From greg at electricrain.com Wed Oct 1 01:27:23 2003 From: greg at electricrain.com (Gregory P. Smith) Date: Wed Oct 1 01:27:29 2003 Subject: [Python-Dev] Procedures for submitting patches to pybsddb In-Reply-To: <20030928005252.6fd1b4b9.itamar@itamarst.org> References: <20030928005252.6fd1b4b9.itamar@itamarst.org> Message-ID: <20031001052723.GK17491@zot.electricrain.com> On Sun, Sep 28, 2003 at 12:52:52AM -0400, Itamar Shtull-Trauring wrote: > I have a patch (DBCursor.get_current_size(), returns size of data for > current entry) which I'd like to submit. This involves changes to > pybsddb cvs as well as python cvs, from what I can tell (for tests and > docs in the pybsddb repo). > > While I have developer access to pybsddb, I don't have it for Python. > Submitting patches for two different repositories seems cumbersome, so > where should I add it? Python SF tracker? If you submit the patch to the python project there's less dependance on me to check it and commit it. I suggest just including the pybsddb repository diffs as a second file in the patch and email the pybsddb-users mailing list afterwards pointing to it. If a patch is submitted to pybsddb; i'll usually notice and do the right thing, but many less eyes watch that patch manager. The pybsddb-users email is a nice headsup because i read that low volume list more often than python-dev. -g From martin at v.loewis.de Wed Oct 1 01:33:24 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Wed Oct 1 01:33:32 2003 Subject: [Python-Dev] Good way of finding out what C functions we have? In-Reply-To: <3F79F878.4070805@ocf.berkeley.edu> References: <3F779BA6.6050407@ocf.berkeley.edu> <3F78E57E.8090403@ocf.berkeley.edu> <16249.33065.797263.217111@montanaro.dyndns.org> <3F79F878.4070805@ocf.berkeley.edu> Message-ID: "Brett C." writes: > Right. I was wondering if there are any checks in configure.in that > if something was not available that Python itself would not compile > *at all*. I would suspect not since that is what the ANSI C/POSIX > coding requirement is supposed to handle, right? Mostly. There are several conditions under which configure would abort; search for exit. One case is that you try to run it on a not-longer-supported system. Regards, Martin From greg at electricrain.com Wed Oct 1 01:55:53 2003 From: greg at electricrain.com (Gregory P. Smith) Date: Wed Oct 1 01:55:56 2003 Subject: [Python-Dev] 2.3.2 and bsddb In-Reply-To: <200309291445.h8TEj5hc002980@localhost.localdomain> References: <200309291445.h8TEj5hc002980@localhost.localdomain> Message-ID: <20031001055553.GM17491@zot.electricrain.com> On Tue, Sep 30, 2003 at 12:45:05AM +1000, Anthony Baxter wrote: > > For those of you not following every bug in the SF tracker closely, > in http://www.python.org/sf/775414 it's been suggested that the docs > for 2.3.2 include a warning about using the old-style interface to > bsddb (without a DBEnv) with multi-threaded applications. This seems > like a prudent suggestion - does someone want to supply some words? > > Anthony Attached is a patch. Commit it if you like it. -------------- next part -------------- Index: libbsddb.tex =================================================================== RCS file: /cvsroot/python/python/dist/src/Doc/lib/libbsddb.tex,v retrieving revision 1.11 diff --unified=5 -r1.11 libbsddb.tex --- libbsddb.tex 28 May 2003 16:20:03 -0000 1.11 +++ libbsddb.tex 1 Oct 2003 05:51:11 -0000 @@ -28,10 +28,16 @@ The following is a description of the legacy \module{bsddb} interface compatible with the old python bsddb module. For details about the more modern Db and DbEnv object oriented interface see the above mentioned pybsddb URL. + +\warning{This legacy interface is not thread safe in python 2.3.x +or earlier. Data corruption, core dumps or deadlocks may occur if you +attempt multi-threaded access. You must use the modern pybsddb +interface linked to above if you need multi-threaded or multi-process +database access.} The \module{bsddb} module defines the following functions that create objects that access the appropriate type of Berkeley DB file. The first two arguments of each function are the same. For ease of portability, only the first two arguments should be used in most From theller at python.net Wed Oct 1 03:52:00 2003 From: theller at python.net (Thomas Heller) Date: Wed Oct 1 03:52:30 2003 Subject: [Python-Dev] Re: RELEASED Python 2.3.1 In-Reply-To: <16250.151.309386.606310@grendel.zope.com> (Fred L. Drake, Jr.'s message of "Tue, 30 Sep 2003 18:15:51 -0400") References: <200309261752.h8QHqJgD029823@localhost.localdomain> <20030926184853.GB22837@mems-exchange.org> <16250.151.309386.606310@grendel.zope.com> Message-ID: "Fred L. Drake, Jr." writes: > Thomas Heller writes: > > > And it would help if I could build the HTML docs myself from CVS. I did > > manage to create the pdf files with TeTex under windows, but I didn't > > succeed with the html pages so far. > > Are you using Cygwin? What problems did you encounter? I'll help if > I can; I have a Windows machine (sometimes), but don't know anything > about non-Cygwin Windows TeX systems (and I don't have Cygwin > installed most of the time). No, I'm not using cygwin. I have seen too many broken cvs files with line end problems, I suspect people check in files with MSDOS line ending under cygwin. Now that I can build the docs on starship (thanks, Greg!) it's not needed anymore to do it under Windows, but for the archives here are my experiences: Doing 'nmake pdf' (this is the MSVC6 make utility) in the src/Doc directory worked, it created the pdf docs with MikTeX I had installed. Maybe I had to trivially edit the Makefile (replace 'cp' with 'copy' and such) before. It doesn't work anymore with the recent checkins to the Makefile it didn't work anymore, although installing the Mingw32 gnumake helped. Then I tried to bring 'make html' to work, installed latex2html (I have Perl already), but this always complained about pnmtopng missing (or something like that). And the make failed with an error such as 'image format unsupported'. Well, I tried to find and install native windows pnm2png and png2pnm tools, had to replace incompatible zlib.dll and so on. It didn't work, instead it broke my ssh and maybe other stuff. At this point I gave up, removed the software, and be happy that I managed to get my ssh working again. Thomas From theller at python.net Wed Oct 1 03:54:20 2003 From: theller at python.net (Thomas Heller) Date: Wed Oct 1 03:54:49 2003 Subject: [Starship] Re: [Python-Dev] Re: RELEASED Python 2.3.1 In-Reply-To: <20030930012556.GA6451@cthulhu.gerg.ca> (Greg Ward's message of "Mon, 29 Sep 2003 21:25:56 -0400") References: <200309261752.h8QHqJgD029823@localhost.localdomain> <20030926184853.GB22837@mems-exchange.org> <2mzngr1bf2.fsf@starship.python.net> <8yoauf87.fsf@python.net> <20030928191828.GA2852@cthulhu.gerg.ca> <2mvfrb271d.fsf@starship.python.net> <20030930012556.GA6451@cthulhu.gerg.ca> Message-ID: Greg Ward writes: > Argh. Installing tetex-bin (and -doc, -extra, -lib just for fun) now. > Thanks, Greg. Works great now. Thomas From mwh at python.net Wed Oct 1 06:45:06 2003 From: mwh at python.net (Michael Hudson) Date: Wed Oct 1 06:44:23 2003 Subject: [Python-Dev] Re: RELEASED Python 2.3.1 In-Reply-To: <16249.65446.699521.103706@grendel.zope.com> (Fred L. Drake, Jr.'s message of "Tue, 30 Sep 2003 18:11:50 -0400") References: <200309261752.h8QHqJgD029823@localhost.localdomain> <20030926184853.GB22837@mems-exchange.org> <2mzngr1bf2.fsf@starship.python.net> <16249.65446.699521.103706@grendel.zope.com> Message-ID: <2mbrt11bkd.fsf@starship.python.net> "Fred L. Drake, Jr." writes: > Michael Hudson writes: > > It occurs to me that I don't know *why* Fred is so much the > > documentation man; I've not had any trouble processing the docs into > > HTML lately (haven't tried on Windows, admittedly, and I haven't > > tried to make info ever). > > It's certainly gotten easier to deal with the documentation on modern > Linux distributions. At CNRI, we used mostly Solaris boxes, and I > have to build my own teTeX installations from source, and hand-select > a version of LaTeX2HTML that worked for me. Oh, there was a reason I put a "lately" in what I said... > At this point, all the software that I can't just install from a > RedHat CD is part of what gets pulled down from CVS. I've been able > to build the docs on Cygwin as well, though I've not tried lately. > A lot of what it takes to build the docs is written into Doc/Makefile, > but it does require a solid make (it even uses $(shell ...) now, so > maybe only GNU make will do; not sure). One thing that puzzled me: Doc/Makefile seems to require that Doc/tools is on $PATH, unless I'm misunderstanding something. > > What else needs to be done? There must be quite a bit of mucking > > about on creosote to do, I guess. > > There's a bit, but that's getting easier and easier as I've gone > through it a few times now. I updated PEP 101 the other evening so > anyone can do what's needed to build the packages and get them in the > download locations. There's more to be written to explain what else > needs to be updated on the site. Well, progress! I think it's a worthy goal that no single person is required to make a release, and that actually this isn't too far off. Cheers, mwh -- Never meddle in the affairs of NT. It is slow to boot and quick to crash. -- Stephen Harris -- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html From gerrit at nl.linux.org Wed Oct 1 07:23:05 2003 From: gerrit at nl.linux.org (Gerrit Holl) Date: Wed Oct 1 07:23:11 2003 Subject: [Python-Dev] Documentation packages In-Reply-To: <16249.61774.732640.328478@grendel.zope.com> References: <16249.61774.732640.328478@grendel.zope.com> Message-ID: <20031001112305.GA3667@nl.linux.org> Hi, Fred L. Drake, Jr. wrote: > After a brief discussion on the Doc-SIG, it looks like I can > reasonably drop the .tar.gz packaging for the documentation, leaving > only .zip and .tar.bz2 formats. > > Are there any strong objections to this change? What is the reason to do so? Can it do any harm do leave it in? just curious... Gerrit Holl. -- 6. If any one steal the property of a temple or of the court, he shall be put to death, and also the one who receives the stolen thing from him shall be put to death. -- 1780 BC, Hammurabi, Code of Law -- Asperger Syndroom - een persoonlijke benadering: http://people.nl.linux.org/~gerrit/ Het zijn tijden om je zelf met politiek te bemoeien: http://www.sp.nl/ From anthony at interlink.com.au Wed Oct 1 01:54:26 2003 From: anthony at interlink.com.au (Anthony Baxter) Date: Wed Oct 1 08:57:05 2003 Subject: [Python-Dev] HP Test Drive systems Message-ID: <200310010554.h915sQBT002065@localhost.localdomain> If you've found the Sourceforge Compile Farm a bit lacking in the numbers of systems available, I highly recommend the HP testdrive program (ok, to be fair, it was originally the DEC testdrive program to allow you to play with OSF/1 on Alphas, then the Compaq testdrive program, but HP's added a bunch of their own systems to it, as well as a wide variety of the free linux and bsd variants). I've appended the list of systems that are currently available to the bottom of this list. mwh and I have been using these to track down all manner of wacky O/S- dependent errors. Sign up for it at http://www.testdrive.compaq.com/ Anthony Test Drive System Type HP Tru64 Unix 4.0g(JAVA) AS1200 2@533MHz (ev56) HP Tru64 Unix 5.1b(JAVA) DS20L 2@833MHz(ev68) HP Tru64 Unix 5.1b(JAVA) DS20E 2@667MHz (ev67) HP Tru64 Unix 5.1b(JAVA) ES40 4@833MHz (ev67) HP Tru64 Unix 5.1b(JAVA) ES45 4@1GHz (ev68) HP Tru64 Unix 5.1b(JAVA) ES47 2x1GHz (ev7) HP OpenVMS 7.3-2 EFT DS10-L 1@466MHz (ev6) HP OpenVMS 7.3-1 DS20 2@500MHz (ev6) HP-UX 11i 11.22 rx2600 2@900MHz (Itanium II) HP-UX 11i 11.22 rx2600 2@900MHz (Itanium II) HP-UX 11i 11.11 rp2470 2@750MHz (PA-RISC) HP-UX 11i 11.11 rp2470 2@750MHz (PA-RISC) Linux Test Drives: Test Drive System Type Debian GNU/Linux 3.0 on Intel ProLiant DL360 G2 1.4GHz (P3) Debian GNU/Linux 3.0 on Intel rx2600 2@900MHz (Itanium II) Debian GNU/Linux 3.0 on Alpha XP1000a 1@667MHz (ev6) Debian GNU/Linux 3.0 on Alpha DS20 2@500MHz (ev6) Debian GNU/Linux 3.0 on PA-RISC rp5470 1@550MHz (PA-RISC) Mandrake Linux 9.1 on Intel ProLiant ML530 2@800MHz (P3) Red Hat Ent Linux ES 2.1 on Intel ProLiant ML530 2@1.0GHz (P3) Red Hat Ent Linux AS 2.1 on Intel ProLiant DL360 2@800MHz (P3) Red Hat Ent Linux AS 2.1 on Intel rx2600 2@900MHz (ItaniumII) Red Hat Ent Linux AS 2.1 on Intel rx2600 2@900MHz (ItaniumII) Red Hat Ent Linux AS 2.1 on Intel Intel 4@1.4GHz (Itanium II) Red Hat Linux 7.2 on Alpha DS20 2@500MHz (ev6) Red Hat Linux 7.2 on Alpha(JAVA) ES40 4@667MHz (ev67) Slackware Linux 9.0 on Intel ProLiant ML530 2@800MHz (P3) SuSE Linux Ent Svr 8.0 on Intel ProLiant DL360 2@1.4GHz (P3) SuSE Linux 7.2a on Intel DL590 4@800MHz (Itanium I) SuSE Linux 7.1 on Alpha DS10-L 1@466MHz (ev6) SuSE Linux 7.1 on Alpha ES40 2@667MHz (ev67) SuSE Linux 7.1 on Alpha(JAVA) DS20e 2@667MHz (ev67) BSD Test Drives: Test Drive System Type FreeBSD 4.8 on Intel ProLiant DL360 2@1.4GHz (P3) FreeBSD 4.8 on Alpha XP1000a 1@667MHz (ev6) OpenBSD 3.2 on Intel ProLiant DL360 2@1.2GHz (P3) NetBSD 1.6 on Intel ProLiant DL360 2@1.2GHz (P3) Cluster Test Drives: Beowulf BrickWall Cluster DS10 & DS10-L(8) 466MHz (ev6) HP TruCluster Server 5.1b(JAVA) ES40 883Mhz & DS20E 667Mhz OpenVMS 7.3 Galaxy Cluster AS4100 EV56 OpenVMS 7.3 Galaxy Cluster AS4100 EV56 Red Hat Advanced Server Cluster ProLiant DL360x2 2@800MHz (P3) iPAQ TestDrive Developer Program: Test Drive System Type iPAQ iPAQ H3650 Application Test Drives: Test Drive System Type Oracle 9iAS Portal Tru64 Unix5.1B ES40 4@667MHz (ev6) Oracle 9iRAC 9.2.0 on Tru64 Unix ES45 @1GHz & ES40 @833MHz From anthony at interlink.com.au Wed Oct 1 02:00:09 2003 From: anthony at interlink.com.au (Anthony Baxter) Date: Wed Oct 1 08:57:25 2003 Subject: [Python-Dev] 2.3.2 and bsddb In-Reply-To: <20031001055553.GM17491@zot.electricrain.com> Message-ID: <200310010600.h91609IZ002197@localhost.localdomain> Looks good! I've upgraded it to a \begin{notice}[warning] so that it really stands out (see for instance http://www.python.org/doc/current/lib/node61.html ) Thanks! Anthony From anthony at interlink.com.au Wed Oct 1 02:10:20 2003 From: anthony at interlink.com.au (Anthony Baxter) Date: Wed Oct 1 08:57:33 2003 Subject: [Python-Dev] 2.3.2 and bsddb In-Reply-To: <20031001055553.GM17491@zot.electricrain.com> Message-ID: <200310010610.h916AK2B002343@localhost.localdomain> Just another thought - should the newer pybsddb API be folded into the library docs? Anthony From barry at python.org Wed Oct 1 09:01:03 2003 From: barry at python.org (Barry Warsaw) Date: Wed Oct 1 09:01:09 2003 Subject: [Python-Dev] 2.3.2 and bsddb In-Reply-To: <200310010610.h916AK2B002343@localhost.localdomain> References: <200310010610.h916AK2B002343@localhost.localdomain> Message-ID: <1065013263.19531.24.camel@anthem> On Wed, 2003-10-01 at 02:10, Anthony Baxter wrote: > Just another thought - should the newer pybsddb API be folded into the > library docs? They're big, but I think worth it. In general the pybsddb docs are excellent and invaluable, but I would make one change. I think the links to the C API point to pybsddb copies of the Sleepycat documentation. I'd change those to point to Sleepycat's own online documentation. It's more fragile, but 1) it means pulling less into Python's library, and 2) should be more up-to-date as Sleepycat makes changes and new releases. -Barry From Paul.Moore at atosorigin.com Wed Oct 1 09:06:10 2003 From: Paul.Moore at atosorigin.com (Moore, Paul) Date: Wed Oct 1 09:07:01 2003 Subject: [Python-Dev] 2.3.2 and bsddb Message-ID: <16E1010E4581B049ABC51D4975CEDB8802C097F1@UKDCX001.uk.int.atosorigin.com> From: Barry Warsaw [mailto:barry@python.org] > On Wed, 2003-10-01 at 02:10, Anthony Baxter wrote: > > Just another thought - should the newer pybsddb API be folded into the > > library docs? > They're big, but I think worth it. In general the pybsddb docs are > excellent and invaluable [...] I think it should. I read the bsddb stuff in the Python manual, and barely noticed the reference to pybsddb. Subconsciously, I assumed that it was simply background reading, and not important for day to day use (much like pointers to RFCs in many of the Internet modules). I certainly never assumed that there might be functionality which wasn't documented in the Python library reference. Paul. From anthony at interlink.com.au Wed Oct 1 09:52:50 2003 From: anthony at interlink.com.au (Anthony Baxter) Date: Wed Oct 1 09:54:38 2003 Subject: [Python-Dev] first pass at a release checker Message-ID: <200310011352.h91DqoUe007522@localhost.localdomain> Here's the first hack at a quick script for checking a release tarball for sanity. Please suggest additional checks to make. At the moment it checks: tarball name tarball unpacks to a correctly named directory no CVS directories in the tarball no Release date: XXX in Misc/NEWS "configure ; make ; make test" works Additional checks I plan to add at some point: check the version number in Include/patchlevel.h check the version number and build number in the windows-specific area Where should something like this (cleaned up a bit) be checked in? Tools/something? Anthony def Error(message): import sys print "ERROR:", message sys.exit(1) def searchFile(filename, searchPattern, badPattern): import re searchRe = re.compile(searchPattern) badPatternRe = re.compile(badPattern) for line in open(filename): if searchRe.match(line): if badPatternRe.search(line): Error("found %s in %s"%(badPattern, filename)) def main(tarball): import os # make tarball path absolute if tarball[0] != "/": tarball = os.path.join(os.getcwd(), tarball) if tarball[-4:] != ".tgz": Error("tarball should end in .tgz") # Check tarball is gzipped, maybe check compression level? reldir = "checkrel-%d"%(os.getpid()) os.mkdir(reldir) os.chdir(reldir) print "extracting in %s"%reldir print "tarball is %s"%(tarball) os.system("tar xzf %s"%(tarball)) relname = os.path.basename(tarball)[:-4] entries = os.listdir(".") if len(entries) != 1 or entries[0] != relname: Error("tarball should have only created %s"%relname) os.chdir(relname) for dirpath, dirnames, filenames in os.walk('.'): if "CVS" in dirnames: Error("%s contains a CVS directory!"%dirpath) # additional checks go here. searchFile("Misc/NEWS", "^\*Release date:", "XXX") os.system("./configure") os.system("make") os.system("make testall") import sys main(sys.argv[1]) From aahz at pythoncraft.com Wed Oct 1 10:02:56 2003 From: aahz at pythoncraft.com (Aahz) Date: Wed Oct 1 10:03:01 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: <20031001112305.GA3667@nl.linux.org> References: <16249.61774.732640.328478@grendel.zope.com> <20031001112305.GA3667@nl.linux.org> Message-ID: <20031001140255.GA17311@panix.com> On Wed, Oct 01, 2003, Gerrit Holl wrote: > Fred L. Drake, Jr. wrote: >> >> After a brief discussion on the Doc-SIG, it looks like I can >> reasonably drop the .tar.gz packaging for the documentation, leaving >> only .zip and .tar.bz2 formats. >> >> Are there any strong objections to this change? > > What is the reason to do so? Can it do any harm do leave it in? Two points: * It's another step in the release process * It takes up extra space on the servers Following Fred's suggestion saves time and space. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From mwh at python.net Wed Oct 1 11:15:23 2003 From: mwh at python.net (Michael Hudson) Date: Wed Oct 1 11:14:40 2003 Subject: [Python-Dev] first pass at a release checker In-Reply-To: <200310011352.h91DqoUe007522@localhost.localdomain> (Anthony Baxter's message of "Wed, 01 Oct 2003 23:52:50 +1000") References: <200310011352.h91DqoUe007522@localhost.localdomain> Message-ID: <2m7k3p0z1w.fsf@starship.python.net> Anthony Baxter writes: > Where should something like this (cleaned up a bit) be checked in? > Tools/something? Tools/scripts/something, I'd have thought. Cheers, mwh -- Presumably pronging in the wrong place zogs it. -- Aldabra Stoddart, ucam.chat From michael.l.schneider at eds.com Wed Oct 1 11:28:09 2003 From: michael.l.schneider at eds.com (Schneider, Michael) Date: Wed Oct 1 11:28:13 2003 Subject: [Python-Dev] RE: Python-Dev Digest, Vol 3, Issue 2 Message-ID: <49199579A2BB32438A7572AF3DBB2FB501FEED04@uscimplm001.net.plm.eds.com> SGI python 1.3.2 rc2 fails to build on irix. There is a compile error in termios.c. This is caused by the fact that SGI #defines some control chars, but does not implement them. If the following code is added to Modules/termios.c, then the problem is fixed, and all is well on IRIX. Can someone get this in? Thanks, Mike -------------------------------------------------------------------------------- // SGI #defines, but does not support these #ifdef (__sgi) #ifdef CLNEXT #undef CLNEXT #endif #ifdef CRPRNT # undef CRPRNT #endif #ifdef CWERASE # undef CWERASE #endif #ifdef CFLUSH #undef CFLUSH #endif #ifdef CDSUSP #undef CDSUSP #endif ---------------------------------------------------------------- Michael Schneider Senior Software Engineering Consultant EDS PLM Solutions "The Greatest Performance Improvement Is the transitioning from a non-working state to the working state" From anthony at interlink.com.au Wed Oct 1 11:54:17 2003 From: anthony at interlink.com.au (Anthony Baxter) Date: Wed Oct 1 11:56:14 2003 Subject: [Python-Dev] RE: Python-Dev Digest, Vol 3, Issue 2 In-Reply-To: <49199579A2BB32438A7572AF3DBB2FB501FEED04@uscimplm001.net.plm.eds.com> Message-ID: <200310011554.h91FsHBP009060@localhost.localdomain> >>> "Schneider, Michael" wrote > SGI python 1.3.2 rc2 fails to build on irix. > There is a compile error in termios.c. I'm not sure what 1.3.2 rc2 might mean. Is this 2.3.1c1? If so, which exact version of Irix is this on, and which compiler (and version of compiler) > This is caused by the fact that SGI #defines some control chars, but does not > implement them. Argh. This is a way ugly problem - why does the OS define them and not implement them? Anthony -- Anthony Baxter It's never too late to have a happy childhood. From aahz at pythoncraft.com Wed Oct 1 12:00:11 2003 From: aahz at pythoncraft.com (Aahz) Date: Wed Oct 1 12:00:17 2003 Subject: [Python-Dev] Irix problems In-Reply-To: <49199579A2BB32438A7572AF3DBB2FB501FEED04@uscimplm001.net.plm.eds.com> References: <49199579A2BB32438A7572AF3DBB2FB501FEED04@uscimplm001.net.plm.eds.com> Message-ID: <20031001160010.GA28676@panix.com> On Wed, Oct 01, 2003, Schneider, Michael wrote: > > SGI python 1.3.2 rc2 fails to build on irix. There is a compile error > in termios.c. > > This is caused by the fact that SGI #defines some control chars, but > does not implement them. > > If the following code is added to Modules/termios.c, then the problem > is fixed, and all is well on IRIX. > > Can someone get this in? Did 2.3 or 2.3.1 compile correctly? If not, it's too late to get this into 2.3.2. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From michael.l.schneider at eds.com Wed Oct 1 13:01:29 2003 From: michael.l.schneider at eds.com (Schneider, Michael) Date: Wed Oct 1 13:01:34 2003 Subject: [Python-Dev] Irix problems Message-ID: <49199579A2BB32438A7572AF3DBB2FB501FEED06@uscimplm001.net.plm.eds.com> 2.3.2rc2 is the first try, we are updating from 1.5 on SGI... ---------------------------------------------------------------- Michael Schneider Senior Software Engineering Consultant EDS PLM Solutions "The Greatest Performance Improvement Is the transitioning from a non-working state to the working state" -----Original Message----- From: Aahz [mailto:aahz@pythoncraft.com] Sent: Wednesday, October 01, 2003 12:00 PM To: Schneider, Michael Cc: python-dev@python.org Subject: Re: [Python-Dev] Irix problems On Wed, Oct 01, 2003, Schneider, Michael wrote: > > SGI python 1.3.2 rc2 fails to build on irix. There is a compile error > in termios.c. > > This is caused by the fact that SGI #defines some control chars, but > does not implement them. > > If the following code is added to Modules/termios.c, then the problem > is fixed, and all is well on IRIX. > > Can someone get this in? Did 2.3 or 2.3.1 compile correctly? If not, it's too late to get this into 2.3.2. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From just at letterror.com Wed Oct 1 13:07:33 2003 From: just at letterror.com (Just van Rossum) Date: Wed Oct 1 13:07:28 2003 Subject: [Python-Dev] imp.findmodule and zip files In-Reply-To: <16250.1230.305462.233362@magrathea.basistech.com> Message-ID: Tom Emerson wrote: > Should imp.find_module() work for modules that are packaged in a zip > file in 2.3.x? I'm seeing that this doesn't, and before I dive in to > figure out why, I want to see if this is the intent. The imp module is not yet updated to have full access to the new import hooks :-(. See near the end of http://www.python.org/peps/pep-0302.html for a discussion of the issues. Just From theller at python.net Wed Oct 1 13:31:27 2003 From: theller at python.net (Thomas Heller) Date: Wed Oct 1 13:31:58 2003 Subject: [Python-Dev] imp.findmodule and zip files In-Reply-To: (Just van Rossum's message of "Wed, 1 Oct 2003 19:07:33 +0200") References: Message-ID: Just van Rossum writes: > The imp module is not yet updated to have full access to the new import > hooks :-(. See near the end of http://www.python.org/peps/pep-0302.html > for a discussion of the issues. There's another minor issue with the new import hooks which would be nice to be fixed: to my knowledge, the Py_VerboseFlag is not exposed to the Python layer. Sometimes it would come handy when debugging a custom importer. Thomas From fdrake at acm.org Wed Oct 1 14:36:15 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed Oct 1 14:36:46 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: <20031001140255.GA17311@panix.com> References: <16249.61774.732640.328478@grendel.zope.com> <20031001112305.GA3667@nl.linux.org> <20031001140255.GA17311@panix.com> Message-ID: <16251.7839.871263.562935@grendel.zope.com> Aahz writes: > * It's another step in the release process The way I wrote up the documentation release in PEP 101, generating the files isn't even a step. There are a couple of make commands that cause these to be generated; these would not change; just the definitions for those targets would change. > * It takes up extra space on the servers There is this; not a huge deal, but considering we're running on hardware owned by XS4ALL, and we're dependent on their goodwill, we shouldn't waste the space if we don't need to. > Following Fred's suggestion saves time and space. I think more important is that it reduces the number of options that get presented to some who's looking to download something. The plethora of documentation packages is almost embarassing when compared to the number of packages for the interpreter itself: the sources as a .tar.gz package (no ZIP!), and the Windows installer. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From fdrake at acm.org Wed Oct 1 14:56:03 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed Oct 1 14:56:34 2003 Subject: [Python-Dev] Re: RELEASED Python 2.3.1 In-Reply-To: <2mbrt11bkd.fsf@starship.python.net> References: <200309261752.h8QHqJgD029823@localhost.localdomain> <20030926184853.GB22837@mems-exchange.org> <2mzngr1bf2.fsf@starship.python.net> <16249.65446.699521.103706@grendel.zope.com> <2mbrt11bkd.fsf@starship.python.net> Message-ID: <16251.9027.715275.62943@grendel.zope.com> Michael Hudson writes: > One thing that puzzled me: Doc/Makefile seems to require that > Doc/tools is on $PATH, unless I'm misunderstanding something. It definately doesn't require that; I've never used Doc/tools/ on $PATH. One thing it was requiring (only recently) was that there was a mkhowto symlink somewhere on the $PATH that pointed to the mkhowto script. I've removed that constraint for the trunk. The intention is that we should be able to use a mkhowto script from a different checkout; you can still modify the MKHOWTO make variable to do that, but it's not so valuable on the trunk as on the maintenance branches (where we want to use mkhowto from the trunk). -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From theller at python.net Wed Oct 1 15:08:03 2003 From: theller at python.net (Thomas Heller) Date: Wed Oct 1 15:08:35 2003 Subject: [Python-Dev] Re: RELEASED Python 2.3.1 In-Reply-To: <16251.9027.715275.62943@grendel.zope.com> (Fred L. Drake, Jr.'s message of "Wed, 1 Oct 2003 14:56:03 -0400") References: <200309261752.h8QHqJgD029823@localhost.localdomain> <20030926184853.GB22837@mems-exchange.org> <2mzngr1bf2.fsf@starship.python.net> <16249.65446.699521.103706@grendel.zope.com> <2mbrt11bkd.fsf@starship.python.net> <16251.9027.715275.62943@grendel.zope.com> Message-ID: <4qys7p4c.fsf@python.net> "Fred L. Drake, Jr." writes: > Michael Hudson writes: > > One thing that puzzled me: Doc/Makefile seems to require that > > Doc/tools is on $PATH, unless I'm misunderstanding something. > > It definately doesn't require that; I've never used Doc/tools/ on > $PATH. One thing it was requiring (only recently) was that there was > a mkhowto symlink somewhere on the $PATH that pointed to the mkhowto > script. > > I've removed that constraint for the trunk. > > The intention is that we should be able to use a mkhowto script from a > different checkout; you can still modify the MKHOWTO make variable to > do that, but it's not so valuable on the trunk as on the maintenance > branches (where we want to use mkhowto from the trunk). Do we? Why? Thomas From fdrake at acm.org Wed Oct 1 15:11:19 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed Oct 1 15:11:42 2003 Subject: [Python-Dev] Re: RELEASED Python 2.3.1 In-Reply-To: <4qys7p4c.fsf@python.net> References: <200309261752.h8QHqJgD029823@localhost.localdomain> <20030926184853.GB22837@mems-exchange.org> <2mzngr1bf2.fsf@starship.python.net> <16249.65446.699521.103706@grendel.zope.com> <2mbrt11bkd.fsf@starship.python.net> <16251.9027.715275.62943@grendel.zope.com> <4qys7p4c.fsf@python.net> Message-ID: <16251.9943.343427.455546@grendel.zope.com> I wrote: > The intention is that we should be able to use a mkhowto script from a > different checkout; you can still modify the MKHOWTO make variable to > do that, but it's not so valuable on the trunk as on the maintenance > branches (where we want to use mkhowto from the trunk). Thomas Heller writes: > Do we? Why? Definately. I don't want to maintain several versions of the tools; they're almost external application at this point. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From aahz at pythoncraft.com Wed Oct 1 15:28:45 2003 From: aahz at pythoncraft.com (Aahz) Date: Wed Oct 1 15:28:59 2003 Subject: [Python-Dev] Irix problems In-Reply-To: <49199579A2BB32438A7572AF3DBB2FB501FEED06@uscimplm001.net.plm.eds.com> References: <49199579A2BB32438A7572AF3DBB2FB501FEED06@uscimplm001.net.plm.eds.com> Message-ID: <20031001192845.GA10491@panix.com> On Wed, Oct 01, 2003, Schneider, Michael wrote: > > 2.3.2rc2 is the first try, we are updating from 1.5 on SGI... In that case, it's too late. We need this fix out quickly to resolve boo-boos in 2.3.1. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From theller at python.net Wed Oct 1 15:41:24 2003 From: theller at python.net (Thomas Heller) Date: Wed Oct 1 15:41:56 2003 Subject: [Python-Dev] Re: RELEASED Python 2.3.1 In-Reply-To: <16251.9943.343427.455546@grendel.zope.com> (Fred L. Drake, Jr.'s message of "Wed, 1 Oct 2003 15:11:19 -0400") References: <200309261752.h8QHqJgD029823@localhost.localdomain> <20030926184853.GB22837@mems-exchange.org> <2mzngr1bf2.fsf@starship.python.net> <16249.65446.699521.103706@grendel.zope.com> <2mbrt11bkd.fsf@starship.python.net> <16251.9027.715275.62943@grendel.zope.com> <4qys7p4c.fsf@python.net> <16251.9943.343427.455546@grendel.zope.com> Message-ID: "Fred L. Drake, Jr." writes: > I wrote: > > The intention is that we should be able to use a mkhowto script from a > > different checkout; you can still modify the MKHOWTO make variable to > > do that, but it's not so valuable on the trunk as on the maintenance > > branches (where we want to use mkhowto from the trunk). > > Thomas Heller writes: > > Do we? Why? > > Definately. I don't want to maintain several versions of the tools; > they're almost external application at this point. > Ok, but it would be nice if pep 101 and 102 would contain instructions how to build the docs (on unix), I will try to do the same for windows. Thanks, Thomas From fdrake at acm.org Wed Oct 1 15:48:24 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed Oct 1 15:48:42 2003 Subject: [Python-Dev] Re: RELEASED Python 2.3.1 In-Reply-To: References: <200309261752.h8QHqJgD029823@localhost.localdomain> <20030926184853.GB22837@mems-exchange.org> <2mzngr1bf2.fsf@starship.python.net> <16249.65446.699521.103706@grendel.zope.com> <2mbrt11bkd.fsf@starship.python.net> <16251.9027.715275.62943@grendel.zope.com> <4qys7p4c.fsf@python.net> <16251.9943.343427.455546@grendel.zope.com> Message-ID: <16251.12168.559381.428128@grendel.zope.com> Thomas Heller writes: > Ok, but it would be nice if pep 101 and 102 would contain instructions > how to build the docs (on unix), I will try to do the same for windows. That's in the version of PEP 101 in CVS; the online version isn't up-to-date due to the anonymous CVS access on SF using their backup repositories that aren't update frequently enough. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From delza at blastradius.com Wed Oct 1 16:11:36 2003 From: delza at blastradius.com (Dethe Elza) Date: Wed Oct 1 16:11:49 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: <16251.7839.871263.562935@grendel.zope.com> Message-ID: <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> My $0.02 (Canadian), for what it's worth: While Windows users may have trouble with *.bz2, and be unfamiliar enough with the extension *.tgz to not even try (even if it does work), I've never known a *nix box to have trouble with *.zip or known a unix user who had trouble with *.zip. So I'd suggest keeping the various flavors of documentation, but standardize on zip compression. That will at least remove one variable. I agree that the main point of all of this is to reduce confusion for the newbie coming to the site to download it. But 90% of those are going to be windows users, and the rest of us have gotten used to living in a windows-dominated world. Using bz2 may get you better compression and save bandwidth, but it wasn't standard the last time I installed RedHat or Debian. Zip has it's faults, but everybody is familiar with it. --Dethe From tim at zope.com Wed Oct 1 16:18:38 2003 From: tim at zope.com (Tim Peters) Date: Wed Oct 1 16:19:44 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> Message-ID: [Dethe Elza] > While Windows users may have trouble with *.bz2, and be unfamiliar > enough with the extension *.tgz to not even try (even if it does > work), I've never known a *nix box to have trouble with *.zip or > known a unix user who had trouble with *.zip. So I'd suggest keeping > the various flavors of documentation, but standardize on zip > compression. That will at least remove one variable. A difficulty is that the HTML doc set compresses *much* better under bz2 than under zip format, and many people download over slow and expensive dialup lines. bz2 is preferred for that reason (smaller file == faster and cheaper download). > I agree that the main point of all of this is to reduce confusion for > the newbie coming to the site to download it. But 90% of those are > going to be windows users, I don't believe that, because the Windows installer for Python includes the full doc set in a Windows-friendly format. So there's simply no reason for the vast majority of Windows Python users to download the doc distribution at all. Fred, do we have stats on how often each of the files got downloaded for previous releases? > and the rest of us have gotten used to living in a windows-dominated > world. Using bz2 may get you better compression and save bandwidth, > but it wasn't standard the last time I installed RedHat or Debian. > Zip has it's faults, but everybody is familiar with it. No argument there. From tree at basistech.com Wed Oct 1 16:18:23 2003 From: tree at basistech.com (Tom Emerson) Date: Wed Oct 1 16:22:41 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> References: <16251.7839.871263.562935@grendel.zope.com> <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> Message-ID: <16251.13967.470512.219448@magrathea.basistech.com> Dethe Elza writes: [...] > I've never known a *nix box to have trouble with *.zip or known a unix > user who had trouble with *.zip. So I'd suggest keeping the various > flavors of documentation, but standardize on zip compression. That > will at least remove one variable. What Unix boxen do you use? I often run into Solaris, IRIX, and HP-UX boxen that lack unzip. > I agree that the main point of all of this is to reduce confusion for > the newbie coming to the site to download it. But 90% of those are > going to be windows users, and the rest of us have gotten used to > living in a windows-dominated world. Using bz2 may get you better > compression and save bandwidth, but it wasn't standard the last time I > installed RedHat or Debian. Zip has it's faults, but everybody is > familiar with it. I wouldn't switch to bz2. Even tgz can be confusing. Having .zip files for Windows users and .tar.gz files for Unix users is a happy medium that should work most everywhere. Of course for maximum Unix portability I suppose you could use .tar.Z ;-) -tree -- Tom Emerson Basis Technology Corp. Software Architect http://www.basistech.com "Beware the lollipop of mediocrity: lick it once and you suck forever" From fdrake at acm.org Wed Oct 1 16:24:15 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed Oct 1 16:24:36 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> References: <16251.7839.871263.562935@grendel.zope.com> <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> Message-ID: <16251.14319.693162.119201@grendel.zope.com> Dethe Elza writes: > While Windows users may have trouble with *.bz2, and be unfamiliar > enough with the extension *.tgz to not even try (even if it does work), > I've never known a *nix box to have trouble with *.zip or known a unix > user who had trouble with *.zip. So I'd suggest keeping the various > flavors of documentation, but standardize on zip compression. That > will at least remove one variable. At this point, the bzip2 compression has been the most-requested (in terms of emails begging us to add it); the most important aspect that makes it desirable is that the file sizes are so much better. From this perspective, ZIP files are the worst for the formats which cause a lot of individual files to be packaged (most importantly, the HTML and LaTeX source formats). There are still a lot of people who want to pull the files over slow links that this seems valuable, at least for those two formats. (It may be that it's *only* valuable for those formats, and can be dropped for the PDF and PostScript formats.) > I agree that the main point of all of this is to reduce confusion for > the newbie coming to the site to download it. But 90% of those are > going to be windows users, and the rest of us have gotten used to > living in a windows-dominated world. Using bz2 may get you better > compression and save bandwidth, but it wasn't standard the last time I > installed RedHat or Debian. Zip has it's faults, but everybody is > familiar with it. Interesting; I don't recall the last time I had to build my own bzip2. I'm pretty sure I didn't do anything special to get it on RedHat recently. The bandwidth savings aren't nearly so valuable to python.org as they are to end users on metered internet connections; those are the users who were so incredibly vocal that we actually started posting those. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From barry at python.org Wed Oct 1 16:26:46 2003 From: barry at python.org (Barry Warsaw) Date: Wed Oct 1 16:26:51 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: <16251.14319.693162.119201@grendel.zope.com> References: <16251.7839.871263.562935@grendel.zope.com> <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> <16251.14319.693162.119201@grendel.zope.com> Message-ID: <1065040005.15765.18.camel@geddy> On Wed, 2003-10-01 at 16:24, Fred L. Drake, Jr. wrote: > Interesting; I don't recall the last time I had to build my own > bzip2. I'm pretty sure I didn't do anything special to get it on > RedHat recently. No, I'm sure you didn't. bzip2 decompression should be standard on RH9, and there's even a tar option to read and write it. What I don't know is whether bz2 decompression is generally available on MacOSX... minority-platform-ly y'rs, -Barry From fred at zope.com Wed Oct 1 16:31:17 2003 From: fred at zope.com (Fred L. Drake, Jr.) Date: Wed Oct 1 16:31:31 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: References: <16251.7839.871263.562935@grendel.zope.com> <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> <16251.13967.470512.219448@magrathea.basistech.com> Message-ID: <16251.14741.876754.348214@grendel.zope.com> Tim Peters writes: > Fred, do we have stats on how often each of the files got downloaded for > previous releases? No, but we should be able to pull those from the server logs. Maybe this weekend I'll get time to write a script to pull that data out. Tom Emerson writes: > I wouldn't switch to bz2. Even tgz can be confusing. Having .zip files > for Windows users and .tar.gz files for Unix users is a happy medium > that should work most everywhere. Interesting. bzip2 saves half a MB over gzip for the HTML and PostScript formats. What reason do you have for not using bzip2? It was very heavily requested for the file-size advantage. > Of course for maximum Unix > portability I suppose you could use .tar.Z ;-) Except nobody remembers what to do with those anymore. ;-) I haven't used compress/uncompress in *many* years. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From michael.l.schneider at eds.com Wed Oct 1 16:32:16 2003 From: michael.l.schneider at eds.com (Schneider, Michael) Date: Wed Oct 1 16:32:19 2003 Subject: [Python-Dev] Irix problems Message-ID: <49199579A2BB32438A7572AF3DBB2FB501FEED07@uscimplm001.net.plm.eds.com> That's fine, I can apply the fix to our local src. Can this fix go into the next release? Thanks for your efforts, Mike ---------------------------------------------------------------- Michael Schneider Senior Software Engineering Consultant EDS PLM Solutions "The Greatest Performance Improvement Is the transitioning from a non-working state to the working state" -----Original Message----- From: Aahz [mailto:aahz@pythoncraft.com] Sent: Wednesday, October 01, 2003 3:29 PM To: Schneider, Michael Cc: python-dev@python.org Subject: Re: [Python-Dev] Irix problems On Wed, Oct 01, 2003, Schneider, Michael wrote: > > 2.3.2rc2 is the first try, we are updating from 1.5 on SGI... In that case, it's too late. We need this fix out quickly to resolve boo-boos in 2.3.1. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From fdrake at acm.org Wed Oct 1 16:53:52 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed Oct 1 16:54:17 2003 Subject: [Python-Dev] Re: RELEASED Python 2.3.1 In-Reply-To: References: <200309261752.h8QHqJgD029823@localhost.localdomain> <20030926184853.GB22837@mems-exchange.org> <16250.151.309386.606310@grendel.zope.com> Message-ID: <16251.16096.224422.956995@grendel.zope.com> Thomas Heller writes: > Now that I can build the docs on starship (thanks, Greg!) it's not > needed anymore to do it under Windows, but for the archives here are my > experiences: I appreciate your taking the time! > Doing 'nmake pdf' (this is the MSVC6 make utility) in the src/Doc > directory worked, it created the pdf docs with MikTeX I had > installed. Maybe I had to trivially edit the Makefile (replace 'cp' with > 'copy' and such) before. Most of the "cp" commands were removed as mkhowto became more capable, are were replaced by calls to shutil.copyfile(). There are still a few "cp" commands in the Makefile, though, and the "clean" target and friends still us "rm". > It doesn't work anymore with the recent checkins to the Makefile it > didn't work anymore, although installing the Mingw32 gnumake helped. That's expected, since it now uses the GNU-ish $(shell ...) syntax to call an external script from Doc/tools/. Removing this would require even more painful gyrations to maintain the same functionality, or would require that Python version numbers once more appear in the documentation source tree. > Then I tried to bring 'make html' to work, installed latex2html (I have > Perl already), but this always complained about pnmtopng missing (or > something like that). And the make failed with an error such as 'image > format unsupported'. Well, I tried to find and install native windows > pnm2png and png2pnm tools, had to replace incompatible zlib.dll and so > on. It didn't work, instead it broke my ssh and maybe other stuff. That's painful. netpbm is a documented requirement for LaTeX2HTML, but is a pain. I had to install that from source under Cygwin. > At this point I gave up, removed the software, and be happy that I > managed to get my ssh working again. That certainly sounds like a pain. I'll think about what I can do to make it easier, but I don't think it can take a high priority. I'm glad you got it working on Starship. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From python at discworld.dyndns.org Wed Oct 1 17:05:16 2003 From: python at discworld.dyndns.org (Charles Cazabon) Date: Wed Oct 1 17:07:14 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: <16251.14319.693162.119201@grendel.zope.com>; from fdrake@acm.org on Wed, Oct 01, 2003 at 04:24:15PM -0400 References: <16251.7839.871263.562935@grendel.zope.com> <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> <16251.14319.693162.119201@grendel.zope.com> Message-ID: <20031001150516.B14797@discworld.dyndns.org> Fred L. Drake, Jr. wrote: > > Interesting; I don't recall the last time I had to build my own > bzip2. I'm pretty sure I didn't do anything special to get it on > RedHat recently. It was included in the RedHat 6.2 distribution, possibly in 6.1 and 6.0 as well, though I can't check that. It hasn't been an "exotic" package in many years, although it's not necessarily installed by default in a "base" install. I see no reason not to use .bz2 as the default format. Charles -- ----------------------------------------------------------------------- Charles Cazabon GPL'ed software available at: http://www.qcc.ca/~charlesc/software/ ----------------------------------------------------------------------- From michael.l.schneider at eds.com Wed Oct 1 17:35:37 2003 From: michael.l.schneider at eds.com (Schneider, Michael) Date: Wed Oct 1 17:35:43 2003 Subject: [Python-Dev] Irix problems Message-ID: <49199579A2BB32438A7572AF3DBB2FB501FEED09@uscimplm001.net.plm.eds.com> Aahz, Correction to SGI code-------------------------------------------- // SGI #defines, but does not support these #ifdef __sgi #ifdef CLNEXT #undef CLNEXT #endif #ifdef CRPRNT # undef CRPRNT #endif #ifdef CWERASE # undef CWERASE #endif #ifdef CFLUSH #undef CFLUSH #endif #ifdef CDSUSP #undef CDSUSP #endif #endif ---------------------------------------------------------------- Michael Schneider Senior Software Engineering Consultant EDS PLM Solutions "The Greatest Performance Improvement Is the transitioning from a non-working state to the working state" -----Original Message----- From: Schneider, Michael Sent: Wednesday, October 01, 2003 4:32 PM To: 'Aahz' Cc: python-dev@python.org Subject: RE: [Python-Dev] Irix problems That's fine, I can apply the fix to our local src. Can this fix go into the next release? Thanks for your efforts, Mike ---------------------------------------------------------------- Michael Schneider Senior Software Engineering Consultant EDS PLM Solutions "The Greatest Performance Improvement Is the transitioning from a non-working state to the working state" -----Original Message----- From: Aahz [mailto:aahz@pythoncraft.com] Sent: Wednesday, October 01, 2003 3:29 PM To: Schneider, Michael Cc: python-dev@python.org Subject: Re: [Python-Dev] Irix problems On Wed, Oct 01, 2003, Schneider, Michael wrote: > > 2.3.2rc2 is the first try, we are updating from 1.5 on SGI... In that case, it's too late. We need this fix out quickly to resolve boo-boos in 2.3.1. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From pycon at python.org Wed Oct 1 17:41:59 2003 From: pycon at python.org (PyCon Chair) Date: Wed Oct 1 17:42:09 2003 Subject: [Python-Dev] PyCon DC 2004: Call for Proposals Message-ID: <20031001214159.GA8075@panix.com> [Please repost to local Python mailing lists.] Want to share your expertise? PyCon DC 2004 is looking for proposals to fill the formal presentation tracks. PyCon DC 2003 had a broad range of presentations, from reports on academic and commercial projects to tutorials and case studies, and we hope to extend that range this year. As long as the presentation is interesting and potentially useful to the Python community, it will be considered for inclusion in the program. The proposal deadline is December 1; the proposal submission system should be up by mid-October. We'll send out another notice with more info when the submission system goes live. Proposals should be 250-1000 words in text (plain or reST) or HTML. You may request either thirty or sixty minutes for your timeslot. Proposals will be accepted or rejected by January 1, 2004. If your proposal is accepted, you may include a companion paper for publication on the PyCon web site. If you don't want to make a formal presentation, there will be a significant amount of Open Space to allow for informal and spur-of-the-moment presentations for which no formal submission is required. There will also be several Lightning Talk sessions (five minutes or less). For more information, see http://www.python.org/pycon/dc2004/cfp.html PyCon is a community-oriented conference targeting developers (both those using Python and those working on the Python project). It gives you opportunities to learn about significant advances in the Python development community, to participate in a programming sprint with some of the leading minds in the Open Source community, and to meet fellow developers from around the world. The organizers work to make the conference affordable and accessible to all. DC 2004 will be held March 24-26, 2004 in Washington, D.C. The keynote speaker is Mitch Kapor of the Open Source Applications Foundation (http://www.osafoundation.org/). There will be a four-day development sprint before the conference. We're looking for volunteers to help run PyCon. If you're interested, subscribe to http://mail.python.org/mailman/listinfo/pycon-organizers Don't miss any PyCon announcements! Subscribe to http://mail.python.org/mailman/listinfo/pycon-announce You can discuss PyCon with other interested people by subscribing to http://mail.python.org/mailman/listinfo/pycon-interest The central resource for PyCon DC 2004 is http://www.python.org/pycon/dc2004/ -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From python-kbutler at sabaydi.com Wed Oct 1 19:00:01 2003 From: python-kbutler at sabaydi.com (Kevin J. Butler) Date: Wed Oct 1 19:00:20 2003 Subject: [Python-Dev] Bug? re.finditer fails to terminate with empty match Message-ID: <3F7B5C71.7020801@sabaydi.com> The iterator returned by re.finditer appears to not terminate if the final match is empty, but rather keeps returning the final (empty) match. Is this a bug in _sre? If so, I'll be happy to file it, though fixing it is a bit beyond my _sre experience level at this point. The solution would appear to be to either a check for duplicate match in iterator.next(), or to increment position by one after returning an empty match (which should be OK, because if a non-empty match started at that location, we would have returned it instead of the empty match). Code to illustrate the failure: from re import finditer last = None for m in finditer( ".*", "asdf" ): if last == m.span(): print "duplicate match:", last break print m.group(), m.span() last = m.span() --- asdf (0, 4) (4, 4) duplicate match: (4, 4) --- findall works: print re.findall( ".*", "asdf" ) ['asdf', ''] Workaround is to explicitly check for a duplicate span, as I did above, or to check for a duplicate end(), which avoids the final empty match kb From greg at electricrain.com Wed Oct 1 19:19:18 2003 From: greg at electricrain.com (Gregory P. Smith) Date: Wed Oct 1 19:19:22 2003 Subject: [Python-Dev] 2.3.2 and bsddb In-Reply-To: <1065013263.19531.24.camel@anthem> References: <200310010610.h916AK2B002343@localhost.localdomain> <1065013263.19531.24.camel@anthem> Message-ID: <20031001231918.GP17491@zot.electricrain.com> On Wed, Oct 01, 2003 at 09:01:03AM -0400, Barry Warsaw wrote: > On Wed, 2003-10-01 at 02:10, Anthony Baxter wrote: > > Just another thought - should the newer pybsddb API be folded into the > > library docs? +1 all for it (for python 2.4). the pybsddb docs should be TeX-ified and included. They were originally written by Robin using a zope-ish formatted ascii -> html generator of some sort so automating the bulk of the task should be possible. > ... but I would make one change. I think the > links to the C API point to pybsddb copies of the Sleepycat > documentation. I'd change those to point to Sleepycat's own online > documentation. It's more fragile, but 1) it means pulling less into > Python's library, and 2) should be more up-to-date as Sleepycat makes > changes and new releases. +1 agreed. One caveat: Sleepycat keeps the documentation for their current release of BerkeleyDB online at http://www.sleepycat.com/docs/. It doesn't mention any of the different behaviours or even API differences between it and older versions of BerkeleyDB. We have no way of knowing exactly what version the users python is compiled against other than in windows binary releases. Mentioning that caveat in the documentation should be enough. -g From bac at OCF.Berkeley.EDU Wed Oct 1 20:37:46 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Wed Oct 1 20:38:08 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: <1065040005.15765.18.camel@geddy> References: <16251.7839.871263.562935@grendel.zope.com> <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> <16251.14319.693162.119201@grendel.zope.com> <1065040005.15765.18.camel@geddy> Message-ID: <3F7B735A.8070401@ocf.berkeley.edu> Barry Warsaw wrote: > On Wed, 2003-10-01 at 16:24, Fred L. Drake, Jr. wrote: > > >>Interesting; I don't recall the last time I had to build my own >>bzip2. I'm pretty sure I didn't do anything special to get it on >>RedHat recently. > > > No, I'm sure you didn't. bzip2 decompression should be standard on RH9, > and there's even a tar option to read and write it. Considering RH hosts the bzip2 site I would hope you could build on their OS. =) > What I don't know > is whether bz2 decompression is generally available on MacOSX... > It is; StuffIt can decompress it. I just downloaded the GNU Info docs and had no problem with double-clicking the file and decompressing. -Brett From tinuviel at sparcs.kaist.ac.kr Wed Oct 1 23:59:54 2003 From: tinuviel at sparcs.kaist.ac.kr (Seo Sanghyeon) Date: Wed Oct 1 23:59:59 2003 Subject: [Python-Dev] re.finditer Message-ID: <20031002035954.GA27701@sparcs.kaist.ac.kr> Hello, python-dev! This is my first mail to python-dev. Attached one line patch fixes re.finditer bug reported by Kevin J. Butler. I read cvs log to find out why this code is introduced, and it seems to be related to SF bug #581080. But that bug didn't appear after my patch, so I wonder why it was introduced in the first place. It seems beyond my understanding. Please enlighten me. To test: #581080 import re list(re.finditer('\s', 'a b')) # expected: one item list # bug: hang #Kevin J. Butler import re list(re.finditer('.*', 'asdf')) # expected: two item list (?) # bug: hang Seo Sanghyeon -------------- next part -------------- ? patch Index: Modules/_sre.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Modules/_sre.c,v retrieving revision 2.99 diff -c -r2.99 _sre.c *** Modules/_sre.c 26 Jun 2003 14:41:08 -0000 2.99 --- Modules/_sre.c 2 Oct 2003 03:48:55 -0000 *************** *** 3062,3069 **** match = pattern_new_match((PatternObject*) self->pattern, state, status); ! if ((status == 0 || state->ptr == state->start) && ! state->ptr < state->end) state->start = (void*) ((char*) state->ptr + state->charsize); else state->start = state->ptr; --- 3062,3068 ---- match = pattern_new_match((PatternObject*) self->pattern, state, status); ! if (status == 0 || state->ptr == state->start) state->start = (void*) ((char*) state->ptr + state->charsize); else state->start = state->ptr; From oussoren at cistron.nl Thu Oct 2 01:49:37 2003 From: oussoren at cistron.nl (Ronald Oussoren) Date: Thu Oct 2 01:49:46 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: <1065040005.15765.18.camel@geddy> References: <16251.7839.871263.562935@grendel.zope.com> <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> <16251.14319.693162.119201@grendel.zope.com> <1065040005.15765.18.camel@geddy> Message-ID: <34FEA612-F49C-11D7-862D-0003931CFE24@cistron.nl> On 1 okt 2003, at 22:26, Barry Warsaw wrote: > On Wed, 2003-10-01 at 16:24, Fred L. Drake, Jr. wrote: > >> Interesting; I don't recall the last time I had to build my own >> bzip2. I'm pretty sure I didn't do anything special to get it on >> RedHat recently. > > No, I'm sure you didn't. bzip2 decompression should be standard on > RH9, > and there's even a tar option to read and write it. What I don't know > is whether bz2 decompression is generally available on MacOSX... The bzip command-line utilities are shipped as part of MacOS X. I'm not sure if Stuffit supports bzip-ed archives. Ronald From mwh at python.net Thu Oct 2 06:08:49 2003 From: mwh at python.net (Michael Hudson) Date: Thu Oct 2 06:08:07 2003 Subject: [Python-Dev] Re: RELEASED Python 2.3.1 In-Reply-To: <16251.9027.715275.62943@grendel.zope.com> (Fred L. Drake, Jr.'s message of "Wed, 1 Oct 2003 14:56:03 -0400") References: <200309261752.h8QHqJgD029823@localhost.localdomain> <20030926184853.GB22837@mems-exchange.org> <2mzngr1bf2.fsf@starship.python.net> <16249.65446.699521.103706@grendel.zope.com> <2mbrt11bkd.fsf@starship.python.net> <16251.9027.715275.62943@grendel.zope.com> Message-ID: <2msmmcymry.fsf@starship.python.net> "Fred L. Drake, Jr." writes: > Michael Hudson writes: > > One thing that puzzled me: Doc/Makefile seems to require that > > Doc/tools is on $PATH, unless I'm misunderstanding something. > > It definately doesn't require that; I've never used Doc/tools/ on ^^^^^^^^^^ > $PATH. One thing it was requiring (only recently) was that there was > a mkhowto symlink somewhere on the $PATH that pointed to the mkhowto > script. Well, OK. I was getting "mkhowto: command not found" messages. > I've removed that constraint for the trunk. Thanks! Cheers, mwh -- Very clever implementation techniques are required to implement this insanity correctly and usefully, not to mention that code written with this feature used and abused east and west is exceptionally exciting to debug. -- Erik Naggum on Algol-style "call-by-name" From andymac at bullseye.apana.org.au Thu Oct 2 05:55:10 2003 From: andymac at bullseye.apana.org.au (Andrew MacIntyre) Date: Thu Oct 2 09:15:18 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: <16251.14741.876754.348214@grendel.zope.com> References: <16251.7839.871263.562935@grendel.zope.com> <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> <16251.13967.470512.219448@magrathea.basistech.com> <16251.14741.876754.348214@grendel.zope.com> Message-ID: <20031002195207.S85276@bullseye.apana.org.au> On Wed, 1 Oct 2003, Fred L. Drake, Jr. wrote: > Interesting. bzip2 saves half a MB over gzip for the HTML and > PostScript formats. If you're producing PDF, why produce Postscript? AFAIK, Ghostscript digests PDF and can generate Postscript for those that have/want to use a Postscript printer. Around here, print shops seem to actually _prefer_ PDF. -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au (pref) | Snail: PO Box 370 andymac@pcug.org.au (alt) | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From skip at pobox.com Thu Oct 2 09:43:14 2003 From: skip at pobox.com (Skip Montanaro) Date: Thu Oct 2 09:43:27 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> References: <16251.7839.871263.562935@grendel.zope.com> <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> Message-ID: <16252.11122.781855.197919@montanaro.dyndns.org> Dethe> I've never known a *nix box to have trouble with *.zip or known a Dethe> unix user who had trouble with *.zip. So I'd suggest keeping the Dethe> various flavors of documentation, but standardize on zip Dethe> compression. That will at least remove one variable. Agreed. We did encounter a problem with a zip file in the SpamBayes group recently which we believe (though haven't confirmed - the OP has apparently gone underground) was related to WinZip problems. As I understand it, if you set an option in WinZip to "flatten" a zip file, all future zip files are also flattened. I guess it's a case of setting that option then poking the "Save Options" or "OK" button, then forgetting that other zip files will have structure which shouldn't be eliminated. Skip From skip at pobox.com Thu Oct 2 09:45:23 2003 From: skip at pobox.com (Skip Montanaro) Date: Thu Oct 2 09:45:33 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: <1065040005.15765.18.camel@geddy> References: <16251.7839.871263.562935@grendel.zope.com> <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> <16251.14319.693162.119201@grendel.zope.com> <1065040005.15765.18.camel@geddy> Message-ID: <16252.11251.742634.458564@montanaro.dyndns.org> Barry> What I don't know is whether bz2 decompression is generally Barry> available on MacOSX... Fink is your friend: % type bzip2 bzip2 is /sw/bin/bzip2 so, no, it's not standard on Mac OS X. S From mwh at python.net Thu Oct 2 09:51:58 2003 From: mwh at python.net (Michael Hudson) Date: Thu Oct 2 09:51:16 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: <16252.11251.742634.458564@montanaro.dyndns.org> (Skip Montanaro's message of "Thu, 2 Oct 2003 08:45:23 -0500") References: <16251.7839.871263.562935@grendel.zope.com> <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> <16251.14319.693162.119201@grendel.zope.com> <1065040005.15765.18.camel@geddy> <16252.11251.742634.458564@montanaro.dyndns.org> Message-ID: <2moewzzr0h.fsf@starship.python.net> Skip Montanaro writes: > Barry> What I don't know is whether bz2 decompression is generally > Barry> available on MacOSX... > > Fink is your friend: > > % type bzip2 > bzip2 is /sw/bin/bzip2 > > so, no, it's not standard on Mac OS X. Just because fink supplies something doesn't mean it didn't come with the base install. Jaguar has bzip2 installed; I don't think 10.1 did. Cheers, mwh -- SCSI is not magic. There are fundamental technical reasons why it is necessary to sacrifice a young goat to your SCSI chain now and then. -- John Woods From skip at pobox.com Thu Oct 2 09:58:10 2003 From: skip at pobox.com (Skip Montanaro) Date: Thu Oct 2 09:58:23 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: <16252.11251.742634.458564@montanaro.dyndns.org> References: <16251.7839.871263.562935@grendel.zope.com> <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> <16251.14319.693162.119201@grendel.zope.com> <1065040005.15765.18.camel@geddy> <16252.11251.742634.458564@montanaro.dyndns.org> Message-ID: <16252.12018.287900.741829@montanaro.dyndns.org> Barry> What I don't know is whether bz2 decompression is generally Barry> available on MacOSX... Skip> Fink is your friend: Skip> % type bzip2 Skip> bzip2 is /sw/bin/bzip2 Skip> so, no, it's not standard on Mac OS X. Sorry, should have used "type -a" so I saw the version in /usr/bin. Skip From fred at zope.com Thu Oct 2 10:04:41 2003 From: fred at zope.com (Fred L. Drake, Jr.) Date: Thu Oct 2 10:04:55 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: <20031002195207.S85276@bullseye.apana.org.au> References: <16251.7839.871263.562935@grendel.zope.com> <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> <16251.13967.470512.219448@magrathea.basistech.com> <16251.14741.876754.348214@grendel.zope.com> <20031002195207.S85276@bullseye.apana.org.au> Message-ID: <16252.12409.982231.274869@grendel.zope.com> Andrew MacIntyre writes: > If you're producing PDF, why produce Postscript? AFAIK, Ghostscript > digests PDF and can generate Postscript for those that have/want to use a > Postscript printer. Around here, print shops seem to actually _prefer_ > PDF. I recall a number of people wanting to use the PostScript to drive real PostScript printers directly. That was some time ago; perhaps Ghostscript can handle PDF sufficiently now. If there's no longer any interest in having the PostScript available, I'll be glad to drop that. I guess I really should come up with a script that pulls the relevant stats from the site logs. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From pinard at iro.umontreal.ca Thu Oct 2 11:14:24 2003 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Thu Oct 2 11:14:35 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: <20031002195207.S85276@bullseye.apana.org.au> References: <16251.7839.871263.562935@grendel.zope.com> <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> <16251.13967.470512.219448@magrathea.basistech.com> <16251.14741.876754.348214@grendel.zope.com> <20031002195207.S85276@bullseye.apana.org.au> Message-ID: <20031002151424.GA14552@alcyon.progiciels-bpi.ca> [Andrew MacIntyre] > If you're producing PDF, why produce Postscript? AFAIK, Ghostscript > digests PDF and can generate Postscript for those that have/want to use a > Postscript printer. Around here, print shops seem to actually _prefer_ > PDF. But some of us are not print shops, and have Postscript printers, which are better fed with Postscript, and do not directly accept PDF. PDF to Postscript converters are not 100% dependable, even if they do the job most of the time. Given `.pdf' and `.ps', for one, I would almost always pick the `.ps' file, to avoid possible fights and trouble. -- Fran?ois Pinard http://www.iro.umontreal.ca/~pinard From martin at v.loewis.de Thu Oct 2 14:39:00 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Thu Oct 2 14:39:27 2003 Subject: [Python-Dev] re.finditer In-Reply-To: <20031002035954.GA27701@sparcs.kaist.ac.kr> References: <20031002035954.GA27701@sparcs.kaist.ac.kr> Message-ID: Seo Sanghyeon writes: > But that bug didn't appear after my patch, so I wonder > why it was introduced in the first place. It seems beyond > my understanding. Please enlighten me. Dear Seo Sanghyeon, Welcome to the list! Please don't post patches here, though - they *will* get lost. Instead, post them to SF (using a new tracker item), and discuss them here if you want. I don't have the time to read your patch right now, so I cannot comment on the issue itself. Regards, Martin From Jack.Jansen at cwi.nl Thu Oct 2 18:00:06 2003 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Thu Oct 2 18:00:18 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: <16252.11251.742634.458564@montanaro.dyndns.org> References: <16251.7839.871263.562935@grendel.zope.com> <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> <16251.14319.693162.119201@grendel.zope.com> <1065040005.15765.18.camel@geddy> <16252.11251.742634.458564@montanaro.dyndns.org> Message-ID: There's always machines out there that won't support newer formats out of the box, so may I suggest the following course of action: 1. For now we add bz2 compression, and put that at the top of the list, with gz far below it. If we want to get real fancy we could even put it behind another link "old formats". 2. At some point in the future we look at the http logs to see how many people still use the older format. .Z files were still very useful to some people long after .gz had become the norm, just because they were stuck on old boxes. And if Python goes out of its way to remain buildable on various old boxes as-os it would be silly if we would require people to download third-party stuff just to decode the documentation... -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From pinard at iro.umontreal.ca Thu Oct 2 20:39:41 2003 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Thu Oct 2 20:39:50 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: References: <16251.7839.871263.562935@grendel.zope.com> <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> <16251.14319.693162.119201@grendel.zope.com> <1065040005.15765.18.camel@geddy> <16252.11251.742634.458564@montanaro.dyndns.org> Message-ID: <20031003003941.GA17401@alcyon.progiciels-bpi.ca> [Jack Jansen] > .Z files were still very useful to some people long after .gz had > become the norm, just because they were stuck on old boxes. Do anybody remember `.z' files? (`pack' and `unpack' were the tools, unless I'm mistaken). I'm _not_ suggesting that they get supported :-). Despite `.Z' is not as old as `.z', they are not very far, once added some perspective. -- Fran?ois Pinard http://www.iro.umontreal.ca/~pinard From azaidi at vsnl.com Thu Oct 2 20:05:24 2003 From: azaidi at vsnl.com (Arsalan Zaidi) Date: Thu Oct 2 21:07:12 2003 Subject: [Python-Dev] Any movement on a SIG for web lib enchancements? Message-ID: <008501c38942$1336a120$b9479cca@LocalHost> There was some discussion about this a few weeks ago. But there's still no SIG. Is anyone working on this yet? --Arsalan From janssen at parc.com Thu Oct 2 21:18:11 2003 From: janssen at parc.com (Bill Janssen) Date: Thu Oct 2 21:18:32 2003 Subject: [Python-Dev] Any movement on a SIG for web lib enchancements? In-Reply-To: Your message of "Thu, 02 Oct 2003 17:05:24 PDT." <008501c38942$1336a120$b9479cca@LocalHost> Message-ID: <03Oct2.181816pdt."58611"@synergy1.parc.xerox.com> Yes, I've been working on a charter. I'll put out a version for folks to look at tomorrow (probably announced on the Meta-SIG; Python-Dev really isn't the right place?). Bill > There was some discussion about this a few weeks ago. But there's still no > SIG. > > Is anyone working on this yet? > > --Arsalan > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev From fdrake at acm.org Thu Oct 2 23:54:10 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu Oct 2 23:54:27 2003 Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages In-Reply-To: References: <16251.7839.871263.562935@grendel.zope.com> <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com> <16251.14319.693162.119201@grendel.zope.com> <1065040005.15765.18.camel@geddy> <16252.11251.742634.458564@montanaro.dyndns.org> Message-ID: <16252.62178.436842.527925@grendel.zope.com> Jack Jansen writes: > 1. For now we add bz2 compression, and put that at the top of the > list, with gz far below it. If we want to get real fancy we > could even put it behind another link "old formats". At this point, we've been providing bzip2-compressed tarballs for three years; they became available with Python 1.6 (does anyone even remember that release?). > 2. At some point in the future we look at the http logs to see > how many people still use the older format. I'm hoping to write the script to do that this weekend. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From anthony at interlink.com.au Fri Oct 3 04:17:42 2003 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri Oct 3 04:19:51 2003 Subject: [Python-Dev] Re: RELEASED Python 2.3.1 In-Reply-To: <16251.16096.224422.956995@grendel.zope.com> Message-ID: <200310030817.h938Hgtk028368@localhost.localdomain> The current documentation release tools don't build the latex packages. I tried using the Makefile targets, but they seemed to want to check out a HEAD revision of the dist/src/Doc directory. Oops. I've commented out the latex row of the download table for now - Fred, can you look into this and fix? Anthony -- Anthony Baxter It's never too late to have a happy childhood. From anthony at python.org Fri Oct 3 04:35:58 2003 From: anthony at python.org (Anthony Baxter) Date: Fri Oct 3 04:37:58 2003 Subject: [Python-Dev] RELEASED Python 2.3.2 (final) Message-ID: <200310030835.h938ZwaN028812@localhost.localdomain> On behalf of the Python development team and the Python community, I'm happy to announce the release of Python 2.3.2 (final). Python 2.3.2 is a bug-fix release, to repair a couple of build problems and packaging errors in Python 2.3.1. For more information on Python 2.3.2, including download links for various platforms, release notes, and known issues, please see: http://www.python.org/2.3.2 Highlights of this new release include: - A bug in autoconf that broke building on HP/UX systems is fixed. - A bug in the Python configure script that meant os.fsync() was never available is fixed. Highlights of the previous major Python release (2.3) are available from the Python 2.3 page, at http://www.python.org/2.3/highlights.html Many apologies for the flaws in 2.3.1 release. Hopefully the new release procedures should stop this happening again. Enjoy the new release, Anthony Anthony Baxter anthony@python.org Python 2.3.2 Release Manager (on behalf of the entire python-dev team) From fdrake at acm.org Fri Oct 3 10:50:04 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri Oct 3 10:50:35 2003 Subject: [Python-Dev] Re: RELEASED Python 2.3.1 In-Reply-To: <200310030817.h938Hgtk028368@localhost.localdomain> References: <16251.16096.224422.956995@grendel.zope.com> <200310030817.h938Hgtk028368@localhost.localdomain> Message-ID: <16253.35996.751345.757001@grendel.zope.com> Anthony Baxter writes: > The current documentation release tools don't build the latex packages. > I tried using the Makefile targets, but they seemed to want to check out > a HEAD revision of the dist/src/Doc directory. Oops. Argh. Yes; this is a problem with mksourcepkg on branches. I think I can improve that a bit, but the best way seems to be running it with a second argument giving the specific tag we're interested in. The script also runs into the "we can't keep our anonymous CVS servers up to date" problem, so I'll mod my local copy not try to use the anonymous servers for now. > I've commented out the latex row of the download table for now - Fred, can > you look into this and fix? I'll have them up shortly. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From python-kbutler at sabaydi.com Fri Oct 3 11:26:01 2003 From: python-kbutler at sabaydi.com (Kevin J. Butler) Date: Fri Oct 3 11:26:34 2003 Subject: [Python-Dev] re.finditer In-Reply-To: References: Message-ID: <3F7D9509.7010802@sabaydi.com> From: Seo Sanghyeon >This is my first mail to python-dev. > > Welcome, nice way to begin. :-) >Attached one line patch fixes re.finditer bug reported by >Kevin J. Butler. I read cvs log to find out why this code is >introduced, and it seems to be related to SF bug #581080. > > Excellent, I'll give it a shot. Meanwhile, I filed a bug: 817234 http://sourceforge.net/tracker/index.php?func=detail&aid=817234&group_id=5470&atid=105470 I included your post & suggested patch. Thanks! kb From python-kbutler at sabaydi.com Fri Oct 3 14:13:16 2003 From: python-kbutler at sabaydi.com (Kevin J. Butler) Date: Fri Oct 3 14:13:52 2003 Subject: Resolution: was Re: [Python-Dev] re.finditer In-Reply-To: References: Message-ID: <3F7DBC3C.6040003@sabaydi.com> Summary: bug 817234 http://sourceforge.net/tracker/index.php?func=detail&aid=817234&group_id=5470&atid=105470 in Python 2.3 and 2.3.1, finditer does not raise StopIteration if the end of the string matches with an empty match. That is, the following code will loop forever: >>> import re >>> i = re.finditer( ".*", "asdf" ) >>> for m in i: print m.span() ... (0, 4) (4, 4) (4, 4) (4, 4) (4, 4) (4, 4) (4, 4) Seo Sanghyeon posted what appears to be a correct fix. The code was introduced in the fix for bug 581080 http://sourceforge.net/tracker/index.php?func=detail&aid=581080&group_id=5470&atid=105470 but removing this line does not re-introduce that bug. Thanks, and kudos to Seo... kb From anthony at interlink.com.au Fri Oct 3 20:08:16 2003 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri Oct 3 20:10:23 2003 Subject: [Python-Dev] 2.3.3 plans Message-ID: <200310040008.h9408HtM008544@localhost.localdomain> I'm currently thinking of doing 2.3.3 in about 3 months time. My focus on 2.3.3 will be on fixing the various build glitches that we have on various platforms - I'd like to see 2.3.3 build on as many boxes as possible, "out of the box". Anthony From jason at mastaler.com Sat Oct 4 14:01:39 2003 From: jason at mastaler.com (Jason R. Mastaler) Date: Sat Oct 4 14:01:47 2003 Subject: [Python-Dev] Re: RELEASED Python 2.3.2 (final) References: <200310030835.h938ZwaN028812@localhost.localdomain> Message-ID: I wanted to say thanks to Anthony and everyone else for responding so quickly to our concerns with the 2.3.1 release. It's greatly appreciated! From python at discworld.dyndns.org Sat Oct 4 16:58:37 2003 From: python at discworld.dyndns.org (Charles Cazabon) Date: Sat Oct 4 16:53:55 2003 Subject: [Python-Dev] Re: RELEASED Python 2.3.2 (final) In-Reply-To: ; from jason@mastaler.com on Sat, Oct 04, 2003 at 12:01:39PM -0600 References: <200310030835.h938ZwaN028812@localhost.localdomain> Message-ID: <20031004145837.A13335@discworld.dyndns.org> Jason R. Mastaler wrote: > I wanted to say thanks to Anthony and everyone else for responding so > quickly to our concerns with the 2.3.1 release. It's greatly > appreciated! Hear, hear. Thanks to all involved for their hard work. It's much easier when all you have to do is complain about it :). Charles -- ----------------------------------------------------------------------- Charles Cazabon GPL'ed software available at: http://www.qcc.ca/~charlesc/software/ ----------------------------------------------------------------------- From cstork at ics.uci.edu Sat Oct 4 19:40:00 2003 From: cstork at ics.uci.edu (Christian Stork) Date: Sat Oct 4 19:40:55 2003 Subject: [Python-Dev] Efficient predicates for the standard library Message-ID: <20031004234000.GG25813@ics.uci.edu> Hi everybody, This is my first post to python-dev and mailman told me to introduce myself... I'm computer science grad student at UC Irvine and I've been programming in python for quite some time now. I'm originally from Germany where I studied math together with Marc-Andre Lemburg, who should be somewhat known on this list. ;-) I'd like to advocate the inclusion of efficient (ie iterator-based) predicates to the standard library. If that's asking too much :-) then consider this just a suggestion for updating the documentation of itertools. My reasoning is that these predicate should be used in many places, especially as part of assert statements. IMHO lowering the burden to use assert statements is always a good idea. The examples given in itertools' documentation are a good starting point. More specifically I'm talking about the following: def all(pred, seq): "Returns True if pred(x) is True for every element in the iterable" return False not in imap(pred, seq) def some(pred, seq): "Returns True if pred(x) is True at least one element in the iterable" return True in imap(pred, seq) def no(pred, seq): "Returns True if pred(x) is False for every element in the iterable" return True not in imap(pred, seq) But before including these functions, I would like to propose two changes. 1. Meaning of None as predicate The meaning of None as pred. The above definitions use itertools.imap's interpretation of None as pred arguments, ie None is interpreted as the functions that returns a tuple of its arguments. Therefore all(None, ) will always return True. Similar reasoning renders None as pred useless for some() and no(). I would like to propose pred=None's meaning to be the same as for itertools.ifilter, ie None is interpreted as the identity function, which--in this context--is the same as the bool() function. Now all(None, seq) it true iff all of seq's elements are interpreted as True by bool(). This is potentially valuable information. ;-) 2. Argument order Now that there's a useful default meaning for pred, we should give it a default and make it an optional argument. For this the order of arguments must be reversed. This is different from itertools consistent use of iterables as last arguments. I don't know if this is relvant here. Anyway, since predicates are in general more useful like this I think it's the better choice. So, I propose an implementation like this: def all(seq, pred=bool): return False not in imap(pred, seq) def some(seq, pred=bool): return True in imap(pred, seq) def no(seq, pred=bool): return True not in imap(pred, seq) [ You can see now that the meaning of pred == None was just a strawman. ;-) ] For enhanced assert support I'd advocate additional predicates for easy and fast type checking, eg allListType, allIntType, etc. Maybe all this should go into it's own `preds' module? -- Chris Stork <><><><><><><><><><><><><> http://www.ics.uci.edu/~cstork/ OpenPGP fingerprint: B08B 602C C806 C492 D069 021E 41F3 8C8D 50F9 CA2F From python at rcn.com Sat Oct 4 21:24:48 2003 From: python at rcn.com (Raymond Hettinger) Date: Sat Oct 4 21:25:19 2003 Subject: [Python-Dev] Efficient predicates for the standard library In-Reply-To: <20031004234000.GG25813@ics.uci.edu> Message-ID: <000001c38adf$77f39ca0$e841fea9@oemcomputer> [Christian Stork] > So, I propose an implementation like this: > > def all(seq, pred=bool): > return False not in imap(pred, seq) > > def some(seq, pred=bool): > return True in imap(pred, seq) > > def no(seq, pred=bool): > return True not in imap(pred, seq) The examples are all useful by themselves, but their primary purpose is to teach how to use the basic tools. Accordingly, the examples should not become complex and they should tie in as well as possible to previously existing knowledge (i.e. common patterns for argument order). Your proposal is a net gain and I will change the docs as requested. Having bool() as a default makes the functions more useful and less error prone. Also, it increases instructional value by giving an example of a predicate (for who skipped class that day). Also, your proposed argument order matches common mathematical usage (i.e. All elements in a such that is true). For your own purposes, consider using a faster implementation: def Some(seq, pred=None): for x in ifilter(None, seq): return True return False All() and No() have similar fast implementations using ifilterfalse() and reversing the return values. > For enhanced assert support I'd advocate additional predicates for > easy and fast type checking, eg allListType, allIntType, etc. > Maybe all this should go into it's own `preds' module? Or maybe not ;-) Somewhere, Tim has a eloquent and pithy saying which roughly translates to: """Adding trivial functions is a net loss because the burden of learning or becoming aware of them (and their implementation nuances) will far exceed the microscopic benefit of saving a line or two that could be coded on the spot as needed.""" In this case, a single example in the docs may suffice: if False in imap(isinstance, seqn, repeat(int)): raise TypeError("All arguments must be of type int") Raymond Hettinger From fincher.8 at osu.edu Sun Oct 5 00:26:48 2003 From: fincher.8 at osu.edu (Jeremy Fincher) Date: Sat Oct 4 23:28:20 2003 Subject: [Python-Dev] Efficient predicates for the standard library In-Reply-To: <20031004234000.GG25813@ics.uci.edu> References: <20031004234000.GG25813@ics.uci.edu> Message-ID: <200310050026.48218.fincher.8@osu.edu> On Saturday 04 October 2003 07:40 pm, Christian Stork wrote: > I'd like to advocate the inclusion of efficient (ie iterator-based) > predicates to the standard library. I agree. At the very least, I think such predicates should be in the itertools module. > My reasoning is that these predicate should be used in many places, > especially as part of assert statements. One of the places where I use them most, to be sure :) > def all(pred, seq): > "Returns True if pred(x) is True for every element in the iterable" > return False not in imap(pred, seq) > > def some(pred, seq): > "Returns True if pred(x) is True at least one element in the iterable" > return True in imap(pred, seq) > > def no(pred, seq): > "Returns True if pred(x) is False for every element in the iterable" > return True not in imap(pred, seq) I would instead call some "any" (it's more standard among the functional languages I've worked with), and I wouldn't bother with "no," since it's exactly the same as "not any" (or "not some," as the case may be). As Raymond Hettinger already mentioned, obviously such predicates over sequences should exhibit short-circuit behavior -- any should return with the first True response and all should return with the first False response. Jeremy From cstork at ics.uci.edu Sun Oct 5 05:57:27 2003 From: cstork at ics.uci.edu ('Christian Stork') Date: Sun Oct 5 05:58:22 2003 Subject: [Python-Dev] Efficient predicates for the standard library In-Reply-To: <000001c38adf$77f39ca0$e841fea9@oemcomputer> References: <20031004234000.GG25813@ics.uci.edu> <000001c38adf$77f39ca0$e841fea9@oemcomputer> Message-ID: <20031005095727.GB32122@ics.uci.edu> On Sat, Oct 04, 2003 at 09:24:48PM -0400, Raymond Hettinger wrote: ... > Your proposal is a net gain and I will change the docs as requested. > Having bool() as a default makes the functions more useful and less > error prone. Also, it increases instructional value by giving an > example of a predicate (for who skipped class that day). Also, your > proposed argument order matches common mathematical usage (i.e. All > elements in a such that is true). Thanks, I agree. > For your own purposes, consider using a faster implementation: > > def Some(seq, pred=None): > for x in ifilter(None, seq): > return True > return False > > All() and No() have similar fast implementations using ifilterfalse() > and reversing the return values. Interesting, this is almost exactly what my first attempt at this looked like. Then I saw the examples in the doc and changed to the proposed ones. Honestly, I assumed that x in iterable has a short-circuit implementation. Why doesn't it? > > For enhanced assert support I'd advocate additional predicates for > > easy and fast type checking, eg allListType, allIntType, etc. > > > Maybe all this should go into it's own `preds' module? > > Or maybe not ;-) > > Somewhere, Tim has a eloquent and pithy saying which roughly translates > to: > > """Adding trivial functions is a net loss because the burden of learning > or becoming aware of them (and their implementation nuances) will far > exceed the microscopic benefit of saving a line or two that could be > coded on the spot as needed.""" I hear you/him :-) and I'd be fine if you just change the docs. I also agree that introducing predicates a la allInts seems like a bad idea since it's overspecialisation. (I could think of better ways to do type checking anyway.) Let me just give you the reasons (in no particular order) for my suggestion to include the `all' and `some/any' predicates: 1. Efficiency Maybe I'm a bit naive here, but it seems to me that since these predicates involve tight inner loops they offer good potential for speedup, especially when used often and over many iterations. 2. Readabilty If we offer universally-used predicates with succinct names which are available as part of the "batteries included" then that increases readabilty of code a lot. 3. Asserts 1. & 2. encourage the use of asserts, which increases code quality. 4. It's *not* trivial! Contrary to what you imply it's not trivial for everybody to just write efficient and well designed predicates with well-chosen names. This discussion is the proof. :-) > In this case, a single example in the docs may suffice: > > if False in imap(isinstance, seqn, repeat(int)): > raise TypeError("All arguments must be of type int") Just that this would be too much to type for me if I only wanted to quickly (and without too much runtime overhead) check on myself. I'd prefer assert isinstance(seq, [int]) ...but that doesn't exist yet. ...in Python, that is. ;-) Anyway, thanks for the itertools package. I especially enjoy the parts that remind me of Haskell! -- Chris Stork <><><><><><><><><><><><><> http://www.ics.uci.edu/~cstork/ OpenPGP fingerprint: B08B 602C C806 C492 D069 021E 41F3 8C8D 50F9 CA2F From cstork at ics.uci.edu Sun Oct 5 05:59:54 2003 From: cstork at ics.uci.edu (Christian Stork) Date: Sun Oct 5 06:00:49 2003 Subject: [Python-Dev] Efficient predicates for the standard library In-Reply-To: <200310050026.48218.fincher.8@osu.edu> References: <20031004234000.GG25813@ics.uci.edu> <200310050026.48218.fincher.8@osu.edu> Message-ID: <20031005095954.GC32122@ics.uci.edu> On Sun, Oct 05, 2003 at 12:26:48AM -0400, Jeremy Fincher wrote: ... > > def some(pred, seq): > > "Returns True if pred(x) is True at least one element in the iterable" > > return True in imap(pred, seq) > > > > def no(pred, seq): > > "Returns True if pred(x) is False for every element in the iterable" > > return True not in imap(pred, seq) > > I would instead call some "any" (it's more standard among the functional > languages I've worked with), and I wouldn't bother with "no," since it's > exactly the same as "not any" (or "not some," as the case may be). Yep, seems better. -- Chris Stork <><><><><><><><><><><><><> http://www.ics.uci.edu/~cstork/ OpenPGP fingerprint: B08B 602C C806 C492 D069 021E 41F3 8C8D 50F9 CA2F From skip at manatee.mojam.com Sun Oct 5 08:01:04 2003 From: skip at manatee.mojam.com (Skip Montanaro) Date: Sun Oct 5 08:01:15 2003 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200310051201.h95C14wE019221@manatee.mojam.com> Bug/Patch Summary ----------------- 531 open / 4217 total bugs (+26) 206 open / 2405 total patches (+5) New Bugs -------- robotparser interactively prompts for username and password (2003-09-28) http://python.org/sf/813986 Grouprefs in lookbehind assertions (2003-09-28) http://python.org/sf/814253 LDFLAGS ignored in Makefile (2003-09-28) http://python.org/sf/814259 new.function raises TypeError for some strange reason... (2003-09-28) http://python.org/sf/814266 5454 - documentation wrong for ossaudiodev mixer device (2003-09-29) http://python.org/sf/814606 'import Tkinter' causes windows missing-DLL popup (2003-09-29) http://python.org/sf/814654 RedHat 9 blows up at dlclose of pyexpat.so (2003-09-29) http://python.org/sf/814726 OSF/1 test_dbm segfaults (2003-09-30) http://python.org/sf/814996 bug with ill-formed rfc822 attachments (2003-09-30) http://python.org/sf/815563 thread unsafe file objects cause crash (2003-09-30) http://python.org/sf/815646 test_locale and en_US (2003-10-01) http://python.org/sf/815668 SCO_SV: many modules cannot be imported (2003-10-01) http://python.org/sf/815753 tkMessageBox functions reject "type" and "icon" keywords (2003-10-01) http://python.org/sf/815924 ImportError: No module named _socket (2003-10-01) http://python.org/sf/815999 Missing import in email example (2003-10-01) http://python.org/sf/816344 Fatal Python error: GC object already tracked (2003-10-02) http://python.org/sf/816476 mark deprecated modules in indexes (2003-10-02) http://python.org/sf/816725 webbrowser.open hangs under certain conditions (2003-10-02) http://python.org/sf/816810 term.h present but cannot be compiled (2003-10-02) http://python.org/sf/816929 Float Multiplication (2003-10-02) http://python.org/sf/816946 invalid \U escape gives 0=length unistr (2003-10-03) http://python.org/sf/817156 Email.message example missing arg (2003-10-03) http://python.org/sf/817178 re.finditer hangs on final empty match (2003-10-03) http://python.org/sf/817234 Google kills socket lookup (2003-10-04) http://python.org/sf/817611 Need "new style note" (2003-10-04) http://python.org/sf/817742 select behavior undefined for empty lists (2003-10-04) http://python.org/sf/817920 ossaudiodev FileObject does not support closed const (2003-10-04) http://python.org/sf/818006 installer wakes up Windows File Protection (2003-10-05) http://python.org/sf/818029 use Windows' default programs location. (2003-10-05) http://python.org/sf/818030 os.listdir on empty strings. Inconsistent behaviour. (2003-10-05) http://python.org/sf/818059 mailbox._Subfile readline() bug (2003-10-05) http://python.org/sf/818065 New Patches ----------- invalid use of setlocale (2003-09-11) http://python.org/sf/804543 deprecated modules (2003-09-29) http://python.org/sf/814560 Extension logging.handlers.SocketHandler (2003-10-01) http://python.org/sf/815911 popen2 work, fixes bugs 768649 and 761888 (2003-10-01) http://python.org/sf/816059 urllib2.URLError don't calll IOError.__init__ (2003-10-02) http://python.org/sf/816787 dynamic popen2 MAXFD (2003-10-03) http://python.org/sf/817329 urllib2 does not allow for absolute ftp paths (2003-10-03) http://python.org/sf/817379 sprout more file operations in SSLFile, fixes 792101 (2003-10-04) http://python.org/sf/817854 Closed Bugs ----------- urllib/urllib2(?) timeouts (2003-09-10) http://python.org/sf/803634 invalid use of setlocale (2003-09-11) http://python.org/sf/804543 Crash if getvar of a non-existent Tcl variable (2003-09-16) http://python.org/sf/807314 exit() raises exception (2003-09-21) http://python.org/sf/810214 2.3.1 configure bug (2003-09-23) http://python.org/sf/811028 HP/UX vs configure (2003-09-23) http://python.org/sf/811160 webbrowser.open_new() opens in an existing browser window (2003-09-24) http://python.org/sf/812089 Closed Patches -------------- popen fix for multiple quoted arguments (2001-09-29) http://python.org/sf/466451 Add IPPROTO_IPV6 option to the socketmodule (2003-09-27) http://python.org/sf/813445 entry size for cursors (2003-09-27) http://python.org/sf/813877 From python at rcn.com Sun Oct 5 11:46:29 2003 From: python at rcn.com (Raymond Hettinger) Date: Sun Oct 5 11:46:58 2003 Subject: [Python-Dev] Efficient predicates for the standard library In-Reply-To: <20031005095727.GB32122@ics.uci.edu> Message-ID: <000001c38b57$d7e91c20$e841fea9@oemcomputer> > Honestly, I assumed that > > x in iterable > > has a short-circuit implementation. Why doesn't it? It does. The ifilter() version is faster only because it doesn't have to continually return values to the 'in' iterator. The speedup is a small constant factor. > Let me just give you the reasons (in no particular order) for my > suggestion to include the `all' and `some/any' predicates: > > 1. Efficiency > Maybe I'm a bit naive here, but it seems to me that since these > predicates involve tight inner loops they offer good potential for > speedup, especially when used often and over many iterations. You're guessing incorrectly. The pure python versions use underlying itertools which run at full C speed. You cannot beat the ifilter() version.. > 2. Readabilty > If we offer universally-used predicates with succinct names which are > available as part of the "batteries included" then that increases > readabilty of code a lot. I put the code in the docs in a form so that people can cut and paste the function definitions it as needed. Then, they can use all(), any(), or no() to their heart's content. > 4. It's *not* trivial! > Contrary to what you imply it's not trivial for everybody to just write > efficient and well designed predicates with well-chosen names. This > discussion is the proof. :-) Cut and paste is easy. Raymond From python at rcn.com Sun Oct 5 11:49:28 2003 From: python at rcn.com (Raymond Hettinger) Date: Sun Oct 5 11:49:58 2003 Subject: [Python-Dev] Efficient predicates for the standard library Message-ID: <000101c38b58$42d5da00$e841fea9@oemcomputer> > Honestly, I assumed that > > x in iterable > > has a short-circuit implementation. Why doesn't it? It does. The ifilter() version is faster only because it doesn't have to continually return values to the 'in' iterator. The speedup is a small constant factor. > Let me just give you the reasons (in no particular order) for my > suggestion to include the `all' and `some/any' predicates: > > 1. Efficiency > Maybe I'm a bit naive here, but it seems to me that since these > predicates involve tight inner loops they offer good potential for > speedup, especially when used often and over many iterations. You're guessing incorrectly. The pure python versions use underlying itertools which loop at full C speed. You cannot beat the ifilter() version. > 2. Readabilty > If we offer universally-used predicates with succinct names which are > available as part of the "batteries included" then that increases > readabilty of code a lot. I put the code in the docs in a form so that people can cut and paste the function definitions it as needed. Then, they can use all(), any(), or no() to their heart's content. > 4. It's *not* trivial! > Contrary to what you imply it's not trivial for everybody to just write > efficient and well designed predicates with well-chosen names. This > discussion is the proof. :-) Cut and paste is your friend. Raymond From jeremy at alum.mit.edu Mon Oct 6 01:18:27 2003 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Mon Oct 6 01:19:57 2003 Subject: [Python-Dev] test_bsddb hangs with CVS Python Message-ID: <1065417507.2095.5.camel@localhost.localdomain> test_bsddb hangs for me everytime. This is a current CVS python with BerkeleyDB 4.1.25. I've tried commenting out test_pop and test_mapping_iteration_methods, but it still hangs somewhere. localhost:~/src/python/build-pydebug> ./python ../Lib/test/test_bsddb.py -v test_change (__main__.TestBTree) ... ok test_clear (__main__.TestBTree) ... ok test_close_and_reopen (__main__.TestBTree) ... ok test_contains (__main__.TestBTree) ... ok test_first_next_looping (__main__.TestBTree) ... ok test_get (__main__.TestBTree) ... ok test_getitem (__main__.TestBTree) ... ok test_has_key (__main__.TestBTree) ... ok test_keyordering (__main__.TestBTree) ... ok test_len (__main__.TestBTree) ... ok test_mapping_iteration_methods (__main__.TestBTree) ... ok test_pop (__main__.TestBTree) ... ok strace says: stat64("./@test", 0xbfffc980) = -1 ENOENT (No such file or directory) stat64("./__db.@test.", {st_mode=S_IFREG|0664, st_size=8192, ...}) = 0 stat64("./@test", 0xbfffc830) = -1 ENOENT (No such file or directory) rename("./__db.@test.", "./@test") = 0 stat64("./@test", {st_mode=S_IFREG|0664, st_size=8192, ...}) = 0 open("./@test", O_RDWR|O_LARGEFILE) = 3 fcntl64(3, F_SETFD, FD_CLOEXEC) = 0 fstat64(3, {st_mode=S_IFREG|0664, st_size=8192, ...}) = 0 pread(3, "\0\0\0\0\1\0\0\0\0\0\0\0b1\5\0\t\0\0\0\0\20\0\0\0\t\0\0"..., 4096, 0) = 4096 pread(3, "\0\0\0\0\1\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\20\1\5\0"..., 4096, 4096) = 4096 futex(0x4055ad40, FUTEX_WAIT, 0, NULL Jeremy From aleaxit at yahoo.com Mon Oct 6 03:17:16 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 6 03:17:24 2003 Subject: [Python-Dev] Efficient predicates for the standard library In-Reply-To: <20031005095727.GB32122@ics.uci.edu> References: <20031004234000.GG25813@ics.uci.edu> <000001c38adf$77f39ca0$e841fea9@oemcomputer> <20031005095727.GB32122@ics.uci.edu> Message-ID: <200310060917.16869.aleaxit@yahoo.com> On Sunday 05 October 2003 11:57 am, 'Christian Stork' wrote: > On Sat, Oct 04, 2003 at 09:24:48PM -0400, Raymond Hettinger wrote: > ... > > > Your proposal is a net gain and I will change the docs as requested. > > Having bool() as a default makes the functions more useful and less > > error prone. Also, it increases instructional value by giving an > > example of a predicate (for who skipped class that day). Also, your > > proposed argument order matches common mathematical usage (i.e. All > > elements in a such that is true). > > Thanks, I agree. Adding to this chorus of agreement, I'd also point out that the form with seq first and pred second ALSO agrees with the usage in list comprehensions of analogous semantics -- [x for x in seq if pred(x)] . > > For your own purposes, consider using a faster implementation: > > > > def Some(seq, pred=None): > > for x in ifilter(None, seq): I suspect that what Raymond means here is ifilter(pred, seq) -- the way he's written it, the pred argument would be ignored. > > return True > > return False > > > > All() and No() have similar fast implementations using ifilterfalse() > > and reversing the return values. > > Interesting, this is almost exactly what my first attempt at this looked > like. Then I saw the examples in the doc and changed to the proposed > ones. > > Honestly, I assumed that > > x in iterable > > has a short-circuit implementation. Why doesn't it? It does. But (assuming the occurrence of x is the Mth out of N items in seq), "return True in imap(pred, seq)" must yield M times and perform M comparisons, while "for x in ifilter(pred, seq): return True" -- while still performing M comparisons, inside ifilter -- yields only once, so it saves performing M-1 yields. The second form is also more tolerant in what it accepts (which is something of a golden rule...) -- it does not malfunction quietly if pred returns true/false values that differ from the "canonical" True and False instances of bool. In some applications, the resulting ability to use an existing pred function directly rather than wrapping it into a bool(...) may further accelerate things. Alex From aleaxit at yahoo.com Mon Oct 6 03:28:38 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 6 03:28:43 2003 Subject: [Python-Dev] Efficient predicates for the standard library In-Reply-To: <000001c38adf$77f39ca0$e841fea9@oemcomputer> References: <000001c38adf$77f39ca0$e841fea9@oemcomputer> Message-ID: <200310060928.38292.aleaxit@yahoo.com> On Sunday 05 October 2003 03:24 am, Raymond Hettinger wrote: ... > In this case, a single example in the docs may suffice: > > if False in imap(isinstance, seqn, repeat(int)): > raise TypeError("All arguments must be of type int") If assert is seen as the typical way to posit such checks, then maybe: assert False not in imap(isinstance, seq, repeat(int)), "All args must be int" might be considered to be a didactically preferable example. Personally, I would not really consider this optimal, with either way of expression. Python's error messages are slowly but surely getting better in that, instead of just saying that something (e.g.) "must be int" (and leaving the coder in the dark about WHAT it was instead), more and more such messages are saying "must be int, not str" or the like. Giving examples that lead to less-informative error messages is, IMHO, not a good idea; to give more information in case of errors, of course, does require a bit more code in the check. I guess for sanity checks that are meant to never really trigger an error message, one might be inclined to ignore this issue -- at least until the first time one such message does trigger and one has to go back and re-instrument the checks to be more informative;-). Sorry for the aside, but I care more about helpful error messages than about "efficient predicates", where the efficiency gain bids fair to be a micro-optimization... Alex From mwh at python.net Mon Oct 6 05:44:39 2003 From: mwh at python.net (Michael Hudson) Date: Mon Oct 6 05:43:51 2003 Subject: [Python-Dev] 2.3.3 plans In-Reply-To: <200310040008.h9408HtM008544@localhost.localdomain> (Anthony Baxter's message of "Sat, 04 Oct 2003 10:08:16 +1000") References: <200310040008.h9408HtM008544@localhost.localdomain> Message-ID: <2m3ce6zomw.fsf@starship.python.net> Anthony Baxter writes: > I'm currently thinking of doing 2.3.3 in about 3 months time. My focus > on 2.3.3 will be on fixing the various build glitches that we have on > various platforms - I'd like to see 2.3.3 build on as many boxes as > possible, "out of the box". This sounds good. It would be nice to have a more sustained effort this time, and also to get on board people who know the problem platforms (as opposed to "logging on to a testdrive machine and flailing"). What platforms have issues that we know about? There's old SCO (but the fix for that is known), HPUX/ia64, various oddities on Irix. Cheers, mwh -- You owe the Oracle a star-spangled dunce cap. -- Internet Oracularity Internet Oracularity #1299-08 From gmccaughan at synaptics-uk.com Mon Oct 6 07:19:54 2003 From: gmccaughan at synaptics-uk.com (Gareth McCaughan) Date: Mon Oct 6 07:20:29 2003 Subject: [Python-Dev] Efficient predicates for the standard library Message-ID: <200310061219.54092.gmccaughan@synaptics-uk.com> Chris Stork wrote: > Now that there's a useful default meaning for pred, we should give > it a default and make it an optional argument. For this the order of > arguments must be reversed. This is different from itertools consistent > use of iterables as last arguments. I don't know if this is relevant > here. Anyway, since predicates are in general more useful like this > I think it's the better choice. Perhaps that's true for a piece of code given as an example. I don't think it would be sensible if, as you propose, these functions were to be put in the standard library, because there's something better to do with the default args for real applications. def any(pred, *iterables): I think the ability to work with multiple sequences (and not to have to use the argument order iter1, pred, iter2, ...) is more important than the ability to avoid typing "bool,". Another option would be def any(*iterables, pred=bool): for items in imap(None, *iterables): if pred(*items): return True return False which looks to me like it offers the best of both worlds. -- g From cstork at ics.uci.edu Mon Oct 6 07:44:28 2003 From: cstork at ics.uci.edu (Christian Stork) Date: Mon Oct 6 07:45:30 2003 Subject: [Python-Dev] Efficient predicates for the standard library In-Reply-To: <200310061219.54092.gmccaughan@synaptics-uk.com> References: <200310061219.54092.gmccaughan@synaptics-uk.com> Message-ID: <20031006114428.GA7899@ics.uci.edu> On Mon, Oct 06, 2003 at 12:19:54PM +0100, Gareth McCaughan wrote: ... > def any(pred, *iterables): > > I think the ability to work with multiple sequences (and > not to have to use the argument order iter1, pred, iter2, ...) > is more important than the ability to avoid typing "bool,". Raymond would tell you to use either chain() or izip() on your *iterables. ;-) This would also make clear what is actually meant. > Another option would be > > def any(*iterables, pred=bool): >>> def any(*iterables, pred=bool): ------------------------------------------------------------ File "", line 1 def any(*iterables, pred=bool): ^ SyntaxError: invalid syntax -Chris From gmccaughan at synaptics-uk.com Mon Oct 6 08:06:18 2003 From: gmccaughan at synaptics-uk.com (Gareth McCaughan) Date: Mon Oct 6 08:06:57 2003 Subject: [Python-Dev] Efficient predicates for the standard library In-Reply-To: <20031006114428.GA7899@ics.uci.edu> References: <200310061219.54092.gmccaughan@synaptics-uk.com> <20031006114428.GA7899@ics.uci.edu> Message-ID: <200310061306.18041.gmccaughan@synaptics-uk.com> I said: >> def any(pred, *iterables): >> >> I think the ability to work with multiple sequences (and >> not to have to use the argument order iter1, pred, iter2, ...) >> is more important than the ability to avoid typing "bool,". Chris Stork replied: > Raymond would tell you to use either chain() or izip() on your > *iterables. ;-) This would also make clear what is actually meant. Ugh. :-) >> Another option would be >> >> def any(*iterables, pred=bool): >>>> def any(*iterables, pred=bool): > > ------------------------------------------------------------ > File "", line 1 > def any(*iterables, pred=bool): > ^ > SyntaxError: invalid syntax Aieee! I was so sure you could do that, I didn't bother checking. In fact my thoughts went like this: "Hang on; can you do that? ... Yes, of course you can. I'm just thinking of Lisp, where you can't because of the way keyword args work there. That's a nice benefit of Python's less minimal syntax, isn't it?". How annoying. -- g From joerg at britannica.bec.de Mon Oct 6 08:48:10 2003 From: joerg at britannica.bec.de (Joerg Sonnenberger) Date: Mon Oct 6 08:49:18 2003 Subject: [Python-Dev] nested packages and import order Message-ID: <20031006124810.GA1890@britannica.bec.de> Hi all, I have a package a.b with the following content: a/b/__init__.py: import a.b dir(a.b) Running this generates a AttributeError for b, obviously the import didn't add b to the module "a". Even though it can be argued that importing a package from within is bad style, this a clearly a bug since its at least surprising. Shouldn't the import create the namespace entry in a after it created the module entry in sys.modules? Joerg P.S.: Please CC me, I'm not subscriped From michael.l.schneider at eds.com Mon Oct 6 10:07:53 2003 From: michael.l.schneider at eds.com (Schneider, Michael) Date: Mon Oct 6 10:07:57 2003 Subject: [Python-Dev] RE: Python-Dev Digest, Vol 3, Issue 10 Message-ID: <49199579A2BB32438A7572AF3DBB2FB501FEED1D@uscimplm001.net.plm.eds.com> Tana, I have a conflict for the 10:30 meeting. Can we get together at 2:00 this afternoon. I'm sorry for the clash, Mike ---------------------------------------------------------------- Michael Schneider Senior Software Engineering Consultant EDS PLM Solutions "The Greatest Performance Improvement Is the transitioning from a non-working state to the working state" -----Original Message----- From: python-dev-request@python.org [mailto:python-dev-request@python.org] Sent: Sunday, October 05, 2003 12:02 PM To: python-dev@python.org Subject: Python-Dev Digest, Vol 3, Issue 10 Send Python-Dev mailing list submissions to python-dev@python.org To subscribe or unsubscribe via the World Wide Web, visit http://mail.python.org/mailman/listinfo/python-dev or, via email, send a message with subject or body 'help' to python-dev-request@python.org You can reach the person managing the list at python-dev-owner@python.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Python-Dev digest..." Today's Topics: 1. RE: Efficient predicates for the standard library (Raymond Hettinger) 2. Efficient predicates for the standard library (Raymond Hettinger) ---------------------------------------------------------------------- Message: 1 Date: Sun, 5 Oct 2003 11:46:29 -0400 From: "Raymond Hettinger" Subject: RE: [Python-Dev] Efficient predicates for the standard library To: "'Christian Stork'" Cc: python-dev@python.org Message-ID: <000001c38b57$d7e91c20$e841fea9@oemcomputer> Content-Type: text/plain; charset="us-ascii" > Honestly, I assumed that > > x in iterable > > has a short-circuit implementation. Why doesn't it? It does. The ifilter() version is faster only because it doesn't have to continually return values to the 'in' iterator. The speedup is a small constant factor. > Let me just give you the reasons (in no particular order) for my > suggestion to include the `all' and `some/any' predicates: > > 1. Efficiency > Maybe I'm a bit naive here, but it seems to me that since these > predicates involve tight inner loops they offer good potential for > speedup, especially when used often and over many iterations. You're guessing incorrectly. The pure python versions use underlying itertools which run at full C speed. You cannot beat the ifilter() version.. > 2. Readabilty > If we offer universally-used predicates with succinct names which are > available as part of the "batteries included" then that increases > readabilty of code a lot. I put the code in the docs in a form so that people can cut and paste the function definitions it as needed. Then, they can use all(), any(), or no() to their heart's content. > 4. It's *not* trivial! > Contrary to what you imply it's not trivial for everybody to just write > efficient and well designed predicates with well-chosen names. This > discussion is the proof. :-) Cut and paste is easy. Raymond ------------------------------ Message: 2 Date: Sun, 5 Oct 2003 11:49:28 -0400 From: "Raymond Hettinger" Subject: [Python-Dev] Efficient predicates for the standard library To: Message-ID: <000101c38b58$42d5da00$e841fea9@oemcomputer> Content-Type: text/plain; charset="us-ascii" > Honestly, I assumed that > > x in iterable > > has a short-circuit implementation. Why doesn't it? It does. The ifilter() version is faster only because it doesn't have to continually return values to the 'in' iterator. The speedup is a small constant factor. > Let me just give you the reasons (in no particular order) for my > suggestion to include the `all' and `some/any' predicates: > > 1. Efficiency > Maybe I'm a bit naive here, but it seems to me that since these > predicates involve tight inner loops they offer good potential for > speedup, especially when used often and over many iterations. You're guessing incorrectly. The pure python versions use underlying itertools which loop at full C speed. You cannot beat the ifilter() version. > 2. Readabilty > If we offer universally-used predicates with succinct names which are > available as part of the "batteries included" then that increases > readabilty of code a lot. I put the code in the docs in a form so that people can cut and paste the function definitions it as needed. Then, they can use all(), any(), or no() to their heart's content. > 4. It's *not* trivial! > Contrary to what you imply it's not trivial for everybody to just write > efficient and well designed predicates with well-chosen names. This > discussion is the proof. :-) Cut and paste is your friend. Raymond ------------------------------ _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev End of Python-Dev Digest, Vol 3, Issue 10 ***************************************** From tim.one at comcast.net Mon Oct 6 10:20:56 2003 From: tim.one at comcast.net (Tim Peters) Date: Mon Oct 6 10:21:02 2003 Subject: [Python-Dev] test_bsddb hangs with CVS Python In-Reply-To: <1065417507.2095.5.camel@localhost.localdomain> Message-ID: [Jeremy Hylton] > test_bsddb hangs for me everytime. This is a current CVS python with > BerkeleyDB 4.1.25. I've tried commenting out test_pop and > test_mapping_iteration_methods, but it still hangs somewhere. On Win98SE, it hangs every time in test_popitem, which I changed like so: def test_popitem(self): print [1] # ADDED THIS k, v = self.f.popitem() print [2] # AND THIS self.assert_(k in self.d) self.assert_(v in self.d.values()) self.assert_(k not in self.f) self.assertEqual(len(self.d)-1, len(self.f)) It prints [1], but not [2]: C:\Code\python\PCbuild>python_d ../lib/test/test_bsddb.py -v test_change (__main__.TestBTree) ... ok test_clear (__main__.TestBTree) ... ok test_close_and_reopen (__main__.TestBTree) ... ok test_contains (__main__.TestBTree) ... ok test_first_next_looping (__main__.TestBTree) ... ok test_get (__main__.TestBTree) ... ok test_getitem (__main__.TestBTree) ... ok test_has_key (__main__.TestBTree) ... ok test_keyordering (__main__.TestBTree) ... ok test_len (__main__.TestBTree) ... ok test_mapping_iteration_methods (__main__.TestBTree) ... ok test_pop (__main__.TestBTree) ... ok test_popitem (__main__.TestBTree) ... [1] A stacktrace at the point it's hung; looks like deadlock: _BSDDB_D! __db_win32_mutex_lock + 134 bytes _BSDDB_D! __lock_get + 2264 bytes _BSDDB_D! __lock_get + 197 bytes _BSDDB_D! __db_lget + 365 bytes _BSDDB_D! __bam_search + 322 bytes _BSDDB_D! __bam_c_rget + 3535 bytes _BSDDB_D! __bam_c_dup + 1251 bytes _BSDDB_D! __db_c_get + 875 bytes _BSDDB_D! __db_delete + 378 bytes _DB_delete(DBObject * 0x00ba1ee8, __db_txn * 0x00000000, __db_dbt * 0x0062d9b0, int 0) line 545 + 29 bytes DB_ass_sub(DBObject * 0x00ba1ee8, _object * 0x00881d10, _object * 0x00000000) line 2343 + 17 bytes PyObject_DelItem(_object * 0x00ba1ee8, _object * 0x00881d10) line 155 + 16 bytes eval_frame(_frame * 0x0098a368) line 1460 + 13 bytes PyEval_EvalCodeEx(PyCodeObject * 0x00bb6550, _object * 0x008782d8, _object * 0x00000000, _object * * 0x008f034c, int 2, _object * * 0x00000000, int 0, _object * * 0x00000000, int 0, _object * 0x00000000) line 2663 + 9 bytes ... eval_frame() is executing DELETE_SUBSCR. Good(?) news: test_popitem continues to hang even if all other tests are commented out: C:\Code\python\PCbuild>python_d ../lib/test/test_bsddb.py -v test_popitem (__main__.TestBTree) ... [1] Same stacktrace at that point. I was using a debug-build CVS Python above. It also hangs the same place using a release-build Python. From tjreedy at udel.edu Mon Oct 6 12:41:03 2003 From: tjreedy at udel.edu (Terry Reedy) Date: Mon Oct 6 12:41:08 2003 Subject: [Python-Dev] Re: nested packages and import order References: <20031006124810.GA1890@britannica.bec.de> Message-ID: "Joerg Sonnenberger" wrote in message news:20031006124810.GA1890@britannica.bec.de... > Hi all, > I have a package a.b with the following content: [snip] Please direct current Python version usage questions to comp.lang.python or the equivalent mailing list (see www.python.org). The py-dev mailing list (and its mirror, g.c.p.devel) is for discussion of the next and future version of Python. > P.S.: Please CC me, I'm not subscriped Done TJR From g_will at cyberus.ca Mon Oct 6 16:04:25 2003 From: g_will at cyberus.ca (Gordon Williams) Date: Mon Oct 6 16:03:56 2003 Subject: [Python-Dev] ConfigParser items method Message-ID: <000e01c38c45$0aebe650$7654e640@amd950> Hi All, There is a consistency problem between the RawConfigParser and the ConfigParser and SafeConfigParser with the items method. In the first case a list of tuples is returned and in the second two a generator is returned. This is quite confusing and I thought that this was a bug, but the docs indicate that this is what is supposed to happen. An items method that returned a list of tuples as it does in the RawconfigParser would be a useful method to have for both ConfigParser and SafeConfigParser. The RawConfigParser docs say that items should return a list: items( section) Return a list of (name, value) pairs for each option in the given section. The ConfigParser docs say that items should return a generator: items( section[, raw[, vars]]) Create a generator which will return a tuple (name, value) for each option in the given section. Optional arguments have the same meaning as for the get() method. New in version 2.3. RawConfigParser returns list: >>> Config.config >>> Config.config.items("personal") [('age', '21'), ('company', 'Aztec'), ('name', 'karthik')] >>> ConfigParser and SafeConfigParser return generator: >>> Config.config >>> Config.config.items("personal") >>> for item in Config.config.items("personal"): ... print item ... ('age', '21') ('company', 'Aztec') ('name', 'karthik') >>> Config.config >>> Config.config.items("personal") >>> for item in Config.config.items("personal"): ... print item ... ('age', '21') ('company', 'Aztec') ('name', 'karthik') It doesn't make sense to me that the same method should return different objects. Maybe another name for ConfigParser and SafeConfigParser would be appropriate to indicate that a generator was being returned. Regards, Gordon Williams From skip at pobox.com Mon Oct 6 16:07:11 2003 From: skip at pobox.com (Skip Montanaro) Date: Mon Oct 6 16:07:26 2003 Subject: [Python-Dev] 2.3.3 plans In-Reply-To: <2m3ce6zomw.fsf@starship.python.net> References: <200310040008.h9408HtM008544@localhost.localdomain> <2m3ce6zomw.fsf@starship.python.net> Message-ID: <16257.52079.226636.407139@montanaro.dyndns.org> Michael> This sounds good. It would be nice to have a more sustained Michael> effort this time, and also to get on board people who know the Michael> problem platforms (as opposed to "logging on to a testdrive Michael> machine and flailing"). It's not quite exhaustive yet, but I will remind people about the PythonTesters wiki page: http://www.python.org/cgi-bin/moinmoin/PythonTesters Maybe that page should also mention some of the vendor-specific test sites (HP Test Drive, SourceForge compile farm, PBF server farm, ...). Michael> What platforms have issues that we know about? There's old SCO Michael> (but the fix for that is known), HPUX/ia64, various oddities on Michael> Irix. I think it would be real nice if we hammered hard on the bsddb3 problems. Whatever it is, it seems to affect a broad cross-section of the community. Skip From fdrake at acm.org Mon Oct 6 16:12:45 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon Oct 6 16:12:58 2003 Subject: [Python-Dev] ConfigParser items method In-Reply-To: <000e01c38c45$0aebe650$7654e640@amd950> References: <000e01c38c45$0aebe650$7654e640@amd950> Message-ID: <16257.52413.154509.392409@grendel.zope.com> Gordon Williams writes: > An items method that returned a list of tuples as it does in the > RawconfigParser would be a useful method to have for both ConfigParser and > SafeConfigParser. I'm happy for these to always return a list. I probably changed this around when I refactored the classes into raw/classic/safe flavors without really thinking about it. If there are no objections, feel free to file a bug report on SourceForge and assign it to me. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From g_will at cyberus.ca Mon Oct 6 16:35:48 2003 From: g_will at cyberus.ca (Gordon Williams) Date: Mon Oct 6 16:35:19 2003 Subject: [Python-Dev] ConfigParser items method References: <000e01c38c45$0aebe650$7654e640@amd950> <16257.52413.154509.392409@grendel.zope.com> Message-ID: <000701c38c49$6d75e510$7654e640@amd950> Hi Fred, I cant log into sourceforge bugs. I will leave it in your capable hands. Regards, Gordon Williams ----- Original Message ----- From: "Fred L. Drake, Jr." To: "Gordon Williams" Cc: Sent: Monday, October 06, 2003 4:12 PM Subject: Re: [Python-Dev] ConfigParser items method > > Gordon Williams writes: > > An items method that returned a list of tuples as it does in the > > RawconfigParser would be a useful method to have for both ConfigParser and > > SafeConfigParser. > > I'm happy for these to always return a list. I probably changed this > around when I refactored the classes into raw/classic/safe flavors > without really thinking about it. > > If there are no objections, feel free to file a bug report on > SourceForge and assign it to me. > > > -Fred > > -- > Fred L. Drake, Jr. > PythonLabs at Zope Corporation > From fdrake at acm.org Mon Oct 6 16:41:42 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon Oct 6 16:41:53 2003 Subject: [Python-Dev] ConfigParser items method In-Reply-To: <000701c38c49$6d75e510$7654e640@amd950> References: <000e01c38c45$0aebe650$7654e640@amd950> <16257.52413.154509.392409@grendel.zope.com> <000701c38c49$6d75e510$7654e640@amd950> Message-ID: <16257.54150.334033.186260@grendel.zope.com> Gordon Williams writes: > I cant log into sourceforge bugs. I will leave it in your capable hands. Report filed: http://sourceforge.net/tracker/index.php?func=detail&aid=818861&group_id=5470&atid=105470 -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From martin at v.loewis.de Mon Oct 6 16:52:21 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Mon Oct 6 16:52:32 2003 Subject: [Python-Dev] 2.3.3 plans In-Reply-To: <16257.52079.226636.407139@montanaro.dyndns.org> References: <200310040008.h9408HtM008544@localhost.localdomain> <2m3ce6zomw.fsf@starship.python.net> <16257.52079.226636.407139@montanaro.dyndns.org> Message-ID: Skip Montanaro writes: > I think it would be real nice if we hammered hard on the bsddb3 problems. > Whatever it is, it seems to affect a broad cross-section of the community. But is there a single report that cannot be attributed to multi-threading, or multi-processes? Regards, Martin From skip at pobox.com Mon Oct 6 16:59:51 2003 From: skip at pobox.com (Skip Montanaro) Date: Mon Oct 6 17:00:02 2003 Subject: [Python-Dev] 2.3.3 plans In-Reply-To: References: <200310040008.h9408HtM008544@localhost.localdomain> <2m3ce6zomw.fsf@starship.python.net> <16257.52079.226636.407139@montanaro.dyndns.org> Message-ID: <16257.55239.526267.58978@montanaro.dyndns.org> Martin> Skip Montanaro writes: >> I think it would be real nice if we hammered hard on the bsddb3 >> problems. Whatever it is, it seems to affect a broad cross-section >> of the community. Martin> But is there a single report that cannot be attributed to Martin> multi-threading, or multi-processes? I don't know. But the fact that we have so far been unable to answer even this question reliably means we have some work to do. I have been assuming that the problems have all been related to access from multiple threads or processes, but others haven't seemed so sure. What about the popitem() hangs? Skip From greg at electricrain.com Mon Oct 6 18:01:16 2003 From: greg at electricrain.com (Gregory P. Smith) Date: Mon Oct 6 18:01:27 2003 Subject: [Python-Dev] bsddb & popitems In-Reply-To: <16257.55239.526267.58978@montanaro.dyndns.org> References: <200310040008.h9408HtM008544@localhost.localdomain> <2m3ce6zomw.fsf@starship.python.net> <16257.52079.226636.407139@montanaro.dyndns.org> <16257.55239.526267.58978@montanaro.dyndns.org> Message-ID: <20031006220116.GB8308@zot.electricrain.com> On Mon, Oct 06, 2003 at 03:59:51PM -0500, Skip Montanaro wrote: > > Martin> Skip Montanaro writes: > >> I think it would be real nice if we hammered hard on the bsddb3 > >> problems. Whatever it is, it seems to affect a broad cross-section > >> of the community. > > Martin> But is there a single report that cannot be attributed to > Martin> multi-threading, or multi-processes? > > I don't know. But the fact that we have so far been unable to answer even > this question reliably means we have some work to do. I have been assuming > that the problems have all been related to access from multiple threads or > processes, but others haven't seemed so sure. What about the popitem() > hangs? The popitem() stack trace on win98 that was just posted still looks like a BerkeleyDB issue. Its stuck in a lock. (bsddb always opens its database and environment with DB_INIT_LOCK and DB_THREAD flags because it can't tell if it will be used by multiple threads) I agree with your assumption and still believe that it is BerkeleyDB / OS locking issues causing the hangs on various platforms. also, not related to bsddb the problem but since i noticed it...: Looking at the code for popitem it looks like bsddb uses UserDict.DictMixin's implementation which does not look thread safe if two threads were removing things from the "dict" with only one of them using popitem. Am I missing something? A race condition exists in iteritems between finding that k exists in the dictionary and looking up self[k]. def iteritems(self): for k in self: return (k, self[k]) Greg From greg at cosc.canterbury.ac.nz Mon Oct 6 20:33:19 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Mon Oct 6 20:33:41 2003 Subject: More informative error messages (Re: [Python-Dev] Efficient predicates for the standard library) In-Reply-To: <200310060928.38292.aleaxit@yahoo.com> Message-ID: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz> Alex Martelli : > Python's error messages are slowly but surely getting better in that, > instead of just saying that something (e.g.) "must be int" (and > leaving the coder in the dark about WHAT it was instead) While we're on the subject of error messages, I'd like to point out another one that could be improved. Often one sees things like TypeError: foo() takes exactly 1 argument (2 given) In the case where foo() is a method of some class, and there are various versions of foo() defined in various superclasses, it's sometimes hard to tell exactly *which* foo it was trying to call. It would be much more useful if the module and class names were included in the error message, e.g. TypeError: MyStuff.SomeClass.foo() takes exactly 1 argument (2 given) The same goes for function names quoted in the traceback. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Mon Oct 6 20:38:28 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Mon Oct 6 20:39:14 2003 Subject: Keyword-only arguments (Re: [Python-Dev] Efficient predicates for the standard library) In-Reply-To: <200310061306.18041.gmccaughan@synaptics-uk.com> Message-ID: <200310070038.h970cSm02652@oma.cosc.canterbury.ac.nz> Gareth McCaughan : > >>>> def any(*iterables, pred=bool): > > > > ------------------------------------------------------------ > > File "", line 1 > > def any(*iterables, pred=bool): > > ^ > > SyntaxError: invalid syntax > > Aieee! I was so sure you could do that, I didn't bother > checking I was just thinking the other day that you *should* be able to say that. Any keyword arguments after a * arg would have to be specified by keyword in the call. So many PEP ideas, so little time... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From aleaxit at yahoo.com Tue Oct 7 04:56:34 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 7 04:56:42 2003 Subject: [Python-Dev] Re: More informative error messages In-Reply-To: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz> References: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz> Message-ID: <200310071056.34954.aleaxit@yahoo.com> On Tuesday 07 October 2003 02:33 am, Greg Ewing wrote: > Alex Martelli : > > Python's error messages are slowly but surely getting better in that, > > instead of just saying that something (e.g.) "must be int" (and > > leaving the coder in the dark about WHAT it was instead) > > While we're on the subject of error messages, I'd like to > point out another one that could be improved. Often one > sees things like > > TypeError: foo() takes exactly 1 argument (2 given) > > In the case where foo() is a method of some class, and there > are various versions of foo() defined in various superclasses, > it's sometimes hard to tell exactly *which* foo it was trying > to call. It would be much more useful if the module and > class names were included in the error message, e.g. > > TypeError: MyStuff.SomeClass.foo() takes exactly 1 argument (2 > given) A perennial beginners' confusion (recently highlighted in a c.l.py thread whose subject claimed that Python can't count;-) is about that "number of arguments given" number: one calls zoop.bleep() and is told bleep "takes exactly 2 arguments (1 given)" when one is sure that one has given no argument at all (and should give exactly 1) -- the implied 'self' causing the beginners' confusion. It seems to me that, if we work on these messages, we may be able to distinguish the bound-method case into TypeError: bound method bleep() of Zoop instance takes exactly 1 argument (0 given) or some such... Alex From just at letterror.com Tue Oct 7 06:17:03 2003 From: just at letterror.com (Just van Rossum) Date: Tue Oct 7 06:17:11 2003 Subject: Keyword-only arguments (Re: [Python-Dev] Efficient predicates for the standard library) In-Reply-To: <200310070038.h970cSm02652@oma.cosc.canterbury.ac.nz> Message-ID: Greg Ewing wrote: > > >>>> def any(*iterables, pred=bool): > > > > > > ------------------------------------------------------------ > > > File "", line 1 > > > def any(*iterables, pred=bool): > > > ^ > > > SyntaxError: invalid syntax > > > > Aieee! I was so sure you could do that, I didn't bother > > checking > > I was just thinking the other day that you *should* be > able to say that. Any keyword arguments after a * arg > would have to be specified by keyword in the call. Same here. Here's another limitation I think is unneccesarary: >>> args = (1, 2, 3) >>> foo(*args, 4, 5, 6) File "", line 1 foo(*args, 4, 5, 6) ^ SyntaxError: invalid syntax >>> > So many PEP ideas, so little time... You got that right... Just From skip at pobox.com Tue Oct 7 07:41:01 2003 From: skip at pobox.com (Skip Montanaro) Date: Tue Oct 7 07:41:13 2003 Subject: [Python-Dev] spambayes-checkins -> spambayes-dev, python-checkins -> python-dev In-Reply-To: <20031006235549.GA14656@cthulhu.gerg.ca> References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz> <1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz> <16257.51553.112259.897217@montanaro.dyndns.org> <20031006235549.GA14656@cthulhu.gerg.ca> Message-ID: <16258.42573.626779.73842@montanaro.dyndns.org> Greg> On 06 October 2003, Skip Montanaro said: >> Maybe the Reply-To: for spambayes-checkins should be spambayes-dev >> (and similarly for python-checkins/python-dev). Can that be >> engineered through Mailman? Greg> Yes -- it's on the "General Options" page. Look for Greg> reply_goes_to_list. After seeing your answer I know I asked the wrong question. I shouldn't have said "Reply-To:". In X?Emacs/VM, I just hit the 'f' key to reply to you and to cc spambayes-dev. Had this been a spambayes-checkins message, it would have been nice if the cc went to spambayes-dev instead of spambayes-checkin. I can probably solve that for myself by tweaking the vm-followup command (what the 'f' key is bound to), but there's probably not a general solution. Setting Reply-To: *might* be okay in a situation like this where you don't want chit-chat on a checkins list to get lost or not seen by the larger audience, but I'd only use it as a last resort. Skip [OT PS] cthulhu.gerg.ca? Is that some sort of pronounceable-only-by-Native- Canadians name? From barry at python.org Tue Oct 7 08:30:24 2003 From: barry at python.org (Barry Warsaw) Date: Tue Oct 7 08:30:29 2003 Subject: [Python-Dev] Re: [spambayes-dev] spambayes-checkins -> spambayes-dev, python-checkins -> python-dev In-Reply-To: <16258.42573.626779.73842@montanaro.dyndns.org> References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz> <1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz> <16257.51553.112259.897217@montanaro.dyndns.org> <20031006235549.GA14656@cthulhu.gerg.ca> <16258.42573.626779.73842@montanaro.dyndns.org> Message-ID: <1065529823.993.38.camel@anthem> On Tue, 2003-10-07 at 07:41, Skip Montanaro wrote: > Greg> On 06 October 2003, Skip Montanaro said: > >> Maybe the Reply-To: for spambayes-checkins should be spambayes-dev > >> (and similarly for python-checkins/python-dev). Can that be > >> engineered through Mailman? > > Greg> Yes -- it's on the "General Options" page. Look for > Greg> reply_goes_to_list. > > After seeing your answer I know I asked the wrong question. I > shouldn't have said "Reply-To:". In X?Emacs/VM, I just hit the 'f' key to > reply to you and to cc spambayes-dev. Had this been a spambayes-checkins > message, it would have been nice if the cc went to spambayes-dev instead of > spambayes-checkin. > > I can probably solve that for myself by tweaking the vm-followup command > (what the 'f' key is bound to), but there's probably not a general solution. > Setting Reply-To: *might* be okay in a situation like this where you don't > want chit-chat on a checkins list to get lost or not seen by the larger > audience, but I'd only use it as a last resort. IMO as an anti-Reply-to munger, I think this is one situation where Reply-To hacking is perfectly legit. You don't want discussions on -checkins, you want them on the discuss mailing list (in this case spambayes-dev). MM2.1 can be configured to retain any existing Reply-To fields so people who have to set this to worm around their broken mail systems can still be coddled. python-devers and spambayes-devers, you vant I should do dis? -Barry From sjoerd at acm.org Tue Oct 7 08:44:53 2003 From: sjoerd at acm.org (Sjoerd Mullender) Date: Tue Oct 7 08:45:05 2003 Subject: [Python-Dev] Re: [spambayes-dev] spambayes-checkins -> spambayes-dev, python-checkins -> python-dev In-Reply-To: <1065529823.993.38.camel@anthem> References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz> <1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz> <16257.51553.112259.897217@montanaro.dyndns.org> <20031006235549.GA14656@cthulhu.gerg.ca> <16258.42573.626779.73842@montanaro.dyndns.org> <1065529823.993.38.camel@anthem> Message-ID: <3F82B545.2060702@acm.org> Barry Warsaw wrote: > On Tue, 2003-10-07 at 07:41, Skip Montanaro wrote: > >> Greg> On 06 October 2003, Skip Montanaro said: >> >> Maybe the Reply-To: for spambayes-checkins should be spambayes-dev >> >> (and similarly for python-checkins/python-dev). Can that be >> >> engineered through Mailman? >> >> Greg> Yes -- it's on the "General Options" page. Look for >> Greg> reply_goes_to_list. >> >>After seeing your answer I know I asked the wrong question. I >>shouldn't have said "Reply-To:". In X?Emacs/VM, I just hit the 'f' key to >>reply to you and to cc spambayes-dev. Had this been a spambayes-checkins >>message, it would have been nice if the cc went to spambayes-dev instead of >>spambayes-checkin. >> >>I can probably solve that for myself by tweaking the vm-followup command >>(what the 'f' key is bound to), but there's probably not a general solution. >>Setting Reply-To: *might* be okay in a situation like this where you don't >>want chit-chat on a checkins list to get lost or not seen by the larger >>audience, but I'd only use it as a last resort. > > > IMO as an anti-Reply-to munger, I think this is one situation where > Reply-To hacking is perfectly legit. You don't want discussions on > -checkins, you want them on the discuss mailing list (in this case > spambayes-dev). MM2.1 can be configured to retain any existing Reply-To > fields so people who have to set this to worm around their broken mail > systems can still be coddled. > > python-devers and spambayes-devers, you vant I should do dis? +1 -- Sjoerd Mullender From aahz at pythoncraft.com Tue Oct 7 10:29:22 2003 From: aahz at pythoncraft.com (Aahz) Date: Tue Oct 7 10:29:27 2003 Subject: [Python-Dev] spambayes-checkins -> spambayes-dev, python-checkins -> python-dev In-Reply-To: <16258.42573.626779.73842@montanaro.dyndns.org> References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz> <1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz> <16257.51553.112259.897217@montanaro.dyndns.org> <20031006235549.GA14656@cthulhu.gerg.ca> <16258.42573.626779.73842@montanaro.dyndns.org> Message-ID: <20031007142921.GA27594@panix.com> On Tue, Oct 07, 2003, Skip Montanaro wrote: > > [OT PS] cthulhu.gerg.ca? Is that some sort of pronounceable-only-by-Native- > Canadians name? http://cthulhu.fnord.at/ -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From kennypitt at hotmail.com Tue Oct 7 11:01:23 2003 From: kennypitt at hotmail.com (Kenny Pitt) Date: Tue Oct 7 11:02:02 2003 Subject: [Python-Dev] RE: [spambayes-dev] spambayes-checkins -> spambayes-dev, python-checkins -> python-dev In-Reply-To: <1065529823.993.38.camel@anthem> Message-ID: Barry Warsaw wrote: > On Tue, 2003-10-07 at 07:41, Skip Montanaro wrote: >> Greg> On 06 October 2003, Skip Montanaro said: >> >> Maybe the Reply-To: for spambayes-checkins should be spambayes-dev >> >> (and similarly for python-checkins/python-dev). Can that be >> >> engineered through Mailman? >> [snip] > > IMO as an anti-Reply-to munger, I think this is one situation where > Reply-To hacking is perfectly legit. You don't want discussions on > -checkins, you want them on the discuss mailing list (in this case > spambayes-dev). MM2.1 can be configured to retain any existing > Reply-To fields so people who have to set this to worm around their > broken mail systems can still be coddled. > > python-devers and spambayes-devers, you vant I should do dis? > -Barry +1 -- Kenny Pitt From jeremy at zope.com Tue Oct 7 11:17:21 2003 From: jeremy at zope.com (Jeremy Hylton) Date: Tue Oct 7 11:20:51 2003 Subject: [Python-Dev] spambayes-checkins -> spambayes-dev, python-checkins -> python-dev In-Reply-To: <16258.42573.626779.73842@montanaro.dyndns.org> References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz> <1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz> <16257.51553.112259.897217@montanaro.dyndns.org> <20031006235549.GA14656@cthulhu.gerg.ca> <16258.42573.626779.73842@montanaro.dyndns.org> Message-ID: <1065539841.2322.11.camel@localhost.localdomain> On Tue, 2003-10-07 at 07:41, Skip Montanaro wrote: > [OT PS] cthulhu.gerg.ca? Is that some sort of pronounceable-only-by-Native- > Canadians name? No. It's H.P. Lovecraft. Just be glad he didn't choose yog-sothoth or tsathoggua. In college, we had a cluster of machines in my living group that were named (briefly) after Lovecraft's creatures. No one could remember how to spell the names. Jeremy From barry at python.org Tue Oct 7 11:42:48 2003 From: barry at python.org (Barry Warsaw) Date: Tue Oct 7 11:42:56 2003 Subject: [Python-Dev] test_bsddb hangs with CVS Python In-Reply-To: <1065417507.2095.5.camel@localhost.localdomain> References: <1065417507.2095.5.camel@localhost.localdomain> Message-ID: <1065541367.17466.0.camel@anthem> On Mon, 2003-10-06 at 01:18, Jeremy Hylton wrote: > test_bsddb hangs for me everytime. This is a current CVS python with > BerkeleyDB 4.1.25. I've tried commenting out test_pop and > test_mapping_iteration_methods, but it still hangs somewhere. > > localhost:~/src/python/build-pydebug> ./python ../Lib/test/test_bsddb.py > -v > test_change (__main__.TestBTree) ... ok > test_clear (__main__.TestBTree) ... ok > test_close_and_reopen (__main__.TestBTree) ... ok > test_contains (__main__.TestBTree) ... ok > test_first_next_looping (__main__.TestBTree) ... ok > test_get (__main__.TestBTree) ... ok > test_getitem (__main__.TestBTree) ... ok > test_has_key (__main__.TestBTree) ... ok > test_keyordering (__main__.TestBTree) ... ok > test_len (__main__.TestBTree) ... ok > test_mapping_iteration_methods (__main__.TestBTree) ... ok > test_pop (__main__.TestBTree) ... ok > > strace says: > stat64("./@test", 0xbfffc980) = -1 ENOENT (No such file or > directory) > stat64("./__db.@test.", {st_mode=S_IFREG|0664, st_size=8192, ...}) = 0 > stat64("./@test", 0xbfffc830) = -1 ENOENT (No such file or > directory) > rename("./__db.@test.", "./@test") = 0 > stat64("./@test", {st_mode=S_IFREG|0664, st_size=8192, ...}) = 0 > open("./@test", O_RDWR|O_LARGEFILE) = 3 > fcntl64(3, F_SETFD, FD_CLOEXEC) = 0 > fstat64(3, {st_mode=S_IFREG|0664, st_size=8192, ...}) = 0 > pread(3, "\0\0\0\0\1\0\0\0\0\0\0\0b1\5\0\t\0\0\0\0\20\0\0\0\t\0\0"..., > 4096, 0) = 4096 > pread(3, "\0\0\0\0\1\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\20\1\5\0"..., > 4096, 4096) = 4096 > futex(0x4055ad40, FUTEX_WAIT, 0, NULL Same thing here, CVS Python 2.3+, RH9, BDB 4.1.25. -Barry From pje at telecommunity.com Tue Oct 7 11:44:50 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Oct 7 11:45:57 2003 Subject: More informative error messages (Re: [Python-Dev] Efficient predicates for the standard library) In-Reply-To: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz> References: <200310060928.38292.aleaxit@yahoo.com> Message-ID: <5.1.1.6.0.20031007114232.0309dc20@telecommunity.com> At 01:33 PM 10/7/03 +1300, Greg Ewing wrote: >While we're on the subject of error messages, I'd like to >point out another one that could be improved. Often one >sees things like > > TypeError: foo() takes exactly 1 argument (2 given) > >In the case where foo() is a method of some class, and there >are various versions of foo() defined in various superclasses, >it's sometimes hard to tell exactly *which* foo it was trying >to call. It would be much more useful if the module and >class names were included in the error message, e.g. > > TypeError: MyStuff.SomeClass.foo() takes exactly 1 argument (2 >given) AFAICT, this would at least require a compiler change, and a change to the layout of code objects, so that a code object would know its "dotted name". >The same goes for function names quoted in the traceback. Don't tracebacks give line number and file? From barry at python.org Tue Oct 7 11:46:51 2003 From: barry at python.org (Barry Warsaw) Date: Tue Oct 7 11:47:08 2003 Subject: [Python-Dev] RE: [spambayes-dev] spambayes-checkins -> spambayes-dev,python-checkins -> python-dev In-Reply-To: References: Message-ID: <1065541611.17466.2.camel@anthem> On Tue, 2003-10-07 at 11:01, Kenny Pitt wrote: > Barry Warsaw wrote: > > IMO as an anti-Reply-to munger, I think this is one situation where > > Reply-To hacking is perfectly legit. You don't want discussions on > > -checkins, you want them on the discuss mailing list (in this case > > spambayes-dev). MM2.1 can be configured to retain any existing > > Reply-To fields so people who have to set this to worm around their > > broken mail systems can still be coddled. > > > > python-devers and spambayes-devers, you vant I should do dis? > > -Barry > > +1 Done for both lists. -Barry From popiel at wolfskeep.com Tue Oct 7 14:09:25 2003 From: popiel at wolfskeep.com (T. Alexander Popiel) Date: Tue Oct 7 14:09:36 2003 Subject: [Python-Dev] Re: [spambayes-dev] spambayes-checkins -> spambayes-dev, python-checkins -> python-dev In-Reply-To: Message from Barry Warsaw of "Tue, 07 Oct 2003 08:30:24 EDT." <1065529823.993.38.camel@anthem> References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz> <1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz> <16257.51553.112259.897217@montanaro.dyndns.org> <20031006235549.GA14656@cthulhu.gerg.ca> <16258.42573.626779.73842@montanaro.dyndns.org> <1065529823.993.38.camel@anthem> Message-ID: <20031007180925.8F8362DE90@cashew.wolfskeep.com> In message: <1065529823.993.38.camel@anthem> Barry Warsaw writes: [ talk of mangling the reply-to on -checkins to point to -dev, iff there is no pre-existing reply-to ] >python-devers and spambayes-devers, you vant I should do dis? +1 - Alex From g_will at cyberus.ca Tue Oct 7 14:11:36 2003 From: g_will at cyberus.ca (Gordon Williams) Date: Tue Oct 7 14:11:10 2003 Subject: [Python-Dev] ConfigParser case sensitive and strings vs objects returned References: <000e01c38c45$0aebe650$7654e640@amd950> <16257.52413.154509.392409@grendel.zope.com> Message-ID: <005b01c38cfe$724ee220$6c57e640@amd950> Hi Fred, A couple of other things about the ConfigParser module that I find a bit strange and I'm not sure that is intended behaivior. 1. Option gets converted to lower case and therefore is not case sensitive, but section is case sensitive. I would have thought that both would be or neither would be case sensitive. (My preference would be that neither would be case sensitive.) example if I have a config.txt file with: [File 1] databaseADF adsfa:octago DASFDAS user:Me password:blank then this gets written out it is (were databaseADF is now databaseadf): [File 1] databaseadf adsfa = octago DASFDAS password = blank user = Me Using "file 1' instead of "File 1": >>> Config.config >>> Config.config.options('file 1') Traceback (most recent call last): File "", line 1, in ? File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 240, in options raise NoSectionError(section) NoSectionError: No section: 'file 1' But using 'dataBASEadf adsfa' instead of 'databaseADF adsfa' or 'databaseadf adsfa ' is OK and returns the correct value: >>> Config.config.get('File 1', 'dataBASEadf adsfa') 'octago DASFDAS' The differences in handling the option and section are annoying and should at least be described in the docs if they cant be changed. 2. SafeConfigParser is the recommended ConfigParser in the docs. I'm not sure what is meant be safe. When values are read in from a file they are first converted to strings. This is not true for values set within the code. If I set an option with anything other than a string then this occurs: >>> Config.config.set('File 1', 'test', 2) >>> Config.config.get('File 1', 'test') Traceback (most recent call last): File "", line 1, in ? File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 518, in get return self._interpolate(section, option, value, d) File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 576, in _interpolate self._interpolate_some(option, L, rawval, section, vars, 1) File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 585, in _interpolate_some p = rest.find("%") AttributeError: 'int' object has no attribute 'find' Likely the value assigned to the object should be first converted to a string before it is stored. The same thing happens if a dict or float is passed in as a default: >>> c= Config.SafeConfigParser({'test':{'1':"One",'2':"Two"}, 'foo':2.3}) This looks OK: >>> c.write(sys.stdout) [DEFAULT] test = {'1': 'One', '2': 'Two'} foo = 2.3 Problem with get: >>> c.get('DEFAULT', 'test') Traceback (most recent call last): File "", line 1, in ? File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 518, in get return self._interpolate(section, option, value, d) File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 576, in _interpolate self._interpolate_some(option, L, rawval, section, vars, 1) File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 585, in _interpolate_some p = rest.find("%") AttributeError: 'dict' object has no attribute 'find' >>> c.get('DEFAULT', 'foo') Traceback (most recent call last): File "", line 1, in ? File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 518, in get return self._interpolate(section, option, value, d) File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 576, in _interpolate self._interpolate_some(option, L, rawval, section, vars, 1) File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 585, in _interpolate_some p = rest.find("%") AttributeError: 'float' object has no attribute 'find' >>> If we set raw= True, then we get back an object () and not a string: >>> c.get('DEFAULT', 'foo', raw= True) 2.2999999999999998 If we use vars= {} an exception is also thrown: >>> c.get('DEFAULT', 'junk', vars= {'junk': 99}) Traceback (most recent call last): File "", line 1, in ? File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 518, in get return self._interpolate(section, option, value, d) File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 576, in _interpolate self._interpolate_some(option, L, rawval, section, vars, 1) File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 585, in _interpolate_some p = rest.find("%") AttributeError: 'int' object has no attribute 'find' One last comment is that 'interpolation' is a bit confusing in the docs. Maybe 'substitution' would be a better word. Thanks, Gordon Williams From fdrake at acm.org Tue Oct 7 14:48:13 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue Oct 7 14:48:30 2003 Subject: [Python-Dev] Re: ConfigParser case sensitive and strings vs objects returned In-Reply-To: <005b01c38cfe$724ee220$6c57e640@amd950> References: <000e01c38c45$0aebe650$7654e640@amd950> <16257.52413.154509.392409@grendel.zope.com> <005b01c38cfe$724ee220$6c57e640@amd950> Message-ID: <16259.2669.257010.619400@grendel.zope.com> Gordon Williams writes: > Hi Fred, > > A couple of other things about the ConfigParser module that I find a bit > strange and I'm not sure that is intended behaivior. > > > 1. Option gets converted to lower case and therefore is not case sensitive, > but section is case sensitive. I would have thought that both would be or > neither would be case sensitive. (My preference would be that neither would > be case sensitive.) And mine would be that both are case sensitive! ;-) I guess that's why we have optionxform to override the transform for option names at least. Ideally, both option and section names should be transformed, but the specific transforms should be independently pluggable. I'm not adverse to a patch which adds a sectionxform, but don't have the time or motivation to change it myself. Feel free to post a patch to SourceForge and assign it to me for review. Documentation and tests are required. [...examples elided...] > The differences in handling the option and section are annoying and should > at least be described in the docs if they cant be changed. Please suggest specific changes; I don't expect to have much time for ConfigParser anytime soon, so specific changes (esp. a patch if you can deal with the LaTeX) would be greatly appreciated. > 2. SafeConfigParser is the recommended ConfigParser in the docs. I'm not > sure what is meant be safe. When values are read in from a file they are > first converted to strings. This is not true for values set within the > code. True. I'd suggest that at most, a typecheck for a value being a string could be added to the code; the documentation may need further elaboration. The "Safe" was intended to refer specifically to the string substitution algorithm; it uses a more careful implementation that isn't as subject to weird border conditions. Again, the documentation may require improvements. > If I set an option with anything other than a string then this occurs: ... > Likely the value assigned to the object should be first converted to a > string before it is stored. Or an exception should be raised, placing the burden squarely on the caller to do the right thing, instead of guessing what the right thing is. > One last comment is that 'interpolation' is a bit confusing in the docs. > Maybe 'substitution' would be a better word. Agreed. I'd like to suggest two things: - Get a SourceForge account and file a bug report (you don't need to be on the Python project, just having an account is sufficient). - Take a look at some of the alternate configuration libraries; they may be more suited to your requirements. My current favorite is ZConfig, for which a new version is expected in the next week or so: http://www.python.org/pypi?%3Aaction=search&name=ZConfig But I might be biased about this one. ;-) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From ianb at colorstudy.com Tue Oct 7 20:41:48 2003 From: ianb at colorstudy.com (Ian Bicking) Date: Tue Oct 7 20:41:45 2003 Subject: More informative error messages (Re: [Python-Dev] Efficient predicates for the standard library) In-Reply-To: <5.1.1.6.0.20031007114232.0309dc20@telecommunity.com> Message-ID: <3389EA8A-F928-11D7-B491-000393C2D67E@colorstudy.com> On Tuesday, October 7, 2003, at 10:44 AM, Phillip J. Eby wrote: > At 01:33 PM 10/7/03 +1300, Greg Ewing wrote: >> While we're on the subject of error messages, I'd like to >> point out another one that could be improved. Often one >> sees things like >> >> TypeError: foo() takes exactly 1 argument (2 given) >> >> In the case where foo() is a method of some class, and there >> are various versions of foo() defined in various superclasses, >> it's sometimes hard to tell exactly *which* foo it was trying >> to call. It would be much more useful if the module and >> class names were included in the error message, e.g. >> >> TypeError: MyStuff.SomeClass.foo() takes exactly 1 argument (2 >> given) > > AFAICT, this would at least require a compiler change, and a change to > the layout of code objects, so that a code object would know its > "dotted name". Methods know their class, and classes know their name, so it should be okay. In the case of functions, they know their module. >> The same goes for function names quoted in the traceback. > > Don't tracebacks give line number and file? Yeah, that seems unnecessary. In the other case (incorrect arguments) it can be hard because you only get the line number of the caller, not the function being called. There's other situations, like list.index, which says "list.index(x): x not in list", when it is almost always useful to know what "x" is. I can't think of other ones off the top of my head, but I know there's many more. Is it helpful (or annoying) to open bugs on these? Personally, I usually add the repr of any interesting arguments to my exceptions. But many of Python's exceptions don't do this. Is there a reasoning there? Sometimes the repr of an object can be verbose, or in getting it you can cause a second error. Is this the reason for the lack of information, or is it just an oversight? Or a differing opinion on how one should debug things? -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org From gward at python.net Tue Oct 7 21:47:16 2003 From: gward at python.net (Greg Ward) Date: Tue Oct 7 21:47:22 2003 Subject: [Python-Dev] Re: spambayes-checkins -> spambayes-dev, python-checkins -> python-dev In-Reply-To: <16258.42573.626779.73842@montanaro.dyndns.org> References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz> <1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz> <16257.51553.112259.897217@montanaro.dyndns.org> <20031006235549.GA14656@cthulhu.gerg.ca> <16258.42573.626779.73842@montanaro.dyndns.org> Message-ID: <20031008014716.GA22217@cthulhu.gerg.ca> On 07 October 2003, Skip Montanaro said: > > Greg> On 06 October 2003, Skip Montanaro said: > >> Maybe the Reply-To: for spambayes-checkins should be spambayes-dev > >> (and similarly for python-checkins/python-dev). Can that be > >> engineered through Mailman? > > Greg> Yes -- it's on the "General Options" page. Look for > Greg> reply_goes_to_list. > > After seeing your answer I know I asked the wrong question. I > shouldn't have said "Reply-To:". In X?Emacs/VM, I just hit the 'f' key to > reply to you and to cc spambayes-dev. Had this been a spambayes-checkins > message, it would have been nice if the cc went to spambayes-dev instead of > spambayes-checkin. I *think* what you want is a Mailman feature to set the Mail-Followup-To header. Not sure if such a feature exists. > [OT PS] cthulhu.gerg.ca? Is that some sort of pronounceable-only-by-Native- > Canadians name? Shhh!!! Don't want spammers to guess that my secret email address is "${my_first_name}@${my_personal_domain}". (Tee-hee-hee, using shell/Perl syntax on python-dev should cause some consternation.) Actually, Cthulhu is an ancient eldritch entity lurking in the depths beneath the Pacific Ocean, waiting to be awakened for the day when he shall DEVOUR ALL HUMANITY!! Good summaries here: http://www.kuro5hin.org/story/2003/9/1/172415/6523 http://www.necfiles.org/part2.htm Greg -- Greg Ward http://www.gerg.ca/ Secrecy is the beginning of tyranny. From barry at python.org Tue Oct 7 22:15:55 2003 From: barry at python.org (Barry Warsaw) Date: Tue Oct 7 22:16:04 2003 Subject: [Python-Dev] Re: [spambayes-dev] Re: spambayes-checkins -> spambayes-dev, python-checkins -> python-dev In-Reply-To: <20031008014716.GA22217@cthulhu.gerg.ca> References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz> <1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz> <16257.51553.112259.897217@montanaro.dyndns.org> <20031006235549.GA14656@cthulhu.gerg.ca> <16258.42573.626779.73842@montanaro.dyndns.org> <20031008014716.GA22217@cthulhu.gerg.ca> Message-ID: <1065579355.18519.43.camel@anthem> On Tue, 2003-10-07 at 21:47, Greg Ward wrote: > I *think* what you want is a Mailman feature to set the Mail-Followup-To > header. Not sure if such a feature exists. Unfortunately, Mail-Followup-To is neither standard nor widely implemented in mail readers. Too bad, it's a good idea. -Barry From fdrake at acm.org Tue Oct 7 22:20:04 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue Oct 7 22:20:20 2003 Subject: More informative error messages (Re: [Python-Dev] Efficient predicates for the standard library) In-Reply-To: <3389EA8A-F928-11D7-B491-000393C2D67E@colorstudy.com> References: <5.1.1.6.0.20031007114232.0309dc20@telecommunity.com> <3389EA8A-F928-11D7-B491-000393C2D67E@colorstudy.com> Message-ID: <16259.29780.396447.476541@grendel.zope.com> Ian Bicking writes: > Personally, I usually add the repr of any interesting arguments to my > exceptions. But many of Python's exceptions don't do this. Is there a > reasoning there? Sometimes the repr of an object can be verbose, or in > getting it you can cause a second error. Is this the reason for the > lack of information, or is it just an oversight? Or a differing > opinion on how one should debug things? Another reason is efficiency. Some exceptions are raised and caught within the C code of the interpreter. For these cases, it is important that the raise be as efficient as possible, so the interpreter attempts to avoid instantiation of the exception instance; this cost was once attributed with a fairly bad performance degradation when we tried a nicer message for AttributeError that caused the exception instance to always be created (fixed before release, of course, IIRC!). That's not to say that there aren't several places where better exception messages can't be used effectively. This is only an issue for exceptions that are going to be frequently raised and caught in C code. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From fdrake at acm.org Tue Oct 7 22:22:12 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue Oct 7 22:22:23 2003 Subject: [Python-Dev] Re: [spambayes-dev] Re: spambayes-checkins -> spambayes-dev, python-checkins -> python-dev In-Reply-To: <1065579355.18519.43.camel@anthem> References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz> <1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz> <16257.51553.112259.897217@montanaro.dyndns.org> <20031006235549.GA14656@cthulhu.gerg.ca> <16258.42573.626779.73842@montanaro.dyndns.org> <20031008014716.GA22217@cthulhu.gerg.ca> <1065579355.18519.43.camel@anthem> Message-ID: <16259.29908.594392.212493@grendel.zope.com> Barry Warsaw writes: > Unfortunately, Mail-Followup-To is neither standard nor widely > implemented in mail readers. Too bad, it's a good idea. There'd be more motivation for mailers to support it if lists generated it. Add it to Mailman, and you'll give mailer authors/maintainers another reason to support it. ;-) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From greg at cosc.canterbury.ac.nz Tue Oct 7 22:34:15 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue Oct 7 22:34:37 2003 Subject: More informative error messages (Re: [Python-Dev] Efficient predicates for the standard library) In-Reply-To: <5.1.1.6.0.20031007114232.0309dc20@telecommunity.com> Message-ID: <200310080234.h982YFE12149@oma.cosc.canterbury.ac.nz> > > TypeError: MyStuff.SomeClass.foo() takes exactly 1 argument (2 > >given) > > AFAICT, this would at least require a compiler change, and a change to the > layout of code objects, so that a code object would know its "dotted name". Perhaps. I had the idea that methods already had some notion of the name of the class they were defined in, but maybe that's only bound methods. In any case, I think it would be worth making this change. > Don't tracebacks give line number and file? Yes, but the exception occurs just *before* the function is entered, so the traceback stops one level short of showing you where the function being called is defined! That's what makes this problem so annoying. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido at python.org Tue Oct 7 22:49:07 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 7 22:49:32 2003 Subject: More informative error messages (Re: [Python-Dev] Efficient predicates for the standard library) In-Reply-To: Your message of "Wed, 08 Oct 2003 15:34:15 +1300." <200310080234.h982YFE12149@oma.cosc.canterbury.ac.nz> References: <200310080234.h982YFE12149@oma.cosc.canterbury.ac.nz> Message-ID: <200310080249.h982n7R22500@12-236-54-216.client.attbi.com> > > > TypeError: MyStuff.SomeClass.foo() takes exactly 1 argument (2 > > >given) > > > > AFAICT, this would at least require a compiler change, and a > > change to the layout of code objects, so that a code object would > > know its "dotted name". > > Perhaps. I had the idea that methods already had some notion of > the name of the class they were defined in, but maybe that's > only bound methods. In any case, I think it would be worth > making this change. Only bound methods. What should the error message be in this case? class C: pass def f(self, a): pass C.f = f C().f() > > Don't tracebacks give line number and file? > > Yes, but the exception occurs just *before* the function is entered, > so the traceback stops one level short of showing you where the > function being called is defined! That's what makes this problem > so annoying. Yes, that's one of the issues of this. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at cosc.canterbury.ac.nz Tue Oct 7 22:50:13 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue Oct 7 22:50:31 2003 Subject: More informative error messages (Re: [Python-Dev] Efficient predicates for the standard library) In-Reply-To: <3389EA8A-F928-11D7-B491-000393C2D67E@colorstudy.com> Message-ID: <200310080250.h982oD012177@oma.cosc.canterbury.ac.nz> Ian Bicking : > > Don't tracebacks give line number and file? > > Yeah, that seems unnecessary. Even when it does give a line number and file, I don't always want to have to go looking them all up just to get an idea of the call path that led to the error. This is a particularly severe problem in Pyrex, where frequently I will get tracebacks telling me things like there was an error in a 27-level deep stack of calls to various functions called "generate_execution_code" scattered among the 50 or so classes in Nodes.py... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Tue Oct 7 23:06:15 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue Oct 7 23:06:33 2003 Subject: More informative error messages (Re: [Python-Dev] Efficient predicates for the standard library) In-Reply-To: <16259.29780.396447.476541@grendel.zope.com> Message-ID: <200310080306.h9836F312300@oma.cosc.canterbury.ac.nz> "Fred L. Drake, Jr." : > this cost was once attributed with a fairly bad performance > degradation when we tried a nicer message for AttributeError that > caused the exception instance to always be created This suggests that perhaps using exceptions for non-exceptional flow control isn't such a good idea, if it forces things like AttributeError to be less useful for debugging than they would otherwise be. I know the Python philosophy holds that you *should* be able to use exceptions freely for both purposes, but perhaps that philosophy needs to be re-examined in the light of this consideration. I know I find myself preferring these days to use getattr et al with default arguments rather than catching exceptions when testing for the presence of something, as it seems to more directly express what I'm trying to do, and avoids all chance of catching the wrong exception. Perhaps the equivalent should be done inside the interpreter, too? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Tue Oct 7 23:12:20 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue Oct 7 23:12:28 2003 Subject: More informative error messages (Re: [Python-Dev] Efficient predicates for the standard library) In-Reply-To: <200310080249.h982n7R22500@12-236-54-216.client.attbi.com> Message-ID: <200310080312.h983CKN12322@oma.cosc.canterbury.ac.nz> > What should the error message be in this case? > > class C: > pass > > def f(self, a): pass > > C.f = f I wouldn't mind if it reported f as a top-level function in that case. It wouldn't be any worse than what happens now if you do def f(a): pass g = f g() Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tim.one at comcast.net Wed Oct 8 00:02:35 2003 From: tim.one at comcast.net (Tim Peters) Date: Wed Oct 8 00:02:45 2003 Subject: More informative error messages (Re: [Python-Dev] Efficientpredicates for the standard library) In-Reply-To: <200310080306.h9836F312300@oma.cosc.canterbury.ac.nz> Message-ID: [Fred L. Drake, Jr.] >> this cost was once attributed with a fairly bad performance >> degradation when we tried a nicer message for AttributeError that >> caused the exception instance to always be created [Greg Ewing] > This suggests that perhaps using exceptions for non-exceptional flow > control isn't such a good idea, if it forces things like > AttributeError to be less useful for debugging than they would > otherwise be. > > I know the Python philosophy holds that you *should* be able to use > exceptions freely for both purposes, but perhaps that philosophy needs > to be re-examined in the light of this consideration. > > I know I find myself preferring these days to use getattr et al with > default arguments rather than catching exceptions when testing for the > presence of something, as it seems to more directly express what I'm > trying to do, and avoids all chance of catching the wrong > exception. Perhaps the equivalent should be done inside the > interpreter, too? The equivalent is already done inside the interpreter, about as far as is possible. Under the covers, the only way getattr(obj, name, default) *can* work is to search obj's inheritance chain, trying to get the attribute at each level, and clearing whatever internal AttributeErrors may be raised along the way. The presence of user-written __getattr__ hooks dooms simpler schemes. An internal PyExc_AttributeError isn't the same as a user-visible AttributeError, though -- a class instance isn't created unless and until PyErr_NormalizeException() gets called because the exception needs to be made user-visible. If the latter never happens, setting and clearing exceptions internally is pretty cheap (a pointer to the global PyExc_AttributeError object is stuffed into the thread state). OTOH, almost every call to a C API function has to test+branch for an error-return value, and I've often wondered whether a setjmp/longjmp-based hack might allow for cleaner and more optimizable code (hand-rolled "real exception handling"). From barry at python.org Wed Oct 8 00:03:04 2003 From: barry at python.org (Barry Warsaw) Date: Wed Oct 8 00:03:14 2003 Subject: [Python-Dev] Re: [spambayes-dev] Re: spambayes-checkins -> spambayes-dev, python-checkins -> python-dev In-Reply-To: <16259.29908.594392.212493@grendel.zope.com> References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz> <1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz> <16257.51553.112259.897217@montanaro.dyndns.org> <20031006235549.GA14656@cthulhu.gerg.ca> <16258.42573.626779.73842@montanaro.dyndns.org> <20031008014716.GA22217@cthulhu.gerg.ca> <1065579355.18519.43.camel@anthem> <16259.29908.594392.212493@grendel.zope.com> Message-ID: <1065585784.18519.50.camel@anthem> On Tue, 2003-10-07 at 22:22, Fred L. Drake, Jr. wrote: > Barry Warsaw writes: > > Unfortunately, Mail-Followup-To is neither standard nor widely > > implemented in mail readers. Too bad, it's a good idea. > > There'd be more motivation for mailers to support it if lists > generated it. Add it to Mailman, and you'll give mailer > authors/maintainers another reason to support it. ;-) Yeah, we've tried that with the RFC 2369 (List-*) headers for years and it hasn't seemed to work. Besides, the closest thing to a standard for Mail-Followup-To says that list servers should never set the header[1]. I support your efforts to fight the good fight though, by setting your MUA to add those headers. :) -Barry [1] http://cr.yp.to/proto/replyto.html -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20031008/2b8c66f5/attachment.bin From guido at python.org Wed Oct 8 00:14:04 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 8 00:14:22 2003 Subject: More informative error messages (Re: [Python-Dev] Efficient predicates for the standard library) In-Reply-To: Your message of "Wed, 08 Oct 2003 16:12:20 +1300." <200310080312.h983CKN12322@oma.cosc.canterbury.ac.nz> References: <200310080312.h983CKN12322@oma.cosc.canterbury.ac.nz> Message-ID: <200310080414.h984E4a22702@12-236-54-216.client.attbi.com> > > What should the error message be in this case? > > > > class C: > > pass > > > > def f(self, a): pass > > > > C.f = f > > I wouldn't mind if it reported f as a top-level function > in that case. My point was that the runtime can't distinguish this case from class C: def f(self, a): pass so the error message has to be the same in both cases. If we want the error message in the latter example to be different, we'll have to provide extra information in the code object. If we don't want to do that, we *may* still be able to recover the fact that we were calling a bound method, but in that case the former example will give the same error message as the latter. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at cosc.canterbury.ac.nz Wed Oct 8 01:39:05 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed Oct 8 01:39:16 2003 Subject: More informative error messages (Re: [Python-Dev] Efficient predicates for the standard library) In-Reply-To: <200310080414.h984E4a22702@12-236-54-216.client.attbi.com> Message-ID: <200310080539.h985d5q12749@oma.cosc.canterbury.ac.nz> > If we want the error message in the latter example to be different, > we'll have to provide extra information in the code object. Yes, I understand that. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at electricrain.com Wed Oct 8 04:03:02 2003 From: greg at electricrain.com (Gregory P. Smith) Date: Wed Oct 8 04:03:11 2003 Subject: [Python-Dev] Re: More informative error messages In-Reply-To: <200310071056.34954.aleaxit@yahoo.com> References: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz> <200310071056.34954.aleaxit@yahoo.com> Message-ID: <20031008080302.GA15666@zot.electricrain.com> On Tue, Oct 07, 2003 at 10:56:34AM +0200, Alex Martelli wrote: > > > > TypeError: foo() takes exactly 1 argument (2 given) > > A perennial beginners' confusion (recently highlighted in a c.l.py thread > whose subject claimed that Python can't count;-) is about that "number > of arguments given" number: one calls zoop.bleep() and is told bleep > "takes exactly 2 arguments (1 given)" when one is sure that one has > given no argument at all (and should give exactly 1) -- the implied 'self' > causing the beginners' confusion. It seems to me that, if we work on these > messages, we may be able to distinguish the bound-method case into > > TypeError: bound method bleep() of Zoop instance takes exactly 1 > argument (0 given) I've had to answer that question about the "wrong" numbers for python newbies[1] frequently as well. Even a simple cleaning up of the user visible off by one error to be: TypeError: method bleep() takes exactly 1 argument (0 given) At the time the TypeError is constructed it shouldn't add serious overhead to check if its a method or a function and subtract 1 accordingly. Greg [1] where newbie is defined as someone who doesn't know the answer to that yet ;) From guido at python.org Wed Oct 8 09:47:07 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 8 09:47:29 2003 Subject: [Python-Dev] Re: More informative error messages In-Reply-To: Your message of "Wed, 08 Oct 2003 01:03:02 PDT." <20031008080302.GA15666@zot.electricrain.com> References: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz> <200310071056.34954.aleaxit@yahoo.com> <20031008080302.GA15666@zot.electricrain.com> Message-ID: <200310081347.h98Dl7423343@12-236-54-216.client.attbi.com> > At the time the TypeError is constructed it shouldn't add serious > overhead to check if its a method or a function and subtract 1 > accordingly. You'd think so, eh? Have you looked at the code? Have you tried to come up with a patch? Why do you think that in 13 years this hasn't been fixed if it's such a common complaint? I'm not arguing against fixing this (I think it would be great) but the number of people who've implied that this should be an easy thing to fix annoys me. For better or for worse, the distinction between a function and a bound method is gone by the time it's called, and recovering that difference is going to be tough. Not in terms of serious overhead, but in terms of serious changes to code that is already extremely subtle. That code it's so subtle *because* we want to keep function call overhead as low as possible, and anything that would add even a fraction of a microsecond to the cost of calling a function with the correct number of arguments will be scrutinized to death. --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at iinet.net.au Wed Oct 8 10:11:19 2003 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Wed Oct 8 10:11:27 2003 Subject: [Python-Dev] Re: More informative error messages In-Reply-To: <200310081347.h98Dl7423343@12-236-54-216.client.attbi.com> References: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz> <200310071056.34954.aleaxit@yahoo.com> <20031008080302.GA15666@zot.electricrain.com> <200310081347.h98Dl7423343@12-236-54-216.client.attbi.com> Message-ID: <3F841B07.2000503@iinet.net.au> Guido van Rossum strung bits together to say: > For better or for worse, the distinction between a function and a > bound method is gone by the time it's called, and recovering that > difference is going to be tough. Not in terms of serious overhead, > but in terms of serious changes to code that is already extremely > subtle. That code it's so subtle *because* we want to keep function > call overhead as low as possible, and anything that would add even a > fraction of a microsecond to the cost of calling a function with the > correct number of arguments will be scrutinized to death. Given this, perhaps a simple addition to the error string might be enough to help reduce confusion: ------------- TypeError: foo() takes exactly 1 argument (2 given). (Note: For bound methods, the argument count includes the object the method is bound to) ------------- Experienced users are unlikely to care, and newer users should then be able to figure out why the argument count is one more than they expect. About the only problem I can see is that it is hard to be clear, without also making the error string rather long (like the one above). Regards, Nick. It's simple, but if it works. . . -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions." From gerrit at nl.linux.org Wed Oct 8 10:13:18 2003 From: gerrit at nl.linux.org (Gerrit Holl) Date: Wed Oct 8 10:13:32 2003 Subject: [Python-Dev] Re: [spambayes-dev] Re: spambayes-checkins -> spambayes-dev, python-checkins -> python-dev In-Reply-To: <1065585784.18519.50.camel@anthem> References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz> <1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz> <16257.51553.112259.897217@montanaro.dyndns.org> <20031006235549.GA14656@cthulhu.gerg.ca> <16258.42573.626779.73842@montanaro.dyndns.org> <20031008014716.GA22217@cthulhu.gerg.ca> <1065579355.18519.43.camel@anthem> <16259.29908.594392.212493@grendel.zope.com> <1065585784.18519.50.camel@anthem> Message-ID: <20031008141318.GA5352@nl.linux.org> Barry Warsaw wrote: > List-Id: Python core developers > List-Unsubscribe: , > > List-Archive: > List-Post: > List-Help: > List-Subscribe: , > > > Yeah, we've tried that with the RFC 2369 (List-*) headers for years and > it hasn't seemed to work. I use them... The ultimate prove that they work ;)! Gerrit. -- 41. If any one fence in the field, garden, and house of a chieftain, man, or one subject to quit-rent, furnishing the palings therefor; if the chieftain, man, or one subject to quit-rent return to field, garden, and house, the palings which were given to him become his property. -- 1780 BC, Hammurabi, Code of Law -- Asperger Syndroom - een persoonlijke benadering: http://people.nl.linux.org/~gerrit/ Kom in verzet tegen dit kabinet: http://www.sp.nl/ From gerrit at nl.linux.org Wed Oct 8 10:52:09 2003 From: gerrit at nl.linux.org (Gerrit Holl) Date: Wed Oct 8 10:52:16 2003 Subject: [Python-Dev] Re: More informative error messages In-Reply-To: <200310081347.h98Dl7423343@12-236-54-216.client.attbi.com> References: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz> <200310071056.34954.aleaxit@yahoo.com> <20031008080302.GA15666@zot.electricrain.com> <200310081347.h98Dl7423343@12-236-54-216.client.attbi.com> Message-ID: <20031008145209.GB5352@nl.linux.org> [I'm not a regular poster, so I'll introduce myself shortly: I am a first-year Physics student without CS knowledge, have learned programming with Python a few years ago] Gregory Smith wrote: > > At the time the TypeError is constructed it shouldn't add serious > > overhead to check if its a method or a function and subtract 1 > > accordingly. Guido van Rossum replied: > You'd think so, eh? Have you looked at the code? Have you tried to > come up with a patch? Why do you think that in 13 years this hasn't > been fixed if it's such a common complaint? Would it be possible to have this code at IDE-level? E.g., is possible for Idle to catch TypeError's and try to find out whether this is about the number of arguments to a callable, and if so, try to find out whether it is about a method or a function? This is of course a lot of overhead, but since it is only for an interactive session, I think this is not a big problem, or am I mistaken here? Something like: except TypeError, msg: if "takes exactly" in msg[0]: # something with tb_lasti? name = msg[0].split('(')[0] typ, val, tb = sys.exc_info() if name in tb.tb_frame.f_locals.keys(): if 'instancemethod' in type(tb.tb_frame.f_locals[name]): # subtract 1 else: # don't subtract 1 else: # hmm, if it is a method, how do we find it? # etc. else: raise It seems quite difficult to do so. It is certainly not always possible, but is it worth the pain? regards, Gerrit Holl. -- 201. If he knock out the teeth of a freed man, he shall pay one-third of a gold mina. -- 1780 BC, Hammurabi, Code of Law -- Asperger Syndroom - een persoonlijke benadering: http://people.nl.linux.org/~gerrit/ Kom in verzet tegen dit kabinet: http://www.sp.nl/ From Scott.Daniels at Acm.Org Wed Oct 8 11:26:43 2003 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Wed Oct 8 11:26:46 2003 Subject: [Python-Dev] RE: More informative error messages In-Reply-To: References: Message-ID: <3F842CB3.1000400@Acm.Org> [Tim Peters] >.... OTOH, almost >every call to a C API function has to test+branch for an error-return value, >and I've often wondered whether a setjmp/longjmp-based hack might allow for >cleaner and more optimizable code (hand-rolled "real exception handling"). > setjmp/longjmp are nightmares for compiler writers. The writers tend to turn off optimizations around them and/or get corner cases wrong. If you read the C standard, precious little is guaranteed around setjmp/longjmp. The C code using disciplined setjmp/longjmp, will read well, probably be be quite optimizable, but .... At least some of the C compilers will mis-optimize such code and others will be painfully slow due to the interaction of two compiler coding strategies: first, emit straightforward sloppy code easily cleaned up in the optimization passes, and second, turn off optimization in the presence of setjmp/longjmp. Maybe the general compiler world has changed, but I had nightmares supporting a language which generated C including setjmp/longjmp calls, and ran it on top of three C compilers. Each compiler had nasty cases to avoid, and the resulting least common denominator was painfully inept. -Scott David Daniels From guido at python.org Wed Oct 8 11:30:00 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 8 11:30:52 2003 Subject: [Python-Dev] Re: More informative error messages In-Reply-To: Your message of "Wed, 08 Oct 2003 16:52:09 +0200." <20031008145209.GB5352@nl.linux.org> References: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz> <200310071056.34954.aleaxit@yahoo.com> <20031008080302.GA15666@zot.electricrain.com> <200310081347.h98Dl7423343@12-236-54-216.client.attbi.com> <20031008145209.GB5352@nl.linux.org> Message-ID: <200310081530.h98FU0K23484@12-236-54-216.client.attbi.com> > Would it be possible to have this code at IDE-level? E.g., is possible > for Idle to catch TypeError's and try to find out whether this is about > the number of arguments to a callable, and if so, try to find out whether > it is about a method or a function? This is of course a lot of overhead, > but since it is only for an interactive session, I think this is not a big > problem, or am I mistaken here? It could be done, probably with 99% reliability. But there are many IDEs out there, and many people don't run their code under an IDE at all, so it would be much preferred to do it in the VM. --Guido van Rossum (home page: http://www.python.org/~guido/) From python-kbutler at sabaydi.com Wed Oct 8 11:34:54 2003 From: python-kbutler at sabaydi.com (Kevin J. Butler) Date: Wed Oct 8 11:35:07 2003 Subject: [Python-Dev] Re: More informative error messages In-Reply-To: References: Message-ID: <3F842E9E.3000600@sabaydi.com> > > >From: Nick Coghlan = > > Given this, perhaps a simple addition to the error string might be > enough to > >help reduce confusion: > > Agreed. >------------- >TypeError: foo() takes exactly 1 argument (2 given). (Note: For bound methods, >the argument count includes the object the method is bound to) >------------- > > Agreed, but then the newbies will wonder what bounds methods are, and to what a method would be bound. Shorter and easier for the uninitiated: TypeError: foo() takes exactly 1 argument (2 given). Counts may include 'self'. kb From mal at lemburg.com Wed Oct 8 12:32:37 2003 From: mal at lemburg.com (M.-A. Lemburg) Date: Wed Oct 8 12:32:43 2003 Subject: [Python-Dev] Efficient predicates for the standard library In-Reply-To: <20031004234000.GG25813@ics.uci.edu> References: <20031004234000.GG25813@ics.uci.edu> Message-ID: <3F843C25.8090808@lemburg.com> Christian Stork wrote: > The examples given in itertools' documentation are a good starting > point. More specifically I'm talking about the following: > > > def all(pred, seq): > "Returns True if pred(x) is True for every element in the iterable" > return False not in imap(pred, seq) > > def some(pred, seq): > "Returns True if pred(x) is True at least one element in the iterable" > return True in imap(pred, seq) > > def no(pred, seq): > "Returns True if pred(x) is False for every element in the iterable" > return True not in imap(pred, seq) FYI, similar APIs have been part of mxTools for years, except that they are called exists() and forall() (the terms used in math for these things), plus there are a few more: count(condition,sequence) Counts the number of objects in sequence for which condition returns true and returns the result as integer. condition must be a callable object. exists(condition,sequence) Return 1 if and only if condition is true for at least one of the items in sequence and 0 otherwise. condition must be a callable object. forall(condition,sequence) Return 1 if and only if condition is true for all of the items in sequence and 0 otherwise. condition must be a callable object. index(condition,sequence) Return the index of the first item for which condition is true. A ValueError is raised in case no item is found. condition must be a callable object. Note that the signatures are similar to those of map(), filter() and reduce() and the do truth checking rather than compare True/False to the result value which is useful sometimes. mxTools currently does not support iterable objects for sequence, but that should be easy to add. More's here: http://www.egenix.com/files/python/mxTools.html -- Marc-Andre Lemburg eGenix.com Professional Python Software directly from the Source (#1, Oct 08 2003) >>> Python/Zope Products & Consulting ... http://www.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From ianb at colorstudy.com Wed Oct 8 12:33:37 2003 From: ianb at colorstudy.com (Ian Bicking) Date: Wed Oct 8 12:33:43 2003 Subject: [Python-Dev] Re: More informative error messages In-Reply-To: <20031008145209.GB5352@nl.linux.org> Message-ID: <2AA75D24-F9AD-11D7-B53E-000393C2D67E@colorstudy.com> On Wednesday, October 8, 2003, at 09:52 AM, Gerrit Holl wrote: > [I'm not a regular poster, so I'll introduce myself shortly: I am a > first-year Physics student without CS knowledge, have learned > programming with Python a few years ago] > > Gregory Smith wrote: >>> At the time the TypeError is constructed it shouldn't add serious >>> overhead to check if its a method or a function and subtract 1 >>> accordingly. > > Guido van Rossum replied: >> You'd think so, eh? Have you looked at the code? Have you tried to >> come up with a patch? Why do you think that in 13 years this hasn't >> been fixed if it's such a common complaint? > > Would it be possible to have this code at IDE-level? E.g., is possible > for Idle to catch TypeError's and try to find out whether this is about > the number of arguments to a callable, and if so, try to find out > whether > it is about a method or a function? This is of course a lot of > overhead, > but since it is only for an interactive session, I think this is not a > big > problem, or am I mistaken here? Or more generally, what if we just add more helpful information to tracebacks? If we care about the particulars of the message, it is always in the context of a traceback. And we don't care about the efficiency of tracebacks. What if, say, exceptions had a method strfortraceback(tb), which was smarter when that would be helpful? Like the code you have here, only as a method of TypeError (or some subclass)... > Something like: > except TypeError, msg: > if "takes exactly" in msg[0]: # something with tb_lasti? > name = msg[0].split('(')[0] > typ, val, tb = sys.exc_info() > if name in tb.tb_frame.f_locals.keys(): > if 'instancemethod' in type(tb.tb_frame.f_locals[name]): > # subtract 1 > else: > # don't subtract 1 > else: > # hmm, if it is a method, how do we find it? > # etc. > else: > raise -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org From greg at electricrain.com Wed Oct 8 14:07:30 2003 From: greg at electricrain.com (Gregory P. Smith) Date: Wed Oct 8 14:07:46 2003 Subject: [Python-Dev] Re: More informative error messages In-Reply-To: <200310081347.h98Dl7423343@12-236-54-216.client.attbi.com> References: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz> <200310071056.34954.aleaxit@yahoo.com> <20031008080302.GA15666@zot.electricrain.com> <200310081347.h98Dl7423343@12-236-54-216.client.attbi.com> Message-ID: <20031008180730.GB15666@zot.electricrain.com> > I'm not arguing against fixing this (I think it would be great) but > the number of people who've implied that this should be an easy thing > to fix annoys me. > > For better or for worse, the distinction between a function and a > bound method is gone by the time it's called, and recovering that > difference is going to be tough. Not in terms of serious overhead, > but in terms of serious changes to code that is already extremely > subtle. That code it's so subtle *because* we want to keep function > call overhead as low as possible, and anything that would add even a > fraction of a microsecond to the cost of calling a function with the > correct number of arguments will be scrutinized to death. Agreed. I just looked at the code to see why. Its much more difficult than I imagined (except in one easy looking case in ceval.c). For anyone who hasn't read the code, the Python/getargs.c vgetargs1() function that parses the argument description string has no knowledge of the PyCFunction object its checking arguments for. Major restruring to do this could be done several ways but is a huge task for speed and C interface compatibility reasons. -g From guido at python.org Wed Oct 8 14:23:18 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 8 14:23:26 2003 Subject: [Python-Dev] Re: More informative error messages In-Reply-To: Your message of "Wed, 08 Oct 2003 11:07:30 PDT." <20031008180730.GB15666@zot.electricrain.com> References: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz> <200310071056.34954.aleaxit@yahoo.com> <20031008080302.GA15666@zot.electricrain.com> <200310081347.h98Dl7423343@12-236-54-216.client.attbi.com> <20031008180730.GB15666@zot.electricrain.com> Message-ID: <200310081823.h98INIS23675@12-236-54-216.client.attbi.com> > > For better or for worse, the distinction between a function and a > > bound method is gone by the time it's called, and recovering that > > difference is going to be tough. Not in terms of serious overhead, > > but in terms of serious changes to code that is already extremely > > subtle. That code it's so subtle *because* we want to keep function > > call overhead as low as possible, and anything that would add even a > > fraction of a microsecond to the cost of calling a function with the > > correct number of arguments will be scrutinized to death. > > Agreed. I just looked at the code to see why. Its much more > difficult than I imagined (except in one easy looking case in ceval.c). > > For anyone who hasn't read the code, the Python/getargs.c vgetargs1() > function that parses the argument description string has no knowledge > of the PyCFunction object its checking arguments for. Major restruring > to do this could be done several ways but is a huge task for speed and > C interface compatibility reasons. Um, when is this a problem for methods implemented in C? AFAIK the problem only exists for Python methods: take e.g. append() as an example of a C method, and everything is fine: >>> [].append(1,2 ) Traceback (most recent call last): File "", line 1, in ? TypeError: append() takes exactly one argument (2 given) >>> The issue is really in ceval.c... --Guido van Rossum (home page: http://www.python.org/~guido/) From gminick at hacker.pl Wed Oct 8 15:42:18 2003 From: gminick at hacker.pl (gminick) Date: Wed Oct 8 15:47:25 2003 Subject: [Python-Dev] obj.__contains__() returns 1/0... Message-ID: <20031008194218.GA17069@hannibal> ...shouldn't it return True/False? examples: >>> a = 'python' >>> a.__contains__('perl') 0 >>> a.__contains__('python') 1 >>> a = { 'python' : 1, 'ruby' : 1 } >>> a.__contains__('perl') 0 >>> a.__contains__('python') 1 >>> instead of: >>> a = 'python' >>> a.__contains__('perl') False >>> a.__contains__('python') True >>> a = { 'python' : 1, 'ruby' : 1 } >>> a.__contains__('perl') False >>> a.__contains__('python') True >>> The reason for asking is that i.e. obj.__eq__() returns True/False, and besides True/False looks nicer than 1/0 ;> ps. I'll send a patch to sf.net in a matter of minutes. Please, decide if it should be applied, thanks. -- [ Wojtek Walczak - gminick (at) underground.org.pl ] [ ] [ "...rozmaite zwroty, matowe od patyny dawnosci." ] From greg at cosc.canterbury.ac.nz Wed Oct 8 22:29:09 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed Oct 8 22:29:35 2003 Subject: [Python-Dev] Re: More informative error messages In-Reply-To: <20031008080302.GA15666@zot.electricrain.com> Message-ID: <200310090229.h992T9m20070@oma.cosc.canterbury.ac.nz> "Gregory P. Smith" : > At the time the TypeError is constructed it shouldn't add serious overhead > to check if its a method or a function and subtract 1 accordingly. Except that by the time the error is detected, we've lost track of whether it's a method or not. Maybe a heuristic could be applied, e.g. if the first parameter is called 'self', say something like "foo() takes exactly 1 argument (excluding 'self'), 0 given". Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tim.one at comcast.net Wed Oct 8 22:41:35 2003 From: tim.one at comcast.net (Tim Peters) Date: Wed Oct 8 22:41:41 2003 Subject: [Python-Dev] RE: More informative error messages In-Reply-To: <3F842CB3.1000400@Acm.Org> Message-ID: [Scott David Daniels] > setjmp/longjmp are nightmares for compiler writers. I was one for 15 years, so that won't scare me off . > The writers tend to turn off optimizations around them and/or get > corner cases wrong. > ... They do. I would aim for a tiny total number of setjmp and longjmp calls, inside very simple functions. So, e.g., a routine that wanted to die with an error wouldn't call longjmp directly, it would call a common utility function containing the longjmp. The latter function simply wouldn't return. Optimizations short of interprocedural analysis aren't harmed then in the calling function, because nothing in *that* is the target, or direct source, of a non-local goto. Last I looked, the Perl source seemed to do such a thing in places, and that's about as widely ported as Python. It struck me with force when I was looking at Perl's version of an adaptive mergesort last year, and got jealous of how much shorter and clearer the C code could be when every stinkin' call didn't have to be followed by an error test-and-branch. The Python sort code hides most of that syntactically via macros, but the runtime cost is always there. In real life, not one sort in a million actually raises an exception, so executing O(N log N) test-branch blocks per sort has astronomically low bang for the buck. In cases like that (which are common), it doesn't matter how slow actually raising an exception would be; it's not even tempting to put the longjmp calls inline. Whatever, I'll never have time to pursue it, so screw it . From python at rcn.com Wed Oct 8 22:53:06 2003 From: python at rcn.com (Raymond Hettinger) Date: Wed Oct 8 22:53:40 2003 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects typeobject.c, 2.244, 2.245 In-Reply-To: Message-ID: <000001c38e10$77e31ea0$e841fea9@oemcomputer> if (res == -1 && PyErr_Occurred()) return NULL; ! return PyInt_FromLong((long)res); } --- 3577,3583 ---- if (res == -1 && PyErr_Occurred()) return NULL; ! ret = PyObject_IsTrue(PyInt_FromLong((long)res)) ? Py_True : Py_False; The line above leaks and does unnecessary work. I believe it should read: ret = res ? Py_True : Py_False; Also, there is another one of these in Objects/descrobject.c line 712. Raymond Hettinger From greg at cosc.canterbury.ac.nz Wed Oct 8 23:21:34 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed Oct 8 23:22:12 2003 Subject: [Python-Dev] RE: More informative error messages In-Reply-To: Message-ID: <200310090321.h993LYK20349@oma.cosc.canterbury.ac.nz> > It struck me with force when I was looking at Perl's version of an > adaptive mergesort last year, and got jealous of how much shorter and > clearer the C code could be when every stinkin' call didn't have to be > followed by an error test-and-branch. Rewrite they Python core in Pyrex. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tim.one at comcast.net Wed Oct 8 23:40:05 2003 From: tim.one at comcast.net (Tim Peters) Date: Wed Oct 8 23:40:14 2003 Subject: [Python-Dev] RE: More informative error messages In-Reply-To: <200310090321.h993LYK20349@oma.cosc.canterbury.ac.nz> Message-ID: [Greg Ewing] > Rewrite they Python core in Pyrex. And steal the glory from you? No way. Whip up a patch, and I'll assign it to Guido . From guido at python.org Wed Oct 8 23:45:45 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 8 23:46:06 2003 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects typeobject.c, 2.244, 2.245 In-Reply-To: Your message of "Wed, 08 Oct 2003 22:53:06 EDT." <000001c38e10$77e31ea0$e841fea9@oemcomputer> References: <000001c38e10$77e31ea0$e841fea9@oemcomputer> Message-ID: <200310090345.h993jkF00618@12-236-54-216.client.attbi.com> > if (res == -1 && PyErr_Occurred()) > return NULL; > ! return PyInt_FromLong((long)res); > } > > --- 3577,3583 ---- > if (res == -1 && PyErr_Occurred()) > return NULL; > ! ret = PyObject_IsTrue(PyInt_FromLong((long)res)) ? Py_True : > Py_False; > > > The line above leaks and does unnecessary work. I believe it should > read: > > ret = res ? Py_True : Py_False; Ai. I did the review while only half awake. :-) But the correct thing to do is to use PyBool_FromLong(res); there's really no need to inline what that function does. > Also, there is another one of these in Objects/descrobject.c line 712. I'll fix that one while I'm at it. BTW, I notice there are a bunch of uses of PyBool_FromLong() that are preceded by something like "if (res < 0) return NULL;" (or "!= -1"). Maybe PyBool_FromLong() itself could make this unneeded by adding something like if (ok < 0 && PyErr_Occurred()) return NULL; to its start? And, while we're reviewing usage patterns of PyBool_FromLong(), the string and unicode types are full of places where it is called by a return statement with a constant 1 or 0 as argument. This seems wasteful to me; I imagine that Py_INCREF(Py_True); return Py_True; takes less time than return PyBool_FromLong(1); Maybe a pair of macros Py_return_True and Py_return_False would make sense? --Guido van Rossum (home page: http://www.python.org/~guido/) From mhammond at skippinet.com.au Thu Oct 9 00:14:48 2003 From: mhammond at skippinet.com.au (Mark Hammond) Date: Thu Oct 9 00:14:47 2003 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objectstypeobject.c, 2.244, 2.245 In-Reply-To: <200310090345.h993jkF00618@12-236-54-216.client.attbi.com> Message-ID: <03a601c38e1b$e2182440$f502a8c0@eden> > Maybe a pair of macros Py_return_True and Py_return_False would make > sense? Include Py_return_None, and a solid +1 from me (even if that isn't how I would spell it .) Mark. From greg at cosc.canterbury.ac.nz Thu Oct 9 00:17:55 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Thu Oct 9 00:18:58 2003 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects typeobject.c, 2.244, 2.245 In-Reply-To: <200310090345.h993jkF00618@12-236-54-216.client.attbi.com> Message-ID: <200310090417.h994HtH00580@oma.cosc.canterbury.ac.nz> Guido van Rossum : > Maybe PyBool_FromLong() itself could make this unneeded by adding > something like > > if (ok < 0 && PyErr_Occurred()) > return NULL; > > to its start? Not sure if it would be a good idea to encourage reliance on one API function doing error checking on behalf of others. I can see someone coming along later and adding some code in between whatever returned the result and the PyBool_FromLong call, not realising that doing so would upset the error checking. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From martin at v.loewis.de Thu Oct 9 00:40:30 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Thu Oct 9 00:40:31 2003 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects typeobject.c, 2.244, 2.245 In-Reply-To: <200310090345.h993jkF00618@12-236-54-216.client.attbi.com> References: <000001c38e10$77e31ea0$e841fea9@oemcomputer> <200310090345.h993jkF00618@12-236-54-216.client.attbi.com> Message-ID: Guido van Rossum writes: > Maybe PyBool_FromLong() itself could make this unneeded by adding > something like > > if (ok < 0 && PyErr_Occurred()) > return NULL; > > to its start? That would an incompatible change. I would expect PyBool_FromLong(i) do the same thing as bool(i). > Maybe a pair of macros Py_return_True and Py_return_False would make > sense? You should, of course, add Py_return_None to it, as well. Then you will find that some contributor goes on a crusade to use these throughout very quickly :-) Regards, Martin From guido at python.org Thu Oct 9 00:43:20 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 9 00:43:39 2003 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects typeobject.c, 2.244, 2.245 In-Reply-To: Your message of "Thu, 09 Oct 2003 17:17:55 +1300." <200310090417.h994HtH00580@oma.cosc.canterbury.ac.nz> References: <200310090417.h994HtH00580@oma.cosc.canterbury.ac.nz> Message-ID: <200310090443.h994hKM00786@12-236-54-216.client.attbi.com> > Guido van Rossum : > > > Maybe PyBool_FromLong() itself could make this unneeded by adding > > something like > > > > if (ok < 0 && PyErr_Occurred()) > > return NULL; > > > > to its start? [Greg Ewing] > Not sure if it would be a good idea to encourage reliance > on one API function doing error checking on behalf of others. Well, most functions in the abstract.c file already do this. And it would actually *catch* bugs -- in fact, the one that Raymond found in descrobject.c originally had return PyInt_FromLong(PySequence_Contains(pp->dict, key)); which was not checking for errors from PySequence_Contains(). > I can see someone coming along later and adding some code > in between whatever returned the result and the PyBool_FromLong > call, not realising that doing so would upset the error > checking. Well, they would have to miss two clues: the documented behavior of PyBool_FromLong() and the fact that whatever produced the value passed in could fail. I'm not sure if that's a big worry, especially since this is typically in dead-simple code. OTOH, explicit is better than implicit. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Oct 9 00:44:25 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 9 00:44:42 2003 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objectstypeobject.c, 2.244, 2.245 In-Reply-To: Your message of "Thu, 09 Oct 2003 14:14:48 +1000." <03a601c38e1b$e2182440$f502a8c0@eden> References: <03a601c38e1b$e2182440$f502a8c0@eden> Message-ID: <200310090444.h994iPW00807@12-236-54-216.client.attbi.com> > > Maybe a pair of macros Py_return_True and Py_return_False would make > > sense? > > Include Py_return_None, and a solid +1 from me (even if that isn't how I > would spell it .) How would you spell it? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Oct 9 01:03:03 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 9 01:03:33 2003 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects typeobject.c, 2.244, 2.245 In-Reply-To: Your message of "09 Oct 2003 06:40:30 +0200." References: <000001c38e10$77e31ea0$e841fea9@oemcomputer> <200310090345.h993jkF00618@12-236-54-216.client.attbi.com> Message-ID: <200310090503.h99533G00867@12-236-54-216.client.attbi.com> > Guido van Rossum writes: > > > Maybe PyBool_FromLong() itself could make this unneeded by adding > > something like > > > > if (ok < 0 && PyErr_Occurred()) > > return NULL; > > > > to its start? [MvL] > That would an incompatible change. I would expect PyBool_FromLong(i) > do the same thing as bool(i). Well, it still does, *except* if you have a pending exception. IMO what happens when you make a Python API call while an exception is pending is pretty underspecified, so it's doubtful whether this incompatibility matters. > > Maybe a pair of macros Py_return_True and Py_return_False would make > > sense? > > You should, of course, add Py_return_None to it, as well. > > Then you will find that some contributor goes on a crusade to use > these throughout very quickly :-) There's the minor issue of how to spell it (Mark Hammond may have a different suggestion) but that certain contributor has my approval once we get the spelling agreed upon. --Guido van Rossum (home page: http://www.python.org/~guido/) From mhammond at skippinet.com.au Thu Oct 9 01:21:12 2003 From: mhammond at skippinet.com.au (Mark Hammond) Date: Thu Oct 9 01:21:51 2003 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objectstypeobject.c, 2.244, 2.245 In-Reply-To: <200310090444.h994iPW00807@12-236-54-216.client.attbi.com> Message-ID: <03dc01c38e25$27ba49c0$f502a8c0@eden> > > > Maybe a pair of macros Py_return_True and Py_return_False > would make > > > sense? > > > > Include Py_return_None, and a solid +1 from me (even if > that isn't how I > > would spell it .) > > How would you spell it? For some reason, I am somewhat conditioned to macros with all caps. So I would personally go for Py_RETURN_NONE/TRUE/FALSE But have no objection to any reasonable spelling. Mark. From gminick at hacker.pl Thu Oct 9 01:41:16 2003 From: gminick at hacker.pl (gminick) Date: Thu Oct 9 01:40:20 2003 Subject: [Python-Dev] obj.__contains__() returns 1/0... In-Reply-To: <20031008194218.GA17069@hannibal> References: <20031008194218.GA17069@hannibal> Message-ID: <20031009054116.GA232@hannibal> On Wed, Oct 08, 2003 at 09:42:18PM +0200, gminick wrote: > ...shouldn't it return True/False? [...] > The reason for asking is that i.e. obj.__eq__() returns True/False, > and besides True/False looks nicer than 1/0 ;> > > ps. I'll send a patch to sf.net in a matter of minutes. > Please, decide if it should be applied, thanks. Ok, no need to discuss: Category: Core (C code) Group: Python 2.3 >Status: Closed >Resolution: Fixed Priority: 5 Submitted By: Wojtek Walczak (gminick) Assigned to: Nobody/Anonymous (nobody) Summary: obj.__contains__() returns 1/0... -- [ Wojtek Walczak - gminick (at) underground.org.pl ] [ ] [ "...rozmaite zwroty, matowe od patyny dawnosci." ] From gminick at hacker.pl Thu Oct 9 02:28:31 2003 From: gminick at hacker.pl (gminick) Date: Thu Oct 9 02:27:30 2003 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects typeobject.c, 2.244, 2.245 In-Reply-To: <000001c38e10$77e31ea0$e841fea9@oemcomputer> References: <000001c38e10$77e31ea0$e841fea9@oemcomputer> Message-ID: <20031009062831.GA274@hannibal> On Wed, Oct 08, 2003 at 10:53:06PM -0400, Raymond Hettinger wrote: > if (res == -1 && PyErr_Occurred()) > return NULL; > ! return PyInt_FromLong((long)res); > } > > --- 3577,3583 ---- > if (res == -1 && PyErr_Occurred()) > return NULL; > ! ret = PyObject_IsTrue(PyInt_FromLong((long)res)) ? Py_True : > Py_False; > > > The line above leaks and does unnecessary work. I believe it should > read: > > ret = res ? Py_True : Py_False; PyInt_FromLong() returns PyObject, so you need to use PyObject_IsTrue() (the way I did) or hack the code not to use PyInt_FromLong(). I used PyInt_FromLong() because it was there before. Original code: res = (*func)(self, value); if (res == -1 && PyErr_Occurred()) return NULL; return PyInt_FromLong((long)res); } If you're sure it isn't needed, then of course we can use the easier way changing the snippet above into: res = (*func)(self, value); if (res == -1 && PyErr_Occurred()) return NULL; ret = res ? Py_True : Py_False; Py_INCREF(ret); return ret; } So, why was there PyInt_FromLong()? :> -- [ Wojtek Walczak - gminick (at) underground.org.pl ] [ ] [ "...rozmaite zwroty, matowe od patyny dawnosci." ] From mwh at python.net Thu Oct 9 05:49:23 2003 From: mwh at python.net (Michael Hudson) Date: Thu Oct 9 05:48:39 2003 Subject: [Python-Dev] RE: More informative error messages In-Reply-To: <200310090321.h993LYK20349@oma.cosc.canterbury.ac.nz> (Greg Ewing's message of "Thu, 09 Oct 2003 16:21:34 +1300 (NZDT)") References: <200310090321.h993LYK20349@oma.cosc.canterbury.ac.nz> Message-ID: <2mbrsqyc4c.fsf@starship.python.net> Greg Ewing writes: >> It struck me with force when I was looking at Perl's version of an >> adaptive mergesort last year, and got jealous of how much shorter and >> clearer the C code could be when every stinkin' call didn't have to be >> followed by an error test-and-branch. > > Rewrite they Python core in Pyrex. That wouldn't alleviate the runtime cost, would it? Maybe one day this sort of fundamental rearrangement will be easier to play with, thanks to PyPy. Cheers, mwh (who has some similarly impractical ideas about memory management...) -- If i don't understand lisp, it would be wise to not bray about how lisp is stupid or otherwise criticize, because my stupidity would be archived and open for all in the know to see. -- Xah, comp.lang.lisp From python at rcn.com Thu Oct 9 11:16:42 2003 From: python at rcn.com (Raymond Hettinger) Date: Thu Oct 9 11:21:58 2003 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objectstypeobject.c, 2.244, 2.245 In-Reply-To: <20031009062831.GA274@hannibal> Message-ID: <000001c38e78$58ec3280$e841fea9@oemcomputer> [Wojtek Walczak] > PyInt_FromLong() returns PyObject, so you need to use PyObject_IsTrue() > (the way I did) or hack the code not to use PyInt_FromLong(). I used > PyInt_FromLong() because it was there before. Original code: > > res = (*func)(self, value); > if (res == -1 && PyErr_Occurred()) > return NULL; > return PyInt_FromLong((long)res); > } > > If you're sure it isn't needed, then of course we can use the easier way > changing the snippet above into: > > res = (*func)(self, value); > if (res == -1 && PyErr_Occurred()) > return NULL; > ret = res ? Py_True : Py_False; > Py_INCREF(ret); > return ret; > } > > So, why was there PyInt_FromLong()? :> obj.__contains__() returns a python object. "res" is a C numeric object. So, PyInt_FromLong() was needed to change it from a C long into a PyObject * to a Python integer (either 0 or 1). Wrapping that return value in Py_ObjectIsTrue() does successfully convert the Python integer into a Python bool. One issue with the way you wrote it is that both PyInt_FromLong() and Py_ObjectIsTrue() create new Python objects but only one of them is returned. The other needs to have its reference count lowered by one so that obj.__contains__() won't leak. The other issue is that it wasn't necessary to create an intermediate PyInt value. Instead, the PyBool can be created directly from "res" using PyBool_FromLong() or the equivalent: ret = res ? Py_True : Py_False; Py_INCREF(ret); return ret; Looking at the functions signatures may make it more clear: wrap_objobjproc: argstuple --> PyObject* PyInt_FromLong: long --> PyObject* PyObject_IsTrue: PyObject* --> PyObject* PyBool_FromLong: long --> PyObject* Hope this helps, Raymond Hettinger From tjreedy at udel.edu Thu Oct 9 12:58:40 2003 From: tjreedy at udel.edu (Terry Reedy) Date: Thu Oct 9 12:58:45 2003 Subject: [Python-Dev] Re: RE: [Python-checkins] python/dist/src/Objectstypeobject.c, 2.244, 2.245 References: <200310090417.h994HtH00580@oma.cosc.canterbury.ac.nz> <200310090443.h994hKM00786@12-236-54-216.client.attbi.com> Message-ID: > > Guido van Rossum : > > > > > Maybe PyBool_FromLong() itself could make this unneeded by adding > > > something like > > > > > > if (ok < 0 && PyErr_Occurred()) > > > return NULL; > > > > > > to its start? > > [Greg Ewing] > > I can see someone coming along later and adding some code > > in between whatever returned the result and the PyBool_FromLong > > call, not realising that doing so would upset the error > > checking. My C is a bit rusty (from being swallowed by a Python)... but in the particular snippet being discussed, it seems that incorporating the error check in PyBool... would eliminate the need for the temporary res variable, so that all can be written as PyBool_FromLong( (*func)(self, value)); /* is (long) cast needed? */ leaving very little 'in between' space in which to insert upsetting code. I have no idea how well this generalizes to other prospective uses. Terry J. Reedy From gminick at hacker.pl Thu Oct 9 15:42:30 2003 From: gminick at hacker.pl (gminick) Date: Thu Oct 9 15:41:45 2003 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objectstypeobject.c, 2.244, 2.245 In-Reply-To: <000001c38e78$58ec3280$e841fea9@oemcomputer> References: <20031009062831.GA274@hannibal> <000001c38e78$58ec3280$e841fea9@oemcomputer> Message-ID: <20031009194229.GA250@hannibal> On Thu, Oct 09, 2003 at 11:16:42AM -0400, Raymond Hettinger wrote: [...] > One issue with the way you wrote it is that both PyInt_FromLong() and > Py_ObjectIsTrue() create new Python objects but only one of them is > returned. The other needs to have its reference count lowered by one so > that obj.__contains__() won't leak. While everything else was clear before, the text above is a nice reminder, some kind of warning, how to code better. Thank you. -- [ Wojtek Walczak - gminick (at) underground.org.pl ] [ ] [ "...rozmaite zwroty, matowe od patyny dawnosci." ] From greg at cosc.canterbury.ac.nz Thu Oct 9 19:45:27 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Thu Oct 9 19:46:11 2003 Subject: [Python-Dev] RE: More informative error messages In-Reply-To: <2mbrsqyc4c.fsf@starship.python.net> Message-ID: <200310092345.h99NjRt13077@oma.cosc.canterbury.ac.nz> Michael Hudson : > > Rewrite they Python core in Pyrex. > > That wouldn't alleviate the runtime cost, would it? No, but it would save having to write all the refcounting and error checking code by hand, with attendant abundancy of opportunities for getting it wrong... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From bac at OCF.Berkeley.EDU Fri Oct 10 02:17:50 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Fri Oct 10 02:17:59 2003 Subject: [Python-Dev] python-dev Summary for 2003-09-16 through 2003-09-30 [draft] Message-ID: <3F864F0E.2070407@ocf.berkeley.edu> Here is everyone's chance to show why Cal Poly should flunk me on the writing proficiency test I have to take this Saturday to prove I can write at a college graduate level. I will probably send the final vesion of this summary on Sunday so you have at least until then to make any corrections and such. And a head's up: I managed to write that guide to Python development but I need to do a quick proof-read (yes, I am actually going to proof-read something for once) and get one other person to take a quick look at it before I post it here to be checked. But it is coming and will be here before December. =) ------------------------------- python-dev Summary for 2003-09-16 through 2003-09-30 ++++++++++++++++++++++++++++++++++++++++++++++++++++ This is a summary of traffic on the `python-dev mailing list`_ from September 16, 2003 through September 30, 2003. It is intended to inform the wider Python community of on-going developments on the list. To comment on anything mentioned here, just post to `comp.lang.python`_ (or email python-list@python.org which is a gateway to the newsgroup) with a subject line mentioning what you are discussing. All python-dev members are interested in seeing ideas discussed by the community, so don't hesitate to take a stance on something. And if all of this really interests you then get involved and join `python-dev`_! This is the twenty-sixth summary written by Brett Cannon (homework, the Summaries, how does he find the time?). All summaries are archived at http://www.python.org/dev/summary/ . Please note that this summary is written using reStructuredText_ which can be found at http://docutils.sf.net/rst.html . Any unfamiliar punctuation is probably markup for reST_ (otherwise it is probably regular expression syntax or a typo =); you can safely ignore it, although I suggest learning reST; it's simple and is accepted for `PEP markup`_ and gives some perks for the HTML output. Also, because of the wonders of programs that like to reformat text, I cannot guarantee you will be able to run the text version of this summary through Docutils_ as-is unless it is from the original text file. .. _PEP Markup: http://www.python.org/peps/pep-0012.html The in-development version of the documentation for Python can be found at http://www.python.org/dev/doc/devel/ and should be used when looking up any documentation on something mentioned here. PEPs (Python Enhancement Proposals) are located at http://www.python.org/peps/ . To view files in the Python CVS online, go to http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/ . Reported bugs and suggested patches can be found at the SourceForge_ project page. .. _python-dev: http://www.python.org/dev/ .. _SourceForge: http://sourceforge.net/tracker/?group_id=5470 .. _python-dev mailing list: http://mail.python.org/mailman/listinfo/python-dev .. _comp.lang.python: http://groups.google.com/groups?q=comp.lang.python .. _Docutils: http://docutils.sf.net/ .. _reST: .. _reStructuredText: http://docutils.sf.net/rst.html .. contents:: .. _last summary: http://www.python.org/dev/summary/2003-09-01_2003-09-15.html ===================== Summary Announcements ===================== First, sorry about the lateness of this summary. I have started my first quarter at `Cal Poly SLO`_. Not only do I get to deal with being back in school for the first time in over a year, but I also get to be abruptly introduced to the quarter system. Joys abound for me. I am still reworking how I manage my time and the Summaries were the first thing to take a back seat. Hopefully this won't happen again. In case you have not been following general Python news, `Python 2.3.2`_ is now the newest release of Python. In case you missed the Python 2.3.1 release, then you missed the little hiccup in that release, which is fine. The Python 2.3.2 release does not technically fall under the jurisdiction of this summary, but I am not going to wait half a month to let people know about it. .. _Cal Poly SLO: http://www.calpoly.edu/ .. _Python 2.3.2: http://www.python.org/2.3.2/ ========= Summaries ========= ---------------------------------------------------------- Deprecations won't spontaneously appear in a micro release ---------------------------------------------------------- In case you don't know, sets.BaseSet.update() has been deprecated in favor of union_update() in order to cut out the unneeded duplication of functionality in Python 2.4 . While 2.3.1 was still under development it grew a PendingDeprecationWarning. This did not sit well with some people. The argument for the PendingDeprecationWarning was that it is silent by default and gives people a heads-up in terms of things that are known to be deprecated in the next minor version of Python. Against this idea, the argument that it adds a change between micro versions that is not a bug fix was raised. In the end this won. Contributing threads: - `pending deprecation warning for Set.update `__ ------------------------------ Web-SIG on its way, supposedly ------------------------------ Bill Janssen is working on a charter so a Web SIG_ can be started in order to redesign the cgi module as the main goal, but also just making Python friendlier to web coding in general. .. _SIG: http://www.python.org/community/sigs.html Contributing threads: - `Improving the CGI module `__ ------------------------------------------- Threads and the desolation that is shutdown ------------------------------------------- Tim Peters decided to try to deal with the fact that the Zope 3 testing suite was spitting out a ton of messages about unhandled exceptions during shutdown of the interpreter. It turned out that threads were still running during shutdown and thus were throwing a fit because they were accessing module globals that were being torn down and set to None. The problem went away when the second call to PyGC_Collect() in Py_Finalize() was commented out. This is not totally acceptable since the second call is there to help collect garbage at shutdown so that things clean up properly. Tim did end up suggesting just taking it out, though, for a future version of Python. He also suggested tearing down the sys module even later (and thus "even more of a special case than it is now"). This would leave sys.modules around and thus not cause globals to turn to None and cause errors from that side-effect. Neither solution has been taken yet. A temporary solution if you keep running into this is to make sure that either your cleanup code only accesses local variables (if you have to store references to globals since that will keep them around for you during shutdown). Contributing threads: - `Fun with 2.3 shutdown `__ ---------------------- Where is str.rsplit?!? ---------------------- The reason str.rsplit does not exist in Python is because the method is not difficult to code on your own. And yet people still want it. But there was not of a public outcry and the topic just fizzled. Contributing threads: - `Discussion on adding rsplit() for strings and unicode objects. `__ ----------------- Waxing on PEP 310 ----------------- Holger Krekel brought up PEP 310 (entitled "Reliable Acquisition/Release Pairs") in terms of how code blocks should handle exceptions and such. Michael Hudson suggested that might be taking PEP 310 beyond what it is meant to cover. To this, Holger suggested that then perhaps some other route should be taken. As with all PEPs, discussion of them is always helpful for python-dev and the community. It helps hash out ideas and gives python-dev feedback on whether a PEP should be rejected. Contributing threads: ` pep 310 (reliable acquisition/release pairs) `__ ------------------------------------------------------------ bsddb3 failures and the database system it wraps, news at 10 ------------------------------------------------------------ The bsddb3 regression tests were failing during preparation for Python 2.3.1 . Beyond the "the test just fails sometimes" issues that come up with tests that are finicky because of timing, it was suggested that the failures are the fault of the Sleepycat_ DB code. It is still being looked into. .. _Sleepycat: http://www.sleepycat.com/ Contributing threads: - `latest bsddb3 test problems `__ ---------------------------------------------------- We want *you* to help with the war on SF patch items ---------------------------------------------------- Someone wanted to help but wasn't sure how they could. Martin v. L?wis sent an email listing common things anyone can do to help with dealing with the patch items on SourceForge_. The email can be found at http://mail.python.org/pipermail/python-dev/2003-September/038253.html . Contributing threads: - `Help offered `__ --------------- Python glossary --------------- Skip Montanaro converted the glossary he has as a wiki at http://manatee.mojam.com/python-glossary to the proper format to be included in the Python documentation. You can peruse the glossary as it stands in the documentation at http://www.python.org/dev/doc/devel/tut/node16.html. Thanks to Skip for for doing the grunt work and getting this done. If you wish to help, please visit the wiki and add/edit/whatever . Contributing threads: - `Python Glossary `__ ---------------------------------- Mitch Kapor to speak at PyCon 2004 ---------------------------------- Mitch Kapor is founder of the `Open Source Application Foundation`_ (OSAF), co-founder of the `Electronic Frontier Foundation`_, and developer of Chandler_ . He is going to be the keynote speaker at `PyCon 2004`_ . The general `Call for Papers`_ has gone out. If you have any desire to speak at PyCon take a look at the CFP. .. _PyCon 2004: http://www.python.org/pycon/dc2004/ .. _Open Source Application Foundation: http://www.osafoundation.org/ .. _Electronic Frontier Foundation: http://www.eff.org/ .. _Chandler: http://www.osafoundation.org/Chandler-Product_FAQ.htm .. _Call for Papers: http://www.python.org/pycon/dc2004/cfp.html ----------------------------------------------------- Python 2.3.1 released, people were happy... initially ----------------------------------------------------- Python 2.3.1 was released to the general public. It was meant to be a bug-fix release to fix bugs that were discovered after Python 2.3 went out the door. But then a typo in the configure.in script that prevented os.fsync() from ever being included was discovered. A rather vocal group of users of this function got out their pitchforks and torches while screaming, "blood, blood!" (actually they were nice about it, but saying, "they kindly asked for a new release," isn't that dramatic, is it?) How were the rioting masses (who were actually not rioting) appeased? Contributing threads: - `2.3.1 is (almost) a go `__ - `RELEASED Python 2.3.1 `__ - `How to test for stuff like fsync? `__ ---------------------------------------------- Let them eat cake while releasing Python 2.3.2 ---------------------------------------------- Python 2.3.2 was released to deal with the os.fsync() snafu. HP/UX compiling issues were also addressed. The bsddb3 problems are still there, but it is becoming more and more certain that the issues are with Sleepycat and not the bsddb module. Contributing threads: - `plans for 2.3.2 `__ - `Python2.3.2 and release23-maint branch `__ - `2.3.2 and bsddb `__ - `RELEASED Python 2.3.2, release candidate 1 `__ - `OpenSSL vulnerability `__ - `RELEASED Python 2.3.2 (final) `__ From zeddicus at satokar.com Fri Oct 10 07:37:21 2003 From: zeddicus at satokar.com (Michael Bartl) Date: Fri Oct 10 07:39:31 2003 Subject: [Python-Dev] Patches & Bug help revisited Message-ID: <20031010113721.GA4148@satokar.com> Hi! I found myself mentioned in the summary so I thought I'd drop a line again. After my inital offer to help I started with reviewing and writing (very simple) patches. To be more explicit: patch 813200, patch 810914, bug 810408, 811082, (can't remember the rest from memory, can provide a better list later). I was quite wondering that no-one seemed to have a look at the patches, but thought that this might be due to the pressing 2.3.2 release. I find patch writing rather unsatisfying if they aren't applied (or rejected :) btw: Is there any possibility to search through the bugs/patches. I couldn't find it and see the sf.net capabilities as rather limited. Have fun, Michael From theller at python.net Fri Oct 10 10:39:48 2003 From: theller at python.net (Thomas Heller) Date: Fri Oct 10 10:39:56 2003 Subject: [Python-Dev] buildin vs. shared modules Message-ID: What is the rationale to decide whether a module is builtin or an extension module in core Python (I only care about Windows)? To give examples, could zlib be made into a builtin module (because it's useful for zipimport), _sre (because it's used by warnings), or are there reasons preventing this? Thomas From theshadow at shambala.net Fri Oct 10 11:25:25 2003 From: theshadow at shambala.net (John Hoffman) Date: Fri Oct 10 11:25:38 2003 Subject: [Python-Dev] IPv6 in Windows binary distro Message-ID: <3F86CF65.1000401@shambala.net> Hello... Could you please tell me why IPv6 support isn't present in the 2.3.1 and 2.3.2 Windows binary releases? Is it broken for Windows? If not, I'd really appreciate if someone could make a new build for me... Thanks. From wtrenker at hotmail.com Fri Oct 10 04:38:51 2003 From: wtrenker at hotmail.com (William Trenker) Date: Fri Oct 10 11:42:16 2003 Subject: [Python-Dev] Patches & Bug help revisited In-Reply-To: <20031010113721.GA4148@satokar.com> References: <20031010113721.GA4148@satokar.com> Message-ID: <20031010083851.6c238978.wtrenker@hotmail.com> On Fri, 10 Oct 2003 13:37:21 +0200 Michael Bartl wrote regarding [Python-Dev] Patches & Bug help revisited: > btw: Is there any possibility to search through the bugs/patches. I > couldn't find it and see the sf.net capabilities as rather limited. If you go to the patches page, on the left-hand side you will see a search box right under the Sourceforge logo. The drop-down list will say "Patches". In the text box under that, type in your search text and click on the search button. You can do the same thing on the bugs page. Of course the search drop-down on the bugs page will have the word "Bugs" in it. Hope this is helpful. Bill From zeddicus at satokar.com Fri Oct 10 11:51:16 2003 From: zeddicus at satokar.com (Michael Bartl) Date: Fri Oct 10 11:51:20 2003 Subject: [Python-Dev] Patches & Bug help revisited In-Reply-To: <20031010083851.6c238978.wtrenker@hotmail.com> References: <20031010113721.GA4148@satokar.com> <20031010083851.6c238978.wtrenker@hotmail.com> Message-ID: <20031010155116.GA5520@satokar.com> On Fri, Oct 10, 2003 at 08:38:51AM +0000, William Trenker wrote: > On Fri, 10 Oct 2003 13:37:21 +0200 > Michael Bartl wrote regarding [Python-Dev] Patches & Bug help revisited: > > > btw: Is there any possibility to search through the bugs/patches. I > > couldn't find it and see the sf.net capabilities as rather limited. > > If you go to the patches page, on the left-hand side you will see a search box right under the Sourceforge logo. The drop-down list will say "Patches". In the text box under that, type in your search text and click on the search button. > > You can do the same thing on the bugs page. Of course the search drop-down on the bugs page will have the word "Bugs" in it. > > Hope this is helpful. > Bill Indeed it is! I never had a look at this, because I thought it's only for software/people. It's not a full text search which would be nice, but it's a start :) From mwh at python.net Fri Oct 10 11:54:55 2003 From: mwh at python.net (Michael Hudson) Date: Fri Oct 10 11:54:04 2003 Subject: [Python-Dev] IPv6 in Windows binary distro In-Reply-To: <3F86CF65.1000401@shambala.net> (John Hoffman's message of "Fri, 10 Oct 2003 09:25:25 -0600") References: <3F86CF65.1000401@shambala.net> Message-ID: <2mad89ulyo.fsf@starship.python.net> John Hoffman writes: > Hello... Could you please tell me why IPv6 support isn't present in > the 2.3.1 and 2.3.2 Windows binary releases? Is it broken for > Windows? If not, I'd really appreciate if someone could make a new > build for me... Thanks. Did the 2.3 builds have IPv6 support? Then this would be a nasty regression. However, I *thought* that you had to build with VC++ 7 or higher to get IPv6 support on Windows, and we've never done that. Cheers, mwh (not a windows victim) -- NUTRIMAT: That drink was individually tailored to meet your personal requirements for nutrition and pleasure. ARTHUR: Ah. So I'm a masochist on a diet am I? -- The Hitch-Hikers Guide to the Galaxy, Episode 9 From nas-python at python.ca Fri Oct 10 12:11:16 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Fri Oct 10 12:10:35 2003 Subject: [Python-Dev] Python 2.3 startup slowness, import related? Message-ID: <20031010161116.GA1238@mems-exchange.org> Python 2.3 seems to be really sluggish starting up. Even on my relatively fast development machine I am getting annoyed when running small scripts. I've only just starting digging into it but I think I've found something interesting. Here's an excerpt of strace output for 2.2: 0.0511 rt_sigaction(SIGRT_30, NULL, {SIG_DFL}, 8) = 0 0.0511 rt_sigaction(SIGRT_31, NULL, {SIG_DFL}, 8) = 0 0.0512 rt_sigaction(SIGINT, NULL, {SIG_DFL}, 8) = 0 0.0512 rt_sigaction(SIGINT, {0x4002e610, [], SA_RESTORER, 0x4010e578}, NULL, 8) 0.0513 stat64("/home/nascheme/lib/python", {st_mode=S_IFDIR|0775, st_size=504, 0.0514 stat64("/home/nascheme/lib/python/site", 0xbfffed08) = -1 ENOENT (No suc 0.0515 open("/home/nascheme/lib/python/site.so", O_RDONLY|O_LARGEFILE) = -1 ENO 0.0516 open("/home/nascheme/lib/python/sitemodule.so", O_RDONLY|O_LARGEFILE) = 0.0516 open("/home/nascheme/lib/python/site.py", O_RDONLY|O_LARGEFILE) = -1 ENO 0.0517 open("/home/nascheme/lib/python/site.pyc", O_RDONLY|O_LARGEFILE) = -1 EN 0.0517 stat64("/www/plat/python/lib/python23.zip", 0xbfffe3c4) = -1 ENOENT (No 0.0518 stat64("/www/plat/python/lib", {st_mode=S_IFDIR|0755, st_size=4096, ...} 0.0519 stat64("/www/plat/python/lib/python23.zip/site", 0xbfffed08) = -1 ENOENT 0.0519 open("/www/plat/python/lib/python23.zip/site.so", O_RDONLY|O_LARGEFILE) 0.0523 open("/www/plat/python/lib/python23.zip/sitemodule.so", O_RDONLY|O_LARGE 0.0538 open("/www/plat/python/lib/python23.zip/site.py", O_RDONLY|O_LARGEFILE) 0.0541 open("/www/plat/python/lib/python23.zip/site.pyc", O_RDONLY|O_LARGEFILE) 0.0556 stat64("/www/plat/python/lib/python2.3/", {st_mode=S_IFDIR|0755, st_size 0.0557 stat64("/www/plat/python/lib/python2.3/site", 0xbfffed08) = -1 ENOENT (N 0.0557 open("/www/plat/python/lib/python2.3/site.so", O_RDONLY|O_LARGEFILE) = - 0.0561 open("/www/plat/python/lib/python2.3/sitemodule.so", O_RDONLY|O_LARGEFIL 0.0575 open("/www/plat/python/lib/python2.3/site.py", O_RDONLY|O_LARGEFILE) = 4 0.0593 fstat64(4, {st_mode=S_IFREG|0644, st_size=11784, ...}) = 0 0.0594 open("/www/plat/python/lib/python2.3/site.pyc", O_RDONLY|O_LARGEFILE) = 0.0611 fstat64(5, {st_mode=S_IFREG|0664, st_size=11417, ...}) = 0 and for 2.3: 0.0521 rt_sigaction(SIGRT_30, NULL, {SIG_DFL}, 8) = 0 0.0521 rt_sigaction(SIGRT_31, NULL, {SIG_DFL}, 8) = 0 0.0522 rt_sigaction(SIGINT, NULL, {SIG_DFL}, 8) = 0 0.0522 rt_sigaction(SIGINT, {0x4002e610, [], SA_RESTORER, 0x4010e578}, NULL 0.0524 stat64("/home/nascheme/lib/python", {st_mode=S_IFDIR|0775, st_size=5 0.0551 stat64("/home/nascheme/lib/python/site", 0xbfffed18) = -1 ENOENT (No 0.0716 open("/home/nascheme/lib/python/site.so", O_RDONLY|O_LARGEFILE) = -1 0.0811 open("/home/nascheme/lib/python/sitemodule.so", O_RDONLY|O_LARGEFILE 0.0925 open("/home/nascheme/lib/python/site.py", O_RDONLY|O_LARGEFILE) = -1 0.1079 open("/home/nascheme/lib/python/site.pyc", O_RDONLY|O_LARGEFILE) = - 0.1145 stat64("/www/python/lib/python23.zip", 0xbfffe3d4) = -1 ENOENT (No s 0.1258 stat64("/www/python/lib", {st_mode=S_IFDIR|0755, st_size=4096, ...}) 0.1260 stat64("/www/python/lib/python23.zip/site", 0xbfffed18) = -1 ENOENT 0.1261 open("/www/python/lib/python23.zip/site.so", O_RDONLY|O_LARGEFILE) = 0.1284 open("/www/python/lib/python23.zip/sitemodule.so", O_RDONLY|O_LARGEF 0.1384 open("/www/python/lib/python23.zip/site.py", O_RDONLY|O_LARGEFILE) = 0.1443 open("/www/python/lib/python23.zip/site.pyc", O_RDONLY|O_LARGEFILE) 0.1492 stat64("/www/python/lib/python2.3/", {st_mode=S_IFDIR|0755, st_size= 0.1512 stat64("/www/python/lib/python2.3/site", 0xbfffed18) = -1 ENOENT (No 0.1623 open("/www/python/lib/python2.3/site.so", O_RDONLY|O_LARGEFILE) = -1 0.1756 open("/www/python/lib/python2.3/sitemodule.so", O_RDONLY|O_LARGEFILE 0.1994 open("/www/python/lib/python2.3/site.py", O_RDONLY|O_LARGEFILE) = 4 0.2081 fstat64(4, {st_mode=S_IFREG|0644, st_size=11784, ...}) = 0 0.2083 open("/www/python/lib/python2.3/site.pyc", O_RDONLY|O_LARGEFILE) = 5 0.2222 fstat64(5, {st_mode=S_IFREG|0664, st_size=11417, ...}) = 0 I cut off the long lines since the first column showing time in seconds since startup is the interesting bit. Notice that 2.3 is making a few more system calls due to the zip import feature but it is taking a lot more time to find the 'site' module. I'm going to keep digging but perhaps someone has a theory as to what's going on. Neil From kbg at kadnet.dk Fri Oct 10 12:11:37 2003 From: kbg at kadnet.dk (kasper b. graversen) Date: Fri Oct 10 12:13:02 2003 Subject: [Python-Dev] attaching methods to an object at runtime and compiler enhancement ideas... Message-ID: <200310101811370393.020A912A@lisbeth.kadnet.dom> Hello all. This is my first posting here. My name is Kasper Graversen, a ph.d. student of the it-university of copenhagen. I'm playing with python for doing roles, that is, runtime specialization pr.v object with the ability of multiple views on each object. So far it has been fun playing with python, but I ponder why it is only possible to introduce functions and not methods to object instances? I am also wondering if it is possible to change the parsed code at compile time by gaining access to the AST and by the use of some mechanism (somewhat similar to meta classes) to be able to patch in before the execution of the code Finally, one of the most difficult things of moving from Java to python is the lack of checking done by the compiler ;) Here are two things I really miss which if would like the future versions of the compiler to support: * A flag, when set, checks that each __init__ method calls its super __init__ * A flag, when set, checks that an inner class in a subclass by the same name of its supers inner class subclasses this class. Oddly only methods and not also inner classes are virtual by default in Python. eg class A(object): class B(object): class C(A): class B(object): should raise an error since C.B should extend A.B * A flag, when set, raises an error if an a field is introduced in code outside the __init__() block.. this ensures that spelling mistakes are caught at compile time, when misspelling the field to be accessed. sincerely Kasper B. Graversen please help save our planet! At least click daily on http://rainforest.care2.com/ and http://www.therainforestsite.com/ and tell your friends to do the same... From nas-python at python.ca Fri Oct 10 12:25:46 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Fri Oct 10 12:25:03 2003 Subject: [Python-Dev] Python 2.3 startup slowness, import related? In-Reply-To: <20031010161116.GA1238@mems-exchange.org> References: <20031010161116.GA1238@mems-exchange.org> Message-ID: <20031010162546.GA1319@mems-exchange.org> On Fri, Oct 10, 2003 at 09:11:16AM -0700, Neil Schemenauer wrote: >Here's an excerpt of strace output for 2.2: Argh, please ignore that garbage. I was using the wrong binary for Python 2.2. Neil From fdrake at acm.org Fri Oct 10 12:25:41 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Fri Oct 10 12:26:11 2003 Subject: [Python-Dev] Patches & Bug help revisited In-Reply-To: <20031010155116.GA5520@satokar.com> References: <20031010113721.GA4148@satokar.com> <20031010083851.6c238978.wtrenker@hotmail.com> <20031010155116.GA5520@satokar.com> Message-ID: <16262.56709.186862.565850@grendel.zope.com> Michael Bartl writes: > Indeed it is! I never had a look at this, because I thought it's only > for software/people. It's not a full text search which would be nice, > but it's a start :) It used to be only for software and people, but that was fixed. Now, it searches through all the issues in the current tracker and doesn't let you filter or sort the results in any way. But better than it was. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From skip at pobox.com Fri Oct 10 12:26:34 2003 From: skip at pobox.com (Skip Montanaro) Date: Fri Oct 10 12:44:43 2003 Subject: [Python-Dev] Patches & Bug help revisited In-Reply-To: <20031010155116.GA5520@satokar.com> References: <20031010113721.GA4148@satokar.com> <20031010083851.6c238978.wtrenker@hotmail.com> <20031010155116.GA5520@satokar.com> Message-ID: <16262.56762.435465.943403@montanaro.dyndns.org> >> If you go to the patches page, on the left-hand side you will see a >> search box ... Michael> I never had a look at this, because I thought it's only for Michael> software/people. It's not a full text search which would be Michael> nice, but it's a start :) It's also limited by the fact that you can't specify any other search constraints (like assignee or current state). If you search for a term which turns up frequently, you'll get dozens of hits, most of which will be useless because the bug or patch has already been closed. Because of this, I often find it easier to use the browse function using various criteria to limit its scope. Skip From tjreedy at udel.edu Fri Oct 10 14:19:51 2003 From: tjreedy at udel.edu (Terry Reedy) Date: Fri Oct 10 14:19:56 2003 Subject: [Python-Dev] Re: attaching methods to an object at runtime and compiler enhancement ideas... References: <200310101811370393.020A912A@lisbeth.kadnet.dom> Message-ID: "kasper b. graversen" wrote in message news:200310101811370393.020A912A@lisbeth.kadnet.dom... At least your first two questions are about usage rather than development and would be better directed to comp.lang.python (or g.c.p.general) In the meanwhile... >each object. So far it has been fun playing with python, but I ponder why it is >only possible to introduce functions and not methods to object instances? 1. Methods are functions attached to classes as attributes. 2. The need for instance-specific 'methods' is rare. 3. Rare needs are covered by explicitly passing the instance as an arg: person.role(person, *args) > I am also wondering if it is possible to change the parsed code at compile > time by gaining access to the AST See compiler module/package and its AST walker. >Finally, one of the most difficult things of moving from Java to python is the >lack of checking done by the compiler ;) Here are two things I really miss >which if would like the future versions of the compiler to support: The current interpreter checks only for syntactic correctness and deprecated usages (to issue warnings). This is unlikely to change soon. Enforcing coding standards is more the province of PyChecker and PyLint. If neither have the checks you want, give both authors your suggestions. Terry J. Reedy From bac at OCF.Berkeley.EDU Fri Oct 10 15:29:11 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Fri Oct 10 15:29:19 2003 Subject: [Python-Dev] Patches & Bug help revisited In-Reply-To: <20031010113721.GA4148@satokar.com> References: <20031010113721.GA4148@satokar.com> Message-ID: <3F870887.4090201@ocf.berkeley.edu> Michael Bartl wrote: > Hi! > > I found myself mentioned in the summary so I thought I'd drop a line > again. After my inital offer to help I started with reviewing and > writing (very simple) patches. To be more explicit: patch 813200, > patch 810914, bug 810408, 811082, (can't remember the rest from > memory, can provide a better list later). > > I was quite wondering that no-one seemed to have a look at the patches, > but thought that this might be due to the pressing 2.3.2 release. > > I find patch writing rather unsatisfying if they aren't applied (or > rejected :) > Sorry about no one getting to your patches, Michael. The help is truly appreciated, even if no one has gotten around to looking at them. As for why no one has dealt with them, Python 2.3.2 was part of it. The other issue is just people on python-dev being busy. Beyond a handful of people, not everyone goes through the new patches, bugs, etc. because of time constraints and having to prioritize what time they have to work on Python. It is an unfortunate drawback of having everyone who works on Python be a volunteer (this can be solved if someone gave the PSF *tons* of money and thus could sponsor someone to work on Python full time, but I haven't won the lottery yet =). Someone will get to them at some point, I promise. If anything I will get to them because I am going to go through all the patch items in the near future; I will reach the end at some point. =) -Brett From neal at metaslash.com Fri Oct 10 16:01:08 2003 From: neal at metaslash.com (Neal Norwitz) Date: Fri Oct 10 16:01:17 2003 Subject: [Python-Dev] Patches & Bug help revisited In-Reply-To: <3F870887.4090201@ocf.berkeley.edu> References: <20031010113721.GA4148@satokar.com> <3F870887.4090201@ocf.berkeley.edu> Message-ID: <20031010200108.GR30467@epoch.metaslash.com> On Fri, Oct 10, 2003 at 12:29:11PM -0700, Brett C. wrote: > Michael Bartl wrote: > > >I was quite wondering that no-one seemed to have a look at the patches, > >but thought that this might be due to the pressing 2.3.2 release. > > > >I find patch writing rather unsatisfying if they aren't applied (or > >rejected :) > > As for why no one has dealt with them, Python 2.3.2 was part of it. The > other issue is just people on python-dev being busy. Beyond a handful > of people, not everyone goes through the new patches, bugs, etc. because > of time constraints and having to prioritize what time they have to work > on Python. Brett is correct. Speaking for myself, most of my free time has gone from working on pychecker to working on python to working on the PSF (I'm treasurer). > It is an unfortunate drawback of having everyone who works > on Python be a volunteer (this can be solved if someone gave the PSF > *tons* of money and thus could sponsor someone to work on Python full > time, but I haven't won the lottery yet =). It would be even better for the PSF to win grants from other Public Charities or government organizations. To win grants we need to identify grant opportunties and more importantly write proposals. Neal From guido at python.org Fri Oct 10 18:08:16 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 10 18:08:28 2003 Subject: [Python-Dev] Python 2.3 startup slowness, import related? In-Reply-To: Your message of "Fri, 10 Oct 2003 09:11:16 PDT." <20031010161116.GA1238@mems-exchange.org> References: <20031010161116.GA1238@mems-exchange.org> Message-ID: <200310102208.h9AM8Ha03802@12-236-54-216.client.attbi.com> > Python 2.3 seems to be really sluggish starting up. There have been I think at least two past threads on this issue; it might be useful to look them up. While a bit of work was done to alleviate the problem, 2.3 remains much slower because it imports a much larger set of modules at startup... --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Fri Oct 10 18:17:13 2003 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri Oct 10 18:17:21 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: References: Message-ID: <3F872FE9.9070508@v.loewis.de> Thomas Heller wrote: > What is the rationale to decide whether a module is builtin or an > extension module in core Python (I only care about Windows)? I believe it is mostly tradition, on Windows: We continue to do things the way they have always been done. On Linux, there is an additional rationale: small executables and many files are cool, so we try to have as many shared libraries as possible. (if you smell sarcasm - that is intentional) > To give examples, could zlib be made into a builtin module (because it's > useful for zipimport), _sre (because it's used by warnings), or are > there reasons preventing this? I think that anything that would be reasonably replaced by third parties (such as pyexpat.pyd) should be shared, and anything else should be part of pythonxy.dll. Regards, Martin From martin at v.loewis.de Fri Oct 10 18:20:07 2003 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri Oct 10 18:20:15 2003 Subject: [Python-Dev] IPv6 in Windows binary distro In-Reply-To: <2mad89ulyo.fsf@starship.python.net> References: <3F86CF65.1000401@shambala.net> <2mad89ulyo.fsf@starship.python.net> Message-ID: <3F873097.7050201@v.loewis.de> Michael Hudson wrote: > Did the 2.3 builds have IPv6 support? Then this would be a nasty > regression. However, I *thought* that you had to build with VC++ 7 or > higher to get IPv6 support on Windows, and we've never done that. No, 2.3 did not have IPv6. You don't strictly need VC7, though - if you have the SDK installed in addition to VC6, you could also include IPv6 support. PC/pyconfig.h does not detect this case automatically, so you would have to manually activate this support (i.e. include winsock2.h). Apart from that, you are right - IPv6 is not supported in the Windows builds because of lacking support in the compiler's header files. Regards, Martin From aahz at pythoncraft.com Fri Oct 10 21:18:25 2003 From: aahz at pythoncraft.com (Aahz) Date: Fri Oct 10 21:18:28 2003 Subject: [Python-Dev] OS testing (was Re: 2.3.3 plans) In-Reply-To: <16257.52079.226636.407139@montanaro.dyndns.org> References: <200310040008.h9408HtM008544@localhost.localdomain> <2m3ce6zomw.fsf@starship.python.net> <16257.52079.226636.407139@montanaro.dyndns.org> Message-ID: <20031011011825.GA15528@panix.com> On Mon, Oct 06, 2003, Skip Montanaro wrote: > > It's not quite exhaustive yet, but I will remind people about the > PythonTesters wiki page: > > http://www.python.org/cgi-bin/moinmoin/PythonTesters > > Maybe that page should also mention some of the vendor-specific test sites > (HP Test Drive, SourceForge compile farm, PBF server farm, ...). Added HP Test Drive. I also added links to PythonTesters on the /dev/ Tools page and the Dev FAQ. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From tim.one at comcast.net Fri Oct 10 23:52:49 2003 From: tim.one at comcast.net (Tim Peters) Date: Fri Oct 10 23:52:54 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: Message-ID: [Thomas Heller] > What is the rationale to decide whether a module is builtin or an > extension module in core Python (I only care about Windows)? I don't know that there is one. Maybe to avoid chewing address space for code that some programs won't use. Generally speaking, it appears some effort was made to make stuff an extension module on Windows if it was an optional part of the Unix build. There was certainly an effort made to build an extension for Python modules wrapping external cod (like the _bsddb and _tkinter projects). > To give examples, could zlib be made into a builtin module (because > it's useful for zipimport), _sre (because it's used by warnings), or > are there reasons preventing this? zlib was there long before Python routinely made use of it; indeed, I doubt I ever used one byte of the zlib code outside of Python testing before zip import came along (and since I have no zip files to import from I guess I still never use it). Leaving _sre an extension seems odd now, but at the time it was competing with the external-to-Python PCRE code. Why do you ask? Answers must be accurate to 10 decimal digits . From bac at OCF.Berkeley.EDU Sat Oct 11 18:50:25 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Sat Oct 11 18:50:29 2003 Subject: [Python-Dev] Failure of test__locale common on OS X? Message-ID: <3F888931.9000706@ocf.berkeley.edu> I just ran ``regrtest.py -unetwork`` after a fresh update and noticed that test__locale failed for me under OS X 10.2.8 because _locale.RADIXCHAR does not exist. Anyone else getting this failure? If so I will add it to the expected skip list. -Brett From martin at v.loewis.de Sat Oct 11 18:59:55 2003 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat Oct 11 19:00:02 2003 Subject: [Python-Dev] Failure of test__locale common on OS X? In-Reply-To: <3F888931.9000706@ocf.berkeley.edu> References: <3F888931.9000706@ocf.berkeley.edu> Message-ID: <3F888B6B.6080002@v.loewis.de> Brett C. wrote: > I just ran ``regrtest.py -unetwork`` after a fresh update and noticed > that test__locale failed for me under OS X 10.2.8 because > _locale.RADIXCHAR does not exist. Anyone else getting this failure? If > so I will add it to the expected skip list. A test failure is different from a skipped test. Martin From jepler at unpythonic.net Sat Oct 11 20:30:28 2003 From: jepler at unpythonic.net (Jeff Epler) Date: Sat Oct 11 20:30:35 2003 Subject: [Python-Dev] Failure of test__locale common on OS X? In-Reply-To: <3F888931.9000706@ocf.berkeley.edu> References: <3F888931.9000706@ocf.berkeley.edu> Message-ID: <20031012003021.GA20833@unpythonic.net> This test was added for python.org/sf/798145 I assume that the 'from _locale import ... RADIXCHAR ...' line fails, and the test skips? Any system which doesn't have RADIXCHAR is expected to skip this test. According to glibc's documentation (the version google coughed up: http://www.delorie.com/gnu/docs/glibc/libc_119.html ), RADIXCHAR is among the identifiers specified in "the X/Open standard". OS X isn't Unix enough for this situation? Jeff From greg at cosc.canterbury.ac.nz Sat Oct 11 21:13:54 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Sat Oct 11 21:14:19 2003 Subject: [Python-Dev] MacPython - access to FinderInfo of a directory Message-ID: <200310120113.h9C1Dsv21694@oma.cosc.canterbury.ac.nz> I discovered recently that the File Manager wrappings in MacPython don't seem to provide any way of getting at the FinderInfo of a directory, because GetFInfo/SetFInfo only work on files, and access to the finderInfo field of the FSCatalogInfo structure hasn't been implemented. I have come up with a patch to _Filemodule.c to remedy this, but patching this file directly probably isn't the right thing to do, because it seems to have been generated automatically using bgen. Unfortunately I don't know enough about bgen to fix this properly. Should I go ahead and submit a patch anyway, and hope that someone will be able to reverse-engineer it into whatever fix is appropriate? It would be good to get this incorporated into the standard distribution if possible. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From sdm7g at mac.com Sat Oct 11 22:00:28 2003 From: sdm7g at mac.com (Steven Majewski) Date: Sat Oct 11 22:00:35 2003 Subject: [Python-Dev] Failure of test__locale common on OS X? In-Reply-To: <3F888931.9000706@ocf.berkeley.edu> Message-ID: tkFileDialog also fails when run as a main module: % pythonw tkFileDialog.py Traceback (most recent call last): File "tkFileDialog.py", line 189, in ? locale.setlocale(locale.LC_ALL,'') File "/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/ locale.py", line 381, in setlocale return _setlocale(category, locale) locale.Error: locale setting not supported All the functions seem to work ok when imported. The problem is the following in the 'if __name__ == "__main__" : ' clause: # See whether CODESET is defined try: import locale locale.setlocale(locale.LC_ALL,'') enc = locale.nl_langinfo(locale.CODESET) except (ImportError, AttributeError): pass The tail end of the OSX setlocale man page reads: ------------------------------------------------------------------------ --------------------------------- STANDARDS The setlocale() and localeconv() functions conform to ISO/IEC 9899:1990 (``ISO C89''). HISTORY The setlocale() and localeconv() functions first appeared in 4.4BSD. BUGS The current implementation supports only the "C" and "POSIX" locales for all but the LC_COLLATE, LC_CTYPE, and LC_TIME categories. In spite of the gnarly currency support in localeconv(), the standards don't include any functions for generalized currency formatting. Use of LC_MONETARY could lead to misleading results until we have a real time currency conversion function. LC_NUMERIC and LC_TIME are personal choices and should not be wrapped up with the other categories. BSD June 9, 1993 BSD ------------------------------------------------------------------------ -------------------------------- In fact, setlocale( LC_ALL, "POSIX" ) and setlocale( LC_ALL, "C" ) both work. ( Some other things also don't work, but I'm not sure exactly what things should work. ) -- Steve Majewski From skip at manatee.mojam.com Sun Oct 12 08:00:33 2003 From: skip at manatee.mojam.com (Skip Montanaro) Date: Sun Oct 12 08:00:42 2003 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200310121200.h9CC0XC6010757@manatee.mojam.com> Bug/Patch Summary ----------------- 542 open / 4234 total bugs (+13) 209 open / 2411 total patches (+3) New Bugs -------- distutils: clean -b ignored; set_undefined_options doesn't (2003-10-05) http://python.org/sf/818201 Shared object modules in Windows have no __file__. (2003-10-05) http://python.org/sf/818315 socketmodule.c compile error using SunPro cc (2003-10-06) http://python.org/sf/818490 optparse "append" action should always make the empty list. (2003-10-07) http://python.org/sf/819178 httplib.SSLFile lacks readlines() method (2003-10-07) http://python.org/sf/819510 PythonIDE interactive window Unicode bug (2003-10-08) http://python.org/sf/819860 Ref Man Index: Symbols -- Latex leak (2003-10-08) http://python.org/sf/820344 urllib2 silently returns None when auth_uri is mismatched (2003-10-09) http://python.org/sf/820583 tkinter's 'after' and 'threads' on multiprocessor (2003-10-09) http://python.org/sf/820605 dbm Error (2003-10-09) http://python.org/sf/820953 reduce docs neglect a very important piece of information. (2003-10-11) http://python.org/sf/821701 pyclbr.readmodule_ex() (2003-10-11) http://python.org/sf/821818 _set_cloexec of tempfile.py uses incorrect error handling (2003-10-11) http://python.org/sf/821896 fcntl() not working on sparc (python 2.2.3) (2003-10-11) http://python.org/sf/821948 Carbon.CarbonEvt.ReceiveNextEvent args wrong (2003-10-11) http://python.org/sf/822005 New Patches ----------- Fix for former/latter confusion in Extending documentation (2003-10-06) http://python.org/sf/819012 fix import problem(unittest.py) (2003-10-07) http://python.org/sf/819077 fix doc typos (2003-10-10) http://python.org/sf/821093 ftplib: Strict RFC 959 (telnet in command channel) (2003-10-11) http://python.org/sf/821862 Closed Bugs ----------- compiler package needs better documentation. (2000-11-27) http://python.org/sf/223616 threads and profiler don't work together (2001-02-08) http://python.org/sf/231540 docs need to discuss // and __future__.division (2001-08-08) http://python.org/sf/449093 urljoin fails RFC tests (2001-08-11) http://python.org/sf/450225 new int overflow handling needs docs (2001-08-22) http://python.org/sf/454446 docs should include man page (2001-10-09) http://python.org/sf/469773 Lib/profile.doc should be updated (2001-12-04) http://python.org/sf/489256 Using the lib index mechanically (2002-04-03) http://python.org/sf/538961 Fuzziness in inspect module documentatio (2002-06-01) http://python.org/sf/563298 Automated daily documentation builds (2002-06-26) http://python.org/sf/574241 -S hides standard dynamic modules (2002-07-25) http://python.org/sf/586680 site-packages & build-dir python (2002-07-25) http://python.org/sf/586700 cPickle documentation incomplete (2002-09-28) http://python.org/sf/616013 File write examples are inadequate (2002-10-09) http://python.org/sf/621057 Creation of struct_seq types (2002-10-17) http://python.org/sf/624827 pydoc.Helper.topics not based on docs (2002-10-24) http://python.org/sf/628258 pygettext should be installed (2002-11-22) http://python.org/sf/642309 extra __builtin__ stuff not documented (2002-12-12) http://python.org/sf/652749 cPickle not always same as pickle (2002-12-18) http://python.org/sf/655802 Accept None for time.ctime() and friends (2002-12-24) http://python.org/sf/658254 HTMLParser attribute parsing bug (2003-02-10) http://python.org/sf/683938 _iscommand() in webbrowser module (2003-02-16) http://python.org/sf/687747 Provide "plucker" format docs. (2003-03-06) http://python.org/sf/698900 Building lib.pdf fails on MacOSX (2003-04-14) http://python.org/sf/721157 urlopen object's read() doesn't read to EOF (2003-04-21) http://python.org/sf/725265 platform module needs docs (LaTeX) (2003-04-24) http://python.org/sf/726911 Importing anydbm generates exception if _bsddb unavailable (2003-05-02) http://python.org/sf/731501 markupbase parse_declaration cannot recognize comments (2003-05-12) http://python.org/sf/736659 Failed assert in stringobject.c (2003-05-14) http://python.org/sf/737947 Tutorial: executable scripts on Windows (2003-06-25) http://python.org/sf/760657 test_ossaudiodev timing failure (2003-08-04) http://python.org/sf/783242 Section 13.1 HTMLParser documentation error (2003-08-23) http://python.org/sf/793702 Clarify trailing comma in func arg list (2003-09-01) http://python.org/sf/798652 Mode argument of dumbdbm does not work (2003-09-07) http://python.org/sf/802128 super instances don't support item assignment (2003-09-12) http://python.org/sf/805304 refleaks in _hotshot.c (2003-09-18) http://python.org/sf/808756 tex to html convert bug (2003-09-19) http://python.org/sf/809599 Py2.2.3: Problem with Expat/XML/Zope on MacOSX 10.2.8 (2003-09-23) http://python.org/sf/811070 randint is always even (2003-09-24) http://python.org/sf/812202 mark deprecated modules in indexes (2003-10-02) http://python.org/sf/816725 Float Multiplication (2003-10-02) http://python.org/sf/816946 invalid \U escape gives 0=length unistr (2003-10-03) http://python.org/sf/817156 use Windows' default programs location. (2003-10-05) http://python.org/sf/818030 Closed Patches -------------- Add multicall support to xmlrpclib (2002-03-18) http://python.org/sf/531629 PyArg_VaParseTupleAndKeywords (2002-04-30) http://python.org/sf/550732 attributes for urlsplit, urlparse result (2002-10-16) http://python.org/sf/624325 HTMLParser.py - more robust SCRIPT tag parsing (2003-01-19) http://python.org/sf/670664 test_htmlparser -- more robust SCRIPT tag handling (2003-01-24) http://python.org/sf/674449 Kill off docs for unsafe macros (2003-03-13) http://python.org/sf/702933 Remove __file__ after running $PYTHONSTARTUP (2003-04-11) http://python.org/sf/719777 build of html docs broken (liboptparse.tex) (2003-05-04) http://python.org/sf/732174 Nails down the semantics of dict setitem (2003-06-03) http://python.org/sf/748126 Let pprint.py use issubclass instead of is for type checking (2003-06-07) http://python.org/sf/750542 Glossary (2003-08-13) http://python.org/sf/788509 Improve "veryhigh.tex" API docs (2003-09-01) http://python.org/sf/798638 dynamic popen2 MAXFD (2003-10-03) http://python.org/sf/817329 From fdrake at acm.org Sun Oct 12 10:43:08 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Sun Oct 12 10:43:17 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib urlparse.py, 1.41, 1.42 In-Reply-To: References: Message-ID: <16265.26748.451661.390136@grendel.zope.com> bcannon@users.sourceforge.net writes: > Log Message: > (revision purely to add comment) You can use "cvs admin" to fix broken comments. It doesn't generate an email, but it avoids an extra entry in the history. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From bac at OCF.Berkeley.EDU Sun Oct 12 16:05:21 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Sun Oct 12 16:05:34 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib urlparse.py, 1.41, 1.42 In-Reply-To: <16265.26748.451661.390136@grendel.zope.com> References: <16265.26748.451661.390136@grendel.zope.com> Message-ID: <3F89B401.6070601@ocf.berkeley.edu> Fred L. Drake, Jr. wrote: > bcannon@users.sourceforge.net writes: > > Log Message: > > (revision purely to add comment) > > You can use "cvs admin" to fix broken comments. It doesn't generate > an email, but it avoids an extra entry in the history. > OK, good to know. I think I will add that to the dev FAQ. -Brett From tjreedy at udel.edu Sun Oct 12 18:25:03 2003 From: tjreedy at udel.edu (Terry Reedy) Date: Sun Oct 12 18:25:18 2003 Subject: [Python-Dev] Re: Weekly Python Bug/Patch Summary References: <200310121200.h9CC0XC6010757@manatee.mojam.com> Message-ID: "Skip Montanaro" wrote in message news:200310121200.h9CC0XC6010757@manatee.mojam.com... > > Bug/Patch Summary > ----------------- > > 542 open / 4234 total bugs (+13) > 209 open / 2411 total patches (+3) > > New Bugs (15 new) > Closed Bugs Something hiccuped. About 45 are listed. If these were really closed, then net change would be about -30. Spot check of about 5 showed still open. TJR From Jack.Jansen at cwi.nl Mon Oct 13 05:52:34 2003 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Mon Oct 13 05:52:05 2003 Subject: [Python-Dev] MacPython - access to FinderInfo of a directory In-Reply-To: <200310120113.h9C1Dsv21694@oma.cosc.canterbury.ac.nz> Message-ID: On Sunday, October 12, 2003, at 03:13 AM, Greg Ewing wrote: > I discovered recently that the File Manager wrappings in > MacPython don't seem to provide any way of getting at the > FinderInfo of a directory, because GetFInfo/SetFInfo only > work on files, and access to the finderInfo field of the > FSCatalogInfo structure hasn't been implemented. > > I have come up with a patch to _Filemodule.c to remedy > this, but patching this file directly probably isn't the > right thing to do, because it seems to have been generated > automatically using bgen. Unfortunately I don't know > enough about bgen to fix this properly. Greg, there's an SF bug for this one: #706585. If you could attach your patch to this one I'll do the magic to work it around to bgen. -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From Jack.Jansen at cwi.nl Mon Oct 13 06:28:43 2003 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Mon Oct 13 06:28:06 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Mac/Tools/IDE PyConsole.py, 1.17, 1.18 In-Reply-To: Message-ID: <0560A63A-FD68-11D7-A415-0030655234CE@cwi.nl> Just, could you backport this to the 2.3 maintenance branch too? Actually, that may be the only place where IDE bug fixes need to go, I hope we have something new by the time 2.4 comes out... On Sunday, October 12, 2003, at 09:27 PM, jvr@users.sourceforge.net wrote: > Update of /cvsroot/python/python/dist/src/Mac/Tools/IDE > In directory sc8-pr-cvs1:/tmp/cvs-serv4435 > > Modified Files: > PyConsole.py > Log Message: > fix for bug [819860]: make sure the buffer gets emptied, even if > WEInsert() fails > > Index: PyConsole.py > =================================================================== > RCS file: /cvsroot/python/python/dist/src/Mac/Tools/IDE/PyConsole.py,v > retrieving revision 1.17 > retrieving revision 1.18 > diff -C2 -d -r1.17 -r1.18 > *** PyConsole.py 9 May 2003 11:47:23 -0000 1.17 > --- PyConsole.py 12 Oct 2003 19:27:24 -0000 1.18 > *************** > *** 128,135 **** > stuff = string.join(stuff, '\r') > self.setselection_at_end() > ! self.ted.WEInsert(stuff, None, None) > selstart, selend = self.getselection() > self._inputstart = selstart > - self._buf = "" > self.ted.WEClearUndo() > self.updatescrollbars() > --- 128,137 ---- > stuff = string.join(stuff, '\r') > self.setselection_at_end() > ! try: > ! self.ted.WEInsert(stuff, None, None) > ! finally: > ! self._buf = "" > selstart, selend = self.getselection() > self._inputstart = selstart > self.ted.WEClearUndo() > self.updatescrollbars() > *************** > *** 330,335 **** > self.w.outputtext.setselection(end, end) > self.w.outputtext.ted.WEFeatureFlag(WASTEconst.weFReadOnly, 0) > ! self.w.outputtext.ted.WEInsert(stuff, None, None) > ! self._buf = "" > self.w.outputtext.updatescrollbars() > self.w.outputtext.ted.WEFeatureFlag(WASTEconst.weFReadOnly, 1) > --- 332,339 ---- > self.w.outputtext.setselection(end, end) > self.w.outputtext.ted.WEFeatureFlag(WASTEconst.weFReadOnly, 0) > ! try: > ! self.w.outputtext.ted.WEInsert(stuff, None, None) > ! finally: > ! self._buf = "" > self.w.outputtext.updatescrollbars() > self.w.outputtext.ted.WEFeatureFlag(WASTEconst.weFReadOnly, 1) > > > > _______________________________________________ > Python-checkins mailing list > Python-checkins@python.org > http://mail.python.org/mailman/listinfo/python-checkins > -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From skip at pobox.com Mon Oct 13 15:10:50 2003 From: skip at pobox.com (Skip Montanaro) Date: Mon Oct 13 15:11:00 2003 Subject: [Python-Dev] Re: Weekly Python Bug/Patch Summary In-Reply-To: References: <200310121200.h9CC0XC6010757@manatee.mojam.com> Message-ID: <16266.63674.958495.69851@montanaro.dyndns.org> >> Bug/Patch Summary >> ----------------- >> >> 542 open / 4234 total bugs (+13) >> 209 open / 2411 total patches (+3) >> >> New Bugs (15 new) >> Closed Bugs Terry> Something hiccuped. About 45 are listed. If these were really Terry> closed, then net change would be about -30. Spot check of about Terry> 5 showed still open. Thanks, I'll take a look at it when I get a chance. Skip From raymond.hettinger at verizon.net Mon Oct 13 15:34:15 2003 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Mon Oct 13 15:34:57 2003 Subject: [Python-Dev] decorate-sort-undecorate Message-ID: <001f01c391c0$fcdb88a0$e841fea9@oemcomputer> For Py2.4, I propose adding an optional list.sort() argument to support the decorate-sort-undecorate pattern. The current, pure Python approach to DSU is pure arcana. It is obscure enough and cumbersome enough that cmpfunc() tends to get used instead. Built-in C support for DSU requires much less skill to use, results in more readable code, and runs faster. Raymond Hettinger ------ Concept demonstraton ------------------ def sort(self, cmpfunc=None, decorator=None): """Show how list.sort() could support a decorating function""" args = () if cmpfunc is not None: args = (cmpfunc,) if decorator is None: self.sort(*args) else: aux = zip(map(decorator, self), self) # Decorate aux.sort(*args) self[:] = list(zip(*aux)[1]) # Un-decorate a = 'the Quick brown Fox jumped Over the Lazy Dog'.split() sort(a) # the no argument form is unchanged print a, 'Normal sort' sort(a, lambda x,y: -cmp(x,y)) # old code still works without change print a, 'Reverse sort' sort(a, decorator=str.lower) # the new way is fast, clean, and readable print a, 'Lowercase sort' # The decorator form works especially well with mappings so that database # keys can be sorted by any field. ages = dict(john=5, amy=3, andrea=32, henry=12) names = ages.keys() location = dict(john='alaska', amy='spain', andrea='peru', henry='iowa') sort(names) print names, '<-- by name' sort(names, decorator=ages.__getitem__) print names, '<-- by age' sort(names, decorator=location.__getitem__) print names, '<-- by location' -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20031013/a4413f54/attachment.html From Paul.Moore at atosorigin.com Mon Oct 13 15:49:35 2003 From: Paul.Moore at atosorigin.com (Moore, Paul) Date: Mon Oct 13 15:50:20 2003 Subject: [Python-Dev] decorate-sort-undecorate Message-ID: <16E1010E4581B049ABC51D4975CEDB8803060CCD@UKDCX001.uk.int.atosorigin.com> From: Raymond Hettinger [mailto:raymond.hettinger@verizon.net] > For Py2.4, I propose adding an optional list.sort() argument to > support the decorate-sort-undecorate pattern. [...] > def sort(self, cmpfunc=None, decorator=None): I like it! But "decorator" isn't a good name - it describes how it's being done, rather than what is being done. How about "key"? After all, "key=str.lower" reads more or less as "the key is the lowercase equivalent of the value", and "key=ages.__getitem__" reads "get the key by getting the appropriate item from the ages dictionary". But names apart, it's nice. It lets people use the builtin, without going for the performance-reducing comparison function... Paul. From Patrick.Maupin at silabs.com Mon Oct 13 16:08:13 2003 From: Patrick.Maupin at silabs.com (Patrick Maupin) Date: Mon Oct 13 16:08:45 2003 Subject: [Python-Dev] Python and Coercion Message-ID: An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20031013/a4ce9338/attachment.html From guido at python.org Mon Oct 13 16:35:59 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 13 16:36:27 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "Mon, 13 Oct 2003 20:49:35 BST." <16E1010E4581B049ABC51D4975CEDB8803060CCD@UKDCX001.uk.int.atosorigin.com> References: <16E1010E4581B049ABC51D4975CEDB8803060CCD@UKDCX001.uk.int.atosorigin.com> Message-ID: <200310132035.h9DKZxv22377@12-236-54-216.client.attbi.com> > From: Raymond Hettinger [mailto:raymond.hettinger@verizon.net] > > > For Py2.4, I propose adding an optional list.sort() argument to > > support the decorate-sort-undecorate pattern. > [...] > > > def sort(self, cmpfunc=None, decorator=None): [Paul Moore] > I like it! But "decorator" isn't a good name - it describes how it's > being done, rather than what is being done. How about "key"? After > all, "key=str.lower" reads more or less as "the key is the lowercase > equivalent of the value", and "key=ages.__getitem__" reads "get the > key by getting the appropriate item from the ages dictionary". Agreed, that was my first thought too. > But names apart, it's nice. It lets people use the builtin, without > going for the performance-reducing comparison function... +1 --Guido van Rossum (home page: http://www.python.org/~guido/) From Patrick.Maupin at silabs.com Mon Oct 13 16:55:23 2003 From: Patrick.Maupin at silabs.com (Patrick Maupin) Date: Mon Oct 13 16:55:59 2003 Subject: [Python-Dev] Python and Coercion Message-ID: Sorry about the HTML. I _hate_ the way they configure things at work -- I have to remember to force text, and can't get around the stoopid legal disclaimer at the bottom. I'll try to remember not to post from here again. Best regards, Pat -----Original Message----- From: Guido van Rossum [mailto:guido@esi.elementalsecurity.com] Sent: Monday, October 13, 2003 3:39 PM To: Patrick Maupin Cc: pyython-dev@python.org Subject: RE: [Python-Dev] Python and Coercion This is a feature. It is mentioned in passing in http://www.python.org/2.2.2/descrintro.html : """Note that while in general operator overloading works just as for classic classes, there are some differences. (The biggest one is the lack of support for __coerce__; new-style classes should always use the new-style numeric API, which passes the other operand uncoerced to the __add__ and __radd__ methods, etc.) """ PS Next time don't post HTML. --Guido van Rossum (home page: http://www.python.org/~guido) -----Original Message----- From: python-dev-bounces@python.org [mailto:python-dev-bounces@python.org] On Behalf Of Patrick Maupin Sent: Monday, October 13, 2003 1:08 PM To: python-dev@python.org Subject: [Python-Dev] Python and Coercion Dear developers: __coerce__ does not seem to work in new-style classes, e.g. class foo: def __int__(self): return 1 def __coerce__(self,other): return int(self), int(other) x = foo() print 1+x works fine, but if foo is derived from object, it fails with: TypeError: unsupported operand type(s) for +: 'int' and 'foo' After finding this difference, I could not figure out if this was an interpreter error or a documentation error. http://www.python.org/doc/current/ref/coercion-rules.html states that: "In Python 3.0, coercion will not be supported." so I thought maybe this was the first round of removing this support. I googled around for awhile trying to find supporting documentation for this -- it appears it might have to do with PEP 228, but I'm not really sure, so I was hoping someone could point me at a reference point for this statement. Regards, Pat Maupin This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto. This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto. From pnorvig at google.com Mon Oct 13 17:17:46 2003 From: pnorvig at google.com (Peter Norvig) Date: Mon Oct 13 17:17:50 2003 Subject: [Python-Dev] Re: Python-Dev Digest, Vol 3, Issue 25 References: Message-ID: I like "sort" better than "decorator"; I would also like "by", as in sort(names, by=ages.__getitem__). I would also advocate an optional reverse=False argument, so that result = sort(names, reverse=True) is equivalent to result = sort(names) result.reverse() > Date: Mon, 13 Oct 2003 15:34:15 -0400 > From: "Raymond Hettinger" > Subject: [Python-Dev] decorate-sort-undecorate > To: > Message-ID: <001f01c391c0$fcdb88a0$e841fea9@oemcomputer> > Content-Type: text/plain; charset="us-ascii" > > For Py2.4, I propose adding an optional list.sort() argument to support > the decorate-sort-undecorate pattern. > > The current, pure Python approach to DSU is pure arcana. It is obscure > enough and cumbersome enough that cmpfunc() tends to get used instead. > > Built-in C support for DSU requires much less skill to use, results in > more readable code, and runs faster. > > Raymond Hettinger > From skip at pobox.com Mon Oct 13 17:22:59 2003 From: skip at pobox.com (Skip Montanaro) Date: Mon Oct 13 17:23:17 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8803060CCD@UKDCX001.uk.int.atosorigin.com> References: <16E1010E4581B049ABC51D4975CEDB8803060CCD@UKDCX001.uk.int.atosorigin.com> Message-ID: <16267.6067.441841.822258@montanaro.dyndns.org> >> def sort(self, cmpfunc=None, decorator=None): Paul> I like it! But "decorator" isn't a good name - it describes how Paul> it's being done, rather than what is being done. How about "key"? How about keyfunc? "keyfunc=str.lower" reads to me more like "generate sort keys using str.lower". "key" doesn't suggest (to me, at least) its value should be a function. Skip From ianb at colorstudy.com Mon Oct 13 17:29:01 2003 From: ianb at colorstudy.com (Ian Bicking) Date: Mon Oct 13 17:29:06 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <001f01c391c0$fcdb88a0$e841fea9@oemcomputer> Message-ID: <43148AA1-FDC4-11D7-94FE-000393C2D67E@colorstudy.com> On Monday, October 13, 2003, at 02:34 PM, Raymond Hettinger wrote: > For Py2.4, I propose adding an optional list.sort() argument to > support the decorate-sort-undecorate pattern. I've seen proposals for an extension to list comprehension, which would be quite nice: [s for s in lst sortby s.lower()] It reads nicely, and avoids lambdas and tiny helper functions. Also handles the sort-returns-None criticism. But it adds syntax. And since it's not an in-place sort it won't perform as well (but probably better than the decorator idiom anyway...?) -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org From guido at python.org Mon Oct 13 18:04:28 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 13 18:04:36 2003 Subject: [Python-Dev] Re: Python-Dev Digest, Vol 3, Issue 25 In-Reply-To: Your message of "Mon, 13 Oct 2003 14:17:46 PDT." References: Message-ID: <200310132204.h9DM4SM22471@12-236-54-216.client.attbi.com> > I like "sort" better than "decorator"; I would also like "by", as in > sort(names, by=ages.__getitem__). > > I would also advocate an optional reverse=False argument, so that > > result = sort(names, reverse=True) > > is equivalent to > > result = sort(names) > result.reverse() While we're at it, +1. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Oct 13 18:05:32 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 13 18:05:54 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "Mon, 13 Oct 2003 16:22:59 CDT." <16267.6067.441841.822258@montanaro.dyndns.org> References: <16E1010E4581B049ABC51D4975CEDB8803060CCD@UKDCX001.uk.int.atosorigin.com> <16267.6067.441841.822258@montanaro.dyndns.org> Message-ID: <200310132205.h9DM5Wr22484@12-236-54-216.client.attbi.com> > Paul> I like it! But "decorator" isn't a good name - it describes how > Paul> it's being done, rather than what is being done. How about "key"? [Skip] > How about keyfunc? "keyfunc=str.lower" reads to me more like "generate sort > keys using str.lower". "key" doesn't suggest (to me, at least) its value > should be a function. But remember that a parameter name doesn't need to be documentation. It just needs to be a memory-jogger. I think key is fine. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Oct 13 18:07:48 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 13 18:08:02 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "Mon, 13 Oct 2003 16:29:01 CDT." <43148AA1-FDC4-11D7-94FE-000393C2D67E@colorstudy.com> References: <43148AA1-FDC4-11D7-94FE-000393C2D67E@colorstudy.com> Message-ID: <200310132207.h9DM7m022506@12-236-54-216.client.attbi.com> > I've seen proposals for an extension to list comprehension, which would > be quite nice: > > [s for s in lst sortby s.lower()] > > It reads nicely, and avoids lambdas and tiny helper functions. Also > handles the sort-returns-None criticism. But it adds syntax. And > since it's not an in-place sort it won't perform as well (but probably > better than the decorator idiom anyway...?) This has a very low probability to be accepted. It suffers IMO from the "SQL syndrome": having reserved words to the language that are only meaningful in a very specific syntax yet are reserved everywhere. Until we have a general way to avoid that, I'd rather not go that route. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Mon Oct 13 18:51:52 2003 From: skip at pobox.com (Skip Montanaro) Date: Mon Oct 13 18:52:05 2003 Subject: [Python-Dev] Re: Python-Dev Digest, Vol 3, Issue 25 In-Reply-To: <200310132204.h9DM4SM22471@12-236-54-216.client.attbi.com> References: <200310132204.h9DM4SM22471@12-236-54-216.client.attbi.com> Message-ID: <16267.11400.169738.924956@montanaro.dyndns.org> >> I would also advocate an optional reverse=False argument, so that >> >> result = sort(names, reverse=True) >> >> is equivalent to >> >> result = sort(names) >> result.reverse() Guido> While we're at it, +1. direction=[ascending|descending] ? Just a thought. Skip From fincher.8 at osu.edu Mon Oct 13 19:59:44 2003 From: fincher.8 at osu.edu (Jeremy Fincher) Date: Mon Oct 13 19:01:25 2003 Subject: [Python-Dev] Re: Python-Dev Digest, Vol 3, Issue 25 In-Reply-To: References: Message-ID: <200310131959.44982.fincher.8@osu.edu> On Monday 13 October 2003 05:17 pm, Peter Norvig wrote: > I like "sort" better than "decorator"; I would also like "by", as in > sort(names, by=ages.__getitem__). Coincidentally, my own decorate-sort-undecorate function is named "sortBy" :) So I'm +1 on naming the argument "by". > I would also advocate an optional reverse=False argument, so that > > result = sort(names, reverse=True) > > is equivalent to > > result = sort(names) > result.reverse() I like it. Jeremy From tim.one at comcast.net Mon Oct 13 19:14:17 2003 From: tim.one at comcast.net (Tim Peters) Date: Mon Oct 13 19:14:31 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <001f01c391c0$fcdb88a0$e841fea9@oemcomputer> Message-ID: [Raymond Hettinger] > ... > # The decorator form works especially well with mappings so that > # database keys can be sorted by any field. Unfortunately, the case of sorting database records by keys is one where it's most important to use a different form of DSU to avoid outrageous runtime. If you sort [(key1, record1), (key2, record2), ...] then whenever two keys compare equal, tuple comparison goes on to compare the records too, and general-object comparison can be arbitrarily expensive. The right way to do this with DSU now is to sort: [(key1, 0, record1), (key2, 1, record2), ...] instead. Then ties on the keys (which are very common when sorting a database) are always resolved quickly by comparing two distinct ints. This is the same way used to force a stable sort in pre-2.3 Python, and remains the best thing for non-experts to do by default. Indeed, if it's not done, then despite that the 2.3 sort *is* stable, sorting on [(key1, record1), (key2, record2), ...] is *not* stable wrt just sorting on the keys. DSU actually changes the natural result unless the original indices are inserted after "the keys". Alas, in decorator=function syntax, there's no clear way in general to write function so that it knows the index of the object passed to it. Internally, I suppose the sort routine could sort a temp list consisting of only the keys, mirroring the relative data movement in the real list too. That should blow the cache all to hell . From pje at telecommunity.com Mon Oct 13 19:25:57 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Oct 13 19:27:26 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: References: <001f01c391c0$fcdb88a0$e841fea9@oemcomputer> Message-ID: <5.1.1.6.0.20031013192501.032177d0@telecommunity.com> At 07:14 PM 10/13/03 -0400, Tim Peters wrote: >Alas, in decorator=function syntax, there's no clear way in general to write >function so that it knows the index of the object passed to it. Why not just have the decoration be (key,index,value) then? Why does the key function need the index? From tim.one at comcast.net Mon Oct 13 19:45:40 2003 From: tim.one at comcast.net (Tim Peters) Date: Mon Oct 13 19:45:57 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <5.1.1.6.0.20031013192501.032177d0@telecommunity.com> Message-ID: [Phillip J. Eby] > Why not just have the decoration be (key,index,value) then? Why does > the key function need the index? It doesn't if indices are synthesized by magic under the covers. Then it starts acting more like Zope (we know it does *something*, but it's not clear what ). If you want to pay that expense now, you do so explicitly, and nothing about it is hidden. From Scott.Daniels at Acm.Org Mon Oct 13 20:40:35 2003 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Mon Oct 13 20:40:51 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: References: Message-ID: <3F8B4603.4000701@Acm.Org> Raymond Hettinger wrote: >For Py2.4, I propose adding an optional list.sort() argument to >support the decorate-sort-undecorate pattern. > ... >def sort(self, cmpfunc=None, key=None): > > ... > if key is None: > self.sort(*args) > else: > aux = zip(map(key, self), self) # Decorate > aux.sort(*args) > self[:] = list(zip(*aux)[1]) # Un-decorate > If the argument is for simplicity, do we need to make this stable? Will warning about incomparables be sufficient? I'm thinking about: data = [(1-1j), -2, 1, 1j] data.sort(key=abs) Or would we prefer the code to end: else: # Decorate aux = [(key(el), nbr, el) for nbr, el in enumerate(self)] aux.sort(*args) self[:] = list(zip(*aux)[2]) # Un-decorate I think the answer comes down to performance vs. law of least surprise. I suppose I am slightly in favor of throwing in the stabilizing count (fewer explanations; those who need speed can do it themselves. -Scott David Daniels Scott.Daniels@Acm.Org From greg at cosc.canterbury.ac.nz Mon Oct 13 20:45:21 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Mon Oct 13 20:46:23 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310132205.h9DM5Wr22484@12-236-54-216.client.attbi.com> Message-ID: <200310140045.h9E0jL609446@oma.cosc.canterbury.ac.nz> Guido: > But remember that a parameter name doesn't need to be documentation. > It just needs to be a memory-jogger. I think key is fine. +1 on "key" from me, too. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Mon Oct 13 20:48:47 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Mon Oct 13 20:50:10 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310132207.h9DM7m022506@12-236-54-216.client.attbi.com> Message-ID: <200310140048.h9E0mlC09462@oma.cosc.canterbury.ac.nz> Guido: > > [s for s in lst sortby s.lower()] > > It suffers IMO from the "SQL syndrome": having reserved words to the > language that are only meaningful in a very specific syntax yet are > reserved everywhere. It could probably be a non-reserved keyword in this case. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido at python.org Mon Oct 13 20:56:34 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 13 20:56:43 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "Tue, 14 Oct 2003 13:48:47 +1300." <200310140048.h9E0mlC09462@oma.cosc.canterbury.ac.nz> References: <200310140048.h9E0mlC09462@oma.cosc.canterbury.ac.nz> Message-ID: <200310140056.h9E0uYf22690@12-236-54-216.client.attbi.com> > Guido: > > > [s for s in lst sortby s.lower()] > > > > It suffers IMO from the "SQL syndrome": having reserved words to the > > language that are only meaningful in a very specific syntax yet are > > reserved everywhere. [Greg] > It could probably be a non-reserved keyword in this case. Yes, but that would be error-prone, because to the parser it would hve to look like an expression followed by an identifier followed by another expression. Many typos in the first expression can then turn this into a valid but different expression. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at cosc.canterbury.ac.nz Mon Oct 13 21:00:58 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Mon Oct 13 21:01:12 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <43148AA1-FDC4-11D7-94FE-000393C2D67E@colorstudy.com> Message-ID: <200310140100.h9E10wq09475@oma.cosc.canterbury.ac.nz> Ian Bicking : > [s for s in lst sortby s.lower()] > > It reads nicely, and avoids lambdas and tiny helper functions. Also > handles the sort-returns-None criticism. But it adds syntax. And makes the definition of list semantics in terms of an equivalent for-loop nest much less elegant. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From mike at nospam.com Mon Oct 13 21:34:55 2003 From: mike at nospam.com (Mike Rovner) Date: Mon Oct 13 21:40:26 2003 Subject: [Python-Dev] Re: python-dev Summary for 2003-09-16 through 2003-09-30 References: Message-ID: Brett C. wrote: > We want *you* to help with the war on SF patch items > ---------------------------------------------------- > Someone wanted to help but wasn't sure how they could. Martin v. > Loewis sent an email listing common things anyone can do to help with > dealing with the patch items on SourceForge_. The email can be found > at > http://mail.python.org/pipermail/python-dev/2003-September/038253.html 24 Sep 2003 09:26:12 +0200 martin v.loewis.de wrote: >> Aahz pythoncraft.com> writes: > Also, try to classify the patch somehow, indicating what most likely > the problem is for the patch not being reviewed/accepted: > >> - the patch might be incomplete. Ping the submitter. If the submitter >> is incomplete, either complete it yourself, or suggest rejection >> of the patch. All I can do as SF regestered user is add a comment to existing patch. I can't extend it, submit extra files, i.e. "complete" it. Please clarify the preferabale way to "help with the war on SF patch items". Regards, Mike From pje at telecommunity.com Mon Oct 13 21:53:33 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Oct 13 21:53:27 2003 Subject: [Python-Dev] decorate-sort-undecorate Message-ID: <5.1.0.14.0.20031013215332.02722cb0@mail.telecommunity.com> At 07:45 PM 10/13/03 -0400, Tim Peters wrote: >[Phillip J. Eby] > > Why not just have the decoration be (key,index,value) then? Why does > > the key function need the index? > >It doesn't if indices are synthesized by magic under the covers. It's not magic if that's the defined behavior, e.g.: """Specifying a 'key' callable causes items' sort order to be determined by comparing 'key(item)' in place of the item being compared. In the event that 'key()' returns an equal value for two different items, the items' order in the original list is preserved. The 'key' callable is called only once for each item in the list, so in general sorting with 'key' is faster than sorting with 'cmpfunc'. It requires more memory, however, because it creates a temporary list of '(key(item),original_item_position,item)' tuples in order to perform the sort.""" >If you want to pay that expense now, you do so >explicitly, and nothing about it is hidden. What expense? The extra memory overhead for the index? I suppose so. But if you *don't* want that behavior, you can still DSU manually, no? From barry at python.org Mon Oct 13 22:04:51 2003 From: barry at python.org (Barry Warsaw) Date: Mon Oct 13 22:04:59 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <5.1.0.14.0.20031013215332.02722cb0@mail.telecommunity.com> References: <5.1.0.14.0.20031013215332.02722cb0@mail.telecommunity.com> Message-ID: <1066097091.19072.11.camel@geddy> On Mon, 2003-10-13 at 21:53, Phillip J. Eby wrote: > """Specifying a 'key' callable causes items' sort order to be determined by > comparing 'key(item)' in place of the item being compared. Using this explanation, "key" doesn't seem right to me. I can't think of anything that I like better though, so I guess I just won't send this email afteral... -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20031013/e475a006/attachment.bin From tim.one at comcast.net Mon Oct 13 22:25:51 2003 From: tim.one at comcast.net (Tim Peters) Date: Mon Oct 13 22:25:53 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <5.1.0.14.0.20031013215332.02722cb0@mail.telecommunity.com> Message-ID: [Phillip J. Eby] > What expense? The extra memory overhead for the index? I suppose > so. Yes, that is an expense. Partly because of the extra memory space in len(list) temp tuples, but mostly because space allocated for integer objects is immortal. That is, range(10000000) grabs space for 1000000 distinct integer objects that's never reused for any other kind of object, and so does stuffing a million distinct int objects into a temp DSU list. Note that this is very different from doing for i in xrange(1000000): which allocates space for only three integer objects (100000, the current value of i, and preceding value of i), and keeps reusing it. A cleverer implementation might be able to avoid permanently ratcheting the space devoted to int objects. > But if you *don't* want that behavior, you can still DSU manually, no? I hope so . From bac at OCF.Berkeley.EDU Mon Oct 13 22:26:19 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Mon Oct 13 22:26:28 2003 Subject: [Python-Dev] Draft of an essay on Python development (and how to help) Message-ID: <3F8B5ECB.4030207@ocf.berkeley.edu> I finally got around to proof-reading a guide to Python development that I wrote based on a presentation I gave to the SF Bay Area Python user group. I would like to get it checked by everybody to make sure it is to everyone's liking. The main goal of this doc is twofold: 1) to have something to point people to when they ask how they can help or get started on python-dev (maybe even be referenced in the welcome email) and 2) to act as a basis for a presentation at PyCon 2004 covering how Python is developed. In other words I want this to be good enough to put up on python.org . Since correcting the summaries works well by pasting into an email, I am going to do that here as well. Comment on any errors in grammar, spelling, etc. If you think an important point is missing, please say so. Do realize, though, this is not mean to replace the dev FAQ. I specifically wrote it like an essay so that people can just read it from beginning to end. If there is some specific point that people need to be pointed to it should probably go into the dev FAQ rather than here. I would like to view this as a gentle intro to python-dev's workings to help lower the fear factor. OK, enough explanations. Here is the doc: ------------------------------------ How Python is Developed +++++++++++++++++++++++ Introduction ============ Software does not make itself. Code does not spontaneously come from the ether of the universe. Python_ is no exception to this rule. Since Python made its public debut back in 1991 people beyond the BDFL (Benevolent Dictator For Life, `Guido van Rossum`_) have helped contribute time and energy to making Python what it is today; a powerful, simple programming language available to all. But it has not been a random process of people doing whatever they wanted to Python. Over the years a process to the development of Python has emerged by the group that heads Python's growth and maintenance; `python-dev`_. This document is an attempt to write this process down in hopes of lowering any barriers possibly preventing people from contributing to the development of Python. .. _Python: http://www.python.org/ .. _Guido van Rossum: http://www.python.org/~guido/ .. _python-dev:http://mail.python.org/mailman/listinfo/python-dev Tools Used ========== To help facilitate the development of Python, certain tools are used. Beyond the obvious ones such as a text editor and email client, two tools are very pervasive in the development process. SourceForge_ is used by python-dev to keep track of feature requests, reported bugs, and contributed patches. A detailed explanation on how to use SourceForge is covered later in `General SourceForge Guidelines`_. CVS_ is a networked file versioning system that stores all of files that make up Python. It allows the developers to have a single repository for the files along with being able to keep track of any and all changes to every file. The basic commands and uses can be found in the `dev FAQ`_ along with a multitude of tutorials spread across the web. .. _SourceForge: http://sourceforge.net/projects/python/ .. _CVS: http://www.cvshome.org/ .. _dev FAQ: http://www.python.org/dev/devfaq.html Communicating ============= Python development is not just programming. It requires a great deal of communication between people. This communication is not just between the members of python-dev; communication within the greater Python community also helps with development. Several mailing lists and newsgroups are used to help organize all of these discussions. In terms of Python development, the primary location for communication is the `python-dev`_ mailing list. This is where the members of python-dev hash out ideas and iron out issues. It is an open list; anyone can subscribe to the mailing list. While the discussion can get quite technical, it is not all out of the reach for even a novice and thus should not discourage anyone from joining the list. Please realize, though, this list is **only** for the discussion of the development of Python; all other questions should be directed somewhere else, such as `python-list`_. When the greater Python community is involved in a discussion, it always ends up on `python-list`_. This mailing list is a gateway to the newsgroup `comp.lang.python`_. This is also a good place to go when you have a question about Python that does not pertain to the actual development of the language. Using CVS_ allows the development team to know who made a change to a file and when they made their change. But unless one wants to continuously update their local checkout of the repository, the best way to stay on top of changes to the repository is to subscribe to `Python-checkins`_. This list sends out an email for each and every change to a file in Python. This list can generate a large amount of traffic since even changing a typo in some text will trigger an email to be sent out. But if you wish to be kept abreast of all changes to Python then this is a good way to do so. The Patches_ mailing list sends out an email for all changes to patch items on SourceForge_. This list, just like Python-checkins, can generate a large amount of email traffic. It is in general useful to people who wish to help out with the development of Python by knowing about all new submitted patches as well as any new developments on preexisting ones. `Python-bugs-list`_ functions much like the Patches mailing list except it is for bug items on SourceForge. If you find yourself wanting to help to close and remove bugs in Python this is the right list to subscribe to if you can handle the volume of email. .. _python-list: http://mail.python.org/mailman/listinfo/python-list .. _comp.lang.python: http://groups.google.com/groups?q=comp.lang.python .. _Python-checkins: http://mail.python.org/mailman/listinfo/python-checkins .. _Patches: http://mail.python.org/mailman/listinfo/patches .. _Python-bugs-list: http://mail.python.org/mailman/listinfo/python-bugs-list The Actual Development ====================== Developing Python is not all just conversations about neat new language features (although those neat conversations do come up and there is a process to it). Developing Python also involves maintaining it by eliminating discovered bugs, adding and changing features, and various other jobs that are not necessarily glamorous but are just as important to the language as anything else. General SourceForge Guidelines ------------------------------ Since a good amount of Python development involves using SourceForge_, it is important to follow some guidelines when handling a tracker item (bug, patch, etc.). Probably one of the most important things you can do is make sure to set the various options in a new tracker item properly. The submitter should make sure that the Data Type, Category, and Group are all set to reasonable values. The remaining values (Assigned To, Status, and Resolution) should in general be left to Python developers to set. The exception to this rule is when you want to retract a patch; then "close" the patch by setting Status to "closed" and Resolution to whatever is appropriate. Make sure you do a cursory check to make sure what ever you are submitting was not previously submitted by someone else. Duplication just uses up valuable time. And **please** do not post feature requests, bug reports, or patches to the python-dev mailing list. If you do you will be instructed to create an appropriate SourceForge tracker item. When in doubt as to whether you should bring something to python-dev's attention, you can always ask on `comp.lang.python`_; Python developers actively participate there and move the conversation over if it is deemed reasonable. Feature Requests ---------------- `Feature requests`_ are for features that you wish Python had but you have no plans on actually implementing by writing a patch. On occasion people do go through the features requests (also called RFCs on SourceForge) to see if there is anything there that they think should be implemented and actually do the implementation. But in general do not expect something put here to be implemented without some participation on your part. The best way to get something implemented is to campaign for it in the greater Python community. `comp.lang.python`_ is the best place to accomplish this. Post to the newsgroup with your idea and see if you can either get support or convince someone to implement it. It might even end up being added to `PEP 42`_ so that the idea does not get lost in the noise as time passes. .. _feature requests: http://sourceforge.net/tracker/?group_id=5470&atid=355470 .. _PEP 42: http://www.python.org/peps/pep-0042.html Bug Reports ----------- Think you found a bug? Then submit a `bug report`_ on SourceForge. Make sure you clearly specify what version of Python you are using, what OS, and under what conditions the bug was triggered. The more information you can give the faster the bug can be fixed since time will not be wasted requesting more information from you. .. _bug report: http://sourceforge.net/tracker/?group_id=5470&atid=105470 Patches ------- Create a patch_ tracker item on SourceForge for any code you think should be applied to the Python CVS tree. For practically any change to Python's functionality the documentation and testing suite will need to be changed as well. Doing this in the first place speeds things up considerably. Please make sure your patch is against the CVS repository. If you don't know how to use it (basics are covered in the `dev FAQ`_), then make sure you specify what version of Python you made your patch against. In terms of coding standards, `PEP 8`_ specifies for Python while `PEP 7`_ specifies for C. Always try to maximize your code reuse; it makes maintenance much easier. For C code make sure to limit yourself to ANSI C code as much as possible. If you must use non-ANSI C code then see if what you need is checked for by looking in pyconfig.h . You can also look in Include/pyport.h for more helpful C code. If what you need is still not there but it is in general available, then add a check in configure.in for it (don't forget to run autoreconf to make the changes to take effect). And if that *still* doesn't fit your needs then code up a solution yourself. The reason for all of this is to limit the dependence on external code that might not be available for all OSs that Python runs on. Be aware of intellectual property when handling patches. Any code with no copyright will fall under the copyright of the `Python Software Foundation`_. If you have no qualms with that, wonderful; this is the best solution for Python. But if you feel the need to include a copyright then make sure that it is compatible with copyright used on Python (i.e., BSD-style). The best solution, though, is to sign the copyright over to the Python Software Foundation. .. _patch: http://sourceforge.net/tracker/?group_id=5470&atid=305470 .. _dev FAQ: http://www.python.org/dev/devfaq.html .. _PEP 7: http://www.python.org/peps/pep-0007.html .. _PEP 8: http://www.python.org/peps/pep-0008.html .. _Python Software Foundation: http://www.python.org/psf/ Changing the Language ===================== You understand how to file a patch. You think you have a great idea on how Python should change. You are ready to write code for your change. Great, but you need to realize that certain things must be done for a change to be accepted. Changes fall into two categories; changes to the standard library (referred to as the "stdlib") and changes to the language proper. Changes to the stdlib --------------------- Changes to the stdlib can consist of adding functionality or changing existing functionality. Adding minor functionality (such as a new function or method) requires convincing a member of python-dev that the addition of code caused by implementing the feature is worth it. A big addition such as a module tends to require more support than just a single member of python-dev. As always, getting community support for your addition is a good idea. With all additions, make sure to write up documentation for your new functionality. Also make sure that proper tests are added to the testing suite. If you want to add a module, be prepared to be called upon for any bug fixes or feature requests for that module. Getting a module added to the stdlib makes you by default its maintainer. If you can't take that level of responsibility and commitment and cannot get someone else to take it on for you then your battle will be very difficult; when there is not a specific maintainer of code python-dev takes responsibility and thus your code must be useful to them or else they will reject the module. Changing existing functionality can be difficult to do if it breaks backwards-compatibility. If your code will break existing code, you must provide a legitimate reason on why making the code act in a non-compatible way is better than the status quo. This requires python-dev as a whole to agree to the change. Changing the Language Proper ---------------------------- Changing Python the language is taken **very** seriously. Python is often heralded for its simplicity and cleanliness. Any additions to the language must continue this tradition and view. Thus any changes must go through a long process. First, you must write a PEP_ (Python Enhancement Proposal). This is basically just a document that explains what you want, why you want it, what could be bad about the change, and how you plan on implementing the change. It is best to get feedback on PEPs on `comp.lang.python`_ and from python-dev. Once you feel the document is ready you can request a PEP number and to have it added to the official list of PEPs in `PEP 0`_. Once you have a PEP, you must then convince python-dev and the BDFL that your change is worth it. Be expected to be bombarded with questions and counter-arguments. It can drag on for over a month, easy. If you are not up for that level of discussion then do not bother with trying to get your change in. If you manage to convince a majority of python-dev and the BDFL (or most of python-dev; that can lead to the BDFL changing his mind) then your change can be applied. As with all new code make sure you also have appropriate documentation patches along with tests for the new functionality. .. _PEP: http://www.python.org/peps/pep-0001.html .. _PEP 0: http://www.python.org/peps/pep-0000.html Helping Out =========== Many people say they wish they could help out with the development of Python but feel they are not up to writing code. There are plenty of things one can do, though, that does not require you to write code. Regardless of your coding abilities, there is something for everyone to help with. For feature requests, adding a comment about what you think is helpful. State whether or not you would like to see the feature. You can also volunteer to write the code to implement the feature if you feel up to it. For bugs, stating whether or not you can reproduce the bug yourself can be extremely helpful. If you can write a fix for the bug that is very helpful as well. For patches, apply the patch and run the testing suite. You can do a code review on the patch to make sure that it is good, clean code. Help add to the patch if it is missing documentation patches or needed regression tests. If the patch adds a new feature, comment on whether you think it is worth adding. If it changes functionality then comment on whether you think it might break code; if it does, say whether you think it is worth the cost of breaking existing code. For language changes, make your voice be heard. Comment about any PEPs on `comp.lang.python`_ so that the general opinion of the community can be assessed. If there is nothing specific you find you want to work on but still feel like contributing nonetheless, there are several things you can do. The documentation can always use fleshing out. Adding more tests to the testing suite is always useful. Contribute to discussions on python-dev or `comp.lang.python`_. Just helping out in the community by spreading the word about Python or helping someone with a question is helpful. If you really want to get knee-deep in all of this, join python-dev. Once you have been actively participating for a while and are generally known on python-dev you can request to have checkin rights on the CVS tree. It is a great way to learn how to work in a large, distributed group along with how to write great code. And if all else fails give money; the `Python Software Foundation`_ is a non-profit organization that accepts donations that are tax-deductible in the United States. The funds are used for various thing such as lawyers for handling the intellectual property of Python to funding PyCon_. But the PSF could do a lot more if they had the funds. Every dollar does help, so please contribute if you can. .. _PyCon: http://www.python.org/pycon/ Conclusion ========== If you get any message from this document, it should be that *anyone* can help Python. All help is greatly appreciated and keeps the language the wonderful piece of software that it is. From bac at OCF.Berkeley.EDU Mon Oct 13 22:31:10 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Mon Oct 13 22:31:24 2003 Subject: [Python-Dev] Re: python-dev Summary for 2003-09-16 through 2003-09-30 In-Reply-To: References: Message-ID: <3F8B5FEE.6010203@ocf.berkeley.edu> Mike Rovner wrote: > Brett C. wrote: > >>We want *you* to help with the war on SF patch items >>---------------------------------------------------- >>Someone wanted to help but wasn't sure how they could. Martin v. >>Loewis sent an email listing common things anyone can do to help with >>dealing with the patch items on SourceForge_. The email can be found >>at >>http://mail.python.org/pipermail/python-dev/2003-September/038253.html > > > 24 Sep 2003 09:26:12 +0200 martin v.loewis.de wrote: > >>>Aahz pythoncraft.com> writes: >> >>Also, try to classify the patch somehow, indicating what most likely >>the problem is for the patch not being reviewed/accepted: >> >> >>>- the patch might be incomplete. Ping the submitter. If the submitter >>> is incomplete, either complete it yourself, or suggest rejection >>> of the patch. > > > All I can do as SF regestered user is add a comment to existing patch. > I can't extend it, submit extra files, i.e. "complete" it. > > Please clarify the preferabale way to "help with the war on SF patch items". > There is a lot you can do even if you can just comment. Apply the patch and verify that it works for you (especially if it relies on OS-specific code) and say what happens. Comment on the cleanliness of the code. If it adds a new feature, state whether you think it is a good addition or not. Make sure that backwards-compatibility is not broken. And if it is, say whether you think it is a good idea to break backwards-compatibility. Doing *anything* to help the patch along is a great help since it allows people who do have checkin abilities to spend less time double-checking double-checking the patch and thus can get to more patches with what little time they have available to work on Python. -Brett From guido at python.org Mon Oct 13 23:31:58 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 13 23:32:22 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "Mon, 13 Oct 2003 22:25:51 EDT." References: Message-ID: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com> > [Phillip J. Eby] > > What expense? The extra memory overhead for the index? I suppose > > so. > > Yes, that is an expense. Partly because of the extra memory space in > len(list) temp tuples, but mostly because space allocated for integer > objects is immortal. That is, > > range(10000000) > > grabs space for 1000000 distinct integer objects that's never reused for any > other kind of object, and so does stuffing a million distinct int objects > into a temp DSU list. Note that this is very different from doing > > for i in xrange(1000000): > > which allocates space for only three integer objects (100000, the current > value of i, and preceding value of i), and keeps reusing it. > > A cleverer implementation might be able to avoid permanently ratcheting the > space devoted to int objects. > > > But if you *don't* want that behavior, you can still DSU manually, no? > > I hope so . After reading this exchange, I'm not sure I agree with Tim about the importance of avoiding to compare the full records. Certainly the *cost* of that comparison doesn't bother me: I expect it's usually going to be a tuple or list of simple types like ints and strings, and comparing those is pretty efficient. Sure, there's a pattern that requires records with equal to remain in the original order, but that seems an artefact of a pattern used only for external sorts (where the sorted data is too large to fit in memory -- Knuth Vol. III is full of algorithms for this, but they seem mostly of historical importance in this age of Gigabyte internal memory). The pattern is that if you want records sorted by zipcode and within each zipcode sorted by name, you first sort by name and then do a stable sort by zipcode. This was common in the days of external sorts. (External sorts are still common in some application domains, of course, but I doubt that you'd find much Python being used there for the main sort.) But using Raymond's proposal, you can do that by specifying a tuple consisting of zipcode and name as the key, as follows: myTable.sort(key = lambda rec: (rec.zipcode, rec.name)) This costs an extra tuple, but the values in the tuple are not new, so it costs less space than adding the index int (not even counting the effect of the immortality of ints). And tuples aren't immortal. (To be honest, for tuples of length < 20, the first 2000 of a given size *are* immortal, but that's a strictly fixed amount of wasted memory.) Given that this solution isn't exactly rocket science, I think the default behavior of Raymond's original proposal (making the whole record the second sort key) is fine. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Oct 13 23:33:37 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 13 23:33:45 2003 Subject: [Python-Dev] Re: Python-Dev Digest, Vol 3, Issue 25 In-Reply-To: Your message of "Mon, 13 Oct 2003 17:51:52 CDT." <16267.11400.169738.924956@montanaro.dyndns.org> References: <200310132204.h9DM4SM22471@12-236-54-216.client.attbi.com> <16267.11400.169738.924956@montanaro.dyndns.org> Message-ID: <200310140333.h9E3Xbg22932@12-236-54-216.client.attbi.com> > >> I would also advocate an optional reverse=False argument, so that > >> > >> result = sort(names, reverse=True) > >> > >> is equivalent to > >> > >> result = sort(names) > >> result.reverse() > > Guido> While we're at it, +1. [Skip] > direction=[ascending|descending] > > ? Just a thought. But where would these constants be defined? Using direction='ascending' feels ugly. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at cosc.canterbury.ac.nz Mon Oct 13 23:44:19 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Mon Oct 13 23:44:36 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Message-ID: <200310140344.h9E3iJF10129@oma.cosc.canterbury.ac.nz> Tim Peters : > mostly because space allocated for integer objects is immortal. The implementation doesn't necessarily have to store the sequence numbers as Python integers. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From mike at nospam.com Tue Oct 14 00:21:07 2003 From: mike at nospam.com (Mike Rovner) Date: Tue Oct 14 00:21:11 2003 Subject: [Python-Dev] Re: Draft of an essay on Python development (and how tohelp) References: <3F8B5ECB.4030207@ocf.berkeley.edu> Message-ID: Brett C. wrote: > The main goal of this doc is twofold: 1) to have something to point > people to when they ask how they can help or get started on python-dev > (maybe even be referenced in the welcome email) Very nice welcome reading (probably you want to hear from a novice to python-dev). > Help add to the patch if it is missing documentation patches or needed > regression tests. Please don't herald for things that can't be done by anyone except patch author or py-dev member. Regards, Mike From greg at cosc.canterbury.ac.nz Tue Oct 14 03:13:52 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue Oct 14 03:14:09 2003 Subject: [Python-Dev] MacPython - access to FinderInfo of a directory In-Reply-To: Message-ID: <200310140713.h9E7DqO10997@oma.cosc.canterbury.ac.nz> > Greg, > there's an SF bug for this one: #706585. If you could attach your > patch to this one I'll do the magic to work it around to bgen. Okay, I've done that. Greg From aleaxit at yahoo.com Tue Oct 14 04:01:10 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 14 04:01:19 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com> References: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com> Message-ID: <200310141001.10705.aleaxit@yahoo.com> On Tuesday 14 October 2003 05:31 am, Guido van Rossum wrote: ... > After reading this exchange, I'm not sure I agree with Tim about the > importance of avoiding to compare the full records. Certainly the > *cost* of that comparison doesn't bother me: I expect it's usually > going to be a tuple or list of simple types like ints and strings, and > comparing those is pretty efficient. I have and have seen many use cases where the things being sorted are dictionaries (comparisons can be costlier) or instances (they can be non-comparable). I agree that the "stable" nature of sorting is not all that important in our context. But avoiding whole-record comparison in the general case seems important enough to me that I'd accept any arbitrary non-comparing behavior (e.g. making the id of the thing being sorted the secondary key!-) rather than default to whole-record compares. Alex From gerrit at nl.linux.org Tue Oct 14 04:15:53 2003 From: gerrit at nl.linux.org (Gerrit Holl) Date: Tue Oct 14 04:16:06 2003 Subject: [Python-Dev] Draft of an essay on Python development (and how to help) In-Reply-To: <3F8B5ECB.4030207@ocf.berkeley.edu> References: <3F8B5ECB.4030207@ocf.berkeley.edu> Message-ID: <20031014081553.GA2976@nl.linux.org> Brett C. wrote: > Feature Requests > ---------------- > `Feature requests`_ are for features that you wish Python had but you > have no plans on actually implementing by writing a patch. On occasion > people do go through the features requests (also called RFCs on > SourceForge) to see if there is anything there that they think should be > implemented and actually do the implementation. But in general do not > expect something put here to be implemented without some participation > on your part. I think feature requests are called RFE's in SF terminology, not RFC's. regards, Gerrit. -- 173. If this woman bear sons to her second husband, in the place to which she went, and then die, her earlier and later sons shall divide the dowry between them. -- 1780 BC, Hammurabi, Code of Law -- Asperger Syndroom - een persoonlijke benadering: http://people.nl.linux.org/~gerrit/ Kom in verzet tegen dit kabinet: http://www.sp.nl/ From just at letterror.com Tue Oct 14 04:37:33 2003 From: just at letterror.com (Just van Rossum) Date: Tue Oct 14 04:37:43 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com> Message-ID: Guido van Rossum wrote: > After reading this exchange, I'm not sure I agree with Tim about the > importance of avoiding to compare the full records. Certainly the > *cost* of that comparison doesn't bother me: I expect it's usually > going to be a tuple or list of simple types like ints and strings, and > comparing those is pretty efficient. I have no opinion about the importance, but I do have a use case that differs from Tim's. The other week I found myself sorting a list of dictionary keys by an arbitrary attribute of the dict values. The sort needed to be stable, in the sense that for attributes that contained equal values, the previous sort order was to be maintained. The dict values themselves weren't meaningfully sortable. What I did (had to do, given the requirements) is almost exactly what Tim proposes (I included the indices in the sort), so having that functionality built into list.sort() would have been helpful for me. Not having that functionality would mean I'd either not use the decorator sort feature (ie. do what I do now) or go through hoops and make the decorator generate the indices. The latter approach doesn't sound very appealing to me. Just From anthony at interlink.com.au Tue Oct 14 04:47:54 2003 From: anthony at interlink.com.au (Anthony Baxter) Date: Tue Oct 14 04:50:25 2003 Subject: [Python-Dev] server side digest auth support Message-ID: <200310140847.h9E8ltLn028921@localhost.localdomain> We've got http digest auth [RFC 2617] support at the client level in the standard library, but it doesn't seem like there's server side support. I'm planning on adding this (for pypi) but it's not clear where it should go - I want to use it from a CGI, but I can see it being useful for people writing HTTP servers as well. Should I just make a new module httpdigest.py? Anthony -- Anthony Baxter It's never too late to have a happy childhood. From python at rcn.com Tue Oct 14 05:00:34 2003 From: python at rcn.com (Raymond Hettinger) Date: Tue Oct 14 05:01:12 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Message-ID: <000701c39231$a10332a0$e841fea9@oemcomputer> I've got a first draft patch (sans docs and tests) loaded at: www.python.org/sf/823292 The argument keywords are: cmpfunc, key, reverse The patch passes regression tests and a minimal set of basic functionality tests which need to be expanded considerably. I'll need to go back over this one in more detail to check: * Whether the code was inserted in the right place with respect to the existing anti-mutation code. * Is the strategy of decorating in-place too aggressive? Decoration consists of *replacing* each value x with (x, key(x)). * Verify reference counting and error handling. Raymond Hettinger From Paul.Moore at atosorigin.com Tue Oct 14 05:31:42 2003 From: Paul.Moore at atosorigin.com (Moore, Paul) Date: Tue Oct 14 05:32:29 2003 Subject: [Python-Dev] decorate-sort-undecorate Message-ID: <16E1010E4581B049ABC51D4975CEDB8802C09890@UKDCX001.uk.int.atosorigin.com> From: Raymond Hettinger [mailto:python@rcn.com] > I've got a first draft patch (sans docs and tests) loaded at: > www.python.org/sf/823292 > > The argument keywords are: cmpfunc, key, reverse Can I just clarify the meaning of reverse (the original posting was a little unclear)? I think that l.sort(reverse=True) should mean the same as l.sort() l.reverse() (ie, both sort and reverse inplace, with a void return). The original posting gave me the impression that a copy would be done (which I don't think is necessary). Paul. From sholden at holdenweb.com Tue Oct 14 08:15:57 2003 From: sholden at holdenweb.com (Steve Holden) Date: Tue Oct 14 08:20:46 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <1066097091.19072.11.camel@geddy> Message-ID: [bazzer][ ... ] > > Using this explanation, "key" doesn't seem right to me. I can't think > of anything that I like better though, so I guess I just > won't send this > email afteral... > That was a sensible decision. It saved me from having to send this one. regards -- Steve Holden +1 703 278 8281 http://www.holdenweb.com/ Improve the Internet http://vancouver-webpages.com/CacheNow/ Python Web Programming http://pydish.holdenweb.com/pwp/ Interview with GvR August 14, 2003 http://www.onlamp.com/python/ From pinard at iro.umontreal.ca Tue Oct 14 08:28:01 2003 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Tue Oct 14 09:57:20 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com> References: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com> Message-ID: <20031014122801.GA2559@titan.progiciels-bpi.ca> [Guido van Rossum] > After reading this exchange, I'm not sure I agree with Tim about the > importance of avoiding to compare the full records. It could be useful to avoid comparing the full records. Once in a while, I have the problem of comparing objects which are not comparable to start with, and have to choose between making them comparable, or using decoration for the time of the sort in which there is a guarantee that the object themselves will not be used in comparisons (by ensuring decoration keys never compare equal). The third option, providing a comparison function, is something I succeeded to avoid so far, as it seems to me that this is a good habit relying on fast idioms at hand, instead of on speed-impacting formulations, and good habits are best kept by sticking to them. :-) The problem at making objects comparable is that you fix a preferred or "natural" ordering for the objects, which might not be so "natural" when you need to sort them differently. In some circumstances, maybe many of them, it is significantly cleaner to leave the objects as not comparable. -- Fran?ois Pinard http://www.iro.umontreal.ca/~pinard From guido at python.org Tue Oct 14 10:37:44 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 14 10:38:02 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "Tue, 14 Oct 2003 05:00:34 EDT." <000701c39231$a10332a0$e841fea9@oemcomputer> References: <000701c39231$a10332a0$e841fea9@oemcomputer> Message-ID: <200310141437.h9EEbii23697@12-236-54-216.client.attbi.com> > I've got a first draft patch (sans docs and tests) loaded at: > www.python.org/sf/823292 No time to review, so feedback just on this email. :-( > The argument keywords are: cmpfunc, key, reverse I'd suggest using 'cmp' instead of 'cmpfunc'. (Same argument as for 'key' vs. 'keyfunc'.) > The patch passes regression tests and a minimal set of basic > functionality tests which need to be expanded considerably. I'll need > to go back over this one in more detail to check: > > * Whether the code was inserted in the right place with respect to the > existing anti-mutation code. > > * Is the strategy of decorating in-place too aggressive? Decoration > consists of *replacing* each value x with (x, key(x)). Should be fine. AFAIR Tim's sort code sets the length of the list to 0, so accessing the list while it's being sorted is not supported anyway. > * Verify reference counting and error handling. Write unit tests and measure process size. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 14 10:46:16 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 14 10:47:00 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "Tue, 14 Oct 2003 10:37:33 +0200." References: Message-ID: <200310141446.h9EEkG523741@12-236-54-216.client.attbi.com> [Just] > I have no opinion about the importance, but I do have a use case that > differs from Tim's. > > The other week I found myself sorting a list of dictionary keys by an > arbitrary attribute of the dict values. The sort needed to be stable, in > the sense that for attributes that contained equal values, the previous > sort order was to be maintained. The dict values themselves weren't > meaningfully sortable. What I did (had to do, given the requirements) is > almost exactly what Tim proposes (I included the indices in the sort), > so having that functionality built into list.sort() would have been > helpful for me. Not having that functionality would mean I'd either not > use the decorator sort feature (ie. do what I do now) or go through > hoops and make the decorator generate the indices. The latter approach > doesn't sound very appealing to me. Hm. I wonder this could be solved by yet another keyword argument (maybe "stable"?) controlling whether to add the index to the key a la Tim's recipe. I note that there are many different uses of sort. Many common uses only sort small lists, where performance doesn't matter much; I often use an inefficient cmp function without worrying about performance in such cases. But there are also uses that really test the extremes of Python's performance, and it's a tribute to Tim that his sort code stands up so well in that case. I think it's inevitable that the default options aren't necessarily best for *all* use cases. I'm not sure whether the defaults should cater to the extreme performance cases or to the smaller cases; I expect that the latter are more common, and people who are sorting truly huge lists should read the manual if they care about performance. But that's just me. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 14 10:50:36 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 14 10:51:04 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "Tue, 14 Oct 2003 10:01:10 +0200." <200310141001.10705.aleaxit@yahoo.com> References: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com> <200310141001.10705.aleaxit@yahoo.com> Message-ID: <200310141450.h9EEobm23763@12-236-54-216.client.attbi.com> [Alex] > I have and have seen many use cases where the things being sorted > are dictionaries (comparisons can be costlier) or instances (they can > be non-comparable). > > I agree that the "stable" nature of sorting is not all that important in > our context. But avoiding whole-record comparison in the general > case seems important enough to me that I'd accept any arbitrary > non-comparing behavior (e.g. making the id of the thing being sorted > the secondary key!-) rather than default to whole-record compares. Given that internally we still do a DSU, sorting tuples of (key, something), using the id of the record for 'something' is just as inefficient as using the original index -- in both cases we'd have to allocate len(lst) ints. Greg Ewing suggested that the ints shouldn't have to be Python ints. While this is true, it would require a much larger overhaul of the existing sort code, which assumes the "records" to be sorted are pointers to objects. --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Tue Oct 14 10:57:40 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 14 10:57:45 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310141446.h9EEkG523741@12-236-54-216.client.attbi.com> References: <200310141446.h9EEkG523741@12-236-54-216.client.attbi.com> Message-ID: <200310141657.40072.aleaxit@yahoo.com> On Tuesday 14 October 2003 04:46 pm, Guido van Rossum wrote: ... > I'm not sure whether the defaults should cater to the extreme > performance cases or to the smaller cases; I expect that the latter > are more common, and people who are sorting truly huge lists should > read the manual if they care about performance. But that's just me. I think your general philosophy on "defaults cover normal cases" is part of what makes Python so good, so, if it's just you, that need not be a bad thing;-). However, it seems to me that, in a normal case (sorting a smallish number of easily comparable thingies), whether the indices are or are not added to the decoration is not going to make an enormous difference either way. So, maybe we should focus on two slightly less normal cases where performance or correctness may be impacted: -- if we're sorting a huge list of easily comparable thingies then the overhead of adding so many indices to the decoration might hurt -- if we're sorting a list of expensive-to-compare thingies (e.g. dicts) or non-comparable thingies, the indices (or something, but might as well be the indices, it seems to me) are needed in the decoration (except in the special cases where all keys can be guaranteed to differ, of course) -- whether the list is huge or not This, plus your indication that only people sorting truly huge lists should have to read the manual, suggests to me that defaulting to decoration-with-indices (perhaps with an option to omit the indices) might be a preferable chocie. Alex From tim.one at comcast.net Tue Oct 14 10:58:10 2003 From: tim.one at comcast.net (Tim Peters) Date: Tue Oct 14 10:58:13 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com> Message-ID: [Guido] > After reading this exchange, I'm not sure I agree with Tim about the > importance of avoiding to compare the full records. Certainly the > *cost* of that comparison doesn't bother me: I expect it's usually > going to be a tuple or list of simple types like ints and strings, and > comparing those is pretty efficient. I made the remark about cost in reference to database sorts specifically, where falling back to whole-record comparison can very easily double the cost of a sort -- or much worse than that. We saw that in a sample database-sort application Kevin Altis sent when the 2.3 sort was being developing, where the results were grossly distorted at first because the driver *didn't* arrange to break key ties with cheap compares. Sort a database of companies by (as that example did, and among other things) the stock exchange each is listed on, and you're guaranteed that a great many duplicate keys exist (there are more companies than stock exchanges). Comparing two ints is then vastly cheaper than firing up a written-in-Python general database-record long-winded __cmp__ function. > Sure, there's a pattern that requires records with equal to remain in > the original order, but that seems an artefact of a pattern used only > for external sorts (where the sorted data is too large to fit in > memory -- Knuth Vol. III is full of algorithms for this, but they seem > mostly of historical importance in this age of Gigabyte internal > memory). It's not externality, it's decomposability: stability is what allows an N-key primary-secondary-etc sort to be done one key at a time instead, in N passes, and get the same result either way. Almost all sorts you're likely to use in real life are stable in order to support this, whether it's clicking on an email-metadata column in Outlook, or sorting an array of data by a contained column in Excel. These are in-memory sorts, but interactive, where the user refines sort criteria on the fly and the app has no memory of what steps were taken before the current sort. Then stability is essential to getting the right result -- or the user has to fill out a complex multi-key sort dialog each time. > The pattern is that if you want records sorted by zipcode and within > each zipcode sorted by name, you first sort by name and then do a > stable sort by zipcode. This was common in the days of external > sorts. I wager it's much more common now (not externality, but sorting first by one key, then by another -- it's interactivity that drives this now). > ... > But using Raymond's proposal, you can do that by specifying a tuple > consisting of zipcode and name as the key, as follows: > > myTable.sort(key = lambda rec: (rec.zipcode, rec.name)) > > This costs an extra tuple, but the values in the tuple are not new, so > it costs less space than adding the index int (not even counting the > effect of the immortality of ints). A hidden cost is that apps supporting interactive (or any other form of multi-stage) sort refinements have to keep track of the full set of sort keys ever applied. > And tuples aren't immortal. (To be honest, for tuples of length < 20, > the first 2000 of a given size *are* immortal, but that's a strictly > fixed amount of wasted memory.) I don't care about that either. > Given that this solution isn't exactly rocket science, I think the > default behavior of Raymond's original proposal (making the whole > record the second sort key) is fine. It does approach rocket science for an end user to understand why their database sort is so slow in the presence of many equal keys, and the absence of a cheap tie-breaker. It's something I didn't appreciate either until digging into why Kevin's sample app was so bloody slow in some cases (but not all! it was the sorts with many equal keys that were pig slow, and because-- as the app was coded --they fell back to whole-record comparison). From aleaxit at yahoo.com Tue Oct 14 11:00:33 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 14 11:00:39 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310141450.h9EEobm23763@12-236-54-216.client.attbi.com> References: <200310141001.10705.aleaxit@yahoo.com> <200310141450.h9EEobm23763@12-236-54-216.client.attbi.com> Message-ID: <200310141700.33217.aleaxit@yahoo.com> On Tuesday 14 October 2003 04:50 pm, Guido van Rossum wrote: ... > > case seems important enough to me that I'd accept any arbitrary > > non-comparing behavior (e.g. making the id of the thing being sorted > > the secondary key!-) rather than default to whole-record compares. > > Given that internally we still do a DSU, sorting tuples of (key, > something), using the id of the record for 'something' is just as > inefficient as using the original index -- in both cases we'd have to > allocate len(lst) ints. Yes, of course, I was just being facetious -- sorry for not making that clearer. > Greg Ewing suggested that the ints shouldn't have to be Python ints. > While this is true, it would require a much larger overhaul of the > existing sort code, which assumes the "records" to be sorted are > pointers to objects. Again, true. But maybe the performance increase would be worth the substantial effort (I don't understand the current sort code enough to say more than "maybe"!-). Alex From skip at pobox.com Tue Oct 14 11:19:19 2003 From: skip at pobox.com (Skip Montanaro) Date: Tue Oct 14 11:19:30 2003 Subject: [Python-Dev] Re: Python-Dev Digest, Vol 3, Issue 25 In-Reply-To: <200310140333.h9E3Xbg22932@12-236-54-216.client.attbi.com> References: <200310132204.h9DM4SM22471@12-236-54-216.client.attbi.com> <16267.11400.169738.924956@montanaro.dyndns.org> <200310140333.h9E3Xbg22932@12-236-54-216.client.attbi.com> Message-ID: <16268.5111.311067.830227@montanaro.dyndns.org> Guido> [Skip] >> direction=[ascending|descending] >> >> ? Just a thought. Guido> But where would these constants be defined? Using Guido> direction='ascending' feels ugly. I agree there are problems with the concept. I was just thinking that reverse=True implies that the user knows without being told what "forward" is (without relying on past experience with stuff like the Unix sort() function). Fortunately, it's easy enough to try things out in Python. ;-) Skip From skip at pobox.com Tue Oct 14 11:20:57 2003 From: skip at pobox.com (Skip Montanaro) Date: Tue Oct 14 11:21:06 2003 Subject: [Python-Dev] Re: Draft of an essay on Python development (and how tohelp) In-Reply-To: References: <3F8B5ECB.4030207@ocf.berkeley.edu> Message-ID: <16268.5209.53004.566117@montanaro.dyndns.org> >> Help add to the patch if it is missing documentation patches or >> needed regression tests. Mike> Please don't herald for things that can't be done by anyone except Mike> patch author or py-dev member. Or identify such barriers and suggest alternate paths to submit such info. Skip From nas-python at python.ca Tue Oct 14 11:27:52 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Tue Oct 14 11:27:03 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com> References: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com> Message-ID: <20031014152752.GA11335@mems-exchange.org> On Mon, Oct 13, 2003 at 08:31:58PM -0700, Guido van Rossum wrote: > But using Raymond's proposal, you can do that by specifying a tuple > consisting of zipcode and name as the key, as follows: > > myTable.sort(key = lambda rec: (rec.zipcode, rec.name)) This reads nicely. +1 on 'key'. Neil From jrw at pobox.com Tue Oct 14 12:09:15 2003 From: jrw at pobox.com (John Williams) Date: Tue Oct 14 12:09:20 2003 Subject: [Python-Dev] decorate-sort-undecorate References: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com> <200310141001.10705.aleaxit@yahoo.com> <200310141450.h9EEobm23763@12-236-54-216.client.attbi.com> Message-ID: <3F8C1FAB.8020607@pobox.com> Guido van Rossum wrote: > Given that internally we still do a DSU, sorting tuples of (key, > something), using the id of the record for 'something' is just as > inefficient as using the original index -- in both cases we'd have to > allocate len(lst) ints. > > Greg Ewing suggested that the ints shouldn't have to be Python ints. > While this is true, it would require a much larger overhaul of the > existing sort code, which assumes the "records" to be sorted are > pointers to objects. Why not use a special tuple type for the DSU algorithm that ignores its last element when doing a comparison? It eliminates the problem of creating a zillion int objects, and it would be easy to implement. jw From nas-python at python.ca Tue Oct 14 12:17:56 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Tue Oct 14 12:17:04 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <3F8C1FAB.8020607@pobox.com> References: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com> <200310141001.10705.aleaxit@yahoo.com> <200310141450.h9EEobm23763@12-236-54-216.client.attbi.com> <3F8C1FAB.8020607@pobox.com> Message-ID: <20031014161756.GC11579@mems-exchange.org> On Tue, Oct 14, 2003 at 11:09:15AM -0500, John Williams wrote: > Why not use a special tuple type for the DSU algorithm that ignores its > last element when doing a comparison? Clever idea I think. You don't need a special tuple, just a little wrapper object that holds the key and the original value and uses the key for tp_richcompare. Neil From guido at python.org Tue Oct 14 12:31:47 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 14 12:32:07 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "Tue, 14 Oct 2003 10:58:10 EDT." References: Message-ID: <200310141631.h9EGVlf24023@12-236-54-216.client.attbi.com> > [Guido] > > After reading this exchange, I'm not sure I agree with Tim about the > > importance of avoiding to compare the full records. Certainly the > > *cost* of that comparison doesn't bother me: I expect it's usually > > going to be a tuple or list of simple types like ints and strings, and > > comparing those is pretty efficient. [Tim] > I made the remark about cost in reference to database sorts specifically, > where falling back to whole-record comparison can very easily double the > cost of a sort -- or much worse than that. When exactly do you consider a sort a "database sort"? > We saw that in a sample > database-sort application Kevin Altis sent when the 2.3 sort was being > developing, where the results were grossly distorted at first because the > driver *didn't* arrange to break key ties with cheap compares. Sort a > database of companies by (as that example did, and among other things) the > stock exchange each is listed on, and you're guaranteed that a great many > duplicate keys exist (there are more companies than stock exchanges). > Comparing two ints is then vastly cheaper than firing up a written-in-Python > general database-record long-winded __cmp__ function. No argument there. > > Sure, there's a pattern that requires records with equal to remain in > > the original order, but that seems an artefact of a pattern used only > > for external sorts (where the sorted data is too large to fit in > > memory -- Knuth Vol. III is full of algorithms for this, but they seem > > mostly of historical importance in this age of Gigabyte internal > > memory). > > It's not externality, it's decomposability: stability is what allows an > N-key primary-secondary-etc sort to be done one key at a time instead, in N > passes, and get the same result either way. Almost all sorts you're likely > to use in real life are stable in order to support this, whether it's > clicking on an email-metadata column in Outlook, or sorting an array of data > by a contained column in Excel. I experimented a bit with the version of Outlook I have, and it seems to always use the delivery date/time as the second key, and always in descending order. > These are in-memory sorts, but interactive, > where the user refines sort criteria on the fly and the app has no memory of > what steps were taken before the current sort. Then stability is essential > to getting the right result -- or the user has to fill out a complex > multi-key sort dialog each time. I'm not sure that this helps us decide the default behavior of sorts in Python, which are rarely interactive in this sense. (If someone writes an Ourlook substitute, they can pretty well code the sort to do whatever they want.) > > The pattern is that if you want records sorted by zipcode and within > > each zipcode sorted by name, you first sort by name and then do a > > stable sort by zipcode. This was common in the days of external > > sorts. > > I wager it's much more common now (not externality, but sorting first by one > key, then by another -- it's interactivity that drives this now). But I don't see the interactivity in Python apps, and that's what counts here. > > ... > > But using Raymond's proposal, you can do that by specifying a tuple > > consisting of zipcode and name as the key, as follows: > > > > myTable.sort(key = lambda rec: (rec.zipcode, rec.name)) > > > > This costs an extra tuple, but the values in the tuple are not new, so > > it costs less space than adding the index int (not even counting the > > effect of the immortality of ints). > > A hidden cost is that apps supporting interactive (or any other form of > multi-stage) sort refinements have to keep track of the full set of sort > keys ever applied. A small cost compared to the cost of writing an interactive app, *if* you really want this behavior (I doubt it matters to most people using Outlook). > > And tuples aren't immortal. (To be honest, for tuples of length < 20, > > the first 2000 of a given size *are* immortal, but that's a strictly > > fixed amount of wasted memory.) > > I don't care about that either. > > > Given that this solution isn't exactly rocket science, I think the > > default behavior of Raymond's original proposal (making the whole > > record the second sort key) is fine. > > It does approach rocket science for an end user to understand why their > database sort is so slow in the presence of many equal keys, and the absence > of a cheap tie-breaker. It's something I didn't appreciate either until > digging into why Kevin's sample app was so bloody slow in some cases (but > not all! it was the sorts with many equal keys that were pig slow, and > because-- as the app was coded --they fell back to whole-record comparison). Yeah, understanding performance anomalies is hard. To cut all this short, I propose that we offer using the index as a second sort key as an option, on by default, whose name can be (barely misleading) "stable". On by default nicely matches the behavior of the 2.3 sort without any options. --Guido van Rossum (home page: http://www.python.org/~guido/) From theller at python.net Tue Oct 14 12:37:00 2003 From: theller at python.net (Thomas Heller) Date: Tue Oct 14 12:37:06 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: <3F872FE9.9070508@v.loewis.de> ( =?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Sat, 11 Oct 2003 00:17:13 +0200") References: <3F872FE9.9070508@v.loewis.de> Message-ID: "Martin v. L?wis" writes: > Thomas Heller wrote: >> What is the rationale to decide whether a module is builtin or an >> extension module in core Python (I only care about Windows)? > > I believe it is mostly tradition, on Windows: We continue to do > things the way they have always been done. > > On Linux, there is an additional rationale: small executables and > many files are cool, so we try to have as many shared libraries as > possible. (if you smell sarcasm - that is intentional) > >> To give examples, could zlib be made into a builtin module (because it's >> useful for zipimport), _sre (because it's used by warnings), or are >> there reasons preventing this? > > I think that anything that would be reasonably replaced by third parties > (such as pyexpat.pyd) should be shared, and anything else should be part > of pythonxy.dll. If I look at the file sizes in the DLLs directory, it seems that at least unicodedata.pyd, _bsddb.pyd, and _ssl.pyd would significantly grow python23.dll. Is unicodedata.pyd used by the encoding/decoding methods? "Tim Peters" writes: > [Thomas Heller] >> What is the rationale to decide whether a module is builtin or an >> extension module in core Python (I only care about Windows)? > > I don't know that there is one. Maybe to avoid chewing address space for > code that some programs won't use. Generally speaking, it appears some > effort was made to make stuff an extension module on Windows if it was an > optional part of the Unix build. There was certainly an effort made to > build an extension for Python modules wrapping external cod (like the _bsddb > and _tkinter projects). > >> To give examples, could zlib be made into a builtin module (because >> it's useful for zipimport), _sre (because it's used by warnings), or >> are there reasons preventing this? > > zlib was there long before Python routinely made use of it; indeed, I doubt > I ever used one byte of the zlib code outside of Python testing before zip > import came along (and since I have no zip files to import from I guess I > still never use it). Leaving _sre an extension seems odd now, but at the > time it was competing with the external-to-Python PCRE code. > > Why do you ask? Answers must be accurate to 10 decimal digits . Well, people complain about the number of files py2exe creates. And especially the modules used to init Python itself (in the 1.5 days, exceptions.py, nowadays zlib.pyd) have to be special cased because they cannot use the import hooks. Thomas From guido at python.org Tue Oct 14 12:55:54 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 14 12:56:01 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "Tue, 14 Oct 2003 11:09:15 CDT." <3F8C1FAB.8020607@pobox.com> References: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com> <200310141001.10705.aleaxit@yahoo.com> <200310141450.h9EEobm23763@12-236-54-216.client.attbi.com> <3F8C1FAB.8020607@pobox.com> Message-ID: <200310141655.h9EGtsr24080@12-236-54-216.client.attbi.com> > Why not use a special tuple type for the DSU algorithm that ignores its > last element when doing a comparison? It eliminates the problem of > creating a zillion int objects, and it would be easy to > implement. If we're going to do a custom object, it should be a fixed-length struct containing (1) the key, (2) a C int of sufficient size to hold the record index; (3) a pointer to the record, and its comparison should only use (1) and (2). --Guido van Rossum (home page: http://www.python.org/~guido/) From just at letterror.com Tue Oct 14 13:07:34 2003 From: just at letterror.com (Just van Rossum) Date: Tue Oct 14 13:07:45 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310141655.h9EGtsr24080@12-236-54-216.client.attbi.com> Message-ID: Guido van Rossum wrote: > > Why not use a special tuple type for the DSU algorithm that ignores > > its last element when doing a comparison? It eliminates the problem > > of creating a zillion int objects, and it would be > > easy to implement. > > If we're going to do a custom object, it should be a fixed-length > struct containing (1) the key, (2) a C int of sufficient size to hold > the record index; (3) a pointer to the record, and its comparison > should only use (1) and (2). But since we have a stable sort, (2) can be omitted. I agree with Neil that this is a very clever idea! Just From fdrake at acm.org Tue Oct 14 13:22:29 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue Oct 14 13:22:43 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310141655.h9EGtsr24080@12-236-54-216.client.attbi.com> References: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com> <200310141001.10705.aleaxit@yahoo.com> <200310141450.h9EEobm23763@12-236-54-216.client.attbi.com> <3F8C1FAB.8020607@pobox.com> <200310141655.h9EGtsr24080@12-236-54-216.client.attbi.com> Message-ID: <16268.12501.505494.397966@grendel.zope.com> Guido van Rossum writes: > If we're going to do a custom object, it should be a fixed-length > struct containing (1) the key, (2) a C int of sufficient size to hold > the record index; (3) a pointer to the record, and its comparison > should only use (1) and (2). As has been pointed out, we already have a stable sort. Instead of making stability an option, let's just keep it. We could allocate a second array of PyObject* to mirror the list contents; that would have only the keys. When two values are switched in the sort, the values in both the key list and the value list can be switched. When done, we only need to decref the computed keys and free the array of keys. No additional structures would be needed. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From nas-python at python.ca Tue Oct 14 13:26:27 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Tue Oct 14 13:25:41 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310141655.h9EGtsr24080@12-236-54-216.client.attbi.com> References: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com> <200310141001.10705.aleaxit@yahoo.com> <200310141450.h9EEobm23763@12-236-54-216.client.attbi.com> <3F8C1FAB.8020607@pobox.com> <200310141655.h9EGtsr24080@12-236-54-216.client.attbi.com> Message-ID: <20031014172627.GC11868@mems-exchange.org> On Tue, Oct 14, 2003 at 09:55:54AM -0700, Guido van Rossum wrote: > If we're going to do a custom object, it should be a fixed-length > struct containing (1) the key, (2) a C int of sufficient size to hold > the record index; (3) a pointer to the record, and its comparison > should only use (1) and (2). I just thought of another reason why this is a good idea. Imagine I want to sort a list of objects that cannot be compared (e.g. complex numbers). I would expect cnums.sort(key = lambda n: n.real) to work, not fail with an exception. Neil From guido at python.org Tue Oct 14 13:43:50 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 14 13:44:01 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "Tue, 14 Oct 2003 13:22:29 EDT." <16268.12501.505494.397966@grendel.zope.com> References: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com> <200310141001.10705.aleaxit@yahoo.com> <200310141450.h9EEobm23763@12-236-54-216.client.attbi.com> <3F8C1FAB.8020607@pobox.com> <200310141655.h9EGtsr24080@12-236-54-216.client.attbi.com> <16268.12501.505494.397966@grendel.zope.com> Message-ID: <200310141743.h9EHho324220@12-236-54-216.client.attbi.com> > As has been pointed out, we already have a stable sort. Instead of > making stability an option, let's just keep it. If this can be done without any of the disadvantages brought up at some point (especially allocating millions of ints), by all means let's do it. > We could allocate a second array of PyObject* to mirror the list > contents; that would have only the keys. When two values are switched > in the sort, the values in both the key list and the value list can be > switched. When done, we only need to decref the computed keys and > free the array of keys. I can't tell if that'll work, but if it does, it would be a great solution. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one at comcast.net Tue Oct 14 14:01:49 2003 From: tim.one at comcast.net (Tim Peters) Date: Tue Oct 14 14:01:57 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310141743.h9EHho324220@12-236-54-216.client.attbi.com> Message-ID: [Fred] >> We could allocate a second array of PyObject* to mirror the list >> contents; that would have only the keys. When two values are >> switched in the sort, the values in both the key list and the value >> list can be switched. When done, we only need to decref the >> computed keys and free the array of keys. [Guido] > I can't tell if that'll work, but if it does, it would be a great > solution. I mentioned that before -- doubling the amount of data movement would hurt, at best by blowing cache all to hell. There's a related approach, though: build a distinct vector of custom objects, each containing: 1. A pointer to the key. 2. The original index, as a C integer. This is similar to, but smaller than, something mentioned before. The comparison function for this kind of object redirects to comparing only the keys -- the integers are ignored during the sort. Sort this list with the sorting code exactly as it exists now. At the end of sorting, the integer members can be used to permute the original list into order. This can be done in-place efficiently (not entirely obvious; Knuth gives at least one algorithm for it). From python at rcn.com Tue Oct 14 14:05:09 2003 From: python at rcn.com (Raymond Hettinger) Date: Tue Oct 14 14:06:32 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <16268.12501.505494.397966@grendel.zope.com> Message-ID: <000901c3927d$b4bda2c0$e841fea9@oemcomputer> [Fred L. Drake] > As has been pointed out, we already have a stable sort. Instead of > making stability an option, let's just keep it. Right. If a fast tie-breaker is provided, why would anyone ever choose stable=False? > We could allocate a second array of PyObject* to mirror the list > contents; that would have only the keys. When two values are switched > in the sort, the values in both the key list and the value list can be > switched. When done, we only need to decref the computed keys and > free the array of keys. > > No additional structures would be needed. I would rather wrap Tim's existing code than muck with assignment logic. Ideally, the performance of list.sort() should stay unchanged when the key function is not specified. Tim's original (key, index, value) idea seems to be simplest. The only sticking point is the immortality of PyInts. One easy, but not so elegant way around this is to use another mortal object for a tiebreaker (for example, "00000", "00001", ...). Alternatively, is there a way of telling a PyInt to be mortal? Besides immortality and speed, another consideration is the interaction between the cmp function and the key function. If both are specified, then the underlying decoration becomes visible to the user: def viewcmp(a, b): print a, b # the decoration just became visible return cmp(a,b) mylist.sort(cmp=viewcmp, key=str.lower) Since the decoration can be visible, it should be as understandable as possible. Viewed this way, PyInts are preferable to a custom object. Raymond Hettinger From fdrake at acm.org Tue Oct 14 14:15:52 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue Oct 14 14:16:10 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <000901c3927d$b4bda2c0$e841fea9@oemcomputer> References: <16268.12501.505494.397966@grendel.zope.com> <000901c3927d$b4bda2c0$e841fea9@oemcomputer> Message-ID: <16268.15704.529902.155365@grendel.zope.com> Raymond Hettinger writes: > Besides immortality and speed, another consideration is the interaction > between the cmp function and the key function. If both are specified, > then the underlying decoration becomes visible to the user: Or the cmp function is only passed the decoration. Or we disallow specifying both. Passing (decoration, index, value) for each value strikes me as just plain wrong. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From martin at v.loewis.de Tue Oct 14 14:17:52 2003 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Oct 14 14:18:00 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: References: <3F872FE9.9070508@v.loewis.de> Message-ID: <3F8C3DD0.4020400@v.loewis.de> Thomas Heller wrote: > If I look at the file sizes in the DLLs directory, it seems that at > least unicodedata.pyd, _bsddb.pyd, and _ssl.pyd would significantly grow > python23.dll. Is unicodedata.pyd used by the encoding/decoding methods? No, but it is use by SRE, and by unicode methods (.lower, .upper, ...). I don't see why it matters, though. Adding modules to pythonxy.dll does not increase the memory consumption if the modules are not used. It might decrease the memory consumption in case the modules are used. Regards, Martin From theller at python.net Tue Oct 14 14:25:31 2003 From: theller at python.net (Thomas Heller) Date: Tue Oct 14 14:25:37 2003 Subject: [Python-Dev] IPv6 in Windows binary distro In-Reply-To: <3F873097.7050201@v.loewis.de> ( =?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Sat, 11 Oct 2003 00:20:07 +0200") References: <3F86CF65.1000401@shambala.net> <2mad89ulyo.fsf@starship.python.net> <3F873097.7050201@v.loewis.de> Message-ID: "Martin v. L?wis" writes: > Michael Hudson wrote: > >> Did the 2.3 builds have IPv6 support? Then this would be a nasty >> regression. However, I *thought* that you had to build with VC++ 7 or >> higher to get IPv6 support on Windows, and we've never done that. > > No, 2.3 did not have IPv6. You don't strictly need VC7, though - if > you have the SDK installed in addition to VC6, you could also include > IPv6 support. PC/pyconfig.h does not detect this case automatically, > so you would have to manually activate this support (i.e. include > winsock2.h). > > Apart from that, you are right - IPv6 is not supported in the Windows > builds because of lacking support in the compiler's header files. Ok, I installed the Feb 2003 Platform SDK, and it seems I'm now able to compile with IPv6 support - after minor twiddling of the header files. Now these questions arise: - Should the next binary release (2.3.3, scheduled for the end of 2003) include this support? - Should there be any attempts to detect this support in the header files automatically (black magic to me), or should the platform sdk be required to compile Python with IPv6? Thomas From bac at OCF.Berkeley.EDU Tue Oct 14 14:29:57 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Tue Oct 14 14:30:11 2003 Subject: [Python-Dev] Draft of an essay on Python development (and how to help) In-Reply-To: <20031014081553.GA2976@nl.linux.org> References: <3F8B5ECB.4030207@ocf.berkeley.edu> <20031014081553.GA2976@nl.linux.org> Message-ID: <3F8C40A5.1060902@ocf.berkeley.edu> Gerrit Holl wrote: > Brett C. wrote: > >>Feature Requests >>---------------- >>`Feature requests`_ are for features that you wish Python had but you >>have no plans on actually implementing by writing a patch. On occasion >>people do go through the features requests (also called RFCs on >>SourceForge) to see if there is anything there that they think should be >>implemented and actually do the implementation. But in general do not >>expect something put here to be implemented without some participation >>on your part. > > > I think feature requests are called RFE's in SF terminology, not RFC's. > They are; Requested Feature Enhancements. Typo on my part. Thanks for catching it. -Brett From bac at OCF.Berkeley.EDU Tue Oct 14 14:34:11 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Tue Oct 14 14:34:43 2003 Subject: [Python-Dev] IPv6 in Windows binary distro In-Reply-To: References: <3F86CF65.1000401@shambala.net> <2mad89ulyo.fsf@starship.python.net> <3F873097.7050201@v.loewis.de> Message-ID: <3F8C41A3.3040700@ocf.berkeley.edu> Thomas Heller wrote: > "Martin v. L?wis" writes: > > >>No, 2.3 did not have IPv6. You don't strictly need VC7, though - if >>you have the SDK installed in addition to VC6, you could also include >>IPv6 support. PC/pyconfig.h does not detect this case automatically, >>so you would have to manually activate this support (i.e. include >>winsock2.h). >> >>Apart from that, you are right - IPv6 is not supported in the Windows >>builds because of lacking support in the compiler's header files. > > > Ok, I installed the Feb 2003 Platform SDK, and it seems I'm now able to > compile with IPv6 support - after minor twiddling of the header files. > > Now these questions arise: > - Should the next binary release (2.3.3, scheduled for the end of 2003) > include this support? > Following the rule of thumb that says differences between micro releases should be minimized, I would say -1 on this. > - Should there be any attempts to detect this support in the header > files automatically (black magic to me), or should the platform sdk be > required to compile Python with IPv6? > No opinion from me on this one. -Brett From bac at OCF.Berkeley.EDU Tue Oct 14 14:36:19 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Tue Oct 14 14:36:35 2003 Subject: [Python-Dev] Re: Draft of an essay on Python development (and how tohelp) In-Reply-To: <16268.5209.53004.566117@montanaro.dyndns.org> References: <3F8B5ECB.4030207@ocf.berkeley.edu> <16268.5209.53004.566117@montanaro.dyndns.org> Message-ID: <3F8C4223.7000708@ocf.berkeley.edu> Skip Montanaro wrote: > >> Help add to the patch if it is missing documentation patches or > >> needed regression tests. > > Mike> Please don't herald for things that can't be done by anyone except > Mike> patch author or py-dev member. > > Or identify such barriers and suggest alternate paths to submit such info. > I think I will mention that you can always post the files somewhere else online and paste the link into a comment. I don't want to suggest creating another patch item just to store extra stuff for an existing patch since that would probably lead to an explosion in patch items and that is the last thing we need. -Brett From martin at v.loewis.de Tue Oct 14 14:49:22 2003 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Oct 14 14:49:38 2003 Subject: [Python-Dev] IPv6 in Windows binary distro In-Reply-To: References: <3F86CF65.1000401@shambala.net> <2mad89ulyo.fsf@starship.python.net> <3F873097.7050201@v.loewis.de> Message-ID: <3F8C4532.8060305@v.loewis.de> Thomas Heller wrote: > - Should the next binary release (2.3.3, scheduled for the end of 2003) > include this support? I'm leaning actually somewhat towards requesting this, although the "no changes in Micro releases" is a strong point. I would not at all be concerned if this was a pure add-on feature. However, there might be minor changes to existing behaviour: - python23.dll would now require winsock2.dll. I'm unsure whether Win95 was already providing this library. - the getaddrinfo implementation would now be the Microsoft "native emulation", instead of the Python one. It is a native emulation because it detects proper getaddrinfo dynamically if available, and falls back to emulation otherwise. This might cause minor semantic changes over our emulation code. [there would also be significant semantic changes in case a host has an IPv6 address - but that would be the whole point of making the change] Given that the feature is going to be requested more and more, and given that Microsoft's getaddrinfo emulation is likely more correct, thread-safe, etc. than our own, I'm still leaning towards "include IPv6". > - Should there be any attempts to detect this support in the header > files automatically (black magic to me), or should the platform sdk be > required to compile Python with IPv6? If this is implemented for 2.3, I think it should be an easily-tunable option, defaulting to "on" - anybody building Python without the SDK would have to turn it off. For 2.4, I hope we will move to VC7, in which case the SDK is not required anymore for IPv6. Regards, Martin From tim.one at comcast.net Tue Oct 14 15:12:37 2003 From: tim.one at comcast.net (Tim Peters) Date: Tue Oct 14 15:12:44 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <20031014161756.GC11579@mems-exchange.org> Message-ID: [Neil Schemenauer] > Clever idea I think. You don't need a special tuple, just a little > wrapper object that holds the key and the original value and uses > the key for tp_richcompare. That could work well. If a comparison function was specified too, it would only see the key (addressing one of Raymond's concerns). From guido at python.org Tue Oct 14 15:16:17 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 14 15:16:26 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "Tue, 14 Oct 2003 14:01:49 EDT." References: Message-ID: <200310141916.h9EJGHA24421@12-236-54-216.client.attbi.com> > [Fred] > >> We could allocate a second array of PyObject* to mirror the list > >> contents; that would have only the keys. When two values are > >> switched in the sort, the values in both the key list and the value > >> list can be switched. When done, we only need to decref the > >> computed keys and free the array of keys. > > [Guido] > > I can't tell if that'll work, but if it does, it would be a great > > solution. [Tim] > I mentioned that before -- doubling the amount of data movement would hurt, > at best by blowing cache all to hell. > > There's a related approach, though: build a distinct vector of custom > objects, each containing: > > 1. A pointer to the key. > 2. The original index, as a C integer. > > This is similar to, but smaller than, something mentioned before. But wouldn't the memory allocated for all those tiny custom objects also be spread all over the place and hence blow away the cache? I guess another approach would be to in-line those objects so that we sort an array of structs like this: struct { PyObject *key; long index; } rather than an array of PyObject*. But this would probably require all of the sort code to be cloned. > The comparison function for this kind of object redirects to comparing only > the keys -- the integers are ignored during the sort. Sort this list with > the sorting code exactly as it exists now. > > At the end of sorting, the integer members can be used to permute the > original list into order. This can be done in-place efficiently (not > entirely obvious; Knuth gives at least one algorithm for it). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 14 15:20:18 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 14 15:20:29 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: Your message of "Tue, 14 Oct 2003 20:17:52 +0200." <3F8C3DD0.4020400@v.loewis.de> References: <3F872FE9.9070508@v.loewis.de> <3F8C3DD0.4020400@v.loewis.de> Message-ID: <200310141920.h9EJKIk24455@12-236-54-216.client.attbi.com> > I don't see why it matters, though. Adding modules to pythonxy.dll does > not increase the memory consumption if the modules are not used. Can you explain why not? Doesn't the whole DLL get loaded into memory? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 14 15:19:22 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 14 15:20:38 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "Tue, 14 Oct 2003 14:05:09 EDT." <000901c3927d$b4bda2c0$e841fea9@oemcomputer> References: <000901c3927d$b4bda2c0$e841fea9@oemcomputer> Message-ID: <200310141919.h9EJJMp24444@12-236-54-216.client.attbi.com> > I would rather wrap Tim's existing code than muck with assignment logic. > Ideally, the performance of list.sort() should stay unchanged when the > key function is not specified. Impossible -- the aux objects tax the memory cache more. Also the characteristics of the data will be very different. > Tim's original (key, index, value) idea seems to be simplest. The only > sticking point is the immortality of PyInts. One easy, but not so > elegant way around this is to use another mortal object for a tiebreaker > (for example, "00000", "00001", ...). Alternatively, is there a way of > telling a PyInt to be mortal? I still like custom objects better. > Besides immortality and speed, another consideration is the interaction > between the cmp function and the key function. If both are specified, > then the underlying decoration becomes visible to the user: > > def viewcmp(a, b): > print a, b # the decoration just became visible > return cmp(a,b) > mylist.sort(cmp=viewcmp, key=str.lower) > > Since the decoration can be visible, it should be as understandable as > possible. Viewed this way, PyInts are preferable to a custom object. I think we should disallow specifying both cmp and key arguments. Using both just doesn't make sense. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 14 15:21:04 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 14 15:21:31 2003 Subject: [Python-Dev] IPv6 in Windows binary distro In-Reply-To: Your message of "Tue, 14 Oct 2003 20:25:31 +0200." References: <3F86CF65.1000401@shambala.net> <2mad89ulyo.fsf@starship.python.net> <3F873097.7050201@v.loewis.de> Message-ID: <200310141921.h9EJL4I24467@12-236-54-216.client.attbi.com> > - Should the next binary release (2.3.3, scheduled for the end of 2003) > include this support? It would be a new feature, wouldn't it? > - Should there be any attempts to detect this support in the header > files automatically (black magic to me), or should the platform sdk be > required to compile Python with IPv6? The Windows build doesn't do any feature detection, does it? All it's got is a hand-edited config file. --Guido van Rossum (home page: http://www.python.org/~guido/) From andrew at gaul.org Tue Oct 14 15:39:05 2003 From: andrew at gaul.org (Andrew Gaul) Date: Tue Oct 14 15:39:09 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: <200310141920.h9EJKIk24455@12-236-54-216.client.attbi.com> References: <3F872FE9.9070508@v.loewis.de> <3F8C3DD0.4020400@v.loewis.de> <200310141920.h9EJKIk24455@12-236-54-216.client.attbi.com> Message-ID: <20031014193905.GA32597@paat.pair.com> On Tue, Oct 14, 2003 at 12:20:18PM -0700, Guido van Rossum wrote: > > I don't see why it matters, though. Adding modules to pythonxy.dll does > > not increase the memory consumption if the modules are not used. > > Can you explain why not? Doesn't the whole DLL get loaded into > memory? The OS maps the entire DLL but only pages in the parts that are referenced. This is the same behavior as mmapping an ordinary file because that is how shared libraries are usually implemented (with some magic when multiple libraries want the same virtual addresses). -- Andrew Gaul http://gaul.org/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20031014/203b20cf/attachment.bin From python at rcn.com Tue Oct 14 15:39:27 2003 From: python at rcn.com (Raymond Hettinger) Date: Tue Oct 14 15:40:43 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Message-ID: <002601c3928a$e15b3920$e841fea9@oemcomputer> > [Neil Schemenauer] > > Clever idea I think. You don't need a special tuple, just a little > > wrapper object that holds the key and the original value and uses > > the key for tp_richcompare. > > That could work well. If a comparison function was specified too, it > would > only see the key (addressing one of Raymond's concerns). Don't you still need a tie-breaker index to preserve stability? Raymond Hettinger ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From python at rcn.com Tue Oct 14 15:41:39 2003 From: python at rcn.com (Raymond Hettinger) Date: Tue Oct 14 15:42:19 2003 Subject: [Python-Dev] decorate-sort-undecorate Message-ID: <002701c3928b$3039a720$e841fea9@oemcomputer> > > [Neil Schemenauer] > > > Clever idea I think. You don't need a special tuple, just a little > > > wrapper object that holds the key and the original value and uses > > > the key for tp_richcompare. [Tim] > > That could work well. If a comparison function was specified too, it > > would > > only see the key (addressing one of Raymond's concerns). [Me] > Don't you still need a tie-breaker index to preserve stability? Arghh! I see it now. Raymond ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From theller at python.net Tue Oct 14 15:48:04 2003 From: theller at python.net (Thomas Heller) Date: Tue Oct 14 15:48:08 2003 Subject: [Python-Dev] IPv6 in Windows binary distro In-Reply-To: <200310141921.h9EJL4I24467@12-236-54-216.client.attbi.com> (Guido van Rossum's message of "Tue, 14 Oct 2003 12:21:04 -0700") References: <3F86CF65.1000401@shambala.net> <2mad89ulyo.fsf@starship.python.net> <3F873097.7050201@v.loewis.de> <200310141921.h9EJL4I24467@12-236-54-216.client.attbi.com> Message-ID: <65irbnyj.fsf@python.net> Guido van Rossum writes: >> - Should the next binary release (2.3.3, scheduled for the end of 2003) >> include this support? > > It would be a new feature, wouldn't it? Sure. And since it is additional work for me, I'm all for leaving it out, especially since I don't use it myself ;-). While this is reason enough for *me*, I'm willing to do the additional work if this feature is requested (if the community is willing to take the risk of the new stuff). >> - Should there be any attempts to detect this support in the header >> files automatically (black magic to me), or should the platform sdk be >> required to compile Python with IPv6? > > The Windows build doesn't do any feature detection, does it? All it's > got is a hand-edited config file. All it does detect now is the C compiler used. Thomas From tim.one at comcast.net Tue Oct 14 15:56:32 2003 From: tim.one at comcast.net (Tim Peters) Date: Tue Oct 14 15:56:38 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310141631.h9EGVlf24023@12-236-54-216.client.attbi.com> Message-ID: [Guido] > ... > When exactly do you consider a sort a "database sort"? Oh, roughly anything that sorts large objects according to a small piece of each. In the sorting app I talked about, the database was a giant XML file, and was read into memory with each database record represented as a class instance with a few dozen data attributes. The full-blown __cmp__ for this class was large. With a more memory-efficient database representation, I'd expect the in-memory object to be a kind of wrapper around database-access code, maybe with a cache of recently-referenced attributes. Then it's mondo expensive if you have to break ties by fetching data from disk. ... >> It's not externality, it's decomposability: .... > I experimented a bit with the version of Outlook I have, and it seems > to always use the delivery date/time as the second key, and always in > descending order. It depends some on the current view, but I misremembered Outlook's UI anyway: to get a multi-heading sort, you have to be depress the shift key when clicking on the 2nd (and 3rd, etc) column (and click twice (not double-click!) to reverse the sort order on the current column; the shift key applies there too if you want a multi-key sort order). > ... > I'm not sure that this helps us decide the default behavior of sorts > in Python, which are rarely interactive in this sense. Python is used to implement interactive apps. > (If someone writes an Ourlook substitute, they can pretty well code the > sort to do whatever they want.) The 2.3 sort is stable. That's not only the default, there's no choice about it. What's getting proposed is to give up stability to ease an implementation trick in the cases where both the stability and speed of a sort are most often most important. If I'm just sorting a list of floats, it's very hard to detect whether the sort was stable (I'd have to stare at the memory addresses of the float objects to tell). The cases where DSU gets used are the ones where the object isn't the key (so that stability or lack thereof becomes obvious), and where the user cares about speed (else they'd just pass a custom comparison function instead of bothering with DSU). Most sorts I do won't specify a key= argument, so most sorts I do couldn't care less what auto-DSU does. When I do code a DSU by hand, I nearly always include the index component, but more for speed reasons than for stability reasons -- but I rarely write interactive apps. I'm told that other people do . > ... > But I don't see the interactivity in Python apps, and that's what > counts here. Won't your current employer write a web-based system security monitor in Python, showing tables of information? An app that doesn't make a table view sortable on a column isn't a real app . > ... > To cut all this short, I propose that we offer using the index as a > second sort key as an option, on by default, whose name can be (barely > misleading) "stable". On by default nicely matches the behavior of > the 2.3 sort without any options. At this point, I may be losing track of how many options the 2.4 sort is growing -- cmp, key, and stable? I'd rather drop stable, and that when cmp or key (or both) is used, sort promises not to fall back to whole-object comparison by magic (if cmp or key invoke whole-object comparison, fine, that's on the user's head then). There are several ways to implement the latter, most of which would inherit stability from the core 2.3 sort. From guido at python.org Tue Oct 14 15:58:15 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 14 15:58:37 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "Tue, 14 Oct 2003 15:39:27 EDT." <002601c3928a$e15b3920$e841fea9@oemcomputer> References: <002601c3928a$e15b3920$e841fea9@oemcomputer> Message-ID: <200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com> > Don't you still need a tie-breaker index to preserve stability? No, because the sort algorithm is already stable. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 14 15:59:47 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 14 15:59:54 2003 Subject: [Python-Dev] IPv6 in Windows binary distro In-Reply-To: Your message of "Tue, 14 Oct 2003 21:48:04 +0200." <65irbnyj.fsf@python.net> References: <3F86CF65.1000401@shambala.net> <2mad89ulyo.fsf@starship.python.net> <3F873097.7050201@v.loewis.de> <200310141921.h9EJL4I24467@12-236-54-216.client.attbi.com> <65irbnyj.fsf@python.net> Message-ID: <200310141959.h9EJxlk24602@12-236-54-216.client.attbi.com> > >> - Should the next binary release (2.3.3, scheduled for the end of 2003) > >> include this support? > > > > It would be a new feature, wouldn't it? > > Sure. And since it is additional work for me, I'm all for leaving it > out, especially since I don't use it myself ;-). > > While this is reason enough for *me*, I'm willing to do the additional > work if this feature is requested (if the community is willing to take > the risk of the new stuff). I think that the community is pretty strong against new features, however neat. Is there a way to offer this functionality as an extension instead? --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one at comcast.net Tue Oct 14 16:08:59 2003 From: tim.one at comcast.net (Tim Peters) Date: Tue Oct 14 16:09:05 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <002701c3928b$3039a720$e841fea9@oemcomputer> Message-ID: [Neil Schemenauer] >>>> Clever idea I think. You don't need a special tuple, just a little >>>> wrapper object that holds the key and the original value and uses >>>> the key for tp_richcompare. [Tim] >>> That could work well. If a comparison function was specified too, >>> it would only see the key (addressing one of Raymond's concerns). [Raymond Hettinger, in darkness] >> Don't you still need a tie-breaker index to preserve stability? [Raymond, in light] > Arghh! I see it now. In case everyone doesn't, "the trick" is that the core sorting algorithm is already stable. The only reason it needs a "cheap tie breaker" in hand-rolled DSU is to stop (key, object) tuple comparison from falling back to whole-object comparison when two keys tie. Falling back to whole-object comparison is what can break stability (and chew up an enormous # of cycles). If comparison is never handed the objects (only the keys), those potential problems vanish, and stability is inherited. From gtalvola at nameconnector.com Tue Oct 14 16:25:17 2003 From: gtalvola at nameconnector.com (Geoffrey Talvola) Date: Tue Oct 14 16:25:39 2003 Subject: [Python-Dev] decorate-sort-undecorate Message-ID: <61957B071FF421419E567A28A45C7FE59AF6E4@mailbox.nameconnector.com> Tim Peters wrote: > The > cases where DSU > gets used are the ones where the object isn't the key (so > that stability or > lack thereof becomes obvious), and where the user cares about > speed (else > they'd just pass a custom comparison function instead of > bothering with > DSU). I disagree... I almost always use DSU in any circumstances because I find it easier and more natural to write: def keyfunc(record): return record.LastName.lower(), record.FirstName.lower(), record.PhoneNumber mylist = sortUsingKeyFunc(mylist, keyfunc) than to have to write an equivalent comparison function: def cmpfunc(r1, r2): return cmp((r1.LastName.lower(), r1.FirstName.lower(), r1.PhoneNumber), (r2.LastName.lower(), r2.FirstName.lower(), r2.PhoneNumber)) mylist.sort(cmpfunc) so for me, ease of use is the reason, not speed. Of course, it doesn't _hurt_ that DSU is faster... - Geoff From martin at v.loewis.de Tue Oct 14 16:37:19 2003 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Oct 14 16:37:21 2003 Subject: [Python-Dev] IPv6 in Windows binary distro In-Reply-To: <200310141921.h9EJL4I24467@12-236-54-216.client.attbi.com> References: <3F86CF65.1000401@shambala.net> <2mad89ulyo.fsf@starship.python.net> <3F873097.7050201@v.loewis.de> <200310141921.h9EJL4I24467@12-236-54-216.client.attbi.com> Message-ID: <3F8C5E7F.9040004@v.loewis.de> Guido van Rossum wrote: >>- Should the next binary release (2.3.3, scheduled for the end of 2003) >>include this support? > > > It would be a new feature, wouldn't it? It would be a new feature only in the binary. The source code has supported that for quite some time. Regards, Martin From martin at v.loewis.de Tue Oct 14 16:39:23 2003 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Oct 14 16:39:37 2003 Subject: [Python-Dev] IPv6 in Windows binary distro In-Reply-To: <200310141959.h9EJxlk24602@12-236-54-216.client.attbi.com> References: <3F86CF65.1000401@shambala.net> <2mad89ulyo.fsf@starship.python.net> <3F873097.7050201@v.loewis.de> <200310141921.h9EJL4I24467@12-236-54-216.client.attbi.com> <65irbnyj.fsf@python.net> <200310141959.h9EJxlk24602@12-236-54-216.client.attbi.com> Message-ID: <3F8C5EFB.5000806@v.loewis.de> Guido van Rossum wrote: > Is there a way to offer this functionality as an extension instead? You could replace _socket.pyd. Regards, Martin From tim.one at comcast.net Tue Oct 14 16:44:30 2003 From: tim.one at comcast.net (Tim Peters) Date: Tue Oct 14 16:44:38 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310141916.h9EJGHA24421@12-236-54-216.client.attbi.com> Message-ID: [Fred] >>>> We could allocate a second array of PyObject* to mirror the list >>>> contents; that would have only the keys. When two values are >>>> switched in the sort, the values in both the key list and the value >>>> list can be switched. When done, we only need to decref the >>>> computed keys and free the array of keys. [Guido] >>> I can't tell if that'll work, but if it does, it would be a great >>> solution. [Tim] >> I mentioned that before -- doubling the amount of data movement >> would hurt, at best by blowing cache all to hell. >> >> There's a related approach, though: build a distinct vector of >> custom objects, each containing: >> >> 1. A pointer to the key. >> 2. The original index, as a C integer. >> >> This is similar to, but smaller than, something mentioned before. [Guido] > But wouldn't the memory allocated for all those tiny custom objects > also be spread all over the place and hence blow away the cache? Probably no more than that the data in the original list was spread all over the place. The mergesort has (per merge) two input areas and one output area, which are contiguous vectors and accessed strictly left to right, one slot at a time, a cache-friendly access pattern. The real data pointed to by the vectors is all over creation, but the vectors themselves are contiguous. We seem to get a lot of good out of that. For example, the version of weak heapsort I coded did very close to the theoretical minimum # of comparisons on randomly ordered data, and was algorithmically much simpler than the adaptive mergesort, yet ran much slower. That can't be explained by # of comparisons or by instruction count. The one obvious difference is that weak heapsort leaps all over the vector, in as cache-hostile a way imaginable. The mergesort also has some success in reusing small merged areas ASAP, while they're still likely to be in cache. If we were to mirror loads and stores across two lists in lockstep, we'd be dealing with 4 input areas and 2 output areas per merge. That's a lot more strain on a cache, even if the access pattern per area is cache-friendly on its own. Of course a large sort is going to blow the cache wrt what the program was doing before the sort regardless. > I guess another approach would be to in-line those objects so that we > sort an array of structs like this: > > struct { > PyObject *key; > long index; > } > > rather than an array of PyObject*. But this would probably require > all of the sort code to be cloned. Following Neil, we could change that to store a pointer to the object rather than an index. I agree that would be cache-friendlier, but I don't know how much better it might do. Dereferencing key pointers acts in the other direction in all schemes, because the memory holding the keys is likely all over the place (and probably gets worse as the sort goes on, and the initial order gets scrambled). No way to know except to try it and time it. From tim.one at comcast.net Tue Oct 14 16:45:44 2003 From: tim.one at comcast.net (Tim Peters) Date: Tue Oct 14 16:45:49 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310141919.h9EJJMp24444@12-236-54-216.client.attbi.com> Message-ID: [Raymond] >> I would rather wrap Tim's existing code than muck with assignment >> logic. Ideally, the performance of list.sort() should stay unchanged >> when the key function is not specified. [Guido] > Impossible -- the aux objects tax the memory cache more. Also the > characteristics of the data will be very different. I think Raymond has in mind that if a key argument isn't specified, then aux objects aren't needed, and wouldn't be constructed -- the list would get sorted the same way it does now. >> ... >> Alternatively, is there a way of telling a PyInt to be mortal? There isn't, but we shouldn't let that drive anything. I don't think any law requires that Python always have an unbounded freelist for int objects. Most objects with custom freelists put a bound on the freelist size (as Guido noted in this thread for small tuples). I'm not sure why ints don't; I guess nobody ever felt motivated enough to stick a bound on 'em. From martin at v.loewis.de Tue Oct 14 16:50:38 2003 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Oct 14 16:50:41 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: <200310141920.h9EJKIk24455@12-236-54-216.client.attbi.com> References: <3F872FE9.9070508@v.loewis.de> <3F8C3DD0.4020400@v.loewis.de> <200310141920.h9EJKIk24455@12-236-54-216.client.attbi.com> Message-ID: <3F8C619E.2000103@v.loewis.de> Guido van Rossum wrote: >>I don't see why it matters, though. Adding modules to pythonxy.dll does >>not increase the memory consumption if the modules are not used. > > > Can you explain why not? Doesn't the whole DLL get loaded into > memory? No. In modern operating systems (including all Win32 implementations, i.e. W9x and NT+), the code segment is *mapped* instead of being loaded (in Win32 terminology, by means of MapViewOfFileEx). This causes demand-paging of the code, meaning that code is only in memory if it is actually executed. There are some pitfalls, e.g. that paging operates only on 4k (on x86) granularity, and that relocations may cause the code to get loaded at startup time instead of at run-time (in the latter case, it also stops being shared across processes). The code still consumes *address space*, but of this, any process has plenty (2GB on Win32, unless you use the /3GB boot.ini option of W2k+). The same is true for executables and shared libraries on Unix, meaning that making extension modules shared libraries does not reduce memory consumption. It may increase it, as code segments are 4k-aligned, so if you have many small segments, you may experience rounding overhead. Regards, Martin From tim.one at comcast.net Tue Oct 14 16:53:17 2003 From: tim.one at comcast.net (Tim Peters) Date: Tue Oct 14 16:53:22 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <61957B071FF421419E567A28A45C7FE59AF6E4@mailbox.nameconnector.com> Message-ID: [Geoffrey Talvola] > I disagree... I almost always use DSU in any circumstances because I > find it easier and more natural to write: > > def keyfunc(record): > return record.LastName.lower(), record.FirstName.lower(), > record.PhoneNumber > mylist = sortUsingKeyFunc(mylist, keyfunc) You've left out the body of sortUsingKeyFunc, so I expect you're unusual in having built up helper routines for doing DSU frequently. > than to have to write an equivalent comparison function: > > def cmpfunc(r1, r2): > return cmp((r1.LastName.lower(), r1.FirstName.lower(), > r1.PhoneNumber), (r2.LastName.lower(), > r2.FirstName.lower(), r2.PhoneNumber)) > mylist.sort(cmpfunc) This is wordier than need be, though, duplicating code for the purpose of making it look bad . I'd do mylist.sort(lambda a, b: cmp(keyfunc(a), keyfunc(b))) > so for me, ease of use is the reason, not speed. Of course, it > doesn't _hurt_ that DSU is faster... If your records often tie on the result of keyfunc (doesn't look likely given the names you picked here), and your sortUsingKeyFunc() function doesn't inject the original list index (or otherwise force a cheap tie-breaker), DSU can be much slower than passing a custom comparison function. Not likely, though. From gtalvola at nameconnector.com Tue Oct 14 17:06:19 2003 From: gtalvola at nameconnector.com (Geoffrey Talvola) Date: Tue Oct 14 17:06:38 2003 Subject: [Python-Dev] decorate-sort-undecorate Message-ID: <61957B071FF421419E567A28A45C7FE59AF6E5@mailbox.nameconnector.com> Tim Peters wrote: > [Geoffrey Talvola] >> I disagree... I almost always use DSU in any circumstances because I >> find it easier and more natural to write: >> >> def keyfunc(record): >> return record.LastName.lower(), record.FirstName.lower(), >> record.PhoneNumber mylist = sortUsingKeyFunc(mylist, keyfunc) > > You've left out the body of sortUsingKeyFunc, so I expect > you're unusual in > having built up helper routines for doing DSU frequently. > Yes, I do use a helper routine, but I suspect I'm not the only one out there who does... >> than to have to write an equivalent comparison function: >> >> def cmpfunc(r1, r2): >> return cmp((r1.LastName.lower(), r1.FirstName.lower(), >> r1.PhoneNumber), (r2.LastName.lower(), >> r2.FirstName.lower(), r2.PhoneNumber)) >> mylist.sort(cmpfunc) > > This is wordier than need be, though, duplicating code for > the purpose of > making it look bad . I'd do > > mylist.sort(lambda a, b: cmp(keyfunc(a), keyfunc(b))) I'm not a huge fan of lambdas from a readability perspective, so I'd probably wrap _that_ up into a helper function if I didn't know about DSU. The point I'm trying to make it that a key function is usually more natural to use than a comparison function. You're right, DSU isn't the only way to make use of a key function. But, I think it would be a good thing for list.sort() to take a key function because it will guide users toward using the cleaner key function idiom and therefore improve the readability of Python code everywhere. > >> so for me, ease of use is the reason, not speed. Of course, it >> doesn't _hurt_ that DSU is faster... > > If your records often tie on the result of keyfunc (doesn't > look likely > given the names you picked here), and your sortUsingKeyFunc() function > doesn't inject the original list index (or otherwise force a cheap > tie-breaker), DSU can be much slower than passing a custom comparison > function. Not likely, though. Not to worry, I do inject the original list index. For the record, here's my helper function, probably not optimally efficient but good enough for me: def sortUsingKeyFunc(l, keyfunc): l = zip(map(keyfunc, l), range(len(l)), l) l.sort() return [x[2] for x in l] - Geoff From greg at cosc.canterbury.ac.nz Tue Oct 14 21:43:42 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue Oct 14 21:44:07 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com> Message-ID: <200310150143.h9F1hg319002@oma.cosc.canterbury.ac.nz> Guido: > > Don't you still need a tie-breaker index to preserve stability? > > No, because the sort algorithm is already stable. In which case it makes no sense at all for stability to be an option, since you'd have to go out of your way to make it *un*stable! The only issue then is to avoid comparing the whole record, and this presumably should be non-optional as well. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido at python.org Tue Oct 14 21:52:04 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 14 21:52:15 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "Wed, 15 Oct 2003 14:43:42 +1300." <200310150143.h9F1hg319002@oma.cosc.canterbury.ac.nz> References: <200310150143.h9F1hg319002@oma.cosc.canterbury.ac.nz> Message-ID: <200310150152.h9F1q4C25064@12-236-54-216.client.attbi.com> > > > Don't you still need a tie-breaker index to preserve stability? > > > > No, because the sort algorithm is already stable. > > In which case it makes no sense at all for stability > to be an option, since you'd have to go out of your > way to make it *un*stable! > > The only issue then is to avoid comparing the whole > record, and this presumably should be non-optional > as well. Right. I think we've settled on using small wrapper objects instead of tuples, whose comparison *only* compares the key value, and whose other field contains a reference to the full record. When passing both cmp and key, cmp is passed the key field from the wrapper. The wrapper objects don't need to have any general purpose functionality so their implementation should be very simple. (We *could* go further and have a custom allocator for these objects, but I'm not sure that that's necessary -- pymalloc should be fast enough, and the bulk cost is going to be the O(N log N) behavior of the sort anyway.) --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Tue Oct 14 22:40:44 2003 From: python at rcn.com (Raymond Hettinger) Date: Tue Oct 14 22:41:31 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310150152.h9F1q4C25064@12-236-54-216.client.attbi.com> Message-ID: <004f01c392c5$bb94a740$e841fea9@oemcomputer> [Greg Ewing] > > The only issue then is to avoid comparing the whole > > record, and this presumably should be non-optional > > as well. [Guido] > Right. I think we've settled on using small wrapper objects instead > of tuples, whose comparison *only* compares the key value, and whose > other field contains a reference to the full record. When passing > both cmp and key, cmp is passed the key field from the wrapper. Here is an implementation to try out. This second patch includes unittests and docs. The reference counts work out file for repeated test runs and the rest of the test suite passes just fine: www.python.org/sf/823292 * The optional keywords arguments are: cmp, key, reverse. * The key function triggers a DSU step with a wrapper object that holds the full record but returns only the key for a comparison. This is fast, memory efficient, and doesn't change the underlying stability characteristics of the sort. (I think this was Neil's idea -- and it works like a charm.) * If the key function is not specified, no wrapping occurs so that sort performance is not affected. Raymond Hettinger ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From tdelaney at avaya.com Tue Oct 14 22:47:11 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Tue Oct 14 22:47:19 2003 Subject: [Python-Dev] decorate-sort-undecorate Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFE7CA@au3010avexu1.global.avaya.com> > From: Greg Ewing [mailto:greg@cosc.canterbury.ac.nz] > > Guido: > > > > Don't you still need a tie-breaker index to preserve stability? > > > > No, because the sort algorithm is already stable. > > In which case it makes no sense at all for stability > to be an option, since you'd have to go out of your > way to make it *un*stable! > > The only issue then is to avoid comparing the whole > record, and this presumably should be non-optional > as well. How would we document this? To date sort() gives no guarantees about stability. We could continue to give this lack of guarantee by stating that only the key as returned from the key function is used in the comparison. Alternatively, we could guarantee that the resulting sort will be stable (which would make it incumbent to use the index if an unstable sort is introduced in a future version). Personally, I think it would be a good idea to make the guarantee that from 2.3 sort() will be stable when the comparison function returns equal, or the keys compare equal, or the objects compare equal (in the case of no comparison func or key func). But that could just be me. Tim Delaney From aleaxit at yahoo.com Wed Oct 15 02:24:35 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Wed Oct 15 02:24:40 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <61957B071FF421419E567A28A45C7FE59AF6E5@mailbox.nameconnector.com> References: <61957B071FF421419E567A28A45C7FE59AF6E5@mailbox.nameconnector.com> Message-ID: <200310150824.35396.aleaxit@yahoo.com> On Tuesday 14 October 2003 11:06 pm, Geoffrey Talvola wrote: ... > The point I'm trying to make it that a key function is usually more natural > to use than a comparison function. You're right, DSU isn't the only way to I agree, with ONE important exception: when your desired sort order is e.g "primary key ascending field X, secondary key descending field Y", writing a key-extraction function can be an absolute BEAR (you have to know the type of field Y, and even then how to build a key that will sort in descending order by it is generally anything but easy), while a comparison function is trivial: def compafun(a, b): return cmp(a.X,b.X) or cmp(b.Y,a.Y) i.e., you obtain descending vs ascending order by switching the arguments to builtin cmp, and join together multiple calls to cmp with 'or' thanks to the fact that cmp returning 0 (equal on this key) is exactly what needs to trigger the "moving on to further, less-significant keys". In fact I find that the simplest general way to do such compares with a key extraction function is a wrapper: class reverser(object): def __init__(self, obj): self.obj = obj def __cmp__(self, other): return cmp(other.obj, self.obj) relying on the fact that an instance of reverser will only ever be compared with another instance of reverser; now, for the same task as above, def keyextract(obj): return obj.X, reverser(obj.Y) does work. However, the number of calls to reverser.__cmp__ is generally O(N log N) [unless you can guarantee that most X subkeys differ] so that the performance benefits of DSU are, alas, not in evidence any more here. A C-coded 'reverser' would presumably be able to counteract this, though (but I admit I have never had to write one in practice). Alex From duncan at rcp.co.uk Wed Oct 15 04:42:37 2003 From: duncan at rcp.co.uk (Duncan Booth) Date: Wed Oct 15 04:42:30 2003 Subject: [Python-Dev] decorate-sort-undecorate References: <002601c3928a$e15b3920$e841fea9@oemcomputer> <200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com> Message-ID: Guido van Rossum wrote in news:200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com: >> Don't you still need a tie-breaker index to preserve stability? > > No, because the sort algorithm is already stable. What about the situation where you want the list sorted in reverse order? If you simply sort and then reverse the list you've broken the stability. You *could* preserve the stability by using a negative index when the list is to be reserved, but might it also be possible to get the special comparison object to invert the result of the comparison? -- Duncan Booth duncan@rcp.co.uk int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3" "\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure? From theller at python.net Wed Oct 15 08:55:13 2003 From: theller at python.net (Thomas Heller) Date: Wed Oct 15 08:55:20 2003 Subject: [Python-Dev] Python-2.3.2 windows binary screwed Message-ID: Sigh. The 2.3.2 windows binary contains invalid MS dlls. I copied them from my system directory, instead of using those of the MSVC 6 SP5 redistributables. There are already 3 bug reports about this: http://www.python.org/sf/818029, http://www.python.org/sf/824016, and http://www.python.org/sf/823405. Strongy affected are probably win98 and NT4 users. I suggest to remove the Python-2.3.2.exe from the downloads page (or to hide it), until this issue is resolved. FWIW, Python-2.3.1.exe should have the same problem. All this is, of course, only my fault. Thomas From list-python-dev at ccraig.org Wed Oct 15 09:41:26 2003 From: list-python-dev at ccraig.org (Christopher A. Craig) Date: Wed Oct 15 09:41:33 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: References: Message-ID: If this goes in can we document that adding a key parameter makes the sort stable and key=None causes a stable sort to happen? Current CPython won't have to do anything at all with that, but other Pythons (or a future CPython where a mythical faster-than-timsort nonstable sort is discovered) would have a documented way to force stability. -- Christopher A. Craig From mcherm at mcherm.com Wed Oct 15 09:51:17 2003 From: mcherm at mcherm.com (Michael Chermside) Date: Wed Oct 15 09:51:19 2003 Subject: [Python-Dev] decorate-sort-undecorate Message-ID: <1066225877.3f8d50d5eff8b@mcherm.com> Tim writes: > Almost all sorts you're likely > to use in real life are stable in order to support this, whether it's > clicking on an email-metadata column in Outlook, or sorting an array of data > by a contained column in Excel. Guido tries it out: > I experimented a bit with the version of Outlook I have, and it seems > to always use the delivery date/time as the second key, and always in > descending order. Which is simply evidence that Outlook is poorly designed, and that Microsoft should have hired Tim to help with design specs. Although Outlook lacks this feature, I have FREQUENTLY desired it, and been annoyed at its absence. Tim in a later email: > It depends some on the current view, but I misremembered Outlook's UI > anyway: to get a multi-heading sort, you have to be depress the shift key > when clicking on the 2nd (and 3rd, etc) column (and click twice (not > double-click!) to reverse the sort order on the current column; the shift > key applies there too if you want a multi-key sort order). Well well... it's nice to learn that! I find Tim's (and other's) arguments quite convincing. We can go with a stable sort... ie, compare the keys, then fall back on stability and NEVER try comparing the objects themselves, and I think it will make complete sense. After all, sorting with a "key" parameter provided is *NOT* really a DSU algorithm... it's a new sort feature which happens to be _implemented_ using DSU. Should that new feature sort on "just the key" (leaving ties stable) or should it sort on "the key and then the objects themselves". I'd say both make sense, and in fact "just the key" is more obvious to me. -- Michael Chermside From larsga at garshol.priv.no Wed Oct 15 10:19:00 2003 From: larsga at garshol.priv.no (Lars Marius Garshol) Date: Wed Oct 15 10:19:00 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <1066225877.3f8d50d5eff8b@mcherm.com> References: <1066225877.3f8d50d5eff8b@mcherm.com> Message-ID: * Michael Chermside | | I find Tim's (and other's) arguments quite convincing. We can go | with a stable sort... ie, compare the keys, then fall back on | stability and NEVER try comparing the objects themselves, and I | think it will make complete sense. After all, sorting with a "key" | parameter provided is *NOT* really a DSU algorithm... it's a new | sort feature which happens to be _implemented_ using DSU. Should | that new feature sort on "just the key" (leaving ties stable) or | should it sort on "the key and then the objects themselves". I'd say | both make sense, and in fact "just the key" is more obvious to me. +1. Very glad to see this being added. This has to take the prize for utility-I-most-often-reimplement. -- Lars Marius Garshol, Ontopian GSM: +47 98 21 55 50 From anthony at interlink.com.au Wed Oct 15 10:59:47 2003 From: anthony at interlink.com.au (Anthony Baxter) Date: Wed Oct 15 11:02:12 2003 Subject: [Python-Dev] Python-2.3.2 windows binary screwed In-Reply-To: Message-ID: <200310151459.h9FExmvu011497@localhost.localdomain> [resend - my adsl fell over, don't think the original went out] I've put a note on the 2.3.2 page. Please email me when you've got a fixed installer, and I'll do the magic to install it on creosote and gpg sign it. Anthony From gtalvola at nameconnector.com Wed Oct 15 11:05:17 2003 From: gtalvola at nameconnector.com (Geoffrey Talvola) Date: Wed Oct 15 11:06:59 2003 Subject: [Python-Dev] decorate-sort-undecorate Message-ID: <61957B071FF421419E567A28A45C7FE59AF6E9@mailbox.nameconnector.com> Alex Martelli wrote: > On Tuesday 14 October 2003 11:06 pm, Geoffrey Talvola wrote: > ... >> The point I'm trying to make it that a key function is usually more >> natural to use than a comparison function. You're right, DSU isn't >> the only way to > > I agree, with ONE important exception: when your desired sort order > is e.g "primary key ascending field X, secondary key descending > field Y", writing > a key-extraction function can be an absolute BEAR (you have to know > the type of field Y, and even then how to build a key that > will sort in > descending order by it is generally anything but easy), while > a comparison > function is trivial: In this case, how about sorting twice, taking advantage of stability? Using the proposed new syntax: mylist.sort(key = lambda r: r.Y) mylist.reverse() mylist.sort(key = lambda r: r.X) It might actually be the fastest way for very large lists, and while it's not immediately obvious what it's doing, it's not _that_ unreadable... - Geoff From theller at python.net Wed Oct 15 11:11:37 2003 From: theller at python.net (Thomas Heller) Date: Wed Oct 15 11:11:44 2003 Subject: [Python-Dev] Python-2.3.2 windows binary screwed In-Reply-To: <200310151459.h9FExmvu011497@localhost.localdomain> (Anthony Baxter's message of "Thu, 16 Oct 2003 00:59:47 +1000") References: <200310151459.h9FExmvu011497@localhost.localdomain> Message-ID: Anthony Baxter writes: > [resend - my adsl fell over, don't think the original went out] > > I've put a note on the 2.3.2 page. Please email me when you've got a fixed > installer, and I'll do the magic to install it on creosote and gpg sign it. Before I'd like some questions to be answered, probably Martin or Tim have an opinion here (but others are also invited). First, I hope that it's ok to build the installer with the VC6 SP5 dlls. The other possibility that comes to mind is to not include *any* MS runtime dlls, and provide the MS package VCREDIST.EXE separately. Second, what about the filename / version number / build number? IMO one should be able to distinguish the new installer from the old one. The easiest thing would be to just change the filename into maybe Python-2.3.2.1.exe. Thomas From guido at python.org Wed Oct 15 11:16:51 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 15 11:17:02 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "15 Oct 2003 09:41:26 EDT." References: Message-ID: <200310151516.h9FFGp502162@12-236-54-216.client.attbi.com> > If this goes in can we document that adding a key parameter makes the > sort stable and key=None causes a stable sort to happen? Current > CPython won't have to do anything at all with that, but other Pythons > (or a future CPython where a mythical faster-than-timsort nonstable > sort is discovered) would have a documented way to force stability. That sounds like an extremely roundabout way of doing it; *if* there had to be a way to request a stable sort, I'd say that specifying a 'stable' keyword would be the way to do it. But I think that's unnecessary. Given that the Jython folks had Tim's sort algorithm translated into Java in half a day, I don't see why we can't require all implementations to have a stable sort. It's not like you can gain significant speed over Timsort. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Oct 15 11:52:54 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 15 11:53:30 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "Wed, 15 Oct 2003 09:42:37 BST." References: <002601c3928a$e15b3920$e841fea9@oemcomputer> <200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com> Message-ID: <200310151553.h9FFqsx02252@12-236-54-216.client.attbi.com> > What about the situation where you want the list sorted in reverse order? > If you simply sort and then reverse the list you've broken the stability. Yes, that's the same thing Alex Martelli brought up. You could also supply a cmp function, as Geoffrey Talvola suggested (though this will make the comparisons more costly). > You *could* preserve the stability by using a negative index when the list > is to be reserved, but might it also be possible to get the special > comparison object to invert the result of the comparison? That's a possibility. Since we've got a reverse keyword argument, that could be implemented. (There would have to be two classes, one with a forward comparison and one with a reverse, to get this info efficiently into the wrapper objects without using globals.) But then I wonder what should happen if you specify reverse without key. The obvious way to implement this is to do the stable sort without wrappers and then reverse the whole list, but this also breaks stability (as you define it). So maybe specifying reverse should force using wrappers? But that's unintuitive in a different way: if you don't care about the stability of the sort (e.g. if equal keys are impossible or unlikely), you'd expect the reverse option to simply reverse the list after sorting it, and using wrappers would make it a lot slower than that. How important do you think this is? We could punt on the issue, implement reverse by reverting the list afterwards. (I could define stability differently and be totally happy with getting everything in reverse order rather than only the specified key. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Oct 15 12:02:12 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 15 12:02:23 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "Wed, 15 Oct 2003 08:52:54 PDT." <200310151553.h9FFqsx02252@12-236-54-216.client.attbi.com> References: <002601c3928a$e15b3920$e841fea9@oemcomputer> <200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com> <200310151553.h9FFqsx02252@12-236-54-216.client.attbi.com> Message-ID: <200310151602.h9FG2CV02321@12-236-54-216.client.attbi.com> > > What about the situation where you want the list sorted in reverse order? > > If you simply sort and then reverse the list you've broken the stability. > > Yes, that's the same thing Alex Martelli brought up. You could also > supply a cmp function, as Geoffrey Talvola suggested (though this will > make the comparisons more costly). Oops. I misremembered Geoffrey's suggestion; he suggested two sorts with a reverse() call in between. I think that would have the same problem. --Guido van Rossum (home page: http://www.python.org/~guido/) From gtalvola at nameconnector.com Wed Oct 15 12:03:03 2003 From: gtalvola at nameconnector.com (Geoffrey Talvola) Date: Wed Oct 15 12:03:27 2003 Subject: [Python-Dev] decorate-sort-undecorate Message-ID: <61957B071FF421419E567A28A45C7FE59AF6EC@mailbox.nameconnector.com> Guido van Rossum wrote: >> What about the situation where you want the list sorted in reverse >> order? If you simply sort and then reverse the list you've broken >> the stability. > > ... > How important do you think this is? We could punt on the issue, > implement reverse by reverting the list afterwards. (I could define > stability differently and be totally happy with getting everything in > reverse order rather than only the specified key. :-) If you make that the documented behavior, then if someone really needs the items sorted in reverse order, but stable with respect to the original list, then this will work: mylist.reverse() mylist.sort(key=keyfunc, reverse=True) - Geoff From guido at python.org Wed Oct 15 12:07:41 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 15 12:08:09 2003 Subject: [Python-Dev] Python-2.3.2 windows binary screwed In-Reply-To: Your message of "Wed, 15 Oct 2003 17:11:37 +0200." References: <200310151459.h9FExmvu011497@localhost.localdomain> Message-ID: <200310151607.h9FG7gZ02354@12-236-54-216.client.attbi.com> > The other possibility that comes to mind is to not include *any* MS > runtime dlls, and provide the MS package VCREDIST.EXE separately. This sounds like a bad idea; all previous installers have included the right DLLs and not gotten any problems. > Second, what about the filename / version number / build number? > > IMO one should be able to distinguish the new installer from the old > one. The easiest thing would be to just change the filename into maybe > Python-2.3.2.1.exe. I can't think of anything better, so I think it's okay. Adding a letter would be confusing because normally suffixes like b2 or c1 come *before* the final version. Sigh indeed. :-( --Guido van Rossum (home page: http://www.python.org/~guido/) From phil at riverbankcomputing.co.uk Wed Oct 15 12:16:28 2003 From: phil at riverbankcomputing.co.uk (Phil Thompson) Date: Wed Oct 15 12:16:35 2003 Subject: [Python-Dev] Python-2.3.2 windows binary screwed In-Reply-To: <200310151607.h9FG7gZ02354@12-236-54-216.client.attbi.com> References: <200310151459.h9FExmvu011497@localhost.localdomain> <200310151607.h9FG7gZ02354@12-236-54-216.client.attbi.com> Message-ID: <200310151716.28576.phil@riverbankcomputing.co.uk> On Wednesday 15 October 2003 5:07 pm, Guido van Rossum wrote: > > The other possibility that comes to mind is to not include *any* MS > > runtime dlls, and provide the MS package VCREDIST.EXE separately. > > This sounds like a bad idea; all previous installers have included the > right DLLs and not gotten any problems. > > > Second, what about the filename / version number / build number? > > > > IMO one should be able to distinguish the new installer from the old > > one. The easiest thing would be to just change the filename into maybe > > Python-2.3.2.1.exe. > > I can't think of anything better, so I think it's okay. Adding a > letter would be confusing because normally suffixes like b2 or c1 come > *before* the final version. I would suggest Python-2.3.2-1.exe which more strongly implies the same version of software but a different version of packaging. Phil From guido at python.org Wed Oct 15 12:41:44 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 15 12:42:09 2003 Subject: [Python-Dev] Python-2.3.2 windows binary screwed In-Reply-To: Your message of "Wed, 15 Oct 2003 17:16:28 BST." <200310151716.28576.phil@riverbankcomputing.co.uk> References: <200310151459.h9FExmvu011497@localhost.localdomain> <200310151607.h9FG7gZ02354@12-236-54-216.client.attbi.com> <200310151716.28576.phil@riverbankcomputing.co.uk> Message-ID: <200310151641.h9FGfiq02459@12-236-54-216.client.attbi.com> > I would suggest Python-2.3.2-1.exe which more strongly implies the same > version of software but a different version of packaging. +1 --Guido van Rossum (home page: http://www.python.org/~guido/) From python at discworld.dyndns.org Wed Oct 15 12:50:45 2003 From: python at discworld.dyndns.org (Charles Cazabon) Date: Wed Oct 15 12:46:06 2003 Subject: [Python-Dev] Python-2.3.2 windows binary screwed In-Reply-To: <200310151641.h9FGfiq02459@12-236-54-216.client.attbi.com>; from guido@python.org on Wed, Oct 15, 2003 at 09:41:44AM -0700 References: <200310151459.h9FExmvu011497@localhost.localdomain> <200310151607.h9FG7gZ02354@12-236-54-216.client.attbi.com> <200310151716.28576.phil@riverbankcomputing.co.uk> <200310151641.h9FGfiq02459@12-236-54-216.client.attbi.com> Message-ID: <20031015105045.A30228@discworld.dyndns.org> Guido van Rossum wrote: > > I would suggest Python-2.3.2-1.exe which more strongly implies the same > > version of software but a different version of packaging. > > +1 How about making it "-2", then, as the previous (broken) package would have been "-1". Some might assume "Python-2.3.2.exe" and "Python-2.3.2-1.exe" were identical, but I would think few would make that assumption with "Python-2.3.2.exe" and "Python-2.3.2-2.exe". Charles -- ----------------------------------------------------------------------- Charles Cazabon GPL'ed software available at: http://www.qcc.ca/~charlesc/software/ ----------------------------------------------------------------------- From guido at python.org Wed Oct 15 12:51:14 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 15 12:51:25 2003 Subject: [Python-Dev] Python-2.3.2 windows binary screwed In-Reply-To: Your message of "Wed, 15 Oct 2003 10:50:45 MDT." <20031015105045.A30228@discworld.dyndns.org> References: <200310151459.h9FExmvu011497@localhost.localdomain> <200310151607.h9FG7gZ02354@12-236-54-216.client.attbi.com> <200310151716.28576.phil@riverbankcomputing.co.uk> <200310151641.h9FGfiq02459@12-236-54-216.client.attbi.com> <20031015105045.A30228@discworld.dyndns.org> Message-ID: <200310151651.h9FGpEj02501@12-236-54-216.client.attbi.com> > How about making it "-2", then, as the previous (broken) package > would have been "-1". Some might assume "Python-2.3.2.exe" and > "Python-2.3.2-1.exe" were identical, but I would think few would > make that assumption with "Python-2.3.2.exe" and > "Python-2.3.2-2.exe". Haven't you noticed that Python uses 0-based indexing? :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at python.org Wed Oct 15 13:35:53 2003 From: barry at python.org (Barry Warsaw) Date: Wed Oct 15 13:36:03 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310151553.h9FFqsx02252@12-236-54-216.client.attbi.com> References: <002601c3928a$e15b3920$e841fea9@oemcomputer> <200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com> <200310151553.h9FFqsx02252@12-236-54-216.client.attbi.com> Message-ID: <1066239353.25726.19.camel@geddy> While we're hacking on [].sort(), how horrible would it be if we modified it to return self instead of None? I don't mind the sort-in-place behavior, but it's just so inconvenient that it doesn't return anything useful. I know it would be better if it returned a new list, but practicality beats purity. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20031015/e434ed8a/attachment.bin From guido at python.org Wed Oct 15 13:52:26 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 15 13:52:34 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "Wed, 15 Oct 2003 13:35:53 EDT." <1066239353.25726.19.camel@geddy> References: <002601c3928a$e15b3920$e841fea9@oemcomputer> <200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com> <200310151553.h9FFqsx02252@12-236-54-216.client.attbi.com> <1066239353.25726.19.camel@geddy> Message-ID: <200310151752.h9FHqQ302581@12-236-54-216.client.attbi.com> > While we're hacking on [].sort(), how horrible would it be if we > modified it to return self instead of None? -1000. This is non-negotiable. --Guido van Rossum (home page: http://www.python.org/~guido/) From marktrussell at btopenworld.com Wed Oct 15 14:42:47 2003 From: marktrussell at btopenworld.com (Mark Russell) Date: Wed Oct 15 14:44:14 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310151752.h9FHqQ302581@12-236-54-216.client.attbi.com> References: <002601c3928a$e15b3920$e841fea9@oemcomputer> <200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com> <200310151553.h9FFqsx02252@12-236-54-216.client.attbi.com> <1066239353.25726.19.camel@geddy> <200310151752.h9FHqQ302581@12-236-54-216.client.attbi.com> Message-ID: <1066243367.1463.30.camel@straylight> On Wed, 2003-10-15 at 18:52, Guido van Rossum wrote: > > While we're hacking on [].sort(), how horrible would it be if we > > modified it to return self instead of None? > > -1000. This is non-negotiable. I have a trivial wrapper function sortcopy() in my I-wish-these-were-builtins module: def sortcopy(vals, cmpfunc=None): """Non in-place wrapper for list.sort().""" copy = list(vals) copy.sort(cmpfunc) return copy I use this more often than list.sort(), because most of the time performance and memory use is not an issue and code using the in-place version is irritatingly verbose. Maybe this is worth adding as a builtin, to satisfy the people that want a non in-place sort. Mark Russell From theller at python.net Wed Oct 15 14:47:57 2003 From: theller at python.net (Thomas Heller) Date: Wed Oct 15 14:48:03 2003 Subject: [Python-Dev] Python-2.3.2 windows binary screwed In-Reply-To: (Thomas Heller's message of "Wed, 15 Oct 2003 19:05:50 +0200") References: <200310151459.h9FExmvu011497@localhost.localdomain> Message-ID: Anthony, did you get this? Thomas Heller writes: > Ok, here it is: > http://starship.python.net/crew/theller/Python-2.3.2-1.exe > > 87aed0e4a79c350065b770f9a4ddfd75 Python-2.3.2-1.exe > > *Exactly* the same as before, except for the MS dlls and the filename. > > Thanks (and apologies) > > Thomas From jeremy at alum.mit.edu Wed Oct 15 15:11:30 2003 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Wed Oct 15 15:13:49 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <1066243367.1463.30.camel@straylight> References: <002601c3928a$e15b3920$e841fea9@oemcomputer> <200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com> <200310151553.h9FFqsx02252@12-236-54-216.client.attbi.com> <1066239353.25726.19.camel@geddy> <200310151752.h9FHqQ302581@12-236-54-216.client.attbi.com> <1066243367.1463.30.camel@straylight> Message-ID: <1066245090.2611.19.camel@localhost.localdomain> On Wed, 2003-10-15 at 14:42, Mark Russell wrote: > I have a trivial wrapper function sortcopy() in my > I-wish-these-were-builtins module: > > def sortcopy(vals, cmpfunc=None): > """Non in-place wrapper for list.sort().""" > copy = list(vals) > copy.sort(cmpfunc) > return copy > > I use this more often than list.sort(), because most of the time > performance and memory use is not an issue and code using the in-place > version is irritatingly verbose. Maybe this is worth adding as a > builtin, to satisfy the people that want a non in-place sort. No. This is so easy to write, we're all destined to write it again and again <0.4 wink>. I also use sort(): def sort(L): L.sort() return L Jeremy From barry at python.org Wed Oct 15 15:25:32 2003 From: barry at python.org (Barry Warsaw) Date: Wed Oct 15 15:26:07 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310151752.h9FHqQ302581@12-236-54-216.client.attbi.com> References: <002601c3928a$e15b3920$e841fea9@oemcomputer> <200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com> <200310151553.h9FFqsx02252@12-236-54-216.client.attbi.com> <1066239353.25726.19.camel@geddy> <200310151752.h9FHqQ302581@12-236-54-216.client.attbi.com> Message-ID: <1066245932.25726.36.camel@geddy> On Wed, 2003-10-15 at 13:52, Guido van Rossum wrote: > > While we're hacking on [].sort(), how horrible would it be if we > > modified it to return self instead of None? > > -1000. This is non-negotiable. Sniff. >>> class mylist(list): ... def sort(self, *args, **kws): ... super(mylist, self).sort(*args, **kws) ... return self ... >>> mylist([5, 4, 3, 2, 1]) [5, 4, 3, 2, 1] >>> x = mylist([5, 4, 3, 2, 1]) >>> x.sort() [1, 2, 3, 4, 5] Bliss. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20031015/d011f721/attachment.bin From aahz at pythoncraft.com Wed Oct 15 15:26:10 2003 From: aahz at pythoncraft.com (Aahz) Date: Wed Oct 15 15:26:14 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310151516.h9FFGp502162@12-236-54-216.client.attbi.com> References: <200310151516.h9FFGp502162@12-236-54-216.client.attbi.com> Message-ID: <20031015192610.GA14327@panix.com> On Wed, Oct 15, 2003, Guido van Rossum wrote: > > That sounds like an extremely roundabout way of doing it; *if* there > had to be a way to request a stable sort, I'd say that specifying a > 'stable' keyword would be the way to do it. But I think that's > unnecessary. > > Given that the Jython folks had Tim's sort algorithm translated into > Java in half a day, I don't see why we can't require all > implementations to have a stable sort. It's not like you can gain > significant speed over Timsort. But in the discussion leading up to adopting Timsort, you (or Tim, same difference ;-) explicitly said that you didn't want to make any doc guarantees about stability in case the sort algorithm changed in the future. I don't have an opinion about whether we should keep our options open, but I do think there should be a clearly explicit decision rather than suddenly assuming that we're going to require Python's core sort to be stable. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From raymond.hettinger at verizon.net Wed Oct 15 14:06:32 2003 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed Oct 15 15:43:14 2003 Subject: [Python-Dev] decorate-sort-undecorate Message-ID: <005201c39347$10c03960$e841fea9@oemcomputer> If the discussion is wrapped up, I'm ready to commit the patch: www.python.org/sf/823292 Summary: . Adds keyword arguments: cmp, key, reverse. . Stable for any combination of arguments (including reverse). . If key is not specified, then no wrapper is applied and nothing is changed (performance is unchanged). . If cmp and key are specified, the wrapper is removed and the original key is passed to the cmp function (the wrapper is not visible to the user). . Has unittests and docs. Passes the full test suite and repeated runs show stable refcounts. Raymond Hettinger From ianb at colorstudy.com Wed Oct 15 15:48:04 2003 From: ianb at colorstudy.com (Ian Bicking) Date: Wed Oct 15 15:48:09 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <1066239353.25726.19.camel@geddy> Message-ID: <7DC93A93-FF48-11D7-9282-000393C2D67E@colorstudy.com> On Wednesday, October 15, 2003, at 12:35 PM, Barry Warsaw wrote: > While we're hacking on [].sort(), how horrible would it be if we > modified it to return self instead of None? I don't mind the > sort-in-place behavior, but it's just so inconvenient that it doesn't > return anything useful. I know it would be better if it returned a new > list, but practicality beats purity. When doing DSU sorting, the in-place sorting isn't really a performance win, is it? You already have to allocate and populate an entire alternate list with the sort keys, though I suppose you could have those mini key structs point to the original list. Anyway, while it's obviously in bad taste to propose .sort change its return value based on the presence of a key, wouldn't it be good if we had access to the new sorted list, instead of always clobbering the original list? Otherwise people's sorted() functions will end up copying lists unnecessarily. Okay, really I'm just hoping for [x for x in l sortby key(x)], if not now then someday -- if only there was a decent way of expressing that without a keyword... [...in l : key(x)] is the only thing I can think of that would be syntactically possible (without introducing a new keyword, new punctuation, or reusing a wholely inappropriate existing keyword). Or ";" instead of ":", but neither is very good. Sigh... -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org From list-python-dev at ccraig.org Wed Oct 15 16:55:16 2003 From: list-python-dev at ccraig.org (Christopher A. Craig) Date: Wed Oct 15 16:55:47 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <20031015192610.GA14327@panix.com> References: <200310151516.h9FFGp502162@12-236-54-216.client.attbi.com> <20031015192610.GA14327@panix.com> Message-ID: Aahz writes: > But in the discussion leading up to adopting Timsort, you (or Tim, same > difference ;-) explicitly said that you didn't want to make any doc > guarantees about stability in case the sort algorithm changed in the > future. I don't have an opinion about whether we should keep our > options open, but I do think there should be a clearly explicit decision > rather than suddenly assuming that we're going to require Python's core > sort to be stable. Yeah, that's mainly what I meant by my post. Currently if I want guarantees that the sort is stable on any future Python I have to manually DSU. If DSU is going to be internalized I'd like some way to guarantee stability (if that involves no arguments at all, great). -- Christopher A. Craig "It's a fairly embarrassing situation to admit that we can't find 90 percent of the universe." Bruce H. Margon (astrophysicist) From python-kbutler at sabaydi.com Wed Oct 15 17:05:34 2003 From: python-kbutler at sabaydi.com (Kevin J. Butler) Date: Wed Oct 15 17:06:00 2003 Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate In-Reply-To: References: Message-ID: <3F8DB69E.2070406@sabaydi.com> From: Barry Warsaw > While we're hacking on [].sort(), how horrible would it be if we > modified it to return self instead of None? BDFL: > -1000. This is non-negotiable. [Barry's blissful demo code snipped] +1 Just 998 votes to go - nice to have a precise value on BDFL pronouncements. No voting twice with bigger numbers! ;-) I think just about everyone gets tripped up by the "sort returns None" behavior, and though one (e.g., BDFL) can declare that it is a less significant stumble than not realizing the list is sorted in place, it is a _continuing_ inconvenience, with virtually every call to [].sort, even for Python experts (like Barry, not me). Small-ongoing-issue-trumps-one-time-surprise-ly y'rs, kb PS. Just realized I made a similar post over 6 years ago. http://www.google.com/groups?selm=w4niv00k9sc.fsf%40jamaica.cs.byu.edu Does that mean I should just give it up already, or does it emphasize that it is an ongoing issue? Though I still like the fact that the change would not break /any/ existing code... From guido at python.org Wed Oct 15 17:17:41 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 15 17:18:59 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Your message of "Wed, 15 Oct 2003 15:26:10 EDT." <20031015192610.GA14327@panix.com> References: <200310151516.h9FFGp502162@12-236-54-216.client.attbi.com> <20031015192610.GA14327@panix.com> Message-ID: <200310152117.h9FLHfh02774@12-236-54-216.client.attbi.com> > On Wed, Oct 15, 2003, Guido van Rossum wrote: > > That sounds like an extremely roundabout way of doing it; *if* there > > had to be a way to request a stable sort, I'd say that specifying a > > 'stable' keyword would be the way to do it. But I think that's > > unnecessary. > > > > Given that the Jython folks had Tim's sort algorithm translated into > > Java in half a day, I don't see why we can't require all > > implementations to have a stable sort. It's not like you can gain > > significant speed over Timsort. [Aahz] > But in the discussion leading up to adopting Timsort, you (or Tim, same > difference ;-) explicitly said that you didn't want to make any doc > guarantees about stability in case the sort algorithm changed in the > future. That was before Timsort had proven to be such a tremendous success. > I don't have an opinion about whether we should keep our > options open, but I do think there should be a clearly explicit decision > rather than suddenly assuming that we're going to require Python's core > sort to be stable. OK, I pronounce on this: Python's list.sort() shall be stable. --Guido van Rossum (home page: http://www.python.org/~guido/) From python at discworld.dyndns.org Wed Oct 15 17:28:32 2003 From: python at discworld.dyndns.org (Charles Cazabon) Date: Wed Oct 15 17:23:55 2003 Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate In-Reply-To: <3F8DB69E.2070406@sabaydi.com>; from python-kbutler@sabaydi.com on Wed, Oct 15, 2003 at 03:05:34PM -0600 References: <3F8DB69E.2070406@sabaydi.com> Message-ID: <20031015152832.A32481@discworld.dyndns.org> Kevin J. Butler wrote: > > I think just about everyone gets tripped up by the "sort returns None" > behavior, and though one (e.g., BDFL) can declare that it is a less > significant stumble than not realizing the list is sorted in place, it > is a _continuing_ inconvenience, with virtually every call to [].sort, > even for Python experts (like Barry, not me). Sure. I regularly find myself wishing "foo.sort().reverse()" and similar constructions would work, even in-place. Charles -- ----------------------------------------------------------------------- Charles Cazabon GPL'ed software available at: http://www.qcc.ca/~charlesc/software/ ----------------------------------------------------------------------- From esr at thyrsus.com Wed Oct 15 17:31:55 2003 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed Oct 15 17:31:59 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310152117.h9FLHfh02774@12-236-54-216.client.attbi.com> References: <200310151516.h9FFGp502162@12-236-54-216.client.attbi.com> <20031015192610.GA14327@panix.com> <200310152117.h9FLHfh02774@12-236-54-216.client.attbi.com> Message-ID: <20031015213155.GA24331@thyrsus.com> Guido van Rossum : > OK, I pronounce on this: Python's list.sort() shall be stable. Excellent. I've been keeping out of this discussion, but this is the outcome I wanted. -- Eric S. Raymond From pdxi11 at terra.es Wed Oct 15 07:30:33 2003 From: pdxi11 at terra.es (Renee Bacon) Date: Wed Oct 15 18:34:37 2003 Subject: [Python-Dev] Re: find any one . any where.. any place cwe oypttfppn Message-ID: <99h668a944n7206g8ne27725m6@7bn2173ms> An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20031015/2895130b/attachment.html From eppstein at ics.uci.edu Wed Oct 15 19:03:43 2003 From: eppstein at ics.uci.edu (David Eppstein) Date: Wed Oct 15 19:03:47 2003 Subject: [Python-Dev] Re: decorate-sort-undecorate References: <200310151516.h9FFGp502162@12-236-54-216.client.attbi.com> <20031015192610.GA14327@panix.com> <200310152117.h9FLHfh02774@12-236-54-216.client.attbi.com> Message-ID: In article <200310152117.h9FLHfh02774@12-236-54-216.client.attbi.com>, Guido van Rossum wrote: > OK, I pronounce on this: Python's list.sort() shall be stable. And there was much rejoicing. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From mcherm at mcherm.com Wed Oct 15 19:20:05 2003 From: mcherm at mcherm.com (Michael Chermside) Date: Wed Oct 15 19:20:04 2003 Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate Message-ID: <1066260005.3f8dd625626c0@mcherm.com> BDFL: > -1000. This is non-negotiable. Kevin Butler: > +1 > > Just 998 votes to go - nice to have a precise value on BDFL > pronouncements. No voting twice with bigger numbers! ;-) Make it 999 after my -1. Seriously, the BDFL isn't just making this up. Beginners would be tripped up by this ALL the time. People like me who move from language to language and can never remember which behavior goes with which language would be tripped up. Returning None prevents being tripped up. And the work-around is *a 2-line function*! Why can't YOU live with writing a 2-line helper function to save lots of frustration for those of us who might forget whether it's in-place or not? I'm not flaming you here, just trying to point out that the BDFL is *not* alone on this issue. Don't-all-dictators-hold-faux-elections-these-days lly, yours -- Michael Chermside From greg at cosc.canterbury.ac.nz Wed Oct 15 19:33:03 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed Oct 15 19:34:12 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <7DC93A93-FF48-11D7-9282-000393C2D67E@colorstudy.com> Message-ID: <200310152333.h9FNX3c26830@oma.cosc.canterbury.ac.nz> Ian Bicking : > Okay, really I'm just hoping for [x for x in l sortby key(x)], if > not now then someday -- if only there was a decent way of expressing > that without a keyword... [...in l : key(x)] is the only thing I can > think of that would be syntactically possible (without introducing a > new keyword, new punctuation, or reusing a wholely inappropriate > existing keyword). [x >> key(x) for x in l] # ascending sort [x << key(x) for x in l] # descending sort (Well, we got print >> f, so it was worth a try...) Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From pnorvig at google.com Wed Oct 15 20:27:40 2003 From: pnorvig at google.com (Peter Norvig) Date: Wed Oct 15 20:27:46 2003 Subject: [Python-Dev] decorate-sort-undecorate Message-ID: Greg Ewing wrote: > [x >> key(x) for x in l] # ascending sort > [x << key(x) for x in l] # descending sort > >(Well, we got print >> f, so it was worth a try...) I hope you're not serious about that. As it turns out, I have a proposed syntax for something I call an "accumulation display", and with it I was able to implement and test a SortBy in about a minute. It uses the syntax >>> [SortBy: abs(x) for x in (-2, -4, 3, 1)] [1, -2, 3, -4] where SortBy is an expression (in this case an identifier bound to a class object), not a keyword. Other examples of accumulation displays include: [Sum: x*x for x in numbers] [Product: Prob_spam(word) for word in email_msg] [Min: temp(hour) for hour in range(24)] [Top(10): humor(joke) for joke in jokes] [Argmax: votes[c] for c in candidates] You can read the whole proposal at http:///www.norvig.com/pyacc.html -Peter Norvig From python-kbutler at sabaydi.com Wed Oct 15 20:50:09 2003 From: python-kbutler at sabaydi.com (Kevin J. Butler) Date: Wed Oct 15 20:50:30 2003 Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate In-Reply-To: <1066260005.3f8dd625626c0@mcherm.com> References: <1066260005.3f8dd625626c0@mcherm.com> Message-ID: <3F8DEB41.6000209@sabaydi.com> Michael Chermside wrote: > Make it 999 after my -1. Seriously, the BDFL isn't just > making this up. A brief google search showed that the python posters whose names I recognize automatically who had expressed opinions were about evenly split on the issue. (I was startled to see my own name - that was where I came across my post of six years ago.) So yes, Guido isn't alone. (If he were, he _probably_ would have caved in to peer pressure. Maybe not, though...) > Beginners would be tripped up by this ALL the > time. People like me who move from language to language and > can never remember which behavior goes with which language would > be tripped up. I have yet to see a convincing code example (e.g., "Here is some real code - look how confused people would be if list.sort() had returned self"). Generally, list.sort() returning self would make the code more clear & concise. In contrast, I've seen multiple people say that using list.sort() in an expression caused real bugs (one said it was his most common Python bug), and many express irritation about the final code. (Especially people with a functional programming background, but I'm not one of them.) > Returning None prevents being tripped up. And > the work-around is *a 2-line function*! Why can't YOU live with > writing a 2-line helper function to save lots of frustration > for those of us who might forget whether it's in-place or not? Oh, we have! Concrete frustration outweighs speculative frustration. ;-) (pun intended). Or maybe we could have list.sort() return "Error: .sort method does not return self." That would make the following idiom entertaining: for i in list.sort(): print i kb From ianb at colorstudy.com Wed Oct 15 21:22:14 2003 From: ianb at colorstudy.com (Ian Bicking) Date: Wed Oct 15 21:22:20 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Message-ID: <2CAD2EAE-FF77-11D7-9282-000393C2D67E@colorstudy.com> On Wednesday, October 15, 2003, at 07:27 PM, Peter Norvig wrote: > As it turns out, I have a proposed syntax for something I call an > "accumulation display", and with it I was able to implement and test a > SortBy in about a minute. It uses the syntax > >>>> [SortBy: abs(x) for x in (-2, -4, 3, 1)] > [1, -2, 3, -4] > > where SortBy is an expression (in this case an identifier bound to a > class object), not a keyword. Other examples of accumulation displays > include: > > [Sum: x*x for x in numbers] > [Product: Prob_spam(word) for word in email_msg] > [Min: temp(hour) for hour in range(24)] > [Top(10): humor(joke) for joke in jokes] > [Argmax: votes[c] for c in candidates] > > You can read the whole proposal at http:///www.norvig.com/pyacc.html Neat. +1. I think it would be nice if accumulators were created more like iterators, maybe with an __accum__ method. Then builtins like min and max could be turned into accumulators, kind of like the int function was turned into a class. Then you also wouldn't have to check for and instantiate classes, which seems a little crude. Then if a sorted() function/class was added to builtins, and it was also an accumulator, you'd be all set. And all the sort method haters out there (they number many!) would be happy. But [sorted: abs(x) for x in lst] doesn't seem right at all, it should return a list of abs(x) sorted by x, not a list of x sorted by abs(x). [sorted.by: abs(x) for x in lst] is perhaps more clever than practical -- it could work and it reads nicely, but it doesn't look normal. -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org From pje at telecommunity.com Wed Oct 15 21:36:44 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Oct 15 21:36:27 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Message-ID: <5.1.0.14.0.20031015212950.02951120@mail.telecommunity.com> At 05:27 PM 10/15/03 -0700, Peter Norvig wrote: >As it turns out, I have a proposed syntax for something I call an >"accumulation display", and with it I was able to implement and test a >SortBy in about a minute. It uses the syntax > > >>> [SortBy: abs(x) for x in (-2, -4, 3, 1)] >[1, -2, 3, -4] > >where SortBy is an expression (in this case an identifier bound to a >class object), not a keyword. Other examples of accumulation displays >include: > > [Sum: x*x for x in numbers] > [Product: Prob_spam(word) for word in email_msg] > [Min: temp(hour) for hour in range(24)] > [Top(10): humor(joke) for joke in jokes] > [Argmax: votes[c] for c in candidates] +0. You can do any of these with a function, if you're willing to let the entire list be created, and put any needed parameters in as a tuple, e.g.: Top(10, [(humor(joke),joke) for joke in jokes]) So, if we had generator comprehensions, the proposed mechanism would be unnecessary. Also, note that [] implies the return value is a list or sequence of some kind, when it's not. IMO, it would really be better to have some kind of generator comprehension to make inline iterator creation easy, and then put the function or class or whatever outside the generator comprehension. Then, it's clear that some function is being applied to a sequence, and that you should look to the function to find out the type of the result, e.g.: Top(10, [yield humor(joke),joke for joke in jokes]) From eppstein at ics.uci.edu Wed Oct 15 22:58:34 2003 From: eppstein at ics.uci.edu (David Eppstein) Date: Wed Oct 15 22:58:42 2003 Subject: [Python-Dev] Re: decorate-sort-undecorate References: Message-ID: In article , Peter Norvig wrote: > As it turns out, I have a proposed syntax for something I call an > "accumulation display", and with it I was able to implement and test a > SortBy in about a minute. It uses the syntax > > >>> [SortBy: abs(x) for x in (-2, -4, 3, 1)] > [1, -2, 3, -4] > > where SortBy is an expression (in this case an identifier bound to a > class object), not a keyword. ... > You can read the whole proposal at http:///www.norvig.com/pyacc.html Would this proposal also allow [Yield: expr(x) for x in someiterator] ? -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From tim.one at comcast.net Wed Oct 15 21:06:07 2003 From: tim.one at comcast.net (Tim Peters) Date: Thu Oct 16 00:49:48 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310152117.h9FLHfh02774@12-236-54-216.client.attbi.com> Message-ID: [Guido] > ... > OK, I pronounce on this: Python's list.sort() shall be stable. Wow. I thought the time machine may have broken on its way to California, but I see this already reached back to the 2.3 release! Relief. +1-ing-ly y'rs - tim From tim.one at comcast.net Wed Oct 15 21:36:07 2003 From: tim.one at comcast.net (Tim Peters) Date: Thu Oct 16 00:49:52 2003 Subject: [Python-Dev] Python-2.3.2 windows binary screwed In-Reply-To: Message-ID: [Thomas Heller] > Sigh. > > The 2.3.2 windows binary contains invalid MS dlls. Oops. Guido can tell you how anal I was about this, but I don't think it ever got documented. Sorry! It's why the Wise script has C:\Code\MSDLLs as a choice for where to get redistributables from. > I copied them from my system directory, instead of using those of the > MSVC 6 SP5 redistributables. That's a good choice. I can't find it now, but somewhere in the MS gigabytes of stuff is a list of which versions of these guys are redistributable. Sometimes a service pack will install one that isn't *generally* usable, because it relies on other stuff installed by the same service pack. These oddballs often show up in security patches, where they're seemingly ramming out a fix as fast as possible. > ... > Strongy affected are probably win98 and NT4 users. The happier news is that I've got 2.3.2 on two Win98SE boxes with no ill effects. I keep these scrupulously up-to-date, though. From tim.one at comcast.net Wed Oct 15 21:41:14 2003 From: tim.one at comcast.net (Tim Peters) Date: Thu Oct 16 00:49:56 2003 Subject: [Python-Dev] Python-2.3.2 windows binary screwed In-Reply-To: Message-ID: [Thomas Heller] > Before I'd like some questions to be answered, probably Martin or Tim > have an opinion here (but others are also invited). > > First, I hope that it's ok to build the installer with the VC6 SP5 > dlls. I have in the past . It's OK by me. The Wise script should already be refusing to replace newer versions of these DLLs. > The other possibility that comes to mind is to not include > *any* MS runtime dlls, and provide the MS package VCREDIST.EXE > separately. Martin pointed out correctly that Win95 didn't ship with these things, so it's safest to keep shipping them until Python moves to VC7 (at which point I don't think we can pretend to support Win9x anymore). > Second, what about the filename / version number / build number? The build number should definitely change. When someone sends a snippet from an interactive prompt with an incomprehensible error report, the build number they're unwittingly tricked into including is the best clue about what they're really running. The version number shouldn't change. > IMO one should be able to distinguish the new installer from the old > one. The easiest thing would be to just change the filename into maybe > Python-2.3.2.1.exe. +1. From pnorvig at google.com Thu Oct 16 01:04:41 2003 From: pnorvig at google.com (Peter Norvig) Date: Thu Oct 16 01:04:47 2003 Subject: [Python-Dev] decorate-sort-undecorate References: <5.1.0.14.0.20031015212950.02951120@mail.telecommunity.com> Message-ID: Yes, you're right -- with generator comprehensions, you can have short-circuit evaluation via functions on the result, and you can get at both original element and some function of it, at the cost of writing f(x), x. So my proposal would be only a small amount of syntactic sugar over what you can do with generator comprehensions. (But you could say the same for list/generator comprehensions over raw generators.) -Peter Norvig On Wed Oct 15 18:36:44 PDT 2003, Phillip J. Eby wrote: > At 05:27 PM 10/15/03 -0700, Peter Norvig wrote: > >As it turns out, I have a proposed syntax for something I call an > >"accumulation display", and with it I was able to implement and test a > >SortBy in about a minute. It uses the syntax > > > > >>> [SortBy: abs(x) for x in (-2, -4, 3, 1)] > >[1, -2, 3, -4] > > > >where SortBy is an expression (in this case an identifier bound to a > >class object), not a keyword. Other examples of accumulation displays > >include: > > > > [Sum: x*x for x in numbers] > > [Product: Prob_spam(word) for word in email_msg] > > [Min: temp(hour) for hour in range(24)] > > [Top(10): humor(joke) for joke in jokes] > > [Argmax: votes[c] for c in candidates] > > +0. You can do any of these with a function, if you're willing to let the > entire list be created, and put any needed parameters in as a tuple, e.g.: > > Top(10, [(humor(joke),joke) for joke in jokes]) > > So, if we had generator comprehensions, the proposed mechanism would be > unnecessary. Also, note that [] implies the return value is a list or > sequence of some kind, when it's not. > > IMO, it would really be better to have some kind of generator comprehension > to make inline iterator creation easy, and then put the function or class > or whatever outside the generator comprehension. Then, it's clear that > some function is being applied to a sequence, and that you should look to > the function to find out the type of the result, e.g.: > > Top(10, [yield humor(joke),joke for joke in jokes]) > > From greg at cosc.canterbury.ac.nz Thu Oct 16 01:11:45 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Thu Oct 16 01:11:58 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Message-ID: <200310160511.h9G5BjS28355@oma.cosc.canterbury.ac.nz> Peter Norvig : > [Sum: x*x for x in numbers] > [Product: Prob_spam(word) for word in email_msg] > [Min: temp(hour) for hour in range(24)] > [Top(10): humor(joke) for joke in jokes] > [Argmax: votes[c] for c in candidates] Interesting idea, but I'm a bit worried by the enclosing [], which suggests that a list is being constructed, whereas in most of your examples the result isn't a list. I still think it would be fun if Python had an "up" operator, so with suitably defined accumulator objects you could say things like total = add up my_numbers product = multiply up some_probabilities Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Thu Oct 16 01:16:23 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Thu Oct 16 01:16:44 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <5.1.0.14.0.20031015212950.02951120@mail.telecommunity.com> Message-ID: <200310160516.h9G5GN328361@oma.cosc.canterbury.ac.nz> "Phillip J. Eby" : > IMO, it would really be better to have some kind of generator > comprehension > > Top(10, [yield humor(joke),joke for joke in jokes]) I like the *idea* of a generator comprehension, but I'm not sure I like the [yield ...] syntax. It's a bit idiomatic looking -- the [] still imply a list, even though it's not building a list at all. Maybe there should be a different kind of bracketing, e.g. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From anthony at interlink.com.au Thu Oct 16 01:15:30 2003 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu Oct 16 01:18:37 2003 Subject: [Python-Dev] Python-2.3.2 windows binary screwed In-Reply-To: Message-ID: <200310160515.h9G5FUqc025443@localhost.localdomain> >>> Thomas Heller wrote > Anthony, did you get this? Yep, sorry - I sleep during the night. Installed on creosote (along with signature) Anthony From martin at v.loewis.de Thu Oct 16 02:04:16 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Thu Oct 16 02:04:22 2003 Subject: [Python-Dev] Re: python-dev Summary for 2003-09-16 through 2003-09-30 In-Reply-To: References: Message-ID: "Mike Rovner" writes: > >> - the patch might be incomplete. Ping the submitter. If the submitter > >> is incomplete, either complete it yourself, or suggest rejection > >> of the patch. > > All I can do as SF regestered user is add a comment to existing patch. > I can't extend it, submit extra files, i.e. "complete" it. > > Please clarify the preferabale way to "help with the war on SF patch items". If you think the patch is best revised in a new form, please submit a new patch, and leave a message in the original one indicating that you think your patch should supercede the patch of the original submitter. However, as Brett explains, there might be other (perhaps better) ways to achieve the same effect: If you think the patch needs revision in a certain direction, ask the submitter to revise the patch accordingly. If you come up with a competing patch, the competition itself may cause bad feelings - so try to work with the submitter, not against her. Regards, Martin From martin at v.loewis.de Thu Oct 16 02:05:12 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Thu Oct 16 02:05:18 2003 Subject: [Python-Dev] Draft of an essay on Python development (and how to help) In-Reply-To: <3F8B5ECB.4030207@ocf.berkeley.edu> References: <3F8B5ECB.4030207@ocf.berkeley.edu> Message-ID: "Brett C." writes: > If you get any message from this document, it should be that *anyone* > can help Python. It should be what? Regards, Martin From martin at v.loewis.de Thu Oct 16 02:07:30 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Thu Oct 16 02:08:00 2003 Subject: [Python-Dev] server side digest auth support In-Reply-To: <200310140847.h9E8ltLn028921@localhost.localdomain> References: <200310140847.h9E8ltLn028921@localhost.localdomain> Message-ID: Anthony Baxter writes: > We've got http digest auth [RFC 2617] support at the client level in > the standard library, but it doesn't seem like there's server side > support. I'm planning on adding this (for pypi) but it's not clear > where it should go - I want to use it from a CGI, but I can see it > being useful for people writing HTTP servers as well. Should I just > make a new module httpdigest.py? Can you actually implement it from CGI? How do you get hold of the WWW-Authenticate header? Regards, Martin From theller at python.net Thu Oct 16 02:51:14 2003 From: theller at python.net (Thomas Heller) Date: Thu Oct 16 02:51:22 2003 Subject: [Python-Dev] Python-2.3.2 windows binary screwed In-Reply-To: (Tim Peters's message of "Wed, 15 Oct 2003 21:41:14 -0400") References: Message-ID: <65ip8yl9.fsf@python.net> "Tim Peters" writes: > [Thomas Heller] >> Before I'd like some questions to be answered, probably Martin or Tim >> have an opinion here (but others are also invited). >> >> First, I hope that it's ok to build the installer with the VC6 SP5 >> dlls. > > I have in the past . It's OK by me. The Wise script should already > be refusing to replace newer versions of these DLLs. > >> The other possibility that comes to mind is to not include >> *any* MS runtime dlls, and provide the MS package VCREDIST.EXE >> separately. > > Martin pointed out correctly that Win95 didn't ship with these things, so > it's safest to keep shipping them until Python moves to VC7 (at which point > I don't think we can pretend to support Win9x anymore). > >> Second, what about the filename / version number / build number? > > The build number should definitely change. When someone sends a snippet > from an interactive prompt with an incomprehensible error report, the build > number they're unwittingly tricked into including is the best clue about > what they're really running. The version number shouldn't change. Too late. Anthony already published on creosote what I sent him. With the exception of the MS dlls, the installer contains and installs the exactly identical files as Python-2.3.2.exe, and this includes the build number since I did not rebuild Python itself. >> IMO one should be able to distinguish the new installer from the old >> one. The easiest thing would be to just change the filename into maybe >> Python-2.3.2.1.exe. > > +1. Python-2.3.2-1.exe is it now. Thomas From theller at python.net Thu Oct 16 02:55:47 2003 From: theller at python.net (Thomas Heller) Date: Thu Oct 16 02:56:01 2003 Subject: [Python-Dev] Python-2.3.2 windows binary screwed In-Reply-To: (Tim Peters's message of "Wed, 15 Oct 2003 21:36:07 -0400") References: Message-ID: <1xtd8ydo.fsf@python.net> "Tim Peters" writes: > [Thomas Heller] >> Sigh. >> >> The 2.3.2 windows binary contains invalid MS dlls. > > Oops. Guido can tell you how anal I was about this, but I don't think it > ever got documented. Sorry! It's why the Wise script has C:\Code\MSDLLs as > a choice for where to get redistributables from. I was probably confused because it had C:\Windows\System also ;-(. I will change the WISE script to remove these, and update the relevant PEPs so that this (hopefully) doesn't happen again. >> I copied them from my system directory, instead of using those of the >> MSVC 6 SP5 redistributables. > > That's a good choice. Apparently not - they were XP specific. > I can't find it now, but somewhere in the MS > gigabytes of stuff is a list of which versions of these guys are > redistributable. Sometimes a service pack will install one that isn't > *generally* usable, because it relies on other stuff installed by the same > service pack. These oddballs often show up in security patches, where > they're seemingly ramming out a fix as fast as possible. > >> Strongy affected are probably win98 and NT4 users. > > The happier news is that I've got 2.3.2 on two Win98SE boxes with no ill > effects. I keep these scrupulously up-to-date, though. See the bug reports I mentioned to find out what happened to other people <0.0 wink>. Apologies to everyone suffering from my fault. Thomas From theller at python.net Thu Oct 16 02:56:36 2003 From: theller at python.net (Thomas Heller) Date: Thu Oct 16 02:56:43 2003 Subject: [Python-Dev] Python-2.3.2 windows binary screwed In-Reply-To: <200310160515.h9G5FUqc025443@localhost.localdomain> (Anthony Baxter's message of "Thu, 16 Oct 2003 15:15:30 +1000") References: <200310160515.h9G5FUqc025443@localhost.localdomain> Message-ID: Anthony Baxter writes: >>>> Thomas Heller wrote >> Anthony, did you get this? > > Yep, sorry - I sleep during the night. Hm, sometimes I totally forget the timezones. > Installed on creosote (along with signature) > > Anthony Thanks, Thomas From tim.one at comcast.net Wed Oct 15 20:33:39 2003 From: tim.one at comcast.net (Tim Peters) Date: Thu Oct 16 04:10:14 2003 Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate In-Reply-To: <3F8DB69E.2070406@sabaydi.com> Message-ID: [Kevin J. Butler] > BDFL: >> -1000. This is non-negotiable. > > [Barry's blissful demo code snipped] > > +1 > > Just 998 votes to go - nice to have a precise value on BDFL > pronouncements. No voting twice with bigger numbers! ;-) -1. Back to 999. > I think just about everyone gets tripped up by the "sort returns None" > behavior, and though one (e.g., BDFL) can declare that it is a less > significant stumble than not realizing the list is sorted in place, it > is a _continuing_ inconvenience, with virtually every call to [].sort, > even for Python experts (like Barry, not me). People would get in worse (subtler) trouble if it did return self. The trouble they get from it returning None is all of shallow, immediate, easily fixed, and 100% consistent with other builtin container mutating methods (dict.update, dict.clear, list.remove, list.append, list.extend, list.insert, list.reverse). That said, since we're having a fire sale on optional sort arguments in 2.4, I wouldn't oppose an optional Boolean argument you could explicit set to have x.sort() return x. For example, >>> [1, 2, 3].sort(happy_guido=False) [1, 2, 3] >>> From tim.one at comcast.net Wed Oct 15 20:46:47 2003 From: tim.one at comcast.net (Tim Peters) Date: Thu Oct 16 04:10:25 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <004f01c392c5$bb94a740$e841fea9@oemcomputer> Message-ID: [Raymond Hettinger] > ... > * The key function triggers a DSU step with a wrapper object that > holds the full record but returns only the key for a comparison. > This is fast, memory efficient, and doesn't change the underlying > stability characteristics of the sort. (I think this was Neil's idea > -- and it works like a charm.) I see the wrapper object participates in cyclic GC. This adds 12 (32-bit Linux) to 16 (32-bit Windows) gc overhead bytes per wrapper object, more than the # of bytes needed to hold the 2 useful pointers. Since the wrapper objects only live for the life of the sort, I don't think it's important that they participate in cyclic gc. In particular, since the key and value objects being wrapped stay alive for the life of the sort too, no cyclic trash they appear in can become collectible during the sort, and so tracing cycles involving these things can't do any good (it can fritter away time moving the wrapper objects into older generations, but that's not usually "good" ). From tim.one at comcast.net Wed Oct 15 20:59:00 2003 From: tim.one at comcast.net (Tim Peters) Date: Thu Oct 16 04:10:34 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310151516.h9FFGp502162@12-236-54-216.client.attbi.com> Message-ID: [Guido] > ... > Given that the Jython folks had Tim's sort algorithm translated into > Java in half a day, I don't see why we can't require all > implementations to have a stable sort. It's not like you can gain > significant speed over Timsort. I object to any sort that claims to be more stable than its author. Speaking of which, by giving up the so-called stability of 2.3's list.sort(), I can speed sorting of exponentially distributed random floats by nearly 0.017%! That's almost a fiftieth of a percent. Some of the floats in the result are smaller than their left neighbor, but that's only because I had to mutate some of the values, and they're not a lot smaller anyway. adopting-the-consensus-view-of-floating-point-will-deliver- many-such-benefits-ly y'rs - tim From tim.one at comcast.net Wed Oct 15 21:13:12 2003 From: tim.one at comcast.net (Tim Peters) Date: Thu Oct 16 04:10:39 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <7DC93A93-FF48-11D7-9282-000393C2D67E@colorstudy.com> Message-ID: [Ian Bicking] > When doing DSU sorting, the in-place sorting isn't really a > performance win, is it? You already have to allocate and populate an > entire alternate list with the sort keys, though I suppose you could > have those mini key structs point to the original list. IIUC, Raymond's patch actually (re)uses the original list object to hold (pointers to) the wrapper objects. No additional list is allocated. Since the wrapper objects hold (pointers to) the original objects, it's easy to make the list point back to the original objects at the end. It's better this way than hand-rolled DSU coded in Python, although the same effect *could* be gotten via class Wrapper: def __init__(self, key, obj): self.key = key self.obj = obj def __lt__(a, b): return a.key < b.key for i, obj in enumerate(L): L[i] = Wrapper(key(obj), obj) L.sort() for i, w in enumerate(L): L[i] = w.key assuming no exceptions occur along the way. > Anyway, while it's obviously in bad taste to propose .sort change its > return value based on the presence of a key, wouldn't it be good if we > had access to the new sorted list, instead of always clobbering the > original list? Otherwise people's sorted() functions will end up > copying lists unnecessarily. Give it an optional clobber argument -- your own sort function doesn't *have* to copy the list. From aleaxit at yahoo.com Thu Oct 16 04:20:24 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Thu Oct 16 04:21:16 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <7DC93A93-FF48-11D7-9282-000393C2D67E@colorstudy.com> References: <7DC93A93-FF48-11D7-9282-000393C2D67E@colorstudy.com> Message-ID: <200310161020.24161.aleaxit@yahoo.com> On Wednesday 15 October 2003 09:48 pm, Ian Bicking wrote: > On Wednesday, October 15, 2003, at 12:35 PM, Barry Warsaw wrote: > > While we're hacking on [].sort(), how horrible would it be if we > > modified it to return self instead of None? I don't mind the > > sort-in-place behavior, but it's just so inconvenient that it doesn't > > return anything useful. I know it would be better if it returned a new > > list, but practicality beats purity. > > When doing DSU sorting, the in-place sorting isn't really a performance > win, is it? You already have to allocate and populate an entire > alternate list with the sort keys, though I suppose you could have > those mini key structs point to the original list. I thought the idea being implemented avoided making a new list -- i.e., that the idea being implemented is the equivalent of: # decorate for i, item in enumerate(thelist): thelist[i] = CleverWrapper((key(item), item)) # sort (with the new stability guarantee) thelist.sort() # undecorate for i, item in enumerate(thelist): thelist[i] = item[1] where (the equivalent of): class CleverWrapper(tuple): def __cmp__(self, other): return cmp(self[0], other[0]) so, there is no allocation of another list -- just (twice) a repopulation of the existing one. How _important_ that is to performance, I dunno, but wanted to double-check on my understanding of this anyway. > Okay, really I'm just hoping for [x for x in l sortby key(x)], if not > now then someday -- if only there was a decent way of expressing that > without a keyword... [...in l : key(x)] is the only thing I can think > of that would be syntactically possible (without introducing a new > keyword, new punctuation, or reusing a wholely inappropriate existing > keyword). Or ";" instead of ":", but neither is very good. Peter Norvig's just-proposed "accumulator" syntax looks quite good to me from this point of view, and superior to the "generator comprehension" alternative (though I think the semantics might perhaps be tweaked, but I'm thinking of writing a separate message about that). IOW, if we can accept that [ ... ] is not necessarily a list, then [SortedBy: key(x) for x in L] would look good to me. (in this case this WOULD be a list, but I think the notation pays for itself only if we can use it more generally). Or maybe SortedBy[key(x) for x in L] -- extending indexing syntax [ ... ] to mean something different if the ... includes a 'for', just like we already extended list display syntax [ ... ] to mean list comprehension in just such a case. Alex From pyth at devel.trillke.net Thu Oct 16 04:35:52 2003 From: pyth at devel.trillke.net (Holger Krekel) Date: Thu Oct 16 04:36:11 2003 Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate In-Reply-To: ; from tim.one@comcast.net on Wed, Oct 15, 2003 at 08:33:39PM -0400 References: <3F8DB69E.2070406@sabaydi.com> Message-ID: <20031016103552.H14453@prim.han.de> Tim Peters wrote: > [Kevin J. Butler] > > I think just about everyone gets tripped up by the "sort returns None" > > behavior, and though one (e.g., BDFL) can declare that it is a less > > significant stumble than not realizing the list is sorted in place, it > > is a _continuing_ inconvenience, with virtually every call to [].sort, > > even for Python experts (like Barry, not me). > > People would get in worse (subtler) trouble if it did return self. The > trouble they get from it returning None is all of shallow, immediate, easily > fixed, and 100% consistent with other builtin container mutating methods > (dict.update, dict.clear, list.remove, list.append, list.extend, > list.insert, list.reverse). > > That said, since we're having a fire sale on optional sort arguments in 2.4, > I wouldn't oppose an optional Boolean argument you could explicit set to > have x.sort() return x. For example, > > >>> [1, 2, 3].sort(happy_guido=False) > [1, 2, 3] > >>> If anything at all, i'd suggest a std-module which contains e.g. 'sort', 'reverse' and 'extend' functions which always return a new list, so that you could write: for i in reverse(somelist): ... which wouldn't modify the list but return a new one. I don't have a name for such a module, but i have once written a "oneliner" to implement the above methods (working on tuples, strings, lists): http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/119596 (sorry this was in my early days :-) have fun, holger From aleaxit at yahoo.com Thu Oct 16 05:14:31 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Thu Oct 16 05:14:35 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310160516.h9G5GN328361@oma.cosc.canterbury.ac.nz> References: <200310160516.h9G5GN328361@oma.cosc.canterbury.ac.nz> Message-ID: <200310161114.31192.aleaxit@yahoo.com> On Thursday 16 October 2003 07:16 am, Greg Ewing wrote: > "Phillip J. Eby" : > > IMO, it would really be better to have some kind of generator > > comprehension > > > > Top(10, [yield humor(joke),joke for joke in jokes]) > > I like the *idea* of a generator comprehension, but I'm > not sure I like the [yield ...] syntax. It's a bit > idiomatic looking -- the [] still imply a list, even > though it's not building a list at all. > > Maybe there should be a different kind of bracketing, > e.g. > > I think we could extend indexing to mean something different when the [ ] contain a 'for', just like we extended list display to mean something different (list comprehension) when the [ ] contain a 'for'. Syntax such as: Top(10)[ humor(joke) for joke in jokes ] does not suggest a list is _returned_, just like foo[23] doesn't. And I have an idea on semantics (which I intend to post separately) which might let accumulator display syntax work for both "iterator comprehensions" AND "return of ordinary non-iterator" results. Alex From aleaxit at yahoo.com Thu Oct 16 06:00:04 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Thu Oct 16 06:00:12 2003 Subject: [Python-Dev] accumulator display semantics In-Reply-To: References: <5.1.0.14.0.20031015212950.02951120@mail.telecommunity.com> Message-ID: <200310161200.04846.aleaxit@yahoo.com> On Thursday 16 October 2003 07:04 am, Peter Norvig wrote: > Yes, you're right -- with generator comprehensions, you can have > short-circuit evaluation via functions on the result, and you can get > at both original element and some function of it, at the cost of > writing f(x), x. So my proposal would be only a small amount of > syntactic sugar over what you can do with generator comprehensions. I _like_ your proposal, particularly in my proposed variant syntax foo[ x*x for x in xs if cond(x) ] vs your original syntax [ foo: x*x for x in xs if cond(x) ] I think the "indexing-like" syntax I proposed solves Greg's objection that your "list display-like" syntax (and similar proposals for iterator comprehensions) misleadingly suggest that a list is the result; an indexing makes no such suggestion, as foo[bar] may just as well be a sequence, an iterator, or anything else whatsoever depending on foo (and perhaps on bar:-). But syntax apart, let's dig a little bit more in the semantics. At http://www.norvig.com/pyacc.html you basically propose that the infrastructure for an accumulator display perform the equivalent of: for x in it: if a.add(f(x), x): break return a.result() where a, in Ian Bicking's proposal, would be acc.__accum__() (I like this, as it lets us use existing sum, max, etc, as accumulators, by adding suitable methods __accum__ to them). However, this would not let accumulator displays usefully return iterators -- since the entire for loop is done by the infrastructure, the best a could do would be to store all needed intermediates to return an iterator on them as a.result() -- possible memory waste. My idea about this is still half-baked, but I think it's ready to post and get your and others' feedback on. Why not move the for loop, if needed, out of the hard-coded infrastructure and just have accumulator display syntax such as: acc[x*x for x in it] be exactly equivalent to: a = acc.__accum__(lambda x: x*x, iter(it)) return a.result() i.e., pass the callable corresponding to the expression, and the iterator corresponding to the sequence, to the user-coded accumulator. Decisions would have to be taken regarding what exactly to do when the display contains multiple for, if, and/or control variables, as in acc[f(x,y,z) for x, y in it1 if y>x for z in g(y) if z(x) for x in ] where x can be a tuple of the multiple control variables involved and iterable 'it' already encodes all nested-for's and if's into one "stream" of values (some similar kind of decision will have to be taken for your original suggestion, for iterator comprehensions, and for any other such idea, it seems to me). The advantage of my idea would be to let accumulator display syntax just as easily return iterators. E.g., with something like: class Accum(object): def __accum__(cls, exp, it): " make __accum__ a classmethod equivalent to calling the class " return cls(exp, it) __accum__ = classmethod(__accum__) def __init__(self, exp, it): " factor-out the common case of looping into this base-class " for item in it: if self.add(exp(it), it): break def result(self): " let self.add implicitly accumulate into self._result by default " return self._result class Iter(Accum): def __init__(self, exp, it): " overriding Accum.__init__ as we don't wanna loop " self.exp = exp self.it = it def result(self): " overriding Accum.result with a generator " for item in self.it: yield self.exp(item) you could code e.g. for y in Iter[ x*x for x in nums if good(x)]: blahblah(y) as being equivalent to: for x in nums: if good(x): y = x*x blahblah(y) but you could also code, roughly as in your original proposal, class Mean(Accum): def __init__(self, exp=None, it=()): " do self attribute initializations then chain up to base class " self.total, self.n = 0, 0 Accum.__init__(self, exp, it) def add(self, value, _ignore): " the elementary step is unchanged " self.total, self.n = self.total+value, self.n+1 def result(self): " override Accum.result as this is better computed just one " return self.total / self.n to keep the .add method factored out for non-display use (the default empty it argument to __init__ is there for this specific purpose, too), if you wished. Basically, my proposal amounts to a different factoring of accumulator display functionality between Python's hard-coded infrastructure, and functionality to be supplied by the standard library module accum that you already propose. By having much less in the hard-coded parts -- basically just the identification and passing-on of the proper expression and iterator -- and correspondingly more in the standard library, we gain flexibility because a base class in the library may be more flexibly "overridden", in part or in its entirety (an accumulator doesn't HAVE to inherit from class Accum at all, if it just wants to reimplement both of the __accum__ and result methods on its own). If this slows things down a bit we may perhaps in the future hard-code some special cases, but worrying about it now would feel like premature optimizaton to me. Alex From anthony at interlink.com.au Thu Oct 16 06:08:01 2003 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu Oct 16 06:10:49 2003 Subject: [Python-Dev] server side digest auth support In-Reply-To: Message-ID: <200310161008.h9GA817Z030936@localhost.localdomain> >>> Martin v. =?iso-8859-15?q?L=F6wis?= wrote > Can you actually implement it from CGI? How do you get hold of the > WWW-Authenticate header? Hm. You're right - it's been far too long since I used plain old CGI for anything. Wow, it's a really awful interface. Been spoiled by app servers and fastcgi and the like, I guess. Ah well - can at least implement it for the various server-side things and client-side things in the std lib. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From just at letterror.com Thu Oct 16 07:19:06 2003 From: just at letterror.com (Just van Rossum) Date: Thu Oct 16 07:19:09 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <005201c39347$10c03960$e841fea9@oemcomputer> Message-ID: Raymond Hettinger wrote: > If the discussion is wrapped up, I'm ready to commit the patch: > > www.python.org/sf/823292 > > Summary: > > .. Adds keyword arguments: cmp, key, reverse. > .. Stable for any combination of arguments (including reverse). [ ... ] On the sf tracker item you write: def sort(self, cmp=None, key=None, reverse=None): if cmp is not None and key is not None: cmp = cmpwrapper(cmp) if key is not None: self[:] = [sortwrapper(key(x), x) for x in self] if reverse is not None: self.reverse() self.sort(cmp) if key is not None: self[:] = [x.getvalue() for x in self] if reverse is not None: self.reverse() Is there consensus at all about the necessity of that first reverse call? To me it's not immediately obvious that the reverse option should maintain the _original_ stable order. In my particular application I would actually want reverse to do just that: reverse the result of the sort. Easy enough to work around of course: I could do the reverse myself after the sort. But it does feel odd: sort() now _has_ a reverse feature, but I can't use it... (Also: how does timsort perform when fed a (partially) sorted list compared to a reversed sorted list? If there's a significant difference there, than that first reverse call may actually hurt performance in some cases. Not that I care much about that...) Just From paul-python at svensson.org Thu Oct 16 07:38:33 2003 From: paul-python at svensson.org (Paul Svensson) Date: Thu Oct 16 07:38:38 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310161114.31192.aleaxit@yahoo.com> References: <200310160516.h9G5GN328361@oma.cosc.canterbury.ac.nz> <200310161114.31192.aleaxit@yahoo.com> Message-ID: <20031016073514.Q41936@familjen.svensson.org> On Thu, 16 Oct 2003, Alex Martelli wrote: >I think we could extend indexing to mean something different when >the [ ] contain a 'for', just like we extended list display to mean >something different (list comprehension) when the [ ] contain a >'for'. Syntax such as: > > Top(10)[ humor(joke) for joke in jokes ] > >does not suggest a list is _returned_, just like foo[23] doesn't. But it does immediately suggest iter[humor(joke) for joke in jokes] as the format for iterator comprehensions. Is that good or bad ? /Paul From aleaxit at yahoo.com Thu Oct 16 07:56:35 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Thu Oct 16 07:56:40 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <20031016073514.Q41936@familjen.svensson.org> References: <200310160516.h9G5GN328361@oma.cosc.canterbury.ac.nz> <200310161114.31192.aleaxit@yahoo.com> <20031016073514.Q41936@familjen.svensson.org> Message-ID: <200310161356.35452.aleaxit@yahoo.com> On Thursday 16 October 2003 01:38 pm, Paul Svensson wrote: > On Thu, 16 Oct 2003, Alex Martelli wrote: > >I think we could extend indexing to mean something different when > >the [ ] contain a 'for', just like we extended list display to mean > >something different (list comprehension) when the [ ] contain a > >'for'. Syntax such as: > > > > Top(10)[ humor(joke) for joke in jokes ] > > > >does not suggest a list is _returned_, just like foo[23] doesn't. > > But it does immediately suggest > > iter[humor(joke) for joke in jokes] > > as the format for iterator comprehensions. > > Is that good or bad ? Personally I consider it very good, because, in my other message about "accumulator display semantics", I show exactly how to achieve that by generalizing the semantics of these displays (well, I show it for a class Iter, but the built-in iter might perfectly well define an __accum__ special method and achieve exactly the same effect). Alex From barry at python.org Thu Oct 16 07:57:10 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 16 07:57:15 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: References: Message-ID: <1066305429.18702.1.camel@anthem> On Wed, 2003-10-15 at 20:27, Peter Norvig wrote: > As it turns out, I have a proposed syntax for something I call an > "accumulation display", and with it I was able to implement and test a > SortBy in about a minute. > You can read the whole proposal at http:///www.norvig.com/pyacc.html BTW, now is a great time to start writing those Python 2.4 PEPs . -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20031016/f85d9af7/attachment.bin From mcherm at mcherm.com Thu Oct 16 08:17:44 2003 From: mcherm at mcherm.com (Michael Chermside) Date: Thu Oct 16 08:17:47 2003 Subject: [Python-Dev] decorate-sort-undecorate Message-ID: <1066306664.3f8e8c687ef61@mcherm.com> Tim writes: > That said, since we're having a fire sale on optional sort arguments in 2.4, > I wouldn't oppose an optional Boolean argument you could explicit set to > have x.sort() return x. For example, I just wanted to call everyone's attention to the fact that Tim may (again... ) have come up with a decent idea. Seriously... Guido (and apparently Tim and I too) insist that aList.sort() must return None since it mutates the list. Meanwhile, Kevin, Barry, and perhaps others want to be able to write aList.sort().reverse().chainMoreHere(). But both sides could probably be happy with: aList.sort(chain=True).reverse() Right? -- Michael Chermside From mcherm at mcherm.com Thu Oct 16 08:31:34 2003 From: mcherm at mcherm.com (Michael Chermside) Date: Thu Oct 16 08:31:36 2003 Subject: [Python-Dev] accumulator display syntax Message-ID: <1066307494.3f8e8fa687e5a@mcherm.com> Alex Martelli writes: > I think we could extend indexing to mean something different when > the [ ] contain a 'for', just like we extended list display to mean > something different (list comprehension) when the [ ] contain a > 'for'. Syntax such as: > > Top(10)[ humor(joke) for joke in jokes ] > > does not suggest a list is _returned_, just like foo[23] doesn't. I find the syntax a bit confusing. Are we subscripting here, or are we juxtaposing one expression ("Top(10)"), with a list comprehension ("[humor(joke) for joke in jokes]")? Not totally unreadable, but it rubs me the wrong way. I read [] used for subscripting as completely different from [] used for list literals and list comprehensions. They just happen to share the same pair of symbols. To me, this confuses the two somewhat. -- Michael Chermside From tim.one at comcast.net Thu Oct 16 09:53:15 2003 From: tim.one at comcast.net (Tim Peters) Date: Thu Oct 16 09:53:18 2003 Subject: [Python-Dev] Python-2.3.2 windows binary screwed In-Reply-To: <1xtd8ydo.fsf@python.net> Message-ID: >>> I copied them from my system directory, instead of using those of >>> the MSVC 6 SP5 redistributables. >> That's a good choice. > Apparently not - they were XP specific. We may be compounding ambiguity here. By "that" I meant the SP redistributables. By "they" I expect you mean whatever was sitting in your system directory, in which case I switch from saying that's a good choice to that's a rotten choice . From aleaxit at yahoo.com Thu Oct 16 10:02:35 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Thu Oct 16 10:02:42 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <1066307494.3f8e8fa687e5a@mcherm.com> References: <1066307494.3f8e8fa687e5a@mcherm.com> Message-ID: <200310161602.35561.aleaxit@yahoo.com> On Thursday 16 October 2003 02:31 pm, Michael Chermside wrote: > Alex Martelli writes: > > I think we could extend indexing to mean something different when > > the [ ] contain a 'for', just like we extended list display to mean > > something different (list comprehension) when the [ ] contain a > > 'for'. Syntax such as: > > > > Top(10)[ humor(joke) for joke in jokes ] > > > > does not suggest a list is _returned_, just like foo[23] doesn't. > > I find the syntax a bit confusing. > > Are we subscripting here, or are we juxtaposing one expression > ("Top(10)"), with a list comprehension ("[humor(joke) for joke in jokes]")? "Subscripting", just like we would do with, say, Top(10)[ humor(joke) and joke in jokes ] This syntax, too, is a bit confusing -- because we rarely use indexing right on the result of a function call -- but it's perfectly valid Python today. If you dislike the syntax, nothing stops you from writing, today: select_top_10 = Top(10) select_top_10[ humor(joke) and joke in jokes ] and similarly nothing will stop you, if something like this accumulator display syntax is approved, from writing in the second statement select_top_10[ humor(joke) for joke in jokes ] and indeed some would consider this other for more readable. I am not proposing any newfangled "juxtaposing" syntax, writing two expressions right one after the other, which would have no precedent in Python; just an extension of the syntax allowed within brackets in indexing syntax (by analogy with that allowed today within brackets in list comprehension / list display syntax) -- for the semantics, see my separate post "accumulator display semantics". (Both of my posts are commentary on the proposal by Peter Norvig for a new accumulator display syntax and semantics: this syntax looks good to me to avoid the objection that Peter's proposed "[foo: x for x in bar]" ``looks like it should be returning a list'' due to the square brackets and the similar objection against the separately proposed iterator-comprehension syntax). > Not totally unreadable, but it rubs me the wrong way. I read [] used > for subscripting as completely different from [] used for list literals > and list comprehensions. They just happen to share the same pair of > symbols. To me, this confuses the two somewhat. Not long ago, what could go inside those square brackets was an expression, period -- no matter whether the brackets stood on their own (list display) or followed an expression (indexing/slicing). Some minor differences, of course, such as empty [ ] being valid only in list display but syntactically invalid in indexing, and slice notation [a:b] being valid in indexing but syntactically invalid in list display; but typical uses such as [a,b,c] overlapping -- with different meanings, of course (indexing X[a,b,c] -> the tuple (a,b,c) used as key; display [a,b,c] -> creating a 3-items list). So, different semantics but very similar syntax. Then list comprehensions were introduced and the syntax admitted inside [ ] got far wider, in "list display" cases only. Why would it be a problem if now the syntax admitted in the "similar syntax, different semantics" case of "indexing" got similarly wider? How would it infringe on the "completely different ... just happen to share the same pair of symbols" (and a lot about the syntax relating to what can go inside those symbols, too) perception, which seems to me to be pretty accurate? Alex From tim.one at comcast.net Thu Oct 16 10:25:54 2003 From: tim.one at comcast.net (Tim Peters) Date: Thu Oct 16 10:25:55 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: Message-ID: [Just] > On the sf tracker item you [Raymond] write: > > def sort(self, cmp=None, key=None, reverse=None): > if cmp is not None and key is not None: > cmp = cmpwrapper(cmp) > if key is not None: > self[:] = [sortwrapper(key(x), x) for x in self] > if reverse is not None: > self.reverse() > self.sort(cmp) > if key is not None: > self[:] = [x.getvalue() for x in self] > if reverse is not None: > self.reverse() > > Is there consensus at all about the necessity of that first reverse > call? To me it's not immediately obvious that the reverse option > should maintain the _original_ stable order. In my particular > application I would actually want reverse to do just that: reverse > the result of the sort. Easy enough to work around of course: I could > do the reverse myself after the sort. But it does feel odd: sort() > now _has_ a reverse feature, but I can't use it... "reverse" here is being used in the sense of "flip the sense of the cmp() result", so that instead of using cmp(x, y), it (conceptually) uses the negation of cmp(x, y). This swaps "less than" with "greater than" outcomes, but leaves "equal" outcomes alone. In that sense, Raymond's is a clever and correct implementation. I don't know that it helps Alex's use case, though (multi-key sort where some keys want ascending and others descending; those are still tricky to write directly in one bite, although the reverse argument makes them easy to do by chaining sorts one key at a time). > (Also: how does timsort perform when fed a (partially) sorted list > compared to a reversed sorted list? I'll need a concrete example to figure out exactly what that's intended to mean. The algorithm is equally happy with descending runs as with ascending runs, although the former need a little time to transform them to ascending runs, and the all-equal case counts as an ascending run. > If there's a significant difference there, than that first reverse > call may actually hurt performance in some cases. Not that I care > much about that...) Say we're doing [1, 2, 3].sort(reverse=True). Raymond first reverses it: [3, 2, 1] In one pass, using two compares (N-1 for a list of length N), the algorithm recognizes that the whole thing is a single descending run. It then reverses it in one pass (swapping elements starting at both ends and moving toward the middle): [1, 2, 3] and it's done. Raymond then reverses it again: [3, 2, 1] So there are 3 reversals in all. Reversals are cheap, since they just swap pointers in a tight little C loop, and never call back into Python. From python at rcn.com Thu Oct 16 10:26:50 2003 From: python at rcn.com (Raymond Hettinger) Date: Thu Oct 16 10:27:32 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <200310161020.24161.aleaxit@yahoo.com> Message-ID: <003301c393f1$8aa9efa0$e841fea9@oemcomputer> [Alex Martelli] > I thought the idea being implemented avoided making a new list -- > i.e., that the idea being implemented is the equivalent of: > > # decorate > for i, item in enumerate(thelist): > thelist[i] = CleverWrapper((key(item), item)) > > # sort (with the new stability guarantee) > thelist.sort() > > # undecorate > for i, item in enumerate(thelist): > thelist[i] = item[1] > > where (the equivalent of): > > class CleverWrapper(tuple): > def __cmp__(self, other): return cmp(self[0], other[0]) > > so, there is no allocation of another list -- just (twice) a repopulation > of the existing one. How _important_ that is to performance, I dunno, > but wanted to double-check on my understanding of this anyway. Yes, that is how it works in a nutshell ;-) Of course, it looks more impressive and was harder to write in C. Raymond Hettinger From pje at telecommunity.com Thu Oct 16 10:50:44 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Oct 16 10:50:42 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <1066307494.3f8e8fa687e5a@mcherm.com> Message-ID: <5.1.0.14.0.20031016104316.01ecbec0@mail.telecommunity.com> At 05:31 AM 10/16/03 -0700, Michael Chermside wrote: >Alex Martelli writes: > > I think we could extend indexing to mean something different when > > the [ ] contain a 'for', just like we extended list display to mean > > something different (list comprehension) when the [ ] contain a > > 'for'. Syntax such as: > > > > Top(10)[ humor(joke) for joke in jokes ] > > > > does not suggest a list is _returned_, just like foo[23] doesn't. > >I find the syntax a bit confusing. > >Are we subscripting here, or are we juxtaposing one expression >("Top(10)"), with a list comprehension ("[humor(joke) for joke in jokes]")? > >Not totally unreadable, but it rubs me the wrong way. I read [] used >for subscripting as completely different from [] used for list literals >and list comprehensions. They just happen to share the same pair of >symbols. To me, this confuses the two somewhat. I have to second on the syntax confusion, but for a different reason. This: Top(10)[ humor(joke) for joke in jokes ] Looks to me like some kind of *slice* syntax. I would read this as being roughly equivalent to: temp = Top(10) [temp[humor(joke)] for joke in jokes ] Top(10) and all the other accumulators proposed are, IMO, nothing more than transformations of a sequence or iterator. Transformations are what functions are for, and function syntax clearly expresses that the function is being applied to the sequence or iterator, and returning a result. Peter's syntax is too magical, and Alex's implies subscripting that doesn't really exist. Both are misleading to a casual reader of the code. From neal at metaslash.com Thu Oct 16 11:52:12 2003 From: neal at metaslash.com (Neal Norwitz) Date: Thu Oct 16 11:52:21 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects listobject.c, 2.158, 2.159 In-Reply-To: References: Message-ID: <20031016155212.GE30467@epoch.metaslash.com> On Wed, Oct 15, 2003 at 08:41:11PM -0700, rhettinger@users.sourceforge.net wrote: > > Index: listobject.c > =================================================================== > + static PyObject * > + cmpwrapper_call(cmpwrapperobject *co, PyObject *args, PyObject *kwds) > + { > + PyObject *x, *y, *xx, *yy; > + > + if (!PyArg_UnpackTuple(args, "", 2, 2, &x, &y)) > + return NULL; > + if (!PyObject_TypeCheck(x, &sortwrapper_type) || > + !PyObject_TypeCheck(x, &sortwrapper_type)) { The second line should be checking y, not x? Neal From ianb at colorstudy.com Thu Oct 16 12:17:35 2003 From: ianb at colorstudy.com (Ian Bicking) Date: Thu Oct 16 12:17:46 2003 Subject: [Python-Dev] accumulator display semantics In-Reply-To: <200310161200.04846.aleaxit@yahoo.com> Message-ID: <40DEE1A8-FFF4-11D7-9282-000393C2D67E@colorstudy.com> On Thursday, October 16, 2003, at 05:00 AM, Alex Martelli wrote: > Why not move the for loop, if needed, out of the hard-coded > infrastructure and just have accumulator display syntax such as: > acc[x*x for x in it] > be exactly equivalent to: > a = acc.__accum__(lambda x: x*x, iter(it)) > return a.result() > i.e., pass the callable corresponding to the expression, and the > iterator corresponding to the sequence, to the user-coded > accumulator. Seems simpler if you could get an iterator for [x*x for x in it] that returned (x*x, x), then call acc.__accum__(that_iter). I suppose for some accumulators you could sometimes avoid calling the expression, but that doesn't seem like a big feature. It seems like it complicates the semantics that you have to turn the list comprehension's expression into a function, where (I imagine) it doesn't get turned into a real function otherwise, but is executed without a new scope. -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org From aahz at pythoncraft.com Thu Oct 16 12:23:41 2003 From: aahz at pythoncraft.com (Aahz) Date: Thu Oct 16 12:23:45 2003 Subject: [Python-Dev] accumulator display semantics In-Reply-To: <200310161200.04846.aleaxit@yahoo.com> References: <5.1.0.14.0.20031015212950.02951120@mail.telecommunity.com> <200310161200.04846.aleaxit@yahoo.com> Message-ID: <20031016162341.GA7305@panix.com> I'm having a difficult time following this discussion. Would someone please write a PEP once things settle down? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From tim.one at comcast.net Thu Oct 16 12:50:41 2003 From: tim.one at comcast.net (Tim Peters) Date: Thu Oct 16 12:50:41 2003 Subject: [Python-Dev] decorate-sort-undecorate In-Reply-To: <1066306664.3f8e8c687ef61@mcherm.com> Message-ID: [Michael Chermside] > ... > But both sides could probably be happy with: > > aList.sort(chain=True).reverse() > > Right? Probably not: some people want list.sort() to return a (shallow) copy of the list in sorted order, leaving the original list alone, sometimes. Somtimes not. It's all dead easy already, of course. From python at rcn.com Thu Oct 16 13:09:22 2003 From: python at rcn.com (Raymond Hettinger) Date: Thu Oct 16 13:10:07 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objectslistobject.c, 2.158, 2.159 In-Reply-To: <20031016155212.GE30467@epoch.metaslash.com> Message-ID: <004501c39408$3eed63a0$e841fea9@oemcomputer> > > + !PyObject_TypeCheck(x, &sortwrapper_type)) { [Neal] > The second line should be checking y, not x? Yes. Will checkin a fix. Raymond From python at rcn.com Thu Oct 16 13:21:21 2003 From: python at rcn.com (Raymond Hettinger) Date: Thu Oct 16 13:22:02 2003 Subject: [Python-Dev] accumulator display semantics In-Reply-To: <20031016162341.GA7305@panix.com> Message-ID: <004c01c39409$eb89d0c0$e841fea9@oemcomputer> [Aahz] > I'm having a difficult time following this discussion. Would someone > please write a PEP once things settle down? Peter's link is essentially a PEP already and covers all the essentials: http://www.norvig.com/pyacc.html Still, if his ideas aspire to immortality, he should go the last yard and format it for pephood. Raymond Hettinger From aahz at pythoncraft.com Thu Oct 16 13:45:26 2003 From: aahz at pythoncraft.com (Aahz) Date: Thu Oct 16 13:45:29 2003 Subject: [Python-Dev] accumulator display semantics In-Reply-To: <004c01c39409$eb89d0c0$e841fea9@oemcomputer> References: <20031016162341.GA7305@panix.com> <004c01c39409$eb89d0c0$e841fea9@oemcomputer> Message-ID: <20031016174526.GA20332@panix.com> On Thu, Oct 16, 2003, Raymond Hettinger wrote: > [Aahz] >> >> I'm having a difficult time following this discussion. Would someone >> please write a PEP once things settle down? > > Peter's link is essentially a PEP already and covers all the essentials: > > http://www.norvig.com/pyacc.html Gotcha. Didn't realize he'd been summarizing the discussion. Well, I'll hold my opinion on the whole proposal pending a PEP, but I'll make two comments on the proposal as it stands: * I'm strongly opposed to the return idea instead of raising StopAccumulation (which should be a subclass of StopIteration). Using return this way is IMO unPythonic. * If we're using bracket notation, I think accumulators must return a list. I think it would be a Bad Idea to permit other types (although I'm willing for leeway to permit list subclasses). -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From python at rcn.com Thu Oct 16 13:49:57 2003 From: python at rcn.com (Raymond Hettinger) Date: Thu Oct 16 13:50:38 2003 Subject: [Python-Dev] inline sort option In-Reply-To: <1066306664.3f8e8c687ef61@mcherm.com> Message-ID: <004f01c3940d$ea1c1320$e841fea9@oemcomputer> [Tim Peters] > > That said, since we're having a fire sale on optional sort arguments in > 2.4, > > I wouldn't oppose an optional Boolean argument you could explicit set to > > have x.sort() return x. For example, [Michael Chermside] > I just wanted to call everyone's attention to the fact that Tim may > (again... ) have come up with a decent idea. > > Seriously... Guido (and apparently Tim and I too) insist that aList.sort() > must return None since it mutates the list. Meanwhile, Kevin, Barry, and > perhaps others want to be able to write > aList.sort().reverse().chainMoreHere(). Are you proposing something like: print mylist.sort(inplace=False) # prints a new, sorted list while # leaving the original list intact which would be implemented something like this: def inlinesort(alist, *args, **kwds): newref = alist[:] newref.sort(*args, **kwds) return newref If that is what you're after, I think it is a good idea. It avoids the perils of mutating methods returning self. It is explicit and pleasing to write: for elem in mylist.sort(inplace=False): . . . It is extra nice in a list comprehension: peckingorder = [d.name for d in duck.sort(key=seniority, inplace=False)] Instead of "inplace=False", an alternative is "inline=True". Raymond Hettinger From guido at python.org Thu Oct 16 14:03:26 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 16 14:04:03 2003 Subject: [Python-Dev] inline sort option In-Reply-To: Your message of "Thu, 16 Oct 2003 13:49:57 EDT." <004f01c3940d$ea1c1320$e841fea9@oemcomputer> References: <004f01c3940d$ea1c1320$e841fea9@oemcomputer> Message-ID: <200310161803.h9GI3Q404990@12-236-54-216.client.attbi.com> > Are you proposing something like: > > print mylist.sort(inplace=False) # prints a new, sorted list while > # leaving the original list intact > > > which would be implemented something like this: > > def inlinesort(alist, *args, **kwds): > newref = alist[:] > newref.sort(*args, **kwds) > return newref > > > If that is what you're after, I think it is a good idea. It avoids the > perils of mutating methods returning self. It is explicit and pleasing > to write: > > for elem in mylist.sort(inplace=False): > . . . > > It is extra nice in a list comprehension: > > peckingorder = [d.name for d in duck.sort(key=seniority, > inplace=False)] > > Instead of "inplace=False", an alternative is "inline=True". *If* we're going to consider this, I would recommend using a different method name rather than a keyword argument. Arguments whose value changes the return type present a problem for program analysis tools like type checkers (and IMO are also easily overseen by human readers). And, it's easier to write l.sorted() rather than l.sort(inline=True). --Guido van Rossum (home page: http://www.python.org/~guido/) From eppstein at ics.uci.edu Thu Oct 16 14:25:13 2003 From: eppstein at ics.uci.edu (David Eppstein) Date: Thu Oct 16 14:25:28 2003 Subject: [Python-Dev] Re: inline sort option References: <1066306664.3f8e8c687ef61@mcherm.com> <004f01c3940d$ea1c1320$e841fea9@oemcomputer> Message-ID: In article <004f01c3940d$ea1c1320$e841fea9@oemcomputer>, "Raymond Hettinger" wrote: > Are you proposing something like: > > print mylist.sort(inplace=False) # prints a new, sorted list while > # leaving the original list intact What's wrong with writing your own three-line function def sort(L): copy = list(L) copy.sort() return copy then you can do print sort(mylist) etc to your heart's content... -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From pje at telecommunity.com Thu Oct 16 14:26:13 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Oct 16 14:26:02 2003 Subject: [Python-Dev] accumulator display semantics In-Reply-To: <20031016174526.GA20332@panix.com> References: <004c01c39409$eb89d0c0$e841fea9@oemcomputer> <20031016162341.GA7305@panix.com> <004c01c39409$eb89d0c0$e841fea9@oemcomputer> Message-ID: <5.1.0.14.0.20031016140604.02e53260@mail.telecommunity.com> At 01:45 PM 10/16/03 -0400, Aahz wrote: >On Thu, Oct 16, 2003, Raymond Hettinger wrote: > > [Aahz] > >> > >> I'm having a difficult time following this discussion. Would someone > >> please write a PEP once things settle down? > > > > Peter's link is essentially a PEP already and covers all the essentials: > > > > http://www.norvig.com/pyacc.html > >Gotcha. Didn't realize he'd been summarizing the discussion. Well, >I'll hold my opinion on the whole proposal pending a PEP, but I'll make >two comments on the proposal as it stands: > >* I'm strongly opposed to the return idea instead of raising >StopAccumulation (which should be a subclass of StopIteration). Using >return this way is IMO unPythonic. > >* If we're using bracket notation, I think accumulators must return a >list. I think it would be a Bad Idea to permit other types (although I'm >willing for leeway to permit list subclasses). And while we're writing comments for the "Objections" part of the PEP... :) * This does nothing functions can't do today (with less magic and greater readability) over any iterable * If you don't want to allocate memory for the whole list, you can always write an iterator object or generator function -- today, even in Python 2.2. * If you really want a way to create a generator inline, let's just have a way to create a generator inline in 2.4. And any accumulator functions you previously wrote for 2.2 or 2.3 will "just work" with the new kind of generator. Note too, that inline generators would have other uses besides accumulation expressions. From bac at OCF.Berkeley.EDU Thu Oct 16 14:29:15 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Oct 16 14:29:30 2003 Subject: [Python-Dev] Draft of an essay on Python development (and how to help) In-Reply-To: References: <3F8B5ECB.4030207@ocf.berkeley.edu> Message-ID: <3F8EE37B.6070803@ocf.berkeley.edu> Martin v. L?wis wrote: > "Brett C." writes: > > >>If you get any message from this document, it should be that *anyone* >>can help Python. > > > It should be what? > "...that *anyone* can help with the development of Python"? -Brett From aahz at pythoncraft.com Thu Oct 16 14:51:57 2003 From: aahz at pythoncraft.com (Aahz) Date: Thu Oct 16 14:52:00 2003 Subject: [Python-Dev] inline sort option In-Reply-To: <200310161803.h9GI3Q404990@12-236-54-216.client.attbi.com> References: <004f01c3940d$ea1c1320$e841fea9@oemcomputer> <200310161803.h9GI3Q404990@12-236-54-216.client.attbi.com> Message-ID: <20031016185156.GA3580@panix.com> On Thu, Oct 16, 2003, Guido van Rossum wrote: > > *If* we're going to consider this, I would recommend using a different > method name rather than a keyword argument. Arguments whose value > changes the return type present a problem for program analysis tools > like type checkers (and IMO are also easily overseen by human > readers). And, it's easier to write l.sorted() rather than > l.sort(inline=True). Let's make explicit: l.copysort() I'm not a big fan of grammatical suffixes for distinguishing between similar meanings. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From python at rcn.com Thu Oct 16 15:18:08 2003 From: python at rcn.com (Raymond Hettinger) Date: Thu Oct 16 15:18:51 2003 Subject: [Python-Dev] inline sort option In-Reply-To: <20031016185156.GA3580@panix.com> Message-ID: <006701c3941a$3c5e9d40$e841fea9@oemcomputer> [Guido van Rossum] > > *If* we're going to consider this, I would recommend using a different > > method name rather than a keyword argument. Arguments whose value > > changes the return type present a problem for program analysis tools > > like type checkers (and IMO are also easily overseen by human > > readers). And, it's easier to write l.sorted() rather than > > l.sort(inline=True). [Aahz] > Let's make explicit: l.copysort() > > I'm not a big fan of grammatical suffixes for distinguishing between > similar meanings. +1 Raymond From theller at python.net Thu Oct 16 15:38:01 2003 From: theller at python.net (Thomas Heller) Date: Thu Oct 16 15:38:05 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: <3F8C3DD0.4020400@v.loewis.de> ( =?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Tue, 14 Oct 2003 20:17:52 +0200") References: <3F872FE9.9070508@v.loewis.de> <3F8C3DD0.4020400@v.loewis.de> Message-ID: > Thomas Heller wrote: > >> If I look at the file sizes in the DLLs directory, it seems that at >> least unicodedata.pyd, _bsddb.pyd, and _ssl.pyd would significantly grow >> python23.dll. Is unicodedata.pyd used by the encoding/decoding methods? > > No, but it is use by SRE, and by unicode methods (.lower, .upper, ...). "Martin v. L?wis" writes: > I don't see why it matters, though. Adding modules to pythonxy.dll > does not increase the memory consumption if the modules are not > used. It might decrease the memory consumption in case the modules are > used. So, would a patch be accepted (for 2.4, I assume there is no way for 2.3.3) which made everything builtin except for the following modules: _testcapi - not used outside the testsuite _tkinter - needs external stuff anyway pyexpat - may be replaced by a third party module _ssl - needs Python to be built Thomas From FBatista at uniFON.com.ar Thu Oct 16 15:51:40 2003 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Thu Oct 16 15:52:34 2003 Subject: [Python-Dev] inline sort option Message-ID: #- > > like type checkers (and IMO are also easily overseen by human #- > > readers). And, it's easier to write l.sorted() rather than #- > > l.sort(inline=True). #- #- [Aahz] #- > Let's make explicit: l.copysort() #- > #- > I'm not a big fan of grammatical suffixes for #- distinguishing between #- > similar meanings. #- #- +1 +2, considering that the difference in behaviour with sort and sorted it's no so clear to a non-english speaker. (my first post to the development list, :D ) . Facundo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20031016/7f1786a9/attachment.html From shane.holloway at ieee.org Thu Oct 16 16:00:31 2003 From: shane.holloway at ieee.org (Shane Holloway (IEEE)) Date: Thu Oct 16 16:01:23 2003 Subject: [Python-Dev] inline sort option In-Reply-To: <200310161803.h9GI3Q404990@12-236-54-216.client.attbi.com> References: <004f01c3940d$ea1c1320$e841fea9@oemcomputer> <200310161803.h9GI3Q404990@12-236-54-216.client.attbi.com> Message-ID: <3F8EF8DF.3030003@ieee.org> Guido van Rossum wrote: > *If* we're going to consider this, I would recommend using a different > method name rather than a keyword argument. Arguments whose value > changes the return type present a problem for program analysis tools > like type checkers (and IMO are also easily overseen by human > readers). And, it's easier to write l.sorted() rather than > l.sort(inline=True). > > --Guido van Rossum (home page: http://www.python.org/~guido/) I'd like to see that as an inplace sort still -- because copysort is easy to get to... l.sorted() # inplace sort, returning self l[:].sorted() # copy sort, returning new list Just my 1/50th of a dollar. ;) -Shane Holloway From guido at python.org Thu Oct 16 16:19:47 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 16 16:20:27 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: Your message of "Thu, 16 Oct 2003 21:38:01 +0200." References: <3F872FE9.9070508@v.loewis.de> <3F8C3DD0.4020400@v.loewis.de> Message-ID: <200310162019.h9GKJli05194@12-236-54-216.client.attbi.com> > So, would a patch be accepted (for 2.4, I assume there is no way for > 2.3.3) which made everything builtin except for the following modules: > > _testcapi - not used outside the testsuite > _tkinter - needs external stuff anyway > pyexpat - may be replaced by a third party module > _ssl - needs Python to be built I'd rather see an explicit list of the "everything" that you want to bundle into the main DLL. --Guido van Rossum (home page: http://www.python.org/~guido/) From pedronis at bluewin.ch Thu Oct 16 16:46:01 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Thu Oct 16 16:44:20 2003 Subject: [Python-Dev] accumulator display semantics In-Reply-To: <5.1.0.14.0.20031016140604.02e53260@mail.telecommunity.com> References: <20031016174526.GA20332@panix.com> <004c01c39409$eb89d0c0$e841fea9@oemcomputer> <20031016162341.GA7305@panix.com> <004c01c39409$eb89d0c0$e841fea9@oemcomputer> Message-ID: <5.2.1.1.0.20031016224531.02804310@pop.bluewin.ch> At 14:26 16.10.2003 -0400, Phillip J. Eby wrote: >* If you really want a way to create a generator inline, let's just have a >way to create a generator inline in 2.4. And any accumulator functions >you previously wrote for 2.2 or 2.3 will "just work" with the new kind of >generator. Note too, that inline generators would have other uses besides >accumulation expressions. agreed. From barry at barrys-emacs.org Thu Oct 16 17:25:51 2003 From: barry at barrys-emacs.org (Barry Scott) Date: Thu Oct 16 17:26:01 2003 Subject: [Python-Dev] Python-2.3.2 windows binary screwed In-Reply-To: References: <200310160515.h9G5FUqc025443@localhost.localdomain> Message-ID: <6.0.0.22.0.20031016222201.0221b908@torment.chelsea.private> You said you are using the SP5 DLLs. They are old... We use the ones from vc6redist.exe from microsoft they have fixes that you may need. Its also the versions that you will encounter on XP systems I believe. So long as you have the version checking done right in the installer you will not rewind a DLL backwards. Barry From martin at v.loewis.de Thu Oct 16 17:37:00 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Thu Oct 16 17:37:14 2003 Subject: [Python-Dev] Draft of an essay on Python development (and how to help) In-Reply-To: <3F8EE37B.6070803@ocf.berkeley.edu> References: <3F8B5ECB.4030207@ocf.berkeley.edu> <3F8EE37B.6070803@ocf.berkeley.edu> Message-ID: "Brett C." writes: > >>If you get any message from this document, it should be that *anyone* > >>can help Python. > > It should be what? > > > > "...that *anyone* can help with the development of Python"? Ah, ok. I was expecting something like "it should be clear/obvious/doubtful that anyone can help with Python" Regards, Martin From niemeyer at conectiva.com Thu Oct 16 18:50:59 2003 From: niemeyer at conectiva.com (Gustavo Niemeyer) Date: Thu Oct 16 18:52:10 2003 Subject: [Python-Dev] SRE recursion Message-ID: <20031016225058.GB19133@ibook.distro.conectiva> Hello folks! I'd like to get back to the SRE recursion issue (#757624). Is this a good time to commit the patch? -- Gustavo Niemeyer http://niemeyer.net From bac at OCF.Berkeley.EDU Thu Oct 16 19:25:16 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Oct 16 19:25:21 2003 Subject: [Python-Dev] Draft of an essay on Python development (and how to help) In-Reply-To: References: <3F8B5ECB.4030207@ocf.berkeley.edu> <3F8EE37B.6070803@ocf.berkeley.edu> Message-ID: <3F8F28DC.1000106@ocf.berkeley.edu> Martin v. L?wis wrote: > "Brett C." writes: > > >>>>If you get any message from this document, it should be that *anyone* >>>>can help Python. >>> >>>It should be what? >>> >> >>"...that *anyone* can help with the development of Python"? > > > Ah, ok. I was expecting something like > > "it should be clear/obvious/doubtful that anyone can help with Python" > Could, but I don't want to come off as patronizing. Last thing I want to happen is someone to read that line with "obvious" and then have them feel stupid because it didn't come off as obvious. Even if the person isn't that smart they can still give the PSF money so I want to minimize the chance of insulting a possible sugar-daddy for the PSF. =) -Brett From niemeyer at conectiva.com Thu Oct 16 19:24:44 2003 From: niemeyer at conectiva.com (Gustavo Niemeyer) Date: Thu Oct 16 19:25:54 2003 Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate In-Reply-To: <20031016103552.H14453@prim.han.de> References: <3F8DB69E.2070406@sabaydi.com> <20031016103552.H14453@prim.han.de> Message-ID: <20031016232444.GA27936@ibook.distro.conectiva> > If anything at all, i'd suggest a std-module which contains e.g. > 'sort', 'reverse' and 'extend' functions which always return > a new list, so that you could write: > > for i in reverse(somelist): > ... You can do reverse with [::-1] now. -- Gustavo Niemeyer http://niemeyer.net From bac at OCF.Berkeley.EDU Thu Oct 16 19:28:36 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Oct 16 19:28:47 2003 Subject: [Python-Dev] SRE recursion In-Reply-To: <20031016225058.GB19133@ibook.distro.conectiva> References: <20031016225058.GB19133@ibook.distro.conectiva> Message-ID: <3F8F29A4.7060904@ocf.berkeley.edu> Gustavo Niemeyer wrote: > Hello folks! > > I'd like to get back to the SRE recursion issue (#757624). Is this > a good time to commit the patch? > I don't see why not. I assume this is only going into the main trunk. Might as well get it in now if you feel it is ready so that there is that much more time for testing and any possible fixing. -Brett From greg at cosc.canterbury.ac.nz Thu Oct 16 20:07:38 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Thu Oct 16 20:08:28 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310161602.35561.aleaxit@yahoo.com> Message-ID: <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz> Alex Martelli : > Then list comprehensions were introduced and the syntax admitted > inside [ ] got far wider, in "list display" cases only. Why would it be > a problem if now the syntax admitted in the "similar syntax, different > semantics" case of "indexing" got similarly wider? List comprehensions extended the semantics of list construction by providing new ways to specify the contents of the list. Extended slice notation extended the semantics of indexing by providing new ways to specify the index. What you're proposing hijacks the indexing syntax and uses it to mean something completely different from indexing, which is a much bigger change, and potentially a very confusing one. So, no, sorry, it doesn't overcome my objection! Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From martin at v.loewis.de Fri Oct 17 01:49:38 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Fri Oct 17 01:52:36 2003 Subject: [Python-Dev] SRE recursion In-Reply-To: <20031016225058.GB19133@ibook.distro.conectiva> References: <20031016225058.GB19133@ibook.distro.conectiva> Message-ID: Gustavo Niemeyer writes: > I'd like to get back to the SRE recursion issue (#757624). Is this > a good time to commit the patch? It would be good if you could find somebody who reviews the patch. However, if nobody volunteers to review, please go ahead - it might well be that you are the last active SRE maintainer left on this planet ... Regards, Martin From gherron at islandtraining.com Fri Oct 17 02:05:27 2003 From: gherron at islandtraining.com (Gary Herron) Date: Fri Oct 17 02:06:28 2003 Subject: [Python-Dev] SRE recursion In-Reply-To: References: <20031016225058.GB19133@ibook.distro.conectiva> Message-ID: <200310162305.27509.gherron@islandtraining.com> On Thursday 16 October 2003 10:49 pm, Martin v. L?wis wrote: > Gustavo Niemeyer writes: > > I'd like to get back to the SRE recursion issue (#757624). Is this > > a good time to commit the patch? > > It would be good if you could find somebody who reviews the > patch. However, if nobody volunteers to review, please go ahead - it > might well be that you are the last active SRE maintainer left on this > planet ... I jumped into SRE and wallowed around a bit before the last release, then got swamped with real (i.e., money earning) work. I'd be willing to jump in again if it would help. Gustavo, would you like me to review the patch? Or if you submit it, I'll just get it from cvs and poke around it that way. Gary Herron From whisper at oz.net Fri Oct 17 02:55:31 2003 From: whisper at oz.net (David LeBlanc) Date: Fri Oct 17 02:55:35 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: <200310162019.h9GKJli05194@12-236-54-216.client.attbi.com> Message-ID: > > So, would a patch be accepted (for 2.4, I assume there is no way for > > 2.3.3) which made everything builtin except for the following modules: > > > > _testcapi - not used outside the testsuite > > _tkinter - needs external stuff anyway > > pyexpat - may be replaced by a third party module > > _ssl - needs Python to be built > > I'd rather see an explicit list of the "everything" that you want to > bundle into the main DLL. > > --Guido van Rossum (home page: http://www.python.org/~guido/) I have no really good technical reason for this, but it gives me a bad feeling - it's Windows, ok? ;) A few things come to mind: What's the cost of mapping the world (all those entry points) at startup? You have to rebuild all of the main dll just to do something to one component. To me, that's maybe the biggest single issue. Any possiblity of new bugs? Are app users/programmers going to have a bloat perception? How many of them really understand that a dll is mapped and not loaded at startup? IMO, it contradicts the unix way of smaller, compartmentalized is better. It's not unix we're talking about, but it still makes sense to me, whatever the OS. On the plus side, it does make some debugging easier if you're working on extension dlls: fewer sources to have to point Vis Studio at. On a related side note: has anyone done any investigation to determine which few percentage of the extensions account for 99% of the dll loads? Maybe there's no such pattern, but experience suggests there probably is and that subset might be a better candidate than the whole world. Dave LeBlanc Seattle, WA USA From greg at electricrain.com Fri Oct 17 03:49:39 2003 From: greg at electricrain.com (Gregory P. Smith) Date: Fri Oct 17 03:49:43 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: References: <3F872FE9.9070508@v.loewis.de> <3F8C3DD0.4020400@v.loewis.de> Message-ID: <20031017074939.GG32250@zot.electricrain.com> On Thu, Oct 16, 2003 at 09:38:01PM +0200, Thomas Heller wrote: > > Thomas Heller wrote: > > > >> If I look at the file sizes in the DLLs directory, it seems that at > >> least unicodedata.pyd, _bsddb.pyd, and _ssl.pyd would significantly grow > >> python23.dll. Is unicodedata.pyd used by the encoding/decoding methods? > > > > No, but it is use by SRE, and by unicode methods (.lower, .upper, ...). > > "Martin v. L?wis" writes: > > > I don't see why it matters, though. Adding modules to pythonxy.dll > > does not increase the memory consumption if the modules are not > > used. It might decrease the memory consumption in case the modules are > > used. > > So, would a patch be accepted (for 2.4, I assume there is no way for > 2.3.3) which made everything builtin except for the following modules: > > _testcapi - not used outside the testsuite > _tkinter - needs external stuff anyway > pyexpat - may be replaced by a third party module > _ssl - needs Python to be built > I really don't like the idea of linking _bsddb.pyd statically into the main python DLL (or .so on other OSes). It adds significantly to the size of the python DLL which isn't fair to projects not using BerkeleyDB. Statically linking any BerkeleyDB version into python on linux (and presumably bsd and un*x) means that attempts to use more recent pybsddb modules with an updated version of the BerkeleyDB library built in don't work properly due to symbol conflicts causing the old library to be used with the new module code. I don't know if this problem applies to windows. I don't see any good reason to want fewer .pyd files and a monolithic main DLL. Greg From aleaxit at yahoo.com Fri Oct 17 03:53:55 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 17 03:54:01 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz> References: <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz> Message-ID: <200310170953.55170.aleaxit@yahoo.com> On Friday 17 October 2003 02:07 am, Greg Ewing wrote: > Alex Martelli : > > Then list comprehensions were introduced and the syntax admitted > > inside [ ] got far wider, in "list display" cases only. Why would it be > > a problem if now the syntax admitted in the "similar syntax, different > > semantics" case of "indexing" got similarly wider? > > List comprehensions extended the semantics of list construction by > providing new ways to specify the contents of the list. > > Extended slice notation extended the semantics of indexing by > providing new ways to specify the index. > > What you're proposing hijacks the indexing syntax and uses it to mean > something completely different from indexing, which is a much bigger > change, and potentially a very confusing one. Hmmm -- on this thread I meant to discuss the syntax only, but, OK, let's touch on the semantics. Let's say, then, that my proposed syntax: foo[x*x for x in blah] gets turned into "extending the semantics of indexing" just like, e.g., extended slicing did. That basically requires making this syntax correspond to Python calling: type(foo).__getitem__(foo, ) just like it does for other possible contents of the parentheses. E.g., today: >>> class x(object): ... def __getitem__(self, index): return index ... >>> a = x() >>> print a['tanto':'va':'la', 'gatta':'al':'lardo'] (slice('tanto', 'va', 'la'), slice('gatta', 'al', 'lardo')) >>> while hypothetically if this syntax (and corresponding semantics) were adopted, we might have: >>> print a[x*x for x in blaap] Of course, it would be up to a's type x to know what to do with that iterator, just as, today, it is to know what to do with that tuple of slice objects with (e.g.) string attributes. Coding objects that support iterators as indices would be slightly harder than having objects receive such indexing via a separate special method, such as the previously proposed __accum__; but then, this just corresponds to the slight hardship we pay for generality in coding objects that support slices as indices via __getitem__ -- the older and less general approach of having a separate special method, quondam __getslice__, was easier for special cases but not as general and extensible as today's. So, if framing what the subject still calls "accumulator displays" as "new ways to specify the index" -- and renaming the whole concept to e.g. "iterators as indices", since there is no necessary connection of the proposed new syntax and semantics to accumulation -- can ease acceptance, that's fine with me. So is the collapsing of the arguments into a single iterator, rather than a separate pair of underlying iterator and exp callable to be applied to each item -- this requires changing the Top(10) use case to pass both sort-key and item explicitly: Top(10)[ (humor(joke), joke) for joke in jokes ] with Top having semantics roughly equivalent to (though no doubt easily optimized -- by using a heap -- wrt): def Top(N): class topper(object): def __getitem__(self, iter): values_and_items = list(iter) values_and_items.sort() return [ item for value, item in values_and_items[:N] ] return topper() But this may in fact be preferable wrt both my and Peter Norvig's previous ideas as posted on this thread. > So, no, sorry, it doesn't overcome my objection! What about this latest small change, of having the indexing syntax invoke __getitem__ -- just like any other indexing, just with an iterator as the index rather than (e.g.) a tuple of slices etc? What, if anything, is "very confusing" in, e.g., sum[x*x for x in blaap] compared with e.g. the currently accepted: a['tanto':'va':'la', 'gatta':'al':'lardo'] ? Alex From Paul.Moore at atosorigin.com Fri Oct 17 05:47:07 2003 From: Paul.Moore at atosorigin.com (Moore, Paul) Date: Fri Oct 17 05:47:52 2003 Subject: [Python-Dev] buildin vs. shared modules Message-ID: <16E1010E4581B049ABC51D4975CEDB8803060CDB@UKDCX001.uk.int.atosorigin.com> From: Gregory P. Smith [mailto:greg@electricrain.com] > I don't see any good reason to want fewer .pyd files and a > monolithic main DLL. Agreed. The arguments on both sides seem weak, so I'd prefer to leave things as they are. My own (weak) argument against a monolithic DLL is that when packaging a standalone distribution (Installer, py2exe, cx_Freeze or whatever) it reduces the distribution size to omit unneeded DLLs. In particular, _tkinter, pyexpat, _bsddb and _ssl are over 100k each. Maybe only the DLLs which are necessary for Python to start should be built in (eg, zlib for zipfile support, _sre seems impossible to avoid, others I don't know - _winreg?) But as I said, I see no arguments which aren't weak, so why change? Paul From paoloinvernizzi at dmsware.com Fri Oct 17 07:03:00 2003 From: paoloinvernizzi at dmsware.com (Paolo Invernizzi) Date: Fri Oct 17 08:10:37 2003 Subject: [Python-Dev] Re: buildin vs. shared modules In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8803060CDB@UKDCX001.uk.int.atosorigin.com> References: <16E1010E4581B049ABC51D4975CEDB8803060CDB@UKDCX001.uk.int.atosorigin.com> Message-ID: Moore, Paul wrote: > Maybe only the DLLs which are necessary for Python to start should > be built in (eg, zlib for zipfile support, _sre seems impossible to > avoid, others I don't know - _winreg?) _winreg is only 36k and the most valuable use I think is that is used by distutils for searching VC compiler, but I think it can stay out... But I agree for zlib and _sre. With only the core DLL and a zip of necessary modules (os module stuff and so) you can start a minimal python and import whatever other zip of modules you need... Python DLL is actually 933k zlib is 61k and _sre is 57k.. so will be around 1050k... --- Paolo Invernizzi > > But as I said, I see no arguments which aren't weak, so why change? > Paul From arigo at tunes.org Fri Oct 17 08:54:29 2003 From: arigo at tunes.org (Armin Rigo) Date: Fri Oct 17 08:58:22 2003 Subject: [Python-Dev] Trashing recursive objects comparison? Message-ID: <20031017125429.GA25854@vicky.ecs.soton.ac.uk> Hello all, I'm bringing (again) the subject of comparison of recursive objects to the table because I just happened to write a buggy piece of code: class X: def __eq__(self, other): return self.content == other This code was buggy because 'self.content' could occasionally be 'self'. In this case it should have triggered an infinite recursion, and I should have got a nice (if a bit long) RuntimeError traceback that told me where the problem was. At least, this is how I would expect my piece of code to misbehave. Instead, the answer was 'True', whatever 'other' actually was. Puzzlement would have gained me if I had no idea about what a bisimulation, or graph isomorphism, is, and what Python's implementation of that idea is. Quoting Tim on bug #625698: > As Erik's latest example shows, the outcome isn't always > particularly well defined either. An alternative to speeding > this > silliness is to raise an exception instead when recursive > objects are detected. There was some hack value in doing > the graph isomorphism bit, but no real practical value I can > see. If the pretty academic subject of graph isomorphisms is well-worn enough to be sent to the trash, I'll submit a patch that just removes all this code and instead use the existing sys.recursionlimit counter to catch infinite recursions and throw the usual RuntimeError. Armin From andymac at bullseye.apana.org.au Fri Oct 17 08:33:16 2003 From: andymac at bullseye.apana.org.au (Andrew MacIntyre) Date: Fri Oct 17 09:15:08 2003 Subject: [Python-Dev] SRE recursion In-Reply-To: References: <20031016225058.GB19133@ibook.distro.conectiva> Message-ID: <20031017223015.N64463@bullseye.apana.org.au> On Fri, 17 Oct 2003, Martin v. [iso-8859-15] L=F6wis wrote: > Gustavo Niemeyer writes: > > > I'd like to get back to the SRE recursion issue (#757624). Is this > > a good time to commit the patch? > > It would be good if you could find somebody who reviews the > patch. However, if nobody volunteers to review, please go ahead - it > might well be that you are the last active SRE maintainer left on this > planet ... Because of the stack recursion issue on FreeBSD (in the presence of threads), I tested several of Gustavo's patches. I didn't scrutinise them for style though... +1 on getting the patch in early in the 2.4 cycle. -- Andrew I MacIntyre "These thoughts are mine alone..." E-mail: andymac@bullseye.apana.org.au (pref) | Snail: PO Box 370 andymac@pcug.org.au (alt) | Belconnen ACT 2616 Web: http://www.andymac.org/ | Australia From guido at python.org Fri Oct 17 10:41:04 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 10:41:14 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Fri, 17 Oct 2003 09:53:55 +0200." <200310170953.55170.aleaxit@yahoo.com> References: <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz> <200310170953.55170.aleaxit@yahoo.com> Message-ID: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> I'd just like to pipe into this discussion saying that while Peter Norvig's pre-PEP is neat, I'd reject it if it were a PEP; the main reason that the proposed notation doesn't return a list. I agree that having generator comprehensions would be a more general solution. I don't have a proposal for generator comprehension syntax though, and [yield ...] has the same problem. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Oct 17 10:46:31 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 10:46:38 2003 Subject: [Python-Dev] Trashing recursive objects comparison? In-Reply-To: Your message of "Fri, 17 Oct 2003 13:54:29 BST." <20031017125429.GA25854@vicky.ecs.soton.ac.uk> References: <20031017125429.GA25854@vicky.ecs.soton.ac.uk> Message-ID: <200310171446.h9HEkVs06278@12-236-54-216.client.attbi.com> > I'm bringing (again) the subject of comparison of recursive objects to the > table because I just happened to write a buggy piece of code: > > class X: > def __eq__(self, other): > return self.content == other > > This code was buggy because 'self.content' could occasionally be 'self'. In > this case it should have triggered an infinite recursion, and I should have > got a nice (if a bit long) RuntimeError traceback that told me where the > problem was. At least, this is how I would expect my piece of code to > misbehave. > > Instead, the answer was 'True', whatever 'other' actually was. Puzzlement > would have gained me if I had no idea about what a bisimulation, or graph > isomorphism, is, and what Python's implementation of that idea is. > > Quoting Tim on bug #625698: > > As Erik's latest example shows, the outcome isn't always > > particularly well defined either. An alternative to speeding > > this > > silliness is to raise an exception instead when recursive > > objects are detected. There was some hack value in doing > > the graph isomorphism bit, but no real practical value I can > > see. > > If the pretty academic subject of graph isomorphisms is well-worn > enough to be sent to the trash, I'll submit a patch that just > removes all this code and instead use the existing > sys.recursionlimit counter to catch infinite recursions and throw > the usual RuntimeError. Go for it, Armin. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Oct 17 10:56:38 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 10:56:42 2003 Subject: [Python-Dev] sort() return value Message-ID: <200310171456.h9HEuc606316@12-236-54-216.client.attbi.com> I'd like to explain once more why I'm so adamant that sort() shouldn't return 'self'. This comes from a coding style (popular in various other languages, I believe especially Lisp revels in it) where a series of side effects on a single object can be chained like this: x.compress().chop(y).sort(z) which would be the same as x.compress() x.chop(y) x.sort(z) I find the chaining form a threat to readability; it requires that the reader must be intimately familiar with each of the methods. The second form makes it clear that each of these calls acts on the same object, and so even if you don't know the class and its methods very well, you can understand that the second and third call are applied to x (and that all calls are made for their side-effects), and not to something else. I'd like to reserve chaining for operations that return new values, like string processing operations: y = x.rstrip("\n").split(":").lower() There are a few standard library modules that encourage chaining of side-effect calls (pstat comes to mind). There shouldn't be any new ones; pstat slipped through my filter when it was weak. --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Fri Oct 17 11:54:49 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Oct 17 11:54:55 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> References: <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz> <200310170953.55170.aleaxit@yahoo.com> Message-ID: <5.1.0.14.0.20031017114814.01ef44b0@mail.telecommunity.com> At 07:41 AM 10/17/03 -0700, Guido van Rossum wrote: >I'd just like to pipe into this discussion saying that while Peter >Norvig's pre-PEP is neat, I'd reject it if it were a PEP; the main >reason that the proposed notation doesn't return a list. I agree that >having generator comprehensions would be a more general solution. I >don't have a proposal for generator comprehension syntax though, and >[yield ...] has the same problem. (yield x*2 for x in foo) or maybe: (yield: x*2 for x in foo) would "yield" better visibility that this is a value that *does* something (like lambda). Or perhaps without the parentheses, but I think they're better for clarity, and I'd add them in practice even if they weren't required. The main problem with a gencomp syntax is that some people are going to use it for everything whether they need it or not, even when they have a small list and the frame overhead for the generator is going to make it slower. So it almost wants to be a really awkward ugly thing in order to discourage them... but then again, that way lies Ruby. :) From pje at telecommunity.com Fri Oct 17 12:03:41 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Oct 17 12:03:43 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310170953.55170.aleaxit@yahoo.com> References: <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz> <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz> Message-ID: <5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com> At 09:53 AM 10/17/03 +0200, Alex Martelli wrote: >What about this latest small change, of having the indexing syntax >invoke __getitem__ -- just like any other indexing, just with an >iterator as the index rather than (e.g.) a tuple of slices etc? > >What, if anything, is "very confusing" in, e.g., > > sum[x*x for x in blaap] > >compared with e.g. the currently accepted: > > a['tanto':'va':'la', 'gatta':'al':'lardo'] > >? Because it's arguably bad coding style to use slices or indexes on an object in order to perform a function on the indexes supplied. Wouldn't you find a program where this held true: TimesTwo[2] == 4 to be in bad style? Function calls are for transforming arguments, indexing is for accessing the contents of a *container*. Top(10) is not a container, it has nothing in it, and neither does TimesTwo. I suppose you could argue that TimesTwo is a conceptual infinite sequence of even integers, but for most of the proposed accumulators, similar arguments would be a *big* stretch. Yes, what you propose is certainly *possible*. But again, if you really needed an iterator as an index, you can right now do: sum[ [x*x for x in blaap] ] And if there are gencomps, you could do the same. So, why single out subscripting for special consideration with regard to generator comprehensions, thus forcing clever tricks of questionable style in order to do what ought to be function calls? I shudder to think of trying to have to explain Top(10)[...] to a Python newbie, even if they're an experienced programmer. Because Top(10) isn't a *container*. I suppose a C++ veteran might consider it an ugly operator overloading hack... and they'd be right. Top(10,[...]) on the other hand, is crystal clear to anybody that gets the idea of function calls. From guido at python.org Fri Oct 17 12:10:45 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 12:10:54 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Fri, 17 Oct 2003 11:54:49 EDT." <5.1.0.14.0.20031017114814.01ef44b0@mail.telecommunity.com> References: <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz> <200310170953.55170.aleaxit@yahoo.com> <5.1.0.14.0.20031017114814.01ef44b0@mail.telecommunity.com> Message-ID: <200310171610.h9HGAj606439@12-236-54-216.client.attbi.com> > (yield x*2 for x in foo) > > or maybe: > > (yield: x*2 for x in foo) > > would "yield" better visibility that this is a value that *does* > something (like lambda). Or perhaps without the parentheses, but I > think they're better for clarity, and I'd add them in practice even > if they weren't required. Both look decent to me, and in fact the first is what I was thinking of this morning in the shower. :-) > The main problem with a gencomp syntax is that some people are going > to use it for everything whether they need it or not, even when they > have a small list and the frame overhead for the generator is going > to make it slower. So it almost wants to be a really awkward ugly > thing in order to discourage them... but then again, that way lies > Ruby. :) Actually, that's also Python's philosophy, if you turn it around: only things that can be done efficiently should look cute... --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Fri Oct 17 12:12:45 2003 From: skip at pobox.com (Skip Montanaro) Date: Fri Oct 17 12:12:56 2003 Subject: [Python-Dev] accumulator display syntax Message-ID: <16272.5373.514560.225999@montanaro.dyndns.org> Greg> What you're proposing hijacks the indexing syntax and uses it to Greg> mean something completely different from indexing, which is a much Greg> bigger change, and potentially a very confusing one. Greg> So, no, sorry, it doesn't overcome my objection! I agree. Any expression bracketed by '[' and ']', no matter how many other clues to the ultimate result it might contain, ought to result in a list as far as I'm concerned. Skip From skip at pobox.com Fri Oct 17 12:13:37 2003 From: skip at pobox.com (Skip Montanaro) Date: Fri Oct 17 12:13:56 2003 Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate Message-ID: <16272.5425.470101.367084@montanaro.dyndns.org> >> If anything at all, i'd suggest a std-module which contains e.g. >> 'sort', 'reverse' and 'extend' functions which always return >> a new list, so that you could write: >> >> for i in reverse(somelist): >> ... Gustavo> You can do reverse with [::-1] now. I don't think that is considered "stable" in the sorting sense. If I sort in descending order vs ascending order, they are not mere reversals of each other. I may well still want adjacent records whose sort keys are identical to remain in the same order. What will the new reverse=True keyword arg do? Skip From aleaxit at yahoo.com Fri Oct 17 12:21:39 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 17 12:21:54 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <16272.5373.514560.225999@montanaro.dyndns.org> References: <16272.5373.514560.225999@montanaro.dyndns.org> Message-ID: <200310171821.39895.aleaxit@yahoo.com> On Friday 17 October 2003 06:12 pm, Skip Montanaro wrote: > Greg> What you're proposing hijacks the indexing syntax and uses it to > Greg> mean something completely different from indexing, which is a > much Greg> bigger change, and potentially a very confusing one. > > Greg> So, no, sorry, it doesn't overcome my objection! > > I agree. Any expression bracketed by '[' and ']', no matter how many other > clues to the ultimate result it might contain, ought to result in a list as > far as I'm concerned. Hmmm, how is, e.g. foo[x*x for x in bar] any more an "expression bracketed by [ and ]" than, say, foo = {'wot': 'tow'} foo['wot'] ...? Yet the latter doesn't involve any lists that I can think of. Nor do I see why the former need "mean something completely different from indexing" -- it means to call foo's __getitem__ with the appropriately constructed object, just as e.g. foo[ 'va':23:2j, {'zip':'zop'}:45:(3,4) ] today calls it with a tuple of two weird slice objects (and doesn't happen to involve any lists whatsoever). Alex From skip at pobox.com Fri Oct 17 12:38:07 2003 From: skip at pobox.com (Skip Montanaro) Date: Fri Oct 17 12:38:17 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310171821.39895.aleaxit@yahoo.com> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> Message-ID: <16272.6895.233187.510629@montanaro.dyndns.org> >> I agree. Any expression bracketed by '[' and ']', no matter how many >> other clues to the ultimate result it might contain, ought to result >> in a list as far as I'm concerned. Alex> Hmmm, how is, e.g. Alex> foo[x*x for x in bar] Alex> any more an "expression bracketed by [ and ]" than, say, Alex> foo = {'wot': 'tow'} Alex> foo['wot'] Alex> ...? When I said "expression bracketed by '[' and ']' I agree I was thinking of list construction sorts of things like: foo = ['wot'] not indexing sorts of things like: foo['wot'] I'm not in a mood to try and explain anything in more precise terms this morning (for other reasons, it's been a piss poor day so far) and must trust your ability to infer my meaning. I have no idea at this point how to interpret foo[x*x for x in bar] That looks like a syntax error to me. You have a probably identifier followed by a list comprehension. Here's a slightly more precise term: If a '['...']' construct exists in context where a list constructor would be legal today, it ought to evaluate to a list, not to something else. Alex> ... just as e.g. foo[ 'va':23:2j, {'zip':'zop'}:45:(3,4) ] ... I have absolutely no idea how to interpret this. Is this existing or proposed Python syntax? Skip From seandavidross at hotmail.com Fri Oct 17 12:43:31 2003 From: seandavidross at hotmail.com (Sean Ross) Date: Fri Oct 17 12:43:36 2003 Subject: [Python-Dev] accumulator display syntax Message-ID: Hi. I've not posted to this group before, but I've been following most of the discussions on it with interest for about 6 months. Yesterday I saw this post: Guido van Rossum wrote: >I don't have a proposal for generator comprehension syntax though, and >[yield ...] has the same problem. I actually like the [yield ...] syntax, (I find the intent clear, and have no expectations of its returning a list) but since that doesn't look like it will be happening, I've tried to think of some other possible syntax. I've come up with 16 different possibilities so far, including [yield ...], which I've listed below. I'm not advocating any one of them (in fact, many of them are abhorrent), I'm just listing some possibilities, in no particular order other than as they occurred to me: # (1) no delimiter sumofsquares = sum(yield x*x for x in myList) # (2) brackets sumofsquares = sum([yield x*x for x in myList]) # (3) parentheses sumofsquares = sum((yield x*x for x in myList)) # (4) braces sumofsquares = sum({yield x*x for x in myList}) # (5) pipes sumofsquares = sum(|yield x*x for x in myList|) # (6) slashes sumofsquares = sum(/yield x*x for x in myList/) # (7) carets sumofsquares = sum(^yield x*x for x in myList^) # (8) angle brackets sumofsquares = sum() # (9) sigil @ sumofsquares = sum(@yield x*x for x in myList@) # (10) sigil $ sumofsquares = sum($yield x*x for x in myList$) # (11) question marks sumofsquares = sum(?yield x*x for x in myList?) # (12) ellipses sumofsquares = sum(...yield x*x for x in myList...) # (13) yield: sumofsquares = sum(yield:[x*x for x in myList]) # (14) unpacking (*) sumofsquares = sum(*[x*x for x in myList]) # (15) <- sumofsquares = sum(<-[x*x for x in myList]) #(16) ^ sumofsquares = sum(^[x*x for x in myList]) These last few suggestion (from (13) on) may require some explanation. The notion I've had for "yield:" is to have it act something like a lambda so that the list comprehension is not evaluated, i.e., no list is constructed in memory. Instead, an iterator is created that can be used to generate the items, one at a time, that would have been in that list. Something like def squares(myList): for x in myList: yield x*x sumofsquares = sum(squares(myList)) The other suggestions, after (13), are based on this same notion. Okay. So, there are some generator comprehension syntax ideas. Hopefully they will be useful, even if they just serve as items to point to and say "we definitely don't want this". I thank you for your time, and I apologize if these unsolicited suggestions are unwanted. Sean Ross _________________________________________________________________ Add photos to your e-mail with MSN 8. Get 2 months FREE*. http://join.msn.com/?page=features/featuredemail From aleaxit at yahoo.com Fri Oct 17 12:52:34 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 17 12:52:39 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com> References: <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz> <5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com> Message-ID: <200310171852.34515.aleaxit@yahoo.com> On Friday 17 October 2003 06:03 pm, Phillip J. Eby wrote: ... > Because it's arguably bad coding style to use slices or indexes on an > object in order to perform a function on the indexes supplied. Wouldn't > you find a program where this held true: > > TimesTwo[2] == 4 > > to be in bad style? Function calls are for transforming arguments, > indexing is for accessing the contents of a *container*. Top(10) is not a Yes, I would find _gratuitous_ use of indexing where other means are perfectly adequate to be in bad style. On the other hand, where Python 'wants' me to use indexing for other purposes, I already do: >>> class Eval: ... def __getitem__(self, expr): return eval(expr) ... >>> print '2 + 2 is %(2 + 2)s' % Eval() 2 + 2 is 4 and given we don't have a better way to "interpolate expressions in strings", I don't feel particularly troubled by this, either. > container, it has nothing in it, and neither does TimesTwo. I suppose you > could argue that TimesTwo is a conceptual infinite sequence of even > integers, but for most of the proposed accumulators, similar arguments > would be a *big* stretch. Yes; any pure function is mathematically a mapping, but arguing for general confusion on that score between indexing and function calls would be stretchy indeed, I agree. Before we had iterators and generators, I did use "indexing as pure function call" to get infinite sequences for use in for loops (to be terminated by break or return when appropriate), but I'm much happier with iterators for this purpose (they keep state, so, having dealt with some prefix of a sequence in a for loop, I still have the sequence's tail intact for possible future processing -- that's often VERY useful to me! -- AND it's often SO much easier to compute "the next item" than it is to compute "the i-th item" for an arbitrary natural i). > Yes, what you propose is certainly *possible*. But again, if you really > needed an iterator as an index, you can right now do: > > sum[ [x*x for x in blaap] ] Actually, I need to use parentheses on the outside and brackets only on the inside -- I assume that's what you meant, of course. If the iterator is finite, and memory consumption not an issue, sure. An infinite iterator would in any case not be suitable for sum (but it _might_ be suitable for other uses, of course). I truly dislike the way foo([...]) _looks_, with those ([ and ]) pairs, but, oh well, not _every_ frequently used construct can look nice, after all. > And if there are gencomps, you could do the same. So, why single out > subscripting for special consideration with regard to generator > comprehensions, thus forcing clever tricks of questionable style in order > to do what ought to be function calls? I guess I let my dislike of ([ ... ]) get away with me:-). If gencomps use your proposed syntax, I'll have no problem whatsoever coding sum((yield: x*x for x in blaap)) particularly since the (( ... )) don't look at all bad;-). Seriously, what I'm after is the functionality: since the _syntax_ seemed to be the stumbling block, I thought of an alternative syntax that seemed fine to me (not any more of a stretch of the concept of indexing than the Eval class above, or the not-so-long-ago use of __getitem__ to get infinite sequences in for loops) and proposed it. If your (yield: ...) syntax is approved instead, I'll be first in line to cheer:-). > I shudder to think of trying to have to explain Top(10)[...] to a Python > newbie, even if they're an experienced programmer. Because Top(10) isn't a > *container*. I suppose a C++ veteran might consider it an ugly operator > overloading hack... and they'd be right. Top(10,[...]) on the other hand, > is crystal clear to anybody that gets the idea of function calls. I agree it's clearer -- a tad less flexible, as you don't get to do separately selector = Top(10) and then somewhere else selector[...] but "oh well", and anyway the issue would be overcome if we had currying (we could be said to have it, but -- I assume you'd consider selector = Top.__get__(10) some kind of abuse, and besides, this 'currying' isn't very general). Alex From nas-python at python.ca Fri Oct 17 12:57:54 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Fri Oct 17 12:56:59 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310171610.h9HGAj606439@12-236-54-216.client.attbi.com> References: <200310170953.55170.aleaxit@yahoo.com> <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz> <200310170953.55170.aleaxit@yahoo.com> <5.1.0.14.0.20031017114814.01ef44b0@mail.telecommunity.com> <200310171610.h9HGAj606439@12-236-54-216.client.attbi.com> Message-ID: <20031017165754.GA22522@mems-exchange.org> On Fri, Oct 17, 2003 at 09:10:45AM -0700, Guido van Rossum wrote: > > (yield x*2 for x in foo) > > > > or maybe: > > > > (yield: x*2 for x in foo) > > Both look decent to me, and in fact the first is what I was thinking > of this morning in the shower. :-) So would you write: sum(yield: x*2 for x in foo) or sum((yield: x*2 for x in foo)) At the moment I like the latter better. Neil From aleaxit at yahoo.com Fri Oct 17 13:03:42 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 17 13:03:47 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <16272.6895.233187.510629@montanaro.dyndns.org> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> <16272.6895.233187.510629@montanaro.dyndns.org> Message-ID: <200310171903.42578.aleaxit@yahoo.com> On Friday 17 October 2003 06:38 pm, Skip Montanaro wrote: ... > Alex> ... just as e.g. foo[ 'va':23:2j, {'zip':'zop'}:45:(3,4) ] ... > > I have absolutely no idea how to interpret this. Is this existing or > proposed Python syntax? Perfectly valid and current existing Python syntax: >>> class F(object): ... def __getitem__(self, x): return x ... >>> foo=F() >>> foo[ 'va':23:2j, {'zip':'zop'}:45:(3,4) ] (slice('va', 23, 2j), slice({'zip': 'zop'}, 45, (3, 4))) Not particularly _sensible_, mind you, and I hope nobody's yet written any container that IS to be indexed by such tuples of slices of multifarious nature. But, indexing does stretch quite far in the current Python syntax and semantics (in Python's *pragmatics* you're supposed to use it far more restrainedly). Alex From eppstein at ics.uci.edu Fri Oct 17 13:15:10 2003 From: eppstein at ics.uci.edu (David Eppstein) Date: Fri Oct 17 13:15:14 2003 Subject: [Python-Dev] Re: accumulator display syntax References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> <16272.6895.233187.510629@montanaro.dyndns.org> Message-ID: In article <16272.6895.233187.510629@montanaro.dyndns.org>, Skip Montanaro wrote: > I'm not in a mood to try and explain anything in more precise terms this > morning (for other reasons, it's been a piss poor day so far) and must trust > your ability to infer my meaning. I have no idea at this point how to > interpret > > foo[x*x for x in bar] > > That looks like a syntax error to me. You have a probably identifier > followed by a list comprehension. foo[ anything ] does not look like an identifier followed by a list, it looks like an indexing operation. So I would interpret foo[x*x for x in bar] to equal foo.__getitem__(i) where i is an iterator of x*x for x in bar. In particular if iter.__getitem__ works appropriately, then iter[x*x for x in bar] could be a generator comprehension and iter[1:n] could be an xrange. Similarly sum and max could be given appropriate __getitem__ methods. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From guido at python.org Fri Oct 17 13:15:21 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 13:16:40 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Fri, 17 Oct 2003 19:03:42 +0200." <200310171903.42578.aleaxit@yahoo.com> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> <16272.6895.233187.510629@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> Message-ID: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> > But, indexing does stretch quite > far in the current Python syntax and semantics (in Python's > *pragmatics* you're supposed to use it far more restrainedly). Which is why I didn't like the 'sum[x for x in S]' notation much. Let's look for an in-line generator notation instead. I like sum((yield x for x in S)) but perhaps we can make this work: sum(x for x in S) (Somebody posted a whole bunch of alternatives that were mostly picking random delimiters; it didn't look like the right approach.) --Guido van Rossum (home page: http://www.python.org/~guido/) From theller at python.net Fri Oct 17 13:42:59 2003 From: theller at python.net (Thomas Heller) Date: Fri Oct 17 13:43:09 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: <200310162019.h9GKJli05194@12-236-54-216.client.attbi.com> (Guido van Rossum's message of "Thu, 16 Oct 2003 13:19:47 -0700") References: <3F872FE9.9070508@v.loewis.de> <3F8C3DD0.4020400@v.loewis.de> <200310162019.h9GKJli05194@12-236-54-216.client.attbi.com> Message-ID: Guido van Rossum writes: >> So, would a patch be accepted (for 2.4, I assume there is no way for >> 2.3.3) which made everything builtin except for the following modules: >> >> _testcapi - not used outside the testsuite >> _tkinter - needs external stuff anyway >> pyexpat - may be replaced by a third party module >> _ssl - needs Python to be built > > I'd rather see an explicit list of the "everything" that you want to > bundle into the main DLL. Here is the list of Python 2.3 extension modules, in decreasing order of my preference to be converted into a builtin. Needed to start Python - should be builtin: zlib _sre Used by myself everyday - would like them to be builtin: _socket _winreg mmap select I'm undecided on these modules, I do not use them now but may in the future - so I'm undecided: _csv winsound datetime bz2 These should remain in separate pyd files for varous reasons: _tkinter _bsddb _testcapi pyexpat Don't know what these do, so I cannot really comment: _symtable parser unicodedata And while we're at it, I have looked at sys.builtin_module_names (again, from Python 2.3), and wondered if there arn't too many. I have *never* used any of these (xxsubtype is only a source code example, isn't it): audioop imageop rgbimg xxsubtype and I guess some of these could also be moved out of python.dll (rotor is even deprecated): _hotshot cmath rotor sha md5 xreadlines ---- There may be incompatibilities - that's why I asked about 2.3.3 or 2.4. The biggest problem would probably be that you would have to download additional sources - zlib is one example. Who cares about the python.dll file getting larger? As Martin explained, this shouldn't increase memory usage, and since zlib and _sre are loaded anyway at Python starup, the startup time should decrease IMO. Let me conclude that I have no pressing need for changing this, but the decision whether an extension module is builtin or in a dll should follow a certain pattern. To reduce the number of files py2exe (or installer) produces the best way would be to build custom python dlls containing the most popular extensions as builtins. Of course this can be done by everyone owning a C compiler and a text editor. And my own version would certainly include _ctypes ;-) Thomas From pje at telecommunity.com Fri Oct 17 13:53:56 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Oct 17 13:53:58 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310171852.34515.aleaxit@yahoo.com> References: <5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com> <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz> <5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com> Message-ID: <5.1.0.14.0.20031017134657.01ec1a70@mail.telecommunity.com> At 06:52 PM 10/17/03 +0200, Alex Martelli wrote: >On Friday 17 October 2003 06:03 pm, Phillip J. Eby wrote: > > Yes, what you propose is certainly *possible*. But again, if you really > > needed an iterator as an index, you can right now do: > > > > sum[ [x*x for x in blaap] ] > >Actually, I need to use parentheses on the outside and brackets only >on the inside -- I assume that's what you meant, of course. No, I meant what I said, which was that if you "really needed an iterator as an *index*" (emphasis added). I suppose I technically should have said, if you really want to provide an *iterable*, since a list is not an iterator. But I figured you'd know what I meant. :) >I agree it's clearer -- a tad less flexible, as you don't get to do separately > selector = Top(10) >and then somewhere else > selector[...] >but "oh well", and anyway the issue would be overcome if we had currying >(we could be said to have it, but -- I assume you'd consider > selector = Top.__get__(10) >some kind of abuse, and besides, this 'currying' isn't very general). Hmmm... that's a hideously sick hack to perform currying... but I *like* it. :) Not to use inline, of course, I'd wrap it in a 'curry' function. But what a lovely way to *implement* it, under the hood. Of course, I'd actually use 'new.instancemethod', since it would do the same thing for any callable, not just functions. But I never thought of using method objects for providing a currying operation (in the general sense) before, even though I've sometimes used as part of a framework to pass along extra operators to chained functions. From shane.holloway at ieee.org Fri Oct 17 13:55:53 2003 From: shane.holloway at ieee.org (Shane Holloway (IEEE)) Date: Fri Oct 17 13:56:44 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> <16272.6895.233187.510629@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> Message-ID: <3F902D29.2060109@ieee.org> Guido van Rossum wrote: > but perhaps we can make this work: > > sum(x for x in S) Being able to use generator compressions as an expression would be useful. In that case, I assume the following would be possible as well: mygenerator = x for x in S for y in x for x in S: print y return x for x in S Thanks, -Shane Holloway From theller at python.net Fri Oct 17 13:56:48 2003 From: theller at python.net (Thomas Heller) Date: Fri Oct 17 13:56:57 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: (David LeBlanc's message of "Thu, 16 Oct 2003 23:55:31 -0700") References: Message-ID: "David LeBlanc" writes: > A few things come to mind: > > What's the cost of mapping the world (all those entry points) at startup? > > You have to rebuild all of the main dll just to do something to one > component. To me, that's maybe the biggest single issue. Hm. How often do you hack the C code of the extension modules included with Python? > Are app users/programmers going to have a bloat perception? How many of them > really understand that a dll is mapped and not loaded at startup? > > IMO, it contradicts the unix way of smaller, compartmentalized is better. > It's not unix we're talking about, but it still makes sense to me, whatever > the OS. Maybe unix solves all this, but on Windows it's called DLL Hell. > On the plus side, it does make some debugging easier if you're working on > extension dlls: fewer sources to have to point Vis Studio at. That's never been a problem for me. It always finds the sources itself, at least for extensions built with distutils (because distutils in debug builds passes absolute pathnames to the compiler). > On a related side note: has anyone done any investigation to determine which > few percentage of the extensions account for 99% of the dll loads? Maybe > there's no such pattern, but experience suggests there probably is and that > subset might be a better candidate than the whole world. That might be. Thomas From theller at python.net Fri Oct 17 14:02:21 2003 From: theller at python.net (Thomas Heller) Date: Fri Oct 17 14:02:31 2003 Subject: [Python-Dev] Python-2.3.2 windows binary screwed In-Reply-To: <6.0.0.22.0.20031016222201.0221b908@torment.chelsea.private> (Barry Scott's message of "Thu, 16 Oct 2003 22:25:51 +0100") References: <200310160515.h9G5FUqc025443@localhost.localdomain> <6.0.0.22.0.20031016222201.0221b908@torment.chelsea.private> Message-ID: Barry Scott writes: > You said you are using the SP5 DLLs. They are old... > > We use the ones from vc6redist.exe from microsoft they have fixes that > you may need. Its also the versions that you will encounter on XP > systems I believe. Well, isn't SP5 the latest service pack available for Visual Studio 6.0? I took it from the Oct 2003 MSDN shipment. > So long as you have the version checking done right in the installer > you will not rewind a DLL backwards. The problem in this case was not the installer doing things wrong, the fault was alone on my side: I did use the dlls from my WinXP system directory, and the installer correctly used them to replace the versions on the target computers. If this was a win2k system, the file protection reverted this change, and the users were lucky again (except they had an entry in the event log). Unfortunately win98 and NT4 users were not so happy, for them it broke the system. Thomas From pje at telecommunity.com Fri Oct 17 14:04:52 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Oct 17 14:04:57 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> <16272.6895.233187.510629@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> Message-ID: <5.1.0.14.0.20031017135553.03e20eb0@mail.telecommunity.com> At 10:15 AM 10/17/03 -0700, Guido van Rossum wrote: > > But, indexing does stretch quite > > far in the current Python syntax and semantics (in Python's > > *pragmatics* you're supposed to use it far more restrainedly). > >Which is why I didn't like the 'sum[x for x in S]' notation much. >Let's look for an in-line generator notation instead. I like > > sum((yield x for x in S)) > >but perhaps we can make this work: > > sum(x for x in S) Offhand, it seems like the grammar might be rather tricky, but it actually does seem more Pythonic than the "yield" syntax, and it retroactively makes listcomps shorthand for 'list(x for x in s)'. However, if gencomps use this syntax, then what does: for x in y*2 for y in z if y<20: ... mean? ;) It's a little clearer with parentheses, of course, so perhaps they should be required: for x in (y*2 for y in z if y<20): ... It would be more efficient to code that stuff inline in the loop, if the gencomp creates another frame, but it *looks* more efficient to put it in the for statement. But maybe I worry too much, since you could slap a listcomp in a for loop now, and I've never even thought of doing so. From guido at python.org Fri Oct 17 14:04:53 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 14:06:34 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: Your message of "Fri, 17 Oct 2003 19:42:59 +0200." References: <3F872FE9.9070508@v.loewis.de> <3F8C3DD0.4020400@v.loewis.de> <200310162019.h9GKJli05194@12-236-54-216.client.attbi.com> Message-ID: <200310171804.h9HI4rJ06803@12-236-54-216.client.attbi.com> > >> So, would a patch be accepted (for 2.4, I assume there is no way for > >> 2.3.3) which made everything builtin except for the following modules: > >> > >> _testcapi - not used outside the testsuite > >> _tkinter - needs external stuff anyway > >> pyexpat - may be replaced by a third party module > >> _ssl - needs Python to be built > > > > I'd rather see an explicit list of the "everything" that you want to > > bundle into the main DLL. > > Here is the list of Python 2.3 extension modules, in decreasing order of > my preference to be converted into a builtin. > > Needed to start Python - should be builtin: > > zlib _sre +1 for _sre. I'd be +1 for zlib, but see bz2 below for a quibble. (How important is this *really* for bootstrap reasons?) > Used by myself everyday - would like them to be builtin: > > _socket _winreg mmap select +1 on _winreg and mmap (they're small enough). Long ago, when I first set up the VC5 project, there were still some target systems out there that didn't have a working winsock DLL, and "import socket" or "import select" would fail there for that reason. If this is no longer a problem, I'm +1 on this. > I'm undecided on these modules, I do not use them now but may in the > future - so I'm undecided: > > _csv winsound datetime bz2 I'm -1 on bz2; I think bz2 requires a 3rd party external library; for developers building their own Python who don't want to bother with that, it's much easier to ignore a DLL that can't be built than to have to cut a module out of the core DLL. The same argument applies to zlib -- but I could be swayed by the counterargument that zlib is needed for zipimport bootrstrap purposes. (Though is it? you can create zip files without using compression.) > These should remain in separate pyd files for varous reasons: > > _tkinter _bsddb _testcapi pyexpat Agreed. > Don't know what these do, so I cannot really comment: > > _symtable parser unicodedata _symtable is tiny and can be included. parser is huge but has no external deps; if MvL's argument is correct, the DLL size increase doesn't translate into a memory usage increase, so I'd be +1 on including it; ditto for unicodedata. > And while we're at it, I have looked at sys.builtin_module_names (again, > from Python 2.3), and wondered if there arn't too many. > > I have *never* used any of these (xxsubtype is only a source code > example, isn't it): > > audioop imageop rgbimg xxsubtype They could all be moved out, but why bother? (xxsubtype is just a source code sample module, there's no need to enable it in distrubutions, but it doesn't hurt anybody either I think!) > and I guess some of these could also be moved out of python.dll (rotor > is even deprecated): > > _hotshot cmath rotor sha md5 xreadlines Ditto. None of these are big. xreadlines should also be deprecated. But let it stay in the DLL until we stop distributing it (again, assuming MvL's argument about memory usage is valid). > ---- > There may be incompatibilities - that's why I asked about 2.3.3 or 2.4. I wouldn't mess with 2.3.3. > The biggest problem would probably be that you would have to download > additional sources - zlib is one example. Right. > Who cares about the python.dll file getting larger? As Martin explained, > this shouldn't increase memory usage, and since zlib and _sre are loaded > anyway at Python starup, the startup time should decrease IMO. Right. > Let me conclude that I have no pressing need for changing this, but the > decision whether an extension module is builtin or in a dll should > follow a certain pattern. "Historical precedent" is a pattern too. :-) > To reduce the number of files py2exe (or installer) produces the best > way would be to build custom python dlls containing the most popular > extensions as builtins. Of course this can be done by everyone owning a > C compiler and a text editor. And my own version would certainly > include _ctypes ;-) --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy at udel.edu Fri Oct 17 14:06:28 2003 From: tjreedy at udel.edu (Terry Reedy) Date: Fri Oct 17 14:06:40 2003 Subject: [Python-Dev] Re: accumulator display syntax References: <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz><200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz> <200310170953.55170.aleaxit@yahoo.com> <5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com> Message-ID: "Phillip J. Eby" wrote in message news:5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com... > At 09:53 AM 10/17/03 +0200, Alex Martelli wrote: > >What, if anything, is "very confusing" in, e.g., > > sum[x*x for x in blaap] To me, it both *looks* a lot like a Lisp macro ... > >compared with e.g. the currently accepted: > > a['tanto':'va':'la', 'gatta':'al':'lardo'] (this does use ':' and ',', at least) > Because it's arguably bad coding style to use slices or indexes on an > object in order to perform a function on the indexes supplied. and acts like a Lisp macro in plugging code pieces into a template that leads to surprising behavior, given the original form. > I shudder to think of trying to have to explain Top(10)[...] to a Python > newbie, even if they're an experienced programmer. Ditto. Getting the reductive functionality thru a gencomp would be better. Terry J. Reedy From guido at python.org Fri Oct 17 14:08:09 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 14:08:23 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: Your message of "Fri, 17 Oct 2003 19:56:48 +0200." References: Message-ID: <200310171808.h9HI89F06844@12-236-54-216.client.attbi.com> > "David LeBlanc" writes: > > > A few things come to mind: > > > > What's the cost of mapping the world (all those entry points) at startup? > > > > You have to rebuild all of the main dll just to do something to one > > component. To me, that's maybe the biggest single issue. [Thomas Heller] > Hm. How often do you hack the C code of the extension modules included > with Python? There's a small but important group of people who rebuild Python from source with different compiler options (perhaps to enable debugging their own extensions). They often don't want to have to bother with downloading external software that they don't use (like bz2 or bsddb). > > Are app users/programmers going to have a bloat perception? How > > many of them really understand that a dll is mapped and not loaded > > at startup? > > > > IMO, it contradicts the unix way of smaller, compartmentalized is better. > > It's not unix we're talking about, but it still makes sense to me, whatever > > the OS. > > Maybe unix solves all this, but on Windows it's called DLL Hell. It's not DLL hell unless there are version issues. I don't think multiple extension modules contribute to that (they aern't in the general Windows DLL search path anyway, only pythonXY.dll is, for the benefit of Mark Hammond's COM support in win32all). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Oct 17 14:09:03 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 14:09:10 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Fri, 17 Oct 2003 11:55:53 MDT." <3F902D29.2060109@ieee.org> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> <16272.6895.233187.510629@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <3F902D29.2060109@ieee.org> Message-ID: <200310171809.h9HI93U06856@12-236-54-216.client.attbi.com> > Guido van Rossum wrote: > > but perhaps we can make this work: > > > > sum(x for x in S) [Shane Holloway] > Being able to use generator compressions as an expression would be > useful. In that case, I assume the following would be possible as well: > > mygenerator = x for x in S > > for y in x for x in S: > print y > > return x for x in S You'd probably have to add extra parentheses around (x for x in S) to help the poor parser (and the human reader). --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy at udel.edu Fri Oct 17 14:17:50 2003 From: tjreedy at udel.edu (Terry Reedy) Date: Fri Oct 17 14:17:56 2003 Subject: [Python-Dev] Re: accumulator display syntax References: <16272.5373.514560.225999@montanaro.dyndns.org><200310171821.39895.aleaxit@yahoo.com><16272.6895.233187.510629@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> Message-ID: "Alex Martelli" wrote in message news:200310171903.42578.aleaxit@yahoo.com... > On Friday 17 October 2003 06:38 pm, Skip Montanaro wrote: > ... > > Alex> ... just as e.g. foo[ 'va':23:2j, {'zip':'zop'}:45:(3,4) ] ... > > > > I have absolutely no idea how to interpret this. Is this existing or > > proposed Python syntax? > > Perfectly valid and current existing Python syntax: > > >>> class F(object): > ... def __getitem__(self, x): return x > ... > >>> foo=F() > >>> foo[ 'va':23:2j, {'zip':'zop'}:45:(3,4) ] > (slice('va', 23, 2j), slice({'zip': 'zop'}, 45, (3, 4))) > > Not particularly _sensible_, mind you, and I hope nobody's yet > written any container that IS to be indexed by such tuples of > slices of multifarious nature. But, indexing does stretch quite > far in the current Python syntax and semantics (in Python's > *pragmatics* you're supposed to use it far more restrainedly). In your commercial programming group, would you accept such a slice usage from another programmer, especially without prior agreement of the group? Or would you want to edit, as you would with 'return x (Guido van Rossum's message of "Fri, 17 Oct 2003 11:04:53 -0700") References: <3F872FE9.9070508@v.loewis.de> <3F8C3DD0.4020400@v.loewis.de> <200310162019.h9GKJli05194@12-236-54-216.client.attbi.com> <200310171804.h9HI4rJ06803@12-236-54-216.client.attbi.com> Message-ID: <4qy7lnuc.fsf@python.net> >> Needed to start Python - should be builtin: >> >> zlib _sre > > +1 for _sre. > > I'd be +1 for zlib, but see bz2 below for a quibble. (How important > is this *really* for bootstrap reasons?) > [...] > I'm -1 on bz2; I think bz2 requires a 3rd party external library; for > developers building their own Python who don't want to bother with > that, it's much easier to ignore a DLL that can't be built than to > have to cut a module out of the core DLL. > > The same argument applies to zlib -- but I could be swayed by the > counterargument that zlib is needed for zipimport bootrstrap purposes. > (Though is it? you can create zip files without using compression.) No, it has nothing to do with zipimport's bootstrap. When zlib is available, you can import from compressed zipfiles, when it's not available, you cannot. (Hopefully Just corrects me if I'm wrong) Of course, uncompressed zipfiles would always work - and they may be preferred because they might be even faster. > Long ago, when I first set up the VC5 project, there were still some > target systems out there that didn't have a working winsock DLL, and > "import socket" or "import select" would fail there for that reason. > If this is no longer a problem, I'm +1 on this. Not on the sytems that I work on. To be double sure, _socket could be rewritten to load the winsock dll dynamically. And maybe this becomes an issue again if IPv6 is compiled in. >> There may be incompatibilities - that's why I asked about 2.3.3 or 2.4. > > I wouldn't mess with 2.3.3. Ok. Thomas From arigo at tunes.org Fri Oct 17 14:28:11 2003 From: arigo at tunes.org (Armin Rigo) Date: Fri Oct 17 14:32:05 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <3F902D29.2060109@ieee.org> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> <16272.6895.233187.510629@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <3F902D29.2060109@ieee.org> Message-ID: <20031017182811.GA28889@vicky.ecs.soton.ac.uk> Hello, On Fri, Oct 17, 2003 at 11:55:53AM -0600, Shane Holloway (IEEE) wrote: > mygenerator = x for x in S > > for y in x for x in S: > print y > > return x for x in S Interesting but potentially confusing: we could expect the last one to mean that we executing 'return' repeatedly, i.e. returning a value more than once, which is not what occurs. Similarily, yield x for x in g() in a generator would be quite close to the syntax discussed some time ago to yield all the values yielded by a sub-generator g, but in your proposal it wouldn't have that meaning: it would only yield a single object, which happens to be an iterator with the same elements as g(). Even with parenthesis, and assuming a syntax to yield from a sub-generator for performance reason, the two syntaxes would be dangerously close: yield x for x in g() # means for x in g(): yield x yield (x for x in g()) # means yield g() Armin From barry at python.org Fri Oct 17 14:35:47 2003 From: barry at python.org (Barry Warsaw) Date: Fri Oct 17 14:35:54 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <20031017165754.GA22522@mems-exchange.org> References: <200310170953.55170.aleaxit@yahoo.com> <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz> <200310170953.55170.aleaxit@yahoo.com> <5.1.0.14.0.20031017114814.01ef44b0@mail.telecommunity.com> <200310171610.h9HGAj606439@12-236-54-216.client.attbi.com> <20031017165754.GA22522@mems-exchange.org> Message-ID: <1066415746.18702.131.camel@anthem> On Fri, 2003-10-17 at 12:57, Neil Schemenauer wrote: > sum((yield: x*2 for x in foo)) +1 -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20031017/bbb5c200/attachment.bin From shane.holloway at ieee.org Fri Oct 17 14:35:41 2003 From: shane.holloway at ieee.org (Shane Holloway (IEEE)) Date: Fri Oct 17 14:36:28 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: References: <3F872FE9.9070508@v.loewis.de> <3F8C3DD0.4020400@v.loewis.de> <200310162019.h9GKJli05194@12-236-54-216.client.attbi.com> Message-ID: <3F90367D.200@ieee.org> Thomas Heller wrote: > Here is the list of Python 2.3 extension modules, in decreasing order of > my preference to be converted into a builtin. > > Needed to start Python - should be builtin: > > zlib _sre +1 -- would not these speed startup time? > Used by myself everyday - would like them to be builtin: > > _socket _winreg mmap select +0 -- I use them a lot, but the overhead importing them is definitely acceptable to me. > I'm undecided on these modules, I do not use them now but may in the > future - so I'm undecided: > > _csv winsound datetime bz2 -0 -- Useful modules, but not on my everyday use list. > These should remain in separate pyd files for varous reasons: > > _tkinter _bsddb _testcapi pyexpat Definitely agreed. :) > Don't know what these do, so I cannot really comment: > > _symtable parser unicodedata Neither do I. Although unicodedata is fairly big. > And while we're at it, I have looked at sys.builtin_module_names (again, > from Python 2.3), and wondered if there arn't too many. > > I have *never* used any of these (xxsubtype is only a source code > example, isn't it): > > audioop imageop rgbimg xxsubtype +1 -- I agree that these would not suffer too badly from being external pyds either. > and I guess some of these could also be moved out of python.dll (rotor > is even deprecated): > > _hotshot cmath rotor sha md5 xreadlines +1 for _hotshot, rotor, and xreadlines -- External would be good. -0 for sha, md5 -- I like these the way they are, but I see your point. -1 for cmath -- complex types are part of the language, and should be builtin, IMO. > ---- > There may be incompatibilities - that's why I asked about 2.3.3 or 2.4. -1 for 2.3.3 or any point release in 2.3 +1 for 2.4 > The biggest problem would probably be that you would have to download > additional sources - zlib is one example. > > Who cares about the python.dll file getting larger? As Martin explained, > this shouldn't increase memory usage, and since zlib and _sre are loaded > anyway at Python starup, the startup time should decrease IMO. Small is beautiful. Fast is good. I don't like the idea of statically linking pyds into python simply because we can. Nor does reducing the number of external files for packagers like py2exe. I know that pain too, but I don't want python to suffer from too much bloat for that reason. > Let me conclude that I have no pressing need for changing this, but the > decision whether an extension module is builtin or in a dll should > follow a certain pattern. > > To reduce the number of files py2exe (or installer) produces the best > way would be to build custom python dlls containing the most popular > extensions as builtins. Of course this can be done by everyone owning a > C compiler and a text editor. And my own version would certainly > include _ctypes ;-) > > Thomas I love ctypes :) It saves me from doing hard work ;) Thanks for reading :) -Shane Holloway From guido at python.org Fri Oct 17 14:40:53 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 14:41:06 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: Your message of "Fri, 17 Oct 2003 20:29:31 +0200." <4qy7lnuc.fsf@python.net> References: <3F872FE9.9070508@v.loewis.de> <3F8C3DD0.4020400@v.loewis.de> <200310162019.h9GKJli05194@12-236-54-216.client.attbi.com> <200310171804.h9HI4rJ06803@12-236-54-216.client.attbi.com> <4qy7lnuc.fsf@python.net> Message-ID: <200310171840.h9HIesN06941@12-236-54-216.client.attbi.com> > > The same argument applies to zlib -- but I could be swayed by the > > counterargument that zlib is needed for zipimport bootrstrap purposes. > > (Though is it? you can create zip files without using compression.) > > No, it has nothing to do with zipimport's bootstrap. When zlib is > available, you can import from compressed zipfiles, when it's not > available, you cannot. (Hopefully Just corrects me if I'm wrong) > > Of course, uncompressed zipfiles would always work - and they may be > preferred because they might be even faster. Right. Compression should be used to save network bandwidth, but in general, these days, files on disk should be uncompressed. > > Long ago, when I first set up the VC5 project, there were still some > > target systems out there that didn't have a working winsock DLL, and > > "import socket" or "import select" would fail there for that reason. > > If this is no longer a problem, I'm +1 on this. > > Not on the sytems that I work on. To be double sure, _socket could be > rewritten to load the winsock dll dynamically. And maybe this becomes > an issue again if IPv6 is compiled in. I'd rather not have more Windows-specific cruft in the socket and select module source code -- they are bad enough already. Dynamically loading winsock probably would mean that ever call into it has to be coded differently, right? --Guido van Rossum (home page: http://www.python.org/~guido/) From theller at python.net Fri Oct 17 14:42:42 2003 From: theller at python.net (Thomas Heller) Date: Fri Oct 17 14:42:51 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: <200310171808.h9HI89F06844@12-236-54-216.client.attbi.com> (Guido van Rossum's message of "Fri, 17 Oct 2003 11:08:09 -0700") References: <200310171808.h9HI89F06844@12-236-54-216.client.attbi.com> Message-ID: Guido van Rossum writes: >> "David LeBlanc" writes: >> >> > A few things come to mind: >> > >> > What's the cost of mapping the world (all those entry points) at startup? >> > >> > You have to rebuild all of the main dll just to do something to one >> > component. To me, that's maybe the biggest single issue. > > [Thomas Heller] >> Hm. How often do you hack the C code of the extension modules included >> with Python? > > There's a small but important group of people who rebuild Python from > source with different compiler options (perhaps to enable debugging > their own extensions). They often don't want to have to bother with > downloading external software that they don't use (like bz2 or bsddb). Well, couldn't there be a mechanism which allows to switch easily between builtin/external? >> > Are app users/programmers going to have a bloat perception? How >> > many of them really understand that a dll is mapped and not loaded >> > at startup? >> > >> > IMO, it contradicts the unix way of smaller, compartmentalized is better. >> > It's not unix we're talking about, but it still makes sense to me, whatever >> > the OS. >> >> Maybe unix solves all this, but on Windows it's called DLL Hell. > > It's not DLL hell unless there are version issues. > I don't think multiple extension modules contribute to that (they > aern't in the general Windows DLL search path anyway, only > pythonXY.dll is, for the benefit of Mark Hammond's COM support in > win32all). I tried to be funny but obviously failed ;-) Although it smells a little bit like DLL hell. Thomas From guido at python.org Fri Oct 17 14:46:34 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 14:47:04 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Fri, 17 Oct 2003 19:28:11 BST." <20031017182811.GA28889@vicky.ecs.soton.ac.uk> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> <16272.6895.233187.510629@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <3F902D29.2060109@ieee.org> <20031017182811.GA28889@vicky.ecs.soton.ac.uk> Message-ID: <200310171846.h9HIkYY06961@12-236-54-216.client.attbi.com> > On Fri, Oct 17, 2003 at 11:55:53AM -0600, Shane Holloway (IEEE) wrote: > > mygenerator = x for x in S > > > > for y in x for x in S: > > print y > > > > return x for x in S > > Interesting but potentially confusing: we could expect the last one > to mean that we executing 'return' repeatedly, i.e. returning a > value more than once, which is not what occurs. I'm not sure what you mean by executing 'return' repeatedly; the closest thing in Python is returning a sequence, and this is pretty close (for many practical purposes, returning an iterator is just as good as returning a sequence). > Similarily, > > yield x for x in g() > > in a generator would be quite close to the syntax discussed some > time ago to yield all the values yielded by a sub-generator g, but > in your proposal it wouldn't have that meaning: it would only yield > a single object, which happens to be an iterator with the same > elements as g(). IMO this is not at all similar to what it suggests for return, as executing 'yield' multiple times *is* a defined thing. This is why I'd prefer to require extra parentheses; yield (x for x in g()) is pretty clear about how many times yield is executed. > Even with parenthesis, and assuming a syntax to yield from a > sub-generator for performance reason, the two syntaxes would be > dangerously close: > > yield x for x in g() # means for x in g(): yield x > yield (x for x in g()) # means yield g() I don't see why we need yield x for x in g() when we can already write for x in g(): yield x This would be a clear case of "more than one way to do it". --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Oct 17 14:47:42 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 14:48:02 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: Your message of "Fri, 17 Oct 2003 20:42:42 +0200." References: <200310171808.h9HI89F06844@12-236-54-216.client.attbi.com> Message-ID: <200310171847.h9HIlgL06985@12-236-54-216.client.attbi.com> > > There's a small but important group of people who rebuild Python from > > source with different compiler options (perhaps to enable debugging > > their own extensions). They often don't want to have to bother with > > downloading external software that they don't use (like bz2 or bsddb). > > Well, couldn't there be a mechanism which allows to switch easily > between builtin/external? Of course there *could*, but why bother? What we have works just as well IMO. --Guido van Rossum (home page: http://www.python.org/~guido/) From shane.holloway at ieee.org Fri Oct 17 14:50:34 2003 From: shane.holloway at ieee.org (Shane Holloway (IEEE)) Date: Fri Oct 17 14:51:20 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <20031017182811.GA28889@vicky.ecs.soton.ac.uk> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> <16272.6895.233187.510629@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <3F902D29.2060109@ieee.org> <20031017182811.GA28889@vicky.ecs.soton.ac.uk> Message-ID: <3F9039FA.8070608@ieee.org> Armin Rigo wrote: > Hello, > > On Fri, Oct 17, 2003 at 11:55:53AM -0600, Shane Holloway (IEEE) wrote: > >> mygenerator = x for x in S >> >> for y in x for x in S: >> print y >> >> return x for x in S > > > Interesting but potentially confusing: we could expect the last one to mean > that we executing 'return' repeatedly, i.e. returning a value more than once, > which is not what occurs. Similarily, > > yield x for x in g() > > in a generator would be quite close to the syntax discussed some time ago to > yield all the values yielded by a sub-generator g, but in your proposal it > wouldn't have that meaning: it would only yield a single object, which happens > to be an iterator with the same elements as g(). Yes, this is one of the things I trying to getting at -- If gencomps are expressions, then they must be expressions everywhere, or my poor brain will explode. As for the subgenerator "unrolling", I think there has to be something added to the yield statement to accomplish it -- because it is also useful to yield a generator itself and not have it unrolled. My favorite was "yield *S" for that discussion...\ -Shane Holloway From skip at pobox.com Fri Oct 17 14:57:46 2003 From: skip at pobox.com (Skip Montanaro) Date: Fri Oct 17 14:57:56 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> <16272.6895.233187.510629@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> Message-ID: <16272.15274.781344.230479@montanaro.dyndns.org> >> But, indexing does stretch quite far in the current Python syntax and >> semantics (in Python's *pragmatics* you're supposed to use it far >> more restrainedly). Guido> Which is why I didn't like the 'sum[x for x in S]' notation much. Guido> Let's look for an in-line generator notation instead. I like Guido> sum((yield x for x in S)) Guido> but perhaps we can make this work: Guido> sum(x for x in S) Forgive my extreme density on this matter, but I don't understand what (yield x for x in S) is supposed to do. Is it supposed to return a generator function which I can assign to a variable (or pass to the builtin function sum() as in your example) and call later, or is it supposed to turn the current function into a generator function (so that each executed yield statement returns a value to the caller of the current function)? Assuming the result is a generator function (a first class object I can assign to a variable then call later), is there some reason the current function notation is inadequate? This seems to me to suffer the same expressive shortcomings as lambda. Lambda seems to be hanging on by the hair on its chinny chin chin. Why is this construct gaining traction? If you don't like lambda, I can't quite see why syntax this is all that appealing. OTOH, if lambda: x: x+1 is okay, then why not: yield: x for x in S ? Skip From niemeyer at conectiva.com Fri Oct 17 14:39:16 2003 From: niemeyer at conectiva.com (Gustavo Niemeyer) Date: Fri Oct 17 15:04:28 2003 Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate In-Reply-To: <16272.5425.470101.367084@montanaro.dyndns.org> References: <16272.5425.470101.367084@montanaro.dyndns.org> Message-ID: <20031017183915.GA29652@ibook.distro.conectiva> > >> If anything at all, i'd suggest a std-module which contains e.g. > >> 'sort', 'reverse' and 'extend' functions which always return > >> a new list, so that you could write: > >> > >> for i in reverse(somelist): > >> ... > > Gustavo> You can do reverse with [::-1] now. > > I don't think that is considered "stable" in the sorting sense. If I > sort in descending order vs ascending order, they are not mere > reversals of each other. I may well still want adjacent records whose > sort keys are identical to remain in the same order. > > What will the new reverse=True keyword arg do? Erm.. what are you talking about!? :-) I was just saying that his reverse(...) method is completely equivalent to [::-1] now, so it could safely be implemented as: reverse = lambda x: x[::-1] I wasn't trying to mention anything about sort nor keyword arguments (perhaps I just wasn't the real target of the message!?). -- Gustavo Niemeyer http://niemeyer.net From pje at telecommunity.com Fri Oct 17 15:20:31 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Oct 17 15:20:31 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <16272.15274.781344.230479@montanaro.dyndns.org> References: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> <16272.6895.233187.510629@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> Message-ID: <5.1.0.14.0.20031017151235.034fad20@mail.telecommunity.com> At 01:57 PM 10/17/03 -0500, Skip Montanaro wrote: > >> But, indexing does stretch quite far in the current Python syntax and > >> semantics (in Python's *pragmatics* you're supposed to use it far > >> more restrainedly). > > Guido> Which is why I didn't like the 'sum[x for x in S]' notation much. > Guido> Let's look for an in-line generator notation instead. I like > > Guido> sum((yield x for x in S)) > > Guido> but perhaps we can make this work: > > Guido> sum(x for x in S) > >Forgive my extreme density on this matter, but I don't understand what > > (yield x for x in S) > >is supposed to do. Is it supposed to return a generator function which I >can assign to a variable (or pass to the builtin function sum() as in your >example) and call later, or is it supposed to turn the current function into >a generator function (so that each executed yield statement returns a value >to the caller of the current function)? Neither. It returns an *iterator*, conceptually equivalent to: def temp(): for x in S: yield x temp = temp() Except of course without creating a 'temp' name. I suppose you could also think of it as: (lambda: for x in S: yield x)() except of course that you can't make a generator lambda. If you look at it this way, then you can consider [x for x in S] to be shorthand syntax for list(x for x in S), as they would both produce the same result. However, IIRC, the current listcomp implementation actually binds 'x' in the current local namespace, whereas the generator version would not. (And the listcomp version might be faster.) From bac at OCF.Berkeley.EDU Fri Oct 17 15:46:52 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Fri Oct 17 15:47:12 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <16272.15274.781344.230479@montanaro.dyndns.org> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> <16272.6895.233187.510629@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <16272.15274.781344.230479@montanaro.dyndns.org> Message-ID: <3F90472C.9060702@ocf.berkeley.edu> Skip Montanaro wrote: > >> But, indexing does stretch quite far in the current Python syntax and > >> semantics (in Python's *pragmatics* you're supposed to use it far > >> more restrainedly). > > Guido> Which is why I didn't like the 'sum[x for x in S]' notation much. > Guido> Let's look for an in-line generator notation instead. I like > > Guido> sum((yield x for x in S)) > > Guido> but perhaps we can make this work: > > Guido> sum(x for x in S) > > Forgive my extreme density on this matter, but I don't understand what > > (yield x for x in S) > > is supposed to do. In an attempt to make sure I understand what is being discussed, I am going to take a stab at this. That way when someone corrects me two people get there questions; two birds, one shotgun. > Is it supposed to return a generator function which I > can assign to a variable (or pass to the builtin function sum() as in your > example) and call later, or is it supposed to turn the current function into > a generator function (so that each executed yield statement returns a value > to the caller of the current function)? > It returns a generator function. > Assuming the result is a generator function (a first class object I can > assign to a variable then call later), is there some reason the current > function notation is inadequate? This seems to me to suffer the same > expressive shortcomings as lambda. Lambda seems to be hanging on by the > hair on its chinny chin chin. Why is this construct gaining traction? If > you don't like lambda, I can't quite see why syntax this is all that > appealing. > Extreme shorthand for a common idiom? > OTOH, if lambda: x: x+1 is okay, then why not: > > yield: x for x in S > I was actually thinking that myself, but I would rather keep lambda as this weird little child of Python who can always be spotted for its predisposition toward pink hot pants (images of "Miami Vice" flash in my head...). Personally I am not seeing any extreme need for this feature. I mean the example I keep seeing is ``sum((yield x*2 for x in foo))``. But how is this such a huge win over ``sum([x*2 for x in foo])``? I know there is a memory perk since the entire list won't be constructed, but unless there is a better reason I see abuse on the horizon. The misuse of __slots__ has shown that when something is added that seems simple and powerful it will be abused by a lot of programmers thinking it is the best thing to use for anything they can shoe horn it into. I don't see this as such an abuse issue as __slots__, mind you, but I can still see people using it where a list comp may have been better. Or even having people checking themselves on whether to use this or a list comp and just using this because it seems cooler. I know I am personally +0 on this even after my above worries since I don't see my above arguments are back-breakers and those of us who do know how to properly to use it will get a perk out of it. -Brett From aleaxit at yahoo.com Fri Oct 17 16:01:50 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 17 16:01:56 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <16272.15274.781344.230479@montanaro.dyndns.org> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <16272.15274.781344.230479@montanaro.dyndns.org> Message-ID: <200310172201.50930.aleaxit@yahoo.com> On Friday 17 October 2003 08:57 pm, Skip Montanaro wrote: ... > Forgive my extreme density on this matter, but I don't understand what > > (yield x for x in S) > > is supposed to do. Is it supposed to return a generator function which I > can assign to a variable (or pass to the builtin function sum() as in your > example) and call later, or is it supposed to turn the current function > into a generator function (so that each executed yield statement returns a > value to the caller of the current function)? Neither: it returns an iterator, _equivalent_ to the one that would be returned by _calling_ a generator such as def xxx(): for x in S: yield x like xxx() [the result of the CALL to xxx, as opposed to xxx itself], (yield: x for x in S) is not callable; rather, it's loopable-on. > you don't like lambda, I can't quite see why syntax this is all that > appealing. I don't really like the current state of lambda (and it will likely never get any better), I particularly don't like the use of the letter lambda for this idea (Church's work notwithstanding, even Paul Graham in his new lispoid language has chosen a more sensible keyword, 'func' I believe), but I like comprehensions AND iterators, and the use of the word yield in generators. I'm not quite sure what parallels you see between the two cases. Alex From pje at telecommunity.com Fri Oct 17 16:15:04 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Oct 17 16:15:09 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <3F90472C.9060702@ocf.berkeley.edu> References: <16272.15274.781344.230479@montanaro.dyndns.org> <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> <16272.6895.233187.510629@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <16272.15274.781344.230479@montanaro.dyndns.org> Message-ID: <5.1.0.14.0.20031017160243.03453220@mail.telecommunity.com> At 12:46 PM 10/17/03 -0700, Brett C. wrote: >Skip Montanaro wrote: >>Is it supposed to return a generator function which I >>can assign to a variable (or pass to the builtin function sum() as in your >>example) and call later, or is it supposed to turn the current function into >>a generator function (so that each executed yield statement returns a value >>to the caller of the current function)? > >It returns a generator function. No, it returns an iterator. Technically a generator-iterator, but definitely not a generator function, just as [x for x in y] doesn't return a function that returns a list. :) >Personally I am not seeing any extreme need for this feature. I mean the >example I keep seeing is ``sum((yield x*2 for x in foo))``. But how is >this such a huge win over ``sum([x*2 for x in foo])``? I know there is a >memory perk since the entire list won't be constructed, but unless there >is a better reason I see abuse on the horizon. It's not an extreme need; if it were, it'd have been added in 2.2, where all extreme Python needs were met. ;) >I know I am personally +0 on this even after my above worries since I >don't see my above arguments are back-breakers and those of us who do know >how to properly to use it will get a perk out of it. I'm sort of +0 myself; there are probably few occasions where I'd use a gencomp. But I'm -1 on creating special indexing or listcomp-like accumulator syntax, so gencomps are a fallback position. I'm not sure gencomp is the right term for these things anyway... calling them iterator expressions probably makes more sense. Then there's not the confusion with generator functions, which get called. And this discussion has made it clearer that having 'yield' in the syntax is just plain wrong, because yield is a control flow statement. These things are really just expressions that act over iterators to return another iterator. In essence, an iterator expression is just syntax for imap and ifilter, in the same way that a listcomp is syntax for map and filter. Really, you could now write imap and ifilter as functions that compute iterator expressions, e.g.: imap = lambda func,items: func(item) for item in items ifilter = lambda func, items: item for item in items if func(item) Which of course means there'd be little need for imap and ifilter, just as there's now little need for map and filter. Anyway, if you look at '.. for .. in .. [if ..]' as a ternary or quaternary operator on an iterator (or iterable) that returns an iterator, it makes a lot more sense than thinking of it as having anything to do with generator(s). (Even if it might be implemented that way.) From tim.one at comcast.net Fri Oct 17 16:16:24 2003 From: tim.one at comcast.net (Tim Peters) Date: Fri Oct 17 16:16:32 2003 Subject: [Python-Dev] Python-2.3.2 windows binary screwed In-Reply-To: Message-ID: [Thomas Heller] > ... > The problem in this case was not the installer doing things wrong, the > fault was alone on my side: I did use the dlls from my WinXP system > directory, and the installer correctly used them to replace the > versions on the target computers. If this was a win2k system, the > file protection reverted this change, and the users were lucky again > (except they had an entry in the event log). Unfortunately win98 and > NT4 users were not so happy, for them it broke the system. For some of them, and probably a small minority (else we would have been deluged with bug reports about this, not just gotten a handful). For example, there were no problems after installing 2.3.2 on two different Win98SE boxes I use. I *did* note at the time I was surprised installation asked me to reboot (which is a sure sign that Wise detected it needed to replace an in-use DLL), but I forgot to panic about it. Under the theory that the boxes where this broke are the same ones contributing to worm spew exploiting MS bugs that were fixed a year ago, you were doing the world a favor by calling their owners' attention to how out of date they were . From skip at pobox.com Fri Oct 17 16:20:38 2003 From: skip at pobox.com (Skip Montanaro) Date: Fri Oct 17 16:20:48 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310172201.50930.aleaxit@yahoo.com> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <16272.15274.781344.230479@montanaro.dyndns.org> <200310172201.50930.aleaxit@yahoo.com> Message-ID: <16272.20246.883506.360730@montanaro.dyndns.org> >> Is it supposed to return a generator function which I can assign to a >> variable (or pass to the builtin function sum() as in your example) >> and call later, or is it supposed to turn the current function into a >> generator function (so that each executed yield statement returns a >> value to the caller of the current function)? Alex> Neither: it returns an iterator, _equivalent_ to the one that Alex> would be returned by _calling_ a generator such as Alex> def xxx(): Alex> for x in S: Alex> yield x All the more reason not to like this. Why not just define the generator function and call it? While Perl sprouts magical punctuation, turning its syntax into line noise, Python seems to be sprouting multiple function-like things. We have * functions * unbound methods * bound methods * generator functions * iterators (currently invisible via syntax, but created by calling a generator function?) * instances magically callable via __call__ and now this new (rather limited) syntax for creating iterators. I am beginning to find it all a bit confusing and unsettling. Skip From aleaxit at yahoo.com Fri Oct 17 16:21:43 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 17 16:21:48 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> Message-ID: <200310172221.43456.aleaxit@yahoo.com> On Friday 17 October 2003 08:17 pm, Terry Reedy wrote: ... > > >>> foo[ 'va':23:2j, {'zip':'zop'}:45:(3,4) ] > > > > (slice('va', 23, 2j), slice({'zip': 'zop'}, 45, (3, 4))) > > > > Not particularly _sensible_, mind you, and I hope nobody's yet ... > In your commercial programming group, would you accept such a slice > usage from another programmer, especially without prior agreement of > the group? Or would you want to edit, as you would with 'return x and True or False' and might with 'return x would reject it in practice, then it is hardly an argument for > something arguably even odder. I'm happy to be using a language which supplies good elementary components and good general "composability", even though it IS possible to overuse the composition and end up with weird constructs. Personally, I don't think that allowing comprehensions in indices would be particularly odd: just another "good elementary component". So would "iterator comprehensions", as an alternative. Both of them quite usable in composition with other existing components and rules to produce weirdness, sure: but showing that weirdness is already quite possible whether new constructs are allowed or not appears to me to be a perfectly valid argument for a new construct that's liable to be used in either good or weird ways. Alex From pf_moore at yahoo.co.uk Fri Oct 17 16:34:21 2003 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Fri Oct 17 16:34:17 2003 Subject: [Python-Dev] Re: accumulator display syntax References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> <16272.6895.233187.510629@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <5.1.0.14.0.20031017135553.03e20eb0@mail.telecommunity.com> Message-ID: "Phillip J. Eby" writes: > At 10:15 AM 10/17/03 -0700, Guido van Rossum wrote: >>Which is why I didn't like the 'sum[x for x in S]' notation much. >>Let's look for an in-line generator notation instead. I like >> >> sum((yield x for x in S)) >> >>but perhaps we can make this work: >> >> sum(x for x in S) I like the look of this. In this context, it looks very natural. > Offhand, it seems like the grammar might be rather tricky, but it > actually does seem more Pythonic than the "yield" syntax, and it > retroactively makes listcomps shorthand for 'list(x for x in s)'. > However, if gencomps use this syntax, then what does: > > for x in y*2 for y in z if y<20: > ... > > mean? ;) It means you're trying to be too clever, and should use parentheses :-) > It's a little clearer with parentheses, of course, so perhaps they > should be required: > > for x in (y*2 for y in z if y<20): > ... I'd rather not require parentheses in general. Guido's example of sum(x for x in S) looks too nice for me to want to give it up without a fight. But I'm happy to have cases where the syntax is ambiguous, or even out-and-out unparseable, without the parentheses. Whether it's possible to express this in a way that Python's grammar can deal with, I don't know. Paul. -- This signature intentionally left blank From pje at telecommunity.com Fri Oct 17 16:38:00 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Oct 17 16:38:01 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <16272.20246.883506.360730@montanaro.dyndns.org> References: <200310172201.50930.aleaxit@yahoo.com> <16272.5373.514560.225999@montanaro.dyndns.org> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <16272.15274.781344.230479@montanaro.dyndns.org> <200310172201.50930.aleaxit@yahoo.com> Message-ID: <5.1.0.14.0.20031017162527.03eb58a0@mail.telecommunity.com> At 03:20 PM 10/17/03 -0500, Skip Montanaro wrote: > * functions > * unbound methods > * bound methods > * generator functions > * iterators (currently invisible via syntax, but created by calling a > generator function?) > * instances magically callable via __call__ The last item on the list encompasses at least the first three. But you also left out __init__ and __new__, which are really ClassType.__call__ or type.__call__, though. :) To me (and the interpreter, actually), there's just tp_call, tp_iter, and tp_iternext (or whatever their actual names are). Callability, iterability, and iterator-next. Many kinds of objects may have these aspects, just as many kinds of objects may be addable with '+'. Of the things you mention, however, most don't actually have different syntax for creating them, and some are even the same object type (e.g. unbound and bound methods). And the syntax for *using* them is always uniform: () always calls an object, for ... in ... creates an iterator from an iterable, .next() goes to the next item. >and now this new (rather limited) syntax for creating iterators. Actually, as now being discussed, list comprehensions would be a special case of an iterator expression. >I am beginning to find it all a bit confusing and unsettling. Ironically, with iterator comprehension in place, a list comprehension would now look like a list containing an iterator, which I agree might be confusing. Too bad we didn't do iterator comps first, or list(itercomp) would be the idiomatic way to make a listcomp. That's really the only confusing bit I see about itercomps... that you have to be careful where you put your parentheses, in order to make your intentions clear in some contexts. However, that's true for many kinds of expressions even now. From aleaxit at yahoo.com Fri Oct 17 16:40:28 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 17 16:40:35 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <5.1.0.14.0.20031017134657.01ec1a70@mail.telecommunity.com> References: <5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com> <5.1.0.14.0.20031017134657.01ec1a70@mail.telecommunity.com> Message-ID: <200310172240.28322.aleaxit@yahoo.com> On Friday 17 October 2003 07:53 pm, Phillip J. Eby wrote: > At 06:52 PM 10/17/03 +0200, Alex Martelli wrote: > >On Friday 17 October 2003 06:03 pm, Phillip J. Eby wrote: > > > Yes, what you propose is certainly *possible*. But again, if you > > > really needed an iterator as an index, you can right now do: > > > > > > sum[ [x*x for x in blaap] ] > > > >Actually, I need to use parentheses on the outside and brackets only > >on the inside -- I assume that's what you meant, of course. > > No, I meant what I said, which was that if you "really needed an iterator > as an *index*" (emphasis added). I suppose I technically should have said, > if you really want to provide an *iterable*, since a list is not an > iterator. But I figured you'd know what I meant. :) Ah, no, I didn't get your meaning. But yes, you could of course pass iter([ x*x for x in blaap ]) as an iterator (not just iterable) index to whatever... as long as blaap was a FINITE iterator, of course. If you can't count on blaap being finite, you'd need to code and name a separate generator such as: def squares(blaap): for x in blaap: yield x*x then pass the result of calling squares(blaap), or you could choose to use itertools.imap and a lambda, etc etc. > >I agree it's clearer -- a tad less flexible, as you don't get to do > > separately selector = Top(10) > >and then somewhere else > > selector[...] > >but "oh well", and anyway the issue would be overcome if we had currying > >(we could be said to have it, but -- I assume you'd consider > > selector = Top.__get__(10) > >some kind of abuse, and besides, this 'currying' isn't very general). > > Hmmm... that's a hideously sick hack to perform currying... but I *like* > it. :) Not to use inline, of course, I'd wrap it in a 'curry' > function. But what a lovely way to *implement* it, under the hood. Of > course, I'd actually use 'new.instancemethod', since it would do the same Yes, def curry(func, arg): return new.instancemethod(func, arg, object) IS indeed way more general than func.__get__(arg) [notably, you get to call it repeatedly to curry more than one argument, from the left]. But if you have to define a curry function anyway, it's not a huge win vs def curry(func, arg): def curried(*args): return func(arg, *args) return curried or indeed more general variations thereof such as def curry(func, *curried_args): def curried(*args): return func(*(curried_args+args)) return curried Alex From pyth at devel.trillke.net Fri Oct 17 16:49:24 2003 From: pyth at devel.trillke.net (Holger Krekel) Date: Fri Oct 17 16:49:43 2003 Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate In-Reply-To: <20031016232444.GA27936@ibook.distro.conectiva>; from niemeyer@conectiva.com on Thu, Oct 16, 2003 at 08:24:44PM -0300 References: <3F8DB69E.2070406@sabaydi.com> <20031016103552.H14453@prim.han.de> <20031016232444.GA27936@ibook.distro.conectiva> Message-ID: <20031017224924.L14453@prim.han.de> Gustavo Niemeyer wrote: > > If anything at all, i'd suggest a std-module which contains e.g. > > 'sort', 'reverse' and 'extend' functions which always return > > a new list, so that you could write: > > > > for i in reverse(somelist): > > ... > > You can do reverse with [::-1] now. sure, but it's a bit unintuitive and i mentioned not only reverse :-) Actually i think that 'reverse', 'sort' and 'extend' algorithms could nicely be put into the new itertools module. There it's obvious that they wouldn't mutate objects. And these algorithms (especially extend and reverse) would be very efficient as iterators because they wouldn't create temporary lists/tuples. cheers, holger From FBatista at uniFON.com.ar Fri Oct 17 16:49:44 2003 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Fri Oct 17 16:50:36 2003 Subject: [Python-Dev] prePEP: Money data type Message-ID: Here I send it. Suggestions and all kinds of recomendations are more than welcomed. If it all goes ok, it'll be a PEP when I finish writing the code. Thank you. . Facundo ------------------------------------------------------------------------ PEP: XXXX Title: Money data type Version: $Revision: 0.1 $ Last-Modified: $Date: 2003/10/17 17:34:00 $ Author: Facundo Batista Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 17-Oct-2003 Python-Version: 2.3.3 Abstract ======== The idea is to make a Money data type, basically for financial uses, where decimals are needed but floating point is too inexact. The Money data type should support the Python standard functions and operations. Rationale ========= The detail of the requeriments are in the `Requirements`_ section. Here I'll include all the decisions made and why, and all the subjects still in discussion. The requirements will be numbered, to simplify discussion on each point. As an XP exercise, I'll write the test cases before de class itself, so it'll comply exactly the requeriments of those tests. Please see them for an exact specification (and if you propose a different behaviour, please propose the corresponding test case if possible, thanks). Why Not To Use Tim Peters' FixedPoint? -------------------------------------- As we'll see in Requeriments, thera are items that FixedPoint doesn?t comply (because doesn't do something or does it different). It could be extended or modified to comply the Requeriments, but some needs are own to currency, and some features of FixedPoint are too much for Money, so taking them out will make this class simplier. Anyway, sometime maybe one could be made subclass of the other, or just make one from both. The code of the Money class is based in large part on the code of Tim Peters' FixedPoint: thank you for your (very) valuable ideas. Items In Discussion ------------------- 6. About repr(). Should ``myMoney == eval(repr(myMoney))``? Requirements ============ 1. The sintaxis should be ``Money(value, [precision])``. 2. The value could of the type: - another money (if you don't include *precision*, it get inheritated) - int or long (default *precision*: 0):: Money(45): 45 Money(45, 2): 45.00 Money(5000000000,3): 5000000000.000 - float (*precision* must be included):: Money(50.33, 3): 50.330 - string (*precision* get extracted from the string):: Money('25.32'): 25.32 Money('25.32', 4): 25.3200 - something that could be coerced by long() or float() 3. Not to support strings with engineer notation (you don't need this when using money). 4. Precision must be a non negative integer, and after created the object you could not change it. 5. Attributes ``decimalSeparator``, ``currencySymbol`` and ``thousandSeparator`` could be overloaded, just to easy change them subclassing. This same *decimalSeparator* is that used by the constructor when receives a string. Defaults are:: decimalSeparator = '.' currencySimbol = '$' thousandSeparator = '' 6. Calling repr() should not return str(self), because if the subclass indicates that ``decimalSeparator=''``, this could carry to a confusion. So, repr() should show a tuple of three values: IntPart, FracPart, Precision. 7. To comply the test case of Mark McEahern:: cost = Money('5.99') percentDiscount = 10 months = 3 subTotal = cost * months discount = subTotal * (percentDiscount * 1.0) / 100 total = subTotal - discount assertEqual(total, Money('16.17')) 8. To support the basic aritmetic (``+, -, *, /, //, **, %, divmod``) and the comparisons (``==, !=, <, >, <=, >=, cmp``) in the following cases: - Money op Money - Money op otherType - otherType op Money - Money op= otherType OtherType could be int, float or long. Automaticlly will be converted to Money, inheritating the precision from the other component of the operation (and, in the case of the float, maybe losing precision **before** the operation). When both are Moneys, the result has the larger precision from both. 9. To support unary operators (``-, +, abs``). 10. To support the built-in methods: - min, max - float, int, long (int and long are rounded by Money) - str, repr - hash - copy, deepcopy - bool (0 is false, otherwise true) 11. To have methods that return its components. The value of Money will be ``(int part) + (frac part) / (10 ** precision)``. - ``getPrecision()``: the precision - ``getFracPart()``: the fractional part (as long) - ``getIntPart()``: the int part (as long) 12. The rounding to be financial. This means that to round a number in a position, if the digit at the right of that position is bigger than 5, the digit at the left of that position is incremented by one, if it's smaller than 5 isn't:: 1.123 --> 1.12 1.128 --> 1.13 But when the digit at the right of that position is ==5. There, if the digit at the left of that position is odd, it gets incremented, otherwise isn't:: 1.125 --> 1.12 1.135 --> 1.14 Reference Implementation ======================== To be included later: - code - test code - documentation Copyright ========= This document has been placed in the public domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20031017/b78dd115/attachment-0001.html From pf_moore at yahoo.co.uk Fri Oct 17 16:52:14 2003 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Fri Oct 17 16:52:09 2003 Subject: [Python-Dev] Re: accumulator display syntax References: <5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com> <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz> <5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com> <200310171852.34515.aleaxit@yahoo.com> <5.1.0.14.0.20031017134657.01ec1a70@mail.telecommunity.com> Message-ID: "Phillip J. Eby" writes: > At 06:52 PM 10/17/03 +0200, Alex Martelli wrote: >>I assume you'd consider >> selector = Top.__get__(10) >>some kind of abuse, and besides, this 'currying' isn't very general). > > Hmmm... that's a hideously sick hack to perform currying... but I > *like* it. :) Urk. I just checked, and this works. But I haven't the foggiest idea why! Could someone please explain? If you do, I promise never to reveal who told me :-) Paul. -- This signature intentionally left blank From niemeyer at conectiva.com Fri Oct 17 16:26:20 2003 From: niemeyer at conectiva.com (Gustavo Niemeyer) Date: Fri Oct 17 16:54:32 2003 Subject: [Python-Dev] SRE recursion In-Reply-To: <200310162305.27509.gherron@islandtraining.com> References: <20031016225058.GB19133@ibook.distro.conectiva> <200310162305.27509.gherron@islandtraining.com> Message-ID: <20031017202619.GA31350@ibook.distro.conectiva> > > > I'd like to get back to the SRE recursion issue (#757624). Is this > > > a good time to commit the patch? > > > > It would be good if you could find somebody who reviews the > > patch. However, if nobody volunteers to review, please go ahead - it > > might well be that you are the last active SRE maintainer left on this > > planet ... > > I jumped into SRE and wallowed around a bit before the last release, > then got swamped with real (i.e., money earning) work. I'd be willing > to jump in again if it would help. Gustavo, would you like me to > review the patch? Or if you submit it, I'll just get it from cvs and > poke around it that way. Great! I'll submit it then. Thanks! -- Gustavo Niemeyer http://niemeyer.net From aleaxit at yahoo.com Fri Oct 17 16:55:43 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 17 16:55:48 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> Message-ID: <200310172255.43697.aleaxit@yahoo.com> On Friday 17 October 2003 07:15 pm, Guido van Rossum wrote: > > But, indexing does stretch quite > > far in the current Python syntax and semantics (in Python's > > *pragmatics* you're supposed to use it far more restrainedly). > > Which is why I didn't like the 'sum[x for x in S]' notation much. Let it rest in peace, then. > Let's look for an in-line generator notation instead. I like > > sum((yield x for x in S)) So do I, _with_ the mandatory extra parentheses and all, and in fact I think it might be even clearer with the extra colon that Phil had mentioned, i.e. sum((yield: x for x in S)) > but perhaps we can make this work: > > sum(x for x in S) Perhaps the parser can be coerced to make this work, but the mandatory parentheses, the yield keyword, and possibly the colon, too, may all help, it seems to me, in making this syntax stand out more. Yes, some uses may "read" more naturally with as little extras as feasible, notably [examples that might be better done with list comprehensions except for _looks_...]: even_digits = Set(x for x in range(0, 10) if x%2==0) versus even_digits = Set((yield: x for x in range(0, 10) if x%2==0)) but that may be because the former notation leads back to the "set comprehensions" that list comprehensions were originally derived from. I don't think it's that clear in other cases which have nothing to do with sets, such as, e.g., Peter Norvig's original examples of "accumulator displays". And as soon as you consider the notation being used in any situation EXCEPT as the ONLY argument in a call...: foo(x, y for y in glab for x in blag) yes, I know this passes ONE x and one iterator, because to pass one iterator of pairs one would have to write foo((x, y) for y in glab for x in blag) but the distinction between the two seems quite error prone to me. BTW, semantically, it WOULD be OK for these iterator comprehension to NOT "leak" their control variables to the surrounding scope, right...? I do consider the fact that list comprehensions "leak" that way a misfeature, and keep waiting for some fanatic of assignment-as-expression to use it IN EARNEST, e.g., to code his or her desired "while c=beep(): boop(c)", use while [c for c in [beep()] if c]: boop(c) ...:-). Anyway, back to the subject, those calls to foo seem very error-prone, while: foo(x, (yield: y for y in glab for x in blag)) (mandatory extra parentheses, 'yield', and colon) seems far less likely to cause any such error. Alex From skip at pobox.com Fri Oct 17 16:56:01 2003 From: skip at pobox.com (Skip Montanaro) Date: Fri Oct 17 16:56:10 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> <16272.6895.233187.510629@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <5.1.0.14.0.20031017135553.03e20eb0@mail.telecommunity.com> Message-ID: <16272.22369.546606.870697@montanaro.dyndns.org> >>> sum((yield x for x in S)) >>> >>> but perhaps we can make this work: >>> >>> sum(x for x in S) Paul> I like the look of this. In this context, it looks very natural. How would it look if you used the optional start arg to sum()? Would either of these work? sum(x for x in S, start=5) sum(x for x in S, 5) or would you have to parenthesize the first arg? sum((x for x in S), start=5) sum((x for x in S), 5) Again, why parens? Why not sum(, start=5) sum(, 5) or something similar? Also, sum(x for x in S) and sum([x for x in S]) look very similar. I don't think it would be obvious to the casual observer what the difference between them was or why the first form didn't raise a SyntaxError. >> It's a little clearer with parentheses, of course, so perhaps they >> should be required: >> >> for x in (y*2 for y in z if y<20): >> ... Paul> I'd rather not require parentheses in general. Parens are required in certain situations within list comprehensions around tuples (probably for syntactic reasons, but perhaps to aid the reader as well) where tuples can often be defined without enclosing parens. Here's a contrived example: >>> [(a,b) for (a,b) in zip(range(5), range(10))] [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)] >>> [a,b for (a,b) in zip(range(5), range(10))] File "", line 1 [a,b for (a,b) in zip(range(5), range(10))] ^ SyntaxError: invalid syntax Paul> Guido's example of sum(x for x in S) looks too nice for me to want Paul> to give it up without a fight. But I'm happy to have cases where Paul> the syntax is ambiguous, or even out-and-out unparseable, without Paul> the parentheses. Whether it's possible to express this in a way Paul> that Python's grammar can deal with, I don't know. I rather suspect parens would be required for tuples if they were added to the language today. I see no reason to make an exception here. Skip From aleaxit at yahoo.com Fri Oct 17 17:08:04 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 17 17:08:11 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: References: <5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com> <5.1.0.14.0.20031017134657.01ec1a70@mail.telecommunity.com> Message-ID: <200310172308.04840.aleaxit@yahoo.com> On Friday 17 October 2003 10:52 pm, Paul Moore wrote: ... > >> selector = Top.__get__(10) ... > Urk. I just checked, and this works. But I haven't the foggiest idea > why! Could someone please explain? If you do, I promise never to > reveal who told me :-) Functions are descriptors, and func.__get__(obj) returns a bound method with im_self set to obj -- that's how functions become bound methods, in today's Python, when accessed with attribute syntax obj.func on an instance obj of a class which has func in its dict. But the mechanism is NOT meant for general currying... you could say the latter just works as a weird-ish side effect, and not in too general a way: consider for example: >>> def p(s): print s ... >>> p.__get__('one case').__get__('another')() another >>> the second __get__ "replaces" the im_self [[it works on _p_ again, the im_func of the bound method given by the first, NOT on "the bound method itself", as that isn't a descriptor]]... now if we had a marketing dept it could sell this as a feature, "rebindable curried functions", perhaps, but in fact it's an "accidental side effect"...;-) Alex From eppstein at ics.uci.edu Fri Oct 17 17:10:20 2003 From: eppstein at ics.uci.edu (David Eppstein) Date: Fri Oct 17 17:10:28 2003 Subject: [Python-Dev] Re: accumulator display syntax References: <200310172201.50930.aleaxit@yahoo.com> <16272.5373.514560.225999@montanaro.dyndns.org> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <16272.15274.781344.230479@montanaro.dyndns.org> <200310172201.50930.aleaxit@yahoo.com> <16272.20246.883506.360730@montanaro.dyndns.org> <5.1.0.14.0.20031017162527.03eb58a0@mail.telecommunity.com> Message-ID: In article <5.1.0.14.0.20031017162527.03eb58a0@mail.telecommunity.com>, "Phillip J. Eby" wrote: > >I am beginning to find it all a bit confusing and unsettling. > > Ironically, with iterator comprehension in place, a list comprehension > would now look like a list containing an iterator, which I agree might be > confusing. Along with that confusion, (x*x for x in S) would look like a tuple comprehension, rather than a bare iterator. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From Jack.Jansen at cwi.nl Fri Oct 17 17:11:44 2003 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Fri Oct 17 17:11:59 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <16272.20246.883506.360730@montanaro.dyndns.org> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <16272.15274.781344.230479@montanaro.dyndns.org> <200310172201.50930.aleaxit@yahoo.com> <16272.20246.883506.360730@montanaro.dyndns.org> Message-ID: <83175E26-00E6-11D8-907C-000A27B19B96@cwi.nl> On 17-okt-03, at 22:20, Skip Montanaro wrote: > All the more reason not to like this. Why not just define the > generator > function and call it? > > While Perl sprouts magical punctuation, turning its syntax into line > noise, > Python seems to be sprouting multiple function-like things. We have > > * functions > * unbound methods > * bound methods > * generator functions > * iterators (currently invisible via syntax, but created by > calling a > generator function?) > * instances magically callable via __call__ > > and now this new (rather limited) syntax for creating iterators. And you even forget lambda:-) I agree with Skip here: there's all this magic that crept into Python since 2.0 (approximately) that really hampers readability to novices. And here I mean novices in the wide sense of the word, i.e. including myself (novice to the new concepts). Some of these look like old concepts but are really something completely different (generators versus functions), some are really little more than keystroke savers (list comprehensions). -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From eppstein at ics.uci.edu Fri Oct 17 17:14:41 2003 From: eppstein at ics.uci.edu (David Eppstein) Date: Fri Oct 17 17:20:17 2003 Subject: [Python-Dev] Re: accumulator display syntax References: <200310171903.42578.aleaxit@yahoo.com> <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> <16272.6895.233187.510629@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <5.1.0.14.0.20031017135553.03e20eb0@mail.telecommunity.com> <16272.22369.546606.870697@montanaro.dyndns.org> Message-ID: In article <16272.22369.546606.870697@montanaro.dyndns.org>, Skip Montanaro wrote: > Parens are required in certain situations within list comprehensions around > tuples (probably for syntactic reasons, but perhaps to aid the reader as > well) where tuples can often be defined without enclosing parens. Here's a > contrived example: > > >>> [(a,b) for (a,b) in zip(range(5), range(10))] > [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)] > >>> [a,b for (a,b) in zip(range(5), range(10))] > File "", line 1 > [a,b for (a,b) in zip(range(5), range(10))] > ^ > SyntaxError: invalid syntax This one has bitten me several times. When it does, I discover the error quickly due to the syntax error, but it would be bad if this became valid syntax and returned a list [a,X] where X is an iterator. I don't think you could count on this getting caught by a being unbound, because often the variables in list comprehensions can be single letters that shadow previous bindings. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From guido at python.org Fri Oct 17 17:19:30 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 17:20:33 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: Your message of "Fri, 17 Oct 2003 15:56:01 CDT." <16272.22369.546606.870697@montanaro.dyndns.org> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> <16272.6895.233187.510629@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <5.1.0.14.0.20031017135553.03e20eb0@mail.telecommunity.com> <16272.22369.546606.870697@montanaro.dyndns.org> Message-ID: <200310172119.h9HLJUF07430@12-236-54-216.client.attbi.com> > Again, why parens? Why not > > sum(, start=5) > sum(, 5) Because the parser doesn't know whether the > after S is the end of the <...> brackets or a binary > operator. (Others can answer your other questions.) --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Fri Oct 17 17:21:48 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 17 17:21:54 2003 Subject: [Python-Dev] prePEP: Money data type In-Reply-To: References: Message-ID: <200310172321.48818.aleaxit@yahoo.com> On Friday 17 October 2003 10:49 pm, Batista, Facundo wrote: ... > The idea is to make a Money data type, basically for financial uses, where > decimals are needed but floating point is too inexact. The Money data type Good, but the name seems ambiguous -- I would expect 'money' to include a *currency unit*, while these are just numbers. E.g., these days for me a "money amount" of "1000" isn't immediately significant -- does it mean "old liras", Euros, SEK, ...? If a clearer name (perhaps Decimal?) was adopted, the type's purposes would be also clearer, perhaps. > 6. About repr(). Should ``myMoney == eval(repr(myMoney))``? I don't see why not. > 3. Not to support strings with engineer notation (you don't need this when > using money). Actually, with certain very depreciated currencies exponent notation would be VERY handy to have. E.g., given than a Euro is worth 1670000 Turkish Liras today, you have to count zeros accurately when expressing any substantial amount in Turkish Liras -- exponential notation would help. > 10. To support the built-in methods: I think you mean functions, not methods, in Python terminology. > - min, max > - float, int, long (int and long are rounded by Money) Rounding rather than truncation seems strange to me here. > - str, repr > - hash > - copy, deepcopy > - bool (0 is false, otherwise true) > > 11. To have methods that return its components. The value of Money will be > ``(int part) + (frac part) / (10 ** precision)``. > > - ``getPrecision()``: the precision > - ``getFracPart()``: the fractional part (as long) > - ``getIntPart()``: the int part (as long) Given we're talking about Python and not Java, I would suggest read-only accessors (like e.g. the complex type has) rather than accessor methods. E.g., x.precision , x.fraction and x.integer rather than x.getPrecision() etc. > 12. The rounding to be financial. This means that to round a number in a > position, if the digit at the right of that position is bigger than 5, > the digit at the left of that position is incremented by one, if it's > smaller than 5 isn't:: > > 1.123 --> 1.12 > 1.128 --> 1.13 > > But when the digit at the right of that position is ==5. There, if the > digit at the left of that position is odd, it gets incremented, > otherwise > isn't:: > > 1.125 --> 1.12 > 1.135 --> 1.14 I don't think these are the rules in the European Union (they're popular in statistics, but, I suspect, not legally correct in accounting). I can try to research that, if you need me to. Alex From aleaxit at yahoo.com Fri Oct 17 17:28:23 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 17 17:28:29 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <5.1.0.14.0.20031017162527.03eb58a0@mail.telecommunity.com> References: <200310172201.50930.aleaxit@yahoo.com> <5.1.0.14.0.20031017162527.03eb58a0@mail.telecommunity.com> Message-ID: <200310172328.23057.aleaxit@yahoo.com> On Friday 17 October 2003 10:38 pm, Phillip J. Eby wrote: ... > Ironically, with iterator comprehension in place, a list comprehension > would now look like a list containing an iterator, which I agree might be > confusing. Too bad we didn't do iterator comps first, or list(itercomp) > would be the idiomatic way to make a listcomp. Yes. But don't mind me, I'm still sad that we have range and xrange when iter(a:b) and list(a:b:c) would be SUCH good replacements for them if slicing-notation was accepted elsewhere than in indexing, or iter[a:b] and list[a:b:c] if some people didn't so strenuously object to certain perfectly harmless uses of indexing...;-) > That's really the only confusing bit I see about itercomps... that you > have to be careful where you put your parentheses, in order to make your > intentions clear in some contexts. However, that's true for many kinds of > expressions even now. Yes. But since iterator comprehensions are being designed from scratch I think we can MANDATE parentheses around them, and a 'yield' right after the open parenthesis for good measure, to ensure they are not ambiguous to human readers as well as to parsers. Alex From FBatista at uniFON.com.ar Fri Oct 17 17:33:48 2003 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Fri Oct 17 17:34:34 2003 Subject: [Python-Dev] prePEP: Money data type Message-ID: #- Good, but the name seems ambiguous -- I would expect 'money' #- to include #- a *currency unit*, while these are just numbers. E.g., #- these days for me a #- "money amount" of "1000" isn't immediately significant -- #- does it mean "old #- liras", Euros, SEK, ...? If a clearer name (perhaps #- Decimal?) was adopted, #- the type's purposes would be also clearer, perhaps. Specifically it doesn't diferenciate it. It is printed with a '$' prefix, but that's all. The name really is a problem. Decimal doesn't imply the different rounding. #- > 6. About repr(). Should ``myMoney == eval(repr(myMoney))``? #- #- I don't see why not. OK, should. But must? #- > 3. Not to support strings with engineer notation (you #- don't need this when #- > using money). #- #- Actually, with certain very depreciated currencies exponent #- notation would #- be VERY handy to have. E.g., given than a Euro is worth #- 1670000 Turkish #- Liras today, you have to count zeros accurately when expressing any #- substantial amount in Turkish Liras -- exponential notation #- would help. You got me. Taking note. #- > 10. To support the built-in methods: #- #- I think you mean functions, not methods, in Python terminology. #- #- > - min, max #- > - float, int, long (int and long are rounded by Money) #- #- Rounding rather than truncation seems strange to me here. To me too. It could be truncated, and if you want to round m to cero precision, you always can Money(m, 0). #- > 11. To have methods that return its components. The value #- of Money will be #- > ``(int part) + (frac part) / (10 ** precision)``. #- > #- > - ``getPrecision()``: the precision #- > - ``getFracPart()``: the fractional part (as long) #- > - ``getIntPart()``: the int part (as long) #- #- Given we're talking about Python and not Java, I would #- suggest read-only #- accessors (like e.g. the complex type has) rather than #- accessor methods. #- E.g., x.precision , x.fraction and x.integer rather than #- x.getPrecision() etc. Nice. #- > But when the digit at the right of that position is #- ==5. There, if the #- > digit at the left of that position is odd, it gets incremented, #- > otherwise #- > isn't:: #- > #- > 1.125 --> 1.12 #- > 1.135 --> 1.14 #- #- I don't think these are the rules in the European Union #- (they're popular #- in statistics, but, I suspect, not legally correct in #- accounting). I can try #- to research that, if you need me to. Please. Because I found it in FixedPoint, and researching, think that in Argentina that's the way banks get rounded money. From guido at python.org Fri Oct 17 17:45:43 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 17:46:07 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Fri, 17 Oct 2003 22:55:43 +0200." <200310172255.43697.aleaxit@yahoo.com> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <200310172255.43697.aleaxit@yahoo.com> Message-ID: <200310172145.h9HLjh807477@12-236-54-216.client.attbi.com> [Guido] > > Let's look for an in-line generator notation instead. I like > > > > sum((yield x for x in S)) [Alex] > So do I, _with_ the mandatory extra parentheses and all, and in > fact I think it might be even clearer with the extra colon that Phil > had mentioned, i.e. > > sum((yield: x for x in S)) > > > but perhaps we can make this work: > > > > sum(x for x in S) > > Perhaps the parser can be coerced to make this work, but the > mandatory parentheses, the yield keyword, and possibly the colon, > too, may all help, it seems to me, in making this syntax stand > out more. Hm. I'm not sure that it *should* stand out more. The version with the yield keyword and the colon draws undue attention to the mechanism. I bet that if you showed sum(x for x in range(10)) to a newbie they'd have no problem understanding it (their biggest problem would be that range(10) is [0, 1, ..., 9] rather than [1, 2, ..., 10]) but if you showed them sum((yield: x for x in S)) they would probably scratch their heads. I also note that if it wasn't for list comprehensions, the form for in poses absolutely no problems to the parser, since it's just a ternary operator (though the same is true for the infamous if else :-). List comprehensions make this a bit difficult because they use the same form in a specific context for something different; at the very best this would mean that [x for x in S] and [(x for x in S)] are completely different beasts: the first would be equivalent to list(S) while the second would be equivalent to [iter(S)] i.e. a list whose only only element is an iterator over S (not a very useful thing to have, except perhaps if you had a function taking a list of iterators as an argument). > Yes, some uses may "read" more naturally with as > little extras as feasible, notably [examples that might be better > done with list comprehensions except for _looks_...]: > > even_digits = Set(x for x in range(0, 10) if x%2==0) > > versus > > even_digits = Set((yield: x for x in range(0, 10) if x%2==0)) > > but that may be because the former notation leads back to > the "set comprehensions" that list comprehensions were > originally derived from. I don't think it's that clear in other > cases which have nothing to do with sets, such as, e.g., > Peter Norvig's original examples of "accumulator displays". Let's go over the examples from http://www.norvig.com/pyacc.html : [Sum: x*x for x in numbers] sum(x*x for x in numbers) [Product: Prob_spam(word) for word in email_msg] product(Prob_spam(word) for word in email_msg) [Min: temp(hour) for hour in range(24)] min(temp(hour) for hour in range(24)) [Mean: f(x) for x in data] mean(f(x) for x in data) [Median: f(x) for x in data] median(f(x) for x in data) [Mode: f(x) for x in data] mode(f(x) for x in data) So far, these can all be written as simple functions that take an iterable argument, and they look as good with an iterator comprehension as with a list argument. [SortBy: abs(x) for x in (-2, -4, 3, 1)] This one is a little less obvious, because it requires the feature from Norvig's PEP that if add() takes a second argument, the unadorned loop control variable is passed in that position. It could be done with this: sortby((abs(x), x) for x in (-2, 3, 4, 1)) but I think that Raymond's code in CVS is just as good. :-) Norvig's Top poses no problem: top(humor(joke) for joke in jokes) In conclusion, I think this syntax is pretty cool. (It will probably die the same death as the ternary expression though.) > And as soon as you consider the notation being used in > any situation EXCEPT as the ONLY argument in a call...: Who said that? I fully intended it to be an expression, acceptable everywhere, though possibly requiring parentheses to avoid ambiguities (in list comprehensions) or excessive ugliness (e.g. to the right of 'in' or 'yield'). > foo(x, y for y in glab for x in blag) > > yes, I know this passes ONE x and one iterator, because > to pass one iterator of pairs one would have to write > > foo((x, y) for y in glab for x in blag) > > but the distinction between the two seems quite error > prone to me. It would requier extra parentheses here: foo(x, (y for y in glab for x in blag)) > BTW, semantically, it WOULD be OK for > these iterator comprehension to NOT "leak" their > control variables to the surrounding scope, right...? Yes. (I think list comprehensions shouldn't do this either; it's just a pain to introduce a new scope; maybe such control variables should simply be renamed to "impossible" names like the names used for the anonymous first argument to f below: def f((a, b), c): ... > I > do consider the fact that list comprehensions "leak" that > way a misfeature, and keep waiting for some fanatic of > assignment-as-expression to use it IN EARNEST, e.g., > to code his or her desired "while c=beep(): boop(c)", use > > while [c for c in [beep()] if c]: > boop(c) > > ...:-). Yuck. Fortunately that would be quite slow, and the same fanatics usually don't like that. :-) > Anyway, back to the subject, those calls to foo seem > very error-prone, while: > > foo(x, (yield: y for y in glab for x in blag)) > > (mandatory extra parentheses, 'yield', and colon) seems > far less likely to cause any such error. I could live with the extra parentheses. Then we get: (x for x in S) # iter(S) [x for x in S] # list(S) --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Fri Oct 17 17:45:26 2003 From: python at rcn.com (Raymond Hettinger) Date: Fri Oct 17 17:46:19 2003 Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate In-Reply-To: <20031017224924.L14453@prim.han.de> Message-ID: <002c01c394f7$fa270500$e841fea9@oemcomputer> [Gustavo Niemeyer wrote] > > > If anything at all, i'd suggest a std-module which contains e.g. > > > 'sort', 'reverse' and 'extend' functions which always return > > > a new list > > > a new list, so that you could write: > > > > > > for i in reverse(somelist): > > > ... sort: This is being addressed by the proposed list.copysort() method reverse: This is being addressed by PEP-0322. When I get a chance, the PEP will be revised to propose a builtin instead of various methods attached to specific sequence objects. extend: How would this differ from itertools.chain() ? > > You can do reverse with [::-1] now. [Holger Krekel] > sure, but it's a bit unintuitive and i mentioned not only reverse :-) > > Actually i think that 'reverse', 'sort' and 'extend' algorithms > could nicely be put into the new itertools module. > > There it's obvious that they wouldn't mutate objects. And these > algorithms > (especially extend and reverse) would be very efficient as iterators > because > they wouldn't create temporary lists/tuples. To be considered as a possible itertool, an ideal candidate should: * work well in combination with other itertools * be a fundamental building block * accept all iterables as inputs * return only an iterator as an output * run lazily so as not to force the inputs to run to completion unless externally requested by list() or some such. * consume constant memory (this rule was bent for itertools.cycle(), but should be followed as much as possible). * run finitely if some of the inputs are finite (itertools.repeat(), count() and cycle() are the only intentionally infinite tools) There is no chance for isort(). Once you've sorted the whole list, there is no advantage to returning an iterator instead of a list. The problem with ireverse() is that it only works with objects that support __getitem__() and len(). That pretty much precludes generators, user defined class based iterators, and the outputs from other itertools. So, while it may make a great builtin (which is what PEP-322 is going to propose), it doesn't fit in with other itertools. Raymond Hettinger From guido at python.org Fri Oct 17 17:46:32 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 17:46:52 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: Your message of "Fri, 17 Oct 2003 14:10:20 PDT." References: <200310172201.50930.aleaxit@yahoo.com> <16272.5373.514560.225999@montanaro.dyndns.org> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <16272.15274.781344.230479@montanaro.dyndns.org> <200310172201.50930.aleaxit@yahoo.com> <16272.20246.883506.360730@montanaro.dyndns.org> <5.1.0.14.0.20031017162527.03eb58a0@mail.telecommunity.com> Message-ID: <200310172146.h9HLkWB07494@12-236-54-216.client.attbi.com> > Along with that confusion, (x*x for x in S) would look like a tuple > comprehension, rather than a bare iterator. Well, () is already heavily overloaded, so I can live with that. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Oct 17 17:48:33 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 17:48:40 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: Your message of "Fri, 17 Oct 2003 14:14:41 PDT." References: <200310171903.42578.aleaxit@yahoo.com> <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> <16272.6895.233187.510629@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <5.1.0.14.0.20031017135553.03e20eb0@mail.telecommunity.com> <16272.22369.546606.870697@montanaro.dyndns.org> Message-ID: <200310172148.h9HLmXk07520@12-236-54-216.client.attbi.com> > > >>> [(a,b) for (a,b) in zip(range(5), range(10))] > > [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)] > > >>> [a,b for (a,b) in zip(range(5), range(10))] > > File "", line 1 > > [a,b for (a,b) in zip(range(5), range(10))] > > ^ > > SyntaxError: invalid syntax > > This one has bitten me several times. > > When it does, I discover the error quickly due to the syntax error, Generally, when we talk about something "biting", we mean something that *doesn't* give a syntax error, but silently does something quite different than what you'd naively expect. This was made a syntax error specifically because of this ambiguity. > but it would be bad if this became valid syntax and returned a list > [a,X] where X is an iterator. I don't think you could count on this > getting caught by a being unbound, because often the variables in > list comprehensions can be single letters that shadow previous > bindings. No, [a,X] would be a syntax error if X was an iterator comprehension. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Oct 17 17:50:34 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 17:50:57 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Fri, 17 Oct 2003 23:28:23 +0200." <200310172328.23057.aleaxit@yahoo.com> References: <200310172201.50930.aleaxit@yahoo.com> <5.1.0.14.0.20031017162527.03eb58a0@mail.telecommunity.com> <200310172328.23057.aleaxit@yahoo.com> Message-ID: <200310172150.h9HLoYj07532@12-236-54-216.client.attbi.com> > Yes. But don't mind me, I'm still sad that we have range and xrange > when iter(a:b) and list(a:b:c) would be SUCH good replacements for > them if slicing-notation was accepted elsewhere than in indexing, This has been proposed more than once (I think the last time by Paul Dubois, who wanted x:y:z to be a general expression), and has a certain elegance, but is probably too terse. --Guido van Rossum (home page: http://www.python.org/~guido/) From janssen at parc.com Fri Oct 17 17:51:42 2003 From: janssen at parc.com (Bill Janssen) Date: Fri Oct 17 17:52:05 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Fri, 17 Oct 2003 13:20:38 PDT." <16272.20246.883506.360730@montanaro.dyndns.org> Message-ID: <03Oct17.145145pdt."58611"@synergy1.parc.xerox.com> > All the more reason not to like this. Why not just define the generator > function and call it? +1. Bill From aleaxit at yahoo.com Fri Oct 17 17:54:32 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 17 17:54:38 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <16272.20246.883506.360730@montanaro.dyndns.org> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310172201.50930.aleaxit@yahoo.com> <16272.20246.883506.360730@montanaro.dyndns.org> Message-ID: <200310172354.32838.aleaxit@yahoo.com> On Friday 17 October 2003 10:20 pm, Skip Montanaro wrote: ... > Alex> Neither: it returns an iterator, _equivalent_ to the one that > Alex> would be returned by _calling_ a generator such as > > Alex> def xxx(): > Alex> for x in S: > Alex> yield x > > All the more reason not to like this. Why not just define the generator > function and call it? The usual problems: having to use several separate statements, and name something that you are only interested in using once, is a bit conceptually cumbersome when you could use a clear inline expression "right where you need it" for the same purpose. Moreover, it seems a bit strange to be able to use the well-liked comprehension syntax only at the price of storing all intermediate steps in memory -- and have to zoom up to several separate statements + a name if you'd rather avoid the memory overhead, e.g.: sum( [x+x*x for x in short_sequence if x >0] ) is all right, BUT if the sequence becomes too long then def gottagiveitaname(): for x in long_sequence: if x>0: yield x+x*x sum( gottagiveitaname() ) That much being said, I entirely agree that the proposal is absolutely NOT crucial to Python -- it will not enormously expand its power nor its range of applicability. I don't think it's SO terribly complicated to require application of such extremely high standards, though. But if the consensus is that ONLY lists are important enough to deserve the beauty of comprehensions, and EVERY other case must either pay the memory price of a list or the conceptual one of calling and then invoking a one-use-only generator, so be it, I guess. > While Perl sprouts magical punctuation, turning its syntax into line noise, > Python seems to be sprouting multiple function-like things. We have > > * functions > * unbound methods > * bound methods > * generator functions > * iterators (currently invisible via syntax, but created by calling a > generator function?) > * instances magically callable via __call__ Every one of these was in Python when I first met it, except generators -- and iterators, which are NOT function-like in the least, nor "invisible" (often, an iterator is an instance of an explicitly coded class or type with a next() method). You seem to have forgotten lambda, though -- and classes/types (all callable -- arguably via __call__ in some sense, but you could say just the same of functions &c). Which ALSO were in Python when I first met it. So, I see no "sprouting" -- Python has "always" (from my POV) had a wide variety of callables. > and now this new (rather limited) syntax for creating iterators. ...which isn't function-like either, neither in syntax nor in semantics. Yes, it's limited -- basically to the same cases as list comprehensions, except that (being an iterator and not a list) there is no necessary implication of finiteness. > I am beginning to find it all a bit confusing and unsettling. I hear you, and I worry about this general effect on you, but I do not seem to be able to understand the real reasons. Any such generalized objection from an experienced Pythonista like you is well worthy of making everybody sit up and care, it seems to me. But exactly because of that, it might help if you were able to articulate your unease more precisely. Python MAY well have accumulated a few too many things in its long, glorious story -- because (and for good reason!) we keep the old cruft around for backwards compatibility, any change means (alas) growth. Guido is on record as declaring that release 3.0 will be about simplification: removing some of the cruft, taking advantage of the 2->3 bump in release number to break a little bit (not TOO much) backwards compatibility. Is Python so large today that we can't afford another release, 2.4, with _some_ kind of additions to the language proper, without confusing and unsettling long-time, experienced, highly skilled Pythonistas like you? Despite the admirable _stationariety_ of the language proper throughout the 2.2 and 2.3 eras...? If something like that is your underlying feeling, it may be well worth articulating -- and perhaps we need to sit back and listen and take stock (hey, I'd get to NOT have to write another edition of the Nutshell for a while -- maybe I should side strongly with this thesis!-). If it's something else, more specific to this set of proposals for accumulators / comprehensions, then maybe there's some _area_ in which any change is particularly unwelcome? But I can't guess with any accuracy... Alex From python at rcn.com Fri Oct 17 17:55:11 2003 From: python at rcn.com (Raymond Hettinger) Date: Fri Oct 17 17:55:54 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310172255.43697.aleaxit@yahoo.com> Message-ID: <002d01c394f9$56b6e460$e841fea9@oemcomputer> [GvR] > > Which is why I didn't like the 'sum[x for x in S]' notation much. [Alex] > Let it rest in peace, then. Goodbye, weird __getitem__ hack! [GvR] > > Let's look for an in-line generator notation instead. I like > > > > sum((yield x for x in S)) [Alex] > So do I, _with_ the mandatory extra parentheses and all, and in > fact I think it might be even clearer with the extra colon that Phil > had mentioned, i.e. > > sum((yield: x for x in S)) +1 [David Eppstein, in a separate note] > Along with that confusion, (x*x for x in S) would look like a tuple > comprehension, rather than a bare iterator. Phil's idea cleans that up pretty well: (yield: x*x for x in S) This is no more tuple-like than any expression surrounded by parens. Raymond Hettinger From pje at telecommunity.com Fri Oct 17 17:58:06 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Oct 17 17:58:09 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310172255.43697.aleaxit@yahoo.com> References: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <16272.5373.514560.225999@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> Message-ID: <5.1.0.14.0.20031017175033.02fea090@mail.telecommunity.com> At 10:55 PM 10/17/03 +0200, Alex Martelli wrote: >while [c for c in [beep()] if c]: > boop(c) > >..:-). That is positively *evil*. Good thing you didn't post it on python-list. :) >Anyway, back to the subject, those calls to foo seem >very error-prone, while: > >foo(x, (yield: y for y in glab for x in blag)) > >(mandatory extra parentheses, 'yield', and colon) seems >far less likely to cause any such error. And also much uglier. Even though I originally proposed it, I like Guido's version (sans yield) much better. OTOH, I can also see where the "tuple comprehension" and other possible confusing uses seem to shoot it down. Hm. What if list comprehensions returned a "lazy list", that if you took an iterator of it, you'd get a generator-iterator, but if you tried to use it as a list, it would populate itself? Then there'd be no need to ever *not* use a listcomp, and only one syntax would be necessary. More specifically, if all you did with the list was iterate over it, and then throw it away, it would never actually populate itself. The principle drawback to this idea from a semantic viewpoint is that listcomps can be done over expressions that have side-effects. :( From martin at v.loewis.de Fri Oct 17 18:00:54 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Fri Oct 17 18:01:19 2003 Subject: [Python-Dev] SRE recursion In-Reply-To: <20031017223015.N64463@bullseye.apana.org.au> References: <20031016225058.GB19133@ibook.distro.conectiva> <20031017223015.N64463@bullseye.apana.org.au> Message-ID: Andrew MacIntyre writes: > Because of the stack recursion issue on FreeBSD (in the presence of > threads), I tested several of Gustavo's patches. I didn't scrutinise them > for style though... It's not primarily style that I'm concerned about, but hard-to-find-in-testing bugs, such as memory leaks, bad decrefs, incompatibilities in boundary cases, and so on. Regards, Martin From aleaxit at yahoo.com Fri Oct 17 18:05:24 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 17 18:06:09 2003 Subject: Currying with instancemethod (was Re: [Python-Dev] accumulator display syntax) In-Reply-To: <5.1.0.14.0.20031017173618.02fe7820@mail.telecommunity.com> References: <5.1.0.14.0.20031017134657.01ec1a70@mail.telecommunity.com> <5.1.0.14.0.20031017173618.02fe7820@mail.telecommunity.com> Message-ID: <200310180005.24686.aleaxit@yahoo.com> On Friday 17 October 2003 11:45 pm, Phillip J. Eby wrote: ... > At 10:40 PM 10/17/03 +0200, Alex Martelli wrote: > >Yes, def curry(func, arg): return new.instancemethod(func, arg, object) ... > >def curry(func, arg): > > def curried(*args): return func(arg, *args) > > return curried ... > It is a big win if the curried function will be used in a > performance-sensitive way. Instance method objects don't pay for setting > up an extra frame object, and for the single curried argument, the > interpreter even shortcuts some of the instancemethod overhead! So, if I You're right: the instancemethod version has impressively better performance (should the curried function be used in a bottleneck, of course) -- i.e., given a.py: import new def curry1(func, arg): return new.instancemethod(func, arg, object) def curry2(func, arg): def curried(*args): return func(args, *args) return curried def f(a, b, c): return a, b, c I've measured: [alex@lancelot ba]$ timeit.py -c -s' import a g = a.curry2(a.f, 23) ' 'g(45, 67)' 100000 loops, best of 3: 2 usec per loop [alex@lancelot ba]$ timeit.py -c -s' import a g = a.curry1(a.f, 23) ' 'g(45, 67)' 1000000 loops, best of 3: 1.09 usec per loop I sure didn't expect an almost 2:1 ratio, while you did predict it. Alex From martin at v.loewis.de Fri Oct 17 18:07:33 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Fri Oct 17 18:07:51 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: References: Message-ID: "David LeBlanc" writes: > What's the cost of mapping the world (all those entry points) at startup? I believe it is measurable. It also adds maintenance costs to have extension modules, both in terms of the build procedure, and in packaging. > You have to rebuild all of the main dll just to do something to one > component. To me, that's maybe the biggest single issue. When did you last wish to rebuild one of the modules without having a PCBuild directory in the first place? If that ever happened, which module did you wish to rebuild and why? > Any possiblity of new bugs? Not likely. > Are app users/programmers going to have a bloat perception? This is possible; it appears that all readers who, in this thread, have spoken in favour of keeping the status quo have done so because of a bloat perception. > IMO, it contradicts the unix way of smaller, compartmentalized is better. I dislike the usage of shared libraries on Unix, and still hope that the Python build procedure becomes sane again by reducing its usage of shared extension modules, in favour of a single complete binary. > It's not unix we're talking about, but it still makes sense to me, whatever > the OS. It makes no sense to me whatsoever. > On a related side note: has anyone done any investigation to > determine which few percentage of the extensions account for 99% of > the dll loads? Do you have any specific concerns beyond FUD? Regards, Martin From seandavidross at hotmail.com Fri Oct 17 18:08:08 2003 From: seandavidross at hotmail.com (Sean Ross) Date: Fri Oct 17 18:08:27 2003 Subject: [Python-Dev] accumulator display syntax References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <200310172255.43697.aleaxit@yahoo.com> Message-ID: Hello. Perhaps looking at some examples of what nested itercomps might look like (because they _will_ be used if they're available...) using each of the leading syntaxes would be useful in trying to decide which form, if any, is most acceptable (or least unacceptable, whichever the case may be): # (1) without parentheses: B(y) for y in A(x) for x in myIterable # (2) for clarity, we'll add some optional parentheses: B(y) for y in (A(x) for x in myIterable) # (3) OK. Now, with required parentheses: (B(y) for y in (A(x) for x in myIterable)) # (4) And, now with the required "yield:" and parentheses: (yield: B(y) for y in (yield: A(x) for x in myIterable)) #(5) And, finally, for completeness, using the rejected PEP 289 syntax: [yield B(y) for y in [yield A(x) for x in myIterable]] Hope that's useful, Sean p.s. I'm only a Python user, and not a developer, so if my comments are not welcome here, please let me know, and I will refrain in future. Thanks for your time. From aleaxit at yahoo.com Fri Oct 17 18:09:09 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 17 18:09:13 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310172150.h9HLoYj07532@12-236-54-216.client.attbi.com> References: <200310172201.50930.aleaxit@yahoo.com> <200310172328.23057.aleaxit@yahoo.com> <200310172150.h9HLoYj07532@12-236-54-216.client.attbi.com> Message-ID: <200310180009.09257.aleaxit@yahoo.com> On Friday 17 October 2003 11:50 pm, Guido van Rossum wrote: > > Yes. But don't mind me, I'm still sad that we have range and xrange > > when iter(a:b) and list(a:b:c) would be SUCH good replacements for > > them if slicing-notation was accepted elsewhere than in indexing, > > This has been proposed more than once (I think the last time by Paul > Dubois, who wanted x:y:z to be a general expression), and has a > certain elegance, but is probably too terse. Perhaps mandatory parentheses around it (as sole argument in a function call, say) might make it un-terse enough for acceptance...? The frequence of counted loops IS such that replacing for x in range(9): ... with for x in (0:9): ... WOULD pay for itself soon in reduced wear and tear on keyboards...;-) [Using iter(0:9) instead would be only "conceptually neat", no typing advantage on range -- conceded]. Alex From martin at v.loewis.de Fri Oct 17 18:10:49 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Fri Oct 17 18:11:10 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: References: <3F872FE9.9070508@v.loewis.de> <3F8C3DD0.4020400@v.loewis.de> <200310162019.h9GKJli05194@12-236-54-216.client.attbi.com> Message-ID: Thomas Heller writes: > I'm undecided on these modules, I do not use them now but may in the > future - so I'm undecided: > > _csv winsound datetime bz2 I think Guido's point that you should be able to build pythonxy.dll without downloading additional source is good, so _csv, winsound, datetime would go in, and bz2 would stay out. Regards, Martin From martin at v.loewis.de Fri Oct 17 18:14:30 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Fri Oct 17 18:14:49 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: <3F90367D.200@ieee.org> References: <3F872FE9.9070508@v.loewis.de> <3F8C3DD0.4020400@v.loewis.de> <200310162019.h9GKJli05194@12-236-54-216.client.attbi.com> <3F90367D.200@ieee.org> Message-ID: "Shane Holloway (IEEE)" writes: > > Don't know what these do, so I cannot really comment: > > _symtable parser unicodedata > > Neither do I. Although unicodedata is fairly big. As I tried to explain: the size of the library is relatively irrelevant, atleast for performance (it might matter for py2exe-style standalone binary production). What matters (as Guido explains) is whether you need additional libraries to download or link with, which is not the case for either of these modules. Regards, Martin From aleaxit at yahoo.com Fri Oct 17 18:14:51 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 17 18:14:58 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310172145.h9HLjh807477@12-236-54-216.client.attbi.com> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310172255.43697.aleaxit@yahoo.com> <200310172145.h9HLjh807477@12-236-54-216.client.attbi.com> Message-ID: <200310180014.51336.aleaxit@yahoo.com> On Friday 17 October 2003 11:45 pm, Guido van Rossum wrote: ... > In conclusion, I think this syntax is pretty cool. (It will probably > die the same death as the ternary expression though.) Ah well -- in this case I guess I won't go to the bother of deciding whether I like your preferred "lighter" syntax or the "stands our more" one. The sad, long, lingering death of the ternary expression was too painful to repeat -- let's put this one out of its misery sooner. Alex From aleaxit at yahoo.com Fri Oct 17 18:18:15 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 17 18:18:23 2003 Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate In-Reply-To: <002c01c394f7$fa270500$e841fea9@oemcomputer> References: <002c01c394f7$fa270500$e841fea9@oemcomputer> Message-ID: <200310180018.15836.aleaxit@yahoo.com> On Friday 17 October 2003 11:45 pm, Raymond Hettinger wrote: ... > To be considered as a possible itertool, an ideal candidate should: Very nice set of specs! Which reminds me: why don't we have take(n, it) and drop(n, it) there? I find myself rewriting those quite often. Alex From pje at telecommunity.com Fri Oct 17 17:45:32 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Oct 17 18:24:57 2003 Subject: Currying with instancemethod (was Re: [Python-Dev] accumulator display syntax) In-Reply-To: <200310172240.28322.aleaxit@yahoo.com> References: <5.1.0.14.0.20031017134657.01ec1a70@mail.telecommunity.com> <5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com> <5.1.0.14.0.20031017134657.01ec1a70@mail.telecommunity.com> Message-ID: <5.1.0.14.0.20031017173618.02fe7820@mail.telecommunity.com> At 10:40 PM 10/17/03 +0200, Alex Martelli wrote: >Yes, def curry(func, arg): return new.instancemethod(func, arg, object) >IS indeed way more general than func.__get__(arg) [notably, you get to >call it repeatedly to curry more than one argument, from the left]. But >if you have to define a curry function anyway, it's not a huge win vs > >def curry(func, arg): > def curried(*args): return func(arg, *args) > return curried > >or indeed more general variations thereof such as > >def curry(func, *curried_args): > def curried(*args): return func(*(curried_args+args)) > return curried It is a big win if the curried function will be used in a performance-sensitive way. Instance method objects don't pay for setting up an extra frame object, and for the single curried argument, the interpreter even shortcuts some of the instancemethod overhead! So, if I were taking the time to write a currying function, I'd probably implement your latter version by chaining instance methods. (Of course, I'd also want to test to find out how many I could chain before the frame overhead was less than the chaining overhead.) Whenever I've run into a performance problem in Python (usually involving loops over 10,000+ items), I've almost invariably found that the big culprit is how many (Python) function calls happen in the loop. In such cases, execution time is almost linearly proportional to how many function calls happen, and inlining functions or resorting to a Pyrex version of the same function can often eliminate the performance problem on that basis alone. (For the Pyrex conversion, I have to use PyObject_GetAttr() in place of Pyrex's native attribute access, because it otherwise uses GetAttrString(), which seems to often make up for the lack of frame creation overhead.) From pyth at devel.trillke.net Fri Oct 17 18:27:42 2003 From: pyth at devel.trillke.net (Holger Krekel) Date: Fri Oct 17 18:27:50 2003 Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate In-Reply-To: <002c01c394f7$fa270500$e841fea9@oemcomputer>; from python@rcn.com on Fri, Oct 17, 2003 at 05:45:26PM -0400 References: <20031017224924.L14453@prim.han.de> <002c01c394f7$fa270500$e841fea9@oemcomputer> Message-ID: <20031018002742.M14453@prim.han.de> Raymond Hettinger wrote: > [Gustavo Niemeyer wrote] > > > > If anything at all, i'd suggest a std-module which contains e.g. > > > > 'sort', 'reverse' and 'extend' functions which always return > > > > a new list > > > > a new list, so that you could write: > > > > > > > > for i in reverse(somelist): > > > > ... > > sort: This is being addressed by the proposed list.copysort() method > reverse: This is being addressed by PEP-0322. When I get a chance, > the PEP will be revised to propose a builtin instead of > various methods attached to specific sequence objects. > extend: How would this differ from itertools.chain() ? pointing someone to these three different specific (somewhat limited) solutions for the "i want reverse/sort/extend/... not to work inplace but on-the-fly" requirement seems tedious. > There is no chance for isort(). Once you've sorted the whole list, > there is no advantage to returning an iterator instead of a list. Providing a uniform concept counts as an advantage IMO. Agreed, performance wise there probably is no advantage with the current sorting-algorithm. > The problem with ireverse() is that it only works with objects that > support __getitem__() and len(). That pretty much precludes > generators, user defined class based iterators, and the outputs > from other itertools. So, while it may make a great builtin (which > is what PEP-322 is going to propose), it doesn't fit in with other > itertools. I wouldn't mind if reverse would - as a fallback - suck all elements and then spit them out in reverse order. After all, you sometimes want to process yielded values from an iterator in reverse order and there is not much else you can do than to exhaust the iterator. cheers, holger From Scott.Daniels at Acm.Org Fri Oct 17 18:30:44 2003 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Fri Oct 17 18:30:58 2003 Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate In-Reply-To: References: Message-ID: <3F906D94.5030902@Acm.Org> [Raymond Hettinger] > To be considered as a possible itertool, an ideal candidate should: > * work well in combination with other itertools > * be a fundamental building block > * accept all iterables as inputs > * return only an iterator as an output > * run lazily so as not to force the inputs to run to completion > unless externally requested by list() or some such. > * consume constant memory (this rule was bent for itertools.cycle(), > but should be followed as much as possible). > * run finitely if some of the inputs are finite (itertools.repeat(), > count() and cycle() are the only intentionally infinite tools) > > There is no chance for isort(). Once you've sorted the whole list, > there is no advantage to returning an iterator instead of a list. Actually, some case can be made: loading prepares a heap, iterating extracts the heap top. sit = isort(someiter) sit.next() is the winner. Then sit.next() is second-place (or a tie with the winner). q = sit.next() [q] + takewhile(lambda x: x==q, sit) is all who tied with the runner-up. Which isn't to say I think it fits. But there are reasons to get everything and then dole out parts. -Scott David Daniels Scott.Daniels@Acm.Org From theller at python.net Fri Oct 17 18:42:26 2003 From: theller at python.net (Thomas Heller) Date: Fri Oct 17 18:42:31 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: (Martin v.'s message of "18 Oct 2003 00:10:49 +0200") References: <3F872FE9.9070508@v.loewis.de> <3F8C3DD0.4020400@v.loewis.de> <200310162019.h9GKJli05194@12-236-54-216.client.attbi.com> Message-ID: <8ynj1o6l.fsf@python.net> martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) writes: > Thomas Heller writes: > >> I'm undecided on these modules, I do not use them now but may in the >> future - so I'm undecided: >> >> _csv winsound datetime bz2 > > I think Guido's point that you should be able to build pythonxy.dll > without downloading additional source is good, so _csv, winsound, > datetime would go in, and bz2 would stay out. Yes, and _ssl would also stay out (it seems I forgot to list it). The only module needing external source is zlib - and this is one I care about because it may be useful for zipimport of compressed modules. Can't we simply import the zlib sources into Python's CVS? Thomas From aleaxit at yahoo.com Fri Oct 17 18:44:47 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 17 18:44:51 2003 Subject: [Python-Dev] prePEP: Money data type In-Reply-To: References: Message-ID: <200310180044.47083.aleaxit@yahoo.com> On Friday 17 October 2003 11:33 pm, Batista, Facundo wrote: ... > #- I don't think these are the rules in the European Union > #- (they're popular > #- in statistics, but, I suspect, not legally correct in > #- accounting). I can try > #- to research that, if you need me to. > > Please. Because I found it in FixedPoint, and researching, think that in > Argentina that's the way banks get rounded money. Found it -- article 5 of the Council Regulation which established the Euro a few years ago is titled "Rounding" and specifies (I quote selectively): """ shall be rounded up or down to the nearest cent ... If ... a result ... is exactly half-way, the sum shall be rounded up. """ The regulation goes on to show cases in which two conversions back and forth (EUR to/from older currencies) can lose or gain a cent, and specifies: """ such difference cannot be invoked to dispute the correctness of payments. The difference must be allowed as a 'tolerance' insofar as it results from the application of the European Regulation. This 'tolerance' should also be incorporated in data processing programmes, especially accounting programmes, in order to avoid problems connected with the reconciliation of amounts. """ The Visual Basic FAQ, for example, explicitly warns that VB does *NOT* respect the legal requirements of Euro conversion rules. The Euro rules are summarized in the FAQ as: """ When rounding to an x number of decimals, the last decimal must be: - Rounded down (i.e. left alone) when the following decimal (if any) is 4 or less. - Rounded up when the following decimal is 5 or more. """ while VB's rules are: """ If after the digit that is to be rounded, the digits following are exactly equal to 5, the value is rounded to the NEAREST EVEN NUMBER. """ (I _think_ it means "the digit ... is", NOT "the digits ... are"). In fact, follow-ons clarify that VB isn't fully coeherent on these rules (hah). But the point remains: rounding half a cent to even rather than always up violates European Union law; nor can the "tolerance rule" be invoked, because it's specifically limited to one-cent discrepancies that "result from the application of the European Regulation", while this one would result from the _violation_ thereof. Oh BTW, other sites quite explicitly state that the rule applies throughout the EU, _not_ only to countries that have adopted the Euro. FWIW, Rogue Wave's Money class lets you specify _either_ rounding approach -- ROUND_PLAIN specifies EU-rules-compliant rounding, ROUND_BANKERS specifies round-to-even, for exactly in-between amounts. Offhand, it would seem impossible to write an accounting program that respects the law in Europe AND the praxis you mention at the same time, unless you somehow tell it what rule to use. Sad, and seems weird to go to such trouble for a cent, but accountants live and die by such minutiae: I think it would not be wise to ignore them, PARTICULARLY if we name the type so as to make it appear to the uninitiated that it "will do the right thing" regarding rounding... when there isn't ONE right thing, it depends on locale &c:-(. Alex From python at rcn.com Fri Oct 17 18:46:54 2003 From: python at rcn.com (Raymond Hettinger) Date: Fri Oct 17 18:47:36 2003 Subject: [Python-Dev] RE: itertools, was RE: list.sort In-Reply-To: <200310180018.15836.aleaxit@yahoo.com> Message-ID: <003201c39500$9006a8c0$e841fea9@oemcomputer> [Raymond] > > To be considered as a possible itertool, an ideal candidate should: [Alex] > Very nice set of specs! Thanks! > Which reminds me: why don't we have take(n, it) > and drop(n, it) there? I find myself rewriting those quite often. Yeah, me too. When you write them, do they return lists or iterators? For me, take() has been most useful in list form, but my point of view is biased because I use it to experiment with itertool suggestions and need an easy way manifest a portion of a potentially infinite iterator. My misgivings about drop() and take() are, firstly, that they are expressible in-terms of islice() so they don't really add any new capability. Secondly, the number of tools needs to be kept to a minimum -- already, the number of tools is large enough to complicate the task of figuring out how to use them in combination -- the examples page in the docs is intended, in part, to record the best discoveries so they won't have to be continually re-invented. Raymond Hettinger P.S. Itertool tip for the day: to generate a stream of random numbers, write: stapmap(random.random, repeat(())) From guido at python.org Fri Oct 17 18:47:28 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 18:48:04 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Fri, 17 Oct 2003 17:58:06 EDT." <5.1.0.14.0.20031017175033.02fea090@mail.telecommunity.com> References: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <16272.5373.514560.225999@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <5.1.0.14.0.20031017175033.02fea090@mail.telecommunity.com> Message-ID: <200310172247.h9HMlSI07690@12-236-54-216.client.attbi.com> > Hm. What if list comprehensions returned a "lazy list", that if you took > an iterator of it, you'd get a generator-iterator, but if you tried to use > it as a list, it would populate itself? Then there'd be no need to ever > *not* use a listcomp, and only one syntax would be necessary. > > More specifically, if all you did with the list was iterate over it, and > then throw it away, it would never actually populate itself. The principle > drawback to this idea from a semantic viewpoint is that listcomps can be > done over expressions that have side-effects. :( I don't think this can be done without breaking b/w compatibility. Example: a = [x**2 for x in range(10)] for i in a: print i print a Your proposed semantics would throw away the values in the for loop, so what would it print in the third line? --Guido van Rossum (home page: http://www.python.org/~guido/) From mike at nospam.com Fri Oct 17 19:01:45 2003 From: mike at nospam.com (Mike Rovner) Date: Fri Oct 17 19:01:44 2003 Subject: [Python-Dev] Re: prePEP: Money data type References: Message-ID: Batista, Facundo wrote: > #- Good, but the name seems ambiguous -- I would expect 'money' > #- to include > #- a *currency unit*, while these are just numbers. E.g., > #- these days for me a > #- "money amount" of "1000" isn't immediately significant -- > #- does it mean "old > #- liras", Euros, SEK, ...? If a clearer name (perhaps > #- Decimal?) was adopted, > #- the type's purposes would be also clearer, perhaps. > > Specifically it doesn't diferenciate it. It is printed with a '$' > prefix, but that's all. >From the prePEP it's not clear (for me) the purpose of curencySymbol. If it's intended for localisation, then prefix isn't enough, some countries use suffix or even such format Money(123.45, 2) --> 123 FF 45 GG where FF is suffix1 and GG is suffix2. Regards, Mike PS. If it's not appropriate to post such comments to c.l.p.dev, just tell me. From pje at telecommunity.com Fri Oct 17 19:06:22 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Oct 17 19:06:25 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310172247.h9HMlSI07690@12-236-54-216.client.attbi.com> References: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <16272.5373.514560.225999@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <5.1.0.14.0.20031017175033.02fea090@mail.telecommunity.com> Message-ID: <5.1.0.14.0.20031017185718.0256f490@mail.telecommunity.com> At 03:47 PM 10/17/03 -0700, Guido van Rossum wrote: > > Hm. What if list comprehensions returned a "lazy list", that if you took > > an iterator of it, you'd get a generator-iterator, but if you tried to use > > it as a list, it would populate itself? Then there'd be no need to ever > > *not* use a listcomp, and only one syntax would be necessary. > > > > More specifically, if all you did with the list was iterate over it, and > > then throw it away, it would never actually populate itself. The > principle > > drawback to this idea from a semantic viewpoint is that listcomps can be > > done over expressions that have side-effects. :( > >I don't think this can be done without breaking b/w compatibility. Example: > > a = [x**2 for x in range(10)] > for i in a: print i > print a > >Your proposed semantics would throw away the values in the for loop, >so what would it print in the third line? I should've been more specific... some pseudocode: class LazyList(list): materialized = False def __init__(self, generator_func): self.generator = generator_func def __iter__(self): # When iterating, use the generator, unless # we've already computed contents. if self.materialized: return super(LazyList,self).__iter__() else: return self.generator() def __getitem__(self,index): if not self.materialized: self[:] = list(self.generator()) self.materialized = True return super(LazyList,self).__getitem__(index) def __len__(self): if not self.materialized: self[:] = list(self.generator()) self.materialized = True return super(LazyList,self).__len__() # etc. So, the problem isn't that the code you posted would fail on 'print a', it's that the generator function would be run *twice*, which would be a no-no if it had side effects, and would also take longer. It was just a throwaway idea, in the hopes that maybe it would lead to an idea that would actually work. Ah well, maybe in Python 3.0, there'll just be itercomps, and we'll use list(itercomp) when we want a list. From aleaxit at yahoo.com Fri Oct 17 19:29:42 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 17 19:29:47 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <5.1.0.14.0.20031017175033.02fea090@mail.telecommunity.com> References: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <5.1.0.14.0.20031017175033.02fea090@mail.telecommunity.com> Message-ID: <200310180129.42506.aleaxit@yahoo.com> On Friday 17 October 2003 11:58 pm, Phillip J. Eby wrote: ... > Hm. What if list comprehensions returned a "lazy list", that if you took > an iterator of it, you'd get a generator-iterator, but if you tried to use > it as a list, it would populate itself? Then there'd be no need to ever > *not* use a listcomp, and only one syntax would be necessary. > > More specifically, if all you did with the list was iterate over it, and > then throw it away, it would never actually populate itself. The principle > drawback to this idea from a semantic viewpoint is that listcomps can be > done over expressions that have side-effects. :( The big problem I see is e.g. as follows: l1 = range(6) lc = [ x for x in l1 ] for a in lc: l1.append(a) (or insert the LC inline in the for, same thing either way I'd sure hope). Today, this is perfectly well-defined, since the LC "takes a snapshot" when evaluated -- l1 becomes a 12-elements list, as if I had done l1 *= 2. But if lc _WASN'T_ "populated"... shudder... it would be as nonterminating as "for a in l1:" same loop body. Unfortunately, it seems to me that turning semantics from strict to lazy is generally unfeasible because of such worries (even if one could somehow ignore side effects). Defining semantics as lazy in the first place is fine: as e.g. "for a in iter(l1):" has always produced a nonterminating loop for that body (iter has always been lazy), people just don't use it. But once it has been defined as strict, going to lazy is probably unfeasible. Pity... Alex From aleaxit at yahoo.com Fri Oct 17 19:43:36 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 17 19:43:42 2003 Subject: [Python-Dev] Re: itertools, was RE: list.sort In-Reply-To: <003201c39500$9006a8c0$e841fea9@oemcomputer> References: <003201c39500$9006a8c0$e841fea9@oemcomputer> Message-ID: <200310180143.36999.aleaxit@yahoo.com> On Saturday 18 October 2003 12:46 am, Raymond Hettinger wrote: ... > My misgivings about drop() and take() are, firstly, that they > are expressible in-terms of islice() so they don't really add > any new capability. Secondly, the number of tools needs to be True. I gotta remember that -- I find it unintuitive, maybe it's islice's odious range-like ordering of arguments. Alex From martin at v.loewis.de Fri Oct 17 18:51:03 2003 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri Oct 17 19:57:28 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: <8ynj1o6l.fsf@python.net> References: <3F872FE9.9070508@v.loewis.de> <3F8C3DD0.4020400@v.loewis.de> <200310162019.h9GKJli05194@12-236-54-216.client.attbi.com> <8ynj1o6l.fsf@python.net> Message-ID: <3F907257.1030406@v.loewis.de> Thomas Heller wrote: > Yes, and _ssl would also stay out (it seems I forgot to list it). The > only module needing external source is zlib - and this is one I care > about because it may be useful for zipimport of compressed > modules. Can't we simply import the zlib sources into Python's CVS? I would advise against that: On Unix, it wouldn't be used, because people would ask that the platform's zlib shared library is used. It appears that in this specific case, Guido is willing to compromise that downloading zlib source to build pythonxy.dll could be acceptable. Also, in this specific case, making it easy to remove zlib support would be possible: add a HAVE_ZLIB in pyconfig.h, and put HAVE_ZLIB around the reference in config.c. Anybody who does not want to download zlib would need to edit pyconfig.h (or perhaps the pythoncore project). Regards, Martin From guido at python.org Fri Oct 17 19:57:45 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 19:57:54 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: Your message of "Sat, 18 Oct 2003 00:42:26 +0200." <8ynj1o6l.fsf@python.net> References: <3F872FE9.9070508@v.loewis.de> <3F8C3DD0.4020400@v.loewis.de> <200310162019.h9GKJli05194@12-236-54-216.client.attbi.com> <8ynj1o6l.fsf@python.net> Message-ID: <200310172357.h9HNvjC07788@12-236-54-216.client.attbi.com> > The only module needing external source is zlib - and this is one I > care about because it may be useful for zipimport of compressed > modules. Can't we simply import the zlib sources into Python's CVS? I don't like that very much; there are always licensing issues. (Even though we did do this for expat.) --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Fri Oct 17 21:10:05 2003 From: python at rcn.com (Raymond Hettinger) Date: Fri Oct 17 21:11:36 2003 Subject: [Python-Dev] generator comprehension syntax, was: accumulator display syntax In-Reply-To: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> Message-ID: <000201c39514$ac006f20$e841fea9@oemcomputer> [GvR] > I'd just like to pipe into this discussion saying that while Peter > Norvig's pre-PEP is neat, I'd reject it if it were a PEP; the main > reason that the proposed notation doesn't return a list. I agree that > having generator comprehensions would be a more general solution. I > don't have a proposal for generator comprehension syntax though, and > [yield ...] has the same problem. Is Phil's syntax acceptable to everyone? (yield: x*x for x in roots) I think this form works nicely. looking-for-resolution-and-consensus-ly yours, Raymond From python at rcn.com Fri Oct 17 21:25:50 2003 From: python at rcn.com (Raymond Hettinger) Date: Fri Oct 17 21:26:33 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: Message-ID: <000401c39516$c440b520$e841fea9@oemcomputer> [GvR] > > > like type checkers (and IMO are also easily overseen by human > > > readers).? And, it's easier to write l.sorted() rather than > > > l.sort(inline=True). [Aahz] > > Let's make explicit: l.copysort() > > > > I'm not a big fan of grammatical suffixes for > distinguishing between > > similar meanings. > > +1 [Facundo] > +2, considering that the difference in behaviour with sort and > sorted it's no so clear to a non-english speaker. FWIW, I've posted a patch to implement list.copysort() that includes a news announcement, docs, and unittests: www.python.org/sf/825814 Raymond Hettinger From guido at python.org Fri Oct 17 23:20:31 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 23:20:42 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: Your message of "Fri, 17 Oct 2003 21:25:50 EDT." <000401c39516$c440b520$e841fea9@oemcomputer> References: <000401c39516$c440b520$e841fea9@oemcomputer> Message-ID: <200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com> > FWIW, I've posted a patch to implement list.copysort() that > includes a news announcement, docs, and unittests: > > www.python.org/sf/825814 Despite my suggesting a better name, I'm not in favor of this (let's say -0). For one, this will surely make lots of people write for key in D.keys().copysort(): ... which makes an unnecessary copy of the keys. I'd rather continue to write keys = D.keys() keys.sort() for key in keys: ... --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Fri Oct 17 23:28:39 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Oct 17 23:28:47 2003 Subject: [Python-Dev] generator comprehension syntax, was: accumulator display syntax In-Reply-To: <000201c39514$ac006f20$e841fea9@oemcomputer> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> Message-ID: <5.1.0.14.0.20031017231830.01ed3ec0@mail.telecommunity.com> At 09:10 PM 10/17/03 -0400, Raymond Hettinger wrote: >[GvR] > > I'd just like to pipe into this discussion saying that while Peter > > Norvig's pre-PEP is neat, I'd reject it if it were a PEP; the main > > reason that the proposed notation doesn't return a list. I agree that > > having generator comprehensions would be a more general solution. I > > don't have a proposal for generator comprehension syntax though, and > > [yield ...] has the same problem. > >Is Phil's syntax acceptable to everyone? > > (yield: x*x for x in roots) Ironically, I'm opposed. :) * Yield is a control flow statement, this is an expression * yield: looks like lambda, and this is not a function * Yield only makes sense if you come into this thinking about generators * Yield distracts from the purpose of the expression To put it another way, Python is "executable pseudocode". Listcomps are pseudocode. Yield in a generator is pseudocode. (x*x for x in roots) is pseudocode. But (yield: x*x for x in roots) looks like some kind of weird programming language gibberish. :) I think the worst misinterpretation I could have about the yield-less syntax is that I might think it was a "tuple comprehension" or something that returned a sequence instead of an iterator. However, I'll find out it's not a sequence or tuple if I try to do anything with it that requires a sequence or tuple. My worst case problem is re-execution of the iterator. Which, by the way, brings up a question: should iterator comps be reiterable? I don't see any reason right now why they shouldn't be, and can think of situations where reiterability would be useful. From aahz at pythoncraft.com Fri Oct 17 23:54:46 2003 From: aahz at pythoncraft.com (Aahz) Date: Fri Oct 17 23:54:50 2003 Subject: [Python-Dev] generator comprehension syntax, was: accumulator display syntax In-Reply-To: <000201c39514$ac006f20$e841fea9@oemcomputer> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> <000201c39514$ac006f20$e841fea9@oemcomputer> Message-ID: <20031018035445.GA14929@panix.com> On Fri, Oct 17, 2003, Raymond Hettinger wrote: > [GvR] >> >> I'd just like to pipe into this discussion saying that while Peter >> Norvig's pre-PEP is neat, I'd reject it if it were a PEP; the main >> reason that the proposed notation doesn't return a list. I agree that >> having generator comprehensions would be a more general solution. I >> don't have a proposal for generator comprehension syntax though, and >> [yield ...] has the same problem. > > Is Phil's syntax acceptable to everyone? > > (yield: x*x for x in roots) I'm not sure. Let's try it out: for square in (yield: x*x for x in roots): print square That doesn't look *too* bad. Okay, how about this: def grep(pattern, iter): pattern = re.compile(pattern) for item in iter: if pattern.search(str(item)): yield item for item in grep("1", (yield: x*x for x in roots) ): print item Now that looks disgusting. OTOH, I doubt any syntax for a generator comprehension could improve that. On the gripping hand, I'm concerned that we're going in Lisp's direction with too many parens. At least with the listcomp you have more of a visual cue: for item in grep("1", [x*x for x in roots] ): -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From aahz at pythoncraft.com Fri Oct 17 23:57:05 2003 From: aahz at pythoncraft.com (Aahz) Date: Fri Oct 17 23:57:07 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com> References: <000401c39516$c440b520$e841fea9@oemcomputer> <200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com> Message-ID: <20031018035705.GB14929@panix.com> On Fri, Oct 17, 2003, Guido van Rossum wrote: >Raymond: >> >> FWIW, I've posted a patch to implement list.copysort() that >> includes a news announcement, docs, and unittests: > > Despite my suggesting a better name, I'm not in favor of this (let's > say -0). I'm actually -1, particularly with your clear argument; I just didn't like your suggestion of l.sorted(). -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From guido at python.org Fri Oct 17 23:57:21 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 17 23:57:38 2003 Subject: [Python-Dev] generator comprehension syntax, was: accumulator display syntax In-Reply-To: Your message of "Fri, 17 Oct 2003 23:28:39 EDT." <5.1.0.14.0.20031017231830.01ed3ec0@mail.telecommunity.com> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> <5.1.0.14.0.20031017231830.01ed3ec0@mail.telecommunity.com> Message-ID: <200310180357.h9I3vLE08067@12-236-54-216.client.attbi.com> > Which, by the way, brings up a question: should iterator comps be > reiterable? I don't see any reason right now why they shouldn't be, and > can think of situations where reiterability would be useful. Oh, no. Not reiterability again. How can you promise something to be reiterable if you don't know whether the underlying iterator can be reiterated? Keeping a hidden buffer would be a bad idea. --Guido van Rossum (home page: http://www.python.org/~guido/) From bac at OCF.Berkeley.EDU Sat Oct 18 02:07:26 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Sat Oct 18 02:07:36 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <20031018035705.GB14929@panix.com> References: <000401c39516$c440b520$e841fea9@oemcomputer> <200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com> <20031018035705.GB14929@panix.com> Message-ID: <3F90D89E.3060706@ocf.berkeley.edu> Aahz wrote: > On Fri, Oct 17, 2003, Guido van Rossum wrote: > >>Raymond: >> >>>FWIW, I've posted a patch to implement list.copysort() that >>>includes a news announcement, docs, and unittests: >> >>Despite my suggesting a better name, I'm not in favor of this (let's >>say -0). > > > I'm actually -1, particularly with your clear argument; I just didn't > like your suggestion of l.sorted(). I'm -1 as well. Lists do not need to grow a method for something that only replaces two lines of code that are not tricky in any form of the word. -Brett From python at rcn.com Sat Oct 18 02:52:46 2003 From: python at rcn.com (Raymond Hettinger) Date: Sat Oct 18 02:53:29 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com> Message-ID: <001f01c39544$70314de0$e841fea9@oemcomputer> [Raymond] > > FWIW, I've posted a patch to implement list.copysort() that > > includes a news announcement, docs, and unittests: > > > > www.python.org/sf/825814 [Guido] > Despite my suggesting a better name, I'm not in favor of this (let's > say -0). > > For one, this will surely make lots of people write > > for key in D.keys().copysort(): > ... > > which makes an unnecessary copy of the keys. I'd rather continue to > write > > keys = D.keys() > keys.sort() > for key in keys: > ... Interesting that you saw this at the same time I was fretting about it over dinner. The solution is to bypass the copy step for the common case of: for elem in somelistmaker().copysort(): . . . The revised patch is at: www.python.org/sf/825814 The technique is to re-use the existing list whenever the refcount is one. This keeps the mutation invisible. Advantages of a copysort() method: * Avoids creating an unnecessary, stateful variable that remains visible after the sort is needed. In the above example, the definition of the "keys" variable changes from unsorted to sorted. Also, the lifetime of the variable extends past the loop where it was intended to be used. In longer code fragments, this unnecessarily increases code complexity, code length, the number of variables, and increases the risk of using a variable in the wrong state which is a common source of programming errors. * By avoiding control flow (the assignments in the current approach), an inline sort becomes usable anywhere an expression is allowed. This includes important places like function call arguments and list comprehensions: todo = [t for t in tasks.copysort() if due_today(t)] genhistory(date, events.copysort(key=incidenttime)) Spreading these out over multiple lines is an unnecessary distractor from the problem domain, resulting is code that is harder to read, write, visually verify, grok, or debug. Raymond Hettinger P.S. There are probably better names than copysort, but the idea still holds. From aleaxit at yahoo.com Sat Oct 18 05:20:45 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 18 05:20:55 2003 Subject: [Python-Dev] generator comprehension syntax, was: accumulator display syntax In-Reply-To: <200310180357.h9I3vLE08067@12-236-54-216.client.attbi.com> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> <5.1.0.14.0.20031017231830.01ed3ec0@mail.telecommunity.com> <200310180357.h9I3vLE08067@12-236-54-216.client.attbi.com> Message-ID: <200310181120.45477.aleaxit@yahoo.com> On Saturday 18 October 2003 05:57 am, Guido van Rossum wrote: > > Which, by the way, brings up a question: should iterator comps be > > reiterable? I don't see any reason right now why they shouldn't be, and > > can think of situations where reiterability would be useful. > > Oh, no. Not reiterability again. How can you promise something to be > reiterable if you don't know whether the underlying iterator can be > reiterated? Keeping a hidden buffer would be a bad idea. I agree it would be bad to have "black magic" performed by every iterator to fulfil a contract that may or may not be useful to clients and might be costly to fulfil. IF "reiterability" is useful (and I'd need to see some use cases, because I don't particularly recall pining for it in Python) it should be exposed as a separate protocol that may or may not be offered by any given iterator type. E.g., the presence of a special method __reiter__ could indicate that this iterator IS able to supply another iterator which retraces the same steps from the start; and perhaps iter(xxx, reiterable=True) could strive to provide a reiterable iterator for xxx, which might justify building one that keeps a hidden buffer as a last resort. But first, I'd like use cases... There ARE other features I'd REALLY have liked to get from iterators in some applications. A "snapshot" -- providing me two iterators, the original one and another, which will step independently over the same sequence of items -- would have been really handy at times. And a "step back" facility ("undo" of the last call to next) -- sometimes one level would suffice, sometimes not; often I could have provided the item to be "pushed back" so the iterator need not retain memory of it independently, but that wouldn't always be handy. Now any of these can be built as a wrapper over an existing iterator, of course -- just like 'reiterability' could (and you could in fact easily implement reiterability in terms of snapshotting, by just ensuring a snapshot is taken at the start and further snapshotted but never disturbed); but not knowing the abilities of the underlying iterator would mean these wrappers would often duplicate functionality needlessly. E.g.: class snapshottable_sequence_iter(object): def __init__(self, sequence, i=0): self.sequence = sequence self.i = i def __iter__(self): return self def next(self): try: result = self.sequence[self.i] except IndexError: raise StopIteration self.i += 1 return result def snapshot(self): return self.__class__(self.sequence, self.i) Here, snapshotting is quite cheap, requiring just a new counter and another reference to the same underlying sequence. So would be restarting and stepping back, directly implemented. But if we need to wrap a totally generic iterator to provide "snapshottability", we inevitably end up keeping a list (or the like) of items so far seen from one but not both 'independent' iterators obtained by a snapshot -- all potentially redundant storage, not to mention the possible coding trickiness in maintaining that FIFO queue. As I said I do have use cases for all of these. Simplest is the ability to push back the last item obtained by next, since a frequent patter is: for item in iterator: if isok(item): process(item) else: # need to push item back onto iterator, then break else: # all items were OK, iterator exhausted, blah blah ...and later... for item in iterator: # process some more items Of course, as long as just a few levels of pushback are enough, THIS one is an easy and light-weight wrapper to write: class pushback_wrapper(object): def __init__(self, it): self.it = it self.pushed_back = [] def __iter__(self): return self def next(self): try: return self.pushed_back.pop() except IndexError: return self.it.next() def pushback(self, item): self.pushed_back.append(item) A "snapshot" would be useful whenever more than one pass on a sequence _or part of it_ is needed (more useful than a "restart" because of the "part of it" provision). And a decent wrapper for it is a bear... Alex From mrussell at verio.net Sat Oct 18 05:44:35 2003 From: mrussell at verio.net (Mark Russell) Date: Sat Oct 18 05:46:57 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <3F90D89E.3060706@ocf.berkeley.edu> References: <000401c39516$c440b520$e841fea9@oemcomputer> <200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com> <20031018035705.GB14929@panix.com> <3F90D89E.3060706@ocf.berkeley.edu> Message-ID: <1066470275.1346.25.camel@straylight> On Sat, 2003-10-18 at 07:07, Brett C. wrote: > I'm -1 as well. Lists do not need to grow a method for something that > only replaces two lines of code that are not tricky in any form of the word. And don't forget that the trivial function will sort any iterable, not just lists. I think for member in copysort(someset): is better than for member in list(someset).copysort(): I'm against list.copysort(), and for either leaving things unchanged or adding copysort() as a builtin (especially if it can use the reference count trick to avoid unnecessary copies). Mark Russell From aleaxit at yahoo.com Sat Oct 18 07:31:17 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 18 07:31:23 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <1066470275.1346.25.camel@straylight> References: <000401c39516$c440b520$e841fea9@oemcomputer> <3F90D89E.3060706@ocf.berkeley.edu> <1066470275.1346.25.camel@straylight> Message-ID: <200310181331.17795.aleaxit@yahoo.com> On Saturday 18 October 2003 11:44 am, Mark Russell wrote: > On Sat, 2003-10-18 at 07:07, Brett C. wrote: > > I'm -1 as well. Lists do not need to grow a method for something that > > only replaces two lines of code that are not tricky in any form of the > > word. > > And don't forget that the trivial function will sort any iterable, not > just lists. I think > > for member in copysort(someset): > > is better than > > for member in list(someset).copysort(): > > I'm against list.copysort(), and for either leaving things unchanged or > adding copysort() as a builtin (especially if it can use the reference > count trick to avoid unnecessary copies). The trick would need to check that the argument is a list, of course, as well as checking that the reference to it on the stack is the only one around. But given this, yes, I guess a built-in would be "better" by occasionally saving the need to type a few extra characters (though maybe "worse" by enlarging the built-in module rather than remaining inside the smaller namespace of the list type...?). The built-in, or method, 'copysort', would have to accept the same optional arguments as the sort method of lists has just grown, of course. Alex From aleaxit at yahoo.com Sat Oct 18 08:26:39 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 18 08:26:45 2003 Subject: [Python-Dev] The Trick (was Re: copysort patch, was Re: inline sort option) In-Reply-To: <200310181331.17795.aleaxit@yahoo.com> References: <000401c39516$c440b520$e841fea9@oemcomputer> <1066470275.1346.25.camel@straylight> <200310181331.17795.aleaxit@yahoo.com> Message-ID: <200310181426.39116.aleaxit@yahoo.com> Wondering about the trick of copysort not copying a singly-referenced list I decided to try it out in a tiny extension module, and, yes, it is just as trivial as one might wish (haven't dealt with optional args to sort, just wanting to check performance &c): static PyObject* copysort(PyObject* self, PyObject* args) { PyObject *sequence, *listresult; if(!PyArg_ParseTuple(args, "O", &sequence)) return 0; if(PyList_CheckExact(sequence) && sequence->ob_refcnt==1) { listresult = sequence; Py_INCREF(listresult); } else { listresult = PySequence_List(sequence); } if(listresult) { if(PyList_Sort(listresult) == -1) { Py_DECREF(listresult); listresult = 0; } } return listresult; } and performance on an equally trivial testcase: x = dict.fromkeys(range(99999)) def looponsorted1(x): keys = x.keys() keys.sort() for k in keys: pass def looponsorted2(x, c=copysort.copysort): for k in c(x.keys()): pass turns out to be identical between the two _with_ The Trick (4.4e+04 usec with timeit.py -c on my box) while without The Trick copysort would slow down to about 5.5e+04 usec. But, this reminds me -- function filter, in bltinmodule.c, uses just about the same trick too (to modify in-place when possible rather than making a new list -- even though when it does make a new list it's an empty one, not a copy, so the gain is less). There must be other cases of applicability which just haven't been considered. So... Shouldn't The Trick be embodied in PySequence_List itself...? So, the whole small tricky part above: if(PyList_CheckExact(sequence) && sequence->ob_refcnt==1) { listresult = sequence; Py_INCREF(listresult); } else { listresult = PySequence_List(sequence); } would collapse to a single PySequence_List call -- *AND* potential calls from Python code such as "x=list(somedict.keys())" might also be speeded up analogously... [Such a call looks silly when shown like this, but in some cases one might not know, in polymorphic use, whether a method returns a new or potentially shared list, or other sequence, and a call to list() on the method's result then may be needed to ensure the right semantics in all cases]. Is there any hidden trap in The Trick that makes it unadvisable to insert it in PySequence_List? Can't think of any, but I'm sure y'all will let me know ASAP what if anything I have overlooked...;-). One might even be tempted to reach down all the way to PyList_GetSlice, placing THERE The Trick in cases of all-list slicing of a singly-referenced list (PyList_GetSlice is what PySequence_List uses, so it would also get the benefit), but that might be overdoing it -- and encouraging list(xxx) instead of xxx[:], by making the former a bit faster in some cases, would be no bad thing IMHO (already I'm teaching newbies to prefer using list(...) rather than ...[:] strictly for legibility and clarity, being able to mention possible future performance benefits might well reinforce the habit...;-). Alex From marktrussell at btopenworld.com Sat Oct 18 08:32:53 2003 From: marktrussell at btopenworld.com (Mark Russell) Date: Sat Oct 18 08:35:14 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310181331.17795.aleaxit@yahoo.com> References: <000401c39516$c440b520$e841fea9@oemcomputer> <3F90D89E.3060706@ocf.berkeley.edu> <1066470275.1346.25.camel@straylight> <200310181331.17795.aleaxit@yahoo.com> Message-ID: <1066480373.1942.50.camel@straylight> On Sat, 2003-10-18 at 12:31, Alex Martelli wrote: > The built-in, or method, 'copysort', would have to accept the same > optional arguments as the sort method of lists has just grown, of course. Yes. In fact one point it its favour is that there aren't any choices to be made - the interface should track that of list.sort(), so it avoids the usual objection to trivial functions that there are many possible variants. Mark Russell From pf_moore at yahoo.co.uk Sat Oct 18 09:18:30 2003 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Sat Oct 18 09:18:20 2003 Subject: [Python-Dev] Re: accumulator display syntax References: <16272.15274.781344.230479@montanaro.dyndns.org> <16272.5373.514560.225999@montanaro.dyndns.org> <200310171821.39895.aleaxit@yahoo.com> <16272.6895.233187.510629@montanaro.dyndns.org> <200310171903.42578.aleaxit@yahoo.com> <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> <16272.15274.781344.230479@montanaro.dyndns.org> <3F90472C.9060702@ocf.berkeley.edu> <5.1.0.14.0.20031017160243.03453220@mail.telecommunity.com> Message-ID: <8yniofa1.fsf@yahoo.co.uk> "Phillip J. Eby" writes: > Which of course means there'd be little need for imap and ifilter, > just as there's now little need for map and filter. > > Anyway, if you look at '.. for .. in .. [if ..]' as a ternary or > quaternary operator on an iterator (or iterable) that returns an > iterator, it makes a lot more sense than thinking of it as having > anything to do with generator(s). (Even if it might be implemented > that way.) I've reached the point of skimming this discussion, but this struck a chord. I think the original proposal (for special syntax for accumulators) is too limited, and if anything is needed (not clear on that) it should be a generalised iterator comprehension construct. In that context, it seems to me that iterator comprehensions bear a very similar relationship to imap/ifilter to the relationship between map/filter and list comprehensions. Paul. -- This signature intentionally left blank From pje at telecommunity.com Sat Oct 18 10:04:52 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat Oct 18 10:05:04 2003 Subject: [Python-Dev] generator comprehension syntax, was: accumulator display syntax In-Reply-To: <200310180357.h9I3vLE08067@12-236-54-216.client.attbi.com> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> <5.1.0.14.0.20031017231830.01ed3ec0@mail.telecommunity.com> Message-ID: <5.1.0.14.0.20031018094933.03ce4780@mail.telecommunity.com> At 08:57 PM 10/17/03 -0700, Guido van Rossum wrote: > > Which, by the way, brings up a question: should iterator comps be > > reiterable? I don't see any reason right now why they shouldn't be, and > > can think of situations where reiterability would be useful. > >Oh, no. Not reiterability again. How can you promise something to be >reiterable if you don't know whether the underlying iterator can be >reiterated? Keeping a hidden buffer would be a bad idea. I think I phrased my question poorly. What I should have said was: "Should iterator expressions preserve the reiterability of the base expression?" I don't want to make them guarantee reiterability, only to preserve it if it already exists. Does that make more sense? In essence, this would be done by having an itercomp expression resolve to an object whose __iter__ method calls the underlying generator, returning a generator-iterator. Thus, any iteration over the itercomp is equivalent to calling a no-arguments generator. The result is reiterable if the base iterable is reiterable, otherwise not. I suppose technically, this means the itercomp doesn't return an iterator, but an iterable, which I suppose could be confusing if you try to call its 'next()' method. But then, it could have a next() method that raises an error saying "call 'iter()' on me first". From niemeyer at conectiva.com Sat Oct 18 10:47:03 2003 From: niemeyer at conectiva.com (Gustavo Niemeyer) Date: Sat Oct 18 10:48:14 2003 Subject: [Python-Dev] SRE recursion removed Message-ID: <20031018144703.GA10212@ibook> The SRE recursion removal patch is finally in. Please, let me know if you find any problems. -- Gustavo Niemeyer http://niemeyer.net From aleaxit at yahoo.com Sat Oct 18 11:14:19 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 18 11:14:25 2003 Subject: [Python-Dev] The Trick (was Re: copysort patch, was Re: inline sort option) In-Reply-To: <200310181426.39116.aleaxit@yahoo.com> References: <000401c39516$c440b520$e841fea9@oemcomputer> <200310181331.17795.aleaxit@yahoo.com> <200310181426.39116.aleaxit@yahoo.com> Message-ID: <200310181714.19688.aleaxit@yahoo.com> On Saturday 18 October 2003 02:26 pm, Alex Martelli wrote: ...oops... > x = dict.fromkeys(range(99999)) here, x.keys() IS already sorted, so the importance of The Trick is emphasized because the sort itself has little work to do: > turns out to be identical between the two _with_ The Trick (4.4e+04 usec > with timeit.py -c on my box) while without The Trick copysort would slow > down to about 5.5e+04 usec. I've changed the initialization of x to > x = dict.fromkeys(map(str,range(99999))) so that x.keys() is not already sorted (still has several runs that the sort will exploit -- perhaps representative of some real-world sorts...;-) and the numbers change to about 240 milliseconds with The Trick (or with separate statements to get and sort the keys), 265 without -- so, more like 10% advantage, NOT 20%-ish (a list.copysort method, from Raymond's patch, has 240 milliseconds too -- indeed it's just about the same code I was using in the standalone function I posted, give or take some level of indirectness in C calls that clearly don't matter much here). Of course, the % advantage will vary with the nature of the list (how many runs that sort can exploit) and be bigger for smaller lists (given we're comparing O(N) copy efforts vs O(N log N) sorting efforts). Alex From skip at pobox.com Sat Oct 18 11:16:00 2003 From: skip at pobox.com (Skip Montanaro) Date: Sat Oct 18 11:16:14 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com> References: <000401c39516$c440b520$e841fea9@oemcomputer> <200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com> Message-ID: <16273.22832.456737.861600@montanaro.dyndns.org> Guido> For one, this will surely make lots of people write Guido> for key in D.keys().copysort(): Guido> ... Guido> which makes an unnecessary copy of the keys. It might be viewed as unnecessary if you intend to change D's keys within the loop. Guido> keys = D.keys() Guido> keys.sort() Guido> for key in keys: Guido> ... Current standard practice is also fine. Skip From aleaxit at yahoo.com Sat Oct 18 11:43:38 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 18 11:43:43 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <16273.22832.456737.861600@montanaro.dyndns.org> References: <000401c39516$c440b520$e841fea9@oemcomputer> <200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com> <16273.22832.456737.861600@montanaro.dyndns.org> Message-ID: <200310181743.38959.aleaxit@yahoo.com> On Saturday 18 October 2003 05:16 pm, Skip Montanaro wrote: > Guido> For one, this will surely make lots of people write > > Guido> for key in D.keys().copysort(): > Guido> ... > > Guido> which makes an unnecessary copy of the keys. > > It might be viewed as unnecessary if you intend to change D's keys within > the loop. D.keys() makes a _snapshot_ of the keys of D -- it doesn't matter what you do to D in the loop's body. Admittedly, that's anything but immediately obvious (quite apart from copysorting or whatever) -- I've seen people change perfectly good code of the form: for k in D.keys(): vastly_alter_a_dictionary(D, k) into broken code of the form: for k in D: vastly_alter_a_dictionary(D, k) because of having missed this crucial difference -- snapshot in the first case, but NOT in the second one. And viceversa, I've seen people carefully copy.copy(D.keys()) or the equivalent to make sure they did not suffer from modifying D in the loop's body -- the latter is in a sense even worse, because the bad effects of the former tend to show up pretty fast as weird bugs and broken unit-tests, while the latter is "just" temporarily wasting some memory and copying time. Anyway, copysort with The Trick, either as a method or function, has no performance problems - exactly the same performance as: > Guido> keys = D.keys() > Guido> keys.sort() > Guido> for key in keys: > Guido> ... > > Current standard practice is also fine. Nolo contendere. It DOES feel a bit like boilerplate, that's all. Alex From pf_moore at yahoo.co.uk Sat Oct 18 11:47:54 2003 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Sat Oct 18 11:47:38 2003 Subject: [Python-Dev] Re: accumulator display syntax References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310172255.43697.aleaxit@yahoo.com> <200310172145.h9HLjh807477@12-236-54-216.client.attbi.com> <200310180014.51336.aleaxit@yahoo.com> Message-ID: <4qy6o8d1.fsf@yahoo.co.uk> Alex Martelli writes: > On Friday 17 October 2003 11:45 pm, Guido van Rossum wrote: > ... >> In conclusion, I think this syntax is pretty cool. (It will probably >> die the same death as the ternary expression though.) > > Ah well -- in this case I guess I won't go to the bother of deciding > whether I like your preferred "lighter" syntax or the "stands our more" > one. The sad, long, lingering death of the ternary expression was > too painful to repeat -- let's put this one out of its misery sooner. The saddest thing about the ternary operator saga (and it may be the fate of this as well) was that the people who wanted the *semantics* destroyed their own case by arguing over *syntax*. I suspect that the only way out of this would be for someone to have just implemented it, with whatever syntax they preferred. Then it either goes in or not, with Guido's final veto applying, as always. Possibly the same is the case here. Unless someone implements iterator comprehensions, with whatever syntax they feel happiest with, arguments about syntax are sterile, and merely serve to fragment the discussion, obscuring the more fundamental question of whether the semantics is wanted or not. Paul -- This signature intentionally left blank From guido at python.org Sat Oct 18 12:27:50 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 18 12:27:58 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: Your message of "Sat, 18 Oct 2003 16:47:54 BST." <4qy6o8d1.fsf@yahoo.co.uk> References: <16272.5373.514560.225999@montanaro.dyndns.org> <200310172255.43697.aleaxit@yahoo.com> <200310172145.h9HLjh807477@12-236-54-216.client.attbi.com> <200310180014.51336.aleaxit@yahoo.com> <4qy6o8d1.fsf@yahoo.co.uk> Message-ID: <200310181627.h9IGRoP09636@12-236-54-216.client.attbi.com> > The saddest thing about the ternary operator saga (and it may be the > fate of this as well) was that the people who wanted the *semantics* > destroyed their own case by arguing over *syntax*. I don't see it that way. There were simply too many people who didn't want it in *any* form (and even if they weren't a strict majority, there were certainly too many to ignore). > I suspect that the only way out of this would be for someone to have > just implemented it, with whatever syntax they preferred. Then it > either goes in or not, with Guido's final veto applying, as always. It was implemented (several times). That wasn't the point at all. > Possibly the same is the case here. Unless someone implements iterator > comprehensions, with whatever syntax they feel happiest with, > arguments about syntax are sterile, and merely serve to fragment the > discussion, obscuring the more fundamental question of whether the > semantics is wanted or not. Not true. There are only two major syntax variations contending (with or without yield) and some quibble about parentheses, and everybody here seems to agree that either version could work. The real issue is whether it adds enough to make it worthwhile to change the language (again). My current opinion is that it isn't: for small datasets, the extra cost of materializing the list using a list comprehension is negligeable, so there's no need for a new feature, and if you need to support truly large datasets, you can afford the three extra lines of code it takes to make a custom iterator or generator. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sat Oct 18 12:33:49 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 18 12:34:05 2003 Subject: [Python-Dev] The Trick In-Reply-To: Your message of "Sat, 18 Oct 2003 17:14:19 +0200." <200310181714.19688.aleaxit@yahoo.com> References: <000401c39516$c440b520$e841fea9@oemcomputer> <200310181331.17795.aleaxit@yahoo.com> <200310181426.39116.aleaxit@yahoo.com> <200310181714.19688.aleaxit@yahoo.com> Message-ID: <200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com> I don't like the trick of avoiding the copy if the refcount is one; AFAIK it can't be done in Jython. I think the application area is too narrow to warrant a built-in, *and* lists shouldn't grow two similar methods. Let's keep the language small! (I know, by that argument several built-ins shouldn't exist. Well, they might be withdrawn in 3.0; let's not add more.) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sat Oct 18 13:17:40 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 18 13:17:52 2003 Subject: [Python-Dev] Reiterability In-Reply-To: Your message of "Sat, 18 Oct 2003 11:20:45 +0200." <200310181120.45477.aleaxit@yahoo.com> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> <5.1.0.14.0.20031017231830.01ed3ec0@mail.telecommunity.com> <200310180357.h9I3vLE08067@12-236-54-216.client.attbi.com> <200310181120.45477.aleaxit@yahoo.com> Message-ID: <200310181717.h9IHHeI09703@12-236-54-216.client.attbi.com> [Guido] > >Oh, no. Not reiterability again. How can you promise something to be > >reiterable if you don't know whether the underlying iterator can be > >reiterated? Keeping a hidden buffer would be a bad idea. [Alex] > I agree it would be bad to have "black magic" performed by every > iterator to fulfil a contract that may or may not be useful to > clients and might be costly to fulfil. > > IF "reiterability" is useful (and I'd need to see some use cases, > because I don't particularly recall pining for it in Python) it > should be exposed as a separate protocol that may or may not be > offered by any given iterator type. E.g., the presence of a special > method __reiter__ could indicate that this iterator IS able to > supply another iterator which retraces the same steps from the > start; and perhaps iter(xxx, reiterable=True) could strive to > provide a reiterable iterator for xxx, which might justify building > one that keeps a hidden buffer as a last resort. But first, I'd > like use cases... In cases where reiterabiliy can be implemented without much effort, there is already an underlying object representing the sequence (e.g. a collection object, or an object defining a numerical series). Reiteration comes for free if you hold on to that underlying object rather than passing an iterator to them around. [Phillip] > I think I phrased my question poorly. What I should have said was: > > "Should iterator expressions preserve the reiterability of the base > expression?" (An iterator expression being something like (f(x) for x in S) right?) > I don't want to make them guarantee reiterability, only to preserve > it if it already exists. Does that make more sense? > > In essence, this would be done by having an itercomp expression > resolve to an object whose __iter__ method calls the underlying > generator, returning a generator-iterator. Thus, any iteration over > the itercomp is equivalent to calling a no-arguments generator. The > result is reiterable if the base iterable is reiterable, otherwise > not. OK, I think I understand what you're after. The code for an iterator expression has to create a generator function behind the scenes, and call it. For example: A = (f(x) for x in S) could be translated into: def gen(seq): for x in seq: yield f(x) A = gen(S) (Note that S could be an arbitrary expression and should be evaluated only once. This translation does that correctly.) This allows one to iterate once over A (a generator function doesn't allow reiteration). What you are asking looks like it could be done like this (never mind the local names): def gen(seq): for x in seq: yield f(x) class Helper: def __init__(seq): self.seq = seq def __iter__(self): return gen(self.seq) A = Helper(S) Then every time you use iter(A) gen() will be called with the saved value of S as argument. > I suppose technically, this means the itercomp doesn't return an > iterator, but an iterable, which I suppose could be confusing if you > try to call its 'next()' method. But then, it could have a next() > method that raises an error saying "call 'iter()' on me first". I don't mind that so much, but I don't think all the extra machinery is worth it; the compiler generally can't tell if it is needed so it has to produce the reiterable code every time. If you *want* to have an iterable instead of an iterator, it's usually easy enough do (especially given knowledge about the type of S). [Alex again] > There ARE other features I'd REALLY have liked to get from iterators > in some applications. > > A "snapshot" -- providing me two iterators, the original one and > another, which will step independently over the same sequence of > items -- would have been really handy at times. And a "step back" > facility ("undo" of the last call to next) -- sometimes one level > would suffice, sometimes not; often I could have provided the item > to be "pushed back" so the iterator need not retain memory of it > independently, but that wouldn't always be handy. Now any of these > can be built as a wrapper over an existing iterator, of course -- > just like 'reiterability' could (and you could in fact easily > implement reiterability in terms of snapshotting, by just ensuring a > snapshot is taken at the start and further snapshotted but never > disturbed); but not knowing the abilities of the underlying iterator > would mean these wrappers would often duplicate functionality > needlessly. I don't see how it can be done without an explicit request for such a wrapper in the calling code. If the underlying iterator is ephemeral (is not reiterable) the snapshotter has to save a copy of every item, and that would defeat the purpose of iterators if it was done automatically. Or am I misunderstanding? > E.g.: > > class snapshottable_sequence_iter(object): > def __init__(self, sequence, i=0): > self.sequence = sequence > self.i = i > def __iter__(self): return self > def next(self): > try: result = self.sequence[self.i] > except IndexError: raise StopIteration > self.i += 1 > return result > def snapshot(self): > return self.__class__(self.sequence, self.i) > > Here, snapshotting is quite cheap, requiring just a new counter and > another reference to the same underlying sequence. So would be > restarting and stepping back, directly implemented. But if we need > to wrap a totally generic iterator to provide "snapshottability", we > inevitably end up keeping a list (or the like) of items so far seen > from one but not both 'independent' iterators obtained by a snapshot > -- all potentially redundant storage, not to mention the possible > coding trickiness in maintaining that FIFO queue. I'm not sure what you are suggesting here. Are you proposing that *some* iterators (those which can be snapshotted cheaply) sprout a new snapshot() method? > As I said I do have use cases for all of these. Simplest is the > ability to push back the last item obtained by next, since a frequent > patter is: > for item in iterator: > if isok(item): process(item) > else: > # need to push item back onto iterator, then > break > else: > # all items were OK, iterator exhausted, blah blah > > ...and later... > > for item in iterator: # process some more items > > Of course, as long as just a few levels of pushback are enough, THIS > one is an easy and light-weight wrapper to write: > > class pushback_wrapper(object): > def __init__(self, it): > self.it = it > self.pushed_back = [] > def __iter__(self): return self > def next(self): > try: return self.pushed_back.pop() > except IndexError: return self.it.next() > def pushback(self, item): > self.pushed_back.append(item) This definitely sounds like you'd want to create an explicit wrapper for this; there is too much machinery here to make this a standard feature. Perhaps a snapshottable iterator could also have a backup() method (which would decrement self.i in your first example) or a prev() method (which would return self.sequence[self.i] and decrement self.i). > A "snapshot" would be useful whenever more than one pass on a > sequence _or part of it_ is needed (more useful than a "restart" > because of the "part of it" provision). And a decent wrapper for it > is a bear... Such wrappers for specific container types (or maybe just one for sequences) could be in a standard library module. Is more needed? --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at iinet.net.au Sat Oct 18 13:18:07 2003 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Sat Oct 18 13:18:05 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310181743.38959.aleaxit@yahoo.com> References: <000401c39516$c440b520$e841fea9@oemcomputer> <200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com> <16273.22832.456737.861600@montanaro.dyndns.org> <200310181743.38959.aleaxit@yahoo.com> Message-ID: <3F9175CF.3040408@iinet.net.au> Alex Martelli strung bits together to say: >> Guido> keys = D.keys() >> Guido> keys.sort() >> Guido> for key in keys: >> Guido> ... >> >>Current standard practice is also fine. > > Nolo contendere. It DOES feel a bit like boilerplate, that's all. Hi, While I'm not an active Python contributor (yet), I've been lurking on python-dev since March. Something was bugging me about the whole l.copysort() ('sortedcopy'?) idea. For whatever reason, the above comment crystalised it - if there's going to be a special 'sortedcopy' to allow minimalist chaining, then what about 'reversedcopy' or 'sortedreversedcopy', or any of the other list methods that may be considered worth chaining? Particularly since the following trick seems to work: ============== >>> def chain(method, *args, **kwds): method(*args, **kwds) return method.__self__ >>> mylist = [1, 2, 3, 3, 2, 1] >>> print chain(mylist.sort) [1, 1, 2, 2, 3, 3] >>> mylist = [1, 2, 3, 3, 2, 1] >>> print chain(chain(mylist.sort).reverse) [3, 3, 2, 2, 1, 1] >>> print mylist [3, 3, 2, 2, 1, 1] >>> mylist = [1, 2, 3, 3, 2, 1] >>> print mylist [1, 2, 3, 3, 2, 1] >>> print chain(chain(list(mylist).sort).reverse) [3, 3, 2, 2, 1, 1] >>> print mylist [1, 2, 3, 3, 2, 1] >>> ============== (Tested with Python 2.3rc2, which is what is currently installed on my home machine) Not exactly the easiest to read, but it does do the job of "sorted copy as an expression", as well as letting you chain arbitrary methods of any object. Regards, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions." From python at rcn.com Sat Oct 18 13:53:17 2003 From: python at rcn.com (Raymond Hettinger) Date: Sat Oct 18 13:54:01 2003 Subject: [Python-Dev] in-line sort In-Reply-To: <200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com> Message-ID: <002e01c395a0$b65d9c40$e841fea9@oemcomputer> > I don't like the trick of avoiding the copy if the refcount is one; > AFAIK it can't be done in Jython. As Alex demonstrated, the time savings for an O(n) operation inside an O(n log n) function is irrelevant anyway. > I think the application area is too narrow to warrant a built-in, > *and* lists shouldn't grow two similar methods. Let's keep the > language small! Not to be hard headed here, but if dropped now, it will never be considered again. Did you have a chance to look at the rationale for change in my previous note and added in the comments for the patch? I think they offer some examples and reasons stronger than "saving a little typing": www.python.org/sf/825814 Raymond From jacobs at penguin.theopalgroup.com Sat Oct 18 14:00:28 2003 From: jacobs at penguin.theopalgroup.com (Kevin Jacobs) Date: Sat Oct 18 14:01:58 2003 Subject: [Python-Dev] The Trick In-Reply-To: <200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com> Message-ID: On Sat, 18 Oct 2003, Guido van Rossum wrote: > I don't like the trick of avoiding the copy if the refcount is one; > AFAIK it can't be done in Jython. There is also a problem with the strategy if if gets called by a C extension. It is perfectly feasible for a C extension to hold the only reference to an object, call the copying sort (directly or indirectly), and then be very surprised that the copy did not take place. -Kevin -- -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (440) 871-6725 x 19 E-mail: jacobs@theopalgroup.com Fax: (440) 871-6722 WWW: http://www.theopalgroup.com/ From martin at v.loewis.de Sat Oct 18 14:13:28 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Sat Oct 18 14:13:54 2003 Subject: [Python-Dev] SRE recursion removed In-Reply-To: <20031018144703.GA10212@ibook> References: <20031018144703.GA10212@ibook> Message-ID: Gustavo Niemeyer writes: > The SRE recursion removal patch is finally in. Please, let me know > if you find any problems. What is the purpose of the USE_RECURSION #define? It looks to me like you have added a lot of dead code; I recommend to remove all this code. Regards, Martin From niemeyer at conectiva.com Sat Oct 18 14:22:16 2003 From: niemeyer at conectiva.com (Gustavo Niemeyer) Date: Sat Oct 18 14:23:27 2003 Subject: [Python-Dev] SRE recursion removed In-Reply-To: References: <20031018144703.GA10212@ibook> Message-ID: <20031018182215.GA10756@ibook> > > The SRE recursion removal patch is finally in. Please, let me know > > if you find any problems. > > What is the purpose of the USE_RECURSION #define? It looks to me like > you have added a lot of dead code; I recommend to remove all this code. If you enable USE_RECURSION it will become recursive again, so it's nice to see if some problem is related to the non-recursive algorithm or not, and makes it easy to understand to change made. The "dead" code you're talking about is probably the unused macros, right? I've used them in some ideas, and gave up later. OTOH, they may be used in further extensions. If you don't mind, I'd rather leave them there, than thinking about it again if I need it. But if they're really a problem, well, I'll remove. Just let me know. -- Gustavo Niemeyer http://niemeyer.net From pje at telecommunity.com Sat Oct 18 14:32:56 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat Oct 18 14:33:12 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: <200310181717.h9IHHeI09703@12-236-54-216.client.attbi.com> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> <5.1.0.14.0.20031017231830.01ed3ec0@mail.telecommunity.com> <200310180357.h9I3vLE08067@12-236-54-216.client.attbi.com> <200310181120.45477.aleaxit@yahoo.com> Message-ID: <5.1.0.14.0.20031018142209.0388a5a0@mail.telecommunity.com> At 10:17 AM 10/18/03 -0700, Guido van Rossum wrote: >[Phillip] > > I think I phrased my question poorly. What I should have said was: > > > > "Should iterator expressions preserve the reiterability of the base > > expression?" > >(An iterator expression being something like > > (f(x) for x in S) > >right?) Yes. > > In essence, this would be done by having an itercomp expression > > resolve to an object whose __iter__ method calls the underlying > > generator, returning a generator-iterator. Thus, any iteration over > > the itercomp is equivalent to calling a no-arguments generator. The > > result is reiterable if the base iterable is reiterable, otherwise > > not. > >OK, I think I understand what you're after. The code for an iterator >expression has to create a generator function behind the scenes, and >call it. For example: > > A = (f(x) for x in S) > >could be translated into: > > def gen(seq): > for x in seq: > yield f(x) > A = gen(S) > >(Note that S could be an arbitrary expression and should be evaluated >only once. This translation does that correctly.) Interesting. That wasn't the semantics I envisioned. I was thinking (implicitly, anyway) that an iterator comprehension was a closure. That is, that S would be evaluated each time. However, if S is a sequence, you don't need to reevaluate it, and if S is another iterator expression that preserves reiterability, you still don't need to. So, in that sense there's never a need to >This allows one to iterate once over A (a generator function doesn't >allow reiteration). What you are asking looks like it could be done >like this (never mind the local names): Yes, that's actually what I said, but I guess I was once again unclear. > def gen(seq): > for x in seq: > yield f(x) > class Helper: > def __init__(seq): > self.seq = seq > def __iter__(self): > return gen(self.seq) > A = Helper(S) > >Then every time you use iter(A) gen() will be called with the saved >value of S as argument. Yes, except of course Helper would be a builtin type. >I don't mind that so much, but I don't think all the extra machinery >is worth it; the compiler generally can't tell if it is needed so it >has to produce the reiterable code every time. It has to produce the generator every time, anyway, presumably as a nested function with access to the current locals. The only question is whether it can be invoked more than once, and whether you create the helper object. But maybe that's what you mean, and now you're being unclear instead of me. ;) > If you *want* to >have an iterable instead of an iterator, it's usually easy enough do >(especially given knowledge about the type of S). I just tend to wish that I didn't have to think about whether iterators are reiterable or not, as it forces me to expose to callers of a function whether the value they pass must be an iterator or an iterable. But I don't want to reopen the entire reiterability discussion, as I don't have any better solutions and the previously proposed solutions make my head hurt just trying to make sure I understand the implications. From martin at v.loewis.de Sat Oct 18 15:01:52 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Sat Oct 18 15:02:30 2003 Subject: [Python-Dev] SRE recursion removed In-Reply-To: <20031018182215.GA10756@ibook> References: <20031018144703.GA10212@ibook> <20031018182215.GA10756@ibook> Message-ID: Gustavo Niemeyer writes: > If you enable USE_RECURSION it will become recursive again, so it's > nice to see if some problem is related to the non-recursive algorithm > or not, and makes it easy to understand to change made. Hmm. Either you trust that your code is basically correct or you don't. If you trust that it is basically correct, you should remove the old code, and trust that any problems in SRE (be they related to your code or independent) can be fixed, in which case maintaining the old code would be pointless. Or, if you don't trust that your code is basically correct, you should not have applied the patch. > The "dead" code you're talking about is probably the unused macros, > right? No, I'm talking about the now-disabled recursive code. I also wonder whether the code performing recursion checks has any function still. So I wonder whether USE_STACKCHECK, USE_RECURSION_LIMIT are "essentially" dead. > But if they're really a problem, well, I'll remove. Just let me > know. IMO, any unused code in SRE is a problem, because it makes already difficult-to-follow code more difficult to follow. It is ok to maintain dead code if the code might be used in the future, but only if there are specific plans to actually use it in a foreseeable future. It is not ok Regards, Martin From da-x at gmx.net Sat Oct 18 15:13:19 2003 From: da-x at gmx.net (Dan Aloni) Date: Sat Oct 18 15:13:31 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <3F9175CF.3040408@iinet.net.au> References: <000401c39516$c440b520$e841fea9@oemcomputer> <200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com> <16273.22832.456737.861600@montanaro.dyndns.org> <200310181743.38959.aleaxit@yahoo.com> <3F9175CF.3040408@iinet.net.au> Message-ID: <20031018191319.GA23071@callisto.yi.org> On Sun, Oct 19, 2003 at 03:18:07AM +1000, Nick Coghlan wrote: > Alex Martelli strung bits together to say: > >> Guido> keys = D.keys() > >> Guido> keys.sort() > >> Guido> for key in keys: > >> Guido> ... > >> > >>Current standard practice is also fine. > > > >Nolo contendere. It DOES feel a bit like boilerplate, that's all. > [...] > > Particularly since the following trick seems to work: > ============== > >>> def chain(method, *args, **kwds): > method(*args, **kwds) > return method.__self__ > > >>> mylist = [1, 2, 3, 3, 2, 1] > >>> print chain(mylist.sort) > [1, 1, 2, 2, 3, 3] > >>> mylist = [1, 2, 3, 3, 2, 1] > >>> print chain(chain(mylist.sort).reverse) > [...] > > (Tested with Python 2.3rc2, which is what is currently installed on my home > machine) > > Not exactly the easiest to read, but it does do the job of "sorted copy as > an expression", as well as letting you chain arbitrary methods of any > object. (I'm new on this list) Actually, there is way to do this out-of-the-box without the chain() function: >>> a = [1,2,3,3,2,1] >>> (a, (a, a.sort())[0].reverse())[0] [3, 3, 2, 2, 1, 1] And there is also one for copysort(): >>> a [1, 2, 3, 3, 2, 1] >>> (lambda x:(x, x.sort())[0])(list(a)) [1, 1, 2, 2, 3, 3] >>> a [1, 2, 3, 3, 2, 1] But that's probably not more readable. Architect's Sketch... -- Dan Aloni da-x@gmx.net From michel at dialnetwork.com Sat Oct 18 15:43:52 2003 From: michel at dialnetwork.com (Michel Pelletier) Date: Sat Oct 18 15:17:50 2003 Subject: [Python-Dev] The Trick In-Reply-To: <200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com> References: <000401c39516$c440b520$e841fea9@oemcomputer><200310181331.17795.aleaxit@yahoo.com><200310181426.39116.aleaxit@yahoo.com> <200310181714.19688.aleaxit@yahoo.com> <200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com> Message-ID: <3630.67.160.160.177.1066506232.squirrel@squirrel.dialnetwork.com> > I don't like the trick of avoiding the copy if > the refcount is one; > AFAIK it can't be done in Jython. It may be possible with the java.lang.ref package using a somewhat similar trick by (I imagine) holding a soft reference and examining the object's rechability to the collector. -Michel From pedronis at bluewin.ch Sat Oct 18 15:29:56 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Sat Oct 18 15:27:45 2003 Subject: [Python-Dev] The Trick In-Reply-To: <3630.67.160.160.177.1066506232.squirrel@squirrel.dialnetwo rk.com> References: <200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com> <000401c39516$c440b520$e841fea9@oemcomputer> <200310181331.17795.aleaxit@yahoo.com> <200310181426.39116.aleaxit@yahoo.com> <200310181714.19688.aleaxit@yahoo.com> <200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com> Message-ID: <5.2.1.1.0.20031018212309.027fb348@pop.bluewin.ch> At 14:43 18.10.2003 -0500, Michel Pelletier wrote: > > I don't like the trick of avoiding the copy if > > the refcount is one; > > AFAIK it can't be done in Jython. > >It may be possible with the java.lang.ref >package using a somewhat similar trick by (I >imagine) holding a soft reference and examining >the object's rechability to the collector. >-Michel no, if you put the last reference to an object in a weak-ref and trigger a GC (which is btw expensive), well you can discover that there was just one reference but you have also lost the object. Now if you have a tiny wrapper/contents organization you can overcome this, playing the trick with the wrapper and keeping the contents, OTOH as I said, the explicit GC is expensive, likely more than allocating and copying. regards. From guido at python.org Sat Oct 18 15:35:19 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 18 15:35:31 2003 Subject: [Python-Dev] in-line sort In-Reply-To: Your message of "Sat, 18 Oct 2003 13:53:17 EDT." <002e01c395a0$b65d9c40$e841fea9@oemcomputer> References: <002e01c395a0$b65d9c40$e841fea9@oemcomputer> Message-ID: <200310181935.h9IJZJ609921@12-236-54-216.client.attbi.com> > > I think the application area is too narrow to warrant a built-in, > > *and* lists shouldn't grow two similar methods. Let's keep the > > language small! > > Not to be hard headed here, but if dropped now, it will never > be considered again. Did you have a chance to look at the > rationale for change in my previous note and added in the > comments for the patch? I think they offer some examples > and reasons stronger than "saving a little typing": > www.python.org/sf/825814 I'm taking that into account. It still smells funny. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sat Oct 18 15:37:12 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 18 15:37:44 2003 Subject: [Python-Dev] SRE recursion removed In-Reply-To: Your message of "Sat, 18 Oct 2003 15:22:16 -0300." <20031018182215.GA10756@ibook> References: <20031018144703.GA10212@ibook> <20031018182215.GA10756@ibook> Message-ID: <200310181937.h9IJbCi09945@12-236-54-216.client.attbi.com> > > What is the purpose of the USE_RECURSION #define? It looks to me like > > you have added a lot of dead code; I recommend to remove all this code. > > If you enable USE_RECURSION it will become recursive again, so it's > nice to see if some problem is related to the non-recursive algorithm > or not, and makes it easy to understand to change made. That's okay. > The "dead" code you're talking about is probably the unused macros, > right? I've used them in some ideas, and gave up later. OTOH, they may > be used in further extensions. If you don't mind, I'd rather leave them > there, than thinking about it again if I need it. But if they're really > a problem, well, I'll remove. Just let me know. That is *not* okay. Dead code is a distraction for future maintainers. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Sat Oct 18 15:42:53 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Sat Oct 18 15:43:08 2003 Subject: [Python-Dev] SRE recursion removed In-Reply-To: <200310181937.h9IJbCi09945@12-236-54-216.client.attbi.com> References: <20031018144703.GA10212@ibook> <20031018182215.GA10756@ibook> <200310181937.h9IJbCi09945@12-236-54-216.client.attbi.com> Message-ID: Guido van Rossum writes: > > If you enable USE_RECURSION it will become recursive again, so it's > > nice to see if some problem is related to the non-recursive algorithm > > or not, and makes it easy to understand to change made. > > That's okay. There is no interface to enable USE_RECURSION except for editing _sre.c, and I cannot see why anybody would do that (except to see whether a bug goes away if it is enabled). So isn't then the old code essentially dead as well? Regards, Martin From aleaxit at yahoo.com Sat Oct 18 15:46:20 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 18 15:46:26 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: <200310181717.h9IHHeI09703@12-236-54-216.client.attbi.com> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> <200310181120.45477.aleaxit@yahoo.com> <200310181717.h9IHHeI09703@12-236-54-216.client.attbi.com> Message-ID: <200310182146.20751.aleaxit@yahoo.com> On Saturday 18 October 2003 07:17 pm, Guido van Rossum wrote: ... > > offered by any given iterator type. E.g., the presence of a special > > method __reiter__ could indicate that this iterator IS able to > > supply another iterator which retraces the same steps from the ... > In cases where reiterabiliy can be implemented without much effort, > there is already an underlying object representing the sequence > (e.g. a collection object, or an object defining a numerical series). ...or a generator that needs to be called again, with the same parameters. > Reiteration comes for free if you hold on to that underlying object > rather than passing an iterator to them around. Yes, but you need to pass around a somewhat complicated thing -- the iterator (to have the "current state in the iteration"), the callable that needs to be called to generate the iterator again (iter, or the generator, or the class whose instances are numerical series, ...) and the arguments for that callable (the sequence, the generator's arguments, the parameters with which to instantiate the class, ...). Nothing terrible, admittedly, and that's presumably how I'd architect things IF I ever met a use case for a "reiterable iterator": class ReiterableIterator(object): def __init__(self, thecallable, *itsargs, **itskwds): self.c, self.a, self.k = thecallable, itsargs, itskwds self.it = thecallable(*itsargs, **itskwds) def __iter__(self): return self def next(self): return self.it.next() def reiter(self): return self.__class__(self.c, *self.a, **self.k) typical toy example use: def printwice(n, reiter): for i, x in enumerate(reiter): if i>=n: break print x for i, x in enumerate(reiter.reiter()): if i>=n: break print x def evens(): x = 0 while 1: yield x x += 2 printwice(5, ReiterableIterator(evens)) > > "Should iterator expressions preserve the reiterability of the base > > expression?" > > (An iterator expression being something like > > (f(x) for x in S) > > right?) ... > OK, I think I understand what you're after. The code for an iterator > expression has to create a generator function behind the scenes, and > call it. For example: Then if I am to be able to plug it into ReiterableIterator or some such mechanism, I need to be able to get at said generator function in order to stash it away (and call it again), right? Hmmm, maybe an iterator built by a generator could keep a reference to the generator it's a child of... but that still wouldn't give the args to call it with, darn... and i doubt it makes sense to burden every generator-made iterator with all those thingies, for the one-in-N need to possibly reiterate on it... > def gen(seq): > for x in seq: > yield f(x) > class Helper: > def __init__(seq): > self.seq = seq > def __iter__(self): > return gen(self.seq) > A = Helper(S) > > Then every time you use iter(A) gen() will be called with the saved > value of S as argument. Yes, that would let ReiterableIterator(iter, A) work, of course. > > I suppose technically, this means the itercomp doesn't return an > > iterator, but an iterable, which I suppose could be confusing if you > > try to call its 'next()' method. But then, it could have a next() > > method that raises an error saying "call 'iter()' on me first". > > I don't mind that so much, but I don't think all the extra machinery > is worth it; the compiler generally can't tell if it is needed so it > has to produce the reiterable code every time. If you *want* to > have an iterable instead of an iterator, it's usually easy enough do > (especially given knowledge about the type of S). Yeah, that seems sensible to me. > [Alex again] > > > There ARE other features I'd REALLY have liked to get from iterators > > in some applications. > > > > A "snapshot" -- providing me two iterators, the original one and > > another, which will step independently over the same sequence of > > items -- would have been really handy at times. And a "step back" ... > > disturbed); but not knowing the abilities of the underlying iterator > > would mean these wrappers would often duplicate functionality > > needlessly. > > I don't see how it can be done without an explicit request for such a > wrapper in the calling code. If the underlying iterator is ephemeral > (is not reiterable) the snapshotter has to save a copy of every item, > and that would defeat the purpose of iterators if it was done > automatically. Or am I misunderstanding? No, you're not. But, if the need to snapshot (or reiterate, very different thing) was deemed important (and I have my doubts if either of them IS important enough -- I suspect snapshot perhaps, reiterable not, but I don't _know_), we COULD have those iterators which "know how to snapshot themselves" expose a .snapshot or __snapshot__ method. Then a function make_a_snapshottable(it) [the names are sucky, sorry, bear with me] would return it if that method was available, otherwise the big bad wrapper around it. Basically, by exposing suitable methods an iterator could "make its abilities know" to functions that may or may not need to wrap it in order to achieve certain semantics -- so the functions can build only those wrappers which are truly indispensable for the purpose. Roughly the usual "protocol" approach -- functions use an object's ability IF that object exposes methods providing that ability, and otherwise fake it on their own. > I'm not sure what you are suggesting here. Are you proposing that > *some* iterators (those which can be snapshotted cheaply) sprout a new > snapshot() method? If snapshottability (eek!) is important enough, yes, though __snapshot__ might perhaps be more traditional (but for iterators we do have the precedent of method next without __underscores__). > > As I said I do have use cases for all of these. Simplest is the > > ability to push back the last item obtained by next, since a frequent Yeah, that's really easy to provide by a lightweight wrapper, which was my not-so-well-clarified intended point. > This definitely sounds like you'd want to create an explicit wrapper Absolutely. > Perhaps a snapshottable iterator could also have a backup() method > (which would decrement self.i in your first example) or a prev() > method (which would return self.sequence[self.i] and decrement > self.i). It seems to me that the ability to back up and that of snapshotting are somewhat independent. > > A "snapshot" would be useful whenever more than one pass on a > > sequence _or part of it_ is needed (more useful than a "restart" > > because of the "part of it" provision). And a decent wrapper for it > > is a bear... > > Such wrappers for specific container types (or maybe just one for > sequences) could be in a standard library module. Is more needed? I think that if it's worth providing a wrapper it's also worth having those iterators that don't need the wrapper (because they already intrinsically have the needed ability) sprout the relevant method or special method; "factory functions" provided with the wrappers could then just return the already-satisfactory iterator, or a wrapper built around it, depending. Problem is, I'm NOT sure if "it's worth providing a wrapper" in each of these cases. snapshottingability (:-) is the one case where, if I had to decide myself right now, I'd say "go for it"... but that may be just because it's the one case for which I happened to stumble on some use cases in production (apart from "undoing", which isn't too bad to handle in other ways anyway). Alex From guido at python.org Sat Oct 18 15:55:48 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 18 15:56:01 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: Your message of "Sat, 18 Oct 2003 14:32:56 EDT." <5.1.0.14.0.20031018142209.0388a5a0@mail.telecommunity.com> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> <5.1.0.14.0.20031017231830.01ed3ec0@mail.telecommunity.com> <200310180357.h9I3vLE08067@12-236-54-216.client.attbi.com> <200310181120.45477.aleaxit@yahoo.com> <5.1.0.14.0.20031018142209.0388a5a0@mail.telecommunity.com> Message-ID: <200310181955.h9IJtmP10005@12-236-54-216.client.attbi.com> > >OK, I think I understand what you're after. The code for an iterator > >expression has to create a generator function behind the scenes, and > >call it. For example: > > > > A = (f(x) for x in S) > > > >could be translated into: > > > > def gen(seq): > > for x in seq: > > yield f(x) > > A = gen(S) > > > >(Note that S could be an arbitrary expression and should be evaluated > >only once. This translation does that correctly.) > > Interesting. That wasn't the semantics I envisioned. I was thinking > (implicitly, anyway) that an iterator comprehension was a closure. That > is, that S would be evaluated each time. We must be miscommunicating. In A = [f(x) for x in S] I certainly don't expect S to be evaluated more than once! Did you mean "each time through the loop" or "each time we reach this statement" or "each time someone loops over A" ??? Also note that I was giving the NON-reiterable semantics. I don't think there's any other way to do it (of course 'gen' should be an anonymous function). > However, if S is a sequence, you > don't need to reevaluate it, and if S is another iterator expression that > preserves reiterability, you still don't need to. So, in that sense > there's never a need to > > > >This allows one to iterate once over A (a generator function doesn't > >allow reiteration). What you are asking looks like it could be done > >like this (never mind the local names): > > Yes, that's actually what I said, but I guess I was once again unclear. > > > > def gen(seq): > > for x in seq: > > yield f(x) > > class Helper: > > def __init__(seq): > > self.seq = seq > > def __iter__(self): > > return gen(self.seq) > > A = Helper(S) > > > >Then every time you use iter(A) gen() will be called with the saved > >value of S as argument. > > Yes, except of course Helper would be a builtin type. Sure, and its constructor would take 'gen' as an argument: class Helper: def __iter__(self, seq, gen): self.seq = seq self.gen = gen def __iter__(self): return self.gen(self.seq) def gen(seq): for x in seq: yield f(x) A = Helper(S, gen) > >I don't mind that so much, but I don't think all the extra machinery > >is worth it; the compiler generally can't tell if it is needed so it > >has to produce the reiterable code every time. > > It has to produce the generator every time, anyway, presumably as a > nested function with access to the current locals. The only > question is whether it can be invoked more than once, and whether > you create the helper object. But maybe that's what you mean, and > now you're being unclear instead of me. ;) I meant creation of the Helper instance. Given that in most practical situations if you *need* reiterability you can provide it using something much simpler, I don't like using a Helper instance. But in fact I don't even like having the implicit generator function. I guess that's one reason I'm falling down on the -0 side of this anyway... > > If you *want* to > >have an iterable instead of an iterator, it's usually easy enough do > >(especially given knowledge about the type of S). > > I just tend to wish that I didn't have to think about whether > iterators are reiterable or not, as it forces me to expose to > callers of a function whether the value they pass must be an > iterator or an iterable. To me that's a perfectly reasonable requirement, as long as functions taking an iterator also take an iterable (i.e. they call iter() on their argument), so a caller who has only iterables doesn't have to care about the difference. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sat Oct 18 15:58:57 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 18 15:59:07 2003 Subject: [Python-Dev] SRE recursion removed In-Reply-To: Your message of "18 Oct 2003 21:42:53 +0200." References: <20031018144703.GA10212@ibook> <20031018182215.GA10756@ibook> <200310181937.h9IJbCi09945@12-236-54-216.client.attbi.com> Message-ID: <200310181958.h9IJwvo10028@12-236-54-216.client.attbi.com> > > > If you enable USE_RECURSION it will become recursive again, so it's > > > nice to see if some problem is related to the non-recursive algorithm > > > or not, and makes it easy to understand to change made. > > > > That's okay. > > There is no interface to enable USE_RECURSION except for editing > _sre.c, and I cannot see why anybody would do that (except to see > whether a bug goes away if it is enabled). So isn't then the old code > essentially dead as well? Given that we're talking about a very complicated change to extremely delicate code, and we're pre-alpha, and we've explicitly discussed giving the code the benefit of the doubt because nobody has the guts to review it, I find it perfectly reasonable to leave the old code in with a quick way to re-enable it in case someone produces a test case that they claim breaks with the new code. The old code can be phased out once we're certain the new code is rock solid. I don't mind having the #ifdef in for one release cycle. --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Sat Oct 18 16:01:52 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 18 16:01:57 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <20031018191319.GA23071@callisto.yi.org> References: <000401c39516$c440b520$e841fea9@oemcomputer> <3F9175CF.3040408@iinet.net.au> <20031018191319.GA23071@callisto.yi.org> Message-ID: <200310182201.52605.aleaxit@yahoo.com> On Saturday 18 October 2003 09:13 pm, Dan Aloni wrote: ... > > >> Guido> keys = D.keys() > > >> Guido> keys.sort() > > >> Guido> for key in keys: ... > Actually, there is way to do this out-of-the-box without the chain() > > function: > >>> a = [1,2,3,3,2,1] > >>> (a, (a, a.sort())[0].reverse())[0] This cannot be applied to D.keys() for some directory D. > >>> (lambda x:(x, x.sort())[0])(list(a)) This one can, because the lambda lets you give a temporary name x to the otherwise-unnamed list returned by D.keys(). It can be made a _little_ better, too, I think: >>> D=dict.fromkeys('ciao') >>> D.keys() ['i', 'a', 'c', 'o'] >>> (lambda x: x.sort() or x)(D.keys()) ['a', 'c', 'i', 'o'] and if you want it reversed after sorting, >>> (lambda x: x.sort() or x.reverse() or x)(D.keys()) ['o', 'i', 'c', 'a'] > But that's probably not more readable. You have a gift for understatement. Still, probably more readable than the classic list comprehension hack: >>> [x for x in [D.keys()] for y in [x.sort(), x] if y][0] ['a', 'c', 'i', 'o'] also, the lambda hack doesn't leak names into the surrounding scope, while the list comprehension hack, alas, does. BTW, welcome to python-dev! Alex From whisper at oz.net Sat Oct 18 16:03:14 2003 From: whisper at oz.net (David LeBlanc) Date: Sat Oct 18 16:03:24 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: Message-ID: > IOW, I have *never* seen anybody who wanted to rebuild a stock Python > module without having to download the entire Python source code, and > rebuild everything. After years of listening to python-help, I found > that the most common application of rebuilding parts or all of Python > is building debug binaries on Windows, to debug your own extension > modules. This requires rebuilding all of Python, and people accept > that (even though they don't like it). Are we talking about the same thing? I'm by no means suggesting that the python.x.x.tar.gz source be broken up! I'm not aware of any ability to download component sources any other way, nor would I want that! I have had the experience of building and rebuilding specific extensions of Python to find a bug. I found it and reported it. I'm proud to say that I found a very obscure bug in Python that affected how Zope was written at the time. My 2 seconds of fame ;) > > Perception does count for a lot, especially when reviewers are > making gross > > comparisons of executable (including dlls) sizes. > > I, personally, would not make technical decisions on grounds of > perception which I know would be unfounded. I can see how other people > would let their decisions guide by incorrect perception, and I find > that unfortunate (but can accept it). > > I would, personally, strive to correct incorrect perception by means > of education. I know this is a tedious process. Yes, me too, but there are no end to the number of people who will look at the surface and never dig deeper. If perception turns people away from Python, that is not a good thing. > > Yes. What big benefit does this offer compared to the status quo? Aren't > > there more important things to devote resources to? > > The big benefit is that it simplifies packaging and deployment, and I > believe this is the reason why Thomas Heller, who has just taken over > Windows packaging, wants to see it implemented. It simplifies his life, > and wasting volunteer time should not be taken lightly. As from above, I'm not, nor do I think it was the original poster's intent, suggesting that the Python distro be broken up. At this point, I download 2 files: the python binary and the python source. I thought this was about merging all the .pyd files into a single python dll? As far as I can see, that's not a distribution issue per se. > It also simplifies my life, as I plan to maintain a Win64 port. I have > to perform manual adjustments in each project file - the fewer project > files, the better. > > There are certainly more important things to devote resources to, like > fixing bugs. Unfortunately, there are no volunteers for these > important things, and the volunteers tend to look into things that are > not important but fun. > > > If there is enough feeling about it, would it be possible to create an > > alternate VS project that could do the all in one dll instead of pulling > > everyone along one path because a few like an idea? > > That would cause DLL hell - there must not be competing versions of > python23.dll. If I build it one way on my machine, how would that cause dll hell? > There is also the issue of converting the VS6 projects to VS7.1; when > that happens, a re-organization might be in place. > > > Why do you find it necessary to characterize some honest > questions as FUD > > instead of speaking to the merits or demerits of the discussion? > > Because it is: this was not meant as a critique, or bad-mouthing, or > some such; if you have taken it in this sense, I apologize. The only > rationale for leaving things as-is was that users might fear things > that you also knew where unfounded fears because of uncertainty - FUD. > > Regards, > Martin David LeBlanc Seattle, WA USA From aleaxit at yahoo.com Sat Oct 18 16:11:24 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 18 16:11:34 2003 Subject: [Python-Dev] The Trick In-Reply-To: <200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com> References: <000401c39516$c440b520$e841fea9@oemcomputer> <200310181714.19688.aleaxit@yahoo.com> <200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com> Message-ID: <200310182211.24635.aleaxit@yahoo.com> On Saturday 18 October 2003 06:33 pm, Guido van Rossum wrote: > I don't like the trick of avoiding the copy if the refcount is one; > AFAIK it can't be done in Jython. No, but if it's only a small optimization, who cares? Anyway, the objection that these functions might be called by _C_ code who's holding the only reference to a PyObject* probably kills The Trick (particularly my hope of moving it into PySequence_List whether copysort survived or not). > I think the application area is too narrow to warrant a built-in, > *and* lists shouldn't grow two similar methods. Let's keep the > language small! Aye aye, captain. Can we dream of a standard library module of "neat hacks that don't really warrant a built-in" in which to stash some of these general-purpose, no-specific-appropriate-module, useful functions and classes? Pluses: would save some people reimplementing them over and over and sometimes incorrectly; would remove any pressure to add not-perfectly-appropriate builtins. Minuses: one more library module (the, what, 211th? doesn't seem like a biggie). Language unchanged -- just library. Pretty please? > (I know, by that argument several built-ins shouldn't exist. Well, > they might be withdrawn in 3.0; let's not add more.) "Amen and Hallelujah" to the hope of slimming language and built-ins in 3.0 (presumably the removed built-ins will go into a "legacy curiosa" module, allowing a "from legacy import *" to ease making old code run in 3.0? seems cheap & sensible). Alex From aleaxit at yahoo.com Sat Oct 18 16:15:28 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 18 16:15:46 2003 Subject: [Python-Dev] The Trick In-Reply-To: References: Message-ID: <200310182215.28092.aleaxit@yahoo.com> On Saturday 18 October 2003 08:00 pm, Kevin Jacobs wrote: > On Sat, 18 Oct 2003, Guido van Rossum wrote: > > I don't like the trick of avoiding the copy if the refcount is one; > > AFAIK it can't be done in Jython. > > There is also a problem with the strategy if if gets called by a C > extension. It is perfectly feasible for a C extension to hold the only > reference to an object, call the copying sort (directly or indirectly), and > then be very surprised that the copy did not take place. Alas, I fear you're right. Darn -- so much for a possible little but cheap optimization (which might have been neat in PySequence_List even if copysort never happens and the optimization is only for CPython -- I don't see why an optimization being impossible in Jython should stop CPython from making it, as long as semantics remain compatible). It's certainly possible for C code to call PySequence_List or whatever while holding the only reference, and count on the returned and argument objects being distinct:-(. Alex From aleaxit at yahoo.com Sat Oct 18 16:24:33 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 18 16:24:38 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <3F9175CF.3040408@iinet.net.au> References: <000401c39516$c440b520$e841fea9@oemcomputer> <200310181743.38959.aleaxit@yahoo.com> <3F9175CF.3040408@iinet.net.au> Message-ID: <200310182224.33499.aleaxit@yahoo.com> On Saturday 18 October 2003 07:18 pm, Nick Coghlan wrote: > Alex Martelli strung bits together to say: > >> Guido> keys = D.keys() > >> Guido> keys.sort() > >> Guido> for key in keys: > >> Guido> ... > >> > >>Current standard practice is also fine. > > > > Nolo contendere. It DOES feel a bit like boilerplate, that's all. > > Hi, > > While I'm not an active Python contributor (yet), I've been lurking on > python-dev since March. Hi Nick! > Something was bugging me about the whole l.copysort() ('sortedcopy'?) idea. > For whatever reason, the above comment crystalised it - if there's going to > be a special 'sortedcopy' to allow minimalist chaining, then what about > 'reversedcopy' or 'sortedreversedcopy', or any of the other list methods > that may be considered worth chaining? sort has just (in CVS right now) sprouted an optional reverse=True parameter that makes sort-reverse chaining a non-issue (thanks to the usual indefatigable Raymond, too). But, for the general case: the BDFL has recently Pronounced that he does not LIKE chaining and doesn't want to encourage it in the least. Yes, your trick does allow chaining, but the repeated chain(...) calls are cumbersome enough to not count as an encouragement IMHO;-). A generalized wrapping system that wraps all methods so that any "return None" is transformed into a "return self" WOULD constitute encouragement, and thus a case of Lese BDFLity, which would easily risk the wrath of the PSU (of COURSE there ain't no such thing!)... Alex From niemeyer at conectiva.com Sat Oct 18 16:28:54 2003 From: niemeyer at conectiva.com (Gustavo Niemeyer) Date: Sat Oct 18 16:30:05 2003 Subject: [Python-Dev] SRE recursion removed In-Reply-To: References: <20031018144703.GA10212@ibook> <20031018182215.GA10756@ibook> Message-ID: <20031018202854.GA22482@ibook> > Hmm. Either you trust that your code is basically correct or you > don't. If you trust that it is basically correct, you should remove > the old code, and trust that any problems in SRE (be they related to > your code or independent) can be fixed, in which case maintaining the > old code would be pointless. > > Or, if you don't trust that your code is basically correct, you should > not have applied the patch. Hey.. Martin, are you ok? What's going on? You're being extremelly aggressive without an aparent reason. I'm putting a prize on my head for hacking the *hairy* code in SRE and removing a serious limitation, and that's your reaction!? I'm disappointed. > I also wonder whether the code performing recursion checks has any > function still. So I wonder whether USE_STACKCHECK, > USE_RECURSION_LIMIT are "essentially" dead. Yeah.. I can clean it. let's please wait a little bit to see the new code working? > IMO, any unused code in SRE is a problem, because it makes already > difficult-to-follow code more difficult to follow. It is ok to > maintain dead code if the code might be used in the future, but only > if there are specific plans to actually use it in a foreseeable > future. It is not ok Dead *debug* code is something common all over the world. Should we remove VERBOSE usage as well!? :-) -- Gustavo Niemeyer http://niemeyer.net From da-x at gmx.net Sat Oct 18 16:54:41 2003 From: da-x at gmx.net (Dan Aloni) Date: Sat Oct 18 16:54:53 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310182201.52605.aleaxit@yahoo.com> References: <000401c39516$c440b520$e841fea9@oemcomputer> <3F9175CF.3040408@iinet.net.au> <20031018191319.GA23071@callisto.yi.org> <200310182201.52605.aleaxit@yahoo.com> Message-ID: <20031018205441.GA24562@callisto.yi.org> On Sat, Oct 18, 2003 at 10:01:52PM +0200, Alex Martelli wrote: > > > >>> (lambda x:(x, x.sort())[0])(list(a)) > > This one can, because the lambda lets you give a temporary name x > to the otherwise-unnamed list returned by D.keys(). It can be made a > _little_ better, too, I think: > > >>> D=dict.fromkeys('ciao') > >>> D.keys() > ['i', 'a', 'c', 'o'] > >>> (lambda x: x.sort() or x)(D.keys()) > ['a', 'c', 'i', 'o'] > > and if you want it reversed after sorting, > > >>> (lambda x: x.sort() or x.reverse() or x)(D.keys()) > ['o', 'i', 'c', 'a'] Good, so this way the difference between copied and not copied is minimized: >>> (lambda x: x.sort() or x)(a) And: >>> (lambda x: x.sort() or x)(list(a)) Nice, this lambda hack is a cleaner, more specific, and simple deviation of the chain() function. Perhaps it could be made more understandable like: >>> sorted = lambda x: x.sort() or x >>> sorted(list(a)) ['a', 'c', 'i', 'o'] And: >>> sorted(a) ['a', 'c', 'i', 'o'] The only problem is that you assume .sort() always returns a non True value. If some time in the future .sort() would return self, your code would break and then the rightful usage would be: >>> a = ['c', 'i', 'a', 'o'] >>> list(a).sort() ['a', 'c', 'i', 'o'] >>> a ['c', 'i', 'a', 'o'] And: >>> a.sort() ['a', 'c', 'i', 'o'] >>> a ['a', 'c', 'i', 'o'] I didn't see the begining of this discussion, but it looks to me that sort() returning self is much better than adding a .copysort(). > BTW, welcome to python-dev! Thanks! -- Dan Aloni da-x@gmx.net From pje at telecommunity.com Sat Oct 18 17:00:17 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat Oct 18 17:00:35 2003 Subject: [Python-Dev] The Trick In-Reply-To: <200310182211.24635.aleaxit@yahoo.com> References: <200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com> <000401c39516$c440b520$e841fea9@oemcomputer> <200310181714.19688.aleaxit@yahoo.com> <200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com> Message-ID: <5.1.0.14.0.20031018165255.0399fd90@mail.telecommunity.com> At 10:11 PM 10/18/03 +0200, Alex Martelli wrote: >Can we dream of a standard library module of "neat hacks that >don't really warrant a built-in" in which to stash some of these >general-purpose, no-specific-appropriate-module, useful functions >and classes? Pluses: would save some people reimplementing >them over and over and sometimes incorrectly; would remove >any pressure to add not-perfectly-appropriate builtins. Minuses: >one more library module (the, what, 211th? doesn't seem like >a biggie). Language unchanged -- just library. Pretty please? Hmmm. import tricky.hacks from dont_try_this_at_home_kids import * I suppose 'shortcuts' would probably be a less contentious name. :) The downside to having such a module would be that it would entertain ongoing pressure to add more things to it. I suppose it'd be better to have a huge shortcuts module (or maybe shortcuts package, divided by subject matter) than to keep adding builtins. > > (I know, by that argument several built-ins shouldn't exist. Well, > > they might be withdrawn in 3.0; let's not add more.) > >"Amen and Hallelujah" to the hope of slimming language and >built-ins in 3.0 (presumably the removed built-ins will go into a >"legacy curiosa" module, allowing a "from legacy import *" to >ease making old code run in 3.0? seems cheap & sensible). I like it. Or, for symmetry, maybe 'from __past__ import lambda'. ;-) Say, in 3.0, will there be perhaps *no* builtins? After all, you don't need builtins to import things. Nah, that'd be too much like Java, and not enough like pseudocode. Ah well, time for me to stop making suggestions on what color to paint the bicycle shed, and start doing some real work today. :) From aleaxit at yahoo.com Sat Oct 18 17:12:38 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 18 17:12:43 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <20031018205441.GA24562@callisto.yi.org> References: <000401c39516$c440b520$e841fea9@oemcomputer> <200310182201.52605.aleaxit@yahoo.com> <20031018205441.GA24562@callisto.yi.org> Message-ID: <200310182312.38553.aleaxit@yahoo.com> On Saturday 18 October 2003 10:54 pm, Dan Aloni wrote: ... > Perhaps it could be made more understandable like: > >>> sorted = lambda x: x.sort() or x > >>> sorted(list(a)) No fair -- that's not a single expression any more!-) > The only problem is that you assume .sort() always returns a non > True value. If some time in the future .sort() would return self, > your code would break and then the rightful usage would be: Why do you think it would break? It would do a _tiny_ amount of avoidable work, but still return the perfectly correct result. Sure you don't think I'd post an unreadable inline hack that would break in the unlikely case the BDFL ever made a change he's specifically Pronounced again, right?-) > I didn't see the begining of this discussion, but it looks to me that > sort() returning self is much better than adding a .copysort(). The BDFL has Pronounced against it: he doesn't LIKE chaining. Alex From guido at python.org Sat Oct 18 17:22:00 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 18 17:22:06 2003 Subject: [Python-Dev] The Trick In-Reply-To: Your message of "Sat, 18 Oct 2003 22:11:24 +0200." <200310182211.24635.aleaxit@yahoo.com> References: <000401c39516$c440b520$e841fea9@oemcomputer> <200310181714.19688.aleaxit@yahoo.com> <200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com> <200310182211.24635.aleaxit@yahoo.com> Message-ID: <200310182122.h9ILM0m10190@12-236-54-216.client.attbi.com> > Can we dream of a standard library module of "neat hacks that > don't really warrant a built-in" in which to stash some of these > general-purpose, no-specific-appropriate-module, useful functions > and classes? Pluses: would save some people reimplementing > them over and over and sometimes incorrectly; would remove > any pressure to add not-perfectly-appropriate builtins. Minuses: > one more library module (the, what, 211th? doesn't seem like > a biggie). Language unchanged -- just library. Pretty please? Modules should be about specific applications, or algorithms, or data types, or some other unifying principle. I think "handy" doesn't qualify. :-) > > (I know, by that argument several built-ins shouldn't exist. Well, > > they might be withdrawn in 3.0; let's not add more.) > > "Amen and Hallelujah" to the hope of slimming language and > built-ins in 3.0 (presumably the removed built-ins will go into a > "legacy curiosa" module, allowing a "from legacy import *" to > ease making old code run in 3.0? seems cheap & sensible). Let's not speculate yet about how to get old code to run in 3.0. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Sat Oct 18 17:27:43 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Sat Oct 18 17:28:06 2003 Subject: [Python-Dev] SRE recursion removed In-Reply-To: <20031018202854.GA22482@ibook> References: <20031018144703.GA10212@ibook> <20031018182215.GA10756@ibook> <20031018202854.GA22482@ibook> Message-ID: Gustavo Niemeyer writes: > Hey.. Martin, are you ok? What's going on? You're being extremelly > aggressive without an aparent reason. I'm putting a prize on my head > for hacking the *hairy* code in SRE and removing a serious limitation, > and that's your reaction!? I'm disappointed. Please accept my apologies; I don't want to diminish your efforts, and I do appreciate them. However, I'm concerned that track is completely lost as to how SRE works - is it or is it not the case that the current implementation which is in CVS is recursive, with arbitrary deep nesting? If it is not recursive anymore (which the subject suggests), then why is the is the 'level' argument still in? Can we or can we not remove the ad-hoc determination of USE_RECURSION_LIMIT? > Yeah.. I can clean it. let's please wait a little bit to > see the new code working? Certainly. However, I was hoping that we have better means of finding out whether the code still does what it is supposed to do than testing. Perhaps that is an illusion. Regards, Martin From guido at python.org Sat Oct 18 18:05:38 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 18 18:06:16 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: Your message of "Sat, 18 Oct 2003 21:46:20 +0200." <200310182146.20751.aleaxit@yahoo.com> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> <200310181120.45477.aleaxit@yahoo.com> <200310181717.h9IHHeI09703@12-236-54-216.client.attbi.com> <200310182146.20751.aleaxit@yahoo.com> Message-ID: <200310182205.h9IM5cj10229@12-236-54-216.client.attbi.com> > > Reiteration comes for free if you hold on to that underlying object > > rather than passing an iterator to them around. > > Yes, but you need to pass around a somewhat complicated thing -- > the iterator (to have the "current state in the iteration"), the callable > that needs to be called to generate the iterator again (iter, or the > generator, or the class whose instances are numerical series, ...) > and the arguments for that callable (the sequence, the generator's > arguments, the parameters with which to instantiate the class, ...). > > Nothing terrible, admittedly, and that's presumably how I'd architect > things IF I ever met a use case for a "reiterable iterator": > > class ReiterableIterator(object): > def __init__(self, thecallable, *itsargs, **itskwds): > self.c, self.a, self.k = thecallable, itsargs, itskwds > self.it = thecallable(*itsargs, **itskwds) > def __iter__(self): return self > def next(self): return self.it.next() > def reiter(self): return self.__class__(self.c, *self.a, **self.k) Why put support for a callable with arbitrary arguments in the ReiterableIterator class? Why not say it's called without args, and if the user has a need to use something with args, they can use one of the many approaches to currying? > typical toy example use: > > def printwice(n, reiter): > for i, x in enumerate(reiter): > if i>=n: break > print x > for i, x in enumerate(reiter.reiter()): > if i>=n: break > print x > > def evens(): > x = 0 > while 1: > yield x > x += 2 > > printwice(5, ReiterableIterator(evens)) Are there any non-toy examples? I'm asking because I can't remember ever having had this need myself. > > [Alex again] > > > > > There ARE other features I'd REALLY have liked to get from iterators > > > in some applications. > > > > > > A "snapshot" -- providing me two iterators, the original one and > > > another, which will step independently over the same sequence of > > > items -- would have been really handy at times. And a "step back" > ... > > > disturbed); but not knowing the abilities of the underlying iterator > > > would mean these wrappers would often duplicate functionality > > > needlessly. > > > > I don't see how it can be done without an explicit request for such a > > wrapper in the calling code. If the underlying iterator is ephemeral > > (is not reiterable) the snapshotter has to save a copy of every item, > > and that would defeat the purpose of iterators if it was done > > automatically. Or am I misunderstanding? > > No, you're not. But, if the need to snapshot (or reiterate, very > different thing) was deemed important (and I have my doubts if > either of them IS important enough -- I suspect snapshot perhaps, > reiterable not, but I don't _know_), we COULD have those iterators > which "know how to snapshot themselves" expose a .snapshot or > __snapshot__ method. Then a function make_a_snapshottable(it) [the > names are sucky, sorry, bear with me] would return it if that method > was available, otherwise the big bad wrapper around it. A better name would be clone(); copy() would work too, as long as it's clear that it copies the iterator, not the underlying sequence or series. (Subtle difference!) Reiteration is a special case of cloning: simply stash away a clone before you begin. > Basically, by exposing suitable methods an iterator could "make its > abilities know" to functions that may or may not need to wrap it in > order to achieve certain semantics -- so the functions can build > only those wrappers which are truly indispensable for the purpose. > Roughly the usual "protocol" approach -- functions use an object's > ability IF that object exposes methods providing that ability, and > otherwise fake it on their own. In this case I'm not sure if it is desirable to do this automatically. If I request a clone of an iterator for a data stream coming from a pipe or socket, it would have to start buffering everything. Sure, I can come up with a buffering class that throws away buffered data that none of the existing clones can reach, but I very much doubt if it's worth it; a customized buffering scheme for the application at hand would likely be more efficient than a generic solution. > > I'm not sure what you are suggesting here. Are you proposing that > > *some* iterators (those which can be snapshotted cheaply) sprout a new > > snapshot() method? > > If snapshottability (eek!) is important enough, yes, though __snapshot__ > might perhaps be more traditional (but for iterators we do have the > precedent of method next without __underscores__). (Which I've admitted before was a mistake.) A problem I have with making iterator cloning a standard option is that this would pretty much require that all iterators for which cloning can be implemented should implement clone(). That in turn means that iterator implementors have to work harder (sometimes cloning can be done cheaply, but it might require a different refactoring of the iterator implementation). Another issue is that it would make generators second-class citizens, since they cannot be cloned. (It would seem to be possible to copy a stack frame, but then the question begs whether to use shallow or deep copying -- if a local variable in a generator references a list, should the list be copied or not? And if it should be copied, should it be a deep or shallow copy? There's no good answer without knowing the intention of the programmer.) > > > As I said I do have use cases for all of these. Simplest is the > > > ability to push back the last item obtained by next, since a > > > frequent > > Yeah, that's really easy to provide by a lightweight wrapper, which > was my not-so-well-clarified intended point. > > > This definitely sounds like you'd want to create an explicit wrapper > > Absolutely. > > > Perhaps a snapshottable iterator could also have a backup() method > > (which would decrement self.i in your first example) or a prev() > > method (which would return self.sequence[self.i] and decrement > > self.i). > > It seems to me that the ability to back up and that of snapshotting > are somewhat independent. Backing up suggests a strictly limited buffer; cloning suggests a potentially arbitrarily large buffer. If backing up is what you really need, it's easy to provide a wrapper for it (with a buffer limit argument). Since the buffer is only limited, keeping a few copies of items that aren't strictly necessary won't hurt; it doesn't have the issue of wasting space with a full copy of an existing sequence (or worse, of an easily regenerated series). > > > A "snapshot" would be useful whenever more than one pass on a > > > sequence _or part of it_ is needed (more useful than a "restart" > > > because of the "part of it" provision). And a decent wrapper > > > for it is a bear... > > > > Such wrappers for specific container types (or maybe just one for > > sequences) could be in a standard library module. Is more needed? > > I think that if it's worth providing a wrapper it's also worth > having those iterators that don't need the wrapper (because they > already intrinsically have the needed ability) sprout the relevant > method or special method; "factory functions" provided with the > wrappers could then just return the already-satisfactory iterator, > or a wrapper built around it, depending. > > Problem is, I'm NOT sure if "it's worth providing a wrapper" in each > of these cases. snapshottingability (:-) is the one case where, if > I had to decide myself right now, I'd say "go for it"... but that > may be just because it's the one case for which I happened to > stumble on some use cases in production (apart from "undoing", which > isn't too bad to handle in other ways anyway). I'd like to hear more about those cases, to see if they really need cloning (:-) or can live with a fixed limited backup capability. I think a standard backup wrapper would be a useful thing to have (maybe in itertools?); since generator functions can't be cloned, I'm going to push back on the need for cloning for now until I see a lot more non-toy evidence. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Sat Oct 18 18:22:59 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Sat Oct 18 18:23:21 2003 Subject: [Python-Dev] Be Honest about LC_NUMERIC Message-ID: <200310182222.h9IMMx1X004861@mira.informatik.hu-berlin.de> What happened to this PEP? I can't find it in the PEP list. Personally, I am satisfied with the patch that evolved from the discussion (#774665), and I would be willing to apply it even without a PEP. Thoughts? Regards, Martin From Scott.Daniels at Acm.Org Sat Oct 18 18:44:46 2003 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Sat Oct 18 18:45:00 2003 Subject: [Python-Dev] Re: Python-Dev Digest, Vol 3, Issue 59 In-Reply-To: References: Message-ID: <3F91C25E.2050409@Acm.Org> [Alex Martelli] >[Kevin Jacobs] >>[Guido van Rossum] >>>I don't like the trick of avoiding the copy if the refcount is one; >>>AFAIK it can't be done in Jython. >> >>There is also a problem with the strategy if if gets called by a >>C only extension. It is perfectly feasible for a C extension to >>hold the reference to an object, call the copying sort (directly >>or indirectly), and then be very surprised that the copy did not >>take place. > > Alas, I fear you're right. Darn--so much for a possible little but > cheap optimization (which might have been neat in PySequence_List > even if copysort never happens and the optimization is only for > CPython -- I don't see why an optimization being impossible in > Jython should stop CPython from making it, as long as semantics > remain compatible). It's certainly possible for C code to call > PySequence_List or whatever while holding the only reference, > and count on the returned and argument objects being distinct:-(. I'm afraid I'm confused here. If the C code is like: ... at this point PTR refers to an object with refcount 1 OTHER = (PTR) ... Then it might be that PTR == OTHER ... What possible harm could come? The C code should expect a sortcopy method to recycle the object referred to by PTR if "the Trick" isn't used. I am a trifle confused about what harm occurs. Seems to me that list(v) (and alist[:]) could quite happily implement "the Trick" without fear of failure. -Scott David Daniels Scott.Danies@Acm.Org From bac at OCF.Berkeley.EDU Sat Oct 18 22:30:27 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Sat Oct 18 22:30:37 2003 Subject: [Python-Dev] How to spell Py_return_None and friends (was: RE: [Python-checkins] python/dist/src/Objects typeobject.c, 2.244, 2.245) In-Reply-To: <200310090503.h99533G00867@12-236-54-216.client.attbi.com> References: <000001c38e10$77e31ea0$e841fea9@oemcomputer> <200310090345.h993jkF00618@12-236-54-216.client.attbi.com> <200310090503.h99533G00867@12-236-54-216.client.attbi.com> Message-ID: <3F91F743.6090801@ocf.berkeley.edu> Guido van Rossum wrote: >>Guido van Rossum writes: >> >> >>>Maybe PyBool_FromLong() itself could make this unneeded by adding >>>something like >>> >>> if (ok < 0 && PyErr_Occurred()) >>> return NULL; >>> >>>to its start? > > > [MvL] > >>That would an incompatible change. I would expect PyBool_FromLong(i) >>do the same thing as bool(i). > > > Well, it still does, *except* if you have a pending exception. IMO > what happens when you make a Python API call while an exception is > pending is pretty underspecified, so it's doubtful whether this > incompatibility matters. > > >>>Maybe a pair of macros Py_return_True and Py_return_False would make >>>sense? >> >>You should, of course, add Py_return_None to it, as well. >> >>Then you will find that some contributor goes on a crusade to use >>these throughout very quickly :-) > > > There's the minor issue of how to spell it (Mark Hammond may have a > different suggestion) but that certain contributor has my approval > once we get the spelling agreed upon. > So I just grepped the source and checked the patch manager and don't see any resolution on this. I know there was no objections from anyone to do this beyond just coming up with an agreed spelling. So Py_return_None or Py_RETURN_NONE ? I am with Mark in liking the all-caps for macros, but I can easily live with the first suggestion as well. -Brett From guido at python.org Sat Oct 18 22:40:46 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 18 22:40:54 2003 Subject: [Python-Dev] Re: How to spell Py_return_None and friends (was: RE: [Python-checkins] python/dist/src/Objects typeobject.c, 2.244, 2.245) In-Reply-To: Your message of "Sat, 18 Oct 2003 19:30:27 PDT." <3F91F743.6090801@ocf.berkeley.edu> References: <000001c38e10$77e31ea0$e841fea9@oemcomputer> <200310090345.h993jkF00618@12-236-54-216.client.attbi.com> <200310090503.h99533G00867@12-236-54-216.client.attbi.com> <3F91F743.6090801@ocf.berkeley.edu> Message-ID: <200310190240.h9J2ekX10384@12-236-54-216.client.attbi.com> > So I just grepped the source and checked the patch manager and don't see > any resolution on this. I know there was no objections from anyone to > do this beyond just coming up with an agreed spelling. > > So Py_return_None or Py_RETURN_NONE ? I am with Mark in liking the > all-caps for macros, but I can easily live with the first suggestion as > well. Py_RETURN_NONE, _FALSE, _TRUE are fine. --Guido van Rossum (home page: http://www.python.org/~guido/) From bac at OCF.Berkeley.EDU Sat Oct 18 23:23:17 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Sat Oct 18 23:23:25 2003 Subject: [Python-Dev] Re: How to spell Py_return_None and friends In-Reply-To: <200310190240.h9J2ekX10384@12-236-54-216.client.attbi.com> References: <000001c38e10$77e31ea0$e841fea9@oemcomputer> <200310090345.h993jkF00618@12-236-54-216.client.attbi.com> <200310090503.h99533G00867@12-236-54-216.client.attbi.com> <3F91F743.6090801@ocf.berkeley.edu> <200310190240.h9J2ekX10384@12-236-54-216.client.attbi.com> Message-ID: <3F9203A5.2030407@ocf.berkeley.edu> Guido van Rossum wrote: >>So I just grepped the source and checked the patch manager and don't see >>any resolution on this. I know there was no objections from anyone to >>do this beyond just coming up with an agreed spelling. >> >>So Py_return_None or Py_RETURN_NONE ? I am with Mark in liking the >>all-caps for macros, but I can easily live with the first suggestion as >>well. > > > Py_RETURN_NONE, _FALSE, _TRUE are fine. > OK, great. I can code them up, but fair warning, I have not done C macros in a *long* time so if someone would rather do it then please do so. Regardless, Brett Newbie Question time: what file should they go in? -Brett From bac at OCF.Berkeley.EDU Sat Oct 18 23:38:21 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Sat Oct 18 23:38:29 2003 Subject: [Python-Dev] python-dev Summary for 2003-10-01 through 2003-10-15 [draft] Message-ID: <3F92072D.70601@ocf.berkeley.edu> python-dev Summary for 2003-10-01 through 2003-10-15 ++++++++++++++++++++++++++++++++++++++++++++++++++++ This is a summary of traffic on the `python-dev mailing list`_ from October 1, 2003 through October 15, 2003. It is intended to inform the wider Python community of on-going developments on the list. To comment on anything mentioned here, just post to `comp.lang.python`_ (or email python-list@python.org which is a gateway to the newsgroup) with a subject line mentioning what you are discussing. All python-dev members are interested in seeing ideas discussed by the community, so don't hesitate to take a stance on something. And if all of this really interests you then get involved and join `python-dev`_! This is the twenty-seventh summary written by Brett Cannon (about to turn a quarter century old; so young yet so wise =). All summaries are archived at http://www.python.org/dev/summary/ . Please note that this summary is written using reStructuredText_ which can be found at http://docutils.sf.net/rst.html . Any unfamiliar punctuation is probably markup for reST_ (otherwise it is probably regular expression syntax or a typo =); you can safely ignore it, although I suggest learning reST; it's simple and is accepted for `PEP markup`_ and gives some perks for the HTML output. Also, because of the wonders of programs that like to reformat text, I cannot guarantee you will be able to run the text version of this summary through Docutils_ as-is unless it is from the original text file. .. _PEP Markup: http://www.python.org/peps/pep-0012.html The in-development version of the documentation for Python can be found at http://www.python.org/dev/doc/devel/ and should be used when looking up any documentation on something mentioned here. PEPs (Python Enhancement Proposals) are located at http://www.python.org/peps/ . To view files in the Python CVS online, go to http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/ . Reported bugs and suggested patches can be found at the SourceForge_ project page. .. _python-dev: http://www.python.org/dev/ .. _SourceForge: http://sourceforge.net/tracker/?group_id=5470 .. _python-dev mailing list: http://mail.python.org/mailman/listinfo/python-dev .. _comp.lang.python: http://groups.google.com/groups?q=comp.lang.python .. _Docutils: http://docutils.sf.net/ .. _reST: .. _reStructuredText: http://docutils.sf.net/rst.html .. contents:: .. _last summary: http://www.python.org/dev/summary/2003-09-01_2003-09-15.html ===================== Summary Announcements ===================== Python-dev had a major explosion in emails thanks to some proposed changes to list.sort (summarized in `Decorate-sort-undecorate eye for the list.sort guy`_). That got covered. Some behind-the-scenes stuff that would not interest the general Python community was left out for my personal sanity. It looks like I will not have major issues continuing writing the Summaries in terms of school interfering. The only big issue will be how long past their closure date does it take me to get them out. In other words, unless my schoolwork load suddenly becomes heavy continuously I should be able to keep doing the Summaries until my personal sanity gives out. This summary is brought to you the song "Insanity_" by `Liz Phair`_ and "`Harder to Breathe`_" by `Maroon 5`_. .. _Insanity: http://phobos.apple.com/WebObjects/MZStore.woa/wa/viewAlbum?playlistId=1760071&selectedItemId=1759480 .. _Liz Phair: http://phobos.apple.com/WebObjects/MZStore.woa/wa/viewArtist?artistId=22707 .. _Harder to Breathe: http://phobos.apple.com/WebObjects/MZStore.woa/wa/viewAlbum?playlistId=1798612&selectedItemId=1798604 .. _Maroon 5: http://phobos.apple.com/WebObjects/MZStore.woa/wa/viewArtist?artistId=1798556 ========= Summaries ========= -------------------------------------------------------------------- I gave a talk at PyCon 2004 and all I got was respect and admiration -------------------------------------------------------------------- I summarized this last month, but this is important so I am doing it again (and will continue to mention it until no more proposals are being accepted). PyCon_ is ramping up for 2004 and is putting out a `Call for Proposals`_. Since PyCon is meant to be very broad-reaching you can propose anything from a scientific paper to a tutorial. If you have any inkling to give a talk please send in a proposal. It can't be rough, the key is that what you want to discuss can be understood from the proposal. So take a look at the link and consider coming to PyCon as a speaker and not just a attendee. .. _PyCon: http://www.python.org/pycon/dc2004/ .. _Call for Proposals: http://www.python.org/pycon/dc2004/cfp.html Contributing threads: `PyCon DC 2004: Call for Proposals `__ --------------- Web-SIG started --------------- As stated on the SIGs page, "The Python `Web SIG`_ is dedicated to improving Python's support for interacting with World Wide Web services and clients." If there is some web-related functionality that you think Python should, this is the place to discuss it. If you think an existing Python module could stand a redesign then this is the proper forum for your ideas. .. _Web SIG: http://www.python.org/sigs/web-sig/ Contributing threads: `Any movement on a SIG for web lib enchancements? `__ -------------------------------------------- I have seen the future and it includes 2.3.3 -------------------------------------------- Anthony Baxter, release manager for Python `2.3.1`_ and `2.3.2`_, is already planning a 2.3.3 release in about three months time. He initially suggested that the goal of this release should be to have Python build on as many platforms as possible. Michael Hudson listed "HPUX/ia64, various oddities on Irix" as the major troublemakers. He suggested that a sustained push to fix these build problems happen instead of trying to do it last-minute. Michael also thought it would be a good idea to try to find experts on the trouble platforms instead of having someone getting access to some machine and floundering since they don't know the OS. Skip Montanaro quickly chimed in with http://www.python.org/cgi-bin/moinmoin/PythonTesters which is a wiki page that lists people who are available to help with testing on various OSs. Please have a look and if you think you could help out on an OS add yourself. .. _2.3.1: http://www.python.org/2.3.1/ .. _2.3.2: http://www.python.org/2.3.2/ Contributing threads: `2.3.3 plans `__ ------------------- Helping you help us ------------------- In response to Martin v. L?wis' email on how to handle patches, Michael Bartl expressed his disappointment that nothing had happened to his patches. It was explained to him that because of time restraints on python-dev that it can take time for people to get to all of the patches, but that his work was greatly appreciated and would eventually be looked at. The question of searching on SourceForge_ through the tracker items also came up. There is a search box on the left side of the page, but it is not extensive. Better than nothing. I also posted an essay I wrote that is meant to act as a guide to how Python is developed and how anyone can help with the development regardless of abilities. You can look at the email below in the "Draft of an essay on Python development" thread referenced below in "Contributing threads". Hopefully it will end up on python.org once it is in its final form. Contributing threads: `Patches & Bug help revisited `__ `Draft of an essay on Python development (and how to help) `__ -------------------------------------------- Making DLLs fatter for lower file dependency -------------------------------------------- Thomas Heller suggested adding more modules to the Windows DLL as built-in so as to cut back on the number of files required to get Python to run (py2exe_ stands to benefit from this). The issue of having a larger DLL to have to load into memory was brought up, but Martin v. L?wis said that DLLs only load into memory what is needed to run and not the entire DLL. The issue of making the overall DLL larger in terms of disk space was brought up, but the worry was partially minimized when the list of modules to add was limited to small modules that do not have external dependencies. But zlib might break that last rule in order to allow importation from compressed zip files. The idea of integrating the zlib source into the Python tree was brought up, but shot down for licensing issues on top of keeping the code synchronized. .. _py2exe: http://py2exe.sf.net/ Contributing threads: `buildin vs. shared modules `__ -------------------------------------------------- Decorate-sort-undecorate eye for the list.sort guy -------------------------------------------------- Raymond Hettinger suggested adding built-in support for the decorate-sort-undecorate (DSU) sorting idiom to list.sort (see the Python Cookbook recipe at http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52234 which is recipe 2.3 in the dead tree version or Tim Peters' intro to chapter 2 for details on the idiom). After a long discussion on the technical merits of various ways to do this, list.sort gained the keyword arguments 'key' and 'reverse'. 'key' takes in a function that accepts one argument and returns what the item should be sorted based on. So running ``[(0,0), (0,2), (0,1)].sort(key=lambda x: x[1])`` will sort the list based on the second item in each tuple. Technically the sort algorithm just runs the item it is currently looking at through the function and then handles the sorting. This avoids having to actually allocate another list. 'reverse' does what it sounds like based on whether its argument is true or false. list.sort also became guaranteed to be stable (this include 'reverse'). A discussion of whether list.sort should return self came up and was *very* quickly squashed by Guido. The idea of having a second method, though, that did sort and returned a copy of the sorted list is still being considered. Contributing threads: `decorate-sort-undecorate `__ `list.sort `__ ------------------------------- New Python 2.3.2 Windows binary ------------------------------- Some invalid DLLs made it into the 2.3.2 Windows binary distribution by accident. It seems to mostly affect Windows 98 and NT 4 users. The binary has been fixed and put up online. You can tell if you downloaded the fixed version by checking the filename; the new one is named Python-2.3.2-1.exe (notice the "-1"). Contributing threads: `Python-2.3.2 windows binary screwed `__ From martin at v.loewis.de Sun Oct 19 03:37:11 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Sun Oct 19 03:37:29 2003 Subject: [Python-Dev] Re: How to spell Py_return_None and friends In-Reply-To: <3F9203A5.2030407@ocf.berkeley.edu> References: <000001c38e10$77e31ea0$e841fea9@oemcomputer> <200310090345.h993jkF00618@12-236-54-216.client.attbi.com> <200310090503.h99533G00867@12-236-54-216.client.attbi.com> <3F91F743.6090801@ocf.berkeley.edu> <200310190240.h9J2ekX10384@12-236-54-216.client.attbi.com> <3F9203A5.2030407@ocf.berkeley.edu> Message-ID: "Brett C." writes: > OK, great. I can code them up, but fair warning, I have not done C > macros in a *long* time so if someone would rather do it then please > do so. Regardless, Brett Newbie Question time: what file should they > go in? I would put them along with the things they are returning, i.e. Py_RETURN_TRUE into boolobject.h, Py_RETURN_NONE into object.h (after the Py_None definition). Regards, Martin From martin at v.loewis.de Sun Oct 19 03:40:50 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Sun Oct 19 03:41:07 2003 Subject: [Python-Dev] python-dev Summary for 2003-10-01 through 2003-10-15 [draft] In-Reply-To: <3F92072D.70601@ocf.berkeley.edu> References: <3F92072D.70601@ocf.berkeley.edu> Message-ID: "Brett C." writes: > In response to Martin v. L?wis' email on how to handle patches, > Michael Bartl expressed his disappointment that nothing had happened > to his patches. It was explained to him that because of time > restraints on python-dev that it can take time for people to get to > all of the patches, but that his work was greatly appreciated and > would eventually be looked at. Follow-up: I have accepted one of his patches. The other I consider incorrect, waiting for him to further comment (or withdraw the patch). Regards, Martin From aleaxit at yahoo.com Sun Oct 19 06:05:56 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 19 06:06:07 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: <200310182205.h9IM5cj10229@12-236-54-216.client.attbi.com> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> <200310182146.20751.aleaxit@yahoo.com> <200310182205.h9IM5cj10229@12-236-54-216.client.attbi.com> Message-ID: <200310191205.57016.aleaxit@yahoo.com> On Sunday 19 October 2003 00:05, Guido van Rossum wrote: ... > > class ReiterableIterator(object): > > def __init__(self, thecallable, *itsargs, **itskwds): ... > Why put support for a callable with arbitrary arguments in the > ReiterableIterator class? Why not say it's called without args, and > if the user has a need to use something with args, they can use one of > the many approaches to currying? The typical and most frequent case would be that generating a new iterator requires calling iter(asequence) -- i.e., the typical case does require arguments. So, just like e.g. for threading.Thread, atexit.register, and other callables that take a callable argument, it makes more sense to NOT require the user to invent a currying approach (note btw that iter does NOT support the iter.__get__ trick, of course, as it's a builtin function and not a Python function). It would be different if Python supported a curry built-in, but it doesn't. > > typical toy example use: ... > Are there any non-toy examples? I have not met any, yet -- whence my interest in hearing about use cases from anybody who might have. > I'm asking because I can't remember ever having had this need myself. Right, me neither. > A better name would be clone(); copy() would work too, as long as it's > clear that it copies the iterator, not the underlying sequence or > series. (Subtle difference!) > > Reiteration is a special case of cloning: simply stash away a clone > before you begin. Good name, and good point. > > Roughly the usual "protocol" approach -- functions use an object's > > ability IF that object exposes methods providing that ability, and > > otherwise fake it on their own. > > In this case I'm not sure if it is desirable to do this automatically. Ah, yes, the automatism might be a performance trap -- good point. > If I request a clone of an iterator for a data stream coming from a > pipe or socket, it would have to start buffering everything. Sure, I > can come up with a buffering class that throws away buffered data that > none of the existing clones can reach, but I very much doubt if it's > worth it; a customized buffering scheme for the application at hand > would likely be more efficient than a generic solution. Then clone(it) should raise an exception if it does NOT expose a method supplying "easy cloning" (or more simply it.clone() could do it, e.g. an AttributeError:-) alerting the user of the need to use such a "buffering class" wrapper: try: clo = it.clone() except AttributeError: clo = BufferingWrapper(it) But if no existing iterator supplies the .clone -- even when it would be very easy for it to do so -- this would bufferingwrap everything. > > > I'm not sure what you are suggesting here. Are you proposing that > > > *some* iterators (those which can be snapshotted cheaply) sprout a > > > new snapshot() method? > > > > If snapshottability (eek!) is important enough, yes, though > > __snapshot__ might perhaps be more traditional (but for iterators we do > > have the precedent of method next without __underscores__). > > (Which I've admitted before was a mistake.) Ah, I didn't recall that admission, sorry. OK, underscores everywhere then. > A problem I have with making iterator cloning a standard option is > that this would pretty much require that all iterators for which > cloning can be implemented should implement clone(). That in turn > means that iterator implementors have to work harder (sometimes > cloning can be done cheaply, but it might require a different > refactoring of the iterator implementation). Making iterator authors aware of their clients' possible need to clone doesn't sound bad to me. There's no _compulsion_ to provide the functionality, but some "social pressure" to do it if a refactoring can afford it, well, why not? > Another issue is that it would make generators second-class citizens, > since they cannot be cloned. (It would seem to be possible to copy a > stack frame, but then the question begs whether to use shallow or deep > copying -- if a local variable in a generator references a list, > should the list be copied or not? And if it should be copied, should > it be a deep or shallow copy? There's no good answer without knowing > the intention of the programmer.) Hmmm, there's worse -- if a generator uses an iterator the latter should be cloned, not copied, to produce the generator-clone effect, e.g. def by2(it): for x in it: yield x*2 If it is a list I don't think this is a problem -- already now the user cannot change it for the lifetime of iterators produced by by2(it) without wierd effects, e.g. "for x in by2(L): L.append(x)" gives an infinite loop. But if it is an iterator it should be cloned at the time an iterator produced by by2(it) is cloned. Eeep. No, you're right, in the general case I cannot see how to clone generator-produced iterators. > > It seems to me that the ability to back up and that of snapshotting > > are somewhat independent. > > Backing up suggests a strictly limited buffer; cloning suggests a Unless you need to provide "unlimited undo", yes, but that's a harder problem anyway (needing different architecture). > > may be just because it's the one case for which I happened to > > stumble on some use cases in production (apart from "undoing", which > > isn't too bad to handle in other ways anyway). > > I'd like to hear more about those cases, to see if they really need > cloning (:-) or can live with a fixed limited backup capability. I have an iterator it whose items, after an arbitrary prefix terminated by the first empty item, are supposed to be each 'yes' or 'no'. I need to process it with different functions depending if it has certain proportions of 'yes'/'no' (and yet another function if it has any invalid items) -- each of those functions needs to get the iterator from right after that 'first empty item'. Today, I do: def dispatchyesno(it, any_invalid, selective_processing): # skip the prefix for x in it: if not x: break # snapshot the rest snap = list(it) it = iter(snap) # count and check yeses = noes = 0 for x in it: if x=='yes': yeses += 1 elif x=='no': noes += 1 else: return any_invalid(snap) total = float(yeses+noes) if not total: raise ValueError, "sequence empty after prefix" ratio = yeses / total for threshold, function in selective_processing: if ratio <= threshold: return function(snap) raise ValueError, "no function to deal with a ratio of %s" % ratio (yes, I could use bisect, but the number of items in selective_processing is generally quite low so I didn't bother). Basically, I punt and "snapshot" by making a list out of what is left of my iterator after the prefix. That may be the best I can do in some cases, but in others it's a waste. (Oh well, at least infinite iterators are not a consideration here, since I do need to exhaust the iterator to get the ratio:-). What I plan to do if this becomes a serious problem in the future is add something like an optional 'clone=None' argument so I can code: if clone is None: snap = list(it) it = iter(snap) else: snap = clone(it) instead of what I have hardwired now. But, I _would_ like to just do, e.g.: try: snap = it.clone() except AttributeError: snap = list(it) it = iter(snap) using some standardized protocol for "easily clonable iterators" rather than requiring such awareness of the issue on the caller's part. > I think a standard backup wrapper would be a useful thing to have > (maybe in itertools?); since generator functions can't be cloned, I'm > going to push back on the need for cloning for now until I see a lot > more non-toy evidence. Very reasonable, sure. I suspect the discussion of backup wrapper is best moved to another thread, given this msg is so long and there are all the usual finicky details to nail down.... Alex From aleaxit at yahoo.com Sun Oct 19 06:25:34 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 19 06:25:40 2003 Subject: [Python-Dev] why The Trick can't work In-Reply-To: <3F91C25E.2050409@Acm.Org> References: <3F91C25E.2050409@Acm.Org> Message-ID: <200310191225.34383.aleaxit@yahoo.com> On Sunday 19 October 2003 00:44, Scott David Daniels wrote: ... > >>There is also a problem with the strategy if if gets called by a > >>C only extension. It is perfectly feasible for a C extension to > >>hold the reference to an object, call the copying sort (directly > >>or indirectly), and then be very surprised that the copy did not > >>take place. > > > > Alas, I fear you're right. Darn--so much for a possible little but > > cheap optimization (which might have been neat in PySequence_List ... > I'm afraid I'm confused here. If the C code is like: > > ... at this point PTR refers to an object with refcount 1 > OTHER = (PTR) > ... Then it might be that PTR == OTHER ... > > What possible harm could come? The C code should expect a > sortcopy method to recycle the object referred to by PTR > if "the Trick" isn't used. No! The point of a sorted *copy* is to NOT "recycle the object", else you'd just call PyList_Sort. There is no precedent for a C function documented as "may steal a reference if it feels like it but need not". Without The Trick, PTR* is unchanged -- so it cannot be changed by The Trick without exposing weirdness for such a C-coded extension in these circumstances. I think such a C-coded extension must ALREADY avoid calling filter under such circumstances (haven't tested yet) -- and that undocumented issue is bad enough already... > I am a trifle confused about > what harm occurs. Seems to me that list(v) (and alist[:]) > could quite happily implement "the Trick" without fear of > failure. Yes, but they're not "C-coded extensions". These Python expressions cause PySequence_List and PyList_CopySlice respectively to be called (in the end it always goes rapidly to the copy-slice one), and it's not trivial to find a spot that is NOT callable by a C-coded extension where The Trick could live safely. Alex From ncoghlan at iinet.net.au Sun Oct 19 06:28:45 2003 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Sun Oct 19 06:29:32 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310182224.33499.aleaxit@yahoo.com> References: <000401c39516$c440b520$e841fea9@oemcomputer> <200310181743.38959.aleaxit@yahoo.com> <3F9175CF.3040408@iinet.net.au> <200310182224.33499.aleaxit@yahoo.com> Message-ID: <3F92675D.4070406@iinet.net.au> Alex Martelli strung bits together to say: > But, for the general case: the BDFL has recently Pronounced that he does > not LIKE chaining and doesn't want to encourage it in the least. Yes, your > trick does allow chaining, but the repeated chain(...) calls are cumbersome > enough to not count as an encouragement IMHO;-). Well, yes, that was sort of the point. For those who _really_ like chaining (I'm not one of them - I agree with Guido that it is less readable and harder to maintain), the 'chain' function provides a way to do it with what's already in the language. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions." From skip at manatee.mojam.com Sun Oct 19 08:00:48 2003 From: skip at manatee.mojam.com (Skip Montanaro) Date: Sun Oct 19 08:00:54 2003 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200310191200.h9JC0m9G003255@manatee.mojam.com> Bug/Patch Summary ----------------- 545 open / 4255 total bugs (+44) 210 open / 2423 total patches (+13) New Bugs -------- tarfile exception on large .tar files (2003-10-13) http://python.org/sf/822668 Telnet.read_until() timeout parameter misleading (2003-10-13) http://python.org/sf/822974 cmath.log doesn't have the same interface as math.log. (2003-10-13) http://python.org/sf/823209 urllib2 digest auth is broken (2003-10-14) http://python.org/sf/823328 os.strerror doesn't understand windows error codes (2003-10-14) http://python.org/sf/823672 ntpath.expandvars doesn't expand Windows-style variables. (2003-10-15) http://python.org/sf/824371 exception with Message.get_filename() (2003-10-15) http://python.org/sf/824417 Package Manager Scrolling Behavior (2003-10-15) http://python.org/sf/824430 bad value of INSTSONAME in Makefile (2003-10-15) http://python.org/sf/824565 dict.__init__ doesn't call subclass's __setitem__. (2003-10-16) http://python.org/sf/824854 Memory error on AIX in email.Utils._qdecode (2003-10-16) http://python.org/sf/824977 code.InteractiveConsole interprets escape chars incorrectly (2003-10-17) http://python.org/sf/825676 reference to Built-In Types section in file() documentation (2003-10-17) http://python.org/sf/825810 Class Problem with repr and getattr on PY2.3.2 (2003-10-18) http://python.org/sf/826013 New Patches ----------- add option to NOT use ~/.netrc in nntplib.NNTP() (2003-10-13) http://python.org/sf/823072 Updated .spec file. (2003-10-14) http://python.org/sf/823259 use just built python interp. to build the docs. (2003-10-14) http://python.org/sf/823775 Add additional isxxx functions to string object. (2003-10-16) http://python.org/sf/825313 telnetlib timeout fix (bug 822974) (2003-10-17) http://python.org/sf/825417 let's get rid of cyclic object comparison (2003-10-17) http://python.org/sf/825639 Add list.copysort() (2003-10-17) http://python.org/sf/825814 cmath.log optional base argument, fixes #823209 (2003-10-18) http://python.org/sf/826074 Closed Bugs ----------- tempfile.mktemp() for directories (2002-11-22) http://python.org/sf/642391 MacOS.Error for platform.mac_ver under OS X (2003-07-30) http://python.org/sf/780461 access fails on Windows with Unicode file name (2003-08-17) http://python.org/sf/789995 a bug in IDLE on Python 2.3 i think (2003-08-17) http://python.org/sf/790162 mkstemp doesn't return abspath (2003-09-21) http://python.org/sf/810408 Google kills socket lookup (2003-10-04) http://python.org/sf/817611 installer wakes up Windows File Protection (2003-10-05) http://python.org/sf/818029 PythonIDE interactive window Unicode bug (2003-10-08) http://python.org/sf/819860 tkinter's 'after' and 'threads' on multiprocessor (2003-10-09) http://python.org/sf/820605 reduce docs neglect a very important piece of information. (2003-10-11) http://python.org/sf/821701 Closed Patches -------------- 642391: tempfile.mktemp() docs to include dir info (2003-01-04) http://python.org/sf/662475 Documentation for platform module (2003-08-08) http://python.org/sf/785752 Tidying error messages in compile.c (2003-08-21) http://python.org/sf/792869 Windows installer changes for 2.3.1 (2003-08-28) http://python.org/sf/796919 Mention behavior of seek() on text files (2003-09-19) http://python.org/sf/809535 fix for mkstemp with relative paths (bug #810408) (2003-09-22) http://python.org/sf/810914 fix doc typos (2003-10-10) http://python.org/sf/821093 From niemeyer at conectiva.com Sun Oct 19 10:06:39 2003 From: niemeyer at conectiva.com (Gustavo Niemeyer) Date: Sun Oct 19 10:07:50 2003 Subject: [Python-Dev] SRE recursion removed In-Reply-To: References: <20031018144703.GA10212@ibook> <20031018182215.GA10756@ibook> <20031018202854.GA22482@ibook> Message-ID: <20031019140638.GA23157@ibook> > However, I'm concerned that track is completely lost as to how SRE > works - You may have lost it. I haven't. > is it or is it not the case that the current implementation > which is in CVS is recursive, with arbitrary deep nesting? If it is It is *NOT* the case. Do the following test: set USE_RECURSION_LIMIT to *2*, and run the tests. There's a single case of single recursion, that's why it can't be 1: when SRE_COUNT() is called from the main loop, and then it calls SRE_MATCH() again. OTOH, this second call of SRE_MATCH() will *never* recurse again, unless there's a serious bug in the expression compiler (which is not the case right now). To give an example of when this single recursion happens, suppose the following expression: "[ab]*". > not recursive anymore (which the subject suggests), then why is the is > the 'level' argument still in? Can we or can we not remove the ad-hoc > determination of USE_RECURSION_LIMIT? USE_RECURSION_LIMIT is a friend of USE_RECURSION. We have already discussed this in other messages. > > Yeah.. I can clean it. let's please wait a little bit to > > see the new code working? > > Certainly. However, I was hoping that we have better means of finding > out whether the code still does what it is supposed to do than > testing. Perhaps that is an illusion. I'm shocked. Do you really belive that I've done all the changes and past fixes in SRE without knowing how it works? I thought my credibility was a little higher. -- Gustavo Niemeyer http://niemeyer.net From jacobs at penguin.theopalgroup.com Sun Oct 19 10:33:00 2003 From: jacobs at penguin.theopalgroup.com (Kevin Jacobs) Date: Sun Oct 19 10:33:05 2003 Subject: [Python-Dev] why The Trick can't work In-Reply-To: <200310191225.34383.aleaxit@yahoo.com> Message-ID: On Sun, 19 Oct 2003, Alex Martelli wrote: > On Sunday 19 October 2003 00:44, Scott David Daniels wrote: > > I'm afraid I'm confused here. If the C code is like: > > > > ... at this point PTR refers to an object with refcount 1 > > OTHER = (PTR) > > ... Then it might be that PTR == OTHER ... > > > > What possible harm could come? The C code should expect a > > sortcopy method to recycle the object referred to by PTR > > if "the Trick" isn't used. > > No! The point of a sorted *copy* is to NOT "recycle the object", > else you'd just call PyList_Sort. There is no precedent for a C > function documented as "may steal a reference if it feels like it > but need not". Without The Trick, PTR* is unchanged -- so it > cannot be changed by The Trick without exposing weirdness > for such a C-coded extension in these circumstances. Even worse, the C code may not know that it is calling a copy sort, and assume that the new object is distinct from the original. I'm not saying that such C code is optimal, or even correct, but I suspect that a great deal of it does exist. Even worse than that is the possibility that a list subclass can change the behavior of copysort in such a way that it changes the refcount of the object and within the copysort call. This could confound any code naively attempting to use refcounts to detect the "optimization", either by masking a copy or masking the lack of a copy. So I'm not saying that it isn't a neat idea, but it does present more problems than it solves. Besides that, I find it no burden to copy then sort, or write a utility function that does so -- and I manage a project with well over 1 million lines of Python/C-extension code at the moment. -Kevin -- -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (440) 871-6725 x 19 E-mail: jacobs@theopalgroup.com Fax: (440) 871-6722 WWW: http://www.theopalgroup.com/ From guido at python.org Sun Oct 19 12:30:15 2003 From: guido at python.org (Guido van Rossum) Date: Sun Oct 19 12:30:28 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: Your message of "Sun, 19 Oct 2003 12:05:56 +0200." <200310191205.57016.aleaxit@yahoo.com> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> <200310182146.20751.aleaxit@yahoo.com> <200310182205.h9IM5cj10229@12-236-54-216.client.attbi.com> <200310191205.57016.aleaxit@yahoo.com> Message-ID: <200310191630.h9JGUF219501@12-236-54-216.client.attbi.com> > > A problem I have with making iterator cloning a standard option is > > that this would pretty much require that all iterators for which > > cloning can be implemented should implement clone(). That in turn > > means that iterator implementors have to work harder (sometimes > > cloning can be done cheaply, but it might require a different > > refactoring of the iterator implementation). > > Making iterator authors aware of their clients' possible need to clone > doesn't sound bad to me. There's no _compulsion_ to provide the > functionality, but some "social pressure" to do it if a refactoring can > afford it, well, why not? Well, since it can't be done for the very important class of generators, I think it's better to prepare the users of all iterators for their non-reiterability. It would surely be a shame if the social pressure to provide cloning ended up making generators second-class citizens! > > I'd like to hear more about those cases, to see if they really need > > cloning (:-) or can live with a fixed limited backup capability. > > I have an iterator it whose items, after an arbitrary prefix terminated by > the first empty item, are supposed to be each 'yes' or 'no'. This is a made-up toy example, right? Does it correspond with something you've had to do in real life? > I need to process it with different functions depending if it has certain > proportions of 'yes'/'no' (and yet another function if it has any invalid > items) -- each of those functions needs to get the iterator from right > after that 'first empty item'. > > Today, I do: > > def dispatchyesno(it, any_invalid, selective_processing): > # skip the prefix > for x in it: > if not x: break > # snapshot the rest > snap = list(it) > it = iter(snap) > # count and check > yeses = noes = 0 > for x in it: > if x=='yes': yeses += 1 > elif x=='no': noes += 1 > else: return any_invalid(snap) > total = float(yeses+noes) > if not total: raise ValueError, "sequence empty after prefix" > ratio = yeses / total > for threshold, function in selective_processing: > if ratio <= threshold: return function(snap) > raise ValueError, "no function to deal with a ratio of %s" % ratio > > (yes, I could use bisect, but the number of items in selective_processing > is generally quite low so I didn't bother). > > Basically, I punt and "snapshot" by making a list out of what is left of > my iterator after the prefix. That may be the best I can do in some cases, > but in others it's a waste. (Oh well, at least infinite iterators are not a > consideration here, since I do need to exhaust the iterator to get the > ratio:-). What I plan to do if this becomes a serious problem in the > future is add something like an optional 'clone=None' argument so I > can code: > if clone is None: > snap = list(it) > it = iter(snap) > else: snap = clone(it) > instead of what I have hardwired now. But, I _would_ like to just do, e.g.: > try: snap = it.clone() > except AttributeError: > snap = list(it) > it = iter(snap) > using some standardized protocol for "easily clonable iterators" rather > than requiring such awareness of the issue on the caller's part. Is this from a real app? What it most reminds me of is parsing email messages that can come either from a file or from a pipe; often you want to scan the body to find the end of its MIME structure and then go back and do things to the various MIME parts. If you know it comes from a real file, it's easy to save the file offsets for the parts as you parse them; but when it's a pipe, that doesn't work. In practice, these days, the right thing to do is probably to save the data read from a pipe to a temp file first, and then parse the temp file; or if you insist on parsing it as it comes it, copy the data to a temp file as you go and save file offsets in the temp file. But I'm not sure that abstracting this away all the way to an iterator makes sense. For one, the generic approach to cloning if the iterator doesn't have __clone__ would be to make a memory copy, but in this app a disk copy is desirable (I can invent something that overflows to disk abouve a certain limit, but it's cumbersome, and you have cleanup issues, and it needs parameterization since not everybody agrees on when to spill to disk). Another issue is that the application does't require iterating over the clone and the original iterator simultaneously, but a generic auto-cloner can't assume that; for files, this would either mean that each clone must have its own file descriptor (and dup() doesn't cut it because it shares the file offset), or each clone must keep a file offset, but now you lose the performance effect of a streaming buffer unless you code up something extremely hairy with locks etc. --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Sun Oct 19 12:50:24 2003 From: python at rcn.com (Raymond Hettinger) Date: Sun Oct 19 12:51:08 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: <200310191205.57016.aleaxit@yahoo.com> Message-ID: <003f01c39661$17d5fd80$e841fea9@oemcomputer> > > A better name would be clone(); copy() would work too, as long as it's > > clear that it copies the iterator, not the underlying sequence or > > series. (Subtle difference!) > > > > Reiteration is a special case of cloning: simply stash away a clone > > before you begin. So far, all of my needs for re-iteration have been met by storing some of the iterator's data. If all of it needs to be saved, I use list(it). If only a portion needs to be saved, then I use the code from the tee() example in the itertools documentation: def tee(iterable): "Return two independent iterators from a single iterable" def gen(next, data={}, cnt=[0]): dpop = data.pop for i in itertools.count(): if i == cnt[0]: item = data[i] = next() cnt[0] += 1 else: item = dpop(i) yield item next = iter(iterable).next return (gen(next), gen(next)) Raymond Hettinger From itamar at itamarst.org Sun Oct 19 13:05:39 2003 From: itamar at itamarst.org (Itamar Shtull-Trauring) Date: Sun Oct 19 13:06:32 2003 Subject: [Python-Dev] Fw: [Fwd: Re: Python-Dev Digest, Vol 3, Issue 37] Message-ID: <20031019130539.7a689a26.itamar@itamarst.org> (Glyph is having some issues sending mail to mail.python.org, so I'm forwarding this for him.) -------------- next part -------------- An embedded message was scrubbed... From: Glyph Lefkowitz Subject: Re: Python-Dev Digest, Vol 3, Issue 37 Date: Wed, 15 Oct 2003 20:57:52 -0400 Size: 1725 Url: http://mail.python.org/pipermail/python-dev/attachments/20031019/95f7b658/attachment.mht From python at rcn.com Sun Oct 19 13:23:40 2003 From: python at rcn.com (Raymond Hettinger) Date: Sun Oct 19 13:24:24 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <3F92675D.4070406@iinet.net.au> Message-ID: <004701c39665$bd6ff440$e841fea9@oemcomputer> [Alex Martelli] > > But, for the general case: the BDFL has recently Pronounced that he does > > not LIKE chaining and doesn't want to encourage it in the least. Yes, > your > > trick does allow chaining, but the repeated chain(...) calls are > cumbersome > > enough to not count as an encouragement IMHO;-). [Nick Coghlan] > Well, yes, that was sort of the point. For those who _really_ like > chaining (I'm > not one of them - I agree with Guido that it is less readable and harder > to > maintain), the 'chain' function provides a way to do it with what's > already in > the language. Remember, list.copysort() isn't about chaining or even "saving a line or two". It is about using an expression instead of a series of statements. That makes it possible to use it wherever expressions are allowed, including function call arguments and list comprehensions. Here are some examples taken from the patch comments: genhistory(date, events.copysort(key=incidenttime)) todo = [t for t in tasks.copysort() if due_today(t)] To break these back into multiple statements is to cloud their intent and take away their expressiveness. Using multiple statements requires introducing auxiliary, state-changing variables that remain visible longer than necessary. State changing variables are a classic source of programming errors. In contrast, the examples above are clean and show their correctness without having to mentally decrypt them. Scanning through the sort examples in the standard library, I see that the multi-line, statement form is sometimes further clouded by having a number of statements in-between. In SimpleHTTPServer.py, for example list = os.listdir(path) . . . (yada, yada) list.sort(key=lambda a: a.lower()) . . . (yada, yada, yada) for name in list: . . . You see other examples using os.environ and such. The forces working against introducing an in-line sort are: * the time to copy the list (which Alex later showed to be irrelevant), * having two list methods with a similar purpose, and * the proposed method names are less than sublime If someone could come-up with a name more elegant than "copysort", I the idea would be much more appetizing. Raymond Hettinger From martin at v.loewis.de Sun Oct 19 13:36:04 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Sun Oct 19 13:36:51 2003 Subject: [Python-Dev] SRE recursion removed In-Reply-To: <20031019140638.GA23157@ibook> References: <20031018144703.GA10212@ibook> <20031018182215.GA10756@ibook> <20031018202854.GA22482@ibook> <20031019140638.GA23157@ibook> Message-ID: Gustavo Niemeyer writes: > You may have lost it. I haven't. Very good. > There's a single case of single recursion, that's why it can't be 1: > when SRE_COUNT() is called from the main loop, and then it calls > SRE_MATCH() again. OTOH, this second call of SRE_MATCH() will *never* > recurse again, unless there's a serious bug in the expression compiler > (which is not the case right now). I see. So there is a guarantee that level will never be larger than 2? > USE_RECURSION_LIMIT is a friend of USE_RECURSION. We have already > discussed this in other messages. Maybe we have, but it was not clear to me. It even still isn't: Wouldn't it be possible to leave USE_RECURSION in, and remove USE_RECURSION_LIMIT, and the level argument? > > Certainly. However, I was hoping that we have better means of finding > > out whether the code still does what it is supposed to do than > > testing. Perhaps that is an illusion. > > I'm shocked. Do you really belive that I've done all the changes and > past fixes in SRE without knowing how it works? I thought my > credibility was a little higher. I was relying on your credibility, so I was surprised that you are interested to leave the old code in - that suggests that you feel there are problems with your code. I'm trying to find out what you think these problems are. However, getting all trust in the SRE code from the trust that I have in you is not enough for me - and, PLEASE UNDERSTAND, this has nothing to do with you personally. I feel bad if important code is so unmaintainable that only a single person understands it. I made the remark as a comment to you saying "let's please wait a little bit to see the new code working?" which suggested that we actually have to *see* how the code works, in order to determine whether it works. Regards, Martin From aahz at pythoncraft.com Sun Oct 19 14:40:16 2003 From: aahz at pythoncraft.com (Aahz) Date: Sun Oct 19 14:40:57 2003 Subject: [Python-Dev] SRE recursion removed In-Reply-To: References: <20031018144703.GA10212@ibook> <20031018182215.GA10756@ibook> <20031018202854.GA22482@ibook> <20031019140638.GA23157@ibook> Message-ID: <20031019184016.GA21377@panix.com> On Sun, Oct 19, 2003, Martin v. L?wis wrote: > Gustavo Niemeyer writes: >> >> I'm shocked. Do you really belive that I've done all the changes and >> past fixes in SRE without knowing how it works? I thought my >> credibility was a little higher. > > I was relying on your credibility, so I was surprised that you are > interested to leave the old code in - that suggests that you feel > there are problems with your code. I'm trying to find out what you > think these problems are. > > However, getting all trust in the SRE code from the trust that I have > in you is not enough for me - and, PLEASE UNDERSTAND, this has nothing > to do with you personally. I feel bad if important code is so > unmaintainable that only a single person understands it. I made the > remark as a comment to you saying > > "let's please wait a little bit to see the new code working?" > > which suggested that we actually have to *see* how the code works, in > order to determine whether it works. As a datapoint, it sounded to me more like Gustavo saying, "I'm pretty sure I know what I'm doing here, but this is hairy code and I'd like to keep the old code around for a bit as a cross-check in case it turns out I'm wrong." But if Gustavo's code already passes the regex regression suite, I would say it's just him being suspenders-and-belt -- which in my book as a tech support-type person is a Good Thing. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From skip at pobox.com Sun Oct 19 14:48:47 2003 From: skip at pobox.com (Skip Montanaro) Date: Sun Oct 19 14:49:02 2003 Subject: [Python-Dev] SRE recursion removed In-Reply-To: References: <20031018144703.GA10212@ibook> <20031018182215.GA10756@ibook> <20031018202854.GA22482@ibook> <20031019140638.GA23157@ibook> Message-ID: <16274.56463.542862.174939@montanaro.dyndns.org> Martin> I feel bad if important code is so unmaintainable that only a Martin> single person understands it. I've never looked at sre or any of the other regular expression engines which have made their way into or been considered for inclusion in Python over the years, but my impression has been that for most there's never more than a small handful of people -- and sometimes only one person -- who truly understands it at any given time. Skip From niemeyer at conectiva.com Sun Oct 19 15:11:44 2003 From: niemeyer at conectiva.com (Gustavo Niemeyer) Date: Sun Oct 19 15:12:57 2003 Subject: [Python-Dev] SRE recursion removed In-Reply-To: References: <20031018144703.GA10212@ibook> <20031018182215.GA10756@ibook> <20031018202854.GA22482@ibook> <20031019140638.GA23157@ibook> Message-ID: <20031019191144.GA27007@ibook> > > There's a single case of single recursion, that's why it can't be 1: > > when SRE_COUNT() is called from the main loop, and then it calls > > SRE_MATCH() again. OTOH, this second call of SRE_MATCH() will *never* > > recurse again, unless there's a serious bug in the expression compiler > > (which is not the case right now). > > I see. So there is a guarantee that level will never be larger than 2? Yep. > > USE_RECURSION_LIMIT is a friend of USE_RECURSION. We have already > > discussed this in other messages. > > Maybe we have, but it was not clear to me. It even still isn't: > Wouldn't it be possible to leave USE_RECURSION in, and remove > USE_RECURSION_LIMIT, and the level argument? Of course we can. It depends totally on what we want to do. We can remove it, and then if we enable USE_RECURSION, it may blow up the stack without being catched. > I was relying on your credibility, so I was surprised that you are > interested to leave the old code in - that suggests that you feel > there are problems with your code. I'm trying to find out what you > think these problems are. I don't think there are problems. > However, getting all trust in the SRE code from the trust that I have > in you is not enough for me - and, PLEASE UNDERSTAND, this has nothing > to do with you personally. I feel bad if important code is so > unmaintainable that only a single person understands it. I made the > remark as a comment to you saying > > "let's please wait a little bit to see the new code working?" > > which suggested that we actually have to *see* how the code works, in > order to determine whether it works. It's the first time in my entire life I'm being blamed for being careful. -- Gustavo Niemeyer http://niemeyer.net From martin at v.loewis.de Sun Oct 19 15:33:57 2003 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Oct 19 15:34:08 2003 Subject: [Python-Dev] SRE recursion removed In-Reply-To: <20031019191144.GA27007@ibook> References: <20031018144703.GA10212@ibook> <20031018182215.GA10756@ibook> <20031018202854.GA22482@ibook> <20031019140638.GA23157@ibook> <20031019191144.GA27007@ibook> Message-ID: <3F92E725.5030308@v.loewis.de> Gustavo Niemeyer wrote: > Of course we can. It depends totally on what we want to do. We can > remove it, and then if we enable USE_RECURSION, it may blow up the > stack without being catched. Right. If the intended usage of USE_RECURSION is only to activate it when looking into SRE bug reports (to see if the bug goes away when this is activated), it might not matter that it blows up the stack. The question at hand is what to do with patch 813391: accept, reject, defer? I was hoping that we can reject it as outdated, but perhaps it is not outdated. Regards, Martin From aleaxit at yahoo.com Sun Oct 19 15:40:42 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 19 15:40:48 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <004701c39665$bd6ff440$e841fea9@oemcomputer> References: <004701c39665$bd6ff440$e841fea9@oemcomputer> Message-ID: <200310192140.43084.aleaxit@yahoo.com> On Sunday 19 October 2003 07:23 pm, Raymond Hettinger wrote: ... > The forces working against introducing an in-line sort are: > * the time to copy the list (which Alex later showed to be irrelevant), > * having two list methods with a similar purpose, and > * the proposed method names are less than sublime Good summary (including the parts I snipped). > If someone could come-up with a name more elegant than "copysort", I > the idea would be much more appetizing. I still think that having it in some module is a bit better than having it as a method of lists. The BDFL has already Pronounced that it's too narrow in applicability for a builtin (and he's right -- as usual), and that we won't have "grab-bag" module of shortcuts that don't fit well anywhere else (ditto), and seems very doubtful despite your urgings to reconsider his stance against adding it as a list method (the two-methods-similar- purpose issue seems quite relevant). So, back to what I see as a key issue: a module needs to be "about" something reasonably specific, such as a data type. Built-in data types have no associated module except the builtin one, which is crowded and needs VERY high threshold for any addition. So, if I came up with an otherwise wonderful function that works on sets, arrays, ..., I'd be in clover -- there's an obvious module to house it)... but if the function worked on lists, dicts, files, ..., I'd be hosed. Note that module string STILL exists, and still is the ONLY way to call maketrans, an important function that was deemed inappropriate as a string method; a function just as important, and just as inappropriate as a method, that worked on lists, or dicts, or files, or slices, or ... would be "homeless" and might end up just not entering the standard library. In a way this risks making built-in types "second-class citizens" when compared to types offered by other modules in the standard library! I think we SHOULD have modules corresponding to built-in types, if there are important functions connected with those types but not appropriate as methods to populate them. Perhaps we could use the User*.py modules for the purpose, but making new ones seems better. Rather than being kept together just by naming conventions, as the User*.py are, they might be grouped in a package. Good names are always a problem, but, say, "tools.lists" might be the modules with the auxiliary tools dealing with lists, if "tools" was the package name -- "tools.dicts", "tools.files", etc, if needed -- "tools.sequences" for tools equally well suited to all sequences (not just lists) -- of course, that would retroactively suggest "tools.iters" for your itertools, oh well, pity, we sure can't rename it breaking backwards compatibility:-). If we had module tools.lists (or utils.lists, whatever) then I think copysort (by whatever name) would live well there. copyreverse and copyextend might perhaps also go there and make Barry happy?-) Alternatively - we could take a different tack. copysort is NOT so much a tool that works on an existing list -- as shown in the code I posted, thanks to PySequence_List, it's just as easy to make it work on any sequence (finite iterator or iterable). So what does it do? It BUILDS a new list object (a sorted one) from any sequence. So -- it's a FACTORY FUNCTION of the list type. Just like, say, dict.fromkeys is a factory function of the dict type. Now, factory functions are "by nature" classmethods of their type object, no? So, we should package THIS factory function just like others -- as a classmethod on list, list.somename, just like dict.fromkeys is a classmethod on dict. In this light, we surely don't want "copy" as a part of the name -- a factory method should be thought of as building a new list, not as copying an old one (particularly because it will work on any sequence as its argument, of course). Maybe list.buildsorted if we want to emphasize the "build" part. Or list.newsorted to emphasize that a new object is returned. Or maybe, like in dict.fromkeys, we don't want to emphasize either the building or the newness, but then I wouldn't know what to suggest except the list.sorted that's already drawn catcalls (though it drew them when it was proposed as an instance methods of lists -- maybe as a classmethod it will look better?-) I want the functionality -- any sensible name that might let the functionality into the standard library would be ok by me (so would one putting the functionality in as a builtin or as an instance method of lists, actually, but I _do_ believe those would not be the best places for this functionality, by far). I hope the "tools package" idea and/or the classmethod one find favour...!-) Alex From niemeyer at conectiva.com Sun Oct 19 15:40:25 2003 From: niemeyer at conectiva.com (Gustavo Niemeyer) Date: Sun Oct 19 15:41:36 2003 Subject: [Python-Dev] SRE recursion removed In-Reply-To: <3F92E725.5030308@v.loewis.de> References: <20031018144703.GA10212@ibook> <20031018182215.GA10756@ibook> <20031018202854.GA22482@ibook> <20031019140638.GA23157@ibook> <20031019191144.GA27007@ibook> <3F92E725.5030308@v.loewis.de> Message-ID: <20031019194025.GA27237@ibook> > The question at hand is what to do with patch 813391: accept, reject, > defer? I was hoping that we can reject it as outdated, but perhaps > it is not outdated. IMO, reject it as outdated. This problem doesn't exist anymore. -- Gustavo Niemeyer http://niemeyer.net From aleaxit at yahoo.com Sun Oct 19 15:47:30 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 19 15:47:37 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: <003f01c39661$17d5fd80$e841fea9@oemcomputer> References: <003f01c39661$17d5fd80$e841fea9@oemcomputer> Message-ID: <200310192147.30590.aleaxit@yahoo.com> On Sunday 19 October 2003 06:50 pm, Raymond Hettinger wrote: ... > If only a portion needs to be saved, then I use the code from the tee() > example in the itertools documentation: VERY very neat indeed. OK, I've gotta re-read that documentation -- mea culpa for NOT recalling such juicy contents. I'd also _really_ like to have this anything-but-trivial code available for non-copy-and-paste reuse (i.e., in the library rather than just in its docs)...! OK, I retract my requests for snapshottability -- and change them into a request to have this 'tee' in the library!-) Alex From sholden at holdenweb.com Sun Oct 19 15:53:37 2003 From: sholden at holdenweb.com (Steve Holden) Date: Sun Oct 19 15:58:19 2003 Subject: [Python-Dev] SRE recursion removed In-Reply-To: Message-ID: [Gustavo] > > > > I'm shocked. Do you really belive that I've done all the changes and > > past fixes in SRE without knowing how it works? I thought my > > credibility was a little higher. > [Martin] > I was relying on your credibility, so I was surprised that you are > interested to leave the old code in - that suggests that you feel > there are problems with your code. I'm trying to find out what you > think these problems are. > > However, getting all trust in the SRE code from the trust that I have > in you is not enough for me - and, PLEASE UNDERSTAND, this has nothing > to do with you personally. I feel bad if important code is so > unmaintainable that only a single person understands it. I made the > remark as a comment to you saying > > "let's please wait a little bit to see the new code working?" > > which suggested that we actually have to *see* how the code works, in > order to determine whether it works. > Martin: I suspect that Gustavo is suffering for an excess of care and modesty: after all, with CVS controlling the code it isn't hard to back out a patch if it turns out to be a bad idea. But it won't, will it, Gustavo ;-)? regards -- Steve Holden +1 703 278 8281 http://www.holdenweb.com/ Improve the Internet http://vancouver-webpages.com/CacheNow/ Python Web Programming http://pydish.holdenweb.com/pwp/ Interview with GvR August 14, 2003 http://www.onlamp.com/python/ From aleaxit at yahoo.com Sun Oct 19 16:16:44 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 19 16:16:49 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: <200310191630.h9JGUF219501@12-236-54-216.client.attbi.com> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> <200310191205.57016.aleaxit@yahoo.com> <200310191630.h9JGUF219501@12-236-54-216.client.attbi.com> Message-ID: <200310192216.44849.aleaxit@yahoo.com> On Sunday 19 October 2003 06:30 pm, Guido van Rossum wrote: ... > > I have an iterator it whose items, after an arbitrary prefix terminated > > by the first empty item, are supposed to be each 'yes' or 'no'. > > This is a made-up toy example, right? Does it correspond with > something you've had to do in real life? Yes, but I signed an NDA, and thus made irrelevant changes sufficient to completely mask the application area &c (how is the prefix's end is found, how the rest of the stream is analyzed to determine how to process it). > But I'm not sure that abstracting this away all the way to an iterator Perhaps I over-abstracted it, but I just love abstracting streams as iterators whenever I can get away with it -- I love the clean, reusable program structure I often get that way, I love the reusable functions it promotes. I guess I'll just build my iterators by suitable factory functions (including "optimized tee-ability" when feasible), tweak Raymond's "tee" to use "optimized tee-ability" when supplied, and tell my clients to build the iterators with my factories if they need memory-optimal tee-ing. As long as I can't share that code more widely, having to use e.g. richiters.iter instead of the built-in iter isn't too bad, anyway. > makes sense. For one, the generic approach to cloning if the iterator > doesn't have __clone__ would be to make a memory copy, but in this app > a disk copy is desirable (I can invent something that overflows to An iterator that knows it's coming from disk or pipe can provide that disk copy (or reuse the existing file) as part of its "optimized tee-ability". > offset), or each clone must keep a file offset, but now you lose the > performance effect of a streaming buffer unless you code up something > extremely hairy with locks etc. ??? when one clone iterates to the end, on a read-only disk file, its seeks (which happen always to be to the current offset) don't remove the benefits of read-ahead done on its behalf by the OS. Maybe you mean something else by "lose the performance effect"? As for locks, why? An iterator in general is not thread-safe: if two threads iterate on the same iterator, without providing their own locking, boom. So why should clones imply stricter thread-safety? Alex From skip at pobox.com Sun Oct 19 17:32:06 2003 From: skip at pobox.com (Skip Montanaro) Date: Sun Oct 19 17:32:16 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Include object.h, 2.121, 2.122 boolobject.h, 1.4, 1.5 In-Reply-To: References: Message-ID: <16275.726.346088.324819@montanaro.dyndns.org> bcannon> Defined macros Py_RETURN_(TRUE|FALSE|NONE) as helper functions bcannon> for returning the specified value. All three Py_INCREF the bcannon> singleton and then return it. ... bcannon> + /* Macro for returning Py_None from a function */ bcannon> + #define Py_RETURN_NONE Py_INCREF(Py_None); return Py_None; ... bcannon> + /* Macros for returning Py_True or Py_False, respectively */ bcannon> + #define Py_RETURN_TRUE Py_INCREF(Py_True); return Py_True; bcannon> + #define Py_RETURN_FALSE Py_INCREF(Py_False); return Py_False; These don't look right to me. First, you have no protection against them being called like this: if (!error) Py_RETURN_TRUE; Second, any time you expect to use a macro in a statement context, I don't think you want to terminate it with a semicolon (the programmer will do that). I would have coded them as #define Py_RETURN_NONE do {Py_INCREF(Py_None); return Py_None;} while (0) #define Py_RETURN_TRUE do {Py_INCREF(Py_True); return Py_True;} while (0) #define Py_RETURN_FALSE do {Py_INCREF(Py_False); return Py_False;} while (0) Skip From bac at OCF.Berkeley.EDU Sun Oct 19 17:40:36 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Sun Oct 19 17:41:04 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Include object.h, 2.121, 2.122 boolobject.h, 1.4, 1.5 In-Reply-To: <16275.726.346088.324819@montanaro.dyndns.org> References: <16275.726.346088.324819@montanaro.dyndns.org> Message-ID: <3F9304D4.70407@ocf.berkeley.edu> Skip Montanaro wrote: > bcannon> Defined macros Py_RETURN_(TRUE|FALSE|NONE) as helper functions > bcannon> for returning the specified value. All three Py_INCREF the > bcannon> singleton and then return it. > > ... > bcannon> + /* Macro for returning Py_None from a function */ > bcannon> + #define Py_RETURN_NONE Py_INCREF(Py_None); return Py_None; > ... > bcannon> + /* Macros for returning Py_True or Py_False, respectively */ > bcannon> + #define Py_RETURN_TRUE Py_INCREF(Py_True); return Py_True; > bcannon> + #define Py_RETURN_FALSE Py_INCREF(Py_False); return Py_False; > > These don't look right to me. First, you have no protection against them > being called like this: > > if (!error) > Py_RETURN_TRUE; > Realized that after my first commit. Already fixed. > Second, any time you expect to use a macro in a statement context, I don't > think you want to terminate it with a semicolon (the programmer will do > that). I would have coded them as > > #define Py_RETURN_NONE do {Py_INCREF(Py_None); return Py_None;} while (0) > #define Py_RETURN_TRUE do {Py_INCREF(Py_True); return Py_True;} while (0) > #define Py_RETURN_FALSE do {Py_INCREF(Py_False); return Py_False;} while (0) > Isn't {Py_INCREF(Py_None); return Py_None} enough? I thought ending a curly brace with a semi-colon is harmless (equivalent of a NO-OP). Why bother with the do/while loop? -Brett From aleaxit at yahoo.com Sun Oct 19 18:31:32 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 19 18:31:39 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Include object.h, 2.121, 2.122 boolobject.h, 1.4, 1.5 In-Reply-To: <3F9304D4.70407@ocf.berkeley.edu> References: <16275.726.346088.324819@montanaro.dyndns.org> <3F9304D4.70407@ocf.berkeley.edu> Message-ID: <200310200031.32498.aleaxit@yahoo.com> On Sunday 19 October 2003 11:40 pm, Brett C. wrote: ... > #define Py_RETURN_FALSE do {Py_INCREF(Py_False); return Py_False;} while (0) > > Isn't {Py_INCREF(Py_None); return Py_None} enough? I thought ending a > curly brace with a semi-colon is harmless (equivalent of a NO-OP). Why Not in C: the extra semicolon is an empty statement. So, for example if(...) { } ; else is a syntax error. > bother with the do/while loop? To let the user put a semicolon after the macro and get correct C code. Alex From bac at OCF.Berkeley.EDU Sun Oct 19 18:40:34 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Sun Oct 19 18:41:19 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Include object.h, 2.121, 2.122 boolobject.h, 1.4, 1.5 In-Reply-To: <200310200031.32498.aleaxit@yahoo.com> References: <16275.726.346088.324819@montanaro.dyndns.org> <3F9304D4.70407@ocf.berkeley.edu> <200310200031.32498.aleaxit@yahoo.com> Message-ID: <3F9312E2.8050807@ocf.berkeley.edu> Alex Martelli wrote: > On Sunday 19 October 2003 11:40 pm, Brett C. wrote: > ... > >>#define Py_RETURN_FALSE do {Py_INCREF(Py_False); return Py_False;} while (0) >> >>Isn't {Py_INCREF(Py_None); return Py_None} enough? I thought ending a >>curly brace with a semi-colon is harmless (equivalent of a NO-OP). Why > > > Not in C: the extra semicolon is an empty statement. So, for example > > if(...) { > } ; else > > is a syntax error. > > >>bother with the do/while loop? > > > To let the user put a semicolon after the macro and get correct C code. > > Nuts. Time for another commit... -Brett From greg at cosc.canterbury.ac.nz Sun Oct 19 19:23:43 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Sun Oct 19 19:24:01 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> Message-ID: <200310192323.h9JNNhs23070@oma.cosc.canterbury.ac.nz> Guido: > the proposed notation doesn't return a list. > ... > I don't have a proposal for generator comprehension syntax though, and > [yield ...] has the same problem. How about just leaving off the brackets? gen = yield x*x for x in stuff Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tdelaney at avaya.com Sun Oct 19 19:34:47 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Sun Oct 19 19:34:54 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFED1A@au3010avexu1.global.avaya.com> > From: Alex Martelli [mailto:aleaxit@yahoo.com] > > I think we SHOULD have modules corresponding to built-in types, > if there are important functions connected with those types but not > appropriate as methods to populate them. Perhaps we could use the > User*.py modules for the purpose, but making new ones seems > better. Well, we already have a precedent for this - the 'Sets' module. So if we use the same naming convention ... For discrete types: Lists Dicts Tuples Sets for interfaces: Iterators Iterables and for a catch-all Objects Then we just have to argue over what goes where ;) Tim Delaney From tdelaney at avaya.com Sun Oct 19 19:40:57 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Sun Oct 19 19:41:03 2003 Subject: [Python-Dev] SRE recursion removed Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFED28@au3010avexu1.global.avaya.com> > From: Steve Holden [mailto:sholden@holdenweb.com] > > > Martin: I suspect that Gustavo is suffering for an excess of care and > modesty: after all, with CVS controlling the code it isn't > hard to back > out a patch if it turns out to be a bad idea. But it won't, will it, > Gustavo ;-)? Perhaps a comment that the patch won't be accepted until the dead code has been removed, but that the dead code is there for ease of regression testing during the initial testing period? Essentially, this is an alpha-level patch. When the dead code is removed it becomes a beta-level patch. Tim Delaney From tdelaney at avaya.com Sun Oct 19 19:44:29 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Sun Oct 19 19:44:35 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFED2C@au3010avexu1.global.avaya.com> > From: Delaney, Timothy C (Timothy) > > Lists > Dicts > Tuples > Sets And for symmetry with Sets, each module should also provide an import of the type that it is about e.g. from Lists import list OK - so it's Monday morning ... ;) Tim DElaney From greg at cosc.canterbury.ac.nz Sun Oct 19 19:45:04 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Sun Oct 19 19:45:27 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Message-ID: <200310192345.h9JNj4I23163@oma.cosc.canterbury.ac.nz> Sean Ross generated: > # (3) parentheses > sumofsquares = sum((yield x*x for x in myList)) I think this one illustrates why requiring parentheses around a bare "yield..." would be a bad idea. > # (14) unpacking (*) > sumofsquares = sum(*[x*x for x in myList]) That already has a meaning (you're passing the result of a list comp as a * argument to the function). Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Sun Oct 19 19:54:27 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Sun Oct 19 19:54:37 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> Message-ID: <200310192354.h9JNsRK23207@oma.cosc.canterbury.ac.nz> Guido: > but perhaps we can make this work: > > sum(x for x in S) But if "x for x in S" were a legal expression on its own, returning a generator, then [x for x in S] would have to be a 1-element list containing a generator. Unless you're suggesting that it should be a special feature of the function call syntax? That would be bizarre... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Sun Oct 19 20:08:31 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Sun Oct 19 20:08:45 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <5.1.0.14.0.20031017151235.034fad20@mail.telecommunity.com> Message-ID: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> "Phillip J. Eby" : > If you look at it this way, then you can consider [x for x in S] to be > shorthand syntax for list(x for x in S), as they would both produce the > same result. However, IIRC, the current listcomp implementation actually > binds 'x' in the current local namespace, whereas the generator version > would not. Are we sure about that? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido at python.org Sun Oct 19 20:23:12 2003 From: guido at python.org (Guido van Rossum) Date: Sun Oct 19 20:23:21 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: Your message of "Sun, 19 Oct 2003 12:50:24 EDT." <003f01c39661$17d5fd80$e841fea9@oemcomputer> References: <003f01c39661$17d5fd80$e841fea9@oemcomputer> Message-ID: <200310200023.h9K0NCp20046@12-236-54-216.client.attbi.com> > So far, all of my needs for re-iteration have been met by storing some > of the iterator's data. If all of it needs to be saved, I use list(it). > If only a portion needs to be saved, then I use the code from the tee() > example in the itertools documentation: > > def tee(iterable): > "Return two independent iterators from a single iterable" > def gen(next, data={}, cnt=[0]): > dpop = data.pop > for i in itertools.count(): > if i == cnt[0]: > item = data[i] = next() > cnt[0] += 1 > else: > item = dpop(i) > yield item > next = iter(iterable).next > return (gen(next), gen(next)) Ouch. That required hard work to understand! :-) And it doesn't generalize straightforwardly to three or more iterators. This approach is nice if you expect the two iterators to remain close together. But if they go far apart (without degenerating to the list(it) case like Alex's example) I imagine that different data structure than a dict would be more efficient to hold the queue. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at cosc.canterbury.ac.nz Sun Oct 19 20:34:54 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Sun Oct 19 20:35:07 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Message-ID: <200310200034.h9K0YsQ23385@oma.cosc.canterbury.ac.nz> Sean Ross : > # (1) without parentheses: > B(y) for y in A(x) for x in myIterable Er, excuse me, but that had better *not* be equivalent to > # (2) for clarity, we'll add some optional parentheses: > B(y) for y in (A(x) for x in myIterable) because the former ought to be a single iterator expression with two nested loops (albeit an erroneous one, since x is being used before it's bound). Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido at python.org Sun Oct 19 20:40:37 2003 From: guido at python.org (Guido van Rossum) Date: Sun Oct 19 20:40:43 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: Your message of "Sun, 19 Oct 2003 22:16:44 +0200." <200310192216.44849.aleaxit@yahoo.com> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> <200310191205.57016.aleaxit@yahoo.com> <200310191630.h9JGUF219501@12-236-54-216.client.attbi.com> <200310192216.44849.aleaxit@yahoo.com> Message-ID: <200310200040.h9K0ebP20072@12-236-54-216.client.attbi.com> > > > I have an iterator it whose items, after an arbitrary prefix > > > terminated by the first empty item, are supposed to be each > > > 'yes' or 'no'. > > > > This is a made-up toy example, right? Does it correspond with > > something you've had to do in real life? > > Yes, but I signed an NDA, and thus made irrelevant changes > sufficient to completely mask the application area &c (how is the > prefix's end is found, how the rest of the stream is analyzed to > determine how to process it). OK, but that does make it harder to judge its value for making the case for iterator cloning, because you're not saying anything about the (range of) characteristics of the input iterator. > > But I'm not sure that abstracting this away all the way to an iterator > > Perhaps I over-abstracted it, but I just love abstracting streams as > iterators whenever I can get away with it -- I love the clean, > reusable program structure I often get that way, I love the reusable > functions it promotes. But when you add more behavior to the iterator protocol, a lot of the cleanliness goes away; any simple transformation of an iterator using a generator function loses all the optional functionality. > I guess I'll just build my iterators by suitable factory functions > (including "optimized tee-ability" when feasible), tweak Raymond's > "tee" to use "optimized tee-ability" when supplied, and tell my > clients to build the iterators with my factories if they need > memory-optimal tee-ing. As long as I can't share that code more > widely, having to use e.g. richiters.iter instead of the built-in > iter isn't too bad, anyway. But you can't get the for-loop to use richiters.iter (you'd have to add an explicit call to it). And you can't use any third party or standard library code for manipulating iterators; you'd have to write your own clone of itertools. > > makes sense. For one, the generic approach to cloning if the > > iterator doesn't have __clone__ would be to make a memory copy, > > but in this app a disk copy is desirable (I can invent something > > that overflows to > > An iterator that knows it's coming from disk or pipe can provide > that disk copy (or reuse the existing file) as part of its > "optimized tee-ability". At considerable cost. > > offset), or each clone must keep a file offset, but now you lose > > the performance effect of a streaming buffer unless you code up > > something extremely hairy with locks etc. > > ??? when one clone iterates to the end, on a read-only disk file, > its seeks (which happen always to be to the current offset) don't > remove the benefits of read-ahead done on its behalf by the OS. > Maybe you mean something else by "lose the performance effect"? I wasn't thinking about the OS read-ahead, I was thinking of stdio buffering, and the additional buffering done by file.next(). (See readahead_get_line_skip() in fileobject.c.) This buffering has made "for line in file" in 2.3 faster than any way of iterating over the lines of a file previously available. Also, on many systems, every call to fseek() drops the stdio buffer, even if the seek position is not actually changed by the call. It could be done, but would require incredibly hairy code. > As for locks, why? An iterator in general is not thread-safe: if > two threads iterate on the same iterator, without providing their > own locking, boom. So why should clones imply stricter > thread-safety? I believe I was thinking of something else; the various iterators iterating over the same file would somehow have to communicate to each other who used the file last, so that repeated next() calls on the same iterator could know they wouldn't have to call seek() and hence lose the readahead buffer. This doesn't require locking in the thread sense, but feels similar. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at cosc.canterbury.ac.nz Sun Oct 19 20:45:00 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Sun Oct 19 20:45:24 2003 Subject: [Python-Dev] generator comprehension syntax, was: accumulator display syntax In-Reply-To: <000201c39514$ac006f20$e841fea9@oemcomputer> Message-ID: <200310200045.h9K0j0q23393@oma.cosc.canterbury.ac.nz> Raymond Hettinger : > Is Phil's syntax acceptable to everyone? > > (yield: x*x for x in roots) I could probably live with it, but it would be so much nicer if the "yield" could be dispensed with. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido at python.org Sun Oct 19 21:04:14 2003 From: guido at python.org (Guido van Rossum) Date: Sun Oct 19 21:04:23 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: Your message of "Sun, 19 Oct 2003 17:23:12 PDT." <200310200023.h9K0NCp20046@12-236-54-216.client.attbi.com> References: <003f01c39661$17d5fd80$e841fea9@oemcomputer> <200310200023.h9K0NCp20046@12-236-54-216.client.attbi.com> Message-ID: <200310200104.h9K14Ev20120@12-236-54-216.client.attbi.com> FWIW, I partially withdraw my observation that reiterability is a special case of cloneability. It is true that if you have cloneability you have reiterability. But I hadn't realized that reiterability is sometimes easier than cloneability! Cloning a generator function at an arbitrary point is not doable; but cloning a generator function at the start would be as easy as saving the function and its arguments. But this doesn't make me any more comfortable with the idea of adding reiterability as an iterator feature (even optional). An iterator represents the rest of the sequence of values it will generate. But if we add reiterability into the mix, an iterator represents two different sequences: its "full" sequence, accessible via its reiter() method (or whatever it would be called), and its "current" sequence. The latter may be different, because when you get passed an iterator, whoever passed it might already have consumed some items; this affects the "current" sequence but not the sequence returned by reiter(). (Cloning doesn't have this problem, but its other problems make up for this.) If you prefer to see a code sample explaining the problem: consider a consumer of a reiterable iterator: def printwice(it): for x in it: print x for x in it.reiter(): print x Now suppose the following code that calls it: r = range(10) it = iter(r) # assume this is reiterable it.next() # skip first item printwice(it) This prints 1...9 followed by 0...9 !!! The solution using cloning wouldn't have this problem: def printwice(it): it2 = it.clone() for x in it: print x for x in it2: print x With reiter() it becomes hard to explain what the input requirements are for the function to work correctly; effectively, it would require a "virginal" (== has never been used :-) reiterable iterator. So we might as well require a container! If you don't have a container but you have a description of a series, Alex's Reiterable can easily fix this: class Reiterable: def __init__(self, func, *args): self.func, self.args = func, args def __iter__(self): return self.func(*self.args) This should be called with e.g. a generator function and an argument list for it. --Guido van Rossum (home page: http://www.python.org/~guido/) From bac at OCF.Berkeley.EDU Sun Oct 19 21:26:59 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Sun Oct 19 21:27:10 2003 Subject: [Python-Dev] Re: How to spell Py_return_None and friends In-Reply-To: <200310200000.h9K006h19965@12-236-54-216.client.attbi.com> References: <000001c38e10$77e31ea0$e841fea9@oemcomputer> <200310090345.h993jkF00618@12-236-54-216.client.attbi.com> <200310090503.h99533G00867@12-236-54-216.client.attbi.com> <3F91F743.6090801@ocf.berkeley.edu> <200310190240.h9J2ekX10384@12-236-54-216.client.attbi.com> <3F9203A5.2030407@ocf.berkeley.edu> <200310191433.h9JEXSL19256@12-236-54-216.client.attbi.com> <3F930373.8010809@ocf.berkeley.edu> <200310200000.h9K006h19965@12-236-54-216.client.attbi.com> Message-ID: <3F9339E3.30605@ocf.berkeley.edu> Guido van Rossum wrote: >>Now, where do the macros get documented? In the Python/C API docs all I >>see is docs for None in 7.1.2 . Is that the proper place to document >>Py_RETURN_NONE? Where are the docs for Py_True and Py_False? > > > Um, maybe Martin has an idea? I've not looked at the doc structure > for years. If Py_True/False aren't documented, maybe they should be > added? Otherwise I suggest you throw this back to python-dev and hope > Fred responds. :-) > Argh! Mis-clicked and hit Reply instead of Reply-All. Joys of a new email client. I couldn't find any docs for Py_True/False both in terms of the TOC and the index. This has turned out to be such a crap day in so many ways it is unbelievable. -Brett From greg at cosc.canterbury.ac.nz Sun Oct 19 23:49:49 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Sun Oct 19 23:50:33 2003 Subject: [Python-Dev] How to spell Py_return_None and friends (was: RE: [Python-checkins] python/dist/src/Objects typeobject.c, 2.244, 2.245) In-Reply-To: <3F91F743.6090801@ocf.berkeley.edu> Message-ID: <200310200349.h9K3nnY24004@oma.cosc.canterbury.ac.nz> "Brett C." : > So Py_return_None or Py_RETURN_NONE ? PyReturn_None, PyReturn_True, PyReturn_False Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Mon Oct 20 00:44:45 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Mon Oct 20 00:45:32 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <200310181627.h9IGRoP09636@12-236-54-216.client.attbi.com> Message-ID: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> Guido on iterator comprehensions: > The real issue is whether it adds enough to make it worthwhile to > change the language (again). > > My current opinion is that it isn't Maybe it's time to get back to what started all this, which was a desire for an accumulation syntax. (Actually it was a proposal to abuse a proposed accumulation syntax to get sorting, if I remember correctly, but let's ignore that detail for now...) Most of us seem to agree that having list comprehensions available as a replacement for map() and filter() is a good thing. But what about reduce()? Are there equally strong reasons for wanting an alternative to that, too? If not, why not? And if we do, maybe a general iterator comprehension syntax isn't the best way to go. It seemed that way at first, but that seems to have led us into a bit of a quagmire. So, taking the original accumulator display idea, and incorporating some of the ideas that have come up along the way, such as getting rid of the square brackets, how about sum of x*x for x in xvalues average of g for g in grades maximum of f(x, y) for x in xrange for y in yrange top(10) of humour(joke) for joke in comedy etc.? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From martin at v.loewis.de Mon Oct 20 01:52:33 2003 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon Oct 20 01:52:37 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310192323.h9JNNhs23070@oma.cosc.canterbury.ac.nz> References: <200310192323.h9JNNhs23070@oma.cosc.canterbury.ac.nz> Message-ID: <3F937821.7050908@v.loewis.de> Greg Ewing wrote: > How about just leaving off the brackets? > > gen = yield x*x for x in stuff I think this has a dangling else problem: gen = yield x*x for x in yield y+y for y in stuff if x > y In this expression, how would you put parentheses, and why? Regards, Martin From martin at v.loewis.de Mon Oct 20 01:54:15 2003 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon Oct 20 01:54:30 2003 Subject: [Python-Dev] SRE recursion removed In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DECFED28@au3010avexu1.global.avaya.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DECFED28@au3010avexu1.global.avaya.com> Message-ID: <3F937887.3070505@v.loewis.de> Delaney, Timothy C (Timothy) wrote: > Perhaps a comment that the patch won't be accepted until the dead code > has been removed, but that the dead code is there for ease of regression > testing during the initial testing period? OTOH, the patch has been already committed to CVS head. So it is already accepted. Regards, Martin From aleaxit at yahoo.com Mon Oct 20 02:28:57 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 20 02:29:02 2003 Subject: [Python-Dev] generator comprehension syntax, was: accumulator display syntax In-Reply-To: <200310200045.h9K0j0q23393@oma.cosc.canterbury.ac.nz> References: <200310200045.h9K0j0q23393@oma.cosc.canterbury.ac.nz> Message-ID: <200310200828.57856.aleaxit@yahoo.com> On Monday 20 October 2003 02:45 am, Greg Ewing wrote: > Raymond Hettinger : > > Is Phil's syntax acceptable to everyone? > > > > (yield: x*x for x in roots) > > I could probably live with it, but it would be > so much nicer if the "yield" could be dispensed > with. I've changed my mind, too, btw (pondering on Guido's last msg on the subject): mandatory parentheses but no "yield:" would be quite fine. I realized I didn't bother to say so because of Guido's prediction (no pronouncement yet) that this issue will anyway die "like the ternary operator" -- I focused on that one rather than on the detail of _what_ syntax exactly should be NOT adopted:-). Alex From python at rcn.com Mon Oct 20 02:35:49 2003 From: python at rcn.com (Raymond Hettinger) Date: Mon Oct 20 02:36:34 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310192323.h9JNNhs23070@oma.cosc.canterbury.ac.nz> Message-ID: <001b01c396d4$66a490c0$a426c797@oemcomputer> [Guido] > > the proposed notation doesn't return a list. > > ... > > I don't have a proposal for generator comprehension syntax though, and > > [yield ...] has the same problem. [Greg Ewing] > How about just leaving off the brackets? > > gen = yield x*x for x in stuff Heck no! Right now, the only way to tell if a function is a generator is to read through the code looking for a yield. If we do get a generator comprehension syntax, it *must* be distinctively set-off from the surrounding code. Brackets accomplish set-off but look too much like lists. Parens aren't strong enough unless the yield is followed a colon. Someone suggested paired angle brackets (lt and gt) but that was promptly shot down for some reason I can't recall. Curly braces and quotes are probably out of the question. That leaves only dollar signs and other perlisms, yuck. The best so far is (yield: x*x for x in stuff) but someone very important said they hated it for some reason. Perhaps someone can come up with some clever, self explanatory use of --> or some such. Raymond From aleaxit at yahoo.com Mon Oct 20 02:46:09 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 20 02:46:14 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: <200310200104.h9K14Ev20120@12-236-54-216.client.attbi.com> References: <003f01c39661$17d5fd80$e841fea9@oemcomputer> <200310200023.h9K0NCp20046@12-236-54-216.client.attbi.com> <200310200104.h9K14Ev20120@12-236-54-216.client.attbi.com> Message-ID: <200310200846.09300.aleaxit@yahoo.com> On Monday 20 October 2003 03:04 am, Guido van Rossum wrote: > FWIW, I partially withdraw my observation that reiterability is a > special case of cloneability. It is true that if you have > cloneability you have reiterability. But I hadn't realized that > reiterability is sometimes easier than cloneability! Hmmm, I thought I had shown a simple wrapper (holding a callable and args for it, as you show later) that implied how to wrap, at creation time, iterators built by iter(sequence) or by generators for reiterability (but not for cloneability). So, sure, cloneability is more general (you can use it to implement reiterability, but not VV) and harder to implement; reiterability IS "a special case" and thus it's less general but easier to implement. > But this doesn't make me any more comfortable with the idea of adding > reiterability as an iterator feature (even optional). Sure. "Relatively easy to implement" doesn't mean "should be in the language". Ease of learning, breadth and appropriateness of use, risk of misuse, ease of substitution if not in the language -- there are so many considerations! > With reiter() it becomes hard to explain what the input requirements > are for the function to work correctly; effectively, it would require > a "virginal" (== has never been used :-) reiterable iterator. So we Yes, very good point -- and possibly the explanation of why I never met a use case for reiterability as such. It's unlikely I want "an iterator that may already be partly consumed but I can restart from an unknown-to-me ``previous'' point in its lifetime" -- then I probably just want an iterable, just as you say. > might as well require a container! If you don't have a container but > you have a description of a series, Alex's Reiterable can easily fix > this: > > class Reiterable: > def __init__(self, func, *args): > self.func, self.args = func, args > def __iter__(self): > return self.func(*self.args) > > This should be called with e.g. a generator function and an argument > list for it. Yes, or the function that needs a callable + args anyway might require the callable and the args as its own arguments instead of wanting them packaged up as an iterable (an iterable's probably better if the typical use case is passing e.g. a list, though -- asking for the "iterator factory" callable might help when the typical use case is passing a generator). Yet another nail in the coffin of "reiterable as a concept in the language", methinks. Alex From aleaxit at yahoo.com Mon Oct 20 03:40:43 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 20 03:40:49 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: <200310200040.h9K0ebP20072@12-236-54-216.client.attbi.com> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> <200310192216.44849.aleaxit@yahoo.com> <200310200040.h9K0ebP20072@12-236-54-216.client.attbi.com> Message-ID: <200310200940.43021.aleaxit@yahoo.com> On Monday 20 October 2003 02:40 am, Guido van Rossum wrote: ... > > Perhaps I over-abstracted it, but I just love abstracting streams as > > iterators whenever I can get away with it -- I love the clean, > > reusable program structure I often get that way, I love the reusable > > functions it promotes. > > But when you add more behavior to the iterator protocol, a lot of the > cleanliness goes away; any simple transformation of an iterator using > a generator function loses all the optional functionality. It loses the optimization on clonability, only, as far as I can see; i.e. cloning becomes potentially memory-expensive if what I'm cloning (tee-ing, whatever) can't give me an optimized way. I can still code the higher levels based on clean "tee-able streams", and possibly optimize some iterator-factories later if profiling shows they're needed (yet another case where one dreams of a way of profiling MEMORY use, oh well:-). BTW, playing around with some of this it seems to me that the inability to just copy.copy (or copy.deepcopy) anything produced by iter(sequence) is more of a bother -- quite apart from clonability (a similar but separate concept), couldn't those iterators be copy'able anyway? I.e. just expose underlying sequence and index as their state for getting and setting? Otherwise to get copyable iterators I have to reimplement iter "by hand": class Iter(object): def __init__(self, seq): self.seq = seq self.idx = 0 def __iter__(self): return self def next(self): try: result = self.seq[self.idx] except IndexError: raise StopIteration self.idx += 1 return result and I don't understand the added value of requiring the user to code this no-added-value, slow-things-down boilerplate. > > I guess I'll just build my iterators by suitable factory functions > > (including "optimized tee-ability" when feasible), tweak Raymond's > > "tee" to use "optimized tee-ability" when supplied, and tell my > > clients to build the iterators with my factories if they need > > memory-optimal tee-ing. As long as I can't share that code more > > widely, having to use e.g. richiters.iter instead of the built-in > > iter isn't too bad, anyway. > > But you can't get the for-loop to use richiters.iter (you'd have to > add an explicit call to it). And you can't use any third party or No problem, as the iterator built by the for loop is not exposed in a way that would ever let me try to tee it anyway. > standard library code for manipulating iterators; you'd have to write > your own clone of itertools. For those itertools functions that may preserve "cheap tee-ability" only, yes. > > > makes sense. For one, the generic approach to cloning if the > > > iterator doesn't have __clone__ would be to make a memory copy, > > > but in this app a disk copy is desirable (I can invent something > > > that overflows to > > > > An iterator that knows it's coming from disk or pipe can provide > > that disk copy (or reuse the existing file) as part of its > > "optimized tee-ability". > > At considerable cost. I'm not sure I see that cost, yet. > > > offset), or each clone must keep a file offset, but now you lose > > > the performance effect of a streaming buffer unless you code up > > > something extremely hairy with locks etc. > > > > ??? when one clone iterates to the end, on a read-only disk file, > > its seeks (which happen always to be to the current offset) don't > > remove the benefits of read-ahead done on its behalf by the OS. > > Maybe you mean something else by "lose the performance effect"? > > I wasn't thinking about the OS read-ahead, I was thinking of stdio > buffering, and the additional buffering done by file.next(). (See > readahead_get_line_skip() in fileobject.c.) This buffering has made > "for line in file" in 2.3 faster than any way of iterating over the Ah, if you're iterating by LINE, yes. I was iterating by fixed-size blocks on binary files in my tests, so I didn't see that effect. > lines of a file previously available. Also, on many systems, every > call to fseek() drops the stdio buffer, even if the seek position is > not actually changed by the call. It could be done, but would require > incredibly hairy code. The call to fseek probably SHOULD drop the buffer in a typical C implementation _on a R/W file_, because it's used as the way to signal the file that you're moving from reading to writing or VV (that's what the C standard says: you need a seek between an input op and an immediately successive output op or viceversa, even a seek to the current point, else, undefined behavior -- which reminds me, I don't know if the _Python_ wrapper maintains that "clever" requirement for ITS R/W files, but I think it does). I can well believe that for simplicity a C-library implementor would then drop the buffer on a R/O file too, needlessly but understandably. So, hmmm, wouldn't it suffice to guard the seek call with a condition that the current point in the file isn't already what we want...? [testing, testing...] nope, even just the guard slows things down a LOT. Hmmm, I think .tell IS implemented by a "dummy" .seek, isn't it? So, yes, quite some hairiness (credible or not;-) would be needed to make an iterated-by-lines file good for optimized tee-ability. > > As for locks, why? An iterator in general is not thread-safe: if > > two threads iterate on the same iterator, without providing their > > own locking, boom. So why should clones imply stricter > > thread-safety? > > I believe I was thinking of something else; the various iterators > iterating over the same file would somehow have to communicate to each > other who used the file last, so that repeated next() calls on the > same iterator could know they wouldn't have to call seek() and hence > lose the readahead buffer. This doesn't require locking in the thread > sense, but feels similar. Interesting intuition. The "who used this last" code doesn't feel similar to a lock, to me: i.e., just transforming a plain iterator class Lines1(object): def __init__(self, f): self.f = f def __iter__(self): return self def next(self): line = self.f.next() return line into a somewhat more complicated one: class Lines(object): wholast = {} def __init__(self, f): self.f = f self.wp = f.tell() def __iter__(self): return self def next(self): if self.wholast.get(self.f) is not self: self.f.seek(self.wp) self.wholast[self.f] = self line = self.f.next() self.wp += len(line) return line (assuming seek "resyncs"). However, a loop using Lines (over /usr/share/dict/words) [though twice as fast as my previous attempt using tell each time] is over twice as slow as one with Lines1, which in turn is 3 times as slow as with a tiny generator: def Lines2(flob): for line in flob: yield line The deuced "for line in flob:" is so deucedly optimized that trying to compete with it, even with something as apparently trivial as Lines1, is apparently a lost cause;-). OK, then I guess that an iterator by lines on a textfile can't easily be optimized for teeability by these "share the file object" strategies; rather, the best way to tee such a disk file would seem to be: def tee_diskfile(f): result = file(f.name, f.mode) result.seek(f.tell()) return f, result Alex From aleaxit at yahoo.com Mon Oct 20 03:44:30 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 20 03:44:36 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> Message-ID: <200310200944.30482.aleaxit@yahoo.com> On Monday 20 October 2003 02:08 am, Greg Ewing wrote: > "Phillip J. Eby" : > > If you look at it this way, then you can consider [x for x in S] to be > > shorthand syntax for list(x for x in S), as they would both produce the > > same result. However, IIRC, the current listcomp implementation actually > > binds 'x' in the current local namespace, whereas the generator version > > would not. > > Are we sure about that? We are indeed sure (sadly) that list comprehensions leak control variable names. We can hardly be sure of what iterator comprehensions would be defined to do, given they don't exist, but surely we can HOPE that in an ideal world where iterator comprehensions were part of Python they would not be similarly leaky:-). Alex From aleaxit at yahoo.com Mon Oct 20 03:53:39 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 20 03:53:44 2003 Subject: [Python-Dev] modules for builtin types (was Re: copysort patch) In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DECFED1A@au3010avexu1.global.avaya.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DECFED1A@au3010avexu1.global.avaya.com> Message-ID: <200310200953.39816.aleaxit@yahoo.com> On Monday 20 October 2003 01:34 am, Delaney, Timothy C (Timothy) wrote: > > From: Alex Martelli [mailto:aleaxit@yahoo.com] > > > > I think we SHOULD have modules corresponding to built-in types, > > if there are important functions connected with those types but not > > appropriate as methods to populate them. Perhaps we could use the > > User*.py modules for the purpose, but making new ones seems > > better. > > Well, we already have a precedent for this - the 'Sets' module. Which is actually "sets" (lowercase leading s). It's a precedent *of sorts*, since sets.Set is not "builtin". array.array is another precedent, unfortunately differing in pluralization as well as in capitalization of the type's name. The name of module "string" is also lowercase and singular, and there's no "string.str" nor "string.string" etc naming the type in the module's namespace. "Queue" does have an uppercase initial, but it's singular -- I think "sets" is the only plural here. So, I dunno; there seems to be little consistency to guide us. Alex From aleaxit at yahoo.com Mon Oct 20 04:07:46 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 20 04:08:00 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> Message-ID: <200310201007.46829.aleaxit@yahoo.com> On Monday 20 October 2003 06:44 am, Greg Ewing wrote: ... > So, taking the original accumulator display idea, and > incorporating some of the ideas that have come up along > the way, such as getting rid of the square brackets, > how about > > sum of x*x for x in xvalues > average of g for g in grades > maximum of f(x, y) for x in xrange for y in yrange > top(10) of humour(joke) for joke in comedy Wow. I'm speechless. [later, having recovered speech] IF (big if) we could pull THAT off, it WOULD be well worth making 'of' a keyword (and thus requiring a "from __future__ import"). It's SO beautiful, SO pythonic, the only risk I can see is that we'd have newbie people coding: sum of the_values rather than: sum(the_values) or: sum of x for x in the_values We could (and hopefully will) quibble about the corresponding semantics (particularly for the top(10) example, implicitly requiring some "underlying sequence" to be made available while all other uses require no such black magic). But this is the first proposed new syntax I've seen in a long time -- not just on this thread -- that is SO pretty it makes me want it in the language FOR ITSELF -- to reinforce the "Python is executable pseudocode" idea!!! -- rather than just as a means to the end of having the underlying semantics available. I can but hope others share my fascination with it... in any case, whatever happens to it, *BRAVO*, Greg!!! Alex From Paul.Moore at atosorigin.com Mon Oct 20 05:47:38 2003 From: Paul.Moore at atosorigin.com (Moore, Paul) Date: Mon Oct 20 05:48:25 2003 Subject: [Python-Dev] Re: Reiterability Message-ID: <16E1010E4581B049ABC51D4975CEDB8802C098E6@UKDCX001.uk.int.atosorigin.com> From: Alex Martelli [mailto:aleaxit@yahoo.com] > Basically, by exposing suitable methods an iterator could "make its > abilities know" to functions that may or may not need to wrap it in > order to achieve certain semantics -- so the functions can build > only those wrappers which are truly indispensable for the purpose. > Roughly the usual "protocol" approach -- functions use an object's > ability IF that object exposes methods providing that ability, and > otherwise fake it on their own. I'm glad you pointed this out. This whole thing was starting to sound very like the sort of thing that the adaptation PEP was intended to cover. Can the people who need this get the capability via a suitable adaptation approach? I'm not familiar enough with the technique to be sure. If so, wouldn't that be a more general technique (as well as being already available in 3rd party modules like PyProtocols). Paul. From ncoghlan at iinet.net.au Mon Oct 20 08:58:35 2003 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Mon Oct 20 08:58:42 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <004701c39665$bd6ff440$e841fea9@oemcomputer> References: <004701c39665$bd6ff440$e841fea9@oemcomputer> Message-ID: <3F93DBFB.3010507@iinet.net.au> Raymond Hettinger strung bits together to say: > Remember, list.copysort() isn't about chaining or even "saving a line or > two". It is about using an expression instead of a series of > statements. > That makes it possible to use it wherever expressions are allowed, > including function call arguments and list comprehensions. > > Here are some examples taken from the patch comments: > > genhistory(date, events.copysort(key=incidenttime)) > > todo = [t for t in tasks.copysort() if due_today(t)] 'chain' may be a bad name then, since all that function really does is take an arbitrary bound method, execute it and then return the object that the method was bound to. If we used a name like 'method_as_expr' (instead of 'chain'), then the above examples would be: genhistory(date, method_as_expr(list(events).sort, key=incidenttime)) todo = [t for t in method_as_expr(list(tasks).sort) if due_today(t)] Granted, it's not quite as clear (some might say it's positively arcane!), but it also isn't using anything that's not already in the language/standard library. > The forces working against introducing an in-line sort are: > * the time to copy the list (which Alex later showed to be irrelevant), > * having two list methods with a similar purpose, and > * the proposed method names are less than sublime > > If someone could come-up with a name more elegant than "copysort", I > the idea would be much more appetizing. Would something like 'sortedcopy' be an improvement? Although Alex's suggestion of a class method like dict.fromkeys() also sounded good - naming it is still an issue, though. I'm not entirely opposed to the idea (the 'method_as_expr' approach feels like something of a hack, even to me) - but the object method just doesn't seem to fit cleanly into the design of the basic types. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions." From aleaxit at yahoo.com Mon Oct 20 09:17:07 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 20 09:17:16 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8802C098E6@UKDCX001.uk.int.atosorigin.com> References: <16E1010E4581B049ABC51D4975CEDB8802C098E6@UKDCX001.uk.int.atosorigin.com> Message-ID: <200310201517.07902.aleaxit@yahoo.com> On Monday 20 October 2003 11:47 am, Moore, Paul wrote: > From: Alex Martelli [mailto:aleaxit@yahoo.com] > > > Basically, by exposing suitable methods an iterator could "make its > > abilities know" to functions that may or may not need to wrap it in > > order to achieve certain semantics -- so the functions can build > > only those wrappers which are truly indispensable for the purpose. > > Roughly the usual "protocol" approach -- functions use an object's > > ability IF that object exposes methods providing that ability, and > > otherwise fake it on their own. > > I'm glad you pointed this out. This whole thing was starting to sound > very like the sort of thing that the adaptation PEP was intended to > cover. Darn -- one more underground attempt to foist adaptation into Python foiled by premature discovery... must learn to phrase things less overtly, the people around here are too clever!!! > Can the people who need this get the capability via a suitable > adaptation approach? I'm not familiar enough with the technique to > be sure. If so, wouldn't that be a more general technique (as well > as being already available in 3rd party modules like PyProtocols). Yes, it would be more general and perfectly adequate for this task too, but would still require SOME level of cooperation from built-in types, such as the iterators returned by built-in iter. Adaptation is no black magic, just a systematic, clean, general way to use some capabilities if a type offers them and perhaps kludge them up with a wrapper if a type doesn't offer them but such a wrapper is possible. If an iterator built by iter(sequence) just won't let me know about what sequence it's iterating on and what its current index on it is, in SOME way or other, there's no way I can prise that information "by force" out of it -- I must treat it just like any other iterator that only exposes a .next() method and nothing more (because that's what it DOES expose). Alex From ncoghlan at iinet.net.au Mon Oct 20 09:22:21 2003 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Mon Oct 20 09:22:27 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <200310201007.46829.aleaxit@yahoo.com> References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> <200310201007.46829.aleaxit@yahoo.com> Message-ID: <3F93E18D.5010708@iinet.net.au> Alex Martelli strung bits together to say: > On Monday 20 October 2003 06:44 am, Greg Ewing wrote: > ... > >>So, taking the original accumulator display idea, and >>incorporating some of the ideas that have come up along >>the way, such as getting rid of the square brackets, >>how about >> >> sum of x*x for x in xvalues >> average of g for g in grades >> maximum of f(x, y) for x in xrange for y in yrange >> top(10) of humour(joke) for joke in comedy > > > Wow. > > I'm speechless. > > [later, having recovered speech] IF (big if) we could pull THAT off, it > WOULD be well worth making 'of' a keyword (and thus requiring a > "from __future__ import"). It's SO beautiful, SO pythonic, the only > risk I can see is that we'd have newbie people coding: > sum of the_values > rather than: > sum(the_values) > or: > sum of x for x in the_values Except, if it was defined such that you wrote: sum of [x*x for x in the_values] then: sum of the_values would actually be a valid expression, and Greg's examples would become: sum of xvalues average of grades maximum of [f(x, y) for x in xrange for y in yrange] top(10) of [humour(joke) for joke in comedy] Either way, that's some seriously pretty executable psuedocode he has happening! And a magic method "__of__" that takes a list as an argument might be enough to do the trick, too. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions." From aleaxit at yahoo.com Mon Oct 20 10:01:08 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 20 10:01:15 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <3F93E18D.5010708@iinet.net.au> References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> <200310201007.46829.aleaxit@yahoo.com> <3F93E18D.5010708@iinet.net.au> Message-ID: <200310201601.08440.aleaxit@yahoo.com> On Monday 20 October 2003 03:22 pm, Nick Coghlan wrote: ... > >> sum of x*x for x in xvalues > >> average of g for g in grades > >> maximum of f(x, y) for x in xrange for y in yrange > >> top(10) of humour(joke) for joke in comedy ... > > "from __future__ import"). It's SO beautiful, SO pythonic, the only > > risk I can see is that we'd have newbie people coding: > > sum of the_values > > rather than: > > sum(the_values) > > or: > > sum of x for x in the_values > > Except, if it was defined such that you wrote: > sum of [x*x for x in the_values] > > then: > sum of the_values > > would actually be a valid expression, and Greg's examples would become: Yes, you COULD extend the syntax from Greg's NAME 'of' listmaker to _also_ accept NAME 'of' test or thereabouts (in the terms of dist/src/Grammar/Grammar of course), I don't think it would have any ambiguity. As to whether it's worth it, I dunno. > sum of xvalues Nope, he's summing the _squares_ -- sum of x*x for x in xvalues it says. > average of grades Yes, this one would then work. > maximum of [f(x, y) for x in xrange for y in yrange] Yes, you could put brackets there, but why? > top(10) of [humour(joke) for joke in comedy] Ditto -- and it doesn't do the job unless the magic becomes even blacker. top(N) is supposed to return jokes, not their humor values; so it needs to get an iterable or iterator of (humor(joke), joke) PAIRS -- I think it would DEFINITELY be better to have this spelled out, and in fact I'd prefer: top(10, key=humour) of comedy or top(10, key=humour) of joke for joke in comedy using the same neat syntax "key=" just sprouted by lists' sort method. > Either way, that's some seriously pretty executable psuedocode he has > happening! And a magic method "__of__" that takes a list as an argument > might be enough to do the trick, too. Agreed on the prettiness. I would prefer to have the special method be defined to receive "an iterator or iterable" -- so we can maybe put together a prototype where we just make and pass it a list, BUT keep the door open to passing it an "iterator comprehension" in the future. Or maybe make it always an iterator (in the prototype we can just build the list and call iter on it anyway... so it's not any harder to get started playing with it). Oh BTW, joining another still-current thread -- for x in sorted_copy of mylist: ... now doesn't THAT read just wonderfully, too...?-) Alex From tim.one at comcast.net Mon Oct 20 10:15:40 2003 From: tim.one at comcast.net (Tim Peters) Date: Mon Oct 20 10:15:46 2003 Subject: [Python-Dev] New warnings in _sre.c Message-ID: MSVC complains when a signed int is compared to an unsigned int. I'm glad it does, because the compiler silently casts the signed int to unsigned, which doesn't do what the author probably intended if the signed int is less than 0: #include void main() { int i = -1; unsigned int j = 0; printf("%d\n", i < j); } That prints 0, i.e. it is not the case that -1 < 0U. _sre.c(852) : warning C4018: '<' : signed/unsigned mismatch _sre.c(1021) : warning C4018: '<' : signed/unsigned mismatch _sre.c(1035) : warning C4018: '<' : signed/unsigned mismatch _sre.c(1109) : warning C4018: '<' : signed/unsigned mismatch _sre.c(1131) : warning C4018: '<' : signed/unsigned mismatch _sre.c(1192) : warning C4018: '<' : signed/unsigned mismatch _sre.c(1230) : warning C4018: '<' : signed/unsigned mismatch _sre.c(1267) : warning C4018: '<' : signed/unsigned mismatch _sre.c(1285) : warning C4018: '<' : signed/unsigned mismatch _sre.c(1287) : warning C4018: '<' : signed/unsigned mismatch _sre.c(1294) : warning C4018: '<' : signed/unsigned mismatch _sre.c(1314) : warning C4018: '<' : signed/unsigned mismatch _sre.c(1344) : warning C4018: '<' : signed/unsigned mismatch _sre.c(1362) : warning C4018: '<' : signed/unsigned mismatch _sre.c(1384) : warning C4018: '<' : signed/unsigned mismatch _sre.c(1476) : warning C4018: '<' : signed/unsigned mismatch _sre.c(1492) : warning C4018: '<' : signed/unsigned mismatch From guido at python.org Mon Oct 20 10:30:37 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 20 10:30:50 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Mon, 20 Oct 2003 09:44:30 +0200." <200310200944.30482.aleaxit@yahoo.com> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310200944.30482.aleaxit@yahoo.com> Message-ID: <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com> > We are indeed sure (sadly) that list comprehensions leak control variable > names. But they shouldn't. It can be fixed by renaming them (e.g. numeric names with a leading dot). > We can hardly be sure of what iterator comprehensions would be > defined to do, given they don't exist, but surely we can HOPE that > in an ideal world where iterator comprehensions were part of Python > they would not be similarly leaky:-). It's highly likely that the implementation will have to create a generator function under the hood, so they will be safely contained in that frame. --Guido van Rossum (home page: http://www.python.org/~guido/) From andrew-pythondev at puzzling.org Mon Oct 20 10:30:56 2003 From: andrew-pythondev at puzzling.org (Andrew Bennetts) Date: Mon Oct 20 10:31:04 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> References: <200310181627.h9IGRoP09636@12-236-54-216.client.attbi.com> <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> Message-ID: <20031020143056.GE28665@frobozz> Greg Ewing wrote: > how about > > sum of x*x for x in xvalues > average of g for g in grades > maximum of f(x, y) for x in xrange for y in yrange > top(10) of humour(joke) for joke in comedy I've thought about this, and I don't think I like it. "of" just seems like a new and confusingly different way to spell a function call. E.g., if I read this max([f(x,y) for x in xrange for y in yrange]) out-loud, I'd say: "the maximum of f of x and y for x in xrange, and y in yrange" So perhaps that third example should be spelt: maximum of f of x, y for x in xrange for y in yrange . This particularly struck me when I read Alex's comment: > for x in sorted_copy of mylist: > ... > > now doesn't THAT read just wonderfully, too...?-) Actually, that strikes me as an odd way of spelling: for x in sorted_copy(mylist): ... I think the lazy iteration syntax approach was probably a better idea. I don't like the proposed use of "yield" to signify it, though -- "yield" is a flow control statement, so the examples using it in this thread look odd to me. Perhaps it would be best to simply use the keyword "lazy" -- after all, that's the key distinguishing feature. I think my preferred syntax would be: sum([lazy x*x for x in sequence]) But use of parens instead of brackets, and/or a colon to make the keyword stand out (and look reminisicent to a lambda! which *is* a related concept, in a way -- it also defers evaluation), e.g.: sum((lazy: x*x for x in sequence)) Would be fine with me as well. -Andrew. From FBatista at uniFON.com.ar Mon Oct 20 10:34:08 2003 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Mon Oct 20 10:35:28 2003 Subject: [Python-Dev] Re: prePEP: Money data type Message-ID: #- >From the prePEP it's not clear (for me) the purpose of #- curencySymbol. #- If it's intended for localisation, then prefix isn't enough, #- some countries use suffix or even such format The idea is to keep separated currencySymbol, thousandSeparator and decimalSeparator, in such a way that if you want to change one of those, just subclass Money and change it. In a money amount shown as $1,234.56 '$' is the currencySymbol, ',' is the thousandSeparator, and '.' is the decimalSeparator. This three elements are useful working with string, not only showing the amount with str(), they're also important when parsing in the creation moment: #standard creation m = Money('12.35') #subclassing class MyMoney(Money): decimalSeparator = ',' #wrong! m = MyMoney('12.35') #right... m = MyMoney('12,35') #- Money(123.45, 2) --> 123 FF 45 GG #- #- where FF is suffix1 and GG is suffix2. This maybe could be addresed with having a currencyPrefix and currencySuffix (the later default would be '') instead just one currencySymbol. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20031020/929ad86f/attachment.html From ncoghlan at iinet.net.au Mon Oct 20 10:37:48 2003 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Mon Oct 20 10:37:58 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <200310201601.08440.aleaxit@yahoo.com> References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> <200310201007.46829.aleaxit@yahoo.com> <3F93E18D.5010708@iinet.net.au> <200310201601.08440.aleaxit@yahoo.com> Message-ID: <3F93F33C.9070702@iinet.net.au> Alex Martelli strung bits together to say: > On Monday 20 October 2003 03:22 pm, Nick Coghlan wrote: > Yes, you COULD extend the syntax from Greg's > > NAME 'of' listmaker > > to _also_ accept > > NAME 'of' test > > or thereabouts (in the terms of dist/src/Grammar/Grammar of course), I don't > think it would have any ambiguity. As to whether it's worth it, I dunno. Actually, I was suggesting that if 'of' is simply designated as taking a list* on the right hand side, then you can just write a list comprehension there, without needing the parser to understand the 'for' syntax in that case. But I don't know enough about the parser to really know if that would be a saving worth making. (* a list is what I was thinking, but as you point out, an iterable would be better) >> sum of xvalues > > Nope, he's summing the _squares_ -- > sum of x*x for x in xvalues > it says. D'oh - and I got that one right higher up, too. Ah, well. >> maximum of [f(x, y) for x in xrange for y in yrange] > > Yes, you could put brackets there, but why? I though it would be easier on the parser (only accepting a list/iterable on the right hand side). I don't know if that's actually true, though. >> top(10) of [humour(joke) for joke in comedy] > > Ditto -- and it doesn't do the job unless the magic becomes even blacker. > top(N) is supposed to return jokes, not their humor values; so it needs to > get an iterable or iterator of (humor(joke), joke) PAIRS -- I think it would > DEFINITELY be better to have this spelled out, and in fact I'd prefer: > > top(10, key=humour) of comedy > > or > > top(10, key=humour) of joke for joke in comedy > > using the same neat syntax "key=" just sprouted by lists' sort > method. Yes, that would make it a lot clearer what was going on. > Agreed on the prettiness. I would prefer to have the special method be > defined to receive "an iterator or iterable" -- so we can maybe put together > a prototype where we just make and pass it a list, BUT keep the door open to > passing it an "iterator comprehension" in the future. Or maybe make it always > an iterator (in the prototype we can just build the list and call iter on it > anyway... so it's not any harder to get started playing with it). Well, I think we've established that at least two people on the planet love this idea. . . and agreed on the iterator/iterable vs lists, too. I only thought of that distinction after I'd already hit send :) > Oh BTW, joining another still-current thread -- > > for x in sorted_copy of mylist: > ... > > now doesn't THAT read just wonderfully, too...?-) Not to mention: for x in sorted_copy of reversed_copy of my_list: ... for x in sorted_copy(key=len) of my_list: ... Indeed, _that_ is a solution that looks truly Pythonic! Hmm, just had a strange thought: y = copy of x How would that be for executable pseudocode? It's entirely possible to do all the iterator related things without having this last example work. But what if it did? Cheers, Nick. __of__: just a single-argument function call? -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions." From FBatista at uniFON.com.ar Mon Oct 20 10:41:23 2003 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Mon Oct 20 10:42:15 2003 Subject: [Python-Dev] prePEP: Money data type Message-ID: #- FWIW, Rogue Wave's Money class lets you specify _either_ rounding #- approach -- ROUND_PLAIN specifies EU-rules-compliant rounding, #- ROUND_BANKERS specifies round-to-even, for exactly in-between #- amounts. Offhand, it would seem impossible to write an accounting #- program that respects the law in Europe AND the praxis you mention #- at the same time, unless you somehow tell it what rule to use. #- #- Sad, and seems weird to go to such trouble for a cent, but #- accountants #- live and die by such minutiae: I think it would not be wise #- to ignore them, #- PARTICULARLY if we name the type so as to make it appear to the #- uninitiated that it "will do the right thing" regarding #- rounding... when there #- isn't ONE right thing, it depends on locale &c:-(. Seems to me that the best would be to have two functions (liked the names roundPlain and roundBankers), and the behaviour to be specified by the user. But here I found two approaches: - By argument: Redefine the sintaxis with Money(value, [precision], [round]), having a specified default for round. - By subclassing: Just make: class MyMoney(Money): moneyround = roundPlain The first is better in the way that you use Money directly, but you need to specify *always* the rounding. In the second way you have to subclass it one time, but then all the job is done (anyway, maybe you was already subclassing Money to change it decimalSeparator or something). Personally, I go for the second choice. . Facundo From mwh at python.net Mon Oct 20 11:02:29 2003 From: mwh at python.net (Michael Hudson) Date: Mon Oct 20 11:02:32 2003 Subject: [Python-Dev] Re: itertools, was RE: list.sort In-Reply-To: <200310180143.36999.aleaxit@yahoo.com> (Alex Martelli's message of "Sat, 18 Oct 2003 01:43:36 +0200") References: <003201c39500$9006a8c0$e841fea9@oemcomputer> <200310180143.36999.aleaxit@yahoo.com> Message-ID: <2my8vgosu2.fsf@starship.python.net> Alex Martelli writes: > On Saturday 18 October 2003 12:46 am, Raymond Hettinger wrote: > ... >> My misgivings about drop() and take() are, firstly, that they >> are expressible in-terms of islice() so they don't really add >> any new capability. Secondly, the number of tools needs to be > > True. I gotta remember that -- I find it unintuitive, maybe it's > islice's odious range-like ordering of arguments. Yes, that rubs me the wrong way too. That and I always read it is-lice (and imap always makes me think of mail...). Cheers, mwh -- Need to Know is usually an interesting UK digest of things that happened last week or might happen next week. [...] This week, nothing happened, and we don't care. -- NTK Now, 2000-12-29, http://www.ntk.net/ From aleaxit at yahoo.com Mon Oct 20 11:16:36 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 20 11:16:45 2003 Subject: [Python-Dev] Re: itertools, was RE: list.sort In-Reply-To: <2my8vgosu2.fsf@starship.python.net> References: <003201c39500$9006a8c0$e841fea9@oemcomputer> <200310180143.36999.aleaxit@yahoo.com> <2my8vgosu2.fsf@starship.python.net> Message-ID: <200310201716.36611.aleaxit@yahoo.com> On Monday 20 October 2003 05:02 pm, Michael Hudson wrote: ... > > islice's odious range-like ordering of arguments. > > Yes, that rubs me the wrong way too. That and I always read it > is-lice (and imap always makes me think of mail...). is-lice might be useful in a debugger, though. Alex From mwh at python.net Mon Oct 20 11:23:09 2003 From: mwh at python.net (Michael Hudson) Date: Mon Oct 20 11:23:12 2003 Subject: [Python-Dev] Re: itertools, was RE: list.sort In-Reply-To: <200310201716.36611.aleaxit@yahoo.com> (Alex Martelli's message of "Mon, 20 Oct 2003 17:16:36 +0200") References: <003201c39500$9006a8c0$e841fea9@oemcomputer> <200310180143.36999.aleaxit@yahoo.com> <2my8vgosu2.fsf@starship.python.net> <200310201716.36611.aleaxit@yahoo.com> Message-ID: <2mptgsorvm.fsf@starship.python.net> Alex Martelli writes: > On Monday 20 October 2003 05:02 pm, Michael Hudson wrote: > ... >> > islice's odious range-like ordering of arguments. >> >> Yes, that rubs me the wrong way too. That and I always read it >> is-lice (and imap always makes me think of mail...). > > is-lice might be useful in a debugger, though. *groan* -- ZAPHOD: OK, so ten out of ten for style, but minus several million for good thinking, eh? -- The Hitch-Hikers Guide to the Galaxy, Episode 2 From aleaxit at yahoo.com Mon Oct 20 11:26:57 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 20 11:27:06 2003 Subject: [Python-Dev] prePEP: Money data type In-Reply-To: References: Message-ID: <200310201726.57890.aleaxit@yahoo.com> On Monday 20 October 2003 04:41 pm, Batista, Facundo wrote: > #- FWIW, Rogue Wave's Money class lets you specify _either_ rounding > #- approach -- ROUND_PLAIN specifies EU-rules-compliant rounding, > #- ROUND_BANKERS specifies round-to-even, for exactly in-between ... > #- isn't ONE right thing, it depends on locale &c:-(. > > Seems to me that the best would be to have two functions (liked the names > roundPlain and roundBankers), and the behaviour to be specified by the Sure, rounding IS best set by function, though you may want more than two (roundForbid to raise exceptions when rounding tries to happen, roundTruncate, etc). > user. But here I found two approaches: > > - By argument: Redefine the sintaxis with Money(value, [precision], > [round]), having a specified default for round. > > - By subclassing: Just make: > class MyMoney(Money): > moneyround = roundPlain > > The first is better in the way that you use Money directly, but you need to > specify *always* the rounding. In the second way you have to subclass it They're not at all incompatible! class Money: round = staticmethod(roundWhateverDefault) precision = someDefaultPrecision def __init__(self, value, precision=None, round=None): self.value = value if precision is not None: self.precision = precision if round is not None: self.round = round then use self.precision and self.round in all further methods -- they'll correctly go to either the INSTANCE attribute, if specifically set, or the CLASS attribute, if no instance attribute is set. A useful part of how Python works, btw. So you can subclass Money and change the default rounding without any problem whatsoever. > one time, but then all the job is done (anyway, maybe you was already > subclassing Money to change it decimalSeparator or something). I do NOT think any advanced formatting should be part of the responsibilities of class Money itself. I would focus on correct and complete arithmetic with good handling of exact precision and rounding rules: I contend THAT is the really necessary part. One can always subclass Money to ADD data and methods (e.g. with appropriately designed mix-ins), but remember subclassing cannot REMOVE capabilities: so, avoid the "fat base class" syndrome, a well-recognized anti-pattern, and make sure what you put in a base class is what's needed for ALL uses of it. Alex From aleaxit at yahoo.com Mon Oct 20 11:41:19 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 20 11:41:26 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <3F93F33C.9070702@iinet.net.au> References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> <200310201601.08440.aleaxit@yahoo.com> <3F93F33C.9070702@iinet.net.au> Message-ID: <200310201741.19295.aleaxit@yahoo.com> On Monday 20 October 2003 04:37 pm, Nick Coghlan wrote: ... > Well, I think we've established that at least two people on the planet love Right, hopefully 3 with Greg (though it's not unheard of for posters to this list to change their minds about their own proposals. So I told myself I should stay out of the thread to let others voice their opinion, BUT...: > for x in sorted_copy of reversed_copy of my_list: Ooops -- sorting a reversed copy of my_list is just like sorting my_list... I think for x in sorted_copy(reverse=True) of my_list: ... (again borrowing brand-new keyword syntax from lists' sort method) is likely to work better...:-) > Hmm, just had a strange thought: > > y = copy of x > > How would that be for executable pseudocode? It's entirely possible to do Awesomely pseudocoder (what a comparative...!-) wrt the current "y = copy.copy(x)". You WOULD need to "from copy import copy" first, presumably, but still... > all the iterator related things without having this last example work. But > what if it did? Then the special method would have to be passed the right-hand operand verbatim, NOT an iterator on it, for the "NAME 'of' test" case; otherwise, this would be a terrible "attractive nuisance" in such cases as x = copy of my_dict (if the hypothetical special method was passed iter(my_dict), it would only get the KEYS -- shudder -- so x would presumably end up as a list -- a trap for the unwary, and one I wouldn't want to have to explain to newbies!-). However, if I had to choose, I would forego this VERY attractive syntax sugar, and go for Greg's original suggestion -- 'of' for iterator comprehensions only. Syntax sugar is all very well (at least in this case), but if it _only_ amounts to a much neater-looking way of doing what is already quite possible, it's a "more-than-one-way-to-do-itis". [Just to make sure I argue both sides: introducing "if key in mydict:" as a better way to express "if mydict.has_key(key):" was a HUGE win, and so was letting "if needle in haystack:" be used as a better way to express "haystack.find(needle) >= 0" for substring checks -- so, 'mere' syntax sugar DOES sometimes make an important difference...] Alex From aleaxit at yahoo.com Mon Oct 20 11:45:36 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 20 11:46:01 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310200944.30482.aleaxit@yahoo.com> <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com> Message-ID: <200310201745.36226.aleaxit@yahoo.com> On Monday 20 October 2003 04:30 pm, Guido van Rossum wrote: > > We are indeed sure (sadly) that list comprehensions leak control variable > > names. > > But they shouldn't. It can be fixed by renaming them (e.g. numeric > names with a leading dot). Hmmm, sorry? >>> [.2 for .2 in range(3)] SyntaxError: can't assign to literal I think I don't understand what you mean. > > We can hardly be sure of what iterator comprehensions would be > > defined to do, given they don't exist, but surely we can HOPE that > > in an ideal world where iterator comprehensions were part of Python > > they would not be similarly leaky:-). > > It's highly likely that the implementation will have to create a > generator function under the hood, so they will be safely contained in > that frame. And there will be much rejoicing...!-) Alex From Paul.Moore at atosorigin.com Mon Oct 20 11:51:04 2003 From: Paul.Moore at atosorigin.com (Moore, Paul) Date: Mon Oct 20 11:51:50 2003 Subject: [Python-Dev] Re: accumulator display syntax Message-ID: <16E1010E4581B049ABC51D4975CEDB8802C098EB@UKDCX001.uk.int.atosorigin.com> From: Alex Martelli [mailto:aleaxit@yahoo.com] >> Hmm, just had a strange thought: >> >> y = copy of x >> >> How would that be for executable pseudocode? It's entirely possible to do > Awesomely pseudocoder (what a comparative...!-) wrt the current "y = > copy.copy(x)". You WOULD need to "from copy import copy" first, presumably, > but still... Did I miss April 1st? We seem to be discussing the merits of f of arg as an alternative form of f(arg) While I'm sure Cobol had some good points, I don't believe that this was one of them... If there is any merit to this proposal, it's very rapidly being lost in examples of rewriting things which are simple function calls. Paul. From pje at telecommunity.com Mon Oct 20 12:11:30 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Oct 20 12:12:30 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310201745.36226.aleaxit@yahoo.com> References: <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com> <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310200944.30482.aleaxit@yahoo.com> <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com> Message-ID: <5.1.1.6.0.20031020120841.03337e00@telecommunity.com> At 05:45 PM 10/20/03 +0200, Alex Martelli wrote: >On Monday 20 October 2003 04:30 pm, Guido van Rossum wrote: > > > We are indeed sure (sadly) that list comprehensions leak control variable > > > names. > > > > But they shouldn't. It can be fixed by renaming them (e.g. numeric > > names with a leading dot). > >Hmmm, sorry? > > >>> [.2 for .2 in range(3)] >SyntaxError: can't assign to literal > >I think I don't understand what you mean. He was talking about having the bytecode compiler generate "hidden" names for the variables... ones that can't be used from Python. There's one drawback there, however... If you're stepping through the listcomp generation with a debugger, you won't be able to print the current item in the list, as (I believe) is possible now. From aleaxit at yahoo.com Mon Oct 20 12:23:45 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 20 12:23:51 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8802C098EB@UKDCX001.uk.int.atosorigin.com> References: <16E1010E4581B049ABC51D4975CEDB8802C098EB@UKDCX001.uk.int.atosorigin.com> Message-ID: <200310201823.45392.aleaxit@yahoo.com> On Monday 20 October 2003 05:51 pm, Moore, Paul wrote: ... > Did I miss April 1st? We seem to be discussing the merits of > > f of arg > > as an alternative form of > > f(arg) > > While I'm sure Cobol had some good points, I don't believe that this was > one of them... I may disagree, but it's sure too late to redesign Python today in that respect;-). > If there is any merit to this proposal, it's very rapidly being lost in > examples of rewriting things which are simple function calls. Agreed, and I pointed that out in my latest msg to this thread -- just like e.g. rewriting the simple function call mydict.has_key(k) as the cool, readable "k in mydict", quite identically rewriting the simple function call sum(numbers) as the cool, readable "sum of numbers" would be mere syntax sugar, "more than one way to do it", etc. So, limiting the discussion to Greg's original idea of using 'of' for iterator comprehensions will be wiser and more prudent (just like one would never dare suggesting 'in' as an alternative to calling has_key, say:-). That 'of' thingy is just SO pretty it's making some of us lose their heads, that's all...!-) Alex From guido at python.org Mon Oct 20 12:37:17 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 20 12:37:28 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Mon, 20 Oct 2003 17:45:36 +0200." <200310201745.36226.aleaxit@yahoo.com> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310200944.30482.aleaxit@yahoo.com> <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com> <200310201745.36226.aleaxit@yahoo.com> Message-ID: <200310201637.h9KGbHU21287@12-236-54-216.client.attbi.com> > On Monday 20 October 2003 04:30 pm, Guido van Rossum wrote: > > > We are indeed sure (sadly) that list comprehensions leak control variable > > > names. > > > > But they shouldn't. It can be fixed by renaming them (e.g. numeric > > names with a leading dot). > > Hmmm, sorry? > > >>> [.2 for .2 in range(3)] > SyntaxError: can't assign to literal > > I think I don't understand what you mean. I meant that the compiler should rename it. Just like when you use a tuple argument: def f(a, (b, c), d): ... this actually defines a function of three (!) arguments whose second argument is named '.2'. And the body starts with something equivalent to b, c = .2 For list comps, the compiler could maintain a mapping for the listcomp control variables so that if you write [x for x in range(3)] it knows to generate bytecode as if x was called '.7'; at the bytecode level there's no requirement for names to follow the identifier syntax. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Mon Oct 20 12:38:51 2003 From: skip at pobox.com (Skip Montanaro) Date: Mon Oct 20 12:38:58 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310200944.30482.aleaxit@yahoo.com> <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com> Message-ID: <16276.3995.177704.754136@montanaro.dyndns.org> >> We can hardly be sure of what iterator comprehensions would be >> defined to do, given they don't exist, but surely we can HOPE that in >> an ideal world where iterator comprehensions were part of Python they >> would not be similarly leaky:-). Guido> It's highly likely that the implementation will have to create a Guido> generator function under the hood, so they will be safely Guido> contained in that frame. Which suggests they aren't likely to be a major performance win over list comprehensions. If nothing else, they would push the crossover point between list comprehensions and iterator comprehensions toward much longer lists. Is performance is the main reason this addition is being considered? They don't seem any more expressive than list comprehensions to me. Skip From guido at python.org Mon Oct 20 12:40:04 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 20 12:41:13 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: Your message of "Mon, 20 Oct 2003 16:51:04 BST." <16E1010E4581B049ABC51D4975CEDB8802C098EB@UKDCX001.uk.int.atosorigin.com> References: <16E1010E4581B049ABC51D4975CEDB8802C098EB@UKDCX001.uk.int.atosorigin.com> Message-ID: <200310201640.h9KGe4I21305@12-236-54-216.client.attbi.com> > Did I miss April 1st? We seem to be discussing the merits of > > f of arg > > as an alternative form of > > f(arg) > > While I'm sure Cobol had some good points, I don't believe that this was one > of them... > > If there is any merit to this proposal, it's very rapidly being lost in > examples of rewriting things which are simple function calls. Amen. *If* we were to introduce 'of' as an operator, at least it should introduce some as-yet-unsupported parameter passing semantics, like call-by-name. :-) And in fact, I think that sum(x for x in range(10)) reads *better* than sum of x for x in range(10) and certainly better than sum of x for x in range of 10 because when you squint, it just becomes a series of undistinguished words, like xxx xx x xxx x xx xxxxx xx xx --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Oct 20 12:43:15 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 20 12:43:21 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Mon, 20 Oct 2003 12:11:30 EDT." <5.1.1.6.0.20031020120841.03337e00@telecommunity.com> References: <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com> <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310200944.30482.aleaxit@yahoo.com> <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com> <5.1.1.6.0.20031020120841.03337e00@telecommunity.com> Message-ID: <200310201643.h9KGhFM21321@12-236-54-216.client.attbi.com> > There's one drawback there, however... If you're stepping through the > listcomp generation with a debugger, you won't be able to print the current > item in the list, as (I believe) is possible now. Good point. But this could be addressed in many ways; the debugger could grow a way to quote nonstandard variable names, or it could know about the name mapping, or we could use a different name-mangling scheme (e.g. prefix the original name with an underscore, and optionally append _1 or _2 etc. as needed to distinguish it from a real local with the same name). Or we could simply state this as a deficiency (I'm not sure I've ever needed to debug that situation). --Guido van Rossum (home page: http://www.python.org/~guido/) From eppstein at ics.uci.edu Mon Oct 20 13:03:50 2003 From: eppstein at ics.uci.edu (David Eppstein) Date: Mon Oct 20 13:03:54 2003 Subject: [Python-Dev] Re: accumulator display syntax References: <16E1010E4581B049ABC51D4975CEDB8802C098EB@UKDCX001.uk.int.atosorigin.com> <200310201640.h9KGe4I21305@12-236-54-216.client.attbi.com> Message-ID: In article <200310201640.h9KGe4I21305@12-236-54-216.client.attbi.com>, Guido van Rossum wrote: > And in fact, I think that > > sum(x for x in range(10)) > > reads *better* than > > sum of x for x in range(10) > > and certainly better than > > sum of x for x in range of 10 I also think sum(x for x in range(10)) reads much better than sum(yield x for x in range(10)) sum(yield: x for x in range(10)) or even sum([x for x in range(10)]) (The yield-based syntaxes also have the problem of confusing the reader into thinking the function containing them might be a generator.) It is enough better that the "tuple comprehension" issue is a non-problem for me. I'm assuming this syntax would need surrounding parens inside lists, tuples, and dicts (to avoid confusion with list/dict comprehensions and for the same reason [x,x for x in S] is currently invalid syntax) but avoiding the extra parens in other contexts like function calls looks like a win. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From guido at python.org Mon Oct 20 13:08:45 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 20 13:08:53 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: Your message of "Mon, 20 Oct 2003 15:17:07 +0200." <200310201517.07902.aleaxit@yahoo.com> References: <16E1010E4581B049ABC51D4975CEDB8802C098E6@UKDCX001.uk.int.atosorigin.com> <200310201517.07902.aleaxit@yahoo.com> Message-ID: <200310201708.h9KH8jF21377@12-236-54-216.client.attbi.com> > Darn -- one more underground attempt to foist adaptation into Python > foiled by premature discovery... must learn to phrase things less > overtly, the people around here are too clever!!! :-) I'm all for adaptation, I'm just hesitant to adapt it wholeheartedly because I expect that it will have such a big impact on coding practices. I want to have a better feel for what that impact is and whether it is altogether healthy. IOW I'm a bit worried that adaptation might become too attractive of a hammer for all sorts of problems, whether or not there are better-suited solutions. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Oct 20 13:21:10 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 20 13:21:23 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Mon, 20 Oct 2003 11:38:51 CDT." <16276.3995.177704.754136@montanaro.dyndns.org> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310200944.30482.aleaxit@yahoo.com> <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com> <16276.3995.177704.754136@montanaro.dyndns.org> Message-ID: <200310201721.h9KHLAa21428@12-236-54-216.client.attbi.com> > Guido> It's highly likely that the implementation will have to create a > Guido> generator function under the hood, so they will be safely > Guido> contained in that frame. [Skip] > Which suggests they aren't likely to be a major performance win over > list comprehensions. If nothing else, they would push the crossover > point between list comprehensions and iterator comprehensions toward > much longer lists. > > Is performance is the main reason this addition is being considered? > They don't seem any more expressive than list comprehensions to me. They are more expressive in one respect: you can't use a list comprehension to express an infinite sequence (that's truncated by the consumer). They are more efficient in a related situation: a list comprehension buffers all its items before the next processing step begins; an iterator comprehension doesn't need to do any buffering. So iterator comprehensions win if you're pipelining operations just like Unix pipes are a huge win over temporary files in some situations. This is particularly important when the consumer is some accumulator like 'average' or 'sum'. Whether there is an actual gain in speed depends on how large the list is. You should be able to time examples like sum([x*x for x in R]) vs. def gen(R): for x in R: yield x*x sum(gen(R)) for various lengths of R. (The latter would be a good indication of how fast an iterator generator could run.) --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Mon Oct 20 13:31:35 2003 From: aahz at pythoncraft.com (Aahz) Date: Mon Oct 20 13:31:39 2003 Subject: [Python-Dev] listcomps vs. for loops Message-ID: <20031020173134.GA29040@panix.com> On Mon, Oct 20, 2003, Guido van Rossum wrote: >Alex Martelli: >> >> We are indeed sure (sadly) that list comprehensions leak control variable >> names. > > But they shouldn't. It can be fixed by renaming them (e.g. numeric > names with a leading dot). ?!?! When listcomps were introduced, you were strongly against any changes that would make it difficult to switch back and forth between a listcomp and its corresponding equivalent for loop. Are you changing your position or are you suggesting that for loops should grow private names? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From guido at python.org Mon Oct 20 13:43:25 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 20 13:43:36 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: Your message of "Mon, 20 Oct 2003 09:40:43 +0200." <200310200940.43021.aleaxit@yahoo.com> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> <200310192216.44849.aleaxit@yahoo.com> <200310200040.h9K0ebP20072@12-236-54-216.client.attbi.com> <200310200940.43021.aleaxit@yahoo.com> Message-ID: <200310201743.h9KHhPZ21469@12-236-54-216.client.attbi.com> > BTW, playing around with some of this it seems to me that the > inability to just copy.copy (or copy.deepcopy) anything produced by > iter(sequence) is more of a bother -- quite apart from clonability > (a similar but separate concept), couldn't those iterators be > copy'able anyway? I.e. just expose underlying sequence and index as > their state for getting and setting? I'm not sure why you say it's separate from cloning; it seems to me that copy.copy(iter(range(10))) should return *exactly* what we'd want the proposed clone operation to return. > Otherwise to get copyable > iterators I have to reimplement iter "by hand": > > class Iter(object): > def __init__(self, seq): > self.seq = seq > self.idx = 0 > def __iter__(self): return self > def next(self): > try: result = self.seq[self.idx] > except IndexError: raise StopIteration > self.idx += 1 > return result > > and I don't understand the added value of requiring the user to > code this no-added-value, slow-things-down boilerplate. I see this as a plea to add __copy__ and __deepcopy__ methods to all standard iterators for which it makes sense. (Or maybe only __copy__ -- I'm not sure what value __deepcopy__ would add.) I find this a reasonable request for the iterators belonging to stndard containers (list, tuple, dict). I guess that some of the iterators in itertools might also support this easily. Perhaps this would be the road to supporting iterator cloning? > > > An iterator that knows it's coming from disk or pipe can provide > > > that disk copy (or reuse the existing file) as part of its > > > "optimized tee-ability". > > > > At considerable cost. > > I'm not sure I see that cost, yet. Mostly complexity of the code to implement it, and things like making sure that the disk file is deleted (not an easy problem cross-platform!). > > lines of a file previously available. Also, on many systems, > > every call to fseek() drops the stdio buffer, even if the seek > > position is not actually changed by the call. It could be done, > > but would require incredibly hairy code. > > The call to fseek probably SHOULD drop the buffer in a typical > C implementation _on a R/W file_, because it's used as the way > to signal the file that you're moving from reading to writing or VV > (that's what the C standard says: you need a seek between an > input op and an immediately successive output op or viceversa, > even a seek to the current point, else, undefined behavior -- which > reminds me, I don't know if the _Python_ wrapper maintains that > "clever" requirement for ITS R/W files, but I think it does). Yes it does: file_seek() calls drop_readahead(). > I can well believe that for simplicity a C-library implementor would > then drop the buffer on a R/O file too, needlessly but > understandably. For any stdio implementation supporting fileno(), fseek() is also used to synch up the seek positions maintained by stdio and by the underlying OS or file descriptor implementation. > The deuced "for line in flob:" is so deucedly optimized that trying > to compete with it, even with something as apparently trivial as > Lines1, is apparently a lost cause;-). OK, then I guess that an > iterator by lines on a textfile can't easily be optimized for teeability > by these "share the file object" strategies; rather, the best way to > tee such a disk file would seem to be: > def tee_diskfile(f): > result = file(f.name, f.mode) > result.seek(f.tell()) > return f, result Right, except you might want to change the mode to a read-only mode (without losing the 'b' or 'U' property). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Oct 20 13:48:00 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 20 13:48:08 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: Your message of "Mon, 20 Oct 2003 13:31:35 EDT." <20031020173134.GA29040@panix.com> References: <20031020173134.GA29040@panix.com> Message-ID: <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> > >> We are indeed sure (sadly) that list comprehensions leak control > >> variable names. > > > > But they shouldn't. It can be fixed by renaming them (e.g. numeric > > names with a leading dot). > > ?!?! When listcomps were introduced, you were strongly against any > changes that would make it difficult to switch back and forth between a > listcomp and its corresponding equivalent for loop. I don't recall what I said then. Did I say it was a feature that L = [x for x in R] print x would print the last item of R? > Are you changing your position or are you suggesting that for loops > should grow private names? No, only list comps. --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Mon Oct 20 13:52:30 2003 From: aahz at pythoncraft.com (Aahz) Date: Mon Oct 20 13:52:35 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> References: <20031020173134.GA29040@panix.com> <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> Message-ID: <20031020175230.GA7307@panix.com> On Mon, Oct 20, 2003, Guido van Rossum wrote: >Aahz: >> >> ?!?! When listcomps were introduced, you were strongly against any >> changes that would make it difficult to switch back and forth between a >> listcomp and its corresponding equivalent for loop. > > I don't recall what I said then. Did I say it was a feature that > > L = [x for x in R] > print x > > would print the last item of R? What I remember you saying was that it was an unfortunate but necessary consequence so that it would work the same as L = [] for x in R: L.append(x) print x You didn't want to have different semantics for two such similar constructs ("there's only one way"). You also didn't want to push a stack frame for listcomps. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From mcherm at mcherm.com Mon Oct 20 14:06:52 2003 From: mcherm at mcherm.com (Michael Chermside) Date: Mon Oct 20 14:06:58 2003 Subject: [Python-Dev] listcomps vs. for loops Message-ID: <1066673212.3f94243c2c03c@mcherm.com> Alex: > We are indeed sure (sadly) that list comprehensions leak control > variable names. Guido: > But they shouldn't. It can be fixed by renaming them (e.g. numeric > names with a leading dot). Aahz: > ?!?! When listcomps were introduced, you were strongly against [...] > Are you changing your position[...]? Guido: > Did I say it was a feature that > > L = [x for x in R] > print x > > would print the last item of R? Well, I don't care much about the history of what you may have said... let's get it out in the open: The fact that listcomps leak their variable (thus providing a handy name-binding expression for the evil-minded among us) is a BAD THING. I'd love to see that (mis)feature removed someday. I'd love to have that made possible by Guido's _immediately_ and _officially_ declaring it to be an unsupported (and deprecated) feature. Then maybe *someday* we could get rid of them. Even now, people are writing code that (ab)uses this, and making it ever harder to ever change this in the future. -- Michael Chermside From guido at python.org Mon Oct 20 14:08:07 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 20 14:08:22 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: Your message of "Mon, 20 Oct 2003 13:52:30 EDT." <20031020175230.GA7307@panix.com> References: <20031020173134.GA29040@panix.com> <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> <20031020175230.GA7307@panix.com> Message-ID: <200310201808.h9KI88Q21557@12-236-54-216.client.attbi.com> > What I remember you saying was that it was an unfortunate but necessary > consequence so that it would work the same as > > L = [] > for x in R: > L.append(x) > print x > > You didn't want to have different semantics for two such similar > constructs ("there's only one way"). You also didn't want to push a > stack frame for listcomps. Then I guess I *have* changed my mind. I guess I didn't think of the renaming solution way back when. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Oct 20 14:15:22 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 20 14:15:33 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: Your message of "Mon, 20 Oct 2003 17:44:45 +1300." <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> Message-ID: <200310201815.h9KIFM821583@12-236-54-216.client.attbi.com> > Most of us seem to agree that having list comprehensions > available as a replacement for map() and filter() is a good > thing. But what about reduce()? Are there equally strong > reasons for wanting an alternative to that, too? If not, > why not? If anything, the desire there is *more* pressing. Except for operator.add, expressions involving reduce() are notoriously hard to understand (except to experienced APL or Scheme hackers :-). Things like sum, max, average etc. are expressed very elegantly with iterator comprehensions. I think the question is more one of frequency of use. List comps have nothing over e.g. result = [] for x in S: result.append(x**2) except compactness of exprssion. How frequent is result = 0.0 for x in S: result += x**2 ??? (I've already said my -1 about your 'sum of ...' proposal.) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Oct 20 14:22:22 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 20 14:22:33 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: Your message of "Sun, 19 Oct 2003 21:40:42 +0200." <200310192140.43084.aleaxit@yahoo.com> References: <004701c39665$bd6ff440$e841fea9@oemcomputer> <200310192140.43084.aleaxit@yahoo.com> Message-ID: <200310201822.h9KIMMX21628@12-236-54-216.client.attbi.com> > Or maybe, like in dict.fromkeys, we don't want to emphasize > either the building or the newness, but then I wouldn't know what > to suggest except the list.sorted that's already drawn catcalls > (though it drew them when it was proposed as an instance > methods of lists -- maybe as a classmethod it will look better?-) list.sorted as a list factory looks fine to me. Maybe whoever pointed out the problem with l.sorted() vs. l.sort() for non-native-English speakers can shed some light on how list.sorted(x) fares compared to x.sort()? But the argument that it wastes a copy still stands (even though that's only O(N) vs. O(N log N) for the sort). > I want the functionality -- any sensible name that might let the > functionality into the standard library would be ok by me (so > would one putting the functionality in as a builtin or as an instance > method of lists, actually, but I _do_ believe those would not be > the best places for this functionality, by far). I hope the "tools > package" idea and/or the classmethod one find favour...!-) I'm still unclear why this so important to have in the library when you can write it yourself in two lines. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Oct 20 14:23:15 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 20 14:23:24 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: Your message of "Mon, 20 Oct 2003 11:06:52 PDT." <1066673212.3f94243c2c03c@mcherm.com> References: <1066673212.3f94243c2c03c@mcherm.com> Message-ID: <200310201823.h9KINF921648@12-236-54-216.client.attbi.com> > Alex: > > We are indeed sure (sadly) that list comprehensions leak control > > variable names. > > Guido: > > But they shouldn't. It can be fixed by renaming them (e.g. numeric > > names with a leading dot). > > Aahz: > > ?!?! When listcomps were introduced, you were strongly against [...] > > Are you changing your position[...]? > > Guido: > > Did I say it was a feature that > > > > L = [x for x in R] > > print x > > > > would print the last item of R? > > Well, I don't care much about the history of what you may have said... > let's get it out in the open: The fact that listcomps leak their > variable (thus providing a handy name-binding expression for the evil-minded > among us) is a BAD THING. > > I'd love to see that (mis)feature removed someday. I'd love to have that > made possible by Guido's _immediately_ and _officially_ declaring it to be > an unsupported (and deprecated) feature. Make it so. > Then maybe *someday* we could > get rid of them. Even now, people are writing code that (ab)uses this, > and making it ever harder to ever change this in the future. --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Mon Oct 20 14:31:24 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Oct 20 14:31:22 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: <200310201708.h9KH8jF21377@12-236-54-216.client.attbi.com> References: <16E1010E4581B049ABC51D4975CEDB8802C098E6@UKDCX001.uk.int.atosorigin.com> <200310201517.07902.aleaxit@yahoo.com> Message-ID: <5.1.1.6.0.20031020141447.021bc7b0@telecommunity.com> At 10:08 AM 10/20/03 -0700, Guido van Rossum wrote: >I'm all for adaptation, I'm just hesitant to adapt it wholeheartedly >because I expect that it will have such a big impact on coding >practices. I want to have a better feel for what that impact is and >whether it is altogether healthy. IOW I'm a bit worried that >adaptation might become too attractive of a hammer for all sorts of >problems, whether or not there are better-suited solutions. FWIW, it occurred to me recently that other languages/systems (e.g CLOS, Dylan) solve the problems that adaptation solves by using generic functions. So, by analogy, one could simply ask whether generic functions are too attractive a hammer in those types of languages. :) The other comparison that might be made is to downcast operations in e.g. Java, or conversion constructors (is that the right name?) in C++. In some ways, adaptation seems more Pythonic to me than generic functions, because it results in objects that support an interface. To do the same with generic functions, one would have to curry in the "self". OTOH, generic functions in CLOS and Dylan support multiple dispatch, which is certainly better for implementing binary (or N-ary) operations. So there are tradeoffs either way. Sometimes, when I define an interface with just one method in it, it looks like it would be cleaner as a generic function. But when there's more than one method, I tend to prefer interface+adaptation. I don't have a generic function implementation I'm happy with at present, though, so I stick with adaptation for now. One other issue with generic functions is that languages with generic functions usually have open type systems that allow e.g. union types or predicate types. Python doesn't have that, so it's hard to e.g. "adapt from one interface to another" with generic functions. It can be done, certainly, it's just hard to do it declaratively in a manner open to extension. From skip at pobox.com Mon Oct 20 14:32:13 2003 From: skip at pobox.com (Skip Montanaro) Date: Mon Oct 20 14:32:22 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> References: <20031020173134.GA29040@panix.com> <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> Message-ID: <16276.10797.81884.996776@montanaro.dyndns.org> >> ?!?! When listcomps were introduced, you were strongly against any >> changes that would make it difficult to switch back and forth between >> a listcomp and its corresponding equivalent for loop. Guido> I don't recall what I said then. Did I say it was a feature that Guido> L = [x for x in R] Guido> print x Guido> would print the last item of R? I suspect the lack of a PEP at the time list comprehensions were added to the language allowed this to slip through. PEP 202 was mostly written after list comprehensions were checked into CVS I think (opened 2000-07-13, marked final 2001-08-14, yes 2001!). At just 84 lines it's one of the shortest PEPs. The patch I opened on SF (#400654, opened 2000-06-28, closed 2000-08-14) was essentially Greg Ewing's experimental patch, which relied heavily on the existing for loop code generation. Had there been a PEP with the usual fanfare, I suspect we'd have caught (or at least considered) variable leakage, and perhaps suppressed it. I don't recall the topic ever coming up until after list comps were part of the language. It certainly seems to be the most controversial aspect, after one accepts the idea of adding them to the language. Missing such an obvious point of contention is perhaps one of the strongest arguments for the current PEP process. Skip From lists at webcrunchers.com Mon Oct 20 14:46:17 2003 From: lists at webcrunchers.com (John D.) Date: Mon Oct 20 14:46:28 2003 Subject: [Python-Dev] dbm bugs? Message-ID: #!/usr/local/bin/python #2003-10-19. Feedback import dbm print """ Python dbm bugs summary: 1. Long strings cause weirdness. 2. Long keys fail without returning error. This demonstrates serious bugs in the Python dbm module. Present in OpenBSD versions 2.2, 2.3, and 2.3.2c1. len(key+string)>61231 results in the item being 'lost', without warning. If the key or string is one character shorter, it is fine. Writing multiple long strings causes unpredictable results (none, some, or all of the items are lost without warning). Curiously, keys of length 57148 return an error, but longer keys fail without warning (sounds like an = instead of a > somewhere). """ mdb=dbm.open("mdb","n") print "Writing 1 item to database, but upon reading," k='k' v='X'*61230 #Long string mdb[k]=v mdb.close() md=dbm.open("mdb","r") print "database contains %i items"%len(md.keys()) md.close() From ianb at colorstudy.com Mon Oct 20 15:14:48 2003 From: ianb at colorstudy.com (Ian Bicking) Date: Mon Oct 20 15:14:57 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310201822.h9KIMMX21628@12-236-54-216.client.attbi.com> Message-ID: On Monday, October 20, 2003, at 01:22 PM, Guido van Rossum wrote: >> I want the functionality -- any sensible name that might let the >> functionality into the standard library would be ok by me (so >> would one putting the functionality in as a builtin or as an instance >> method of lists, actually, but I _do_ believe those would not be >> the best places for this functionality, by far). I hope the "tools >> package" idea and/or the classmethod one find favour...!-) > > I'm still unclear why this so important to have in the library when > you can write it yourself in two lines. Probably "there should only be one way to do something." It's something that is recreated over and over, mostly the same way but sometimes with slight differences (e.g., copy-and-sort versus sort-in-place). Like dict() growing keyword arguments, a copy/sort method (function, classmethod, whatever) will standardize something that is very commonly reimplemented. Another analogs might be True and False (which before being built into Python may have been spelled true/false, TRUE/FALSE, or just 0/1). These don't add any real features, but they standardize these simplest of idioms. I think I've seen people in this thread say that they've written Big Python Programs, and they didn't have any problem with this -- but this is a feature that's most important for Small Python Programs. Defining a sort() function becomes boilerplate when you write small programs. Or alternatively you create some util module that contains these little functions, which becomes like a site.py only somewhat more explicit. A util module feels like boilerplate as well, because it is a module without any conceptual integrity, shared between projects only for convenience, or not shared as it grows organically. "from util import sort" just feels like cruft-hiding, not real modularity. -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org From guido at python.org Mon Oct 20 15:24:33 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 20 15:24:47 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: Your message of "Mon, 20 Oct 2003 14:14:48 CDT." References: Message-ID: <200310201924.h9KJOXc21813@12-236-54-216.client.attbi.com> > > I'm still unclear why this so important to have in the library when > > you can write it yourself in two lines. > > Probably "there should only be one way to do something." It's > something that is recreated over and over, mostly the same way but > sometimes with slight differences (e.g., copy-and-sort versus > sort-in-place). Like dict() growing keyword arguments, a copy/sort > method (function, classmethod, whatever) will standardize something > that is very commonly reimplemented. Another analogs might be True and > False (which before being built into Python may have been spelled > true/false, TRUE/FALSE, or just 0/1). These don't add any real > features, but they standardize these simplest of idioms. > > I think I've seen people in this thread say that they've written Big > Python Programs, and they didn't have any problem with this -- but this > is a feature that's most important for Small Python Programs. Defining > a sort() function becomes boilerplate when you write small programs. > Or alternatively you create some util module that contains these little > functions, which becomes like a site.py only somewhat more explicit. A > util module feels like boilerplate as well, because it is a module > without any conceptual integrity, shared between projects only for > convenience, or not shared as it grows organically. "from util import > sort" just feels like cruft-hiding, not real modularity. That's one of the best ways I've seen this formulated. If Alex's proposal to have list.sorted() as a factory function is acceptable to the non-English-speaking crowd, I think we can settle on that. (Hm, an alternative would be to add a "sort=True" keyword argument to list()...) --Guido van Rossum (home page: http://www.python.org/~guido/) From fdrake at acm.org Mon Oct 20 15:30:59 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Mon Oct 20 15:31:51 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310201924.h9KJOXc21813@12-236-54-216.client.attbi.com> References: <200310201924.h9KJOXc21813@12-236-54-216.client.attbi.com> Message-ID: <16276.14323.383711.943996@grendel.zope.com> Guido van Rossum writes: > (Hm, an alternative would be to add a "sort=True" keyword > argument to list()...) My immediate expectation on seeing that would be that the keyword args for l.sort() would also be present. It feels better to isolate that stuff; keeping list.sorted(...) make more sense I think. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From Martin.McGreal at anheuser-busch.com Mon Oct 20 15:34:35 2003 From: Martin.McGreal at anheuser-busch.com (McGreal, Martin P.) Date: Mon Oct 20 15:34:43 2003 Subject: [Python-Dev] to the maintainer of python's configure script Message-ID: <09C096BBD0CB2244B0B176300BEDD65A098DF2@STLEXGUSR32.abc.corp.anheuser-busch.com> Hello, I need to make these modifications to the configure script every time I compile Python on AIX (both AIX 4.3.3 and 5.2 -- so I assume 5.1 as well), so I figured I might as well submit them to you. Everything works fine without my changes except for the readline detection. To get readline detection to work I must... 1. AIX doesn't have a termcap library, so any reference to -ltermcap must be changed to -lcurses. 2. The prototype in the sample code at line 18237 is different from the prototype in , so it should simply be removed from the sample code. 3. The sample code header doesn't include , so both it and should be included. 34d33 < $as_unset ENV MAIL MAILPATH 18222c18221 < LIBS="-lreadline -ltermcap $LIBS" --- > LIBS="-lreadline -lcurses $LIBS" 18225a18225,18226 > #include > #include 18237d18237 < char rl_pre_input_hook (); 18286c18286 < LIBS="-lreadline -ltermcap $LIBS" --- > LIBS="-lreadline -lcurses $LIBS" 18929d18928 < $as_unset ENV MAIL MAILPATH My configure command is ../configure -C --includedir=/usr/local/include --with-libs=-L/usr/local/lib --disable-ipv6 --with-threads My readline is version 4.3, and is installed under /usr/local: # find /usr/local/include -type f |egrep "readline|history" /usr/local/include/readline/chardefs.h /usr/local/include/readline/history.h /usr/local/include/readline/keymaps.h /usr/local/include/readline/readline.h /usr/local/include/readline/rlconf.h /usr/local/include/readline/rlstdc.h /usr/local/include/readline/rltypedefs.h /usr/local/include/readline/tilde.h # find /usr/local/lib -type f |egrep "readline|history" /usr/local/lib/libhistory.a /usr/local/lib/libreadline.a If I do not make the changes in the configure script for the readline checks, the following errors are produced: [rl_pre_input_hook check before changing -ltermcap to -lcurses]: configure:18215: checking for rl_pre_input_hook in -lreadline configure:18246: cc_r -o conftest -g -I/usr/local/include conftest.c -lreadline -ltermcap -L/usr/local/lib -ldl >&5 ld: 0706-006 Cannot find or open library file: -l termcap ld:open(): No such file or directory [rl_completion_matches check before changing -ltermcap to -lcurses] configure:18279: checking for rl_completion_matches in -lreadline configure:18310: cc_r -o conftest -g -I/usr/local/include conftest.c -lreadline -ltermcap -L/usr/local/lib -ldl >&5 ld: 0706-006 Cannot find or open library file: -l termcap ld:open(): No such file or directory configure:18313: $? = 255 [rl_pre_input_hook check after changing -ltermcap to -lcurses]: configure:18215: checking for rl_pre_input_hook in -lreadline configure:18246: cc_r -o conftest -g -I/usr/local/include conftest.c -lreadline -lcurses -L/usr/loc al/lib -ldl >&5 ld: 0711-317 ERROR: Undefined symbol: .rl_pre_input_hook ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more information. configure:18249: $? = 8 [rl_completion_matches check is ok after changing -ltermcap to -lcurses] [rl_pre_input_hook check after adding and to example code] configure:18215: checking for rl_pre_input_hook in -lreadline configure:18249: cc_r -o conftest -g -I/usr/local/include conftest.c -lreadline -lcurses -L/usr/loc al/lib -ldl >&5 "configure", line 18233.9: 1506-236 (W) Macro name _ALL_SOURCE has been redefined. "configure", line 18233.9: 1506-358 (I) "_ALL_SOURCE" is defined on line 129 of /usr/include/standar ds.h. "configure", line 18432.6: 1506-343 (S) Redeclaration of rl_pre_input_hook differs from previous dec laration on line 526 of "/usr/local/include/readline/readline.h". "configure", line 18432.6: 1506-382 (I) The type "unsigned char()" of identifier rl_pre_input_hook d iffers from previous type "int(*)()". configure:18252: $? = 1 [rl_pre_input_hook check is ok after deleting redeclaration of rl_pre_input_hook] This output was produced on an H50 running AIX 5.2 ML1. The same output can be produced on AIX 4.3.3 ML11 (tested on an S7A), except that there is a libtermcap, so -ltermcap doesn't have to be changed to -lcurses (and consequently, the rl_completion_matches check goes right the first time). And on 4.3.3 for some reason the --includedir=/usr/local/include doesn't work so instead I had to use CPPFLAGS="-I/usr/local/include" ./configure -C --with-libs=-L/usr/local/lib --disable-ipv6 --with-threads Thanks! Martin McGreal PS: I also remove the unsetting of ENV from lines 34 and 18929 because on our systems ENV is readonly, which makes the configure script choke. From pje at telecommunity.com Mon Oct 20 15:42:20 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Oct 20 15:42:29 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310201924.h9KJOXc21813@12-236-54-216.client.attbi.com> References: Message-ID: <5.1.1.6.0.20031020153611.01f00140@telecommunity.com> At 12:24 PM 10/20/03 -0700, Guido van Rossum wrote: > > > > [Ian cites "preferably only one obvious way to do it" to justify a sort > idiom] > >That's one of the best ways I've seen this formulated. Does this extend by analogy to other requests for short functions that are commonly reimplemented? Not that any spring to mind at the moment; it just seems to me that inline sorting is one of a set of perennially requested such functions or methods, where the current standard answer is "but you can do it yourself in only X lines!". >If Alex's proposal to have list.sorted() as a factory function is >acceptable to the non-English-speaking crowd, I think we can settle on >that. (Hm, an alternative would be to add a "sort=True" keyword >argument to list()...) Wouldn't it need to grow key and cmpfunc, too? From nas-python at python.ca Mon Oct 20 15:51:38 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Mon Oct 20 15:50:40 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310201924.h9KJOXc21813@12-236-54-216.client.attbi.com> References: <200310201924.h9KJOXc21813@12-236-54-216.client.attbi.com> Message-ID: <20031020195138.GA30478@mems-exchange.org> On Mon, Oct 20, 2003 at 12:24:33PM -0700, Guido van Rossum wrote: > (Hm, an alternative would be to add a "sort=True" keyword argument > to list()...) Yuck. -1. Neil From martin at v.loewis.de Mon Oct 20 16:14:25 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Mon Oct 20 16:15:11 2003 Subject: [Python-Dev] New warnings in _sre.c In-Reply-To: References: Message-ID: "Tim Peters" writes: > MSVC complains when a signed int is compared to an unsigned int. I'm glad > it does, because the compiler silently casts the signed int to unsigned, > which doesn't do what the author probably intended if the signed int is less > than 0: FWIW, gcc complains about the same thing. Regards, Martin From martin at v.loewis.de Mon Oct 20 16:15:46 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Mon Oct 20 16:17:07 2003 Subject: [Python-Dev] dbm bugs? In-Reply-To: References: Message-ID: "John D." writes: > #!/usr/local/bin/python > #2003-10-19. Feedback Can you please submit bug report to sf.net/projects/python? Thanks, Martin From guido at python.org Mon Oct 20 16:17:23 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 20 16:17:31 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: Your message of "Mon, 20 Oct 2003 15:42:20 EDT." <5.1.1.6.0.20031020153611.01f00140@telecommunity.com> References: <5.1.1.6.0.20031020153611.01f00140@telecommunity.com> Message-ID: <200310202017.h9KKHNU21889@12-236-54-216.client.attbi.com> > >That's one of the best ways I've seen this formulated. > > Does this extend by analogy to other requests for short functions > that are commonly reimplemented? Not that any spring to mind at the > moment; it just seems to me that inline sorting is one of a set of > perennially requested such functions or methods, where the current > standard answer is "but you can do it yourself in only X lines!". Only if there's some quirk to reimplementing them correctly, and only if the need is truly common. Most recently we did this for sum(). > >If Alex's proposal to have list.sorted() as a factory function is > >acceptable to the non-English-speaking crowd, I think we can settle on > >that. (Hm, an alternative would be to add a "sort=True" keyword > >argument to list()...) > > Wouldn't it need to grow key and cmpfunc, too? Yes, but list.sorted() would have to support these too. It might become slightly inelegant because we'd probably have to say that sorted defaults to False except it defaults to True if either of cmp, and key is specified. Note that reverse=True would not imply sorting, so that list(range(10), reverse=True) would yield [9, 8, 7, 6, 5, 4, 3, 2, 1, 0] But Raymond has a different proposal in mind for that (he still needs to update PEP 322 though). So maybe list.sorted() is better because it doesn't lend itself to such generalizations (mostly because of the TOOWTDI rule). --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Mon Oct 20 16:19:14 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Mon Oct 20 16:19:43 2003 Subject: [Python-Dev] to the maintainer of python's configure script In-Reply-To: <09C096BBD0CB2244B0B176300BEDD65A098DF2@STLEXGUSR32.abc.corp.anheuser-busch.com> References: <09C096BBD0CB2244B0B176300BEDD65A098DF2@STLEXGUSR32.abc.corp.anheuser-busch.com> Message-ID: "McGreal, Martin P." writes: > I need to make these modifications to the configure script every > time I compile Python on AIX (both AIX 4.3.3 and 5.2 -- so I assume > 5.1 as well), so I figured I might as well submit them to you. Dear Martin, Please understand that the patches are likely ignored if sent to python-dev. Instead, please submit them to sf.net/projects/python. It would be good if a) you could unified (-u) or context (-c) diffs, and b) the patches would be generally applicable to all systems, or, if this is not feasible, c) patches specific to AIX would not harm operation of other systems Regards, Martin From Martin.McGreal at anheuser-busch.com Mon Oct 20 16:21:15 2003 From: Martin.McGreal at anheuser-busch.com (McGreal, Martin P.) Date: Mon Oct 20 16:21:30 2003 Subject: [Python-Dev] to the maintainer of python's configure script Message-ID: <09C096BBD0CB2244B0B176300BEDD65A029141E5@STLEXGUSR32.abc.corp.anheuser-busch.com> Ok, will do. Thanks! -----Original Message----- From: Martin v. L?wis [mailto:martin@v.loewis.de] Sent: Monday, October 20, 2003 3:19 PM To: python-dev@python.org Cc: McGreal, Martin P. Subject: Re: [Python-Dev] to the maintainer of python's configure script "McGreal, Martin P." writes: > I need to make these modifications to the configure script every > time I compile Python on AIX (both AIX 4.3.3 and 5.2 -- so I assume > 5.1 as well), so I figured I might as well submit them to you. Dear Martin, Please understand that the patches are likely ignored if sent to python-dev. Instead, please submit them to sf.net/projects/python. It would be good if a) you could unified (-u) or context (-c) diffs, and b) the patches would be generally applicable to all systems, or, if this is not feasible, c) patches specific to AIX would not harm operation of other systems Regards, Martin From marktrussell at btopenworld.com Mon Oct 20 17:11:02 2003 From: marktrussell at btopenworld.com (Mark Russell) Date: Mon Oct 20 17:13:43 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310201822.h9KIMMX21628@12-236-54-216.client.attbi.com> References: <004701c39665$bd6ff440$e841fea9@oemcomputer> <200310192140.43084.aleaxit@yahoo.com> <200310201822.h9KIMMX21628@12-236-54-216.client.attbi.com> Message-ID: <1066684262.17163.24.camel@straylight> On Mon, 2003-10-20 at 19:22, Guido van Rossum wrote: > But the argument that it wastes a copy still stands (even though > that's only O(N) vs. O(N log N) for the sort). That would be irrelevant in most of the cases where I would use it - typically sorting short lists or dicts where the overhead is unmeasurable. > I'm still unclear why this so important to have in the library when > you can write it yourself in two lines. For little standalone scripts it gets a bit tedious to write this again and again. It doesn't take much code to write dict.fromkeys() manually, but I'm glad that it's there. I'd say list.sorted (or whatever it gets called) has at least as much claim to exist. Mark Russell From tdelaney at avaya.com Mon Oct 20 17:34:20 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Mon Oct 20 17:34:29 2003 Subject: [Python-Dev] SRE recursion removed Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFEF59@au3010avexu1.global.avaya.com> > From: "Martin v. L?wis" [mailto:martin@v.loewis.de] > > Delaney, Timothy C (Timothy) wrote: > > > Perhaps a comment that the patch won't be accepted until > the dead code > > has been removed, but that the dead code is there for ease > of regression > > testing during the initial testing period? > > OTOH, the patch has been already committed to CVS head. So it > is already accepted. True. Too many different bug tracking and source control systems ... I think it would be very useful (and important) to document this requirement though - perhaps a separate bug report, with a comment on the patch pointing to it? Tim Delaney From tdelaney at avaya.com Mon Oct 20 17:35:48 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Mon Oct 20 17:35:56 2003 Subject: [Python-Dev] RE: modules for builtin types (was Re: copysort patch) Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFEF5B@au3010avexu1.global.avaya.com> > From: Alex Martelli [mailto:aleaxit@yahoo.com] > > Which is actually "sets" (lowercase leading s). You're right ... I had a brain fart, thinking we used: from Sets import set but of course it's: from sets import Set Damn. There goes a beautifully-crafted proposal ;) Tim Delaney From python at rcn.com Mon Oct 20 17:43:59 2003 From: python at rcn.com (Raymond Hettinger) Date: Mon Oct 20 17:44:47 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <5.1.1.6.0.20031020153611.01f00140@telecommunity.com> Message-ID: <000301c39753$45a18980$e841fea9@oemcomputer> Let's see what the use cases look like under the various proposals: todo = [t for t in tasks.copysort() if due_today(t)] todo = [t for t in list.sorted(tasks) if due_today(t)] todo = [t for t in list(tasks, sorted=True) if due_today(t)] genhistory(date, events.copysort(key=incidenttime)) genhistory(date, list.sorted(events, key=incidenttime)) genhistory(date, list(events, sorted=True, key=incidenttime)) for f in os.listdir().copysort(): . . . for f in list.sorted(os.listdir()): . . . for f in list(os.listdir(), sorted=True): . . . To my eye, the first form reads much better in every case. It still needs a better name though. [Phillip J. Eby in a separate note] > Wouldn't it need to grow key and cmpfunc, too? Now, that "key" and "reverse" are available, there is no need for "cmp" in any new methods. [Guido in a separate note] > list(range(10), reverse=True) > >would yield > > [9, 8, 7, 6, 5, 4, 3, 2, 1, 0] > > But Raymond has a different proposal in mind for that (he still needs to > > update PEP 322 though). I'll get to it soon; there won't be any surprises. Raymond Hettinger ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From tdelaney at avaya.com Mon Oct 20 17:55:05 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Mon Oct 20 17:55:13 2003 Subject: [Python-Dev] listcomps vs. for loops Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFEF5F@au3010avexu1.global.avaya.com> > From: Guido van Rossum [mailto:guido@python.org] > > > I'd love to see that (mis)feature removed someday. I'd love > to have that > > made possible by Guido's _immediately_ and _officially_ > declaring it to be > > an unsupported (and deprecated) feature. > > Make it so. Should someone raise a bug report against the docs for this then? Tim Delaney From aleaxit at yahoo.com Mon Oct 20 17:56:30 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 20 17:56:36 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <000301c39753$45a18980$e841fea9@oemcomputer> References: <000301c39753$45a18980$e841fea9@oemcomputer> Message-ID: <200310202356.30050.aleaxit@yahoo.com> On Monday 20 October 2003 11:43 pm, Raymond Hettinger wrote: > Let's see what the use cases look like under the various proposals: > > todo = [t for t in tasks.copysort() if due_today(t)] > todo = [t for t in list.sorted(tasks) if due_today(t)] > todo = [t for t in list(tasks, sorted=True) if due_today(t)] > > genhistory(date, events.copysort(key=incidenttime)) > genhistory(date, list.sorted(events, key=incidenttime)) > genhistory(date, list(events, sorted=True, key=incidenttime)) > > for f in os.listdir().copysort(): . . . > for f in list.sorted(os.listdir()): . . . > for f in list(os.listdir(), sorted=True): . . . > > To my eye, the first form reads much better in every case. > It still needs a better name though. You're forgetting the cases in which (e.g.) tasks is not necessarily a list, but any finite sequence (iterable or iterator). Then. e.g. the first job becomes: todo = [t for t in list(tasks).copysort() if due_today(t)] todo = [t for t in list.sorted(tasks) if due_today(t)] todo = [t for t in list(tasks, sorted=True) if due_today(t)] and I think you'll agree that the first construct isn't that good then (quite apart from the probably negligible overhead of an unneeded copy -- still, we HAVE determined that said small overhead needs to be paid sometimes, and needing to code list(x).copysort() when x is not a list or you don't KNOW if x is a list adds one copy then). > [Phillip J. Eby in a separate note] > > > Wouldn't it need to grow key and cmpfunc, too? > > Now, that "key" and "reverse" are available, > there is no need for "cmp" in any new methods. Sorry, but much as I dislike cmpfunc it's still opportune at times, e.g. I'd rather code: def Aup_Bdown(x, y): return cmp(x.A, y.A) or cmp(y.B, x.B) for a in list.sorted(foo, cmp=Aup_Bdown): ... than for a in list.sorted( list.sorted(foo, key=lambda x:x.B, reverse=True), key=lambda x: x.A): ... or even for a in list(foo).copysort(key=lambda x:x.B, reverse=True ).copysort(key=lambda x: x.A): ... Alex From marktrussell at btopenworld.com Mon Oct 20 18:04:12 2003 From: marktrussell at btopenworld.com (Mark Russell) Date: Mon Oct 20 18:06:49 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <000301c39753$45a18980$e841fea9@oemcomputer> References: <000301c39753$45a18980$e841fea9@oemcomputer> Message-ID: <1066687451.16391.45.camel@straylight> On Mon, 2003-10-20 at 22:43, Raymond Hettinger wrote: > Let's see what the use cases look like under the various proposals: > > [1] todo = [t for t in tasks.copysort() if due_today(t)] > [2] todo = [t for t in list.sorted(tasks) if due_today(t)] > [3] todo = [t for t in list(tasks, sorted=True) if due_today(t)] Well, #3 is (I hope) a non-starter, given the need for the extra sort keyword arguments. And the instance method is less capable - it can't sort a non-list iterable (except via list(xxx).copysort()). So I would definitely prefer #2, especially as I would tend to put: sort = list.sorted at the top of my modules where needed. Then I'd have: todo = [t for t in sort(tasks) if due_today(t)] genhistory(date, sort(events, key=incidenttime)) for f in sort(os.listdir()): . . . which to me looks enough like pseudocode that I'm happy. This might seem like an argument for having sort() as a builtin, but I think it's still better as a list constructor. Adding "sort = list.sorted" to the modules that need it is a small price to pay in boilerplate for the big win of not cluttering the builtin namespace. Mark Russell From aleaxit at yahoo.com Mon Oct 20 18:09:39 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 20 18:09:43 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: <200310201743.h9KHhPZ21469@12-236-54-216.client.attbi.com> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> <200310200940.43021.aleaxit@yahoo.com> <200310201743.h9KHhPZ21469@12-236-54-216.client.attbi.com> Message-ID: <200310210009.39256.aleaxit@yahoo.com> On Monday 20 October 2003 07:43 pm, Guido van Rossum wrote: ... > I'm not sure why you say it's separate from cloning; it seems to me > that copy.copy(iter(range(10))) should return *exactly* what we'd want > the proposed clone operation to return. I'd be tickled pink if it did, but I would have expected a shallow copy to return an iterator that's not necessarily independent from the starting one. Maybe I have a bad mental model of the "depth" (indirectness) of iterators? > I see this as a plea to add __copy__ and __deepcopy__ methods to all > standard iterators for which it makes sense. (Or maybe only __copy__ > -- I'm not sure what value __deepcopy__ would add.) Hmmm, copy the underlying sequence as well? Don't have any use case for it, but that's what would feel unsurprising to me (as I have already mentioned I may not have the right mental model of an iterator...). > I find this a reasonable request for the iterators belonging to > stndard containers (list, tuple, dict). I guess that some of the > iterators in itertools might also support this easily. Perhaps this > would be the road to supporting iterator cloning? It would surely do a lot to let me clone things, yes, and in fact doing it with the existing __copy__ protocol sounds much better than sprouting a new one. (Thanks for the confirmations and clarifications on file internals. btw, any news of the new experimental file things you were playing with back at PythonUK...?) Alex From aleaxit at yahoo.com Mon Oct 20 18:15:23 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 20 18:15:28 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310201721.h9KHLAa21428@12-236-54-216.client.attbi.com> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <16276.3995.177704.754136@montanaro.dyndns.org> <200310201721.h9KHLAa21428@12-236-54-216.client.attbi.com> Message-ID: <200310210015.23591.aleaxit@yahoo.com> On Monday 20 October 2003 07:21 pm, Guido van Rossum wrote: ... > 'average' or 'sum'. Whether there is an actual gain in speed depends > on how large the list is. You should be able to time examples like > > sum([x*x for x in R]) > > vs. > > def gen(R): > for x in R: > yield x*x > sum(gen(R)) > > for various lengths of R. (The latter would be a good indication of > how fast an iterator generator could run.) with a.py having: def asum(R): sum([ x*x for x in R ]) def gen(R): for x in R: yield x*x def gsum(R, gen=gen): sum(gen(R)) I measure: [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(100)' 'a.asum(R)' 10000 loops, best of 3: 96 usec per loop [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(100)' 'a.gsum(R)' 10000 loops, best of 3: 60 usec per loop [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(1000)' 'a.asum(R)' 1000 loops, best of 3: 930 usec per loop [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(1000)' 'a.gsum(R)' 1000 loops, best of 3: 590 usec per loop [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(10000)' 'a.asum(R)' 100 loops, best of 3: 1.28e+04 usec per loop [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(10000)' 'a.gsum(R)' 100 loops, best of 3: 8.4e+03 usec per loop not sure why gsum's advantage ratio over asum seems to be roughly constant, but, this IS what I measure!-) Alex From aleaxit at yahoo.com Mon Oct 20 18:24:04 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 20 18:24:09 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: <200310201708.h9KH8jF21377@12-236-54-216.client.attbi.com> References: <16E1010E4581B049ABC51D4975CEDB8802C098E6@UKDCX001.uk.int.atosorigin.com> <200310201517.07902.aleaxit@yahoo.com> <200310201708.h9KH8jF21377@12-236-54-216.client.attbi.com> Message-ID: <200310210024.04660.aleaxit@yahoo.com> On Monday 20 October 2003 07:08 pm, Guido van Rossum wrote: > > Darn -- one more underground attempt to foist adaptation into Python > > foiled by premature discovery... must learn to phrase things less > > overtly, the people around here are too clever!!! > > > :-) > > I'm all for adaptation, I'm just hesitant to adapt it wholeheartedly > because I expect that it will have such a big impact on coding > practices. I want to have a better feel for what that impact is and > whether it is altogether healthy. IOW I'm a bit worried that Wise as usual. I suspect adaptation should enter Python when interfaces or protocols or however we wanna call them do, and I remember your explanations about wanting to see real-world experience with that stuff, because there will be ONE chance to get them into Python "right". > adaptation might become too attractive of a hammer for all sorts of > problems, whether or not there are better-suited solutions. Well, OO has that problem too -- I see people (mostly coming from Java:-) STARTING with designing a class, by reflex, even when a couple of functions are more suitable. It generally doesn't take ALL that much to wean them from such "premature complexity" if they work with some non-OObsessed Pythonistas. Protocol adaptation is "an attractive hammer" much like OO is, without the further issue of there being very popular "protocol adaptation oriented languages" around:-), so I don't think the worry is really justified. I've seen another poster use a similar analogy with generic functions and multimethods (which btw we DO have in pypy as an implementation strategy, see http://codespeak.net/ and browse or download at will), and perhaps that's equally suitable too. Alex From tjreedy at udel.edu Mon Oct 20 18:42:18 2003 From: tjreedy at udel.edu (Terry Reedy) Date: Mon Oct 20 18:42:26 2003 Subject: [Python-Dev] Re: Re: accumulator display syntax References: <200310181627.h9IGRoP09636@12-236-54-216.client.attbi.com><200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> <20031020143056.GE28665@frobozz> Message-ID: "Andrew Bennetts" wrote in message ... > I think the lazy iteration syntax approach was probably a better idea. I > don't like the proposed use of "yield" to signify it, though -- "yield" is a > flow control statement, so the examples using it in this thread look odd to > me. Same here. > Perhaps it would be best to simply use the keyword "lazy" -- after all, > that's the key distinguishing feature. I think my preferred syntax would > be: > > sum([lazy x*x for x in sequence]) I like this the best of suggestions so far. Easy to understand, easy to teach: [lazy ...] = iter([...]) but produced more efficiently > But use of parens instead of brackets, and/or a colon to make the keyword > stand out (and look reminisicent to a lambda! which *is* a related concept, > in a way -- it also defers evaluation), e.g.: > > sum((lazy: x*x for x in sequence)) I prefer sticking with [...] for 'make a (possibly virtual) list'. Having removed ':' when abbreviating _[] for i in seq: _.append[expr] as an expression, it seems odd to bring it back for a special case. I wish ':' could have also been removed from the lambda abbreviation of def. Terry J. Reedy From tjreedy at udel.edu Mon Oct 20 18:47:49 2003 From: tjreedy at udel.edu (Terry Reedy) Date: Mon Oct 20 18:47:54 2003 Subject: [Python-Dev] Re: Re: accumulator display syntax References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> <200310201815.h9KIFM821583@12-236-54-216.client.attbi.com> Message-ID: "Guido van Rossum" wrote in message news:200310201815.h9KIFM821583@12-236-54-216.client.attbi.com... > > Most of us seem to agree that having list comprehensions > > available as a replacement for map() and filter() is a good > > thing. But what about reduce()? Are there equally strong > > reasons for wanting an alternative to that, too? If not, > > why not? > > If anything, the desire there is *more* pressing. Except for > operator.add, expressions involving reduce() are notoriously hard to > understand (except to experienced APL or Scheme hackers :-). > > Things like sum, max, average etc. are expressed very elegantly with > iterator comprehensions. > > I think the question is more one of frequency of use. List comps have > nothing over e.g. > > result = [] > for x in S: > result.append(x**2) > > except compactness of exprssion. How frequent is > > result = 0.0 > for x in S: > result += x**2 > > ??? > > (I've already said my -1 about your 'sum of ...' proposal.) > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org > From guido at python.org Mon Oct 20 18:49:41 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 20 18:49:50 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: Your message of "Mon, 20 Oct 2003 23:04:12 BST." <1066687451.16391.45.camel@straylight> References: <000301c39753$45a18980$e841fea9@oemcomputer> <1066687451.16391.45.camel@straylight> Message-ID: <200310202249.h9KMnfe22122@12-236-54-216.client.attbi.com> > I would tend to put: > > sort = list.sorted > > at the top of my modules where needed. Really? That would seem to just obfuscate things for the reader (who would have to scroll back potentially many pages to find the one-line definition of sort). Why be so keen on saving 7 keystrokes? How many calls to list.sorted do you expect to have in your average module? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Oct 20 18:51:22 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 20 18:51:32 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: Your message of "Tue, 21 Oct 2003 07:55:05 +1000." <338366A6D2E2CA4C9DAEAE652E12A1DECFEF5F@au3010avexu1.global.avaya.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DECFEF5F@au3010avexu1.global.avaya.com> Message-ID: <200310202251.h9KMpMn22142@12-236-54-216.client.attbi.com> > > From: Guido van Rossum [mailto:guido@python.org] > > > > > I'd love to see that (mis)feature removed someday. I'd love to > > > have that made possible by Guido's _immediately_ and > > > _officially_ declaring it to be an unsupported (and deprecated) > > > feature. > > > > Make it so. > > Should someone raise a bug report against the docs for this then? Please do (unless you can check the fix in yourself :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From tdelaney at avaya.com Mon Oct 20 19:44:48 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Mon Oct 20 19:44:55 2003 Subject: [Python-Dev] listcomps vs. for loops Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFEFBE@au3010avexu1.global.avaya.com> > From: Guido van Rossum [mailto:guido@python.org] > > > > From: Guido van Rossum [mailto:guido@python.org] > > > > > > > I'd love to see that (mis)feature removed someday. I'd love to > > > > have that made possible by Guido's _immediately_ and > > > > _officially_ declaring it to be an unsupported (and deprecated) > > > > feature. > > > > > > Make it so. > > > > Should someone raise a bug report against the docs for this then? > > Please do (unless you can check the fix in yourself :-). Raised request 827209: http://sourceforge.net/tracker/index.php?func=detail&aid=827209&group_id=5470&atid=105470 http://tinyurl.com/ro6g I'd have a go at it, but we're in crunch mode here at the moment - hoping to do a release candidate this week - so I don't have time to set up my environment or anything :( Tim Delaney From FBatista at uniFON.com.ar Mon Oct 20 16:54:55 2003 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Mon Oct 20 21:57:55 2003 Subject: [Python-Dev] prePEP: Money data type Message-ID: #- Sure, rounding IS best set by function, though you may want more than #- two (roundForbid to raise exceptions when rounding tries to happen, #- roundTruncate, etc). So far, goes 4 different kind: roundPlain roundBanker roundTruncate roundForbid #- class Money: #- round = staticmethod(roundWhateverDefault) #- precision = someDefaultPrecision #- def __init__(self, value, precision=None, round=None): #- self.value = value #- if precision is not None: self.precision = precision #- if round is not None: self.round = round #- #- then use self.precision and self.round in all further #- methods -- they'll #- correctly go to either the INSTANCE attribute, if #- specifically set, or #- the CLASS attribute, if no instance attribute is set. A #- useful part of #- how Python works, btw. Wow! This is the difference between a python newbie and a python guru, :) #- I do NOT think any advanced formatting should be part of the #- responsibilities #- of class Money itself. I would focus on correct and #- complete arithmetic with #- good handling of exact precision and rounding rules: I #- contend THAT is the #- really necessary part. Triming formatting, adding different types of rounding, and allowing strings with engineering notation. Maybe is better to build a Decimal class (kind of FixedPoint) easily subclassable to make a Money one. From guido at python.org Mon Oct 20 23:44:30 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 20 23:44:51 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: Your message of "Tue, 21 Oct 2003 00:09:39 +0200." <200310210009.39256.aleaxit@yahoo.com> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> <200310200940.43021.aleaxit@yahoo.com> <200310201743.h9KHhPZ21469@12-236-54-216.client.attbi.com> <200310210009.39256.aleaxit@yahoo.com> Message-ID: <200310210344.h9L3iVh23308@12-236-54-216.client.attbi.com> > On Monday 20 October 2003 07:43 pm, Guido van Rossum wrote: > ... > > I'm not sure why you say it's separate from cloning; it seems to me > > that copy.copy(iter(range(10))) should return *exactly* what we'd want > > the proposed clone operation to return. [Alex] > I'd be tickled pink if it did, but I would have expected a shallow > copy to return an iterator that's not necessarily independent from > the starting one. Maybe I have a bad mental model of the "depth" > (indirectness) of iterators? Hm. Let's consider a Python implementation of a sequence iterator (I think you've given a similar class before): class SeqIter: def __init__(self, seq, i=0): self.seq = seq self.i = i def __iter__(self): return self # Obligatory self-returning __iter__ def next(self): try: x = self.seq[self.i] except IndexError: raise StopIteration else: self.i += 1 return x All we care about really is that this is an instance with two instance variables, seq and i. A shallow copy creates a new instance (with a new __dict__!) with the same two instance variable names, referencing the same two objects. Since i is immutable, the copy/clone is independent from the original iterator; but both reference the same underlying sequence object. Clearly this is the copy semantics that would be expected from a sequence iterator object implemented in C. Ditto for the dict iterator. Now if someone wrote a tree class with a matching iterator class (which might keep a stack of nodes visited in a list), the default copy.copy() semantics might not be right, but such a class could easily provide a __copy__ method that did the right thing (making a shallow copy of the stack). > > I see this as a plea to add __copy__ and __deepcopy__ methods to all > > standard iterators for which it makes sense. (Or maybe only __copy__ > > -- I'm not sure what value __deepcopy__ would add.) > > Hmmm, copy the underlying sequence as well? Don't have any use > case for it, but that's what would feel unsurprising to me (as I have > already mentioned I may not have the right mental model of an > iterator...). Right. I have no use case for this either, although it's close to pickling, and who knows if someday it might be useful to be able to pickle iterators along with their containers. > > I find this a reasonable request for the iterators belonging to > > stndard containers (list, tuple, dict). I guess that some of the > > iterators in itertools might also support this easily. Perhaps this > > would be the road to supporting iterator cloning? > > It would surely do a lot to let me clone things, yes, and in fact > doing it with the existing __copy__ protocol sounds much better > than sprouting a new one. Right, so that's settled. We don't need an iterator cloning protocol, we can just let iterators support __copy__. (There's no C-level slot for this.) > (Thanks for the confirmations and clarifications on file internals. > btw, any news of the new experimental file things you were > playing with back at PythonUK...?) No; I donated it to pypy and I think it's in their subversion depot. I haven't had time to play with it further. It would be great to do a rewrite from the ground up of the file object without using stdio, but it would be a lot of work to get it right on all platforms; I guess an engineering team of volunteers should be formed to tackle this issue. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 00:09:11 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 00:09:19 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 00:15:23 +0200." <200310210015.23591.aleaxit@yahoo.com> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <16276.3995.177704.754136@montanaro.dyndns.org> <200310201721.h9KHLAa21428@12-236-54-216.client.attbi.com> <200310210015.23591.aleaxit@yahoo.com> Message-ID: <200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com> > with a.py having: > def asum(R): > sum([ x*x for x in R ]) > > def gen(R): > for x in R: yield x*x > def gsum(R, gen=gen): > sum(gen(R)) > > I measure: > > [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(100)' 'a.asum(R)' > 10000 loops, best of 3: 96 usec per loop > [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(100)' 'a.gsum(R)' > 10000 loops, best of 3: 60 usec per loop > [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(1000)' 'a.asum(R)' > 1000 loops, best of 3: 930 usec per loop > [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(1000)' 'a.gsum(R)' > 1000 loops, best of 3: 590 usec per loop > [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(10000)' 'a.asum(R)' > 100 loops, best of 3: 1.28e+04 usec per loop > [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(10000)' 'a.gsum(R)' > 100 loops, best of 3: 8.4e+03 usec per loop > > not sure why gsum's advantage ratio over asum seems to be roughly > constant, but, this IS what I measure!-) Great! This is a plus for iterator comprehensions (we need a better term BTW). I guess that building up a list using repeated append() calls slows things down more than the frame switching used by generator functions; I knew the latter was fast but this is a pleasant result. BTW, if I use a different function that calculates list() instead of sum(), the generator version is a few percent slower than the list comprehension. But that's because list(a) has a shortcut in case a is a list, while sum(a) always uses PyIter_Next(). So this is actually consistent: despite the huge win of the shortcut, the generator version is barely slower. I think the answer lies in the bytecode: >>> def lc(a): return [x for x in a] >>> import dis >>> dis.dis(lc) 2 0 BUILD_LIST 0 3 DUP_TOP 4 LOAD_ATTR 0 (append) 7 STORE_FAST 1 (_[1]) 10 LOAD_FAST 0 (a) 13 GET_ITER >> 14 FOR_ITER 16 (to 33) 17 STORE_FAST 2 (x) 20 LOAD_FAST 1 (_[1]) 23 LOAD_FAST 2 (x) 26 CALL_FUNCTION 1 29 POP_TOP 30 JUMP_ABSOLUTE 14 >> 33 DELETE_FAST 1 (_[1]) 36 RETURN_VALUE 37 LOAD_CONST 0 (None) 40 RETURN_VALUE >>> def gen(a): for x in a: yield x >>> dis.dis(gen) 2 0 SETUP_LOOP 18 (to 21) 3 LOAD_FAST 0 (a) 6 GET_ITER >> 7 FOR_ITER 10 (to 20) 10 STORE_FAST 1 (x) 13 LOAD_FAST 1 (x) 16 YIELD_VALUE 17 JUMP_ABSOLUTE 7 >> 20 POP_BLOCK >> 21 LOAD_CONST 0 (None) 24 RETURN_VALUE >>> The list comprehension executes 7 bytecodes per iteration; the generator version only 5 (this could be more of course if the expression was more complicated than 'x'). The YIELD_VALUE does very little work; falling out of the frame is like falling off a log; and gen_iternext() is pretty sparse code too. On the list comprehension side, calling the list's append method has a bunch of overhead. (Some of which could be avoided if we had a special-purpose opcode which called PyList_Append().) But the executive summary remains: the generator wins because it doesn't have to materialize the whole list. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at cosc.canterbury.ac.nz Tue Oct 21 03:27:03 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue Oct 21 03:27:20 2003 Subject: [Python-Dev] Re: Re: accumulator display syntax In-Reply-To: Message-ID: <200310210727.h9L7R3V02910@oma.cosc.canterbury.ac.nz> > > sum([lazy x*x for x in sequence]) > > I like this the best of suggestions so far. Easy to understand, easy > to teach: > [lazy ...] = iter([...]) but produced more efficiently -1. An iterator is not a lazy list. A lazy list would support indexing, slicing, etc. while calculating its items on demand. An iterator is inherently sequential and single-use -- a different concept. But maybe some other keyword could be added to ease any syntactic problems, such as "all" or "every": sum(all x*x for x in xlist) sum(every x*x for x in xlist) The presence of the extra keyword would then distinguish an iterator comprehension from the innards of a list comprehension. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Tue Oct 21 03:43:06 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue Oct 21 03:43:15 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com> Message-ID: <200310210743.h9L7h6k02941@oma.cosc.canterbury.ac.nz> Guido: > But the executive summary remains: the generator wins because it > doesn't have to materialize the whole list. But what would happen if the generator were replaced with in-line code that computes the values and feeds them to an accumulator object, such as might result from an accumulator syntax that gets inline-expanded in the same way as a list comp? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From aleaxit at yahoo.com Tue Oct 21 03:43:31 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 21 03:43:38 2003 Subject: [Python-Dev] Re: Re: accumulator display syntax In-Reply-To: <200310210727.h9L7R3V02910@oma.cosc.canterbury.ac.nz> References: <200310210727.h9L7R3V02910@oma.cosc.canterbury.ac.nz> Message-ID: <200310210943.31574.aleaxit@yahoo.com> On Tuesday 21 October 2003 09:27 am, Greg Ewing wrote: ... > But maybe some other keyword could be added to ease any > syntactic problems, such as "all" or "every": > > sum(all x*x for x in xlist) > sum(every x*x for x in xlist) > > The presence of the extra keyword would then distinguish > an iterator comprehension from the innards of a list > comprehension. Heh, you ARE a volcano of cool syntactic ideas these days, Greg. As between them, to me 'all' sort of connotes 'all at once' while 'every' connotes 'one by one' (so would a third possibility, 'each'); so 'all' is the one I like least. Besides accumulators &c we should also think of normal loops: for a in all x*x for x in xlist: ... for a in every x*x for x in xlist: ... for a in each x*x for x in xlist: ... Of these three, 'every' looks best to me, personally. Alex From greg at cosc.canterbury.ac.nz Tue Oct 21 03:55:23 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue Oct 21 03:55:43 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8802C098EB@UKDCX001.uk.int.atosorigin.com> Message-ID: <200310210755.h9L7tNH02963@oma.cosc.canterbury.ac.nz> > Did I miss April 1st? We seem to be discussing the merits of > > f of arg > > as an alternative form of > > f(arg) > > While I'm sure Cobol had some good points, I don't believe that this was one > of them... No, some people were *abusing* my suggested accumulator syntax for things that could have been done more directly using a function call. It was not meant to be used for copying or sorting! I may have misled people a bit by using "sum" in one of the examples, since there is currently a function by that name, which wouldn't be directly usable that way. Just to be clear, y = accum of f(x) for x in seq would be equivalent to something like a = accum() for x in seq: a.__consume__(f(x)) y = a.__result__() which, as you can see, is rather more than just a function call. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Tue Oct 21 04:01:29 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue Oct 21 04:01:47 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <3F93E18D.5010708@iinet.net.au> Message-ID: <200310210801.h9L81Tk02971@oma.cosc.canterbury.ac.nz> > Except, if it was defined such that you wrote: > sum of [x*x for x in the_values] I don't think that would be a good idea, because the square brackets make it look less efficient than it really is, and leave you wondering why you shouldn't just write a function call with a listcomp as argument instead. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From aleaxit at yahoo.com Tue Oct 21 05:13:41 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 21 05:13:47 2003 Subject: [Python-Dev] prePEP: Money data type In-Reply-To: References: Message-ID: <200310211113.41658.aleaxit@yahoo.com> On Monday 20 October 2003 10:54 pm, Batista, Facundo wrote: ... > Triming formatting, adding different types of rounding, and allowing > strings with engineering notation. Maybe is better to build a Decimal class > (kind of FixedPoint) easily subclassable to make a Money one. Sure, arithmetic (including rounding) is what we most need. If we call it Decimal or whatever, that may be preferable to Money, I don't know. Alex From aleaxit at yahoo.com Tue Oct 21 06:02:26 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 21 06:02:35 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: <200310210344.h9L3iVh23308@12-236-54-216.client.attbi.com> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> <200310210009.39256.aleaxit@yahoo.com> <200310210344.h9L3iVh23308@12-236-54-216.client.attbi.com> Message-ID: <200310211202.26677.aleaxit@yahoo.com> On Tuesday 21 October 2003 05:44, Guido van Rossum wrote: ... > All we care about really is that this is an instance with two instance > variables, seq and i. A shallow copy creates a new instance (with a > new __dict__!) with the same two instance variable names, referencing > the same two objects. Since i is immutable, the copy/clone is Ah -- *right*! The index can be taken as IMMUTABLE -- so the fact that the copy is shallow, so gets "the same index object", is a red herring -- as soon as either the copy or the original "increment" the index, they're in fact creating a new index object for themselves while still leaving their brother's index object unchanged. I get it now -- I was thinking too abstractly and generally, in terms of a more general "index" which might be mutable, thus shared (including its changes) after a shallow copy. > Now if someone wrote a tree class with a matching iterator class > (which might keep a stack of nodes visited in a list), the default > copy.copy() semantics might not be right, but such a class could > easily provide a __copy__ method that did the right thing (making a > shallow copy of the stack). Yes, if we specify an iter's __copy__ makes an independent iterator, which is surely the most useful semantics for it, then any weird iterator whose index is in fact mutable can copy not-quite-shallowly and offer the same useful semantics. I'm not sure where that leaves generator made iterators, which don't really know which parts of the state in their saved frame are "index", but having them just punt and refuse to copy themselves shallowly might be ok. > > > I see this as a plea to add __copy__ and __deepcopy__ methods to all > > > standard iterators for which it makes sense. (Or maybe only __copy__ > > > -- I'm not sure what value __deepcopy__ would add.) > > > > Hmmm, copy the underlying sequence as well? Don't have any use > > case for it, but that's what would feel unsurprising to me (as I have > > already mentioned I may not have the right mental model of an > > iterator...). > > Right. I have no use case for this either, although it's close to > pickling, and who knows if someday it might be useful to be able to > pickle iterators along with their containers. Sure, it might. Perhaps the typical use case would be one in which an iterator gets deepcopied "incidentally" as part of the deepcopy of some other object which "happens" to hold an iterator; if iterators knew how to deepcopy themselves that would save some work on the part of the other object's author. No huge win, sure. But once the copy gets deep, generator-made iterators should also have no problem actually doing it, and that may be another middle-size win. > > > would be the road to supporting iterator cloning? > > > > It would surely do a lot to let me clone things, yes, and in fact > > doing it with the existing __copy__ protocol sounds much better > > than sprouting a new one. > > Right, so that's settled. We don't need an iterator cloning protocol, > we can just let iterators support __copy__. (There's no C-level slot > for this.) Right. > > (Thanks for the confirmations and clarifications on file internals. > > btw, any news of the new experimental file things you were > > playing with back at PythonUK...?) > > No; I donated it to pypy and I think it's in their subversion depot. > I haven't had time to play with it further. It would be great to do a > rewrite from the ground up of the file object without using stdio, but > it would be a lot of work to get it right on all platforms; I guess an > engineering team of volunteers should be formed to tackle this issue. Right, and doing so as part of pypy is surely right, since it's one of the many things pypy definitely needs to become fully self-hosting. Alex From aleaxit at yahoo.com Tue Oct 21 06:03:49 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 21 06:03:56 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310201637.h9KGbHU21287@12-236-54-216.client.attbi.com> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310201745.36226.aleaxit@yahoo.com> <200310201637.h9KGbHU21287@12-236-54-216.client.attbi.com> Message-ID: <200310211203.49395.aleaxit@yahoo.com> On Monday 20 October 2003 18:37, Guido van Rossum wrote: ... > > >>> [.2 for .2 in range(3)] > > > > SyntaxError: can't assign to literal > > > > I think I don't understand what you mean. > > I meant that the compiler should rename it. Just like when you use a <> I'm being rather thick these days, I guess. Thanks for clarifying! Alex From marktrussell at btopenworld.com Tue Oct 21 06:31:28 2003 From: marktrussell at btopenworld.com (Mark Russell) Date: Tue Oct 21 06:34:18 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310202249.h9KMnfe22122@12-236-54-216.client.attbi.com> References: <000301c39753$45a18980$e841fea9@oemcomputer> <1066687451.16391.45.camel@straylight> <200310202249.h9KMnfe22122@12-236-54-216.client.attbi.com> Message-ID: <1066732288.18847.17.camel@straylight> On Mon, 2003-10-20 at 23:49, Guido van Rossum wrote: > Really? That would seem to just obfuscate things for the reader (who > would have to scroll back potentially many pages to find the one-line > definition of sort). I think most readers would probably be able to guess what for key in sort(d.keys()): would do. If not then it's no worse than a user-defined function. It's also a matter of proportion -- the important thing about the code above is that it's walking over a dictionary. In most of my uses, the sort() is just a detail to ensure reproducible behaviour. In a new language I think you could make a case for the default behaviour for dict iteration to be sorted, with a walk-in-unspecified-order method for the cases where the speed really does matter. Back in the real world, how about: for key, value in d.sort(): (i.e. adding a sort() instance method to dict equivalent to: def sort(d, cmp=None, key=None, reverse=False): l = list(d.items()) l.sort(cmp, key, reverse) return l ). At least there's no question of an in-place sort for dicts! > Why be so keen on saving 7 keystrokes? It's not totally trivial - for me a list comprehension is noticeably less readable when split over more than one line. > How many calls to list.sorted do you expect to have in your average > module? Er, about 0.3 :-) In the project I'm working on, there are 52 sortcopy() calls in 162 modules (about 18K LOC). Not enough to justify a built-in sort(), but enough I think to make list.sorted() worthwhile. Mark Russell From aleaxit at yahoo.com Tue Oct 21 07:00:42 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 21 07:00:59 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <1066732288.18847.17.camel@straylight> References: <000301c39753$45a18980$e841fea9@oemcomputer> <200310202249.h9KMnfe22122@12-236-54-216.client.attbi.com> <1066732288.18847.17.camel@straylight> Message-ID: <200310211300.42187.aleaxit@yahoo.com> On Tuesday 21 October 2003 12:31 pm, Mark Russell wrote: > On Mon, 2003-10-20 at 23:49, Guido van Rossum wrote: > > Really? That would seem to just obfuscate things for the reader (who > > would have to scroll back potentially many pages to find the one-line > > definition of sort). > > I think most readers would probably be able to guess what > > for key in sort(d.keys()): > > would do. If not then it's no worse than a user-defined function. Incidentally, for k in list.sorted(d): will be marginally faster, e.g. (using the copysort I posted here, without The Trick -- it should be just about identical to the list.sorted classmethod): import copysort x = dict.fromkeys(map(str,range(99999))) def looponkeys(x, c=copysort.copysort): for k in c(x.keys()): pass def loopondict(x, c=copysort.copysort): for k in c(x): pass [alex@lancelot ext]$ timeit.py -c -s'import t' 't.loopondict(t.x)' 10 loops, best of 3: 2.84e+05 usec per loop [alex@lancelot ext]$ timeit.py -c -s'import t' 't.looponkeys(t.x)' 10 loops, best of 3: 2.67e+05 usec per loop i.e., about 10% better for this size of list and number of runs (quite a few, eyeball x.keys()...:-). Nothing crucial, of course, but still. Moreover, "list.sorted(d)" and "sort(d.keys())" are the same length, and the former is conceptually simpler (one [explicit] method call, vs one method call and one function call). Of course, if each keystroke count, you may combine both "abbreviations" and just use "sort(d)". > for key, value in d.sort(): > > (i.e. adding a sort() instance method to dict equivalent to: Why should multiple data types acquire separate .sort methods with subtly different semantics (one works in-place and returns None, one doesn't mutate the object and returns a list, ...) when there's no real added value wrt ONE classmethod of list...? Particularly with cmp, key, and reverse on each, seems cumbersome to me. Truly, is list.sorted(d.iteritems()) [or d.items() if you'd rather save 4 chars than a small slice of time:-)] SO "unobvious"? I just don't get it. Alex From marktrussell at btopenworld.com Tue Oct 21 07:18:16 2003 From: marktrussell at btopenworld.com (Mark Russell) Date: Tue Oct 21 07:21:04 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310211300.42187.aleaxit@yahoo.com> References: <000301c39753$45a18980$e841fea9@oemcomputer> <200310202249.h9KMnfe22122@12-236-54-216.client.attbi.com> <1066732288.18847.17.camel@straylight> <200310211300.42187.aleaxit@yahoo.com> Message-ID: <1066735096.18849.33.camel@straylight> On Tue, 2003-10-21 at 12:00, Alex Martelli wrote: > Why should multiple data types acquire separate .sort methods with > subtly different semantics (one works in-place and returns None, one > doesn't mutate the object and returns a list, ...) when there's no real > added value wrt ONE classmethod of list...? I agree that the different semantics for lists and dicts are a strike against this. The argument for it is that walking over a dictionary in sorted order is (at least to me) a missing idiom in python. Does this never come up when you're teaching the language? I wouldn't advocate adding this to other types (e.g. Set) because they're much less commonly used than dicts, so I don't think there's a danger of a creeping plague of sort methods. Not a big deal though - list.sorted() is the real win. Mark Russell PS: I'm really not an anal-retentive keystoke counter :-) From mwh at python.net Tue Oct 21 07:41:00 2003 From: mwh at python.net (Michael Hudson) Date: Tue Oct 21 07:41:06 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> (Guido van Rossum's message of "Mon, 20 Oct 2003 10:48:00 -0700") References: <20031020173134.GA29040@panix.com> <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> Message-ID: <2mad7uq0mr.fsf@starship.python.net> Guido van Rossum writes: > I don't recall what I said then. Did I say it was a feature that > > L = [x for x in R] > print x > > would print the last item of R? A problem with such code irrespective of anything else is that it fails when R is empty. Cheers, mwh -- Whaaat? That is the most retarded thing I have seen since, oh, yesterday -- Kaz Kylheku, comp.lang.lisp From aleaxit at yahoo.com Tue Oct 21 07:55:02 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 21 07:55:57 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <1066735096.18849.33.camel@straylight> References: <000301c39753$45a18980$e841fea9@oemcomputer> <200310211300.42187.aleaxit@yahoo.com> <1066735096.18849.33.camel@straylight> Message-ID: <200310211355.02326.aleaxit@yahoo.com> On Tuesday 21 October 2003 01:18 pm, Mark Russell wrote: > On Tue, 2003-10-21 at 12:00, Alex Martelli wrote: > > Why should multiple data types acquire separate .sort methods with > > subtly different semantics (one works in-place and returns None, one > > doesn't mutate the object and returns a list, ...) when there's no real > > added value wrt ONE classmethod of list...? > > I agree that the different semantics for lists and dicts are a strike > against this. The argument for it is that walking over a dictionary in > sorted order is (at least to me) a missing idiom in python. Does this It's a frequently used idiom (actually more than one) -- it's not "missing". > never come up when you're teaching the language? Sure, and I have a good time explaining that half the time you want to sort on KEYS and half the time on VALUES. An example I often use is building and displaying a word-frequency index: now it's pretty obvious that you may want to display it just as easily by frequency (most frequent words first) OR alphabetically. The key= construct IS a huge win, btw. I just wish there WAS an easier way to express the TYPICAL keys one wants to use than lambda x: x[N] for some N or lambda x: x.A for some A. getattr and operator.getitem are no use, alas, even when curried, because they take x first:-(. I'd rather not teach lambda (at least surely not early on!) so I'll end up with lots of little def's (whose need had sharply decreased with list comprehensions, as map and filter moved into a corner to gather dust). Ah well. > I wouldn't advocate adding this to other types (e.g. Set) because > they're much less commonly used than dicts, so I don't think there's a Actually, I was thinking of presenting them BEFORE dicts next time I have an opportunity of teaching Python from scratch. The ARE simpler and more fundamental, after all. > danger of a creeping plague of sort methods. Not a big deal though - > list.sorted() is the real win. I concur. > Mark Russell > > PS: I'm really not an anal-retentive keystoke counter :-) OK, sorry for the digs, it just _looked_ that way for a sec;-). Alex From mwh at python.net Tue Oct 21 07:59:50 2003 From: mwh at python.net (Michael Hudson) Date: Tue Oct 21 07:59:55 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310201637.h9KGbHU21287@12-236-54-216.client.attbi.com> (Guido van Rossum's message of "Mon, 20 Oct 2003 09:37:17 -0700") References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310200944.30482.aleaxit@yahoo.com> <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com> <200310201745.36226.aleaxit@yahoo.com> <200310201637.h9KGbHU21287@12-236-54-216.client.attbi.com> Message-ID: <2m65iipzrd.fsf@starship.python.net> Guido van Rossum writes: >> On Monday 20 October 2003 04:30 pm, Guido van Rossum wrote: >> > > We are indeed sure (sadly) that list comprehensions leak control variable >> > > names. >> > >> > But they shouldn't. It can be fixed by renaming them (e.g. numeric >> > names with a leading dot). >> >> Hmmm, sorry? >> >> >>> [.2 for .2 in range(3)] >> SyntaxError: can't assign to literal >> >> I think I don't understand what you mean. > > I meant that the compiler should rename it. Implementing this might be entertaining. In particular what happens if the iteration variable is a local in the frame anyway? I presume that would inhibit the renaming, but then there's a potentially confusing dichotomy as to whether renaming gets done. Of course you could *always* rename, but then code like def f(x): r = [x+1 for x in range(x)] return r, x becomes even more incomprehensible (and changes in behaviour). And what about horrors like [([x for x in range(10)],x) for x in range(10)] vs: [([x for x in range(10)],y) for y in range(10)] ? I suppose you could make a case for throwing out (or warning about) all these cases at compile time, but that would require significant effort as well (I think). Cheers, mwh -- This song is for anyone ... fuck it. Shut up and listen. -- Eminem, "The Way I Am" From ncoghlan at iinet.net.au Tue Oct 21 09:33:20 2003 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Tue Oct 21 09:33:25 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <200310201741.19295.aleaxit@yahoo.com> References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> <200310201601.08440.aleaxit@yahoo.com> <3F93F33C.9070702@iinet.net.au> <200310201741.19295.aleaxit@yahoo.com> Message-ID: <3F9535A0.9060803@iinet.net.au> Alex Martelli strung bits together to say: > On Monday 20 October 2003 04:37 pm, Nick Coghlan wrote: >> for x in sorted_copy of reversed_copy of my_list: > > Ooops -- sorting a reversed copy of my_list is just like sorting my_list... > I think > for x in sorted_copy(reverse=True) of my_list: > ... > (again borrowing brand-new keyword syntax from lists' sort method) is > likely to work better...:-) (slightly OT for this thread, but. . .) I got the impression that: l.sort(reverse=True) was stable w.r.t. items that sort equivalently, while: l.reverse() l.sort() was not. I.e. the "reverse" in the sort arguments refers to reversing the order of the arguments to the comparison operation, rather than to reversing the list. > However, if I had to choose, I would forego this VERY attractive syntax > sugar, and go for Greg's original suggestion -- 'of' for iterator > comprehensions only. Syntax sugar is all very well (at least in this case), > but if it _only_ amounts to a much neater-looking way of doing what is already > quite possible, it's a "more-than-one-way-to-do-itis". Yes - quite pretty, but ultimately confusing, I think (as a few people have pointed out). However, getting back to Greg's original point - that our goal is to find a syntax that does for "reduce" what list comprehensions did for "map" and "filter", I realised last night that this "of" syntax isn't it. The "of" syntax requires us to have an existing special operator to handle the accumulation (e.g. sum or max), whereas what reduce does is let us take an existing binary function (e.g. operator.add), and feed it a sequence element-by-element, accumulating the result. If we already have a method that can extract the result from we want from a seqeunce, then list comprehensions and method calls are perfectly adequate. (starts thinking about this from the basics of what the goal is) So what we're really talking about is syntactic sugar for: y = 0 for x in xvalues: if (x > 0): y = y + (x*x) We want to be able to specify the object to iterate over, the condition for which elements to consider (filter), an arbitrary function involving the element (map), and the method we want to use to accumulate the elements (reduce) If we had a list comprehension: squares_of_positives = [x*x for x in xvalues if x > 0] the original unrolled form would have been: squares_of_positives = [] for x in xvalues: if (x > 0): squares_of_positives.append(x*x) So list comprehensions just fix the accumulation method (appending to the result list). So what we need is a way to describe how to accumulate the result, as well as the ability to initialise the cumulative result: y = y + x*x from y = 0 for x in xvalues if x > 0 Yuck. Looks like an assignment, but is actually an accumulation expression. Ah, how about: y + x*x from y = 0 for x in xvalues if x > 0 The 'from' clause identifies the accumulation variable, in just the same way the 'for' clause identifies the name of the current value from the iterable. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions." From ncoghlan at iinet.net.au Tue Oct 21 09:41:39 2003 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Tue Oct 21 09:41:44 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <200310201815.h9KIFM821583@12-236-54-216.client.attbi.com> References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> <200310201815.h9KIFM821583@12-236-54-216.client.attbi.com> Message-ID: <3F953793.1000208@iinet.net.au> Guido van Rossum strung bits together to say: > except compactness of exprssion. How frequent is > > result = 0.0 > for x in S: > result += x**2 > > ??? > > (I've already said my -1 about your 'sum of ...' proposal.) Just so this suggestion doesn't get buried in the part of the thread where I was getting rather carried away about Greg's 'of' syntax (sorry!). What about: result + x**2 from result = 0.0 for x in S Essentially short for: result = 0.0 for x in S: result = result + x**2 Cheers, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions." From skip at pobox.com Tue Oct 21 09:43:41 2003 From: skip at pobox.com (Skip Montanaro) Date: Tue Oct 21 09:43:51 2003 Subject: [Python-Dev] Re: Re: accumulator display syntax In-Reply-To: References: <200310181627.h9IGRoP09636@12-236-54-216.client.attbi.com> <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> <20031020143056.GE28665@frobozz> Message-ID: <16277.14349.844999.220166@montanaro.dyndns.org> Terry> "Andrew Bennetts" wrote in message >> ... I think the lazy iteration syntax approach was probably a better >> idea. I don't like the proposed use of "yield" to signify it, though >> -- >> "yield" is a flow control statement, so the examples using it in this >> thread look odd to me. Terry> Same here. And probably contributed to my initial confusion about what the proposed construct was supposed to do. (I'm still not keen on it, but at least I understand it better.) Skip From skip at pobox.com Tue Oct 21 09:57:38 2003 From: skip at pobox.com (Skip Montanaro) Date: Tue Oct 21 09:57:47 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <16276.3995.177704.754136@montanaro.dyndns.org> <200310201721.h9KHLAa21428@12-236-54-216.client.attbi.com> <200310210015.23591.aleaxit@yahoo.com> <200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com> Message-ID: <16277.15186.392757.583785@montanaro.dyndns.org> >> [Alex measures speed improvements] Guido> Great! This is a plus for iterator comprehensions (we need a Guido> better term BTW). Here's an alternate suggestion. Instead of inventing new syntax, why not change the semantics of list comprehensions to be lazy? They haven't been in use that long, and while they are popular, the semantic tweakage would probably cause minimal disruption. In situations where laziness wasn't wanted, the most that a particular use would have to change (I think) is to pass it to list(). Skip From exarkun at intarweb.us Tue Oct 21 10:28:16 2003 From: exarkun at intarweb.us (Jp Calderone) Date: Tue Oct 21 10:28:36 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: <2mad7uq0mr.fsf@starship.python.net> References: <20031020173134.GA29040@panix.com> <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> <2mad7uq0mr.fsf@starship.python.net> Message-ID: <20031021142816.GA25455@intarweb.us> On Tue, Oct 21, 2003 at 12:41:00PM +0100, Michael Hudson wrote: > Guido van Rossum writes: > > > I don't recall what I said then. Did I say it was a feature that > > > > L = [x for x in R] > > print x > > > > would print the last item of R? > > A problem with such code irrespective of anything else is that it > fails when R is empty. > Not when x is properly initialized. Anyway, this is no different from the problem of: for x in R: ... print x In any case, are there plans to also have the compiler emit warnings about potential reliance on this feature? Jp -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://mail.python.org/pipermail/python-dev/attachments/20031021/31c2666d/attachment.bin From tjreedy at udel.edu Tue Oct 21 10:29:32 2003 From: tjreedy at udel.edu (Terry Reedy) Date: Tue Oct 21 10:29:40 2003 Subject: [Python-Dev] Re: listcomps vs. for loops References: <20031020173134.GA29040@panix.com><200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> <2mad7uq0mr.fsf@starship.python.net> Message-ID: "Michael Hudson" wrote in message news:2mad7uq0mr.fsf@starship.python.net... > Guido van Rossum writes: > > > I don't recall what I said then. Did I say it was a feature that > > > > L = [x for x in R] > > print x > > > > would print the last item of R? Someone more-or-less did -- in the tutorial. See bottom below. > A problem with such code irrespective of anything else is that it > fails when R is empty. Same would be true of for loops, except that typical after-for usage, such as searching for item in list, has else clause to set control variable to default in 'not found' cases, which include empty lists. The Ref Manual currently says nothing about leakage or overwriting. That should make leakage fair game for plugging. On the other hand, Tutorial 5.1.4 List Comprehensions says: ''' To make list comprehensions match the behavior of for loops, assignments to the loop variable remain visible outside of the comprehension: >>> x = 100 # this gets overwritten >>> [x**3 for x in range(5)] [0, 1, 8, 27, 64] >>> x # the final value for range(5) 4 ''' (Pointed out by John Roth in response to my c.l.py posting.) I have added note to SF 827209. Terry J. Reedy From mwh at python.net Tue Oct 21 10:45:00 2003 From: mwh at python.net (Michael Hudson) Date: Tue Oct 21 10:45:08 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: <20031021142816.GA25455@intarweb.us> (Jp Calderone's message of "Tue, 21 Oct 2003 10:28:16 -0400") References: <20031020173134.GA29040@panix.com> <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> <2mad7uq0mr.fsf@starship.python.net> <20031021142816.GA25455@intarweb.us> Message-ID: <2mwuayodjn.fsf@starship.python.net> Jp Calderone writes: > On Tue, Oct 21, 2003 at 12:41:00PM +0100, Michael Hudson wrote: >> Guido van Rossum writes: >> >> > I don't recall what I said then. Did I say it was a feature that >> > >> > L = [x for x in R] >> > print x >> > >> > would print the last item of R? >> >> A problem with such code irrespective of anything else is that it >> fails when R is empty. >> > > Not when x is properly initialized. Obviously. > Anyway, this is no different from the > problem of: > > for x in R: > ... > print x Well, yes. I still think it's dubious code. > In any case, are there plans to also have the compiler emit warnings about > potential reliance on this feature? I would hope that we wouldn't make changes without emitting such a warning. I'm not sure how hard it would be to implement, tho'. (It would be /nice/ to implement a warning whenever there's a possibility of the UnboundLocalError exception, but that *definitely* requires control flow analysis and that is *definitely* a heap of work, unless the ast-branch gets some attention). Cheers, mwh -- We did requirements and task analysis, iterative design, and user testing. You'd almost think programming languages were an interface between people and computers. -- Steven Pemberton (one of the designers of Python's direct ancestor ABC) From aleaxit at yahoo.com Tue Oct 21 11:49:21 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 21 11:49:36 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <16277.15186.392757.583785@montanaro.dyndns.org> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com> <16277.15186.392757.583785@montanaro.dyndns.org> Message-ID: <200310211749.21152.aleaxit@yahoo.com> On Tuesday 21 October 2003 03:57 pm, Skip Montanaro wrote: > >> [Alex measures speed improvements] > > Guido> Great! This is a plus for iterator comprehensions (we need a > Guido> better term BTW). > > Here's an alternate suggestion. Instead of inventing new syntax, why not > change the semantics of list comprehensions to be lazy? They haven't been > in use that long, and while they are popular, the semantic tweakage would > probably cause minimal disruption. In situations where laziness wasn't > wanted, the most that a particular use would have to change (I think) is to > pass it to list(). Well, yes, the _most_ one could ever have to change is move from [ ... ] to list[ ... ]) to get back today's semantics. But any use NOT so changed may break, in general; any perfectly correct program coded with Python 2.1 to Python 2.3 -- several years' worth of "current Python", by the time 2.4 comes out -- might break. I think we should keep the user-observable semantics as now, BUT maybe an optimization IS possible if all the user code does with the LC is loop on it (or anyway just get its iter(...) and nothing else). Perhaps a _variant_ of "The Trick" MIGHT be practicable (since I don't believe the "call from C holding just one ref" IS a real risk here). Again it would be based on reference-count being 1 at a certain point. The LC itself _might_ just build a generator and wrap it in a "pseudolist" object. Said pseudolist object, IF reacting to a tp_iter when its reference count is one, NEED NOT "unfold" itself. But for ANY other operation, it must generate the real list and "get out of the way" as much as possible. Note that this includes a tp_iter WITH rc>1. For example: x = [ a.strip().upper() for a in thefile if len(a)>7 ] for y in x: blah(y) for z in x: bluh(z) the first 'for' implicitly calls iter(x) but that must NOT be allowed to "consume" thefile in a throwaway fashion -- because x can be used again later (e.g. in the 2nd for). This works fine today and has worked for years, and I would NOT like it to break in 2.4... if LC's had been lazy from the start (just as they are in Haskell), that would have been wonderful, but, alas, we didn't have the iterator protocol then...:-( As to whether the optimization is worth this complication, I dunno. I'd rather have "iterator literals", I think -- simpler and more explicit. That way when i see [x.bah() for x in someiterator] I KNOW the iterator is consumed right then and there, I don't need to look at the surrounding context... context-depended semantics is not Python's most normal and usual approach, after all... Alex From pje at telecommunity.com Tue Oct 21 11:59:19 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Oct 21 11:59:19 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <16277.15186.392757.583785@montanaro.dyndns.org> References: <200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com> <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <16276.3995.177704.754136@montanaro.dyndns.org> <200310201721.h9KHLAa21428@12-236-54-216.client.attbi.com> <200310210015.23591.aleaxit@yahoo.com> <200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com> Message-ID: <5.1.1.6.0.20031021115620.023e2300@telecommunity.com> At 08:57 AM 10/21/03 -0500, Skip Montanaro wrote: > >> [Alex measures speed improvements] > > Guido> Great! This is a plus for iterator comprehensions (we need a > Guido> better term BTW). > >Here's an alternate suggestion. Instead of inventing new syntax, why not >change the semantics of list comprehensions to be lazy? They haven't been >in use that long, and while they are popular, the semantic tweakage would >probably cause minimal disruption. In situations where laziness wasn't >wanted, the most that a particular use would have to change (I think) is to >pass it to list(). If you make it a list that's lazy, it doesn't lose the memory allocation overhead for the list. If I understand Alex's benchmarks, making a lazy list would end up being *slower* than list comprehension is now. I previously proposed a different solution earlier in this thread, where you get a pseudo-list that, if iterated, runs the underlying generator function. But there were issues with possible side-effects (not to mention reiterability) of the underlying iterator on which the comprehension was based. From theller at python.net Tue Oct 21 12:26:13 2003 From: theller at python.net (Thomas Heller) Date: Tue Oct 21 12:26:28 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: <200310171840.h9HIesN06941@12-236-54-216.client.attbi.com> (Guido van Rossum's message of "Fri, 17 Oct 2003 11:40:53 -0700") References: <3F872FE9.9070508@v.loewis.de> <3F8C3DD0.4020400@v.loewis.de> <200310162019.h9GKJli05194@12-236-54-216.client.attbi.com> <200310171804.h9HI4rJ06803@12-236-54-216.client.attbi.com> <4qy7lnuc.fsf@python.net> <200310171840.h9HIesN06941@12-236-54-216.client.attbi.com> Message-ID: Guido van Rossum writes: [about making _socket a builtin module instead of an extension] >> > Long ago, when I first set up the VC5 project, there were still some >> > target systems out there that didn't have a working winsock DLL, and >> > "import socket" or "import select" would fail there for that reason. >> > If this is no longer a problem, I'm +1 on this. >> >> Not on the sytems that I work on. To be double sure, _socket could be >> rewritten to load the winsock dll dynamically. And maybe this becomes >> an issue again if IPv6 is compiled in. > > I'd rather not have more Windows-specific cruft in the socket and > select module source code -- they are bad enough already. Dynamically > loading winsock probably would mean that ever call into it has to be > coded differently, right? Yes. Yet another approach would be to use the delay_load feature of MSVC, it allows dynamic loading of the dlls at runtime, ideally without changing the source code. So far I have never tried that, does anyone know if this really works? Thomas From fdrake at acm.org Tue Oct 21 12:29:42 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue Oct 21 12:30:01 2003 Subject: [Python-Dev] Expat 1.95.7 in Python 2.3.x? Message-ID: <16277.24310.600396.856699@grendel.zope.com> I released Expat 1.95.7 yesterday, and updated the Python and PyXML projects to use the new version. It fixes a number of bugs in Expat as well as cleaning up some build issues that caused Python and PyXML to ship a slightly modified version. (It may also prove a little faster in some applications, since it's now using a string hash function based on Python's.) I'd be interested in hearing if there are any objections to updating the Python 2.3.x maintenance tree to also use the new version. I think it's safe (no new features) and allows us to ship an unmodified Expat. Please let me know if you know of any objections. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From skip at pobox.com Tue Oct 21 12:34:24 2003 From: skip at pobox.com (Skip Montanaro) Date: Tue Oct 21 12:34:36 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310211749.21152.aleaxit@yahoo.com> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com> <16277.15186.392757.583785@montanaro.dyndns.org> <200310211749.21152.aleaxit@yahoo.com> Message-ID: <16277.24592.805548.835843@montanaro.dyndns.org> >> Here's an alternate suggestion. Instead of inventing new syntax, why >> not change the semantics of list comprehensions to be lazy? Alex> Well, yes, the _most_ one could ever have to change is move from [ Alex> ... ] to list[ ... ]) to get back today's semantics. But any use Alex> NOT so changed may break, in general; any perfectly correct Alex> program coded with Python 2.1 to Python 2.3 -- several years' Alex> worth of "current Python", by the time 2.4 comes out -- might Alex> break. I understand all that. Still, the "best" syntax for these so-called iterator comprehensions might have been the current list comprehension syntax. I don't know how hard it would be to fix existing code, probably not a massive undertaking, but the bugs lazy list comprehensions introduced would probably be a bit subtle. Let's perform a little thought experiment. We already have the current list comprehension syntax and the people thinking about lazy list comprehensions are seem to be struggling a bit to find syntax for them which doesn't appear cobbled together. Direct your attention to Python 3.0 where one of the things Guido has said he would like to do is to eliminate some bits of the language he feels are warts. Given two similar language constructs implementing two similar sets of semantics, I'd have to think he would like to toss one of each. The list comprehension syntax seems the more obvious (to me) syntax to keep while it would appear there are some advantages to the lazy list comprehension semantics (enumerate (parts of) infinite sequences, better memory usage, some performance improvements). I don't know when 3.0 alpha will (conceptually) become the CVS trunk. Guido may not know either, but it is getting nearer every day. Unless he likes one of the proposed new syntaxes well enough to conclude now that he will keep both syntaxes and both sets of semantics in 3.0, I think we should look at other alternatives which don't introduce new syntax, including morphing list comprehensions into lazy list comprehensions or leaving lazy list comprehensions out of the language, at least in 2.x. As I think people learned when considering ternary operators and switch statements, adding constructs to the language in a Pythonic way is not always possible, no matter how compelling the feature might be. In those situations it makes sense to leave the construct out for now and see if syntax restructuring in 3.0 will make addition of such desired features possible. Anyone for [x for x in S]L ? Skip From aleaxit at yahoo.com Tue Oct 21 12:41:45 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 21 12:41:52 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <3F953793.1000208@iinet.net.au> References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> <200310201815.h9KIFM821583@12-236-54-216.client.attbi.com> <3F953793.1000208@iinet.net.au> Message-ID: <200310211841.45711.aleaxit@yahoo.com> On Tuesday 21 October 2003 03:41 pm, Nick Coghlan wrote: --- > What about: > > result + x**2 from result = 0.0 for x in S > > Essentially short for: > result = 0.0 > for x in S: > result = result + x**2 Not bad, but I'm not sure I like the strict limitation to "A = A + f(x)" forms (possibly with some other operator in lieu of + etc, of course). Say I want to make a sets.Set out of the iterator, for example: result.union([ x**2 ]) from result = sets.Set() for x in theiter now that's deucedly _inefficient_, consarn it!, because it maps to a loop of: result = result.union([ x** ]) so I may be tempted to try, instead: real_result = sets.Set() real_result.union_update([ x**2 ]) from fake_result = None for x in theiter and hoping the N silly rebindings of fake_result to None cost me less than not having to materialize a list from theiter would cost if I did real_result = sets.Set([ x**2 for x in theiter ]) I don't think we should encourage that sort of thing with the "implicit assignment" in accumulation. So, if it's an accumulation syntax we're going for, I'd much rather find ways to express whether we want [a] no assignment at all (as e.g for union_update), [b] plain assignment, [c] augmented assignment such as += or whatever. Sorry, no good idea comes to my mind now, but I _do_ think we'd want all three possibilities... Alex From aleaxit at yahoo.com Tue Oct 21 12:50:25 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 21 12:50:34 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <16277.24592.805548.835843@montanaro.dyndns.org> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310211749.21152.aleaxit@yahoo.com> <16277.24592.805548.835843@montanaro.dyndns.org> Message-ID: <200310211850.25376.aleaxit@yahoo.com> On Tuesday 21 October 2003 06:34 pm, Skip Montanaro wrote: ... > would like to toss one of each. The list comprehension syntax seems the > more obvious (to me) syntax to keep while it would appear there are some > advantages to the lazy list comprehension semantics (enumerate (parts of) > infinite sequences, better memory usage, some performance improvements). Yes to both points. Hmmm... > should look at other alternatives which don't introduce new syntax, > including morphing list comprehensions into lazy list comprehensions or ...as long as this can be done WITHOUT breaking a ton of my code... > leaving lazy list comprehensions out of the language, at least in 2.x. As Eeek. Maybe. Sigh. 3 years or so (best case, assuming 2.4 is the last of the 2.*'s) before I can teach and deploy lazy comprehensions?-( Hmmm... what about skipping 2.4, and making a beeline for 3.0...?-) Alex From guido at python.org Tue Oct 21 12:53:38 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 12:53:48 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 17:49:21 +0200." <200310211749.21152.aleaxit@yahoo.com> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com> <16277.15186.392757.583785@montanaro.dyndns.org> <200310211749.21152.aleaxit@yahoo.com> Message-ID: <200310211653.h9LGrcC24239@12-236-54-216.client.attbi.com> [Skip] > > Here's an alternate suggestion. Instead of inventing new syntax, > > why not change the semantics of list comprehensions to be lazy? > > They haven't been in use that long, and while they are popular, > > the semantic tweakage would probably cause minimal disruption. In > > situations where laziness wasn't wanted, the most that a > > particular use would have to change (I think) is to pass it to > > list(). Sorry, too late. You're hugely underestimating the backwards compatibility issues. And they have been in use at least since 2000 (they were introduced in 2.0). [Alex] > I think we should keep the user-observable semantics as now, BUT > maybe an optimization IS possible if all the user code does with the > LC is loop on it (or anyway just get its iter(...) and nothing else). But that's not very common, so I don't see the point of putting in the effort, plus it's not safe. Using a LC as the sequence of a for loop is ugly, and usually for x in [y for y in S if P(y)]: ... means the same as for x in S: if P(x): ... except when it doesn't, and then making the list comprehension lazy can be a mistake: the following example for key in [k for k in d if d[k] is None]: del d[key] is *not* the same as for key in d: if d[key] is None: del d --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 12:56:31 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 12:56:41 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: Your message of "Tue, 21 Oct 2003 15:45:00 BST." <2mwuayodjn.fsf@starship.python.net> References: <20031020173134.GA29040@panix.com> <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> <2mad7uq0mr.fsf@starship.python.net> <20031021142816.GA25455@intarweb.us> <2mwuayodjn.fsf@starship.python.net> Message-ID: <200310211656.h9LGuVQ24250@12-236-54-216.client.attbi.com> > > Anyway, this is no different from the > > problem of: > > > > for x in R: > > ... > > print x > > Well, yes. I still think it's dubious code. > > > In any case, are there plans to also have the compiler emit > > warnings about potential reliance on this feature? > > I would hope that we wouldn't make changes without emitting such a > warning. I'm not sure how hard it would be to implement, tho'. Warning about what? I have no intent to make the example quoted above illegal; a regular for loop control variable's scope will extend beyond the loop. It's only list comprehensions where I plan to remove x from the scope after the comprehension is finished. Do you need a warning for that change too? Code that relies on it is pretty sick IMO. --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Tue Oct 21 12:57:57 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 21 12:58:02 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: References: <200310171840.h9HIesN06941@12-236-54-216.client.attbi.com> Message-ID: <200310211857.57783.aleaxit@yahoo.com> On Tuesday 21 October 2003 06:26 pm, Thomas Heller wrote: ... > Yet another approach would be to use the delay_load feature of MSVC, it > allows dynamic loading of the dlls at runtime, ideally without changing > the source code. > > So far I have never tried that, does anyone know if this really works? Yes, back when I was in think3 we experimented extensively as soon as it was available (still in a beta of -- I don't recall if it was the SDK or VStudio 6), and except for a few specific libraries that gave some trouble (MSVCRT.DLL and the MFC one, only -- I think because they did something to the memory allocation mechanisms, MSVCRT having it and MFC changing it -- perhaps it was because we were ALSO using other memory-related tools in DLL's, e.g. leak-detectors), it always worked smoothly and "spread" the load, making the app startup faster. So we set the two DLL's that gave us trouble for load and startup and the rest for delayed load and lived happily ever after (I don't even recall exactly HOW we did that, it WAS years ago...). Alex From guido at python.org Tue Oct 21 12:58:23 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 12:58:34 2003 Subject: [Python-Dev] Re: listcomps vs. for loops In-Reply-To: Your message of "Tue, 21 Oct 2003 10:29:32 EDT." References: <20031020173134.GA29040@panix.com><200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> <2mad7uq0mr.fsf@starship.python.net> Message-ID: <200310211658.h9LGwN124275@12-236-54-216.client.attbi.com> > > > I don't recall what I said then. Did I say it was a feature that > > > > > > L = [x for x in R] > > > print x > > > > > > would print the last item of R? > > Someone more-or-less did -- in the tutorial. See bottom below. Oh bah! > > A problem with such code irrespective of anything else is that it > > fails when R is empty. > > Same would be true of for loops, except that typical after-for usage, > such as searching for item in list, has else clause to set control > variable to default in 'not found' cases, which include empty lists. The regular for loop won't change. > The Ref Manual currently says nothing about leakage or overwriting. > That should make leakage fair game for plugging. Unfortunately the Ref Manual is notoriously incomplete. > On the other hand, Tutorial 5.1.4 List Comprehensions says: > ''' > To make list comprehensions match the behavior of for loops, > assignments to the loop variable remain visible outside of the > comprehension: > > >>> x = 100 # this gets overwritten > >>> [x**3 for x in range(5)] > [0, 1, 8, 27, 64] > >>> x # the final value for range(5) > 4 > ''' > (Pointed out by John Roth in response to my c.l.py posting.) > I have added note to SF 827209. Sigh. What a bummer to put this in a tutorial. :-( But it won't stop me from deprecating the feature. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 13:04:14 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 13:05:57 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 23:41:39 +1000." <3F953793.1000208@iinet.net.au> References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> <200310201815.h9KIFM821583@12-236-54-216.client.attbi.com> <3F953793.1000208@iinet.net.au> Message-ID: <200310211704.h9LH4E324322@12-236-54-216.client.attbi.com> > What about: > > result + x**2 from result = 0.0 for x in S > > Essentially short for: > result = 0.0 > for x in S: > result = result + x**2 You're kidding right? --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Tue Oct 21 13:24:25 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 21 13:24:31 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310211653.h9LGrcC24239@12-236-54-216.client.attbi.com> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310211749.21152.aleaxit@yahoo.com> <200310211653.h9LGrcC24239@12-236-54-216.client.attbi.com> Message-ID: <200310211924.25711.aleaxit@yahoo.com> On Tuesday 21 October 2003 06:53 pm, Guido van Rossum wrote: ... > > maybe an optimization IS possible if all the user code does with the > > LC is loop on it (or anyway just get its iter(...) and nothing else). > > But that's not very common, so I don't see the point of putting in the It IS common, at least in the code I write, e.g.: d = dict([ (f(a), g(a)) for a in S ]) s = sets.Set([ a*a for a in S ]) totsq = sum([ x*x for x in S ]) etc. I detest the look of those ([ ... ]), but that's the closest I get to dict comprehensions, set comprehensions, etc. > except when it doesn't, and then making the list comprehension lazy > can be a mistake: the following example > > for key in [k for k in d if d[k] is None]: > del d[key] > > is *not* the same as > > for key in d: > if d[key] is None: > del d Well, no, but even if that last statement was "del d[key]" you'd still be right:-). Even in a situation where the list comp is only looped over once, code MIGHT still be relying on the LC having "snapshotted" and/or exhausted iterators IT uses. I was basically thinking of passing the LC as argument to something -- the typical cases where I use LC now and WISH they were lazy, as above -- rather about for loops. And even when the LC _is_ an argument there might be cases where its current strict (nonlazy) semantics are necessary. Oh well! Alex From exarkun at intarweb.us Tue Oct 21 13:31:25 2003 From: exarkun at intarweb.us (Jp Calderone) Date: Tue Oct 21 13:31:44 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: <200310211656.h9LGuVQ24250@12-236-54-216.client.attbi.com> References: <20031020173134.GA29040@panix.com> <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> <2mad7uq0mr.fsf@starship.python.net> <20031021142816.GA25455@intarweb.us> <2mwuayodjn.fsf@starship.python.net> <200310211656.h9LGuVQ24250@12-236-54-216.client.attbi.com> Message-ID: <20031021173125.GA27127@intarweb.us> On Tue, Oct 21, 2003 at 09:56:31AM -0700, Guido van Rossum wrote: > > > Anyway, this is no different from the > > > problem of: > > > > > > for x in R: > > > ... > > > print x > > > > Well, yes. I still think it's dubious code. > > > > > In any case, are there plans to also have the compiler emit > > > warnings about potential reliance on this feature? > > > > I would hope that we wouldn't make changes without emitting such a > > warning. I'm not sure how hard it would be to implement, tho'. > > Warning about what? > > I have no intent to make the example quoted above illegal; a regular > for loop control variable's scope will extend beyond the loop. > Sorry, my ordering could have been a little more clear. I only meant a warning for the list comprehension case. > [snip] > > Do you need a warning for that change too? Code that relies on it is > pretty sick IMO. > I agree, and I try never to write such code. But having Python point out any places I foolishly did so makes the job of fixing any bugs this change introduces into my code that much easier. It also serves to point out to people who *don't* realize how sick this construct is that a potentially large chunk of their software will break in Python X.Y (3.0?), where it will break, and why it will break. Jp -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://mail.python.org/pipermail/python-dev/attachments/20031021/59d36242/attachment.bin From aleaxit at yahoo.com Tue Oct 21 13:39:45 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 21 13:39:50 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <5.1.1.6.0.20031021115620.023e2300@telecommunity.com> References: <200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com> <5.1.1.6.0.20031021115620.023e2300@telecommunity.com> Message-ID: <200310211939.45800.aleaxit@yahoo.com> On Tuesday 21 October 2003 05:59 pm, Phillip J. Eby wrote: ... > If you make it a list that's lazy, it doesn't lose the memory allocation > overhead for the list. If I understand Alex's benchmarks, making a lazy > list would end up being *slower* than list comprehension is now. No, my benchmarks show that NOT having to "incarnate" the list, when all you do is loop on it, is a modest but repeatable win (20%-30% or so). Alex From fdrake at acm.org Tue Oct 21 14:04:38 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue Oct 21 14:04:52 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Doc/lib libplatform.tex, 1.1, 1.2 In-Reply-To: References: Message-ID: <16277.30006.67329.572905@grendel.zope.com> fdrake@users.sourceforge.net writes: > Modified Files: > libplatform.tex > Log Message: > - make this section format > - start cleaning up the markup for consistency > - comment out the reference to a MS KnowledgeBase article that doesn't > seem to be present at msdn.microsoft.com; hopefully someone can > point out an alternate source for the relevant information I forgot to mention in the checkin message that this is *not* ready to be backported to the 2.3.x maintenance branch yet. I hope to make another substantial pass through this later this week. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From guido at python.org Tue Oct 21 14:08:20 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 14:08:31 2003 Subject: [Python-Dev] prePEP: Money data type In-Reply-To: Your message of "Tue, 21 Oct 2003 11:13:41 +0200." <200310211113.41658.aleaxit@yahoo.com> References: <200310211113.41658.aleaxit@yahoo.com> Message-ID: <200310211808.h9LI8Kt24464@12-236-54-216.client.attbi.com> > Sure, arithmetic (including rounding) is what we most need. If we call > it Decimal or whatever, that may be preferable to Money, I don't know. Remember, a Decimal implementation following the IEEE 854 specs and Mike Cowlishaw's design and tests exists in the nondist part of the Python source tree, thanks to Eric Pierce (and some early work by Aahz, and encouraging words by Tim Peters). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 14:09:33 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 14:09:56 2003 Subject: [Python-Dev] Re: Re: accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 09:43:31 +0200." <200310210943.31574.aleaxit@yahoo.com> References: <200310210727.h9L7R3V02910@oma.cosc.canterbury.ac.nz> <200310210943.31574.aleaxit@yahoo.com> Message-ID: <200310211809.h9LI9XV24477@12-236-54-216.client.attbi.com> > On Tuesday 21 October 2003 09:27 am, Greg Ewing wrote: > ... > > But maybe some other keyword could be added to ease any > > syntactic problems, such as "all" or "every": > > > > sum(all x*x for x in xlist) > > sum(every x*x for x in xlist) > > > > The presence of the extra keyword would then distinguish > > an iterator comprehension from the innards of a list > > comprehension. > > Heh, you ARE a volcano of cool syntactic ideas these days, Greg. > > As between them, to me 'all' sort of connotes 'all at once' while > 'every' connotes 'one by one' (so would a third possibility, 'each'); > so 'all' is the one I like least. > > Besides accumulators &c we should also think of normal loops: > > for a in all x*x for x in xlist: ... > > for a in every x*x for x in xlist: ... > > for a in each x*x for x in xlist: ... > > Of these three, 'every' looks best to me, personally. > > > Alex I'd rather reserver these keywords for conditions using quantifiers, like in ABC. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 14:11:39 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 14:11:46 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 20:43:06 +1300." <200310210743.h9L7h6k02941@oma.cosc.canterbury.ac.nz> References: <200310210743.h9L7h6k02941@oma.cosc.canterbury.ac.nz> Message-ID: <200310211811.h9LIBdD24492@12-236-54-216.client.attbi.com> > > But the executive summary remains: the generator wins because it > > doesn't have to materialize the whole list. > > But what would happen if the generator were replaced with > in-line code that computes the values and feeds them to > an accumulator object, such as might result from an > accumulator syntax that gets inline-expanded in the > same way as a list comp? I'd worry that writing an accumilator would become much less natural. The cool thing of iterators and generators is that you can write both the source (generator) and the destination (iterator consumer) as a simple loop, which is how you usually think about it. --Guido van Rossum (home page: http://www.python.org/~guido/) From theller at python.net Tue Oct 21 14:25:12 2003 From: theller at python.net (Thomas Heller) Date: Tue Oct 21 14:26:07 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: <200310211857.57783.aleaxit@yahoo.com> (Alex Martelli's message of "Tue, 21 Oct 2003 18:57:57 +0200") References: <200310171840.h9HIesN06941@12-236-54-216.client.attbi.com> <200310211857.57783.aleaxit@yahoo.com> Message-ID: Alex Martelli writes: > On Tuesday 21 October 2003 06:26 pm, Thomas Heller wrote: > ... >> Yet another approach would be to use the delay_load feature of MSVC, it >> allows dynamic loading of the dlls at runtime, ideally without changing >> the source code. >> >> So far I have never tried that, does anyone know if this really works? > > Yes, back when I was in think3 we experimented extensively as soon as > it was available (still in a beta of -- I don't recall if it was the SDK or > VStudio 6), and except for a few specific libraries that gave some trouble > (MSVCRT.DLL and the MFC one, only -- I think because they did > something to the memory allocation mechanisms, MSVCRT having it > and MFC changing it -- perhaps it was because we were ALSO using > other memory-related tools in DLL's, e.g. leak-detectors), it always worked > smoothly and "spread" the load, making the app startup faster. So we > set the two DLL's that gave us trouble for load and startup and the rest > for delayed load and lived happily ever after (I don't even recall exactly > HOW we did that, it WAS years ago...). After installing MSVC6 on a win98 machine, where I could rename wsock32.dll away (which was not possible on XP due to file system protection), I was able to change socketmodule.c to use delay loading of the winsock dll. I had to wrap up the WSAStartup() call inside a __try {} __except {} block to catch the exception thrown. With this change, _socket (and maybe also select) could then also be converted into builtin modules. Guido, what do you think? Thomas PS: Here's the exception raised when loading of wsock32.dll fails: >>> import _socket Traceback (most recent call last): File "", line1m in ? ImportError: WSAStartup failed: error code -1066598274 and here's the tiny patch: Index: socketmodule.c =================================================================== RCS file: /cvsroot/python/python/dist/src/Modules/socketmodule.c,v retrieving revision 1.271.6.5 diff -c -r1.271.6.5 socketmodule.c *** socketmodule.c 20 Oct 2003 14:34:47 -0000 1.271.6.5 --- socketmodule.c 21 Oct 2003 18:21:39 -0000 *************** *** 3381,3387 **** WSADATA WSAData; int ret; char buf[100]; ! ret = WSAStartup(0x0101, &WSAData); switch (ret) { case 0: /* No error */ Py_AtExit(os_cleanup); --- 3381,3391 ---- WSADATA WSAData; int ret; char buf[100]; ! __try { ! ret = WSAStartup(0x0101, &WSAData); ! } __except (ret = GetExceptionCode(), EXCEPTION_EXECUTE_HANDLER) { ! ; ! } switch (ret) { case 0: /* No error */ Py_AtExit(os_cleanup); From aahz at pythoncraft.com Tue Oct 21 14:57:49 2003 From: aahz at pythoncraft.com (Aahz) Date: Tue Oct 21 14:57:54 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: <200310211656.h9LGuVQ24250@12-236-54-216.client.attbi.com> References: <20031020173134.GA29040@panix.com> <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> <2mad7uq0mr.fsf@starship.python.net> <20031021142816.GA25455@intarweb.us> <2mwuayodjn.fsf@starship.python.net> <200310211656.h9LGuVQ24250@12-236-54-216.client.attbi.com> Message-ID: <20031021185748.GA18869@panix.com> On Tue, Oct 21, 2003, Guido van Rossum wrote: > > It's only list comprehensions where I plan to remove x from the scope > after the comprehension is finished. > > Do you need a warning for that change too? Code that relies on it is > pretty sick IMO. Yes, it's sick, but since you made clear previously that listcomps semantics equivalent to the corresponding for loop, I wouldn't be surprised to discover that someone converted a for loop to a listcomp without fixing that sickness. So yes, it needs a warning. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From python at rcn.com Tue Oct 21 14:58:55 2003 From: python at rcn.com (Raymond Hettinger) Date: Tue Oct 21 14:59:44 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <16277.24592.805548.835843@montanaro.dyndns.org> Message-ID: <004601c39805$60dd9a60$e841fea9@oemcomputer> [Skip Montanaro] > I understand all that. Still, the "best" syntax for these so-called > iterator comprehensions might have been the current list comprehension > syntax. Skip is right about returning to the basics. Before considering some of the wild syntaxes that have been submitted, I suggest re-examining the very first proposal with brackets and yield. At one time, I got a lot of feedback on this from comp.lang.python. Just about everyone found the brackets to be helpful and not misleading, the immediate presence of "yield" was more than enough to signal that an iterator was being returned instead of a list: g = [yield (len(line),line) for line in file if len(line)>5] This syntax is instantly learnable from existing knowledge about list comprehensions and generators. The advantage of a near zero learning curve should not be easily dismissed. Also, this syntax makes is trivially easy to convert an existing list comprehension into an iterator comprehension if needed to help the application scale-up or to improve performance. Raymond Hettinger ################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# ################################################################# From FBatista at uniFON.com.ar Tue Oct 21 15:07:12 2003 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Tue Oct 21 15:08:13 2003 Subject: [Python-Dev] prePEP: Money data type Message-ID: Guido van Rossum wrote: #- Remember, a Decimal implementation following the IEEE 854 specs and #- Mike Cowlishaw's design and tests exists in the nondist part of the #- Python source tree, thanks to Eric Pierce (and some early work by #- Aahz, and encouraging words by Tim Peters). Meaning that I should extend/finish it or meaning that Money should not repeat that work and get specific with money issues? Can't find it in the CVS, specific path? Thank you! . Facundo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20031021/c3d57d21/attachment-0001.html From ianb at colorstudy.com Tue Oct 21 15:08:30 2003 From: ianb at colorstudy.com (Ian Bicking) Date: Tue Oct 21 15:08:31 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <004601c39805$60dd9a60$e841fea9@oemcomputer> Message-ID: On Tuesday, October 21, 2003, at 01:58 PM, Raymond Hettinger wrote: > At one time, I got a lot of feedback on this from comp.lang.python. > Just about everyone found the brackets to be helpful and not > misleading, > the immediate presence of "yield" was more than enough to signal that > an iterator was being returned instead of a list: > > g = [yield (len(line),line) for line in file if len(line)>5] FWIW, that g is an iterator is *far* less surprising than the fact that yield turns a function into a generator. If it's okay that a yield in the body of a function change the function, why can't a yield in the body of a list comprehension change the list comprehension? It's a lot more noticeable, and people should know that "yield" signals something a little more tricky is going on. Also has good symmetry with the current meaning of yield. -- Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org From fdrake at acm.org Tue Oct 21 15:21:02 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue Oct 21 15:21:14 2003 Subject: [Python-Dev] prePEP: Money data type In-Reply-To: References: Message-ID: <16277.34590.271610.511919@grendel.zope.com> Batista, Facundo writes: > Can't find it in the CVS, specific path? The main body of the Python sources are in CVS as python/dist/src/; the rest is in python/nondist/. The decimal package is in python/nondist/sandbox/decimal/. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From guido at python.org Tue Oct 21 15:30:49 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 15:31:04 2003 Subject: [Python-Dev] prePEP: Money data type In-Reply-To: Your message of "Tue, 21 Oct 2003 16:07:12 -0300." References: Message-ID: <200310211930.h9LJUnS24625@12-236-54-216.client.attbi.com> > #- Remember, a Decimal implementation following the IEEE 854 specs and > #- Mike Cowlishaw's design and tests exists in the nondist part of the > #- Python source tree, thanks to Eric Pierce (and some early work by > #- Aahz, and encouraging words by Tim Peters). > > Meaning that I should extend/finish it or meaning that Money should not > repeat that work and get specific with money issues? Meaning that you should use if if possible rather than reinventing that particular wheel. And yes, if the Decimal class still needs work, if you want to help fix it that would be great! --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one at comcast.net Tue Oct 21 15:31:10 2003 From: tim.one at comcast.net (Tim Peters) Date: Tue Oct 21 15:31:17 2003 Subject: [Python-Dev] prePEP: Money data type In-Reply-To: Message-ID: [Guido] > Remember, a Decimal implementation following the IEEE 854 specs and > Mike Cowlishaw's design and tests exists in the nondist part of the > Python source tree, thanks to Eric Pierce ... s/Pierce/Price/ [Batista, Facundo] > Meaning that I should extend/finish it or meaning that Money should > not repeat that work and get specific with money issues? Meaning that there's an existing body of work that's already been informed by years of design debate (IBM's proposed decimal standard), and an involved Python implementation of that. What happens next depends on who can make time to do something next. > Can't find it in the CVS, specific path? IBM's proposed standard: http://www2.hursley.ibm.com/decimal/ Eric's implementation: http://cvs.sf.net/viewcvs.py/python/python/nondist/sandbox/decimal/ From guido at python.org Tue Oct 21 15:36:40 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 15:36:53 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 14:58:55 EDT." <004601c39805$60dd9a60$e841fea9@oemcomputer> References: <004601c39805$60dd9a60$e841fea9@oemcomputer> Message-ID: <200310211936.h9LJaeO24644@12-236-54-216.client.attbi.com> > Skip is right about returning to the basics. Before considering > some of the wild syntaxes that have been submitted, I suggest > re-examining the very first proposal with brackets and yield. > > At one time, I got a lot of feedback on this from comp.lang.python. > Just about everyone found the brackets to be helpful and not misleading, > the immediate presence of "yield" was more than enough to signal that > an iterator was being returned instead of a list: > > g = [yield (len(line),line) for line in file if len(line)>5] > > This syntax is instantly learnable from existing knowledge about > list comprehensions and generators. The advantage of a near zero > learning curve should not be easily dismissed. > > Also, this syntax makes is trivially easy to convert an existing > list comprehension into an iterator comprehension if needed to > help the application scale-up or to improve performance. -1. I expect that most iterator comprehensions (we need a better term!) are not stored in a variable but passed as an argument to something that takes an iterable, e.g. sum(len(line) for line in file if line.strip()) I find that in such cases, the 'yield' distracts from what is going on by focusing attention on the generator (which is really just an implementation detail). We can quibble about whether double parentheses are needed, but this syntax is just so much clearer than the version with square brackets and yield, that there is no contest IMO. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 15:37:42 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 15:37:52 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: Your message of "Tue, 21 Oct 2003 14:57:49 EDT." <20031021185748.GA18869@panix.com> References: <20031020173134.GA29040@panix.com> <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> <2mad7uq0mr.fsf@starship.python.net> <20031021142816.GA25455@intarweb.us> <2mwuayodjn.fsf@starship.python.net> <200310211656.h9LGuVQ24250@12-236-54-216.client.attbi.com> <20031021185748.GA18869@panix.com> Message-ID: <200310211937.h9LJbg624656@12-236-54-216.client.attbi.com> > > It's only list comprehensions where I plan to remove x from the scope > > after the comprehension is finished. > > > > Do you need a warning for that change too? Code that relies on it is > > pretty sick IMO. > > Yes, it's sick, but since you made clear previously that listcomps > semantics equivalent to the corresponding for loop, I wouldn't be > surprised to discover that someone converted a for loop to a listcomp > without fixing that sickness. So yes, it needs a warning. OK, fair enough. Someone update the doc bug report for this. Initially, we're just going to document it as deprecated behavior (or maybe "despised" behavior :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 15:40:56 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 15:41:04 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: Your message of "Tue, 21 Oct 2003 20:25:12 +0200." References: <200310171840.h9HIesN06941@12-236-54-216.client.attbi.com> <200310211857.57783.aleaxit@yahoo.com> Message-ID: <200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com> > After installing MSVC6 on a win98 machine, where I could rename > wsock32.dll away (which was not possible on XP due to file system > protection), I was able to change socketmodule.c to use delay loading of > the winsock dll. I had to wrap up the WSAStartup() call inside a > __try {} __except {} block to catch the exception thrown. > > With this change, _socket (and maybe also select) could then also be > converted into builtin modules. > > Guido, what do you think? I think now is a good time to try this in 2.4. I don't think I'd want to do this (or any of the proposed reorgs) in 2.3 though. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 15:46:20 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 15:46:52 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Doc/lib libplatform.tex, 1.1, 1.2 In-Reply-To: Your message of "Tue, 21 Oct 2003 14:04:38 EDT." <16277.30006.67329.572905@grendel.zope.com> References: <16277.30006.67329.572905@grendel.zope.com> Message-ID: <200310211946.h9LJkKP24720@12-236-54-216.client.attbi.com> > > - comment out the reference to a MS KnowledgeBase article that doesn't > > seem to be present at msdn.microsoft.com; hopefully someone can > > point out an alternate source for the relevant information Bizarre. It seems MS has removed all traces of that article; I found lots of pointers to it in Google but they all point to the same dead link. Google's cache is your best bet... --Guido van Rossum (home page: http://www.python.org/~guido/) From FBatista at uniFON.com.ar Tue Oct 21 15:48:39 2003 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Tue Oct 21 15:49:35 2003 Subject: [Python-Dev] prePEP: Money data type Message-ID: Guido van Rossum wrote: #- > Meaning that I should extend/finish it or meaning that #- Money should not #- > repeat that work and get specific with money issues? #- #- Meaning that you should use if if possible rather than reinventing #- that particular wheel. I'll study it and see if I can subclass it or something. #- And yes, if the Decimal class still needs work, if you want to help #- fix it that would be great! I'll do my best. . Facundo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20031021/6f710a96/attachment.html From guido at python.org Tue Oct 21 15:50:03 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 15:50:34 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 19:24:25 +0200." <200310211924.25711.aleaxit@yahoo.com> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310211749.21152.aleaxit@yahoo.com> <200310211653.h9LGrcC24239@12-236-54-216.client.attbi.com> <200310211924.25711.aleaxit@yahoo.com> Message-ID: <200310211950.h9LJo3i24741@12-236-54-216.client.attbi.com> > On Tuesday 21 October 2003 06:53 pm, Guido van Rossum wrote: > ... > > > maybe an optimization IS possible if all the user code does with the > > > LC is loop on it (or anyway just get its iter(...) and nothing else). > > > > But that's not very common, so I don't see the point of putting in the > > It IS common, at least in the code I write, e.g.: > > d = dict([ (f(a), g(a)) for a in S ]) > > s = sets.Set([ a*a for a in S ]) > > totsq = sum([ x*x for x in S ]) > > etc. I detest the look of those ([ ... ]), but that's the closest I > get to dict comprehensions, set comprehensions, etc. OK, but you hve very little hope of optimizing the incarnation away by the compiler (especially since our attempts at warning about surreptitious changes to builtins had to be withdrawn before 2.3 went out). > > except when it doesn't, and then making the list comprehension lazy > > can be a mistake: the following example > > > > for key in [k for k in d if d[k] is None]: > > del d[key] > > > > is *not* the same as > > > > for key in d: > > if d[key] is None: > > del d > > Well, no, but even if that last statement was "del d[key]" you'd still be > right:-). :-( > Even in a situation where the list comp is only looped over once, > code MIGHT still be relying on the LC having "snapshotted" and/or > exhausted iterators IT uses. I was basically thinking of passing the LC > as argument to something -- the typical cases where I use LC now and > WISH they were lazy, as above -- rather about for loops. And even > when the LC _is_ an argument there might be cases where its current > strict (nonlazy) semantics are necessary. Oh well! Yes, this is why iterator comprehensions (we need a better term!!!) would be so cool to have (I think much cooler than conditional expressions). --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Tue Oct 21 15:52:23 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Oct 21 15:52:29 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310211936.h9LJaeO24644@12-236-54-216.client.attbi.com> References: <004601c39805$60dd9a60$e841fea9@oemcomputer> Message-ID: <5.1.1.6.0.20031021155150.0240c6b0@telecommunity.com> At 12:36 PM 10/21/03 -0700, Guido van Rossum wrote: >I expect that most iterator comprehensions (we need a better term!) >are not stored in a variable but passed as an argument to something >that takes an iterable, e.g. Iterator expression? From guido at python.org Tue Oct 21 15:53:17 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 15:53:23 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 18:50:25 +0200." <200310211850.25376.aleaxit@yahoo.com> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310211749.21152.aleaxit@yahoo.com> <16277.24592.805548.835843@montanaro.dyndns.org> <200310211850.25376.aleaxit@yahoo.com> Message-ID: <200310211953.h9LJrHg24771@12-236-54-216.client.attbi.com> > Hmmm... what about skipping 2.4, and making a beeline for 3.0...?-) Not until I can quit my job at ES and spend a year or so on PSF funds on it. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Tue Oct 21 15:56:08 2003 From: skip at pobox.com (Skip Montanaro) Date: Tue Oct 21 15:56:17 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310211936.h9LJaeO24644@12-236-54-216.client.attbi.com> References: <004601c39805$60dd9a60$e841fea9@oemcomputer> <200310211936.h9LJaeO24644@12-236-54-216.client.attbi.com> Message-ID: <16277.36696.353007.168363@montanaro.dyndns.org> Guido> I expect that most iterator comprehensions (we need a better Guido> term!) You didn't like "lazy list comprehensions"? Guido> We can quibble about whether double parentheses are needed, ... You haven't convinced me that you're not going to want to toss out one of the two comprehension syntaxes and only retain the lazy semantics in Py3k. If that's the case and the current list comprehension syntax is better than the current crop of proposals, why even add (lazy list|iterator) comprehensions now? Just make do without them until Py3k and make all list comprehensions lazy at that point. There will be enough other bullets to bite that this shouldn't be a big deal (many programs will probably require significant rewriting anyway). Skip From python at rcn.com Tue Oct 21 16:00:46 2003 From: python at rcn.com (Raymond Hettinger) Date: Tue Oct 21 16:01:50 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: <200310211703.h9LH3X124310@12-236-54-216.client.attbi.com> Message-ID: <000f01c3980e$047d7200$e841fea9@oemcomputer> It is clear now that tee() is a fundamental building block and that a C implementation has decisive advantages over its pure python counterpart. So ... tee() will become a standard itertools function in Py2.4. Raymond Hettinger From guido at python.org Tue Oct 21 16:02:01 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 16:02:17 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 11:34:24 CDT." <16277.24592.805548.835843@montanaro.dyndns.org> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com> <16277.15186.392757.583785@montanaro.dyndns.org> <200310211749.21152.aleaxit@yahoo.com> <16277.24592.805548.835843@montanaro.dyndns.org> Message-ID: <200310212002.h9LK21624815@12-236-54-216.client.attbi.com> [Skip] > I understand all that. Still, the "best" syntax for these so-called > iterator comprehensions might have been the current list > comprehension syntax. I don't know how hard it would be to fix > existing code, probably not a massive undertaking, but the bugs lazy > list comprehensions introduced would probably be a bit subtle. > > Let's perform a little thought experiment. We already have the > current list comprehension syntax and the people thinking about lazy > list comprehensions are seem to be struggling a bit to find syntax > for them which doesn't appear cobbled together. Direct your > attention to Python 3.0 where one of the things Guido has said he > would like to do is to eliminate some bits of the language he feels > are warts. Given two similar language constructs implementing two > similar sets of semantics, I'd have to think he would like to toss > one of each. The list comprehension syntax seems the more obvious > (to me) syntax to keep while it would appear there are some > advantages to the lazy list comprehension semantics (enumerate > (parts of) infinite sequences, better memory usage, some performance > improvements). > > I don't know when 3.0 alpha will (conceptually) become the CVS > trunk. Guido may not know either, but it is getting nearer every > day. Not necessarily. Maybe the time machine's stuck. :-) > Unless he likes one of the proposed new syntaxes well enough to > conclude now that he will keep both syntaxes and both sets of > semantics in 3.0, I think we should look at other alternatives which > don't introduce new syntax, including morphing list comprehensions > into lazy list comprehensions or leaving lazy list comprehensions > out of the language, at least in 2.x. As I think people learned > when considering ternary operators and switch statements, adding > constructs to the language in a Pythonic way is not always possible, > no matter how compelling the feature might be. In those situations > it makes sense to leave the construct out for now and see if syntax > restructuring in > 3.0 will make addition of such desired features possible. > > Anyone for > > [x for x in S]L > > ? Thanks for trying to bang some sense into this. Personally, I still like the idea best to make (x for x in S) be an iterator comprehension and [x for x in S] syntactic sugar for the common operation list((x for x in S)) I'm not 100% sure about requiring the double parentheses, but I certainly want to require extra parentheses if there's a comma on either side, so that if we want to pass a 2-argument function a list comprehension, it will have to be parenthesized, e.g. foo((x for x in S), 42) bar(42, (x for x in S)) This makes me think that it's probably fine to also require sum((x for x in S)) Of course, multiple for clauses and if clauses are still supported just like in current list comprehensions; they add no new syntactic issues, except if we also were to introduce conditional expressions. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From theller at python.net Tue Oct 21 16:03:33 2003 From: theller at python.net (Thomas Heller) Date: Tue Oct 21 16:03:48 2003 Subject: [Python-Dev] buildin vs. shared modules In-Reply-To: <200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com> (Guido van Rossum's message of "Tue, 21 Oct 2003 12:40:56 -0700") References: <200310171840.h9HIesN06941@12-236-54-216.client.attbi.com> <200310211857.57783.aleaxit@yahoo.com> <200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com> Message-ID: Guido van Rossum writes: >> After installing MSVC6 on a win98 machine, where I could rename >> wsock32.dll away (which was not possible on XP due to file system >> protection), I was able to change socketmodule.c to use delay loading of >> the winsock dll. I had to wrap up the WSAStartup() call inside a >> __try {} __except {} block to catch the exception thrown. >> >> With this change, _socket (and maybe also select) could then also be >> converted into builtin modules. >> >> Guido, what do you think? > > I think now is a good time to try this in 2.4. I don't think I'd want > to do this (or any of the proposed reorgs) in 2.3 though. Yes, I understood this already. Thomas From fincher.8 at osu.edu Tue Oct 21 17:05:15 2003 From: fincher.8 at osu.edu (Jeremy Fincher) Date: Tue Oct 21 16:06:50 2003 Subject: [Python-Dev] prePEP: Money data type In-Reply-To: References: Message-ID: <200310211705.15094.fincher.8@osu.edu> On Tuesday 21 October 2003 03:31 pm, Tim Peters wrote: > Eric's implementation: > > http://cvs.sf.net/viewcvs.py/python/python/nondist/sandbox/decimal/ Just out of curiosity, why isn't this distributed with Python? Jeremy From guido at python.org Tue Oct 21 16:06:45 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 16:07:00 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 12:59:50 BST." <2m65iipzrd.fsf@starship.python.net> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310200944.30482.aleaxit@yahoo.com> <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com> <200310201745.36226.aleaxit@yahoo.com> <200310201637.h9KGbHU21287@12-236-54-216.client.attbi.com> <2m65iipzrd.fsf@starship.python.net> Message-ID: <200310212006.h9LK6jt24859@12-236-54-216.client.attbi.com> > > I meant that the compiler should rename it. > > Implementing this might be entertaining. In particular what happens > if the iteration variable is a local in the frame anyway? I presume > that would inhibit the renaming, but then there's a potentially > confusing dichotomy as to whether renaming gets done. Of course > you could *always* rename, but then code like > > def f(x): > r = [x+1 for x in range(x)] > return r, x > > becomes even more incomprehensible (and changes in behaviour). Here's the rule I'd propose for iterator comprehensions, which list comprehensions would inherit: [ for in ] The variables in should always be simple variables, and their scope only extends to . If there's a variable with the same name in an outer scope (including the function containing the comprehension) it is not accessible (at least not by name) in . is not affected. In comprehensions you won't be able to do some things you can do with regular for loops: a = [1,2] for a[0] in range(10): print a > And what about horrors like > > [([x for x in range(10)],x) for x in range(10)] > > vs: > > [([x for x in range(10)],y) for y in range(10)] > > ? > > I suppose you could make a case for throwing out (or warning about) > all these cases at compile time, but that would require significant > effort as well (I think). I think the semantics are crisply defined, users who write these deserve what they get (confusion and the wrath of their readers). --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Tue Oct 21 16:16:02 2003 From: skip at pobox.com (Skip Montanaro) Date: Tue Oct 21 16:16:28 2003 Subject: [Python-Dev] prePEP: Money data type In-Reply-To: <200310211705.15094.fincher.8@osu.edu> References: <200310211705.15094.fincher.8@osu.edu> Message-ID: <16277.37890.616759.734494@montanaro.dyndns.org> >> http://cvs.sf.net/viewcvs.py/python/python/nondist/sandbox/decimal/ Jeremy> Just out of curiosity, why isn't this distributed with Python? 'cuz it's still in the sandbox (the place where people can play with code ideas). Skip From FBatista at uniFON.com.ar Tue Oct 21 16:16:41 2003 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Tue Oct 21 16:17:38 2003 Subject: [Python-Dev] prePEP: Money data type Message-ID: Jeremy Fincher wrote: #- On Tuesday 21 October 2003 03:31 pm, Tim Peters wrote: #- > Eric's implementation: #- > #- > #- http://cvs.sf.net/viewcvs.py/python/python/nondist/sandbox/decimal/ #- #- Just out of curiosity, why isn't this distributed with Python? Nice question! Think that testDecimal.py is not finishing well. At least that's where I'll start after studying the class itself. . Facundo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20031021/a810f090/attachment.html From tim.one at comcast.net Tue Oct 21 16:19:33 2003 From: tim.one at comcast.net (Tim Peters) Date: Tue Oct 21 16:19:38 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310211936.h9LJaeO24644@12-236-54-216.client.attbi.com> Message-ID: [Guido] > I expect that most iterator comprehensions (we need a better term!) > ... Well, calling it an iterator Aussonderungsaxiom would continue emphasizing the wrong thing . "Set comprehensions" in a programming language originated with SETL, and are named in honor of the set-theoretic Axiom of Comprehension (Aussonderungsaxiom). In its well-behaved form, that says roughly that given a set X, then for any predicate P(x), there exists a subset of X whose elements consist of exactly those elements x of X for which P(x) is true (in its ill-behaved form, it leads directly to Russell's Paradox -- the set of all sets that don't contain themselves). So "comprehension" emphasizes the "if" part of list comprehension syntax, which often isn't the most interesting thing. More interesting more often are (a) the computation done on the objects gotten from the for-iterator, and (b) that the results are generated one at a time. Put that all in a pot and stir, and the name "generator expression" seems natural and useful to me. In the Icon language, *all* expressions are generators, so maybe I'm biased by that. OTOH, "the results are generated one at a time" is close to plain English, and "generator expression" then brings to my mind an expression capable of delivering a sequence of results. Or you could call it an Orlijn flourish. From pf_moore at yahoo.co.uk Tue Oct 21 16:20:48 2003 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Tue Oct 21 16:19:55 2003 Subject: [Python-Dev] Re: buildin vs. shared modules References: <200310171840.h9HIesN06941@12-236-54-216.client.attbi.com> <200310211857.57783.aleaxit@yahoo.com> <200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com> Message-ID: <7k2ywden.fsf@yahoo.co.uk> Guido van Rossum writes: >> After installing MSVC6 on a win98 machine, where I could rename >> wsock32.dll away (which was not possible on XP due to file system >> protection), I was able to change socketmodule.c to use delay loading of >> the winsock dll. I had to wrap up the WSAStartup() call inside a >> __try {} __except {} block to catch the exception thrown. >> >> With this change, _socket (and maybe also select) could then also be >> converted into builtin modules. >> >> Guido, what do you think? > > I think now is a good time to try this in 2.4. I don't think I'd want > to do this (or any of the proposed reorgs) in 2.3 though. One (very mild) point - this is highly MSVC-specific. I don't know if there is ever going to be any interest in (for example) getting Python to build with Mingw/gcc on Windows, but there's no equivalent of this in Mingw (indeed, Mingw doesn't, as far as I know, support __try/__except either). But in the absence of anyone who is working on a Mingw build, this is pretty much irrelevant... Paul. -- This signature intentionally left blank From tim.one at comcast.net Tue Oct 21 16:22:11 2003 From: tim.one at comcast.net (Tim Peters) Date: Tue Oct 21 16:22:18 2003 Subject: [Python-Dev] prePEP: Money data type In-Reply-To: <200310211705.15094.fincher.8@osu.edu> Message-ID: >> http://cvs.sf.net/viewcvs.py/python/python/nondist/sandbox/decimal/ [Jeremy Fincher] > Just out of curiosity, why isn't this distributed with Python? Because it's not finished. Finish it, then ask again . From skip at pobox.com Tue Oct 21 16:23:15 2003 From: skip at pobox.com (Skip Montanaro) Date: Tue Oct 21 16:23:27 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310212006.h9LK6jt24859@12-236-54-216.client.attbi.com> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310200944.30482.aleaxit@yahoo.com> <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com> <200310201745.36226.aleaxit@yahoo.com> <200310201637.h9KGbHU21287@12-236-54-216.client.attbi.com> <2m65iipzrd.fsf@starship.python.net> <200310212006.h9LK6jt24859@12-236-54-216.client.attbi.com> Message-ID: <16277.38323.854588.570453@montanaro.dyndns.org> Guido> Here's the rule I'd propose for iterator comprehensions, which list Guido> comprehensions would inherit: Guido> [ for in ] Guido> The variables in should always be simple variables, and Guido> their scope only extends to . If there's a variable with Guido> the same name in an outer scope (including the function Guido> containing the comprehension) it is not accessible (at least not Guido> by name) in . is not affected. I thought the definition for list comprehension syntax was something like '[' for in [ for in ] * [ if ] * ']' The loop in an earlier for clause should be visible in all nested for clauses and conditional clauses, not just in the first . Skip From guido at python.org Tue Oct 21 16:33:13 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 16:33:22 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: Your message of "Tue, 21 Oct 2003 12:02:26 +0200." <200310211202.26677.aleaxit@yahoo.com> References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com> <200310210009.39256.aleaxit@yahoo.com> <200310210344.h9L3iVh23308@12-236-54-216.client.attbi.com> <200310211202.26677.aleaxit@yahoo.com> Message-ID: <200310212033.h9LKXDk24952@12-236-54-216.client.attbi.com> > Yes, if we specify an iter's __copy__ makes an independent iterator, > which is surely the most useful semantics for it, then any weird iterator > whose index is in fact mutable can copy not-quite-shallowly and offer > the same useful semantics. I'm not sure where that leaves generator > made iterators, which don't really know which parts of the state in their > saved frame are "index", but having them just punt and refuse to copy > themselves shallowly might be ok. I thought we already established before that attempting to guess wihch parts of a generator function to copy and which parts to share is hopeless. generator-made iterators won't be __copy__-able, period. I think this is the weakness of this cloning business, because it either makes generators second-class iterators, or it makes cloning a precarious thing to attempt when generators are used. (You can make a non-cloneable iterator cloneable by wrapping it into something that buffers just those items that are still reacheable by clones, but this can still require arbitrary amounts of buffer space. The problem is that using a generator as a filter in a pipeline of iterators makes the result non-cloneable, even if the underlying iterator is cloneable. I'm thinking of situations like def odd(it): while True: it.next() yield it.next() it = odd(range(1000)) it2 = clone(it) Here we'd wish the result could be the same as that of tmp = range(1000) it = odd(tmp) it2 = odd(tmp) but that can't be realized. > Sure, it might. Perhaps the typical use case would be one in which > an iterator gets deepcopied "incidentally" as part of the deepcopy > of some other object which "happens" to hold an iterator; if > iterators knew how to deepcopy themselves that would save some work > on the part of the other object's author. No huge win, sure. But > once the copy gets deep, generator-made iterators should also have > no problem actually doing it, and that may be another middle-size > win. I still don't think deep-copying stack frames is a business I'd like to be in. Too many tricky issues. --Guido van Rossum (home page: http://www.python.org/~guido/) From mcherm at mcherm.com Tue Oct 21 16:38:33 2003 From: mcherm at mcherm.com (Michael Chermside) Date: Tue Oct 21 16:38:39 2003 Subject: [Python-Dev] accumulator display syntax Message-ID: <1066768713.3f959949dc764@mcherm.com> Skip writes: > thought the definition for list comprehension syntax was something like > > '[' > for in > [ for in ] * > [ if ] * > ']' Nope: >>> [x*y for x in 'aBcD' if x.islower() for y in range(4) if y%2] ['a', 'aaa', 'c', 'ccc'] -- Michael Chermside From guido at python.org Tue Oct 21 16:46:42 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 16:47:00 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 15:52:23 EDT." <5.1.1.6.0.20031021155150.0240c6b0@telecommunity.com> References: <004601c39805$60dd9a60$e841fea9@oemcomputer> <5.1.1.6.0.20031021155150.0240c6b0@telecommunity.com> Message-ID: <200310212046.h9LKkgp25011@12-236-54-216.client.attbi.com> > Iterator expression? Better. Or perhaps generator expression? To maintain the link with generator functions, since the underlying mechanism *will* be mostly the same. Yes, I like that even better. BTW, while Alex has shown that a generator function with no free variables runs quite fast, a generator expression that uses variables from the surrounding scope will have to use the nested scopes machinery to access those, unlike a list comprehension; not only does this run slower, but it also slows down all other uses of that variable in the surrounding scope (because it becomes a "cell" throughout the scope). Someone could time how well y = 1 sum([x*y for x in R]) fares compared to y = 1 def gen(): for x in R: yield y*y sum(gen()) for R in (range(N) for N in (100, 1000, 10000)). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 16:50:16 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 16:50:34 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 14:56:08 CDT." <16277.36696.353007.168363@montanaro.dyndns.org> References: <004601c39805$60dd9a60$e841fea9@oemcomputer> <200310211936.h9LJaeO24644@12-236-54-216.client.attbi.com> <16277.36696.353007.168363@montanaro.dyndns.org> Message-ID: <200310212050.h9LKoGM25025@12-236-54-216.client.attbi.com> > Guido> I expect that most iterator comprehensions (we need a better > Guido> term!) > > You didn't like "lazy list comprehensions"? No, because list comprehensions are no longer the fundamental building blocks. Generator expression sounds good to me now. > Guido> We can quibble about whether double parentheses are needed, ... > > You haven't convinced me that you're not going to want to toss out > one of the two comprehension syntaxes and only retain the lazy > semantics in Py3k. Too many double negatives. :-) Right now I feel like keeping both syntaxes, but declaring list comprehensions syntactic sugar for list(generator expression). > If that's the case and the current list comprehension syntax is > better than the current crop of proposals, why even add (lazy > list|iterator) comprehensions now? Just make do without them until > Py3k and make all list comprehensions lazy at that point. There > will be enough other bullets to bite that this shouldn't be a big > deal (many programs will probably require significant rewriting > anyway). It's likely that generator experssions won't make it into Python 2.x for any x, just because of the effort to get the community to accept new syntax in general. --Guido van Rossum (home page: http://www.python.org/~guido/) From pf_moore at yahoo.co.uk Tue Oct 21 16:30:41 2003 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Tue Oct 21 16:50:48 2003 Subject: [Python-Dev] Re: prePEP: Money data type References: Message-ID: <3cdmwcy6.fsf@yahoo.co.uk> "Tim Peters" writes: > Meaning that there's an existing body of work that's already been informed > by years of design debate (IBM's proposed decimal standard), and an involved > Python implementation of that. What happens next depends on who can make > time to do something next. While I'm little more than an interested bystander, I'm not clear what *could* happen next. Can what's in nondist simply (!) be documented and migrated to the standard library? Is there a need for a C implementation (much like datetime started in Python but became C before release)? The module TODO comment just mentions "cleanup, hunt and kill bugs". So it certainly sounds like it's nearly there... Paul. -- This signature intentionally left blank From guido at python.org Tue Oct 21 16:52:00 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 16:52:11 2003 Subject: [Python-Dev] prePEP: Money data type In-Reply-To: Your message of "Tue, 21 Oct 2003 17:05:15 EDT." <200310211705.15094.fincher.8@osu.edu> References: <200310211705.15094.fincher.8@osu.edu> Message-ID: <200310212052.h9LKq1G25047@12-236-54-216.client.attbi.com> > > http://cvs.sf.net/viewcvs.py/python/python/nondist/sandbox/decimal/ > > Just out of curiosity, why isn't this distributed with Python? Because it's not seen any actual usage, is AFAIK undocumented, and one can have quibbles about the API. --Guido van Rossum (home page: http://www.python.org/~guido/) From tim.one at comcast.net Tue Oct 21 16:55:01 2003 From: tim.one at comcast.net (Tim Peters) Date: Tue Oct 21 16:55:07 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310212046.h9LKkgp25011@12-236-54-216.client.attbi.com> Message-ID: [Guido] > ... > BTW, while Alex has shown that a generator function with no free > variables runs quite fast, a generator expression that uses variables > from the surrounding scope will have to use the nested scopes > machinery to access those, unlike a list comprehension; not only does > this run slower, but it also slows down all other uses of that > variable in the surrounding scope (because it becomes a "cell" > throughout the scope). The implementation could synthesize a generator function abusing default arguments to give the generator's frame locals with the same names. From guido at python.org Tue Oct 21 16:55:37 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 16:55:51 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 16:19:33 EDT." References: Message-ID: <200310212055.h9LKtcC25068@12-236-54-216.client.attbi.com> > Well, calling it an iterator Aussonderungsaxiom would continue emphasizing > the wrong thing . > > "Set comprehensions" in a programming language originated with SETL, > and are named in honor of the set-theoretic Axiom of Comprehension > (Aussonderungsaxiom). In its well-behaved form, that says roughly > that given a set X, then for any predicate P(x), there exists a > subset of X whose elements consist of exactly those elements x of X > for which P(x) is true (in its ill-behaved form, it leads directly > to Russell's Paradox -- the set of all sets that don't contain > themselves). > > So "comprehension" emphasizes the "if" part of list comprehension > syntax, which often isn't the most interesting thing. More > interesting more often are (a) the computation done on the objects > gotten from the for-iterator, and (b) that the results are generated > one at a time. > > Put that all in a pot and stir, and the name "generator expression" > seems natural and useful to me. In the Icon language, *all* > expressions are generators, so maybe I'm biased by that. OTOH, "the > results are generated one at a time" is close to plain English, and > "generator expression" then brings to my mind an expression capable > of delivering a sequence of results. Thanks for an independent validation of "generator expressions"! It's a perfect term. > Or you could call it an Orlijn flourish. No, that term is already reserved for something else (the details of which I'll spare you, as they involve intimate details about toddler hygiene :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 16:57:03 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 16:57:16 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 15:23:15 CDT." <16277.38323.854588.570453@montanaro.dyndns.org> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310200944.30482.aleaxit@yahoo.com> <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com> <200310201745.36226.aleaxit@yahoo.com> <200310201637.h9KGbHU21287@12-236-54-216.client.attbi.com> <2m65iipzrd.fsf@starship.python.net> <200310212006.h9LK6jt24859@12-236-54-216.client.attbi.com> <16277.38323.854588.570453@montanaro.dyndns.org> Message-ID: <200310212057.h9LKv3425092@12-236-54-216.client.attbi.com> > I thought the definition for list comprehension syntax was something like > > '[' > for in > [ for in ] * > [ if ] * > ']' > > The loop in an earlier for clause should be visible in all nested for > clauses and conditional clauses, not just in the first . Absolutely, good point! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 17:04:50 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 17:04:58 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 16:55:01 EDT." References: Message-ID: <200310212104.h9LL4oH25172@12-236-54-216.client.attbi.com> > [Guido] > > ... > > BTW, while Alex has shown that a generator function with no free > > variables runs quite fast, a generator expression that uses variables > > from the surrounding scope will have to use the nested scopes > > machinery to access those, unlike a list comprehension; not only does > > this run slower, but it also slows down all other uses of that > > variable in the surrounding scope (because it becomes a "cell" > > throughout the scope). [Tim] > The implementation could synthesize a generator function abusing default > arguments to give the generator's frame locals with the same names. Yes, I think that could work -- I see no way that something invoked by the generator expression could possibly modify a variable binding in the surrounding scope. Argh, someone *could* pass around a copy of locals() and make an assignment into that. But I think we're already deprecating non-read-only use of locals(), so I'd like to ban that as abuse. --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Tue Oct 21 17:16:22 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 21 17:16:30 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310212046.h9LKkgp25011@12-236-54-216.client.attbi.com> References: <5.1.1.6.0.20031021155150.0240c6b0@telecommunity.com> <200310212046.h9LKkgp25011@12-236-54-216.client.attbi.com> Message-ID: <200310212316.22749.aleaxit@yahoo.com> On Tuesday 21 October 2003 10:46 pm, Guido van Rossum wrote: > y = 1 > sum([x*y for x in R]) > > fares compared to > > y = 1 > def gen(): > for x in R: yield y*y > sum(gen()) module a.py being: R = [range(N) for N in (10, 100, 10000)] def lc(R): y = 1 sum([x*y for x in R]) def gen1(R): y = 1 def gen(): for x in R: yield y*y sum(gen()) def gen2(R): y = 1 def gen(R=R, y=y): for x in R: yield y*y sum(gen()) i measure: for N=10: [alex@lancelot bo]$ timeit.py -c -s'import a' 'a.lc(a.R[0])' 100000 loops, best of 3: 12.3 usec per loop [alex@lancelot bo]$ timeit.py -c -s'import a' 'a.gen1(a.R[0])' 100000 loops, best of 3: 10.4 usec per loop [alex@lancelot bo]$ timeit.py -c -s'import a' 'a.gen2(a.R[0])' 100000 loops, best of 3: 9.7 usec per loop for N=100: [alex@lancelot bo]$ timeit.py -c -s'import a' 'a.lc(a.R[1])' 10000 loops, best of 3: 93 usec per loop [alex@lancelot bo]$ timeit.py -c -s'import a' 'a.gen1(a.R[1])' 10000 loops, best of 3: 59 usec per loop [alex@lancelot bo]$ timeit.py -c -s'import a' 'a.gen2(a.R[1])' 10000 loops, best of 3: 55 usec per loop for N=10000: [alex@lancelot bo]$ timeit.py -c -s'import a' 'a.lc(a.R[2])' 100 loops, best of 3: 9.4e+03 usec per loop [alex@lancelot bo]$ timeit.py -c -s'import a' 'a.gen1(a.R[2])' 100 loops, best of 3: 5.6e+03 usec per loop [alex@lancelot bo]$ timeit.py -c -s'import a' 'a.gen2(a.R[2])' 100 loops, best of 3: 5.2e+03 usec per loop I think it's well worth overcoming come "community resistance to new syntax" to get this kind of advantage easily. The trick of binding outer-scope variables as default args is neat but buys less than the pure idea of just using a generator rather than a list comprehension. Alex From pedronis at bluewin.ch Tue Oct 21 17:33:30 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Tue Oct 21 17:31:09 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310212104.h9LL4oH25172@12-236-54-216.client.attbi.com> References: Message-ID: <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> At 14:04 21.10.2003 -0700, Guido van Rossum wrote: > > [Guido] > > > ... > > > BTW, while Alex has shown that a generator function with no free > > > variables runs quite fast, a generator expression that uses variables > > > from the surrounding scope will have to use the nested scopes > > > machinery to access those, unlike a list comprehension; not only does > > > this run slower, but it also slows down all other uses of that > > > variable in the surrounding scope (because it becomes a "cell" > > > throughout the scope). > >[Tim] > > The implementation could synthesize a generator function abusing default > > arguments to give the generator's frame locals with the same names. > >Yes, I think that could work -- I see no way that something invoked by >the generator expression could possibly modify a variable binding in >the surrounding scope. so this, if I understand: def h(): y = 0 l = [1,2] it = (x+y for x in l) y = 1 for v in it: print v will print 1,2 and not 2,3 unlike: def h(): y = 0 l = [1,2] def gen(S): for x in S: yield x+y it = gen(l) y = 1 for v in it: print v From allison at sumeru.stanford.EDU Tue Oct 21 17:33:49 2003 From: allison at sumeru.stanford.EDU (Dennis Allison) Date: Tue Oct 21 17:34:06 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Doc/lib libplatform.tex, 1.1, 1.2 In-Reply-To: <200310211946.h9LJkKP24720@12-236-54-216.client.attbi.com> Message-ID: Or Brewster Kahle's web archive, www.archive.org On Tue, 21 Oct 2003, Guido van Rossum wrote: > > > - comment out the reference to a MS KnowledgeBase article that doesn't > > > seem to be present at msdn.microsoft.com; hopefully someone can > > > point out an alternate source for the relevant information > > Bizarre. It seems MS has removed all traces of that article; I found > lots of pointers to it in Google but they all point to the same dead > link. Google's cache is your best bet... > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/allison%40sumeru.stanford.edu > From pje at telecommunity.com Tue Oct 21 17:57:39 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Oct 21 17:57:42 2003 Subject: [Python-Dev] locals() (was Re: accumulator display syntax) In-Reply-To: <200310212104.h9LL4oH25172@12-236-54-216.client.attbi.com> References: Message-ID: <5.1.1.6.0.20031021174739.01f60e00@telecommunity.com> At 02:04 PM 10/21/03 -0700, Guido van Rossum wrote: >Argh, someone *could* pass around a copy of locals() and make an >assignment into that. Not when the locals() is that of a CPython function, and I expect the same is true of Jython functions. > But I think we're already deprecating >non-read-only use of locals(), so I'd like to ban that as abuse. FWIW, both Zope 3 and PEAK currently make use of 'locals()' (actually, sys._getframe()) to modify locals of a class or module scope (i.e. non-functions). For both class and module scopes, it seems to be implied by the language definition that the local namespace is the __dict__ of the corresponding object. So, is this deprecated usage for class and module objects too? From python at rcn.com Tue Oct 21 17:59:56 2003 From: python at rcn.com (Raymond Hettinger) Date: Tue Oct 21 18:00:43 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310211936.h9LJaeO24644@12-236-54-216.client.attbi.com> Message-ID: <001001c3981e$aa78f340$e841fea9@oemcomputer> [Guido] > I expect that most iterator comprehensions (we need a better term!) > are not stored in a variable but passed as an argument to something > that takes an iterable, e.g. > > sum(len(line) for line in file if line.strip()) That is somewhat beautiful. So, I drop my request for bracketed yields and throw my tiny weight behind this idea for an iterator expression. > We can quibble about whether double parentheses are needed I vote for not requiring the outer parentheses unless there is an adjacent comma. That would unnecessarily complicate the simple, elegant proposal. Otherwise, I would anticipate frequent questions to the help list or tutor list on why something coded like your example doesn't work. Also, the double paren form just looks funny, like there is something wrong with it but you can't tell what. Timing ------ Based on the extensive comp.lang.python discussions when I first floated a PEP on the subject, I conclude that the user community will very much accept the new form and that there is no reason to not include it in Py2.4. If there is any doubt on that score, I would be happy to update the PEP to match the current proposal for iterator expressions and solicit more community feedback. Raymond Hettinger From barry at python.org Tue Oct 21 18:07:22 2003 From: barry at python.org (Barry Warsaw) Date: Tue Oct 21 18:08:17 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <001001c3981e$aa78f340$e841fea9@oemcomputer> References: <001001c3981e$aa78f340$e841fea9@oemcomputer> Message-ID: <1066774041.5750.255.camel@anthem> On Tue, 2003-10-21 at 17:59, Raymond Hettinger wrote: > [Guido] > > I expect that most iterator comprehensions (we need a better term!) > > are not stored in a variable but passed as an argument to something > > that takes an iterable, e.g. > > > > sum(len(line) for line in file if line.strip()) > > That is somewhat beautiful. Indeed, as is the term "generator expression" and the relegation to syntactic sugar of list comprehensions. > > We can quibble about whether double parentheses are needed > > I vote for not requiring the outer parentheses unless there is an > adjacent comma. I like that too. It mirrors other situations where the parentheses aren't needed except to disambiguate syntax. In the above example, there's no ambiguity. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20031021/53953213/attachment.bin From guido at python.org Tue Oct 21 18:11:17 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 18:11:23 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 23:16:22 +0200." <200310212316.22749.aleaxit@yahoo.com> References: <5.1.1.6.0.20031021155150.0240c6b0@telecommunity.com> <200310212046.h9LKkgp25011@12-236-54-216.client.attbi.com> <200310212316.22749.aleaxit@yahoo.com> Message-ID: <200310212211.h9LMBH925278@12-236-54-216.client.attbi.com> > I think it's well worth overcoming come "community resistance > to new syntax" to get this kind of advantage easily. The trick > of binding outer-scope variables as default args is neat but > buys less than the pure idea of just using a generator rather > than a list comprehension. Thanks for the measurements! Is someone interested in writing up a PEP and taking it to the community? Or do I have to do it myself (and risk another newsgroup meltdown)? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 18:14:19 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 18:14:27 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 23:33:30 +0200." <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> References: <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> Message-ID: <200310212214.h9LMEJB25302@12-236-54-216.client.attbi.com> > >[name withheld] > > > The implementation could synthesize a generator function abusing default > > > arguments to give the generator's frame locals with the same names. [Guido] > >Yes, I think that could work -- I see no way that something invoked by > >the generator expression could possibly modify a variable binding in > >the surrounding scope. [Samuele] > so this, if I understand: > > def h(): > y = 0 > l = [1,2] > it = (x+y for x in l) > y = 1 > for v in it: > print v > > will print 1,2 and not 2,3 > > unlike: > > def h(): > y = 0 > l = [1,2] > def gen(S): > for x in S: > yield x+y > it = gen(l) > y = 1 > for v in it: > print v Argh. Of course. No, I think it should use the actual value of y, just like a nested function. Never mind that idea then. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 18:18:28 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 18:18:37 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 17:59:56 EDT." <001001c3981e$aa78f340$e841fea9@oemcomputer> References: <001001c3981e$aa78f340$e841fea9@oemcomputer> Message-ID: <200310212218.h9LMIS725333@12-236-54-216.client.attbi.com> > I vote for not requiring the outer parentheses unless there is an > adjacent comma. That would unnecessarily complicate the simple, > elegant proposal. > > Otherwise, I would anticipate frequent questions to the help list > or tutor list on why something coded like your example doesn't work. > > Also, the double paren form just looks funny, like there is something > wrong with it but you can't tell what. OK. I think I can pull it off in the Grammar. > Timing > ------ > > Based on the extensive comp.lang.python discussions when I first > floated a PEP on the subject, I conclude that the user community > will very much accept the new form and that there is no reason > to not include it in Py2.4. > > If there is any doubt on that score, I would be happy to update > the PEP to match the current proposal for iterator expressions > and solicit more community feedback. Wonderful! Rename PEP 289 to "generator expressions" and change the contents to match this proposal. Thanks for being the fall guy! --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 18:25:28 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 18:25:44 2003 Subject: [Python-Dev] locals() (was Re: accumulator display syntax) In-Reply-To: Your message of "Tue, 21 Oct 2003 17:57:39 EDT." <5.1.1.6.0.20031021174739.01f60e00@telecommunity.com> References: <5.1.1.6.0.20031021174739.01f60e00@telecommunity.com> Message-ID: <200310212225.h9LMPSv25371@12-236-54-216.client.attbi.com> > >Argh, someone *could* pass around a copy of locals() and make an > >assignment into that. > > Not when the locals() is that of a CPython function, and I expect the same > is true of Jython functions. Well, the effect is undefined; there may be things you can do that would force the changes out to the real local variables. > > But I think we're already deprecating > >non-read-only use of locals(), so I'd like to ban that as abuse. > > FWIW, both Zope 3 and PEAK currently make use of 'locals()' > (actually, sys._getframe()) to modify locals of a class or module > scope (i.e. non-functions). For both class and module scopes, it > seems to be implied by the language definition that the local > namespace is the __dict__ of the corresponding object. > > So, is this deprecated usage for class and module objects too? It isn't. I'm not sure it shouldn't be; at some point it might be attractive to lock down the namespace of certain modules and classes, and in fact new-style classes already attempt to lock down their __dict__. Fortunately the __dict__ you see when executing a function during the class definition phase is not the class dict; the class dict is a copy of it taken by the class creation code. --Guido van Rossum (home page: http://www.python.org/~guido/) From walter at livinglogic.de Tue Oct 21 18:28:51 2003 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Tue Oct 21 18:29:01 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310212002.h9LK21624815@12-236-54-216.client.attbi.com> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com> <16277.15186.392757.583785@montanaro.dyndns.org> <200310211749.21152.aleaxit@yahoo.com> <16277.24592.805548.835843@montanaro.dyndns.org> <200310212002.h9LK21624815@12-236-54-216.client.attbi.com> Message-ID: <3F95B323.9010405@livinglogic.de> Guido van Rossum wrote: > [...] > Thanks for trying to bang some sense into this. > > Personally, I still like the idea best to make > > (x for x in S) > > be an iterator comprehension > > and > > [x for x in S] > > syntactic sugar for the common operation > > list((x for x in S)) Would this mean: [x for x in S] is a list comprehension and [(x for x in S)] is a list containing one generator expression? Bye, Walter D?rwald From guido at python.org Tue Oct 21 18:31:58 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 18:32:11 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Wed, 22 Oct 2003 00:28:51 +0200." <3F95B323.9010405@livinglogic.de> References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz> <200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com> <16277.15186.392757.583785@montanaro.dyndns.org> <200310211749.21152.aleaxit@yahoo.com> <16277.24592.805548.835843@montanaro.dyndns.org> <200310212002.h9LK21624815@12-236-54-216.client.attbi.com> <3F95B323.9010405@livinglogic.de> Message-ID: <200310212231.h9LMVwR25409@12-236-54-216.client.attbi.com> > Would this mean: > [x for x in S] is a list comprehension and > [(x for x in S)] is a list containing one generator expression? Yes. (Raymond, you might mention this in the PEP.) --Guido van Rossum (home page: http://www.python.org/~guido/) From pedronis at bluewin.ch Tue Oct 21 18:43:49 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Tue Oct 21 18:41:29 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310212214.h9LMEJB25302@12-236-54-216.client.attbi.com> References: <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> Message-ID: <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> At 15:14 21.10.2003 -0700, Guido van Rossum wrote: > > >[name withheld] > > > > The implementation could synthesize a generator function abusing > default > > > > arguments to give the generator's frame locals with the same names. > >[Guido] > > >Yes, I think that could work -- I see no way that something invoked by > > >the generator expression could possibly modify a variable binding in > > >the surrounding scope. > >[Samuele] > > so this, if I understand: > > > > def h(): > > y = 0 > > l = [1,2] > > it = (x+y for x in l) > > y = 1 > > for v in it: > > print v > > > > will print 1,2 and not 2,3 > > > > unlike: > > > > def h(): > > y = 0 > > l = [1,2] > > def gen(S): > > for x in S: > > yield x+y > > it = gen(l) > > y = 1 > > for v in it: > > print v > >Argh. Of course. > >No, I think it should use the actual value of y, just like a nested >function. > >Never mind that idea then. this is a bit OT and too late, but given that our closed over variables are read-only, I'm wondering whether, having a 2nd chance, using cells and following mutations in the enclosing scopes is really worth it, we kind of mimic Scheme and relatives but there outer scope variables are also rebindable. Maybe copying semantics not using cells for our closures would not be too insane, and people would not be burnt by trying things like this: for msg in msgs: def onClick(e): print msg panel.append(Button(msg,onClick=onClick)) which obviously doesn't do what one could expect today. OTOH as for general mutability, using a mutable object (list,...) would allow for mutability when one really need it (rarely). From pedronis at bluewin.ch Tue Oct 21 18:49:23 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Tue Oct 21 18:47:08 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> References: <200310212214.h9LMEJB25302@12-236-54-216.client.attbi.com> <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> Message-ID: <5.2.1.1.0.20031022004621.027cd230@pop.bluewin.ch> At 00:43 22.10.2003 +0200, Samuele Pedroni wrote: >this is a bit OT and too late, but given that our closed over variables >are read-only, I'm wondering whether, having a 2nd chance, using cells and >following mutations in the enclosing scopes is really worth it, we kind of >mimic Scheme and relatives but there outer scope variables are also >rebindable. Maybe copying semantics not using cells for our closures would >not be too insane, and people would not be burnt by trying things like this: > >for msg in msgs: > def onClick(e): > print msg > panel.append(Button(msg,onClick=onClick)) > >which obviously doesn't do what one could expect today. OTOH as for >general mutability, using a mutable object (list,...) would allow for >mutability when one really need it (rarely). of course OTOH cells make it easier to cope with recursive references: def g(): def f(x): ... f refers to f ... return f but this seem more an implementation detail, although not using cells would make this rather trickier to support. From guido at python.org Tue Oct 21 18:51:59 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 18:52:07 2003 Subject: [Python-Dev] closure semantics In-Reply-To: Your message of "Wed, 22 Oct 2003 00:43:49 +0200." <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> References: <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> Message-ID: <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> [Changing the subject.] [Samuele] > this is a bit OT and too late, but given that our closed over > variables are read-only, I'm wondering whether, having a 2nd chance, > using cells and following mutations in the enclosing scopes is > really worth it, we kind of mimic Scheme and relatives but there > outer scope variables are also rebindable. Maybe copying semantics > not using cells for our closures would not be too insane, and people > would not be burnt by trying things like this: > > for msg in msgs: > def onClick(e): > print msg > panel.append(Button(msg,onClick=onClick)) > > which obviously doesn't do what one could expect today. OTOH as for > general mutability, using a mutable object (list,...) would allow > for mutability when one really need it (rarely). It was done this way because not everybody agreed that closed-over variables should be read-only, and the current semantics allow us to make them writable (as in Scheme, I suppose?) if we can agree on a syntax to declare an "intermediate scope" global. Maybe "global x in f" would work? def outer(): x = 1 def intermediate(): x = 2 def inner(): global x in outer x = 42 inner() print x # prints 2 intermediate() print x # prints 42 --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Tue Oct 21 19:05:07 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 21 19:05:22 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310212211.h9LMBH925278@12-236-54-216.client.attbi.com> References: <200310212316.22749.aleaxit@yahoo.com> <200310212211.h9LMBH925278@12-236-54-216.client.attbi.com> Message-ID: <200310220105.08017.aleaxit@yahoo.com> On Wednesday 22 October 2003 00:11, Guido van Rossum wrote: > > I think it's well worth overcoming come "community resistance > > to new syntax" to get this kind of advantage easily. The trick > > of binding outer-scope variables as default args is neat but > > buys less than the pure idea of just using a generator rather > > than a list comprehension. > > Thanks for the measurements! > > Is someone interested in writing up a PEP and taking it to the > community? Or do I have to do it myself (and risk another newsgroup > meltdown)? I'm interested, if it can wait until next week (in a few hours I'm flying off for a trip and I won't even have my laptop along). What's the procedure for requesting a PEP number, again? Alex From walter at livinglogic.de Tue Oct 21 19:06:30 2003 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Tue Oct 21 19:06:38 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> References: <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> Message-ID: <3F95BBF6.1090900@livinglogic.de> Guido van Rossum wrote: > [...] > Maybe "global x in f" would work? > > def outer(): > x = 1 > def intermediate(): > x = 2 > def inner(): > global x in outer > x = 42 > inner() > print x # prints 2 > intermediate() > print x # prints 42 Why not make local variables attributes of the function, i.e. replace: def inner(): global x in outer x = 42 with: def inner(): outer.x = 42 Global variables could then be assigned via: global.x = 42 Could this be made backwards compatible? Bye, Walter D?rwald From guido at python.org Tue Oct 21 19:07:43 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 19:07:56 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Wed, 22 Oct 2003 01:05:07 +0200." <200310220105.08017.aleaxit@yahoo.com> References: <200310212316.22749.aleaxit@yahoo.com> <200310212211.h9LMBH925278@12-236-54-216.client.attbi.com> <200310220105.08017.aleaxit@yahoo.com> Message-ID: <200310212307.h9LN7hY25523@12-236-54-216.client.attbi.com> > > Is someone interested in writing up a PEP and taking it to the > > community? Or do I have to do it myself (and risk another newsgroup > > meltdown)? > > I'm interested, if it can wait until next week (in a few hours I'm > flying off for a trip and I won't even have my laptop along). What's > the procedure for requesting a PEP number, again? Raymond is going to give PEP 289 an overhaul. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 19:09:28 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 19:09:37 2003 Subject: [Python-Dev] closure semantics In-Reply-To: Your message of "Wed, 22 Oct 2003 01:06:30 +0200." <3F95BBF6.1090900@livinglogic.de> References: <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> <3F95BBF6.1090900@livinglogic.de> Message-ID: <200310212309.h9LN9Sk25548@12-236-54-216.client.attbi.com> > Why not make local variables attributes of the function, i.e. > replace: > > def inner(): > global x in outer > x = 42 > > with: > > def inner(): > outer.x = 42 Because this already means something! outer.x refers to the attribute x of function outer. That's quite different than local variable x of the most recent invocation of outer on the current thread's call stack! > Global variables could then be assigned via: > global.x = 42 This has a tiny bit of appeal, but not enough to bother. --Guido van Rossum (home page: http://www.python.org/~guido/) From tdelaney at avaya.com Tue Oct 21 19:11:44 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Tue Oct 21 19:11:52 2003 Subject: [Python-Dev] listcomps vs. for loops Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com> > From: Jp Calderone [mailto:exarkun@intarweb.us.avaya.com] > > Not when x is properly initialized. Anyway, this is no > different from the > problem of: > > for x in R: > ... > print x For which reason I propose that Python 3.0 have the control name in any for expression be "local" to the expression ;) Hmm - actually this does raise another issue. >>> x = 1 >>> y = [1, 2, 3] >>> y = [x for x in y] Using the current semantics: >>> print x 3 Using the new semantics: >>> print x 1 Is this a problem? Are the new semantics going to cause confusion? Tim Delaney From pje at telecommunity.com Tue Oct 21 19:13:43 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Oct 21 19:13:49 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310212214.h9LMEJB25302@12-236-54-216.client.attbi.com> References: <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> Message-ID: <5.1.1.6.0.20031021190931.01ec41d0@telecommunity.com> At 03:14 PM 10/21/03 -0700, Guido van Rossum wrote: >[Samuele] > > so this, if I understand: > > > > def h(): > > y = 0 > > l = [1,2] > > it = (x+y for x in l) > > y = 1 > > for v in it: > > print v > > > > will print 1,2 and not 2,3 > > > > unlike: > > > > def h(): > > y = 0 > > l = [1,2] > > def gen(S): > > for x in S: > > yield x+y > > it = gen(l) > > y = 1 > > for v in it: > > print v > >Argh. Of course. > >No, I think it should use the actual value of y, just like a nested >function. Why? >Never mind that idea then. Actually, I consider Samuele's example a good argument in *favor* of the idea. Because of the similarity between listcomps and generator expressions (gen-X's? ;) ) it seems late binding of locals would lead to people thinking the behavior is a bug. Since a genex is not a function (at least in form) a late binding would be very non-obvious and counterintuitive relative to other kinds of expressions. From tdelaney at avaya.com Tue Oct 21 19:15:46 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Tue Oct 21 19:15:51 2003 Subject: [Python-Dev] Re: buildin vs. shared modules Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFF21E@au3010avexu1.global.avaya.com> > From: Paul Moore [mailto:pf_moore@yahoo.co.uk] > > But in the absence of anyone who is working on a Mingw build, this is > pretty much irrelevant... Well, Gerhard has periodically worked on getting Mingw to work. I've had a quick go myself, but don't know the ins and outs enough. I would like Mingw to work, as I don't have access to MSVC at home, and don't have time to work on Python at work :( Since this is definitely MSVC-specific, I think it should be in an #ifdef block. Other Windows implementations (Mingw, etc) would not get the delay loading. Tim Delaney From aleaxit at yahoo.com Tue Oct 21 19:21:52 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 21 19:22:00 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> References: <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> Message-ID: <200310220121.52789.aleaxit@yahoo.com> On Wednesday 22 October 2003 00:51, Guido van Rossum wrote: ... > Maybe "global x in f" would work? Actually, I would rather like to DO AWAY with the anomalous 'global' statement and its weird anomalies such as: x = 23 def f1(u): if u: global x x = 45 def f2(): if 0: global x x = 45 print x f2() print x f1(0) print x "if u:" when u is 0, and "if 0:", should have the same effect to avoid violating the least-astonishment rule -- but when the if's body has a global in it, they don't. Eeek. Plus. EVERY newbie makes the mistake of taking "global" to mean "for ALL modules" rather than "for THIS module", uselessly using global in toplevel, etc. It's a wart and I'd rather work to remove it than to expand it, even though I _would_ like rebindable outers. I'd rather have a special name that means "this module" available for import (yes, I can do that with an import hook today). Say that __this_module__ was deemed acceptable for this. Then, import __this_module__ __this_module__.x = 23 lets me rebind the global-to-this-module variable x without 'global' and its various ills. Yeah, the name isn't _too_ cool. But I like the idea, and when I bounced it experimentally in c.l.py a couple weeks ago the reaction was mildly positive and without flames. Making globals a TAD less handy to rebind from within a function would not be exactly bad, either. (Of course 'global' would stay until 3.0 at least, but having an alternative I could explain it as obsolescent:-). Extending this idea (perhaps overstretching it), some other name "special for import" might indicate outer scopes. Though reserving the whole family of names __outer___ is probably overdoing it... plus, the object thus 'imported' would not be a module and would raise errors if you tried setattr'ing in it a name that's NOT a local variable of (the import itself would fail if you were not lexically nested inside a function called ). Thus this would allow *re-binding* existing local outer names but not *adding* new ones, which feels just fine to me (but maybe not to all). OK, this is 1/4-baked for the closure issue. BUT -- I'd STILL love to gradually ease 'global' out, think the "import __this_module__" idea is 3/4-baked (lacks a good special name...), and would hate to see 'global' gain a new lease of life for sophisticated uses...;-) Alex From aleaxit at yahoo.com Tue Oct 21 19:26:58 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 21 19:27:04 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310212307.h9LN7hY25523@12-236-54-216.client.attbi.com> References: <200310220105.08017.aleaxit@yahoo.com> <200310212307.h9LN7hY25523@12-236-54-216.client.attbi.com> Message-ID: <200310220126.59005.aleaxit@yahoo.com> On Wednesday 22 October 2003 01:07, Guido van Rossum wrote: > > > Is someone interested in writing up a PEP and taking it to the > > > community? Or do I have to do it myself (and risk another newsgroup > > > meltdown)? > > > > I'm interested, if it can wait until next week (in a few hours I'm > > flying off for a trip and I won't even have my laptop along). What's > > the procedure for requesting a PEP number, again? > > Raymond is going to give PEP 289 an overhaul. Wonderful! Much the best idea. Alex From guido at python.org Tue Oct 21 19:27:28 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 19:27:42 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: Your message of "Wed, 22 Oct 2003 09:11:44 +1000." <338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com> Message-ID: <200310212327.h9LNRSX25612@12-236-54-216.client.attbi.com> > > for x in R: > > ... > > print x > > For which reason I propose that Python 3.0 have the control name in > any for expression be "local" to the expression ;) What expression? If you're talking about making x = None for x in R: pass print x # last item of R illegal, forget it. That's too darn useful. > Hmm - actually this does raise another issue. > > >>> x = 1 > >>> y = [1, 2, 3] > >>> y = [x for x in y] > > Using the current semantics: > > >>> print x > 3 > > Using the new semantics: > > >>> print x > 1 > > Is this a problem? Are the new semantics going to cause confusion? No, and no; we already went over this (but I don't blame you for not reading every msg in this thread :-). It does mean that we have to start issuing proper deprecation warnings, and maybe we won't be able to properly fix the LC scope thing before 3.0. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 19:30:50 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 19:31:00 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 19:13:43 EDT." <5.1.1.6.0.20031021190931.01ec41d0@telecommunity.com> References: <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> <5.1.1.6.0.20031021190931.01ec41d0@telecommunity.com> Message-ID: <200310212330.h9LNUop25640@12-236-54-216.client.attbi.com> > Actually, I consider Samuele's example a good argument in *favor* of > the idea. Because of the similarity between listcomps and generator > expressions (gen-X's? ;) ) it seems late binding of locals would > lead to people thinking the behavior is a bug. Since a genex is not > a function (at least in form) a late binding would be very > non-obvious and counterintuitive relative to other kinds of > expressions. Hm. We do late binding of globals. Why shouldn't we do late binding of locals? There are lots of corners or the language where if you expect something else the actual behavior feels like a bug, until someone explains it to you. That's no reason to compromise. It's an opportunity for education about scopes! --Guido van Rossum (home page: http://www.python.org/~guido/) From nas-python at python.ca Tue Oct 21 19:39:10 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Tue Oct 21 19:38:01 2003 Subject: [Python-Dev] accumulator display syntax Message-ID: <20031021233910.GA2091@mems-exchange.org> Guido: > Personally, I still like the idea best to make > > (x for x in S) > > be an iterator comprehension > > and > > [x for x in S] > > syntactic sugar for the common operation > > list((x for x in S)) FWIW, that's enough to switch my vote for generator expressions from -0 to +0. If they work this way then there is essentially no extra complexity in the language. It's important to look at things from the perspective of a new Python programmer, I think. Another nice thing is that we have tuple and dict comprehensions for free: tuple(x for x in S) dict((k, v) for k, v in S) Set(x for x in S) Aside from the bit of syntactic sugar, everything is nice an regular. Neil From walter at livinglogic.de Tue Oct 21 19:38:55 2003 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Tue Oct 21 19:39:00 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <200310212309.h9LN9Sk25548@12-236-54-216.client.attbi.com> References: <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> <3F95BBF6.1090900@livinglogic.de> <200310212309.h9LN9Sk25548@12-236-54-216.client.attbi.com> Message-ID: <3F95C38F.4040201@livinglogic.de> Guido van Rossum wrote: >>Why not make local variables attributes of the function, i.e. >>replace: >> >> def inner(): >> global x in outer >> x = 42 >> >>with: >> >> def inner(): >> outer.x = 42 > > > Because this already means something! outer.x refers to the attribute > x of function outer. That's quite different than local variable x of > the most recent invocation of outer on the current thread's call stack! I guess unifying them both (somewhat like the instance attribute lookup rule) won't work. >>Global variables could then be assigned via: >> global.x = 42 > > > This has a tiny bit of appeal, but not enough to bother. Bye, Walter D?rwald From tdelaney at avaya.com Tue Oct 21 19:39:03 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Tue Oct 21 19:39:12 2003 Subject: [Python-Dev] listcomps vs. for loops Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFF239@au3010avexu1.global.avaya.com> > From: Guido van Rossum [mailto:guido@python.org] > > > > for x in R: > > > ... > > > print x > > > > For which reason I propose that Python 3.0 have the control name in > > any for expression be "local" to the expression ;) > > What expression? Sorry - I meant statement. > If you're talking about making > > x = None > for x in R: pass > print x # last item of R > > illegal, forget it. That's too darn useful. Note the winking smiley above :) Although I do find the scope limiting in: for (int i=0; i < 10; ++i) { } to be a nice feature of C++ (good god - did I just say that?) and hate that the implementation in MSVC is broken and the control variable leaks. > No, and no; we already went over this (but I don't blame you for not > reading every msg in this thread :-). It does mean that we have to > start issuing proper deprecation warnings, and maybe we won't be able > to properly fix the LC scope thing before 3.0. Yeah - I realised later that the discussion was hidden in the accumulator syntax thread. I definitely wouldn't find it confusing, but I've been a proponent of not leaking the control variable all along :) Tim Delaney From guido at python.org Tue Oct 21 19:40:34 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 19:40:47 2003 Subject: [Python-Dev] closure semantics In-Reply-To: Your message of "Wed, 22 Oct 2003 01:21:52 +0200." <200310220121.52789.aleaxit@yahoo.com> References: <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> <200310220121.52789.aleaxit@yahoo.com> Message-ID: <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> > Actually, I would rather like to DO AWAY with the anomalous 'global' > statement and its weird anomalies such as: > > x = 23 > > def f1(u): > if u: > global x > x = 45 > > def f2(): > if 0: > global x > x = 45 > > print x > f2() > print x > f1(0) > print x > > "if u:" when u is 0, and "if 0:", should have the same effect to avoid > violating the least-astonishment rule -- but when the if's body has > a global in it, they don't. Eeek. Eek. Global statement inside flow control should be deprecated, not abused to show that global is evil. :-) > Plus. EVERY newbie makes the mistake of taking "global" to mean > "for ALL modules" rather than "for THIS module", Only if they've been exposed to languages that have such globals. > uselessly using global in toplevel, Which the parser should reject. > etc. It's a wart and I'd rather work to remove it than to expand > it, even though I _would_ like rebindable outers. > > I'd rather have a special name that means "this module" available > for import (yes, I can do that with an import hook today). Say that > __this_module__ was deemed acceptable for this. Then, > import __this_module__ > __this_module__.x = 23 > lets me rebind the global-to-this-module variable x without 'global' > and its various ills. Yeah, the name isn't _too_ cool. But I like the > idea, and when I bounced it experimentally in c.l.py a couple weeks > ago the reaction was mildly positive and without flames. Making > globals a TAD less handy to rebind from within a function would > not be exactly bad, either. (Of course 'global' would stay until 3.0 > at least, but having an alternative I could explain it as obsolescent:-). I think it's not unreasonable to want to replace global with attribute assignment of *something*. I don't think that "something" should have to be imported before you can use it; I don't even think it deserves to have leading and trailing double underscores. Walter suggested 'global.x = 23' which looks reasonable; unfortunately my parser can't do this without removing the existing global statement from the Grammar: after seeing the token 'global' it must be able to make a decision about whether to expand this to a global statement or an assignment without peeking ahead, and that's impossible. > Extending this idea (perhaps overstretching it), some other name > "special for import" might indicate outer scopes. Though reserving > the whole family of names __outer___ is probably overdoing > it... plus, the object thus 'imported' would not be a module and would > raise errors if you tried setattr'ing in it a name that's NOT a local > variable of (the import itself would fail if you were not lexically > nested inside a function called ). Thus this would allow > *re-binding* existing local outer names but not *adding* new ones, > which feels just fine to me (but maybe not to all). > > OK, this is 1/4-baked for the closure issue. BUT -- I'd STILL love > to gradually ease 'global' out, think the "import __this_module__" > idea is 3/4-baked (lacks a good special name...), and would hate > to see 'global' gain a new lease of life for sophisticated uses...;-) If we removed global from the language, how would you spell assignment to a variable in an outer function scope? Remember, you can *not* use 'outer.x' because that already refers to a function attribute. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 19:42:20 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 19:42:29 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 16:39:10 PDT." <20031021233910.GA2091@mems-exchange.org> References: <20031021233910.GA2091@mems-exchange.org> Message-ID: <200310212342.h9LNgKa25725@12-236-54-216.client.attbi.com> > FWIW, that's enough to switch my vote for generator expressions from > -0 to +0. Thanks for the support! I value your judgement. > If they work this way then there is essentially no extra > complexity in the language. It's important to look at things from > the perspective of a new Python programmer, I think. > > Another nice thing is that we have tuple and dict comprehensions > for free: > > tuple(x for x in S) > dict((k, v) for k, v in S) > Set(x for x in S) Yes, this is nice. > Aside from the bit of syntactic sugar, everything is nice an > regular. Exactly. We should thank Peter Norvig for starting this discussion! --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at cosc.canterbury.ac.nz Tue Oct 21 19:42:41 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue Oct 21 19:43:41 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <1066735096.18849.33.camel@straylight> Message-ID: <200310212342.h9LNgfb10069@oma.cosc.canterbury.ac.nz> Mark Russell : > The argument for it is that walking over a dictionary in sorted order > is (at least to me) a missing idiom in python. Does this never come > up when you're teaching the language? Maybe dicts should have a .sortedkeys() method? The specialised method name would help stave off any temptation to add varied sort methods to other types. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From pje at telecommunity.com Tue Oct 21 19:47:58 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Oct 21 19:48:01 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310212330.h9LNUop25640@12-236-54-216.client.attbi.com> References: <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> <5.1.1.6.0.20031021190931.01ec41d0@telecommunity.com> Message-ID: <5.1.1.6.0.20031021193219.01df0c60@telecommunity.com> At 04:30 PM 10/21/03 -0700, Guido van Rossum wrote: > > Actually, I consider Samuele's example a good argument in *favor* of > > the idea. Because of the similarity between listcomps and generator > > expressions (gen-X's? ;) ) it seems late binding of locals would > > lead to people thinking the behavior is a bug. Since a genex is not > > a function (at least in form) a late binding would be very > > non-obvious and counterintuitive relative to other kinds of > > expressions. > >Hm. We do late binding of globals. Why shouldn't we do late binding >of locals? Wha? Oh, you mean in a function. But that's what I'm saying, it's *not* a function. Sure, it's implemented as one under the hood, but it doesn't *look* like a function. In any normal (non-lambda) expression, whether a variable is local or global, its value is retrieved immediately. Also, even though there's a function under the hood, that function is *called* and its value returned immediately. This seems consistent with an immediate binding of parameters. > There are lots of corners or the language where if you >expect something else the actual behavior feels like a bug, until >someone explains it to you. That's no reason to compromise. It's an >opportunity for education about scopes! So far, I haven't seen you say any reason why the "arguments" approach is bad, or why the "closure" approach is good. Both are certainly Pythonic in some circumstances, but why do you feel that one is better than the other, here? I will state one pragmatic reason for using the default arguments approach: code converted from using a listcomp to a genex can immediately have bugs as a result of rebinding a local. Those bugs won't happen if rebinding the local has no effect on the genex's evaluation. (Obviously, an aliasing problem can still be created if one modifies a mutable used in the genex, but there's no way to remove that possibility and still end up with a lazy iterator.) Given that one of the big arguments in favor of genexes is to make "upgrading" from listcomps easy, it shouldn't fail so quickly and obviously. E.g., converting from: x = {} for i in range(10): x[i] = [y^i for y in range(10)] to: x = {} for i in range(10): x[i] = (y^i for y in range(10)) Shouldn't result in all of x's elements iterating over the same values! From FBatista at uniFON.com.ar Tue Oct 21 15:50:32 2003 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Tue Oct 21 19:49:37 2003 Subject: [Python-Dev] prePEP: Money data type Message-ID: Tim Peters wrote: #- Meaning that there's an existing body of work that's already #- been informed #- by years of design debate (IBM's proposed decimal standard), #- and an involved #- Python implementation of that. What happens next depends on #- who can make #- time to do something next. I'm urged to have a Money data type, but I'll see if I can get it through Decimal, improving/fixing/extedign Decimal and saving effort at the same time. . Facundo From greg at cosc.canterbury.ac.nz Tue Oct 21 19:49:48 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue Oct 21 19:50:53 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <2m65iipzrd.fsf@starship.python.net> Message-ID: <200310212349.h9LNnm710076@oma.cosc.canterbury.ac.nz> Michael Hudson : > In particular what happens if the iteration variable is a local in the > frame anyway? I presume that would inhibit the renaming Why? > but then code like > > def f(x): > r = [x+1 for x in range(x)] > return r, x > > becomes even more incomprehensible (and changes in behaviour). Anyone who writes code like that *deserves* to have the behaviour changed on them! If this is really a worry, an alternative would be to simply forbid using a name for the loop variable that's used for anything else outside the loop. That could break existing code too, but at least it would break it in a very obvious way by making it fail to compile. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From walter at livinglogic.de Tue Oct 21 19:51:05 2003 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Tue Oct 21 19:51:10 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> References: <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> <200310220121.52789.aleaxit@yahoo.com> <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> Message-ID: <3F95C669.4080706@livinglogic.de> Guido van Rossum wrote: > [...] > Walter suggested 'global.x = 23' which looks reasonable; unfortunately > my parser can't do this without removing the existing global statement > from the Grammar: after seeing the token 'global' it must be able to > make a decision about whether to expand this to a global statement or > an assignment without peeking ahead, and that's impossible. Couldn't this be solved by making 'global.' a token? Should {get|has}attr(global, 'foo') be possible? Bye, Walter D?rwald From tdelaney at avaya.com Tue Oct 21 19:53:01 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Tue Oct 21 19:53:10 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFF246@au3010avexu1.global.avaya.com> > From: Greg Ewing [mailto:greg@cosc.canterbury.ac.nz] > > Maybe dicts should have a .sortedkeys() method? The specialised > method name would help stave off any temptation to add varied sort > methods to other types. -1. I think that: d = {1: 2, 3: 4} for i in list.sorted(d): print i or d = {1: 2, 3: 4} for i in list.sorted(d.iterkeys()): print i looks very clean and unambiguous. And is even better at staving off temptation to add varied sort methods to other types. Tim Delaney From walter at livinglogic.de Tue Oct 21 19:57:20 2003 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Tue Oct 21 19:57:25 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> References: <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> <200310220121.52789.aleaxit@yahoo.com> <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> Message-ID: <3F95C7E0.4030608@livinglogic.de> Guido van Rossum wrote: > [...] > Walter suggested 'global.x = 23' which looks reasonable; unfortunately > my parser can't do this without removing the existing global statement > from the Grammar: after seeing the token 'global' it must be able to > make a decision about whether to expand this to a global statement or > an assignment without peeking ahead, and that's impossible. Another idea: We could replace the function globals() with an object that provides __call__ for backwards compatibility, but also has a special __setattr__. Then global assignment would be 'globals.x = 23'. Would this be possible? Bye, Walter D?rwald From aleaxit at yahoo.com Tue Oct 21 19:58:21 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 21 19:58:27 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> References: <200310220121.52789.aleaxit@yahoo.com> <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> Message-ID: <200310220158.21389.aleaxit@yahoo.com> On Wednesday 22 October 2003 01:40, Guido van Rossum wrote: ... > Eek. Global statement inside flow control should be deprecated, not > abused to show that global is evil. :-) OK, let's (deprecate them), shall we...? > > Plus. EVERY newbie makes the mistake of taking "global" to mean > > "for ALL modules" rather than "for THIS module", > > Only if they've been exposed to languages that have such globals. Actually, I've seen that happen to complete newbies too. "global" is a VERY strong word -- or at least perceived as such. > > uselessly using global in toplevel, > > Which the parser should reject. Again: can we do that in 2.4? > I think it's not unreasonable to want to replace global with attribute > assignment of *something*. I don't think that "something" should have > to be imported before you can use it; I don't even think it deserves > to have leading and trailing double underscores. Using attribute assignment is my main drive here. I was doing it via import only to be able to experiment with that in today's Python;-). > Walter suggested 'global.x = 23' which looks reasonable; unfortunately > my parser can't do this without removing the existing global statement > from the Grammar: after seeing the token 'global' it must be able to > make a decision about whether to expand this to a global statement or > an assignment without peeking ahead, and that's impossible. So it can't be global, as it must stay a keyword for backwards compatibility at least until 3.0. What about: this_module current_module sys.modules[__name__] [[hmmm this DOES work today, but...;-)]] __module__ ...? > If we removed global from the language, how would you spell assignment > to a variable in an outer function scope? Remember, you can *not* use > 'outer.x' because that already refers to a function attribute. scope(outer).x , making 'scope' a suitable built-in factory function. I do think this deserves a built-in. If we have this, maybe scope could also be reused as e.g. scope(global).x = 23 ? I think the reserved keyword 'global' SHOULD give the parser no problem in this one specific use (but, I'm guessing...!). Alex From guido at python.org Tue Oct 21 20:19:40 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 20:19:53 2003 Subject: [Python-Dev] closure semantics In-Reply-To: Your message of "Wed, 22 Oct 2003 01:51:05 +0200." <3F95C669.4080706@livinglogic.de> References: <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> <200310220121.52789.aleaxit@yahoo.com> <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> <3F95C669.4080706@livinglogic.de> Message-ID: <200310220019.h9M0JeS25829@12-236-54-216.client.attbi.com> > > Walter suggested 'global.x = 23' which looks reasonable; unfortunately > > my parser can't do this without removing the existing global statement > > from the Grammar: after seeing the token 'global' it must be able to > > make a decision about whether to expand this to a global statement or > > an assignment without peeking ahead, and that's impossible. > > Couldn't this be solved by making 'global.' a token? > > Should {get|has}attr(global, 'foo') be possible? Yes, I think if we go this path, global should behave as a predefined variable. Maybe we should call it __globals__ after all, consistent with __file__ and __name__ (it would create a cycle, but we have plenty of those already). Though I still wish it didn't need underscores. Maybe 'globals' could sprout __getattribute__ and __setattr__ methods that would delegate to the current global module? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 20:20:15 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 20:20:38 2003 Subject: [Python-Dev] closure semantics In-Reply-To: Your message of "Wed, 22 Oct 2003 01:57:20 +0200." <3F95C7E0.4030608@livinglogic.de> References: <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> <200310220121.52789.aleaxit@yahoo.com> <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> <3F95C7E0.4030608@livinglogic.de> Message-ID: <200310220020.h9M0KF825841@12-236-54-216.client.attbi.com> > Another idea: We could replace the function globals() with an object > that provides __call__ for backwards compatibility, but also has a > special __setattr__. Then global assignment would be 'globals.x = 23'. > Would this be possible? Yes, I just proposed this in my previous response. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 20:23:48 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 20:23:56 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 19:47:58 EDT." <5.1.1.6.0.20031021193219.01df0c60@telecommunity.com> References: <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> <5.1.1.6.0.20031021190931.01ec41d0@telecommunity.com> <5.1.1.6.0.20031021193219.01df0c60@telecommunity.com> Message-ID: <200310220023.h9M0Nmg25868@12-236-54-216.client.attbi.com> > > > Actually, I consider Samuele's example a good argument in *favor* of > > > the idea. Because of the similarity between listcomps and generator > > > expressions (gen-X's? ;) ) it seems late binding of locals would > > > lead to people thinking the behavior is a bug. Since a genex is not > > > a function (at least in form) a late binding would be very > > > non-obvious and counterintuitive relative to other kinds of > > > expressions. > > > >Hm. We do late binding of globals. Why shouldn't we do late binding > >of locals? > > Wha? Oh, you mean in a function. No, everywhere. Global in generator expressions also have late binding: A = 1 def f(): return (x+A for x in range(3)) g = f() A = 2 print list(g) # prints [2, 3, 4]; not [1, 2, 3] > But that's what I'm saying, it's *not* a > function. Sure, it's implemented as one under the hood, but it doesn't > *look* like a function. In any normal (non-lambda) expression, whether a > variable is local or global, its value is retrieved immediately. That's because the expression is evaluated immediately. When passing generator expressions around that reference free variables (whether global or from a function scope), the expression is evaluated when it is requested. Note that even under your model, A = [] g = (A for x in range(3)) A.append(42) print list(g) # prints [[42], [42], [42]] > Also, even though there's a function under the hood, that function > is *called* and its value returned immediately. This seems > consistent with an immediate binding of parameters. But it's a generator function, and the call suspends immediately, and continues to execute only when the next() method on the result is called. > > There are lots of corners or the language where if you > >expect something else the actual behavior feels like a bug, until > >someone explains it to you. That's no reason to compromise. It's an > >opportunity for education about scopes! > > So far, I haven't seen you say any reason why the "arguments" > approach is bad, or why the "closure" approach is good. Both are > certainly Pythonic in some circumstances, but why do you feel that > one is better than the other, here? Unified semantic principles. I want to be able to explain generator expressions as a shorthand for defining and calling generator functions. Invoking default argument semantics makes the explanation less clean: we would have to go through the trouble of finding all references to fere variables. Do you want globals to be passed via default arguments as well? And what about builtins? (Note that the compiler currently doesn't know the difference.) > I will state one pragmatic reason for using the default arguments > approach: code converted from using a listcomp to a genex can > immediately have bugs as a result of rebinding a local. Those bugs > won't happen if rebinding the local has no effect on the genex's > evaluation. (Obviously, an aliasing problem can still be created if > one modifies a mutable used in the genex, but there's no way to > remove that possibility and still end up with a lazy iterator.) > > Given that one of the big arguments in favor of genexes is to make > "upgrading" from listcomps easy, it shouldn't fail so quickly and > obviously. E.g., converting from: > > x = {} > for i in range(10): > x[i] = [y^i for y in range(10)] > > to: > > x = {} > for i in range(10): > x[i] = (y^i for y in range(10)) > > Shouldn't result in all of x's elements iterating over the same values! Hm. I think most generator expressions should be finished before moving on to the next line, as in for n in range(4): print sum(x**n for x in range(1, 11)) Saving a generator expression for later use should be something you rarely do, and you should really think of it as a shorthand for a generator function just as lambda is a shorthand for a regular function. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 20:42:05 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 21 20:42:17 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: Your message of "Wed, 22 Oct 2003 01:58:21 +0200." <200310220158.21389.aleaxit@yahoo.com> References: <200310220121.52789.aleaxit@yahoo.com> <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> <200310220158.21389.aleaxit@yahoo.com> Message-ID: <200310220042.h9M0g5225903@12-236-54-216.client.attbi.com> (Changing the subject yet again) > > > Plus. EVERY newbie makes the mistake of taking "global" to mean > > > "for ALL modules" rather than "for THIS module", > > > > Only if they've been exposed to languages that have such globals. > > Actually, I've seen that happen to complete newbies too. "global" is > a VERY strong word -- or at least perceived as such. We can't expect everybody to guess the rules of the language purely based on the symbols used. But I appreciate the argument; 'global' comes from ABC's SHARE, but ABC doesn't have modules. (It does have workspaces, but AFAIR there is no communication at all between workspaces, so it isn't unreasonable that a SHAREd name in one workspace isn't visible in another workspace.) > > > uselessly using global in toplevel, > > > > Which the parser should reject. > > Again: can we do that in 2.4? Submit a patch. It'll probably break plenty of code though (I bet you including Zope :-), so you'll have to start with a warning in 2.4. > > I think it's not unreasonable to want to replace global with > > attribute assignment of *something*. I don't think that > > "something" should have to be imported before you can use it; I > > don't even think it deserves to have leading and trailing double > > underscores. > > Using attribute assignment is my main drive here. I was doing it > via import only to be able to experiment with that in today's Python;-). You could have writen an import hook that simply inserted __globals__ in each imported module. :-) > > Walter suggested 'global.x = 23' which looks reasonable; unfortunately > > my parser can't do this without removing the existing global statement > > from the Grammar: after seeing the token 'global' it must be able to > > make a decision about whether to expand this to a global statement or > > an assignment without peeking ahead, and that's impossible. > > So it can't be global, as it must stay a keyword for backwards compatibility > at least until 3.0. What about: > this_module > current_module > sys.modules[__name__] [[hmmm this DOES work today, but...;-)]] > __module__ > ...? __module__ can't work because it has to be a string. (I guess it could be a str subclass but that would be too perverse.) Walter and I both suggested hijacking the 'globals' builtin. What do you think of that? > > If we removed global from the language, how would you spell assignment > > to a variable in an outer function scope? Remember, you can *not* use > > 'outer.x' because that already refers to a function attribute. > > scope(outer).x , making 'scope' a suitable built-in factory > function. I do think this deserves a built-in. Hm. I want it to be something that the compiler can know about reliably, and a built-in function doesn't work (yet). The compiler currently knows enough about nested scopes so that it can implement locals that are shared with inner functions differently (using cells). It's also too asymmetric -- *using* x would continue to be just x. Hmm. That's also a problem I have with changing global assignment -- I think the compiler should know about it, just like it knows about *using* globals. And it's not just the compiler. I think it requires more mental gymnastics of the human reader to realize that def outer(): def f(): scope(outer).x = 42 print x return f outer()() prints 42 rather than being an error. But how does the compiler know to reserve space for x in outer's scope? Another thing is that your proposed scope() is too dynamic -- it would require searching the scopes that (statically) enclose the call for a stack frame belonging to the argument. But there's no stack by the time f gets called in the last example! (The curernt machinery for nested scopes doesn't reference stack frames; it only passes cells.) > If we have this, maybe scope could also be reused as e.g. > scope(global).x = 23 > ? I think the reserved keyword 'global' SHOULD give the parser no > problem in this one specific use (but, I'm guessing...!). I don't want to go there. :-) (If it wasn't clear, I'm struggling with this subject -- I think there are good reasons for why I'm resisting your proposal, but I haven't found them yet. The more I think about it, the less I like 'globals.x = 42' . --Guido van Rossum (home page: http://www.python.org/~guido/) From eppstein at ics.uci.edu Tue Oct 21 20:44:45 2003 From: eppstein at ics.uci.edu (David Eppstein) Date: Tue Oct 21 20:44:48 2003 Subject: [Python-Dev] Re: accumulator display syntax References: <20031021233910.GA2091@mems-exchange.org> Message-ID: In article <20031021233910.GA2091@mems-exchange.org>, Neil Schemenauer wrote: > nother nice thing is that we have tuple and dict comprehensions > for free: > > tuple(x for x in S) > dict((k, v) for k, v in S) > Set(x for x in S) Who cares about tuple comprehensions, but I would like similar syntactic sugar for dict comprehensions as for lists: {k:v for k,v in S} (PEP 274). -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From tim.one at comcast.net Tue Oct 21 21:06:31 2003 From: tim.one at comcast.net (Tim Peters) Date: Tue Oct 21 21:06:36 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> Message-ID: [Samuele Pedroni] > so this, if I understand: > > def h(): > y = 0 > l = [1,2] > it = (x+y for x in l) > y = 1 > for v in it: > print v > > will print 1,2 and not 2,3 That is what I had in mind, and that if the first assignment to "y" were commented out, the assignment to "it" would raise UnboundLocalError. > unlike: > > def h(): > y = 0 > l = [1,2] > def gen(S): > for x in S: > yield x+y > it = gen(l) > y = 1 > for v in it: > print v Yes, but like it if you replaced the "def gen" and the line following it with: def gen(y=y, l=l): for x in l: yield x+y it = gen() This is worth some thought. My intuition is that we *don't* want "a closure" here. If generator expressions were reiterable, then (probably obnoxiously) clever code could make some of use of tricking them into using different inherited bindings on different (re)iterations. But they're one-shot things, and divorcing the values actually used from the values in force at the definition site sounds like nothing but trouble to me (error-prone and surprising). They look like expressions, after all, and after x = 5 y = x**2 x = 10 print y it would be very surprising to see 100 get printed. In the rare cases that's desirable, creating an explicit closure is clear(er): x = 5 y = lambda: x**2 x = 10 print y() I expect creating a closure instead would bite hard especially when building a list of generator expressions (one of the cases where delaying generation of the results is easily plausible) in a loop. The loop index variable will probably play some role (directly or indirectly) in the intended operation of each generator expression constructed, and then you severely want *not* for each generator expression to see "the last" value of the index vrlbl. For concreteness, test_generators.Queens.__init__ creates a list of rowgen() generators, and rowgen uses the default-arg trick to give each generator a different value for rowuses; it would be an algorithmic disaster if they all used the same value. Generator expressions are too limited to do what rowgen() does (it needs to create and undo side effects as backtracking proceeds), so it's not perfectly relevant as-is. I *suspect* that if people work at writing concrete use cases, though, a similar thing will hold. BTW, Icon can give no guidance here: in that language, the generation of a generator's result sequence is inextricably bound to the lexical occurrence of the generator. The question arises in Python because definition site and generation can be divorced. From barry at python.org Tue Oct 21 21:21:51 2003 From: barry at python.org (Barry Warsaw) Date: Tue Oct 21 21:21:57 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: References: <20031021233910.GA2091@mems-exchange.org> Message-ID: <1066785710.5750.333.camel@anthem> On Tue, 2003-10-21 at 20:44, David Eppstein wrote: > Who cares about tuple comprehensions, but I would like similar syntactic > sugar for dict comprehensions as for lists: > {k:v for k,v in S} > (PEP 274). +1 :) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20031021/63df6654/attachment.bin From pedronis at bluewin.ch Tue Oct 21 21:27:14 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Tue Oct 21 21:24:51 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310220042.h9M0g5225903@12-236-54-216.client.attbi.com> References: <200310220121.52789.aleaxit@yahoo.com> <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> <200310220158.21389.aleaxit@yahoo.com> Message-ID: <5.2.1.1.0.20031022031539.027f6fc0@pop.bluewin.ch> At 17:42 21.10.2003 -0700, Guido van Rossum wrote: >(Changing the subject yet again) > > > > > Plus. EVERY newbie makes the mistake of taking "global" to mean > > > > "for ALL modules" rather than "for THIS module", > > > > > > Only if they've been exposed to languages that have such globals. > > > > Actually, I've seen that happen to complete newbies too. "global" is > > a VERY strong word -- or at least perceived as such. > >We can't expect everybody to guess the rules of the language purely >based on the symbols used. > >But I appreciate the argument; 'global' comes from ABC's SHARE, but >ABC doesn't have modules. (It does have workspaces, but AFAIR there >is no communication at all between workspaces, so it isn't >unreasonable that a SHAREd name in one workspace isn't visible in >another workspace.) > > > > > uselessly using global in toplevel, > > > > > > Which the parser should reject. > > > > Again: can we do that in 2.4? > >Submit a patch. It'll probably break plenty of code though (I bet you >including Zope :-), so you'll have to start with a warning in 2.4. > > > > I think it's not unreasonable to want to replace global with > > > attribute assignment of *something*. I don't think that > > > "something" should have to be imported before you can use it; I > > > don't even think it deserves to have leading and trailing double > > > underscores. > > > > Using attribute assignment is my main drive here. I was doing it > > via import only to be able to experiment with that in today's Python;-). > >You could have writen an import hook that simply inserted __globals__ >in each imported module. :-) > > > > Walter suggested 'global.x = 23' which looks reasonable; unfortunately > > > my parser can't do this without removing the existing global statement > > > from the Grammar: after seeing the token 'global' it must be able to > > > make a decision about whether to expand this to a global statement or > > > an assignment without peeking ahead, and that's impossible. > > > > So it can't be global, as it must stay a keyword for backwards > compatibility > > at least until 3.0. What about: > > this_module > > current_module > > sys.modules[__name__] [[hmmm this DOES work today, but...;-)]] > > __module__ > > ...? > >__module__ can't work because it has to be a string. (I guess it >could be a str subclass but that would be too perverse.) > >Walter and I both suggested hijacking the 'globals' builtin. What do >you think of that? > > > > If we removed global from the language, how would you spell assignment > > > to a variable in an outer function scope? Remember, you can *not* use > > > 'outer.x' because that already refers to a function attribute. > > > > scope(outer).x , making 'scope' a suitable built-in factory > > function. I do think this deserves a built-in. > >Hm. I want it to be something that the compiler can know about >reliably, and a built-in function doesn't work (yet). The compiler >currently knows enough about nested scopes so that it can implement >locals that are shared with inner functions differently (using >cells). It's also too asymmetric -- *using* x would continue to be >just x. > >Hmm. That's also a problem I have with changing global assignment -- >I think the compiler should know about it, just like it knows about >*using* globals. > >And it's not just the compiler. I think it requires more mental >gymnastics of the human reader to realize that > > def outer(): > def f(): > scope(outer).x = 42 > print x > return f > outer()() > >prints 42 rather than being an error. But how does the compiler know >to reserve space for x in outer's scope? > >Another thing is that your proposed scope() is too dynamic -- it >would require searching the scopes that (statically) enclose the call >for a stack frame belonging to the argument. But there's no stack by >the time f gets called in the last example! (The curernt machinery >for nested scopes doesn't reference stack frames; it only passes >cells.) > > > If we have this, maybe scope could also be reused as e.g. > > scope(global).x = 23 > > ? I think the reserved keyword 'global' SHOULD give the parser no > > problem in this one specific use (but, I'm guessing...!). > >I don't want to go there. :-) > >(If it wasn't clear, I'm struggling with this subject -- I think there >are good reasons for why I'm resisting your proposal, but I haven't >found them yet. The more I think about it, the less I like >'globals.x = 42' . . suggests runtime, for compile time then maybe global::x=42 module::x=42 outer::x=42 (I don't like those, and personally I don't see the need to get rebinding for closed-over variables but anyway) another possibility is that today is a syntax error, so maybe global x = 42 or module x = 42 they would not be statements, this for symmetry would also be legal: y = module x + 1 then outer x = 42 and also y = g x + 1 the problems are also clear, in some other languages x y is function application, etc.. From kiko at async.com.br Tue Oct 21 21:43:46 2003 From: kiko at async.com.br (Christian Robottom Reis) Date: Tue Oct 21 21:43:57 2003 Subject: [Python-Dev] Re: Be Honest about LC_NUMERIC In-Reply-To: <200310182222.h9IMMx1X004861@mira.informatik.hu-berlin.de> References: <200310182222.h9IMMx1X004861@mira.informatik.hu-berlin.de> Message-ID: <20031022014346.GI2977@async.com.br> On Sun, Oct 19, 2003 at 12:22:59AM +0200, Martin v. L?wis wrote: > What happened to this PEP? I can't find it in the PEP list. Sorry, I've been completely distracted by real life, lately. I can put some effort into the text this week, but I'm not sure what I should do beyond sending the PEP to the list and waiting for comments (which were pretty sparse!) > Personally, I am satisfied with the patch that evolved from the > discussion (#774665), and I would be willing to apply it even without > a PEP. I would really appreciate a comment from Tim outlining his opinion (so I'm adding him to the To: list). Just to recapitulate, the patch Gustavo has posted doesn't use the thread-safe glibc functions, which means that we won't be safe from runtime locale switching. I suppose I should also point out that runtime locale switching is useful in certain obscure situations; for instance, formatting a number with periods grouping the thousands can be done by setting to the da_DK locale temporarily. Whether this hack is to be encouraged or shelved for something better is yet unknown to me, though. Take care, -- Christian Robottom Reis | http://async.com.br/~kiko/ | [+55 16] 261 2331 From barry at python.org Tue Oct 21 21:55:36 2003 From: barry at python.org (Barry Warsaw) Date: Tue Oct 21 21:55:48 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <200310220020.h9M0KF825841@12-236-54-216.client.attbi.com> References: <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> <200310220121.52789.aleaxit@yahoo.com> <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> <3F95C7E0.4030608@livinglogic.de> <200310220020.h9M0KF825841@12-236-54-216.client.attbi.com> Message-ID: <1066787735.5750.343.camel@anthem> On Tue, 2003-10-21 at 20:20, Guido van Rossum wrote: > > Another idea: We could replace the function globals() with an object > > that provides __call__ for backwards compatibility, but also has a > > special __setattr__. Then global assignment would be 'globals.x = 23'. > > Would this be possible? > > Yes, I just proposed this in my previous response. :-) So maybe the idea of using function attributes isn't totally nuts, if you use a special name. E.g. outer.__locals__.x and outer.__globals__.x -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20031021/6088ac50/attachment.bin From greg at cosc.canterbury.ac.nz Tue Oct 21 23:14:56 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue Oct 21 23:15:19 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <200310212309.h9LN9Sk25548@12-236-54-216.client.attbi.com> Message-ID: <200310220314.h9M3Euk11066@oma.cosc.canterbury.ac.nz> Guido: > > def inner(): > > outer.x = 42 > > Because this already means something! Hmmm, maybe x of outer = 42 Determined-to-get-an-'of'-into-the-language-somehow-ly, Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From jeremy at zope.com Tue Oct 21 23:08:10 2003 From: jeremy at zope.com (Jeremy Hylton) Date: Tue Oct 21 23:15:37 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310220042.h9M0g5225903@12-236-54-216.client.attbi.com> References: <200310220121.52789.aleaxit@yahoo.com> <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> <200310220158.21389.aleaxit@yahoo.com> <200310220042.h9M0g5225903@12-236-54-216.client.attbi.com> Message-ID: <1066792089.19270.29.camel@localhost.localdomain> On Tue, 2003-10-21 at 20:42, Guido van Rossum wrote: > (If it wasn't clear, I'm struggling with this subject -- I think there > are good reasons for why I'm resisting your proposal, but I haven't > found them yet. The more I think about it, the less I like > 'globals.x = 42' . I think it's good that attribute assignment and variable assignment look different. An object's attributes are more dynamic that the variables in a module. I don't see much benefit to conflating two distinct concepts. Jeremy From python at rcn.com Tue Oct 21 23:17:26 2003 From: python at rcn.com (Raymond Hettinger) Date: Tue Oct 21 23:18:13 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: Message-ID: <000001c3984b$052cd820$e841fea9@oemcomputer> > Neil Schemenauer wrote: > > nother nice thing is that we have tuple and dict comprehensions > > for free: > > > > tuple(x for x in S) > > dict((k, v) for k, v in S) > > Set(x for x in S) [David Eppstein] > Who cares about tuple comprehensions, but I would like similar syntactic > sugar for dict comprehensions as for lists: > {k:v for k,v in S} > (PEP 274). -1 Let's keep just one way to do it. That constuct saves a few characters just to get a little cuteness and another special case to remember and maintain. Once you have iterator expressions, you've already gotten 99% of the benefits of PEP 274. List comprehensions, on the other hand, already exist, so they *have* to be supported. Raymond From tim_one at email.msn.com Tue Oct 21 23:03:14 2003 From: tim_one at email.msn.com (Tim Peters) Date: Tue Oct 21 23:21:35 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310220223.h9M2N7l26105@12-236-54-216.client.attbi.com> Message-ID: [Guido] > Urgh, we need this sorted out before Raymond can rewrite PEP 289 and > present it to c.l.py... That would be good . I don't feel a sense of urgency, though, and will be out of town the rest of the week. I sure *expect* that most generator expressions will "get consumed" immediately, at their definition site, so that there's no meaningful question to answer then (as in, e.g., the endless sum(generator_expression) examples, assuming the builtin sum). That means people have to think of plausible use cases where evaluation is delayed. There are some good examples of lists-of-generators in test_generators.py, and I'll just note that they use the default-arg mechanism to force a particular loop-variant non-local value, or use an instance variable, and/or use lexical scoping but know darned well that the up-level binding will never change over the life of each generator. That's all the concrete stuff I have to stare at now (& recalling that the question can't be asked in Icon -- no "divorce" possible there, and no lexical nesting even if it were possible to delay generation). > ... > So, do you want *all* free variables to be passed using the > default-argument trick (even globals and builtins), or only those that > correspond to variables in the immediately outer scope, or only those > corresponding to function scopes (as opposed to globals)? All or none make sense to me, as semantic models (not ruling out that a clever implementation may take shortcuts). I'm not having a hard time imagining that "all" will be useful; I haven't yet managed to dream up a plausible use case where "none" actually helps. > n = 0 > def f(): > global n > n += 1 > return n > print list(n+f() for x in range(10)) Like I just said . There's no question that semantics can differ between "all" and "none" (and at several points between to boot). Stick a "global f" inside f() and rebind f based on the current value of n too, if you like. I'm having a hard time imagining something *useful* coming out of such tricks combined with "none". Under "all", I look at the print and think "f is f, and n is 0, and that's it". I'm not sure it's "a feature" that print [n+f() for x in range(10)] looks up n and f anew on each iteration -- if I saw a listcomp that actually relied on this, I'd be eager to avoid inheriting any of author's code. From tjreedy at udel.edu Tue Oct 21 23:22:09 2003 From: tjreedy at udel.edu (Terry Reedy) Date: Tue Oct 21 23:22:16 2003 Subject: [Python-Dev] Re: closure semantics References: <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch><200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> <200310220121.52789.aleaxit@yahoo.com> <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> Message-ID: "Guido van Rossum" wrote in message news:200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com... > Eek. Global statement inside flow control should be deprecated, not > abused to show that global is evil. :-) Is there any good reason to ever use globals anywhere other than as the first statement (after doc string) of a function? If not, could its usage be so restricted (like __future__ import)? > > Plus. EVERY newbie makes the mistake of taking "global" to mean > > "for ALL modules" rather than "for THIS module", Part of my brain still thinks that, and another part has to say, 'no, just modular or mod_vars()'. > Only if they've been exposed to languages that have such globals. Like Python with __builtins__? which I think of as the true globals. Do C or Fortran count as such a source of 'infection'? > > uselessly using global in toplevel, > > Which the parser should reject. Good. The current nonrejection sometimes leads beginners astray because they think it must be doing something. While I use global/s() just fine, I still don't like the names. I decided awhile ago that they must predate import, when the current module scoop would have been 'global'. >[from another post] But I appreciate the argument; 'global' comes from ABC's >SHARE, but ABC doesn't have modules. Aha! Now I can use this explanation as fact instead of speculation. Terry J. Reedy From eppstein at ics.uci.edu Tue Oct 21 23:27:05 2003 From: eppstein at ics.uci.edu (David Eppstein) Date: Tue Oct 21 23:27:11 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <000001c3984b$052cd820$e841fea9@oemcomputer> References: <000001c3984b$052cd820$e841fea9@oemcomputer> Message-ID: <13803476.1066768024@[192.168.1.101]> On 10/21/03 11:17 PM -0400 Raymond Hettinger wrote: > -1 > > Let's keep just one way to do it. > > That constuct saves a few characters just to get a little > cuteness and another special case to remember and maintain. > > Once you have iterator expressions, you've already gotten 99% of > the benefits of PEP 274. Currently, I am using expressions like pos2d = dict([(s,(positions[s][0]+dx*positions[s][2],positions[s][1]+dy*positions[s ][2])) for s in positions]) Once I have iterator expressions, I can simplify it by dropping a whole two characters (the brackets) and get an unimportant time savings. But with PEP 274, I could write pos2d = {s:(positions[s][0]+dx*positions[s][2],positions[s][1]+dy*positions[s][2]) for s in positions} Instead of five levels of nested parens+brackets, I would need only three, and each level would be a different type of paren or bracket, which I think together with the shorter overall length would contribute significantly to readability. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From jeremy at alum.mit.edu Tue Oct 21 23:00:43 2003 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Tue Oct 21 23:29:54 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> References: <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> Message-ID: <1066791643.19270.25.camel@localhost.localdomain> On Tue, 2003-10-21 at 18:51, Guido van Rossum wrote: > [Samuele] > > this is a bit OT and too late, but given that our closed over > > variables are read-only, I'm wondering whether, having a 2nd chance, > > using cells and following mutations in the enclosing scopes is > > really worth it, we kind of mimic Scheme and relatives but there > > outer scope variables are also rebindable. Maybe copying semantics > > not using cells for our closures would not be too insane, and people > > would not be burnt by trying things like this: > > > > for msg in msgs: > > def onClick(e): > > print msg > > panel.append(Button(msg,onClick=onClick)) > > > > which obviously doesn't do what one could expect today. OTOH as for > > general mutability, using a mutable object (list,...) would allow > > for mutability when one really need it (rarely). I think copying semantics would be too surprising. > It was done this way because not everybody agreed that closed-over > variables should be read-only, and the current semantics allow us to > make them writable (as in Scheme, I suppose?) if we can agree on a > syntax to declare an "intermediate scope" global. > > Maybe "global x in f" would work? Woo hoo. I'm happy to hear you've had a change of heart on this topic. I think a simple, declarative statement would be clearer than assigning to an attribute of a special object. If a special object, like __global__, existed, could you create an alias, like: surprise = __global__ surprise.x = 1 print __global__.x ? It would apparently also allow you to use a local and global variable with the same name in the same scope. That's odd, although I suppose it would be clear from context whether the local or global was intended. > def outer(): > x = 1 > def intermediate(): > x = 2 > def inner(): > global x in outer > x = 42 > inner() > print x # prints 2 > intermediate() > print x # prints 42 I would prefer to see a separate statement similar to global that meant "look for the nearest enclosing binding." Rather than specifying that you want to use x from outer, you could only say you don't want x to be local. That means you'd always get intermediate. I think this choice is more modular. If you can re-bind a non-local variable, then the name of the function where it is initially bound isn't that interesting. It would be safe, for example, to move it to another place in the function hierarchy without affecting the semantics of the program -- except that in the case of "global x in outer" you'd have to change all the referring global statements. Or would the semantics be to create a binding for x in outer, even if it didn't already exist? Jeremy From pje at telecommunity.com Tue Oct 21 22:05:55 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Oct 21 23:31:38 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310220023.h9M0Nmg25868@12-236-54-216.client.attbi.com> References: <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> <5.1.1.6.0.20031021190931.01ec41d0@telecommunity.com> <5.1.1.6.0.20031021193219.01df0c60@telecommunity.com> Message-ID: <5.1.0.14.0.20031021214942.026ca320@mail.telecommunity.com> At 05:23 PM 10/21/03 -0700, Guido van Rossum wrote: >Unified semantic principles. I want to be able to explain generator >expressions as a shorthand for defining and calling generator >functions. For a technical explanation, I would say, "any name that is not defined by the generator expression itself has the binding that was in effect for that name at the time the generator expression occurs." (Note that this statement is equally true for any other non-lambda expression.) For a non-technical explanation, I wouldn't say anything, because I don't think anybody is going to assume the late-binding behavior, who doesn't already have the mental model that "this is a shortcut for a generator function". IOW, the issue I see here is that if somebody runs into the problem, they need to learn about the free variables and closures concept in order to understand why their code is breaking. But, if it doesn't break, then why do they need to learn that? >Invoking default argument semantics makes the explanation >less clean: we would have to go through the trouble of finding all >references to fere variables. Do you want globals to be passed via >default arguments as well? And what about builtins? (Note that the >compiler currently doesn't know the difference.) This sounds like "if the implementation is hard to explain" grounds, which I agree with in principle. I'm not positive it's that hard to explain, though, mainly because I don't see how anyone would *question* it in the first place. I find it hard to imagine somebody *wanting* changes to the variable bindings to affect an iterator expression, and thus the issue of why that doesn't work should be *much* rarer than the other way around. Past this point I think I'll be duplicating either my or Tim's arguments for this, so I'll leave off now. From barry at python.org Tue Oct 21 23:34:50 2003 From: barry at python.org (Barry Warsaw) Date: Tue Oct 21 23:35:01 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <1066791643.19270.25.camel@localhost.localdomain> References: <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> <1066791643.19270.25.camel@localhost.localdomain> Message-ID: <1066793689.5750.376.camel@anthem> On Tue, 2003-10-21 at 23:00, Jeremy Hylton wrote: > I would prefer to see a separate statement similar to global that meant > "look for the nearest enclosing binding." Rather than specifying that > you want to use x from outer, you could only say you don't want x to be > local. That means you'd always get intermediate. Would those "up" bindings chain? -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20031021/eaa90213/attachment.bin From jeremy at alum.mit.edu Tue Oct 21 23:46:00 2003 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Tue Oct 21 23:48:32 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <1066793689.5750.376.camel@anthem> References: <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> <1066791643.19270.25.camel@localhost.localdomain> <1066793689.5750.376.camel@anthem> Message-ID: <1066794359.19270.31.camel@localhost.localdomain> On Tue, 2003-10-21 at 23:34, Barry Warsaw wrote: > On Tue, 2003-10-21 at 23:00, Jeremy Hylton wrote: > > > I would prefer to see a separate statement similar to global that meant > > "look for the nearest enclosing binding." Rather than specifying that > > you want to use x from outer, you could only say you don't want x to be > > local. That means you'd always get intermediate. > > Would those "up" bindings chain? Yes. If a block had an up declaration and it contained a nested block with an up declaration for the same variable, both blocks would refer to an outer binding. Jeremy From guido at python.org Tue Oct 21 22:15:15 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 00:08:01 2003 Subject: [Python-Dev] closure semantics In-Reply-To: Your message of "Tue, 21 Oct 2003 21:55:36 EDT." <1066787735.5750.343.camel@anthem> References: <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> <200310220121.52789.aleaxit@yahoo.com> <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> <3F95C7E0.4030608@livinglogic.de> <200310220020.h9M0KF825841@12-236-54-216.client.attbi.com> <1066787735.5750.343.camel@anthem> Message-ID: <200310220215.h9M2FFc26081@12-236-54-216.client.attbi.com> > So maybe the idea of using function attributes isn't totally nuts, if > you use a special name. E.g. outer.__locals__.x and outer.__globals__.x -1. Way too ugly. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 21 22:23:07 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 00:08:13 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 21:06:31 EDT." References: Message-ID: <200310220223.h9M2N7l26105@12-236-54-216.client.attbi.com> Urgh, we need this sorted out before Raymond can rewrite PEP 289 and present it to c.l.py... > [Samuele Pedroni] > > so this, if I understand: > > > > def h(): > > y = 0 > > l = [1,2] > > it = (x+y for x in l) > > y = 1 > > for v in it: > > print v > > > > will print 1,2 and not 2,3 > > That is what I had in mind, and that if the first assignment to "y" were > commented out, the assignment to "it" would raise UnboundLocalError. > > > unlike: > > > > def h(): > > y = 0 > > l = [1,2] > > def gen(S): > > for x in S: > > yield x+y > > it = gen(l) > > y = 1 > > for v in it: > > print v > > Yes, but like it if you replaced the "def gen" and the line following it > with: > > def gen(y=y, l=l): > for x in l: > yield x+y > it = gen() > > This is worth some thought. My intuition is that we *don't* want "a > closure" here. If generator expressions were reiterable, then (probably > obnoxiously) clever code could make some of use of tricking them into using > different inherited bindings on different (re)iterations. But they're > one-shot things, and divorcing the values actually used from the values in > force at the definition site sounds like nothing but trouble to me > (error-prone and surprising). They look like expressions, after all, and > after > > x = 5 > y = x**2 > x = 10 > print y > > it would be very surprising to see 100 get printed. In the rare cases > that's desirable, creating an explicit closure is clear(er): > > x = 5 > y = lambda: x**2 > x = 10 > print y() > > I expect creating a closure instead would bite hard especially when building > a list of generator expressions (one of the cases where delaying generation > of the results is easily plausible) in a loop. The loop index variable will > probably play some role (directly or indirectly) in the intended operation > of each generator expression constructed, and then you severely want *not* > for each generator expression to see "the last" value of the index vrlbl. Right. > For concreteness, test_generators.Queens.__init__ creates a list of rowgen() > generators, and rowgen uses the default-arg trick to give each generator a > different value for rowuses; it would be an algorithmic disaster if they all > used the same value. > > Generator expressions are too limited to do what rowgen() does (it needs to > create and undo side effects as backtracking proceeds), so it's not > perfectly relevant as-is. I *suspect* that if people work at writing > concrete use cases, though, a similar thing will hold. > > BTW, Icon can give no guidance here: in that language, the generation of a > generator's result sequence is inextricably bound to the lexical occurrence > of the generator. The question arises in Python because definition site and > generation can be divorced. So, do you want *all* free variables to be passed using the default-argument trick (even globals and builtins), or only those that correspond to variables in the immediately outer scope, or only those corresponding to function scopes (as opposed to globals)? n = 0 def f(): global n n += 1 return n print list(n+f() for x in range(10)) --Guido van Rossum (home page: http://www.python.org/~guido/) From eppstein at ics.uci.edu Wed Oct 22 00:11:24 2003 From: eppstein at ics.uci.edu (David Eppstein) Date: Wed Oct 22 00:11:31 2003 Subject: [Python-Dev] Re: closure semantics References: <200310212309.h9LN9Sk25548@12-236-54-216.client.attbi.com> <200310220314.h9M3Euk11066@oma.cosc.canterbury.ac.nz> Message-ID: In article <200310220314.h9M3Euk11066@oma.cosc.canterbury.ac.nz>, Greg Ewing wrote: > Guido: > > > > def inner(): > > > outer.x = 42 > > > > Because this already means something! > > Hmmm, maybe > > x of outer = 42 > > Determined-to-get-an-'of'-into-the-language-somehow-ly, scope(outer).x = 42 Almost implementable now by using the inspect module to find the first matching scope, except that inspect can't change the local variable values, only look at them. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From tdelaney at avaya.com Wed Oct 22 00:44:13 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Wed Oct 22 00:44:21 2003 Subject: [Python-Dev] Re: accumulator display syntax Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6A8FF@au3010avexu1.global.avaya.com> > From: David Eppstein [mailto:eppstein@ics.uci.edu] > > Once I have iterator expressions, I can simplify it by > dropping a whole two > characters (the brackets) and get an unimportant time > savings. But with > PEP 274, I could write > > pos2d = > {s:(positions[s][0]+dx*positions[s][2],positions[s][1]+dy*posi > tions[s][2]) > for s in positions} Don't be evil ... Tim Delaney From guido at python.org Wed Oct 22 00:48:44 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 00:48:53 2003 Subject: [Python-Dev] closure semantics In-Reply-To: Your message of "Tue, 21 Oct 2003 23:00:43 EDT." <1066791643.19270.25.camel@localhost.localdomain> References: <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> <1066791643.19270.25.camel@localhost.localdomain> Message-ID: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com> [Guido] > > Maybe "global x in f" would work? [Jeremy] > Woo hoo. I'm happy to hear you've had a change of heart on this topic. > I think a simple, declarative statement would be clearer than assigning > to an attribute of a special object. Right. > If a special object, like __global__, existed, could you create an > alias, like: > > surprise = __global__ > surprise.x = 1 > print __global__.x > > ? > > It would apparently also allow you to use a local and global variable > with the same name in the same scope. That's odd, although I suppose it > would be clear from context whether the local or global was intended. I don't care about that argument; it's no more confusing to have globals.x and x as it is to have self.x and x, and the latter happens all the time. > > def outer(): > > x = 1 > > def intermediate(): > > x = 2 > > def inner(): > > global x in outer > > x = 42 > > inner() > > print x # prints 2 > > intermediate() > > print x # prints 42 > > I would prefer to see a separate statement similar to global that meant > "look for the nearest enclosing binding." Rather than specifying that > you want to use x from outer, you could only say you don't want x to be > local. That means you'd always get intermediate. That would be fine; I think that code where you have a choice of more than one outer variable with the same name is seriously insane. An argument for naming the outer function is that explicit is better than implicit, and it might help the reader if there is more than one level; OTOH it is a pain if you decide to rename the outer function (easily caught by the parser, but creates unnecessary work). I admit that I chose this mostly because the syntax 'global x in outer' reads well and doesn't require new keywords. > I think this choice is more modular. If you can re-bind a non-local > variable, then the name of the function where it is initially bound > isn't that interesting. It would be safe, for example, to move it to > another place in the function hierarchy without affecting the semantics > of the program I'm not sure what you mean here. Move x around, or move outer around? In both cases I can easily see how the semantics *would* change, in general. > -- except that in the case of "global x in outer" you'd > have to change all the referring global statements. Yes, that's the main downside. > Or would the semantics be to create a binding for x in outer, even > if it didn't already exist? That would be the semantics, right; just like the current global statement doesn't care whether the global variable already exists in the module or not; it will create it if necessary. But a relative global statement would be fine too; it would be an error if there's no definition of the given variable in scope. But all this is moot unless someone comes up with a way to spell this that doesn't require a new keyword or change the meaning of 'global x' even if there's an x at an intermediate scope (i.e. you can't change 'global x' to mean "search for the next outer scope that defines x"). And we still have to answer Alex's complaint that newbies misinterpret the word 'global'. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Oct 22 01:00:21 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 01:00:38 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: Your message of "Tue, 21 Oct 2003 23:22:09 EDT." References: <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch><200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> <200310220121.52789.aleaxit@yahoo.com> <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> Message-ID: <200310220500.h9M50Ln26397@12-236-54-216.client.attbi.com> > Is there any good reason to ever use globals anywhere other than as > the first statement (after doc string) of a function? If the use of the global is fairly localized, I sometimes like to have the global declaration immediately proceed the first use, assuming all other uses are in the same indented block. (This means that I sometimes *do* have global inside flow control, but then all uses are also inside the same branch.) But I'm not sure this is a *good* reason. > If not, could its usage be so restricted (like __future__ import)? This would break way too much stuff. It would have been a good idea for 0.1. But then I was trying to keep the grammar small while keeping syntactic checks out of the compilation phase if at all possible, and I thought "screw it -- if import can go anywhere, so can global." > > > Plus. EVERY newbie makes the mistake of taking "global" to mean > > > "for ALL modules" rather than "for THIS module", > > Part of my brain still thinks that, and another part has to say, > 'no, just modular or mod_vars()'. > > > Only if they've been exposed to languages that have such globals. > > Like Python with __builtins__? which I think of as the true globals. Hardly, since they aren't normally thought of as variables. > Do C or Fortran count as such a source of 'infection'? C, definitely -- it has the concept and the terminology. In Fortran, it's called common blocks (similar in idea to ABC's SHARE). > > > uselessly using global in toplevel, > > > > Which the parser should reject. > > Good. The current nonrejection sometimes leads beginners astray > because they think it must be doing something. Just like x + 1 I suppose. I'm sure PyChecker catches this. > While I use global/s() just fine, I still don't like the names. I > decided awhile ago that they must predate import, when the current > module scoop would have been 'global'. No, they were both there from day one. Frankly, I don't think in this case newbie confusion is enough of a reason to switch from global to some other keyword of mechanism. Yes, this means I'm retracting my support for Alex's "replace-global-with-attribute-assignment" proposal -- Jeremy's objection made me realize why I don't like it much. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Oct 22 01:02:56 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 01:03:11 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: Your message of "Wed, 22 Oct 2003 03:27:14 +0200." <5.2.1.1.0.20031022031539.027f6fc0@pop.bluewin.ch> References: <200310220121.52789.aleaxit@yahoo.com> <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> <200310220158.21389.aleaxit@yahoo.com> <5.2.1.1.0.20031022031539.027f6fc0@pop.bluewin.ch> Message-ID: <200310220503.h9M52um26419@12-236-54-216.client.attbi.com> [Samuele] > . suggests runtime, for compile time then maybe Right, that's what I don't like about it. > global::x=42 > module::x=42 > > outer::x=42 > > (I don't like those, and personally I don't see the need to get rebinding > for closed-over variables but anyway) I don't like these either. > another possibility is that today is a syntax error, so maybe > > global x = 42 or > module x = 42 > > they would not be statements, this for symmetry would also be legal: > > y = module x + 1 > > then > > outer x = 42 > > and also > > y = g x + 1 > > the problems are also clear, in some other languages x y is function > application, etc.. Juxtaposition of names opens a whole lot of cans of worms -- for one, it makes many more typos pass the parser. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at cosc.canterbury.ac.nz Wed Oct 22 01:16:33 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed Oct 22 01:20:07 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: Message-ID: <200310220516.h9M5GXV11525@oma.cosc.canterbury.ac.nz> David Eppstein : > Who cares about tuple comprehensions, but I would like similar syntactic > sugar for dict comprehensions as for lists: > {k:v for k,v in S} If you have *that*, as well as generator expressions, someone is going to want k:v for k,v in S as a bare expression to be some sort of generator. What exactly it would generate isn't clear... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From python at rcn.com Wed Oct 22 01:19:58 2003 From: python at rcn.com (Raymond Hettinger) Date: Wed Oct 22 01:20:58 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310212218.h9LMIS725333@12-236-54-216.client.attbi.com> Message-ID: <000a01c3985c$2345dc60$e841fea9@oemcomputer> [Raymond] > > If there is any doubt on that score, I would be happy to update > > the PEP to match the current proposal for iterator expressions > > and solicit more community feedback. [Guido] > Wonderful! Rename PEP 289 to "generator expressions" and change the > contents to match this proposal. Thanks for being the fall guy! Here is a rough draft on the resurrected PEP. I'm sure it contains many flaws and I welcome suggested amendments. In particular, the follow needs attention: * Precise specification of the syntax including the edge cases with commas where enclosing parentheses are required. * Making sure the acknowledgements are correct and complete. * Verifying my understanding of the issues surrounding late binding, modification of locals, and returning generator expressions. * Clear articulation of the expected benefits. There are so many, it was difficult to keep it focused. Raymond Hettinger ---------------------------------------------------------------------- PEP: 289 Title: Generator Expressions Version: $Revision: 1.2 $ Last-Modified: $Date: 2003/08/30 23:57:36 $ Author: python@rcn.com (Raymond D. Hettinger) Status: Active Type: Standards Track Created: 30-Jan-2002 Python-Version: 2.3 Post-History: 22-Oct-2003 Abstract This PEP introduces generator expressions as a high performance, memory efficient generalization of list expressions and generators. Rationale Experience with list expressions has shown their wide-spread utility throughout Python. However, many of the use cases do not need to have a full list created in memory. Instead, they only need to iterate over the elements one at a time. For instance, the following dictionary constructor code will build a full item list in memory, iterate over that item list, and, when the reference is no longer needed, delete the list: d = dict([(k, func(v)) for k in keylist]) Time, clarity, and memory are conserved by using an generator expession instead: d = dict((k, func(v)) for k in keylist) Similar benefits are conferred on the constructors for other container objects: s = Set(word for line in page for word in line.split()) Having a syntax similar to list comprehensions makes it easy to switch to an iterator expression when scaling up application. Generator expressions are especially useful in functions that reduce an iterable input to a single value: sum(len(line) for line.strip() in file if len(line)>5) Accordingly, generator expressions are expected to partially eliminate the need for reduce() which is notorious for its lack of clarity. And, there are additional speed and clarity benefits from writing expressions directly instead of using lambda. List expressions greatly reduced the need for filter() and map(). Likewise, generator expressions are expected to minimize the need for itertools.ifilter() and itertools.imap(). In contrast, the utility of other itertools will be enhanced by generator expressions: dotproduct = sum(x*y for x,y in itertools.izip(x_vector, y_vector)) BDFL Pronouncements The previous version of this PEP was REJECTED. The bracketed yield syntax left something to be desired; the performance gains had not been demonstrated; and the range of use cases had not been shown. After, much discussion on the python-dev list, the PEP has been resurrected its present form. The impetus for the discussion was an innovative proposal from Peter Norvig. The Gory Details 1) In order to achieve a performance gain, generator expressions need to be run in the local stackframe; otherwise, the improvement in cache performance gets offset by the time spent switching stackframes. The upshot of this is that generator expressions need to be both created and consumed within the context of a single stackframe. Accordingly, the generator expression cannot be returned to another function: return (k, func(v)) for k in keylist 2) The loop variable is not exposed to the surrounding function. This both facilates the implementation and makes typical use cases more reliable. In some future version of Python, list comprehensions will also hide the induction variable from the surrounding code (and, in Py2.4, warnings will be issued for code accessing the induction variable). 3) Variables references in the generator expressions will exhibit late binding just like other Python code. In the following example, the iterator runs *after* the value of y is set to one: def h(): y = 0 l = [1,2] def gen(S): for x in S: yield x+y it = gen(l) y = 1 for v in it: print v 4) List comprehensions will remain unchanged. So, [x for x in S] is a list comprehension and [(x for x in S)] is a list containing one generator expression. 5) It is prohibited to use locals() for other than read-only use in generator expressions. This simplifies the implementation and precludes a certain class of obfuscated code. Acknowledgements: Peter Norvig resurrected the discussion proposal for "accumulation displays". Alex Martelli provided critical measurements that proved the the performance benefits of generator expressions. Samuele Pedroni provided the example of late binding. Guido van Rossum suggested the bracket free, yield free syntax. Raymond Hettinger first proposed "generator comprehensions" in January 2002. References [1] PEP 255 Simple Generators http://python.sourceforge.net/peps/pep-0255.html [2] PEP 202 List Comprehensions http://python.sourceforge.net/peps/pep-0202.html [3] Peter Norvig's Accumulation Display Proposal http:///www.norvig.com/pyacc.html Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil fill-column: 70 End: From guido at python.org Wed Oct 22 01:27:42 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 01:27:55 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Tue, 21 Oct 2003 23:03:14 EDT." References: Message-ID: <200310220527.h9M5Rgr26465@12-236-54-216.client.attbi.com> [Tim] > I'm not sure it's "a feature" that > > print [n+f() for x in range(10)] > > looks up n and f anew on each iteration -- if I saw a listcomp that > actually relied on this, I'd be eager to avoid inheriting any of > author's code. It's just a direct consequence of Python's general rule for name lookup in all contexts: variables are looked up when used, not before. (Note: lookup is different from scope determination, which is done mostly at compile time. Scope determination tells you where to look; lookup gives you the actual value of that location.) If n is a global and calling f() changes n, f()+n differs from n+f(), and both are well-defined due to the left-to-right rule. That's not good or bad, that's just *how it is*. Despite having some downsides, the simplicity of the rule is good; I'm sure we could come up with downsides of other rules too. Despite the good case that's been made for what would be most useful, I'm loathe to drop the evaluation rule for convenience in one special case. Next people may argue that in Python 3.0 lambda should also do this; arguably it's more useful than the current semantics there too. And then what next -- maybe all nested functions should copy their free variables? Oh, and then maybe outermost functions should copy their globals into locals too -- that will speed up a lot of code. :-) There are other places in Python where some rule is applied to "all free variables of a given piece of code" (the distinction between locals and non-locals in functions is made this way). But there are no other places where implicit local *copies* of all those free variables are taken. I'd need to find a unifying principle to warrant doing that beyond utility. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at cosc.canterbury.ac.nz Wed Oct 22 01:38:26 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed Oct 22 01:38:39 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310220223.h9M2N7l26105@12-236-54-216.client.attbi.com> Message-ID: <200310220538.h9M5cQn11555@oma.cosc.canterbury.ac.nz> Guido van Rossum : > So, do you want *all* free variables to be passed using the > default-argument trick (even globals and builtins), or only those that > correspond to variables in the immediately outer scope, or only those > corresponding to function scopes (as opposed to globals)? And what about foo = (f(x) for x in stuff) def f(x): ... for blarg in foo: ... ? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido at python.org Wed Oct 22 02:02:28 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 02:02:47 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Wed, 22 Oct 2003 01:19:58 EDT." <000a01c3985c$2345dc60$e841fea9@oemcomputer> References: <000a01c3985c$2345dc60$e841fea9@oemcomputer> Message-ID: <200310220602.h9M62Su26531@12-236-54-216.client.attbi.com> > Here is a rough draft on the resurrected PEP. Thanks -- that was quick! > I'm sure it contains many flaws and I welcome suggested amendments. > In particular, the follow needs attention: > > * Precise specification of the syntax including the edge cases > with commas where enclosing parentheses are required. > > * Making sure the acknowledgements are correct and complete. > > * Verifying my understanding of the issues surrounding late binding, > modification of locals, and returning generator expressions. > > * Clear articulation of the expected benefits. There are so many, > it was difficult to keep it focused. > > > Raymond Hettinger > > ---------------------------------------------------------------------- > > PEP: 289 > Title: Generator Expressions > Version: $Revision: 1.2 $ > Last-Modified: $Date: 2003/08/30 23:57:36 $ > Author: python@rcn.com (Raymond D. Hettinger) > Status: Active > Type: Standards Track > Created: 30-Jan-2002 > Python-Version: 2.3 > Post-History: 22-Oct-2003 > > > Abstract > > This PEP introduces generator expressions as a high performance, > memory efficient generalization of list expressions and > generators. Um, please change "list expressions" back to "list comprehensions" everywhere. Global substitute gone awry? :-) > Rationale > > Experience with list expressions has shown their wide-spread > utility throughout Python. However, many of the use cases do > not need to have a full list created in memory. Instead, they > only need to iterate over the elements one at a time. > > For instance, the following dictionary constructor code will > build a full item list in memory, iterate over that item list, > and, when the reference is no longer needed, delete the list: > > d = dict([(k, func(v)) for k in keylist]) I'd prefer to use the example sum([x*x for x in range(10)]) > Time, clarity, and memory are conserved by using an generator > expession instead: > > d = dict((k, func(v)) for k in keylist) which becomes sum(x*x for x in range(10)) (I find the dict constructor example sub-optimal because it starts with two parentheses, and visually finding the match for the second of those is further complicated by the use of func(v) for the value.) > Similar benefits are conferred on the constructors for other > container objects: (Here you can use the dict constructor example.) > s = Set(word for line in page for word in line.split()) > > Having a syntax similar to list comprehensions makes it easy to > switch to an iterator expression when scaling up application. ^^^^^^^^ generator > Generator expressions are especially useful in functions that reduce > an iterable input to a single value: > > sum(len(line) for line.strip() in file if len(line)>5) ^^^^^^^^^^^^ That's not valid syntax; my example was something like sum(len(line) for line in file if line.strip()) > Accordingly, generator expressions are expected to partially > eliminate the need for reduce() which is notorious for its lack > of clarity. And, there are additional speed and clarity benefits > from writing expressions directly instead of using lambda. > > List expressions greatly reduced the need for filter() and > map(). Likewise, generator expressions are expected to minimize > the need for itertools.ifilter() and itertools.imap(). In > contrast, the utility of other itertools will be enhanced by > generator expressions: > > dotproduct = sum(x*y for x,y in itertools.izip(x_vector, y_vector)) > > > BDFL Pronouncements > > The previous version of this PEP was REJECTED. The bracketed > yield syntax left something to be desired; the performance gains > had not been demonstrated; and the range of use cases had not > been shown. After, much discussion on the python-dev list, the > PEP has been resurrected its present form. The impetus for the > discussion was an innovative proposal from Peter Norvig. > > > The Gory Details > > 1) In order to achieve a performance gain, generator expressions need > to be run in the local stackframe; otherwise, the improvement in > cache performance gets offset by the time spent switching > stackframes. The upshot of this is that generator expressions > need to be both created and consumed within the context of a > single stackframe. Accordingly, the generator expression cannot > be returned to another function: > > return (k, func(v)) for k in keylist Heh? Did you keep this from the old PEP? Performance tests show that a generator function is already faster than a list comprehension, and the semantics are now defined as equivalent to creating an anonymous generator function and calling it. (There's still discussion about whether that generator function should copy the current value of all free variables into default arguments.) We need a Gory Detail item explaining the exact syntax. I propose that a generator expression always needs to be inside a set of parentheses and cannot have a comma on either side. Unfortunately this is different from list comprehensions; while [1, x for x in R] is illegal, [x for x in 1, 2, 3] is legal, meaning [x for x in (1,2,3)]. With reference to the file Grammar/Grammar in CVS, I think these changes are suitable: (1) The rule atom: '(' [testlist] ')' changes to atom: '(' [listmaker1] ')' where listmaker1 is almost the same as listmaker, but only allows a single test after 'for' ... 'in'. (2) The rule for arglist is similarly changed so that it can be either a bunch of arguments possibly followed by *xxx and/or **xxx, or a single generator expression. This is even hairier, so I'm not going to present the exact changes here; I'm confident that it can be done though using the same kind of breakdown as used for listmaker. Yes, maybe the compiler may have to work a little harder to distinguish all the cases. :-) > 2) The loop variable is not exposed to the surrounding function. > This both facilates the implementation and makes typical use > cases more reliable. In some future version of Python, list > comprehensions will also hide the induction variable from the > surrounding code (and, in Py2.4, warnings will be issued for > code accessing the induction variable). > > 3) Variables references in the generator expressions will > exhibit late binding just like other Python code. In the > following example, the iterator runs *after* the value of y is > set to one: > > def h(): > y = 0 > l = [1,2] > def gen(S): > for x in S: > yield x+y > it = gen(l) > y = 1 > for v in it: > print v There is still discussion about this one. > 4) List comprehensions will remain unchanged. > So, [x for x in S] is a list comprehension and > [(x for x in S)] is a list containing one generator expression. > > 5) It is prohibited to use locals() for other than read-only use > in generator expressions. This simplifies the implementation and > precludes a certain class of obfuscated code. I wouldn't mention this. assigning into locals() has an undefined effect anyway. > Acknowledgements: > > Peter Norvig resurrected the discussion proposal for "accumulation > displays". Can you do inline URLs in the final version? Maybe an opportunity to learn reST. :-) Or else at least add [3] to the text. > Alex Martelli provided critical measurements that proved the > the performance benefits of generator expressions. And also argued with great force that this was a useful thing to have (as have several others). > Samuele Pedroni provided the example of late binding. (But he wanted generator expressions *not* to use late binding!) > Guido van Rossum suggested the bracket free, yield free syntax. I don't need credits, and I wouldn't be surprised if someone else had suggested it first. > Raymond Hettinger first proposed "generator comprehensions" in > January 2002. Phillip Eby suggested "iterator expressions" as the name and subsequently Tim Peters suggested "generator expressions". > References > > [1] PEP 255 Simple Generators > http://python.sourceforge.net/peps/pep-0255.html > > [2] PEP 202 List Comprehensions > http://python.sourceforge.net/peps/pep-0202.html > > [3] Peter Norvig's Accumulation Display Proposal > http:///www.norvig.com/pyacc.html I'd point to the thread in python-dev too. BTW I think the idea of having some iterators support __copy__ as a way to indicate they can be cloned is also PEPpable; we've pretty much reached closure on that one. PEP 1 explains how to get a PEP number. --Guido van Rossum (home page: http://www.python.org/~guido/) From theller at python.net Wed Oct 22 03:31:31 2003 From: theller at python.net (Thomas Heller) Date: Wed Oct 22 03:31:59 2003 Subject: [Python-Dev] Re: buildin vs. shared modules In-Reply-To: <7k2ywden.fsf@yahoo.co.uk> (Paul Moore's message of "Tue, 21 Oct 2003 21:20:48 +0100") References: <200310171840.h9HIesN06941@12-236-54-216.client.attbi.com> <200310211857.57783.aleaxit@yahoo.com> <200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com> <7k2ywden.fsf@yahoo.co.uk> Message-ID: <65ihlodo.fsf@python.net> [Thomas] >>> After installing MSVC6 on a win98 machine, where I could rename >>> wsock32.dll away (which was not possible on XP due to file system >>> protection), I was able to change socketmodule.c to use delay loading of >>> the winsock dll. I had to wrap up the WSAStartup() call inside a >>> __try {} __except {} block to catch the exception thrown. >>> >>> With this change, _socket (and maybe also select) could then also be >>> converted into builtin modules. >>> >>> Guido, what do you think? >> [Guido] >> I think now is a good time to try this in 2.4. I don't think I'd want >> to do this (or any of the proposed reorgs) in 2.3 though. > [Paul] > One (very mild) point - this is highly MSVC-specific. I don't know if > there is ever going to be any interest in (for example) getting Python > to build with Mingw/gcc on Windows, but there's no equivalent of this > in Mingw (indeed, Mingw doesn't, as far as I know, support > __try/__except either). The whole delayload/__try/__except stuff may be unneeded in 2.4, because it will most probably be compiled with MSVC7.1, installed via an msi installer, and all systems where the msi actually could be installed would already have a winsock (or winsock2) dll. At least that is my impression on what I hear about systems older than (or including?) win98SE these days. Thomas From python at rcn.com Wed Oct 22 03:57:45 2003 From: python at rcn.com (Raymond Hettinger) Date: Wed Oct 22 03:58:33 2003 Subject: [Python-Dev] PEP 289: Generator Expressions (second draft) In-Reply-To: <200310220602.h9M62Su26531@12-236-54-216.client.attbi.com> Message-ID: <000001c39872$2dece760$e841fea9@oemcomputer> Guido, thanks for the quick edits of the first draft. Here is a link to the second: http://users.rcn.com/python/download/pep-0289.html The reST version is attached. [Guido] > BTW I think the idea of having some iterators support __copy__ as a > way to indicate they can be cloned is also PEPpable; we've pretty much > reached closure on that one. PEP 1 explains how to get a PEP number. That one sounds like a job for Alex. Raymond Hettinger ------------------------------------------------------------------ PEP: 289 Title: Generator Expressions Version: $Revision: 1.3 $ Last-Modified: $Date: 2003/08/30 23:57:36 $ Author: python@rcn.com (Raymond D. Hettinger) Status: Active Type: Standards Track Content-Type: text/x-rst Created: 30-Jan-2002 Python-Version: 2.3 Post-History: 22-Oct-2003 Abstract ======== This PEP introduces generator expressions as a high performance, memory efficient generalization of list comprehensions [1]_ and generators [2]_. Rationale ========= Experience with list comprehensions has shown their wide-spread utility throughout Python. However, many of the use cases do not need to have a full list created in memory. Instead, they only need to iterate over the elements one at a time. For instance, the following summation code will build a full list of squares in memory, iterate over those values, and, when the reference is no longer needed, delete the list:: sum([x*x for x in range(10)]) Time, clarity, and memory are conserved by using an generator expession instead:: sum(x*x for x in range(10)) Similar benefits are conferred on constructors for container objects:: s = Set(word for line in page for word in line.split()) d = dict( (k, func(v)) for k in keylist) Generator expressions are especially useful in functions that reduce an iterable input to a single value:: sum(len(line) for line in file if line.strip()) Accordingly, generator expressions are expected to partially eliminate the need for reduce() which is notorious for its lack of clarity. And, there are additional speed and clarity benefits from writing expressions directly instead of using lambda. List comprehensions greatly reduced the need for filter() and map(). Likewise, generator expressions are expected to minimize the need for itertools.ifilter() and itertools.imap(). In contrast, the utility of other itertools will be enhanced by generator expressions:: dotproduct = sum(x*y for x,y in itertools.izip(x_vector, y_vector)) Having a syntax similar to list comprehensions also makes it easy to convert existing code into an generator expression when scaling up application. BDFL Pronouncements =================== The previous version of this PEP was REJECTED. The bracketed yield syntax left something to be desired; the performance gains had not been demonstrated; and the range of use cases had not been shown. After, much discussion on the python-dev list, the PEP has been resurrected its present form. The impetus for the discussion was an innovative proposal from Peter Norvig [3]_. The Gory Details ================ 1. The semantics of a generator expression are equivalent to creating an anonymous generator function and calling it. There's still discussion about whether that generator function should copy the current value of all free variables into default arguments. 2. The syntax requires that a generator expression always needs to be inside a set of parentheses and cannot have a comma on either side. Unfortunately, this is different from list comprehensions. While [1, x for x in R] is illegal, [x for x in 1, 2, 3] is legal, meaning [x for x in (1,2,3)]. With reference to the file Grammar/Grammar in CVS, two rules change: a) The rule:: atom: '(' [testlist] ')' changes to:: atom: '(' [listmaker1] ')' where listmaker1 is almost the same as listmaker, but only allows a single test after 'for' ... 'in'. b) The rule for arglist needs similar changes. 2. The loop variable is not exposed to the surrounding function. This facilates the implementation and makes typical use cases more reliable. In some future version of Python, list comprehensions will also hide the induction variable from the surrounding code (and, in Py2.4, warnings will be issued for code accessing the induction variable). 3. There is still discussion about whether variable referenced in generator expressions will exhibit late binding just like other Python code. In the following example, the iterator runs *after* the value of y is set to one:: def h(): y = 0 l = [1,2] def gen(S): for x in S: yield x+y it = gen(l) y = 1 for v in it: print v 4. List comprehensions will remain unchanged:: [x for x in S] # This is a list comprehension. [(x for x in S)] # This is a list containing one generator expression. Acknowledgements ================ * Raymond Hettinger first proposed the idea of "generator comprehensions" in January 2002. * Peter Norvig resurrected the discussion in his proposal for Accumulation Displays [3]_. * Alex Martelli provided critical measurements that proved the performance benefits of generator expressions. He also provided strong arguments that they were a desirable thing to have. * Phillip Eby suggested "iterator expressions" as the name. * Subsequently, Tim Peters suggested the name "generator expressions". * Samuele Pedroni argued against late binding and provided the example shown above. References ========== .. [1] PEP 202 List Comprehensions http://python.sourceforge.net/peps/pep-0202.html .. [2] PEP 255 Simple Generators http://python.sourceforge.net/peps/pep-0255.html .. [3] Peter Norvig's Accumulation Display Proposal http:///www.norvig.com/pyacc.html Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: From fincher.8 at osu.edu Wed Oct 22 05:45:48 2003 From: fincher.8 at osu.edu (Jeremy Fincher) Date: Wed Oct 22 04:47:25 2003 Subject: [Python-Dev] PEP 289: Generator Expressions (second draft) In-Reply-To: <000001c39872$2dece760$e841fea9@oemcomputer> References: <000001c39872$2dece760$e841fea9@oemcomputer> Message-ID: <200310220545.49319.fincher.8@osu.edu> On Wednesday 22 October 2003 03:57 am, Raymond Hettinger wrote: > Accordingly, generator expressions are expected to partially eliminate > the need for reduce() which is notorious for its lack of clarity. And, > there are additional speed and clarity benefits from writing expressions > directly instead of using lambda. I probably missed it in this monster of a thread, but how do generator expressions do this? It seems that they'd only make reduce more efficient, but it would still be just as needed as before. Jeremy From mwh at python.net Wed Oct 22 07:03:18 2003 From: mwh at python.net (Michael Hudson) Date: Wed Oct 22 07:03:21 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310212349.h9LNnm710076@oma.cosc.canterbury.ac.nz> (Greg Ewing's message of "Wed, 22 Oct 2003 12:49:48 +1300 (NZDT)") References: <200310212349.h9LNnm710076@oma.cosc.canterbury.ac.nz> Message-ID: <2moew9o7pl.fsf@starship.python.net> Greg Ewing writes: > Michael Hudson : > >> In particular what happens if the iteration variable is a local in the >> frame anyway? I presume that would inhibit the renaming > > Why? Well, because then you have the same name for two different bindings. >> but then code like >> >> def f(x): >> r = [x+1 for x in range(x)] >> return r, x >> >> becomes even more incomprehensible (and changes in behaviour). > > Anyone who writes code like that *deserves* to have the > behaviour changed on them! This was not my impression of the Python way. I know I'd be pretty pissed if this broke my app. I have no objection to breaking the above code, just to breaking it silently! Having code *silently change in behaviour* (not die with an expection, post a warning at compile time or fail to compile at all) is about an evil a change as it's possible to contemplate, IMO. > If this is really a worry, an alternative would be to > simply forbid using a name for the loop variable that's > used for anything else outside the loop. That could > break existing code too, but at least it would break > it in a very obvious way by making it fail to compile. This would be infinitely preferable! Cheers, mwh -- I like silliness in a MP skit, but not in my APIs. :-) -- Guido van Rossum, python-dev From gmccaughan at synaptics-uk.com Wed Oct 22 07:15:27 2003 From: gmccaughan at synaptics-uk.com (Gareth McCaughan) Date: Wed Oct 22 07:16:10 2003 Subject: [Python-Dev] accumulator display syntax Message-ID: <200310221215.27570.gmccaughan@synaptics-uk.com> Tim Peters wrote: > "Set comprehensions" in a programming language originated with SETL, > and are named in honor of the set-theoretic Axiom of Comprehension > (Aussonderungsaxiom). In its well-behaved form, that says roughly that > given a set X, then for any predicate P(x), there exists a subset of X whose > elements consist of exactly those elements x of X for which P(x) is true (in > its ill-behaved form, it leads directly to Russell's Paradox -- the set of > all sets that don't contain themselves). "Aussonderungsaxiom" is the axiom of *separation*[1], which is a weakened version of the (disastrous) axiom of *comprehension*. In terms of Python's listcomps, comprehension would be [x if P(x)] and separation [x for x in S if P(x)]. So we should be calling them "list separations", really :-). [1] Hence the name; compare English "sunder". For the record, I like "generator expressions" too, or "iterator expressions". -- Gareth McCaughan From pedronis at bluewin.ch Wed Oct 22 08:12:35 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Wed Oct 22 08:10:19 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310220602.h9M62Su26531@12-236-54-216.client.attbi.com> References: <000a01c3985c$2345dc60$e841fea9@oemcomputer> Message-ID: <5.2.1.1.0.20031022140258.027b3d80@pop.bluewin.ch> At 23:02 21.10.2003 -0700, Guido van Rossum wrote: > > Samuele Pedroni provided the example of late binding. > >(But he wanted generator expressions *not* to use late binding!) > > to be honest no, I was just arguing for coherent behavior between generator expressions and closures, Tim and Phillip J. Eby argued (are arguing) against late binding. It is true that successively in an OT-way I mildly proposed non-late-binding semantics but for _all_ closures wrt to free variables apart from globals, but I got that a fraction of people still would like rebinding support for closer-over vars (something I don't miss personally) , and there are subtle issues wrt recursive references, which while solvable would make the semantics rather DWIMish , not a good thing. Samuele. From skip at pobox.com Wed Oct 22 08:42:08 2003 From: skip at pobox.com (Skip Montanaro) Date: Wed Oct 22 08:42:17 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310212307.h9LN7hY25523@12-236-54-216.client.attbi.com> References: <200310212316.22749.aleaxit@yahoo.com> <200310212211.h9LMBH925278@12-236-54-216.client.attbi.com> <200310220105.08017.aleaxit@yahoo.com> <200310212307.h9LN7hY25523@12-236-54-216.client.attbi.com> Message-ID: <16278.31520.908436.64862@montanaro.dyndns.org> Guido> Raymond is going to give PEP 289 an overhaul. Since you rejected PEP 289 at one point, it might be worth having a short explanation of why you've changed your mind. Skip From skip at pobox.com Wed Oct 22 08:48:52 2003 From: skip at pobox.com (Skip Montanaro) Date: Wed Oct 22 08:49:04 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <20031021233910.GA2091@mems-exchange.org> References: <20031021233910.GA2091@mems-exchange.org> Message-ID: <16278.31924.243308.981142@montanaro.dyndns.org> Neil> Another nice thing is that we have tuple and dict comprehensions Neil> for free: Neil> tuple(x for x in S) Neil> dict((k, v) for k, v in S) Neil> Set(x for x in S) Neil> Aside from the bit of syntactic sugar, everything is nice an Neil> regular. Maybe in 3.0 the syntactic sugar for list comprehensions should disappear then. Skip From ncoghlan at iinet.net.au Wed Oct 22 09:02:52 2003 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Wed Oct 22 09:02:56 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <200310211841.45711.aleaxit@yahoo.com> References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> <200310201815.h9KIFM821583@12-236-54-216.client.attbi.com> <3F953793.1000208@iinet.net.au> <200310211841.45711.aleaxit@yahoo.com> Message-ID: <3F967FFC.6040507@iinet.net.au> Alex Martelli strung bits together to say: > I don't think we should encourage that sort of thing with the "implicit > assignment" in accumulation. > > So, if it's an accumulation syntax we're going for, I'd much rather find > ways to express whether we want [a] no assignment at all (as e.g for > union_update), [b] plain assignment, [c] augmented assignment such > as += or whatever. Sorry, no good idea comes to my mind now, but > I _do_ think we'd want all three possibilities... I had a similar thought about 5 minutes after turning my computer off last night. The alternative I came up with was: y = (from result = 0.0 do result += x**2 for x in values if x > 0) The two extra clauses (from & do) are pretty much unavoidable if we want to be able to express both the starting point, and the method of accumulation. And hopefully those clauses would be enough to disambiguate this from the new syntax for generator expressions. The 'from' clause would allow a single plain assignment statement. It names the accumulation variable, and also gives it an initial value (if you don't want an initial value, an explicit assignment to None should suffice) The 'do' clause would allow single plain or augmented assignment statements, as well as allowing any expression. 'from' is already a keyword (thanks to 'from ... import ...') and it might be possible to avoid making 'do' a keyword (in the same way that 'as' is not a keyword despite its use in 'from ... import ... as ...') (And I'll add my vote to pointing out that generator expressions don't magically eliminate the use of the reduce function or accumulation loops any more than list comprehensions did. We still need the ability to express the starting value and the accumulation method). Cheers, Nick. P.S. I'm heading off to Canberra early tomorrow morning, so I won't be catching up on this discussion until the weekend. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions." From skip at pobox.com Wed Oct 22 09:09:48 2003 From: skip at pobox.com (Skip Montanaro) Date: Wed Oct 22 09:09:57 2003 Subject: [Python-Dev] Time for py3k@python.org or a Py3K Wiki? Message-ID: <16278.33180.5190.95094@montanaro.dyndns.org> These various discussions are moving along a bit too rapidly for me to keep up. We have been discussing language issues which are going to impact Python 3.0, either by deprecating current language constructs which can't be eliminated until then (e.g., the global statement) or by tossing around language construct ideas which will have to wait until then for their implementation (other mechanisms for variable access in outer scopes). Unfortunately, I'm afraid these things are going to get lost in the sea of other python-dev topics and be forgotten about then the time is ripe. Maybe this would be a good time to create a py3k@python.org mailing list with more restrictions than python-dev (posting by members only? membership by invitation?) so we can more easily separate these ideas from shorter term issues and keep track of them in a separate Mailman archive. I'd suggest starting a Wiki, but that seems a bit too "global". You can restrict Wiki mods in MoinMoin to users who are logged in, but I'm not sure you can restrict signups very well. I also think Guido wants to make a significant leap on his own at Python 3.0, but that is going to require a considerable amount of uninteruppted full-time available for that effort. Given that my 21-year old only recently fled the nest and my 20-year old keeps returning, I'd say Guido's going to have to wait for quite awhile for the "uninterrupted" qualifier to become unconditionally true. ;-) In the meantime, a mailing list archive or Wiki would provide a good place to keep notes which he could refer to. Skip From skip at pobox.com Wed Oct 22 09:15:16 2003 From: skip at pobox.com (Skip Montanaro) Date: Wed Oct 22 09:15:24 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310220042.h9M0g5225903@12-236-54-216.client.attbi.com> References: <200310220121.52789.aleaxit@yahoo.com> <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> <200310220158.21389.aleaxit@yahoo.com> <200310220042.h9M0g5225903@12-236-54-216.client.attbi.com> Message-ID: <16278.33508.677499.127119@montanaro.dyndns.org> Guido> (If it wasn't clear, I'm struggling with this subject -- I think Guido> there are good reasons for why I'm resisting your proposal, but I Guido> haven't found them yet. The more I think about it, the less I Guido> like 'globals.x = 42' . How about __.x = 42 ? Skip From ark-mlist at att.net Wed Oct 22 09:24:15 2003 From: ark-mlist at att.net (Andrew Koenig) Date: Wed Oct 22 09:22:03 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: <200310212033.h9LKXDk24952@12-236-54-216.client.attbi.com> Message-ID: <009401c3989f$cb269030$6402a8c0@arkdesktop> > I thought we already established before that attempting to guess wihch > parts of a generator function to copy and which parts to share is > hopeless. generator-made iterators won't be __copy__-able, period. > I think this is the weakness of this cloning business, because it > either makes generators second-class iterators, or it makes cloning a > precarious thing to attempt when generators are used. (You can make a > non-cloneable iterator cloneable by wrapping it into something that > buffers just those items that are still reacheable by clones, but this > can still require arbitrary amounts of buffer space. However, the buffering can be done in a way that uses only as much buffer space as is truly needed. Just maintain the buffer as a singly linked list in which new elements are inserted at the *tail* of the list. Then whenever the head becomes unreachable (e.g. because no iterators refer to it), it will be garbage collected. From skip at pobox.com Wed Oct 22 09:26:02 2003 From: skip at pobox.com (Skip Montanaro) Date: Wed Oct 22 09:26:19 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <13803476.1066768024@[192.168.1.101]> References: <000001c3984b$052cd820$e841fea9@oemcomputer> <13803476.1066768024@[192.168.1.101]> Message-ID: <16278.34154.245725.959203@montanaro.dyndns.org> >>>>> "David" == David Eppstein writes: David> Currently, I am using expressions like David> pos2d = David> dict([(s,(positions[s][0]+dx*positions[s][2],positions[s][1]+dy*positions[s David> ][2])) David> for s in positions]) which I would have written something like pos2d = dict([(s,(positions[s][0]+dx*positions[s][2], positions[s][1]+dy*positions[s][2])) for s in positions]) so that I could see the relationship between the two tuple elements. [ skipping the avoidance of listcomp syntactic sugar ] David> But with PEP 274, I could write David> pos2d = David> {s:(positions[s][0]+dx*positions[s][2],positions[s][1]+dy*positions[s][2]) David> for s in positions} David> Instead of five levels of nested parens+brackets, I would need David> only three, and each level would be a different type of paren or David> bracket, which I think together with the shorter overall length David> would contribute significantly to readability. which I would still find unreadable and would recast in a more obvious (to me) way as pos2d = {s: (positions[s][0]+dx*positions[s][2], positions[s][1]+dy*positions[s][2]) for s in positions} The extra characters required today are less of a problem if the expression is laid out sensibly. Skip From aahz at pythoncraft.com Wed Oct 22 09:49:13 2003 From: aahz at pythoncraft.com (Aahz) Date: Wed Oct 22 09:49:18 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: <200310212327.h9LNRSX25612@12-236-54-216.client.attbi.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com> <200310212327.h9LNRSX25612@12-236-54-216.client.attbi.com> Message-ID: <20031022134913.GA21755@panix.com> On Tue, Oct 21, 2003, Guido van Rossum wrote: > > If you're talking about making > > x = None > for x in R: pass > print x # last item of R > > illegal, forget it. That's too darn useful. Not illegal, but perhaps for 3.0 we should consider making that print display "None". The question is to what extent Python should continue having unified semantics across constructs. While I agree that listcomps should definitely have a local scope ("expressions should not have side-effects"), I think that there would be advantages to the control variable in a for loop also having local scope that are magnified by having compatible semantics between listcomps and for loops. In other words, consider x = None [x for x in R] print x Why should the two behave differently? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From guido at python.org Wed Oct 22 10:42:26 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 10:42:38 2003 Subject: [Python-Dev] Re: buildin vs. shared modules In-Reply-To: Your message of "Wed, 22 Oct 2003 09:31:31 +0200." <65ihlodo.fsf@python.net> References: <200310171840.h9HIesN06941@12-236-54-216.client.attbi.com> <200310211857.57783.aleaxit@yahoo.com> <200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com> <7k2ywden.fsf@yahoo.co.uk> <65ihlodo.fsf@python.net> Message-ID: <200310221442.h9MEgQN27219@12-236-54-216.client.attbi.com> > The whole delayload/__try/__except stuff may be unneeded in 2.4, because > it will most probably be compiled with MSVC7.1, installed via an msi > installer, Is anyone working on that? I have the VC7.1 compiler too, but haven't tried to use it yet. Maybe someone should check in a project (separate from the VC6 project, so people don't *have to* switch yet)? Are the tools needed to build an MSI installer included in VC7.1? If not, are they a free download? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Oct 22 11:02:09 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 11:02:15 2003 Subject: [Python-Dev] Re: Reiterability In-Reply-To: Your message of "Wed, 22 Oct 2003 09:24:15 EDT." <009401c3989f$cb269030$6402a8c0@arkdesktop> References: <009401c3989f$cb269030$6402a8c0@arkdesktop> Message-ID: <200310221502.h9MF29U27337@12-236-54-216.client.attbi.com> > > I thought we already established before that attempting to guess wihch > > parts of a generator function to copy and which parts to share is > > hopeless. generator-made iterators won't be __copy__-able, period. > > > I think this is the weakness of this cloning business, because it > > either makes generators second-class iterators, or it makes cloning a > > precarious thing to attempt when generators are used. (You can make a > > non-cloneable iterator cloneable by wrapping it into something that > > buffers just those items that are still reacheable by clones, but this > > can still require arbitrary amounts of buffer space. > > However, the buffering can be done in a way that uses only as much > buffer space as is truly needed. Just maintain the buffer as a > singly linked list in which new elements are inserted at the *tail* > of the list. Then whenever the head becomes unreachable > (e.g. because no iterators refer to it), it will be garbage > collected. Correct. For this reason, Raymond will make a leak-proof version of his tee() function part of itertools. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Oct 22 11:06:43 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 11:07:01 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: Your message of "Wed, 22 Oct 2003 09:49:13 EDT." <20031022134913.GA21755@panix.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com> <200310212327.h9LNRSX25612@12-236-54-216.client.attbi.com> <20031022134913.GA21755@panix.com> Message-ID: <200310221506.h9MF6hM27365@12-236-54-216.client.attbi.com> > > If you're talking about making > > > > x = None > > for x in R: pass > > print x # last item of R > > > > illegal, forget it. That's too darn useful. > > Not illegal, but perhaps for 3.0 we should consider making that print > display "None". The question is to what extent Python should continue > having unified semantics across constructs. While I agree that listcomps > should definitely have a local scope ("expressions should not have > side-effects"), I think that there would be advantages to the control > variable in a for loop also having local scope that are magnified by > having compatible semantics between listcomps and for loops. In other > words, consider > > x = None > [x for x in R] > print x > > Why should the two behave differently? The variable of a for *statement* must be accessible after the loop because you might want to break out of the loop with a specific value. This is a common pattern that I have no intent of breaking. So it can't introduce a new scope; then it might as well keep the last value assigned to it. List comprehensions and generator expressions don't have 'break'. (You could cause an exception and catch it, but it's not a common pattern to use the control variable afterwards -- only the debugger would need access somehow.) --Guido van Rossum (home page: http://www.python.org/~guido/) From nas-python at python.ca Wed Oct 22 11:08:16 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Wed Oct 22 11:07:10 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <000a01c3985c$2345dc60$e841fea9@oemcomputer> References: <200310212218.h9LMIS725333@12-236-54-216.client.attbi.com> <000a01c3985c$2345dc60$e841fea9@oemcomputer> Message-ID: <20031022150816.GA4161@mems-exchange.org> On Wed, Oct 22, 2003 at 01:19:58AM -0400, Raymond Hettinger wrote: > Experience with list expressions has shown their wide-spread > utility throughout Python. However, many of the use cases do > not need to have a full list created in memory. Instead, they > only need to iterate over the elements one at a time. I see generator expression as making available the iterator guts of list comprehensions available as a first class object. The list() call is not always wanted. > 1) In order to achieve a performance gain, generator expressions need > to be run in the local stackframe [...] > Accordingly, the generator expression cannot be returned > to another function: That would be unacceptable, IMHO. Generator expressions should be first class. Luckily, generator functions are speedy little buggers. :-) Neil From guido at python.org Wed Oct 22 11:07:50 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 11:08:08 2003 Subject: [Python-Dev] PEP 289: Generator Expressions (second draft) In-Reply-To: Your message of "Wed, 22 Oct 2003 05:45:48 EDT." <200310220545.49319.fincher.8@osu.edu> References: <000001c39872$2dece760$e841fea9@oemcomputer> <200310220545.49319.fincher.8@osu.edu> Message-ID: <200310221507.h9MF7od27394@12-236-54-216.client.attbi.com> > I probably missed it in this monster of a thread, but how do > generator expressions do this? It seems that they'd only make > reduce more efficient, but it would still be just as needed as > before. All we need is more standard accumulator functions like sum(). There are many useful accumulator functions that aren't easily expressed as a binary operator but are easily done with an explicit iterator argument, so I am hopeful that the need for reduce will disappear. 99% use cases for reduce were with operator.add, and that's replaced by sum() already. --Guido van Rossum (home page: http://www.python.org/~guido/) From mcherm at mcherm.com Wed Oct 22 11:11:08 2003 From: mcherm at mcherm.com (Michael Chermside) Date: Wed Oct 22 11:11:18 2003 Subject: [Python-Dev] closure semantics Message-ID: <1066835468.3f969e0c7b3a2@mcherm.com> Guido writes: > But all this is moot unless someone comes up with a way to spell this > that doesn't require a new keyword or change the meaning of 'global x' > even if there's an x at an intermediate scope (i.e. you can't change > 'global x' to mean "search for the next outer scope that defines x"). > > And we still have to answer Alex's complaint that newbies misinterpret > the word 'global'. I've always thought that "global" statements ought to resemble "import" statements. Let me explain. Python doesn't have an "import" statement. It has SEVERAL import statements. There's "import m", "import m as n", "from m import n", "from m import n as n2", even the dreaded "import *". However, this profusion of different statements is NOT usually confusing to people for three reasons: (1) All have the same primary keyword "import", suggesting that they're all related. (2) All are ultimately concerned with doing the same thing... ensuring a module is in sys.modules and binding a name in the current environment so it can be used. (3) All of these read like english. Now, it seems to me that "global" is an ideal canidate for similar treatment. Rather like "import", there is a single thing we want to control... specifying, when we use the unadorned variable name, which namespace we wish it to refer to. There are several things we might want. First, what we already have: (a) Refers to local namespace. This is the most commonly used version, and should be (and is!) the default when no "global" statement is used. (b) Refers to the module-global namespace. This is the second-most commonly used scope, and so I'd say it deserves the simplest form of the "global" statement (rather like "import m" is the simplest form of "import"). That woud be "global x", and that's already how Python works. And a few others we might want: (c) Refers to nearest enclosing nested-scope namespace in which a binding of that name already exists. (d) Refers to the getattr() namespace (normally __dict__) of the first argument of the function. This is for the "don't like typing 'self.'" crowd. (e) Refers to a truly-global (across all modules) namespace (built-ins I suppose). This is what Alex says newbies guess that "global" means. (f) Refers to a specific enclosing nested-scope namespace, in cases where the nearest nested-scope namespace isn't the one you want. Personally, I have no use for (d), (e), and (f), and I'd vote c:+1, d:-0 e:-1, f:-1 on including these. But my point is, that a slightly different form of the "global" statement would satisfy both readability AND parsability. I'm not feeling particularly creative, so please try to improve on these phrasings (some aren't parsable... we need forms that are parsable AND read well): (c) -- "global x in def" (d) -- "global x in " (e) -- "global global x" (f) -- "x is global in " Okay... all four of those are lousy. But I still think seeking some alternate "phrases" or "forms" for the global statement has merit. -- Michael Chermside From guido at python.org Wed Oct 22 12:02:43 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 12:03:08 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: Your message of "Wed, 22 Oct 2003 08:15:16 CDT." <16278.33508.677499.127119@montanaro.dyndns.org> References: <200310220121.52789.aleaxit@yahoo.com> <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> <200310220158.21389.aleaxit@yahoo.com> <200310220042.h9M0g5225903@12-236-54-216.client.attbi.com> <16278.33508.677499.127119@montanaro.dyndns.org> Message-ID: <200310221602.h9MG2h427527@12-236-54-216.client.attbi.com> > How about > > __.x = 42 Too much line-noise, so too Perlish. :-) I don't like to use a mysterious symbol like __ for a common thing like a global variable. I also don't think I want global variable assignments to look like attribute assignments. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Oct 22 12:05:58 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 12:07:03 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: Your message of "Wed, 22 Oct 2003 23:02:52 +1000." <3F967FFC.6040507@iinet.net.au> References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> <200310201815.h9KIFM821583@12-236-54-216.client.attbi.com> <3F953793.1000208@iinet.net.au> <200310211841.45711.aleaxit@yahoo.com> <3F967FFC.6040507@iinet.net.au> Message-ID: <200310221606.h9MG5wo27539@12-236-54-216.client.attbi.com> > I had a similar thought about 5 minutes after turning my computer off last > night. The alternative I came up with was: > > y = (from result = 0.0 do result += x**2 for x in values if x > 0) I think you're aiming for the wrong thing here; I really see no reason why you'd want to avoid writing this out as a real for loop if you don't have an existing accumulator function (like sum()) to use. --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Wed Oct 22 12:11:37 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Wed Oct 22 12:11:42 2003 Subject: [Python-Dev] let's not stretch a keyword's use unreasonably, _please_... Message-ID: <20031022161137.96353.qmail@web40513.mail.yahoo.com> I'm traveling, but i did manage to briefly peek at python-dev -- and, with some luck, I think i'll manage to post from here too. It's probably gonna be the weekend before I get mail access again, but, in the meantime...: I think we're trying to stretch the meaning of a keyword, "global" (that wasn't a particularly appropriate one for Python anyway, as opposed to ABC), "beyond reason". Yes, the temptation is huge, because adding a keyword is problematic. But this is like C's extending "static" to mean "private to this module as opposed to visible from other modules too" and later C++ further stretching it to mean "pertaining to the class rather than to the instance". Seriously, I weep inside whenever I have to explain "staticmethod" in terms of "once upon a time, there was a language (which has just about nothing to do with Python) which stretched the meaning of a word, which an older language had already stretched for a vaguely related purpose, and ..." :-( Well, nothing we can do about THAT -- that particular abuse of "static" is widely enough engrained in too many _programmers'_ minds, so newbies will just have to put up with it. But, "global" _isn't_ similarly engrained. It sits oddly in Python -- a statement that doesn't actually DO things, but rather "flags" something for the compiler's benefit; in other words, _a declaration_ -- the one and only (expletive deleted) declaration we have in Python, though we call it "a statement" probably in an attempt at decency:-). I've seen it called "a declarative statement" on this thread, which is something of an oxymoron to me -- statements DO things. All but "global", that is. And it *rubs in* the "compile-time" vs "runtime" distinction that Python is mostly SO good at seamlessly hiding...! And all for what purpose? To let that one particular case of "assigment to bare name" actually mean setattr on some object (the current module) "blackmagically". I've seen it argued on this thread that "variables in a module and attributes in an object are different things" -- but they *AREN'T*! Just like, say, in a class. Inside a class C's body ("toplevel" in it, not nested inside a def &c) I can write x = 23 and it means C.x = 23 (for a normal metaclass). Once the class object C is created, if I want to tweak that attribute of C I have to write e.g. C.x = 42 after getting ahold of some reference to C (by its name, or say in a method of C by self.__class__.x = 42, etc). Inside a module M's body ("toplevel" in it, not nested inside a def &c) I can write x = 23 and it means M.x = 23 (unconditionally). Once the module object M is created, if I want to tweak that attribute of M I have to write e.g. M.x = 42 after getting ahold of some reference to M (say by an "import M", or say in a function of M by sys.modules[__name__].x = 42, etc). *OR* in the second case I can alternatively use the magical "compile-time declarative statement" ``global x'' -- a weird "special case" with a weird keyword going with it. Why? The two cases aren't all that different; bright learners catch them on to them most easily when I draw the parallels explicitly (before going on to explain the differences, of course). If "assigning to bare names not in the current scope" _must_ be supported by a dedicated keyword, I think that keyword should be 'scope'. NOT 'global'. Let's not do the "see how far we can stretch 'static' and then some" gig, please... Alternatively, assigning to an attribute of some particular object still feels a better approach to me -- no new kwd, no stretching of bad old ones, actually a chance to let bad old 'global' fade slowly into the sunset. If there's any chance to salvage THAT approach -- if it only needs a good neat syntax to get that "particular object" -- I'll be glad to participate in brainstorming to find it. But before we spend energy on that, I'd like to know it's not sure to be wasted, because it just 'must' be a keyword; this subdebate is warranted only if it's _conceivable_ that attribute assignment be again demeed acceptable for this purpose. If it HAS to be a keyword, I think it should be 'scope' AND it should not be a STATEMENT but rather an "operator". E.g., a "bare name" COULD be "x scope foo" or "(x in module scope)" or some such construct (I think it could syntactically resemble attribute assignment and that would be USEFUL, e.g. scope(foo).x, even if scope was not a function but a keyword). But _this_ subdebate, I think, is warranted only if it's _conceivable_ that a keyword could be added for this (be it 'scope' or something even better that I haven't thought of). If both conditions fail -- it MUST be a keyword, and that keyword MUST be 'global' no matter how horrid that is, then once that is etched in stone I don't think there is much worth debating, because I don't think the result can be decent -- it seems an overconstrained system. So, in just the spirit of "print>>bah, x", I would suggest: global>>outer, x There! What could be better?-) Alex From eppstein at ics.uci.edu Wed Oct 22 12:23:43 2003 From: eppstein at ics.uci.edu (David Eppstein) Date: Wed Oct 22 12:23:47 2003 Subject: [Python-Dev] Re: accumulator display syntax References: <000001c3984b$052cd820$e841fea9@oemcomputer> <13803476.1066768024@[192.168.1.101]> <16278.34154.245725.959203@montanaro.dyndns.org> Message-ID: In article <16278.34154.245725.959203@montanaro.dyndns.org>, Skip Montanaro wrote: [ re a long expression of mine ] ... > pos2d = dict([(s,(positions[s][0]+dx*positions[s][2], > positions[s][1]+dy*positions[s][2])) > for s in positions]) ... > pos2d = {s: (positions[s][0]+dx*positions[s][2], > positions[s][1]+dy*positions[s][2]) > for s in positions} > > The extra characters required today are less of a problem if the expression > is laid out sensibly. I have to admit, your indentation is better than mine, even if you ignore the problems caused by my using lines wider than 80 characters. But I still feel the second of your two alternatives more clearly expresses the intent of the expression. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From jeremy at zope.com Wed Oct 22 12:54:20 2003 From: jeremy at zope.com (Jeremy Hylton) Date: Wed Oct 22 12:54:22 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com> Message-ID: > But all this is moot unless someone comes up with a way to spell this > that doesn't require a new keyword or change the meaning of 'global x' > even if there's an x at an intermediate scope (i.e. you can't change > 'global x' to mean "search for the next outer scope that defines x"). > > And we still have to answer Alex's complaint that newbies misinterpret > the word 'global'. I'm not averse to introducing a new keyword, which would address both concerns. yield was introduced with apparently little problem, so it seems possible to add a keyword without causing too much disruption. If we decide we must stick with global, then it's very hard to address Alex's concern about global being a confusing word choice . Jeremy From dave at pythonapocrypha.com Wed Oct 22 13:07:38 2003 From: dave at pythonapocrypha.com (Dave Brueck) Date: Wed Oct 22 13:07:43 2003 Subject: [Python-Dev] replacing 'global' References: <200310220121.52789.aleaxit@yahoo.com><200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com><200310220158.21389.aleaxit@yahoo.com><200310220042.h9M0g5225903@12-236-54-216.client.attbi.com> <16278.33508.677499.127119@montanaro.dyndns.org> <200310221602.h9MG2h427527@12-236-54-216.client.attbi.com> Message-ID: <1b5501c398be$ff1832d0$891e140a@YODA> > > How about > > > > __.x = 42 > > Too much line-noise, so too Perlish. :-) > > I don't like to use a mysterious symbol like __ for a common thing > like a global variable. I also don't think I want global variable > assignments to look like attribute assignments. Go easy on me for piping up here, but aren't they attribute assignments or at least used as such? After reading the other posts in this thread I wonder if it would be helpful to have more information on how "global" is used in practice (and maybe some of those practices will be deemed bad, who knows). >From my (a user of Python) perspective, "global" has two uses: 1) Attaching objects to the module, so that other modules do a module.name to get at the object 2) Putting objects in some namespace visible to the rest of the module. Now whether or not #1 is "good" or "bad" - I don't know, but it sure looks like attribute assignment to me. Again, please regard this as just feedback from a user, but especially outside of the module it looks and acts like attribute assignment, I would expect the same to be true inside the module, and any distinction would seem arbitrary or artificial (consider, for example, that it is not an uncommon practice to write a module instead of a class if the class would be a singleton). As for #2, I personally don't use global at all because it just rubs me the wrong way (the same way it would if you removed "self." in a method and made bind-to-instance implicit like in C++). Instead, many of my modules have this at the top: class GV: someVar1 = None someVar2 = 5 (where GV = "global variables") I felt _really_ guilty doing this the first few times and I continue to thing it's yucky, but I don't know of a better alternative, and this approach reads better, especially compared to: global foo foo = 10 Seeing GV.foo = 10 adds a lot to readability. >From this user's perspective, both problems #1 and #2 would be solved by an object named "module" that refers to this module (but please don't name it "global" or "globals" - that word has a different expected meaning). Shutting up, -Dave From guido at python.org Wed Oct 22 13:21:15 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 13:21:22 2003 Subject: [Python-Dev] closure semantics In-Reply-To: Your message of "Sat, 09 Dec 2000 14:27:37 EST." References: Message-ID: <200310221721.h9MHLFl27628@12-236-54-216.client.attbi.com> [Guido] > > But all this is moot unless someone comes up with a way to spell this > > that doesn't require a new keyword or change the meaning of 'global x' > > even if there's an x at an intermediate scope (i.e. you can't change > > 'global x' to mean "search for the next outer scope that defines x"). > > > > And we still have to answer Alex's complaint that newbies misinterpret > > the word 'global'. [Jeremy] > I'm not averse to introducing a new keyword, which would address both > concerns. yield was introduced with apparently little problem, so it seems > possible to add a keyword without causing too much disruption. > > If we decide we must stick with global, then it's very hard to address > Alex's concern about global being a confusing word choice . OK, the tension is mounting. Which keyword do you have in mind? And would you use the same keyword for module-globals as for outer-scope variables? --Guido van Rossum (home page: http://www.python.org/~guido/) From walter at livinglogic.de Wed Oct 22 13:21:56 2003 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed Oct 22 13:22:02 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: <200310221506.h9MF6hM27365@12-236-54-216.client.attbi.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com> <200310212327.h9LNRSX25612@12-236-54-216.client.attbi.com> <20031022134913.GA21755@panix.com> <200310221506.h9MF6hM27365@12-236-54-216.client.attbi.com> Message-ID: <3F96BCB4.1000002@livinglogic.de> Guido van Rossum wrote: > [...] > The variable of a for *statement* must be accessible after the loop > because you might want to break out of the loop with a specific > value. This is a common pattern that I have no intent of breaking. > So it can't introduce a new scope; then it might as well keep the last > value assigned to it. > > List comprehensions and generator expressions don't have 'break'. > (You could cause an exception and catch it, but it's not a common > pattern to use the control variable afterwards -- only the debugger > would need access somehow.) How about an until keyword in generator expressions: sum(len(line) for line in file if not line.startswith("#") until not line.strip()) Would the order of "if" and "until" be significant? And we could have accumulators first() and last(): def first(it): return it.next() def last(it): for value in it: pass return value first(line for line in file if line.startswith("#")) if not last(file): # last line not terminated Bye, Walter D?rwald From guido at python.org Wed Oct 22 13:30:09 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 13:30:22 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Wed, 22 Oct 2003 12:03:18 BST." <2moew9o7pl.fsf@starship.python.net> References: <200310212349.h9LNnm710076@oma.cosc.canterbury.ac.nz> <2moew9o7pl.fsf@starship.python.net> Message-ID: <200310221730.h9MHU9w27692@12-236-54-216.client.attbi.com> > >> def f(x): > >> r = [x+1 for x in range(x)] > >> return r, x > >> > >> becomes even more incomprehensible (and changes in behaviour). > > > > Anyone who writes code like that *deserves* to have the > > behaviour changed on them! > > This was not my impression of the Python way. I know I'd be pretty > pissed if this broke my app. > > I have no objection to breaking the above code, just to breaking it > silently! Having code *silently change in behaviour* (not die with an > expection, post a warning at compile time or fail to compile at all) > is about an evil a change as it's possible to contemplate, IMO. > > > If this is really a worry, an alternative would be to > > simply forbid using a name for the loop variable that's > > used for anything else outside the loop. That could > > break existing code too, but at least it would break > > it in a very obvious way by making it fail to compile. > > This would be infinitely preferable! Not so fast. We introduced nested scopes, which could similarly subtly change the meaning of code without giving an error. Instead, we had at least one release that *warned* about situations that would change meaning silently under the new semantics. The same release also implemented the new semantics if you used a __future__ import. We should do that here too (both the warning and the __future__). I don't want this kind of code to cause an error; it's not Pythonic to flag an error when a variable name in an inner scope shields a variable of the same name in an outer scope. --Guido van Rossum (home page: http://www.python.org/~guido/) From pedronis at bluewin.ch Wed Oct 22 13:32:56 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Wed Oct 22 13:30:59 2003 Subject: [Python-Dev] closure semantics In-Reply-To: References: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com> Message-ID: <5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch> At 14:27 09.12.2000 -0500, Jeremy Hylton wrote: > > But all this is moot unless someone comes up with a way to spell this > > that doesn't require a new keyword or change the meaning of 'global x' > > even if there's an x at an intermediate scope (i.e. you can't change > > 'global x' to mean "search for the next outer scope that defines x"). > > > > And we still have to answer Alex's complaint that newbies misinterpret > > the word 'global'. > >I'm not averse to introducing a new keyword, which would address both >concerns. yield was introduced with apparently little problem, so it seems >possible to add a keyword without causing too much disruption. > >If we decide we must stick with global, then it's very hard to address >Alex's concern about global being a confusing word choice . why exactly do we want write access to outer scopes? for completeness, to avoid the overhead of introducing a class here and there, to facilitate people using Scheme textbooks with Python? so far I have not been missing it, I don't find: def accgen(n): def acc(i): global n in accgen n += i return n return acc particulary more compelling than: class accgen: def __init__(self, n): self.n = n def __call__(self, i): self.n += i return self.n I'm not asking in order to polemize, I just would like to see the rationale spelled out. regards. From mcherm at mcherm.com Wed Oct 22 13:34:19 2003 From: mcherm at mcherm.com (Michael Chermside) Date: Wed Oct 22 13:34:21 2003 Subject: [Python-Dev] closure semantics Message-ID: <1066844059.3f96bf9b1240f@mcherm.com> [Jeremy] > I'm not averse to introducing a new keyword, which would address both > concerns. yield was introduced with apparently little problem, so it seems > possible to add a keyword without causing too much disruption. > > If we decide we must stick with global, then it's very hard to address > Alex's concern about global being a confusing word choice . [Guido] > OK, the tension is mounting. Which keyword do you have in mind? And > would you use the same keyword for module-globals as for outer-scope > variables? Surely the most appropriate keyword is "scope", right? As in scope a is global scope b is nested scope c is self scope d is myDict Okay... maybe I'm getting too ambitious with the last couple... -- Michael Chermside From guido at python.org Wed Oct 22 13:47:42 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 13:48:07 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: Your message of "Wed, 22 Oct 2003 19:21:56 +0200." <3F96BCB4.1000002@livinglogic.de> References: <338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com> <200310212327.h9LNRSX25612@12-236-54-216.client.attbi.com> <20031022134913.GA21755@panix.com> <200310221506.h9MF6hM27365@12-236-54-216.client.attbi.com> <3F96BCB4.1000002@livinglogic.de> Message-ID: <200310221747.h9MHlgK27769@12-236-54-216.client.attbi.com> > How about an until keyword in generator expressions: New keywords are not on the table for generator expressions. You could do this with 'while' (which is just 'until not' -- note that your example uses that :-) but I'd be against making this part of the syntax more complex. You can do that with itertools.takewhile or dropwhile anyway. > And we could have accumulators first() and last(): > > def first(it): > return it.next() This begs for using a plain old loop statement with a 'break'. > def last(it): > for value in it: > pass > return value What if it is empty? > first(line for line in file if line.startswith("#")) > > if not last(file): > # last line not terminated The comment is incorrect. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Oct 22 13:57:18 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 13:57:26 2003 Subject: [Python-Dev] closure semantics In-Reply-To: Your message of "Wed, 22 Oct 2003 19:32:56 +0200." <5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch> References: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com> <5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch> Message-ID: <200310221757.h9MHvI327805@12-236-54-216.client.attbi.com> [Samuele] > why exactly do we want write access to outer scopes? > > for completeness, to avoid the overhead of introducing a class here > and there, to facilitate people using Scheme textbooks with Python? Probably the latter; I think Jeremy Hylton does know more Scheme than I do. :-) > so far I have not been missing it, > > I don't find: > > def accgen(n): > def acc(i): > global n in accgen > n += i > return n > return acc > > particulary more compelling than: > > class accgen: > def __init__(self, n): > self.n = n > > def __call__(self, i): > self.n += i > return self.n Some people have "fear of classes". Some people think that a function's scope can be cheaper than an object (someone should time this). Looking at the last example in the itertools docs: def tee(iterable): "Return two independent iterators from a single iterable" def gen(next, data={}, cnt=[0]): dpop = data.pop for i in count(): if i == cnt[0]: item = data[i] = next() cnt[0] += 1 else: item = dpop(i) yield item next = iter(iterable).next return (gen(next), gen(next)) This would have been clearer if the author didn't have to resort to representing his counter variable as a list of one element. Using 'global* x' to mean 'find x in an outer scope', and also moving data into the outer scope, again to emphasize that it is shared between multiple calls of gen() without abusing default arguments, it would become: def tee(iterable): "Return two independent iterators from a single iterable" data = {} cnt = 0 def gen(next): global* cnt dpop = data.pop for i in count(): if i == cnt: item = data[i] = next() cnt += 1 else: item = dpop(i) yield item next = iter(iterable).next return (gen(next), gen(next)) which is IMO more readable. But in 2.4 this will become a real object implemented in C. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From walter at livinglogic.de Wed Oct 22 14:05:12 2003 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed Oct 22 14:05:41 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: <200310221747.h9MHlgK27769@12-236-54-216.client.attbi.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com> <200310212327.h9LNRSX25612@12-236-54-216.client.attbi.com> <20031022134913.GA21755@panix.com> <200310221506.h9MF6hM27365@12-236-54-216.client.attbi.com> <3F96BCB4.1000002@livinglogic.de> <200310221747.h9MHlgK27769@12-236-54-216.client.attbi.com> Message-ID: <3F96C6D8.8040507@livinglogic.de> Guido van Rossum wrote: > [...] > >How about an until keyword in generator expressions: > > > New keywords are not on the table for generator expressions. You > could do this with 'while' (which is just 'until not' -- note that > your example uses that :-) You're right, using while would be better. > but I'd be against making this part of the > syntax more complex. You can do that with itertools.takewhile or > dropwhile anyway. But sum(len(line) for line in file if not line.startswith("#") while line.strip()) looks simple than sum(itertools.takewhile(lambda l: l.strip(), len(line) for line in file if not line.startswith("#")) >>def last(it): >> for value in it: >> pass >> return value > > What if it is empty? This should raise an exception. (It does, but not the correct one! ;)) >>first(line for line in file if line.startswith("#")) >> >>if not last(file): >> # last line not terminated > > > The comment is incorrect. That should have been: if not last(file).endswith("\n"): Bye, Walter D?rwald From python at rcn.com Wed Oct 22 14:30:53 2003 From: python at rcn.com (Raymond Hettinger) Date: Wed Oct 22 14:31:41 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <000301c39753$45a18980$e841fea9@oemcomputer> Message-ID: <005501c398ca$a07a6f20$e841fea9@oemcomputer> Did the discussion of a sort() expression get resolved? The last I remember was that the list.sorted() classmethod had won the most support because it accepted the broadest range of inputs. I could live with that though I still prefer the more limited (list-only) copysort() method. Raymond Hettinger > Let's see what the use cases look like under the various proposals: > > todo = [t for t in tasks.copysort() if due_today(t)] > todo = [t for t in list.sorted(tasks) if due_today(t)] > todo = [t for t in list(tasks, sorted=True) if due_today(t)] > > genhistory(date, events.copysort(key=incidenttime)) > genhistory(date, list.sorted(events, key=incidenttime)) > genhistory(date, list(events, sorted=True, key=incidenttime)) > > for f in os.listdir().copysort(): . . . > for f in list.sorted(os.listdir()): . . . > for f in list(os.listdir(), sorted=True): . . . > > To my eye, the first form reads much better in every case. > It still needs a better name though. From theller at python.net Wed Oct 22 14:38:18 2003 From: theller at python.net (Thomas Heller) Date: Wed Oct 22 14:38:39 2003 Subject: [Python-Dev] Re: buildin vs. shared modules In-Reply-To: <200310221442.h9MEgQN27219@12-236-54-216.client.attbi.com> (Guido van Rossum's message of "Wed, 22 Oct 2003 07:42:26 -0700") References: <200310171840.h9HIesN06941@12-236-54-216.client.attbi.com> <200310211857.57783.aleaxit@yahoo.com> <200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com> <7k2ywden.fsf@yahoo.co.uk> <65ihlodo.fsf@python.net> <200310221442.h9MEgQN27219@12-236-54-216.client.attbi.com> Message-ID: <4qy1qfs5.fsf@python.net> Guido van Rossum writes: >> The whole delayload/__try/__except stuff may be unneeded in 2.4, because >> it will most probably be compiled with MSVC7.1, installed via an msi >> installer, > > Is anyone working on that? I have the VC7.1 compiler too, but haven't > tried to use it yet. Maybe someone should check in a project > (separate from the VC6 project, so people don't *have to* switch yet)? No, nobdoy is working on that AFAIK. VC7 can convert VC6 workspace and project files into its own format, but there is no way back. You cannot use VC7 files (they are called solution instead of workspace) in VC6 anymore. MvL suggested to convert the files once and then deprecate using the VC6 workspace. > Are the tools needed to build an MSI installer included in VC7.1? If > not, are they a free download? Yes, there are tools included. A college of mine tried to use them, and we quickly switched to Wise for Windows Installer (this is not the same as the Wise version used in Python 2.3) which does also create msi files. But this also has its own range of problems. MvL again has the idea to create the msi (which is basically a database) programmatically with Python - either via COM, a custom Python extension or maybe ctypes. Thomas From guido at python.org Wed Oct 22 14:45:47 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 14:45:57 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: Your message of "Wed, 22 Oct 2003 20:05:12 +0200." <3F96C6D8.8040507@livinglogic.de> References: <338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com> <200310212327.h9LNRSX25612@12-236-54-216.client.attbi.com> <20031022134913.GA21755@panix.com> <200310221506.h9MF6hM27365@12-236-54-216.client.attbi.com> <3F96BCB4.1000002@livinglogic.de> <200310221747.h9MHlgK27769@12-236-54-216.client.attbi.com> <3F96C6D8.8040507@livinglogic.de> Message-ID: <200310221845.h9MIjlr27891@12-236-54-216.client.attbi.com> > sum(len(line) for line in file if not line.startswith("#") while > line.strip()) > > looks simple than > > sum(itertools.takewhile(lambda l: l.strip(), len(line) for line in file > if not line.startswith("#")) I think both are much harder to read and understand than n = 0 for line in file: if not line.strip(): break if not line.startwith("#"): n += len(line) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Oct 22 14:53:21 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 14:53:31 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: Your message of "Wed, 22 Oct 2003 14:30:53 EDT." <005501c398ca$a07a6f20$e841fea9@oemcomputer> References: <005501c398ca$a07a6f20$e841fea9@oemcomputer> Message-ID: <200310221853.h9MIrL327955@12-236-54-216.client.attbi.com> > Did the discussion of a sort() expression get resolved? > > The last I remember was that the list.sorted() classmethod had won the > most support because it accepted the broadest range of inputs. > > I could live with that though I still prefer the more limited > (list-only) copysort() method. list.sorted() has won, but we are waiting from feedback from the person who didn't like having both sort() and sorted() as methods, to see if his objection still holds when one is a method and the other a factory function. --Guido van Rossum (home page: http://www.python.org/~guido/) From theller at python.net Wed Oct 22 15:33:05 2003 From: theller at python.net (Thomas Heller) Date: Wed Oct 22 15:33:33 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules/expat macconfig.h, NONE, 1.1.2.1 asciitab.h, 1.1.1.1, 1.1.1.1.16.1 expat.h, 1.5, 1.5.12.1 iasciitab.h, 1.1.1.1, 1.1.1.1.16.1 internal.h, 1.1, 1.1.14.1 latin1tab.h, 1.1.1.1, 1.1.1.1.16.1 utf8tab.h, 1.1.1.1, 1.1.1.1.16.1 winconfig.h, 1.1.1.1, 1.1.1.1.16.1 xmlparse.c, 1.5, 1.5.12.1 xmlrole.c, 1.5, 1.5.12.1 xmltok.c, 1.3, 1.3.12.1 xmltok_impl.c, 1.2, 1.2.12.1 expat.h.in, 1.1.1.1, NONE In-Reply-To: (fdrake@users.sourceforge.net's message of "Tue, 21 Oct 2003 13:02:38 -0700") References: Message-ID: fdrake@users.sourceforge.net writes: > Update of /cvsroot/python/python/dist/src/Modules/expat > In directory sc8-pr-cvs1:/tmp/cvs-serv7002 > > Modified Files: > Tag: release23-maint > asciitab.h expat.h iasciitab.h internal.h latin1tab.h > utf8tab.h winconfig.h xmlparse.c xmlrole.c xmltok.c > xmltok_impl.c > Added Files: > Tag: release23-maint > macconfig.h > Removed Files: > Tag: release23-maint > expat.h.in > Log Message: > Update to Expat 1.95.7; there are no changes to the Expat sources. I'm getting compile errors on Windows (in the release-23maint branch, haven't tried in the trunk yet): C:\sf\python\dist\src-maint23\Modules\expat\xmlparse.c(76) : fatal error C1189: #error : memmove does not exist on this platform, nor is a substitute available Thomas From fdrake at acm.org Wed Oct 22 15:44:41 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed Oct 22 15:45:38 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules/expat macconfig.h, NONE, 1.1.2.1 asciitab.h, 1.1.1.1, 1.1.1.1.16.1 expat.h, 1.5, 1.5.12.1 iasciitab.h, 1.1.1.1, 1.1.1.1.16.1 internal.h, 1.1, 1.1.14.1 latin1tab.h, 1.1.1.1, 1.1.1.1.16.1 utf8tab.h, 1.1.1.1, 1.1.1.1.16.1 winconfig.h, 1.1.1.1, 1.1.1.1.16.1 xmlparse.c, 1.5, 1.5.12.1 xmlrole.c, 1.5, 1.5.12.1 xmltok.c, 1.3, 1.3.12.1 xmltok_impl.c, 1.2, 1.2.12.1 expat.h.in, 1.1.1.1, NONE In-Reply-To: References: Message-ID: <16278.56873.730705.729001@grendel.zope.com> Thomas Heller writes: > I'm getting compile errors on Windows (in the release-23maint branch, > haven't tried in the trunk yet): I'll bet they match. ;-) > C:\sf\python\dist\src-maint23\Modules\expat\xmlparse.c(76) : fatal error > C1189: #error : memmove does not exist on this platform, nor is a > substitute available Hmm. I see PC\pyconfig.h doesn't define HAVE_MEMMOVE; this gets defined in the configure-generated pyconfig.h for the Linux systems I tested this on. Doesn't Windows always have memmove()? (I *think* it does based on a quick look at msdn.microsoft.com, but who knows for sure...) I'm not sure how extension building works on Windows; if setup.py is used, you should be able to define HAVE_MEMMOVE in PC\pyconfig.h, otherwise you can define it in the relevant .dsp file. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From theller at python.net Wed Oct 22 15:57:12 2003 From: theller at python.net (Thomas Heller) Date: Wed Oct 22 15:57:49 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules/expat macconfig.h, NONE, 1.1.2.1 asciitab.h, 1.1.1.1, 1.1.1.1.16.1 expat.h, 1.5, 1.5.12.1 iasciitab.h, 1.1.1.1, 1.1.1.1.16.1 internal.h, 1.1, 1.1.14.1 latin1tab.h, 1.1.1.1, 1.1.1.1.16.1 utf8tab.h, 1.1.1.1, 1.1.1.1.16.1 winconfig.h, 1.1.1.1, 1.1.1.1.16.1 xmlparse.c, 1.5, 1.5.12.1 xmlrole.c, 1.5, 1.5.12.1 xmltok.c, 1.3, 1.3.12.1 xmltok_impl.c, 1.2, 1.2.12.1 expat.h.in, 1.1.1.1, NONE In-Reply-To: <16278.56873.730705.729001@grendel.zope.com> (Fred L. Drake, Jr.'s message of "Wed, 22 Oct 2003 15:44:41 -0400") References: <16278.56873.730705.729001@grendel.zope.com> Message-ID: "Fred L. Drake, Jr." writes: > Thomas Heller writes: > > I'm getting compile errors on Windows (in the release-23maint branch, > > haven't tried in the trunk yet): > > I'll bet they match. ;-) > > > C:\sf\python\dist\src-maint23\Modules\expat\xmlparse.c(76) : fatal error > > C1189: #error : memmove does not exist on this platform, nor is a > > substitute available > > Hmm. I see PC\pyconfig.h doesn't define HAVE_MEMMOVE; this gets > defined in the configure-generated pyconfig.h for the Linux systems I > tested this on. > > Doesn't Windows always have memmove()? (I *think* it does based on a > quick look at msdn.microsoft.com, but who knows for sure...) Windows? MSVC has it. > I'm not sure how extension building works on Windows; if setup.py is > used, you should be able to define HAVE_MEMMOVE in PC\pyconfig.h, > otherwise you can define it in the relevant .dsp file. setup.py isn't used, and PC\pyconfig.h is manually maintained. So HAVE_MEMMOVE has to be defined in this file, at least for MSVC6. I don't know anything about watcom, borland, or other compilers. Let's add it in the file and see what happens ;-) Thomas From fdrake at acm.org Wed Oct 22 16:07:03 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed Oct 22 16:07:24 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules/expat macconfig.h, NONE, 1.1.2.1 asciitab.h, 1.1.1.1, 1.1.1.1.16.1 expat.h, 1.5, 1.5.12.1 iasciitab.h, 1.1.1.1, 1.1.1.1.16.1 internal.h, 1.1, 1.1.14.1 latin1tab.h, 1.1.1.1, 1.1.1.1.16.1 utf8tab.h, 1.1.1.1, 1.1.1.1.16.1 winconfig.h, 1.1.1.1, 1.1.1.1.16.1 xmlparse.c, 1.5, 1.5.12.1 xmlrole.c, 1.5, 1.5.12.1 xmltok.c, 1.3, 1.3.12.1 xmltok_impl.c, 1.2, 1.2.12.1 expat.h.in, 1.1.1.1, NONE In-Reply-To: References: <16278.56873.730705.729001@grendel.zope.com> Message-ID: <16278.58215.241781.151437@grendel.zope.com> Thomas Heller writes: > setup.py isn't used, and PC\pyconfig.h is manually maintained. > So HAVE_MEMMOVE has to be defined in this file, at least for MSVC6. > I don't know anything about watcom, borland, or other compilers. > Let's add it in the file and see what happens ;-) Not quite, I think. The setup.py script will load it from the pyconfig.h file and pass it along for Expat, but if that isn't used, it needs to be added to the .dsp used to build pyexpat.pyd. Not sure what to do about other C runtimes. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From tim.one at comcast.net Wed Oct 22 16:10:10 2003 From: tim.one at comcast.net (Tim Peters) Date: Wed Oct 22 16:10:42 2003 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Modules/expat macconfig.h, NONE, 1.1.2.1 asciitab.h, 1.1.1.1, 1.1.1.1.16.1 expat.h, 1.5, 1.5.12.1iasciitab.h, 1.1.1.1, 1.1.1.1.16.1 internal.h, 1.1, 1.1.14.1 latin1tab.h, 1.1.1.1, 1.1.1.1.16.1 utf8tab.h, 1.1.1.1, 1.1.1. In-Reply-To: Message-ID: memmove is a standard ANSI C function so can be used freely (Python requires ANSI C). From theller at python.net Wed Oct 22 16:11:16 2003 From: theller at python.net (Thomas Heller) Date: Wed Oct 22 16:11:41 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules/expat macconfig.h, NONE, 1.1.2.1 asciitab.h, 1.1.1.1, 1.1.1.1.16.1 expat.h, 1.5, 1.5.12.1 iasciitab.h, 1.1.1.1, 1.1.1.1.16.1 internal.h, 1.1, 1.1.14.1 latin1tab.h, 1.1.1.1, 1.1.1.1.16.1 utf8tab.h, 1.1.1.1, 1.1.1.1.16.1 winconfig.h, 1.1.1.1, 1.1.1.1.16.1 xmlparse.c, 1.5, 1.5.12.1 xmlrole.c, 1.5, 1.5.12.1 xmltok.c, 1.3, 1.3.12.1 xmltok_impl.c, 1.2, 1.2.12.1 expat.h.in, 1.1.1.1, NONE In-Reply-To: <16278.58215.241781.151437@grendel.zope.com> (Fred L. Drake, Jr.'s message of "Wed, 22 Oct 2003 16:07:03 -0400") References: <16278.56873.730705.729001@grendel.zope.com> <16278.58215.241781.151437@grendel.zope.com> Message-ID: "Fred L. Drake, Jr." writes: > Thomas Heller writes: > > setup.py isn't used, and PC\pyconfig.h is manually maintained. > > So HAVE_MEMMOVE has to be defined in this file, at least for MSVC6. > > I don't know anything about watcom, borland, or other compilers. > > Let's add it in the file and see what happens ;-) > > Not quite, I think. > > The setup.py script will load it from the pyconfig.h file and pass it > along for Expat, but if that isn't used, it needs to be added to the > .dsp used to build pyexpat.pyd. Ah, you mean pyconfig.h is not included by the expat files? Ok, in this case it will have to go into the .dsp. > Not sure what to do about other C runtimes. Neither do I. Thomas From fdrake at acm.org Wed Oct 22 16:43:31 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Wed Oct 22 16:43:48 2003 Subject: [Python-Dev] memmove() in Expat In-Reply-To: References: <16278.56873.730705.729001@grendel.zope.com> <16278.58215.241781.151437@grendel.zope.com> Message-ID: <16278.60403.851100.38638@grendel.zope.com> Tim Peters writes: > memmove is a standard ANSI C function so can be used freely (Python requires > ANSI C). Cool; thanks! Thomas Heller writes: > Ah, you mean pyconfig.h is not included by the expat files? > Ok, in this case it will have to go into the .dsp. That's right; the problem isn't pyexpat.c, which imports pyconfig.h via Python.h. The #error is in the Expat sources, which we're using unmodified, and Expat is more tolerant of non-ANSI platforms. I'd rather see HAVE_MEMMOVE added to the .dsp than use modified Expat sources. -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From eppstein at ics.uci.edu Wed Oct 22 19:03:06 2003 From: eppstein at ics.uci.edu (David Eppstein) Date: Wed Oct 22 19:03:14 2003 Subject: [Python-Dev] Re: closure semantics References: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com> <5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch> Message-ID: In article <5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch>, Samuele Pedroni wrote: > why exactly do we want write access to outer scopes? > > for completeness, to avoid the overhead of introducing a class here and there, > to facilitate people using Scheme textbooks with Python? I am currently working on implementing an algorithm with the following properties: - It is an algorithm, not a data structure; that is, you run it, it returns an answer, and it doesn't leave any persistent state afterwards. - It is sufficiently complex that I prefer to break it into several different functions or methods. - These functions or methods need to share various state variables. If I implement it as a collection of separate functions, then there's a lot of unnecessary code complexity involved in passing the state variables from one function to the next, returning the changes to the variables, etc. Also, it doesn't present a modular interface to the rest of the project -- code outside this algorithm is not prevented from calling the internal subroutines of the algorithm. If I implement it as a collection of methods of an object, I then have to include a separate function which creates an instance of the object and immediately destroys it. This seems clumsy and also doesn't fit with my intuition about what objects are for (representing persistent structure). Also, again, modularity is violated -- outside code should not be making instances of this object or accessing its methods. What I would like to do is to make an outer function, which sets up the state variables, defines inner functions, and then calls those functions. Currently, this sort of works: most of the state variables consist of mutable objects, so I can mutate them without rebinding them. But some of the state is immutable (in this case, an int) so I need to somehow encapsulate it in mutable objects, which is again clumsy. Write access to outer scopes would let me avoid this encapsulation problem. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From greg at cosc.canterbury.ac.nz Wed Oct 22 19:20:28 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed Oct 22 19:21:25 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310221215.27570.gmccaughan@synaptics-uk.com> Message-ID: <200310222320.h9MNKSh18932@oma.cosc.canterbury.ac.nz> Gareth McCaughan : > "Aussonderungsaxiom" is the axiom of *separation*[1], which is > a weakened version of the (disastrous) axiom of *comprehension*. > In terms of Python's listcomps, comprehension would be [x if P(x)] Actually, my original implementation of list comps *did* allow you to write that -- although it didn't try to loop over all possible values of x, fortunately. :-) It was Guido who (probably fairly wisely, even though I didn't agree at the time) decided there had to be a "for" in there. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido at python.org Wed Oct 22 19:59:05 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 19:59:28 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: Your message of "Wed, 22 Oct 2003 16:03:06 PDT." References: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com> <5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch> Message-ID: <200310222359.h9MNx6L28417@12-236-54-216.client.attbi.com> > I am currently working on implementing an algorithm with the following > properties: > - It is an algorithm, not a data structure; that is, you run it, > it returns an answer, and it doesn't leave any persistent state > afterwards. > - It is sufficiently complex that I prefer to break it into several > different functions or methods. > - These functions or methods need to share various state variables. > > If I implement it as a collection of separate functions, then there's a > lot of unnecessary code complexity involved in passing the state > variables from one function to the next, returning the changes to the > variables, etc. Also, it doesn't present a modular interface to the > rest of the project -- code outside this algorithm is not prevented from > calling the internal subroutines of the algorithm. > > If I implement it as a collection of methods of an object, I then have > to include a separate function which creates an instance of the object > and immediately destroys it. This seems clumsy and also doesn't fit > with my intuition about what objects are for (representing persistent > structure). Also, again, modularity is violated -- outside code should > not be making instances of this object or accessing its methods. > > What I would like to do is to make an outer function, which sets up the > state variables, defines inner functions, and then calls those > functions. Currently, this sort of works: most of the state variables > consist of mutable objects, so I can mutate them without rebinding them. > But some of the state is immutable (in this case, an int) so I need to > somehow encapsulate it in mutable objects, which is again clumsy. > Write access to outer scopes would let me avoid this encapsulation > problem. I know the problem, I've dealt with this many times. Personally I would much rather define a class than a bunch of nested functions. I'd have a separate master function that creates the instance, calls the main computation, and then extracts and returns the result. Yes, the class may be accessible at the toplevel in the module. I don't care: I just add a comment explaining that it's not part of the API, or give it a name starting with "_". My problem with the nested functions is that it is much harder to get a grasp of what the shared state is -- any local variable in the outer function *could* be part of the shared state, and the only way to tell for sure is by inspecting all the subfunctions. With the class, there's a strong convention that all state is initialized in __init__(), so __init__() is self-documenting. --Guido van Rossum (home page: http://www.python.org/~guido/) From KCT at empmut.com.au Wed Oct 22 20:08:55 2003 From: KCT at empmut.com.au (KCT@empmut.com.au) Date: Wed Oct 22 20:08:18 2003 Subject: [Python-Dev] ...... Message-ID: <5F1E1E39D28A1447AAC833D0D0C6F01401827D@XXXX> From greg at cosc.canterbury.ac.nz Wed Oct 22 20:15:48 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed Oct 22 20:16:02 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <200310222359.h9MNx6L28417@12-236-54-216.client.attbi.com> Message-ID: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> Guido: > My problem with the nested functions is that it is much harder to get > a grasp of what the shared state is -- any local variable in the outer > function *could* be part of the shared state, and the only way to tell > for sure is by inspecting all the subfunctions. That would be solved if, instead of marking variables in inner scopes that refer to outer scopes, it were the other way round, and variables in the outer scope were marked as being rebindable in inner scopes. def f(): rebindable x def inc_x_by(i): x += i # rebinds outer x x = 39 inc_x_by(3) return x Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Wed Oct 22 20:56:24 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed Oct 22 20:56:59 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <3F967FFC.6040507@iinet.net.au> Message-ID: <200310230056.h9N0uON19236@oma.cosc.canterbury.ac.nz> > The alternative I came up with was: > > y = (from result = 0.0 do result += x**2 for x in values if x > 0) As has been pointed out, this hardly gains you anything over writing it all out explicitly. It seems like nothing more than a Perlesque another-way-to-do-it. This seems to be the fate of all reduce-replacement suggestions that try to be fully general -- there are just too many degrees of freedom to be able to express it all succinctly. The only way out of this I can see (short of dropping the whole idea) is to cut out some of the degrees of freedom by restrict ourselves to targeting the most common cases. Thinking about the way this works in APL, where you can say things like total = + / numbers one reason it's so compact is that the system knows what the identity is for each operator, so you don't have to specify the starting value explicitly. Another is the use of a binary operator. So if we postulate a "reducing protocol" that requires function objects to have a __div__ method that performs reduction with a suitable identity, then we can write total = operator.add / numbers Does that look succinct enough? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From bac at OCF.Berkeley.EDU Wed Oct 22 21:02:43 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Wed Oct 22 21:02:54 2003 Subject: [Python-Dev] Time for py3k@python.org or a Py3K Wiki? In-Reply-To: <16278.33180.5190.95094@montanaro.dyndns.org> References: <16278.33180.5190.95094@montanaro.dyndns.org> Message-ID: <3F9728B3.2070809@ocf.berkeley.edu> Skip Montanaro wrote: > These various discussions are moving along a bit too rapidly for me to keep > up. We have been discussing language issues which are going to impact > Python 3.0, either by deprecating current language constructs which can't be > eliminated until then (e.g., the global statement) or by tossing around > language construct ideas which will have to wait until then for their > implementation (other mechanisms for variable access in outer scopes). > Unfortunately, I'm afraid these things are going to get lost in the sea of > other python-dev topics and be forgotten about then the time is ripe. > > Maybe this would be a good time to create a py3k@python.org mailing list > with more restrictions than python-dev (posting by members only? membership > by invitation?) so we can more easily separate these ideas from shorter term > issues and keep track of them in a separate Mailman archive. I'd suggest > starting a Wiki, but that seems a bit too "global". You can restrict Wiki > mods in MoinMoin to users who are logged in, but I'm not sure you can > restrict signups very well. > I would support doing *something*. My personal hell that is all of these threads seems to be getting deeper and deeper. But making it invitation-only might stifle some ideas. But then again it might be a while before Python 3 has to be worried about so maybe that is not that big of an issue now. God I hope Raymond has PEP 289 updated before the next summary is due. -Brett From tim_one at email.msn.com Wed Oct 22 21:18:42 2003 From: tim_one at email.msn.com (Tim Peters) Date: Wed Oct 22 21:18:47 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310220223.h9M2N7l26105@12-236-54-216.client.attbi.com> Message-ID: I had a large file today, and needed to find lines matching several patterns simultaneously. It seemed a natural application for generator expressions, so let's see how that looks. Generalized a bit: Given: "source", an iterable producing elements (like a file producing lines) "predicates", a sequence of one-argument functions, mapping element to truth (like a regexp search returning a match object or None) Create: a generator producing the elements of source for which each predicate is true This is-- or should be --an easy application for pipelining generator expressions. Like so: pipe = source for p in predicates: # add a filter over the current pipe, and call that the new pipe pipe = e for e in pipe if p(e) Now I hope that for e in pipe: print e prints the desired elements. If will if the "p" and "pipe" in the generator expression use the bindings in effect at the time the generator expression is assigned to pipe. If the generator expression is instead a closure, it's a subtle disaster. You can play with this today like so: pipe = source for p in predicates: # pipe = e for e in pipe if p(e) def g(pipe=pipe, p=p): for e in pipe: if p(e): yield e pipe = g() for e in pipe: print e Those are the semantics for which "it works". If "p=p" is removed (so that the implementation of the generator expression acts like a closure wrt p), the effect is to ignore all but the last predicate. Instead predicates[-1] is applied to soucre, and then applied redundantly to the survivors len(predicates)-1 times each. It's not obvious then that the result is wrong, and for some inputs may even be correct. If "pipe=pipe" is removed instead, it should produce a "generator already executing" exception, since the "pipe" in the final for-loop is bound to the same object as the "pipe" inside g then (all of the g's, but only the last g matters). From greg at cosc.canterbury.ac.nz Wed Oct 22 21:36:41 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed Oct 22 21:37:48 2003 Subject: [Python-Dev] Can we please have a better dict interpolation syntax? Message-ID: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz> I have just had the experience of writing a bunch of expressions of the form "create index %(table)s_lid1_idx on %(table)s(%(lid1)s)" % params and found myself getting quite confused by all the parentheses and "s" suffixes. I would *really* like to be able to write this as "create index %{table}_lid1_idx on %{table}(%{lid1})" % params which I find to be much easier on the eyes. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From fincher.8 at osu.edu Wed Oct 22 22:37:50 2003 From: fincher.8 at osu.edu (Jeremy Fincher) Date: Wed Oct 22 21:40:33 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: References: Message-ID: <200310222237.50142.fincher.8@osu.edu> On Wednesday 22 October 2003 09:18 pm, Tim Peters wrote: > Those are the semantics for which "it works". I'm convinced; not only that free variables should be frozen as if they'd been passed into a generator function as keyword arguments, but of the utility of generator expressions as a whole -- that code is just beautiful :) Jeremy From raymond.hettinger at verizon.net Wed Oct 22 21:43:05 2003 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed Oct 22 21:43:56 2003 Subject: [Python-Dev] product() Message-ID: <002401c39907$0176f5a0$e841fea9@oemcomputer> In the course of writing up Pep 289, it became clear that the future has a number of accumulator functions in store. Each of these is useful with iterators of all stripes and each helps eliminate a reason for using reduce(). Some like average() and stddev() will likely end up in a statistics module. Others like nbiggest(), nsmallest(), anytrue(), alltrue(), and such may end-up somewhere else. The product() accumulator is the one destined to be a builtin. Though it is not nearly as common as sum(), it does enjoy some popularity. Having it available will help dispense with reduce(operator.mul, data, 1). Would there be any objections to my adding product() to Py2.4? The patch was simple and it is ready to go unless someone has some major issue with it. Raymond Hettinger From greg at cosc.canterbury.ac.nz Wed Oct 22 21:49:45 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed Oct 22 21:50:20 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Message-ID: <200310230149.h9N1njU19481@oma.cosc.canterbury.ac.nz> Tim Peters : > If will if the "p" and "pipe" in the generator expression use the > bindings in effect at the time the generator expression is assigned to > pipe. Lying awake thinking about this sort of thing last night, I found myself wondering if there should be a way of explicitly requesting that a name be evaluated at closure creation time, e.g. pipe = source for p in predicates: pipe = e for e in pipe if ^p(e) where the ^ means that p is evaluated in the enclosing scope when the closure is created, and bound to a slot which behaves like a default-argument slot (but is separate from the default arguments). This would allow the current delayed-evaluation semantics to be kept as the default, while eliminating any need for using the default-argument hack when you don't want delayed evaluation. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tim_one at email.msn.com Wed Oct 22 22:07:48 2003 From: tim_one at email.msn.com (Tim Peters) Date: Wed Oct 22 22:07:56 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310220527.h9M5Rgr26465@12-236-54-216.client.attbi.com> Message-ID: [Tim] >> I'm not sure it's "a feature" that >> >> print [n+f() for x in range(10)] >> >> looks up n and f anew on each iteration -- if I saw a listcomp that >> actually relied on this, I'd be eager to avoid inheriting any of >> author's code. [Guido] > It's just a direct consequence of Python's general rule for name > lookup in all contexts: variables are looked up when used, not before. > (Note: lookup is different from scope determination, which is done > mostly at compile time. Scope determination tells you where to look; > lookup gives you the actual value of that location.) If n is a global > and calling f() changes n, f()+n differs from n+f(), and both are > well-defined due to the left-to-right rule. That's not good or bad, > that's just *how it is*. Despite having some downsides, the > simplicity of the rule is good; I'm sure we could come up with > downsides of other rules too. Sorry, but none of that follows unless you first insist that a listcomp is semantically equivalent to a particular for-loop. Which we did do at the start, and which is now being abandoned in part ("well, except for the for target(s) -- well, OK, they still work like exactly like the for-loop would work if the target(s) were renamed in a particular 'safe' way"). I don't mind the renaming trick there, but by the same token there's nothing to stop explaining the meaning of a generator expression as a particular way of writing a generator function either. It's hardly a conceptual strain to give the function default arguments, or even to eschew that technical implementation trick and just say the generator's frame gets some particular initialized local variables (which is the important bit, not the trick used to get there). > Despite the good case that's been made for what would be most useful, I don't see that any good case had been made for or against it: the only cases I care about are real use cases. A thing stands or falls by that, purity be damned. I have since posted the first plausible use case that occurred to me while thinking about real work, and "closure semantics" turned out to be disastrous in that example (see other email), while "capture the current binding" semantics turned out to be exactly right in that example. I suspected that would be so, but I still want to see more not-100%-fabricated examples. > I'm loathe to drop the evaluation rule for convenience in one special > case. Next people may argue that in Python 3.0 lambda should also do > this; arguably it's more useful than the current semantics there too. It's not analogous: when I'm writing a lambda, I can *choose* which bindings to capture at lambda definition time, and which to leave free. Unless generator expressions grow more hair, I have no choice when writing one of those, so the implementation-forced choice had better be overwhelmingly most useful most often. I can't judge the latter without plausible use cases, though. > And then what next -- maybe all nested functions should copy their > free variables? Same objection as to the lambda example. > Oh, and then maybe outermost functions should copy their globals into > locals too -- that will speed up a lot of code. :-) It would save Jim a lot of thing=thing arglist typing in Zope code too . > There are other places in Python where some rule is applied to "all > free variables of a given piece of code" (the distinction between > locals and non-locals in functions is made this way). But there are > no other places where implicit local *copies* of all those free > variables are taken. I didn't suggest to copy anything, just to capture the bindings in use at the time a generator expression is evaluated. This is easy to explain, and trivial to explain for people familiar with the default-argument trick. Whenever I've written a list-of-generators, or in the recent example a generator pipeline, I have found it semantically necessary, without exception so far, to capture the bindings of the variables whose bindings wouldn't otherwise be invariant across the life of the generator. It it turns out that this is always, or nearly almost always, the case, across future examples too, then it would just be goofy not to implement generator expressions that way ("well, yes, the implementation does do a wrong thing in every example we had, but what you're not seeing is that the explanation would have been a line longer had the implementation done a useful thing instead" ). > I'd need to find a unifying principle to warrant doing that beyond > utility. No you don't -- you just think you do . From pje at telecommunity.com Wed Oct 22 22:12:09 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Oct 22 22:11:32 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310230149.h9N1njU19481@oma.cosc.canterbury.ac.nz> References: Message-ID: <5.1.0.14.0.20031022220801.02547e20@mail.telecommunity.com> At 02:49 PM 10/23/03 +1300, Greg Ewing wrote: >This would allow the current delayed-evaluation semantics >to be kept as the default, while eliminating any need >for using the default-argument hack when you don't >want delayed evaluation. Does anybody actually have a use case for delayed evaluation? Why would you ever *want* it to be that way? (Apart from the BDFL's desire to have the behavior resemble function behavior.) And, if there's no use case for delayed evaluation, why make people jump through hoops to get the immediate binding? From tim_one at email.msn.com Wed Oct 22 22:19:06 2003 From: tim_one at email.msn.com (Tim Peters) Date: Wed Oct 22 22:19:12 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310230149.h9N1njU19481@oma.cosc.canterbury.ac.nz> Message-ID: [Greg Ewing] > Lying awake thinking about this sort of thing last night, > I found myself wondering if there should be a way of > explicitly requesting that a name be evaluated at closure > creation time, e.g. > > pipe = source > for p in predicates: > pipe = e for e in pipe if ^p(e) > > where the ^ means that p is evaluated in the enclosing > scope when the closure is created, and bound to a slot > which behaves like a default-argument slot (but is > separate from the default arguments). As explained in the original email, the example is also a disaster if pipe's binding isn't captured at creation-time too. > This would allow the current delayed-evaluation semantics > to be kept as the default, while eliminating any need > for using the default-argument hack when you don't > want delayed evaluation. Well, I have yet to see an example where delayed evaluation is of any use in a generator expression, except for a 100%-contrived example that simply illustrated that the semantics can in fact differ (which I hope isn't something anyone questioned to begin with ). Try writing a real example. If it needs delayed evaluation in a plausible way, great. I'm still batting 0 at trying to find such a thing; I confess I wasn't moved by the it = f(x) for x in whatever def f(x): blah example (there being no apparent need to contort the order of the assignments except, again, to illustrate that semantics have consequences). From guido at python.org Wed Oct 22 22:28:59 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 22:29:29 2003 Subject: [Python-Dev] Can we please have a better dict interpolation syntax? In-Reply-To: Your message of "Thu, 23 Oct 2003 14:36:41 +1300." <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz> References: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz> Message-ID: <200310230229.h9N2SxA28642@12-236-54-216.client.attbi.com> > I have just had the experience of writing a bunch > of expressions of the form > > "create index %(table)s_lid1_idx on %(table)s(%(lid1)s)" % params > > and found myself getting quite confused by all the parentheses > and "s" suffixes. I would *really* like to be able to write > this as > > "create index %{table}_lid1_idx on %{table}(%{lid1})" % params > > which I find to be much easier on the eyes. Wouldn't this be even better? "create index ${table}_lid1_idx on $table($lid1)" % params --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Wed Oct 22 22:30:05 2003 From: tim_one at email.msn.com (Tim Peters) Date: Wed Oct 22 22:30:11 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310221215.27570.gmccaughan@synaptics-uk.com> Message-ID: [Gareth McCaughan] > > "Aussonderungsaxiom" is the axiom of *separation*[1], which is > a weakened version of the (disastrous) axiom of *comprehension*. Ya, sez you . Seriously, I don't think the usage is as consistent as you would have us believe here. When listcomps were introduced, I suggested at the time that "list separations" would be a better name for them (for the reason you gave), but the historical precedent set by SETL, and carried over into Haskell, means "comprehension" will stick forever in this context. I don't think the distinction is consistent across math texts either. > In terms of Python's listcomps, comprehension would be [x if P(x)] > and separation [x for x in S if P(x)]. So we should be > calling them "list separations", really :-). Yes, we should. SETL and Haskell also required specifying a base set (or list) from which elements are chosen, so they also should have called them separations. > [1] Hence the name; compare English "sunder". > > For the record, I like "generator expressions" too, or "iterator expressions". > Good! Guido has decided you love the former, and I agree . From guido at python.org Wed Oct 22 22:34:08 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 22:34:29 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Thu, 23 Oct 2003 14:49:45 +1300." <200310230149.h9N1njU19481@oma.cosc.canterbury.ac.nz> References: <200310230149.h9N1njU19481@oma.cosc.canterbury.ac.nz> Message-ID: <200310230234.h9N2Y8V28680@12-236-54-216.client.attbi.com> > Lying awake thinking about this sort of thing last night, > I found myself wondering if there should be a way of > explicitly requesting that a name be evaluated at closure > creation time, e.g. > > pipe = source > for p in predicates: > pipe = e for e in pipe if ^p(e) > > where the ^ means that p is evaluated in the enclosing > scope when the closure is created, and bound to a slot > which behaves like a default-argument slot (but is > separate from the default arguments). Bah. Arbitrary semantics bound to line-noise characters. Guess what that reminds me of. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From tim_one at email.msn.com Wed Oct 22 22:40:23 2003 From: tim_one at email.msn.com (Tim Peters) Date: Wed Oct 22 22:40:33 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310230234.h9N2Y8V28680@12-236-54-216.client.attbi.com> Message-ID: [Guido] > Bah. Arbitrary semantics bound to line-noise characters. Guess what > that reminds me of. :-) I sure hope the answer isn't "Python 3"! well-you-*did*-move-to-california-ly y'rs - tim From guido at python.org Wed Oct 22 22:48:40 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 22 22:48:01 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: Your message of "Thu, 23 Oct 2003 13:56:24 +1300." <200310230056.h9N0uON19236@oma.cosc.canterbury.ac.nz> References: <200310230056.h9N0uON19236@oma.cosc.canterbury.ac.nz> Message-ID: <200310230248.h9N2meb01254@12-236-54-216.client.attbi.com> > Thinking about the way this works in APL, where you can say things > like > > total = + / numbers > > one reason it's so compact is that the system knows what the identity > is for each operator, so you don't have to specify the starting value > explicitly. Another is the use of a binary operator. > > So if we postulate a "reducing protocol" that requires function > objects to have a __div__ method that performs reduction with a > suitable identity, then we can write > > total = operator.add / numbers > > Does that look succinct enough? It still suffers from my main problem with reduce(), which is not its verbosity (far from it) but that except for some special cases (mainly sum and product) I have to stand on my head to understand what it does. This is even the case for examples like reduce(lambda x, y: x + y.foo, seq) which is hardly the epitomy of complexity. Who here knows for sure it shouldn't rather be reduce(lambda x, y: x.foo + y, seq) without going through an elaborate step-by-step execution? This is inherent in the definition of reduce, and no / notation makes it go away for me. --Guido van Rossum (home page: http://www.python.org/~guido/) From sean at datamage.net Wed Oct 22 22:54:32 2003 From: sean at datamage.net (Sean Legassick) Date: Wed Oct 22 23:00:53 2003 Subject: [Python-Dev] Re: listcomps vs. for loops References: <338366A6D2E2CA4C9DAEAE652E12A1DECFF239@au3010avexu1.global.avaya.com> Message-ID: In message <338366A6D2E2CA4C9DAEAE652E12A1DECFF239@au3010avexu1.global.avaya.com>, "Delaney, Timothy C (Timothy)" >Note the winking smiley above :) Although I do find the scope limiting in: > > for (int i=0; i < 10; ++i) > { > } > >to be a nice feature of C++ (good god - did I just say that?) and hate >that the implementation in MSVC is broken and the control variable >leaks. Me too, but then that's because it's so much more maintainable to be able to repeat such for loops ad nauseum using the same loop var name without removing the 'int' type declarator. And happily that's not an issue in Python. (Hmmm, jumping out of lurk mode with a comment concerning C++. Apologies for the bad form but I am somewhat of a Python newbie, albeit an increasingly addicted one). Sean -- Sean Legassick sean@datamage.net http://www.informage.net - bloggin' along From greg at cosc.canterbury.ac.nz Wed Oct 22 23:32:19 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed Oct 22 23:32:48 2003 Subject: [Python-Dev] Can we please have a better dict interpolation syntax? In-Reply-To: <200310230229.h9N2SxA28642@12-236-54-216.client.attbi.com> Message-ID: <200310230332.h9N3WIm20194@oma.cosc.canterbury.ac.nz> Guido: > Wouldn't this be even better? > > "create index ${table}_lid1_idx on $table($lid1)" % params I wouldn't object to that. I'd have expected *you* to object to it, though, since it re-defines the meaning of "$" in an interpolated string. I was just trying to suggest something that would be backward-compatible. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Wed Oct 22 19:35:52 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed Oct 22 23:34:03 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: <200310221506.h9MF6hM27365@12-236-54-216.client.attbi.com> Message-ID: <200310222335.h9MNZqW19000@oma.cosc.canterbury.ac.nz> Guido: > The variable of a for *statement* must be accessible after the loop > because you might want to break out of the loop with a specific > value. This is a common pattern that I have no intent of breaking. It wouldn't be a great hardship if the loop variable weren't accessible after the break, because you can always write for x in stuff: if meets_condition(x): result = x break do_something_with(result) which is arguably a clearer way to write it, anyway. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Wed Oct 22 20:36:20 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed Oct 22 23:34:11 2003 Subject: [Python-Dev] PEP 289: Generator Expressions (second draft) In-Reply-To: <200310221507.h9MF7od27394@12-236-54-216.client.attbi.com> Message-ID: <200310230036.h9N0aK319192@oma.cosc.canterbury.ac.nz> Guido: > > I probably missed it in this monster of a thread, but how do > > generator expressions do this? It seems that they'd only make > > reduce more efficient, but it would still be just as needed as > > before. > > All we need is more standard accumulator functions like sum(). There > are many useful accumulator functions that aren't easily expressed as > a binary operator but are easily done with an explicit iterator > argument, so I am hopeful that the need for reduce will disappear. But this would still be true even if we introduced such functions *without* generator expressions, i.e. given some new standard accumulator foo_accumulator which accumulates using foo_function, you can write r = foo_accumulator(some_seq) instead of r = reduce(foo_function, some_seq) regardless of whether some_seq is a regular list or a generator expression. So it seems to me that generator expressions have *no* effect on the need or otherwise for reduce, and any suggestion to that effect should be removed from the PEP as misleading and confusing. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Wed Oct 22 23:37:11 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed Oct 22 23:37:45 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310230234.h9N2Y8V28680@12-236-54-216.client.attbi.com> Message-ID: <200310230337.h9N3bBa20209@oma.cosc.canterbury.ac.nz> > > pipe = source > > for p in predicates: > > pipe = e for e in pipe if ^p(e) > > Bah. Arbitrary semantics bound to line-noise characters. Guess what > that reminds me of. :-) If anyone can think of anything less line-noisy, I'm open to suggestions. The important thing is the idea of explicitly capturing an enclosing binding, however it's expressed. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From eppstein at ics.uci.edu Wed Oct 22 23:59:46 2003 From: eppstein at ics.uci.edu (David Eppstein) Date: Wed Oct 22 23:59:51 2003 Subject: [Python-Dev] Re: product() References: <002401c39907$0176f5a0$e841fea9@oemcomputer> Message-ID: In article <002401c39907$0176f5a0$e841fea9@oemcomputer>, "Raymond Hettinger" wrote: > In the course of writing up Pep 289, it became clear that > the future has a number of accumulator functions in store. > Each of these is useful with iterators of all stripes and > each helps eliminate a reason for using reduce(). Maybe it would be useful to get some feeling for how much other functions get used in reduce? I took a look through some of my own code, and found: - three loops with |= and &= that could have been done as a reduction on a generator expression (but for now will stay loops) - one call reduce(f,...) where f is not known until run time - no products. My guess is that, after sum, the functions used in reduce get a lot more diverse, and that trying to replace all of them with builtins is not feasible. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From tjreedy at udel.edu Thu Oct 23 00:03:29 2003 From: tjreedy at udel.edu (Terry Reedy) Date: Thu Oct 23 00:03:35 2003 Subject: [Python-Dev] Re: closure semantics References: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com><5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch> Message-ID: "David Eppstein" wrote in message news:eppstein-567571.16030622102003@sea.gmane.org... > If I implement it as a collection of methods of an object, I then have > to include a separate function which creates an instance of the object > and immediately destroys it. This seems clumsy and also doesn't fit > with my intuition about what objects are for (representing persistent > structure). Also, again, modularity is violated -- outside code should > not be making instances of this object or accessing its methods. So why not define the class inside the master function to keep it private? For a complex algorithm, re-setup time should be relatively negligible. Terry J. Reedy From greg at cosc.canterbury.ac.nz Thu Oct 23 00:07:03 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Thu Oct 23 00:07:43 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <200310230248.h9N2meb01254@12-236-54-216.client.attbi.com> Message-ID: <200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz> > I have to stand on my head to understand what it > does. This is even the case for examples like > > reduce(lambda x, y: x + y.foo, seq) It occurs to me that, with generator expressions, such cases could be rewritten as reduce(lambda x, y: x + y, (z.foo for z in seq)) i.e. any part of the computation that only depends on the right argument can be factored out into the generator. So I might have to take back some of what I said earlier about generator comprehensions being independent of reduce. But if I understand you correctly, what you're saying is that the interesting cases are the ones where there isn't a ready-made binary function that does what you want, in which case you're going to have to spell everything out explicitly anyway one way or another. In that case, the most you could gain from a reduce syntax would be that it's an expression rather than a sequence of statements. But the same could be said of list comprehensions -- and *was* said quite loudly by many people in the early days, if I recall correctly. What's the point, people asked, when writing out a set of nested loops is just about as easy? Somehow we came to the conclusion that being able to write a list comprehension as an expression was a valuable thing to have, even if it wasn't significantly shorter or clearer. What about reductions? Do we feel differently? If so, why? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido at python.org Thu Oct 23 00:16:19 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 00:15:55 2003 Subject: [Python-Dev] Can we please have a better dict interpolation syntax? In-Reply-To: Your message of "Thu, 23 Oct 2003 16:32:19 +1300." <200310230332.h9N3WIm20194@oma.cosc.canterbury.ac.nz> References: <200310230332.h9N3WIm20194@oma.cosc.canterbury.ac.nz> Message-ID: <200310230416.h9N4GJv01528@12-236-54-216.client.attbi.com> > > Wouldn't this be even better? > > > > "create index ${table}_lid1_idx on $table($lid1)" % params > > I wouldn't object to that. I'd have expected *you* to > object to it, though, since it re-defines the meaning > of "$" in an interpolated string. I was just trying > to suggest something that would be backward-compatible. Correct, my proposal can't be backward-compatible. :-( But somehow I think that, for various cultural reasons (not just Perl :-) $ is a better character to use for interpolation than % -- this is pretty arbitrary, but it seems that $foo is just much more common than %foo as a substitution indicator, across various languages. (% is more common for C-style format strings of course.) There have been many proposals in this area, even a PEP (PEP 215, which I don't like that much, despite its use of $). Many people have also implemented something along these lines, using a function to request interpolation (or using template files etc.), and using various things (from dicts to namespaces) as the source for names. Anyway, I think this is something that can wait until 3.0, and I'd rather not have too many discussions here at once, so I'd rather unhelpfully punt than take this on for real (also for the benefit of Brett, who has to sort through all of this for his python-dev summary). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Oct 23 00:17:28 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 00:17:05 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: Your message of "Thu, 23 Oct 2003 12:35:52 +1300." <200310222335.h9MNZqW19000@oma.cosc.canterbury.ac.nz> References: <200310222335.h9MNZqW19000@oma.cosc.canterbury.ac.nz> Message-ID: <200310230417.h9N4HSn01539@12-236-54-216.client.attbi.com> > Guido: > > The variable of a for *statement* must be accessible after the loop > > because you might want to break out of the loop with a specific > > value. This is a common pattern that I have no intent of breaking. [Greg] > It wouldn't be a great hardship if the loop variable > weren't accessible after the break, because you can > always write > > for x in stuff: > if meets_condition(x): > result = x > break > do_something_with(result) > > which is arguably a clearer way to write it, anyway. I don't know. It seems to add clutter. I don't see the big urge to limit the scope of loop control variables. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Oct 23 00:20:30 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 00:19:58 2003 Subject: [Python-Dev] PEP 289: Generator Expressions (second draft) In-Reply-To: Your message of "Thu, 23 Oct 2003 13:36:20 +1300." <200310230036.h9N0aK319192@oma.cosc.canterbury.ac.nz> References: <200310230036.h9N0aK319192@oma.cosc.canterbury.ac.nz> Message-ID: <200310230420.h9N4KUl01570@12-236-54-216.client.attbi.com> > Guido: > > > I probably missed it in this monster of a thread, but how do > > > generator expressions do this? It seems that they'd only make > > > reduce more efficient, but it would still be just as needed as > > > before. > > > > All we need is more standard accumulator functions like sum(). There > > are many useful accumulator functions that aren't easily expressed as > > a binary operator but are easily done with an explicit iterator > > argument, so I am hopeful that the need for reduce will disappear. > > But this would still be true even if we introduced such functions > *without* generator expressions, i.e. given some new standard > accumulator foo_accumulator which accumulates using foo_function, you > can write > > r = foo_accumulator(some_seq) > > instead of > > r = reduce(foo_function, some_seq) > > regardless of whether some_seq is a regular list or a generator > expression. > > So it seems to me that generator expressions have *no* effect on the > need or otherwise for reduce, and any suggestion to that effect should > be removed from the PEP as misleading and confusing. After some thinking, I agree. The only (indirect) link is that generator expressions make it more attractive to start writing accumulator functions, and having more accumulator functions available eliminates the need for reduce(). I'll update the PEP as needed (Raymond already toned down its mention of reduce()). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Oct 23 00:25:49 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 00:25:38 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Thu, 23 Oct 2003 16:37:11 +1300." <200310230337.h9N3bBa20209@oma.cosc.canterbury.ac.nz> References: <200310230337.h9N3bBa20209@oma.cosc.canterbury.ac.nz> Message-ID: <200310230425.h9N4Pnf01585@12-236-54-216.client.attbi.com> > > > pipe = source > > > for p in predicates: > > > pipe = e for e in pipe if ^p(e) > > > > Bah. Arbitrary semantics bound to line-noise characters. Guess what > > that reminds me of. :-) > > If anyone can think of anything less line-noisy, I'm > open to suggestions. The important thing is the idea of > explicitly capturing an enclosing binding, however it's > expressed. I think that no matter what notation you invent, this will remain an unpythonic thing. I can't quite explain why I feel that way. Maybe it's because it feels very strongly like a directive to the compiler -- Python's compiler likes to stay out of the way and not need help. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Oct 23 00:29:13 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 00:28:34 2003 Subject: [Python-Dev] Re: product() In-Reply-To: Your message of "Wed, 22 Oct 2003 20:59:46 PDT." References: <002401c39907$0176f5a0$e841fea9@oemcomputer> Message-ID: <200310230429.h9N4TD301617@12-236-54-216.client.attbi.com> > My guess is that, after sum, the functions used in reduce get a lot more > diverse, and that trying to replace all of them with builtins is not > feasible. That matches my intuition. I figure even if we just started deprecating reduce() without offering a replacement there wouldn't be many complaints. reduce() just doesn't get enough mileage. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at cosc.canterbury.ac.nz Thu Oct 23 00:41:11 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Thu Oct 23 00:41:28 2003 Subject: [Python-Dev] Can we please have a better dict interpolation syntax? In-Reply-To: <200310230416.h9N4GJv01528@12-236-54-216.client.attbi.com> Message-ID: <200310230441.h9N4fBQ20402@oma.cosc.canterbury.ac.nz> Guido: > Many people have also implemented something along these lines, using a > function to request interpolation (or using template files etc.), and > using various things (from dicts to namespaces) as the source for > names. I'm not asking for interpolation out of the current namespace or anything like that -- just a simple extension to the current set of formats for interpolating from a dict, that could be done right now without affecting anything. I'd be willing to supply a patch if it has some chance of being accepted. I agree that the more esoteric proposals are best left until later. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido at python.org Thu Oct 23 00:50:43 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 00:50:07 2003 Subject: [Python-Dev] Can we please have a better dict interpolation syntax? In-Reply-To: Your message of "Thu, 23 Oct 2003 17:41:11 +1300." <200310230441.h9N4fBQ20402@oma.cosc.canterbury.ac.nz> References: <200310230441.h9N4fBQ20402@oma.cosc.canterbury.ac.nz> Message-ID: <200310230450.h9N4ohh01673@12-236-54-216.client.attbi.com> > I'm not asking for interpolation out of the current namespace > or anything like that -- just a simple extension to the current > set of formats for interpolating from a dict, that could be > done right now without affecting anything. I'd be willing to > supply a patch if it has some chance of being accepted. > > I agree that the more esoteric proposals are best left until > later. But adding to % interpolation makes it less likely that a radically different (and better) approach will be implemented, because the status quo will be closer to "good enough" without being "right". --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy at udel.edu Thu Oct 23 01:11:01 2003 From: tjreedy at udel.edu (Terry Reedy) Date: Thu Oct 23 01:11:11 2003 Subject: [Python-Dev] Re: Re: accumulator display syntax References: <200310230056.h9N0uON19236@oma.cosc.canterbury.ac.nz> <200310230248.h9N2meb01254@12-236-54-216.client.attbi.com> Message-ID: "Guido van Rossum" wrote in message news:200310230248.h9N2meb01254@12-236-54-216.client.attbi.com... > It still suffers from my main problem with reduce(), which is not its > verbosity (far from it) but that except for some special cases (mainly > sum and product) I have to stand on my head to understand what it > does. This is even the case for examples like > > reduce(lambda x, y: x + y.foo, seq) > > which is hardly the epitomy of complexity. Who here knows for sure it > shouldn't rather be > > reduce(lambda x, y: x.foo + y, seq) > > without going through an elaborate step-by-step execution? I do and Raymond Hettinger should. Doc bug 821701 addressed this confusion. I suggested the addition of "The first (left) argument is the accumulator; the second (right) is the update value from the sequence. The accumulator starts as the initializer, if given, or as seq[0]. " but don't know yet what Raymond actually did. For remembering, the arg order corresponds to left associativity: ...(((a op b) op c) op d) ... . For clarity, the updater should be written with real arg names: lambda sum, item: sum + item.foo Now sum.foo + item is pretty obviously wrong. I think it a mistake to make the two args of the update function look symmetric when they are not. Even if the same type, the first represents a cumulation of several values (and the last return value) while the second is just one (new) value. Terry J. Reedy From tjreedy at udel.edu Thu Oct 23 01:25:25 2003 From: tjreedy at udel.edu (Terry Reedy) Date: Thu Oct 23 01:26:20 2003 Subject: [Python-Dev] Re: Re: buildin vs. shared modules References: <200310171840.h9HIesN06941@12-236-54-216.client.attbi.com> <200310211857.57783.aleaxit@yahoo.com><200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com><7k2ywden.fsf@yahoo.co.uk> <65ihlodo.fsf@python.net> Message-ID: "Thomas Heller" wrote in message news:65ihlodo.fsf@python.net... > The whole delayload/__try/__except stuff may be unneeded in 2.4, because > it will most probably be compiled with MSVC7.1, installed via an msi > installer, and all systems where the msi actually could be installed > would already have a winsock (or winsock2) dll. At least that is my > impression on what I hear about systems older than (or including?) > win98SE these days. There are a *lot* of Win98 systems that are not officially 'SE', although a lot of SE stuff has been added thru Windows Update. They are both newer and more numerous, I believe, than some of the other OSes supported. I would hate for Python to cease working on them. (I have one, and my wife three or four.) So I would hope that a C7.1 build is tested on such before an irrevocable commitment is made. Terry J. Reedy From guido at python.org Thu Oct 23 01:43:40 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 01:43:24 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: Your message of "Thu, 23 Oct 2003 17:07:03 +1300." <200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz> References: <200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz> Message-ID: <200310230543.h9N5heh01776@12-236-54-216.client.attbi.com> [Guido] > > I have to stand on my head to understand what it > > does. This is even the case for examples like > > > > reduce(lambda x, y: x + y.foo, seq) [Greg] > It occurs to me that, with generator expressions, > such cases could be rewritten as > > reduce(lambda x, y: x + y, (z.foo for z in seq)) > > i.e. any part of the computation that only depends on > the right argument can be factored out into the > generator. So I might have to take back some of what > I said earlier about generator comprehensions being > independent of reduce. > > But if I understand you correctly, what you're saying > is that the interesting cases are the ones where there > isn't a ready-made binary function that does what > you want, in which case you're going to have to spell > everything out explicitly anyway one way or another. (And then spelling it out so that it works with reduce() reduces clarity.) > In that case, the most you could gain from a reduce > syntax would be that it's an expression rather than > a sequence of statements. > > But the same could be said of list comprehensions -- > and *was* said quite loudly by many people in the early > days, if I recall correctly. What's the point, people > asked, when writing out a set of nested loops is just > about as easy? Some people still hate LC's for this reason. > Somehow we came to the conclusion that being able to > write a list comprehension as an expression was a > valuable thing to have, even if it wasn't significantly > shorter or clearer. What about reductions? Do we feel > differently? If so, why? IMO LC's *are* significantly clearer because the notation lets you focus on what goes into the list (e.g. the expresion "x**2") and under what conditions (e.g. the condition "x%2 == 1") rather than how you get it there (i.e. the initializer "result = []" and the call "result.append(...)"). This is an incredibly common idiom in the use of loops; for experienced programmers the boilerplate disappears when they read the code, but for less experienced readers it takes more time to recognize the idiom. I think this is at least in part due to the fact that there are more details that can be written differently, e.g. the name of the result variable, and exactly at which point it is initialized. I think that for reductions the gains are less clear. The initializer for the result variable and the call that updates its are no longer boilerplate, because they vary for each use; plus the name of the result variable should be chosen carefully because it indicates what kind of result it is (e.g. a sum or product). So, leaving out the condition for now, the pattern or idiom is: = for in : = (Where uses and .) If we think of this as a template with parameters, there are five parameters! (A LC without a condition only has 3: , and .) No matter how hard you try, a macro with 5 parameters will have a hard time conveying the meaning of each without being at least as verbose as the full template. We could reduce the number of template parameters to 4 by leaving anonymous; we could then refer to it by e.g. "_" in , which is more concise and perhaps acceptable, but makes certain uses more strained (e.g. mean() below). Just for fun, let me try to propose a macro syntax: reduction(, , , ) (I think it's better to have as the first parameter, but you can ) For example: reduction(0, _+x**2, x, S) Lavishly sprinkle syntactic sugar, and perhaps it can become this ('reduction' would have to be a reserved word): reduction(0, _+x**2 for x in S) A few more examples using this notation: # product(S), if Raymond's product() builtin is accepted reduction(1, _*x for x in S) # mean of f(x); uses result tuple and needs result postprocessing total, n = reduction((0, 0), (_[0]+f(x), _[1]+1) for x in S) mean = total/n # horner(S, x): evaluate a polynomial over x: [6, 3, 4] => 6*x**2 + 3*x + 4 reduction(0, _*x + c for c in S) In each of these cases I have the same gut response as to writing these using reduce(): the notation is too "concentrated", I have to think so hard before I understand what it does that I wouldn't mind having it spread over three lines. Compare the above four examples to: sum = 0 for x in S: sum += x**2 product = 1 for x in S: product *= x total, n = 0, 0 for x in S: total += f(x) n += 1 mean = total/n horner = 0 for c in S: horner = horner*x + c I find that these cause much less strain on the eyes. (BTW the horner example shows that insisting on augmented assignment would reduce the power.) Concluding, I think the reduce() pattern is doomed -- the template is too complex to capture in special syntax. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Oct 23 01:48:17 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 01:47:44 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Wed, 22 Oct 2003 22:07:48 EDT." References: Message-ID: <200310230548.h9N5mHs01795@12-236-54-216.client.attbi.com> (This is drawing to a conclusion. Summary: Tim has convinced me.) > > There are other places in Python where some rule is applied to "all > > free variables of a given piece of code" (the distinction between > > locals and non-locals in functions is made this way). But there are > > no other places where implicit local *copies* of all those free > > variables are taken. > > I didn't suggest to copy anything, just to capture the bindings in use at > the time a generator expression is evaluated. Sorry, I meant a pointer copy, not an object copy. That's a binding capture. > This is easy to explain, and trivial to explain for people familiar > with the default-argument trick. Phillip Eby already recommended not bothering with that; the default-argument rule is actually confusing for newbies (they think the defaults are evaluated at call time) so it's best not to bring this into the picture. > Whenever I've written a list-of-generators, or in the recent example > a generator pipeline, I have found it semantically necessary, > without exception so far, to capture the bindings of the variables > whose bindings wouldn't otherwise be invariant across the life of > the generator. It it turns out that this is always, or nearly > almost always, the case, across future examples too, then it would > just be goofy not to implement generator expressions that way > ("well, yes, the implementation does do a wrong thing in every > example we had, but what you're not seeing is that the explanation > would have been a line longer had the implementation done a useful > thing instead" ). > > > I'd need to find a unifying principle to warrant doing that beyond > > utility. > > No you don't -- you just think you do . OK, I got it now. I hope we can find another real-life example; but there were some other early toy examples that also looked quite convincing. I'll take a pass at updating the PEP. --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy at udel.edu Thu Oct 23 01:51:45 2003 From: tjreedy at udel.edu (Terry Reedy) Date: Thu Oct 23 01:51:50 2003 Subject: [Python-Dev] Re: let's not stretch a keyword's use unreasonably, _please_... References: <20031022161137.96353.qmail@web40513.mail.yahoo.com> Message-ID: "Alex Martelli" wrote in message news:20031022161137.96353.qmail@web40513.mail.yahoo.com... > Inside a module M's body ("toplevel" in it, not nested inside > a def &c) I can write > x = 23 > and it means M.x = 23 (unconditionally). Once the module > object M is created, if I want to tweak that attribute > of M I have to write e.g. M.x = 42 after getting ahold of > some reference to M (say by an "import M", or say in a function > of M by sys.modules[__name__].x = 42, etc). > Inside a module M's body ("toplevel" in it, not nested inside > a def &c) I can write > x = 23 > and it means M.x = 23 (unconditionally). Once the module > object M is created, if I want to tweak that attribute > of M I have to write e.g. M.x = 42 after getting ahold of > some reference to M (say by an "import M", or say in a function > of M by sys.modules[__name__].x = 42, etc). I somehow overlooked that this would work inside modules also. >>> import __main__ as m # I know, not general, just for trial >>> m.c=3 >>> c 3 >>> def e(): ... m.x='ha' ... >>> e() >>> x 'ha' So I really *don't* need global. Perhaps a new builtin def me(): import sys return sys.modules[__name__] or an addition to my template.py file. Terry J. Reedy From guido at python.org Thu Oct 23 02:42:12 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 02:41:29 2003 Subject: [Python-Dev] PEP 289: Generator Expressions (second draft) In-Reply-To: Your message of "Wed, 22 Oct 2003 21:20:30 PDT." Message-ID: <200310230642.h9N6gCN01979@12-236-54-216.client.attbi.com> I've checked in an update to Raymond's PEP 289 which (I hope) clarifies a lot of things, and settles the capturing of free variables. Raymond, please take this to c.l.py for feedback! Wear asbestos. :-) I'm sure there will be plenty of misunderstandings in the discussion there. If these are due to lack of detail or clarity in the PEP, feel free to update the PEP. If there are questions that need us to go back to the drawing board or requiring BDFL pronouncement, take it back to python-dev. --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Thu Oct 23 02:41:00 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Thu Oct 23 02:42:43 2003 Subject: [Python-Dev] Re: buildin vs. shared modules In-Reply-To: <4qy1qfs5.fsf@python.net> References: <200310171840.h9HIesN06941@12-236-54-216.client.attbi.com> <200310211857.57783.aleaxit@yahoo.com> <200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com> <7k2ywden.fsf@yahoo.co.uk> <65ihlodo.fsf@python.net> <200310221442.h9MEgQN27219@12-236-54-216.client.attbi.com> <4qy1qfs5.fsf@python.net> Message-ID: Thomas Heller writes: > VC7 can convert VC6 workspace and project files into its own format, > but there is no way back. You cannot use VC7 files (they are called > solution instead of workspace) in VC6 anymore. MvL suggested to convert > the files once and then deprecate using the VC6 workspace. Indeed: Conversion works fairly well, but we (as python-devers) should agree on using a single compiler - otherwise, conflicting changes will occur. So I propose to actually move the VC6 project files elsewhere; anybody who wants to continue to use them would need to copy them back. I could implement that very quickly; I just need agreement that we should do so. We would also need agreement on whether to use VC7 (Studio .NET) or VC 7.1 (Studio .NET 2003); I propose to use the latter. > MvL again has the idea to create the msi (which is basically a database) > programmatically with Python - either via COM, a custom Python extension > or maybe ctypes. I haven't made much progress with that, though. Initially I plan to use the MSI COM interface, and I'm fairly certain that this can be done, but it also takes some effort. On the plus side, anybody could then do the packaging - you would only need PythonWin installed. That requirement could be dropped by using the C API to installer. To build necessary extension module, you would need to have the Installer SDK installed (which comes with the platform SDK); I haven't checked whether VC 7.1 ships with the necessary libraries (in which case there would be no additional prerequisites). Regards, Martin From bac at OCF.Berkeley.EDU Thu Oct 23 02:48:03 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Oct 23 02:48:13 2003 Subject: [Python-Dev] setjmp/longjmp exception handling (was: More informative error messages) In-Reply-To: References: Message-ID: <3F9779A3.7000504@ocf.berkeley.edu> Tim Peters wrote: > An internal PyExc_AttributeError isn't the same as a user-visible > AttributeError, though -- a class instance isn't created unless and until > PyErr_NormalizeException() gets called because the exception needs to be > made user-visible. If the latter never happens, setting and clearing > exceptions internally is pretty cheap (a pointer to the global > PyExc_AttributeError object is stuffed into the thread state). OTOH, almost > every call to a C API function has to test+branch for an error-return value, > and I've often wondered whether a setjmp/longjmp-based hack might allow for > cleaner and more optimizable code (hand-rolled "real exception handling"). > For some odd reason (maybe because of all the code touch-ups I did to Python/ast.c in the AST branch), the idea of doing exception handling in C using setjmp/longjmp really appealed to me. So, being a programmer with an itch that needed to be scratched, I came up with a possible solution. Even if the idea would work (which I don't know if it will just because I am not sure how thread-safe it is nor if the code will work; this was a mental exercise that doesn't compile because of casting of jmp_buf and such), I doubt it will ever be incorporated into Python just because it would require so much change to the C code. But hey, who knows. The basic idea is to keep a stack of jmp_buf points. They are pushed on to the stack when a chunk of code wants to handle an exception. The basic code is in the function try_except(); have an 'if' that calls a function that pushes on to the stack a new jmp_buf and register it in the conditional check. When an exception is raised a function is called (makejmp()) that pops the stack and jumps to the jmp_buf that is popped. Continue until the last item on the stack is reached which should be PyErr_NormalizeException() (I think that is the function that exposes an exception to Python code). I have no clue how much performance benefit/loss there would be from this, but code would be cleaner since you wouldn't have to do constant ``if (fxn() == NULL) return NULL;`` checks for raised exceptions. But in case anyone cares, here is the *very* rough C code: #include #include /* Basically just a stack item */ typedef struct jmp_stack_item_struct { jmp_buf jmp_point; struct jmp_stack_item_struct *previous; } jmp_stack_item; /* Global stack of jmp points to exception handlers */ jmp_stack_item *jmp_stack; void try_except(void) { jmp_stack = NULL; /* try: */ if (!setjmp(allocjmp())) { ; } /* except: */ else { ; } } /* returning jmp_buf like this makes gcc unhappy since it is an array */ jmp_buf allocjmp(void) { /* malloc jmp_buf and put on top of stack */ /* return malloc'ed jmp_buf */ jmp_stack_item *new_jmp = (jmp_stack_item *) malloc(sizeof(jmp_stack_item)); if (!jmp_stack) { new_jmp->previous = NULL; } else { new_jmp->previous = jmp_stack; } jmp_stack = new_jmp; return new_jmp->jmp_point; } void raise(void) { /* Exception set; now call... */ makejmp(); } void makejmp(void) { jmp_stack_item *top_jmp = jmp_stack; jmp_buf jmp_to; if (!jmp_stack->previous) longjmp(jmp_stack->jmp_point, 1); else { memmove(jmp_to, top_jmp->jmp_point, sizeof(jmp_to)); jmp_stack = top_jmp->previous; free(top_jmp); longjmp(jmp_to, 1); } } From guido at python.org Thu Oct 23 02:49:37 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 02:48:55 2003 Subject: [Python-Dev] Re: buildin vs. shared modules In-Reply-To: Your message of "23 Oct 2003 08:41:00 +0200." References: <200310171840.h9HIesN06941@12-236-54-216.client.attbi.com> <200310211857.57783.aleaxit@yahoo.com> <200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com> <7k2ywden.fsf@yahoo.co.uk> <65ihlodo.fsf@python.net> <200310221442.h9MEgQN27219@12-236-54-216.client.attbi.com> <4qy1qfs5.fsf@python.net> Message-ID: <200310230649.h9N6nbS02025@12-236-54-216.client.attbi.com> > > VC7 can convert VC6 workspace and project files into its own format, > > but there is no way back. You cannot use VC7 files (they are called > > solution instead of workspace) in VC6 anymore. MvL suggested to convert > > the files once and then deprecate using the VC6 workspace. > > Indeed: Conversion works fairly well, but we (as python-devers) should > agree on using a single compiler - otherwise, conflicting changes will > occur. So I propose to actually move the VC6 project files elsewhere; > anybody who wants to continue to use them would need to copy them back. > > I could implement that very quickly; I just need agreement that we > should do so. We would also need agreement on whether to use VC7 > (Studio .NET) or VC 7.1 (Studio .NET 2003); I propose to use the > latter. Right. Microsoft donated 10 copies of VC7.1 to various key Python developers (including me, Tim Peters and Jeremy Hylton). > > MvL again has the idea to create the msi (which is basically a database) > > programmatically with Python - either via COM, a custom Python extension > > or maybe ctypes. > > I haven't made much progress with that, though. Initially I plan to > use the MSI COM interface, and I'm fairly certain that this can be > done, but it also takes some effort. > > On the plus side, anybody could then do the packaging - you would only > need PythonWin installed. That requirement could be dropped by using > the C API to installer. To build necessary extension module, you would > need to have the Installer SDK installed (which comes with the > platform SDK); I haven't checked whether VC 7.1 ships with the > necessary libraries (in which case there would be no additional > prerequisites). --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Thu Oct 23 02:45:19 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Thu Oct 23 02:57:43 2003 Subject: [Python-Dev] Re: Re: buildin vs. shared modules In-Reply-To: References: <200310171840.h9HIesN06941@12-236-54-216.client.attbi.com> <200310211857.57783.aleaxit@yahoo.com> <200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com> <7k2ywden.fsf@yahoo.co.uk> <65ihlodo.fsf@python.net> Message-ID: "Terry Reedy" writes: > So I would hope that a C7.1 build is tested on such before an > irrevocable commitment is made. That will happen only if there are volunteers to test it. Those volunteers would need to be very active while the transition occurs, i.e. build from CVS instead of just trying out installable packages (because initially, there would not be any installable packages). That said, I'm quite confident that a VC7.1-built-MSI-packaged application could be installed even on Win95 - you would have to install installer first, though (by means of the four-files packaging approach). Regards, Martin From mcherm at mcherm.com Thu Oct 23 03:29:48 2003 From: mcherm at mcherm.com (Michael Chermside) Date: Thu Oct 23 03:29:49 2003 Subject: [Python-Dev] product() Message-ID: <1066894188.3f97836cbb106@mcherm.com> [Raymond, recently]: > The product() accumulator is the one destined to be a builtin. > > Though it is not nearly as common as sum(), it does enjoy > some popularity. Having it available will help dispense > with reduce(operator.mul, data, 1). > > Would there be any objections to my adding product() to > Py2.4? The patch was simple and it is ready to go unless > someone has some major issue with it. Just wanted to bring you a blast from the past: [http://mail.python.org/pipermail/python-dev/2003-April/034784.html] [Alex Martelli:] > I think I understand the worry that introducing 'sum' would be the start > of a slippery slope leading to requests for 'prod' (I can't think of other > bulk operations that would be at all popular -- perhaps bulk and/or, but > I think that's stretching it). But I think it's a misplaced worry in this > case. "Adding up a bunch of numbers" is just SO much more common > than "Multiplying them up" (indeed the latter's hardly idiomatic English, > while "adding up" sure is), that I believe normal users (as opposed to > advanced programmers with a keenness on generalization) wouldn't > have any problem at all with 'sum' being there and 'prod' missing... I have nothing to add... Alex said it much better than I could. -- Michael Chermside From barry at python.org Thu Oct 23 07:59:11 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 23 07:59:15 2003 Subject: [Python-Dev] product() In-Reply-To: <002401c39907$0176f5a0$e841fea9@oemcomputer> References: <002401c39907$0176f5a0$e841fea9@oemcomputer> Message-ID: <1066910350.11634.7.camel@anthem> On Wed, 2003-10-22 at 21:43, Raymond Hettinger wrote: > In the course of writing up Pep 289, it became clear that > the future has a number of accumulator functions in store. In a crazy, I-haven't-yet-had-my-coffee-yet desperate attempt at resurrecting PEP 274, what if we made dict (and maybe tuple) accumulator functions too? Then if something like dict(genex) would work, how hard would it be to add some syntactic sugar for that in {genex}? Aren't we kind of close already? >>> from __future__ import generators >>> def a(): ... for x in 'hello world': ... yield x ... >>> dict([(c, c) for c in a()]) {' ': ' ', 'e': 'e', 'd': 'd', 'h': 'h', 'l': 'l', 'o': 'o', 'r': 'r', 'w': 'w'} Okay, I promise, I'll shut up now about PEP 274. pass-the-joe-ly y'rs, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20031023/628f09a2/attachment.bin From pinard at iro.umontreal.ca Thu Oct 23 08:31:53 2003 From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard) Date: Thu Oct 23 08:32:08 2003 Subject: [Python-Dev] Can we please have a better dict interpolation syntax? In-Reply-To: <200310230416.h9N4GJv01528@12-236-54-216.client.attbi.com> References: <200310230332.h9N3WIm20194@oma.cosc.canterbury.ac.nz> <200310230416.h9N4GJv01528@12-236-54-216.client.attbi.com> Message-ID: <20031023123153.GA20072@alcyon.progiciels-bpi.ca> [Guido van Rossum] > > > Wouldn't this be even better? > > > "create index ${table}_lid1_idx on $table($lid1)" % params "Better" because it uses `$' instead of `%'? It is really a matter of taste and aesthetics, more than being "better" on technical grounds. Technically, the multiplication of aspects and paradigms goes against some unencumberance and simplicity, which made Python attractive to start with. We would loose something probably not worth the gain. > it seems that $foo is just much more common than > %foo as a substitution indicator, across various languages. Python has the right of being culturally distinct on some details. I see it as an advantage: when languages are too similar, some confusion arises between differences. The distinction actually helps. > Anyway, I think this is something that can wait until 3.0, and I'd > rather not have too many discussions here at once, OK, then. Enough said! :-) -- Fran?ois Pinard http://www.iro.umontreal.ca/~pinard From barry at python.org Thu Oct 23 09:22:41 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 23 09:22:46 2003 Subject: [Python-Dev] Re: let's not stretch a keyword's use unreasonably, _please_... In-Reply-To: References: <20031022161137.96353.qmail@web40513.mail.yahoo.com> Message-ID: <1066915360.11634.11.camel@anthem> On Thu, 2003-10-23 at 01:51, Terry Reedy wrote: > So I really *don't* need global. Perhaps a new builtin > > def me(): > import sys > return sys.modules[__name__] +1, or just "import __me__" I've often wanted a convenient way to get a hold of the current module object. I use something like def me(), but it's a bit ugly and magical looking. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20031023/5aa0a928/attachment-0001.bin From skip at pobox.com Thu Oct 23 09:55:22 2003 From: skip at pobox.com (Skip Montanaro) Date: Thu Oct 23 09:55:53 2003 Subject: [Python-Dev] Can we please have a better dict interpolation syntax? In-Reply-To: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz> References: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz> Message-ID: <16279.56778.309781.129469@montanaro.dyndns.org> Greg> I would *really* like to be able to write this as Greg> "create index %{table}_lid1_idx on %{table}(%{lid1})" % params Greg> which I find to be much easier on the eyes. What if lid1 is a float which you want to display with two digits past the decimal point? I think we've been around the block on this one a few times. While %{foo} might be a convenient shorthand for %(foo)s, I don't think it saves enough space (one character) or stands out that much more ("{...}" instead of "(...)s") to make the addition worthwhile. In addition, you'd have to retain the current construct in cases where something other than simple string interpolation was required, in which case you also have the problem of having two almost identical ways to do dictionary interpolation. Skip From skip at pobox.com Thu Oct 23 10:08:11 2003 From: skip at pobox.com (Skip Montanaro) Date: Thu Oct 23 10:08:19 2003 Subject: [Python-Dev] Re: product() In-Reply-To: References: <002401c39907$0176f5a0$e841fea9@oemcomputer> Message-ID: <16279.57547.169388.138165@montanaro.dyndns.org> David> Maybe it would be useful to get some feeling for how much other David> functions get used in reduce? Looking at my own code collection I found five instances of reduce(), all used either a defined sum function or the equivalent lambda. There are probably many other contexts where I might have used reduce, but where it either didn't occur to me or didn't make the code easier to read or faster. I'd be happy if you deprecated reduce() today. ;-) Skip From skip at pobox.com Thu Oct 23 10:16:02 2003 From: skip at pobox.com (Skip Montanaro) Date: Thu Oct 23 10:16:10 2003 Subject: [Python-Dev] Re: let's not stretch a keyword's use unreasonably, _please_... In-Reply-To: References: <20031022161137.96353.qmail@web40513.mail.yahoo.com> Message-ID: <16279.58018.40303.136992@montanaro.dyndns.org> >>> import __main__ as m # I know, not general, just for trial >>> m.c=3 Isn't (in 3.0) the notion of being able to modify another module's globals supposed to get restricted to help out (among other things) the compiler? If so, this use, even though it's not really modifying a global in another module, might not work forever. Skip From ark-mlist at att.net Thu Oct 23 10:18:48 2003 From: ark-mlist at att.net (Andrew Koenig) Date: Thu Oct 23 10:18:55 2003 Subject: [Python-Dev] PEP 289: Generator Expressions (second draft) In-Reply-To: <200310230642.h9N6gCN01979@12-236-54-216.client.attbi.com> Message-ID: <009301c39970$94247530$6402a8c0@arkdesktop> > Raymond, please take this to c.l.py for feedback! Wear asbestos. :-) One thought: If we eventually adopt the notation that {a, b, c} is a set, there is a potential ambiguity in expressions such as {x**2 for x in range(n)}. Which is it, a set comprehension or a set with one element that is a generator expression? It would have to be the former, of course, by analogy with [x**2 for x in range(n)], which means that if we introduce generator expressions, and we later introduce set literals, we will have to introduce set comprehensions at the same time. Either that or prohibit generator expressions as set-literal elements unless parenthesized -- i.e. {(x**2 for x in range(n))}. From barry at python.org Thu Oct 23 10:46:48 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 23 10:46:57 2003 Subject: [Python-Dev] Can we please have a better dict interpolation syntax? In-Reply-To: <200310230416.h9N4GJv01528@12-236-54-216.client.attbi.com> References: <200310230332.h9N3WIm20194@oma.cosc.canterbury.ac.nz> <200310230416.h9N4GJv01528@12-236-54-216.client.attbi.com> Message-ID: <1066920408.11634.89.camel@anthem> On Thu, 2003-10-23 at 00:16, Guido van Rossum wrote: > There have been many proposals in this area, even a PEP (PEP 215, > which I don't like that much, despite its use of $). And PEP 292, which I probably should update. I should mention that $string substitutions are optional in Mailman 2.1, but they will be the only way to do it in Mailman 3. I've played a lot with various implementations of this idea, and below is the one I've currently settled on. Not all of the semantics may be perfect for core Python (i.e. never throw a KeyError), but this is all doable in modern Python, and for user-exposed templates, gets a +1000 in my book. >>> s = dstring('${person} lives in $where and owes me $$${amount}') >>> d = safedict(person='Guido', where='California', amount='1,000,000') >>> print s % d Guido lives in California and owes me $1,000,000 >>> d = safedict(person='Tim', amount=.13) >>> print s % d Tim lives in ${where} and owes me $0.13 -Barry import re # Search for $$, $identifier, or ${identifier} dre = re.compile(r'(\${2})|\$([_a-z]\w*)|\${([_a-z]\w*)}', re.IGNORECASE) EMPTYSTRING = '' class dstring(unicode): def __new__(cls, ustr): ustr = ustr.replace('%', '%%') parts = dre.split(ustr) for i in range(1, len(parts), 4): if parts[i] is not None: parts[i] = '$' elif parts[i+1] is not None: parts[i+1] = '%(' + parts[i+1] + ')s' else: parts[i+2] = '%(' + parts[i+2] + ')s' return unicode.__new__(cls, EMPTYSTRING.join(filter(None, parts))) class safedict(dict): """Dictionary which returns a default value for unknown keys.""" def __getitem__(self, key): try: return super(safedict, self).__getitem__(key) except KeyError: return '${%s}' % key -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20031023/2684c83b/attachment.bin From barry at python.org Thu Oct 23 10:53:15 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 23 10:53:22 2003 Subject: [Python-Dev] Can we please have a better dict interpolation syntax? In-Reply-To: <20031023123153.GA20072@alcyon.progiciels-bpi.ca> References: <200310230332.h9N3WIm20194@oma.cosc.canterbury.ac.nz> <200310230416.h9N4GJv01528@12-236-54-216.client.attbi.com> <20031023123153.GA20072@alcyon.progiciels-bpi.ca> Message-ID: <1066920795.11634.96.camel@anthem> On Thu, 2003-10-23 at 08:31, Fran?ois Pinard wrote: > [Guido van Rossum] > > > > Wouldn't this be even better? > > > > "create index ${table}_lid1_idx on $table($lid1)" % params > > "Better" because it uses `$' instead of `%'? It is really a matter of > taste and aesthetics, more than being "better" on technical grounds. > Technically, the multiplication of aspects and paradigms goes against > some unencumberance and simplicity, which made Python attractive to > start with. We would loose something probably not worth the gain. Better because the trailing type specifier on %-strings is extremely error prone (#1 cause of bugs for Mailman translators is/was leaving off the trailing 's'). Better because the rules for $-strings are simple and easy to explain. Better because the enclosing braces are optional, and unnecessary in the common case, making for much more readable template strings. And yes, better because it uses $ instead of %; it just seems that more people grok that $foo is a placeholder. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20031023/81a3c016/attachment.bin From guido at python.org Thu Oct 23 10:56:27 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 10:57:58 2003 Subject: [Python-Dev] product() In-Reply-To: Your message of "Thu, 23 Oct 2003 07:59:11 EDT." <1066910350.11634.7.camel@anthem> References: <002401c39907$0176f5a0$e841fea9@oemcomputer> <1066910350.11634.7.camel@anthem> Message-ID: <200310231456.h9NEuRn02615@12-236-54-216.client.attbi.com> > In a crazy, I-haven't-yet-had-my-coffee-yet desperate attempt at > resurrecting PEP 274, what if we made dict (and maybe tuple) > accumulator functions too? There's nothing magical about accumulator functions; they're just functions taking an iterable. We have tons of these today, and tuple() and dict() are among them. Once the syntax works, dict((k,k) for k,k in "hello") will with without changes to dict. --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Thu Oct 23 10:56:31 2003 From: skip at pobox.com (Skip Montanaro) Date: Thu Oct 23 10:58:05 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Doc python-docs.txt, 1.2, 1.3 In-Reply-To: References: Message-ID: <16279.60447.29714.759275@montanaro.dyndns.org> fred> - add "Why is Python installed on my computer?" as a documentation fred> FAQ since this gets asked at the docs at python.org address a fred> lot And I thought only webmaster@python.org got asked that question all the time. Does it get asked at other addresses as well? I don't recall ever seeing it on python-list. Skip From python at rcn.com Thu Oct 23 10:58:02 2003 From: python at rcn.com (Raymond Hettinger) Date: Thu Oct 23 10:58:54 2003 Subject: [Python-Dev] PEP 289: Generator Expressions In-Reply-To: <200310230642.h9N6gCN01979@12-236-54-216.client.attbi.com> Message-ID: <002e01c39976$0f130040$e841fea9@oemcomputer> [Guido] > I've checked in an update to Raymond's PEP 289 which (I hope) > clarifies a lot of things, and settles the capturing of free > variables. Nice edits. I'm unclear on the meaning of the last line in detail #3, "(Loop variables may also use constructs like x[i] or x.a; this form may be deprecated.)" Does this mean that "(x.a for x in mylist)" will initiatly be valid but will someday break? If so, I can't imagine why. Or does in mean that the induction variable can be in that form, "(x for x.a in mylist)". Surely, this would never be allowed. > Raymond, please take this to c.l.py for feedback! Wear asbestos. :-) Will do. Raymond Hettinger From barry at python.org Thu Oct 23 11:02:16 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 23 11:02:23 2003 Subject: [Python-Dev] Can we please have a better dict interpolation syntax? In-Reply-To: <16279.56778.309781.129469@montanaro.dyndns.org> References: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz> <16279.56778.309781.129469@montanaro.dyndns.org> Message-ID: <1066921335.11634.103.camel@anthem> On Thu, 2003-10-23 at 09:55, Skip Montanaro wrote: > What if lid1 is a float which you want to display with two digits past the > decimal point? BTW, I should mention that IMO, $-strings are great for end-user editable string templates, such as (in Mailman) things like translatable strings or message footer templates. But I also think the existing %-strings are just fine for programmers. I would definitely be opposed to complicating $-strings with any of the specialized and fine-grained control you have with %-strings. KISS and you'll have a great 99% solution, as long as you accept that the two substitution formats are aimed at different audiences. Then again, see my last post. I'm not sure anything needs to be added to core Python to support useful $-strings. Or maybe it can be implemented as a library module (or part of a 'textutils' package). -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20031023/b1b7b4bf/attachment.bin From guido at python.org Thu Oct 23 11:03:35 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 11:03:50 2003 Subject: [Python-Dev] PEP 289: Generator Expressions (second draft) In-Reply-To: Your message of "Thu, 23 Oct 2003 10:18:48 EDT." <009301c39970$94247530$6402a8c0@arkdesktop> References: <009301c39970$94247530$6402a8c0@arkdesktop> Message-ID: <200310231503.h9NF3Zr02681@12-236-54-216.client.attbi.com> > If we eventually adopt the notation that {a, b, c} is a set, there is a > potential ambiguity in expressions such as {x**2 for x in range(n)}. Which > is it, a set comprehension or a set with one element that is a generator > expression? > > It would have to be the former, of course, by analogy with > [x**2 for x in range(n)], which means that if we introduce generator > expressions, and we later introduce set literals, we will have to introduce > set comprehensions at the same time. Either that or prohibit generator > expressions as set-literal elements unless parenthesized -- i.e. > {(x**2 for x in range(n))}. Don't worry. The current proposal *always* requires parentheses around generator expressions (but it may be the only argument to a function), so your example would be illegal. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at python.org Thu Oct 23 11:04:41 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 23 11:04:48 2003 Subject: [Python-Dev] PEP 289: Generator Expressions (second draft) In-Reply-To: <009301c39970$94247530$6402a8c0@arkdesktop> References: <009301c39970$94247530$6402a8c0@arkdesktop> Message-ID: <1066921481.11634.106.camel@anthem> On Thu, 2003-10-23 at 10:18, Andrew Koenig wrote: > > Raymond, please take this to c.l.py for feedback! Wear asbestos. :-) > > One thought: > > If we eventually adopt the notation that {a, b, c} is a set, there is a > potential ambiguity in expressions such as {x**2 for x in range(n)}. Which > is it, a set comprehension or a set with one element that is a generator > expression? > > It would have to be the former, of course, by analogy with > [x**2 for x in range(n)], which means that if we introduce generator > expressions, and we later introduce set literals, we will have to introduce > set comprehensions at the same time. Either that or prohibit generator > expressions as set-literal elements unless parenthesized -- i.e. > {(x**2 for x in range(n))}. Heh, and then {(x, x**2) for x in range(n)} is a dict comprehension. okay-/now/-i'll-shut-up-about-them-ly y'rs, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20031023/31b49715/attachment-0001.bin From barry at python.org Thu Oct 23 11:05:57 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 23 11:06:19 2003 Subject: [Python-Dev] product() In-Reply-To: <200310231456.h9NEuRn02615@12-236-54-216.client.attbi.com> References: <002401c39907$0176f5a0$e841fea9@oemcomputer> <1066910350.11634.7.camel@anthem> <200310231456.h9NEuRn02615@12-236-54-216.client.attbi.com> Message-ID: <1066921556.11634.108.camel@anthem> On Thu, 2003-10-23 at 10:56, Guido van Rossum wrote: > > In a crazy, I-haven't-yet-had-my-coffee-yet desperate attempt at > > resurrecting PEP 274, what if we made dict (and maybe tuple) > > accumulator functions too? > > There's nothing magical about accumulator functions; they're just > functions taking an iterable. We have tons of these today, and > tuple() and dict() are among them. Once the syntax works, > > dict((k,k) for k,k in "hello") > > will with without changes to dict. Cool! -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20031023/cd65cd86/attachment.bin From fdrake at acm.org Thu Oct 23 11:09:09 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu Oct 23 11:09:22 2003 Subject: [Python-Dev] PEP 289: Generator Expressions In-Reply-To: <002e01c39976$0f130040$e841fea9@oemcomputer> References: <200310230642.h9N6gCN01979@12-236-54-216.client.attbi.com> <002e01c39976$0f130040$e841fea9@oemcomputer> Message-ID: <16279.61205.953516.442124@grendel.zope.com> Raymond Hettinger writes: > Does this mean that "(x.a for x in mylist)" will initiatly be valid but > will someday break? If so, I can't imagine why. Or does in mean that > the induction variable can be in that form, "(x for x.a in mylist)". > Surely, this would never be allowed. The later. There's bound to be some seriously evil stuff out there, just waiting to pop up... ;-) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From ark-mlist at att.net Thu Oct 23 11:12:07 2003 From: ark-mlist at att.net (Andrew Koenig) Date: Thu Oct 23 11:12:13 2003 Subject: [Python-Dev] PEP 289: Generator Expressions (second draft) In-Reply-To: <1066921481.11634.106.camel@anthem> Message-ID: <00b401c39978$074ba8b0$6402a8c0@arkdesktop> > Heh, and then {(x, x**2) for x in range(n)} is a dict comprehension. No, it's a set comprehension where the set elements are pairs. The dict comprehension would be {x: x**2 for x in range(n)} Or would that be a single-element dict whose key is x and value is a generator expression? :-) From fdrake at acm.org Thu Oct 23 11:18:36 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu Oct 23 11:18:52 2003 Subject: [Python-Dev] Can we please have a better dict interpolation syntax? In-Reply-To: <1066921335.11634.103.camel@anthem> References: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz> <16279.56778.309781.129469@montanaro.dyndns.org> <1066921335.11634.103.camel@anthem> Message-ID: <16279.61772.978424.304106@grendel.zope.com> Barry Warsaw writes: > Then again, see my last post. I'm not sure anything needs to be added > to core Python to support useful $-strings. Or maybe it can be > implemented as a library module (or part of a 'textutils' package). +1 on adding this as a module. I've managed to implement this a few times, and it would be nice to just import the same implementation from everywhere I needed it. One note: calling this "interpolation" (at least when describing it to end users) is probably a mistake; "substitution" makes more sense to people not ingrained in communities where it's called interpolation. It might be ok to call it interpolation for programmers, but... there's no need for two different names for it. ;-) -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From skip at pobox.com Thu Oct 23 11:22:40 2003 From: skip at pobox.com (Skip Montanaro) Date: Thu Oct 23 11:22:50 2003 Subject: [Python-Dev] Can we please have a better dict interpolation syntax? In-Reply-To: <1066921335.11634.103.camel@anthem> References: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz> <16279.56778.309781.129469@montanaro.dyndns.org> <1066921335.11634.103.camel@anthem> Message-ID: <16279.62016.628120.971560@montanaro.dyndns.org> Barry> On Thu, 2003-10-23 at 09:55, Skip Montanaro wrote: >> What if lid1 is a float which you want to display with two digits >> past the decimal point? Barry> BTW, I should mention that IMO, $-strings are great for end-user Barry> editable string templates, such as (in Mailman) things like Barry> translatable strings or message footer templates. ... Barry> Then again, see my last post. I'm not sure anything needs to be Barry> added to core Python to support useful $-strings. Or maybe it Barry> can be implemented as a library module (or part of a 'textutils' Barry> package). +1. If it's not something programmers will use (most of the time, anyway) there's no need to build it into the language. If programmers like it, it's only another module to import. In addition, I'm fairly certain such a module could be made compatible with Python as far back as 1.5.2 without a lot of effort. You also have the freedom to make it much more flexible (use of templates and so forth) if it's in a separate module. Skip From guido at python.org Thu Oct 23 11:35:56 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 11:36:51 2003 Subject: [Python-Dev] PEP 289: Generator Expressions In-Reply-To: Your message of "Thu, 23 Oct 2003 10:58:02 EDT." <002e01c39976$0f130040$e841fea9@oemcomputer> References: <002e01c39976$0f130040$e841fea9@oemcomputer> Message-ID: <200310231535.h9NFZuc02814@12-236-54-216.client.attbi.com> > I'm unclear on the meaning of the last line in detail #3, "(Loop > variables may also use constructs like x[i] or x.a; this form may be > deprecated.)" > > Does this mean that "(x.a for x in mylist)" will initiatly be valid but > will someday break? No, I meant that "for x.a in mylist: ..." is valid but shouldn't be, and consequently (because they all share the same syntax) this is also allowed in list comprehensions and generator expressions. All uses should be disallowed. > If so, I can't imagine why. Or does in mean that > the induction variable can be in that form, "(x for x.a in mylist)". > Surely, this would never be allowed. We can prevent it for generator expressions, but it's too late for list comprehensions and regular for loops -- we'll have to go deprecate it there. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Oct 23 11:38:18 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 11:38:26 2003 Subject: [Python-Dev] Can we please have a better dict interpolation syntax? In-Reply-To: Your message of "Thu, 23 Oct 2003 10:22:40 CDT." <16279.62016.628120.971560@montanaro.dyndns.org> References: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz> <16279.56778.309781.129469@montanaro.dyndns.org> <1066921335.11634.103.camel@anthem> <16279.62016.628120.971560@montanaro.dyndns.org> Message-ID: <200310231538.h9NFcIW02840@12-236-54-216.client.attbi.com> I have too much on my plate (spent too much on generator expressions lately :-). I am bowing out of the variable substitution discussion after noting that putting it in a module would be a great start (like for sets). --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Thu Oct 23 11:48:15 2003 From: skip at pobox.com (Skip Montanaro) Date: Thu Oct 23 11:48:26 2003 Subject: [Python-Dev] PEP 289: Generator Expressions In-Reply-To: <200310231535.h9NFZuc02814@12-236-54-216.client.attbi.com> References: <002e01c39976$0f130040$e841fea9@oemcomputer> <200310231535.h9NFZuc02814@12-236-54-216.client.attbi.com> Message-ID: <16279.63551.557669.791100@montanaro.dyndns.org> Guido> No, I meant that "for x.a in mylist: ..." is valid but shouldn't Guido> be, Valid? I'll buy that, but it had never occurred to me. Useful? That's not immediately obvious: >>> class Foo: ... def __init__(self): ... self.a = 42 ... >>> lst = [Foo() for i in range(4)] >>> lst [<__main__.Foo instance at 0x752760>, <__main__.Foo instance at 0x7529e0>, <__main__.Foo instance at 0x752df0>, <__main__.Foo instance 0x7529e0>at 0x752dc8>] >>> [x for x.a in lst] [Type help() for interactive help, or help(object) for help about object., Type help() for interactive help, or help(object) for help about object., Type help() for interactive help, or help(object) for help about object., Type help() for interactive help, or help(object) for help about object.] Skip From ws-news at gmx.at Thu Oct 23 11:47:24 2003 From: ws-news at gmx.at (Werner Schiendl) Date: Thu Oct 23 11:51:35 2003 Subject: [Python-Dev] Re: PEP 289: Generator Expressions (second draft) References: <1066921481.11634.106.camel@anthem> <00b401c39978$074ba8b0$6402a8c0@arkdesktop> Message-ID: Hello, this is my first post to this list, but I followed it "passive" since quite some time. I had a thought about distinguishing the list with 1 iterators vs. list comprehension issue that did not appear (at least to my eyes) yet. Why not take the same approach than used for tuples already? like (5) is just the value 5 and (5,) is a 1-tuple containing the value 5 I thought it would be intuitive to have [x**2 for x in range(n)] # be a list comprehension like it currently is [x**2 for x in range(n),] # a list with 1 iterator in it > No, it's a set comprehension where the set elements are pairs. The dict > comprehension would be > > {x: x**2 for x in range(n)} > > Or would that be a single-element dict whose key is x and value is a > generator expression? :-) in this case the same could be applied {x: x**2 for x in range(n)} # dict comprehension {x: x**2 for x in range(n),} # dict with 1 iterator (but "x" is probably not a valid name, is it?) best regards Werner From Paul.Moore at atosorigin.com Thu Oct 23 11:54:19 2003 From: Paul.Moore at atosorigin.com (Moore, Paul) Date: Thu Oct 23 11:55:06 2003 Subject: [Python-Dev] PEP 289: Generator Expressions Message-ID: <16E1010E4581B049ABC51D4975CEDB8802C0991B@UKDCX001.uk.int.atosorigin.com> From: Skip Montanaro [mailto:skip@pobox.com] > Guido> No, I meant that "for x.a in mylist: ..." is valid but shouldn't > Guido> be, > Valid? I'll buy that, but it had never occurred to me. Useful? That's not > immediately obvious: Well, I'll certainly give you "not obviously useful", but... >>> class Dummy: ... def __init__(self): ... self.a = 12 ... >>> d = Dummy() >>> d.a 12 >>> [9 for d.a in range(4)] [9, 9, 9, 9] >>> d.a 3 Paul From python at rcn.com Thu Oct 23 11:55:08 2003 From: python at rcn.com (Raymond Hettinger) Date: Thu Oct 23 11:57:12 2003 Subject: [Python-Dev] PEP 289: Generator Expressions In-Reply-To: <200310231535.h9NFZuc02814@12-236-54-216.client.attbi.com> Message-ID: <001101c3997e$094742e0$e841fea9@oemcomputer> [Guido] > No, I meant that "for x.a in mylist: ..." is valid but shouldn't be, > and consequently (because they all share the same syntax) this is also > allowed in list comprehensions and generator expressions. All uses > should be disallowed. > > > If so, I can't imagine why. Or does in mean that > > the induction variable can be in that form, "(x for x.a in mylist)". > > Surely, this would never be allowed. > > We can prevent it for generator expressions, but it's too late for > list comprehensions and regular for loops -- we'll have to go > deprecate it there. Since the issue is not unique to generator expressions, I recommend leaving it out of the PEP and separately dealing with all for-constructs at one time. It's harder to win support for proposals that use the word "deprecate". Raymond Hettinger From barry at python.org Thu Oct 23 11:58:08 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 23 11:58:15 2003 Subject: [Python-Dev] Can we please have a better dict interpolation syntax? In-Reply-To: <16279.61772.978424.304106@grendel.zope.com> References: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz> <16279.56778.309781.129469@montanaro.dyndns.org> <1066921335.11634.103.camel@anthem> <16279.61772.978424.304106@grendel.zope.com> Message-ID: <1066924688.11634.156.camel@anthem> On Thu, 2003-10-23 at 11:18, Fred L. Drake, Jr. wrote: > Barry Warsaw writes: > > Then again, see my last post. I'm not sure anything needs to be added > > to core Python to support useful $-strings. Or maybe it can be > > implemented as a library module (or part of a 'textutils' package). > > +1 on adding this as a module. Wasn't there talk of a textutils package around the time of textwrap.py? Maybe add that for Py2.4? > I've managed to implement this a few times, and it would be nice to > just import the same implementation from everywhere I needed it. > > One note: calling this "interpolation" (at least when describing it to > end users) is probably a mistake; "substitution" makes more sense to > people not ingrained in communities where it's called interpolation. > It might be ok to call it interpolation for programmers, > but... there's no need for two different names for it. ;-) Again +1 isn't strong enough. :) End users understand "substitution", they don't understand "interpolation". If started to use the former everywhere now. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20031023/765c511d/attachment.bin From barry at python.org Thu Oct 23 12:01:04 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 23 12:01:55 2003 Subject: [Python-Dev] Can we please have a better dict interpolation syntax? In-Reply-To: <200310231538.h9NFcIW02840@12-236-54-216.client.attbi.com> References: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz> <16279.56778.309781.129469@montanaro.dyndns.org> <1066921335.11634.103.camel@anthem> <16279.62016.628120.971560@montanaro.dyndns.org> <200310231538.h9NFcIW02840@12-236-54-216.client.attbi.com> Message-ID: <1066924863.11634.159.camel@anthem> On Thu, 2003-10-23 at 11:38, Guido van Rossum wrote: > I have too much on my plate (spent too much on generator expressions > lately :-). > > I am bowing out of the variable substitution discussion after noting > that putting it in a module would be a great start (like for sets). I don't have time to do it, but once Someone figures out where to situate it, feel free to use my posted code, either verbatim or as a starting point. PSF donation, blah, blah, blah. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20031023/6c6e568d/attachment.bin From python at rcn.com Thu Oct 23 12:04:21 2003 From: python at rcn.com (Raymond Hettinger) Date: Thu Oct 23 12:07:02 2003 Subject: [Python-Dev] PEP 289: Generator Expressions In-Reply-To: <16279.63551.557669.791100@montanaro.dyndns.org> Message-ID: <001801c3997f$524bc6e0$e841fea9@oemcomputer> [Skip Montanaro] > Valid? I'll buy that, but it had never occurred to me It had not occurred to me either. A moments reflection on the implementation reveals that any lvalue will work, even a[:]. Rather than twist ourselves into knots trying to find ways to disallow it, I think it should be left in the realm of things that never occur to anyone and have never been a real problem. don't-ask-don't-tell-ly yours, Raymond From guido at python.org Thu Oct 23 13:04:16 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 13:04:24 2003 Subject: [Python-Dev] Re: let's not stretch a keyword's use unreasonably, _please_... In-Reply-To: Your message of "Thu, 23 Oct 2003 09:16:02 CDT." <16279.58018.40303.136992@montanaro.dyndns.org> References: <20031022161137.96353.qmail@web40513.mail.yahoo.com> <16279.58018.40303.136992@montanaro.dyndns.org> Message-ID: <200310231704.h9NH4Gw03094@12-236-54-216.client.attbi.com> > >>> import __main__ as m # I know, not general, just for trial > >>> m.c=3 > > Isn't (in 3.0) the notion of being able to modify another module's globals > supposed to get restricted to help out (among other things) the compiler? > If so, this use, even though it's not really modifying a global in another > module, might not work forever. That's one reason why I'd rather continue to use 'global' than some attribute assignment. To the compiler, module globals are more special than class variables etc. because they can shadow builtins. Therefore the compiler would like to know about *all* assignments to module globals. Similarly, assignment to locals in outer scopes need to be known to the compiler because it must make sure that all locals referenced by inner scopes are implemented as cells. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Oct 23 13:09:05 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 13:09:12 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: Your message of "Thu, 23 Oct 2003 13:15:48 +1300." <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> Message-ID: <200310231709.h9NH95703122@12-236-54-216.client.attbi.com> > Guido: > > My problem with the nested functions is that it is much harder to get > > a grasp of what the shared state is -- any local variable in the outer > > function *could* be part of the shared state, and the only way to tell > > for sure is by inspecting all the subfunctions. > > That would be solved if, instead of marking variables > in inner scopes that refer to outer scopes, it were > the other way round, and variables in the outer scope > were marked as being rebindable in inner scopes. [Greg] > def f(): > rebindable x > def inc_x_by(i): > x += i # rebinds outer x > x = 39 > inc_x_by(3) > return x This would only apply to *assignment* from inner scopes, not to *use* from inner scopes, right? (Otherwise it would be seriously backwards incompatible.) I'm not sure I like it much, because it gives outer scopes (some) control over inner scopes. One of the guidelines is that a name defined in an inner scope should always shadow the same name in an outer scope, to allow evolution of the outer scope without affecting local details of inner scope. (IOW if an inner function defines a local variable 'x', the outer scope shouldn't be able to change that.) --Guido van Rossum (home page: http://www.python.org/~guido/) From jmarshal at mathworks.com Thu Oct 23 13:42:23 2003 From: jmarshal at mathworks.com (Joshua Marshall) Date: Thu Oct 23 13:42:29 2003 Subject: [Python-Dev] closure semantics Message-ID: <7224B63940F10F40A48AC423597ADE57012DC7BA@MESSAGE-AH.ad.mathworks.com> > [Jeremy] > > I'm not averse to introducing a new keyword, which would address both > > concerns. yield was introduced with apparently little problem, so it > > seems possible to add a keyword without causing too much disruption. > > > > If we decide we must stick with global, then it's very hard to address > > Alex's concern about global being a confusing word choice . [Guido] > OK, the tension is mounting. Which keyword do you have in > mind? And would you use the same keyword for module-globals > as for outer-scope variables? I'd like to suggest "outer v" for this. The behavior could be to scan outward for the first definition of v. If the only outer-scope variable is at module-level, then the behavior would be the same as "global v". Or if everyone is comfortable enough re-using the keyword "global", then I also like "global v in f". From skip at pobox.com Thu Oct 23 13:51:16 2003 From: skip at pobox.com (Skip Montanaro) Date: Thu Oct 23 13:51:30 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <200310231709.h9NH95703122@12-236-54-216.client.attbi.com> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310231709.h9NH95703122@12-236-54-216.client.attbi.com> Message-ID: <16280.5396.284178.989033@montanaro.dyndns.org> >> That would be solved if, instead of marking variables in inner scopes >> that refer to outer scopes, it were the other way round, and >> variables in the outer scope were marked as being rebindable in inner >> scopes. ... Guido> This would only apply to *assignment* from inner scopes, not to Guido> *use* from inner scopes, right? (Otherwise it would be seriously Guido> backwards incompatible.) Given that the global keyword or something like it is here to stay (being preferable over some attribute-style access) and that global variable writes needs to be known to the compiler for future efficiency reasons, I think we need to consider modifications of the current global statement. The best thing I've seen so far (I forget who proposed it) is 'global' vars [ 'in' named_scope ] where named_scope can only be the name of a function which encloses the function containing the declaration. In Greg's example of inc_x_by nested inside f, he'd have declared: global x in f in inc_x_by. The current global statement (without a scoping clause) would continue to refer to the outermost scope of the module. This should be compatible with existing usage. The only problem I see is whether the named_scope needs to be known at compile time or if it can be deferred until run time. For example, should this import random def outer(a): x = a def inner(a): x = 42 def innermost(r): if r < 0.5: global x in inner else: global x in outer x = r print " inner, x @ start:", x innermost(random.random()) print " inner, x @ end:", x print "outer, x @ start:", x inner(a) print "outer, x @ end:", x outer(12.73) be valid? My thought is that it shouldn't. Skip From tim at zope.com Thu Oct 23 14:44:24 2003 From: tim at zope.com (Tim Peters) Date: Thu Oct 23 14:45:33 2003 Subject: [Python-Dev] PEP 289: Generator Expressions In-Reply-To: <001801c3997f$524bc6e0$e841fea9@oemcomputer> Message-ID: FYI, some of the implementations of the backtracking conjoin() operator in test_generators.py make heavy use of for values[i] in gs[i](): style for-loops. That style is often useful when generating vectors representing combinatorial objects. I could live without it, but so far haven't needed to prove that . From fdrake at acm.org Thu Oct 23 14:51:58 2003 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Thu Oct 23 14:53:23 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Doc python-docs.txt, 1.2, 1.3 In-Reply-To: <16279.60447.29714.759275@montanaro.dyndns.org> References: <16279.60447.29714.759275@montanaro.dyndns.org> Message-ID: <16280.9038.338759.771647@grendel.zope.com> Skip Montanaro writes: > And I thought only webmaster@python.org got asked that question all the > time. Does it get asked at other addresses as well? I don't recall ever > seeing it on python-list. I wouldn't expect to see it on python-list. Aren't the people who ask generally people who *aren't* in the Python community? They're going to look for the easiest ways to ask, so that generally means googling for "Python" and using whatever contact address is on one of the first pages they find. The first two Google's showing me now are: http://www.python.org/ ( webmaster at python.org ) http://www.python.org/doc/ ( docs at python.org ) Wanna guess where the questions are going to go? -Fred -- Fred L. Drake, Jr. PythonLabs at Zope Corporation From martin at v.loewis.de Thu Oct 23 16:30:16 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Thu Oct 23 16:31:07 2003 Subject: [Python-Dev] setjmp/longjmp exception handling (was: More informative error messages) In-Reply-To: <3F9779A3.7000504@ocf.berkeley.edu> References: <3F9779A3.7000504@ocf.berkeley.edu> Message-ID: "Brett C." writes: > The basic idea is to keep a stack of jmp_buf points. This is an old implementation strategy for exceptions in C++; e.g. GNU g++ uses it with -fsjlj-exception option. It is generally discouraged as it is *really* expensive: it requires a lot of memory per jmpbuf, and it requires that the memory is filled. In addition, for Python, there would be no simplification: each stack frame needs to perform "all" DECREFs. To convert this to exception handling, you would get very many nested try-catch blocks, as each allocation of some object would need to be followed with a try-catch block. So if you have 5 objects allocated in a function, you would need a nesting of 5 levels - i.e. up to column 40. Regards, Martin From pete at shinners.org Thu Oct 23 16:22:59 2003 From: pete at shinners.org (Pete Shinners) Date: Thu Oct 23 16:31:58 2003 Subject: [Python-Dev] random.choice IndexError on empty list Message-ID: (this should potentialy go on sourceforge's bug tracker? alas i have no account right now) Another user and I were scratching our heads over why random.choice() was raising "IndexError: list index out of range". For awhile we were thinking random.random() must have been returning >= 1.0. It turns out an empty list was being passed. I would suggest either a ValueError is raised, or a different exception message. perhaps more like the results of >>> [].pop() Traceback (most recent call last): File "", line 1, in ? IndexError: pop from empty list The patch is trivial, but I can provide it if there is an agreed-apon response. From jrw at pobox.com Thu Oct 23 17:40:33 2003 From: jrw at pobox.com (John Williams) Date: Thu Oct 23 17:40:43 2003 Subject: [Python-Dev] Re: closure semantics References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310231709.h9NH95703122@12-236-54-216.client.attbi.com> <16280.5396.284178.989033@montanaro.dyndns.org> Message-ID: <3F984AD1.5040306@pobox.com> Skip Montanaro wrote: > Given that the global keyword or something like it is here to stay (being > preferable over some attribute-style access) and that global variable writes > needs to be known to the compiler for future efficiency reasons, I think we > need to consider modifications of the current global statement. The best > thing I've seen so far (I forget who proposed it) is > > 'global' vars [ 'in' named_scope ] ... > This should be compatible with existing usage. The only problem I see is > whether the named_scope needs to be known at compile time or if it can be > deferred until run time. How about (to abuse a keyword that's gone unmolested for too long) global foo from def to declare that foo refers a variable in a lexically enclosing function definition? This avoids to need to name a specific function (which IMHO is just a source of confusion over the semantics of strange cases) while still having some mnemonic value (foo "comes from" an enclosing function definition). jw From skip at pobox.com Thu Oct 23 17:46:33 2003 From: skip at pobox.com (Skip Montanaro) Date: Thu Oct 23 17:46:43 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <3F984AD1.5040306@pobox.com> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310231709.h9NH95703122@12-236-54-216.client.attbi.com> <16280.5396.284178.989033@montanaro.dyndns.org> <3F984AD1.5040306@pobox.com> Message-ID: <16280.19513.571537.789185@montanaro.dyndns.org> John> How about (to abuse a keyword that's gone unmolested for too long) John> global foo from def John> to declare that foo refers a variable in a lexically enclosing John> function definition? This avoids to need to name a specific John> function (which IMHO is just a source of confusion over the John> semantics of strange cases) while still having some mnemonic value John> (foo "comes from" an enclosing function definition). How do you indicate the particular scope to which foo will be bound (there can be many lexically enclosing function definitions)? Using my example again: def outer(a): x = a def inner(a): x = 42 def innermost(r): global x from def # <--- your notation x = r print " inner, x @ start:", x innermost(random.random()) print " inner, x @ end:", x print "outer, x @ start:", x inner(a) print "outer, x @ end:", x how do you tell Python that x inside innermost is to be associated with the x in inner or the x in outer? Skip From zack at codesourcery.com Thu Oct 23 17:58:54 2003 From: zack at codesourcery.com (Zack Weinberg) Date: Thu Oct 23 17:58:59 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <16280.19513.571537.789185@montanaro.dyndns.org> (Skip Montanaro's message of "Thu, 23 Oct 2003 16:46:33 -0500") References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310231709.h9NH95703122@12-236-54-216.client.attbi.com> <16280.5396.284178.989033@montanaro.dyndns.org> <3F984AD1.5040306@pobox.com> <16280.19513.571537.789185@montanaro.dyndns.org> Message-ID: <87k76vehup.fsf@egil.codesourcery.com> Skip Montanaro writes: > John> How about (to abuse a keyword that's gone unmolested for too long) > > John> global foo from def > > John> to declare that foo refers a variable in a lexically enclosing > John> function definition? This avoids to need to name a specific > John> function (which IMHO is just a source of confusion over the > John> semantics of strange cases) while still having some mnemonic value > John> (foo "comes from" an enclosing function definition). > > How do you indicate the particular scope to which foo will be bound (there > can be many lexically enclosing function definitions)? Using my example > again: > > def outer(a): > x = a > def inner(a): > x = 42 > def innermost(r): > global x from def # <--- your notation > x = r > print " inner, x @ start:", x > innermost(random.random()) > print " inner, x @ end:", x > print "outer, x @ start:", x > inner(a) > print "outer, x @ end:", x > > how do you tell Python that x inside innermost is to be associated with the > x in inner or the x in outer? Maybe "global foo from " ? Or, "from function_name global foo" is consistent with import, albeit somewhat weird. I would never use this feature; I avoid nested functions entirely. However, as long as we're talking about this stuff, I wish I could write "global foo" at module scope and have that mean "this variable is to be treated as global in all functions in this module". zw From guido at python.org Thu Oct 23 18:06:58 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 18:07:06 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: Your message of "Thu, 23 Oct 2003 12:51:16 CDT." <16280.5396.284178.989033@montanaro.dyndns.org> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310231709.h9NH95703122@12-236-54-216.client.attbi.com> <16280.5396.284178.989033@montanaro.dyndns.org> Message-ID: <200310232206.h9NM6wb03700@12-236-54-216.client.attbi.com> [Skip] > Given that the global keyword or something like it is here to stay > (being preferable over some attribute-style access) (Actually I expect more pushback from Alex once he's back from his trip. He seems to feel strongly about this. :-) > and that global variable writes needs to be known to the compiler > for future efficiency reasons, I think we need to consider > modifications of the current global statement. The best thing I've > seen so far (I forget who proposed it) is > > 'global' vars [ 'in' named_scope ] > > where named_scope can only be the name of a function which encloses > the function containing the declaration. That was my first suggestion earlier this week. The main downside (except from propagating 'global' :-) is that if you rename the function defining the scope you have to fix all global statements referring to it. I saw a variant where the syntax was 'global' vars 'in' 'def' which solves that concern (though not particularly elegantly). > In Greg's example of inc_x_by nested inside f, he'd have declared: > > global x in f > > in inc_x_by. The current global statement (without a scoping > clause) would continue to refer to the outermost scope of the > module. > > This should be compatible with existing usage. The only problem I > see is whether the named_scope needs to be known at compile time or > if it can be deferred until run time. Definitely compile time. 'f' has to be a name of a lexically enclosing 'def'; it's not an expression. The compiler nees to know which scope it refers to so it can turn the correct variable into a cell. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Oct 23 18:08:58 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 18:09:07 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: Your message of "Thu, 23 Oct 2003 14:58:54 PDT." <87k76vehup.fsf@egil.codesourcery.com> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310231709.h9NH95703122@12-236-54-216.client.attbi.com> <16280.5396.284178.989033@montanaro.dyndns.org> <3F984AD1.5040306@pobox.com> <16280.19513.571537.789185@montanaro.dyndns.org> <87k76vehup.fsf@egil.codesourcery.com> Message-ID: <200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com> > However, as long as we're talking about this stuff, I wish I could > write "global foo" at module scope and have that mean "this variable > is to be treated as global in all functions in this module". This is similar to Greg Ewing's proposable to have 'rebindable x' at an outer function scope. My problem with it remains: It gives outer scopes (some) control over inner scopes. One of the guidelines is that a name defined in an inner scope should always shadow the same name in an outer scope, to allow evolution of the outer scope without affecting local details of inner scope. (IOW if an inner function defines a local variable 'x', the outer scope shouldn't be able to change that.) --Guido van Rossum (home page: http://www.python.org/~guido/) From zack at codesourcery.com Thu Oct 23 18:27:01 2003 From: zack at codesourcery.com (Zack Weinberg) Date: Thu Oct 23 18:27:05 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com> (Guido van Rossum's message of "Thu, 23 Oct 2003 15:08:58 -0700") References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310231709.h9NH95703122@12-236-54-216.client.attbi.com> <16280.5396.284178.989033@montanaro.dyndns.org> <3F984AD1.5040306@pobox.com> <16280.19513.571537.789185@montanaro.dyndns.org> <87k76vehup.fsf@egil.codesourcery.com> <200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com> Message-ID: <87brs7egju.fsf@egil.codesourcery.com> Guido van Rossum writes: >> However, as long as we're talking about this stuff, I wish I could >> write "global foo" at module scope and have that mean "this variable >> is to be treated as global in all functions in this module". > > This is similar to Greg Ewing's proposable to have 'rebindable x' at > an outer function scope. My problem with it remains: > > It gives outer scopes (some) control over inner scopes. One of the > guidelines is that a name defined in an inner scope should always > shadow the same name in an outer scope, to allow evolution of the > outer scope without affecting local details of inner scope. (IOW if > an inner function defines a local variable 'x', the outer scope > shouldn't be able to change that.) Frankly, I wish Python required one to write explicit declarations for all variables in the program: var x, y, z # module scope class bar: classvar I, J, K # class variables var i, j, k # instance variables def foo(...): var a, b, c # function scope ... It's extra bondage and discipline, yeah, but it's that much more help comprehending the program six months later, and it also gets rid of the "how was this variable name supposed to be spelled again?" question. zw From jrw at pobox.com Thu Oct 23 18:31:48 2003 From: jrw at pobox.com (John Williams) Date: Thu Oct 23 18:32:08 2003 Subject: [Python-Dev] Re: closure semantics References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310231709.h9NH95703122@12-236-54-216.client.attbi.com> <16280.5396.284178.989033@montanaro.dyndns.org> <3F984AD1.5040306@pobox.com> <16280.19513.571537.789185@montanaro.dyndns.org> Message-ID: <3F9856D4.3000404@pobox.com> Skip Montanaro wrote: > > John> global foo from def > > How do you indicate the particular scope to which foo will be bound (there > can be many lexically enclosing function definitions)? Using my example > again: > > def outer(a): > x = a > def inner(a): > x = 42 > def innermost(r): > global x from def # <--- your notation > x = r > print " inner, x @ start:", x > innermost(random.random()) > print " inner, x @ end:", x > print "outer, x @ start:", x > inner(a) > print "outer, x @ end:", x > > how do you tell Python that x inside innermost is to be associated with the > x in inner or the x in outer? I can think of two reasonable possibilities--either it refers to the innermost possible variable, or the compiler rejects this case outright. Either way the problem is easy to solve by renaming one of the variables. Sorry I wasn't clear--I really only meant to propose a new syntax for the already-proposed "global foo in def". For some reason I can't quite put my finger on, "in def" looks to me like it's referring to the function where the statement occurs, but "from def" looks like it refers to some other function. jw From raymond.hettinger at verizon.net Thu Oct 23 18:38:10 2003 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Thu Oct 23 18:39:01 2003 Subject: [Python-Dev] PyList API missing PyList_Pop() and PyList_Delete Message-ID: <001101c399b6$56d67a20$e841fea9@oemcomputer> Was there a reason for leaving this out of the API or should it be added? Is the right way to simulate a pop something like this: n = PyList_GET_SIZE(outbasket); if (n == 0) { PyErr_SetString(PyExc_IndexError, "Pop from an empty list."); return NULL; } result = PyList_Get_Item(outbasket, n-1); if (result == NULL) return NULL; Py_INCREF(result); empty_list = Py_ListNew(0); if (empty_list == NULL) { Py_DECREF(result); return NULL; } err = PyList_SetSlice(to->outbasket, n-1, n, empty_list); Py_DECREF(empty_list); if (err == -1) { Py_DECREF(result); return NULL; } return result; /* Whew, that was a lot of code for a pop have a popped result */ Raymond Hettinger From guido at python.org Thu Oct 23 18:42:50 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 18:43:14 2003 Subject: [Python-Dev] PyList API missing PyList_Pop() and PyList_Delete In-Reply-To: Your message of "Thu, 23 Oct 2003 18:38:10 EDT." <001101c399b6$56d67a20$e841fea9@oemcomputer> References: <001101c399b6$56d67a20$e841fea9@oemcomputer> Message-ID: <200310232242.h9NMgoG03818@12-236-54-216.client.attbi.com> > Was there a reason for leaving this out of the API It is much newer than that set of API functions, and I guess nobody thought about it. > or should it be added? Unclear -- how often does one need this? Can't you call it using one of the higher-level method-calling helpers? > Is the right way to simulate a pop something like this: No time to check, it should do the same as listpop(). --Guido van Rossum (home page: http://www.python.org/~guido/) From pedronis at bluewin.ch Thu Oct 23 19:08:38 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Thu Oct 23 19:07:03 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <200310221757.h9MHvI327805@12-236-54-216.client.attbi.com> References: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com> <5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch> Message-ID: <5.2.1.1.0.20031024005809.027db7c0@pop.bluewin.ch> At 10:57 22.10.2003 -0700, Guido van Rossum wrote: > def tee(iterable): > "Return two independent iterators from a single iterable" > data = {} > cnt = 0 > def gen(next): > global* cnt > dpop = data.pop > for i in count(): > if i == cnt: > item = data[i] = next() > cnt += 1 > else: > item = dpop(i) > yield item > next = iter(iterable).next > return (gen(next), gen(next)) > >which is IMO more readable. it's a subtle piece of code. I wouldn't mind a more structured syntax with both the outer function declaring that is ok for some inner function to rebind some of its locals, and the inner function declaring that a local is coming from an outer scope: def tee(iterable): "Return two independent iterators from a single iterable" data = {} # cnt = 0 here would be ok share cnt = 0: # the assignment is opt, # inner functions in the suite can rebind cnt def gen(next): use cnt # OR outer cnt dpop = data.pop for i in count(): if i == cnt: item = data[i] = next() cnt += 1 else: item = dpop(i) yield item # cnt = 0 here would be ok next = iter(iterable).next return (gen(next), gen(next)) yes it's heavy and unpythonic, but it makes very clear that something special is going on with cnt. no time to add anything else to the thread. regards. From guido at python.org Thu Oct 23 19:22:44 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 19:22:54 2003 Subject: [Python-Dev] closure semantics In-Reply-To: Your message of "Fri, 24 Oct 2003 01:08:38 +0200." <5.2.1.1.0.20031024005809.027db7c0@pop.bluewin.ch> References: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com> <5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch> <5.2.1.1.0.20031024005809.027db7c0@pop.bluewin.ch> Message-ID: <200310232322.h9NNMiA03864@12-236-54-216.client.attbi.com> > > def tee(iterable): > > "Return two independent iterators from a single iterable" > > data = {} > > cnt = 0 > > def gen(next): > > global* cnt > > dpop = data.pop > > for i in count(): > > if i == cnt: > > item = data[i] = next() > > cnt += 1 > > else: > > item = dpop(i) > > yield item > > next = iter(iterable).next > > return (gen(next), gen(next)) > > > >which is IMO more readable. > > it's a subtle piece of code. I wouldn't mind a more structured syntax with > both the outer function declaring that is ok for some inner function to > rebind some of its locals, and the inner function declaring that a local is > coming from an outer scope: > > def tee(iterable): > "Return two independent iterators from a single iterable" > data = {} > > # cnt = 0 here would be ok > > share cnt = 0: # the assignment is opt, > # inner functions in the suite can rebind cnt > def gen(next): > use cnt # OR outer cnt > dpop = data.pop > for i in count(): > if i == cnt: > item = data[i] = next() > cnt += 1 > else: > item = dpop(i) > yield item > > # cnt = 0 here would be ok > > next = iter(iterable).next > return (gen(next), gen(next)) > > yes it's heavy and unpythonic, but it makes very clear that something > special is going on with cnt. Might as well declare a class then. :-) > no time to add anything else to the thread. Ditto. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at cosc.canterbury.ac.nz Thu Oct 23 19:48:55 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Thu Oct 23 19:49:18 2003 Subject: [Python-Dev] Can we please have a better dict interpolation syntax? In-Reply-To: <16279.56778.309781.129469@montanaro.dyndns.org> Message-ID: <200310232348.h9NNmti28349@oma.cosc.canterbury.ac.nz> Skip Montanaro : > I think we've been around the block on this one a few times. While %{foo} > might be a convenient shorthand for %(foo)s, I don't think it saves enough > space (one character) or stands out that much more ("{...}" instead of > "(...)s") to make the addition worthwhile. I disagree strongly -- I think it *does* stand out more clearly. The "s" on the end of "%(name)s" too easily gets mixed up with other alphanumeric stuff nearby. If it were just "%(name)" *without* the trailing "s" it wouldn't be nearly as bad, but unfortunately it can't be left off and remain backwards compatible. > What if lid1 is a float which you want to display with two digits > past the decimal point? Then I would use the existing construct -- I'm not suggesting that it be removed. > in which case you also have the problem of having two almost identical > ways to do dictionary interpolation. I don't see that as a big problem. To my mind, practicality beats purity here -- "%(name)s" is too awkward to be practical for routine use. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Thu Oct 23 19:56:59 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Thu Oct 23 19:57:18 2003 Subject: [Python-Dev] PEP 289: Generator Expressions (second draft) In-Reply-To: <00b401c39978$074ba8b0$6402a8c0@arkdesktop> Message-ID: <200310232356.h9NNuxP28383@oma.cosc.canterbury.ac.nz> Andrew Koenig : > The dict > comprehension would be > > {x: x**2 for x in range(n)} > > Or would that be a single-element dict whose key is x and value is a > generator expression? :-) According to the parentheses rule, no, because that would have to be {x: (x**2 for x in range(n))} (Parentheses)-(are)-(so)-(handy)-(ly), Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tdelaney at avaya.com Thu Oct 23 20:07:06 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Thu Oct 23 20:07:15 2003 Subject: [Python-Dev] Re: closure semantics Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6AC32@au3010avexu1.global.avaya.com> > From: John Williams [mailto:jrw@pobox.com] > > I can think of two reasonable possibilities--either it refers to the > innermost possible variable, or the compiler rejects this > case outright. > Either way the problem is easy to solve by renaming one of > the variables. Going on the principle of least surprise, I have to say that I think explicitly naming the scope in which a variable is to be used is the best approach. My concern with the other proposal is that introducing code between scopes could silently change the semantics of a piece of code. I'll use the 'outer' proposal since it's the shortest and least confusing to me ... def func1() x = 1 def func2() def func3() outer x x += 2 return func3 return func2() print func1() should print: 3 Now, if we change it to: def func1() x = 1 def func2() x = 2 def func3() outer x x += 2 return func3 return func2() print func1() it would now print: 4 OTOH, specifying the scope prevents this type of error: def func1() x = 1 def func2() def func3() global x in func1 x += 2 return func3 return func2() print func1() and def func1() x = 1 def func2() x = 2 def func3() global x in func1 x += 2 return func3 return func2() print func1() should both print 3 'global x in func1' is also a *lot* easier to explain. I think these two points should weigh heavily in any decision. I think the need to rename the target scope is of lesser importance. Tim Delaney From guido at python.org Thu Oct 23 20:11:36 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 20:11:46 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: Your message of "Fri, 24 Oct 2003 10:07:06 +1000." <338366A6D2E2CA4C9DAEAE652E12A1DED6AC32@au3010avexu1.global.avaya.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DED6AC32@au3010avexu1.global.avaya.com> Message-ID: <200310240011.h9O0Bav03963@12-236-54-216.client.attbi.com> [Tim Delane]y > Going on the principle of least surprise, I have to say that I think > explicitly naming the scope in which a variable is to be used is the > best approach. [...] > 'global x in func1' is also a *lot* easier to explain. > > I think these two points should weigh heavily in any decision. I > think the need to rename the target scope is of lesser importance. I have to concur. EIBTI. --Guido van Rossum (home page: http://www.python.org/~guido/) From bac at OCF.Berkeley.EDU Thu Oct 23 20:26:34 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Oct 23 20:26:53 2003 Subject: [Python-Dev] setjmp/longjmp exception handling In-Reply-To: References: <3F9779A3.7000504@ocf.berkeley.edu> Message-ID: <3F9871BA.3060502@ocf.berkeley.edu> Martin v. L?wis wrote: > "Brett C." writes: > > >>The basic idea is to keep a stack of jmp_buf points. > > > This is an old implementation strategy for exceptions in C++; e.g. GNU > g++ uses it with -fsjlj-exception option. It is generally discouraged > as it is *really* expensive: it requires a lot of memory per jmpbuf, > and it requires that the memory is filled. > Figures. Oh well. At least it was interesting to figure out. From skip at pobox.com Thu Oct 23 21:56:26 2003 From: skip at pobox.com (Skip Montanaro) Date: Thu Oct 23 22:17:32 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <87k76vehup.fsf@egil.codesourcery.com> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310231709.h9NH95703122@12-236-54-216.client.attbi.com> <16280.5396.284178.989033@montanaro.dyndns.org> <3F984AD1.5040306@pobox.com> <16280.19513.571537.789185@montanaro.dyndns.org> <87k76vehup.fsf@egil.codesourcery.com> Message-ID: <16280.34506.469575.79716@montanaro.dyndns.org> Zack> Maybe "global foo from " ? Sounds just about like the "global foo in named_scope" (where "named_scope" means enclosing function) that I described earlier. I like "in" better than "from" because it tells you more clearly that you are messing with the variable in-place, not making a copy of it into the local scope. Zack> Or, "from function_name global foo" is consistent with import, Zack> albeit somewhat weird. That reads a bit weird to me. The nice thing about the other way is that "global foo" without any qualifiers means the same thing it does today. There's also no reason to use the from form as "global foo in function" doesn't imply that you will refer to foo as "function.foo". Zack> I would never use this feature; I avoid nested functions entirely. Zack> However, as long as we're talking about this stuff, I wish I could Zack> write "global foo" at module scope and have that mean "this Zack> variable is to be treated as global in all functions in this Zack> module". I've never actually used nested scopes either, nor have I ever felt the urge. Maybe it has something to do with not having done much recent programming in a language before Python which supported them. (Pascal does, but my last Pascal experience was nearly 20 years ago.) Skip From skip at pobox.com Thu Oct 23 22:02:59 2003 From: skip at pobox.com (Skip Montanaro) Date: Thu Oct 23 22:17:51 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <200310232206.h9NM6wb03700@12-236-54-216.client.attbi.com> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310231709.h9NH95703122@12-236-54-216.client.attbi.com> <16280.5396.284178.989033@montanaro.dyndns.org> <200310232206.h9NM6wb03700@12-236-54-216.client.attbi.com> Message-ID: <16280.34899.211589.786953@montanaro.dyndns.org> >> 'global' vars [ 'in' named_scope ] >> >> where named_scope can only be the name of a function which encloses >> the function containing the declaration. Guido> That was my first suggestion earlier this week. The main Guido> downside (except from propagating 'global' :-) is that if you Guido> rename the function defining the scope you have to fix all global Guido> statements referring to it. Well, the listed variables are "global" to the current local scope. I find the rename argument a bit specious. If I rename a function I have to change all the references to it today. This is just one more. Since "global" is a declarative statement, the compiler can tell you immediately that it can't find the old function name. Guido> I saw a variant where the syntax was Guido> 'global' vars 'in' 'def' Guido> which solves that concern (though not particularly elegantly). I don't see how that can work though. What does 'def' mean in this case? There can be multiple lexically enclosing functions, any of which have the same local variable x which you might want modify. >> This should be compatible with existing usage. The only problem I >> see is whether the named_scope needs to be known at compile time or >> if it can be deferred until run time. Guido> Definitely compile time. 'f' has to be a name of a lexically Guido> enclosing 'def'; it's not an expression. The compiler nees to Guido> know which scope it refers to so it can turn the correct variable Guido> into a cell. Okay, that was easily settled. ;-) Skip From greg at cosc.canterbury.ac.nz Thu Oct 23 23:19:56 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Thu Oct 23 23:20:05 2003 Subject: [Python-Dev] PEP 289: Generator Expressions (second draft) In-Reply-To: <200310230642.h9N6gCN01979@12-236-54-216.client.attbi.com> Message-ID: <200310240319.h9O3JuX29277@oma.cosc.canterbury.ac.nz> > I've checked in an update to Raymond's PEP 289 which (I hope) > clarifies a lot of things, and settles the capturing of free > variables. I had another early-morning idea about how to deal with the free variable issue, which could also be used when you have another form of closure (lambda, def) and you want to capture some of its free variables. Suppose there were a special form of assignment new x = expr If x is not used in any nested scope, this is the same as a regular assignment. But if it is, and consequently x is kept in a cell, instead of replacing the contents of the cell, this creates a *new* cell which replaces the previous one in the current scope. But any previously create closure will still be holding on to the old cell with its old value. If you do this in a loop, you will end up with a series of incarnations of the variable, each of which lives in its own little scope. Using this, Tim's pipeline example would become pipe = source for new p in predicates: new pipe = e for e in pipe if p(e) For generator expressions, Tim's idea of just always capturing the free variables is probably better, since it doesn't require recognising a subtle problem and then applying a furtherly-subtle solution. But it seemed like a stunningly brilliant idea at 3:27am this morning, so I thought I'd share it with you. :-) Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Thu Oct 23 23:26:34 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Thu Oct 23 23:26:43 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <7224B63940F10F40A48AC423597ADE57012DC7BA@MESSAGE-AH.ad.mathworks.com> Message-ID: <200310240326.h9O3QYU29285@oma.cosc.canterbury.ac.nz> Joshua Marshall : > I'd like to suggest "outer v" for this. We've been assuming all along that the semantics of a plain "global" statement have to remain exactly as they are, but is that strictly necessary? How much hardship would it cause, really, if "global" were simply redefined to mean "the next scope out where it's bound"? It would only break something if "global" were used in a nested function *and* there were a variable with the same name in some intermediate scope. That sounds like a rather rare set of conditions to me. Not significantly more common than "yield" being used as a variable name, surely? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido at python.org Thu Oct 23 23:40:52 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 23:41:21 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: Your message of "Thu, 23 Oct 2003 21:02:59 CDT." <16280.34899.211589.786953@montanaro.dyndns.org> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310231709.h9NH95703122@12-236-54-216.client.attbi.com> <16280.5396.284178.989033@montanaro.dyndns.org> <200310232206.h9NM6wb03700@12-236-54-216.client.attbi.com> <16280.34899.211589.786953@montanaro.dyndns.org> Message-ID: <200310240340.h9O3eqL04334@12-236-54-216.client.attbi.com> > Well, the listed variables are "global" to the current local scope. > I find the rename argument a bit specious. If I rename a function I > have to change all the references to it today. This is just one > more. Since "global" is a declarative statement, the compiler can > tell you immediately that it can't find the old function name. Right, I tend to agree. > Guido> I saw a variant where the syntax was > Guido> 'global' vars 'in' 'def' > Guido> which solves that concern (though not particularly elegantly). > > I don't see how that can work though. What does 'def' mean in this > case? There can be multiple lexically enclosing functions, any of > which have the same local variable x which you might want modify. Yeah, but usually that's not a problem. The compiler knows about all those x-es, and uses the innermost (nearest) one. This matches what it does when *referencing* a non-local variable, which doesn't need a global statement. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Oct 23 23:44:51 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 23 23:45:36 2003 Subject: [Python-Dev] closure semantics In-Reply-To: Your message of "Fri, 24 Oct 2003 16:26:34 +1300." <200310240326.h9O3QYU29285@oma.cosc.canterbury.ac.nz> References: <200310240326.h9O3QYU29285@oma.cosc.canterbury.ac.nz> Message-ID: <200310240344.h9O3ipI04351@12-236-54-216.client.attbi.com> > We've been assuming all along that the semantics of a > plain "global" statement have to remain exactly as they > are, but is that strictly necessary? > > How much hardship would it cause, really, if "global" > were simply redefined to mean "the next scope out where > it's bound"? > > It would only break something if "global" were used in > a nested function *and* there were a variable with the > same name in some intermediate scope. That sounds like > a rather rare set of conditions to me. Not significantly > more common than "yield" being used as a variable name, > surely? Reasonable assumption. We'd have to do a survey. --Guido van Rossum (home page: http://www.python.org/~guido/) From tdelaney at avaya.com Fri Oct 24 00:26:57 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Fri Oct 24 00:27:04 2003 Subject: [Python-Dev] closure semantics Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6ACF1@au3010avexu1.global.avaya.com> > From: Guido van Rossum [mailto:guido@python.org] > > > We've been assuming all along that the semantics of a > > plain "global" statement have to remain exactly as they > > are, but is that strictly necessary? > > > > How much hardship would it cause, really, if "global" > > were simply redefined to mean "the next scope out where > > it's bound"? > > Reasonable assumption. We'd have to do a survey. It would break any unadorned 'global x' in a nested scope if the name did not exist anywhere. I'm not saying this would be good form - personally I think anyone who did this would deserve it - but it would definitely break. One option would be to have an "if the name doesn't exist, it it created in module scope". But all this creates too many exceptions to what would otherwise be a simple rule IMO: global [in ] where default to the current module. Tim Delaney From tim_one at email.msn.com Fri Oct 24 00:46:59 2003 From: tim_one at email.msn.com (Tim Peters) Date: Fri Oct 24 00:47:11 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310230548.h9N5mHs01795@12-236-54-216.client.attbi.com> Message-ID: [Tim] >> This is easy to explain, and trivial to explain for people familiar >> with the default-argument trick. [Guido] > Phillip Eby already recommended not bothering with that; the > default-argument rule is actually confusing for newbies (they think > the defaults are evaluated at call time) so it's best not to bring > this into the picture. Of course it works equally well to pass regular (non-default) arguments, it just makes a precise explanation a little longer to type (because the arglist needs to be typed out in two places). > .. > OK, I got it now. I hope we can find another real-life example; but > there were some other early toy examples that also looked quite > convincing. I expect we will find more, although I haven't had more time to think about it (and today was devoted to puzzling over excessive rates of ZODB conflict errors, where generator expressions didn't seem immediately applicable ). I do think it's related to non-reiterablity. If generator expressions were reiterable, then a case could be made for them capturing a parameterized computation, reusable for different things by varying the bindings of the free variables. Like, say, you wanted to plot the squares of various functions at a set of points, and then: squares = (f(x)**2 for x in inputs) # assuming reiterability here for f in math.sin, math.cos, math.tan: plot(squares) But that doesn't make sense for a one-shot (not reiterable) generator, and even if it were reiterable I can't think of a real example that would want the bindings of free variables to change *during* a single pass over the results. For that matter, if it were reiterable, the "control by obscure side effect" style of the example is hard to like anyway. From greg at cosc.canterbury.ac.nz Fri Oct 24 01:04:47 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Fri Oct 24 01:04:58 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DED6ACF1@au3010avexu1.global.avaya.com> Message-ID: <200310240504.h9O54lQ29545@oma.cosc.canterbury.ac.nz> > It would break any unadorned 'global x' in a nested scope if the name > did not exist anywhere. > > One option would be to have an "if the name doesn't exist, it it > created in module scope". What would be wrong with that? It's what I had in mind. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tdelaney at avaya.com Fri Oct 24 02:25:52 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Fri Oct 24 02:25:59 2003 Subject: [Python-Dev] closure semantics Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6AD4A@au3010avexu1.global.avaya.com> > From: Greg Ewing [mailto:greg@cosc.canterbury.ac.nz] > > > It would break any unadorned 'global x' in a nested scope > if the name > > did not exist anywhere. > > > > One option would be to have an "if the name doesn't exist, it it > > created in module scope". > > What would be wrong with that? It's what I had in mind. It's complex. Can you explain the complete semantics of 'outer' as simply as: global [in ] Binds and uses in another scope. If 'in ' is omitted then the name is bound and used in the scope of the current module. My understanding of 'outer' is (and I'm not sure about this): outer Binds and uses in the innermost scope containing the current scope that already has bound. If is not bound in any containing scope then it is bound into the scope of the current module if is used or bound while executing in the current scope. Or something like that. Tim Delaney From Paul.Moore at atosorigin.com Fri Oct 24 04:02:15 2003 From: Paul.Moore at atosorigin.com (Moore, Paul) Date: Fri Oct 24 04:03:04 2003 Subject: [Python-Dev] closure semantics Message-ID: <16E1010E4581B049ABC51D4975CEDB8802C0991D@UKDCX001.uk.int.atosorigin.com> From: Delaney, Timothy C (Timothy) [mailto:tdelaney@avaya.com] > It would break any unadorned 'global x' in a nested scope > if the name did not exist anywhere. > I'm not saying this would be good form - personally I think > anyone who did this would deserve it - but it would definitely break. > One option would be to have an "if the name doesn't exist, it it > created in module scope". But all this creates too many exceptions > to what would otherwise be a simple rule IMO: > > global [in ] > > where default to the current module. This made me think. What should be the effect of def f(): x = 12 def g(): global y in f y = 12 g() print locals() I suspect the answer is "it's illegal". But by extension from the current behaviour of "global", it should create a local variable in f. Paul From tdelaney at avaya.com Fri Oct 24 04:11:01 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Fri Oct 24 04:11:13 2003 Subject: [Python-Dev] closure semantics Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6AD5B@au3010avexu1.global.avaya.com> > From: Moore, Paul [mailto:Paul.Moore@atosorigin.com] > > From: Delaney, Timothy C (Timothy) [mailto:tdelaney@avaya.com] > > > global [in ] > > > > where default to the current module. > > This made me think. What should be the effect of > > def f(): > x = 12 > def g(): > global y in f > y = 12 > g() > print locals() > > I suspect the answer is "it's illegal". But by extension from > the current > behaviour of "global", it should create a local variable in f. My understanding of (all) the proposals, and what I would expect, is identical semantics to the current 'global', but the affected scope. So yes, the above should create a local name `y` in `f`. The local name `y` would be allocated at compile time, just like any other local name. Likewise, the following should be illegal: def f(): x = 12 y = 1 def g(): global y in f y = 12 g() print locals() because the global statement occurs after a local binding of the name. Tim Delaney From theller at python.net Fri Oct 24 05:21:16 2003 From: theller at python.net (Thomas Heller) Date: Fri Oct 24 05:21:50 2003 Subject: [Python-Dev] cleanup order Message-ID: Is the cleanup order at Python shutdown documented somewhere? The only thing I found is the (old) essay http://www.python.org/doc/essays/cleanup.html Thomas From mwh at python.net Fri Oct 24 07:27:22 2003 From: mwh at python.net (Michael Hudson) Date: Fri Oct 24 07:27:30 2003 Subject: [Python-Dev] PyList API missing PyList_Pop() and PyList_Delete In-Reply-To: <001101c399b6$56d67a20$e841fea9@oemcomputer> (Raymond Hettinger's message of "Thu, 23 Oct 2003 18:38:10 -0400") References: <001101c399b6$56d67a20$e841fea9@oemcomputer> Message-ID: <2mu15ynaed.fsf@starship.python.net> "Raymond Hettinger" writes: > Was there a reason for leaving this out of the API or should it be > added? Is the right way to simulate a pop something like this: Well, there's always PyEval_CallMethod... Cheers, mwh -- [3] Modem speeds being what they are, large .avi files were generally downloaded to the shell server instead[4]. [4] Where they were usually found by the technical staff, and burned to CD. -- Carlfish, asr From ncoghlan at iinet.net.au Fri Oct 24 08:18:14 2003 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Fri Oct 24 08:18:16 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <200310221606.h9MG5wo27539@12-236-54-216.client.attbi.com> References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> <200310201815.h9KIFM821583@12-236-54-216.client.attbi.com> <3F953793.1000208@iinet.net.au> <200310211841.45711.aleaxit@yahoo.com> <3F967FFC.6040507@iinet.net.au> <200310221606.h9MG5wo27539@12-236-54-216.client.attbi.com> Message-ID: <3F991886.3090309@iinet.net.au> Guido van Rossum strung bits together to say: >>I had a similar thought about 5 minutes after turning my computer off last >>night. The alternative I came up with was: >> >> y = (from result = 0.0 do result += x**2 for x in values if x > 0) > > > I think you're aiming for the wrong thing here; I really see no reason > why you'd want to avoid writing this out as a real for loop if you > don't have an existing accumulator function (like sum()) to use. One interesting thing is that I later realised that iterator comprehensions combined with the sum function would actually cover 90% of the accumulation functions I would ever write. So Raymond turns out to be correct when he suggests that generator expressions may limit the need for reduce functions and accumulation loops. With the sum() built in around, they will cover a large number of the reduction operations encountered in real life. Previously, sum() was not available, and even if it had been the cost of generating the entire list to be summed may have been expensive (if the values to be summed are a function of the stored values, rather than a straight sum). So while I think a concise reduction syntax was worth aiming for, I'm also willing to admit that it seems to be basically impossible to manage without violating Python's maxim of "one obvious way to do it". The combination of generator expressions and the various builtins that operate on iterables (especially sum()) is a superior solution. Still, I learned a few interesting things I didn't know last week :) Cheers, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions." From skip at pobox.com Fri Oct 24 08:37:38 2003 From: skip at pobox.com (Skip Montanaro) Date: Fri Oct 24 08:37:52 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: References: <200310230548.h9N5mHs01795@12-236-54-216.client.attbi.com> Message-ID: <16281.7442.783253.814142@montanaro.dyndns.org> Tim> squares = (f(x)**2 for x in inputs) # assuming reiterability here Tim> for f in math.sin, math.cos, math.tan: Tim> plot(squares) How much more expensive would this be than for f in math.sin, math.cos, math.tan: squares = (f(x)**2 for x in inputs) plot(squares) which would work without reiterability, right? The underlying generator function could still be created at compile-time and it (or its code object?) stored in the current function's constants. 'f' is simply an argument to it when the iterator is instantiated. Skip From skip at pobox.com Fri Oct 24 08:41:17 2003 From: skip at pobox.com (Skip Montanaro) Date: Fri Oct 24 08:41:27 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DED6AD5B@au3010avexu1.global.avaya.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DED6AD5B@au3010avexu1.global.avaya.com> Message-ID: <16281.7661.936250.901160@montanaro.dyndns.org> Tim> Likewise, the following should be illegal: Tim> def f(): Tim> x = 12 Tim> y = 1 Tim> def g(): Tim> global y in f Tim> y = 12 Tim> g() Tim> print locals() Tim> because the global statement occurs after a local binding of the Tim> name. You meant def f(): x = 12 y = 1 def g(): y = 12 global y in f g() print locals() right? Skip From arigo at tunes.org Fri Oct 24 08:46:06 2003 From: arigo at tunes.org (Armin Rigo) Date: Fri Oct 24 08:49:57 2003 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects typeobject.c, 2.244, 2.245 In-Reply-To: <200310090345.h993jkF00618@12-236-54-216.client.attbi.com> References: <000001c38e10$77e31ea0$e841fea9@oemcomputer> <200310090345.h993jkF00618@12-236-54-216.client.attbi.com> Message-ID: <20031024124606.GB3853@vicky.ecs.soton.ac.uk> Hello Guido, On Wed, Oct 08, 2003 at 08:45:45PM -0700, Guido van Rossum wrote: > Py_INCREF(Py_True); > return Py_True; > > takes less time than > > return PyBool_FromLong(1); > > Maybe a pair of macros Py_return_True and Py_return_False would make > sense? Sorry if this was already suggested and hastily rejected, but why do we care at all about the reference counter of the few heavily-used immortal objects of CPython? I guess allowing their counter not to be carefully maintained ventures to the slippery slopes of bad code. Anyway, my two cents for a (very) slightly faster and shorter code would be to be allowed never to do Py_INCREF or Py_DECREF when we know that the object is Py_None, Py_False or Py_True. These three would have a dummy tp_dealloc that just resets the reference counter to some large value if it ever reaches zero. Armin From pedronis at bluewin.ch Fri Oct 24 09:53:57 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Fri Oct 24 09:51:40 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <200310232322.h9NNMiA03864@12-236-54-216.client.attbi.com> References: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com> <5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch> <5.2.1.1.0.20031024005809.027db7c0@pop.bluewin.ch> Message-ID: <5.2.1.1.0.20031024153151.02834768@pop.bluewin.ch> At 16:22 23.10.2003 -0700, Guido van Rossum wrote: > > > def tee(iterable): > > > "Return two independent iterators from a single iterable" > > > data = {} > > > cnt = 0 > > > def gen(next): > > > global* cnt > > > dpop = data.pop > > > for i in count(): > > > if i == cnt: > > > item = data[i] = next() > > > cnt += 1 > > > else: > > > item = dpop(i) > > > yield item > > > next = iter(iterable).next > > > return (gen(next), gen(next)) > > > > > >which is IMO more readable. > > > > it's a subtle piece of code. I wouldn't mind a more structured syntax with > > both the outer function declaring that is ok for some inner function to > > rebind some of its locals, and the inner function declaring that a > local is > > coming from an outer scope: > > > > def tee(iterable): > > "Return two independent iterators from a single iterable" > > data = {} > > > > # cnt = 0 here would be ok > > > > share cnt = 0: # the assignment is opt, > > # inner functions in the suite can rebind cnt > > def gen(next): > > use cnt # OR outer cnt > > dpop = data.pop > > for i in count(): > > if i == cnt: > > item = data[i] = next() > > cnt += 1 > > else: > > item = dpop(i) > > yield item > > > > # cnt = 0 here would be ok > > > > next = iter(iterable).next > > return (gen(next), gen(next)) > > > > yes it's heavy and unpythonic, but it makes very clear that something > > special is going on with cnt. > >Might as well declare a class then. :-) well, no, it's probably that I expect rebindable closed-over vars to be introduced but some kind of structured construct instead of the usual Python freeform. I think for this kind of situation I miss the Lisp-y 'let'. def counter(starval): share cnt = startval: def inc(i): use cnt cnt += i return cnt def dec(i) use cnt cnt -= i return cnt return inc,dec vs. def counter(starval): cnt = startval def inc(i): global cnt in counter cnt += i return cnt def dec(i) global cnt in counter cnt -= i return cnt return inc,dec vs. def counter(starval): class Counter: def __init__(self,startval): self.cnt = startval def inc(self,i): self.cnt += i return self.cnt def dec(self,i): self.cnt += i return self.cnt newcounter = Counter(startval) return newcounter.inc,newcounter.dec vs. (defun counter (startval) (let ((cnt startval)) (flet ((inc (i) (incf cnt i)) (dec (i) (decf cnt i))) (values #'inc #'dec)))) From ncoghlan at iinet.net.au Fri Oct 24 10:01:55 2003 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Fri Oct 24 10:02:00 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <5.1.0.14.0.20031022220801.02547e20@mail.telecommunity.com> References: <5.1.0.14.0.20031022220801.02547e20@mail.telecommunity.com> Message-ID: <3F9930D3.5060103@iinet.net.au> Phillip J. Eby strung bits together to say: > At 02:49 PM 10/23/03 +1300, Greg Ewing wrote: > >> This would allow the current delayed-evaluation semantics >> to be kept as the default, while eliminating any need >> for using the default-argument hack when you don't >> want delayed evaluation. > > > Does anybody actually have a use case for delayed evaluation? Why would > you ever *want* it to be that way? (Apart from the BDFL's desire to > have the behavior resemble function behavior.) > > And, if there's no use case for delayed evaluation, why make people jump > through hoops to get the immediate binding? The other thing to consider is that if generator expressions provide immediate evaluation, then anyone who wants delayed evaluation semantics still has the option of writing an actual generator function - at which point, it ceases to be an expression, and becomes a function. Which seems to fit with the way Python works at the moment: This displays '1': x = 0 y = x + 1 x = 1 print y This displays '2': x = 0 y = lambda: x + 1 x = 1 print y (I think someone already gave a similar example) Actually, the exact same no-argument-lambda trick used above would be enough to get you late binding of all of the elements in your generator expression. Being selective still requires writing a real generator function, though. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions." From python at rcn.com Fri Oct 24 10:06:11 2003 From: python at rcn.com (Raymond Hettinger) Date: Fri Oct 24 10:07:01 2003 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects typeobject.c, 2.244, 2.245 In-Reply-To: <20031024124606.GB3853@vicky.ecs.soton.ac.uk> Message-ID: <001001c39a37$fb221720$e841fea9@oemcomputer> [Guido van Rossum] > > Py_INCREF(Py_True); > > return Py_True; > > > > takes less time than > > > > return PyBool_FromLong(1); > > > > Maybe a pair of macros Py_return_True and Py_return_False would make > > sense? [Armin Rigo] > Sorry if this was already suggested and hastily rejected, but why do we > care > at all about the reference counter of the few heavily-used immortal > objects of > CPython? > > I guess allowing their counter not to be carefully maintained ventures to > the > slippery slopes of bad code. Anyway, my two cents for a (very) slightly > faster and shorter code would be to be allowed never to do Py_INCREF or > Py_DECREF when we know that the object is Py_None, Py_False or Py_True. > These > three would have a dummy tp_dealloc that just resets the reference counter > to > some large value if it ever reaches zero. Hmm, how about having the macros do the increments while in the debug but skip them for production code. That would keep the quality controls in, not break existing leak detection methods, and save the microseconds for incrementing. Raymond From FBatista at uniFON.com.ar Fri Oct 24 10:42:00 2003 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Fri Oct 24 10:43:04 2003 Subject: [Python-Dev] prePEP: Money data type Message-ID: Nick Coghlan wrote: #- > I'm urged to have a Money data type, but I'll see if I can #- get it through #- > Decimal, improving/fixing/extedign Decimal and saving #- effort at the same #- > time. #- #- And there is always the "class Money(Decimal):" option, as well. Sure, I think that's the way to do it. But first I need to know what problems have Decimal (if it isn't in the standard library, it's sure it needs work). Anyway, I'm burning my mind with the IBM specification of Decimal Arithmetic and studying the class itself. Then I'll work out the test cases, see what must be done, *do it* (if I can), and theeeeeeeeeeen start thinking again about Money. . Facundo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20031024/ac5c896d/attachment.html From gtalvola at nameconnector.com Fri Oct 24 11:07:35 2003 From: gtalvola at nameconnector.com (Geoffrey Talvola) Date: Fri Oct 24 11:07:42 2003 Subject: [Python-Dev] Can we please have a better dict interpolation s yntax? Message-ID: <61957B071FF421419E567A28A45C7FE59AF738@mailbox.nameconnector.com> Greg Ewing wrote: > Guido: > >> Wouldn't this be even better? >> >> "create index ${table}_lid1_idx on $table($lid1)" % params > > I wouldn't object to that. I'd have expected *you* to > object to it, though, since it re-defines the meaning > of "$" in an interpolated string. I was just trying > to suggest something that would be backward-compatible. > $ is currently unused in Python AFAIK. So why not: "create index ${table}_lid1_idx on $table($lid1)" $ params No backward compatibility problems at all. - Geoff From ncoghlan at iinet.net.au Fri Oct 24 10:45:26 2003 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Fri Oct 24 11:44:30 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <3F9930D3.5060103@iinet.net.au> References: <5.1.0.14.0.20031022220801.02547e20@mail.telecommunity.com> <3F9930D3.5060103@iinet.net.au> Message-ID: <3F993B06.2080106@iinet.net.au> Nick Coghlan strung bits together to say: > This displays '2': > x = 0 > y = lambda: x + 1 > x = 1 > print y D'oh! That last line should be "print y()"! Regards, Nick. Still has to reinstall Python on new OS installation. . . -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions." From guido at python.org Fri Oct 24 11:45:17 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 24 11:46:09 2003 Subject: [Python-Dev] closure semantics In-Reply-To: Your message of "Fri, 24 Oct 2003 09:02:15 BST." <16E1010E4581B049ABC51D4975CEDB8802C0991D@UKDCX001.uk.int.atosorigin.com> References: <16E1010E4581B049ABC51D4975CEDB8802C0991D@UKDCX001.uk.int.atosorigin.com> Message-ID: <200310241545.h9OFjHx05183@12-236-54-216.client.attbi.com> > This made me think. What should be the effect of > > def f(): > x = 12 > def g(): > global y in f > y = 12 > g() > print locals() > > I suspect the answer is "it's illegal". But by extension from the > current behaviour of "global", it should create a local variable in > f. I see no reason why it should be illegal; it should indeed create y in f. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Oct 24 11:46:35 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 24 11:46:43 2003 Subject: [Python-Dev] closure semantics In-Reply-To: Your message of "Fri, 24 Oct 2003 18:11:01 +1000." <338366A6D2E2CA4C9DAEAE652E12A1DED6AD5B@au3010avexu1.global.avaya.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DED6AD5B@au3010avexu1.global.avaya.com> Message-ID: <200310241546.h9OFkZ905194@12-236-54-216.client.attbi.com> > Likewise, the following should be illegal: > > def f(): > x = 12 > y = 1 > def g(): > global y in f > y = 12 > g() > print locals() > > because the global statement occurs after a local binding of the name. Huh? The placement of a global statement is irrelevant -- it can occur anywhere in the scope. This should certainly work. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Oct 24 11:50:48 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 24 11:50:58 2003 Subject: [Python-Dev] cleanup order In-Reply-To: Your message of "Fri, 24 Oct 2003 11:21:16 +0200." References: Message-ID: <200310241550.h9OFonK05219@12-236-54-216.client.attbi.com> > Is the cleanup order at Python shutdown documented somewhere? Yes, in the source. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From jmarshal at mathworks.com Fri Oct 24 11:56:21 2003 From: jmarshal at mathworks.com (Joshua Marshall) Date: Fri Oct 24 11:58:11 2003 Subject: [Python-Dev] closure semantics Message-ID: <7224B63940F10F40A48AC423597ADE57012DC7BB@MESSAGE-AH.ad.mathworks.com> [Timothy] > It would break any unadorned 'global x' in a nested scope if the > name did not exist anywhere. > > One option would be to have an "if the name doesn't exist, it it > created in module scope". [Greg Ewing] > What would be wrong with that? It's what I had in mind. I believe the <<- operator in the statistical language R behaves exactly like this. While not a compelling reason in itself to adopt this behavior, it's useful to consider constructs in other successful languages. From guido at python.org Fri Oct 24 11:59:04 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 24 11:59:11 2003 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects typeobject.c, 2.244, 2.245 In-Reply-To: Your message of "Fri, 24 Oct 2003 13:46:06 BST." <20031024124606.GB3853@vicky.ecs.soton.ac.uk> References: <000001c38e10$77e31ea0$e841fea9@oemcomputer> <200310090345.h993jkF00618@12-236-54-216.client.attbi.com> <20031024124606.GB3853@vicky.ecs.soton.ac.uk> Message-ID: <200310241559.h9OFx4M05253@12-236-54-216.client.attbi.com> > > Py_INCREF(Py_True); > > return Py_True; > > > > takes less time than > > > > return PyBool_FromLong(1); > > > > Maybe a pair of macros Py_return_True and Py_return_False would make > > sense? > > Sorry if this was already suggested and hastily rejected, but why do > we care at all about the reference counter of the few heavily-used > immortal objects of CPython? It was never discussed; I don't recall that it has ever occurred to me. > I guess allowing their counter not to be carefully maintained > ventures to the slippery slopes of bad code. Anyway, my two cents > for a (very) slightly faster and shorter code would be to be allowed > never to do Py_INCREF or Py_DECREF when we know that the object is > Py_None, Py_False or Py_True. These three would have a dummy > tp_dealloc that just resets the reference counter to some large > value if it ever reaches zero. I think there are debugging modes where this would upset some counters that maintain a balance of the total number of references in the world. I also don't think that the performance ggain would be measurable. Maybe the slight code size decrease would have some benefits. I'm worried that there would be a negative effect in terms of people copying the pattern for other objects. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Oct 24 12:05:09 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 24 12:06:25 2003 Subject: [Python-Dev] closure semantics In-Reply-To: Your message of "Fri, 24 Oct 2003 15:53:57 +0200." <5.2.1.1.0.20031024153151.02834768@pop.bluewin.ch> References: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com> <5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch> <5.2.1.1.0.20031024005809.027db7c0@pop.bluewin.ch> <5.2.1.1.0.20031024153151.02834768@pop.bluewin.ch> Message-ID: <200310241605.h9OG59C05317@12-236-54-216.client.attbi.com> > well, no, it's probably that I expect rebindable closed-over vars to > be introduced but some kind of structured construct instead of the > usual Python freeform. Why does rebindability make a difference here? Local vars are already visible in inner scopes, and if they are mutable, they are already being modified from inner scopes (just not rebound, but to most programmers that's an annoying detail). --Guido van Rossum (home page: http://www.python.org/~guido/) From arigo at tunes.org Fri Oct 24 12:08:27 2003 From: arigo at tunes.org (Armin Rigo) Date: Fri Oct 24 12:12:25 2003 Subject: [Python-Dev] Trashing recursive objects comparison? In-Reply-To: <200310171446.h9HEkVs06278@12-236-54-216.client.attbi.com> References: <20031017125429.GA25854@vicky.ecs.soton.ac.uk> <200310171446.h9HEkVs06278@12-236-54-216.client.attbi.com> Message-ID: <20031024160827.GA20721@vicky.ecs.soton.ac.uk> Hello, On Fri, Oct 17, 2003 at 07:46:31AM -0700, Guido van Rossum wrote: > > If the pretty academic subject of graph isomorphisms is well-worn > > enough to be sent to the trash, I'll submit a patch that just > > removes all this code and instead use the existing > > sys.recursionlimit counter to catch infinite recursions and throw > > the usual RuntimeError. Patch http://www.python.org/sf/825639. Rationale --------- Adding a list to itself is a nice first-time example of Python features, but it is quite uncommon in practice. It introduces a few problems for the CPython interpreter, which must explicitely detect and avoid infinite recursions not only to please the user (e.g. for a nicer str() representation) but because infinite C-level recursions crash it. The naive definition of comparison between lists is recursive, and thus suffers from this problem. "Bisimulation" is one of the possible mathematically clean definition of what it means for two recursive structures to be equal; this is what CPython currently implements. However, I argue that this behavior is unexpected (and undocumented), and it masks bugs in erroneous user code: structures may be considered equal by error. Triggering an explicit "infinite recursion" exception would have clearly pointed out the problem. The current implementation of equality is to return True if comparison of two containers recursively reaches the same pair of containers. This is arguably the same as if the following code: def f(): return f() returned None instead of looping indefinitely, because some dark magic in CPython decides that returning None is a good idea (and returning None is consistent: f() can return None if the nested f() call returns None too. Of course, returning anything else would be consistent too, but then for the equality we decide to return True whereas returning False would be consistent too, and would just make less structures to be considered as equal). Workarounds ----------- Under the proposed patch, applications that relied on equality to compare recursive structures will receive a RuntimeError: maximum recursion depth exceeded in cmp. This error does not come with a long traceback, unlike the normal "maximum recursion depth exceeded" error, unless user-defined (pure Python) comparison operators are involved in the infinite loop. It is easy to write the bisimulation algorithm in Python if one needs it, but it is harder and quite unnatural to do the converse: work around CPython's implementation of equality to turn off the bisimulation behavior. Three approaches can be taken to port applications: - structural equality can often be replaced by explicit structural tests. This is what the patch does for *all* the tests in Lib/test that relied on recursive equality. For example, if you want to check that an object is really a list that contains itself and nothing else, you can easily check that "isinstance(a, list) and len(a) == 1 and a[0] is a". This is more precise that the now-deprecated variants "a==a[0]" or "b=[]; b.append(b); a==b" because the latters would also succeed if a is [c] and c is [a], for example. - among the rare cases where we really want bisimulation, cyclic structures involving user-defined objects with a custom notion of equality are probably the most common case. If so, then it is straightforward to add a cache to the __eq__ operator: def __eq__(self, other): if id(other) in self.assumedequal: return True try: self.assumedequal[id(other)] = True #...recursive comparisons... finally: del self.assumedequal[id(other)] This typically only needs to be done for one of the classes involved -- as long as all cycles you are interested in will involve an instance of this class. - finally, to compare cyclic structures made only from built-in containers, an explicit "global" algorithm will do the trick. Here is a non-recursive one for lists: def bisimilar_lists(a, b): def consider(a, b): key = id(a), id(b) if key not in bisim: bisim[key] = True pending.append((a, b)) bisim = {} pending = [] consider(a, b) for a, b in pending: # a, b are the lists to compare if len(a) != len(b): # different length return False for a1, b1 in zip(a, b): if type(a1) != type(b1): # elements of different types return False if isinstance(a1, list): consider(a1, b1) # add the two sub-lists to 'pending' elif a1 != b1: return False # else compare non-lists directory return True This could easily be extended to provide support for dictionaries. The complete equivalent of the current CPython implementation is harder to achieve, but in the improbable case where the user really needs it (as opposed to one of the above solutions), he could define custom special methods, say __bisimilar__(). He would then extend the above algorithm to call this method in preference to __eq__() when it exists. Alternatively, he could define a global dictionary mapping types to bisimulation algorithms, with a registration mecanism for new types. (This is similar to copy.py and copy_reg.py. It could be added to the standard library.) Patch info ---------- The proposed patch adds two functions to the C API: int Py_EnterRecursiveCall(char *where) Marks a point where a recursive C-level call is about to be performed. 'where' should be a string " in xyz" to be concatenated to the RuntimeError message caused by the recursion depth limit. void Py_LeaveRecursiveCall() Ends a Py_EnterRecursiveCall(). Must be called once for each *successful* invocation of Py_EnterRecursiveCall(). These functions are used to simplify the code of the following: - eval_frame() - PyObject_Compare() - PyObject_RichCompare() - instance_call() The idea to make these two functions part of the public API is to have a well-tested and PyOS_CheckStack()-issuing way to perform safe recursive calls at the C level, both in the core and in extension modules. For example, cPickle.c has its own notion of recursion depth limit, but it does not check the OS stack; instead, it should probably use Py_EnterRecursiveCall() as well (which I did not do yet). Note that Py_EnterRecursiveCall() does the same checks as eval_frame() used to do, whereas Py_LeaveRecursiveCall() is actually a single-instruction macro. There is a performance degradation for the comparison of large non-cyclic lists, which I measure to be about 6-7% slower with the patch. Possibly, extra work could be done to tune Py_EnterRecursiveCall(). Another problem that Py_EnterRecursiveCall() could be enhanced to also address is that a long, non-recursive comparison cannot currently be interrupted by Ctrl-C. For example: >>> a = [5] * 1000 >>> b = [a] * 1000 >>> c = [b] * 1000 >>> c == c -=- Armin From guido at python.org Fri Oct 24 12:12:30 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 24 12:12:40 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: Your message of "Fri, 24 Oct 2003 07:37:38 CDT." <16281.7442.783253.814142@montanaro.dyndns.org> References: <200310230548.h9N5mHs01795@12-236-54-216.client.attbi.com> <16281.7442.783253.814142@montanaro.dyndns.org> Message-ID: <200310241612.h9OGCUw05359@12-236-54-216.client.attbi.com> > The underlying generator function could still be created at > compile-time and it (or its code object?) stored in the current > function's constants. No, the code object would be stored in the constants; the function object would be created each time around the loop. Good thing it came from an example that Tim himself didn't like. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Fri Oct 24 12:31:06 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 24 12:31:13 2003 Subject: [Python-Dev] Trashing recursive objects comparison? In-Reply-To: Your message of "Fri, 24 Oct 2003 17:08:27 BST." <20031024160827.GA20721@vicky.ecs.soton.ac.uk> References: <20031017125429.GA25854@vicky.ecs.soton.ac.uk> <200310171446.h9HEkVs06278@12-236-54-216.client.attbi.com> <20031024160827.GA20721@vicky.ecs.soton.ac.uk> Message-ID: <200310241631.h9OGV6W05482@12-236-54-216.client.attbi.com> You've convinced me. It should be noted in the NEWS file that it may breaks some apps; I'm sure there are a bunch of clever folks out there who liked the bisimulation approach enough to depend on it :-). Anyone else not in favor, please speak up over the weekend so Armin can check it in on Monday. --Guido van Rossum (home page: http://www.python.org/~guido/) From pedronis at bluewin.ch Fri Oct 24 12:46:42 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Fri Oct 24 12:44:17 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <200310241605.h9OG59C05317@12-236-54-216.client.attbi.com> References: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com> <5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch> <5.2.1.1.0.20031024005809.027db7c0@pop.bluewin.ch> <5.2.1.1.0.20031024153151.02834768@pop.bluewin.ch> Message-ID: <5.2.1.1.0.20031024181208.027a6958@pop.bluewin.ch> At 09:05 24.10.2003 -0700, Guido van Rossum wrote: > > well, no, it's probably that I expect rebindable closed-over vars to > > be introduced but some kind of structured construct instead of the > > usual Python freeform. > >Why does rebindability make a difference here? Local vars are already >visible in inner scopes, and if they are mutable, they are already >being modified from inner scopes (just not rebound, but to most >programmers that's an annoying detail). most Python programmers or most Python programmers using closures? Well, it's a gut feeling, let's try to articulate it. Because a) parametrizing a closure with some read-only variable b) possibly shared mutable state with indefinite extent are very different things. I think that people should recur to b) instead of using classes sparingly and make it clear when they do so. b) can feel like global variables with their problems, I think that's why I would prefer a syntax that still point out: this is some state and this are functions to manipulate it. Classes are fine for that, and knowing that it is common style/idiom in Lisp variants this is also fine there: (let ... introduces vars ... function defs) I think it is also about expectations when reading some code. Right now, reading Python code I expect at most to encounter a), although b) can be obtained using mutable objects, but also in that case IMHO an explicit uniform idiom would be preferable, like some Ref object inspired by ML references. I can live with all solutions, although I'm still unconviced apart from the Scheme textbook argument (which was serious) that this addition is really necessary. regards. From tjreedy at udel.edu Fri Oct 24 13:07:39 2003 From: tjreedy at udel.edu (Terry Reedy) Date: Fri Oct 24 13:07:44 2003 Subject: [Python-Dev] Re: Re: closure semantics References: <338366A6D2E2CA4C9DAEAE652E12A1DED6AC32@au3010avexu1.global.avaya.com> Message-ID: "Delaney, Timothy C (Timothy)" wrote in message > I think these two points [consistency and teachability] >should weigh heavily in any decision. Agree also >I think the need to rename the target scope is of lesser importance. If you mean the need to sync the inner global-in statement with an outer function name change, that is less onerous than the doing the same for variable name changes (which might require changes to several lines in the inner function). Function name mismatches would, I presume, be caught as compile-time syntax errors. But what about name mismatches? Global statements allows functions to create 'new' variables in the module scope and not just 'existing' ones. What about for in-between scopes? #start of fress interpreter session def f(): global xf xf = 1 def g() global xg xg = 2 global xgf in f xgf = 3 does this compile and run? or choke on third global at compile time? or choke on third assignment at runtime? Terry J. Reedy From tjreedy at udel.edu Fri Oct 24 13:16:41 2003 From: tjreedy at udel.edu (Terry Reedy) Date: Fri Oct 24 13:16:47 2003 Subject: [Python-Dev] Re: Re: closure semantics References: <338366A6D2E2CA4C9DAEAE652E12A1DED6AC32@au3010avexu1.global.avaya.com> Message-ID: "Terry Reedy" wrote in message news:bnbm8s$p6h$1@sea.gmane.org... > > But what about name mismatches? Global statements allows functions to > create 'new' variables in the module scope and not just 'existing' > ones. What about for in-between scopes? > > #start of fress interpreter session > def f(): > global xf > xf = 1 > def g() > global xg > xg = 2 > global xgf in f > xgf = 3 > > does this compile and run? or choke on third global at compile time? > or choke on third assignment at runtime? Whoops. Paul Moore asked same question and Guido answered compile and run. TJR From python at rcn.com Fri Oct 24 13:19:55 2003 From: python at rcn.com (Raymond Hettinger) Date: Fri Oct 24 13:20:46 2003 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objectstypeobject.c, 2.244, 2.245 In-Reply-To: <200310241559.h9OFx4M05253@12-236-54-216.client.attbi.com> Message-ID: <004e01c39a53$0b5838c0$e841fea9@oemcomputer> > I also don't think that the performance ggain would be measurable. The more I think about it, the more I'm sure that it cumulatively will never save as much time as it took to write this email. Raymond From guido at python.org Fri Oct 24 13:34:04 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 24 13:34:17 2003 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objectstypeobject.c, 2.244, 2.245 In-Reply-To: Your message of "Fri, 24 Oct 2003 13:19:55 EDT." <004e01c39a53$0b5838c0$e841fea9@oemcomputer> References: <004e01c39a53$0b5838c0$e841fea9@oemcomputer> Message-ID: <200310241734.h9OHY4L05621@12-236-54-216.client.attbi.com> > > I also don't think that the performance ggain would be measurable. > > The more I think about it, the more I'm sure that it cumulatively will > never save as much time as it took to write this email. So stop thinking about it already! :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From arigo at tunes.org Fri Oct 24 13:31:02 2003 From: arigo at tunes.org (Armin Rigo) Date: Fri Oct 24 13:34:54 2003 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects typeobject.c, 2.244, 2.245 In-Reply-To: <200310241559.h9OFx4M05253@12-236-54-216.client.attbi.com> References: <000001c38e10$77e31ea0$e841fea9@oemcomputer> <200310090345.h993jkF00618@12-236-54-216.client.attbi.com> <20031024124606.GB3853@vicky.ecs.soton.ac.uk> <200310241559.h9OFx4M05253@12-236-54-216.client.attbi.com> Message-ID: <20031024173102.GA29094@vicky.ecs.soton.ac.uk> Hello Guido, On Fri, Oct 24, 2003 at 08:59:04AM -0700, Guido van Rossum wrote: > > Sorry if this was already suggested and hastily rejected, but why do > > we care at all about the reference counter of the few heavily-used > > immortal objects of CPython? > > It was never discussed; I don't recall that it has ever occurred to > me. Just tried, and indeed I can't measure a difference. Armin From python at rcn.com Fri Oct 24 14:12:57 2003 From: python at rcn.com (Raymond Hettinger) Date: Fri Oct 24 14:13:49 2003 Subject: [Python-Dev] PyList API missing PyList_Pop() and PyList_Delete In-Reply-To: <2mu15ynaed.fsf@starship.python.net> Message-ID: <005401c39a5a$73aa9880$e841fea9@oemcomputer> > > Was there a reason for leaving this out of the API or should it be > > added? Is the right way to simulate a pop something like this: > > Well, there's always PyEval_CallMethod... I ended-up using: PyObject_CallMethod(to->outbasket, "pop", NULL); The bummer is that this call is effectively used in a loop and runs once for every data element in an iterable. Something like pop() has such a tiny granularity that its runtime is overwhelmed by the lookup time to call it this way. For this reason, I think PyList_Pop() warrants inclusion in the API much more than low granularity methods like PyList_Reverse() or PyList_Sort(). Raymond Hettinger From guido at python.org Fri Oct 24 14:24:52 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 24 14:26:47 2003 Subject: [Python-Dev] PyList API missing PyList_Pop() and PyList_Delete In-Reply-To: Your message of "Fri, 24 Oct 2003 14:12:57 EDT." <005401c39a5a$73aa9880$e841fea9@oemcomputer> References: <005401c39a5a$73aa9880$e841fea9@oemcomputer> Message-ID: <200310241824.h9OIOr105765@12-236-54-216.client.attbi.com> > I ended-up using: > > PyObject_CallMethod(to->outbasket, "pop", NULL); > > The bummer is that this call is effectively used in a loop and runs once > for every data element in an iterable. Something like pop() has such a > tiny granularity that its runtime is overwhelmed by the lookup time to > call it this way. For this reason, I think PyList_Pop() warrants > inclusion in the API much more than low granularity methods like > PyList_Reverse() or PyList_Sort(). But it's easy to simulate a pop, writing the C equivalent of x = lst[len(lst)-1] del lst[len(lst)-1 : len(lst)] IOW: PyObject * listpop(PyObject *lst) { PyObject *x; int n; n = PyList_GET_SIZE(lst); if (n == 0) return NULL; x = PyList_GET_ITEM(lst, n-1); Py_INCREF(x); PyList_SetSlice(lst, n-1, n, NULL); return x; } I see no need to add this to the public API just yet (it would have to be more flexible to allow lst.pop(n), do more arg checks, etc.). --Guido van Rossum (home page: http://www.python.org/~guido/) From oren-py-d at hishome.net Fri Oct 24 14:26:29 2003 From: oren-py-d at hishome.net (Oren Tirosh) Date: Fri Oct 24 14:27:08 2003 Subject: [Python-Dev] let's not stretch a keyword's use unreasonably, _please_... In-Reply-To: <20031022161137.96353.qmail@web40513.mail.yahoo.com> References: <20031022161137.96353.qmail@web40513.mail.yahoo.com> Message-ID: <20031024182629.GA34310@hishome.net> On Wed, Oct 22, 2003 at 09:11:37AM -0700, Alex Martelli wrote: ... > Alternatively, assigning to an attribute of some particular > object still feels a better approach to me -- no new kwd, > no stretching of bad old ones, actually a chance to let bad > old 'global' fade slowly into the sunset. If there's any > chance to salvage THAT approach -- if it only needs a good > neat syntax to get that "particular object" -- I'll be glad > to participate in brainstorming to find it. How about using the word 'global' to get the current module object? A precedent for this is None which is on its way to becoming a keyword to get that "particular object". A bit of parser magic would be required so global can still work as a declaration for compatibility. >>> global is sys.modules[__name__] True >>> global.__dict__ is globals() True Oren From guido at python.org Fri Oct 24 14:30:49 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 24 14:31:01 2003 Subject: [Python-Dev] closure semantics In-Reply-To: Your message of "Fri, 24 Oct 2003 18:46:42 +0200." <5.2.1.1.0.20031024181208.027a6958@pop.bluewin.ch> References: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com> <5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch> <5.2.1.1.0.20031024005809.027db7c0@pop.bluewin.ch> <5.2.1.1.0.20031024153151.02834768@pop.bluewin.ch> <5.2.1.1.0.20031024181208.027a6958@pop.bluewin.ch> Message-ID: <200310241830.h9OIUnq05803@12-236-54-216.client.attbi.com> [Samuele] > > > well, no, it's probably that I expect rebindable closed-over vars to > > > be introduced but some kind of structured construct instead of the > > > usual Python freeform. [Guido] > >Why does rebindability make a difference here? Local vars are already > >visible in inner scopes, and if they are mutable, they are already > >being modified from inner scopes (just not rebound, but to most > >programmers that's an annoying detail). [Samuele] > most Python programmers or most Python programmers using closures? I meant both categories. > Well, it's a gut feeling, let's try to articulate it. Because > > a) parametrizing a closure with some read-only variable > b) possibly shared mutable state with indefinite extent > > are very different things. I think that people should recur to b) instead > of using classes sparingly and make it clear when they do so. Raymond's tree() example is an unfortunate one in this category. (Unfortunately because it is obfuscated code for speed reasons and because it appears in an examples section of official docs.) > b) can feel like global variables with their problems, I think that's why I > would prefer a syntax that still point out: this is some state and this are > functions to manipulate it. Classes are fine for that, and knowing that it > is common style/idiom in Lisp variants this is also fine there: > > (let ... introduces vars > ... function defs) > > I think it is also about expectations when reading some code. Right now, > reading Python code I expect at most to encounter a), although b) can be > obtained using mutable objects, but also in that case IMHO an explicit > uniform idiom would be preferable, like some Ref object inspired by ML > references. > > I can live with all solutions, although I'm still unconviced apart from the > Scheme textbook argument (which was serious) that this addition is really > necessary. I don't think the Scheme textbook argument should weigh much, since that's such a small audience. My original approach has been to discourage (b) by not allowing rebinding. Maybe this should stay the way it is. But the use of 'global x in f' might be enough to tip the reader off -- not quite at the start of f, when x is defined, but at least at the start of the inner function that declares x global in f. --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Fri Oct 24 14:31:37 2003 From: python at rcn.com (Raymond Hettinger) Date: Fri Oct 24 14:32:28 2003 Subject: [Python-Dev] Trashing recursive objects comparison? In-Reply-To: <200310241631.h9OGV6W05482@12-236-54-216.client.attbi.com> Message-ID: <005d01c39a5d$0f970f60$e841fea9@oemcomputer> [Guido] > You've convinced me. It should be noted in the NEWS file that it may > breaks some apps; I'm sure there are a bunch of clever folks out there > who liked the bisimulation approach enough to depend on it :-). > > Anyone else not in favor, please speak up over the weekend so Armin > can check it in on Monday. Armin is working on speeding up the patch. I recommend holding off until we can measure the performance impact of a revised patch. If it only affects cyclic structures, it's no big deal. But if it impacts normal equality tests, that warrants a little more discussion. Another thought is that it would be prudent to see how much breakage can be expected. For example, perhaps the patch can tried on an older python to see if Zope can deal with it. Otherwise, the patch is elegant and simplifies the code quite a bit. Also, Armin's well written proposal ought to be preserved somewhere (like Tim's listsort.txt file). Raymond Hettinger From oren-py-d at hishome.net Fri Oct 24 14:48:50 2003 From: oren-py-d at hishome.net (Oren Tirosh) Date: Fri Oct 24 14:48:55 2003 Subject: [Python-Dev] Can we please have a better dict interpolation syntax? In-Reply-To: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz> References: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz> Message-ID: <20031024184850.GB34310@hishome.net> On Thu, Oct 23, 2003 at 02:36:41PM +1300, Greg Ewing wrote: > I have just had the experience of writing a bunch > of expressions of the form > > "create index %(table)s_lid1_idx on %(table)s(%(lid1)s)" % params > > and found myself getting quite confused by all the parentheses > and "s" suffixes. I would *really* like to be able to write > this as > > "create index %{table}_lid1_idx on %{table}(%{lid1})" % params > > which I find to be much easier on the eyes. A while ago I proposed the following syntax for embedded expressions in strings, parsed at compile-time: "create index \{table}_lid1_idx on \{table}(\{lid1})" And the equivalent runtime parsed version: r"create index \{table}_lid1_idx on \{table}(\{lid1})".cook(params) testing-the-water-to-see-if-it's-PEP-time-ly yours, Oren From python at rcn.com Fri Oct 24 15:01:09 2003 From: python at rcn.com (Raymond Hettinger) Date: Fri Oct 24 15:02:05 2003 Subject: [Python-Dev] PyList API missing PyList_Pop() and PyList_Delete In-Reply-To: <200310241824.h9OIOr105765@12-236-54-216.client.attbi.com> Message-ID: <006201c39a61$2f8ed1a0$e841fea9@oemcomputer> > -----Original Message----- [Raymond] > > The bummer is that this call is effectively used in a loop and runs once > > for every data element in an iterable. Something like pop() has such a > > tiny granularity that its runtime is overwhelmed by the lookup time to > > call it this way. For this reason, I think PyList_Pop() warrants > > inclusion in the API much more than low granularity methods like > > PyList_Reverse() or PyList_Sort(). > > But it's easy to simulate a pop, writing the C equivalent of > > x = lst[len(lst)-1] > del lst[len(lst)-1 : len(lst)] . . . > PyList_SetSlice(lst, n-1, n, NULL); There's the new piece of information. I didn't know that the final argument could be NULL and creating/destroying and empty list for the arg was unpleasant. I'll add that info to the API docs. > I see no need to add this to the public API just yet (it would have to > be more flexible to allow lst.pop(n), do more arg checks, etc.). Yes. See if more requestors come along. Surely, I was not the first to want to use PyLists as append/pop stacks for PyObjects. Thanks, Raymond From guido at python.org Fri Oct 24 15:08:59 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 24 15:09:22 2003 Subject: [Python-Dev] PyList API missing PyList_Pop() and PyList_Delete In-Reply-To: Your message of "Fri, 24 Oct 2003 15:01:09 EDT." <006201c39a61$2f8ed1a0$e841fea9@oemcomputer> References: <006201c39a61$2f8ed1a0$e841fea9@oemcomputer> Message-ID: <200310241908.h9OJ8xO05942@12-236-54-216.client.attbi.com> > Surely, I was not the first to want to use PyLists as append/pop stacks > for PyObjects. Yes, but most of them write Python code, not C code. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Fri Oct 24 15:18:49 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 24 15:18:56 2003 Subject: [Python-Dev] let's not stretch a keyword's use unreasonably, _please_... In-Reply-To: <20031024182629.GA34310@hishome.net> References: <20031022161137.96353.qmail@web40513.mail.yahoo.com> <20031024182629.GA34310@hishome.net> Message-ID: <200310242118.49335.aleaxit@yahoo.com> On Friday 24 October 2003 08:26 pm, Oren Tirosh wrote: ... > How about using the word 'global' to get the current module object? > A precedent for this is None which is on its way to becoming a keyword > to get that "particular object". > > A bit of parser magic would be required so global can still work > as a declaration for compatibility. Unfortunately I think Guido clarified in a previous post the amount of parser magic needed for that is excessive -- no lookahead allowed. If we managed to tweak the parser, we'd still have the issue of keyword inappropriateness -- and further stretching if we also want to use it to allow rebinding of outer-scope variables that AREN'T "global" in any sense whatsoever. So I'd much rather have 'scope' and no issue with parser magic... Alex From aahz at pythoncraft.com Fri Oct 24 16:00:57 2003 From: aahz at pythoncraft.com (Aahz) Date: Fri Oct 24 16:01:01 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310221853.h9MIrL327955@12-236-54-216.client.attbi.com> References: <005501c398ca$a07a6f20$e841fea9@oemcomputer> <200310221853.h9MIrL327955@12-236-54-216.client.attbi.com> Message-ID: <20031024200057.GA19184@panix.com> On Wed, Oct 22, 2003, Guido van Rossum wrote: >Raymond Hettinger: >> >> Did the discussion of a sort() expression get resolved? >> >> The last I remember was that the list.sorted() classmethod had won the >> most support because it accepted the broadest range of inputs. >> >> I could live with that though I still prefer the more limited >> (list-only) copysort() method. > > list.sorted() has won, but we are waiting from feedback from the > person who didn't like having both sort() and sorted() as methods, to > see if his objection still holds when one is a method and the other a > factory function. Actually, I was another person who expressed dislike for "sorted()" causing confusion, but previous calls for feedback were restricted to those who felt comfortable expressing opinions for non-English speakers. I'm still -1 on sorted() compared to copysort(), but because it's a different context, I'm no longer actively opposed (which is why I didn't bother speaking up earlier). I still think that a purely grammatical change in spelling is not appropriate to indicate meaning, particularly when it's still the same part of speech (both verbs). To my mind, sorted() implies a Boolean predicate. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From aleaxit at yahoo.com Fri Oct 24 16:17:42 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 24 16:17:48 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <87brs7egju.fsf@egil.codesourcery.com> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com> <87brs7egju.fsf@egil.codesourcery.com> Message-ID: <200310242217.42857.aleaxit@yahoo.com> On Friday 24 October 2003 12:27 am, Zack Weinberg wrote: ... > Frankly, I wish Python required one to write explicit declarations for > all variables in the program: > > var x, y, z # module scope > > class bar: > classvar I, J, K # class variables Seems like a great way to get uninitialized variables to me. Might as well mandate initialization, getting a hard-to-read classvar I=2.3, J=(2,3), K=23 or to force more readability one might say only one name per classvar statement classvar I=2.3 classvar J=(2,3) classvar K=23 But then what added value is that 'classvar' boilerplate dirtying things up? Might as well take it off and get I = 2.3 J = (2, 3) K = 23 which is just what we have now. > It's extra bondage and discipline, yeah, but it's that much more help > comprehending the program six months later, and it also gets rid of There is absolutely no help (not one minute later, not six months later) "comprehending" the program just because some silly language mandates redundancy, such as a noiseword 'classvar' in front of the assignments. > the "how was this variable name supposed to be spelled again?" > question. I disagree that the 'classvar' boilerplate would provide any help with that question. Just put the initializing assignment there and it's only clearer for NOT being obscured by that 'classvar' thingy. Document with docstrings or comments, not by changing the language. A language which, I suspect, MIGHT let you do exactly what you want, is Ruby. I don't know for sure that you can tweak Ruby into giving (at least) warnings for assignment to symbols outside of a certain set, but I suspect you might; you _can_ change the language's semantics pretty deeply. Yet in most other ways it's close enough to Python that the two are almost equivalent. I do believe (and hope!) you stand very little chance of ever getting into Python something as alien to its tradition and principles as variable declarations, so, if they're important to you, considering ruby might be a more productive option for you. Alex From aahz at pythoncraft.com Fri Oct 24 16:20:51 2003 From: aahz at pythoncraft.com (Aahz) Date: Fri Oct 24 16:20:56 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <200310230543.h9N5heh01776@12-236-54-216.client.attbi.com> References: <200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz> <200310230543.h9N5heh01776@12-236-54-216.client.attbi.com> Message-ID: <20031024202051.GB19184@panix.com> On Wed, Oct 22, 2003, Guido van Rossum wrote: > > I think that for reductions the gains are less clear. The initializer > for the result variable and the call that updates its are no longer > boilerplate, because they vary for each use; plus the name of the > result variable should be chosen carefully because it indicates what > kind of result it is (e.g. a sum or product). So, leaving out the > condition for now, the pattern or idiom is: > > = > for in : > = > > (Where uses and .) Actually, even that doesn't quite capture the expressiveness needed, because needs in some cases to be a sequence of statements and there needs to be an opportunity for a finalizer to run after the for loop (e.g. average()). -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From zack at codesourcery.com Fri Oct 24 16:39:39 2003 From: zack at codesourcery.com (Zack Weinberg) Date: Fri Oct 24 16:39:46 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <200310242217.42857.aleaxit@yahoo.com> (Alex Martelli's message of "Fri, 24 Oct 2003 22:17:42 +0200") References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com> <87brs7egju.fsf@egil.codesourcery.com> <200310242217.42857.aleaxit@yahoo.com> Message-ID: <87llrabcac.fsf@egil.codesourcery.com> Alex Martelli writes: > On Friday 24 October 2003 12:27 am, Zack Weinberg wrote: > ... >> Frankly, I wish Python required one to write explicit declarations for >> all variables in the program: >> >> var x, y, z # module scope >> >> class bar: >> classvar I, J, K # class variables > > Seems like a great way to get uninitialized variables to me. No, they get a magic cookie value that triggers an exception on use. Which, incidentally, disambiguates the present UnboundLocalError - is that a typo, or is that failure to initialize the variable on this code path? Consider, eg. def foo(x): s = 2 if x: a = 1 return a ... > But then what added value is that 'classvar' boilerplate dirtying > things up? Might as well take it off and get > I = 2.3 > J = (2, 3) > K = 23 > > which is just what we have now. ... > > There is absolutely no help (not one minute later, not six months later) > "comprehending" the program just because some silly language mandates > redundancy, such as a noiseword 'classvar' in front of the assignments. Understand that I do almost all my programming in typed languages, where that keyword isn't noise, it's a critical part of the declaration. I see where you're coming from with regard to noisewords. There are plausible alternatives, although they're all more complicated to implement and explain, compared to var a, b = 2, c = foo() # a throws UninitializedLocalError if used # before set ... d # throws UnboundLocalError e = 1 # ALSO throws UnboundLocalError But in this domain, I am mostly content with the language as is. I think there really *is* a language deficiency with regard to declaring class versus instance variables. class foo: A = 1 # these are class variables B = 2 C = 3 def __init__(self): self.a = 4 # these are instance variables self.b = 5 self.c = 6 I find this imperative syntax for declaring instance variables profoundly unintuitive. Further, on my first exposure to Python, I thought A, B, C were instance variables, although it wasn't hard to understand why they aren't. People like to rag on the popularity of __slots__ (for reasons which are never clearly spelled out, but never mind) -- has anyone considered that it's popular because it's a way of declaring the set of instance variables, and there is no other way in the language? zw From aleaxit at yahoo.com Fri Oct 24 16:48:32 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 24 16:48:39 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <200310232206.h9NM6wb03700@12-236-54-216.client.attbi.com> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <16280.5396.284178.989033@montanaro.dyndns.org> <200310232206.h9NM6wb03700@12-236-54-216.client.attbi.com> Message-ID: <200310242248.32981.aleaxit@yahoo.com> On Friday 24 October 2003 12:06 am, Guido van Rossum wrote: > [Skip] > > > Given that the global keyword or something like it is here to stay > > (being preferable over some attribute-style access) > > (Actually I expect more pushback from Alex once he's back from his > trip. He seems to feel strongly about this. :-) I do: I dislike "declarative statements" and I also dislike "global" as a spelling for anything that isn't actually global. But after a 3-day Bologna->Munich->Gothenburg->Stockholm->Amsterdam->Bologna whirl I'm just too bushed -- and have too many hundreds of msgs to go through (backwards as usual) -- to be very effective;-). With luck, I may be able to do better in the weekend...:-). > That was my first suggestion earlier this week. The main downside > (except from propagating 'global' :-) is that if you rename the > function defining the scope you have to fix all global statements > referring to it. I seem to have seen many others say that the "renaming the function" downside is not a serious problem, and I concur with them; you're just as likely to rename e.g. the variable (where you have to hunt down and change every assignment and access as well as the "declarative stmt", AND get no compiler support for errors) as the function (where you only need to fix the "declarative stmts" AND the compiler will tell you if you miss some) Alex From bac at OCF.Berkeley.EDU Fri Oct 24 16:59:57 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Fri Oct 24 17:00:48 2003 Subject: [Python-Dev] cleanup order In-Reply-To: References: Message-ID: <3F9992CD.8030201@ocf.berkeley.edu> Thomas Heller wrote: > Is the cleanup order at Python shutdown documented somewhere? > > The only thing I found is the (old) essay > http://www.python.org/doc/essays/cleanup.html > The summarized history of python-dev to the rescue (thanks to Google's restricted domain searching and "python-dev Summary" as a keyword). =) http://www.python.org/dev/summary/2003-04-01_2003-04-15.html http://www.python.org/dev/summary/2003-09-16_2003-09-30.html Just search in these docs for "shutdown" and "cleanup". Most of it is over threads not being terminated before shutdown begins, but the basic order and such is discussed and spilled out. And the April one has it told as if it was being explained to a grade school class (one of my more creative, quirky summaries if I do say so myself). -Brett From aleaxit at yahoo.com Fri Oct 24 17:01:16 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 24 17:01:22 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <87k76vehup.fsf@egil.codesourcery.com> <200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com> Message-ID: <200310242301.16445.aleaxit@yahoo.com> On Friday 24 October 2003 12:08 am, Guido van Rossum wrote: > > However, as long as we're talking about this stuff, I wish I could > > write "global foo" at module scope and have that mean "this variable > > is to be treated as global in all functions in this module". > > This is similar to Greg Ewing's proposable to have 'rebindable x' at > an outer function scope. My problem with it remains: > > It gives outer scopes (some) control over inner scopes. One of the > guidelines is that a name defined in an inner scope should always > shadow the same name in an outer scope, to allow evolution of the > outer scope without affecting local details of inner scope. (IOW if > an inner function defines a local variable 'x', the outer scope > shouldn't be able to change that.) I must be missing something, because I don't understand the value of that guideline. I see outer and inner functions as tightly coupled anyway; it's not as if they could be developed independently -- not even lexically, surely not semantically. I do prefer to have the reminder "this is _assigning_ a NON-local variable" _closer_ to the assignment -- and I DO think it would be great if such rebinding HAD to be an assignment, not some kind of "side effect" from statements such as def, class, for, btw. (Incidentally, we'd get the latter for free if the nonlocal was "an attribute of some object" -- outer.x = 23 YES, "def outer.x():..." NO. But i'd still feel safer, even with a deuced 'declarative statement', if it could somehow be allowed to rebind nonlocals ONLY with an explicit assignment). So, anyway, the closer to the assignment the reminder, the better, so if it has to be a "declarative statement" I'd rather have it in the inner function than in the outer one. But for reasons very different from that guideline which I don't grasp... (probably just sleepiness and tiredness on my part...). Alex From pje at telecommunity.com Fri Oct 24 17:10:03 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Oct 24 17:12:05 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <87llrabcac.fsf@egil.codesourcery.com> References: <200310242217.42857.aleaxit@yahoo.com> <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com> <87brs7egju.fsf@egil.codesourcery.com> <200310242217.42857.aleaxit@yahoo.com> Message-ID: <5.1.1.6.0.20031024170245.03260160@telecommunity.com> At 01:39 PM 10/24/03 -0700, Zack Weinberg wrote: >class foo: > A = 1 # these are class variables > B = 2 > C = 3 > > def __init__(self): > self.a = 4 # these are instance variables > self.b = 5 > self.c = 6 > >I find this imperative syntax for declaring instance variables >profoundly unintuitive. Further, on my first exposure to Python, I >thought A, B, C were instance variables, although it wasn't hard to >understand why they aren't. A, B, and C *are* instance variables. Why do you think they aren't? >People like to rag on the popularity of __slots__ (for reasons which >are never clearly spelled out, but never mind) -- has anyone >considered that it's popular because it's a way of declaring the set >of instance variables, What good does declaring the set of instance variables *do*? This seems to be more of a mental comfort thing than anything else. I've spent most of my career in declaration-free languages, though, so I really don't understand why people get so emotional about being able to declare their variables. > and there is no other way in the language? Actually, there are a great many ways to implement such a thing. One way might be something like: class RestrictedVars: vars = () def __setattr__(self,attr,name): if name not in self.vars: raise AttributeError("No such attribute",attr) class SomeClass(RestrictedVars): vars = 'a','b','c' From zack at codesourcery.com Fri Oct 24 17:16:27 2003 From: zack at codesourcery.com (Zack Weinberg) Date: Fri Oct 24 17:16:33 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <5.1.1.6.0.20031024170245.03260160@telecommunity.com> (Phillip J. Eby's message of "Fri, 24 Oct 2003 17:10:03 -0400") References: <200310242217.42857.aleaxit@yahoo.com> <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com> <87brs7egju.fsf@egil.codesourcery.com> <200310242217.42857.aleaxit@yahoo.com> <5.1.1.6.0.20031024170245.03260160@telecommunity.com> Message-ID: <87ad7qbal0.fsf@egil.codesourcery.com> "Phillip J. Eby" writes: > At 01:39 PM 10/24/03 -0700, Zack Weinberg wrote: >>class foo: >> A = 1 # these are class variables >> B = 2 >> C = 3 >> >> def __init__(self): >> self.a = 4 # these are instance variables >> self.b = 5 >> self.c = 6 >> >>I find this imperative syntax for declaring instance variables >>profoundly unintuitive. Further, on my first exposure to Python, I >>thought A, B, C were instance variables, although it wasn't hard to >>understand why they aren't. > > A, B, and C *are* instance variables. Why do you think they aren't? You prove my point! I got it wrong! This is a confusing part of the language! > What good does declaring the set of instance variables *do*? This > seems to be more of a mental comfort thing than anything else. I've > spent most of my career in declaration-free languages, though, so I > really don't understand why people get so emotional about being able > to declare their variables. Yeah, it's a mental comfort thing. Mental comfort is important. Having the computer catch your fallible human mistakes is also important. zw From guido at python.org Fri Oct 24 17:32:22 2003 From: guido at python.org (Guido van Rossum) Date: Fri Oct 24 17:33:09 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: Your message of "Fri, 24 Oct 2003 23:01:16 +0200." <200310242301.16445.aleaxit@yahoo.com> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <87k76vehup.fsf@egil.codesourcery.com> <200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com> <200310242301.16445.aleaxit@yahoo.com> Message-ID: <200310242132.h9OLWMv06320@12-236-54-216.client.attbi.com> [Guido] > > It gives outer scopes (some) control over inner scopes. One of the > > guidelines is that a name defined in an inner scope should always > > shadow the same name in an outer scope, to allow evolution of the > > outer scope without affecting local details of inner scope. (IOW if > > an inner function defines a local variable 'x', the outer scope > > shouldn't be able to change that.) [Alex] > I must be missing something, because I don't understand the value > of that guideline. I see outer and inner functions as tightly coupled > anyway; it's not as if they could be developed independently -- not > even lexically, surely not semantically. It's the same as the reason why name lookup (whether at compile time or at run-time) always goes from inner scope to outer. While you and I see nested functions as small amounts of closely-knit code, some people will go overboard and write functions of hundred lines long containing dozens of inner functions, which may be categorized into several functional groups. A decision to share a variable 'foo' between one group of inner functions shouldn't mean that none of the other inner functions can have a local variable 'foo'. Anyway, I hope you'll have a look at my reasons for why the compiler needs to know about rebinding variables in outer scopes from inside an inner scope. --Guido van Rossum (home page: http://www.python.org/~guido/) From bac at OCF.Berkeley.EDU Fri Oct 24 17:53:56 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Fri Oct 24 17:54:08 2003 Subject: [Python-Dev] 2nd draft of "How Py is Developed" essay Message-ID: <3F999F74.7040706@ocf.berkeley.edu> OK, so using the feedback from the first draft I made a few changes. One is a paragraph on what to do if you want to add or change a file on a patch item if you are not the original submitter. I also added two-sentence conclusion to the whole essay. Lastly I changed the title to better reflect how Python is ultimately developed. =) As before, any comments and corrections are welcome. If you think this sucker is done, please say so! If I get enough people saying they think this is good enough to go out to the world I will post to python-announce and python-list and then add it to the python.org/dev/ . Then you can all hear me discuss it again at PyCon (assuming it gets accepted). =) ---------------------------- Guido, Some Guys, and a Mailing List: How Python is Developed +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Introduction ============ Software does not make itself. Code does not spontaneously come from the ether of the universe. Python_ is no exception to this rule. Since Python made its public debut back in 1991 people beyond the BDFL (Benevolent Dictator For Life, `Guido van Rossum`_) have helped contribute time and energy to making Python what it is today; a powerful, simple programming language available to all. But it has not been a random process of people doing whatever they wanted to Python. Over the years a process to the development of Python has emerged by the group that heads Python's growth and maintenance; `python-dev`_. This document is an attempt to write this process down in hopes of lowering any barriers possibly preventing people from contributing to the development of Python. .. _Python: http://www.python.org/ .. _Guido van Rossum: http://www.python.org/~guido/ .. _python-dev:http://mail.python.org/mailman/listinfo/python-dev Tools Used ========== To help facilitate the development of Python, certain tools are used. Beyond the obvious ones such as a text editor and email client, two tools are very pervasive in the development process. SourceForge_ is used by python-dev to keep track of feature requests, reported bugs, and contributed patches. A detailed explanation on how to use SourceForge is covered later in `General SourceForge Guidelines`_. CVS_ is a networked file versioning system that stores all of files that make up Python. It allows the developers to have a single repository for the files along with being able to keep track of any and all changes to every file. The basic commands and uses can be found in the `dev FAQ`_ along with a multitude of tutorials spread across the web. .. _SourceForge: http://sourceforge.net/projects/python/ .. _CVS: http://www.cvshome.org/ .. _dev FAQ: http://www.python.org/dev/devfaq.html Communicating ============= Python development is not just programming. It requires a great deal of communication between people. This communication is not just between the members of python-dev; communication within the greater Python community also helps with development. Several mailing lists and newsgroups are used to help organize all of these discussions. In terms of Python development, the primary location for communication is the `python-dev`_ mailing list. This is where the members of python-dev hash out ideas and iron out issues. It is an open list; anyone can subscribe to the mailing list. While the discussion can get quite technical, it is not all out of the reach for even a novice and thus should not discourage anyone from joining the list. Please realize, though, this list is **only** for the discussion of the development of Python; all other questions should be directed somewhere else, such as `python-list`_. When the greater Python community is involved in a discussion, it always ends up on `python-list`_. This mailing list is a gateway to the newsgroup `comp.lang.python`_. This is also a good place to go when you have a question about Python that does not pertain to the actual development of the language. Using CVS_ allows the development team to know who made a change to a file and when they made their change. But unless one wants to continuously update their local checkout of the repository, the best way to stay on top of changes to the repository is to subscribe to `Python-checkins`_. This list sends out an email for each and every change to a file in Python. This list can generate a large amount of traffic since even changing a typo in some text will trigger an email to be sent out. But if you wish to be kept abreast of all changes to Python then this is a good way to do so. The Patches_ mailing list sends out an email for all changes to patch items on SourceForge_. This list, just like Python-checkins, can generate a large amount of email traffic. It is in general useful to people who wish to help out with the development of Python by knowing about all new submitted patches as well as any new developments on preexisting ones. `Python-bugs-list`_ functions much like the Patches mailing list except it is for bug items on SourceForge. If you find yourself wanting to help to close and remove bugs in Python this is the right list to subscribe to if you can handle the volume of email. .. _python-list: http://mail.python.org/mailman/listinfo/python-list .. _comp.lang.python: http://groups.google.com/groups?q=comp.lang.python .. _Python-checkins: http://mail.python.org/mailman/listinfo/python-checkins .. _Patches: http://mail.python.org/mailman/listinfo/patches .. _Python-bugs-list: http://mail.python.org/mailman/listinfo/python-bugs-list The Actual Development ====================== Developing Python is not all just conversations about neat new language features (although those neat conversations do come up and there is a process to it). Developing Python also involves maintaining it by eliminating discovered bugs, adding and changing features, and various other jobs that are not necessarily glamorous but are just as important to the language as anything else. General SourceForge Guidelines ------------------------------ Since a good amount of Python development involves using SourceForge_, it is important to follow some guidelines when handling a tracker item (bug, patch, etc.). Probably one of the most important things you can do is make sure to set the various options in a new tracker item properly. The submitter should make sure that the Data Type, Category, and Group are all set to reasonable values. The remaining values (Assigned To, Status, and Resolution) should in general be left to Python developers to set. The exception to this rule is when you want to retract a patch; then "close" the patch by setting Status to "closed" and Resolution to whatever is appropriate. Make sure you do a cursory check to make sure what ever you are submitting was not previously submitted by someone else. Duplication just uses up valuable time. And **please** do not post feature requests, bug reports, or patches to the python-dev mailing list. If you do you will be instructed to create an appropriate SourceForge tracker item. When in doubt as to whether you should bring something to python-dev's attention, you can always ask on `comp.lang.python`_; Python developers actively participate there and move the conversation over if it is deemed reasonable. Feature Requests ---------------- `Feature requests`_ are for features that you wish Python had but you have no plans on actually implementing by writing a patch. On occasion people do go through the features requests (also called RFEs on SourceForge) to see if there is anything there that they think should be implemented and actually do the implementation. But in general do not expect something put here to be implemented without some participation on your part. The best way to get something implemented is to campaign for it in the greater Python community. `comp.lang.python`_ is the best place to accomplish this. Post to the newsgroup with your idea and see if you can either get support or convince someone to implement it. It might even end up being added to `PEP 42`_ so that the idea does not get lost in the noise as time passes. .. _feature requests: http://sourceforge.net/tracker/?group_id=5470&atid=355470 .. _PEP 42: http://www.python.org/peps/pep-0042.html Bug Reports ----------- Think you found a bug? Then submit a `bug report`_ on SourceForge. Make sure you clearly specify what version of Python you are using, what OS, and under what conditions the bug was triggered. The more information you can give the faster the bug can be fixed since time will not be wasted requesting more information from you. .. _bug report: http://sourceforge.net/tracker/?group_id=5470&atid=105470 Patches ------- Create a patch_ tracker item on SourceForge for any code you think should be applied to the Python CVS tree. For practically any change to Python's functionality the documentation and testing suite will need to be changed as well. Doing this in the first place speeds things up considerably. Please make sure your patch is against the CVS repository. If you don't know how to use it (basics are covered in the `dev FAQ`_), then make sure you specify what version of Python you made your patch against. In terms of coding standards, `PEP 8`_ specifies for Python while `PEP 7`_ specifies for C. Always try to maximize your code reuse; it makes maintenance much easier. For C code make sure to limit yourself to ANSI C code as much as possible. If you must use non-ANSI C code then see if what you need is checked for by looking in pyconfig.h . You can also look in Include/pyport.h for more helpful C code. If what you need is still not there but it is in general available, then add a check in configure.in for it (don't forget to run autoreconf to make the changes to take effect). And if that *still* doesn't fit your needs then code up a solution yourself. The reason for all of this is to limit the dependence on external code that might not be available for all OSs that Python runs on. Be aware of intellectual property when handling patches. Any code with no copyright will fall under the copyright of the `Python Software Foundation`_. If you have no qualms with that, wonderful; this is the best solution for Python. But if you feel the need to include a copyright then make sure that it is compatible with copyright used on Python (i.e., BSD-style). The best solution, though, is to sign the copyright over to the Python Software Foundation. .. _patch: http://sourceforge.net/tracker/?group_id=5470&atid=305470 .. _dev FAQ: http://www.python.org/dev/devfaq.html .. _PEP 7: http://www.python.org/peps/pep-0007.html .. _PEP 8: http://www.python.org/peps/pep-0008.html .. _Python Software Foundation: http://www.python.org/psf/ Changing the Language ===================== You understand how to file a patch. You think you have a great idea on how Python should change. You are ready to write code for your change. Great, but you need to realize that certain things must be done for a change to be accepted. Changes fall into two categories; changes to the standard library (referred to as the "stdlib") and changes to the language proper. Changes to the stdlib --------------------- Changes to the stdlib can consist of adding functionality or changing existing functionality. Adding minor functionality (such as a new function or method) requires convincing a member of python-dev that the addition of code caused by implementing the feature is worth it. A big addition such as a module tends to require more support than just a single member of python-dev. As always, getting community support for your addition is a good idea. With all additions, make sure to write up documentation for your new functionality. Also make sure that proper tests are added to the testing suite. If you want to add a module, be prepared to be called upon for any bug fixes or feature requests for that module. Getting a module added to the stdlib makes you by default its maintainer. If you can't take that level of responsibility and commitment and cannot get someone else to take it on for you then your battle will be very difficult; when there is not a specific maintainer of code python-dev takes responsibility and thus your code must be useful to them or else they will reject the module. Changing existing functionality can be difficult to do if it breaks backwards-compatibility. If your code will break existing code, you must provide a legitimate reason on why making the code act in a non-compatible way is better than the status quo. This requires python-dev as a whole to agree to the change. Changing the Language Proper ---------------------------- Changing Python the language is taken **very** seriously. Python is often heralded for its simplicity and cleanliness. Any additions to the language must continue this tradition and view. Thus any changes must go through a long process. First, you must write a PEP_ (Python Enhancement Proposal). This is basically just a document that explains what you want, why you want it, what could be bad about the change, and how you plan on implementing the change. It is best to get feedback on PEPs on `comp.lang.python`_ and from python-dev. Once you feel the document is ready you can request a PEP number and to have it added to the official list of PEPs in `PEP 0`_. Once you have a PEP, you must then convince python-dev and the BDFL that your change is worth it. Be expected to be bombarded with questions and counter-arguments. It can drag on for over a month, easy. If you are not up for that level of discussion then do not bother with trying to get your change in. If you manage to convince a majority of python-dev and the BDFL (or most of python-dev; that can lead to the BDFL changing his mind) then your change can be applied. As with all new code make sure you also have appropriate documentation patches along with tests for the new functionality. .. _PEP: http://www.python.org/peps/pep-0001.html .. _PEP 0: http://www.python.org/peps/pep-0000.html Helping Out =========== Many people say they wish they could help out with the development of Python but feel they are not up to writing code. There are plenty of things one can do, though, that does not require you to write code. Regardless of your coding abilities, there is something for everyone to help with. For feature requests, adding a comment about what you think is helpful. State whether or not you would like to see the feature. You can also volunteer to write the code to implement the feature if you feel up to it. For bugs, stating whether or not you can reproduce the bug yourself can be extremely helpful. If you can write a fix for the bug that is very helpful as well; start a patch item and reference it in a comment in the bug item. For patches, apply the patch and run the testing suite. You can do a code review on the patch to make sure that it is good, clean code. If the patch adds a new feature, comment on whether you think it is worth adding. If it changes functionality then comment on whether you think it might break code; if it does, say whether you think it is worth the cost of breaking existing code. Help add to the patch if it is missing documentation patches or needed regression tests. A special mention about adding a file to a tracker item. Only official developers and the creator of the tracker item can add a file. This means that if you want to add a file and you are neither of the types of people just mentioned you have to do an extra step or two. One thing you can do is post the file you want added somewhere else online and reference the URL in a comment. You can also create a new patch item if you feel the change is thorough enough and cross-reference between both patches in the comments. Be wary of this last option, though, since some people might be offended since it might come off as if you think there code is bad and yours is better. The best solution of all is to work with the original poster if they are receptive to help. But if they do not respond or are not friendly then do go ahead and do one of the other two suggestions. For language changes, make your voice be heard. Comment about any PEPs on `comp.lang.python`_ so that the general opinion of the community can be assessed. If there is nothing specific you find you want to work on but still feel like contributing nonetheless, there are several things you can do. The documentation can always use fleshing out. Adding more tests to the testing suite is always useful. Contribute to discussions on python-dev or `comp.lang.python`_. Just helping out in the community by spreading the word about Python or helping someone with a question is helpful. If you really want to get knee-deep in all of this, join python-dev. Once you have been actively participating for a while and are generally known on python-dev you can request to have checkin rights on the CVS tree. It is a great way to learn how to work in a large, distributed group along with how to write great code. And if all else fails give money; the `Python Software Foundation`_ is a non-profit organization that accepts donations that are tax-deductible in the United States. The funds are used for various thing such as lawyers for handling the intellectual property of Python to funding PyCon_. But the PSF could do a lot more if they had the funds. One goal is to have enough money to fund having Guido work on Python for a full year full-time; this would bring about Python 3. Every dollar does help, so please contribute if you can. .. _PyCon: http://www.python.org/pycon/ Conclusion ========== If you get any message from this document, it should be that *anyone* can help with the development of Python. All help is greatly appreciated and keeps the language the wonderful piece of software that it is. From gward-work at python.net Fri Oct 24 18:07:03 2003 From: gward-work at python.net (Greg Ward) Date: Fri Oct 24 18:07:10 2003 Subject: [Python-Dev] Kernel panic writing to /dev/dsp with cmpci driver Message-ID: <20031024220703.GA2267@intelerad.com> [cc'ing python-dev because there might be something funny in the ossaudiodev module -- but some of you already know that!] I've just upgraded to Linux 2.4.23-pre8 + RML's preemptible kernel patch, and I have a pretty reproducible panic when writing to /dev/dsp. Here's what lspci reports about the sound hardware: 02:03.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10) I'm using the cmpci driver. Oddly, the panic only happens when using Python 2.3's ossaudiodev module, which is a fairly thin wrapper around the OSS API. Here's a script that crashes my machine every time: """ #!/usr/bin/python2.3 import sys import ossaudiodev random = open("/dev/urandom", "r") dsp = ossaudiodev.open("w") while 1: sys.stdout.write("."); sys.stdout.flush() dsp.write(random.read(4096)) """ (I'm quite sure that the panic has nothing to do with /dev/urandom, since I discovered the it by playing Ogg Vorbis files, not by playing white noise.) The crash happens after about 10-12 dots have appeared, ie. 10-12 4k blocks have been written. Here's a C version of that script that does *not* crash my system: """ #include #include #include #include #include #include #define BUF_SIZE 4096 int main(int argc, char ** argv) { int nbytes; char data[BUF_SIZE]; int source, dsp; /* input, output FDs */ source = open("/dev/urandom", O_RDONLY); dsp = open("/dev/dsp", O_WRONLY); printf("source fd=%d, dsp fd=%d\n", source, dsp); while (1) { printf("."); fflush(stdout); nbytes = read(source, data, BUF_SIZE); write(dsp, data, nbytes); } } """ Just wondering if anyone else has seen something like this in 2.4.23-pre8, either with or without the preemptible kernel patch. I'm going to try backing out that patch to see if the problem persists; if so, I'll report back here with more details on the panic. Oh yeah, this is a Red Hat 9 system -- the sound driver worked perfectly with Red Hat's 2.4.20-20.9 kernel (which, from the source RPM, appears to be 2.4.21-pre3 plus a bunch of Red Hat patches). Greg From gward at intelerad.com Fri Oct 24 18:58:41 2003 From: gward at intelerad.com (Greg Ward) Date: Fri Oct 24 18:58:47 2003 Subject: [Python-Dev] Re: Kernel panic writing to /dev/dsp with cmpci driver In-Reply-To: <20031024220703.GA2267@intelerad.com> References: <20031024220703.GA2267@intelerad.com> Message-ID: <20031024225841.GA1915@intelerad.com> On 24 October 2003, I said: > I'm going to try backing out that patch to see if the problem > persists; if so, I'll report back here with more details on the panic. OK, I tried it with a vanilla 2.4.23-pre8. The panic is still there, and now I can reproduce it with my C program. (However, I had to run it twice. I'm guessing that if I had run it twice under the preemptible kernel, it would have crashed then too.) So it looks like this is definitely a kernel bug, the Python ossaudiodev driver is not doing anything too perverse, and RML's preemptible kernel patch is not to blame. So here's the ksymoops output: """ ksymoops 2.4.9 on i686 2.4.23-pre8-gw2. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.23-pre8-gw2/ (default) -m /boot/System.map-2.4.23-pre8-gw2 (specified) c01124d3 *pde = 00000000 Oops: 0000 CPU: 0 EIP: 0010:[] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010013 eax: 72756f2e ebx: 40017000 ecx: 00000000 edx: 72756f2e esi: df52c4ac edi: 00000003 ebp: dd9add58 esp: dd9add3c ds: 0018 es: 0018 ss: 0018 Process crasher (pid: 1846, stackpage=dd9ad000) Stack: Call trace: [] [] [] [] [] [] [] Code: 8b 01 85 c6 75 19 8b 02 89 d3 89 c2 0f 18 00 39 f3 75 ea ff >>EIP; c01124d3 <__wake_up+33/80> <===== >>esi; df52c4ac <_end+1f215128/204fecdc> >>ebp; dd9add58 <_end+1d6969d4/204fecdc> >>esp; dd9add3c <_end+1d6969b8/204fecdc> Trace; c0108945 Trace; c0108ac4 Trace; c010b168 Trace; c01a38dc Trace; c01a3bbe Trace; c01360b3 Trace; c010740f Code; c01124d3 <__wake_up+33/80> 00000000 <_EIP>: Code; c01124d3 <__wake_up+33/80> <===== 0: 8b 01 mov (%ecx),%eax <===== Code; c01124d5 <__wake_up+35/80> 2: 85 c6 test %eax,%esi Code; c01124d7 <__wake_up+37/80> 4: 75 19 jne 1f <_EIP+0x1f> Code; c01124d9 <__wake_up+39/80> 6: 8b 02 mov (%edx),%eax Code; c01124db <__wake_up+3b/80> 8: 89 d3 mov %edx,%ebx Code; c01124dd <__wake_up+3d/80> a: 89 c2 mov %eax,%edx Code; c01124df <__wake_up+3f/80> c: 0f 18 00 prefetchnta (%eax) Code; c01124e2 <__wake_up+42/80> f: 39 f3 cmp %esi,%ebx Code; c01124e4 <__wake_up+44/80> 11: 75 ea jne fffffffd <_EIP+0xfffffffd> Code; c01124e6 <__wake_up+46/80> 13: ff 00 incl (%eax) """ (Err, the "-gw2" version number is a red herring -- this really is an unpatched 2.4.23-pre8, I swear!) Is that enough info for a real kernel hacker to track this down? I'm not very experienced with kernel panics, so I'm not sure if this is all you need. Let me know if I can provide more info. Greg From tjreedy at udel.edu Fri Oct 24 20:14:32 2003 From: tjreedy at udel.edu (Terry Reedy) Date: Fri Oct 24 20:14:40 2003 Subject: [Python-Dev] Re: Re: closure semantics References: <200310242217.42857.aleaxit@yahoo.com><200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz><200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com><87brs7egju.fsf@egil.codesourcery.com><200310242217.42857.aleaxit@yahoo.com> <87llrabcac.fsf@egil.codesourcery.com> <5.1.1.6.0.20031024170245.03260160@telecommunity.com> Message-ID: "Phillip J. Eby" wrote in message news:5.1.1.6.0.20031024170245.03260160@telecommunity.com... > At 01:39 PM 10/24/03 -0700, Zack Weinberg wrote: > >class foo: > > A = 1 # these are class variables > > B = 2 > > C = 3 > > > > def __init__(self): > > self.a = 4 # these are instance variables > > self.b = 5 > > self.c = 6 > > > >I find this imperative syntax for declaring instance variables > >profoundly unintuitive. Further, on my first exposure to Python, I > >thought A, B, C were instance variables, although it wasn't hard to > >understand why they aren't. > > A, B, and C *are* instance variables. Why do you think they aren't? What? They are class attributes that live in the class dictionary, not the instance dictionary. They can be (directly) directly accessed as foo.A, etc, while foo.a, etc don't work. While they *may* serve as default or backup same-for-all-instances values for when there is no instance-specific value of the same name, that not the same, which is why they are defined differently. And a class attribute like number_of_instances would, conceptually, only be a class variable. Let's not confuse Zack further. Terry J. Reedy From eppstein at ics.uci.edu Fri Oct 24 20:31:12 2003 From: eppstein at ics.uci.edu (David Eppstein) Date: Fri Oct 24 20:31:20 2003 Subject: [Python-Dev] Re: closure semantics References: <200310242217.42857.aleaxit@yahoo.com> <87brs7egju.fsf@egil.codesourcery.com> <200310242217.42857.aleaxit@yahoo.com> <87llrabcac.fsf@egil.codesourcery.com> <5.1.1.6.0.20031024170245.03260160@telecommunity.com> Message-ID: In article , "Terry Reedy" wrote: > > A, B, and C *are* instance variables. Why do you think they aren't? > > What? They are class attributes that live in the class dictionary, > not the instance dictionary. They are instance variables on the class object, which is an instance of type 'class'. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From bac at OCF.Berkeley.EDU Fri Oct 24 21:16:41 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Fri Oct 24 21:16:58 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: References: <200310242217.42857.aleaxit@yahoo.com> <87brs7egju.fsf@egil.codesourcery.com> <200310242217.42857.aleaxit@yahoo.com> <87llrabcac.fsf@egil.codesourcery.com> <5.1.1.6.0.20031024170245.03260160@telecommunity.com> Message-ID: <3F99CEF9.5040304@ocf.berkeley.edu> David Eppstein wrote: > In article , > "Terry Reedy" wrote: > > >>>A, B, and C *are* instance variables. Why do you think they aren't? >> >>What? They are class attributes that live in the class dictionary, >>not the instance dictionary. > > > They are instance variables on the class object, which is an instance of > type 'class'. > I think the confusion that is brewing here is how Python masks class attributes when you do an assignment on an instance:: >>> class foo(object): ... A = 42 ... [12213 refs] >>> bar = foo() [12218 refs] >>> bar.A 42 [12220 refs] >>> bar.A = 13 [12223 refs] >>> foo.A 42 [12223 refs] >>> bar.A 13 Python's resolution order checks the instance first and then the class (this is ignoring a data descriptor somewhere in this chain; for the details read Raymond's essay on descriptors @ http://users.rcn.com/python/download/Descriptor.htm#invoking-descriptors ). -Brett From tim_one at email.msn.com Fri Oct 24 23:25:48 2003 From: tim_one at email.msn.com (Tim Peters) Date: Fri Oct 24 23:26:17 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <16281.7442.783253.814142@montanaro.dyndns.org> Message-ID: [Tim] > squares = (f(x)**2 for x in inputs) # assuming reiterability here > ... > for f in math.sin, math.cos, math.tan: > plot(squares) [Skip Montanaro] > How much more expensive Stop right there. I must have been unclear. The only point of the example was semantic, not cost: even if generator expressions used closure semantics, the example *still* wouldn't work the way it appears to read, and because generator expressions aren't reiterable. What the example would do under closure semantics: 1. Plot the square of math.sin(x), for each x in inputs. then 2. Probably nothing more than that. The "squares" GE is exhausted after #1 completes, and no matter often it's run again it's simply going to raise StopIteration at once each time it's tried. A reasonable plot() would probably do nothing when fed an exhausted iterable, but maybe it would raise an exception. That's up to plot(). What it *won't* do under any scheme here is go on to plot the squares of math.cos(x) and math.tan(x) over the inputs too. The lack of reiterability (which is fine by me!) thus seems to make a plausible use for closure semantics hard to imagine. The example was one where closure semantics suck despite misleading appearance. Closures are very often used (in languages other than Python, and in Python too by people who haven't yet learned to write Python <0.9 wink>) to hold mutable state in outer scopes, for the use of functions in inner scopes, very much like an instance's data attributes hold mutable state for the use of methods defined in the instance's class. In those common cases, the power comes from being able to run functions (methods) more than once, or to reuse the mutable state among functions (methods). But generator expressions are always one-shot computations (you get to run a GE to completion no more than once). There may be some use for closure semantics in a collection of GEs that reference each other (similar to instance data being visible to multiple methods), but so far I've failed to dream up a plausible case of that nature either. > would this be than > > for f in math.sin, math.cos, math.tan: > squares = (f(x)**2 for x in inputs) > plot(squares) Despite the similar appearance, that does something very different, plotting all 3 functions (not just math.sin), and regardless of whether closure or capture semantics are used. I expect the body of the loop in real life would be a one-liner, though: plot(f(x)**2 for x in inputs) > which would work without reiterability, right? Yup. > The underlying generator function could still be created at compile-time > and it (or its code object?) stored in the current function's constants. > 'f' is simply an argument to it when the iterator is instantiated. Guido expanded on that already. The code is compiled only once (at "compile time"), and there's a small runtime cost per outer-loop iteration to build a function object from the (pre-compiled) code object, and a possibly larger runtime cost per outer-loop iteration to start the GE. Passing 'f' and 'inputs' may be part of either of those costs, depending on how it's implemented -- but giving the synthesized generator function some initialized locals is the least of the runtime costs. From aleaxit at yahoo.com Sat Oct 25 03:21:40 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 03:21:48 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <5.1.1.6.0.20031024170245.03260160@telecommunity.com> References: <200310242217.42857.aleaxit@yahoo.com> <5.1.1.6.0.20031024170245.03260160@telecommunity.com> Message-ID: <200310250921.40413.aleaxit@yahoo.com> On Friday 24 October 2003 23:10, Phillip J. Eby wrote: > At 01:39 PM 10/24/03 -0700, Zack Weinberg wrote: > >class foo: > > A = 1 # these are class variables > > B = 2 > > C = 3 ... > >thought A, B, C were instance variables, although it wasn't hard to > >understand why they aren't. > > A, B, and C *are* instance variables. Why do you think they aren't? They're _accessible AS_ instance attributes (self.B will be 2 in a method), but they have the same value in all instances and to _rebind_ them you need to do so on the class object (you can bind an instance variable with the same name to shadow each and any of them, of course). > What good does declaring the set of instance variables *do*? This seems It decreases productivity -- that's the empirical result of Prechelt's study and the feeling of people who have ample experience with both kinds of language (cfr Robert Martin's well-known blog for an authoritative one, but my own experience is quite similar). If you subscribe to the popular fallacy known as "lump of labour" -- there is a total fixed amount of work that needs to be done -- it would follow that diminishing productivity increases the number of jobs available. Any economist would be appalled, of course, but, what do THEY know?-) > to be more of a mental comfort thing than anything else. I've spent most > of my career in declaration-free languages, though, so I really don't > understand why people get so emotional about being able to declare their > variables. Most of MY work has been with mandatory-declaration languages, and my theory is that a "Stockholm Syndrome" is in effect (google for a few tens of thousands of explanations of that syndrome). > > and there is no other way in the language? > > Actually, there are a great many ways to implement such a thing. One way For instance variables, yes. Fewer for class variables (you need a custom metaclass). None for module variables (also misleadingly known as 'global' ones) nor for local variables. Alex From aleaxit at yahoo.com Sat Oct 25 03:43:34 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 03:43:40 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <87llrabcac.fsf@egil.codesourcery.com> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310242217.42857.aleaxit@yahoo.com> <87llrabcac.fsf@egil.codesourcery.com> Message-ID: <200310250943.34682.aleaxit@yahoo.com> On Friday 24 October 2003 22:39, Zack Weinberg wrote: ... > > There is absolutely no help (not one minute later, not six months > > later) "comprehending" the program just because some silly language > > mandates redundancy, such as a noiseword 'classvar' in front of the > > assignments. > > Understand that I do almost all my programming in typed languages, > where that keyword isn't noise, it's a critical part of the declaration. I have a vast experience of typed languages, and, even there, the mandatory redundancy of declarations is just a cop-out. A _well-designed_ strictly typed language, such as Haskell or ML, lets the compiler infer all types, so you don't _have_ to provide declarations -- you can, much in the spirit as you can use assert in Python, but you need not. > I think there really *is* a language deficiency with regard to > declaring class versus instance variables. I don't: there is no declaration at all (save for the accursed 'global'), only _statements_. They DO things, and what they do is simple and obvious. > I find this imperative syntax for declaring instance variables > profoundly unintuitive. Further, on my first exposure to Python, I That's because you keep thinking of "declaring". Don't. There is no such thing. There is documenting (docstrings, comments) and actions. Period. Entities must not be multiplied beyond need: we don't NEED enforced redundancy. We don't WANT it: if we did, we could chose among a huge host of languages imposing it in a myriad of ways -- but we've chosen Python exactly BECAUSE it has no such redundancy. When I write in some scope x = 1 I am saying: x is a name in this scope and it refers to value 1. I have said all that is needed by either the compiler, or a reader who knows the language, to understand everything perfectly. Forcing me to say AGAIN "and oh by the way x is REALLY a name in this scope, I wasn't kidding, honest" is abhorrent. If you really like that why stop at ONE forced useless redundancy? Why not force me to provide a THIRD redundant "I really REALLY truly mean it, please DO believe me!!!", or a fourth one, or...? *ONCE, AND ONLY ONCE*. A key principle of agile programming. > thought A, B, C were instance variables, although it wasn't hard to > understand why they aren't. Reducing the productivity of all language users to (perhaps) help a few who hadn't yet understood one "not hard to understand" detail would be a disastrous trade-off. > People like to rag on the popularity of __slots__ (for reasons which > are never clearly spelled out, but never mind) -- has anyone > considered that it's popular because it's a way of declaring the set > of instance variables, and there is no other way in the language? Yes, or more precisely, at least it looks that way, and it's efficient (saves some per-instance memory). Much the same way as "type(x) is int" looks like a way to "declare a type" and so does isinstance(x, int) later on in one's study of the language (though no saving accrues there). But then, "Extraordinary Popular Delusions and the Madness of Crowds" IS quite deservedly a best-seller for the last 160+ years. Fortunately, Python need not pander to such madness and delusions, however popular:-). Alex From aleaxit at yahoo.com Sat Oct 25 04:07:36 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 04:07:51 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <200310242132.h9OLWMv06320@12-236-54-216.client.attbi.com> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310242301.16445.aleaxit@yahoo.com> <200310242132.h9OLWMv06320@12-236-54-216.client.attbi.com> Message-ID: <200310251007.36871.aleaxit@yahoo.com> On Friday 24 October 2003 23:32, Guido van Rossum wrote: ... > or at run-time) always goes from inner scope to outer. While you and > I see nested functions as small amounts of closely-knit code, some > people will go overboard and write functions of hundred lines long > containing dozens of inner functions, which may be categorized into This doesn't look like a legitimate use case to me; i.e., I see no need to distort the language if the benefit goes to such "way overboard" uses. I think they will have serious maintainability problems anyway. Fortunately, I don't think of placing the "indication to the compiler" as close to the assignment-to-outer-variable as a distortion;-) > Anyway, I hope you'll have a look at my reasons for why the compiler > needs to know about rebinding variables in outer scopes from inside > an inner scope. Sure! I do understand this. What I don't understand is why, syntactically, the reserved word that indicates this to the compiler should have to be a "statement that does nothing" -- the ONLY "declaration" in the language -- rather than e.g. an _operator_ which specifically flags such uses. Assume for the sake of argument that we could make 'scope' a reserved word. Now, what are the tradeoffs of using a "declaration" scope x in outer which makes all rebidings of x act in the scope of containing function outer (including 'def x():', 'class x:', 'import x', ...); versus an "operator" that must be used to indicate "which x" when specifically assigning it (no "side effect rebinding" via def &c allowed -- I think it helps the reader of code a LOT to require specific assignment!), e.g. scope(outer).x = 23 Don't think of scope as a built-in function, but as a keyword in either case (and we could surely have other syntax for the "scope operator", e.g. "(x in outer scope) = 23" or whatever, as long as it's RIGHT THERE where x is being assigned). So the compiler can catch on to the info just as effectively. The tradeoffs are: -- we can keep thinking of Python as declaration-free and by gradually deprecating the global statement make it more so -- the reader of code KNOWS what's being assigned to without having to scroll up "hundreds of lines" looking for possible declarations -- assignment to nonlocals is made less casually convenient by just the right amount to ensure it won't be overused -- no casual rebinding of nonlocals via def, class, import -- once we solve the general problem of allowing non-bare-names as iteration variables in 'for', nonlocals benefit from that resolution automatically, since nonlocals are never assigned-to as bare-names I see this as the pluses. The minus is, we need a new keyword; but I think we do, because stretching 'global' to mean something that ISN'T global in any sense is such a hack. Cutting both ways is the fact that this allows using the same name from more than one scope (since each use is explicitly qualified as coming from a specific scope). That's irrelevant for small compact uses of nesting, but it may be seen as offering aid and succour to those wanting to "go overboard" as you detail earlier (bad); OTOH, if I ever need to maintain such "overboard" code written by others, and refactoring it is not feasible right now, it may be helpful. In any case, the confusion is reduced by having the explicit qualification on assignment. Similarly for _accesses_ rather than rebindings -- access to the barename will keep using the same rules as today, of course, but I think the same syntax that MUST be used to assign nonlocals should also be optionally usable to access them -- not important either way in small compact functions, but more regular and offering a way to make code non-ambiguous in large ones. I don't see having two ways to access a name -- barename x or qualified scope(foo).x -- as a problem, just like today from inside a method we may access a classvariable as "self.x" OR "self.__class__.x" indifferently -- the second form is needed for rebinding and may be chosen for clarity in some cases where the first simpler ("barer") one would suffice. Alex From aleaxit at yahoo.com Sat Oct 25 04:44:18 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 04:44:25 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Doc python-docs.txt, 1.2, 1.3 In-Reply-To: <16279.60447.29714.759275@montanaro.dyndns.org> References: <16279.60447.29714.759275@montanaro.dyndns.org> Message-ID: <200310251044.18365.aleaxit@yahoo.com> On Thursday 23 October 2003 16:56, Skip Montanaro wrote: > fred> - add "Why is Python installed on my computer?" as a > documentation fred> FAQ since this gets asked at the docs at python.org > address a fred> lot > > And I thought only webmaster@python.org got asked that question all the > time. Does it get asked at other addresses as well? I don't recall ever > seeing it on python-list. It's quite common on help@python.org too. People who ask it probably don't know enough to post to the ng/python-list, but look for the simplest way to ask. Having a FAQ to point them too will be helpful, anyway. Alex From aleaxit at yahoo.com Sat Oct 25 04:58:05 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 04:58:12 2003 Subject: [Python-Dev] Re: let's not stretch a keyword's use unreasonably, _please_... In-Reply-To: References: <20031022161137.96353.qmail@web40513.mail.yahoo.com> Message-ID: <200310251058.05704.aleaxit@yahoo.com> On Thursday 23 October 2003 07:51, Terry Reedy wrote: ... > So I really *don't* need global. Perhaps a new builtin > > def me(): > import sys > return sys.modules[__name__] Or, we can make the _compiler_ aware of what is going on (and get just the same semantics as global) by accepting either a non-statement keyword (scope, as I suggested elsewhere) or a magicname for import, e.g. import __me__ as Barry suggested. Then __me__.x=23 can have just the same semantics as today "x=23" has if there is some "global x" somewhere around, and indeed it could be compiled into the same bytecode if __me__ was sufficiently special to the compiler. [[ If __me__ was assigned to other objects, subjected to setattr, etc, it would lose all special powers, and become restricted to whatever restrictions may apply now or in the future to "setting stuff in other modules". ]] We'd get more clarity _for human readers_ by thus flagging every assignment-to-module-level-name *in the very spot it's happening* and avoiding the inappropriate term "global" -- to the compiler it's all the same, but humans are important, too. Alex From aleaxit at yahoo.com Sat Oct 25 05:32:04 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 05:32:10 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <200310230543.h9N5heh01776@12-236-54-216.client.attbi.com> References: <200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz> <200310230543.h9N5heh01776@12-236-54-216.client.attbi.com> Message-ID: <200310251132.04686.aleaxit@yahoo.com> On Thursday 23 October 2003 07:43, Guido van Rossum wrote: ... > = > for in : > = ... > Concluding, I think the reduce() pattern is doomed -- the template is > too complex to capture in special syntax. I concur, particularly because the assignment in the pattern sketched above is too limiting. You point out that forcing augmented assignment would lose power (e.g., Horner's polynomials need bare assignment), but the inability to use it would imply inefficiencies -- e.g., flatlist = [] for sublist in listoflists: flatlist += sublist or flatlist.extend(sublist) is better than forcing a "flatlist = flatlist + sublist" as the loop body. Indeed, that's a spot where even 'sum' can be a performance trap; consider the following z.py: lol = [ [x] for x in range(1000) ] def flat1(lol=lol): return sum(lol, []) def flat2(lol=lol): result = [] for sub in lol: result.extend(sub) return result and the measurements: [alex@lancelot pop]$ timeit.py -c -s'import z' 'z.flat1()' 100 loops, best of 3: 8.5e+03 usec per loop [alex@lancelot pop]$ timeit.py -c -s'import z' 'z.flat2()' 1000 loops, best of 3: 940 usec per loop sum looks cooler, but it can be an order of magnitude slower than the humble loop of result.extend calls. We could fix this specific performance trap by specialcasing in sum those cases where the result has a += method -- hmmm... would a patch for this performance bug be accepted for 2.3.* ...? (I understand and approve that we're keen on avoiding adding functionality in any 2.3.*, but fixed-functionality performance enhancements should be just as ok as fixes to functionality bugs, right?) Anyway, there's a zillion other cases that sum doesn't cover (well, unless we extend dict to have a += synonym for update, which might be polymorphically useful:-), such as totaldict = {} for subdict in listofdicts: totaldict.update(subdict) Indeed, given the number of "modify in place and return None" methods of both built-in and user-coded types, I think the variant of "accumulation pattern" which simply calls such a method on each item of an iterator is about as prevalent as the variant with assignment "result = ..." as the loop body. Alex From aleaxit at yahoo.com Sat Oct 25 05:35:01 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 05:35:46 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <200310230425.h9N4Pnf01585@12-236-54-216.client.attbi.com> References: <200310230337.h9N3bBa20209@oma.cosc.canterbury.ac.nz> <200310230425.h9N4Pnf01585@12-236-54-216.client.attbi.com> Message-ID: <200310251135.01779.aleaxit@yahoo.com> On Thursday 23 October 2003 06:25, Guido van Rossum wrote: ... > it's because it feels very strongly like a directive to the compiler > -- Python's compiler likes to stay out of the way and not need help. *YES*!!! So, what about that 'declarative statement' g****l, hmm...?-) Alex From aleaxit at yahoo.com Sat Oct 25 06:32:55 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 06:33:00 2003 Subject: [Python-Dev] test_bsddb blocks while testing popitem (?) Message-ID: <200310251232.55044.aleaxit@yahoo.com> I guess it had been a while since I ran 'make test' on the 2.4 cvs... can't find this bug in the bugs db and I'd just like a quick sanity check (if the bug's already there or if I'm doing something weird) before I add it. Linux Mandrake 9.1, gcc 3.2.2, Berkeley DB 4.1.25 installed in /usr/local/BerkeleyDB.4.1 -- "make test" runs fine all the way to test_bsddb and blocks there. Digging further shows it runs fine all the way to test_pop and blocks specifically in test_popitem. Digging yet further with print and printf shows that when trying to delete the first key ('e') it gets all the way to entering the call err = self->db->del(self->db, txn, key, 0); in _DB_delete -- and never gets out of that call. Ctrl-C does nothing, have to Ctrl-Z then kill %1 to get out. Previous deletes done in the course of the unit-test give no problems (e.g., test_clear also starts by deleting that 'e' key and just works fine). So, I'm nonplussed... Alex From aleaxit at yahoo.com Sat Oct 25 07:52:13 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 07:52:18 2003 Subject: [Python-Dev] tests expecting but not finding errors due to bug fixes Message-ID: <200310251352.13266.aleaxit@yahoo.com> Switching to the 2.3 maintenance branch (where test_bsdddb runs just fine), I got "make test" failures on test_re.py. Turns out that the 2.3-branch test_re.py was apparently not updated when the RE recursion bug was fixed -- it still expects a couple of exceptions to be raised and they don't get raised any more because the bugfix itself WAS backported. On general principles, in cases of this ilk, IS it all right to just backport the corrected unit-test (from the 2.4 to the 2.3 branch) and commit the fix, or should one be more circumspect about it...? Alex From aleaxit at yahoo.com Sat Oct 25 08:30:51 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 08:30:57 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: <5.1.0.14.0.20031022220801.02547e20@mail.telecommunity.com> References: <5.1.0.14.0.20031022220801.02547e20@mail.telecommunity.com> Message-ID: <200310251430.51930.aleaxit@yahoo.com> On Thursday 23 October 2003 04:12 am, Phillip J. Eby wrote: > At 02:49 PM 10/23/03 +1300, Greg Ewing wrote: > >This would allow the current delayed-evaluation semantics > >to be kept as the default, while eliminating any need > >for using the default-argument hack when you don't > >want delayed evaluation. > > Does anybody actually have a use case for delayed evaluation? Why would > you ever *want* it to be that way? (Apart from the BDFL's desire to have > the behavior resemble function behavior.) I have looked far and wide over my code, present and desired, and can only find one example that seems perhaps tangentially relevant -- and I don't think it's a _good_ example. Anyway, here it comes: def smooth(N, sequence): def fifo(N): window = [] while 1: if len(window) < N: yield None else: yield window window.pop(0) window.append(item) latest = iter(fifo(N)).next for item in sequence: window = latest() if window is None: continue yield sum(window) / N as I said, I don't like it one bit; the non-transparent "argument passing" of item from the loop "down into" the generator is truly yecchy. There are MUCH better ways to do this, such as def fifo(N, sequence): it = iter(sequence) window = list(itertools.islice(it, N)) while 1: yield window window.pop(0) window.append(it.next()) def smooth(N, sequence): for window in fifo(N, sequence): yield sum(window) / N It's not clear that this would generalize to generator expressions, anyway. But I could imagine it might, e.g. IF we had "closure semantics" rather than "snapshot-binding" somebody COULD be tempted to such murky cases of "surreptitious argumentpassing down into genexprs"... and we're better off without any such possibility, IMHO. > And, if there's no use case for delayed evaluation, why make people jump > through hoops to get the immediate binding? I understand Guido's position that simplicity and regularity of the rules count (a LOT). But in this case I think Tim's insistence on practicality should count for more: the "bind everything at start time" semantics are NOT a weird special case, and the "lookup everything each time around the loop" ones don't seem to yield any non-weird use... Alex From aleaxit at yahoo.com Sat Oct 25 08:39:12 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 08:39:17 2003 Subject: [Python-Dev] product() In-Reply-To: <002401c39907$0176f5a0$e841fea9@oemcomputer> References: <002401c39907$0176f5a0$e841fea9@oemcomputer> Message-ID: <200310251439.12449.aleaxit@yahoo.com> On Thursday 23 October 2003 03:43 am, Raymond Hettinger wrote: > In the course of writing up Pep 289, it became clear that > the future has a number of accumulator functions in store. > Each of these is useful with iterators of all stripes and > each helps eliminate a reason for using reduce(). > > Some like average() and stddev() will likely end up in a > statistics module. Others like nbiggest(), nsmallest(), > anytrue(), alltrue(), and such may end-up somewhere else. > > The product() accumulator is the one destined to be a builtin. > > Though it is not nearly as common as sum(), it does enjoy > some popularity. Having it available will help dispense > with reduce(operator.mul, data, 1). > > Would there be any objections to my adding product() to > Py2.4? The patch was simple and it is ready to go unless > someone has some major issue with it. Michael has already quoted my April opinion on the subject. I think these "useful accumulator functions" should all be in some separate module[s]: none of them is nowhere near popular enough to warrant being a built-in, IMHO. If any were, it might be "alltrue" and "anytrue" -- the short-circuiting ones, returning the first true or false item found respectively, as in: def alltrue(seq): for x in seq: if not x: return x else: return True def anytrue(seq): for x in seq: if x: return x else: return False these seem MUCH more generally useful than 'product' (but, I still opine, not quite enough to warrant being built-ins). Alex From aleaxit at yahoo.com Sat Oct 25 08:57:05 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 08:57:12 2003 Subject: [Python-Dev] fixing sum's performance bug In-Reply-To: <200310251132.04686.aleaxit@yahoo.com> References: <200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz> <200310230543.h9N5heh01776@12-236-54-216.client.attbi.com> <200310251132.04686.aleaxit@yahoo.com> Message-ID: <200310251457.05711.aleaxit@yahoo.com> On Saturday 25 October 2003 11:32 am, Alex Martelli wrote: ... > Indeed, that's a spot where even 'sum' can be a performance trap; > consider the following z.py: > > lol = [ [x] for x in range(1000) ] > > def flat1(lol=lol): > return sum(lol, []) > > def flat2(lol=lol): > result = [] > for sub in lol: result.extend(sub) > return result > > and the measurements: > > [alex@lancelot pop]$ timeit.py -c -s'import z' 'z.flat1()' > 100 loops, best of 3: 8.5e+03 usec per loop > > [alex@lancelot pop]$ timeit.py -c -s'import z' 'z.flat2()' > 1000 loops, best of 3: 940 usec per loop > > sum looks cooler, but it can be an order of magnitude slower > than the humble loop of result.extend calls. We could fix this > specific performance trap by specialcasing in sum those cases > where the result has a += method -- hmmm... would a patch for > this performance bug be accepted for 2.3.* ...? (I understand and > approve that we're keen on avoiding adding functionality in any > 2.3.*, but fixed-functionality performance enhancements should > be just as ok as fixes to functionality bugs, right?) Ah well -- it's the most trivial fix one can possibly think of, just changing PyNumber_Add to PyNumber_InPlaceAdd -- so the semantics are _guaranteed_ to be equal in all _sane_ cases, i.e. excepting only weird user-coded types that have an __iadd__ with a weirdly different semantic than __add__ -- and DOES make sum's CPU time drop to 490 usec in the above (making it roughly twice as fast as the loop, as it generally tends to be in typical cases of summing lots of numbers). So I wend ahead and committed the tiny change on both the 2.4 and 2.3 maintenance branches (easy enough to revert if the "insane" cases must keep working in the same [not sane:-)] way in 2.3.*)... Alex From aleaxit at yahoo.com Sat Oct 25 09:08:15 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 09:08:20 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310221853.h9MIrL327955@12-236-54-216.client.attbi.com> References: <005501c398ca$a07a6f20$e841fea9@oemcomputer> <200310221853.h9MIrL327955@12-236-54-216.client.attbi.com> Message-ID: <200310251508.15634.aleaxit@yahoo.com> On Wednesday 22 October 2003 08:53 pm, Guido van Rossum wrote: > > Did the discussion of a sort() expression get resolved? > > > > The last I remember was that the list.sorted() classmethod had won the > > most support because it accepted the broadest range of inputs. > > > > I could live with that though I still prefer the more limited > > (list-only) copysort() method. > > list.sorted() has won, but we are waiting from feedback from the > person who didn't like having both sort() and sorted() as methods, to > see if his objection still holds when one is a method and the other a > factory function. So, if I've followed correctly the lots of python-dev mail over the last few days, that person (Aahz) is roughly +0 on list.sorted as classmethod and thus we can go ahead. Right? Alex From aleaxit at yahoo.com Sat Oct 25 09:18:22 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 09:18:27 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: <200310221845.h9MIjlr27891@12-236-54-216.client.attbi.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com> <3F96C6D8.8040507@livinglogic.de> <200310221845.h9MIjlr27891@12-236-54-216.client.attbi.com> Message-ID: <200310251518.22207.aleaxit@yahoo.com> On Wednesday 22 October 2003 08:45 pm, Guido van Rossum wrote: > > sum(len(line) for line in file if not line.startswith("#") while > > line.strip()) > > > > looks simple than > > > > sum(itertools.takewhile(lambda l: l.strip(), len(line) for line in file > > if not line.startswith("#")) > > I think both are much harder to read and understand than > > n = 0 > for line in file: > if not line.strip(): > break > if not line.startwith("#"): > n += len(line) Yes, but personally I would prefer another refactoring yet, something like: def noncomment_lines_until_white(file): for line in file: if not line.strip(): break if not line.startswith('#'): yield line n = sum(len(line) for line in noncomment_lines_until_white(file)) To me, the concept "get all non-comment lines until the first all-whitespace one" gets nicely "factored out", this way, from the other concept of "sum the lengths of all of these lines". In Guido's version I have to reconstruct these two concepts "bottom-up" from their entwined expression in lower-level terms; in Walter's, I have to reconstruct them by decomposing a very dense equivalent, still full of lower-level constructs. It seems to me that, by naming the "noncomment_lines_until_white" generator, I make the separation of (and cooperation between) the two concepts, most clear. Clearly, people's tastes in named vs unnamed, and lower-level vs higher-level expression of concepts, differ widely!-) Alex From barry at python.org Sat Oct 25 09:25:47 2003 From: barry at python.org (Barry Warsaw) Date: Sat Oct 25 09:25:57 2003 Subject: [Python-Dev] test_bsddb blocks while testing popitem (?) In-Reply-To: <200310251232.55044.aleaxit@yahoo.com> References: <200310251232.55044.aleaxit@yahoo.com> Message-ID: <1067088346.10257.71.camel@anthem> On Sat, 2003-10-25 at 06:32, Alex Martelli wrote: > I guess it had been a while since I ran 'make test' on the 2.4 cvs... can't > find this bug in the bugs db and I'd just like a quick sanity check (if the > bug's already there or if I'm doing something weird) before I add it. Jeremy and I have both seen similar hangs in 2.4cvs. -Barry From aleaxit at yahoo.com Sat Oct 25 09:37:57 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 09:38:04 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <200310221757.h9MHvI327805@12-236-54-216.client.attbi.com> References: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com> <5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch> <200310221757.h9MHvI327805@12-236-54-216.client.attbi.com> Message-ID: <200310251537.57480.aleaxit@yahoo.com> On Wednesday 22 October 2003 07:57 pm, Guido van Rossum wrote: ... > > def accgen(n): > > def acc(i): > > global n in accgen > > n += i > > return n > > return acc > > > > particulary more compelling than: > > > > class accgen: > > def __init__(self, n): > > self.n = n > > > > def __call__(self, i): > > self.n += i > > return self.n > > Some people have "fear of classes". Some people think that a > function's scope can be cheaper than an object (someone should time > this). I need to simulate the "rebinding name in outer scope" with some kind of item or attribute, of course, but, given this, here comes: given this b.py: def accgen_attr(n): def acc(i): acc.n += i return acc.n acc.n = n return acc def accgen_item(n): n = [n] def acc(i): n[0] += i return n[0] return acc class accgen_clas(object): def __init__(self, n): self.n = n def __call__(self, i): self.n += i return self.n def looper(accgen, N=1000): acc = accgen(100) x = map(acc, xrange(N)) return x I measure: [alex@lancelot ext]$ timeit.py -c -s'import b' 'b.looper(b.accgen_attr)' 1000 loops, best of 3: 1.86e+03 usec per loop [alex@lancelot ext]$ timeit.py -c -s'import b' 'b.looper(b.accgen_item)' 1000 loops, best of 3: 1.18e+03 usec per loop [alex@lancelot ext]$ timeit.py -c -s'import b' 'b.looper(b.accgen_clas)' 100 loops, best of 3: 2.1e+03 usec per loop So, yes, a function IS slightly faster anyway (accgen_attr vs accgen_clas), AND simulating outer-scope-rebinding with a list item is somewhat faster than doing so with an attr (a class always uses an attr, and most of its not-too-terrible performance handicap presumably comes from that fact). I just don't think such closures would typically be used in bottlenecks SO tight that a 10%, or even a 40%, extra overhead are going to be crucial. So, I find it hard to get excited either way by this performance issue. Alex From aleaxit at yahoo.com Sat Oct 25 09:56:54 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 09:56:59 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <1066844059.3f96bf9b1240f@mcherm.com> References: <1066844059.3f96bf9b1240f@mcherm.com> Message-ID: <200310251556.54913.aleaxit@yahoo.com> On Wednesday 22 October 2003 07:34 pm, Michael Chermside wrote: > [Jeremy] > > > I'm not averse to introducing a new keyword, which would address both > > concerns. yield was introduced with apparently little problem, so it > > seems possible to add a keyword without causing too much disruption. > > > > If we decide we must stick with global, then it's very hard to address > > Alex's concern about global being a confusing word choice . > > [Guido] > > > OK, the tension is mounting. Which keyword do you have in mind? And > > would you use the same keyword for module-globals as for outer-scope > > variables? > > Surely the most appropriate keyword is "scope", right? That is my personal vote, yes. > As in > > scope a is global > scope b is nested > scope c is self > scope d is myDict > > Okay... maybe I'm getting too ambitious with the last couple... If we have to have 'scope' as a statement, I'd slightly prefer it if it HAD something useful to do, so I understand your ambition. If somebody thinks it's useful and important to be able to spell themodule.x = 23 as x # e.g. at top of function ...many lines in-between obscuring the issue... x = 23 then it WOULD no doubt be consistent to be able to do similar things for other "whatever.x = 23" assignments. However, that "myDict" still leaves me dubious. If we want to make it easy to use attribute setting and access syntax in lieu of dictionary indexing syntax (and it does look nicer often enough) then it seems to me that we should rather make available a fast equivalent of a wrapper such as class ItemsAsAttrs(object): def __init__(self, d): object.__setattr__(self, 'd', d) def __getattr__(self, n): return self.d[n] def __setattr__(self, n, v): self.d[n] = v then, "scope xx is ItemsAsAttr(myDict)" would work if such things as "scope xx is self" did. Personally, I'd rather go in the other direction: make all assignments except those to local variables into something _locally clear without needing to look for possible declarations who knows where_ , rather than fall for the "convenience trap" of allowing assignment to bare names (and presumably "side-effect rebindings" such as those in statements def, class, for, import) to mean something different depending on "declarative statements". However, I do understand that it would at least be consistent to allow such "insidious convenience" for many kinds of non-local names, as your "ambitious proposals" imply/suggest. If it IS deemed desirable to give "x=23" semantics that depend on the possible presence of a "declarative statement" who-knows-where, it seems consistent to allow that "convenience" for all kind of semantics. Perhaps an idea which I think Samuele suggested might be less insidious that allowing the "declarative statement" to be just about anywhere within the current function: make 'scope' a normal compound statement, as in, e.g.: scope x in module, xx in foo, z, t, v in self: x = 23 ...etc etc... this way, at least, the semantics of "x = 23" depend only on those declarative statements *it's nested inside of*; better than having to look all over the function, before AND after the "x = 23" and inside flow control statements too (!), for the "global x" (or whatever) that MIGHT make "x = 23" mean something different. Alex From aleaxit at yahoo.com Sat Oct 25 10:03:17 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 10:03:22 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <1b5501c398be$ff1832d0$891e140a@YODA> References: <200310221602.h9MG2h427527@12-236-54-216.client.attbi.com> <1b5501c398be$ff1832d0$891e140a@YODA> Message-ID: <200310251603.17845.aleaxit@yahoo.com> On Wednesday 22 October 2003 07:07 pm, Dave Brueck wrote: ... > > like a global variable. I also don't think I want global variable > > assignments to look like attribute assignments. > > Go easy on me for piping up here, but aren't they attribute assignments or > at least used as such? After reading the other posts in this thread I I entirely afree with this "user of Python" perspective, and I think it's a pity it's been ignored in the following discussion. > and any distinction would seem arbitrary or artificial (consider, for Yes! If the compiler needs to be aware of global assignments (which IS a good idea) we can do so by either introducing a new "operator keyword", OR something like Barry's suggestion of "import __me__" with __me__ as a magicname recognized by the compiler (hey, if it can recognize __future__ why not __me__?-). But to the Python user, making things look similar when their semantics and use ARE similar is a wonderful idea. > example, that it is not an uncommon practice to write a module instead of a > class if the class would be a singleton). Indeed, that IS the officially recommended practice (and Guido emphasized that in rather adamant words after he had recovered from the shock of seeing the Borg nonpattern presented at a Python-UK session...:-). Alex From skip at pobox.com Sat Oct 25 10:15:36 2003 From: skip at pobox.com (Skip Montanaro) Date: Sat Oct 25 10:16:07 2003 Subject: [Python-Dev] accumulator display syntax In-Reply-To: References: <16281.7442.783253.814142@montanaro.dyndns.org> Message-ID: <16282.34184.615261.250326@montanaro.dyndns.org> Tim> [Skip Montanaro] >> How much more expensive Tim> Stop right there. Okay, but I couldn't resist. ;-) >> for f in math.sin, math.cos, math.tan: >> squares = (f(x)**2 for x in inputs) >> plot(squares) Tim> Despite the similar appearance, that does something very different, ... >> which would work without reiterability, right? Tim> Yup. I shouldn't have mentioned performance. The above was really the point I was getting at. The mention of performance was simply because I couldn't understand why reiterability would be necessary in your example. I see you were just pointing out that someone not understanding the underlying nature of the generator would assume your example would work *and* save cycles because the definition of the generator expression was hoisted out of the loop. Skip From aleaxit at yahoo.com Sat Oct 25 10:18:42 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 10:18:46 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <13803476.1066768024@[192.168.1.101]> References: <000001c3984b$052cd820$e841fea9@oemcomputer> <13803476.1066768024@[192.168.1.101]> Message-ID: <200310251618.42221.aleaxit@yahoo.com> On Wednesday 22 October 2003 05:27 am, David Eppstein wrote: ... > Currently, I am using expressions like > > pos2d = > dict([(s,(positions[s][0]+dx*positions[s][2],positions[s][1]+dy*positions[s > ][2])) > for s in positions]) I _must_ be getting old -- it would never occur to me to write something as dense and incomprehensible (and no, removing the "dict([" would not make it much clearer). Something like: pos2d = {} for s, (x, y, delta) in positions.iteritems(): pos2d[s] = x+dx*delta, y+dy*delta seems just SO much clearer and more transparent to me. Alex From neal at metaslash.com Sat Oct 25 10:29:32 2003 From: neal at metaslash.com (Neal Norwitz) Date: Sat Oct 25 10:29:40 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310251603.17845.aleaxit@yahoo.com> References: <200310221602.h9MG2h427527@12-236-54-216.client.attbi.com> <1b5501c398be$ff1832d0$891e140a@YODA> <200310251603.17845.aleaxit@yahoo.com> Message-ID: <20031025142932.GZ5842@epoch.metaslash.com> On Sat, Oct 25, 2003 at 04:03:17PM +0200, Alex Martelli wrote: > > Yes! If the compiler needs to be aware of global assignments (which IS > a good idea) we can do so by either introducing a new "operator keyword" One thing that I've always wondered about, why can't one do: def reset_foo(): global foo = [] # declare as global and do assignment As Alex pointed out in another mail (I'm paraphrasing liberally): redundancy is bad. By having to declare foo as global, there's a guaranteed redundancy of the variable when foo is also assigned. I don't know if this solution would make Alex dislike global less. But it changes global to look more like a statement, rather than a declaration. Neal From aleaxit at yahoo.com Sat Oct 25 11:05:04 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 11:06:30 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <20031025142932.GZ5842@epoch.metaslash.com> References: <200310221602.h9MG2h427527@12-236-54-216.client.attbi.com> <200310251603.17845.aleaxit@yahoo.com> <20031025142932.GZ5842@epoch.metaslash.com> Message-ID: <200310251705.04439.aleaxit@yahoo.com> On Saturday 25 October 2003 04:29 pm, Neal Norwitz wrote: > On Sat, Oct 25, 2003 at 04:03:17PM +0200, Alex Martelli wrote: > > Yes! If the compiler needs to be aware of global assignments (which IS > > a good idea) we can do so by either introducing a new "operator keyword" > > One thing that I've always wondered about, why can't one do: > > def reset_foo(): > global foo = [] # declare as global and do assignment > > As Alex pointed out in another mail (I'm paraphrasing liberally): > redundancy is bad. By having to declare foo as global, there's > a guaranteed redundancy of the variable when foo is also assigned. > > I don't know if this solution would make Alex dislike global less. > But it changes global to look more like a statement, rather than > a declaration. Indeed, you can see 'global', in this case, as a kind of "operator keyword", modifying the scope of foo in an assignment statement. I really have two separate peeves against global (not necessarily in order of importance, actually): -- it's the wrong keyword, doesn't really _mean_ "global" -- it's a "declarative statement", the only one in Python (ecch) (leading to weird uncertainty about where it can be placed) -- "side-effect" assignment to globals, such as in def, class &c statements, is quite tricky and error-prone, not useful Well, OK, _three_ peeves... usual Spanish Inquisition issue...:-) Your proposal is quite satisfactory wrt solving the second issue, from my viewpoint. It would still create a unique-in-Python construct, but not (IMHO) a problematic one. As you point out, it _would_ be more concise than having to separately [a] say foo is global then [b] assign something. It would solve any uncertainty regarding placement of 'global', and syntactically impede using global variables in "rebinding as side-effect" cases such as def &c, so the third issue disappears. The first issue, of course, is untouched:-). It can't be touched without choosing a different keyword, anyway. So, with 2 resolutions out of 3, I do like your idea. However, I don't think we can get there from here. Guido has explained that the parser must be able to understand a statement that starts with 'global' without look-ahead; I don't know if it can keep accepting, for bw compat and with a warning, the old global xx while also accepting the new and improved global xx = 23 But perhaps it's not quite as hard as the "global.xx = 23" would be. I find Python's parser too murky & mysterious to feel sure. Other side issues: if you rebind a module-level xx in half a dozen places in your function f, right now you only need ONE "global xx" somewhere in f (just about anywhere); with your proposal, you'd need to flag "global xx = 23" at each of the several assignments to that xx. Now, _that suits me just fine_: indeed, I LOVE the fact that a bare "xx = 23" is KNOWN to set a local, and you don't have to look all over the place for declarative statements that might affect its semantics (hmmm, perhaps a 4th peeve vs global, but I see it as part and parcel of peeve #2:-). But globals-lovers might complain that it makes using globals a TAD less convenient. (Personally, I would not mind THAT at all, either: if as a result people use 10% fewer globals and replace them with arguments or classes etc, I think that will enhance their programs anyway;-). So -- +1, even though we may need a different keyword to solve [a] the problem of getting there from here AND [b] my peeve #1 ...:-). Alex From pf_moore at yahoo.co.uk Sat Oct 25 11:49:30 2003 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Sat Oct 25 11:49:31 2003 Subject: [Python-Dev] Re: closure semantics References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310242301.16445.aleaxit@yahoo.com> <200310242132.h9OLWMv06320@12-236-54-216.client.attbi.com> <200310251007.36871.aleaxit@yahoo.com> Message-ID: Alex Martelli writes: > Assume for the sake of argument that we could make 'scope' a reserved > word. Now, what are the tradeoffs of using a "declaration" > scope x in outer > which makes all rebidings of x act in the scope of containing function > outer (including 'def x():', 'class x:', 'import x', ...); versus an > "operator" that must be used to indicate "which x" when specifically > assigning it (no "side effect rebinding" via def &c allowed -- I think it > helps the reader of code a LOT to require specific assignment!), e.g. > scope(outer).x = 23 > > Don't think of scope as a built-in function, but as a keyword in either > case (and we could surely have other syntax for the "scope operator", > e.g. "(x in outer scope) = 23" or whatever, as long as it's RIGHT THERE > where x is being assigned). So the compiler can catch on to the info > just as effectively. I'm skimming this, so I apologise if I've missed something obvious. However, one significant issue with your notation scope(outer).x = 23 is that, although scope(outer) *looks like* a function call, it isn't - precisely because scope is a keyword. I think that, if you're using a keyword, you need something syntactically distinct. Now maybe you can make something like (x in f scope) work as an expression (I've deliberately used "f" not "outer" to highlight the fact that it may not always look as "nice" as your example), but I'm not sure it's as intuitive as you imply. But then again, I've no problem with "global x in f". Paul -- This signature intentionally left blank From eppstein at ics.uci.edu Sat Oct 25 12:03:01 2003 From: eppstein at ics.uci.edu (David Eppstein) Date: Sat Oct 25 12:03:04 2003 Subject: [Python-Dev] Re: accumulator display syntax References: <000001c3984b$052cd820$e841fea9@oemcomputer> <13803476.1066768024@[192.168.1.101]> <200310251618.42221.aleaxit@yahoo.com> Message-ID: In article <200310251618.42221.aleaxit@yahoo.com>, Alex Martelli wrote: > On Wednesday 22 October 2003 05:27 am, David Eppstein wrote: > ... > > Currently, I am using expressions like > > > > pos2d = > > dict([(s,(positions[s][0]+dx*positions[s][2],positions[s][1]+dy*positions[s > > ][2])) > > for s in positions]) > > I _must_ be getting old -- it would never occur to me to write something > as dense and incomprehensible (and no, removing the "dict([" would not > make it much clearer). Something like: > > pos2d = {} > for s, (x, y, delta) in positions.iteritems(): > pos2d[s] = x+dx*delta, y+dy*delta > > seems just SO much clearer and more transparent to me. I like the comprehension syntax so much that I push it harder than I guess I should. If I'm building a dictionary by performing some transformation on the items of another dictionary, I prefer to write it in a way that avoids sequencing the items one by one; I don't think of that sequencing as an inherent part of the loop. Put another way, I prefer declarative to imperative when possible. Let's try to spread it out a little and use intermediate variable names: pos2d = dict([(s, (x + dx*z, y + dy*z)) for s,(x,y,z) in positions.items()]) Better? -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From guido at python.org Sat Oct 25 12:40:25 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 25 12:40:55 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: Your message of "Sat, 25 Oct 2003 10:07:36 +0200." <200310251007.36871.aleaxit@yahoo.com> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310242301.16445.aleaxit@yahoo.com> <200310242132.h9OLWMv06320@12-236-54-216.client.attbi.com> <200310251007.36871.aleaxit@yahoo.com> Message-ID: <200310251640.h9PGePZ07536@12-236-54-216.client.attbi.com> > > or at run-time) always goes from inner scope to outer. While you and > > I see nested functions as small amounts of closely-knit code, some > > people will go overboard and write functions of hundred lines long > > containing dozens of inner functions, which may be categorized into > > This doesn't look like a legitimate use case to me; i.e., I see no need > to distort the language if the benefit goes to such "way overboard" uses. > I think they will have serious maintainability problems anyway. One person here brought up (maybe David Eppstein) that they used this approach for coding up extensive algorithms that are functional in nature but have a lot of state referenced *during* the computation. Whoever it was didn't like using classes because the internal state would persist past the lifetime of the calculation. When I visited Google I met one person who was advocating the same coding style -- he was adamant that if he revealed any internal details of his algorithm then the users of his library would start using them, and he wouldn't be able to change the details in another revision. AFACT these were both very experienced Python developers who had thought about the issue and chosen to write large nested functions. So I don't think you can dismiss this so easily. > Fortunately, I don't think of placing the "indication to the > compiler" as close to the assignment-to-outer-variable as a > distortion;-) > > > Anyway, I hope you'll have a look at my reasons for why the compiler > > needs to know about rebinding variables in outer scopes from inside > > an inner scope. > > Sure! I do understand this. What I don't understand is why, > syntactically, the reserved word that indicates this to the compiler > should have to be a "statement that does nothing" -- the ONLY > "declaration" in the language -- rather than e.g. an _operator_ > which specifically flags such uses. Maybe because I haven't seen such an operator proposed that I liked. :) And in its normal usage, I don't find 'global x' offensive; that it can be abused and sometimes misunderstood doesn't matter to me, that's the case for sooooo many language constructs... > Assume for the sake of argument that we could make 'scope' a reserved > word. Now, what are the tradeoffs of using a "declaration" > scope x in outer > which makes all rebidings of x act in the scope of containing function > outer (including 'def x():', 'class x:', 'import x', ...); versus an > "operator" that must be used to indicate "which x" when specifically > assigning it (no "side effect rebinding" via def &c allowed -- I think it > helps the reader of code a LOT to require specific assignment!), e.g. > scope(outer).x = 23 > > Don't think of scope as a built-in function, but as a keyword in either > case (and we could surely have other syntax for the "scope operator", > e.g. "(x in outer scope) = 23" or whatever, as long as it's RIGHT THERE > where x is being assigned). So the compiler can catch on to the info > just as effectively. What bugs me tremendously about this is that this isn't symmetric with usage: you can *use* the x from the outer scope without using all that verbiage, but you must *assign* to it with a special construct. This would be particularly confusing if x is used on the right hand side of the assignment, e.g.: scope(outer).x = x.lower() > The tradeoffs are: > -- we can keep thinking of Python as declaration-free and by gradually > deprecating the global statement make it more so Somehow I don't see "declaration-free" as an absolute goal, where 100% is better than 99%. > -- the reader of code KNOWS what's being assigned to without having > to scroll up "hundreds of lines" looking for possible declarations Yeah, but you can still *use* a variable that was set "hundreds of lines" before, so it's not a full solution (and will never be -- allowing *use* of nonlocals is clearly a much-wanted and very useful feature). > -- assignment to nonlocals is made less casually convenient by just the > right amount to ensure it won't be overused If we don't add "global x in f" or some equivalent, you can't assign to nonlocals except for module globals, where I don't see a problem. > -- no casual rebinding of nonlocals via def, class, import I don't think that's a real issue. > -- once we solve the general problem of allowing non-bare-names as > iteration variables in 'for', nonlocals benefit from that > resolution automatically, since nonlocals are never > assigned-to as bare-names This is obscure -- most readers here didn't even know you could do that, and all except Tim (whom I cut a certain amount of slack because he's from Wisconsin) said they considered it bad style. So again the argument is weak. > I see this as the pluses. The minus is, we need a new keyword; but I > think we do, because stretching 'global' to mean something that ISN'T > global in any sense is such a hack. Well, if for some reason the entire Python community suddenly leaned on me to allow assignment to non-locals with a syntactic construct to be used in every assignment to a non-local, I would much favor the C++ style of ::. > Cutting both ways is the fact that this allows using the same name from > more than one scope (since each use is explicitly qualified as coming > from a specific scope). That's irrelevant for small compact uses of > nesting, but it may be seen as offering aid and succour to those wanting > to "go overboard" as you detail earlier (bad); There is no need for this even among those folks; a simple renaming allows access to all variables they need. (My earlier argument wasn't about this, it was about accidental shadowing when there was *no* need to share.) > OTOH, if I ever need to maintain such "overboard" code written by > others, and refactoring it is not feasible right now, it may be > helpful. In any case, the confusion is reduced by having the > explicit qualification on assignment. Similarly for _accesses_ > rather than rebindings -- access to the barename will keep using the > same rules as today, of course, but I think the same syntax that > MUST be used to assign nonlocals should also be optionally usable to > access them -- not important either way in small compact functions, > but more regular and offering a way to make code non-ambiguous in > large ones. I don't see having two ways to access a name -- > barename x or qualified scope(foo).x -- as a problem, just like > today from inside a method we may access a classvariable as "self.x" > OR "self.__class__.x" indifferently -- the second form is needed for > rebinding and may be chosen for clarity in some cases where the > first simpler ("barer") one would suffice. Actually, self.__class__.x is probably a mistake, usually one should name the class explicitly. But I don't see that as the same, because the name isn't bare in either case. --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Sat Oct 25 13:15:29 2003 From: aahz at pythoncraft.com (Aahz) Date: Sat Oct 25 13:15:33 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310251508.15634.aleaxit@yahoo.com> References: <005501c398ca$a07a6f20$e841fea9@oemcomputer> <200310221853.h9MIrL327955@12-236-54-216.client.attbi.com> <200310251508.15634.aleaxit@yahoo.com> Message-ID: <20031025171529.GA18617@panix.com> On Sat, Oct 25, 2003, Alex Martelli wrote: > > So, if I've followed correctly the lots of python-dev mail over the last > few days, that person (Aahz) is roughly +0 on list.sorted as classmethod > and thus we can go ahead. Right? I'm not the person who objected on non-English speaking grounds, and I'm -0 because I don't like using grammatical tense as the differentiator; as I said, I'd expect sorted() to be a predicate. If we're doing this (and it seems we are), I still prefer copysort() for clarity. But I'm not objecting to sorted(). -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From eppstein at ics.uci.edu Sat Oct 25 14:05:38 2003 From: eppstein at ics.uci.edu (David Eppstein) Date: Sat Oct 25 14:05:43 2003 Subject: [Python-Dev] Re: closure semantics References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310242301.16445.aleaxit@yahoo.com> <200310242132.h9OLWMv06320@12-236-54-216.client.attbi.com> <200310251007.36871.aleaxit@yahoo.com> <200310251640.h9PGePZ07536@12-236-54-216.client.attbi.com> Message-ID: In article <200310251640.h9PGePZ07536@12-236-54-216.client.attbi.com>, Guido van Rossum wrote: > > > or at run-time) always goes from inner scope to outer. While you and > > > I see nested functions as small amounts of closely-knit code, some > > > people will go overboard and write functions of hundred lines long > > > containing dozens of inner functions, which may be categorized into > > > > This doesn't look like a legitimate use case to me; i.e., I see no need > > to distort the language if the benefit goes to such "way overboard" uses. > > I think they will have serious maintainability problems anyway. > > One person here brought up (maybe David Eppstein) that they used this > approach for coding up extensive algorithms that are functional in > nature but have a lot of state referenced *during* the computation. > Whoever it was didn't like using classes because the internal state > would persist past the lifetime of the calculation. Yes, that was me. You recommended refactoring the stateful part of the algorithm as an object despite its lack of persistence. It worked and my code is much improved thereby. Specifically, I recognized that one of the outer level functions of my code was appending to a sequence of strings, so I turned that function into the next() method of an iterator object, and the other nested functions became other methods of the same object. I'm not sure how much of the improvement was due to using an object-oriented architecture and how much was due to the effort of refactoring in general, but you convinced me that using an object to represent shared state explicitly rather than doing it implicitly by nested function scoping can be a good idea. -- David Eppstein http://www.ics.uci.edu/~eppstein/ Univ. of California, Irvine, School of Information & Computer Science From pedronis at bluewin.ch Sat Oct 25 14:40:36 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Sat Oct 25 14:38:11 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <200310251640.h9PGePZ07536@12-236-54-216.client.attbi.com> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310242301.16445.aleaxit@yahoo.com> <200310242132.h9OLWMv06320@12-236-54-216.client.attbi.com> <200310251007.36871.aleaxit@yahoo.com> Message-ID: <5.2.1.1.0.20031025194109.0284ac80@pop.bluewin.ch> At 09:40 25.10.2003 -0700, Guido van Rossum wrote: > > > or at run-time) always goes from inner scope to outer. While you and > > > I see nested functions as small amounts of closely-knit code, some > > > people will go overboard and write functions of hundred lines long > > > containing dozens of inner functions, which may be categorized into > > > > This doesn't look like a legitimate use case to me; i.e., I see no need > > to distort the language if the benefit goes to such "way overboard" uses. > > I think they will have serious maintainability problems anyway. > >One person here brought up (maybe David Eppstein) that they used this >approach for coding up extensive algorithms that are functional in >nature but have a lot of state referenced *during* the computation. >Whoever it was didn't like using classes because the internal state >would persist past the lifetime of the calculation. [seen David Eppstein's post, discarded obsolete comment] >When I visited Google I met one person who was advocating the same >coding style -- he was adamant that if he revealed any internal >details of his algorithm then the users of his library would start >using them, and he wouldn't be able to change the details in another >revision. I should be missing the details, it seems someone unhappy with the encapsulation support in Python, wanting it backward using closures. Yes, closures can be used to get strong encapsulation. If Python wanted again to support directly some form of sandboxed execution , then better support for encapsulation would very likely play a role. But as I said I should be missing something, if the point is stronger encapsulation I would add it to the OO part of the language. The schizophrenic split, use objects but if you want encapsulation use closures, seems odd. Aside: I have the maybe misled impression, that having a fully complete functional programming support in Python was not the point, but that the addition of generators have increased the interest in more functional programming support. >AFACT these were both very experienced Python developers who had >thought about the issue and chosen to write large nested functions. > > they seem to want to import idioms that before 2.1 were not even imaginable, and maybe I'm wrong, but idioms that come from somewhere else. Personally, e.g. I would like multi-method support in Python and I know where they come from . Every experienced Python developer probaly knows some other language, and miss or would like something from there. Sometimes I have the impression that seeing the additions incrementally, and already knowing the language well, make it hard to consider the learning curve for someone encountering the language for the first time. I think that evalaluating whether an addition really enhance expressivity, or makes the language more uniform vs the ref-man growth is very important. IMHO generators were a clear win, generator expressions seem a add/substract thing because list comprehension explanation becomes just list(gen expr). regards. From skip at pobox.com Sat Oct 25 12:09:55 2003 From: skip at pobox.com (Skip Montanaro) Date: Sat Oct 25 15:51:39 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <200310251007.36871.aleaxit@yahoo.com> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310242301.16445.aleaxit@yahoo.com> <200310242132.h9OLWMv06320@12-236-54-216.client.attbi.com> <200310251007.36871.aleaxit@yahoo.com> Message-ID: <16282.41043.939103.536103@montanaro.dyndns.org> [Alex] Assume for the sake of argument that we could make 'scope' a reserved word. Now, what are the tradeoffs of using a "declaration" scope x in outer which makes all rebidings of x act in the scope of containing function outer (including 'def x():', 'class x:', 'import x', ...); versus an "operator" that must be used to indicate "which x" when specifically assigning it (no "side effect rebinding" via def &c allowed -- I think it helps the reader of code a LOT to require specific assignment!), e.g. scope(outer).x = 23 I don't see how either of your scope statements is really any better than "global". If I say global x in outer I am declaring to the compiler that x is global to the current function, and in particular I want you to bind x to the x which is local to the function outer. Maybe "global" isn't perfect, but it seems to suit the situation fairly well and avoids a new keyword to boot. With the "scope(outer).x = 23" notation you are mixing apples and oranges (declaration and execution). It looks like an executable statement but it's really a declaration to the compiler. Guido has already explained why the binding has to occur at compile time. The tradeoffs are: -- we can keep thinking of Python as declaration-free and by gradually deprecating the global statement make it more so How do you propose to subsume the current global statement's functionality? -- the reader of code KNOWS what's being assigned to without having to scroll up "hundreds of lines" looking for possible declarations As he would with an extension of the current global statement. I presume you mean for your scope pseudo function to be used at the "point of attack", so there would likely be less separation between the declaration and the assignment. Of course, using your argument about redundancy against you, would I have to use scope(outer).x = ... each time I wanted to change the value of x? What if I rename outer? -- assignment to nonlocals is made less casually convenient by just the right amount to ensure it won't be overused I don't see this as a big problem now. In my own code I rarely use global, and never use nested functions. I suspect that's true for most people. Skip From guido at python.org Sat Oct 25 16:20:45 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 25 16:20:55 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: Your message of "Sat, 25 Oct 2003 11:32:04 +0200." <200310251132.04686.aleaxit@yahoo.com> References: <200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz> <200310230543.h9N5heh01776@12-236-54-216.client.attbi.com> <200310251132.04686.aleaxit@yahoo.com> Message-ID: <200310252020.h9PKKjL07657@12-236-54-216.client.attbi.com> > sum looks cooler, but it can be an order of magnitude slower > than the humble loop of result.extend calls. We could fix this > specific performance trap by specialcasing in sum those cases > where the result has a += method -- hmmm... would a patch for > this performance bug be accepted for 2.3.* ...? (I understand and > approve that we're keen on avoiding adding functionality in any > 2.3.*, but fixed-functionality performance enhancements should > be just as ok as fixes to functionality bugs, right?) No way. There's nothing that guarantees that a+=b has the same semantics as a+b, and in fact for lists it doesn't. I wouldn't even want this for 2.4. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sat Oct 25 17:14:32 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 25 17:14:41 2003 Subject: [Python-Dev] product() In-Reply-To: Your message of "Sat, 25 Oct 2003 14:39:12 +0200." <200310251439.12449.aleaxit@yahoo.com> References: <002401c39907$0176f5a0$e841fea9@oemcomputer> <200310251439.12449.aleaxit@yahoo.com> Message-ID: <200310252114.h9PLEWU07712@12-236-54-216.client.attbi.com> > it might be "alltrue" and "anytrue" -- the short-circuiting ones, > returning the first true or false item found respectively, as in: > > def alltrue(seq): > for x in seq: > if not x: return x > else: > return True > > def anytrue(seq): > for x in seq: > if x: return x > else: > return False > > these seem MUCH more generally useful than 'product' (but, > I still opine, not quite enough to warrant being built-ins). These are close to what ABC does with quantifiers. There, you can write IF EACH x IN sequence HAS x > 0: ... ABC has the additional quirk that if there's an ELSE branch, you can use x in it (as a "counter-example"). In Python, you could write this as if alltrue(x > 0 for x in sequence): ... but the current design doesn't expose x to the else branch. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sat Oct 25 17:18:42 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 25 17:18:57 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python bltinmodule.c, 2.292.10.1, 2.292.10.2 In-Reply-To: Your message of "Sat, 25 Oct 2003 05:47:11 PDT." References: Message-ID: <200310252118.h9PLIgE07777@12-236-54-216.client.attbi.com> > Modified Files: > Tag: release23-maint > bltinmodule.c > Log Message: > changed builtin_sum to use PyNumber_InPlaceAdd -- unchanged semantics but > fixes performance bug with sum(lotsoflists, []). I think this ought to be reverted, both in 2.3 and 2.4. Consider this code: empty = [] for i in range(10): print sum([[x] for x in range(i)], empty) The output used to be: [] [0] [0, 1] [0, 1, 2] [0, 1, 2, 3] [0, 1, 2, 3, 4] [0, 1, 2, 3, 4, 5] [0, 1, 2, 3, 4, 5, 6] [0, 1, 2, 3, 4, 5, 6, 7] [0, 1, 2, 3, 4, 5, 6, 7, 8] But now it is: [] [0] [0, 0, 1] [0, 0, 1, 0, 1, 2] [0, 0, 1, 0, 1, 2, 0, 1, 2, 3] [0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4] [0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 5] [0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 6] [0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6, 7] [0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 2, 3, 4, 5, 6, 7, 8] --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sat Oct 25 18:10:56 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 25 18:11:09 2003 Subject: [Python-Dev] test_bsddb blocks while testing popitem (?) In-Reply-To: Your message of "Sat, 25 Oct 2003 09:25:47 EDT." <1067088346.10257.71.camel@anthem> References: <200310251232.55044.aleaxit@yahoo.com> <1067088346.10257.71.camel@anthem> Message-ID: <200310252210.h9PMAuT07833@12-236-54-216.client.attbi.com> > On Sat, 2003-10-25 at 06:32, Alex Martelli wrote: > > I guess it had been a while since I ran 'make test' on the 2.4 > > cvs... can't find this bug in the bugs db and I'd just like a > > quick sanity check (if the bug's already there or if I'm doing > > something weird) before I add it. > > Jeremy and I have both seen similar hangs in 2.4cvs. > > -Barry Ditto for me on RH9. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sat Oct 25 18:20:13 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 25 18:21:21 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: Your message of "Sat, 25 Oct 2003 10:29:32 EDT." <20031025142932.GZ5842@epoch.metaslash.com> References: <200310221602.h9MG2h427527@12-236-54-216.client.attbi.com> <1b5501c398be$ff1832d0$891e140a@YODA> <200310251603.17845.aleaxit@yahoo.com> <20031025142932.GZ5842@epoch.metaslash.com> Message-ID: <200310252220.h9PMKD507863@12-236-54-216.client.attbi.com> > One thing that I've always wondered about, why can't one do: > > def reset_foo(): > global foo = [] # declare as global and do assignment Nothing deep -- it just never occurred to me. I was mimicking ABC's "SHARE foo", which doesn't have this because its syntax for assignment is the more verbose "PUT value IN variable". I don't think it'll entice Alex though. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at cosc.canterbury.ac.nz Sat Oct 25 18:22:38 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Sat Oct 25 18:22:45 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DED6AD4A@au3010avexu1.global.avaya.com> Message-ID: <200310252222.h9PMMc505078@oma.cosc.canterbury.ac.nz> > It's complex. Can you explain the complete semantics of 'outer' as simply as: > > global [in ] > > Binds and uses in another scope. If 'in ' is omitted > then the name is bound and used in the scope of the current module. global Assignments to rebind it in the next outer scope where it is already bound, or in the module scope if there is no existing binding. Seems about the same length as yours. > current scope and the scope where the programmer was expecting the > name to be bound> Such comments belong in warning messages about the change issued during the transitional phase, not in the language definition. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido at python.org Sat Oct 25 18:25:22 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 25 18:25:39 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: Your message of "Sat, 25 Oct 2003 13:15:29 EDT." <20031025171529.GA18617@panix.com> References: <005501c398ca$a07a6f20$e841fea9@oemcomputer> <200310221853.h9MIrL327955@12-236-54-216.client.attbi.com> <200310251508.15634.aleaxit@yahoo.com> <20031025171529.GA18617@panix.com> Message-ID: <200310252225.h9PMPMT07897@12-236-54-216.client.attbi.com> [Alex] > > So, if I've followed correctly the lots of python-dev mail over the last > > few days, that person (Aahz) is roughly +0 on list.sorted as classmethod > > and thus we can go ahead. Right? [Aahz] > I'm not the person who objected on non-English speaking grounds, and I'm > -0 because I don't like using grammatical tense as the differentiator; > as I said, I'd expect sorted() to be a predicate. If we're doing this > (and it seems we are), I still prefer copysort() for clarity. But I'm > not objecting to sorted(). Predicates start with 'is'. For example, s.lower() converts s to lowercase; s.islower() asks if s is lowercase. I'm -1 on list.copysort() as a constructor/factory. Since whoever didn't like sorted() before hasn't piped up now, I think we should go ahead and implement the list.sorted() constructor. --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Sat Oct 25 18:38:32 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 18:38:37 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310252220.h9PMKD507863@12-236-54-216.client.attbi.com> References: <200310221602.h9MG2h427527@12-236-54-216.client.attbi.com> <20031025142932.GZ5842@epoch.metaslash.com> <200310252220.h9PMKD507863@12-236-54-216.client.attbi.com> Message-ID: <200310260038.32881.aleaxit@yahoo.com> On Sunday 26 October 2003 12:20 am, Guido van Rossum wrote: > > One thing that I've always wondered about, why can't one do: > > > > def reset_foo(): > > global foo = [] # declare as global and do assignment > > Nothing deep -- it just never occurred to me. I was mimicking ABC's > "SHARE foo", which doesn't have this because its syntax for assignment > is the more verbose "PUT value IN variable". > > I don't think it'll entice Alex though. :-) Ah, you haven't seen my answer to it? I think it meets most of my objections -- all but the distaste for the keyword 'global' itself -- and I could definitely live with this more happily than with any other use of 'global'. Please see my direct response to Neal for more details. Alex From guido at python.org Sat Oct 25 18:48:12 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 25 18:48:26 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: Your message of "Sat, 25 Oct 2003 17:05:04 +0200." <200310251705.04439.aleaxit@yahoo.com> References: <200310221602.h9MG2h427527@12-236-54-216.client.attbi.com> <200310251603.17845.aleaxit@yahoo.com> <20031025142932.GZ5842@epoch.metaslash.com> <200310251705.04439.aleaxit@yahoo.com> Message-ID: <200310252248.h9PMmCE07958@12-236-54-216.client.attbi.com> [Neal] > > def reset_foo(): > > global foo = [] # declare as global and do assignment [Alex] > Indeed, you can see 'global', in this case, as a kind of "operator > keyword", modifying the scope of foo in an assignment statement. > > I really have two separate peeves against global (not necessarily > in order of importance, actually): > > -- it's the wrong keyword, doesn't really _mean_ "global" I haven't heard anyone else in this thread agree with you on that one. I certainly don't think it's of earth-shattering ugliness. > -- it's a "declarative statement", the only one in Python (ecch) > (leading to weird uncertainty about where it can be placed) I'd be happy to entertain proposals for reasonable restrictions on where 'global' can be placed. (Other placements would have to be deprecated at first.) > -- "side-effect" assignment to globals, such as in def, class &c > statements, is quite tricky and error-prone, not useful Agreed; nobody uses these, but again this can be fixed if we want to (again we'd have to start deprecating existing use first). Note that this is also currently allowed and probably shouldn't: def f(): global x for x in ...: ... > Well, OK, _three_ peeves... usual Spanish Inquisition issue...:-) > > Your proposal is quite satisfactory wrt solving the second issue, > from my viewpoint. It would still create a unique-in-Python > construct, but not (IMHO) a problematic one. Well, *every* construct is "unique in Python", isn't it? Because Python has only one of each construct, in line with the TOOWTDI zen. Or do you mean "not seen in other languages"? I'd disagree -- lots of languages have something similar, e.g. "int x = 5;" in C or "var x = 5" in JavaScript. IMO, "global x = 5" is sufficiently similar that it will require no time to learn. > As you point out, > it _would_ be more concise than having to separately [a] say > foo is global then [b] assign something. It would solve any > uncertainty regarding placement of 'global', and syntactically > impede using global variables in "rebinding as side-effect" cases > such as def &c, so the third issue disappears. > > The first issue, of course, is untouched:-). It can't be touched > without choosing a different keyword, anyway. > > So, with 2 resolutions out of 3, I do like your idea. I don't think that Neal's proposal solves #3, unless 'global x = ...' becomes the *only* way. Also, I presume that the following: def f(): global x = 21 x *= 2 print x should continue to be value, and all three lines should reference the same variable. But #3 is moot IMO, it can be solved without mucking with global at all, by simply making the parser reject 'class X', 'def X', 'import X' and 'for X' when there's also a 'global X' in effect. Piece of cake. > However, I don't think we can get there from here. Guido has > explained that the parser must be able to understand a statement > that starts with 'global' without look-ahead; I don't know if it can > keep accepting, for bw compat and with a warning, the old > global xx > while also accepting the new and improved > global xx = 23 There is absolutely no problem recognizing this. > But perhaps it's not quite as hard as the "global.xx = 23" would > be. I find Python's parser too murky & mysterious to feel sure. If you can understand what code can be recognized by a pure recursive descent parser with one token lookahead and no backtracking, you can understand what Python's parser can handle. > Other side issues: if you rebind a module-level xx in half a > dozen places in your function f, right now you only need ONE > "global xx" somewhere in f (just about anywhere); with your > proposal, you'd need to flag "global xx = 23" at each of the > several assignments to that xx. Now, _that suits me just > fine_: indeed, I LOVE the fact that a bare "xx = 23" is KNOWN > to set a local, and you don't have to look all over the place for > declarative statements that might affect its semantics You may love this for assignments, but for *using* variables there is already no such comfort. Whether "print xx" prints a local or global variable depends on whether there's an assignment to xx anywhere in the same scope. So I don't think that is a very strong argument. > (hmmm, > perhaps a 4th peeve vs global, but I see it as part and parcel > of peeve #2:-). But globals-lovers might complain that it makes > using globals a TAD less convenient. (Personally, I would not > mind THAT at all, either: if as a result people use 10% fewer > globals and replace them with arguments or classes etc, I think > that will enhance their programs anyway;-). > > > So -- +1, even though we may need a different keyword to > solve [a] the problem of getting there from here AND [b] my > peeve #1 ...:-). --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Sat Oct 25 18:50:50 2003 From: python at rcn.com (Raymond Hettinger) Date: Sat Oct 25 18:51:43 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310252225.h9PMPMT07897@12-236-54-216.client.attbi.com> Message-ID: <003f01c39b4a$70db20c0$e841fea9@oemcomputer> > Since whoever didn't like sorted() before hasn't piped up now, I think > we should go ahead and implement the list.sorted() constructor. Okay, I'll modify the patch to be a classmethod called sorted() and will assign to Alex for second review. Raymond From greg at cosc.canterbury.ac.nz Sat Oct 25 18:51:50 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Sat Oct 25 18:51:58 2003 Subject: [Python-Dev] Re: Re: closure semantics In-Reply-To: Message-ID: <200310252251.h9PMpon05099@oma.cosc.canterbury.ac.nz> > But what about name mismatches? Global statements allows functions to > create 'new' variables in the module scope and not just 'existing' > ones. What about for in-between scopes? It's probably a misfeature of the global statement that it allows that, but if we're going to re-use it in the form of a "global x in scope" statement, we should keep the behaviour the same for nested scopes in the interests of consistency. Maybe this is an argument for introducing an "outer" statement, which requires an existing binding (determined by existence of an assignment at compile time) even for the module scope, and deprecating "global" altogether. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From skip at pobox.com Sat Oct 25 19:03:17 2003 From: skip at pobox.com (Skip Montanaro) Date: Sat Oct 25 19:03:34 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <200310252222.h9PMMc505078@oma.cosc.canterbury.ac.nz> References: <338366A6D2E2CA4C9DAEAE652E12A1DED6AD4A@au3010avexu1.global.avaya.com> <200310252222.h9PMMc505078@oma.cosc.canterbury.ac.nz> Message-ID: <16283.309.987234.133955@montanaro.dyndns.org> Greg> global Greg> Assignments to rebind it in the next outer scope where it Greg> is already bound, or in the module scope if there is no existing Greg> binding. Greg> Seems about the same length as yours. Is that compatible with current use? I think the current semantics are that global always binds name to an object with that name at module scope. I thought the point of this discussion was to allow the programmer to specify the precise scope of the object to which the variable would be bound, in the face of possibly multiple occurrences of the name. Using the existing syntax you have to pick one rather arbitrarily, either the module scope or the first place you find . (Again, I have never used nested functions, so this is more of a pedantic argument than anything for me. Still, it seems if you're going to change things you should make it so any instance of an outer variable can be specified.) Skip From aleaxit at yahoo.com Sat Oct 25 19:04:18 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 19:04:24 2003 Subject: [Python-Dev] product() In-Reply-To: <200310252114.h9PLEWU07712@12-236-54-216.client.attbi.com> References: <002401c39907$0176f5a0$e841fea9@oemcomputer> <200310251439.12449.aleaxit@yahoo.com> <200310252114.h9PLEWU07712@12-236-54-216.client.attbi.com> Message-ID: <200310260104.18806.aleaxit@yahoo.com> On Saturday 25 October 2003 11:14 pm, Guido van Rossum wrote: > > it might be "alltrue" and "anytrue" -- the short-circuiting ones, > > returning the first true or false item found respectively, as in: > > > > def alltrue(seq): > > for x in seq: > > if not x: return x > > else: > > return True > > > > def anytrue(seq): > > for x in seq: > > if x: return x > > else: > > return False > > > > these seem MUCH more generally useful than 'product' (but, > > I still opine, not quite enough to warrant being built-ins). > > These are close to what ABC does with quantifiers. There, you can > write > > IF EACH x IN sequence HAS x > 0: ... > > ABC has the additional quirk that if there's an ELSE branch, you can > use x in it (as a "counter-example"). > > In Python, you could write this as > > if alltrue(x > 0 for x in sequence): ... > > but the current design doesn't expose x to the else branch. Right -- it would return the condition being tested, x>0, when non-true, so just a False; there is no natural way for it to get the underlying object on which it's testing it. This is somewhat the same problem as Peter Norvig's original Top(10) accumulator example: if you just pass to it the iterator of the comparison keys, it can't collect the 10 items with the highest comparison keys. Maybe alltrue(sequence, pred=lambda x: x>0) might be better (pred would default to None meaning to test the items in the first argument, the iterator, for true/false directly): def alltrue(seq, pred=None): if pred is None: def pred(x): return x def wrap(x): return x else: class wrap(object): def __init__(self, x): self.counterexample = x def __nonzero__(self): return False for x in seq: if not pred(x): return wrap(x) else: return True or something like that (I do think we need the wrap class, so that alltrue can return an object that evaluates to false but still allows the underlying "counter-example" to be retrieved if needed). Use, of course, would have to be something like: allpositives = alltrue(sequence, pred=lambda x: x>0) if allpositives: print "wow, all positives!" else: print "nope, some nonpositives, e.g.", allpositives.counterexample Unfortunately, this usage is pushing at TWO not-strengths of Python: no neat way to pass an unnamed predicate (lambda ain't really all that neat...) AND no assignment-as-expression. So, I don't think it would really catch on all that much. Alex From just at letterror.com Sat Oct 25 19:09:12 2003 From: just at letterror.com (Just van Rossum) Date: Sat Oct 25 19:09:20 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310252248.h9PMmCE07958@12-236-54-216.client.attbi.com> Message-ID: It seems noone liked (or remembered) an idea I proposed last february, but I'm going to repost it anyway: How about adding a "rebinding" operator, for example spelled ":=": a := 2 It would mean: bind the value 2 to the nearest scope that defines 'a'. Original post: http://mail.python.org/pipermail/python-dev/2003-February/032764.html A better summary by someone else who liked it: http://groups.google.com/groups?selm=mailman.1048248875.10571.python- list%40python.org Advantages: no declarative statement (I don't like global much to begin with, but much less for scope declarations other that what it means now). It's a nice addition to the current scoping rule: an assignment IS a scope declaration. Possible disadvantage: you can only rebind to the nearest scope that defines the name. If there's a farther scope that also defines that name you can't reach that. But that's nicely symmetrical with how _reading_ values from nested scopes work today, shadowing is nothing new. Ideally, augmented assignments would also become "rebinding". However, this may have compatibility problems. Just From aleaxit at yahoo.com Sat Oct 25 19:11:51 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 19:11:56 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <200310252020.h9PKKjL07657@12-236-54-216.client.attbi.com> References: <200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz> <200310251132.04686.aleaxit@yahoo.com> <200310252020.h9PKKjL07657@12-236-54-216.client.attbi.com> Message-ID: <200310260111.51509.aleaxit@yahoo.com> On Saturday 25 October 2003 10:20 pm, Guido van Rossum wrote: > > sum looks cooler, but it can be an order of magnitude slower > > than the humble loop of result.extend calls. We could fix this > > specific performance trap by specialcasing in sum those cases > > where the result has a += method -- hmmm... would a patch for > > this performance bug be accepted for 2.3.* ...? (I understand and > > approve that we're keen on avoiding adding functionality in any > > 2.3.*, but fixed-functionality performance enhancements should > > be just as ok as fixes to functionality bugs, right?) > > No way. There's nothing that guarantees that a+=b has the same > semantics as a+b, and in fact for lists it doesn't. You mean because += is more permissive (accepts any sequence RHS while + insists the RHS be specifically a list)? I don't see how this would make it bad to use += instead of + -- if we let the user sum up a mix of (e.g.) strings and tuples, why are we hurting him? And it seemed to me that cases in which the current semantics of "a = a + b" would work correctly, while the potentially-faster "a += b" wouldn't, could be classified as "weird" and ignored in favour of avoiding "sum" be an orders-of-magnitude performance trap for such cases (see my performance measurements in other posts of mine to this thread). Still, you're the boss. Sorry -- I'll immediately revert the commits I had made and be less eager in the future. > I wouldn't even want this for 2.4. Aye aye, cap'n. I'll revert the 2.4 commits too, then. Sorry. Alex From guido at python.org Sat Oct 25 19:16:35 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 25 19:16:49 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: Your message of "Sun, 26 Oct 2003 01:11:51 +0200." <200310260111.51509.aleaxit@yahoo.com> References: <200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz> <200310251132.04686.aleaxit@yahoo.com> <200310252020.h9PKKjL07657@12-236-54-216.client.attbi.com> <200310260111.51509.aleaxit@yahoo.com> Message-ID: <200310252316.h9PNGZc08136@12-236-54-216.client.attbi.com> > > No way. There's nothing that guarantees that a+=b has the same > > semantics as a+b, and in fact for lists it doesn't. > > You mean because += is more permissive (accepts any sequence > RHS while + insists the RHS be specifically a list)? I don't see how > this would make it bad to use += instead of + -- if we let the user > sum up a mix of (e.g.) strings and tuples, why are we hurting him? We specifically decided that sum() wasn't allowed for strings, because it's a quadratic algorithm. Other sequences are just as bad, we just didn't expect that to be a common case. Also see my not-so-far-fetched example of a semantic change. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at cosc.canterbury.ac.nz Sat Oct 25 19:21:20 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Sat Oct 25 19:23:09 2003 Subject: [Python-Dev] Can we please have a better dict interpolation syntax? In-Reply-To: <20031024184850.GB34310@hishome.net> Message-ID: <200310252321.h9PNLKC05250@oma.cosc.canterbury.ac.nz> > "create index \{table}_lid1_idx on \{table}(\{lid1})" That looks horrible. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From aahz at pythoncraft.com Sat Oct 25 19:23:26 2003 From: aahz at pythoncraft.com (Aahz) Date: Sat Oct 25 19:23:29 2003 Subject: [Python-Dev] Re: let's not stretch a keyword's use unreasonably, _please_... In-Reply-To: <16279.58018.40303.136992@montanaro.dyndns.org> References: <20031022161137.96353.qmail@web40513.mail.yahoo.com> <16279.58018.40303.136992@montanaro.dyndns.org> Message-ID: <20031025232326.GA23772@panix.com> On Thu, Oct 23, 2003, Skip Montanaro wrote: > > >>> import __main__ as m # I know, not general, just for trial > >>> m.c=3 > > Isn't (in 3.0) the notion of being able to modify another module's globals > supposed to get restricted to help out (among other things) the compiler? > If so, this use, even though it's not really modifying a global in another > module, might not work forever. That use had better continue working. What won't work is m.len = my_len() and even there, there's still some debate about ways to structure permitting it for the use of debuggers. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From aahz at pythoncraft.com Sat Oct 25 19:25:15 2003 From: aahz at pythoncraft.com (Aahz) Date: Sat Oct 25 19:25:18 2003 Subject: [Python-Dev] Re: let's not stretch a keyword's use unreasonably, _please_... In-Reply-To: <200310251058.05704.aleaxit@yahoo.com> References: <20031022161137.96353.qmail@web40513.mail.yahoo.com> <200310251058.05704.aleaxit@yahoo.com> Message-ID: <20031025232515.GB23772@panix.com> On Sat, Oct 25, 2003, Alex Martelli wrote: > > Or, we can make the _compiler_ aware of what is going on (and get just the > same semantics as global) by accepting either a non-statement keyword > (scope, as I suggested elsewhere) or a magicname for import, e.g. > import __me__ as Barry suggested. Then __me__.x=23 can have just the > same semantics as today "x=23" has if there is some "global x" somewhere > around, and indeed it could be compiled into the same bytecode if __me__ > was sufficiently special to the compiler. We've already got ``import __main__``; what does __me__ gain? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From aleaxit at yahoo.com Sat Oct 25 19:31:47 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 19:31:53 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <16282.41043.939103.536103@montanaro.dyndns.org> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310251007.36871.aleaxit@yahoo.com> <16282.41043.939103.536103@montanaro.dyndns.org> Message-ID: <200310260131.47500.aleaxit@yahoo.com> On Saturday 25 October 2003 06:09 pm, Skip Montanaro wrote: ... > I don't see this as a big problem now. In my own code I rarely use global, > and never use nested functions. I suspect that's true for most people. No doubt it's true that most people only care about their own code, and don't have much to do with teaching and advising others, mentoring them, maintaining and enhancing code originally written by others, etc. So, since my professional activity typically encompasses these weird activities, not of interest to most people, and that gives me a different viewpoint from that of most people, I guess it's silly of me to share it. Sorry if my past well-meant eagerness caused problems; it's obviously more sensible for people who never use nested functions to help shape their syntax and semantics, than for those who DO use them, after all -- and similarly, for people who only care about their own code to help determine if 'global' is, or isn't, a cause of problems out there in the wide world of Python newbies and users far from python-dev. Alex From aleaxit at yahoo.com Sat Oct 25 19:45:23 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Oct 25 19:45:39 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python bltinmodule.c, 2.292.10.1, 2.292.10.2 In-Reply-To: <200310252118.h9PLIgE07777@12-236-54-216.client.attbi.com> References: <200310252118.h9PLIgE07777@12-236-54-216.client.attbi.com> Message-ID: <200310260145.23094.aleaxit@yahoo.com> On Saturday 25 October 2003 11:18 pm, Guido van Rossum wrote: > > Modified Files: > > Tag: release23-maint > > bltinmodule.c > > Log Message: > > changed builtin_sum to use PyNumber_InPlaceAdd -- unchanged semantics but > > fixes performance bug with sum(lotsoflists, []). > > I think this ought to be reverted, both in 2.3 and 2.4. Consider this > code: I have reverted it; it's obviously true that, by causing side effects on the 2nd argument, the fix as I had commited it could change semantics. I apologize for not thinking of this (and adding the missing unit-tests to catch this, see next paragraph). If it was up to me I would still pursue the possibility of using PyNumber_InPlaceAdd, for example by only doing so if the second argument could first be successfully copy.copy'ed into the result and falling back to PyNumber_Add otherwise. The alternative of leaving sum as a performance trap for the unwary -- an "attractive nuisance" in legal terms -- would appear to me to be such a bad situation, as to warrant such effort (including adding unit-tests to ensure sum does not alter its second argument, works correctly with a non-copyable 2nd argument, etc). However, it's YOUR decision, and you have already made it clear in another mail that your objections to remedying this performance bug are such that no possible solution will satisfy them. If a type gives different results for "a = a + b" vs "a += b", there is no way sum can find this out; and while, were it my decision, I would not care to support such weird cases at such a huge performance price, it's not my decision. Similarly for types which let you do "a = copy.copy(b)" but do NOT return a valid copy of b, or return b itself even though it's mutable, and so on weirdly. I'm just very sad that I didn't think of this performance-trap back when the specs of sum were first being defined. Oh well:-(. Can I at least add a warning about this performance trap to the docs at http://www.python.org/doc/current/lib/built-in-funcs.html ? Alex From greg at cosc.canterbury.ac.nz Sat Oct 25 21:10:21 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Sat Oct 25 21:10:41 2003 Subject: [Python-Dev] closure semantics In-Reply-To: <16283.309.987234.133955@montanaro.dyndns.org> Message-ID: <200310260110.h9Q1ALH05576@oma.cosc.canterbury.ac.nz> > Is that compatible with current use? I think the current semantics are that > global always binds name to an object with that name at module scope. No, it's not quite compatible, but I don't think it's likely to break anything much in practice. > I thought the point of this discussion was to allow the programmer > to specify the precise scope of the object to which the variable > would be bound, in the face of possibly multiple occurrences of the > name. In general the point seems to be simply about finding *some* way to bind intermediate variables. Some suggestions have included a way to explictly identify the scope, but that seems like an unnecessary complication to me. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Sat Oct 25 21:15:01 2003 From: greg at cosc.canterbury.ac.nz (greg@cosc.canterbury.ac.nz) Date: Sat Oct 25 21:15:09 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <20031025171529.GA18617@panix.com> Message-ID: <200310260115.h9Q1F0905596@oma.cosc.canterbury.ac.nz> > If we're doing this > (and it seems we are), I still prefer copysort() for clarity. "copysort" sounds like the name of some weird sorting algorithm to me. I'd prefer "sortedcopy" (although I suppose that could be read as a predicate, too -- "is x a sorted copy of y?") Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Sat Oct 25 21:20:33 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Sat Oct 25 21:20:41 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: Message-ID: <200310260120.h9Q1KXw05599@oma.cosc.canterbury.ac.nz> > How about adding a "rebinding" operator, for example spelled ":=": > > a := 2 I expect Guido would object to that on the grounds that it's conferring arbitrary semantics on a symbol. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido at python.org Sat Oct 25 23:26:03 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 25 23:26:13 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python bltinmodule.c, 2.292.10.1, 2.292.10.2 In-Reply-To: Your message of "Sun, 26 Oct 2003 01:45:23 +0200." <200310260145.23094.aleaxit@yahoo.com> References: <200310252118.h9PLIgE07777@12-236-54-216.client.attbi.com> <200310260145.23094.aleaxit@yahoo.com> Message-ID: <200310260326.h9Q3Q3708345@12-236-54-216.client.attbi.com> > > > changed builtin_sum to use PyNumber_InPlaceAdd -- unchanged semantics but > > > fixes performance bug with sum(lotsoflists, []). > > > > I think this ought to be reverted, both in 2.3 and 2.4. Consider this > > code: > > I have reverted it; it's obviously true that, by causing side effects on the > 2nd argument, the fix as I had commited it could change semantics. I > apologize for not thinking of this (and adding the missing unit-tests to > catch this, see next paragraph). > > If it was up to me I would still pursue the possibility of using > PyNumber_InPlaceAdd, for example by only doing so if the second > argument could first be successfully copy.copy'ed into the result and falling > back to PyNumber_Add otherwise. The alternative of leaving sum as a > performance trap for the unwary -- an "attractive nuisance" in legal terms -- > would appear to me to be such a bad situation, as to warrant such > effort (including adding unit-tests to ensure sum does not alter its second > argument, works correctly with a non-copyable 2nd argument, etc). > > However, it's YOUR decision, and you have already made it clear in > another mail that your objections to remedying this performance bug are > such that no possible solution will satisfy them. If a type gives different > results for "a = a + b" vs "a += b", there is no way sum can find this out; > and while, were it my decision, I would not care to support such weird > cases at such a huge performance price, it's not my decision. Similarly > for types which let you do "a = copy.copy(b)" but do NOT return a valid > copy of b, or return b itself even though it's mutable, and so on weirdly. > > I'm just very sad that I didn't think of this performance-trap back when > the specs of sum were first being defined. Oh well:-(. Oh, but we all *did* think of it. For strings. :-) > Can I at least add > a warning about this performance trap to the docs at > http://www.python.org/doc/current/lib/built-in-funcs.html ? Definitely. You know, I don't even think that I would consider using sum() if I wanted to concatenate a bunch of lists. Let's use sum() for numbers. Big deal. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sat Oct 25 23:29:53 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 25 23:30:09 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: Your message of "Sun, 26 Oct 2003 14:20:33 +1300." <200310260120.h9Q1KXw05599@oma.cosc.canterbury.ac.nz> References: <200310260120.h9Q1KXw05599@oma.cosc.canterbury.ac.nz> Message-ID: <200310260329.h9Q3Trv08377@12-236-54-216.client.attbi.com> > > How about adding a "rebinding" operator, for example spelled ":=": > > > > a := 2 > > I expect Guido would object to that on the grounds that > it's conferring arbitrary semantics on a symbol. Hardly arbitary (I have fond memories of several languages that used :=). But what is one to make of a function that uses both a := 2 and a = 2 ??? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sat Oct 25 23:36:20 2003 From: guido at python.org (Guido van Rossum) Date: Sat Oct 25 23:36:34 2003 Subject: [Python-Dev] product() In-Reply-To: Your message of "Sun, 26 Oct 2003 01:04:18 +0200." <200310260104.18806.aleaxit@yahoo.com> References: <002401c39907$0176f5a0$e841fea9@oemcomputer> <200310251439.12449.aleaxit@yahoo.com> <200310252114.h9PLEWU07712@12-236-54-216.client.attbi.com> <200310260104.18806.aleaxit@yahoo.com> Message-ID: <200310260336.h9Q3aKg08424@12-236-54-216.client.attbi.com> > > These are close to what ABC does with quantifiers. There, you can > > write > > > > IF EACH x IN sequence HAS x > 0: ... > > > > ABC has the additional quirk that if there's an ELSE branch, you can > > use x in it (as a "counter-example"). > > > > In Python, you could write this as > > > > if alltrue(x > 0 for x in sequence): ... > > > > but the current design doesn't expose x to the else branch. > > Right -- it would return the condition being tested, x>0, when non-true, > so just a False; there is no natural way for it to get the underlying > object on which it's testing it. This is somewhat the same problem as > Peter Norvig's original Top(10) accumulator example: if you just pass to > it the iterator of the comparison keys, it can't collect the 10 items with > the highest comparison keys. > > Maybe > alltrue(sequence, pred=lambda x: x>0) > might be better (pred would default to None meaning to test the items > in the first argument, the iterator, for true/false directly): > > def alltrue(seq, pred=None): > if pred is None: > def pred(x): return x > def wrap(x): return x > else: > class wrap(object): > def __init__(self, x): self.counterexample = x > def __nonzero__(self): return False > for x in seq: > if not pred(x): return wrap(x) > else: > return True > > or something like that (I do think we need the wrap class, so that > alltrue can return an object that evaluates to false but still allows > the underlying "counter-example" to be retrieved if needed). > > Use, of course, would have to be something like: > > allpositives = alltrue(sequence, pred=lambda x: x>0) > if allpositives: print "wow, all positives!" > else: print "nope, some nonpositives, e.g.", allpositives.counterexample > > Unfortunately, this usage is pushing at TWO not-strengths of Python: > no neat way to pass an unnamed predicate (lambda ain't really all > that neat...) AND no assignment-as-expression. So, I don't think it > would really catch on all that much. Yeah. An explicit for loop sounds much better in cases where we want to know which x failed the test. Let alltrue() be as simple as originally proposed. Do we need allfalse() and anytrue() and anyfalse() too? These can all easily be gotten by judicious use of 'not'. I think ABC has EACH, SOME and NO (why not all four? who knows). --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Sun Oct 26 04:01:30 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 04:01:38 2003 Subject: [Python-Dev] product() In-Reply-To: <200310260336.h9Q3aKg08424@12-236-54-216.client.attbi.com> References: <002401c39907$0176f5a0$e841fea9@oemcomputer> <200310260104.18806.aleaxit@yahoo.com> <200310260336.h9Q3aKg08424@12-236-54-216.client.attbi.com> Message-ID: <200310261001.31072.aleaxit@yahoo.com> On Sunday 26 October 2003 04:36, Guido van Rossum wrote: ... > > Unfortunately, this usage is pushing at TWO not-strengths of Python: > > no neat way to pass an unnamed predicate (lambda ain't really all > > that neat...) AND no assignment-as-expression. So, I don't think it > > would really catch on all that much. > > Yeah. An explicit for loop sounds much better in cases where we want > to know which x failed the test. Let alltrue() be as simple as > originally proposed. Yeah, makes sense. > Do we need allfalse() and anytrue() and anyfalse() too? These can all > easily be gotten by judicious use of 'not'. I think ABC has EACH, > SOME and NO (why not all four? who knows). If we were discussing language or built-ins I would argue for "only one obvious way to do it", but I don't think this is all that important once we are discussing standard-library functions (which IS the case here, right?). Still, I'm not sure I see the benefits of overlapping functionality in this specific case. Alex From greg at cosc.canterbury.ac.nz Sun Oct 26 04:13:29 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Sun Oct 26 04:14:13 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310260329.h9Q3Trv08377@12-236-54-216.client.attbi.com> Message-ID: <200310260913.h9Q9DTS06656@oma.cosc.canterbury.ac.nz> > Hardly arbitary (I have fond memories of several languages that used :=). But all the ones I know of use it for ordinary assignment. We'd be having two kinds of assignment, and there's no prior art to suggest to suggest which should be = and which :=. That's the "arbitrary" part. The only language I can remember seeing which had two kinds of assignment was Simula, which had := for value assignment and :- for reference assignment (or was it the other way around? :-) I always thought that was kind of weird. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From just at letterror.com Sun Oct 26 04:53:18 2003 From: just at letterror.com (Just van Rossum) Date: Sun Oct 26 04:53:25 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310260329.h9Q3Trv08377@12-236-54-216.client.attbi.com> Message-ID: Guido van Rossum wrote: > Hardly arbitary (I have fond memories of several languages that used > :=). I think augmented assignment should (ideally) also be rebinding, and := kindof looks like an augmented assignment, so I don't think it's all that bad. I'd be used to it in a snap. But: let's not get carried away with this particular spelling, the main question is: "is it a good idea to have a rebinding assignment operator?" (regardless of how that operator is spelled). Needless to say, I think it is. > But what is one to make of a function that uses both > > a := 2 > > and > > a = 2 > > ??? Simple, "a = 2" means 'a' is local to that function, so "a := 2" will rebind in the same scope. So the following example will raise UnboundLocalException: def foo(): a := 3 a = 2 And this will just work (but is kindof pointless): def foo(): a = 2 a := 3 And this would be a substitute for the global statement: a = 2 def foo(): a := 3 (Alex noted in private mail that one disadvantage of this idea is that it makes using globals perhaps TOO easy...) Just From aleaxit at yahoo.com Sun Oct 26 05:09:56 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 05:10:03 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python bltinmodule.c, 2.292.10.1, 2.292.10.2 In-Reply-To: <200310260326.h9Q3Q3708345@12-236-54-216.client.attbi.com> References: <200310260145.23094.aleaxit@yahoo.com> <200310260326.h9Q3Q3708345@12-236-54-216.client.attbi.com> Message-ID: <200310261109.56801.aleaxit@yahoo.com> On Sunday 26 October 2003 04:26, Guido van Rossum wrote: ... > > I'm just very sad that I didn't think of this performance-trap back > > when the specs of sum were first being defined. Oh well:-(. > > Oh, but we all *did* think of it. For strings. :-) Yeah, and your decision to forbid them (while my first prototype tried forwarding to ''.join) simplified sum's implementation a lot. Unfortunately we cannot easily distinguish numbers from sequences in the general case, when user-coded classes are in play; so we can't easily forbid sequences and allow numbers. Exactly the same underlying reason as a bug I just opened on SF: if x is an instance of a class X having __mul__ but not __rmul__, 3*x works (just like x*3) but 3.0*x raises TypeError (with a message that shows the problem -- x is being taken as a sequence). When X is intended as a number class, this asymmetry between multiplication and (e.g.) addition violates the principle of least surprise. > > Can I at least add > > a warning about this performance trap to the docs at > > http://www.python.org/doc/current/lib/built-in-funcs.html ? > > Definitely. > > You know, I don't even think that I would consider using sum() if I > wanted to concatenate a bunch of lists. Let's use sum() for numbers. > Big deal. Currently the docs say that sum is mostly meant for numbers. By making that observation into a stronger warning, we can at least be somewhat helpful to those few Python users who read the manuals;-). If sum just couldn't be used for a list of lists it would indeed not be a big problem. The problem is that it can, it's just (unexpectedly for the naive user) dog-slow, just like a loop of += on a list of strings. And people WILL and DO consider and try and practice _any_ use for a language or library feature. The problem of the += loop on strings is essentially solved by psyco, which has tricks to catch that and make it almost as fast as ''.join; but psyco can't get into a built-in function such as sum, and thus can't help us with the performance trap there. As you've indicated that for 2.4 the risk of semantics changes to sum in weird cases _can_ be at least considered (you're still opposed but open to perhaps being convinced) I hope to get something for that (with a copy.copy of the "accumulator" and in-place addition if that succeeds, falling back to plain addition otherwise) and all the unit tests needed to show it makes sense. An aside...: One common subtheme of this and other recent threads here and on c.l.py is that, as we think of "accumulator functions" to consume iterators, we should not ignore the mutating methods (typically returning None) that are NOT appropriate for list comprehensions just as they weren't for map and friends. A substantial minority of intermediate Python users, knowing or feeling that loops coded in Python aren't as fast as those that happen inside C-coded funcs such as sum, those in itertools, etc, is NOT enthusiastic about coding e.g. "for x in stuff: tot += x". Most often their performance focus is of course inappropriate, but it's hard to uproot it. So, in a typical example, we might have: L = [ [x] for x in xrange(1000) ] def aloop(L=L): tot = [] for x in L: tot += x return tot def asum(L=L): return sum(L, []) def amap(L=L): tot = [] map(tot.extend, L) return tot With the now-regressed fix, this gave: [alex@lancelot bo]$ timeit.py -c -s'import d' 'd.aloop()' 1000 loops, best of 3: 640 usec per loop [alex@lancelot bo]$ timeit.py -c -s'import d' 'd.asum()' 1000 loops, best of 3: 480 usec per loop [alex@lancelot bo]$ timeit.py -c -s'import d' 'd.amap()' 1000 loops, best of 3: 790 usec per loop so sum could be touted as "the only obvious solution" -- shortest, neatest, fastest... IF it were indeed fast!-) Unfortunately, with the sum change regressed, d.asum times to 8.4e+03 usec per loop, so it clearly cannot be considered any more:-). So, there might be space for an accumulator function patterned on map but [a] which stops on the shortest sequence like zip and [b] does NOT build a list of results, meant to be called a bit like map is in the 'amap' example above. itertools is a great little collection of producers and manipulators of iterators, but the "accumulator functions" might provide the "one obvious way" to _consume_ iterators for common cases; and accumulating by calling an accumulator-object's mutator method, such as tot.extend above, on all items of an iterator, clearly is pretty common. Alex From aleaxit at yahoo.com Sun Oct 26 05:20:16 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 05:20:23 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310260329.h9Q3Trv08377@12-236-54-216.client.attbi.com> References: <200310260120.h9Q1KXw05599@oma.cosc.canterbury.ac.nz> <200310260329.h9Q3Trv08377@12-236-54-216.client.attbi.com> Message-ID: <200310261120.16246.aleaxit@yahoo.com> On Sunday 26 October 2003 04:29, Guido van Rossum wrote: > > > How about adding a "rebinding" operator, for example spelled ":=": > > > > > > a := 2 > > > > I expect Guido would object to that on the grounds that > > it's conferring arbitrary semantics on a symbol. > > Hardly arbitary (I have fond memories of several languages that used :=). Now, operator :=) MIGHT indeed be worth considering -- "rebinding assignment with a smile"! Yes, of course := IS a very popular way to denote assignment. > But what is one to make of a function that uses both > > a := 2 > > and > > a = 2 What would astonish me least: the presence of a normal rebiding would ensure a is local. I would prefer, therefore, if the compiler AT LEAST warned about the presence of := at the same scope, and probably I'd be even happier if the compiler flagged it as an outright error. I just can't think of good use cases for wanting both at the same scope on the same name. I can think of a dubious one: a style where = would be used as "initializing declaration" for a name at function start, and all further re-bindings of the name systematically always used := -- I can think of people who might prefer that style, but it might be best for Python to avoid style variance by forbidding it (since it obviously can't be _mandated_, thanks be:-). By forbidding compresence of = and := on the same name at the same scope, := becomes an unmistakable yet unobtrusive symbol saying "this assignment here is to a NON-local name", and thus amply satisfies my long-debated unease wrt "global". Alex From aleaxit at yahoo.com Sun Oct 26 05:25:46 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 05:26:02 2003 Subject: [Python-Dev] Re: let's not stretch a keyword's use unreasonably, _please_... In-Reply-To: <20031025232515.GB23772@panix.com> References: <20031022161137.96353.qmail@web40513.mail.yahoo.com> <200310251058.05704.aleaxit@yahoo.com> <20031025232515.GB23772@panix.com> Message-ID: <200310261125.46048.aleaxit@yahoo.com> On Sunday 26 October 2003 01:25, Aahz wrote: > On Sat, Oct 25, 2003, Alex Martelli wrote: > > Or, we can make the _compiler_ aware of what is going on (and get just > > the same semantics as global) by accepting either a non-statement > > keyword (scope, as I suggested elsewhere) or a magicname for import, > > e.g. import __me__ as Barry suggested. Then __me__.x=23 can have just > > the same semantics as today "x=23" has if there is some "global x" > > somewhere around, and indeed it could be compiled into the same > > bytecode if __me__ was sufficiently special to the compiler. > > We've already got ``import __main__``; what does __me__ gain? import __main__ works only if the current module is being imported with the name __main__. Most modules will be using a different name most of the time (i.e. except when they're being used as main scripts, e.g. to run tests on them). Similarly, even if I know a module is named foo.py and am willing to hardcode that into the module's source, import foo as __me__ doesn't always work (submodules of packages, modules being run as main scripts for testing). Furthermore, the compiler cannot do anything special on most imports. __me__ would be designed as special (just like __future__ is) and allow the compiler to recognize the situation and do all it wants or needs, thus obviating the need for "declarative statements". Alex From aleaxit at yahoo.com Sun Oct 26 05:34:56 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 05:35:03 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: References: Message-ID: <200310261134.56982.aleaxit@yahoo.com> On Sunday 26 October 2003 01:09, Just van Rossum wrote: > It seems noone liked (or remembered) an idea I proposed last february, > but I'm going to repost it anyway: > > How about adding a "rebinding" operator, for example spelled ":=": > > a := 2 > > It would mean: bind the value 2 to the nearest scope that defines 'a'. In the light of the current discussion, this looks beautiful. At least if compresence of := and other bindings (= , class, def, for, import, ...) for the same name at the same scope is flagged as an error. I would also suggest for simplicity that := be only allowed in the simplest form of assignment: to a single bare name -- no packing, unpacking, chaining, nor can the LHS be an indexing, slicing, nor dotted name. > Advantages: no declarative statement (I don't like global much to begin > with, but much less for scope declarations other that what it means > now). It's a nice addition to the current scoping rule: an assignment IS > a scope declaration. Yes. Neat. := becomes an unobtrusive but unmistakable indication "I'm binding this name in NON-local scope" and -- if defined with the restrictions I suggest -- meets all of my issues wrt 'global'. > Possible disadvantage: you can only rebind to the nearest scope that > defines the name. If there's a farther scope that also defines that name > you can't reach that. But that's nicely symmetrical with how _reading_ > values from nested scopes work today, shadowing is nothing new. I agree. Reaching other scopes but the "closest" outer one is not a use case of any overriding importance, IMHO. > Ideally, augmented assignments would also become "rebinding". However, > this may have compatibility problems. Unfortunately yes. It might have been better to define them that way in the first place, but changing them now is dubious. Besides, we could not load them with the restrictions I think should be put on := to make it simplest, sharpest, and most useful. Alex From aleaxit at yahoo.com Sun Oct 26 05:37:40 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 05:37:44 2003 Subject: [Python-Dev] Re: Re: closure semantics In-Reply-To: <200310252251.h9PMpon05099@oma.cosc.canterbury.ac.nz> References: <200310252251.h9PMpon05099@oma.cosc.canterbury.ac.nz> Message-ID: <200310261137.40143.aleaxit@yahoo.com> On Sunday 26 October 2003 00:51, Greg Ewing wrote: > > But what about name mismatches? Global statements allows functions to > > create 'new' variables in the module scope and not just 'existing' > > ones. What about for in-between scopes? > > It's probably a misfeature of the global statement that it allows > that, but if we're going to re-use it in the form of a "global x in > scope" statement, we should keep the behaviour the same for nested > scopes in the interests of consistency. > > Maybe this is an argument for introducing an "outer" statement, > which requires an existing binding (determined by existence of > an assignment at compile time) even for the module scope, and > deprecating "global" altogether. I think Just's proposal of := meets all of these issues, too: it doesn't have to, and won't, propagate global's misfeature of allowing creation of new variables in nonlocal scope, and "requires an existing binding" (and allows deprecating global altogether, with a warning in 2.4 etc) in the most natural manner. Alex From aleaxit at yahoo.com Sun Oct 26 05:42:05 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 05:42:11 2003 Subject: [Python-Dev] test_bsddb blocks while testing popitem (?) In-Reply-To: <200310252210.h9PMAuT07833@12-236-54-216.client.attbi.com> References: <200310251232.55044.aleaxit@yahoo.com> <1067088346.10257.71.camel@anthem> <200310252210.h9PMAuT07833@12-236-54-216.client.attbi.com> Message-ID: <200310261142.05394.aleaxit@yahoo.com> On Sunday 26 October 2003 00:10, Guido van Rossum wrote: > > On Sat, 2003-10-25 at 06:32, Alex Martelli wrote: > > > I guess it had been a while since I ran 'make test' on the 2.4 > > > cvs... can't find this bug in the bugs db and I'd just like a > > > quick sanity check (if the bug's already there or if I'm doing > > > something weird) before I add it. > > > > Jeremy and I have both seen similar hangs in 2.4cvs. > > > > -Barry > > Ditto for me on RH9. So does anybody have a better idea of what's going on...? I can't see what's different in 2.4cvs vs 2.3cvs bsddb module that makes the former repeatably hang in test_popitem while the latter breezes thru all tests...!-( And neither can diff, neither for bsddbmodule.c nor for test_bsddb.py ... Alex From skip at pobox.com Sun Oct 26 05:42:07 2003 From: skip at pobox.com (Skip Montanaro) Date: Sun Oct 26 05:42:23 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: References: <200310252248.h9PMmCE07958@12-236-54-216.client.attbi.com> Message-ID: <16283.42239.376018.900892@montanaro.dyndns.org> Just> How about adding a "rebinding" operator, for example spelled ":=": Just> a := 2 Just> It would mean: bind the value 2 to the nearest scope that defines Just> 'a'. I see a couple problems: * Would you be required to use := at each assignment or just the first? All the toy examples we pass around are very simple, but it seems that the name would get assigned to more than once, so the programmer might need to remember the same discipline all the time. It seems that use of x := 2 and x = 4 should be disallowed in the same function so that the compiler can flag such mistakes. * This seems like a statement which mixes declaration and execution. Everyone seems to abhor the global statement. Perhaps its main saving grace is that it doesn't pretend to mix execution and declaration. I think to narrow the scope of possible alternatives it would be helpful to know if what we're looking for is a way to allow the programmer only bind in the nearest enclosing scope or if she should be able to bind to an arbitrary enclosing scope. The various ideas seem to be falling into those two categories. Guido, do you have a preference or a pronouncement on that idea? Knowing that would eliminate one category of solutions. Skip From aleaxit at yahoo.com Sun Oct 26 05:46:53 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 05:46:58 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310251640.h9PGePZ07536@12-236-54-216.client.attbi.com> Message-ID: <200310261146.53044.aleaxit@yahoo.com> On Saturday 25 October 2003 20:05, David Eppstein wrote: ... > > One person here brought up (maybe David Eppstein) that they used this > > approach for coding up extensive algorithms that are functional in > > nature but have a lot of state referenced *during* the computation. ... > refactoring in general, but you convinced me that using an object to > represent shared state explicitly rather than doing it implicitly by > nested function scoping can be a good idea. Great testimony, David -- thanks!!! So, maybe, rather than going out of our way to facilitate coding very large and complicated closures, it might be better to keep focusing on _simple_, small closures as the intended, designed-for use case, and convince users of complicated closures that refactoring, as David has done, into OO terms, can indeed be preferable. Alex From aleaxit at yahoo.com Sun Oct 26 05:54:55 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 05:55:01 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: References: <000001c3984b$052cd820$e841fea9@oemcomputer> <200310251618.42221.aleaxit@yahoo.com> Message-ID: <200310261154.55202.aleaxit@yahoo.com> On Saturday 25 October 2003 18:03, David Eppstein wrote: ... > > > pos2d = > > > dict([(s,(positions[s][0]+dx*positions[s][2],positions[s][1]+dy*posit > > >ions[s ][2])) > > > for s in positions]) ... > > pos2d = {} > > for s, (x, y, delta) in positions.iteritems(): > > pos2d[s] = x+dx*delta, y+dy*delta > > > > seems just SO much clearer and more transparent to me. ... > I like the comprehension syntax so much that I push it harder than I > guess I should. If I'm building a dictionary by performing some > transformation on the items of another dictionary, I prefer to write it > in a way that avoids sequencing the items one by one; I don't think of > that sequencing as an inherent part of the loop. > > Put another way, I prefer declarative to imperative when possible. Hmmm, I see. List comprehensions are in fact fully imperative (in Python), but they may be "thought of" in quasi-declarative terms; I do see the allure of that. Thanks for clarifying! We DO have to keep in mind this source of attractiveness in comprehensions over simple loops, I think. > Let's try to spread it out a little and use intermediate variable names: > pos2d = dict([(s, (x + dx*z, y + dy*z)) > for s,(x,y,z) in positions.items()]) > > Better? Yes, it does seem better to me. And with generator expressions, dropping those slightly intrusive [ ... ] would be another little helpful step. Once you can write: pos2d = dict( (s, (x+dx*z, y+dy*x) for s,(x,y,z) in position.items() ) I don't think the further slight added value in clarity in being able to write a "dict comprehension" directly, e.g. pos2d = { s: (x+dx*z, y+dy*x) for s,(x,y,z) in position.items() } would be enough to warrant the addition to Python's syntax. Alex From aleaxit at yahoo.com Sun Oct 26 06:01:30 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 06:01:36 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310251007.36871.aleaxit@yahoo.com> Message-ID: <200310261201.30234.aleaxit@yahoo.com> On Saturday 25 October 2003 17:49, Paul Moore wrote: ... > However, one significant issue with your notation scope(outer).x = 23 > is that, although scope(outer) *looks like* a function call, it isn't > - precisely because scope is a keyword. > > I think that, if you're using a keyword, you need something > syntactically distinct. Now maybe you can make something like Existing operator keywords, such as, e.g., 'not', get away without it. One can use parentheses, write not(x), or not (preferable style); and what's the problem if "not(x)" CAN indeed look like a function call while in fact it's not? I really makes no deep difference here that 'not' is a keyword and not a built-in function (it does matter when it's used with other syntax, of course, such as "x is not y" or "x not in y" or "not x" and so on -- but then, where 'scope' to be introduced, it, too, like other operator keywords, might admit of slightly different syntax uses). Similarly, that 'scope' is a keyword known to the compiler is not deeply important to the user coding scope(f) -- it might as well be a built-in, from the user's viewpoint. It's important to the compiler, it becomes important if the user erroneously tries to rebind "scope = 23", but those cases don't give problems. Alex From skip at pobox.com Sun Oct 26 05:51:53 2003 From: skip at pobox.com (Skip Montanaro) Date: Sun Oct 26 06:08:01 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <200310260131.47500.aleaxit@yahoo.com> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310251007.36871.aleaxit@yahoo.com> <16282.41043.939103.536103@montanaro.dyndns.org> <200310260131.47500.aleaxit@yahoo.com> Message-ID: <16283.42825.957517.595315@montanaro.dyndns.org> Alex> Sorry if my past well-meant eagerness caused problems; it's Alex> obviously more sensible for people who never use nested functions Alex> to help shape their syntax and semantics, than for those who DO Alex> use them, after all -- and similarly, for people who only care Alex> about their own code to help determine if 'global' is, or isn't, a Alex> cause of problems out there in the wide world of Python newbies Alex> and users far from python-dev. Pardon me? Just because I don't use a particular feature of the language doesn't mean I have no interest in how the language evolves. I don't believe I ever disrespected your ideas or opinions. Why are you disrespecting mine? Hell, why are you disrespecting me? I would be more than happy if nested scopes weren't in the language. Their absence would also make your teaching, advising, mentoring, maintenance and enhancing simpler. I haven't proposed that they be removed, though that would be rather clean way to solve this problem. Alex, if a qualification for discussing improvements to Python is that one use every aspect of the language, please pronounce. I'll be happy to butt out of your turf. Skip From just at letterror.com Sun Oct 26 06:14:58 2003 From: just at letterror.com (Just van Rossum) Date: Sun Oct 26 06:14:59 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <16283.42239.376018.900892@montanaro.dyndns.org> Message-ID: Skip Montanaro wrote: > I see a couple problems: > > * Would you be required to use := at each assignment or just the > first? Just the first; "a = 2" still means "a is local to this scope". > All the toy examples we pass around are very simple, but it > seems that the name would get assigned to more than once, so the > programmer might need to remember the same discipline all the time. > It seems that use of > x := 2 > and > x = 4 > should be disallowed in the same function so that the compiler can > flag such mistakes. I don't see it as a mistake. := would mean: "bind to whichever scope the name is defined in", and that includes the current scope. I disagree with Alex when he says := should mean "I'm binding this name in NON-local scope". > * This seems like a statement which mixes declaration and execution. How is that different from "regular" assignment? It mixes declaration and execution in the same way. Just From just at letterror.com Sun Oct 26 06:19:26 2003 From: just at letterror.com (Just van Rossum) Date: Sun Oct 26 06:19:25 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: Message-ID: Just van Rossum wrote: > > * Would you be required to use := at each assignment or just the > > first? > > Just the first; "a = 2" still means "a is local to this scope". ^^^^^^^^^^^^^^ Whoops, I meant *at each asignment*, obviously. Just From skip at pobox.com Sun Oct 26 06:21:41 2003 From: skip at pobox.com (Skip Montanaro) Date: Sun Oct 26 06:21:53 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: References: <16283.42239.376018.900892@montanaro.dyndns.org> Message-ID: <16283.44613.425664.463009@montanaro.dyndns.org> >> * Would you be required to use := at each assignment or just the >> first? Just> Just the first; "a = 2" still means "a is local to this scope". That seems like a very subtle error waiting to happen... >> All the toy examples we pass around are very simple, but it seems >> that the name would get assigned to more than once, so the programmer >> might need to remember the same discipline all the time. It seems >> that use of x := 2 and x = 4 should be disallowed in the same >> function so that the compiler can flag such mistakes. Just> I don't see it as a mistake. := would mean: "bind to whichever Just> scope the name is defined in", and that includes the current Just> scope. I disagree with Alex when he says := should mean "I'm Just> binding this name in NON-local scope". Yeah, but if you come back to the code in six months and the nested function is 48 lines long and assigns to x using a variety of ":=" and "=" assignments, it seems to me like it will be hard to tell if there's a problem. >> * This seems like a statement which mixes declaration and execution. Just> How is that different from "regular" assignment? It mixes Just> declaration and execution in the same way. Not in the way of saying, "this is global and here's its value". Skip From just at letterror.com Sun Oct 26 06:35:09 2003 From: just at letterror.com (Just van Rossum) Date: Sun Oct 26 06:35:12 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <16283.44613.425664.463009@montanaro.dyndns.org> Message-ID: Skip Montanaro wrote: > >> * Would you be required to use := at each assignment or just > >> the first? > > Just> Just the first; "a = 2" still means "a is local to this scope". > > That seems like a very subtle error waiting to happen... Since I said the wrong thing, I'm not sure how to respond to this... Do you still feel the same way with my corrected reply? > >> * This seems like a statement which mixes declaration and > >> execution. > > Just> How is that different from "regular" assignment? It mixes > Just> declaration and execution in the same way. > > Not in the way of saying, "this is global and here's its value". In a way := is the opposite of "this is local and here's its value". It says: "this is defined _somewhere_ and here's its new value". Just From aleaxit at yahoo.com Sun Oct 26 06:32:41 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 06:36:35 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310260913.h9Q9DTS06656@oma.cosc.canterbury.ac.nz> References: <200310260913.h9Q9DTS06656@oma.cosc.canterbury.ac.nz> Message-ID: <200310261232.41678.aleaxit@yahoo.com> On Sunday 26 October 2003 10:13, Greg Ewing wrote: > > Hardly arbitary (I have fond memories of several languages that used > > :=). > > But all the ones I know of use it for ordinary assignment. > We'd be having two kinds of assignment, and there's no > prior art to suggest to suggest which should be = and > which :=. That's the "arbitrary" part. > > The only language I can remember seeing which had two > kinds of assignment was Simula, which had := for value > assignment and :- for reference assignment (or was it > the other way around? :-) I always thought that was > kind of weird. VB6 had LET x = y for value assignment and SET x = y for reference assignment. Yes, very confusing particularly because the LET keyword could be dropped. Fortunately we're not proposing anything like that;-). Icon had := for irreversible and <- for reversible assignment. (also :=: and <-> for exchanges and diffferent comparisons for == and === so maybe it HAD gone a bit overboard:-). I do recall an obscure language where = was always augmented assignment equivalent to a = a b. But in particular the : operator meant to evaluate two exprs and take the RH one, like comma in C, so a := b did turn out to mean the same as a = b BUT fail if a couldn't first be evaluated, which (sort of randomly) is sort of close to Just's proposal. Unfortunately I don't remember the language's name:-(. Googling a bit does show other languages distinguishing global from local variable assignments. E.g, in MUF, http://www.muq.org/~cynbe/muq/muf1_24.html , --> (arrow with TWO hyphens) assigns globally, -> (arrow with ONE hyphen) assigns locally. It appears that this approach is slightly less popular than the 'qualification' one I suggested (e.g. in Javascript you can assign window.x to assign the global x; in Beanshell, super.x to assign to x from enclosing scope) which in turn is less popular than declarations. Another not very popular idea is distinguishing locals and globals by name rules, as in Ruby $glob vs loc or KVirc Glob (upper initial) vs loc (lower initial). Alex From aleaxit at yahoo.com Sun Oct 26 06:35:41 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 06:37:19 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <16283.42239.376018.900892@montanaro.dyndns.org> References: <200310252248.h9PMmCE07958@12-236-54-216.client.attbi.com> <16283.42239.376018.900892@montanaro.dyndns.org> Message-ID: <200310261235.41107.aleaxit@yahoo.com> On Sunday 26 October 2003 11:42, Skip Montanaro wrote: ... > might need to remember the same discipline all the time. It seems that > use of > x := 2 > and > x = 4 > should be disallowed in the same function so that the compiler can > flag such mistakes. I entirely agree with you. There is no good use case that I can see for this mixture, and prohibiting it helps the compiler help the programmer. > * This seems like a statement which mixes declaration and execution. That's actually the PLAIN assignment statement, which mixes assigning a value with telling the compiler "this name is local" (other binding statements such as def, class etc also do that). Alex From skip at pobox.com Sun Oct 26 07:04:58 2003 From: skip at pobox.com (Skip Montanaro) Date: Sun Oct 26 07:05:40 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: References: <16283.44613.425664.463009@montanaro.dyndns.org> Message-ID: <16283.47210.64438.619480@montanaro.dyndns.org> >>>>> "Just" == Just van Rossum writes: Just> Skip Montanaro wrote: >> >> * Would you be required to use := at each assignment or just >> >> the first? >> Just> Just the first; "a = 2" still means "a is local to this scope". >> >> That seems like a very subtle error waiting to happen... Just> Since I said the wrong thing, I'm not sure how to respond to Just> this... Do you still feel the same way with my corrected reply? Nope. Skip From fincher.8 at osu.edu Sun Oct 26 08:20:59 2003 From: fincher.8 at osu.edu (Jeremy Fincher) Date: Sun Oct 26 07:22:34 2003 Subject: [Python-Dev] product() In-Reply-To: <200310260336.h9Q3aKg08424@12-236-54-216.client.attbi.com> References: <002401c39907$0176f5a0$e841fea9@oemcomputer> <200310260104.18806.aleaxit@yahoo.com> <200310260336.h9Q3aKg08424@12-236-54-216.client.attbi.com> Message-ID: <200310260820.59266.fincher.8@osu.edu> On Saturday 25 October 2003 11:36 pm, Guido van Rossum wrote: > Do we need allfalse() and anytrue() and anyfalse() too? These can all > easily be gotten by judicious use of 'not'. I think ABC has EACH, > SOME and NO (why not all four? who knows). There was a recent thread here ("Efficient predicates for the standard library") in which the names "any" and "all" were discussed rather than "anytrue" and "alltrue." Those are at least their common names in the functional programming languages I know, and it easily sidesteps the confusion that might be caused by having an "anytrue" without an "anyfalse" or an "alltrue" without an "allfalse." Jeremy From just at letterror.com Sun Oct 26 07:37:02 2003 From: just at letterror.com (Just van Rossum) Date: Sun Oct 26 07:37:12 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <16283.47210.64438.619480@montanaro.dyndns.org> Message-ID: Skip Montanaro wrote: > Nope. Ok :). Yet I think I'm starting to agree with you and Alex that := should mean "this name is NON-local". A couple more things: - I think augmented assignments CAN be made "rebinding" without breaking code, since currently a += 1 fails if a is neither local nor global. - Would := be allowed in statements like "self.a := 2"? It makes no sense, but since "(a, b) := (2, 3)" IS meaningful, what about "(a, b, self.c) = (1, 2, 3)"? Just From skip at manatee.mojam.com Sun Oct 26 08:00:50 2003 From: skip at manatee.mojam.com (Skip Montanaro) Date: Sun Oct 26 08:01:05 2003 Subject: [Python-Dev] Weekly Python Bug/Patch Summary Message-ID: <200310261300.h9QD0oB3015515@manatee.mojam.com> Bug/Patch Summary ----------------- 547 open / 4276 total bugs (+42) 205 open / 2432 total patches (+7) New Bugs -------- email/Generator.py: Incorrect header output (2003-10-20) http://python.org/sf/826756 Proto 2 pickle vs dict subclass (2003-10-20) http://python.org/sf/826897 wrong error message of islice indexing (2003-10-20) http://python.org/sf/827190 List comprehensions leaking control variable name deprecated (2003-10-20) http://python.org/sf/827209 Bug in dbm - long strings in keys and values (2003-10-21) http://python.org/sf/827760 object.h misdocuments PyDict_SetItemString (2003-10-21) http://python.org/sf/827856 ctime is not creation time (2003-10-21) http://python.org/sf/827902 Registry key CurrentVersion not set (2003-10-21) http://python.org/sf/827963 Idle fails on loading .idlerc if Home path changes. (2003-10-22) http://python.org/sf/828049 sdist generates bad MANIFEST on Windows (2003-10-22) http://python.org/sf/828450 bdist_rpm failure when no setup.py (2003-10-23) http://python.org/sf/828743 setattr(obj, BADNAME, value) does not raises exception (2003-10-24) http://python.org/sf/829458 os.makedirs() cannot handle "." (2003-10-24) http://python.org/sf/829532 __mul__ taken as __rmul__ for mul-by-int only (2003-10-25) http://python.org/sf/830261 python-mode.el: py-b-of-def-or-class looks inside strings (2003-10-25) http://python.org/sf/830347 Config parser don't raise DuplicateSectionError when reading (2003-10-26) http://python.org/sf/830449 New Patches ----------- absolute paths cause problems for MSVC (2003-10-21) http://python.org/sf/827386 SimpleHTTPServer directory-indexing "bug" fix (2003-10-21) http://python.org/sf/827559 Allow set swig include dirs in setup.py (2003-10-22) http://python.org/sf/828336 ref. manual talks of 'sequence' instead of 'iterable' (2003-10-23) http://python.org/sf/829073 Fixes smtplib starttls HELO errors (2003-10-24) http://python.org/sf/829951 itertoolsmodule.c: islice error messages (827190) (2003-10-25) http://python.org/sf/830070 python-mode.el: (py-point 'bod) doesn't quite work (2003-10-25) http://python.org/sf/830341 Closed Bugs ----------- asyncore unhandled write event (2002-03-10) http://python.org/sf/528295 missing important curses calls (2003-01-10) http://python.org/sf/665572 Problems with non-greedy match groups (2003-03-01) http://python.org/sf/695688 ncurses/curses on solaris (2003-03-10) http://python.org/sf/700780 sigwinch crashes python with curses (2003-06-14) http://python.org/sf/754455 asyncore with non-default map problems (2003-06-20) http://python.org/sf/758241 HTMLParser chokes on my.yahoo.com output (2003-06-26) http://python.org/sf/761452 minidom.py -- TypeError: object doesn't support slice assig (2003-07-25) http://python.org/sf/777884 xmlrpclib's functions dumps() and loads() not documented. (2003-09-19) http://python.org/sf/809174 Support for non-string data in ConfigParser unclear/buggy (2003-09-22) http://python.org/sf/810843 test_tempfile fails on windows if space in install path (2003-09-23) http://python.org/sf/811082 a Py_DECREF() too much (2003-09-25) http://python.org/sf/812353 new.function raises TypeError for some strange reason... (2003-09-28) http://python.org/sf/814266 webbrowser.open hangs under certain conditions (2003-10-02) http://python.org/sf/816810 Need "new style note" (2003-10-04) http://python.org/sf/817742 Ref Man Index: Symbols -- Latex leak (2003-10-08) http://python.org/sf/820344 tarfile exception on large .tar files (2003-10-13) http://python.org/sf/822668 urllib2 digest auth is broken (2003-10-14) http://python.org/sf/823328 code.InteractiveConsole interprets escape chars incorrectly (2003-10-17) http://python.org/sf/825676 reference to Built-In Types section in file() documentation (2003-10-17) http://python.org/sf/825810 Class Problem with repr and getattr on PY2.3.2 (2003-10-18) http://python.org/sf/826013 Closed Patches -------------- Mutable PyCObject (2001-11-02) http://python.org/sf/477441 (?(id/name)yes|no) re implementation (2002-06-23) http://python.org/sf/572936 Fixing recursive problem in SRE (2003-06-19) http://python.org/sf/757624 small fix for setup.py (2003-07-15) http://python.org/sf/772077 make test_fcntl 64bit clean (2003-09-13) http://python.org/sf/805626 NetBSD py_curses.h fix (2003-09-15) http://python.org/sf/806800 Fix many doc typos (2003-09-22) http://python.org/sf/810751 normalize whitespace (2003-09-25) http://python.org/sf/812378 Fix test_tempfile: space in Win32 install path bug #811082 (2003-09-26) http://python.org/sf/813200 _sre stack overflow on FreeBSD/amd64 and /sparc64 (2003-09-26) http://python.org/sf/813391 deprecated modules (2003-09-29) http://python.org/sf/814560 fix import problem(unittest.py) (2003-10-07) http://python.org/sf/819077 Updated .spec file. (2003-10-14) http://python.org/sf/823259 Add additional isxxx functions to string object. (2003-10-16) http://python.org/sf/825313 From pedronis at bluewin.ch Sun Oct 26 08:37:01 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Sun Oct 26 08:34:44 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: References: <200310260329.h9Q3Trv08377@12-236-54-216.client.attbi.com> Message-ID: <5.2.1.1.0.20031026140652.027b3f78@pop.bluewin.ch> At 10:53 26.10.2003 +0100, Just van Rossum wrote: >But: let's not get carried away with this particular spelling, the main >question is: "is it a good idea to have a rebinding assignment >operator?" (regardless of how that operator is spelled). Needless to >say, I think it is. would you mind trying to express why? everybody is spending a lot of mental energy trying to figure a out a sensible way to achieve this but only Guido has made explicit some 3rd party reasons to want it. I would like to read more rationales about why we need it so badly. Thanks. From aleaxit at yahoo.com Sun Oct 26 06:35:41 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 09:11:35 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <16283.42239.376018.900892@montanaro.dyndns.org> References: <200310252248.h9PMmCE07958@12-236-54-216.client.attbi.com> <16283.42239.376018.900892@montanaro.dyndns.org> Message-ID: <200310261235.41107.aleaxit@yahoo.com> On Sunday 26 October 2003 11:42, Skip Montanaro wrote: ... > might need to remember the same discipline all the time. It seems that > use of > x := 2 > and > x = 4 > should be disallowed in the same function so that the compiler can > flag such mistakes. I entirely agree with you. There is no good use case that I can see for this mixture, and prohibiting it helps the compiler help the programmer. > * This seems like a statement which mixes declaration and execution. That's actually the PLAIN assignment statement, which mixes assigning a value with telling the compiler "this name is local" (other binding statements such as def, class etc also do that). Alex From aleaxit at yahoo.com Sun Oct 26 09:16:31 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 09:18:27 2003 Subject: [Python-Dev] Re: closure semantics In-Reply-To: <16283.42825.957517.595315@montanaro.dyndns.org> References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> <200310260131.47500.aleaxit@yahoo.com> <16283.42825.957517.595315@montanaro.dyndns.org> Message-ID: <200310261511.29081.aleaxit@yahoo.com> On Sunday 26 October 2003 11:51, Skip Montanaro wrote: ... > Just because I don't use a particular feature of the language doesn't > mean I have no interest in how the language evolves. I don't believe I Absolutely true. Any feature added to the language brings some weight to all, even those who will not use it (perhaps not much to those who will not use it AND only care about their own code, but I do believe that most should also care about _others'_ code, even if they don't realize that -- reusing others' code from the net, &c, are still possibilities). > ever disrespected your ideas or opinions. Why are you disrespecting > mine? Hell, why are you disrespecting me? I had no intention of expressing any disrespect to you. If I miscommunicated in this regard, I owe you an apology. As for opinions based on only caring about one's own code, I am, however, fully entitled to meta-opine that such opinions are too narrowly based, and that not considering the coding behavior of others is near-sighted. > I would be more than happy if nested scopes weren't in the language. > Their absence would also make your teaching, advising, mentoring, > maintenance and enhancing simpler. I haven't proposed that they be > removed, though that would be rather clean way to solve this problem. Of course such a proposal would have to wait for 3.0 (i.e. who knows when) given backwards incompatibility. Personally, I think that would just bring back all the "foo=foo, bar=bar" default-argument abuse that we used to have before nested scopes appeared, and therefore would not make any of my activities substantially simpler nor more productive (even discounting the large work of porting code across such a jump in semantics -- I think that could be eased by tools systematically _introducing_ default-argument abuse, but the semantics of such 'snapshotting' is still far enough from today's nested arguments to require plenty of manual inspection and changing). > Alex, if a qualification for discussing improvements to Python is that > one use every aspect of the language, please pronounce. I'll be happy to > butt out of your turf. You got the wrong guy: I don't get to pronounce, and this ain't my turf. I only get to plead, cajole, whine, argue, entreaty, advocate, propose, appeal, supplicate, contend, suggest, insist, agree, and disagree, just like everybody else. Alex From ncoghlan at iinet.net.au Sun Oct 26 09:39:29 2003 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Sun Oct 26 09:39:36 2003 Subject: [Python-Dev] product() In-Reply-To: <200310260820.59266.fincher.8@osu.edu> References: <002401c39907$0176f5a0$e841fea9@oemcomputer> <200310260104.18806.aleaxit@yahoo.com> <200310260336.h9Q3aKg08424@12-236-54-216.client.attbi.com> <200310260820.59266.fincher.8@osu.edu> Message-ID: <3F9BDCA1.5040101@iinet.net.au> Jeremy Fincher strung bits together to say: > On Saturday 25 October 2003 11:36 pm, Guido van Rossum wrote: > >>Do we need allfalse() and anytrue() and anyfalse() too? These can all >>easily be gotten by judicious use of 'not'. I think ABC has EACH, >>SOME and NO (why not all four? who knows). > > There was a recent thread here ("Efficient predicates for the standard > library") in which the names "any" and "all" were discussed rather than > "anytrue" and "alltrue." Those are at least their common names in the > functional programming languages I know, and it easily sidesteps the > confusion that might be caused by having an "anytrue" without an "anyfalse" > or an "alltrue" without an "allfalse." >>> if all(pred(x) for x in values): pass # alltrue >>> if any(pred(x) for x in values): pass # anytrue >>> if any(not pred(x) for x in values): pass # anyfalse >>> if all(not pred(x) for x in values): pass # allfalse The names from the earlier thread do read nicely. . . Alternately, getting away with just one function: >>> if all(pred(x) for x in values): pass # alltrue >>> if not all(not pred(x) for x in values): pass # anytrue >>> if not all(pred(x) for x in values): pass # anyfalse >>> if all(not pred(x) for x in values): pass # allfalse I don't know about anyone else, but the double negative required to express "any" in terms of "all" hurts my brain. . . Cheers, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions." From aleaxit at yahoo.com Sun Oct 26 10:23:54 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 10:24:02 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: References: Message-ID: <200310261623.54136.aleaxit@yahoo.com> On Sunday 26 October 2003 13:37, Just van Rossum wrote: > Skip Montanaro wrote: > > Nope. > > Ok :). Yet I think I'm starting to agree with you and Alex that := > should mean "this name is NON-local". The more I think about it, the more I like it in its _simplest_ form. > A couple more things: > > - I think augmented assignments CAN be made "rebinding" without breaking > code, since currently a += 1 fails if a is neither local nor global. You are right about the breaking code, but I would still slightly prefer to eschew this just for simplicity -- see also below. > - Would := be allowed in statements like "self.a := 2"? It makes no > sense, but since "(a, b) := (2, 3)" IS meaningful, what about > "(a, b, self.c) = (1, 2, 3)"? I would not allow := in any but the SIMPLEST case: simple assignment to a bare name, no unpacking (I earlier said "no packing" but that's silly and I mispoke there -- "a := 3, 4, 5" WOULD of course be fine), no chaining, no := when the LHS is an indexing, slicing, attribute access. Keeping := Franciscan in its simplicity would make it easiest to implement, easiest to explain, AND avoid all sort of confusing cases where the distinction between := and = would otherwise be confusingly nonexistent. It would also make it most effective because it always means the same thing -- "assignment to (already-existing) nonlocal". This is much the spirit in which I'd forego the idea of making += etc access nonlocals too, though I guess I'm only -0 on that; it seems simplest and most effective to have the one concept "rebinding a nonlocal name" correspond in strict 1-1 way to the one notation := . Simplicity and effectiveness feel very Pythonic to me. I think rebinding nonlocals should be rare enough that the fact of having to write e.g. "a := a+1" rather than "a += 1" is a very minor problem. The important use case of += & friends, "xop[flap].quip(glop).nip[zap] += 1", gets no special benefit from += being deemed "rebinding" -- the rebinding concept applies usefully to bare names, and for a bare name writing name := name RHS is no big deal wrt name = RHS If name's a huge list, name.extend(anotherlist) is a fine substitute for name += anotherlist if you want to keep name nonlocal AND get some efficiency gain. Other containers with similar issues should also always supply a more readable synonym to __iadd__ for such uses, e.g. sets do, supplying union_update. So, keeping += &c just like today seems acceptable and preferable. Alex From pje at telecommunity.com Sun Oct 26 10:41:43 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun Oct 26 10:41:30 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <200310252316.h9PNGZc08136@12-236-54-216.client.attbi.com> References: <200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz> <200310251132.04686.aleaxit@yahoo.com> <200310252020.h9PKKjL07657@12-236-54-216.client.attbi.com> <200310260111.51509.aleaxit@yahoo.com> Message-ID: <5.1.0.14.0.20031026103718.03f76e70@mail.telecommunity.com> At 04:16 PM 10/25/03 -0700, Guido van Rossum wrote: > > > No way. There's nothing that guarantees that a+=b has the same > > > semantics as a+b, and in fact for lists it doesn't. > > > > You mean because += is more permissive (accepts any sequence > > RHS while + insists the RHS be specifically a list)? I don't see how > > this would make it bad to use += instead of + -- if we let the user > > sum up a mix of (e.g.) strings and tuples, why are we hurting him? > >We specifically decided that sum() wasn't allowed for strings, because >it's a quadratic algorithm. Other sequences are just as bad, we just >didn't expect that to be a common case. > >Also see my not-so-far-fetched example of a semantic change. Maybe I'm confused, but when Alex first proposed this change, I mentally assumed that he meant he would change it so that the *first* addition would use + (in order to ensure getting a "fresh" object) and then subsequent additions would use +=. If this were the approach taken, it seems to me that there could not be any semantic change or side-effects for types that have compatible meaning for + and += (i.e. += is an in-place version of +). Maybe I'm missing something here? From aahz at pythoncraft.com Sun Oct 26 10:46:26 2003 From: aahz at pythoncraft.com (Aahz) Date: Sun Oct 26 10:46:29 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310261623.54136.aleaxit@yahoo.com> References: <200310261623.54136.aleaxit@yahoo.com> Message-ID: <20031026154626.GA18564@panix.com> On Sun, Oct 26, 2003, Alex Martelli wrote: > > Keeping := Franciscan in its simplicity would make it easiest to > implement, easiest to explain, AND avoid all sort of confusing cases > where the distinction between := and = would otherwise be confusingly > nonexistent. It would also make it most effective because it always > means the same thing -- "assignment to (already-existing) nonlocal". > This is much the spirit in which I'd forego the idea of making += etc > access nonlocals too, though I guess I'm only -0 on that; it seems > simplest and most effective to have the one concept "rebinding a > nonlocal name" correspond in strict 1-1 way to the one notation := . > Simplicity and effectiveness feel very Pythonic to me. Sounds good to me. Question: what does this do? def f(): def g(x): z := x g(3) print z return g g = f() print z g('foo') print z That is, in the absence of a pre-existing binding, where does the binding for := go? I think it should be equivalent to global, going to the module scope. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From just at letterror.com Sun Oct 26 10:54:30 2003 From: just at letterror.com (Just van Rossum) Date: Sun Oct 26 10:54:36 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <5.2.1.1.0.20031026140652.027b3f78@pop.bluewin.ch> Message-ID: Samuele Pedroni wrote: > >But: let's not get carried away with this particular spelling, the > >main question is: "is it a good idea to have a rebinding assignment > >operator?" (regardless of how that operator is spelled). Needless to > >say, I think it is. > > would you mind trying to express why? everybody is spending a lot of > mental energy trying to figure a out a sensible way to achieve this > but only Guido has made explicit some 3rd party reasons to want it. I > would like to read more rationales about why we need it so badly. My question above is misleading with respect to my personal feelings about the issue. It should have been: """*If* we decide we need to be able to assign to names in outer scopes, would it be a good idea to add a rebinding operator?""" I actually don't care much whether the cability is added or not, but *if* we add it, I'd much rather see a rebinding operator than an extension to the global statement or a new declarative statement. Just From just at letterror.com Sun Oct 26 10:58:13 2003 From: just at letterror.com (Just van Rossum) Date: Sun Oct 26 10:58:17 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <20031026154626.GA18564@panix.com> Message-ID: Aahz wrote: > Sounds good to me. Question: what does this do? > > def f(): > def g(x): > z := x > g(3) > print z > return g > g = f() > print z > g('foo') > print z > > That is, in the absence of a pre-existing binding, where does the > binding for := go? I think it should be equivalent to global, going > to the module scope. I think it should raise NameError or UnboundLocalError or a new NameError subclass. "In the face of ambiguity, etc." Just From pje at telecommunity.com Sun Oct 26 11:06:10 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun Oct 26 11:05:24 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: References: <5.2.1.1.0.20031026140652.027b3f78@pop.bluewin.ch> Message-ID: <5.1.0.14.0.20031026105653.03e64ec0@mail.telecommunity.com> At 04:54 PM 10/26/03 +0100, Just van Rossum wrote: >My question above is misleading with respect to my personal feelings >about the issue. It should have been: > >"""*If* we decide we need to be able to assign to names in outer scopes, >would it be a good idea to add a rebinding operator?""" > >I actually don't care much whether the cability is added or not, but >*if* we add it, I'd much rather see a rebinding operator than an >extension to the global statement or a new declarative statement. If we have a rebinding operator, I'd rather it be something considerably more visible than the presence or absence of a ':' on an assignment statement. So far, all the examples have been downright scary in the invisibility of what's happening. Mostly, I can imagine some poor sap trying to debug a program that uses := and is missing one somewhere or has one where it's not intended -- and hoping that poor sap won't be me. :) I've mostly stayed out of this discussion, but so far something like the scope(function).variable proposals, with perhaps a special case for scope(global) or scope(globals) seems to me like the way to go. It seems very Pythonic, in that it is explicit and calls attention to the fact that something special is going on, in a way that ':=' does not. And 'scope' can be looked up in a manual more easily than ':=' can. Last, but not least, ':=' looks enough like normal assignment in other languages, that somebody just plain might not notice that they *need* to look it up. From aleaxit at yahoo.com Sun Oct 26 11:18:48 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 11:19:01 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <5.1.0.14.0.20031026103718.03f76e70@mail.telecommunity.com> References: <200310260111.51509.aleaxit@yahoo.com> <5.1.0.14.0.20031026103718.03f76e70@mail.telecommunity.com> Message-ID: <200310261718.48377.aleaxit@yahoo.com> On Sunday 26 October 2003 04:41 pm, Phillip J. Eby wrote: > At 04:16 PM 10/25/03 -0700, Guido van Rossum wrote: > > > > No way. There's nothing that guarantees that a+=b has the same > > > > semantics as a+b, and in fact for lists it doesn't. ... > assumed that he meant he would change it so that the *first* addition would > use + (in order to ensure getting a "fresh" object) and then subsequent > additions would use +=. A better architecture than the initial copy.copy I was now thinking of -- thanks. But it doesn't solve Guido's objection as above shown. > If this were the approach taken, it seems to me that there could not be any > semantic change or side-effects for types that have compatible meaning for > + and += (i.e. += is an in-place version of +). > > Maybe I'm missing something here? Only the fact that "there's nothing that guarantees" this, as Guido says. alist = alist + x only succeds if x is also a list, while alist += x succeeds also for tuples and other sequences, for example. Personally, I don't think this would be a problem, but it's not my decision. Alex From skip at pobox.com Sun Oct 26 11:19:32 2003 From: skip at pobox.com (Skip Montanaro) Date: Sun Oct 26 11:19:48 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: References: <16283.47210.64438.619480@montanaro.dyndns.org> Message-ID: <16283.62484.474269.27181@montanaro.dyndns.org> Just> - Would := be allowed in statements like "self.a := 2"? It makes Just> no sense, but since "(a, b) := (2, 3)" IS meaningful, what about Just> "(a, b, self.c) = (1, 2, 3)"? Ummm... This doesn't seem to be strengthening your argument. ;-) Skip From aleaxit at yahoo.com Sun Oct 26 11:20:16 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 11:20:42 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: References: Message-ID: <200310261720.16457.aleaxit@yahoo.com> On Sunday 26 October 2003 04:58 pm, Just van Rossum wrote: > Aahz wrote: > > Sounds good to me. Question: what does this do? > > > > def f(): > > def g(x): > > z := x > > g(3) > > print z > > return g > > g = f() > > print z > > g('foo') > > print z > > > > That is, in the absence of a pre-existing binding, where does the > > binding for := go? I think it should be equivalent to global, going > > to the module scope. > > I think it should raise NameError or UnboundLocalError or a new > NameError subclass. "In the face of ambiguity, etc." Absolute agreement here. I think a new subclass of NameError would be best. The simplest and most limited the functionality of := the more effective I think it will be. Alex From pf_moore at yahoo.co.uk Sun Oct 26 11:21:26 2003 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Sun Oct 26 11:21:23 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python bltinmodule.c, 2.292.10.1, 2.292.10.2 References: <200310260145.23094.aleaxit@yahoo.com> <200310260326.h9Q3Q3708345@12-236-54-216.client.attbi.com> <200310261109.56801.aleaxit@yahoo.com> Message-ID: Alex Martelli writes: > Unfortunately, with the sum change regressed, d.asum times to > 8.4e+03 usec per loop, so it clearly cannot be considered any > more:-). So, there might be space for an accumulator function > patterned on map but [a] which stops on the shortest sequence > like zip and [b] does NOT build a list of results, meant to be called > a bit like map is in the 'amap' example above. itertools is a great > little collection of producers and manipulators of iterators, but the > "accumulator functions" might provide the "one obvious way" to > _consume_ iterators for common cases; and accumulating by > calling an accumulator-object's mutator method, such as > tot.extend above, on all items of an iterator, clearly is pretty common. I *think* I see what you're getting at here, but I'm struggling to follow in the absence of concrete use cases. As we're talking about library functions, I'd suggest that your suggested "accumulator functions" start their life as an external module - maybe even in Python, although I take our point about the speed advantages of C. With a bit of "real life" use, migration into the standard library might be more of an obvious step. Paul. -- This signature intentionally left blank From aleaxit at yahoo.com Sun Oct 26 11:23:20 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 11:23:24 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <20031026154626.GA18564@panix.com> References: <200310261623.54136.aleaxit@yahoo.com> <20031026154626.GA18564@panix.com> Message-ID: <200310261723.20026.aleaxit@yahoo.com> On Sunday 26 October 2003 04:46 pm, Aahz wrote: > On Sun, Oct 26, 2003, Alex Martelli wrote: ... > > nonexistent. It would also make it most effective because it always > > means the same thing -- "assignment to (already-existing) nonlocal". ... > Sounds good to me. Question: what does this do? > > def f(): > def g(x): > z := x ... > That is, in the absence of a pre-existing binding, where does the > binding for := go? I think it should be equivalent to global, going to > the module scope. I think it should raise some subclass of NameError, because it's not an assignment to an _already-existing_ nonlocal, as per my text quoted above. It does not seem to me that "nested functions able to rebind module-level names" has compelling use cases, so I would prefer the simplicity of forbidding this usage. Alex From aleaxit at yahoo.com Sun Oct 26 11:24:56 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 11:24:59 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <16283.62484.474269.27181@montanaro.dyndns.org> References: <16283.47210.64438.619480@montanaro.dyndns.org> <16283.62484.474269.27181@montanaro.dyndns.org> Message-ID: <200310261724.56194.aleaxit@yahoo.com> On Sunday 26 October 2003 05:19 pm, Skip Montanaro wrote: > Just> - Would := be allowed in statements like "self.a := 2"? It makes > Just> no sense, but since "(a, b) := (2, 3)" IS meaningful, what > about Just> "(a, b, self.c) = (1, 2, 3)"? > > Ummm... This doesn't seem to be strengthening your argument. ;-) Indeed, I think the argument is stronger -- and := is more useful -- if a, b := 2, 3 and self.a := 2 and all such non-elementary variations of assignment are NOT allowed with := . Alex From just at letterror.com Sun Oct 26 11:25:23 2003 From: just at letterror.com (Just van Rossum) Date: Sun Oct 26 11:25:50 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <5.1.0.14.0.20031026105653.03e64ec0@mail.telecommunity.com> Message-ID: Phillip J. Eby wrote: > If we have a rebinding operator, I'd rather it be something > considerably more visible than the presence or absence of a ':' on an > assignment statement. I don't know, but somehow I don't have a problem spotting augmented assignments, so I don't think := will be as hard to miss as you suggest. > So far, all the examples have been downright scary in the > invisibility of what's happening. Mostly, I can imagine some poor > sap trying to debug a program that uses := and is missing one > somewhere or has one where it's not intended -- and hoping that poor > sap won't be me. :) How is that different from a '-=' that should have been a plain '='? Also, if := is disallowed to rebind in the _same_ scope, this problem would be spotted by the compiler. > I've mostly stayed out of this discussion, but so far something like > the scope(function).variable proposals, with perhaps a special case > for scope(global) or scope(globals) seems to me like the way to go. > It seems very Pythonic, in that it is explicit and calls attention to > the fact that something special is going on, in a way that ':=' does > not. The reverse argument can be made, too: := calls attention to the fact that something is happening right there, whereas a declaration may be many lines away. > And 'scope' can be looked up in a manual more easily than ':=' > can. Last, but not least, ':=' looks enough like normal assignment > in other languages, that somebody just plain might not notice that > they *need* to look it up. That's a good point. Just From arigo at tunes.org Sun Oct 26 11:23:25 2003 From: arigo at tunes.org (Armin Rigo) Date: Sun Oct 26 11:27:20 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python bltinmodule.c, 2.292.10.1, 2.292.10.2 In-Reply-To: <200310261109.56801.aleaxit@yahoo.com> References: <200310260145.23094.aleaxit@yahoo.com> <200310260326.h9Q3Q3708345@12-236-54-216.client.attbi.com> <200310261109.56801.aleaxit@yahoo.com> Message-ID: <20031026162325.GA4113@vicky.ecs.soton.ac.uk> Hello, On Sun, Oct 26, 2003 at 11:09:56AM +0100, Alex Martelli wrote: > > Oh, but we all *did* think of it. For strings. :-) > ... When X is intended as a number class, this > asymmetry between multiplication and (e.g.) addition violates > the principle of least surprise. I must admit I was a bit surprized when I first tested sum(), without first reading its doc because I thought I knew what it should do. I expected it to be a fast equivalent to: def sum(seq, start=0): for item in seq: start = start + seq return start or: reduce(operator.add, seq, start) I immediately tried it with strings and lists. I immediately thought about lists because of their use of "+" for concatenation. So it seems that neither strings nor lists are properly supported, neither tuples by the way, and my opinion on this is that it strongly contradicts the principle of least surprize. I would not object to an implementation of sum() that special-case lists, tuples and strings for efficiency. (by which I mean I can contribute a patch) > language or library feature. The problem of the += loop on strings is > essentially solved by psyco, which has tricks to catch that and make > it almost as fast as ''.join; but psyco can't get into a built-in function > such as sum, and thus can't help us with the performance trap there. Supporing sum() in Psyco is no big issue, and it could help the same way as it does for str.__add__. (It is not explicitely supported yet, but it could be added.) Still I believe that getting the complexity right in CPython is important, when it can be done. Armin From fincher.8 at osu.edu Sun Oct 26 13:10:27 2003 From: fincher.8 at osu.edu (Jeremy Fincher) Date: Sun Oct 26 12:12:15 2003 Subject: [Python-Dev] product() In-Reply-To: <3F9BDCA1.5040101@iinet.net.au> References: <002401c39907$0176f5a0$e841fea9@oemcomputer> <200310260820.59266.fincher.8@osu.edu> <3F9BDCA1.5040101@iinet.net.au> Message-ID: <200310261310.27950.fincher.8@osu.edu> On Sunday 26 October 2003 09:39 am, Nick Coghlan wrote: > >>> if all(pred(x) for x in values): pass # alltrue > >>> if any(pred(x) for x in values): pass # anytrue Yeah, that does read nicely, which is why I think it's pretty common in FPLs. > >>> if any(not pred(x) for x in values): pass # anyfalse I've always expressed this as: if not all(pred(x) for x in values): pass > >>> if all(not pred(x) for x in values): pass # allfalse And this as: if not any(pred(x) for x in values): pass It's slightly more efficient (only one negation), and it seems to maintain better the pseudocode-like aspect that we so much adore in Python :) Jeremy From arigo at tunes.org Sun Oct 26 12:11:48 2003 From: arigo at tunes.org (Armin Rigo) Date: Sun Oct 26 12:15:49 2003 Subject: [Python-Dev] PyPy: sprint and news Message-ID: <20031026171148.GC16738@vicky.ecs.soton.ac.uk> PyPy Sprint announcement & news from the project ================================================ We are coming close to a first experimental release of PyPy, a more flexible Python implementation written in Python. The sprint to make this happen will take place in Amsterdam, a city know to be reachable by cheap flights :-) This is 1) the announcement for the sprint; 2) news about the current state of PyPy; 3) some words about a proposal we recently submitted to the European Union. Amsterdam Sprint Details ------------------------ The Sprint will take place from the 14th of December to the 21st of December at the "Vrije Universiteit in Amsterdam", 14th-21th Dec 2003. thanks to Etienne Posthumus, who helps us to organize the event. The main goal will be a complete C translation of PyPy, probably still using a hacked Pyrex-variant as an intermediate layer and using CPython's runtime. We also plan to work on some fun frontends to PyPy like one based on pygame or a web browser to visualize interactions between interpreter and objectspace. If you want to participate on the sprint, please subscribe here http://codespeak.net/mailman/listinfo/pypy-sprint and list yourself on this wiki page http://codespeak.net/moin/pypy/moin.cgi/AmsterdamSprint where you will also find more information as the sprint date approaches. If you are just interested but don't know if you come then only subscribe to the mailing list. State of the PyPy project -------------------------- PyPy works pretty well but still on top of CPython. The double interpretation penalty makes it - as expected - incredibly slow :-) In the Berlin sprint we have thus started to work on the "translation" part, i.e. how this code should be translated into C. We can now translate simple functions to C-like code including some type annotations. For convenience, we are reusing a modified subset of Pyrex to generate the low-level C code. Thanks to Seo (who joined the project from south-korea) we also have a lisp-backend to fuel the endless c.l.py threads about python versus lisp :-) The goal of the next sprint is to complete this work so that we can translate the complete PyPy source into a huge Pyrex module, and then a big CPython extension module. True, the result is not independent from CPython, as it runs reusing its runtime environment. But it's probably an interesting enough state to make a public release from. The translation is done by generating a control flow of functions by means of abstract interpretation. IOW, we run the PyPy interpreter with a custom object space ("flowobjspace") which generates a control flow graph (including the elementary operations) which is then used to generate low-level code for backends. We also have preliminary type inference on the graphs, which can be used by the Pyrex generator to emit some C type declarations. Writing transformations and analysis of these graphs and displaying them with GraphViz's 'dot' is great fun! We certainly have a greater need than ever for graphical interactive tools to see, understand and debug all these graph manipulations and run tests of them. Currently it is a bit difficult to write a test that checks that a transformed graph "looks right"! What we expect from the Amsterdam sprint is thus: - a big not-too-slow "cpypy.so" extension module for CPython, where at least integer arithmetic is done efficiently - interactive tools to display and debug and test PyPy, visualizing control flow, call-graphs and state models. - improving and rewriting our testing tools to give us more control over the testing process, and to allow more interactive testing sessions. Other interesting News ---------------------- Before mid October, we also had a quite different Sprint. It was an approximately 10-day effort towards submitting a proposal to the EU. If it is accepted we will have resources to fund some developers working full- or parttime on the project. However, our "sprint driven development" will continue to play the central role for development of PyPy. There are especially two technical sections of the proposal which you might find interesting to read: "Scientific and technological objectives": http://codespeak.net/pypy/index.cgi?doc/funding/B1.0 "Detailed implementation plan" http://codespeak.net/pypy/index.cgi?doc/funding/B6.0 Maybe you want to read the whole proposal for other reasons, too, like making a EU project of your own or competing with us. Actually, with our sprints there is usually a lot of room for cooperation :-) Anyway, here is the PDF-url: http://codespeak.net/svn/pypy/trunk/doc/funding/proposal/part_b.pdf Everybody who thinks that he/she could help on the project is invited to join! Btw, the latest discussions about our sprint goals usually take place on the pypy-dev list: http://codespeak.net/mailman/listinfo/pypy-dev have fun, Armin & Holger From just at letterror.com Sun Oct 26 12:31:22 2003 From: just at letterror.com (Just van Rossum) Date: Sun Oct 26 12:31:22 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310261623.54136.aleaxit@yahoo.com> Message-ID: Alex Martelli wrote: > > - I think augmented assignments CAN be made "rebinding" without > > breaking code, since currently a += 1 fails if a is neither local > > nor global. > > You are right about the breaking code, but I would still slightly > prefer to eschew this just for simplicity -- see also below. [ ... ] > I think rebinding nonlocals should be rare enough that the fact of > having to write e.g. "a := a+1" rather than "a += 1" is a very minor > problem. [ ... ] Minor, sure, but I think it's an unnecessary restriction, just like many people think Python's current inability to assign to outer scopes is unneccesary. If we have a rebinding operator, it'll be very surprising if augmented assignment ISN'T rebinding. It's just such a natural fit. Just From python at rcn.com Sun Oct 26 12:34:55 2003 From: python at rcn.com (Raymond Hettinger) Date: Sun Oct 26 12:37:25 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <200310261718.48377.aleaxit@yahoo.com> Message-ID: <000001c39be7$90fc2460$d4b8958d@oemcomputer> > > If this were the approach taken, it seems to me that there could not be > any > > semantic change or side-effects for types that have compatible meaning > for > > + and += (i.e. += is an in-place version of +). > > > > Maybe I'm missing something here? > > Only the fact that "there's nothing that guarantees" this, as Guido says. > alist = alist + x only succeds if x is also a list, while alist += x > succeeds > also for tuples and other sequences, for example. > > Personally, I don't think this would be a problem, but it's not my > decision. In the context of sum(), I think it would be nice to allow iterables to be added together: sum(['abc', range(3), ('do', 're', 'me')], []) This fits in well with the current thinking that the prohibition of adding sequences of unlike types be imposed only on operators and not on functions or methods. For instance, in sets.py, a|b requires both a and b to be Sets; however, a.union(b) allows b to be any iterable. The matches the distinction between list.__iadd__() and list.extend() where the former requires a list argument and the latter does not. Raymond Hettinger From aleaxit at yahoo.com Sun Oct 26 12:48:30 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 12:48:35 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python bltinmodule.c, 2.292.10.1, 2.292.10.2 In-Reply-To: References: <200310261109.56801.aleaxit@yahoo.com> Message-ID: <200310261848.30366.aleaxit@yahoo.com> On Sunday 26 October 2003 05:21 pm, Paul Moore wrote: ... > I *think* I see what you're getting at here, but I'm struggling to > follow in the absence of concrete use cases. As we're talking about Assuming the simplest definition, equivalent to: def loop(bound_method, it): for item in it: bound_method(item) typical simple use cases might be, e.g.: Merge a stream of dictionaries into one dict: merged_dict = {} loop(merged_dict.update, stream_of_dictionaries) rather than: merged_dict = {} for d in stream_of_dictionaries: merged_dict.update(d) Add a bunch of sequences into one list: all_the_seqs = [] loop(all_the_seqs.extend, bunch_of_seqs) rather than: all_the_seqs = [] for s in bunch_of_seqs: all_the_seqs.extend(s) Add two copies of each of a bunch of sequences ditto: all_the_seqs = [] loop(all_the_seqs.extend, s+s for s in bunch_of_seqs) ditto but only for sequences which have 23 somewhere in them: seqs_with_23 = [] loop(seqs_with_23.extend, s for s in bunch_of_seqs in 23 in s) and so on. There are no doubt possibly more elegant ways, e.g. def init_and_loop(initvalue, unboundmethod, it, *its): for items in itertools.izip(it, *its): unboundmethod(initvalue, *items) return initvalue which would allow, e.g., merged_dict = init_and_loop({}, dict.update, stream_of_dictionaries) or other variants yet, but the use cases are roughly the same. The gain of such tiny "accumulator functions" (consuming one or more iterators by just passing their items to some mutating-method and ignoring the latter's results) are essentially conceptual -- it's not a matter of saving a couple of lines at the point of use, nor of saving some "bananoseconds" if the accumulator functions are implemented in C, when compared to the coded-out loops. Rather, such functions would allow "more declarative style" presentation (of underlying functionality that remains imperative): expressions feel more "declarative", stylistically, to some, while spelling a few steps out feels more "imperative". We've had this preference explained recently on this group, and others over in c.l.py are breaking over the champagne at the news of list.sorted for essentially the same motivation. > library functions, I'd suggest that your suggested "accumulator > functions" start their life as an external module - maybe even in > Python, although I take our point about the speed advantages of C. Absolutely. It's not _my_ suggestion to have more accumulator functions -- it came up repeatedly on the threads started by Peter Norvig original proposal about accumulation, and Guido mentioned them in the 'product' thread I believe (where we also discussed 'any', 'all' etc, if I recall correctly). I don't think anybody's ever thought of making these built-ins. But if that external module[s] (one or more) is/are not part of the Python 2.4 library, if 2.4 does not come with a selection of accumulation functions [not necessarily including that 'loop' &c above mentioned, though I think something like that might help], I don't think we can have the "accumulation functionality" -- we only have great ways to make and express iterators but not many great ways to _consume_ them (and most particularly if sum, one of the few "good iterator consumers" we have, is practically unusable for iterators whose items are lists..). > With a bit of "real life" use, migration into the standard library > might be more of an obvious step. You could have said the same of itertools before 2.3, but I think it was a great decision to accept them into the standard library instead; 2.3 would be substantially poorer without them. With an even richer complement of iterator tools in itertools, and the new "generator expressions" to give us even more great ways to make iterators, I think a module of "iterator consumers", also known as accumulation functions, would be a good idea. Look at Peter Norvig's original ideas for some suggestions, for example. Which reminds me of an issue with Top(10), but, this is a long enough post, so I think I should write a separate one about that. Alex From aleaxit at yahoo.com Sun Oct 26 12:51:04 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 12:51:10 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <000001c39be7$90fc2460$d4b8958d@oemcomputer> References: <000001c39be7$90fc2460$d4b8958d@oemcomputer> Message-ID: <200310261851.04820.aleaxit@yahoo.com> On Sunday 26 October 2003 06:34 pm, Raymond Hettinger wrote: ... > b to be Sets; however, a.union(b) allows b to be any iterable. The > matches the distinction between list.__iadd__() and list.extend() where > the former requires a list argument and the latter does not. What distinction...? >>> x=range(3) >>> x.__iadd__('foo') [0, 1, 2, 'f', 'o', 'o'] >>> x [0, 1, 2, 'f', 'o', 'o'] >>> did you mean list.__add__()...? list.__iadd__ IS just as permissive as list.extend, it seems to me. Alex From python at rcn.com Sun Oct 26 12:56:40 2003 From: python at rcn.com (Raymond Hettinger) Date: Sun Oct 26 12:57:32 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: <200310261851.04820.aleaxit@yahoo.com> Message-ID: <000301c39bea$8230e1c0$d4b8958d@oemcomputer> > did you mean list.__add__()...? list.__iadd__ IS just as permissive > as list.extend, it seems to me. Hmm, I did mean __iadd__() but misremembered what it did. Raymond From zack at codesourcery.com Sun Oct 26 13:14:37 2003 From: zack at codesourcery.com (Zack Weinberg) Date: Sun Oct 26 13:14:42 2003 Subject: [Python-Dev] Alternate notation for global variable assignments Message-ID: <87k76rhnn6.fsf@egil.codesourcery.com> I like Just's := concept except for the similarity to =, and I worry that the presence of := in the language will flip people into "Pascal mode" -- thinking that = is the equality operator. I also think that the notation is somewhat unnatural -- "globalness" is a property of the _variable_, not the operator. So I'd like to suggest instead :var = value # var in module scope :scope:var = value # var in named enclosing scope An advantage of this notation is that it can be used anywhere, not just in an assignment. This has primary value for people reading the code -- if you have a fairly large class method that uses a module variable (not by assigning it) somewhere in the middle, writing it :var means the reader knows to go look for the assignment way up top. This should obviously be optional, to preserve backward compatibility. zw From aleaxit at yahoo.com Sun Oct 26 13:20:52 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 13:20:59 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python bltinmodule.c, 2.292.10.1, 2.292.10.2 In-Reply-To: <20031026162325.GA4113@vicky.ecs.soton.ac.uk> References: <200310261109.56801.aleaxit@yahoo.com> <20031026162325.GA4113@vicky.ecs.soton.ac.uk> Message-ID: <200310261920.52477.aleaxit@yahoo.com> On Sunday 26 October 2003 05:23 pm, Armin Rigo wrote: ... > I must admit I was a bit surprized when I first tested sum(), without first > reading its doc because I thought I knew what it should do. I expected it > to be a fast equivalent to: > > def sum(seq, start=0): > for item in seq: > start = start + seq > return start It IS equivalent to that -- plus an explicit typetest to raise if start is an instance of str or unicode. I had originally tried forwarding to ''.join for strings, but Guido preferred to forbid them, and it still doesn't look like a problem to me. Alas, "fast" is debatable:-). > reduce(operator.add, seq, start) sum doesn't reproduce reduce's quirk of using the first item of seq if start is not given. So, the def version is closer. > I immediately tried it with strings and lists. I immediately thought about > lists because of their use of "+" for concatenation. > > So it seems that neither strings nor lists are properly supported, neither Strings are explicitly disallowed, so that should take care of the surprise factor for that specific case. As for lists, the semantics are right, the speed is not (could be made way faster with modest effort). Same for other mutable sequences. As for tuples and other immutable sequences, they ARE treated exactly like your 'def' above (roughly like your reduce) would treat them -- not very fast, but if all you know about something is that it's an immutable sequence, there's nothing more you can do. The use case of making a huge tuple from many smaller ones seems weird enough that I don't see specialcasing tuples specifically as particularly worthwhile (other immutable sequences that aren't exactly tuples would still suffer, after all). > tuples by the way, and my opinion on this is that it strongly contradicts > the principle of least surprize. For mutable sequences, I agree. For immutable ones, I don't see the performance trap as being a practical problem for tuples (and weirder things) -- it WOULD be for strings, but as we specifically disallow them with a message reminding the user of ''.join, in practice the problem seems minor. Maybe I'm coming to this with a too-pragmatical stance...? > I would not object to an implementation of sum() that special-case lists, > tuples and strings for efficiency. (by which I mean I can contribute a > patch) I think all mutable sequences (that aren't especially weird in their + vs += behavior) might be handled correctly, without specialcasing, roughly as follows (Phillip Eby's idea): def sum(seq, start=0): it = iter(seq) try: result = start + it.next() except StopIteration: return start for item in it: result += item return result my original idea was perhaps a bit goofier, something like: def sum(seq, start=0): try: start = copy.copy(start) except TypeError: for item in seq: start = start + item else: for item in seq: start += item return start Admittedly the latter version may accept a few more cases, e.g. both versions would accept: sum([ range(3), 'foo' ], []) because [] is copyable, []+range(3) is fine, and list.__iadd__ is more permissive than list.__add__; however, the first version would fail on: sum([ 'foo', range(3) ], []) because []+'foo' fails, while the second version would be fine because [] is _still_ copyable and __iadd__ is still permissive:-). So, perhaps, implementing by Phillip's idea would still not reduce the suprise factor enough. Hmmm... > > language or library feature. The problem of the += loop on strings is > > essentially solved by psyco, which has tricks to catch that and make > > it almost as fast as ''.join; but psyco can't get into a built-in > > function such as sum, and thus can't help us with the performance trap > > there. > > Supporing sum() in Psyco is no big issue, and it could help the same way as > it does for str.__add__. (It is not explicitely supported yet, but it > could be added.) Still I believe that getting the complexity right in > CPython is important, when it can be done. Absolutely -- we can't rely on psyco for everything, particularly not for getting the big-O right as opposed to speeding things up by constant multipliers (in part, for example, because psyco doesn't work on Mac's, which are going to be a HUGE part of Python's installed base...). However, I would be happy to "leave it to psyco" for a sum of a large sequence of tuples or other immutable sequences...:-). I just don't think that people in practice _will_ fall into that performance-trap... Alex From aleaxit at yahoo.com Sun Oct 26 13:35:35 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Sun Oct 26 13:35:57 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: References: Message-ID: <200310261935.35235.aleaxit@yahoo.com> On Sunday 26 October 2003 05:25 pm, Just van Rossum wrote: > Phillip J. Eby wrote: > > If we have a rebinding operator, I'd rather it be something > > considerably more visible than the presence or absence of a ':' on an > > assignment statement. > > I don't know, but somehow I don't have a problem spotting augmented > assignments, so I don't think := will be as hard to miss as you suggest. I agree -- := isn't any less "visible" than, say, -= . > > So far, all the examples have been downright scary in the > > invisibility of what's happening. Mostly, I can imagine some poor > > sap trying to debug a program that uses := and is missing one > > somewhere or has one where it's not intended -- and hoping that poor > > sap won't be me. :) > > How is that different from a '-=' that should have been a plain '='? > Also, if := is disallowed to rebind in the _same_ scope, this problem > would be spotted by the compiler. Not always (the = that should have been a := won't be, for example), but pretty often (more often than the errant -= will be;-). Worst case it's not any worse than the dreaded "typo in variable name" whereby somebody assigns to, e.g., "accounts__receivable" where they meant to assign to "accounts_receivable" -- people who are new to Python are terrified of that possibility, and demand declarations to take care of it, but long-time practitioners know it's not all that huge a danger. > > I've mostly stayed out of this discussion, but so far something like > > the scope(function).variable proposals, with perhaps a special case > > for scope(global) or scope(globals) seems to me like the way to go. > > It seems very Pythonic, in that it is explicit and calls attention to > > the fact that something special is going on, in a way that ':=' does > > not. > > The reverse argument can be made, too: := calls attention to the fact > that something is happening right there, whereas a declaration may be > many lines away. Right (that's part of why i do not like declarations!-), but the proposal Phillip is referring to would have "scope(foo).x = 23" ``right here'' just as "x := 23" would. Actually, speaking as the original author of the 'scope' proposal, I think I now prefer your := when taken in the simplest, most effective form -- took me a while to convince myself of that, but it grew on me. > > And 'scope' can be looked up in a manual more easily than ':=' > > can. Last, but not least, ':=' looks enough like normal assignment > > in other languages, that somebody just plain might not notice that > > they *need* to look it up. > > That's a good point. Well, if they're looking at a function that ONLY has := in isolation and no occurrence of = -- and their grasp of Python is so scarce that they don't realize = is Python's normal assignment. Doesn't seem like a particularly scary combination of circumstances to me, to be honest. Alex From guido at python.org Sun Oct 26 13:36:46 2003 From: guido at python.org (Guido van Rossum) Date: Sun Oct 26 13:36:56 2003 Subject: [Python-Dev] Re: accumulator display syntax In-Reply-To: Your message of "Sun, 26 Oct 2003 12:56:40 EST." <000301c39bea$8230e1c0$d4b8958d@oemcomputer> References: <000301c39bea$8230e1c0$d4b8958d@oemcomputer> Message-ID: <200310261836.h9QIakK25425@12-236-54-216.client.attbi.com> > > did you mean list.__add__()...? list.__iadd__ IS just as permissive > > as list.extend, it seems to me. > > Hmm, I did mean __iadd__() but misremembered what it did. You're forgiven, at some point in the past they *were* different. --Guido van Rossum (home page: http://www.python.org/~guido/) From arigo at tunes.org Sun Oct 26 14:37:55 2003 From: arigo at tunes.org (Armin Rigo) Date: Sun Oct 26 14:42:43 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python bltinmodule.c, 2.292.10.1, 2.292.10.2 In-Reply-To: <200310261920.52477.aleaxit@yahoo.com> References: <200310261109.56801.aleaxit@yahoo.com> <20031026162325.GA4113@vicky.ecs.soton.ac.uk> <200310261920.52477.aleaxit@yahoo.com> Message-ID: <20031026193755.GA27194@vicky.ecs.soton.ac.uk> Hello Alex, On Sun, Oct 26, 2003 at 07:20:52PM +0100, Alex Martelli wrote: > > def sum(seq, start=0): > > for item in seq: > > start = start + seq > > return start > > It IS equivalent to that -- plus an explicit typetest to raise if start is an > instance of str or unicode. Yes, it is what I'm saying: it is what we expect it to be, but there is an exception for no real reason apart from "don't do it like this, buddy, there is a faster version out there". I tend to regard this kind of exceptions as very bad, because if you write a generic algorithm using sum(), even if you don't really see why someone would think about using your algorithm with strings one day, chances are that someone will. Raising a Warning instead of an exception would have had the same result without the generality problem. > > reduce(operator.add, seq, start) > > sum doesn't reproduce reduce's quirk of using the first item of seq if start > is not given. So, the def version is closer. I was thinking about: def sum(seq, start=0): return reduce(operator.add, seq, start) which is the same as the previous one. > Admittedly the latter version may accept a few more cases, e.g. > both versions would accept: > sum([ range(3), 'foo' ], []) > because [] is copyable, []+range(3) is fine, and list.__iadd__ is > more permissive than list.__add__; however, the first version > would fail on: > sum([ 'foo', range(3) ], []) > because []+'foo' fails, while the second version would be fine > because [] is _still_ copyable and __iadd__ is still permissive:-). These cases all show that we have a surprize problem (although probably not a big one). The user will expect sum() to have a clean definition, and because the += one doesn't work, it must be +. To my opinion, sum() should be strictly equivalent to the naive + version and try to optimize common cases under the hood. Admittedly, this is not obvious, because of precisely all these strange mixed type cases which could be user-defined classes with __add__ or __radd__ operators... I'm sure someone will design a class class x: def __add__(self, other): return other so that x() can be used as a trivial starting point for sum() -- and then sum(["abc", "def"], x()) works :-) Armin From gward at python.net Sun Oct 26 14:55:15 2003 From: gward at python.net (Greg Ward) Date: Sun Oct 26 14:55:18 2003 Subject: [Python-Dev] Inconsistent error messages in Py{Object, Sequence}_SetItem() Message-ID: <20031026195515.GA30335@cthulhu.gerg.ca> I just noticed a subtle inconsistency in the error messages when trying to assign to a tuple: >>> (1,)[0] = "foo" Traceback (most recent call last): File "", line 1, in ? TypeError: object doesn't support item assignment >>> (1,)['foo'] = "foo" Traceback (most recent call last): File "", line 1, in ? TypeError: object does not support item assignment Note the "doesn't" vs "does not". It's easily tracked down to PyObject_SetItem() and PySequence_SetItem() (in Objects/abstract.c). Is this deliberate, or a simple oversight? I'm inclined to assume the latter, and change "doesn't" to "does not" on the grounds that error messages are formal writing, and I was taught not to use contractions in formal writing. Any objections? Greg -- Greg Ward http://www.gerg.ca/ Eschew obfuscation! From pf_moore at yahoo.co.uk Sun Oct 26 15:27:21 2003 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Sun Oct 26 15:27:15 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python bltinmodule.c, 2.292.10.1, 2.292.10.2 References: <200310261109.56801.aleaxit@yahoo.com> <200310261848.30366.aleaxit@yahoo.com> Message-ID: Alex Martelli writes: > On Sunday 26 October 2003 05:21 pm, Paul Moore wrote: > ... >> I *think* I see what you're getting at here, but I'm struggling to >> follow in the absence of concrete use cases. As we're talking about > > Assuming the simplest definition, equivalent to: > > def loop(bound_method, it): > for item in it: bound_method(item) > > typical simple use cases might be, e.g.: [...] > and so on. None of which are, to me, particularly convincing. Then again, while I like a "declarative style" in some cases, I've got nothing against the sort of "idiom-based" style in which short code patterns just "mean something" as a whole, and aren't viewed as being comprised of their individual parts (much like the standard C idiom for walking a linked list). > The gain of such tiny "accumulator functions" (consuming one or > more iterators by just passing their items to some mutating-method > and ignoring the latter's results) are essentially conceptual -- it's > not a matter of saving a couple of lines at the point of use, nor of > saving some "bananoseconds" if the accumulator functions are > implemented in C, when compared to the coded-out loops. > > Rather, such functions would allow "more declarative style" > presentation (of underlying functionality that remains imperative): > expressions feel more "declarative", stylistically, to some, while > spelling a few steps out feels more "imperative". We've had this > preference explained recently on this group, and others over in > c.l.py are breaking over the champagne at the news of list.sorted > for essentially the same motivation. OK. I'd bow out here, as I don't feel the need to push the declarative style that extra step. Let others champion the style. > Absolutely. It's not _my_ suggestion to have more accumulator > functions -- it came up repeatedly on the threads started by Peter > Norvig original proposal about accumulation, and Guido mentioned > them in the 'product' thread I believe (where we also discussed > 'any', 'all' etc, if I recall correctly). I'm sorry - I'd got the impression that you were arguing the case. In which case, I'd have to say that I'm not at all clear who it is who's proposing anything here, or what specifically the proposals are. I suspect the original intention is getting lost in generalities, and it's time for those original posters to speak up and clarify exactly what they want. Maybe a PEP is in order, to get back to the core of the proposal. >> With a bit of "real life" use, migration into the standard library >> might be more of an obvious step. > > You could have said the same of itertools before 2.3, but I think > it was a great decision to accept them into the standard library > instead; 2.3 would be substantially poorer without them. Agreed. I was very conscious of itertools when I made that statement. But my gut feel is that in this case, there has been so much discussion that the key concept has been obscured. A PEP, or some prior art, would recapture that. > With an even richer complement of iterator tools in itertools, and > the new "generator expressions" to give us even more great ways to > make iterators, I think a module of "iterator consumers", also known > as accumulation functions, would be a good idea. Look at Peter > Norvig's original ideas for some suggestions, for example. In principle, I don't have a problem with that. Let's get concrete, though, and see either a PEP or some code. Otherwise, the discussion isn't really going anywhere. And on that note, I really ought to bow out :-) Paul. -- This signature intentionally left blank From guido at python.org Sun Oct 26 15:50:18 2003 From: guido at python.org (Guido van Rossum) Date: Sun Oct 26 15:50:41 2003 Subject: [Python-Dev] Inconsistent error messages in Py{Object, Sequence}_SetItem() In-Reply-To: Your message of "Sun, 26 Oct 2003 14:55:15 EST." <20031026195515.GA30335@cthulhu.gerg.ca> References: <20031026195515.GA30335@cthulhu.gerg.ca> Message-ID: <200310262050.h9QKoI825552@12-236-54-216.client.attbi.com> > Note the "doesn't" vs "does not". It's easily tracked down to > PyObject_SetItem() and PySequence_SetItem() (in Objects/abstract.c). > > Is this deliberate, or a simple oversight? I'm inclined to assume the > latter, and change "doesn't" to "does not" on the grounds that error > messages are formal writing, and I was taught not to use contractions in > formal writing. Luckily I wasn't taught formal writing :-), and I don't see why it can't be doesn't. I'd say that if you want Python's error messages to be formal writing, you'd have to change a lot more than just the one... :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sun Oct 26 16:16:31 2003 From: guido at python.org (Guido van Rossum) Date: Sun Oct 26 16:16:43 2003 Subject: [Python-Dev] cloning iterators again Message-ID: <200310262116.h9QLGVf25583@12-236-54-216.client.attbi.com> The following is just so beautiful, I have to share it. I've been thinking about various implementations of Andrew Koenig's idea of "copyable iterator wrappers", which support a generalization of Raymond Hettinger's tee(). This needs some kind of queue-ish data structure, but all queues that I could think of required too much administration for the additional requirements. I finally realized (this may come as no surprise to Andrew :-) that the best implementation is a singly-linked list. Yes, a use case for a linked list in Python! This nicely takes care of the GC issue when one of the iterators is discarded before being exhausted. I hope Raymond can implement this in C. class Link(object): """Singly-linked list of (state, value) pairs. The slots are manipulated directly by Wrapper.next() below. The state slot can have three values, which determine the meaning of the value slot: state = 0 => value is not yet determined (set to None) state = 1 => value is value to be returned by next() state = -1 => value is exception to be raised by next() The next slot points to the next Link in the chain; it is None at the end of the chain (state <= 0). """ __slots__ = ["state", "value", "next"] def __init__(self): self.state = 0 self.value = None self.next = None class Wrapper(object): """Copyable wrapper around an iterator. Any number of Wrappers around the same iterator share the same chain of Links. The Wrapper that is most behind references the oldest Link, and as it moves forward the oldest Link instances are automatically discarded. The newest Link has its value set to None and its state set to 0. When a Wrapper needs to get the value out of this Link, it calls next() on the underlying iterator and stores it in the Link, setting its state to 1, for the benefit of other Wrappers that are behind it. If the underlying iterator's next() raises an exception, the Link's state is set to -1 and its value to the exception instance instead. When the oldest Wrapper is garbage-collected before it finishes the chain, the Links that it owns are also garbage-collected, up to the next Link still owned by a live Wrapper. """ __slots__ = ["it", "link"] def __init__(self, it, link=None): """Constructor. The link argument is used by __copy__ below.""" self.it = it if link is None: link = Link() self.link = link def __copy__(self): """Copy the iterator. This returns a new iterator that will return the same series of results as the original. """ return Wrapper(self.it, self.link) def __iter__(self): """All iterators should support __iter__() returning self.""" return self def next(self): """Get the next value of the iterator, or raise StopIteration.""" link = self.link if link is None: raise StopIteration state, value, next = link.state, link.value, link.next if state == 0: # At the head of the chain: move underlying iterator try: value = self.it.next() except StopIteration, exc: value = exc state = -1 else: state = 1 link.state = state link.value = value if state < 0: self.link = None raise value assert state > 0 if next is None: next = Link() link.next = next self.link = next return value def tee(it): """Replacement for Raymond's tee(); see examples in itertools docs.""" if not hasattr(it, "__copy__"): it = Wrapper(it) return (it, it.__copy__()) def test(): """A simple demonstration of the Wrapper class.""" import random def gen(): for i in range(10): yield i it = gen() a, b = tee(it) b, c = tee(b) c, d = tee(c) iterators = [a, b, c, d] while iterators != [None, None, None, None]: i = random.randrange(4) it = iterators[i] if it is None: next = "----" else: try: next = it.next() except StopIteration: next = "****" iterators[i] = None print "%4d%s%4s%s" % (i, " ."*i, next, " ."*(3-i)) if __name__ == "__main__": test() --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Sun Oct 26 16:42:08 2003 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Oct 26 16:42:30 2003 Subject: [Python-Dev] Inconsistent error messages in Py{Object, Sequence}_SetItem() In-Reply-To: <200310262050.h9QKoI825552@12-236-54-216.client.attbi.com> References: <20031026195515.GA30335@cthulhu.gerg.ca> <200310262050.h9QKoI825552@12-236-54-216.client.attbi.com> Message-ID: <3F9C3FB0.8050206@v.loewis.de> Guido van Rossum wrote: > Luckily I wasn't taught formal writing :-), and I don't see why it > can't be doesn't. I'd say that if you want Python's error messages to > be formal writing, you'd have to change a lot more than just the > one... :-) OTOH, I would always yield to native speakers in such issues. To me myself, it does not matter much, but if native speakers feel happier one way or the other, I'd like to help them feel happy :-) Regards, Martin From pje at telecommunity.com Sun Oct 26 17:11:58 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun Oct 26 17:11:15 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: References: <5.1.0.14.0.20031026105653.03e64ec0@mail.telecommunity.com> Message-ID: <5.1.0.14.0.20031026170948.03613950@mail.telecommunity.com> At 05:25 PM 10/26/03 +0100, Just van Rossum wrote: >Phillip J. Eby wrote: > > > If we have a rebinding operator, I'd rather it be something > > considerably more visible than the presence or absence of a ':' on an > > assignment statement. > >I don't know, but somehow I don't have a problem spotting augmented >assignments, so I don't think := will be as hard to miss as you suggest. > > > So far, all the examples have been downright scary in the > > invisibility of what's happening. Mostly, I can imagine some poor > > sap trying to debug a program that uses := and is missing one > > somewhere or has one where it's not intended -- and hoping that poor > > sap won't be me. :) > >How is that different from a '-=' that should have been a plain '='? >Also, if := is disallowed to rebind in the _same_ scope, this problem >would be spotted by the compiler. But some languages use := to mean simple assignment. So, '=' and ':=' don't appear *semantically* distinct. Whereas, I'm not aware of a language that uses '-=' differently. > > I've mostly stayed out of this discussion, but so far something like > > the scope(function).variable proposals, with perhaps a special case > > for scope(global) or scope(globals) seems to me like the way to go. > > It seems very Pythonic, in that it is explicit and calls attention to > > the fact that something special is going on, in a way that ':=' does > > not. > >The reverse argument can be made, too: := calls attention to the fact >that something is happening right there, whereas a declaration may be >many lines away. I guess I wasn't clear. I meant, using 'scope(function).variable = whatever' *every* time you assign to the outer scope variable, and not having any "declarations", ever. From python at rcn.com Sun Oct 26 18:17:35 2003 From: python at rcn.com (Raymond Hettinger) Date: Sun Oct 26 18:18:30 2003 Subject: [Python-Dev] RE: cloning iterators again In-Reply-To: <200310262116.h9QLGVf25583@12-236-54-216.client.attbi.com> Message-ID: <000f01c39c17$575e8920$d4b8958d@oemcomputer> > The following is just so beautiful, I have to share it. I have to say, it is a thing of beauty. > I've been thinking about various implementations of Andrew Koenig's > idea of "copyable iterator wrappers", which support a generalization > of Raymond Hettinger's tee(). I've re-read some of the old email on the subject but didn't see what this buys us that we don't already get with the current tee(). When I wrote tee(), I had considered implementing it as a multi-way tee(it, n=2) so you could write a,b,c,d=tee(myiterable, 4). Then, I wracked my brain for use cases and found nothing that warranted: * the additional memory consumption (the current implementation consumes only one pointer per element and it stores them in contiguous memory); * the additional memory management utilization (the underlying list.pop and list.append have already been optimized to avoid excessive malloc/free calls); * or the impact on cache performance (using contiguous memory means that consecutive pops are in the L1 cache at least 90% of the time and using only one word per entry means that a long series of pops is less likely to blow everything else out of the cache). With only two iterators, I can imagine use cases where the two iterators track each other fairly closely. But with multiple iterators, one iterator typically lags far behind (meaning that list(it) is the best solution) or they track within a fixed number of elements of each other (meaning that windowing is the best solution). The itertools example section shows the pure python code for windowing. AFAICT, that windowing code is unbeatable in terms of speed and memory consumption (nearly all the time is spent forming the result tuple). > class Link(object): > """Singly-linked list of (state, value) pairs. . . . > __slots__ = ["state", "value", "next"] One way to implement this is with a type which adds PyHEAD to the space consumption for the three fields. An alternate approach is to use PyMem directly and request space for four fields (including a refcount field). > if state < 0: > self.link = None > raise value Is it kosher to re-raise the exception long after something else make have handled it and the execution context has long since disappeared? > def test(): > """A simple demonstration of the Wrapper class.""" > import random > def gen(): > for i in range(10): > yield i > it = gen() > a, b = tee(it) > b, c = tee(b) > c, d = tee(c) This is very nice. The current tee() increases memory consumption and workload when nested like this. Raymond Hettinger From skip at pobox.com Sun Oct 26 12:11:51 2003 From: skip at pobox.com (Skip Montanaro) Date: Sun Oct 26 18:58:48 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310261723.20026.aleaxit@yahoo.com> References: <200310261623.54136.aleaxit@yahoo.com> <20031026154626.GA18564@panix.com> <200310261723.20026.aleaxit@yahoo.com> Message-ID: <16284.87.71562.652543@montanaro.dyndns.org> > Sounds good to me. Question: what does this do? > > def f(): > def g(x): > z := x ... > That is, in the absence of a pre-existing binding, where does the > binding for := go? I think it should be equivalent to global, going to > the module scope. This is one place I think an extension of the global statement has a definite advantage: def f(): def g(): global z in f z = x Skip From guido at python.org Sun Oct 26 19:31:35 2003 From: guido at python.org (Guido van Rossum) Date: Sun Oct 26 19:31:51 2003 Subject: [Python-Dev] RE: cloning iterators again In-Reply-To: Your message of "Sun, 26 Oct 2003 18:17:35 EST." <000f01c39c17$575e8920$d4b8958d@oemcomputer> References: <000f01c39c17$575e8920$d4b8958d@oemcomputer> Message-ID: <200310270031.h9R0VZp25738@12-236-54-216.client.attbi.com> > > I've been thinking about various implementations of Andrew Koenig's > > idea of "copyable iterator wrappers", which support a generalization > > of Raymond Hettinger's tee(). > > I've re-read some of the old email on the subject but didn't see what > this buys us that we don't already get with the current tee(). Performance-wise I don't know; we'd have to profile it I guess. :-( With the current tee(), I was thinking that if the two iterators stay close, you end up moving the in basket to the out basket rather frequently, and the overhead of that might beat the simplicity of the linked lists. Also, *if* you need a lot of clones, using multiple tee() calls ends up creating several queues, again causing more overhead. (These queues end up together containing all the items from the oldest to the newest iterator.) I also note that the current tee() doesn't let you use __copy__ easily (it would be quite messy I think). The linked-list version supports __copy__ trivially. This may be important if we execute (as I think we should) on the idea of making selected iterators __copy__-able (especially all the standard container iterators and xrange). > When I wrote tee(), I had considered implementing it as a multi-way > tee(it, n=2) so you could write a,b,c,d=tee(myiterable, 4). Then, I > wracked my brain for use cases and found nothing that warranted: > > * the additional memory consumption (the current implementation consumes > only one pointer per element and it stores them in contiguous memory); > > * the additional memory management utilization (the underlying list.pop > and list.append have already been optimized to avoid excessive > malloc/free calls); > > * or the impact on cache performance (using contiguous memory means that > consecutive pops are in the L1 cache at least 90% of the time and using > only one word per entry means that a long series of pops is less likely > to blow everything else out of the cache). > > With only two iterators, I can imagine use cases where the two iterators > track each other fairly closely. But with multiple iterators, one > iterator typically lags far behind (meaning that list(it) is the best > solution) or they track within a fixed number of elements of each other > (meaning that windowing is the best solution). Maybe Andrew has some use cases? After all he implemented this once for C++. BTW in private mail he reminded me that (a) he'd already suggested using a linked list to me before, and (b) his version had several values per link node, which might address some of your concerns above. > The itertools example section shows the pure python code for windowing. > AFAICT, that windowing code is unbeatable in terms of speed and memory > consumption (nearly all the time is spent forming the result tuple). > > > > > class Link(object): > > """Singly-linked list of (state, value) pairs. > . . . > > __slots__ = ["state", "value", "next"] > > One way to implement this is with a type which adds PyHEAD to the space > consumption for the three fields. An alternate approach is to use PyMem > directly and request space for four fields (including a refcount field). Or you could use Andrew's suggestion. > > > if state < 0: > > self.link = None > > raise value > > Is it kosher to re-raise the exception long after something else make > have handled it and the execution context has long since disappeared? This isn't a re-raise; it's a raise of the exception object, which doesn't depend on the context and can be raised as often as you want to. I agree that it might be worth it to do a bare raise (== re-raise) *if* the exception was in fact caught in the current next() invocation, to preserve the stack trace. Or we could change the meaning of value and store the sys.exc_info() triple in it -- but this would probably keep too many stack frames and local variables alive for too long. > > def test(): > > """A simple demonstration of the Wrapper class.""" > > import random > > def gen(): > > for i in range(10): > > yield i > > it = gen() > > a, b = tee(it) > > b, c = tee(b) > > c, d = tee(c) > > This is very nice. The current tee() increases memory consumption and > workload when nested like this. The question is, how often does one need this? Have you seen real use cases for tee() that aren't better served with list()? --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at cosc.canterbury.ac.nz Sun Oct 26 22:22:26 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Sun Oct 26 22:22:45 2003 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python bltinmodule.c, 2.292.10.1, 2.292.10.2 In-Reply-To: <200310261109.56801.aleaxit@yahoo.com> Message-ID: <200310270322.h9R3MQT13907@oma.cosc.canterbury.ac.nz> Alex Martelli : > Exactly the same underlying reason as a bug I just opened on > SF: if x is an instance of a class X having __mul__ but not > __rmul__, 3*x works (just like x*3) but 3.0*x raises TypeError Seems to me the bug there is not giving X an __rmul__ method and yet expecting y*x to work at all. The fact that it happens to work in some cases is an accident that should not be relied upon. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tdelaney at avaya.com Sun Oct 26 22:25:19 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Sun Oct 26 22:26:50 2003 Subject: [Python-Dev] closure semantics Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6AF30@au3010avexu1.global.avaya.com> > From: Skip Montanaro [mailto:skip@pobox.com] > > You meant > > def f(): > x = 12 > y = 1 > def g(): > y = 12 > global y in f > g() > print locals() > > right? Er - yes ... :) Tim Delaney From tdelaney at avaya.com Sun Oct 26 22:27:52 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Sun Oct 26 22:28:00 2003 Subject: [Python-Dev] closure semantics Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6AF32@au3010avexu1.global.avaya.com> > From: Guido van Rossum [mailto:guido@python.org] > > > Likewise, the following should be illegal: > > > > def f(): > > x = 12 > > y = 1 > > def g(): > > global y in f > > y = 12 > > g() > > print locals() > > > > because the global statement occurs after a local binding > of the name. > > Huh? The placement of a global statement is irrelevant -- it can > occur anywhere in the scope. This should certainly work. As Skip pointed out, I got: y = 12 global y in f reversed. And I was thinking of PyChecker warning about this. I should not have been thinking about these things while trying to set a release candidate build going so I could head home on a Friday evening :( Tim Delaney From greg at cosc.canterbury.ac.nz Sun Oct 26 22:28:08 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Sun Oct 26 22:28:18 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310261134.56982.aleaxit@yahoo.com> Message-ID: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz> Alex Martelli : > > Ideally, augmented assignments would also become "rebinding". However, > > this may have compatibility problems. > > Unfortunately yes. It might have been better to define them that way in > the first place, but changing them now is dubious. I'm not so sure. You need an existing binding before an augmented assignment will work, so I don't think there can be any correct existing usages that would be broken by this. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From tdelaney at avaya.com Sun Oct 26 22:31:12 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Sun Oct 26 22:31:23 2003 Subject: [Python-Dev] closure semantics Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6AF34@au3010avexu1.global.avaya.com> > From: Greg Ewing [mailto:greg@cosc.canterbury.ac.nz] > > > Is that compatible with current use? I think the current > semantics are that > > global always binds name to an object with that name > at module scope. > > No, it's not quite compatible, but I don't think > it's likely to break anything much in practice. I'm almost 100% sure that it will. People tend to use the same short variable names for things, and nested functions had *better* be related ... We could not use an unadorned 'global' for such a change in semantics. It would require a new keyword. Tim Delaney From guido at python.org Sun Oct 26 22:58:19 2003 From: guido at python.org (Guido van Rossum) Date: Sun Oct 26 22:58:35 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: Your message of "Mon, 27 Oct 2003 16:28:08 +1300." <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz> References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz> Message-ID: <200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com> [attribution lost] > > > Ideally, augmented assignments would also become "rebinding". However, > > > this may have compatibility problems. [Alex] > > Unfortunately yes. It might have been better to define them that way in > > the first place, but changing them now is dubious. [Greg] > I'm not so sure. You need an existing binding before an > augmented assignment will work, so I don't think there can > be any correct existing usages that would be broken by this. Indeed. If x is neither local not declared global, x+=... is always an error, even if an x at an intermediate level exists, so THAT shouldn't be used as an argument against this. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at cosc.canterbury.ac.nz Sun Oct 26 23:28:09 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Sun Oct 26 23:28:18 2003 Subject: [Python-Dev] Alternate notation for global variable assignments In-Reply-To: <87k76rhnn6.fsf@egil.codesourcery.com> Message-ID: <200310270428.h9R4S9u14063@oma.cosc.canterbury.ac.nz> > I like Just's := concept except for the similarity to =, and I worry > that the presence of := in the language will flip people into "Pascal > mode" -- thinking that = is the equality operator. I also think that > the notation is somewhat unnatural -- "globalness" is a property of > the _variable_, not the operator. So I'd like to suggest instead > > :var = value # var in module scope > :scope:var = value # var in named enclosing scope Yeek, that makes it look like Logo! What about simply outer x = value In this, 'outer' would be an annotation applicable to any bare name in an lvalue position, so you could say (x, outer y, self.z) = stuff if you wanted, or even def outer f(): ... class outer C: ... although probably I wouldn't mind much if those were disallowed. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From pete at shinners.org Mon Oct 27 01:02:47 2003 From: pete at shinners.org (Pete Shinners) Date: Mon Oct 27 00:07:25 2003 Subject: [Python-Dev] VisualC6 Available Message-ID: The place I used to work at had several retail copies of MS Visual Studio 6.0. Since the company is no longer, I have one available to offer to anyone developing python. If they don't go to someone useful here they will likely just end up in the dumpster. I figure if anyone is stuck using a potentially 'shady' licensed version this could be a good chance to get all legit. The full product is "Microsoft Visual Studio 6.0 Enterprise Edition, English". This will come with the original CD's, Case, CD Key, and Certificate of Authenticity. If this sounds like it will help, drop me an email and I'll figure out how to get it to you. I'm especially interested in helping someone actively developing python. From python at rcn.com Mon Oct 27 00:12:33 2003 From: python at rcn.com (Raymond Hettinger) Date: Mon Oct 27 00:13:30 2003 Subject: [Python-Dev] RE: cloning iterators again In-Reply-To: <200310270031.h9R0VZp25738@12-236-54-216.client.attbi.com> Message-ID: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer> > > I've re-read some of the old email on the subject but didn't see what > > this buys us that we don't already get with the current tee(). > > Performance-wise I don't know; we'd have to profile it I guess. :-( My question was more directed toward non-performance issues. Do we really have *any* need for more than two iterators running concurrently? After all, it's already difficult to come-up with good use cases for two that are not dominated by list() or window(). > With the current tee(), I was thinking that if the two iterators stay > close, you end up moving the in basket to the out basket rather > frequently, and the overhead of that might beat the simplicity of the > linked lists. With current tee(), runtime is dominated by calls to Append and Pop (reverse is super-fast and moves each element only once). Those are calls are more expensive than a link jump; however append() and pop() are optimized to avoid calls to the memory manager while every link would need steps for alloc/initialization/reference/dealloc. Cache effects are also important because the current tee() uses much less memory and the two memory blocks are contiguous. > Also, *if* you need a lot of clones, using multiple > tee() calls ends up creating several queues, again causing more > overhead. (These queues end up together containing all the items from > the oldest to the newest iterator.) *If* we want to support multiple clones, there is an alternate implementation of the current tee that only costs one extra word per iteration. That would be in there already. I really *wanted* a multi-way tee but couldn't find a single use case that warranted it. > I also note that the current tee() doesn't let you use __copy__ easily > (it would be quite messy I think). To __copy__ is to tee. Both make two iterators from one. They are different names for the same thing. Right now, they don't seem comparable because the current tee is only a two way split and you think of copy as being a multi-way split for no extra cost. > Maybe Andrew has some use cases? I hope so. I can't think of anything that isn't dominated by list(), window(), or the current tee(). And, if needed, the current tee() can easily be made multi-way. It doubles the unit memory cost from one word to two but that's nothing compared to the link method (two words for PyHead, another 3 (Linux) or 4 (Windows) words for GC, and another 3 for the data fields). > The question is, how often does one need this? Have you seen real use > cases for tee() that aren't better served with list()? I'm sure they exist, but they are very few. I was hoping that a simple, fast, memory efficient two-way tee() would have satisfied all the requests, but this thing appears to be taking on a life of its own with folks thinking they need multiple concurrent iterators created by a magic method (what for?). Raymond From aleaxit at yahoo.com Mon Oct 27 02:51:02 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 27 02:51:12 2003 Subject: [Python-Dev] RE: cloning iterators again In-Reply-To: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer> References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer> Message-ID: <200310270851.02495.aleaxit@yahoo.com> On Monday 27 October 2003 06:12, Raymond Hettinger wrote: ... > My question was more directed toward non-performance issues. Do we > really have *any* need for more than two iterators running concurrently? I admit I have no use cases for that. It was probably a case of over-eager generalization on my part. I understand and appreciate all of your other explanations on performance, except one: > > I also note that the current tee() doesn't let you use __copy__ easily > > (it would be quite messy I think). > > To __copy__ is to tee. Both make two iterators from one. > They are different names for the same thing. > Right now, they don't seem comparable because the current tee is only a > two way split and you think of copy as being a multi-way split for no > extra cost. I don't understand this. __copy__ is a special method that a type may or may not expose. If it does, copy.copy(x) on an instance x of that type makes and returns one (shallow) copy of x. I just got a PEP number (323) for Copyable Iterators as recently discussed, and hope to commit the PEP within today. But, basically, the idea is trivially simple: iterators which really have a tiny amount of state, such as those on sequences and dicts, will expose __copy__ and implement it by just duplicating said tiny amount (one pointer to a container and an index). But I don't understand how it would be quite messy to take advantage of this in tee(), either: simply, tee() would start with the equivalent of it = iter(it) try: return it, copy.copy(it) except TypeError:pass and proceed just like now if this shortcut hasn't worked -- that's all. Alex From aleaxit at yahoo.com Mon Oct 27 03:06:44 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 27 03:06:52 2003 Subject: [Python-Dev] the "3*x works w/o __rmul__" bug In-Reply-To: <200310270322.h9R3MQT13907@oma.cosc.canterbury.ac.nz> References: <200310270322.h9R3MQT13907@oma.cosc.canterbury.ac.nz> Message-ID: <200310270906.44209.aleaxit@yahoo.com> On Monday 27 October 2003 04:22, Greg Ewing wrote: > Alex Martelli : > > Exactly the same underlying reason as a bug I just opened on > > SF: if x is an instance of a class X having __mul__ but not > > __rmul__, 3*x works (just like x*3) but 3.0*x raises TypeError > > Seems to me the bug there is not giving X an __rmul__ > method and yet expecting y*x to work at all. The fact > that it happens to work in some cases is an accident > that should not be relied upon. No, the bug is that it works in some cases where it should fail (and, secondarily, that -- where it does fail -- it gives a weird error message). In other words, the bug (in Python) is that "accident". Nobody's asking for 3.0*x to work where x is a user-coded type without an __rmul__; rather, the point is that 3*x should fail too, and ideally they'd have the same clear error message as 3+x gives when the type has no __radd__. Doesn't seem trivial to fix (though I hope I'm missing something obvious) and doesn't affect perfect user-programs, but I do think it should be fixed because it's sure extremely mysterious and could send a developer on wild goose chases when met in the course of development. Alex From ncoghlan at iinet.net.au Mon Oct 27 03:31:32 2003 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Mon Oct 27 03:31:41 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <16284.87.71562.652543@montanaro.dyndns.org> References: <200310261623.54136.aleaxit@yahoo.com> <20031026154626.GA18564@panix.com> <200310261723.20026.aleaxit@yahoo.com> <16284.87.71562.652543@montanaro.dyndns.org> Message-ID: <3F9CD7E4.5070609@iinet.net.au> Skip Montanaro strung bits together to say: > This is one place I think an extension of the global statement has a > definite advantage: > > def f(): > def g(): > global z in f > z = x > Alternately (using Just's 'rebinding non-local' syntax: def f(): z = None def g(): z := x Cheers, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions." From ncoghlan at iinet.net.au Mon Oct 27 04:06:35 2003 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Mon Oct 27 04:06:48 2003 Subject: [Python-Dev] product() In-Reply-To: <200310261310.27950.fincher.8@osu.edu> References: <002401c39907$0176f5a0$e841fea9@oemcomputer> <200310260820.59266.fincher.8@osu.edu> <3F9BDCA1.5040101@iinet.net.au> <200310261310.27950.fincher.8@osu.edu> Message-ID: <3F9CE01B.7070905@iinet.net.au> Jeremy Fincher strung bits together to say: > On Sunday 26 October 2003 09:39 am, Nick Coghlan wrote: > >> >>> if any(not pred(x) for x in values): pass # anyfalse > if not all(pred(x) for x in values): pass >> >>> if all(not pred(x) for x in values): pass # allfalse > if not any(pred(x) for x in values): pass > It's slightly more efficient (only one negation), and it seems to maintain > better the pseudocode-like aspect that we so much adore in Python :) I originally wrote them out the way you suggest, but then changed them after I added the comment that indicated what each example represented (as the less efficient versions more literally match the comments). Anyway, I suspect those used to the idiom would use the forms you suggest. There might be some variation due to the multiple ways of writing the expressions (using any/all), but I doubt that would be worse than the confusion created by the double negative needed to express either any or all in terms of the other. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions." From aleaxit at yahoo.com Mon Oct 27 04:33:56 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 27 04:34:08 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com> References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz> <200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com> Message-ID: <200310271033.56569.aleaxit@yahoo.com> On Monday 27 October 2003 04:58 am, Guido van Rossum wrote: > [attribution lost] > > > > > Ideally, augmented assignments would also become "rebinding". > > > > However, this may have compatibility problems. > > [Alex] > > > > Unfortunately yes. It might have been better to define them that way > > > in the first place, but changing them now is dubious. > > [Greg] > > > I'm not so sure. You need an existing binding before an > > augmented assignment will work, so I don't think there can > > be any correct existing usages that would be broken by this. > > Indeed. If x is neither local not declared global, x+=... is always > an error, even if an x at an intermediate level exists, so THAT > shouldn't be used as an argument against this. Actually, if the compiler were able to diagnose that, it would be wonderful -- but I don't think it can, because it can make no assumptions regarding what might be defined in global scope (or at least it definitely can't make any such assumptions now). So, yes, any sensible program that works today would keep working. I dunno about NON-sensible programs such as: def outer(): x = 23 def inner(): exec 'x = 45' x+=1 # etc etc but then I guess the presence of 'exec' might be defined to change semantics of += and/or disallow := or whatever else, just as today it turns off local-variable optimizations. My slight preference for leaving += and friends alone is that a function using them to rebind nonlocals would be hard to read, that since the change only applies when the LHS is a bare name the important use cases for augmented assignment don't apply any way, that it's a bit subtle to explain that foo.bar += baz ( += on a dotted name) implies a plain assignment (setattr) on foo.bar while foo_bar += baz ( += on bare name) might imply a := assignment (rebinding a nonlocal) IF there are no "foo_bar = baz" elsewhere in the same function BUT it would imply a plain assignment if there ARE other plain assignments to the same name in the same function, ... IOW it seems to me that we're getting into substantial amounts of subtlety in explaining (and thus maybe in implementing) a functionality change that's not terribly useful anyway and may damage rather than improve readability when it's used. Taking the typical P. Graham accumulator example, say: with += rebinding, we can code this: def accumulator(n=0): def increment(i): n += i return n return increment but without it, we would code: def accumulator(n=0): def increment(i): n := n + i return n return increment and it doesn't seem to me that the two extra keystrokes are to be considered a substantial price to pay. Admittedly in such a tiny example readability is just as good either way, as it's obvious which n we're talking about (there being just one, and extremely nearby wrt the point of use of either += or := ). Suppose we wanted to have the accumulator "saturate" -- if the last value it returned was > m it must restart accumulating from zero. Now, without augmented assignment: def accumulator_saturating(n=0, m=100): def increment(i): if n > m: n := i else: n := n + i return n return increment we have a pleasing symmetry and no risk of errors -- if we mistakenly use an = instead of := in either branch the compiler will be able to let us know immediately. (Actually I'd be quite tempted to code the if branch as "n := 0 + i" to underscore the symmetry, but maybe I'm just weird:-). If we do rely on augmented assignment being "rebinding": def accumulator_saturating(n=0, m=100): def increment(i): if n > m: n = i else: n += i return n return increment the error becomes a runtime rather than compile-time one, and does take a (small but non-zero) time to discover it. The += 's subtle new semantics (rebinds either a local or nonlocal, depending on how other assignments elsewhere in the function are coded) do make it slightly harder to understand and explain, compared to my favourite approach, which is: := is the ONLY way to rebind a nonlocal name (and only ever does that, only with a bare name on LHS, etc, etc) which can't be beaten in terms of how simple it is to understand and explain. The compiler could then diagnose an error when it sees := and += used on the same barename in the same function (and perhaps give a clear error message suggesting non-augmented := usage in lieu of the augmented assignment). Can somebody please show a compelling use case for some "nonlocal += expr" over "nonlocal := nonlocal + expr" , sufficient to override all the "simplicity" arguments above? I guess there must be some, since popular feeling appears to be in favour of having augmented-assignment as "rebinding", but I can't see them. Alex Alex From greg at electricrain.com Mon Oct 27 04:40:45 2003 From: greg at electricrain.com (Gregory P. Smith) Date: Mon Oct 27 04:41:01 2003 Subject: [Python-Dev] Re: test_bsddb blocks testing popitem - reason In-Reply-To: <200310270930.28811.aleaxit@yahoo.com> References: <200310251232.55044.aleaxit@yahoo.com> <20031027075422.GK3929@zot.electricrain.com> <200310270930.28811.aleaxit@yahoo.com> Message-ID: <20031027094045.GL3929@zot.electricrain.com> > > It is unfortuantely entirely possible that various berkeleydb libraries > > have bugs. Since the BerkeleyDB db->del() call isn't returning it is > > presumably stuck in a lock waiting for who knows what. > > Right. But the SAME berkeley db library is being used for my build of > both Python 2.4 alpha 0, and 2.3 maintenance branch, both from cvs, > and I can't see any difference in what they're doing with bsddb -- so > clearly I must be missing something because it's hanging on EVERY > attempt to run the unittest w/2.4, but never w/2.3. The big difference i see between 2.3cvs and 2.4cvs that could "explain" it is that Lib/bsddb/__init__.py has been updated to use a private (in memory, single process only) DBEnv with locking and thread support enabled. That explains why db->del() would be doing locking. But not why it would deadlock. This is also easily reproducable here. No special platform or berkeleydb version should be required. Looking closer I suspect what is happening is that Lib/bsddb/__init__.py implementation is not threadsafe. It wants to maintain the current iterator location using a DBCursor object. However, having an active DBCursor holds a lock in the database. DictMixin's popitem() is effectively: k, v = self.iteritems().next() del self[k] return (k, v) The iteritems() call creates an internal DBCursor object for the iterator. The next() call on the iterator (DBCursor) looks up the value for k. The following delete attempts to delete the record without using the DBCursor; thus the deadlock. If we implement our own popitem() for the bsddb dictionary object (_DBWithCursor) to perform the delete using the cursor this deadlock in the unit tests would go away. That won't stop users from intermixing iteration over a database with modifications to the database; causing their own deadlocks (very unexpected in single threaded code). Proposed fix: It should be possible for the bsddb object to maintain internal state of its own about what key is is on and close any internal DB cursor on all non-cursor database accesses leaving the iteration functions to detect this and reopen and reposition the cursor. Since the basic bsddb interface doesn't allow databases with duplicate keys it shouldn't be too difficult. Its not efficient but a user who cares about efficient use of berkeleydb should use the real DB/DBEnv interface directly. How do python dictionaries deal with modifications to the dictionary intermixed with iteration? Greg From aleaxit at yahoo.com Mon Oct 27 05:25:16 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 27 05:25:24 2003 Subject: [Python-Dev] Re: test_bsddb blocks testing popitem - reason In-Reply-To: <20031027094045.GL3929@zot.electricrain.com> References: <200310251232.55044.aleaxit@yahoo.com> <200310270930.28811.aleaxit@yahoo.com> <20031027094045.GL3929@zot.electricrain.com> Message-ID: <200310271125.16879.aleaxit@yahoo.com> On Monday 27 October 2003 10:40 am, Gregory P. Smith wrote: ... > The big difference i see between 2.3cvs and 2.4cvs that could "explain" > it is that Lib/bsddb/__init__.py has been updated to use a private > (in memory, single process only) DBEnv with locking and thread support > enabled. That explains why db->del() would be doing locking. But not > why it would deadlock. *AH*! I wasn't looking in the right place, silly me. Good job!!! Yes, now that you've pointed it out, the change from 2.3's d = db.DB() to 2.4's e = _openDBEnv() d = db.DB(e) must be the culprit. I still don't quite see how the lock ends up being "held", but, don't mind me -- the intricacy of mixins and wrappings and generators and delegations in those modules is making my head spin anyway, so it's definitely not surprising that I can't quite see what's going on. > How do python dictionaries deal with modifications to the dictionary > intermixed with iteration? In general, Python doesn't deal well with modifications to any iterable in the course of a loop using an iterator on that iterable. The one kind of "modification during the loop" that does work is: for k in somedict: somedict[k] = ...whatever... i.e. one can change the values corresponding to keys, but not change the set of keys in any way -- any changes to the set of keys can cause unending loops or other such misbehavior (not deadlocks nor crashes, though...). However, on a real Python dict, k, v = thedict.iteritems().next() doesn't constitute "a loop" -- the iterator object returned by the iteritems call is dropped since there are no outstanding references to it right after this statement. So, following up with del thedict[k] is quite all right -- the dictionary isn't being "looped on" at that time. Given that in bsddb's case that iteritems() first [and only] next() boils down to a self.first() which in turn does a self.dbc.first() I _still_ don't see exactly what's holding the lock. But the simplest fix would appear to be in __delitem__, i.e., if we have a cursor we should delete through it: def __delitem__(self, key): self._checkOpen() if self.dbc is not None: self.dbc.set(key) self.dbc.delete() else: del self.db[key] ...but this doesn't in fact remove the deadlock on the unit-test for popitem, which just confirms I don't really grasp what's going on, yet!-) Alex From just at letterror.com Mon Oct 27 05:53:47 2003 From: just at letterror.com (Just van Rossum) Date: Mon Oct 27 05:53:47 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310271033.56569.aleaxit@yahoo.com> Message-ID: Alex Martelli wrote: > My slight preference for leaving += and friends alone is that > a function using them to rebind nonlocals would be hard to > read, that since the change only applies when the LHS is a > bare name the important use cases for augmented assignment > don't apply any way, that it's a bit subtle to explain that > foo.bar += baz > ( += on a dotted name) implies a plain assignment (setattr) > on foo.bar while > foo_bar += baz > ( += on bare name) might imply a := assignment (rebinding > a nonlocal) IF there are no "foo_bar = baz" elsewhere in the > same function BUT it would imply a plain assignment if there > ARE other plain assignments to the same name in the same > function, ... To an extent you're only making it _more_ difficult by saying "x := ..." rebinds to a non-local name" instead of "x := rebinds to x in whichever scope x is defined (which may be the local scope)". With the latter definition, there's less to explain regarding "x += ..." as a rebinding operation. I find that _if_ we were to add a rebinding operator, it would be extremely silly not to allow augmented assignments to be rebinding, perhaps even patronizing: "yes you can assign to outer scopes, but no you can't use augmented assignments for that since we think it makes it too difficult for you." We should either _not_ allow assignments to outer scopes at all, _or_ allow it and make it as powerful as practically possible. I don't think allowing it with non-obvious (arbitrary) limitations is a good idea. For example, the more I think about it, the more I am _against_ disallowing "a, b := b, a". That said, someone made a point here that rebinding is a behavior of a variable, not the assignment operation: that's a very good one indeed, and does make me less certain of whether adding := would be such a good idea after all. Just From python at rcn.com Mon Oct 27 06:24:57 2003 From: python at rcn.com (Raymond Hettinger) Date: Mon Oct 27 06:25:53 2003 Subject: [Python-Dev] RE: cloning iterators again In-Reply-To: <200310270031.h9R0VZp25738@12-236-54-216.client.attbi.com> Message-ID: <001201c39c7c$f4582140$81b0958d@oemcomputer> > I also note that the current tee() doesn't let you use __copy__ easily > (it would be quite messy I think). The linked-list version supports > __copy__ trivially. This may be important if we execute (as I think > we should) on the idea of making selected iterators __copy__-able > (especially all the standard container iterators and xrange). The current tee() was written to support only a two way split, but it can easily be cast as a multi-way splitter with no problem. The only real difference in the ideas presented so far are whether the underlying queue should be implemented as a singly linked list or as a double stack. As a proof-of-concept, here is GvR's code re-cast with the queue changed to a double stack implementation. The interface is completely unchanged. The memory consumed is double that the current tee() but much less than the linked list version. The speed is half that of the current tee() and roughly comparable to or slightly better than the linked list version. Raymond Hettinger ------------------------------------------------------------------------ -- """ Guido's demo program re-cast with a different underlying data structure Replaces the linked list based queue with a two stack based queue. Advantages: The double stack method consumes only two pointers per data element while the linked list method consumes space for a link object (8 to 10 words). The double stack method uses contiguous memory while the link objects are more fragmented. The stack method uses append() and pop() which are optimized to minimize memory management calls. For the link method, every link costs a malloc and free. Todo: Handle Wrappers that are GC'd before termination. Add support for holding an exception. """ class TeeMaster(object): """Holder for information common to wrapped iterators """ def __init__(self, it): self.inbasket = [] self.inrefcnt = [] self.outbasket = [] self.outrefcnt = [] self.numseen = 0 self.it = it self.numsplits = 0 class Wrapper(object): """Copyable wrapper around an iterator. Any number of Wrappers around the same iterator share the TeeMaster object. The Wrapper that is most behind will drop the refcnt to zero, which causes the reference to be popped off of the queue. The newest Wrapper gets a brand new TeeMaster object. Later wrappers share an existing TeeMaster object. Since they may have arrived late in the game, they need to know how many objects have already been seen by the wrapper. When they call next(), they ask for the next numseen. If a Wrapper is garbage-collected before it finishes, the refcount floor needs to be raised. That has not yet been implemented. """ __slots__ = ["master", "numseen"] def __init__(self, it, master=None): """Constructor. The master argument is used by __copy__ below.""" if master is None: master = TeeMaster(it) self.master = master self.numseen = master.numseen self.master.numsplits += 1 def __copy__(self): """Copy the iterator. This returns a new iterator that will return the same series of results as the original. """ return Wrapper(None, self.master) def __iter__(self): """All iterators should support __iter__() returning self.""" return self def next(self): """Get the next value of the iterator, or raise StopIteration.""" master = self.master inbasket, inrefcnt = master.inbasket, master.inrefcnt if master.numseen == self.numseen: # This is the lead dog so get a value through the iterator value = master.it.next() master.numseen += 1 # Save it for the other dogs inbasket.append(value) inrefcnt.append(master.numsplits-1) self.numseen += 1 return value # Not a lead dog -- the view never changes :-( location = len(inbasket) - (master.numseen - self.numseen) if location >= 0: # Our food is in the inbasket value = inbasket[location] inrefcnt[location] -= 1 rc = inrefcnt[location] else: # Our food is in the outbasket location = -(location + 1) value = master.outbasket[location] master.outrefcnt[location] -= 1 rc = master.outrefcnt[location] # Purge doggie bowl when no food is left if rc == 0: if len(master.outbasket) == 0: master.outbasket, master.inbasket = master.inbasket, master.outbasket master.outrefcnt, master.inrefcnt = master.inrefcnt, master.outrefcnt master.outbasket.reverse() master.outrefcnt.reverse() master.outbasket.pop() master.outrefcnt.pop() self.numseen += 1 return value def tee(it): """Replacement for Raymond's tee(); see examples in itertools docs.""" if not hasattr(it, "__copy__"): it = Wrapper(it) return (it, it.__copy__()) def test(): """A simple demonstration of the Wrapper class.""" import random def gen(): for i in range(10): yield i it = gen() a, b = tee(it) b, c = tee(b) c, d = tee(c) iterators = [a, b, c, d] while iterators != [None, None, None, None]: i = random.randrange(4) it = iterators[i] if it is None: next = "----" else: try: next = it.next() except StopIteration: next = "****" iterators[i] = None print "%4d%s%4s%s" % (i, " ."*i, next, " ."*(3-i)) if __name__ == "__main__": test() From mwh at python.net Mon Oct 27 07:45:31 2003 From: mwh at python.net (Michael Hudson) Date: Mon Oct 27 07:45:40 2003 Subject: [Python-Dev] PyList API missing PyList_Pop() and PyList_Delete In-Reply-To: <006201c39a61$2f8ed1a0$e841fea9@oemcomputer> (Raymond Hettinger's message of "Fri, 24 Oct 2003 15:01:09 -0400") References: <006201c39a61$2f8ed1a0$e841fea9@oemcomputer> Message-ID: <2m3cden91w.fsf@starship.python.net> "Raymond Hettinger" writes: >> PyList_SetSlice(lst, n-1, n, NULL); > > There's the new piece of information. I didn't know that the final > argument could be NULL and creating/destroying and empty list for the > arg was unpleasant. I'll add that info to the API docs. "del thing" is punned into "set thing NULL" at a pretty low level, and fairly consistently (hope that made sense...). Cheers, mwh -- MAN: How can I tell that the past isn't a fiction designed to account for the discrepancy between my immediate physical sensations and my state of mind? -- The Hitch-Hikers Guide to the Galaxy, Episode 12 From guido at python.org Mon Oct 27 09:49:39 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 27 09:51:30 2003 Subject: [Python-Dev] RE: cloning iterators again In-Reply-To: Your message of "Mon, 27 Oct 2003 00:12:33 EST." <002f01c39c48$edfa7b60$d4b8958d@oemcomputer> References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer> Message-ID: <200310271449.h9REnd026601@12-236-54-216.client.attbi.com> > My question was more directed toward non-performance issues. Do we > really have *any* need for more than two iterators running concurrently? > After all, it's already difficult to come-up with good use cases for two > that are not dominated by list() or window(). > > > > With the current tee(), I was thinking that if the two iterators stay > > close, you end up moving the in basket to the out basket rather > > frequently, and the overhead of that might beat the simplicity of the > > linked lists. > > With current tee(), runtime is dominated by calls to Append and Pop > (reverse is super-fast and moves each element only once). Those are > calls are more expensive than a link jump; however append() and pop() > are optimized to avoid calls to the memory manager while every link > would need steps for alloc/initialization/reference/dealloc. Cache > effects are also important because the current tee() uses much less > memory and the two memory blocks are contiguous. > > > > > Also, *if* you need a lot of clones, using multiple > > tee() calls ends up creating several queues, again causing more > > overhead. (These queues end up together containing all the items from > > the oldest to the newest iterator.) > > *If* we want to support multiple clones, there is an alternate > implementation of the current tee that only costs one extra word per > iteration. That would be in there already. I really *wanted* a > multi-way tee but couldn't find a single use case that warranted it. All points well taken. > > I also note that the current tee() doesn't let you use __copy__ easily > > (it would be quite messy I think). > > To __copy__ is to tee. Both make two iterators from one. > They are different names for the same thing. > Right now, they don't seem comparable because the current tee is only a > two way split and you think of copy as being a multi-way split for no > extra cost. Here I respectfully differ. When you tee, you have to stop using the underlying iterator, and replace it with one of the tee'ed copies. When you __copy__, you can continue to use the original. The difference matters if you're tee'ing an iterator owned by another piece of code. > > Maybe Andrew has some use cases? > > I hope so. I can't think of anything that isn't dominated by list(), > window(), or the current tee(). > > And, if needed, the current tee() can easily be made multi-way. It > doubles the unit memory cost from one word to two but that's nothing > compared to the link method (two words for PyHead, another 3 (Linux) or > 4 (Windows) words for GC, and another 3 for the data fields). As you said in your first msg, you could do it with much less overhead if the link cell wasn't made a PyObject. Also, Andrew's suggestion of using a link cell containing an array of values could be explored. But I'll happily back off until we find a use case that needs more than a 2-way tee *and* we find it's a performance bottleneck for your approach. We may never find that. > > The question is, how often does one need this? Have you seen real use > > cases for tee() that aren't better served with list()? > > I'm sure they exist, but they are very few. I was hoping that a simple, > fast, memory efficient two-way tee() would have satisfied all the > requests, but this thing appears to be taking on a life of its own with > folks thinking they need multiple concurrent iterators created by a > magic method (what for?). Well, there was a separate thread about __copy__'ing iterators. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Oct 27 09:53:24 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 27 09:53:29 2003 Subject: [Python-Dev] RE: cloning iterators again In-Reply-To: Your message of "Mon, 27 Oct 2003 08:51:02 +0100." <200310270851.02495.aleaxit@yahoo.com> References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer> <200310270851.02495.aleaxit@yahoo.com> Message-ID: <200310271453.h9RErOT26617@12-236-54-216.client.attbi.com> > But I don't understand how it would be quite messy to take advantage > of this in tee(), either: simply, tee() would start with the equivalent of > it = iter(it) > try: return it, copy.copy(it) > except TypeError:pass > and proceed just like now if this shortcut hasn't worked -- that's all. Right, that's what the tee() at the end of my code did, except it checked for __copy__ explicitly, since I assume that only iterators whose author has thought about copyability should be assumed copyable; this means that the default copy stategy for class instances (classic and new-style) as suspect. tee is more and less powerful than copy; it is more powerful because it works for any iterator, but less so because you can't continue using the underlying iterator (any calls to its next() method will be lost for both tee'ed copies). --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Mon Oct 27 10:09:03 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 27 10:09:13 2003 Subject: [Python-Dev] RE: cloning iterators again In-Reply-To: <20031027103540.GA27782@vicky.ecs.soton.ac.uk> References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer> <200310270851.02495.aleaxit@yahoo.com> <20031027103540.GA27782@vicky.ecs.soton.ac.uk> Message-ID: <200310271609.03819.aleaxit@yahoo.com> On Monday 27 October 2003 11:35 am, Armin Rigo wrote: > Hello Alex, > > On Mon, Oct 27, 2003 at 08:51:02AM +0100, Alex Martelli wrote: > > I just got a PEP number (323) for Copyable Iterators as recently > > discussed, > > .. where? I was assigned the PEP number in email today and just now committed the PEP (and the update of PEP 0 to list it) to CVS. > > and hope to commit the PEP within today. But, basically, the idea is > > trivially simple: iterators which really have a tiny amount of state, > > such as those on sequences and dicts, will expose __copy__ and implement > > it by just duplicating said tiny amount (one pointer to a container and > > an index). > > I needed this for sequence iterators and generators in a recent project. > Duplicating a user-defined running generator seems funny, but it works > quite well. I use this to make a snapshot of the program state and restore > it later, and the program makes heavy use of parallel-running generators > stored in lists. > > http://codespeak.net/svn/user/arigo/misc/statesaver.c Cool! Why don't you try copy.copy on types you don't automatically recognize and know how to deal with, BTW? That might make this cool piece of code general enough that Guido might perhaps allow generator-produced iterators to grow it as their __copy__ method... Alex From guido at python.org Mon Oct 27 10:11:16 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 27 10:11:23 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: Your message of "Mon, 27 Oct 2003 10:33:56 +0100." <200310271033.56569.aleaxit@yahoo.com> References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz> <200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com> <200310271033.56569.aleaxit@yahoo.com> Message-ID: <200310271511.h9RFBG926665@12-236-54-216.client.attbi.com> > My slight preference for leaving += and friends alone is that > a function using them to rebind nonlocals would be hard to > read, that since the change only applies when the LHS is a > bare name the important use cases for augmented assignment > don't apply any way, that it's a bit subtle to explain that > foo.bar += baz > ( += on a dotted name) implies a plain assignment (setattr) > on foo.bar while > foo_bar += baz > ( += on bare name) might imply a := assignment (rebinding > a nonlocal) IF there are no "foo_bar = baz" elsewhere in the > same function BUT it would imply a plain assignment if there > ARE other plain assignments to the same name in the same > function, ... I think you're making this sound more complicated than it is. I don't think you'll ever *have* to explain this anyway, as long as := and += use the same rules to find their target (I'd even accept rejecting the case where the target is a global for which the compiler can't find a single assignment, breaking an utterly minuscule amount of bad code, if any). I'm *not* saying that I like := (so far I still like 'global x in f' better) but I think that either way of allowing rebinding nonlocals will also have to allow rebinding them through += and friends. I think the main weakness (for me) of := and other approaches that try to force you to say you're rebinding a nonlocal each time you do it is beginning to show: there are already well-established rules for deciding whether a bare name is local or not, and those rules have always worked "at a distance". The main reason for disallowing rebinding nonlocals in the past has been that one of those rules was "if there's a bare-name assignment to it it must be local (unless there's also a global statement for it)" (and I couldn't find a satisfactory way to add a nonlocal declarative statement and I didn't think it was a huge miss -- actually I still think it's not a *huge* miss). > IOW it seems to me that we're getting into substantial amounts > of subtlety in explaining (and thus maybe in implementing) a > functionality change that's not terribly useful anyway and may > damage rather than improve readability when it's used. > > Taking the typical P. Graham accumulator example, say: > > with += rebinding, we can code this: > > def accumulator(n=0): > def increment(i): > n += i > return n > return increment > > but without it, we would code: > > def accumulator(n=0): > def increment(i): > n := n + i > return n > return increment > > and it doesn't seem to me that the two extra keystrokes are to > be considered a substantial price to pay. That's the argument that has always been used against += by people who don't like it. The counterargument is that (a) the savings in typing isn't always that small, and (b) += *expresses the programmer's thought better*. Personally I expect that as soon as nonlocal rebinding is supported in any way, people would be hugely surprised if += and friends were not. > Admittedly in such a > tiny example readability is just as good either way, as it's obvious > which n we're talking about (there being just one, and extremely > nearby wrt the point of use of either += or := ). > > Suppose we wanted to have the accumulator "saturate" -- if > the last value it returned was > m it must restart accumulating > from zero. Now, without augmented assignment: > > def accumulator_saturating(n=0, m=100): > def increment(i): > if n > m: > n := i > else: > n := n + i > return n > return increment > > we have a pleasing symmetry and no risk of errors -- if we > mistakenly use an = instead of := in either branch the compiler > will be able to let us know immediately. (Actually I'd be quite > tempted to code the if branch as "n := 0 + i" to underscore the > symmetry, but maybe I'm just weird:-). > > If we do rely on augmented assignment being "rebinding": > > def accumulator_saturating(n=0, m=100): > def increment(i): > if n > m: > n = i > else: > n += i > return n > return increment > > the error becomes a runtime rather than compile-time one, > and does take a (small but non-zero) time to discover it. Hah. Another argument *against* rebinding by :=, and *for* a nonlocal declaration. With 'nonlocal n, m' in increment() (or however it's spelled :-) the intent is clear. > The > += 's subtle new semantics (rebinds either a local or nonlocal, > depending on how other assignments elsewhere in the > function are coded) do make it slightly harder to understand > and explain, compared to my favourite approach, which is: > := is the ONLY way to rebind a nonlocal name > (and only ever does that, only with a bare name on LHS, > etc, etc) > which can't be beaten in terms of how simple it is to understand > and explain. The compiler could then diagnose an error when it > sees := and += used on the same barename in the same > function (and perhaps give a clear error message suggesting > non-augmented := usage in lieu of the augmented assignment). > > > Can somebody please show a compelling use case for some > "nonlocal += expr" over "nonlocal := nonlocal + expr" , sufficient > to override all the "simplicity" arguments above? I guess there > must be some, since popular feeling appears to be in favour of > having augmented-assignment as "rebinding", but I can't see them. --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Mon Oct 27 10:24:01 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 27 10:24:08 2003 Subject: [Python-Dev] RE: cloning iterators again In-Reply-To: <200310271453.h9RErOT26617@12-236-54-216.client.attbi.com> References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer> <200310270851.02495.aleaxit@yahoo.com> <200310271453.h9RErOT26617@12-236-54-216.client.attbi.com> Message-ID: <200310271624.01265.aleaxit@yahoo.com> On Monday 27 October 2003 03:53 pm, Guido van Rossum wrote: > > But I don't understand how it would be quite messy to take advantage > > of this in tee(), either: simply, tee() would start with the equivalent > > of it = iter(it) > > try: return it, copy.copy(it) > > except TypeError:pass > > and proceed just like now if this shortcut hasn't worked -- that's all. > > Right, that's what the tee() at the end of my code did, except it > checked for __copy__ explicitly, since I assume that only iterators > whose author has thought about copyability should be assumed copyable; > this means that the default copy stategy for class instances (classic > and new-style) as suspect. I see! So you want to be more prudent here than an ordinary copy would be, and also disallow alternatives to __copy__ such as __getinitargs__ or __getstate__/__setstate__ ...? Could you give an example of an iterator class, which is "accidentally" copyable, but "shouldn't" be for purposes of tee only? I have a hard time thinking of any (hmmm, perhaps a file object that's not "held" directly as an attribute, but indirectly in some devious way...?). Maybe I need to revise the PEP 323, which I just committed (to nondist/peps as usual) accordingly? > tee is more and less powerful than copy; it is more powerful because > it works for any iterator, but less so because you can't continue > using the underlying iterator (any calls to its next() method will be > lost for both tee'ed copies). Yes, it IS worth pointing out that the idiom for using tee must always be a, b = tee(c) and c is not to be used afterwards -- or equivalently a, b = tee(a) when, as common, there are no other references to a (even indirectly e.g. via somebody holding on to a ref to a.next). Hmmm, I wonder if that should go in my PEP, though, since it's more about tee than about copy...? Alex From guido at python.org Mon Oct 27 10:34:59 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 27 10:37:34 2003 Subject: [Python-Dev] RE: cloning iterators again In-Reply-To: Your message of "Mon, 27 Oct 2003 16:24:01 +0100." <200310271624.01265.aleaxit@yahoo.com> References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer> <200310270851.02495.aleaxit@yahoo.com> <200310271453.h9RErOT26617@12-236-54-216.client.attbi.com> <200310271624.01265.aleaxit@yahoo.com> Message-ID: <200310271535.h9RFYxm26796@12-236-54-216.client.attbi.com> > I see! So you want to be more prudent here than an ordinary copy > would be, and also disallow alternatives to __copy__ such as > __getinitargs__ or __getstate__/__setstate__ ...? Could you give > an example of an iterator class, which is "accidentally" copyable, but > "shouldn't" be for purposes of tee only? We discussed this before: if the state representing the iterator's position is a mutable object, copy.copy() will not copy this mutable object, so the two would share their state (or, more likely, part of their state). The example would be a tree iterator using a stack, represented as a list. > Yes, it IS worth pointing out that the idiom for using tee must > always be > a, b = tee(c) > and c is not to be used afterwards -- or equivalently > a, b = tee(a) > when, as common, there are no other references to a (even > indirectly e.g. via somebody holding on to a ref to a.next). Hmmm, > I wonder if that should go in my PEP, though, since it's more about tee > than about copy...? I think Raymond should add this to the tee() docs in big bold print. --Guido van Rossum (home page: http://www.python.org/~guido/) From amk at amk.ca Mon Oct 27 11:02:21 2003 From: amk at amk.ca (amk@amk.ca) Date: Mon Oct 27 11:05:46 2003 Subject: [Python-Dev] htmllib vs. HTMLParser Message-ID: <20031027160221.GA29155@rogue.amk.ca> Over in the Web SIG, it was noted that the HTML parser in htmllib has handlers for HTML 2.0 elements, and it should really support HTML 4.01, the current version. I'm looking into doing this. We actually have two HTML parsers: htmllib.py and the more recent HTMLParser.py. The initial check-in comment for 2001/05/18 for HTMLParser.py reads: A much improved HTML parser -- a replacement for sgmllib. The API is derived from but not quite compatible with that of sgmllib, so it's a new file. I suppose it needs documentation, and htmllib needs to be changed to use this instead of sgmllib, and sgmllib needs to be declared obsolete. But that can all be done later. sgmllib only handles those bits of SGML needed for HTML, and anyone doing serious SGML work is going to have to use a real SGML parser, so deprecating sgmllib is reasonable. HTMLParser needs no changes for HTML 4.01; only htmllib needs to get a bunch more handler methods. Should I try to do this for 2.4? (I can't find an explanation of how the API differs between the two modules but can figure it out by inspecting the code, and will try to keep the htmllib module backward-compatible.) --amk From FBatista at uniFON.com.ar Mon Oct 27 11:10:47 2003 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Mon Oct 27 11:11:59 2003 Subject: [Python-Dev] Decimal.py in sandbox Message-ID: To my (nice) surprise, all the testCases of Decimal.py ran OK. These tests were all about the specification (the ugly side, :) and not about using the class. For instance, you can do:: x = Decimal(3) / 5 and it get done allright (according to the tests cases of Mike Cowlishaw). But you can?t do:: x = 5 / Decimal(3) So, here is a temptative list of ToDo for myself: 1. Clean up unused code, reorder methods (all publics together, etc). 2. Put some repeated code inside functions. 3. Write a pre-PEP. 4. Write testCases for the functionality specified by the pre-PEP. 5. Write the code to comply the testCases. 6. Write the PEP. 7. Submit everything. Some questions: - Is there some of this work (specially the third item) already done or started? - Should I submit partial work or everything as a whole? - Modifications to the sandbox modules, are considered patches? Should I send them through SourceForge interface? As always, suggestions and similars are welcomed (and very appreciated). Thank you. Facundo Batista Gesti?n de Red fbatista@unifon.com.ar (54 11) 5130-4643 Cel: 15 5132 0132 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20031027/ecb74e42/attachment.html From neal at metaslash.com Mon Oct 27 11:12:02 2003 From: neal at metaslash.com (Neal Norwitz) Date: Mon Oct 27 11:12:11 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310271511.h9RFBG926665@12-236-54-216.client.attbi.com> References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz> <200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com> <200310271033.56569.aleaxit@yahoo.com> <200310271511.h9RFBG926665@12-236-54-216.client.attbi.com> Message-ID: <20031027161202.GF5842@epoch.metaslash.com> On Mon, Oct 27, 2003 at 07:11:16AM -0800, Guido van Rossum wrote: > > Hah. Another argument *against* rebinding by :=, and *for* a nonlocal > declaration. With 'nonlocal n, m' in increment() (or however it's > spelled :-) the intent is clear. I dislike := very much. I think it will confuse newbies and thus be abused. While I dislike the global declaration, I don't feel strongly about changing or removing it. The best alternative I've seen that addresses nested scope and the global declaration. Is to borrow :: from C++: foo = DEFAULT_VALUES counter = 0 def reset_foo(): ::foo = DEFAULT_VALUES def inc_counter(): ::counter += 1 def outer(): counter = 5 def inner(): ::counter += outer::counter # increment global from outer outer::counter += 2 # increment outer counter The reasons why I like this approach: * each variable reference can be explicit when necessary * no separate declaration * concise, no wording issues like global * similarity between global and nested scopes (ie, ::foo is global, scope::foo is some outer scope) both the global and nested issues are handled at once * doesn't prevent augmented assignment * it reads well to me and the semantics are pretty clear (although that's highly subjective) Neal From aleaxit at yahoo.com Mon Oct 27 11:20:10 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 27 11:20:44 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310271511.h9RFBG926665@12-236-54-216.client.attbi.com> References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz> <200310271033.56569.aleaxit@yahoo.com> <200310271511.h9RFBG926665@12-236-54-216.client.attbi.com> Message-ID: <200310271720.10313.aleaxit@yahoo.com> On Monday 27 October 2003 04:11 pm, Guido van Rossum wrote: ... > I don't think you'll ever *have* to explain this anyway, as long as := > and += use the same rules to find their target (I'd even accept Actually, I'd like to make a := ... an error when there's an a = ... in the same function, so it can't be exactly the same rules for a += ... in my opinion. > I'm *not* saying that I like := (so far I still like 'global x in f' Ah well. > I think the main weakness (for me) of := and other approaches that try > to force you to say you're rebinding a nonlocal each time you do it is > beginning to show: there are already well-established rules for > deciding whether a bare name is local or not, and those rules have There are, but they represent a wart (according to AMK's python-warts page, http://www.amk.ca/python/writing/warts.html , and I agree with him on this, although NOT with his suggested fix of having the compiler "automatically adding a global when needed" -- I don't like too-clever compilers that make subtle inferences behind my back, and I think that the fact that Python's compiler doesn't is a strength, not a weakness). The "well-established rules" also cause one of the "10 Python pitfalls" listed at http://zephyrfalcon.org/labs/python_pitfalls.html . My personal experience teaching/consulting/mentoring confirms this, although I, personally, don't remember having been bitten by this (but then, I recall only 2 of those 10 pitfalls as giving trouble to me personally, as opposed to people I taught/advised/etc: mutable default arguments, and "loops of x=x+y" performance traps for sequences). It seemed to me that introducing := (or other approaches that require explicit denotation of "I'm binding a nonlocal here") was a chance to FIX the warts/pitfalls of those "already well-established rules". Albeit with a heavy heart, I would consider even a Rubyesque stropping of nonlocals (Ruby uses $foo to mean foo is nonlocal, others here have suggested :foo, whatever, it's not the sugar that matters most to me here) preferable to using "declarative statements" for the purpose. Oh well. > always worked "at a distance". The main reason for disallowing > rebinding nonlocals in the past has been that one of those rules was > "if there's a bare-name assignment to it it must be local (unless > there's also a global statement for it)" (and I couldn't find a > satisfactory way to add a nonlocal declarative statement and I didn't > think it was a huge miss -- actually I still think it's not a *huge* > miss). Agreed, not huge, just probably marginally worth doing. Should it make "declarative statements" more popular and widely used than today's bare "global", I don't even know if it would be worth it. I don't like declarative statements. I don't understand why you like them here, when, in your message of Thursday 23 October 2003 06:25:49 on "accumulator display syntax", you condemned a proposal "because it feels very strongly like a directive to the compiler". "A directive to the compiler" is exactly how "global" and other proposed declarative-statements feel to me: statements that don't DO things (like all other statements do), but strictly and only are "like a directive to the compiler". > > and it doesn't seem to me that the two extra keystrokes are to > > be considered a substantial price to pay. > > That's the argument that has always been used against += by people who > don't like it. The counterargument is that (a) the savings in typing > isn't always that small, and (b) += *expresses the programmer's The saving in typing is not always small _when on the left of the augmented assignment operator you have something much more complicated than just a bare name_. For example, counter[current_row + current_column * delta] += current_value Without += this statement would be too long, and it would be hard to check that the LHS and RHS match exactly -- in practice one would end up breaking it in two, current_index = current_row + current_column * delta counter[current_index] = counter[current_index] + current_value which IS still substantially more cumbersome than the previous version using += . But this counterargument does not apply to uses of += on bare names: the saving is strictly limited to the length of the bare name, which should be reasonably small. > thought better*. Personally I expect that as soon as nonlocal > rebinding is supported in any way, people would be hugely surprised if > += and friends were not. We could try an opinion poll, but it's probably worth it only if this measure of "expected surprise" was the key point for your decision; if you're going to prefer declarative statements anyway, there's no point going through the aggravation. > > If we do rely on augmented assignment being "rebinding": > > > > def accumulator_saturating(n=0, m=100): > > def increment(i): > > if n > m: > > n = i > > else: > > n += i > > return n > > return increment > > > > the error becomes a runtime rather than compile-time one, > > and does take a (small but non-zero) time to discover it. > > Hah. Another argument *against* rebinding by :=, and *for* a nonlocal > declaration. With 'nonlocal n, m' in increment() (or however it's > spelled :-) the intent is clear. I disagree that the example is "an argument for declarations": on the contrary, it's an argument for := without "rebinding +=". The erroneous example just reposted gives a runtime error anyway (I don't know why I wrote it would give a compile-time error -- just like a bare "def f(): x+=1" doesn't give a compile-time error today, so, presumably, wouldn't this reposted example). If "n := n + i" WAS used in lieu of the augmented assignment, THEN -- and only then -- could we give the preferable compile- time error, for forbidden mixing of "n = ..." and "n := ..." in different spots in the same function. Alex From guido at python.org Mon Oct 27 11:51:16 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 27 11:51:34 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: Your message of "Mon, 27 Oct 2003 11:12:02 EST." <20031027161202.GF5842@epoch.metaslash.com> References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz> <200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com> <200310271033.56569.aleaxit@yahoo.com> <200310271511.h9RFBG926665@12-236-54-216.client.attbi.com> <20031027161202.GF5842@epoch.metaslash.com> Message-ID: <200310271651.h9RGpGe27042@12-236-54-216.client.attbi.com> The only problem with using :: is a syntactic ambiguity: a[x::y] already means something (an extended slice with start=x, no stop, and step=y). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Oct 27 11:52:53 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 27 11:53:18 2003 Subject: [Python-Dev] htmllib vs. HTMLParser In-Reply-To: Your message of "Mon, 27 Oct 2003 11:02:21 EST." <20031027160221.GA29155@rogue.amk.ca> References: <20031027160221.GA29155@rogue.amk.ca> Message-ID: <200310271652.h9RGqrN27054@12-236-54-216.client.attbi.com> > Over in the Web SIG, it was noted that the HTML parser in htmllib has > handlers for HTML 2.0 elements, and it should really support HTML 4.01, the > current version. I'm looking into doing this. > > We actually have two HTML parsers: htmllib.py and the more recent > HTMLParser.py. The initial check-in comment for 2001/05/18 for > HTMLParser.py reads: > > A much improved HTML parser -- a replacement for sgmllib. The API is > derived from but not quite compatible with that of sgmllib, so it's a > new file. I suppose it needs documentation, and htmllib needs to be > changed to use this instead of sgmllib, and sgmllib needs to be > declared obsolete. But that can all be done later. > > sgmllib only handles those bits of SGML needed for HTML, and anyone doing > serious SGML work is going to have to use a real SGML parser, so deprecating > sgmllib is reasonable. HTMLParser needs no changes for HTML 4.01; only > htmllib needs to get a bunch more handler methods. > > Should I try to do this for 2.4? I'm unclear on what you plan to do -- repeal sgmllib an rewrite htmllib to use HTMLParser internally for a backwards compatible interface? > (I can't find an explanation of how the API differs between the two modules > but can figure it out by inspecting the code, and will try to keep the > htmllib module backward-compatible.) That would be required for a few releases, yes. I'm okay with deprecating sgmllib faster than htmllib. --Guido van Rossum (home page: http://www.python.org/~guido/) From neal at metaslash.com Mon Oct 27 12:08:50 2003 From: neal at metaslash.com (Neal Norwitz) Date: Mon Oct 27 12:09:00 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310271651.h9RGpGe27042@12-236-54-216.client.attbi.com> References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz> <200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com> <200310271033.56569.aleaxit@yahoo.com> <200310271511.h9RFBG926665@12-236-54-216.client.attbi.com> <20031027161202.GF5842@epoch.metaslash.com> <200310271651.h9RGpGe27042@12-236-54-216.client.attbi.com> Message-ID: <20031027170850.GG5842@epoch.metaslash.com> On Mon, Oct 27, 2003 at 08:51:16AM -0800, Guido van Rossum wrote: > The only problem with using :: is a syntactic ambiguity: > > a[x::y] > > already means something (an extended slice with start=x, no stop, and > step=y). I'm not wedded to the :: digraph, I prefer the concept. :: was nice because it re-used a similar concept from C++. No other digraph jumps out at me. Some other possibilities (I don't care for any of these): Global Nested ------ ------ :>variable scope:>variable *>variable scope*>variable ->variable scope->variable ?>variable scope?>variable &>variable scope&>variable Or perhaps variations using <. Neal From aleaxit at yahoo.com Mon Oct 27 12:20:14 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 27 12:20:50 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <20031027170850.GG5842@epoch.metaslash.com> References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz> <200310271651.h9RGpGe27042@12-236-54-216.client.attbi.com> <20031027170850.GG5842@epoch.metaslash.com> Message-ID: <200310271820.15001.aleaxit@yahoo.com> On Monday 27 October 2003 06:08 pm, Neal Norwitz wrote: > On Mon, Oct 27, 2003 at 08:51:16AM -0800, Guido van Rossum wrote: > > The only problem with using :: is a syntactic ambiguity: > > > > a[x::y] > > > > already means something (an extended slice with start=x, no stop, and > > step=y). > > I'm not wedded to the :: digraph, I prefer the concept. :: was nice > because it re-used a similar concept from C++. No other digraph jumps Does it have to be a digraph? We could use one of the ASCII chars Python doesn't use. For example, $ would give us exactly the same way as Ruby to strop global variables (though, differently from Ruby, we'd only _have_ to strop them on rebinding -- more-common "read" accesses would stay clean) -- $variable meaning 'global'. And scope$variable meaning 'outer'. OTOH, if we used @ instead, it would read better the other way 'round -- variable@scope DOES look like a pretty natural way to indicate "said variable at said scope" -- though it doesn't read quite as well _without_ a scope. Alex From pedronis at bluewin.ch Mon Oct 27 12:23:28 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Mon Oct 27 12:21:28 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310271511.h9RFBG926665@12-236-54-216.client.attbi.com> References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz> <200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com> <200310271033.56569.aleaxit@yahoo.com> Message-ID: <5.2.1.1.0.20031027181727.027d5b00@pop.bluewin.ch> At 07:11 27.10.2003 -0800, Guido van Rossum wrote: >I'm *not* saying that I like := (so far I still like 'global x in f' >better) if I understand 'global x in f' will introduce a local x in f even if there is none, for symmetry with global. Maybe this has already been answered (this thread is getting too long, and is this change scheduled for 2.4 or 3.0?) but x = 'global' def f(): def init(): global x in f x = 'in f' init() print x f() will this print 'global' or 'in f' ? I can argument both ways which is not a good thing. Thanks. From skip at pobox.com Mon Oct 27 12:23:40 2003 From: skip at pobox.com (Skip Montanaro) Date: Mon Oct 27 12:23:50 2003 Subject: [Python-Dev] Let's table the discussion of replacing 'global' Message-ID: <16285.21660.432100.124214@montanaro.dyndns.org> [ on changing Python's global variable access mechanisms ] I'm going to make a suggestion. Let's shelve this topic for the time being and simply summarize the issues in an informational PEP aimed at Py3k. We don't even know (at least I don't) if we want an implicit search for outer scope variables or an explicit specification of which scope such variables should be defined in. If, for some reason, nested scopes make a quick exit in Py3k, this would all be moot anyway. It's not clear nested scopes really offer anything to Python other than muddled semantics and a more complex virtual machine implementation. Skip From guido at python.org Mon Oct 27 12:28:33 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 27 12:28:41 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: Your message of "Mon, 27 Oct 2003 18:23:28 +0100." <5.2.1.1.0.20031027181727.027d5b00@pop.bluewin.ch> References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz> <200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com> <200310271033.56569.aleaxit@yahoo.com> <5.2.1.1.0.20031027181727.027d5b00@pop.bluewin.ch> Message-ID: <200310271728.h9RHSXl27167@12-236-54-216.client.attbi.com> > if I understand 'global x in f' will introduce a local x in f even if there > is none, for symmetry with global. Maybe this has already been answered > (this thread is getting too long, and is this change scheduled for 2.4 or > 3.0?) but > > x = 'global' > > def f(): > def init(): > global x in f > x = 'in f' > init() > print x > > f() > > will this print 'global' or 'in f' ? I can argument both ways which is not > a good thing. The compiler does a full analysis so it will know that init() refers to a cell for x in f's locals, and hence it will print 'in f'. For the purposes of deciding which variables live where, the presence of 'global x in f' inside an inner function (whether or not there's a matching assignment) is equivalent to the presence of an assignment to x in f's body. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Oct 27 12:29:49 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 27 12:30:33 2003 Subject: [Python-Dev] Let's table the discussion of replacing 'global' In-Reply-To: Your message of "Mon, 27 Oct 2003 11:23:40 CST." <16285.21660.432100.124214@montanaro.dyndns.org> References: <16285.21660.432100.124214@montanaro.dyndns.org> Message-ID: <200310271729.h9RHTnl27186@12-236-54-216.client.attbi.com> > I'm going to make a suggestion. Let's shelve this topic for the time being > and simply summarize the issues in an informational PEP aimed at > Py3k. Great idea. I'm getting tired of it too; Alex and I don't seem to be getting an inch closer to each other. > We don't even know (at least I don't) if we want an implicit search > for outer scope variables or an explicit specification of which > scope such variables should be defined in. If, for some reason, > nested scopes make a quick exit in Py3k, this would all be moot > anyway. Sorry to disappoint you, but nested scopes aren't going away. --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Mon Oct 27 12:34:48 2003 From: aahz at pythoncraft.com (Aahz) Date: Mon Oct 27 12:34:51 2003 Subject: [Python-Dev] Decimal.py in sandbox In-Reply-To: References: Message-ID: <20031027173448.GA17544@panix.com> On Mon, Oct 27, 2003, Batista, Facundo wrote: > > Some questions: > > - Is there some of this work (specially the third item) already done or > started? > - Should I submit partial work or everything as a whole? > - Modifications to the sandbox modules, are considered patches? Should I > send them through SourceForge interface? The first thing you should do is talk with Eric Price (eprice@tjhsst.edu), author of the code. You don't need to use SF for now; CVS should be fine, but you should find out whether Eric would like to approve changes first. There's no reason you can't start with a pre-PEP now; I'd focus on interface (i.e. the question of what ``Decimal(5)/3`` and ``5/Decimal(3)`` should do -- my personal take at this point is that both ought to fail). -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From walter at livinglogic.de Mon Oct 27 12:42:40 2003 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Mon Oct 27 12:43:03 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <20031027161202.GF5842@epoch.metaslash.com> References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz> <200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com> <200310271033.56569.aleaxit@yahoo.com> <200310271511.h9RFBG926665@12-236-54-216.client.attbi.com> <20031027161202.GF5842@epoch.metaslash.com> Message-ID: <3F9D5910.9050001@livinglogic.de> Neal Norwitz wrote: > On Mon, Oct 27, 2003 at 07:11:16AM -0800, Guido van Rossum wrote: > >>Hah. Another argument *against* rebinding by :=, and *for* a nonlocal >>declaration. With 'nonlocal n, m' in increment() (or however it's >>spelled :-) the intent is clear. > > > I dislike := very much. I think it will confuse newbies and thus be > abused. While I dislike the global declaration, I don't feel strongly > about changing or removing it. I think ':=' is to close to '='. The default assigment should be much easier to type than the special case. Otherwise I'd have to think about which one I'd like to use every time I type an assigment. Bye, Walter D?rwald From guido at python.org Mon Oct 27 13:00:09 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 27 13:03:17 2003 Subject: [Python-Dev] RE: cloning iterators again In-Reply-To: Your message of "Mon, 27 Oct 2003 06:24:57 EST." <001201c39c7c$f4582140$81b0958d@oemcomputer> References: <001201c39c7c$f4582140$81b0958d@oemcomputer> Message-ID: <200310271800.h9RI09E27291@12-236-54-216.client.attbi.com> > As a proof-of-concept, here is GvR's code re-cast with the queue changed > to a double stack implementation. The interface is completely > unchanged. The memory consumed is double that the current tee() but > much less than the linked list version. The speed is half that of the > current tee() and roughly comparable to or slightly better than the > linked list version. Actually, if I up the range() in the gen() function to range(10000) and drop the print statement, the Python version of your code runs about 20% slower than mine. But this says nothing about the relative speed of C implementations. --Guido van Rossum (home page: http://www.python.org/~guido/) From just at letterror.com Mon Oct 27 13:00:55 2003 From: just at letterror.com (Just van Rossum) Date: Mon Oct 27 13:03:40 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310271651.h9RGpGe27042@12-236-54-216.client.attbi.com> Message-ID: Guido van Rossum wrote: > The only problem with using :: is a syntactic ambiguity: > > a[x::y] > > already means something (an extended slice with start=x, no stop, and > step=y). On the other hand: a[x y] doesn't mean anything, so I don't see an immediate problem. I like Neal's proposal, including the "::" digraph. Just From aleaxit at yahoo.com Mon Oct 27 13:40:31 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 27 13:41:43 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: References: Message-ID: <200310271940.31530.aleaxit@yahoo.com> On Monday 27 October 2003 07:00 pm, Just van Rossum wrote: > Guido van Rossum wrote: > > The only problem with using :: is a syntactic ambiguity: > > > > a[x::y] > > > > already means something (an extended slice with start=x, no stop, and > > step=y). > > On the other hand: > > a[x y] > > doesn't mean anything, so I don't see an immediate problem. Sorry, just, but I really don't understand the "don't see immediate problem". As I understand the proposal: y = 23 biglist = range(999) def f(): y = 45 # sets a local ::y = 67 # sets the global print biglist[::y] should this print the 67-th item of biglist, or the first 45 ones? a[x::y] is similarly made ambiguous (slice from x step y, or index at y in scope x?), at least for human readers if not for the compiler -- to have the same expression mean either thing depending on whether x names an outer function, a local variable, or neither, or both, for example, would seem very confusing to me. > I like Neal's proposal, including the "::" digraph. I just don't see how :: can be used nonconfusingly due to the 'clash' with "slicing with explicit step and without explicit stop" (ambiguity with slices with implicit 0 start for prefix use, a la ::y -- ambiguity with slices with explicit start for infix use, a la x::y). A digraph, single character, or other operator that could be used (and look nice) in lieu of :: either prefix or infix -- aka "stropping by any other name", even though the syntax sugar may look different from Ruby's use of prefix $ to strop globals -- would be fine. But I don't think :: can be it. Alex From FBatista at uniFON.com.ar Mon Oct 27 13:46:54 2003 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Mon Oct 27 13:48:03 2003 Subject: [Python-Dev] Decimal.py in sandbox Message-ID: Aahz wrote: #- The first thing you should do is talk with Eric Price #- (eprice@tjhsst.edu), author of the code. You don't need to #- use SF for #- now; CVS should be fine, but you should find out whether #- Eric would like #- to approve changes first. OK, I'll mail him. #- There's no reason you can't start with a pre-PEP now; I'd focus on #- interface (i.e. the question of what ``Decimal(5)/3`` and #- ``5/Decimal(3)`` should do -- my personal take at this point is that #- both ought to fail). Well, there's wide discussion about this when I posted the pre-PEP of Money. The raisoning of majority is that when two operands are of different type, the less general must be converted to the more general one: >>> myint = 5 >>> myfloat = 3.0 >>> mywhat = myint + myfloat >>> type(mywhat) With this in mind, the behaviour would be: >>> myDecimal = Decimal(5) >>> myfloat = 3.0 >>> mywhat = myDecimal + myfloat >>> isinstance(mywhat, float) True and >>> myDecimal = Decimal(5) >>> myint = 3 >>> mywhat = myint + myDecimal >>> isinstance(mywhat, Decimal) True but I really don't know if the first behaviour should be extended to the latter two. Anyway, I'll post the pre-PEP and we all should see, :) Thanks. . Facundo From amk at amk.ca Mon Oct 27 13:54:52 2003 From: amk at amk.ca (amk@amk.ca) Date: Mon Oct 27 13:55:05 2003 Subject: [Python-Dev] htmllib vs. HTMLParser In-Reply-To: <200310271652.h9RGqrN27054@12-236-54-216.client.attbi.com> References: <20031027160221.GA29155@rogue.amk.ca> <200310271652.h9RGqrN27054@12-236-54-216.client.attbi.com> Message-ID: <20031027185452.GA29897@rogue.amk.ca> On Mon, Oct 27, 2003 at 08:52:53AM -0800, Guido van Rossum wrote: > I'm unclear on what you plan to do -- repeal sgmllib an rewrite > htmllib to use HTMLParser internally for a backwards compatible > interface? Correct; that's what your initial checkin message for HTMLParser.py suggests doing, and if I'm touching htmllib.py to add the HTML 4.01 stuff, I may as well make the other change, too. > I'm okay with deprecating sgmllib faster than htmllib. sgmllib gets deprecated; htmllib never gets deprecated. HTMLParser is a barebones HTML parser that provides no default handlers (handle_head, handle_title, etc.), and htmllib extends it, adding default handlers for the various things in HTML 4.01. --amk From guido at python.org Mon Oct 27 14:08:48 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 27 14:09:00 2003 Subject: [Python-Dev] htmllib vs. HTMLParser In-Reply-To: Your message of "Mon, 27 Oct 2003 13:54:52 EST." <20031027185452.GA29897@rogue.amk.ca> References: <20031027160221.GA29155@rogue.amk.ca> <200310271652.h9RGqrN27054@12-236-54-216.client.attbi.com> <20031027185452.GA29897@rogue.amk.ca> Message-ID: <200310271908.h9RJ8mX27413@12-236-54-216.client.attbi.com> > On Mon, Oct 27, 2003 at 08:52:53AM -0800, Guido van Rossum wrote: > > I'm unclear on what you plan to do -- repeal sgmllib an rewrite > > htmllib to use HTMLParser internally for a backwards compatible > > interface? > > Correct; that's what your initial checkin message for HTMLParser.py suggests > doing, and if I'm touching htmllib.py to add the HTML 4.01 stuff, I may as > well make the other change, too. > > > I'm okay with deprecating sgmllib faster than htmllib. > > sgmllib gets deprecated; htmllib never gets deprecated. HTMLParser is a > barebones HTML parser that provides no default handlers (handle_head, > handle_title, etc.), and htmllib extends it, adding default handlers for the > various things in HTML 4.01. OK, got it. Sounds good to me! --Guido van Rossum (home page: http://www.python.org/~guido/) From mwh at python.net Mon Oct 27 14:26:20 2003 From: mwh at python.net (Michael Hudson) Date: Mon Oct 27 14:26:25 2003 Subject: [Python-Dev] tests expecting but not finding errors due to bug fixes In-Reply-To: <200310251352.13266.aleaxit@yahoo.com> (Alex Martelli's message of "Sat, 25 Oct 2003 13:52:13 +0200") References: <200310251352.13266.aleaxit@yahoo.com> Message-ID: <2mr80ylbxf.fsf@starship.python.net> Alex Martelli writes: > Switching to the 2.3 maintenance branch (where test_bsdddb runs just fine), > I got "make test" failures on test_re.py. Turns out that the 2.3-branch > test_re.py was apparently not updated when the RE recursion bug was > fixed -- it still expects a couple of exceptions to be raised and they don't > get raised any more because the bugfix itself WAS backported. > > On general principles, in cases of this ilk, IS it all right to just backport > the corrected unit-test (from the 2.4 to the 2.3 branch) and commit the > fix, or should one be more circumspect about it...? I'd say go for it. It sounds like just a partially missed backport (and someone checking things in without running make test, tsk). Cheers, mwh -- Roll on a game of competetive offence-taking. -- Dan Sheppard, ucam.chat From just at letterror.com Mon Oct 27 14:28:24 2003 From: just at letterror.com (Just van Rossum) Date: Mon Oct 27 14:28:24 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310271940.31530.aleaxit@yahoo.com> Message-ID: Alex Martelli wrote: > Sorry, just, but I really don't understand the "don't see immediate > problem". [ ... ] > print biglist[::y] Well, that's the part I didn't see yet, so there :) Just From tjreedy at udel.edu Mon Oct 27 14:47:41 2003 From: tjreedy at udel.edu (Terry Reedy) Date: Mon Oct 27 14:47:07 2003 Subject: [Python-Dev] Re: Inconsistent error messages in Py{Object, Sequence}_SetItem() References: <20031026195515.GA30335@cthulhu.gerg.ca><200310262050.h9QKoI825552@12-236-54-216.client.attbi.com> <3F9C3FB0.8050206@v.loewis.de> Message-ID: "Martin v. Löwis" wrote in message news:3F9C3FB0.8050206@v.loewis.de... > Guido van Rossum wrote: > > Luckily I wasn't taught formal writing :-), and I don't see why it > > can't be doesn't. I'd say that if you want Python's error messages to > > be formal writing, you'd have to change a lot more than just the > > one... :-) > > OTOH, I would always yield to native speakers in such issues. To me > myself, it does not matter much, but if native speakers feel happier > one way or the other, I'd like to help them feel happy :-) To add a native-speaker datapoint: I am old enough to remember being taught the same as Greg. (However, American stylistic conventions have tended to get looser since then.) I also remember going through manuscripts to get rid of contractions prior to submission for publication. Given the overloading of apostrophe both in English and Python, I think 'does not' looks slightly better than "doesn't" (which saves only one character and forces a change in quote marks!). So does consistency versus accidental variation ;-) Terry J. Reedy From guido at python.org Mon Oct 27 14:54:02 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 27 14:54:14 2003 Subject: [Python-Dev] Re: Inconsistent error messages in Py{Object, Sequence}_SetItem() In-Reply-To: Your message of "Mon, 27 Oct 2003 14:47:41 EST." References: <20031026195515.GA30335@cthulhu.gerg.ca><200310262050.h9QKoI825552@12-236-54-216.client.attbi.com> <3F9C3FB0.8050206@v.loewis.de> Message-ID: <200310271954.h9RJs2i27571@12-236-54-216.client.attbi.com> > Given the overloading of apostrophe both in English and Python, I > think 'does not' looks slightly better than "doesn't" (which saves > only one character and forces a change in quote marks!). So does > consistency versus accidental variation ;-) So who's going to change all the other occurrences of "doesn't" and other contractions in error messages? --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Mon Oct 27 14:59:00 2003 From: skip at pobox.com (Skip Montanaro) Date: Mon Oct 27 14:59:11 2003 Subject: [Python-Dev] Let's table the discussion of replacing 'global' In-Reply-To: <1067283859.8566.633.camel@localhost.localdomain> References: <16285.21660.432100.124214@montanaro.dyndns.org> <200310271729.h9RHTnl27186@12-236-54-216.client.attbi.com> <1067283859.8566.633.camel@localhost.localdomain> Message-ID: <16285.30980.869021.894625@montanaro.dyndns.org> Jeremy> I haven't had time to participate in this thread -- too much Jeremy> real work for the last several days -- but I'd be happy to write Jeremy> a PEP that summarizes the issues. Thank you. I was trying to figure out where I was going to find the time. Feel free to ask me for inputs or an outline (or if you continue in your too busy ways I'll try to whip something up). Skip From tim.one at comcast.net Mon Oct 27 15:00:26 2003 From: tim.one at comcast.net (Tim Peters) Date: Mon Oct 27 15:00:31 2003 Subject: [Python-Dev] Re: Inconsistent error messages in Py{Object, Sequence}_SetItem() In-Reply-To: <200310271954.h9RJs2i27571@12-236-54-216.client.attbi.com> Message-ID: [Guido] > So who's going to change all the other occurrences of "doesn't" and > other contractions in error messages? I hope nobody -- it's about as silly a crusade as trying to find a way to make "$" mean "non-local" <0.5 wink>. and-that-wouldn't-read-better-as-"it-is-about-as-silly"-ly y'rs - tim From jeremy at alum.mit.edu Mon Oct 27 14:44:21 2003 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Mon Oct 27 15:33:09 2003 Subject: [Python-Dev] Let's table the discussion of replacing 'global' In-Reply-To: <200310271729.h9RHTnl27186@12-236-54-216.client.attbi.com> References: <16285.21660.432100.124214@montanaro.dyndns.org> <200310271729.h9RHTnl27186@12-236-54-216.client.attbi.com> Message-ID: <1067283859.8566.633.camel@localhost.localdomain> On Mon, 2003-10-27 at 12:29, Guido van Rossum wrote: > > I'm going to make a suggestion. Let's shelve this topic for the time being > > and simply summarize the issues in an informational PEP aimed at > > Py3k. > > Great idea. I'm getting tired of it too; Alex and I don't seem to be > getting an inch closer to each other. > > > We don't even know (at least I don't) if we want an implicit search > > for outer scope variables or an explicit specification of which > > scope such variables should be defined in. If, for some reason, > > nested scopes make a quick exit in Py3k, this would all be moot > > anyway. > > Sorry to disappoint you, but nested scopes aren't going away. I haven't had time to participate in this thread -- too much real work for the last several days -- but I'd be happy to write a PEP that summarizes the issues. Jeremy From python at rcn.com Mon Oct 27 16:13:26 2003 From: python at rcn.com (Raymond Hettinger) Date: Mon Oct 27 16:14:25 2003 Subject: [Python-Dev] PEP 289: Generator Expressions In-Reply-To: <200310230642.h9N6gCN01979@12-236-54-216.client.attbi.com> Message-ID: <004701c39ccf$29ef0740$81b0958d@oemcomputer> [GvR] > Raymond, please take this to c.l.py for feedback! Wear asbestos. :-) > > I'm sure there will be plenty of misunderstandings in the discussion > there. If these are due to lack of detail or clarity in the PEP, feel > free to update the PEP. If there are questions that need us to go > back to the drawing board or requiring BDFL pronouncement, take it > back to python-dev. The asbestos wasn't needed :-) Overall the pep is being well received. The discussion has been uncontentious and light (around 50-55 posts). Several people initially thought that lambda should be part of the syntax, but other respondants quickly laid that to rest. Many posters were succinctly positive: "+1" or "great idea". One skeptical response came from someone who didn't like list comprehensions either. Alex quickly pointed out that they have been "wildly successful" for advanced users and newbies alike. One poster counter-suggested a weird regex style syntax for embedding Perl expressions. The newsgroup was very kind and no one called him wacko :-) There was occasional discussion about the parentheses requirement but that was quickly settled also. One idea that had some merit was to not require the outer parentheses for a single expression on the rhs of an assignment: g = (x**2 for x in range(10)) # maybe the outer parens are not needed The discussion is winding down and there are no unresolved questions. Raymond Hettinger From pje at telecommunity.com Mon Oct 27 16:20:26 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Oct 27 16:21:48 2003 Subject: [Python-Dev] PEP 289: Generator Expressions In-Reply-To: <004701c39ccf$29ef0740$81b0958d@oemcomputer> References: <200310230642.h9N6gCN01979@12-236-54-216.client.attbi.com> Message-ID: <5.1.1.6.0.20031027161728.01f6c680@telecommunity.com> At 04:13 PM 10/27/03 -0500, Raymond Hettinger wrote: >There was occasional discussion about the parentheses requirement but >that was quickly settled also. One idea that had some merit was to not >require the outer parentheses for a single expression on the rhs of an >assignment: > > g = (x**2 for x in range(10)) # maybe the outer parens are not >needed FWIW, I think the parentheses add clarity over e.g. g = x**2 for x in range(10) As this latter formulation looks to me like g will equal 81 after the statement is executed. From sholden at holdenweb.com Mon Oct 27 16:25:56 2003 From: sholden at holdenweb.com (Steve Holden) Date: Mon Oct 27 16:31:01 2003 Subject: [Python-Dev] Alternate notation for global variable assignments In-Reply-To: <87k76rhnn6.fsf@egil.codesourcery.com> Message-ID: > -----Original Message----- > From: python-dev-bounces+sholden=holdenweb.com@python.org > [mailto:python-dev-bounces+sholden=holdenweb.com@python.org]On > Behalf Of > Zack Weinberg > Sent: Sunday, October 26, 2003 1:15 PM > To: python-dev > Subject: [Python-Dev] Alternate notation for global variable > assignments > > > > I like Just's := concept except for the similarity to =, and I worry > that the presence of := in the language will flip people into "Pascal > mode" -- thinking that = is the equality operator. I also think that > the notation is somewhat unnatural -- "globalness" is a property of > the _variable_, not the operator. So I'd like to suggest instead > > :var = value # var in module scope > :scope:var = value # var in named enclosing scope > > An advantage of this notation is that it can be used anywhere, not > just in an assignment. This has primary value for people reading the > code -- if you have a fairly large class method that uses a module > variable (not by assigning it) somewhere in the middle, writing it > :var means the reader knows to go look for the assignment way up top. > This should obviously be optional, to preserve backward compatibility. > However, its use in such expressions as sublist = lst[:var] would lead to substantial ambiguities, right? regards -- Steve Holden +1 703 278 8281 http://www.holdenweb.com/ Improve the Internet http://vancouver-webpages.com/CacheNow/ Python Web Programming http://pydish.holdenweb.com/pwp/ Interview with GvR August 14, 2003 http://www.onlamp.com/python/ From greg at electricrain.com Mon Oct 27 16:56:48 2003 From: greg at electricrain.com (Gregory P. Smith) Date: Mon Oct 27 16:56:55 2003 Subject: [Python-Dev] Re: test_bsddb blocks testing popitem - reason In-Reply-To: <200310271125.16879.aleaxit@yahoo.com> References: <200310251232.55044.aleaxit@yahoo.com> <200310270930.28811.aleaxit@yahoo.com> <20031027094045.GL3929@zot.electricrain.com> <200310271125.16879.aleaxit@yahoo.com> Message-ID: <20031027215648.GM3929@zot.electricrain.com> On Mon, Oct 27, 2003 at 11:25:16AM +0100, Alex Martelli wrote: > I still don't quite see how the lock ends up being "held", but, don't mind > me -- the intricacy of mixins and wrappings and generators and delegations > in those modules is making my head spin anyway, so it's definitely not > surprising that I can't quite see what's going on. BerkeleyDB internally always grabs a read lock (i believe at the page level; i don't think BerkeleyDB does record locking) for any database read when opened with DB_THREAD | DB_INIT_LOCK flags. I believe the problem is that a DBCursor object holds this lock as long as it is open/exists. Other reads can go on happily, but writes must to wait for the read lock to be released before they can proceed. > > How do python dictionaries deal with modifications to the dictionary > > intermixed with iteration? > > In general, Python doesn't deal well with modifications to any > iterable in the course of a loop using an iterator on that iterable. > > The one kind of "modification during the loop" that does work is: > > for k in somedict: > somedict[k] = ...whatever... > > i.e. one can change the values corresponding to keys, but not > change the set of keys in any way -- any changes to the set of > keys can cause unending loops or other such misbehavior (not > deadlocks nor crashes, though...). > > However, on a real Python dict, > k, v = thedict.iteritems().next() > doesn't constitute "a loop" -- the iterator object returned by > the iteritems call is dropped since there are no outstanding > references to it right after this statement. So, following up > with > del thedict[k] > is quite all right -- the dictionary isn't being "looped on" at > that time. What about the behaviour of multiple iterators for the same dict being used at once (either interleaved or by multiple threads; it shouldn't matter)? I expect that works fine in python. This is something the _DBWithCursor iteration interface does not currently support due to its use of a single DBCursor internally. _DBWithCursor is currently written such that the cursor is never closed once created. This leaves tons of potential for deadlock even in single threaded apps. Reworking _DBWithCursor into a _DBThatUsesCursorsSafely such that each iterator creates its own cursor in an internal pool and other non cursor methods that would write to the db destroy all cursors after saving their current() position so that the iterators can reopen+reposition them is a solution. > Given that in bsddb's case that iteritems() first [and only] > next() boils down to a self.first() which in turn does a > self.dbc.first() I _still_ don't see exactly what's holding the > lock. But the simplest fix would appear to be in __delitem__, > i.e., if we have a cursor we should delete through it: > > def __delitem__(self, key): > self._checkOpen() > if self.dbc is not None: > self.dbc.set(key) > self.dbc.delete() > else: > del self.db[key] > > ...but this doesn't in fact remove the deadlock on the > unit-test for popitem, which just confirms I don't really > grasp what's going on, yet!-) hmm. i would've expected your __delitem__ to work. Regardless, using the debugger I can stop the deadlock from occurring if i do "self.dbc.close(); self.dbc = None" just before popitem's "del self[k]" Greg From barry at python.org Mon Oct 27 17:07:16 2003 From: barry at python.org (Barry Warsaw) Date: Mon Oct 27 17:07:22 2003 Subject: [Python-Dev] Re: test_bsddb blocks testing popitem - reason In-Reply-To: <20031027215648.GM3929@zot.electricrain.com> References: <200310251232.55044.aleaxit@yahoo.com> <200310270930.28811.aleaxit@yahoo.com> <20031027094045.GL3929@zot.electricrain.com> <200310271125.16879.aleaxit@yahoo.com> <20031027215648.GM3929@zot.electricrain.com> Message-ID: <1067292435.1785.91.camel@anthem> On Mon, 2003-10-27 at 16:56, Gregory P. Smith wrote: > BerkeleyDB internally always grabs a read lock (i believe at the page > level; i don't think BerkeleyDB does record locking) Correct, at least for btree tables. -Barry From python at rcn.com Mon Oct 27 17:45:09 2003 From: python at rcn.com (Raymond Hettinger) Date: Mon Oct 27 17:46:07 2003 Subject: [Python-Dev] RE: [Python-checkins] python/nondist/peps pep-0323.txt, NONE, 1.1 pep-0000.txt, 1.254, 1.255 In-Reply-To: Message-ID: <005a01c39cdb$fa18b540$81b0958d@oemcomputer> Excellent PEP! Consider adding your bookmarking example. I found it to be a compelling use case. Also note that there are many variations of the bookmarking theme (undo utilities, macro recording, parser lookahead functions, backtracking, etc). Under drawbacks and issues there are a couple of thoughts: * Not all iterators will be copyable. Knowing which is which creates a bit of a usability issue (i.e. the question of whether a particular iterator is copyable will come up every time) and a substitution issue (i.e. code which depends on copyability precludea substitution of other iterators that don't have copyability). * In addition to knowing whether a given iterator is copyable, a user should also know whether the copy is lightweight (just an index or some such) or heavy (storing all of the data for future use). They should know whether it is active (intercepting every call to iter()) or inert. * For heavy copies, there is a performance trap when the stored data stream gets too long. At some point, just using list() would be better. Consider adding a section with pure python sample implementations for listiter.__copy__, dictiter.__copy__, etc. Also, I have a question about the semantic specification of what a copy is supposed to do. Does it guarantee that the same data stream will be reproduced? For instance, would a generator of random words expect its copy to generate the same word sequence. Or, would a copy of a dictionary iterator change its output if the underlying dictionary got updated (i.e. should the dict be frozen to changes when a copy exists or should it mutate). Raymond Hettinger From zack at codesourcery.com Mon Oct 27 17:55:04 2003 From: zack at codesourcery.com (Zack Weinberg) Date: Mon Oct 27 17:59:39 2003 Subject: [Python-Dev] Alternate notation for global variable assignments In-Reply-To: (Steve Holden's message of "Mon, 27 Oct 2003 16:25:56 -0500") References: Message-ID: <87llr69tpz.fsf@codesourcery.com> "Steve Holden" writes: >> >> :var = value # var in module scope >> :scope:var = value # var in named enclosing scope >> >> An advantage of this notation is that it can be used anywhere, not >> just in an assignment. This has primary value for people reading the >> code -- if you have a fairly large class method that uses a module >> variable (not by assigning it) somewhere in the middle, writing it >> :var means the reader knows to go look for the assignment way up top. >> This should obviously be optional, to preserve backward compatibility. >> > However, its use in such expressions as > > sublist = lst[:var] > > would lead to substantial ambiguities, right? I suppose it would. Unfortunately, there's no other punctuation mark that can really be used for the purpose -- I think both $ and @ (suggested elsewhere in response to a similar proposal) have too many countervailing connotations. Witness e.g. the suggestion last week that $ become magic in string % dict notation. Py-in-the-sky suggestion: make the slice separator character be ; instead of :. (Half serious.) Somewhat warty suggestion: take lst[:var] to be a slice, but lst[(:var)] to be a global variable reference. And lst[:(:var)] to be a slice on a global, etc. etc. Better ideas solicited. zw From aleaxit at yahoo.com Mon Oct 27 18:07:33 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 27 18:07:40 2003 Subject: [Python-Dev] Alternate notation for global variable assignments In-Reply-To: <87llr69tpz.fsf@codesourcery.com> References: <87llr69tpz.fsf@codesourcery.com> Message-ID: <200310280007.33899.aleaxit@yahoo.com> On Monday 27 October 2003 11:55 pm, Zack Weinberg wrote: ... > Somewhat warty suggestion: take lst[:var] to be a slice, but > lst[(:var)] to be a global variable reference. And lst[:(:var)] to be > a slice on a global, etc. etc. That would work -- and with the :: (rather than single :) stropping which Guido seems to prefer, too. As long as ::name or scope::name are always (parenthesized) when not doing so would be ambiguous (same general rules as, say, for tuples), which in their case would seem to be "within brackets only", I think :: stropping would work fine -- and perhaps avoid some possible single-: ambiguity in dictionary display such as d = { a:b:c } which would require further parenthesization -- with :: stropping, d = { a::b:c } and d = { a:b::c } are unambiguous, although parentheses would no doubt be advisable anyway to help human readers. Alex From guido at python.org Mon Oct 27 18:10:05 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 27 18:11:11 2003 Subject: [Python-Dev] PEP 289: Generator Expressions In-Reply-To: Your message of "Mon, 27 Oct 2003 16:13:26 EST." <004701c39ccf$29ef0740$81b0958d@oemcomputer> References: <004701c39ccf$29ef0740$81b0958d@oemcomputer> Message-ID: <200310272310.h9RNA5Y27764@12-236-54-216.client.attbi.com> > Overall the pep is being well received. The discussion has been > uncontentious and light (around 50-55 posts). Great! > There was occasional discussion about the parentheses requirement but > that was quickly settled also. One idea that had some merit was to not > require the outer parentheses for a single expression on the rhs of an > assignment: > > g = (x**2 for x in range(10)) # maybe the outer parens are not needed I really think they should be required. The 'for' keyword feels like it has a lower "priority" than the assignment operator. --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Mon Oct 27 18:13:36 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 27 18:13:43 2003 Subject: [Python-Dev] PEP 289: Generator Expressions In-Reply-To: <200310272310.h9RNA5Y27764@12-236-54-216.client.attbi.com> References: <004701c39ccf$29ef0740$81b0958d@oemcomputer> <200310272310.h9RNA5Y27764@12-236-54-216.client.attbi.com> Message-ID: <200310280013.36524.aleaxit@yahoo.com> On Tuesday 28 October 2003 12:10 am, Guido van Rossum wrote: > > Overall the pep is being well received. The discussion has been > > uncontentious and light (around 50-55 posts). > > Great! > > > There was occasional discussion about the parentheses requirement but > > that was quickly settled also. One idea that had some merit was to not > > require the outer parentheses for a single expression on the rhs of an > > assignment: > > > > g = (x**2 for x in range(10)) # maybe the outer parens are not > > needed > > I really think they should be required. The 'for' keyword feels like > it has a lower "priority" than the assignment operator. I entirely agree with Guido: the assignment looks _much_ better to me WITH the parentheses around the RHS. Alex From anthony at interlink.com.au Mon Oct 27 18:16:09 2003 From: anthony at interlink.com.au (Anthony Baxter) Date: Mon Oct 27 18:19:28 2003 Subject: [Python-Dev] Alternate notation for global variable assignments In-Reply-To: <200310280007.33899.aleaxit@yahoo.com> Message-ID: <200310272316.h9RNG9FB011873@localhost.localdomain> >>> Alex Martelli wrote > On Monday 27 October 2003 11:55 pm, Zack Weinberg wrote: > ... > > Somewhat warty suggestion: take lst[:var] to be a slice, but > > lst[(:var)] to be a global variable reference. And lst[:(:var)] to be > > a slice on a global, etc. etc. > > That would work -- and with the :: (rather than single :) stropping > which Guido seems to prefer, too. As long as ::name or > scope::name are always (parenthesized) when not doing so > would be ambiguous (same general rules as, say, for tuples), > which in their case would seem to be "within brackets only", > I think :: stropping would work fine -- and perhaps avoid some > possible single-: ambiguity in dictionary display such as Can I just say, as someone who's only been lightly following this thread, that the above :(: type stuff a) looks incredibly ugly b) gives absolutely no clue as to what it might mean c) looks incredibly ugly. There's already prior usage of the : in python for dictionaries, for slices, but nothing at all like this. I'd really hope we don't end up with something this awful looking in the stdlib. Speaking purely for myself, of course (On the other hand, making the operator :( might be a subtle way of pre-deprecating it) Anthony -- Anthony Baxter It's never too late to have a happy childhood. From aleaxit at yahoo.com Mon Oct 27 18:19:29 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Oct 27 18:19:39 2003 Subject: [Python-Dev] PEP 289: Generator Expressions In-Reply-To: <004701c39ccf$29ef0740$81b0958d@oemcomputer> References: <004701c39ccf$29ef0740$81b0958d@oemcomputer> Message-ID: <200310280019.29859.aleaxit@yahoo.com> On Monday 27 October 2003 10:13 pm, Raymond Hettinger wrote: ... > Several people initially thought that lambda should be part of the yield was repeatedly mentioned, and I don't recall lambda being, so maybe this is a typo. > syntax, but other respondants quickly laid that to rest. Yes, consensus clearly converged on the proposed syntax (the mention of "generators" in the construct's name was the part that I think prompted the desire for 'yield' -- had they been called "iterator expressions" I suspect nobody would have missed 'yield' even transiently:-). > One poster counter-suggested a weird regex style syntax for embedding > Perl expressions. The newsgroup was very kind and no one called him > wacko :-) ...though I did say "if you want Perl, you know where to find it"...:-) > The discussion is winding down and there are no unresolved questions. Yes, fair summary. The one persistent (but low-as-a-whisper) grumbling is by one A.M., who keeps mumbling "they're _iterator_ expressions, the fact that they use generators is an implementation detail, grmbl grmbl":-). But then, he IS one of those pesky must-always-have-SOME-whine types. Alex From barry at python.org Mon Oct 27 18:39:03 2003 From: barry at python.org (Barry Warsaw) Date: Mon Oct 27 18:39:08 2003 Subject: [Python-Dev] Alternate notation for global variable assignments In-Reply-To: <200310272316.h9RNG9FB011873@localhost.localdomain> References: <200310272316.h9RNG9FB011873@localhost.localdomain> Message-ID: <1067297942.1066.24.camel@anthem> On Mon, 2003-10-27 at 18:16, Anthony Baxter wrote: > Can I just say, as someone who's only been lightly following this > thread, Me too. > that the above :(: type stuff > > a) looks incredibly ugly > b) gives absolutely no clue as to what it might mean > c) looks incredibly ugly. > > There's already prior usage of the : in python for dictionaries, for > slices, but nothing at all like this. I'd really hope we don't end up > with something this awful looking in the stdlib. It's not just you. -Barry From nas-python at python.ca Mon Oct 27 18:45:10 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Mon Oct 27 18:43:48 2003 Subject: [Python-Dev] PEP 289: Generator Expressions In-Reply-To: <200310280019.29859.aleaxit@yahoo.com> References: <004701c39ccf$29ef0740$81b0958d@oemcomputer> <200310280019.29859.aleaxit@yahoo.com> Message-ID: <20031027234510.GA22587@mems-exchange.org> On Tue, Oct 28, 2003 at 12:19:29AM +0100, Alex Martelli wrote: > The one persistent (but low-as-a-whisper) grumbling is by one > A.M., who keeps mumbling "they're _iterator_ expressions, the fact > that they use generators is an implementation detail, grmbl > grmbl":-). I'm inclined to agree with him. Was there some reason why the term iterator expressions was rejected? Neil From tdelaney at avaya.com Mon Oct 27 19:00:43 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Mon Oct 27 19:00:50 2003 Subject: [Python-Dev] Alternate notation for global variable assignments Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6B084@au3010avexu1.global.avaya.com> > From: Zack Weinberg [mailto:zack@codesourcery.com] > >> > > However, its use in such expressions as > > > > sublist = lst[:var] > > > > would lead to substantial ambiguities, right? > > I suppose it would. Unfortunately, there's no other punctuation mark > that can really be used for the purpose -- I think both $ and @ > (suggested elsewhere in response to a similar proposal) have > too many countervailing connotations. Witness e.g. the suggestion > last week that $ become magic in string % dict notation. First of all, I'm strongly *against* the idea of :var. However, I think a syntax that would work with no ambiguities, and not look too bad, would be: .var e.g. sublist = lst[.var] I would also be strongly against this suggestion - it simply deals with the problems I see with the current suggestion. It has its own problems, including (but not limited to) not being very obvious. Tim Delaney From greg at cosc.canterbury.ac.nz Mon Oct 27 19:04:24 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Mon Oct 27 19:04:42 2003 Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug In-Reply-To: <200310270906.44209.aleaxit@yahoo.com> Message-ID: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz> Alex Martelli : > Nobody's asking for 3.0*x to work where x is a user-coded type > without an __rmul__; rather, the point is that 3*x should fail too, > and ideally they'd have the same clear error message as 3+x > gives when the type has no __radd__. Okay, that makes sense. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From barry at python.org Mon Oct 27 19:11:52 2003 From: barry at python.org (Barry Warsaw) Date: Mon Oct 27 19:12:01 2003 Subject: [Python-Dev] Alternate notation for global variable assignments In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DED6B084@au3010avexu1.global.avaya.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DED6B084@au3010avexu1.global.avaya.com> Message-ID: <1067299912.1066.35.camel@anthem> On Mon, 2003-10-27 at 19:00, Delaney, Timothy C (Timothy) wrote: > First of all, I'm strongly *against* the idea of :var. > > However, I think a syntax that would work with no ambiguities, and not look too bad, would be: > > .var > > e.g. > > sublist = lst[.var] > > I would also be strongly against this suggestion - it simply deals with the problems I see with the current suggestion. It has its own problems, including (but not limited to) not being very obvious. What I really want is access to a namespace, and then all the normal Python attribute access notations just work. They're one honking great idea after all. This was behind the "import __me__" suggestion for access to module globals. Why can't we have something similar for nested functions? -Barry From guido at python.org Mon Oct 27 19:23:52 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 27 19:24:08 2003 Subject: [Python-Dev] Alternate notation for global variable assignments In-Reply-To: Your message of "Tue, 28 Oct 2003 11:00:43 +1100." <338366A6D2E2CA4C9DAEAE652E12A1DED6B084@au3010avexu1.global.avaya.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DED6B084@au3010avexu1.global.avaya.com> Message-ID: <200310280023.h9S0Nqk27854@12-236-54-216.client.attbi.com> > However, I think a syntax that would work with no ambiguities, and > not look too bad, would be: > > .var > > e.g. > > sublist = lst[.var] No; I want to reserve .var for the "with" statement (a la VB). --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Oct 27 19:50:30 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 27 19:51:51 2003 Subject: [Python-Dev] PEP 289: Generator Expressions In-Reply-To: Your message of "Mon, 27 Oct 2003 15:45:10 PST." <20031027234510.GA22587@mems-exchange.org> References: <004701c39ccf$29ef0740$81b0958d@oemcomputer> <200310280019.29859.aleaxit@yahoo.com> <20031027234510.GA22587@mems-exchange.org> Message-ID: <200310280050.h9S0oUR27881@12-236-54-216.client.attbi.com> > > The one persistent (but low-as-a-whisper) grumbling is by one > > A.M., who keeps mumbling "they're _iterator_ expressions, the fact > > that they use generators is an implementation detail, grmbl > > grmbl":-). > > I'm inclined to agree with him. Was there some reason why the term > iterator expressions was rejected? After seeing "iterator expressions" I came up wit "generator expressions" and decided I liked that better. Around the same time Tim Peters wrote a post where he proposed "generator expressions" independently: http://mail.python.org/pipermail/python-dev/2003-October/039186.html Trying to rationalize my own gut preference, I think I like "generator expressions" better than "iterator expressions" because there are so many other expressions that yield iterators (e.g. iter(x) comes to mind :-). Just like generator functions are one specific cool way of creating an iterator, generator expressions are another specific cool way, and as a bonus, they're related in terms of implementation (and that certainly reflects on corners of the semantics, so I don't think we should try to hide this as an implementation detail). --Guido van Rossum (home page: http://www.python.org/~guido/) From janssen at parc.com Mon Oct 27 19:53:32 2003 From: janssen at parc.com (Bill Janssen) Date: Mon Oct 27 19:53:57 2003 Subject: [Python-Dev] htmllib vs. HTMLParser In-Reply-To: Your message of "Mon, 27 Oct 2003 08:02:21 PST." <20031027160221.GA29155@rogue.amk.ca> Message-ID: <03Oct27.165334pst."58611"@synergy1.parc.xerox.com> Glad to see you volunteering! But IMO simply adding some handler methods won't really do it. You also need to introduce some knowledge about the semantics of the syntax. For example, a new "block"-level element should close all "in-line" elements that are currently open. Etc. It would also be handy to have a version of the parser that takes an HTML page and returns a parse tree, rather than the halfway solution we currently have, forcing the user to design and write a lot of code to get anything done. Bill From guido at python.org Mon Oct 27 19:54:21 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 27 19:55:04 2003 Subject: [Python-Dev] RE: [Python-checkins] python/nondist/peps pep-0323.txt, NONE, 1.1 pep-0000.txt, 1.254, 1.255 In-Reply-To: Your message of "Mon, 27 Oct 2003 17:45:09 EST." <005a01c39cdb$fa18b540$81b0958d@oemcomputer> References: <005a01c39cdb$fa18b540$81b0958d@oemcomputer> Message-ID: <200310280054.h9S0sLv27908@12-236-54-216.client.attbi.com> > Also, I have a question about the semantic specification of what a copy > is supposed to do. Does it guarantee that the same data stream will be > reproduced? For instance, would a generator of random words expect its > copy to generate the same word sequence. Or, would a copy of a > dictionary iterator change its output if the underlying dictionary got > updated (i.e. should the dict be frozen to changes when a copy exists or > should it mutate). Every attempt should be made for the two copies to return exactly the same stream of values. This is the pure tee() semantics. --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at cosc.canterbury.ac.nz Mon Oct 27 20:48:21 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Mon Oct 27 20:48:33 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310271033.56569.aleaxit@yahoo.com> Message-ID: <200310280148.h9S1mLg21692@oma.cosc.canterbury.ac.nz> Alex Martelli : > My slight preference for leaving += and friends alone is that > a function using them to rebind nonlocals would be hard to > read Using my "outer" suggestion, augmented assignments to nonlocals would be written outer x += 1 which would make the intention pretty clear, I think. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Mon Oct 27 21:02:10 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Mon Oct 27 21:02:20 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310271511.h9RFBG926665@12-236-54-216.client.attbi.com> Message-ID: <200310280202.h9S22Ad21701@oma.cosc.canterbury.ac.nz> Guido: > there are already well-established rules for deciding whether a bare > name is local or not, and those rules have always worked "at a > distance". If we adopt a method of nonlocal assignment that allows the deprecation of "global", then we have a chance to change this, if we think that such "at-a-distance" rules are undesirable in general. Do we think that? Einstein-certainly-seemed-to-ly, Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Mon Oct 27 21:10:11 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Mon Oct 27 21:10:22 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <20031027161202.GF5842@epoch.metaslash.com> Message-ID: <200310280210.h9S2AB921796@oma.cosc.canterbury.ac.nz> > The best alternative I've seen that addresses nested scope and the > global declaration. Is to borrow :: from C++: -1000! I hate it whenever an otherwise sensible language borrows this ugly piece of syntax. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Mon Oct 27 21:20:19 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Mon Oct 27 21:20:27 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <3F9D5910.9050001@livinglogic.de> Message-ID: <200310280220.h9S2KJC21818@oma.cosc.canterbury.ac.nz> Walter D=F6rwald: > I think ':=3D' is to close to '=3D'. The default assigment should b= e much > easier to type than the special case. Well, typing "outer x =3D value" would require 6 more keystrokes than "x =3D value". Would that be difficult enough for you? :-) Greg Ewing, Computer Science Dept, +---------------------------------= -----+ University of Canterbury,=09 | A citizen of NewZealandCorp, a=09 | Christchurch, New Zealand=09 | wholly-owned subsidiary of USA Inc. = | greg@cosc.canterbury.ac.nz=09 +------------------------------------= --+ From bac at OCF.Berkeley.EDU Mon Oct 27 21:41:48 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Mon Oct 27 21:42:13 2003 Subject: [Python-Dev] Let's table the discussion of replacing 'global' In-Reply-To: <1067283859.8566.633.camel@localhost.localdomain> References: <16285.21660.432100.124214@montanaro.dyndns.org> <200310271729.h9RHTnl27186@12-236-54-216.client.attbi.com> <1067283859.8566.633.camel@localhost.localdomain> Message-ID: <3F9DD76C.8010801@ocf.berkeley.edu> Jeremy Hylton wrote: > On Mon, 2003-10-27 at 12:29, Guido van Rossum wrote: > >>>I'm going to make a suggestion. Let's shelve this topic for the time being >>>and simply summarize the issues in an informational PEP aimed at >>>Py3k. >> >>Great idea. I'm getting tired of it too; Alex and I don't seem to be >>getting an inch closer to each other. >> >> >>>We don't even know (at least I don't) if we want an implicit search >>>for outer scope variables or an explicit specification of which >>>scope such variables should be defined in. If, for some reason, >>>nested scopes make a quick exit in Py3k, this would all be moot >>>anyway. >> >>Sorry to disappoint you, but nested scopes aren't going away. > > > I haven't had time to participate in this thread -- too much real work > for the last several days -- but I'd be happy to write a PEP that > summarizes the issues. > Woohoo! PEPs for generator expressions, copying iterators, and now 'global' "stuff"! This will make summarizing the 700-odd emails I have for the next summary (at this point; the thing grows an average of 50 emails a day as of late) a *heck* of a lot easier. Thanks Jeremy, Raymond, and Alex. -Brett From guido at python.org Mon Oct 27 21:55:57 2003 From: guido at python.org (Guido van Rossum) Date: Mon Oct 27 21:56:06 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: Your message of "Tue, 28 Oct 2003 15:02:10 +1300." <200310280202.h9S22Ad21701@oma.cosc.canterbury.ac.nz> References: <200310280202.h9S22Ad21701@oma.cosc.canterbury.ac.nz> Message-ID: <200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com> > If we adopt a method of nonlocal assignment that allows the > deprecation of "global", then we have a chance to change this, > if we think that such "at-a-distance" rules are undesirable > in general. > > Do we think that? Alex certainly seems to be arguing this, but I think it's a lost cause. Even Alex will have to accept the long-distance effect of def f(): x = 42 . . (hundreds of lines of unrelated code) . print x And at some point in the future Python *will* grow (optional) type declarations for all sorts of things (arguments, local variables, instance variables) and those will certainly have effect at a distance. --Guido van Rossum (home page: http://www.python.org/~guido/) From bob at redivi.com Mon Oct 27 22:14:06 2003 From: bob at redivi.com (Bob Ippolito) Date: Mon Oct 27 22:14:35 2003 Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate Message-ID: > [Gustavo Niemeyer wrote] > > > You can do reverse with [::-1] now. > > [Holger Krekel] > > sure, but it's a bit unintuitive and i mentioned not only reverse :-) > > > > Actually i think that 'reverse', 'sort' and 'extend' algorithms > > could nicely be put into the new itertools module. > > > > There it's obvious that they wouldn't mutate objects. And these > > algorithms > > (especially extend and reverse) would be very efficient as iterators > > because > > they wouldn't create temporary lists/tuples. > > To be considered as a possible itertool, an ideal candidate should: > > * work well in combination with other itertools > * be a fundamental building block > * accept all iterables as inputs > * return only an iterator as an output > * run lazily so as not to force the inputs to run to completion > unless externally requested by list() or some such. > * consume constant memory (this rule was bent for itertools.cycle(), > but should be followed as much as possible). > * run finitely if some of the inputs are finite (itertools.repeat(), > count() and cycle() are the only intentionally infinite tools) > > There is no chance for isort(). Once you've sorted the whole list, > there is no advantage to returning an iterator instead of a list. > > The problem with ireverse() is that it only works with objects that > support __getitem__() and len(). That pretty much precludes > generators, user defined class based iterators, and the outputs > from other itertools. So, while it may make a great builtin (which > is what PEP-322 is going to propose), it doesn't fit in with other > itertools. How about making islice be more lenient about inputs? For example x[::-1] should be expressable by islice(x, None, None, -1) when the input implements __len__ and __getitem__ -- but it's not. [::-1] *does* create a temporary list, because Python doesn't have "views" of lists/tuples. islice should also let you go backwards in general, islice(x, len(x)-1, None, -2) should work. -bob From python at rcn.com Tue Oct 28 00:11:08 2003 From: python at rcn.com (Raymond Hettinger) Date: Tue Oct 28 00:12:08 2003 Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate In-Reply-To: Message-ID: <008f01c39d11$e5f13160$81b0958d@oemcomputer> [Bob Ippolito] > How about making islice be more lenient about inputs? For example > x[::-1] should be expressable by islice(x, None, None, -1) when the > input implements __len__ and __getitem__ -- but it's not. [::-1] > *does* create a temporary list, because Python doesn't have "views" of > lists/tuples. islice should also let you go backwards in general, > islice(x, len(x)-1, None, -2) should work. Sorry, this idea was examined and rejected long ago. The itertools principles involved are: - avoiding calls that cause the whole stream to be realized, - avoiding situations that require much of the data to be stored in memory, - an itertool should work well with other tools and handle all kinds of iterators as inputs. islice(it, None, None, -1) is a disaster when presented with an infinite iterator and a near disaster with a long running iterator. Handling negative steps entails saving data in memory. The issue of reverse iteration is being dealt with outside the scope of itertools. See the soon to be revised PEP 322 on reverse iteration. It will give you the "views" that you seek :-) Raymond Hettinger From python at rcn.com Tue Oct 28 01:40:57 2003 From: python at rcn.com (Raymond Hettinger) Date: Tue Oct 28 01:41:55 2003 Subject: [Python-Dev] PEP 289: Generator Expressions In-Reply-To: <200310280050.h9S0oUR27881@12-236-54-216.client.attbi.com> Message-ID: <00a701c39d1e$71af4f00$81b0958d@oemcomputer> > After seeing "iterator expressions" I came up wit "generator > expressions" and decided I liked that better. Around the same time > Tim Peters wrote a post where he proposed "generator expressions" > independently: > > http://mail.python.org/pipermail/python-dev/2003-October/039186.html > > Trying to rationalize my own gut preference, I think I like "generator > expressions" better than "iterator expressions" because there are so > many other expressions that yield iterators (e.g. iter(x) comes to > mind :-). Just like generator functions are one specific cool way of > creating an iterator, generator expressions are another specific cool > way, and as a bonus, they're related in terms of implementation (and > that certainly reflects on corners of the semantics, so I don't think > we should try to hide this as an implementation detail). I'm convinced. Raymond From aleaxit at yahoo.com Tue Oct 28 03:56:34 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 03:56:44 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com> References: <200310280202.h9S22Ad21701@oma.cosc.canterbury.ac.nz> <200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com> Message-ID: <200310280956.34183.aleaxit@yahoo.com> On Tuesday 28 October 2003 03:55 am, Guido van Rossum wrote: > > If we adopt a method of nonlocal assignment that allows the > > deprecation of "global", then we have a chance to change this, > > if we think that such "at-a-distance" rules are undesirable > > in general. > > > > Do we think that? > > Alex certainly seems to be arguing this, but I think it's a lost cause. I must have some Don Quixote in my blood. Ah, can anybody point me to the nearest windmill, please...?-) Seriously, I realize by now that I stand no chance of affecting your decision in this matter. Nevertheless, and that attitude may indeed be quixotical, I still have (just barely) enough energy not to let your explanation of your likely coming decision stand as if it was ok with me, or as if I had no good response to your arguments. If it's a lost cause, I think it's because I'm not being very good at marshaling the arguments for it, not because those arguments are weak. So, basically for the record, here goes, once more... > Even Alex will have to accept the long-distance effect of > > def f(): > x = 42 > . > . (hundreds of lines of unrelated code) > . > print x I have absolutely no problem with that -- except that it's bad style, but the language cannot, in general, force good style. The language can and should ALLOW good style, but enforcing it is not always possible. In (old) C, there was often no alternative to putting a declaration far away from the code that used the variable, because declarations had to come at block start. Sometimes you could enclose declaration and use in a nested sub-block, but not always. C++ and modern C have removed this wart by letting declarations come at any point before the variable's used, and _encouraging_ (stylistically -- no enforcement) the declaration to come always together with the initialization. That's about all a language can be expected to do in this regard: not forbid "action at a distance" (that would be too confining), but _allow_ and _encourage_ most programs to avoid it. Python is and always has been just as good or even better: there being no separate declaration, you _always_ have the equivalent of it "at the first initialization" (as C++ and modern C encourage but can't enforce), and it's perfectly natural in most cases to keep that close to the region in a function where the name is of interest, if that region comprises only a subset of the function's body. But this, to some extent, is a red herring. "Reading" (accessing) the value referred to by a name looks the name up by rules I mostly _like_, even though it is quite possible that the name was set "far away". As AMK suggests in his "Python warts" essay, people don't often get in trouble with that because _most_ global (module-level, and even more built-in) names are NOT re-bound dynamically. So, when I see, e.g., print len(phonebook) it's most often fine that phonebook is global, just as it's fine that len is built-in (it may be argued that we have "too many" built-in names, and similarly that having "too many" global names is not a good thing, but having SOME such names is just fine, and indeed inevitable -- perhaps Python may remedy the "too many built-ins" in 3.0, and any programmer can refactor his own code to alleviate the "too many globals" -- no deep problem here, in either case). Re-binding names is different. It's far rarer than accessing them, of course. And while all uses of "print x" mean (semantics equivalent to) "look x up in the locals, then if not found there in outer scopes, then if not found there in the globals, then if not found there in the builtins" -- a single, reasonably simple and uniform rule, independent from any "purely declarative statement", which just determines where the value will come from -- the situation for "x=42" is currently different. It's a rarer situation than just accessing x; it's _more_ important to know where x will be bound, because that will affect its future lifetime -- which we don't particularly care about when we're just accessing it, but is more important when we're setting it; _and_ (alas!) it's affected by a _possible_, purely-declarative, instruction-to-the-compiler "global" statement SOMEwhere. "Normally", "x=42" binds or rebinds x locally. That's the common case, as rebinding nonlocals is rare. It's therefore a little trap that some (a small % of) the time we are instead rebinding a nonlocal _with no nearby reminder of the fact_. No "nearby reminder" is really needed for the _more common_ case of _accessing_ a name -- partly because "where is this being accessed from" is often less crucial (while it IS crucial when _binding_ the name), partly because it's totally common and expected that the "just access" may be doing lookup in other namespaces (indeed, when I write len(x), it's the rare case where len HAS been rebound that may be a trap!-). > And at some point in the future Python *will* grow (optional) type > declarations for all sorts of things (arguments, local variables, > instance variables) and those will certainly have effect at a > distance. Can we focus on the locals? Argument passing, and setting attributes of objects with e.g. "x.y = z" notation, are already subject to rather different rules than setting bare names, e.g. "x.y = z" might perfectly well be calling a property setter x.setY(z) or x.__setattr__('y', z), so I don't think refining those potentially-subtle rules will be a problem, nor that the situation is parallel to "global". However, optional type declarations for local variables might surely be (both problems and parallel:-), depending on roughly what you have in mind for that. E.g., are you thinking, syntax sugar apart, of some new statement "constrain_type" which might go something like...: def f(): constrain_type(int) x, y, z, t x = 23 # ok y = 2.3 # ??? a z = "23" # ??? b t = "foo" # raise subclass of (TypeError ?) If so, what semantics do you have in mind for cases a and b? I can imagine either an implicit int() call around the RHS (which is why I guess the assignment to t would fail, though I don't know whether it would fail with a type or value error), or an implicit isinstance check, in which case a and b would also fail (and then no doubt with a type error). I may be weird, but -- offhand, and not having had time to reflect on this in depth -- it seems to me that having assignment to bare names 'fail' in some circumstances, while revolutionary in Python, would not be particularly troublesome in the "action at a distance" sense. After all the constrain_type would have the specific purpose of forbidding some assignments that would otherwise succeed, would be used specifically for that, and making "wrong" assignment fail immediately and noisily would be exactly what it's for. I may not think it a GOOD idea to introduce it (for local variables), but if I argued against it it would not be on the lines of "one can't tell by just looking at y=2.3 whether it succeeds or fails". If the concept is to make y=2.3 implicitly do y=int(2.3) I would be much more worried. THEN, with no clear indication to the contrary, we'd have "y=2.3" leave y with a value of 2.3, or 2, or maybe something else for sufficiently weird values of X in a "constrain_type(X) y" -- the semantics of a CORRECT program would suddenly grow subtle dependencies on "nonlocal" ``declarations''. So, if THAT is your intention -- and indeed that would be far closer to the way "global" works: it doesn't FORBID assignments, rather it changes their semantics -- then I admit the parallel is indeed strict, and I would be worried on basically the same grounds as I'm grumbling about 'global' and its planned extensions. Yes, I realize this seems to be arguing _against_ adaptation -- surely if we had "constrain_type(X) y", and set "y = z", we might like an implicit "y = adapt(z, X)" to be the assignment's semantics? My answer (again, this is a first-blush reaction, haven't thought deeply about the issues) is that adaptation is good, but implicit rather than explicit is ungood, and I'm not sure the good is stronger than the ungood here; AND, adaptation is not typecasting: e.g y=adapt("23", int) should NOT succeed. So, while I might be more intrigued than horrified by such novel suggestions, I would surely see the risks in them -- and some of the risks I'd see WOULD be about "lack of local indication of nonobvious semantics shift". Just like with 'global', yes. Alex From aleaxit at yahoo.com Tue Oct 28 04:22:31 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 04:22:40 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310280148.h9S1mLg21692@oma.cosc.canterbury.ac.nz> References: <200310280148.h9S1mLg21692@oma.cosc.canterbury.ac.nz> Message-ID: <200310281022.31722.aleaxit@yahoo.com> On Tuesday 28 October 2003 02:48 am, Greg Ewing wrote: > Alex Martelli : > > My slight preference for leaving += and friends alone is that > > a function using them to rebind nonlocals would be hard to > > read > > Using my "outer" suggestion, augmented assignments to > nonlocals would be written > > outer x += 1 > > which would make the intention pretty clear, I think. Absolutely clear, and wonderful. Pity that any alternative to 'global' has been declared "a lost cause" by Guido. I'd still like to forbid "side effect rebinding" via statements such as class, def, import, for, i.e., no outer def f(): ... and the like; i.e., the 'outer' statement should be 'outer' expr_stmt (in Grammar/Grammar terms) with the further constraint that the expr_stmt must be an assignment (augmented or not); and the outer statement should not be a 'small_stmt', so as to avoid the ambiguity of outer x=1; y=2 (is this binding a local or nonlocal name 'y'?). Alex From aleaxit at yahoo.com Tue Oct 28 04:27:06 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 04:27:12 2003 Subject: [Python-Dev] RE: [Python-checkins] python/nondist/peps pep-0323.txt, NONE, 1.1 pep-0000.txt, 1.254, 1.255 In-Reply-To: <200310280054.h9S0sLv27908@12-236-54-216.client.attbi.com> References: <005a01c39cdb$fa18b540$81b0958d@oemcomputer> <200310280054.h9S0sLv27908@12-236-54-216.client.attbi.com> Message-ID: <200310281027.06138.aleaxit@yahoo.com> On Tuesday 28 October 2003 01:54 am, Guido van Rossum wrote: > > Also, I have a question about the semantic specification of what a copy > > is supposed to do. Does it guarantee that the same data stream will be > > reproduced? For instance, would a generator of random words expect its > > copy to generate the same word sequence. Or, would a copy of a > > dictionary iterator change its output if the underlying dictionary got > > updated (i.e. should the dict be frozen to changes when a copy exists or > > should it mutate). > > Every attempt should be made for the two copies to return exactly the > same stream of values. This is the pure tee() semantics. Yes, but iterators that run on underlying containers don't guarantee, in general, what happens if the container is mutated while the iteration is going on -- arbitrary items may end up being skipped, repeated, etc. So, "every attempt" is, I feel, too strong here. deepcopy exists for those cases where one is ready to pay a hefty price for guarantees of "decoupling", after all. Alex From aleaxit at yahoo.com Tue Oct 28 04:31:38 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 04:31:43 2003 Subject: [Python-Dev] Alternate notation for global variable assignments In-Reply-To: <1067299912.1066.35.camel@anthem> References: <338366A6D2E2CA4C9DAEAE652E12A1DED6B084@au3010avexu1.global.avaya.com> <1067299912.1066.35.camel@anthem> Message-ID: <200310281031.38234.aleaxit@yahoo.com> On Tuesday 28 October 2003 01:11 am, Barry Warsaw wrote: ... > What I really want is access to a namespace, and then all the normal > Python attribute access notations just work. They're one honking great > idea after all. Yes, all in all this does remain my preference, too. I'd take stropping (or "keyword stropping" a la Greg's "outer x") rather than declarative stuff, but just getting a namespace (in ways the compiler could recognize, i.e. by magicnames such as __me__) and then using __me__.x=23 would require no new syntax and be maximally obvious. Sigh. > This was behind the "import __me__" suggestion for access to module > globals. Why can't we have something similar for nested functions? And why can't we have "import __me__" too? Ah well! Alex From aleaxit at yahoo.com Tue Oct 28 04:37:44 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 04:37:50 2003 Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug In-Reply-To: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz> References: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz> Message-ID: <200310281037.44424.aleaxit@yahoo.com> On Tuesday 28 October 2003 01:04 am, Greg Ewing wrote: > Alex Martelli : > > Nobody's asking for 3.0*x to work where x is a user-coded type > > without an __rmul__; rather, the point is that 3*x should fail too, > > and ideally they'd have the same clear error message as 3+x > > gives when the type has no __radd__. > > Okay, that makes sense. So how do you think we should go about it? I can't see a way right now (at least not for 2.3, i.e. without breaking some programs). A user COULD have coded a class that's meant to represent a sequence AND relied on the (undocumented, I think) feature that giving the class a __mul__ automatically makes instances of that class multipliable by integers on EITHER side, after all. We can't (sensibly), I think, distinguish that from the case where the user has coded a class that's meant to represent a number and gets astonished that __mul__, undocumentedly, makes isntances of that class multipliable by integers on either side. So perhaps for 2.3 we should just apologetically note the anomaly in the docs, and for 2.4 forbid the former case, i.e., require both __mul__ AND __rmul__ to exist if one wants to code sequence classes that can be multiplied by integers on either side...? Any opinions, anybody...? Alex From aleaxit at yahoo.com Tue Oct 28 05:12:21 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 05:12:33 2003 Subject: [Python-Dev] Re: test_bsddb blocks testing popitem - reason In-Reply-To: <20031027215648.GM3929@zot.electricrain.com> References: <200310251232.55044.aleaxit@yahoo.com> <200310271125.16879.aleaxit@yahoo.com> <20031027215648.GM3929@zot.electricrain.com> Message-ID: <200310281112.21162.aleaxit@yahoo.com> On Monday 27 October 2003 10:56 pm, Gregory P. Smith wrote: > On Mon, Oct 27, 2003 at 11:25:16AM +0100, Alex Martelli wrote: > > I still don't quite see how the lock ends up being "held", but, don't > > mind me -- the intricacy of mixins and wrappings and generators and > > delegations in those modules is making my head spin anyway, so it's > > definitely not surprising that I can't quite see what's going on. > > BerkeleyDB internally always grabs a read lock (i believe at the page > level; i don't think BerkeleyDB does record locking) for any database read > when opened with DB_THREAD | DB_INIT_LOCK flags. I believe the problem > is that a DBCursor object holds this lock as long as it is open/exists. > Other reads can go on happily, but writes must to wait for the read lock > to be released before they can proceed. Aha, much clearer, thanks. > What about the behaviour of multiple iterators for the same dict being > used at once (either interleaved or by multiple threads; it shouldn't > matter)? I expect that works fine in python. If the dict is not being modified, or if the only modifications on it are assigning different values for already-existing keys, multiple iterators on the same unchanging dict do work fine in one or more threads. But note that iterators only "read" the dict, don't change it. If any change to the set of keys in the dict happens, all bets are off. > This is something the _DBWithCursor iteration interface does not currently > support due to its use of a single DBCursor internally. > > _DBWithCursor is currently written such that the cursor is never closed > once created. This leaves tons of potential for deadlock even in single > threaded apps. Reworking _DBWithCursor into a _DBThatUsesCursorsSafely > such that each iterator creates its own cursor in an internal pool > and other non cursor methods that would write to the db destroy all > cursors after saving their current() position so that the iterators can > reopen+reposition them is a solution. Woof. I think I understand what you're saying. However, writing to a dict (in the sense of changing the sets of keys) while one is iterating on the dict is NOT supported in Python -- basically "undefined behavior" (which does NOT include possibilities of crashes and deadlocks, though). So, maybe, we could get away with something a bit less rich here? > > Given that in bsddb's case that iteritems() first [and only] > > next() boils down to a self.first() which in turn does a > > self.dbc.first() I _still_ don't see exactly what's holding the > > lock. But the simplest fix would appear to be in __delitem__, > > i.e., if we have a cursor we should delete through it: > > > > def __delitem__(self, key): > > self._checkOpen() > > if self.dbc is not None: > > self.dbc.set(key) > > self.dbc.delete() > > else: > > del self.db[key] > > > > ...but this doesn't in fact remove the deadlock on the > > unit-test for popitem, which just confirms I don't really > > grasp what's going on, yet!-) > > hmm. i would've expected your __delitem__ to work. Regardless, using the Ah! I'll check again -- maybe I did something silly -- but what happens now is that the __delitem__ DOES work, the key does get deleted according to print and printf's I've sprinkled here and there, BUT then right after the key is deleted everything deadlocks anyway (in test_popitem). > debugger I can stop the deadlock from occurring if i do "self.dbc.close(); > self.dbc = None" just before popitem's "del self[k]" So, maybe I _should_ just fix popitem that way and see if all tests pass? I dunno -- it feels a bit like fixing the symptoms and leaving some deep underlying problems intact... Any other opinions? I don't have any strong feelings one way or the other, except that I really think unit-tests SHOULD pass... and indeed that changes should not committed UNLESS unit-tests pass... Alex From python at rcn.com Tue Oct 28 05:29:21 2003 From: python at rcn.com (Raymond Hettinger) Date: Tue Oct 28 05:30:16 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310260115.h9Q1F0905596@oma.cosc.canterbury.ac.nz> Message-ID: <000b01c39d3e$59af4de0$c807a044@oemcomputer> Okay, this is the last chance to come-up with a name other than sorted(). Here are some alternatives: inlinesort() # immediately clear how it is different from sort() sortedcopy() # clear that it makes a copy and does a sort newsorted() # appropriate for a class method constructor I especially like the last one and all of them provide a distinction from list.sort(). Raymond Hettinger From ncoghlan at iinet.net.au Tue Oct 28 06:19:31 2003 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Tue Oct 28 06:19:37 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310280956.34183.aleaxit@yahoo.com> References: <200310280202.h9S22Ad21701@oma.cosc.canterbury.ac.nz> <200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com> <200310280956.34183.aleaxit@yahoo.com> Message-ID: <3F9E50C3.4040908@iinet.net.au> The current situation: Rebinding a variable at module scope: >>> def f(): global x x = 1 >>> f() >>> print x 1 If I try to write "global x.y" inside the function, Idle spits the dummy (and rightly so). I can rebind x.y quite happily, since I am only referencing x, and the lookup rules find the scope I need. I don't see any reason for 'global in ' or syntactic sugar for nonlocal binding (i.e. ":=" ) to accept anything that the current global does not. Similarly, consider the following from Idle: >>> def f(): x += 1 >>> x = 1 >>> f() Traceback (most recent call last): File "", line 1, in -toplevel- f() File "", line 2, in f x += 1 UnboundLocalError: local variable 'x' referenced before assignment Augmented assignment does not currently automatically invoke a "global" definition now, so why should that change no matter the outcome of this discussion? Guido's suggestion of "nonlocal" for a variant of global that searches intervening namespaces first seems nice - the term "non-local variable" certainly strikes me as the most freqently used way of referring to variables from containing scopes in this thread. >>> def f(): def g(): nonlocal x x = 1 g() print x >>> f() 1 If 'nonlocal' was allowed only to _rebind_ variables, rather than create them at some other scope (probably advisable since 'nonlocal' merely says, 'somewhere other than here', which means there is no obvious suggestion for where to create the new variable - I could argue for either "module scope" or "nearest enclosing scope"). Defining it this way also allows catching problems at compile time instead of runtime (YMMV on whether that's a good thing or not) At this point, Just's "rebinding variable from outer scope only" assignment operator "x := 1" might seem like nice syntactic sugar for "nonlocal x; x =1" (it wouldn't require a new keyword, either) Is there really any need to allow anything more then replicating the search order for variable _reference_? Code which nests sufficient scopes that a simple 'inside-out' search is not sufficient would just seem sorely in need of a redesign to me. . . Regards, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions." From ncoghlan at iinet.net.au Tue Oct 28 06:33:40 2003 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Tue Oct 28 06:33:45 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <3F9E50C3.4040908@iinet.net.au> References: <200310280202.h9S22Ad21701@oma.cosc.canterbury.ac.nz> <200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com> <200310280956.34183.aleaxit@yahoo.com> <3F9E50C3.4040908@iinet.net.au> Message-ID: <3F9E5414.8020500@iinet.net.au> Nick Coghlan strung bits together to say: ::snip:: Saw the rather sensible suggestion to shelve this discussion only _after_ making my previous post. Ah well. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions." From mickey at tm.informatik.uni-frankfurt.de Tue Oct 28 06:30:08 2003 From: mickey at tm.informatik.uni-frankfurt.de (Michael Lauer) Date: Tue Oct 28 06:34:05 2003 Subject: [Python-Dev] Re: 2.3.3 plans Message-ID: <1067340608.24137.11.camel@gandalf.tm.informatik.uni-frankfurt.de> Anthony wrote: >I'm currently thinking of doing 2.3.3 in about 3 months time. My focus >on 2.3.3 will be on fixing the various build glitches that we have on >various platforms - I'd like to see 2.3.3 build on as many boxes as >possible, "out of the box". Does this also include cross compiling? I'm the maintainer of a python-for-arm-linux distribution (http://opie.net.wox.org/python) which is created using the OpenZaurus build infrastructure (http://openzaurus.org). To get Python cross compiled for arm-linux, I did a few (pretty rough) patches which I attached to this message. It would be useful for cross compiling, if (conceptually) the first two could be integrated into Python 2.3.3. Best Regards, Mickey. -- :M: -------------------------------------------------------------------------- Dipl.-Inf. Michael 'Mickey' Lauer mickey@tm.informatik.uni-frankfurt.de Raum 10b - ++49 69 798 28358 Fachbereich Informatik und Biologie -------------------------------------------------------------------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: python-2.3.2-crosscompile.patch Type: text/x-diff Size: 3409 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20031028/78f07782/python-2.3.2-crosscompile.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: python-modules-oz1.patch Type: text/x-diff Size: 2026 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20031028/78f07782/python-modules-oz1.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: python-crosscompile-hotshot.patch Type: text/x-diff Size: 354 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20031028/78f07782/python-crosscompile-hotshot.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: python-crosscompile-socket.patch Type: text/x-diff Size: 286 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20031028/78f07782/python-crosscompile-socket.bin From ws-news at gmx.at Tue Oct 28 06:59:49 2003 From: ws-news at gmx.at (Werner Schiendl) Date: Tue Oct 28 07:00:14 2003 Subject: [Python-Dev] Re: copysort patch, was RE: inline sort option References: <200310260115.h9Q1F0905596@oma.cosc.canterbury.ac.nz> <000b01c39d3e$59af4de0$c807a044@oemcomputer> Message-ID: Hi, thought you might be interested in the opinions of someone not (yet) working full-day with Python and whose mother tounge is *not* english. "Raymond Hettinger" schrieb > Okay, this is the last chance to come-up with a name other than > sorted(). > The method is making a copy and sorts that and returns it, right? I think the copy is not fully clear from this name. I'd give it +0 > Here are some alternatives: > > inlinesort() # immediately clear how it is different from sort() I'm rather -1 on it. Inline might be confused with inplace, and even when not it's not clear from the name that a copy is made. > sortedcopy() # clear that it makes a copy and does a sort My favourite (if the behaviour is how I believe it, that is *only* the copy is sorted) It's really obvious what is done. +1 > newsorted() # appropriate for a class method constructor I first read this news-orted, and had to step back. Also "new" is not actually the same than "copy" to me (maybe because of my C++) background. Say -0 hth Werner From amk at amk.ca Tue Oct 28 07:53:50 2003 From: amk at amk.ca (amk@amk.ca) Date: Tue Oct 28 07:53:55 2003 Subject: [Python-Dev] htmllib vs. HTMLParser In-Reply-To: <03Oct27.165334pst."58611"@synergy1.parc.xerox.com> References: <20031027160221.GA29155@rogue.amk.ca> <03Oct27.165334pst."58611"@synergy1.parc.xerox.com> Message-ID: <20031028125350.GC1095@rogue.amk.ca> On Mon, Oct 27, 2003 at 04:53:32PM -0800, Bill Janssen wrote: > But IMO simply adding some handler methods won't really do it. You > also need to introduce some knowledge about the semantics of the > syntax. For example, a new "block"-level element should close all > "in-line" elements that are currently open. Etc. Perhaps, but it might be a mug's game. I was on the Lynx developer list for a while, and bad HTML requires many, many hacks to be processed sensibly. Given that XHTML use is slowly rising, that work may not be necessary, but I'll keep it in mind. --amk From pje at telecommunity.com Tue Oct 28 08:46:08 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Oct 28 08:46:16 2003 Subject: [Python-Dev] Alternate notation for global variable assignments In-Reply-To: <200310281031.38234.aleaxit@yahoo.com> References: <1067299912.1066.35.camel@anthem> <338366A6D2E2CA4C9DAEAE652E12A1DED6B084@au3010avexu1.global.avaya.com> <1067299912.1066.35.camel@anthem> Message-ID: <5.1.0.14.0.20031028084229.01e66800@mail.telecommunity.com> At 10:31 AM 10/28/03 +0100, Alex Martelli wrote: >On Tuesday 28 October 2003 01:11 am, Barry Warsaw wrote: > ... > > What I really want is access to a namespace, and then all the normal > > Python attribute access notations just work. They're one honking great > > idea after all. > >Yes, all in all this does remain my preference, too. I'd take stropping (or >"keyword stropping" a la Greg's "outer x") rather than declarative stuff, >but just getting a namespace (in ways the compiler could recognize, >i.e. by magicnames such as __me__) and then using __me__.x=23 >would require no new syntax and be maximally obvious. Sigh. Why not just: import whatevermynameis whatevermynameis.foo = bar This would be even *more* maximally obvious, as you wouldn't need to know what '__me__' means. :) And how often do you write a module without knowing what its name is, or change the name after you've written it? Plus, thanks to the time machine, it already works. :) Heck, now that I've thought of it, I'm almost tempted to go change all my existing uses of global to this instead... From pje at telecommunity.com Tue Oct 28 08:57:54 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Oct 28 08:57:47 2003 Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing 'global') In-Reply-To: <200310280956.34183.aleaxit@yahoo.com> References: <200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com> <200310280202.h9S22Ad21701@oma.cosc.canterbury.ac.nz> <200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com> Message-ID: <5.1.0.14.0.20031028085337.02f57ec0@mail.telecommunity.com> At 09:56 AM 10/28/03 +0100, Alex Martelli wrote: >AND, adaptation is not typecasting: >e.g y=adapt("23", int) should NOT succeed. Obviously, it wouldn't succeed today, since int doesn't have __adapt__ and str doesn't have __conform__. But why would you intend that they not have them in future? And, why do you consider adaptation *not* to be typecasting? I always think of it as "give me X, rendered as a Y", which certainly sounds like a description of typecasting to me. From pje at telecommunity.com Tue Oct 28 09:01:29 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Oct 28 09:00:41 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <000b01c39d3e$59af4de0$c807a044@oemcomputer> References: <200310260115.h9Q1F0905596@oma.cosc.canterbury.ac.nz> Message-ID: <5.1.0.14.0.20031028090013.03a9b200@mail.telecommunity.com> At 05:29 AM 10/28/03 -0500, Raymond Hettinger wrote: > inlinesort() # immediately clear how it is different from sort() > sortedcopy() # clear that it makes a copy and does a sort > newsorted() # appropriate for a class method constructor +1 on sortedcopy(), especially if it's usable as a method, e.g. myList.sortedcopy(). (Note that that doesn't exclude it also being spelled as 'list.sortedcopy(myList)'.) From niemeyer at conectiva.com Tue Oct 28 08:59:38 2003 From: niemeyer at conectiva.com (Gustavo Niemeyer) Date: Tue Oct 28 09:01:10 2003 Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate In-Reply-To: <008f01c39d11$e5f13160$81b0958d@oemcomputer> References: <008f01c39d11$e5f13160$81b0958d@oemcomputer> Message-ID: <20031028135938.GA22878@ibook.distro.conectiva> Hi Raymond! If that has been discussed long ago as you mention, please, just tell me so and I won't try to recreate the same discussion. :-) [...] > islice(it, None, None, -1) is a disaster when presented with an infinite > iterator and a near disaster with a long running iterator. I don't agree with that approach. I think islice() would be more useful if it was based on best effort to try to reduce memory usage. With a negative index, it will be necessary to iterate trough all steps, but it can do better than list(iter)[-1], for example. Knowing that you have a negative index of -1 means that you may cache just a single entry, instead of the whole list. > Handling negative steps entails saving data in memory. Indeed. But if I *want* to use a negative index over an iterator, it would be nice if some smart guy did the work for "me" instead of having to do that by hand, or even worse, having to use a list() which will store *everything* in memory. As a real world example, have a look at rrule's __getitem__() method (more info on https://moin.conectiva.com.br/DateUtil): def __getitem__(self, item): if self._cache_complete: return self._cache[item] elif isinstance(item, slice): if item.step and item.step < 0: return list(iter(self))[item] else: return list(itertools.islice(self, item.start or 0, item.stop or sys.maxint, item.step or 1)) elif item >= 0: gen = iter(self) try: for i in range(item+1): res = gen.next() except StopIteration: raise IndexError return res else: return list(iter(self))[item] Having negative indexes is *very* useful in that context, and I'd like so much to turn it into simply return list(itertools.islice(self, item.start, item.stop, item.step)) Now, have a look at the count() method, which is useful as well (it is the same as a __len__() method, but introducing __len__() kills the iterator performance). def count(self): if self._len is None: for x in self: pass return self._len It is very useful as well, and having something like ilen() would be nice, even though it must iterate over the whole sequence. This would never end up in an infinite loop in that context, and even if it did, I wouldn't care about it. Not introducing it for being afraid of an infinite loop would be the same as removing the 'while' construct to avoid "while 1: pass". -- Gustavo Niemeyer http://niemeyer.net From guido at python.org Tue Oct 28 10:15:02 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 28 10:18:09 2003 Subject: [Python-Dev] RE: [Python-checkins] python/nondist/peps pep-0323.txt, NONE, 1.1 pep-0000.txt, 1.254, 1.255 In-Reply-To: Your message of "Tue, 28 Oct 2003 10:27:06 +0100." <200310281027.06138.aleaxit@yahoo.com> References: <005a01c39cdb$fa18b540$81b0958d@oemcomputer> <200310280054.h9S0sLv27908@12-236-54-216.client.attbi.com> <200310281027.06138.aleaxit@yahoo.com> Message-ID: <200310281515.h9SFF2c28991@12-236-54-216.client.attbi.com> > > > Also, I have a question about the semantic specification of what a copy > > > is supposed to do. Does it guarantee that the same data stream will be > > > reproduced? For instance, would a generator of random words expect its > > > copy to generate the same word sequence. Or, would a copy of a > > > dictionary iterator change its output if the underlying dictionary got > > > updated (i.e. should the dict be frozen to changes when a copy exists or > > > should it mutate). > > > > Every attempt should be made for the two copies to return exactly the > > same stream of values. This is the pure tee() semantics. > > Yes, but iterators that run on underlying containers don't guarantee, > in general, what happens if the container is mutated while the iteration > is going on -- arbitrary items may end up being skipped, repeated, etc. > So, "every attempt" is, I feel, too strong here. Maybe. I agree that for list and dict iterators, if the list is mutated, this warrantee shall be void. But I strongly believe that cloning a random iterator should cause two identical streams of numbers, not two different random streams. If you want two random streams you should create two independent iterators. Most random number generators have a sufficiently small amount of state that making a copy isn't a big deal. If it is hooked up to an external source (e.g. /dev/random) then I'd say you'd have to treat it as a file, and introduce explicit buffering. > deepcopy exists for those cases where one is ready to pay a hefty > price for guarantees of "decoupling", after all. But I don't propose that iterators support __deepcopy__. The use case is very different. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 28 10:18:45 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 28 10:19:00 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: Your message of "Tue, 28 Oct 2003 05:29:21 EST." <000b01c39d3e$59af4de0$c807a044@oemcomputer> References: <000b01c39d3e$59af4de0$c807a044@oemcomputer> Message-ID: <200310281518.h9SFIj129025@12-236-54-216.client.attbi.com> > Okay, this is the last chance to come-up with a name other than > sorted(). > > Here are some alternatives: > > inlinesort() # immediately clear how it is different from sort() > sortedcopy() # clear that it makes a copy and does a sort > newsorted() # appropriate for a class method constructor > > > I especially like the last one and all of them provide a distinction > from list.sort(). While we're voting, I still like list.sorted() best, so please keep that one in the list of possibilities. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 28 10:16:37 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 28 10:26:03 2003 Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug In-Reply-To: Your message of "Tue, 28 Oct 2003 10:37:44 +0100." <200310281037.44424.aleaxit@yahoo.com> References: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz> <200310281037.44424.aleaxit@yahoo.com> Message-ID: <200310281516.h9SFGbf29003@12-236-54-216.client.attbi.com> > So perhaps for 2.3 we should just apologetically note the anomaly > in the docs, and for 2.4 forbid the former case, i.e., require both > __mul__ AND __rmul__ to exist if one wants to code sequence > classes that can be multiplied by integers on either side...? > > Any opinions, anybody...? What's wrong with the status quo? So 3*x is undefined, and it happens to return x*3. Is that so bad? --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 28 10:27:45 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 28 10:28:03 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: Your message of "Tue, 28 Oct 2003 21:19:31 +1000." <3F9E50C3.4040908@iinet.net.au> References: <200310280202.h9S22Ad21701@oma.cosc.canterbury.ac.nz> <200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com> <200310280956.34183.aleaxit@yahoo.com> <3F9E50C3.4040908@iinet.net.au> Message-ID: <200310281527.h9SFRjw29046@12-236-54-216.client.attbi.com> > Augmented assignment does not currently automatically invoke a > "global" definition now, so why should that change no matter the > outcome of this discussion? Because of the fair user expectation that if you can write "x = x + 1" you should also be able to write "x += 1". > Is there really any need to allow anything more then replicating the > search order for variable _reference_? Code which nests sufficient > scopes that a simple 'inside-out' search is not sufficient would > just seem sorely in need of a redesign to me. . . I just realized one thing that explains why I prefer explicitly designating the scope (as in 'global x in f') over something like 'nonlocal'. It matches what the current global statement does, and it makes it crystal clear that you *can* declare a variable in a specific scope and assign to it without requiring there to be a binding for that variable in the scope itself. EIBTI when comparing these two. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 28 10:29:53 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 28 10:30:05 2003 Subject: [Python-Dev] Re: 2.3.3 plans In-Reply-To: Your message of "Tue, 28 Oct 2003 12:30:08 +0100." <1067340608.24137.11.camel@gandalf.tm.informatik.uni-frankfurt.de> References: <1067340608.24137.11.camel@gandalf.tm.informatik.uni-frankfurt.de> Message-ID: <200310281529.h9SFTrY29061@12-236-54-216.client.attbi.com> > Does this also include cross compiling? I'm the maintainer of a > python-for-arm-linux distribution (http://opie.net.wox.org/python) > which is created using the OpenZaurus build infrastructure > (http://openzaurus.org). I think this is a worthy cause to try and support. (I love my Zaurus.) > To get Python cross compiled for arm-linux, I did a few (pretty > rough) patches which I attached to this message. It would be useful > for cross compiling, if (conceptually) the first two could be > integrated into Python 2.3.3. I hope someone here can work with you on getting the patches in acceptable shape. You should start by uploading them to the patch manager in SourceForge. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 28 10:33:34 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 28 10:34:01 2003 Subject: [Python-Dev] RE: cloning iterators again In-Reply-To: Your message of "Tue, 28 Oct 2003 12:40:42 GMT." <20031028124042.GA22513@vicky.ecs.soton.ac.uk> References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer> <200310270851.02495.aleaxit@yahoo.com> <20031027103540.GA27782@vicky.ecs.soton.ac.uk> <200310271609.03819.aleaxit@yahoo.com> <20031028124042.GA22513@vicky.ecs.soton.ac.uk> Message-ID: <200310281533.h9SFXZi29097@12-236-54-216.client.attbi.com> I've lost context for the following thread. What is this about? I can answer one technical question regardless, but I have no idea what I'm promoting here. :-) > Hello Alex, > > On Mon, Oct 27, 2003 at 04:09:03PM +0100, Alex Martelli wrote: > > Cool! Why don't you try copy.copy on types you don't automatically > > recognize and know how to deal with, BTW? That might make this > > cool piece of code general enough that Guido might perhaps allow > > generator-produced iterators to grow it as their __copy__ method... > > I will try. Note that only __deepcopy__ makes sense, as far as I can tell, > because there is too much state that really needs to be copied and not shared > in a generator (in particular, the sequence iterators from 'for' loops). > > I'm not sure about how deep-copying should be defined for built-in > types. Should a built-in __deepcopy__ method try to import and call > copy.deepcopy() on the sub-items? This doesn't seem to be right. Almost -- you have to pass the memo argument that your __deepcopy__ received as the second argument to the recursive deepcopy() calls, to avoid looping on cycles. > A bientot, > > Armin. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 28 10:36:23 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 28 10:36:29 2003 Subject: [Python-Dev] Alternate notation for global variable assignments In-Reply-To: Your message of "Tue, 28 Oct 2003 08:46:08 EST." <5.1.0.14.0.20031028084229.01e66800@mail.telecommunity.com> References: <1067299912.1066.35.camel@anthem> <338366A6D2E2CA4C9DAEAE652E12A1DED6B084@au3010avexu1.global.avaya.com> <1067299912.1066.35.camel@anthem> <5.1.0.14.0.20031028084229.01e66800@mail.telecommunity.com> Message-ID: <200310281536.h9SFaNr29119@12-236-54-216.client.attbi.com> > Why not just: > > import whatevermynameis > > whatevermynameis.foo = bar > > This would be even *more* maximally obvious, as you wouldn't need to > know what '__me__' means. :) And how often do you write a module > without knowing what its name is, or change the name after you've > written it? Plus, thanks to the time machine, it already works. :) Doesn't work when your module may either be called __main__ or rumpelstiltkin. It would then become if __name__ == "__main__": import __main__ as me else: import rumpelstiltkin as me which loses the "aha!" effect of a cool solution. It also IMO requires too much explanation to the unsuspecting reader who doesn't understand right away *why* rumpelstiltkin imports itself. --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Tue Oct 28 10:39:08 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 10:39:14 2003 Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug In-Reply-To: <200310281516.h9SFGbf29003@12-236-54-216.client.attbi.com> References: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz> <200310281037.44424.aleaxit@yahoo.com> <200310281516.h9SFGbf29003@12-236-54-216.client.attbi.com> Message-ID: <200310281639.08240.aleaxit@yahoo.com> On Tuesday 28 October 2003 04:16 pm, Guido van Rossum wrote: > > So perhaps for 2.3 we should just apologetically note the anomaly > > in the docs, and for 2.4 forbid the former case, i.e., require both > > __mul__ AND __rmul__ to exist if one wants to code sequence > > classes that can be multiplied by integers on either side...? > > > > Any opinions, anybody...? > > What's wrong with the status quo? So 3*x is undefined, and it happens > to return x*3. Is that so bad? Where is it specified that 3*x "is undefined" when x's type exposes __mul__ but not __rmul__ ? Sorry, I don't understand the viewpoint you seem to imply here. If x's type exposed no __add__ but "it just so happened" that x+23 always returned 12345 -- while every other addition, as expected, failed -- would you doubt the lack of a normal and reasonably expected exception is bad? I think that if Python returns "arbitrary" results, rather than raising an exception, for operations that "should" raise an exception, that is surely very bad -- it makes it that much harder for programmers to debug the programs they're developing. If there's some doubt about the words I've put in hyphens -- that treating x*y just like y*x only for certain values of type(y) isn't arbitrary or shouldn't raise -- then we can of course discuss this, but isn't the general idea correct? Now, the docs currently say, about sequences under http://www.python.org/doc/current/ref/sequence-types.html : """ sequence types should implement ... multiplication (meaning repetition) by defining the methods __mul__(), __rmul__() and __imul__() described below; they should not define __coerce__() or other numerical operators. """ So, a sequence-emulating type that implements __mul__ but not __rmul__ appears to violate that "should". The description of __mul__ and __rmul__ referred to seems to be that at http://www.python.org/doc/current/ref/numeric-types.html . It says that methods corresponding to operations not supported by a particular kind of number should be left undefined (as opposed to the behavior of _attempts at those operations_ being undefined), so if I had a hypothetical number type X such that, for x instance of X and an integer k, x*k should be supported but k*x shouldn't, isn't this a recommendation to not write __rmul__ in X ...? Besides, this weird anomaly is typical of newstyle classes only. Consider: >>> class X: ... def __mul__(self, other): return 23 ... >>> x=X() >>> x*7 23 >>> 7*x Traceback (most recent call last): File "", line 1, in ? TypeError: unsupported operand type(s) for *: 'int' and 'instance' >>> ALL wonderful, just as expected, hunky-dory. But now, having read that newstyle classes are better, I want to make X newstyle -- can't see any indication in the docs that I shouldn't -- and...: >>> class X(object): ... def __mul__(self, other): return 23 ... >>> x=X() >>> x*7 23 >>> 7*x 23 >>> *eep*! Yes, it DOES seem to be that this is QUITE bad indeed. Alex From mcherm at mcherm.com Tue Oct 28 10:44:38 2003 From: mcherm at mcherm.com (Michael Chermside) Date: Tue Oct 28 10:44:39 2003 Subject: [Python-Dev] product() Message-ID: <1067355878.3f9e8ee6bf177@mcherm.com> Nick Coghlan writes: > >>> if all(pred(x) for x in values): pass # alltrue > >>> if any(pred(x) for x in values): pass # anytrue > >>> if any(not pred(x) for x in values): pass # anyfalse > >>> if all(not pred(x) for x in values): pass # allfalse > > The names from the earlier thread do read nicely. . . +1 Very nicely indeed. -- Michael Chermside From aleaxit at yahoo.com Tue Oct 28 11:03:58 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 11:04:04 2003 Subject: [Python-Dev] RE: cloning iterators again In-Reply-To: <200310281533.h9SFXZi29097@12-236-54-216.client.attbi.com> References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer> <20031028124042.GA22513@vicky.ecs.soton.ac.uk> <200310281533.h9SFXZi29097@12-236-54-216.client.attbi.com> Message-ID: <200310281703.58169.aleaxit@yahoo.com> On Tuesday 28 October 2003 04:33 pm, Guido van Rossum wrote: > I've lost context for the following thread. What is this about? I > can answer one technical question regardless, but I have no idea what > I'm promoting here. :-) Armin Rigo posted an URL to a C extension module he has recently developed, and used. That module is able to copy running instances of a generator. Currently, it does so by just "knowing", and type by type dealing with, several types whose instances could be values referred to by the generator's frame. I was suggesting extending this so that, when values of other types are met, instead of automatically failing (or, as I believe I recall Armin's extension does today, copying the reference rather than the value), the copy process would...: > > > Cool! Why don't you try copy.copy on types you don't automatically > > > recognize and know how to deal with, BTW? That might make this and Armin said that __copy__ seems weak to him but __deepcopy__ might not be: > > I will try. Note that only __deepcopy__ makes sense, as far as I can > > tell, because there is too much state that really needs to be copied and > > not shared in a generator (in particular, the sequence iterators from > > 'for' loops). So, now he went on to ask about __deepcopy__ and you answered: > > I'm not sure about how deep-copying should be defined for built-in > > types. Should a built-in __deepcopy__ method try to import and call > > copy.deepcopy() on the sub-items? This doesn't seem to be right. > > Almost -- you have to pass the memo argument that your __deepcopy__ > received as the second argument to the recursive deepcopy() calls, to > avoid looping on cycles. Now, if Armin's code can only provide __deepcopy__ and not __copy__, then it's probably of little use wrt the __copy__ functionality I talk about in PEP 323 (which I still must revise to take into account your feedback and Raymond's -- plan to get to that as soon as I've cleared my mbox) -- the memory and time expenditure is likely to be too high for that. It's still going to be a cool hack, well worth "publishing" as such, and probably able to be "user-installed" as the way deepcopy deals with generators even though generators themselves may not sprout a __deepcopy__ method themselves (fortunately, copy.copy does a lot of "ad-hoc protocol adaptation" -- it's occasionally a bit rambling or hard to follow, but often allows plugging in "third party copiers" for types which their authors hadn't imagined would be copied or deep copied, so that other innocent client code which just calls copy.copy(x) will work... essentially how "real" adaptation would work, except that registering a third-party protocol adapter would be easier:-). Alex From aleaxit at yahoo.com Tue Oct 28 11:23:31 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 11:23:39 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310281518.h9SFIj129025@12-236-54-216.client.attbi.com> References: <000b01c39d3e$59af4de0$c807a044@oemcomputer> <200310281518.h9SFIj129025@12-236-54-216.client.attbi.com> Message-ID: <200310281723.31940.aleaxit@yahoo.com> On Tuesday 28 October 2003 04:18 pm, Guido van Rossum wrote: > > Okay, this is the last chance to come-up with a name other than > > sorted(). > > > > Here are some alternatives: > > > > inlinesort() # immediately clear how it is different from sort() > > sortedcopy() # clear that it makes a copy and does a sort > > newsorted() # appropriate for a class method constructor > > > > > > I especially like the last one and all of them provide a distinction > > from list.sort(). > > While we're voting, I still like list.sorted() best, so please keep > that one in the list of possibilities. I also like list.sorted() -- but list.newsorted() is IMHO even a LITTLE bit better, making it even clearer that it's giving a NEW list. Just a _little_ preference, mind you. "sortedcopy" appears to me a BIT less clear (what "copy", if the arg isn't a list...?), "inlinesort" worst. BTW, I think I should point out one POSSIBLE problem with classmethods -- since unfortunately they CAN be called on an instance, and will ignore that instance, this may confuse an unsuspecting user. I was arguing on c.l.py that this _wasn't_ confusing because I never saw anybody made that mistake with dict.fromkeys ... and the response was "what's that?"... i.e., people aren't making mistakes with it because they have no knowledge of it. list.newsorted or however it's going to be called is probably going to be more popular than existing dict.fromkeys, so the issue may be more likely to arise there. Although I think the issue can safely be ignored, I also think I should point it out anyway, even just to get concurrence on this -- it IS possible that the classmethod idea is winning "by accident" just because nobody had thought of the issue, after all, and that would be bad (and I say so even if I was the original proposer of the list.sorted classmethod and still think it should be adopted -- don't want it to get in "on the sly" because a possible problem was overlooked, as opposed to, considered and decided to be not an issue). OK, so here's the problem, exemplified with dict.fromkeys: d = {1:2, 3:4, 5:6} dd = d.fromkeys([3, 5]) it's not immediately obvious that the value of d matters not a whit -- that this is NOT going to return a subset of d "taken from the keys" 3 and 5, i.e. {3:4, 5:6}, but, rather, {3:None, 5:None} -- and the functionality a naive user might attribute to that call d.fromkeys([3, 5]) should in fact be coded quite differently, e.g.: dd = dict([ (k,v) for k, v in d.iteritems() if k in [3,5] ]) or perhaps, if most keys are going to be copied: dd = d.copy() for k in d: if k not in [3, 5]: del dd[k] The situation with list.sorted might be somewhat similar, although in fact I think that it's harder to construct a case of fully sympathizable-with confusion like the above. Still: L = range(7) LL = L.sorted() this raises an exception (presumably about L.sorted needing "at least 1 arguments, got 0" -- that's what d.fromkeys() would do today), so the issue ain't as bad -- it will just take the user a while to understand WHY, but at least there should be running program with strange results, which makes for harder debugging. Or: L = range(7) LL = L.sorted(('fee', 'fie', 'foo')) I'l not sure what the coder might expect here, but again it seems possible that he expected the value of L to matter in some way to the resulting value of LL. Perhaps this points to an issue with classmethods in general, due in part to the fact that they're still rather little used in Python -- callers of instance.method() may mostly expect that the result has something to do with the instance's value, rather than being the same as type(instance).method() -- little we can do about it at this point except instruction, I think. Alex From aleaxit at yahoo.com Tue Oct 28 11:35:41 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 11:37:30 2003 Subject: [Python-Dev] RE: [Python-checkins] python/nondist/peps pep-0323.txt, NONE, 1.1 pep-0000.txt, 1.254, 1.255 In-Reply-To: <200310281515.h9SFF2c28991@12-236-54-216.client.attbi.com> References: <005a01c39cdb$fa18b540$81b0958d@oemcomputer> <200310281027.06138.aleaxit@yahoo.com> <200310281515.h9SFF2c28991@12-236-54-216.client.attbi.com> Message-ID: <200310281735.41103.aleaxit@yahoo.com> On Tuesday 28 October 2003 04:15 pm, Guido van Rossum wrote: ... > > > Every attempt should be made for the two copies to return exactly the > > > same stream of values. This is the pure tee() semantics. > > > > Yes, but iterators that run on underlying containers don't guarantee, > > in general, what happens if the container is mutated while the iteration > > is going on -- arbitrary items may end up being skipped, repeated, etc. > > So, "every attempt" is, I feel, too strong here. > > Maybe. > > I agree that for list and dict iterators, if the list is mutated, this > warrantee shall be void. OK, noticed -- and I'll clarify the PEP about this, thanks. > But I strongly believe that cloning a random iterator should cause two > identical streams of numbers, not two different random streams. If > you want two random streams you should create two independent > iterators. Most random number generators have a sufficiently small > amount of state that making a copy isn't a big deal. If it is hooked > up to an external source (e.g. /dev/random) then I'd say you'd have to > treat it as a file, and introduce explicit buffering. I really truly REALLY like this. I _was_ after all the one who lobbied Tim to add getstate and setstate to random.py, back in the not-so- long-ago time when I was a total Python newbie -- exactly because, being a NOT-newbie consumer of pseudo-random streams, I loathed and detested the pseudo-random generators that didn't allow me to reproduce experiments in this way. So, I entirely agree that if pseudo-random numbers are being consumed through a "pseudo-random iterator" the copy should indeed step through just the same numbers. Again, this will get in the PEP -- *thanks*! Btw, random.py doesn't seem to supply pseudo-random iterators -- easy to make e.g. with iter(random.random, None) [assuming you want a nonterminating one], but that wouldn't be copyable. Should something be done about that...? And as for NON-pseudo random numbers, such as those supplied by /dev/random and other external sources, yes, of course, they should NOT be copy'able -- best to let tee() work on them by making its buffer, or else wrap them in a buffer-to-file way if one needs to "snapshot" the sequence then re-"generate" a lot of it later for reproducibility purposes. I.e., absolute agreement. > > deepcopy exists for those cases where one is ready to pay a hefty > > price for guarantees of "decoupling", after all. > > But I don't propose that iterators support __deepcopy__. The use case > is very different. Yes, the use case of __deepcopy__ is indeed quite different (and to be honest it doesn't appear in my actual experience -- I can "imagine" some as well as the next man, but they'd be made out of whole cloth:-). But I was under the impression that you wanted them in PEP 323 too? Maybe I misunderstood your words. Should I take them out of PEP 323? In that case somebody else can later PEP that if they want, and I can basically wash my hands of them -- what do you think? Alex From aleaxit at yahoo.com Tue Oct 28 11:42:39 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 11:42:46 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <5.1.0.14.0.20031028090013.03a9b200@mail.telecommunity.com> References: <200310260115.h9Q1F0905596@oma.cosc.canterbury.ac.nz> <5.1.0.14.0.20031028090013.03a9b200@mail.telecommunity.com> Message-ID: <200310281742.39349.aleaxit@yahoo.com> On Tuesday 28 October 2003 03:01 pm, Phillip J. Eby wrote: > At 05:29 AM 10/28/03 -0500, Raymond Hettinger wrote: > > inlinesort() # immediately clear how it is different from sort() > > sortedcopy() # clear that it makes a copy and does a sort > > newsorted() # appropriate for a class method constructor > > +1 on sortedcopy(), especially if it's usable as a method, e.g. > myList.sortedcopy(). (Note that that doesn't exclude it also being spelled > as 'list.sortedcopy(myList)'.) Please explain how it might work when the argument to list.sortedcopy is *NOT* an instance of type list, but rather a completely general sequence, as a classmethod will of course allow us to have. Maybe I'm missing some recent Python enhancements, but I thought that, if a method is fully usable as an instancemethod, then when called on the type it's an unbound method and will ONLY support being called with an instance of the type as the 1st arg. Hmmm... maybe one COULD make a custom descriptor that does support both usages... and maybe it IS worth making the .sorted (or whatever name) entry a case of exactly such a subtle custom descriptor... Alex From FBatista at uniFON.com.ar Tue Oct 28 11:52:48 2003 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Tue Oct 28 11:53:43 2003 Subject: [Python-Dev] Decimal.py in sandbox Message-ID: Aahz wrote: #- The first thing you should do is talk with Eric Price #- (eprice@tjhsst.edu), author of the code. You don't need to #- use SF for #- now; CVS should be fine, but you should find out whether #- Eric would like #- to approve changes first. Eric Price wrote: #- Not really-- since school started, I haven't had much time #- to spare. #- I'll probably look over the changes at some time, but I #- wouldn't want to #- keep them waiting. So, to who may I send the changes? Should I send the whole staff at the end of the work, or keep feeding small changes? Should I send by email the diff results? Thanks for the answers. . Facundo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20031028/6f6af4d2/attachment.html From aleaxit at yahoo.com Tue Oct 28 11:55:44 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 11:56:11 2003 Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing 'global') In-Reply-To: <5.1.0.14.0.20031028085337.02f57ec0@mail.telecommunity.com> References: <200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com> <5.1.0.14.0.20031028085337.02f57ec0@mail.telecommunity.com> Message-ID: <200310281755.44307.aleaxit@yahoo.com> On Tuesday 28 October 2003 02:57 pm, Phillip J. Eby wrote: > At 09:56 AM 10/28/03 +0100, Alex Martelli wrote: > >AND, adaptation is not typecasting: > >e.g y=adapt("23", int) should NOT succeed. > > Obviously, it wouldn't succeed today, since int doesn't have __adapt__ and > str doesn't have __conform__. But why would you intend that they not have > them in future? I'd be delighted to have the int type sprout __adapt__ and the str type sprout __conform__ -- but neither should accept this case, see below. > And, why do you consider adaptation *not* to be typecasting? I always > think of it as "give me X, rendered as a Y", which certainly sounds like a > description of typecasting to me. typecasting (in Python) makes a NEW object whose value is somehow "built" (possibly in a very loose sense) from the supplied argument[s], but need not have any more than a somewhat tangential relation with them. adaptation returns "the same object" passed as the argument, or a wrapper to it that makes it comply with the protocol. To give a specific example: x = file("foo.txt") now (assuming this succeeds) x is a readonly object which is an instance of file. The argument string "foo.txt" has "indicated", quite indirectly, how to construct the file object, but there's really no true connection between the value of the argument string and what will happen as that object x is read. Thinking of what should happen upon: x = adapt("foo.txt", file) what I envision is DEFINITELY the equivalent of: x = cStringIO.StringIO("foo.txt") i.e., the value (aka object) "foo.txt", wrapped appropriately so as to conform to the (readonly) "file protocol" (I can call x.read(3) and get "foo", then x.seek(0) then x.read(2) and get "fo", etc). Hmmm, that PEP definitely needs updating (including mentions of PyProtocol as well as of this issue...)...! I've been rather remiss about it so far -- sorry. Alex From python at rcn.com Tue Oct 28 12:09:55 2003 From: python at rcn.com (Raymond Hettinger) Date: Tue Oct 28 12:14:04 2003 Subject: [Python-Dev] RE: [Python-checkins] python/nondist/pepspep-0323.txt, NONE, 1.1 pep-0000.txt, 1.254, 1.255 In-Reply-To: <200310281735.41103.aleaxit@yahoo.com> Message-ID: <003401c39d76$4f36d080$1a3ac797@oemcomputer> [Alex] > Btw, random.py doesn't seem to supply pseudo-random iterators -- > easy to make e.g. with iter(random.random, None) [assuming you > want a nonterminating one], Probably a bit faster with: starmap(random.random, repeat(())) > but that wouldn't be copyable. Should > something be done about that...? No. 1) The use case is not typical. Other than random.random() and time.ctime(), it is rare to see functions of zero arguments that usefully return an infinite sequence of distinct values. 2) If you need a copy, run it through tee(). Raymond From tjreedy at udel.edu Tue Oct 28 12:16:47 2003 From: tjreedy at udel.edu (Terry Reedy) Date: Tue Oct 28 12:15:46 2003 Subject: [Python-Dev] Re: copysort patch, was RE: inline sort option References: <000b01c39d3e$59af4de0$c807a044@oemcomputer> <200310281518.h9SFIj129025@12-236-54-216.client.attbi.com> Message-ID: "Guido van Rossum" wrote in message news:200310281518.h9SFIj129025@12-236-54-216.client.attbi.com... > > Here are some alternatives: > > > > inlinesort() # immediately clear how it is different from sort() > > sortedcopy() # clear that it makes a copy and does a sort > > newsorted() # appropriate for a class method constructor > > > > > > I especially like the last one and all of them provide a distinction > > from list.sort(). > > While we're voting, I still like list.sorted() best, so please keep > that one in the list of possibilities. After thinking about it some more, I also prefer .sorted to suggested alternatives. I read it as follows: list(iterable) means 'make a list from iterable (preserving item order)' list.sorted(iterable) means 'make a sorted list from iterable' While I generally like verbs for method names, the adjective form works here as modifing the noun/verb 'list' and the invoked construction process. 'Inline' strikes me as confusing. 'Copy' and 'new' strike me as redundant noise since, in the new 2.2+ regime, 'list' as a typal verb *means* 'make a new list'. Terry J. Reedy Terry J. Reedy From tanzer at swing.co.at Tue Oct 28 12:23:29 2003 From: tanzer at swing.co.at (Christian Tanzer) Date: Tue Oct 28 12:24:01 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: Your message of "Tue, 28 Oct 2003 17:23:31 +0100." <200310281723.31940.aleaxit@yahoo.com> Message-ID: Alex Martelli wrote: > On Tuesday 28 October 2003 04:18 pm, Guido van Rossum wrote: > > > Okay, this is the last chance to come-up with a name other than > > > sorted(). > > > > > > Here are some alternatives: > > > > > > inlinesort() # immediately clear how it is different from sort() > > > sortedcopy() # clear that it makes a copy and does a sort > > > newsorted() # appropriate for a class method constructor > > > > > > > > > I especially like the last one and all of them provide a distinction > > > from list.sort(). > > > > While we're voting, I still like list.sorted() best, so please keep > > that one in the list of possibilities. > > I also like list.sorted() -- but list.newsorted() is IMHO even a LITTLE > bit better, making it even clearer that it's giving a NEW list. Just > a _little_ preference, mind you. "sortedcopy" appears to me a BIT > less clear (what "copy", if the arg isn't a list...?), "inlinesort" worst. IMO, sorted is the clearest, all other proposals carry excess baggage making them less clear. > Perhaps this points to an issue with classmethods in > general, due in part to the fact that they're still rather > little used in Python -- callers of instance.method() > may mostly expect that the result has something to > do with the instance's value, rather than being the > same as type(instance).method() -- little we can do > about it at this point except instruction, I think. Or put the method into the metaclass. I'm using both classmethods and methods defined by metaclasses and didn't get any complaints about classmethods yet. -- Christian Tanzer http://www.c-tanzer.at/ From jjl at pobox.com Tue Oct 28 12:25:54 2003 From: jjl at pobox.com (John J Lee) Date: Tue Oct 28 12:27:22 2003 Subject: [Python-Dev] Re: [Web-SIG] Threading and client-side support In-Reply-To: <20031028124646.GB1095@rogue.amk.ca> References: <20031027150709.GA29045@rogue.amk.ca> <20031028124646.GB1095@rogue.amk.ca> Message-ID: [background for python-dev-ers: In the process of making my client-side cookie module a suitable candidate for inclusion in the standard library, I'm trying to make it thread-safe] On Tue, 28 Oct 2003 amk@amk.ca wrote: > On Tue, Oct 28, 2003 at 10:35:33AM +0000, John J Lee wrote: > > Thanks. So, in particular, httplib, urllib and urllib2 are thread-safe? > > No idea; reading the code would be needed to figure that out. That might not be helpful if the person reading it (me) has zero threading experience ;-) I certainly plan to gain that experience, but surely *somebody* already knows whether they're thread-safe? I presume they are, broadly, since a couple of violations of thread safety are commented in urllib2 and urllib. Right? John From guido at python.org Tue Oct 28 12:42:16 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 28 12:42:24 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: Your message of "Tue, 28 Oct 2003 17:42:39 +0100." <200310281742.39349.aleaxit@yahoo.com> References: <200310260115.h9Q1F0905596@oma.cosc.canterbury.ac.nz> <5.1.0.14.0.20031028090013.03a9b200@mail.telecommunity.com> <200310281742.39349.aleaxit@yahoo.com> Message-ID: <200310281742.h9SHgGt29384@12-236-54-216.client.attbi.com> > Hmmm... maybe one COULD make a custom descriptor that does support > both usages... and maybe it IS worth making the .sorted (or whatever name) > entry a case of exactly such a subtle custom descriptor... Thanks for the idea, I can use this as a perverted example in my talk at Stanford tomorrow. Here it is: import new def curry(f, x, cls=None): return new.instancemethod(f, x) class MagicDescriptor(object): def __init__(self, classmeth, instmeth): self.classmeth = classmeth self.instmeth = instmeth def __get__(self, obj, cls): if obj is None: return curry(self.classmeth, cls) else: return curry(self.instmeth, obj) class MagicList(list): def _classcopy(cls, lst): obj = cls(lst) obj.sort() return obj def _instcopy(self): obj = self.__class__(self) obj.sort() return obj sorted = MagicDescriptor(_classcopy, _instcopy) class SubClass(MagicList): def __str__(self): return "SubClass(%s)" % str(list(self)) unsorted = (1, 10, 2) print MagicList.sorted(unsorted) print MagicList(unsorted).sorted() print SubClass.sorted(unsorted) print SubClass(unsorted).sorted() --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 28 12:51:59 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 28 12:52:07 2003 Subject: [Python-Dev] RE: [Python-checkins] python/nondist/peps pep-0323.txt, NONE, 1.1 pep-0000.txt, 1.254, 1.255 In-Reply-To: Your message of "Tue, 28 Oct 2003 17:35:41 +0100." <200310281735.41103.aleaxit@yahoo.com> References: <005a01c39cdb$fa18b540$81b0958d@oemcomputer> <200310281027.06138.aleaxit@yahoo.com> <200310281515.h9SFF2c28991@12-236-54-216.client.attbi.com> <200310281735.41103.aleaxit@yahoo.com> Message-ID: <200310281752.h9SHpxr29419@12-236-54-216.client.attbi.com> > Yes, the use case of __deepcopy__ is indeed quite different (and > to be honest it doesn't appear in my actual experience -- I can "imagine" > some as well as the next man, but they'd be made out of whole cloth:-). > But I was under the impression that you wanted them in PEP 323 too? > Maybe I misunderstood your words. Should I take them out of PEP 323? > In that case somebody else can later PEP that if they want, and I can > basically wash my hands of them -- what do you think? I think it would be better of PEP 323 only did __copy__, so you can remove all traces of __deepcopy__. I don't recall what I said, maybe I wasn't clear. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 28 13:00:14 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 28 13:00:46 2003 Subject: [Python-Dev] RE: cloning iterators again In-Reply-To: Your message of "Tue, 28 Oct 2003 17:03:58 +0100." <200310281703.58169.aleaxit@yahoo.com> References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer> <20031028124042.GA22513@vicky.ecs.soton.ac.uk> <200310281533.h9SFXZi29097@12-236-54-216.client.attbi.com> <200310281703.58169.aleaxit@yahoo.com> Message-ID: <200310281800.h9SI0Fr29445@12-236-54-216.client.attbi.com> > Armin Rigo posted an URL to a C extension module he has recently > developed, and used. That module is able to copy running instances > of a generator. Currently, it does so by just "knowing", and type by > type dealing with, several types whose instances could be values > referred to by the generator's frame. I was suggesting extending this > so that, when values of other types are met, instead of automatically > failing (or, as I believe I recall Armin's extension does today, copying > the reference rather than the value), the copy process would...: I haven't seen Armin's code, but I don't believe that the type alone gives enough information about whether they should be copied. Consider a generator that uses a dict as a cache or memo for values it computes. Multiple instances of the generator share the dict, but for efficiency the generator references it in a local variable. This dict should not be copied when copying the generator's stack frame. But consider another generator that uses a dict to hold some of its state. This dict should be copied. > > > > Cool! Why don't you try copy.copy on types you don't automatically > > > > recognize and know how to deal with, BTW? That might make this > > and Armin said that __copy__ seems weak to him but __deepcopy__ > might not be: > > > > I will try. Note that only __deepcopy__ makes sense, as far as I can > > > tell, because there is too much state that really needs to be copied and > > > not shared in a generator (in particular, the sequence iterators from > > > 'for' loops). > > > So, now he went on to ask about __deepcopy__ and you answered: > > > > I'm not sure about how deep-copying should be defined for built-in > > > types. Should a built-in __deepcopy__ method try to import and call > > > copy.deepcopy() on the sub-items? This doesn't seem to be right. > > > > Almost -- you have to pass the memo argument that your __deepcopy__ > > received as the second argument to the recursive deepcopy() calls, to > > avoid looping on cycles. > > > Now, if Armin's code can only provide __deepcopy__ and not __copy__, > then it's probably of little use wrt the __copy__ functionality I > talk about in PEP 323 (which I still must revise to take into > account your feedback and Raymond's -- plan to get to that as soon > as I've cleared my mbox) -- the memory and time expenditure is > likely to be too high for that. Right. > It's still going to be a cool hack, well worth "publishing" as such, As a third-party module? Sure. > and probably able to be "user-installed" as the way deepcopy deals > with generators even though generators themselves may not sprout a > __deepcopy__ method themselves (fortunately, copy.copy does a lot of > "ad-hoc protocol adaptation" -- it's occasionally a bit rambling or > hard to follow, but often allows plugging in "third party copiers" > for types which their authors hadn't imagined would be copied or > deep copied, so that other innocent client code which just calls > copy.copy(x) will work... essentially how "real" adaptation would > work, except that registering a third-party protocol adapter would > be easier:-). --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Tue Oct 28 13:10:10 2003 From: aahz at pythoncraft.com (Aahz) Date: Tue Oct 28 13:10:13 2003 Subject: [Python-Dev] Re: [Web-SIG] Threading and client-side support In-Reply-To: References: <20031027150709.GA29045@rogue.amk.ca> <20031028124646.GB1095@rogue.amk.ca> Message-ID: <20031028181009.GA20129@panix.com> On Tue, Oct 28, 2003, John J Lee wrote: > On Tue, 28 Oct 2003 amk@amk.ca wrote: >> On Tue, Oct 28, 2003 at 10:35:33AM +0000, John J Lee wrote: >>> >>> Thanks. So, in particular, httplib, urllib and urllib2 are thread-safe? >> >> No idea; reading the code would be needed to figure that out. > > That might not be helpful if the person reading it (me) has zero > threading experience ;-) > > I certainly plan to gain that experience, but surely *somebody* > already knows whether they're thread-safe? I presume they are, > broadly, since a couple of violations of thread safety are commented > in urllib2 and urllib. Right? Generally speaking, any code that does not rely on global objects is thread-safe in Python. For more information, let's take this to python-list. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From mcherm at mcherm.com Tue Oct 28 13:10:26 2003 From: mcherm at mcherm.com (Michael Chermside) Date: Tue Oct 28 13:10:43 2003 Subject: copysort patch, was RE: [Python-Dev]inline sort option Message-ID: <1067364626.3f9eb11204e4f@mcherm.com> Alex Martelli writes: > BTW, I think I should point out one POSSIBLE problem with > classmethods -- since unfortunately they CAN be called on an > instance, and will ignore that instance, this may confuse an > unsuspecting user. Alex, that's a good point, and one we should be careful of. However, (as you said) I suspect that the unsuspecting users will always call it with zero arguments. So long as that call always fails (preferably with a useful error message) I think we should be OK. So what if we make the error message maximally useful? Something like this: _privateObj= Object() def sorted(iteratorToSort=_privateObj): if iteratorToSort == _privateObj: raise TypeError('sorted is a classmethod of list ' + 'taking an iterator argument') else: <... normal body here ...> The only thing I've done here was to make the text of the message more helpful (I've even left the type of the exception as TypeError even though that might not be the most useful thing). Okay... there's one other change... if you pass 2 or more arguments, then it will complain that it expected "at least 0 arguments", but try it once with 0 arguments and you'll immediately understand. -- Michael Chermside From aleaxit at yahoo.com Tue Oct 28 13:24:54 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 13:25:05 2003 Subject: [Python-Dev] RE: [Python-checkins] python/nondist/pepspep-0323.txt, NONE, 1.1 pep-0000.txt, 1.254, 1.255 In-Reply-To: <003401c39d76$4f36d080$1a3ac797@oemcomputer> References: <003401c39d76$4f36d080$1a3ac797@oemcomputer> Message-ID: <200310281924.54600.aleaxit@yahoo.com> On Tuesday 28 October 2003 06:09 pm, Raymond Hettinger wrote: > [Alex] > > > Btw, random.py doesn't seem to supply pseudo-random iterators -- > > easy to make e.g. with iter(random.random, None) [assuming you > > want a nonterminating one], > > Probably a bit faster with: > > starmap(random.random, repeat(())) Yep, saving the useless compare does help: [alex@lancelot bo]$ timeit.py -c -s'import random' -s'import itertools as it' \ > -s'xx=it.starmap(random.random, it.repeat(()))' 'xx.next()' 1000000 loops, best of 3: 1.37 usec per loop [alex@lancelot bo]$ timeit.py -c -s'import random' -s'import itertools as it' -s'xx=iter(random.random, None)' 'xx.next()' 1000000 loops, best of 3: 1.62 usec per loop Renewed compliments for itertools' speed!-) > > but that wouldn't be copyable. Should > > something be done about that...? > > No. > > 1) The use case is not typical. Other than random.random() and > time.ctime(), it is rare to see functions of zero arguments that > usefully return an infinite sequence of distinct values. Sorry, I must have been unclear -- I had no "zero arguments" limitation in mind. Rather, I thought of something like: it = random.iterator(random.Random().sample, range(8), 3) now each call to it.next() would return a random sample of 3 numbers from range(8) w/o repetition. I.e., the usual (callable, *args) idiom (as in Tkinter widgets' .after method, etc, etc). What I'm saying is that there is no reason type(it) shouldn't support a __copy__ method -- as long as the underlying callable sports an im_self which also exposes a __getstate__ method, at least. Come to think of this, there may be other use cases for this general approach than "random iterators". Do you think that an iterator on a callable *and args for it* would live well in itertools? That module IS, after all, your baby... > 2) If you need a copy, run it through tee(). That's exactly what I plan to do -- but I would NOT want tee() to consume O(N) memory [where N is how far out of step the two iterators may get] in those cases where the iterator argument DOES have a __copy__ method that can presumably produce a usable copy with O(1) memory expenditure. Thus, I'd like itertools.tee to start by checking if its argument iterator "is properly copyable". Guido has pointed out that it would not be safe to just try copy.copy(it), because that MIGHT produce a copy that does not satisfy "iterator copying" semantics requirements. As an example, he has repeatedly mentioned "an iterator on a tree which keeps ``a stack of indices''". Here, I think, is an indication of the kind of thing he fears (code untested beyond running it on that one example): import copy class TreeIter(object): def __init__(self, tree): self.tree = [tree] self.indx = [-1] def __iter__(self): return self def next(self): if not self.indx: raise StopIteration self.indx[-1] += 1 try: result = self.tree[-1][self.indx[-1]] except IndexError: self.tree.pop() self.indx.pop() if not self.indx: raise StopIteration return self.next() if type(result) is not list: return result self.tree.append(result) self.indx.append(-1) return self.next() x = [ [1,2,3], [4, 5, [6, 7, 8], 9], 10, 11, [12] ] print 'all items, one:', for i in TreeIter(x): print i, print print 'all items, two:', it = TreeIter(x) for i in it: print i, if i==6: cop = copy.copy(it) print print '>=6 items, one:', for i in cop: print i, print print '>=6 items, two:', it = TreeIter(x) for i in it: if i==6: cop = copy.deepcopy(it) for i in cop: print i, print Output is: [alex@lancelot bo]$ python treit.py all items, one: 1 2 3 4 5 6 7 8 9 10 11 12 all items, two: 1 2 3 4 5 6 7 8 9 10 11 12 >=6 items, one: >=6 items, two: 7 8 9 10 11 12 i.e., the "iterator copy" returned by copy.copy does NOT satisfy requirements! (I've added the last tidbit to show that the one returned by copy.deepcopy WOULD satisfy them, but, it's clearly WAY too memory-costly to consider, far worse than tee()!!!!). So, "safely copying an iterator" means ensuring the iterator's author HAS thought specifically about allowing a copy -- in which case, we can (well, we _must_:-) trust that they have implemented things correctly. Just using copy.copy(it) MIGHT fall afoul of a default shallow copy not being sufficient. Perhaps we can get by with checking for, and using if found, a __copy__ method only. Is there a specific need to support __setstate__ etc, here? I hope not -- still, these, too, are things I must make sure are mentioned in the PEP. Alex From guido at python.org Tue Oct 28 13:28:32 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 28 13:28:42 2003 Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug In-Reply-To: Your message of "Tue, 28 Oct 2003 16:39:08 +0100." <200310281639.08240.aleaxit@yahoo.com> References: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz> <200310281037.44424.aleaxit@yahoo.com> <200310281516.h9SFGbf29003@12-236-54-216.client.attbi.com> <200310281639.08240.aleaxit@yahoo.com> Message-ID: <200310281828.h9SISW529541@12-236-54-216.client.attbi.com> > On Tuesday 28 October 2003 04:16 pm, Guido van Rossum wrote: > > > So perhaps for 2.3 we should just apologetically note the anomaly > > > in the docs, and for 2.4 forbid the former case, i.e., require both > > > __mul__ AND __rmul__ to exist if one wants to code sequence > > > classes that can be multiplied by integers on either side...? > > > > > > Any opinions, anybody...? > > > > What's wrong with the status quo? So 3*x is undefined, and it happens > > to return x*3. Is that so bad? > > Where is it specified that 3*x "is undefined" when x's type exposes > __mul__ but not __rmul__ ? Sorry, I don't understand the viewpoint > you seem to imply here. If x's type exposed no __add__ but "it just > so happened" that x+23 always returned 12345 -- while every other > addition, as expected, failed -- would you doubt the lack of a normal > and reasonably expected exception is bad? > > I think that if Python returns "arbitrary" results, rather than raising an > exception, for operations that "should" raise an exception, that is > surely very bad -- it makes it that much harder for programmers to > debug the programs they're developing. If there's some doubt about > the words I've put in hyphens -- that treating x*y just like y*x only for > certain values of type(y) isn't arbitrary or shouldn't raise -- then we > can of course discuss this, but isn't the general idea correct? > > Now, the docs currently say, about sequences under > http://www.python.org/doc/current/ref/sequence-types.html : > """ > sequence types should implement ... multiplication (meaning repetition) by > defining the methods __mul__(), __rmul__() and __imul__() described below; > they should not define __coerce__() or other numerical operators. > """ > So, a sequence-emulating type that implements __mul__ but not __rmul__ > appears to violate that "should". > > The description of __mul__ and __rmul__ referred to seems to be > that at http://www.python.org/doc/current/ref/numeric-types.html . > > It says that methods corresponding to operations not supported by > a particular kind of number should be left undefined (as opposed > to the behavior of _attempts at those operations_ being undefined), > so if I had a hypothetical number type X such that, for x instance > of X and an integer k, x*k should be supported but k*x shouldn't, > isn't this a recommendation to not write __rmul__ in X ...? > > > Besides, this weird anomaly is typical of newstyle classes only. > Consider: > > >>> class X: > ... def __mul__(self, other): return 23 > ... > >>> x=X() > >>> x*7 > 23 > >>> 7*x > Traceback (most recent call last): > File "", line 1, in ? > TypeError: unsupported operand type(s) for *: 'int' and 'instance' > >>> > > ALL wonderful, just as expected, hunky-dory. But now, having > read that newstyle classes are better, I want to make X newstyle -- > can't see any indication in the docs that I shouldn't -- and...: > > >>> class X(object): > ... def __mul__(self, other): return 23 > ... > >>> x=X() > >>> x*7 > 23 > >>> 7*x > 23 > >>> > > *eep*! Yes, it DOES seem to be that this is QUITE bad indeed. > > > Alex You're making a mountain of a molehill here, Alex. I know that in group theory there are non-Abelian groups (for which AB != BA), but I've never encountered one myself in programming; more typical such non-commutative operations are modeled as __add__ rather than as __mul__. Anyway, the real issue AFAICT is not that people depend on __rmul__'s absence to raise a TypeError, but that people learn by example and find __rmul__ isn't necessary by experimenting with integers. The reason why it works at all for integers without __rmul__ is complicated; it has to do with very tricky issues in trying to implement multiplication of a sequence with an integer. That code has gone through a number of iterators, and every time someone eventually found a bug in it, so I'd rather leave the __rmul__ blemish than uproot it again. If you can come up with a fix that doesn't break sequence repetition I'd be open to accepting it (for 2.4 only, in 2.3 there may be too much code depending on the bug) but only after serious review -- and not by me, because I'm not all that familiar with all the subtleties of that code any more. :-( --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Tue Oct 28 13:30:46 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 13:31:50 2003 Subject: copysort patch, was RE: [Python-Dev]inline sort option In-Reply-To: <1067364626.3f9eb11204e4f@mcherm.com> References: <1067364626.3f9eb11204e4f@mcherm.com> Message-ID: <200310281930.46518.aleaxit@yahoo.com> On Tuesday 28 October 2003 07:10 pm, Michael Chermside wrote: > Alex Martelli writes: > > BTW, I think I should point out one POSSIBLE problem with > > classmethods -- since unfortunately they CAN be called on an > > instance, and will ignore that instance, this may confuse an > > unsuspecting user. > > Alex, that's a good point, and one we should be careful of. Thanks, that's why I brought the issue up. > However, (as you said) I suspect that the unsuspecting users > will always call it with zero arguments. So long as that call > always fails (preferably with a useful error message) I think > we should be OK. > > So what if we make the error message maximally useful? Something *VERY good idea* > like this: > > _privateObj= Object() > def sorted(iteratorToSort=_privateObj): > if iteratorToSort == _privateObj: > raise TypeError('sorted is a classmethod of list ' + > 'taking an iterator argument') > else: > <... normal body here ...> > > The only thing I've done here was to make the text of the message > more helpful (I've even left the type of the exception as TypeError > even though that might not be the most useful thing). Okay... > there's one other change... if you pass 2 or more arguments, then > it will complain that it expected "at least 0 arguments", but try > it once with 0 arguments and you'll immediately understand. Could we perhaps deal with the latter issue by adding a *args to sorted's signature, and changing the condition on the 'if' to: if iteratorToSort is _privateObj or args: raise TypeError # etc etc ? Maybe w/"a single iterator argument" in the error message's text? {alternatively, if we don't care about keyword args, just having the *args in the signature and checking "if len(args) != 1: ..." might be OK} Alex From guido at python.org Tue Oct 28 13:34:09 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 28 13:34:16 2003 Subject: [Python-Dev] RE: cloning iterators again In-Reply-To: Your message of "Tue, 28 Oct 2003 17:41:58 GMT." <20031028174158.GA19133@vicky.ecs.soton.ac.uk> References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer> <200310270851.02495.aleaxit@yahoo.com> <20031027103540.GA27782@vicky.ecs.soton.ac.uk> <200310271609.03819.aleaxit@yahoo.com> <20031028124042.GA22513@vicky.ecs.soton.ac.uk> <200310281533.h9SFXZi29097@12-236-54-216.client.attbi.com> <20031028174158.GA19133@vicky.ecs.soton.ac.uk> Message-ID: <200310281834.h9SIY9j29592@12-236-54-216.client.attbi.com> > Ok on this point, the question was whether (the error-checking obfuscated > equivalent of) > > PyObject *m = PyImport_ImportModule("copy"); > PyObject_CallMethod(m, "deepcopy", x, memo); > > should be done inside a built-in __deepcopy__ implementation. It looks like > it will make a hell of a lot of quite slow calls to PyImport_ImportModule() > for structures like lists of generators, which is the kind of structure you > are interested in when you deepcopy generators. Yeah, you should ideally be able to cache the resuls of the import, except then your code wouldn't work when theer are multiple interpreters. Maybe using PyObject *modules = PySys_GetObject("modules"); PyObject *m = PyDict_Lookup(modules, "copy"); would be faster? PySys_GetObject() doesn't waste much time. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Oct 28 13:37:56 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 28 13:38:03 2003 Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug In-Reply-To: Your message of "Tue, 28 Oct 2003 10:28:32 PST." <200310281828.h9SISW529541@12-236-54-216.client.attbi.com> References: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz> <200310281037.44424.aleaxit@yahoo.com> <200310281516.h9SFGbf29003@12-236-54-216.client.attbi.com> <200310281639.08240.aleaxit@yahoo.com> <200310281828.h9SISW529541@12-236-54-216.client.attbi.com> Message-ID: <200310281837.h9SIbuV29622@12-236-54-216.client.attbi.com> > You're making a mountain of a molehill here, Alex. I know that in > group theory there are non-Abelian groups (for which AB != BA), but > I've never encountered one myself in programming; more typical such > non-commutative operations are modeled as __add__ rather than as > __mul__. I need to give myself a small slap on the forehead head, because of course non-square matrix multiplication is an excellent example where AB != BA. However even there, Ax == xA when x is a singleton, and the issue only arises for integers, so I still don't think there are use cases. --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Tue Oct 28 13:43:57 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 13:44:06 2003 Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug In-Reply-To: <200310281828.h9SISW529541@12-236-54-216.client.attbi.com> References: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz> <200310281639.08240.aleaxit@yahoo.com> <200310281828.h9SISW529541@12-236-54-216.client.attbi.com> Message-ID: <200310281943.57278.aleaxit@yahoo.com> On Tuesday 28 October 2003 07:28 pm, Guido van Rossum wrote: ... > You're making a mountain of a molehill here, Alex. I know that in You have a point: when one is in love (and I still _am_ madly in love with Python!-), it's hard to admit of imperfections in the loved one:-). Even a tiny defect...:-). > group theory there are non-Abelian groups (for which AB != BA), but > I've never encountered one myself in programming; more typical such > non-commutative operations are modeled as __add__ rather than as > __mul__. I don't remember ever coding a __mul__ that I WANTED to be non-commutative, right. > Anyway, the real issue AFAICT is not that people depend on __rmul__'s > absence to raise a TypeError, but that people learn by example and > find __rmul__ isn't necessary by experimenting with integers. Or more seriously: they write what LOOK like perfectly adequate unit tests, but the numbers they try in "number * x" happen to be ints; so the unit tests pass -- but their code is broken because they forgot the __rmul__ and the unittests-with-ints didn't catch that. > The reason why it works at all for integers without __rmul__ is > complicated; it has to do with very tricky issues in trying to > implement multiplication of a sequence with an integer. That code has Yes, I think I understand some of that -- I included the analysis of the bug in my bugreport on SF. > gone through a number of iterators, and every time someone eventually > found a bug in it, so I'd rather leave the __rmul__ blemish than > uproot it again. If you can come up with a fix that doesn't break > sequence repetition I'd be open to accepting it (for 2.4 only, in 2.3 > there may be too much code depending on the bug) but only after > serious review -- and not by me, because I'm not all that familiar > with all the subtleties of that code any more. :-( I do have one weird idea that might help here (for 2.4), but I'd better post that separately because I suspect it's going to fuel its own long discussion thread. As things are, without an ability to distinguish a sequence from a number securely, and IF sequences must work w/o __rmul__ (but they didn't in classic classes...? and the docs don't indicate that...?) then I'm stumped. Who'd be the right people to review such proposed changes, btw? Alex From mcherm at mcherm.com Tue Oct 28 13:47:27 2003 From: mcherm at mcherm.com (Michael Chermside) Date: Tue Oct 28 13:47:28 2003 Subject: [Python-Dev] replacing 'global' Message-ID: <1067366846.3f9eb9bf1ee8e@mcherm.com> Alex lists this flaw: > -- it's the wrong keyword, doesn't really _mean_ "global" Guido says: > I haven't heard anyone else in this thread agree with you on that > one. I certainly don't think it's of earth-shattering ugliness. Well, I agree. But I also agree with your point that it's certainly not earth-shattering... just a little confusing to newbies, who expect "global" to mean "global", not "module-wide". Not worth changing the language, but if you were to re-invent Python from the ground up, I'd consider it. Greg Ewing writes: > We'd be having two kinds of assignment, and there's no > prior art to suggest to suggest which should be = and > which :=. That's the "arbitrary" part. No one will ever confuse these, because no one will learn about := until long after = is well understood. The one spelled "=" will be "the normal one" and ":=" will be "the funny one". Just mentions: > (Alex noted in private mail that one disadvantage of this idea is that > it makes using globals perhaps TOO easy...) Indeed, that would be my concern. At least the word "global" has strong negative associations (mostly undeserved in this case since it really means "module-level" not "global" ;-). Skip writes: > It seems that use > of > x := 2 > and > x = 4 > should be disallowed in the same function so that the compiler can > flag such mistakes. I agree. When writing a function, we ALLOW name shadowing because we want the author of the function to be able to use local variables without having to know anything about the outer scope(s). But if the author of the function ALREADY KNOWS that there's an outer variable named "x" (MUST know it since she is modifying that outer variable), then there's no excuse for the poor choice of names... the local variable should be renamed to avoid the conflict. The "global" statement as it currently exists enforces this... if one assignment in a scope is "global", then ALL will be. I maintain that the use of := vs = should be the same... all or none! Despite Just's original preference for thinking of it as "find somplace and rebind", I would always wind up thinking of this as the "bind in some outer scope" operator. ----- Anyhow, that's as far as I got in reading the discussion so far. Whew! What a lot of traffic! -- Michael Chermside From pje at telecommunity.com Tue Oct 28 13:47:06 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Oct 28 13:48:44 2003 Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing 'global') In-Reply-To: <200310281755.44307.aleaxit@yahoo.com> References: <5.1.0.14.0.20031028085337.02f57ec0@mail.telecommunity.com> <200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com> <5.1.0.14.0.20031028085337.02f57ec0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20031028121110.0315e0d0@telecommunity.com> At 05:55 PM 10/28/03 +0100, Alex Martelli wrote: >On Tuesday 28 October 2003 02:57 pm, Phillip J. Eby wrote: > > At 09:56 AM 10/28/03 +0100, Alex Martelli wrote: > > >AND, adaptation is not typecasting: > > >e.g y=adapt("23", int) should NOT succeed. > > > > Obviously, it wouldn't succeed today, since int doesn't have __adapt__ and > > str doesn't have __conform__. But why would you intend that they not have > > them in future? > >I'd be delighted to have the int type sprout __adapt__ and the str type >sprout __conform__ -- but neither should accept this case, see below. You didn't actually give any example of why 'adapt("23",int)' shouldn't return 23, just why adapt("foo",file) shouldn't return a file. Currently, Python objects may possess an __int__ method for conversion to integers, and a __str__ method for conversion to string. So, it seems to me that for such objects, 'adapt(x,int)' should be equivalent to x.__int__() and 'adapt(x,str)' should be equivalent to x.__str__(). So, there is already a defined protocol within Python for conversion to specific types, with well-defined meaning. One might argue that since it's already possible to call the special method or invoke the type constructor, that it's not necessary for there to be an adapt() synonym for them. However, it's also possible to get an object's attribute or call an arbitrary function by exec'ing a dynamically constructed string instead of using getattr() or having functions as first class objects. So, I don't see any problem with "convert to integer" being 'int(x)' and yet still being able to spell it 'adapt(x,int)' in the circumstance where 'int' is actually a variable or parameter, just as one may use 'getattr(x,y)' when the attribute to be gotten is a variable. > > And, why do you consider adaptation *not* to be typecasting? I always > > think of it as "give me X, rendered as a Y", which certainly sounds like a > > description of typecasting to me. > >typecasting (in Python) makes a NEW object whose value is somehow >"built" (possibly in a very loose sense) from the supplied argument[s], >but need not have any more than a somewhat tangential relation with >them. adaptation returns "the same object" passed as the argument, >or a wrapper to it that makes it comply with the protocol. I don't understand the dividing line here. Perhaps that's because Python doesn't really *have* an existing notion of typecasting as such, there are just constructors (e.g. int) and conversion methods (e.g. __int__). However, conversion methods and even constructors of immutable types are allowed to be idempotent. 'int(x) is x' can be true, for example. So, how is that different? >To give a specific example: > >x = file("foo.txt") > >now (assuming this succeeds) x is a readonly object which is an >instance of file. The argument string "foo.txt" has "indicated", quite >indirectly, how to construct the file object, but there's really no true >connection between the value of the argument string and what >will happen as that object x is read. > >Thinking of what should happen upon: > >x = adapt("foo.txt", file) > >what I envision is DEFINITELY the equivalent of: > >x = cStringIO.StringIO("foo.txt") > >i.e., the value (aka object) "foo.txt", wrapped appropriately so as >to conform to the (readonly) "file protocol" (I can call x.read(3) >and get "foo", then x.seek(0) then x.read(2) and get "fo", etc). I don't see how any of this impacts the question of whether adapt(x,int) == int(x). Certainly, I agree with you that adapt("foo",file) should not equal file("foo"), but I don't understand what one of these things has to do with the other. From aleaxit at yahoo.com Tue Oct 28 13:49:50 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 13:50:00 2003 Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug In-Reply-To: <200310281837.h9SIbuV29622@12-236-54-216.client.attbi.com> References: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz> <200310281828.h9SISW529541@12-236-54-216.client.attbi.com> <200310281837.h9SIbuV29622@12-236-54-216.client.attbi.com> Message-ID: <200310281949.50592.aleaxit@yahoo.com> On Tuesday 28 October 2003 07:37 pm, Guido van Rossum wrote: > > You're making a mountain of a molehill here, Alex. I know that in > > group theory there are non-Abelian groups (for which AB != BA), but > > I've never encountered one myself in programming; more typical such > > non-commutative operations are modeled as __add__ rather than as > > __mul__. > > I need to give myself a small slap on the forehead head, because of > course non-square matrix multiplication is an excellent example where > AB != BA. However even there, Ax == xA when x is a singleton, and the > issue only arises for integers, so I still don't think there are use > cases. There may be no "perfectly correct code" that will ever notice 3*x weirdly works. But would that make it acceptable to return 42, rather than raise IndexError, when a list of length exactly 33 is indexed by index 666? That, too, might "have no practical use cases" for perfectly correct code. But programmers make mistakes, and one of Python's strength is that it does NOT (crash, hang, or) return weird wrong results when they do -- most often it raises appropriate exceptions, which make it easy to diagnose and fix one's mistakes. Thus, it troubles me that we can't do it here. I know it's hard to fix (I've stared at that code for QUITE a while...). But "deducing" from that difficulty that the error's not worth fixing seems like a classic case of "sour grapes":-). Alex From guido at python.org Tue Oct 28 14:08:04 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 28 14:08:12 2003 Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug In-Reply-To: Your message of "Tue, 28 Oct 2003 19:49:50 +0100." <200310281949.50592.aleaxit@yahoo.com> References: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz> <200310281828.h9SISW529541@12-236-54-216.client.attbi.com> <200310281837.h9SIbuV29622@12-236-54-216.client.attbi.com> <200310281949.50592.aleaxit@yahoo.com> Message-ID: <200310281908.h9SJ84S29755@12-236-54-216.client.attbi.com> > I know it's hard to fix (I've stared at that code for QUITE a > while...). But "deducing" from that difficulty that the error's not > worth fixing seems like a classic case of "sour grapes":-). I dunno. As language warts go I find this one minuscule, and the effort you spend on rhetoric to convince me a waste of breath. My position is: I understand that it's a wart, I just don't think I know of a good solution, and I can live with the status quo just fine. --Guido van Rossum (home page: http://www.python.org/~guido/) From raymond.hettinger at verizon.net Tue Oct 28 14:14:09 2003 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Tue Oct 28 14:15:49 2003 Subject: [Python-Dev] PEP 322: Generator Expressions (implementation team) Message-ID: <001601c39d87$aaa08c20$f7b42c81@oemcomputer> Guido has accepted the generator expressions pep, so it's time for me to form an implementation team. Any volunteers are welcome to email me directly. Alex, Brett, Neal, Jeremy? Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20031028/f67c5990/attachment.html From jacobs at penguin.theopalgroup.com Tue Oct 28 14:18:16 2003 From: jacobs at penguin.theopalgroup.com (Kevin Jacobs) Date: Tue Oct 28 14:18:20 2003 Subject: [Python-Dev] Decimal.py in sandbox In-Reply-To: Message-ID: On Tue, 28 Oct 2003, Batista, Facundo wrote: > Aahz wrote: > #- The first thing you should do is talk with Eric Price > #- (eprice@tjhsst.edu), author of the code. You don't need to > #- use SF for > #- now; CVS should be fine, but you should find out whether > #- Eric would like > #- to approve changes first. > > Eric Price wrote: > #- Not really-- since school started, I haven't had much time > #- to spare. > #- I'll probably look over the changes at some time, but I > #- wouldn't want to > #- keep them waiting. > > So, to who may I send the changes? > > Should I send the whole staff at the end of the work, or keep feeding small > changes? > > Should I send by email the diff results? I'll be happy to review your changes, so long as the changesets are kept fairly focused. We can then feed them through one of the regular committers. Just e-mail them to me directly in unified format (-u) with a simple explanation of what is being accomplished. Thanks, -Kevin -- -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (440) 871-6725 x 19 E-mail: jacobs@theopalgroup.com Fax: (440) 871-6722 WWW: http://www.theopalgroup.com/ From fperez at colorado.edu Tue Oct 28 14:24:25 2003 From: fperez at colorado.edu (Fernando Perez) Date: Tue Oct 28 14:24:30 2003 Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug Message-ID: <3F9EC269.6080108@colorado.edu> Hi all, I just wanted to add a small comment on this discussion, which I'd been following via the newsgroup mirror. Python is picking up a lot of steam in the scientific computing community, and in science it is quite common to encounter non-commutative multiplication. Just to remind Guido from his old math days :), even for square matrices, AB!=BA in most cases. The Matrix class supplied with Numpy is one example of a widely used library which implements '*' as a non-commutative multiplication operator. From what I've read, I realize that this is quite a subtle and difficult bug to treat. I just wanted to add a data point for you folks to consider. Please don't dismiss non-commutative multiplication as too much of an obscure corner case, it is a daily occurrence for a growing number of python users (scientists). Thanks, Fernando. From martin at v.loewis.de Tue Oct 28 15:37:54 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Tue Oct 28 15:38:04 2003 Subject: [Python-Dev] Re: 2.3.3 plans In-Reply-To: <200310281529.h9SFTrY29061@12-236-54-216.client.attbi.com> References: <1067340608.24137.11.camel@gandalf.tm.informatik.uni-frankfurt.de> <200310281529.h9SFTrY29061@12-236-54-216.client.attbi.com> Message-ID: Guido van Rossum writes: > I hope someone here can work with you on getting the patches in > acceptable shape. You should start by uploading them to the patch > manager in SourceForge. Correct. In addition, the patches should *first* be integrated into the CVS head, and then backported to 2.3. There is the possibility that cross-compilation support breaks native compilation procedures, which would not be acceptable for a point release. Regards, Martin From aleaxit at yahoo.com Tue Oct 28 15:55:41 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 15:58:12 2003 Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing 'global') In-Reply-To: <5.1.1.6.0.20031028121110.0315e0d0@telecommunity.com> References: <5.1.0.14.0.20031028085337.02f57ec0@mail.telecommunity.com> <5.1.1.6.0.20031028121110.0315e0d0@telecommunity.com> Message-ID: <200310282155.41407.aleaxit@yahoo.com> On Tuesday 28 October 2003 07:47 pm, Phillip J. Eby wrote: ... > You didn't actually give any example of why 'adapt("23",int)' shouldn't > return 23, just why adapt("foo",file) shouldn't return a file. Which is sufficient to show that, IN GENERAL, adapt(x, sometype) should not just be the equivalent of sometype(x), as you seemed (and, below, still seem) to argue. Now, if you want to give positive reasons, specific, compelling use-cases, to show why for SOME combinations of type(x) and sometype that general rule should be violated, go ahead, but the burden of proof is on you. If you do want to try and justify such specific-cases exceptions, remember: "adapt(x, foo)" is specified as returning "x or a wrapper around x", and clearly a new object of type foo with no actual connection to x is neither of those. That's a "formal" reasoning from the PEP's actual current text. But perhaps informal reasoning may prove more convincing -- let's try. adaptation is *NOT* conversion -- it's not the creation of a new object that will thereafter live a life separate from the original one. This part is not relevant when the objects are immutable, but it's quite relevant to your GENERAL idea of, e.g.: > 'adapt(x,str)' should be equivalent to x.__str__(). Say that x is mutable. Then, "adapting x to the string protocol", if supported, should give a wrapper object, supporting all string-object methods, in a way that any call to such methods relies on the current-at-call-time value of x. But, still on that general idea of yours that I quote above, there is worse, MUCH worse. Consider: an object's type often supports a __str__ that, as per its specs in the docs, is "the ``informal'' string representation of an object ... convenient or concise representation may be used instead". The docs make it AMPLY clear that the purpose of __str__ is STRICTLY for the object's type to give a (convenient, concise, possibly quite incomplete and inaccurate) HUMAN-READABLE representation of the object. To assert that this is in any way equivalent to a claim, on the object type's part, that its instances can "adapt themselves to the string protocol", beggars belief. It borders, I think, on the absurd, to maintain that, for example, "" *IS* my open file object "adapted to" the string protocol. It's clearly a mere human readable representation, a vague phantom of the object itself. It should be obvious that, just as "adapting a string to the (R/O) file protocol" means wrapping it in cStringIO.StringIO, so the reverse adaptation, "adapting a file to the string protocol", should utilize a wrapper object that presents the file's data with all string object methods, for example via mmap. > So, there is already a defined protocol within Python for conversion to > specific types, with well-defined meaning. One might argue that since it's Conversion is one thing, adaptation is a different thing. Creating a new object "somehow related" to an existing one -- i.e., conversion -- is a very different thing from "wrapping" an existing object to support a different protocol -- adaptation. Consider another typical case: >>> import array >>> x = array.array('c', 'ciao') >>> L = list(x) >>> x.extend(array.array('c', 'foop')) >>> x array('c', 'ciaofoop') >>> L ['c', 'i', 'a', 'o'] >>> See the point? CONVERSION, aka construction, aka typecasting, i.e. list(x), has created a new object, based on what WERE the contents of x at the time at conversion, but INDEPENDENT from it henceforwards. Adaptation should NOT work that way: adapt(x, list) would, rather, return a wrapper, providing listlike methods (some, like pop or remove, would delegate to x's own methods -- others, like sort, would require more work) and _eventually performing actual operations on x_, NOT on a separate thing that once, a long time ago, was constructed by copying it. Thus, I see foo(x) and adapt(x, foo) -- even in cases where foo is a type -- as GENERALLY very different. If you have SPECIFIC use cases in mind where it would be clever to make the two operations coincide, you still haven't made them; I only heard vague generalities about how adapt(x, y) "should" work without ANY real support for them. If the code that requests adaptation is happy, as a fall-back, to have (e.g.) "" as the "ersatz adaptation" of a file instance to str, for example, it can always do the fall-back itself, e.g. try: z = adapt(x, y) except TypeError: try: z = y(x) except (TypeError, ValueError): # whatever other desperation measures it wants to try To have adapt itself imply such measures would be a disaster, and make adaptation basically unusable in all cases where one might have (e.g.) "y is str". > I don't understand the dividing line here. Perhaps that's because Python > doesn't really *have* an existing notion of typecasting as such, there are > just constructors (e.g. int) and conversion methods (e.g. > __int__). Yeah, that's much like C++, except C++ is more general in terms of conversion methods -- not only can a constructor for type X accept a Y argument (or const Y&, equivalently), but type Y can also always choose to provide an "operator X()" to typecast its instances to the other type [I think I recall that if BOTH types try to cooperate in such ways you end up with an ambiguity error, though:-)]. That's in contrast to the specific few 'conversion methods' that Python supports only for a small set of numeric types as the destination of the conversion. Either the single-argument constructor or the operator may get used when you typecast (static_cast(y) where y is an instance of Y). There isn't all that much difference between C++'s approach and Python's here, except for C++'s greater generality and the fact that in Python you always use notation X(y) to indicate the typecasting request. ("typecast" is not a C++ term any more than it's Python's -- I think it's only used in some obscure languages such as CIAO, tools like Flex/Harpoon, Mathworks, etc -- so, sorry if my use was obscure). One important difference: in C++, you get to define whether a one-argument constructor gets to be evaluated "implicitly", when an object of type X is required and one of type Y is supplied instead, or not. If the constructor is declared explicit, then it ONLY gets called for EXPLICIT typecasts such as X(y). In Python, we think EIBNI, and therefore typecasts are explicit. We do NOT "adapt" a float f to int when an int is required, as in somelist[f]: we raise a TypeError -- if you want COERCION, aka CONVERSION, to an int, with possible loss of information etc, you EXPLICITLY code somelist[int(f)]. Your proposal that adaptation be, when possible, implemented by conversion, goes against the grain of that good habit and principle. Adaptation in general is not conversion -- when you know you want, or at least can possibly tolerate as a fallback, total conversion, ASK for it, explicitly -- perhaps as a fallback if adaptation fails, as above. Having "adapt(x, y)" just basically duplicate some possible cases of y(x) would be a serious diminution of adaptation's potential and usefulness. > However, conversion methods and even constructors of immutable > types are allowed to be idempotent. 'int(x) is x' can be true, for > example. So, how is that different? it's part of the PEP that, if isinstance(x, y), then normally x is adapt(x, y) [[ with a specific exception for "non substitutable subclasses" whose usecases I do not know -- anyway, such subclasses would need to be _specifically_ "exempted" from the general rule, e.g. by providing an __adapt__ that raises as needed ]]. So, calling y(x) will be wrong EXCEPT when type y is immutable AND it's EXACTLY the case that "type(x) is y", NOT a subclass, otherwise: >>> class xx(int): pass ... >>> w = xx(23) >>> type(w) >>> type(int(w)) >>> ... the 'is' constraint is lost, despite the fact that xx IS quite obviously "substitutable" and has requested NO exception to the rule, AT ALL. Again: adaptation is not conversion -- and this is NOT about the admitted defects in the PEP, because this case is VERY specifically spelled out there. Implementing adapt(x, y) as y(x) may perhaps be of some practical use in some cases, but I am still waiting for you to show any such use case of practical compelling interest. I hope I have _amply_ shown that the implementation strategy is absolutely out of the question as a general one, so it matters up to a point if some very specific subcases are well served by that strategy, anyway. The key issue is, such cases, if any, will need to be very specifically identified and justified one by one. Alex From aleaxit at yahoo.com Tue Oct 28 16:26:33 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 16:26:52 2003 Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug In-Reply-To: <3F9EC269.6080108@colorado.edu> References: <3F9EC269.6080108@colorado.edu> Message-ID: <200310282226.33748.aleaxit@yahoo.com> On Tuesday 28 October 2003 08:24 pm, Fernando Perez wrote: > Hi all, > > I just wanted to add a small comment on this discussion, which I'd been > following via the newsgroup mirror. Thanks for your comments! I didn't even know we HAD an ng mirror... > Python is picking up a lot of steam in the scientific computing community, > and in science it is quite common to encounter non-commutative > multiplication. Just to remind Guido from his old math days :), even for > square matrices, AB!=BA in most cases. The Matrix class supplied with Yes, of course, you're right. However, the most specific problem is: do you know of ANY use cases where A*x and x*A should give different results, or the former should succeed and the latter should fail, *when x is an integer*? If you can find any use case for that, even in an obscure branch of maths, then clearly the urgency of fixing this bug goes WAY up. Otherwise -- if having the problem specifically for an integer x ONLY should not affect anything -- the bug is basically only going to show up in software that's under development and not yet completed, or else not fully correct. I _still_ want to fix it, but... the urgency of doing so is going to be different, as I'm sure you'll understand! Thanks again for your help -- we DO need to hear from users!!! Alex From pje at telecommunity.com Tue Oct 28 16:36:42 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Oct 28 16:38:45 2003 Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing 'global') In-Reply-To: <200310282155.41407.aleaxit@yahoo.com> References: <5.1.1.6.0.20031028121110.0315e0d0@telecommunity.com> <5.1.0.14.0.20031028085337.02f57ec0@mail.telecommunity.com> <5.1.1.6.0.20031028121110.0315e0d0@telecommunity.com> Message-ID: <5.1.1.6.0.20031028155739.022ab800@telecommunity.com> At 09:55 PM 10/28/03 +0100, Alex Martelli wrote: >On Tuesday 28 October 2003 07:47 pm, Phillip J. Eby wrote: > ... > > You didn't actually give any example of why 'adapt("23",int)' shouldn't > > return 23, just why adapt("foo",file) shouldn't return a file. > >Which is sufficient to show that, IN GENERAL, adapt(x, sometype) >should not just be the equivalent of sometype(x), as you seemed (and, >below, still seem) to argue. I'm not arguing that, nor have I ever intended to. I merely questioned your appearing to argue that adapt(x,sometype) should NEVER equal sometype(x). >It borders, I think, on the absurd, to maintain that, for example, >"" *IS* my open file >object "adapted to" the string protocol. It's clearly a mere human >readable representation, a vague phantom of the object itself. It >should be obvious that, just as "adapting a string to the (R/O) file >protocol" means wrapping it in cStringIO.StringIO, so the reverse >adaptation, "adapting a file to the string protocol", should utilize >a wrapper object that presents the file's data with all string object >methods, for example via mmap. Great, so now you know what you'd like file.__conform__(str) to do. This has nothing to do with what I was asking about. You said, in the post I originally replied to: "y=adapt("23", int) should NOT succeed." And I said, "why not?" This is not the same as me saying that adapt(x,y) for all y should equal y(x). Such an idea is patently absurd. I might, however, argue that adapt(x,int) should equal int(x) for any x whose __conform__ returns None. Or more precisely, that int.__adapt__(x) should return int(x). And that is why I'm asking why you appear to disagree. However, you keep talking about *other* values of y and x than 'int' and "23", so I'm no closer to understanding your original statement than before. >Adaptation should NOT work that way: adapt(x, list) would, rather, >return a wrapper, providing listlike methods (some, like pop or remove, >would delegate to x's own methods -- others, like sort, would require >more work) and _eventually performing actual operations on x_, NOT >on a separate thing that once, a long time ago, was constructed by >copying it. For protocols whose contract includes immutability (such as 'int') this distinction is irrelevant, since a snapshot is required. Or are you saying that adaptation cannot be used to adapt a mutable object to a protocol that includes immutability? >Thus, I see foo(x) and adapt(x, foo) -- even in cases where foo is a >type -- as GENERALLY very different. If you have SPECIFIC use cases >in mind where it would be clever to make the two operations coincide, >you still haven't made them; I only heard vague generalities about how >adapt(x, y) "should" work without ANY real support for them. It's you who has proposed how they work, and I who asked a question about your statement. >In Python, we think EIBNI, and therefore typecasts are explicit. >We do NOT "adapt" a float f to int when an int is required, as >in somelist[f]: we raise a TypeError -- if you want COERCION, >aka CONVERSION, to an int, with possible loss of information >etc, you EXPLICITLY code somelist[int(f)]. Your proposal that >adaptation be, when possible, implemented by conversion, goes I'm not aware that I made such a proposal. I asked why you thought that adapt('23',int) should *not* return 23. >[lots more snipped] We seem to be having two different conversations. I haven't proposed *anything*, only asked questions. Meanwhile, you keep debating my supposed proposal, and not answering my questions! Specifically, you still have not answered my question: Why do you think that 'adapt("23",int)' should not return 23? That is all I am asking, and trying to understand. It is a question, not a proposal for anything, of any kind. Now, it is possible I misunderstood your original statement, and you were not in fact proposing that it should not. If so, then that clarification would be helpful. All the rest of this about why adapt(x,y) may have nothing to do with y(x) isn't meaningful to me. The fact that 2+2==4 and 2*2 ==4 doesn't mean that multiplication is the same as addition! So why would adapt(x,y) and y(x) being equal for some values of x and y mean that adaptation is conversion? You seem to be arguing, however, that that's what I'm saying. Further, you seem to me to be saying that "Because addition is not multiplication, adding 2 and 2 should not equal 4. That's what multiplication is for, so you should always multiply 2 and 2 to get 4, never add them." And that seems so wrong to me, that I have to ask, "Why would you say a thing like that?" Then, you answer me by saying, "But addition is not multiplication, so why are you proposing that adding two numbers should always produce the same result as multiplying them?" When in fact I have not proposed any such thing, nor would I! From aleaxit at yahoo.com Tue Oct 28 16:42:12 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 16:42:17 2003 Subject: [Python-Dev] RE: [Python-checkins] python/nondist/peps pep-0323.txt, NONE, 1.1 pep-0000.txt, 1.254, 1.255 In-Reply-To: <200310281752.h9SHpxr29419@12-236-54-216.client.attbi.com> References: <005a01c39cdb$fa18b540$81b0958d@oemcomputer> <200310281735.41103.aleaxit@yahoo.com> <200310281752.h9SHpxr29419@12-236-54-216.client.attbi.com> Message-ID: <200310282242.12398.aleaxit@yahoo.com> On Tuesday 28 October 2003 06:51 pm, Guido van Rossum wrote: > > Yes, the use case of __deepcopy__ is indeed quite different (and > > to be honest it doesn't appear in my actual experience -- I can "imagine" > > some as well as the next man, but they'd be made out of whole cloth:-). > > But I was under the impression that you wanted them in PEP 323 too? > > Maybe I misunderstood your words. Should I take them out of PEP 323? > > In that case somebody else can later PEP that if they want, and I can > > basically wash my hands of them -- what do you think? > > I think it would be better of PEP 323 only did __copy__, so you can > remove all traces of __deepcopy__. I don't recall what I said, maybe > I wasn't clear. Aye aye cap'n -- that suits me just fine, actually:-). Alex From aleaxit at yahoo.com Tue Oct 28 16:46:50 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 16:46:58 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: <200310281742.h9SHgGt29384@12-236-54-216.client.attbi.com> References: <200310260115.h9Q1F0905596@oma.cosc.canterbury.ac.nz> <200310281742.39349.aleaxit@yahoo.com> <200310281742.h9SHgGt29384@12-236-54-216.client.attbi.com> Message-ID: <200310282246.50113.aleaxit@yahoo.com> On Tuesday 28 October 2003 06:42 pm, Guido van Rossum wrote: > > Hmmm... maybe one COULD make a custom descriptor that does support > > both usages... and maybe it IS worth making the .sorted (or whatever > > name) entry a case of exactly such a subtle custom descriptor... > > Thanks for the idea, I can use this as a perverted example in my talk > at Stanford tomorrow. Here it is: Heh, cool! > import new > > def curry(f, x, cls=None): > return new.instancemethod(f, x) Hmmm, what's the role of the ", cls=None" argument here...? I.e, couldn't just curry = new.instancemethod be equivalent? Alex From pedronis at bluewin.ch Tue Oct 28 16:55:34 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Tue Oct 28 16:53:00 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310281527.h9SFRjw29046@12-236-54-216.client.attbi.com> References: <200310280202.h9S22Ad21701@oma.cosc.canterbury.ac.nz> <200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com> <200310280956.34183.aleaxit@yahoo.com> <3F9E50C3.4040908@iinet.net.au> Message-ID: <5.2.1.1.0.20031028224849.02876cc0@pop.bluewin.ch> At 07:27 28.10.2003 -0800, Guido van Rossum wrote: > It matches what the current global statement does, and it >makes it crystal clear that you *can* declare a variable in a specific >scope and assign to it without requiring there to be a binding for >that variable in the scope itself. EIBTI when comparing these two. looking at: x = 'global' def f(): def init(): global x in f x = 'in f' def g(): print x init() g() I don't really know whether to call explicit or implicit the fact that x in g is not the global one. And contrast with x = 'global' def f(): x = 0 def init(): global x x = 'in f' def g(): print x init() g() or consider x = 'global' def f(): global x def init(): global x in f x = 'in f' def g(): print x init() g() From fperez at colorado.edu Tue Oct 28 16:57:38 2003 From: fperez at colorado.edu (Fernando Perez) Date: Tue Oct 28 16:57:42 2003 Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug In-Reply-To: <200310282226.33748.aleaxit@yahoo.com> References: <3F9EC269.6080108@colorado.edu> <200310282226.33748.aleaxit@yahoo.com> Message-ID: <3F9EE652.3030703@colorado.edu> Alex Martelli wrote: > On Tuesday 28 October 2003 08:24 pm, Fernando Perez wrote: >> I just wanted to add a small comment on this discussion, which I'd been >> following via the newsgroup mirror. > > > Thanks for your comments! I didn't even know we HAD an ng mirror... Via gmane news, it works quite well in fact. I typically follow python-dev there, and only subscribe occasionally if I need to say something. > Yes, of course, you're right. However, the most specific problem is: do > you know of ANY use cases where A*x and x*A should give different results, > or the former should succeed and the latter should fail, *when x is an > integer*? > > If you can find any use case for that, even in an obscure branch of maths, > then clearly the urgency of fixing this bug goes WAY up. Well, I'm not a mathematician myself, but I did ask two friends and neither of them could think of such a case quickly (they're applied people, though, we need to ask someone doing abstract algebra :) But I think I can see a 'semi-reasonable' usage case. Bear with me for a moment, please. Suppose A is a member of a class representing a non-linear operator which acts on functions f, such that in particular: A(x*f) != x*A(f) for x an integer. Now, if for some reason I decide to implement the 'application of A', which in the above I represented with (), with '*', the bug you mention does surface, because then: A*x*f != x*A*f Or does it? The left-right order of associativity plays a role here also, and I don't know exactly how python treats these. Granted, this example is somewhat contrived. Here, using __call__ for application would be more sensible, and the associativity rules may still hide the bug. Using '*' for application is not totally absurd, because if you are using finite matrix representations of your operators and functions, then in fact operator-function application _does_ become multiplication in the finite-dimensional vector space. But it _suggests_ a possibility for the bug to surface. At the same time, it also shows that 'really reasonable' uses will probably not easily expose this one. The one idea which I think matters, though, is the following: since in python we can't define new operators, in specific problem domains the existing ones (such as '*') may end up being reused in very unconventional ways. So while I can't think now of a non-commutative integer*THING multiplication, I don't see why someone might not build a THING where '*' isn't really what we think of as 'multiplication', and then the bug matters. In the end, I'd argue that it would be _nice_ to have it fixed, but I understand that with finite developer resources available, this one may have to take a back seat until someone can show a truly compelling case. Perhaps I'm just not imaginative enough to see one quickly :) > Thanks again for your help -- we DO need to hear from users!!! No problem, thanks for being receptive :) Best, f From nas-python at python.ca Tue Oct 28 17:09:54 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Tue Oct 28 17:08:27 2003 Subject: [Python-Dev] Deprecate the buffer object? Message-ID: <20031028220953.GA25984@mems-exchange.org> I happened to be looking at the buffer API today and I came across this posting from Guido: http://mail.python.org/pipermail/python-dev/2000-October/009974.html Over the years there has been a lot of discussion about the buffer API and the buffer object. The general consensus seems to be that the buffer API is not ideal but nonetheless useful. The buffer object, OTOH, is considered fundamentally broken and should be removed. Does anyone object to deprecating the 'buffer' builtin? Eventually we could remove the buffer object completely. Neil From aleaxit at yahoo.com Tue Oct 28 17:23:18 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 17:23:25 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <20031028220953.GA25984@mems-exchange.org> References: <20031028220953.GA25984@mems-exchange.org> Message-ID: <200310282323.18041.aleaxit@yahoo.com> On Tuesday 28 October 2003 11:09 pm, Neil Schemenauer wrote: > I happened to be looking at the buffer API today and I came across > this posting from Guido: > > http://mail.python.org/pipermail/python-dev/2000-October/009974.html > > Over the years there has been a lot of discussion about the buffer > API and the buffer object. The general consensus seems to be that > the buffer API is not ideal but nonetheless useful. The buffer > object, OTOH, is considered fundamentally broken and should be > removed. > > Does anyone object to deprecating the 'buffer' builtin? Eventually > we could remove the buffer object completely. Is that about RW buffers specifically? Because I _have_ used R/O buffers in production code -- when I had a huge string already in memory, and needed various largish substrings of it at different but overlapping times, without paying the overhead to copy them as slicing would have done. Having 'buffer' as a buit-in was quite minor though -- considering the number of times I have used it, importing some module to get at it would have been perfectly acceptable, perhaps preferable. If the buffer interface stays but the function completely disappears, I guess it won't be too hard for me to recreate it in a tiny extension module, but it's not quite clear to me why I should need to. R/W buffers I've never used in production, though. I do recall once (at the very beginning of my Python usage) using an array's buffer_info method as a Q&D way to do some interfacing to C, but that was before ctypes, which I think is what i'd use now. Alex From nas-python at python.ca Tue Oct 28 17:30:14 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Tue Oct 28 17:28:49 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <20031028220953.GA25984@mems-exchange.org> References: <20031028220953.GA25984@mems-exchange.org> Message-ID: <20031028223014.GA26245@mems-exchange.org> Looks like I was a little quick sending out that message. I found more recent postings from Tim and Guido: http://mail.python.org/pipermail/python-dev/2002-July/026408.html http://mail.python.org/pipermail/python-dev/2002-July/026413.html Slippery little beast, that buffer object. :-) I'm going to go ahead and add deprecation warnings. Neil From guido at python.org Tue Oct 28 17:28:35 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 28 17:29:01 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option In-Reply-To: Your message of "Tue, 28 Oct 2003 22:46:50 +0100." <200310282246.50113.aleaxit@yahoo.com> References: <200310260115.h9Q1F0905596@oma.cosc.canterbury.ac.nz> <200310281742.39349.aleaxit@yahoo.com> <200310281742.h9SHgGt29384@12-236-54-216.client.attbi.com> <200310282246.50113.aleaxit@yahoo.com> Message-ID: <200310282228.h9SMSZf30381@12-236-54-216.client.attbi.com> > > import new > > > > def curry(f, x, cls=None): > > return new.instancemethod(f, x) > > Hmmm, what's the role of the ", cls=None" argument here...? Oops, remnant of a dead code branch. > I.e, couldn't just > > curry = new.instancemethod > > be equivalent? Right. I had bigger plans but decided to can them. :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Tue Oct 28 17:29:19 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Tue Oct 28 17:29:40 2003 Subject: [Python-Dev] Re: [Python-checkins] python/nondist/peps pep-0323.txt, NONE, 1.1 pep-0000.txt, 1.254, 1.255 In-Reply-To: <005a01c39cdb$fa18b540$81b0958d@oemcomputer> References: <005a01c39cdb$fa18b540$81b0958d@oemcomputer> Message-ID: <200310282329.19689.aleaxit@yahoo.com> On Monday 27 October 2003 11:45 pm, Raymond Hettinger wrote: > Excellent PEP! > > > Consider adding your bookmarking example. I found it to be a compelling > use case. Also note that there are many variations of the bookmarking > theme (undo utilities, macro recording, parser lookahead functions, > backtracking, etc). I will -- thanks! > Under drawbacks and issues there are a couple of thoughts: > > * Not all iterators will be copyable. Knowing which is which creates a > bit of a usability issue (i.e. the question of whether a particular > iterator is copyable will come up every time) and a substitution issue > (i.e. code which depends on copyability precludea substitution of other > iterators that don't have copyability). Yes, I'll have to mention that (that the royal road for user code to access "iterator copying" functionality is via tee() when feasible). > * In addition to knowing whether a given iterator is copyable, a user > should also know whether the copy is lightweight (just an index or some > such) or heavy (storing all of the data for future use). They should > know whether it is active (intercepting every call to iter()) or inert. Heavy copies should be left to 'tee' more often than not. > * For heavy copies, there is a performance trap when the stored data > stream gets too long. At some point, just using list() would be better. Or saving to disk, beyond a further threshold. > Consider adding a section with pure python sample implementations for > listiter.__copy__, dictiter.__copy__, etc. OK, but some of it's gonna be very-pseudo code (how do you mimic dictiter's real behaviour in pure Python...?). > Also, I have a question about the semantic specification of what a copy > is supposed to do. Does it guarantee that the same data stream will be > reproduced? For instance, would a generator of random words expect its > copy to generate the same word sequence. Or, would a copy of a > dictionary iterator change its output if the underlying dictionary got > updated (i.e. should the dict be frozen to changes when a copy exists or > should it mutate). I'll have to clarify this as for followup discussion on this thread -- pseudorandom iterators (I'll give an example) should be copyable and ensure the same stream from original and copy, real-random iterators (e.g. from /dev/random) not, iterators on e.g. lists and dicts should not freeze the underlying contained when copied any more than they do when first generated (in general if you mutate a dict or list you're iterating on, Python doesn't guarantee "sensible" behavior...). Thanks, Alex From mike at nospam.com Tue Oct 28 17:40:58 2003 From: mike at nospam.com (Mike Rovner) Date: Tue Oct 28 17:41:20 2003 Subject: [Python-Dev] Re: Re: the "3*x works w/o __rmul__" bug References: <3F9EC269.6080108@colorado.edu><200310282226.33748.aleaxit@yahoo.com> <3F9EE652.3030703@colorado.edu> Message-ID: Fernando Perez wrote: > Alex Martelli wrote: >> On Tuesday 28 October 2003 08:24 pm, Fernando Perez wrote: >>> I just wanted to add a small comment on this discussion, which I'd >>> been following via the newsgroup mirror. >> Thanks for your comments! I didn't even know we HAD an ng mirror... > Via gmane news, it works quite well in fact. I typically follow > python-dev > there, and only subscribe occasionally if I need to say something. Just FYI gmane provides two-way access via nntp. This message is a confirmation. :) Mike From aahz at pythoncraft.com Tue Oct 28 17:43:04 2003 From: aahz at pythoncraft.com (Aahz) Date: Tue Oct 28 17:43:08 2003 Subject: [Python-Dev] PEP 322: Generator Expressions (implementation team) In-Reply-To: <001601c39d87$aaa08c20$f7b42c81@oemcomputer> References: <001601c39d87$aaa08c20$f7b42c81@oemcomputer> Message-ID: <20031028224303.GA1740@panix.com> On Tue, Oct 28, 2003, Raymond Hettinger wrote: > > Guido has accepted the generator expressions pep, so it's time for me to > form an implementation team. Um. PEP 322 is generator expressions? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From nas-python at python.ca Tue Oct 28 17:55:20 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Tue Oct 28 17:53:53 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <200310282323.18041.aleaxit@yahoo.com> References: <20031028220953.GA25984@mems-exchange.org> <200310282323.18041.aleaxit@yahoo.com> Message-ID: <20031028225520.GB26245@mems-exchange.org> On Tue, Oct 28, 2003 at 11:23:18PM +0100, Alex Martelli wrote: > Is that about RW buffers specifically? No. > Because I _have_ used R/O buffers in production code -- when I had > a huge string already in memory, and needed various largish > substrings of it at different but overlapping times, without > paying the overhead to copy them as slicing would have done. That's a useful thing to be able to do and the buffer object does it in a safe way. I guess that's part of the reason why the buffer object has managed to survive as long as it has. Neil From tdelaney at avaya.com Tue Oct 28 18:03:34 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Tue Oct 28 18:03:42 2003 Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing 'global') Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6B228@au3010avexu1.global.avaya.com> > From: Phillip J. Eby [mailto:pje@telecommunity.com] > > At 09:56 AM 10/28/03 +0100, Alex Martelli wrote: > >AND, adaptation is not typecasting: > >e.g y=adapt("23", int) should NOT succeed. > > And, why do you consider adaptation *not* to be typecasting? > I always > think of it as "give me X, rendered as a Y", which certainly > sounds like a > description of typecasting to me. Because (IMO anyway) adaption is *not* "give me X, rendered as Y". Adaption is "here is an X, can it be used as a Y?". They are two distinct concepts, although obviously there are crossover points. A string cannot be used as an int, although an int can be created from the string representation of an int. Adaption should not involve any change to the underlying data - mutating operations on the adapted object should (attempt to) mutate the original object (assuming the adapted object and original object are not one and the same). Tim Delaney From pf_moore at yahoo.co.uk Tue Oct 28 18:05:30 2003 From: pf_moore at yahoo.co.uk (Paul Moore) Date: Tue Oct 28 18:08:18 2003 Subject: [Python-Dev] Re: Deprecate the buffer object? References: <20031028220953.GA25984@mems-exchange.org> <20031028223014.GA26245@mems-exchange.org> Message-ID: <8yn5dkud.fsf@yahoo.co.uk> Neil Schemenauer writes: > Looks like I was a little quick sending out that message. I found > more recent postings from Tim and Guido: > > http://mail.python.org/pipermail/python-dev/2002-July/026408.html > http://mail.python.org/pipermail/python-dev/2002-July/026413.html > > Slippery little beast, that buffer object. :-) I'm going to go > ahead and add deprecation warnings. I used it once in combination with ctypes as buffer(a-ctypes-object) to get at the raw memory whicy ctypes objects expose via the buffer API. But it was pretty obscure, and I would happily have used an external module. Like this: >>> import ctypes >>> n = ctypes.c_int(12) >>> buffer(n) >>> str(buffer(n)) '\x0c\x00\x00\x00' Basically, the only serious use case is getting the bytes out of objects which support the buffer API but which *don't* offer a "get the bytes out" interface. I've just realised that I could, however, also do this via the array module: >>> from array import array >>> a = array('c') >>> a.fromstring(n) # Hey - fromstring means "from buffer API"! >>> a.tostring() '\x0c\x00\x00\x00' There's an extra copy in there. Disaster :-) Nope, I don't think there's a good use case after all... Paul -- This signature intentionally left blank From tdelaney at avaya.com Tue Oct 28 18:17:58 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Tue Oct 28 18:18:06 2003 Subject: [Python-Dev] RE: [Python-checkins]python/nondist/pepspep-0323.txt, NONE, 1.1 pep-0000.txt, 1.254, 1.255 Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6B231@au3010avexu1.global.avaya.com> > From: Alex Martelli [mailto:aleaxit@yahoo.com] > > Come to think of this, there may be other use cases for this > general approach than "random iterators". Do you think that > an iterator on a callable *and args for it* would live well in > itertools? That module IS, after all, your baby... Hmm - I like the idea of this. import itertools d10 = itertools.icall(random.randint, (1, 10,)) for i in range(10): print d10.next() Tim Delaney From greg at cosc.canterbury.ac.nz Tue Oct 28 18:28:39 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue Oct 28 18:29:03 2003 Subject: [Python-Dev] replacing 'global' In-Reply-To: <200310281022.31722.aleaxit@yahoo.com> Message-ID: <200310282328.h9SNSd400064@oma.cosc.canterbury.ac.nz> Alex Martelli : > i.e., the 'outer' statement should be > 'outer' expr_stmt The way I was thinking, "outer" wouldn't be a statement at all, but a modifier applied to an indentifier in a binding position. So, e.g. x, outer y, z = 1, 2, 3 would be legal, meaning that x and z are local and y isn't, and outer x = 1; y = 2 would mean y is local and x isn't. To make both x and y non-local you would have to write outer x = 1; outer y = 2 Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From pje at telecommunity.com Tue Oct 28 18:33:19 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Oct 28 18:35:24 2003 Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing 'global') In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DED6B228@au3010avexu1.global .avaya.com> Message-ID: <5.1.1.6.0.20031028180926.01f40040@telecommunity.com> At 10:03 AM 10/29/03 +1100, Delaney, Timothy C (Timothy) wrote: > > From: Phillip J. Eby [mailto:pje@telecommunity.com] > > > > > At 09:56 AM 10/28/03 +0100, Alex Martelli wrote: > > >AND, adaptation is not typecasting: > > >e.g y=adapt("23", int) should NOT succeed. > > > > And, why do you consider adaptation *not* to be typecasting? > > I always > > think of it as "give me X, rendered as a Y", which certainly > > sounds like a > > description of typecasting to me. > >Because (IMO anyway) adaption is *not* "give me X, rendered as Y". >Adaption is "here is an X, can it be used as a Y?". > >They are two distinct concepts, although obviously there are crossover >points. Yes, just like 2+2==4 and 2*2==4. >A string cannot be used as an int, although an int can be created from the >string representation of an int. I'd often like to "use a string as an integer", or use some arbitrary object as an integer. Of course, there's a perfectly valid way to express this now (i.e. 'int()'), and I think that's fine and in my code I will personally prefer to use int() to mean I want an int, because that's clearer. But, if for some reason I have code that is referencing some protocol as a *parameter*, say 'p', and I have no way to know in advance that p==int, then the most sensible thing to do is 'adapt(x,p)', rather than 'p(x)'. (Assuming 'p' is expected to be a protocol, rather than a conversion function.) Now, given that 'p' *might* be 'int' in some cases, it seems reasonable to me that adapt("23",p) should return 23 in such a case. Since 23 satisfies the desired contract (int) on behalf of "23", this seems to be a correct adaptation. For a protocol p that has immutability as part of its contract, adapt(x,p) is well within its rights to return an object that is a "copy" of x in some sense. The immutability requirement means that the "adapted" value can never change, so really it's a *requirement* that the "adaptation" be a snapshot. >Adaption should not involve any change to the underlying data - mutating >operations on the adapted object should (attempt to) mutate the original >object (assuming the adapted object and original object are not one and >the same). I agree 100% -- for a protocol whose contract doesn't require immutability, the way 'int' does. I think now that I understand, however, why you and Alex think I'm saying something different than I've been saying. To both of you, "typecasting" means "convert to a different type" at an *implementation* level (as it is in other languages), and I mean at a *logical* level. Thus, to me, "I would like to use X as a Y" includes whatever contracts Y supplies *as applied to X*. Not, "give me an instance of Y that's a copy of X". It just so happens, however, that for a protocol whose contract includes immutability, these two concepts overlap, just as multiplication and addition overlap for the case of 2+2==2*2. So, IMO, for immutable types such as tuple, str, int, and float, I believe that it's reasonable for adapt(x,p)==p(x) iff x is not an instance of p already, and does not have a __conform__ method that overrides this interpretation. That such a default interpretation is redundant with p(x), I also agree. However, for code that uses protocols dynamically, that redundancy would eliminate the need to make a dummy protocol (e.g. 'IInteger') to use in place of 'int'. OTOH, if Guido decides that Python's eventual interface objects shouldn't be types, then there will be an IInteger anyway, and the point becomes moot. Anyway, I can only understand Alex's objection to such adaptation if he is saying that there is no such thing as adapting to an immutable protocol! In that case, there could never exist such a thing as IInteger, because you could never adapt anything to it that wasn't already an IInteger. Somehow, this seems wrong to me. From aahz at pythoncraft.com Tue Oct 28 18:37:45 2003 From: aahz at pythoncraft.com (Aahz) Date: Tue Oct 28 18:37:49 2003 Subject: [Python-Dev] Decimal.py in sandbox In-Reply-To: References: Message-ID: <20031028233745.GA19657@panix.com> On Mon, Oct 27, 2003, Batista, Facundo wrote: > > The raisoning of majority is that when two operands are of different type, > the less general must be converted to the more general one: > > >>> myDecimal = Decimal(5) > >>> myfloat = 3.0 > >>> mywhat = myDecimal + myfloat > >>> isinstance(mywhat, float) > True Absolutely not. No way, no how, no time. -1000 The problem is that Decimal is capable of greater precision, accuracy, and range than float. You could reasonably argue that the result should be a Decimal, but that has problems with numbers like 1.1 that already are inexactly represented in Python. My opinion is that conversion between float and Decimal should always be explicit (and my recollection is that Tim Peters agrees). > >>> myDecimal = Decimal(5) > >>> myint = 3 > >>> mywhat = myint + myDecimal > >>> isinstance(mywhat, Decimal) > True This is acceptable (because you can't lose anything), but I'm overall leaning toward always requiring explicit conversion. The one thing I dislike in Cowlishaw's algorithms is that integers are always zero-extended. IOW, 1e3 is always 1000. But a standard is a standard; if we want Python's Decimal results to be interoperable with other languages, we have to do that. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From aahz at pythoncraft.com Tue Oct 28 18:39:23 2003 From: aahz at pythoncraft.com (Aahz) Date: Tue Oct 28 18:39:26 2003 Subject: [Python-Dev] Decimal.py in sandbox In-Reply-To: References: Message-ID: <20031028233923.GB19657@panix.com> On Tue, Oct 28, 2003, Batista, Facundo wrote: > > So, to who may I send the changes? > > Should I send the whole staff at the end of the work, or keep feeding small > changes? > > Should I send by email the diff results? Are you comfortable with CVS? Would you like to check your changes in directly? (Since this is sandbox, it doesn't require the usual rigorous approval process for patches.) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan From greg at cosc.canterbury.ac.nz Tue Oct 28 19:34:11 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue Oct 28 19:34:25 2003 Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug In-Reply-To: <200310281828.h9SISW529541@12-236-54-216.client.attbi.com> Message-ID: <200310290034.h9T0YBo00246@oma.cosc.canterbury.ac.nz> Guido van Rossum : > The reason why it works at all for integers without __rmul__ is > complicated; it has to do with very tricky issues in trying to > implement multiplication of a sequence with an integer. I thought the plan was to get rid of all the special case code in the interpreter for multiplying sequences and push it all down into methods of the objects concerned, i.e. all sequences, including the built-in ones, would implement the C equivalent of both __mul__ and __rmul__ if they wanted to support multiplication on both sides. Is there some reason why that wouldn't work? Or is it just that nobody has had time to fix all the built-in sequences to work this way? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From guido at python.org Tue Oct 28 19:37:38 2003 From: guido at python.org (Guido van Rossum) Date: Tue Oct 28 19:37:53 2003 Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug In-Reply-To: Your message of "Wed, 29 Oct 2003 13:34:11 +1300." <200310290034.h9T0YBo00246@oma.cosc.canterbury.ac.nz> References: <200310290034.h9T0YBo00246@oma.cosc.canterbury.ac.nz> Message-ID: <200310290037.h9T0bcq30713@12-236-54-216.client.attbi.com> > I thought the plan was to get rid of all the special case code in the > interpreter for multiplying sequences and push it all down into > methods of the objects concerned, i.e. all sequences, including the > built-in ones, would implement the C equivalent of both __mul__ and > __rmul__ if they wanted to support multiplication on both sides. > > Is there some reason why that wouldn't work? Or is it just that > nobody has had time to fix all the built-in sequences to work > this way? It would be a lot of work, and I expect that for 3rd party extension types (and possibly for 3rd party Python classse) it wouldn't be quite compatible. I want it to work this way in Python 3.0, but I don't know if it's worth reworking all that tedious detail in the 2.x series. (Understanding that 3.0 is a few years away still.) --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at cosc.canterbury.ac.nz Tue Oct 28 20:37:45 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue Oct 28 20:37:58 2003 Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing 'global') In-Reply-To: <5.1.1.6.0.20031028180926.01f40040@telecommunity.com> Message-ID: <200310290137.h9T1bjO00488@oma.cosc.canterbury.ac.nz> "Phillip J. Eby" : > For a protocol p that has immutability as part of its contract, > adapt(x,p) is well within its rights to return an object that is a > "copy" of x in some sense. I don't think that's right -- this should only apply if the original object x is immutable. Otherwise, changes to x should be reflected in the view of it provided by p -- even if p itself provides no operations for mutation. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From jacobs at penguin.theopalgroup.com Tue Oct 28 20:41:04 2003 From: jacobs at penguin.theopalgroup.com (Kevin Jacobs) Date: Tue Oct 28 20:41:08 2003 Subject: [Python-Dev] Decimal.py in sandbox In-Reply-To: <20031028233923.GB19657@panix.com> Message-ID: On Tue, 28 Oct 2003, Aahz wrote: > On Tue, Oct 28, 2003, Batista, Facundo wrote: > > > > So, to who may I send the changes? > > > > Should I send the whole staff at the end of the work, or keep feeding small > > changes? > > > > Should I send by email the diff results? > > Are you comfortable with CVS? Would you like to check your changes in > directly? (Since this is sandbox, it doesn't require the usual rigorous > approval process for patches.) I'd be happier with at least one round of review before committing to CVS. The code is fairly complex and an extra set of eyes will help keep things focused. I've also volunteered to be that extra set of eyes, and plan a quick turn-around on any patches sent to me. However, I don't have CVS write permission either, even to the sandbox. -Kevin -- -- Kevin Jacobs The OPAL Group - Enterprise Systems Architect Voice: (440) 871-6725 x 19 E-mail: jacobs@theopalgroup.com Fax: (440) 871-6722 WWW: http://www.theopalgroup.com/ From greg at cosc.canterbury.ac.nz Tue Oct 28 20:41:54 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue Oct 28 20:42:07 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <20031028220953.GA25984@mems-exchange.org> Message-ID: <200310290141.h9T1fsw00503@oma.cosc.canterbury.ac.nz> Neil Schemenauer : > The buffer object, OTOH, is considered fundamentally broken and should > be removed. There's no doubt that the current implementation of it is unacceptably dangerous, but I haven't yet seen an argument that convinces me that it couldn't be fixed if desired. I don't think the *idea* of a buffer object is fundamentally flawed, and it seems potentially useful (although I must admit that I haven't found a need for it myself yet). Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From jo at jan.csie.ntu.edu.tw Tue Oct 28 20:42:47 2003 From: jo at jan.csie.ntu.edu.tw (Chih-Chung Chang) Date: Tue Oct 28 20:43:30 2003 Subject: copysort patch, was RE: [Python-Dev] inline sort option Message-ID: <20031029014247.GA28906@jan.csie.ntu.edu.tw> Hi, Raymond Hettinger wrote: > Okay, this is the last chance to come-up with a name other than > sorted(). > > Here are some alternatives: > > inlinesort() # immediately clear how it is different from sort() > sortedcopy() # clear that it makes a copy and does a sort > newsorted() # appropriate for a class method constructor > > > I especially like the last one and all of them provide a distinction > from list.sort(). > How about adding a builtin function sort() which returns the sorted version of the input list? L.sort() # sort in-place sort(L) # return sorted copy Regards, Chih-Chung Chang From bac at OCF.Berkeley.EDU Tue Oct 28 21:01:38 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Tue Oct 28 21:01:44 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python Message-ID: <3F9F1F82.2090209@ocf.berkeley.edu> Today I got the wheels turning on my masters thesis by getting an adviser. Now I just need a topic. =) The big goal is to do something involving Python for a thesis to be finished by fall of next year (about October) so as to have it done, hopefully published (getting into LL4 would be cool), and ready to be used for doctoral applications come January 2005. So, anyone have any ideas? The best one that I can think of is optional type-checking. I am fairly open to ideas, though, in almost any area involving language design. There is no deadline to this, so if an idea strikes you a while from now still let me know. I suspect I won't settle on an idea any sooner than December, and that is only if the idea just smacks me in the face and says, "DO THIS!" Otherwise it might be a while since I don't want to take up a topic that won't interest me or is not helpful in some way. -Brett From pje at telecommunity.com Tue Oct 28 21:31:53 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Oct 28 21:31:06 2003 Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing 'global') In-Reply-To: <200310290137.h9T1bjO00488@oma.cosc.canterbury.ac.nz> References: <5.1.1.6.0.20031028180926.01f40040@telecommunity.com> Message-ID: <5.1.0.14.0.20031028210911.03e4cd80@mail.telecommunity.com> At 02:37 PM 10/29/03 +1300, Greg Ewing wrote: >"Phillip J. Eby" : > > > For a protocol p that has immutability as part of its contract, > > adapt(x,p) is well within its rights to return an object that is a > > "copy" of x in some sense. > >I don't think that's right -- this should only apply if >the original object x is immutable. Otherwise, changes to >x should be reflected in the view of it provided by p -- >even if p itself provides no operations for mutation. There's a difference between an interface which provides no methods for mutation, and an interface that *requires* immutability. Part of the concept of an 'int' or 'tuple' is that it is a *value* and therefore unchanging. Thus, one might say that IInteger or ITuple conceptually derive from IValueObject. However, that doesn't mean we can't say that adapt([1,2,3],tuple) should fail, and I'm certainly open to the possibility of such an interpretation, if it's decreed that supporting 'tuple' means guaranteeing that the adaptee doesn't change state, not merely the adapted form. It seems there are three levels of "immutable" one may have in an interface/protocol: 1. No mutator methods, but no requirements regarding stability of state 2. Immutability is required of the adapted form (snapshot) 3. Immutability is required of the adaptee I have made plenty of use of cases 1 and 2, but never 3. I'm having a hard time thinking of a use case for it, so that's probably why it hasn't occurred to me before now. Looking at this list, I now understand at least one of Alex's points better: he (and I think you) are assuming that an immutable target protocol means case 3. That has been baffling the heck out of me, because I have not yet encountered a use case for 3. On the other hand, it's possible that I *have* seen use case 3, and mistaken it for use case 2, simply because all the types I wrote adapters for were immutable. Given all this, I think I'm okay with saying that adapting from a mutable object to an immutable interface (e.g list->tuple) is an improper use of adaptation. Presumably this also means StringIO->str adaptation would be invalid as well. But, int<->str and other such immutable-to-immutable conversions seem well within the purview of adaptation. From greg at cosc.canterbury.ac.nz Tue Oct 28 21:56:53 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Tue Oct 28 21:57:12 2003 Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing 'global') In-Reply-To: <5.1.0.14.0.20031028210911.03e4cd80@mail.telecommunity.com> Message-ID: <200310290256.h9T2uqu00728@oma.cosc.canterbury.ac.nz> "Phillip J. Eby" : > Given all this, I think I'm okay with saying that adapting from a mutable > object to an immutable interface (e.g list->tuple) is an improper use of > adaptation. Expecting such an adaptation to somehow make the underlying list unchangeable by any means would be unreasonable, I think. I can't see any way of enforcing that other than by making a copy, which goes agains the spirit of adaptation. There still might be uses for it, though, without any unchangeability guarantee, such as passing it to something that requires a tuple and not just a sequence, but not wanting the overhead of making a copy. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From ncoghlan at iinet.net.au Tue Oct 28 23:09:22 2003 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Tue Oct 28 23:09:30 2003 Subject: [Python-Dev] Alternate notation for global variable assignments In-Reply-To: <200310281536.h9SFaNr29119@12-236-54-216.client.attbi.com> References: <1067299912.1066.35.camel@anthem> <338366A6D2E2CA4C9DAEAE652E12A1DED6B084@au3010avexu1.global.avaya.com> <1067299912.1066.35.camel@anthem> <5.1.0.14.0.20031028084229.01e66800@mail.telecommunity.com> <200310281536.h9SFaNr29119@12-236-54-216.client.attbi.com> Message-ID: <3F9F3D72.5080308@iinet.net.au> Guido van Rossum strung bits together to say: > which loses the "aha!" effect of a cool solution. It also IMO > requires too much explanation to the unsuspecting reader who doesn't > understand right away *why* rumpelstiltkin imports itself. I believe someone else also suggested that if rumpelstiltskin should be imported as: import fairytales.rumpelstiltskin then a bare import is going to have trouble, even inside the module. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia ICQ#: 68854767 | ncoghlan@email.com Mobile: 0409 573 268 | http://www.talkinboutstuff.net "Let go your prejudices, lest they limit your thoughts and actions." From python at rcn.com Wed Oct 29 01:36:35 2003 From: python at rcn.com (Raymond Hettinger) Date: Wed Oct 29 01:37:33 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <20031028225520.GB26245@mems-exchange.org> Message-ID: <000501c39de7$0019c180$3403a044@oemcomputer> > That's a useful thing to be able to do and the buffer object does it > in a safe way. I guess that's part of the reason why the buffer > object has managed to survive as long as it has. At least the builtin buffer function should go away. Even if someone had a use for it, it would not make-up for all the time lost by all the other people trying to figure what it was good for. Raymond Hettinger From martin at v.loewis.de Wed Oct 29 02:17:58 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Wed Oct 29 02:18:25 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <3F9F1F82.2090209@ocf.berkeley.edu> References: <3F9F1F82.2090209@ocf.berkeley.edu> Message-ID: "Brett C." writes: > So, anyone have any ideas? The best one that I can think of is > optional type-checking. I am fairly open to ideas, though, in almost > any area involving language design. Did you explicitly mean language *design*? Because there might be areas of research relevant to language implementation, in terms of efficiency, portability, etc. Here are some suggestions: - memory management: attempt to replace reference counting by "true" garbage collection - threading: attempt to provide free threading efficiently - typing: attempt to provide run-time or static type inference, and see whether this could be used to implement some byte codes more efficiently (although there is probably overlap with the specializing compilers) - floating point: provide IEEE-794 (or some such) in a portable yet efficient way - persistency: provide a mechanism to save the interpreter state to disk, with the possibility to restart it later (similar to Smalltalk images) On language design, I don't have that many suggestions, as I think the language itself should evolve slowly if at all: - deterministic finalization: provide a way to get objects destroyed implicitly at certain points in control flow; a use case would be thread-safety/critical regions - attributes: provide syntax to put arbitrary annotations to functions, classes, and class members, similar to .NET attributes. Use that facility to implement static and class methods, synchronized methods, final methods, web methods, transactional methods, etc (yes, there is a proposal, but nobody knows whether it meets all requirements - nobody knows what the requirements are) - interfaces (this may go along with optional static typing) Regards, Martin From janssen at parc.com Wed Oct 29 02:26:15 2003 From: janssen at parc.com (Bill Janssen) Date: Wed Oct 29 02:26:43 2003 Subject: [Python-Dev] htmllib vs. HTMLParser In-Reply-To: Your message of "Tue, 28 Oct 2003 04:53:50 PST." <20031028125350.GC1095@rogue.amk.ca> Message-ID: <03Oct28.232619pst."58611"@synergy1.parc.xerox.com> > Perhaps, but it might be a mug's game. I was on the Lynx developer list for > a while, and bad HTML requires many, many hacks to be processed sensibly. Yes, I know what you mean. I would personally be happy to simply reject bad HTML (return None from the parser), and force the user to do what he currently has to do to handle it. Bill From aleaxit at yahoo.com Wed Oct 29 02:45:36 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Wed Oct 29 02:46:46 2003 Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing 'global') In-Reply-To: <200310290256.h9T2uqu00728@oma.cosc.canterbury.ac.nz> References: <200310290256.h9T2uqu00728@oma.cosc.canterbury.ac.nz> Message-ID: <200310290845.36472.aleaxit@yahoo.com> On Wednesday 29 October 2003 03:56, Greg Ewing wrote: > "Phillip J. Eby" : > > Given all this, I think I'm okay with saying that adapting from a > > mutable object to an immutable interface (e.g list->tuple) is an > > improper use of adaptation. > > Expecting such an adaptation to somehow make the underlying > list unchangeable by any means would be unreasonable, I > think. I can't see any way of enforcing that other than by > making a copy, which goes agains the spirit of adaptation. > > There still might be uses for it, though, without any > unchangeability guarantee, such as passing it to something > that requires a tuple and not just a sequence, but not > wanting the overhead of making a copy. There are uses for both permanent (via copy) and temporary freezing. For example: checking if a list is an element of a set will need only temporary freezing -- just enough to let the list supply a hash value. Adding the list to the set will need a frozen copy. Right now, the sets.py code tries for both kinds of adaptation via special methods -- __as_immutable__ and __as_temporarily_immutable__ -- but that's just the usual ad hoc approach. If we had adaptation I'd want both of these to go via protocol adaptation, just because that will allow adaptation strategies to be supplied by protocol, object type AND third parties -- practicality beats purity, i.e., even though you are puristically right that adaptation normally shouldn't copy, I find this one a compelling very practical use case. Adaptation altering the object itself, as in "setting a flag in the list to make it permanently reject any further changes", WOULD on the other hand be a very bad thing -- one could never safely try adaptation any longer if one had to fear such permanent effects on the object being adapted. Alex From troels at thule.no Wed Oct 29 02:57:19 2003 From: troels at thule.no (Troels Walsted Hansen) Date: Wed Oct 29 02:58:08 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <000501c39de7$0019c180$3403a044@oemcomputer> References: <000501c39de7$0019c180$3403a044@oemcomputer> Message-ID: <3F9F72DF.9080101@thule.no> Raymond Hettinger wrote: > At least the builtin buffer function should go away. > Even if someone had a use for it, it would not make-up for all the time > lost by all the other people trying to figure what it was good for. I trust you will preserve the functionality though? I have used the buffer() function to achieve great leaps in performance in applications which send data from a string buffer to a socket. Slicing kills performance in this scenario once buffer sizes get beyond a few 100 kB. Below is example from an asyncore.dispatcher subclass. This code sends chunks with maximum size, without ever slicing the buffer. def handle_write(self): if self.buffer_offset: sent = self.send(buffer(self.buffer, self.buffer_offset)) else: sent = self.send(self.buffer) self.buffer_offset += sent if self.buffer_offset == len(self.buffer): del self.buffer Troels From pyth at devel.trillke.net Wed Oct 29 02:59:18 2003 From: pyth at devel.trillke.net (Holger Krekel) Date: Wed Oct 29 02:59:54 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <3F9F1F82.2090209@ocf.berkeley.edu>; from bac@OCF.Berkeley.EDU on Tue, Oct 28, 2003 at 06:01:38PM -0800 References: <3F9F1F82.2090209@ocf.berkeley.edu> Message-ID: <20031029085918.Y14453@prim.han.de> Hi Brett, Brett C. wrote: > Today I got the wheels turning on my masters thesis by getting an > adviser. Now I just need a topic. =) The big goal is to do something > involving Python for a thesis to be finished by fall of next year (about > October) so as to have it done, hopefully published (getting into LL4 > would be cool), and ready to be used for doctoral applications come > January 2005. > > So, anyone have any ideas? The best one that I can think of is optional > type-checking. I am fairly open to ideas, though, in almost any area > involving language design. Maybe you have heard of PyPy, a reimplementation of Python in Python. We are employing quite some innovative approaches to language design and implementation and there are certainly a lot of open research areas. See our OSCON 2003 paper http://codespeak.net/pypy/index.cgi?doc/oscon2003-paper.html or two interesting chapters out of our European Union proposal http://codespeak.net/pypy/index.cgi?doc/funding/B1.0 http://codespeak.net/pypy/index.cgi?doc/funding/B6.0 You are welcome to discuss stuff on e.g. the IRC channel #pypy on freenode or on the mailing list http://codespeak.net/mailman/listinfo/pypy-dev in order to find out, if you'd like to join us and/or do some interesting thesis. have fun, holger From Boris.Boutillier at arteris.net Wed Oct 29 03:30:10 2003 From: Boris.Boutillier at arteris.net (Boris Boutillier) Date: Wed Oct 29 03:30:16 2003 Subject: [Python-Dev] Py_TPFLAGS_HEAPTYPE, what's its real meaning ? Message-ID: <3F9F7A92.1050800@arteris.net> Hi all, I've posted this question to the main python list, but got no answers, and I didn't see the issue arose on Python-dev (but I subscribed only two weeks ago). It concerns problems with the Py_TPFLAGS_HEAPTYPE and the new 'hackcheck' in python 2.3. I'm writing a C-extension module for python 2.3. I need to declare a new class, MyClass. For this class I want two things : 1) redefine the setattr function on objects of this class (ie setting a new tp_setattro) 2) I want that the python user can change attributes on MyClass (the class itself). Now I have a conflict on the Py_TPFLAGS_HEAPTYPE with new Python 2.3. If I have Py_TPFLAGS_HEAPTYPE set on MyClass, I'll have problem with the new hackcheck (Object/typeobject.c:3631), as I am a HEAPTYPE but I also redefine tp_setattro. If I don't have Py_TPFLAGS_HEAPTYPE, the user can't set new attributes on my class because of a check in type_setattro (Object/typeobject.c:2047). The only solution I've got without modifying python source is to create a specific Metaclass for Myclass, and write the tp_setattr. But I don't like the idea of making a copy-paste of the type_setattr source code, just to remove a check, this is not great for future compatibility with python (at each revision of Python I have to check if type_setattr has not change to copy-paste the changes). In fact I'm really wondering what's the real meaning of this flags, but I think there is some history behind it. If you think this is not the right place for this question, just ignore it, and sorry for disturbance. Boris From FBatista at uniFON.com.ar Wed Oct 29 04:03:21 2003 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Wed Oct 29 04:04:39 2003 Subject: [Python-Dev] Decimal.py in sandbox Message-ID: Aahz wrote: #- > >>> myDecimal = Decimal(5) #- > >>> myfloat = 3.0 #- > >>> mywhat = myDecimal + myfloat #- > >>> isinstance(mywhat, float) #- > True #- #- Absolutely not. No way, no how, no time. -1000 :) #- are inexactly represented in Python. My opinion is that conversion #- between float and Decimal should always be explicit (and my #- recollection #- is that Tim Peters agrees). I'm not decided for any option. I just want (it will be nice) the group to decant either way. There's some controversial about this. Anyway, I'll explicit the options in the pre-PEP, and we all will take a side, :) . Facundo From FBatista at uniFON.com.ar Wed Oct 29 04:17:50 2003 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Wed Oct 29 04:18:47 2003 Subject: [Python-Dev] Decimal.py in sandbox Message-ID: Aahz wrote: #- Are you comfortable with CVS? Would you like to check #- your changes in #- directly? (Since this is sandbox, it doesn't require the #- usual rigorous #- approval process for patches.) Kevin Jacobs wrote: #- I'd be happier with at least one round of review before #- committing to CVS. #- The code is fairly complex and an extra set of eyes will help keep #- things focused. I've also volunteered to be that extra set #- of eyes, and #- plan a quick turn-around on any patches sent to me. I'm not comfortable with CVS. I think I'll use the extra pair of eyes of Kevin (thanks), and start learning CVS while keeping the universe secure, :) Thank you all. . Facundo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20031029/b77ca76c/attachment.html From arigo at tunes.org Wed Oct 29 05:47:36 2003 From: arigo at tunes.org (Armin Rigo) Date: Wed Oct 29 05:51:31 2003 Subject: [Python-Dev] RE: cloning iterators again In-Reply-To: <200310281800.h9SI0Fr29445@12-236-54-216.client.attbi.com> References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer> <20031028124042.GA22513@vicky.ecs.soton.ac.uk> <200310281533.h9SFXZi29097@12-236-54-216.client.attbi.com> <200310281703.58169.aleaxit@yahoo.com> <200310281800.h9SI0Fr29445@12-236-54-216.client.attbi.com> Message-ID: <20031029104736.GA20194@vicky.ecs.soton.ac.uk> Hello Guido, On Tue, Oct 28, 2003 at 10:00:14AM -0800, Guido van Rossum wrote: > I haven't seen Armin's code, but I don't believe that the type alone > gives enough information about whether they should be copied. This is a quite deep problem, actually. I admit I have never used copy.py because in all cases I needed more control about what should be copied or not. This generator-copier module that we are talking about is no exception: its existence is not only due to the fact that it can copy generators, but also that I needed precise control over what I copied and what I shared. Putting this information in __getstate__ or __copy__ methods of instances or in copy_reg only goes so far, because sometimes you want to do different things with the same instances in the same program -- e.g. you may want at some point only a copy of a small number of objects (e.g. to be able to rollback a small transaction), and at some other point a more complete copy of the state of the same program. Nevertheless, I can surely make a C module that registers in copy_reg a deep copier for generators. A bientot, Armin. From guido at python.org Wed Oct 29 10:58:53 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 29 10:59:00 2003 Subject: [Python-Dev] RE: cloning iterators again In-Reply-To: Your message of "Wed, 29 Oct 2003 10:47:36 GMT." <20031029104736.GA20194@vicky.ecs.soton.ac.uk> References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer> <20031028124042.GA22513@vicky.ecs.soton.ac.uk> <200310281533.h9SFXZi29097@12-236-54-216.client.attbi.com> <200310281703.58169.aleaxit@yahoo.com> <200310281800.h9SI0Fr29445@12-236-54-216.client.attbi.com> <20031029104736.GA20194@vicky.ecs.soton.ac.uk> Message-ID: <200310291558.h9TFwr031960@12-236-54-216.client.attbi.com> > This is a quite deep problem, actually. I admit I have never used > copy.py because in all cases I needed more control about what should > be copied or not. This generator-copier module that we are talking > about is no exception: its existence is not only due to the fact > that it can copy generators, but also that I needed precise control > over what I copied and what I shared. Putting this information in > __getstate__ or __copy__ methods of instances or in copy_reg only > goes so far, because sometimes you want to do different things with > the same instances in the same program -- e.g. you may want at some > point only a copy of a small number of objects (e.g. to be able to > rollback a small transaction), and at some other point a more > complete copy of the state of the same program. > > Nevertheless, I can surely make a C module that registers in > copy_reg a deep copier for generators. I'm not sure that there would be a general use for this... --Guido van Rossum (home page: http://www.python.org/~guido/) From nas-python at python.ca Wed Oct 29 11:35:40 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Wed Oct 29 11:34:18 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <3F9F1F82.2090209@ocf.berkeley.edu> References: <3F9F1F82.2090209@ocf.berkeley.edu> Message-ID: <20031029163540.GA28700@mems-exchange.org> Hi Brett, Some ideas: * Finish of the AST compiler. Make it possible to manipulate ASTs from Python and allow them to be feed to the compiler to generate code. This is one half of macros for Python. The other half is harder. * Build a refactoring code editor that works using the AST. * Implement an object system that supports multiple dispatch. You can look at Dylan and Goo for ideas. * Optimize access to global variables and builtins. See PEP 267 for some ideas. If we can disallow inter-module shadowing of names the job becomes easier. Measure the performance difference. * Look at making the GC mark-and-sweep. You will need to provide it explict roots. Is it worth doing? Mark-and-sweep would require changes to extension modules since they don't expose roots to the interpreter. * More radically, look at Chicken¹ and it's GC. Henry Baker's "Cheney on the M.T.A"² is very clever, IMHO, and could be used instead of Python's reference counting. Build a limited Python interpreter based on this idea and evaluate it. 1. http://www.call-with-current-continuation.org/chicken.html 2. http://citeseer.nj.nec.com/baker94cons.html From allison at sumeru.stanford.EDU Wed Oct 29 12:21:56 2003 From: allison at sumeru.stanford.EDU (Dennis Allison) Date: Wed Oct 29 12:22:10 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <3F9F1F82.2090209@ocf.berkeley.edu> Message-ID: How about re-engineering the interpreter to make it more MP friendly? (This is probably a bigger task than a Masters thesis.) The current interpreter serializes on the global interpreter lock (GIL) and blocks everything. Is there another approach which would allow processing to continue? Guido said once that there was an attempt to change the granularity of the locking, but that it quickly became overly complex and unstable. Perhaps some of Maurice Herlihy's ideas may be adapted to the problem. Moreover, it may not be necessary that the interpreter state be consistent and deterministic all the time as long as it eventually produces the same answer as a deterministic equivalent. There may be interpreter organizations which move forward optimistically, ignoring potential locking problems and then (if necessary) recoveri, and these may have better performance than the more conservative ones. Or they may not. Some kind of performance tests and evaluations would need to be part of any such study. On Tue, 28 Oct 2003, Brett C. wrote: > Today I got the wheels turning on my masters thesis by getting an > adviser. Now I just need a topic. =) The big goal is to do something > involving Python for a thesis to be finished by fall of next year (about > October) so as to have it done, hopefully published (getting into LL4 > would be cool), and ready to be used for doctoral applications come > January 2005. > > So, anyone have any ideas? The best one that I can think of is optional > type-checking. I am fairly open to ideas, though, in almost any area > involving language design. > From fperez at colorado.edu Wed Oct 29 12:23:25 2003 From: fperez at colorado.edu (Fernando Perez) Date: Wed Oct 29 12:23:28 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <20031029163540.GA28700@mems-exchange.org> References: <3F9F1F82.2090209@ocf.berkeley.edu> <20031029163540.GA28700@mems-exchange.org> Message-ID: <3F9FF78D.3060605@colorado.edu> Hi Brett, I don't know how interested you are in scientific computing. But Pat Miller from Lawrence Livermore Lab (http://www.llnl.gov/CASC/people/pmiller/) presented at SciPy'03 some very interesting stuff for on-the-fly compilation of python code into C for numerical work. None of this has been publically released yet, but if that kind of thing sounds interesting to you, you might want to contact him. Just an idea. Best, f From pje at telecommunity.com Wed Oct 29 13:25:52 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Oct 29 13:27:53 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <3F9F1F82.2090209@ocf.berkeley.edu> Message-ID: <5.1.1.6.0.20031029131118.030c1770@telecommunity.com> At 06:01 PM 10/28/03 -0800, Brett C. wrote: >Today I got the wheels turning on my masters thesis by getting an >adviser. Now I just need a topic. =) The big goal is to do something >involving Python for a thesis to be finished by fall of next year (about >October) so as to have it done, hopefully published (getting into LL4 >would be cool), and ready to be used for doctoral applications come >January 2005. > >So, anyone have any ideas? The best one that I can think of is optional >type-checking. I am fairly open to ideas, though, in almost any area >involving language design. Throwing another Python-specific implementation issue into the ring... how about performance of Python function calls? Specifically, the current Python interpreter has a high overhead for argument passing and frame setup that dominates performance of simple functions. One strategy I've been thinking about for a little while is replacing the per-frame variable size stacks (e.g. argument and block stacks) with per-thread stacks. In principle, this would allow a few things to happen: * Fixed-size "miniframe" workspace objects allocated on the C stack (with lazy creation of heap-allocated "real" frame objects when needed for an exception or a sys._getframe() call) * Direct use of positional arguments on the stack as the "locals" of the next function called, without creating (and then unpacking) an argument tuple, in the case where there are no */** arguments provided by the caller. This would be a pretty sizeable change to Python's internals (especially the core interpreter's handling of "call" operations), but could possibly produce double-digit percentage speedups for function calls in tight loops. (I base this hypothesis on the speed difference between a function call and resuming a generator, and the general observation that the runtime of certain classes of Python programs is almost directly proportional to the number of function calls occurring.) From mwh at python.net Wed Oct 29 13:33:41 2003 From: mwh at python.net (Michael Hudson) Date: Wed Oct 29 13:33:44 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <5.1.1.6.0.20031029131118.030c1770@telecommunity.com> (Phillip J. Eby's message of "Wed, 29 Oct 2003 13:25:52 -0500") References: <5.1.1.6.0.20031029131118.030c1770@telecommunity.com> Message-ID: <2mu15rki62.fsf@starship.python.net> "Phillip J. Eby" writes: > * Direct use of positional arguments on the stack as the "locals" of > the next function called, without creating (and then unpacking) an > argument tuple, in the case where there are no */** arguments > provided by the caller. Already done, unless I misunderstand your idea. Well, the arguments might still get copied into the new frame's locals area but I'm pretty sure no tuple is involved. Cheers, mwh -- That being done, all you have to do next is call free() slightly less often than malloc(). You may want to examine the Solaris system libraries for a particularly ambitious implementation of this technique. -- Eric O'Dell, comp.lang.dylan (& x-posts) From pje at telecommunity.com Wed Oct 29 13:48:00 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Oct 29 13:50:00 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <2mu15rki62.fsf@starship.python.net> References: <5.1.1.6.0.20031029131118.030c1770@telecommunity.com> <5.1.1.6.0.20031029131118.030c1770@telecommunity.com> Message-ID: <5.1.1.6.0.20031029133413.020105e0@telecommunity.com> At 06:33 PM 10/29/03 +0000, Michael Hudson wrote: >"Phillip J. Eby" writes: > > > * Direct use of positional arguments on the stack as the "locals" of > > the next function called, without creating (and then unpacking) an > > argument tuple, in the case where there are no */** arguments > > provided by the caller. > >Already done, unless I misunderstand your idea. Well, the arguments >might still get copied into the new frame's locals area but I'm pretty >sure no tuple is involved. Hm. I thought that particular optimization only could take place when the function lacks default arguments. But maybe I've misread that part. If it's true in all cases, then argument tuple creation isn't where the overhead is coming from. Anyway... it wouldn't be a good thesis idea if the answer were as obvious as my speculations, would it? ;) From wtrenker at shaw.ca Wed Oct 29 07:13:35 2003 From: wtrenker at shaw.ca (William Trenker) Date: Wed Oct 29 14:16:46 2003 Subject: [Python-Dev] Weeding thru the PEPs Message-ID: <20031029121335.00087ef6.wtrenker@shaw.ca> Hello Python gurus! I've been learning a lot about Python by following you folks here. Lots of headscratching on my part, but slowly the elegance and utility of Python is sinking in. I've been going thru the PEPs on the Python site. Since I don't live and breathe with the PEPs like you do, I'm having a bit of a problem seeing the forest for the trees. Specifically, those PEPs which are most active or current are not 'popping off the page' in the PEP index. Is there a view of the PEP index available that is sorted by the date each PEP was last edited? I've looked at the listing of the PEP's in CVS, sorted by Age. That's pretty close but the CVS listing doesn't show the title or status of each PEP. Just wondering, Bill From jeremy at alum.mit.edu Wed Oct 29 16:23:57 2003 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Wed Oct 29 16:28:19 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: <200310201808.h9KI88Q21557@12-236-54-216.client.attbi.com> References: <20031020173134.GA29040@panix.com> <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> <20031020175230.GA7307@panix.com> <200310201808.h9KI88Q21557@12-236-54-216.client.attbi.com> Message-ID: <1067462637.24165.7.camel@localhost.localdomain> On Mon, 2003-10-20 at 14:08, Guido van Rossum wrote: > > What I remember you saying was that it was an unfortunate but necessary > > consequence so that it would work the same as > > > > L = [] > > for x in R: > > L.append(x) > > print x > > > > You didn't want to have different semantics for two such similar > > constructs ("there's only one way"). You also didn't want to push a > > stack frame for listcomps. > > Then I guess I *have* changed my mind. I guess I didn't think of the > renaming solution way back when. Not to make a big deal out of it, but I just checked on the first report of this problem that I remember. David Beazley reported this problem on python-dev a couple of years ago and suggested the renaming solution. http://mail.python.org/pipermail/python-dev/2001-May/015089.html I'm sure we talked about the problem, but since I was talking I probably said something about a nested scopes solution <0.3 wink>. In that thread, Tim did some effective channeling and said the day you approved a solution based on lambda was the day you'd kill us all. Jeremy From guido at python.org Wed Oct 29 16:38:57 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 29 16:39:28 2003 Subject: [Python-Dev] listcomps vs. for loops In-Reply-To: Your message of "Wed, 29 Oct 2003 16:23:57 EST." <1067462637.24165.7.camel@localhost.localdomain> References: <20031020173134.GA29040@panix.com> <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> <20031020175230.GA7307@panix.com> <200310201808.h9KI88Q21557@12-236-54-216.client.attbi.com> <1067462637.24165.7.camel@localhost.localdomain> Message-ID: <200310292139.h9TLcwX32493@12-236-54-216.client.attbi.com> > Tim did some effective channeling and said the day you approved > a solution based on lambda was the day you'd kill us all. Aargh! You'er on to my evil plan! :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From tdelaney at avaya.com Wed Oct 29 17:41:43 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Wed Oct 29 17:41:51 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6B3F8@au3010avexu1.global.avaya.com> > From: Dennis Allison [mailto:allison@sumeru.stanford.EDU] > > How about re-engineering the interpreter to make it more MP friendly? > (This is probably a bigger task than a Masters thesis.) The current > interpreter serializes on the global interpreter lock (GIL) and blocks > everything. To me this would probably be the most interesting thing to tackle - especially since it has been tried before with partial success but overall failure. At the very least that gives a body of work which you can refer to both as a starting point for your work, and to show how your approach differs from and improves on existing work. It would also be of tremendous value to Python IMO if it could be done without negatively impacting performance on single-processor machines. Whether it is too large for a Masters thesis I don't know. Does a Masters thesis require *success* in the stated goal? I've been thinking about doing my own Masters in the not-too-distant future if I can find the time ... Tim Delaney From nas-python at python.ca Wed Oct 29 17:44:55 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Wed Oct 29 17:43:27 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <200310290141.h9T1fsw00503@oma.cosc.canterbury.ac.nz> References: <20031028220953.GA25984@mems-exchange.org> <200310290141.h9T1fsw00503@oma.cosc.canterbury.ac.nz> Message-ID: <20031029224455.GA30572@mems-exchange.org> On Wed, Oct 29, 2003 at 02:41:54PM +1300, Greg Ewing wrote: > There's no doubt that the current implementation of it is > unacceptably dangerous, but I haven't yet seen an argument > that convinces me that it couldn't be fixed if desired. Okay. Perhaps I am missing something but would fixing it be as simple as adding another field to the tp_as_buffer struct? /* references returned by the buffer functins are valid while * the object remains alive */ #define PyBuffer_FLAG_SAFE 1 Then in stringobject.c (and elsewhere as appropriate): static PyBufferProcs buffer_as_buffer = { (getreadbufferproc)buffer_getreadbuf, (getwritebufferproc)buffer_getwritebuf, (getsegcountproc)buffer_getsegcount, (getcharbufferproc)buffer_getcharbuf, PyBuffer_FLAG_SAFE, }; Then change bufferobject so that it can only be created from objects that set PyBuffer_FLAG_SAFE. Neil From allison at sumeru.stanford.EDU Wed Oct 29 17:55:08 2003 From: allison at sumeru.stanford.EDU (Dennis Allison) Date: Wed Oct 29 17:56:19 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DED6B3F8@au3010avexu1.global.avaya.com> Message-ID: Measuring the size of a project is difficult. This one would require (I think) some significant out-of-the-box thinking. There are a number of resource which could be brought to bear in addition to Herlihy's work on synchronization, for example, Kourosh Gharachorloo's work on programming for the Stanford Dash MP where he toyed with the issues involved with building synchronization independent (that is, lock independent) programs. On Thu, 30 Oct 2003, Delaney, Timothy C (Timothy) wrote: > > From: Dennis Allison [mailto:allison@sumeru.stanford.EDU] > > > > How about re-engineering the interpreter to make it more MP friendly? > > (This is probably a bigger task than a Masters thesis.) The current > > interpreter serializes on the global interpreter lock (GIL) and blocks > > everything. > > To me this would probably be the most interesting thing to tackle - especially since it has been tried before with partial success but overall failure. At the very least that gives a body of work which you can refer to both as a starting point for your work, and to show how your approach differs from and improves on existing work. > > It would also be of tremendous value to Python IMO if it could be done without negatively impacting performance on single-processor machines. > > Whether it is too large for a Masters thesis I don't know. Does a Masters thesis require *success* in the stated goal? I've been thinking about doing my own Masters in the not-too-distant future if I can find the time ... > > Tim Delaney > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/allison%40sumeru.stanford.edu > From guido at python.org Wed Oct 29 18:11:49 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 29 18:11:57 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: Your message of "Wed, 29 Oct 2003 14:44:55 PST." <20031029224455.GA30572@mems-exchange.org> References: <20031028220953.GA25984@mems-exchange.org> <200310290141.h9T1fsw00503@oma.cosc.canterbury.ac.nz> <20031029224455.GA30572@mems-exchange.org> Message-ID: <200310292311.h9TNBn932586@12-236-54-216.client.attbi.com> > Okay. Perhaps I am missing something but would fixing it be as > simple as adding another field to the tp_as_buffer struct? > > /* references returned by the buffer functins are valid while > * the object remains alive */ > #define PyBuffer_FLAG_SAFE 1 > > Then in stringobject.c (and elsewhere as appropriate): > > static PyBufferProcs buffer_as_buffer = { > (getreadbufferproc)buffer_getreadbuf, > (getwritebufferproc)buffer_getwritebuf, > (getsegcountproc)buffer_getsegcount, > (getcharbufferproc)buffer_getcharbuf, > PyBuffer_FLAG_SAFE, > }; > > Then change bufferobject so that it can only be created from objects > that set PyBuffer_FLAG_SAFE. I don't know if this is enough, but if it is, I'd recommend adding the flag bitto tp_flags rather than extending the buffer structure (since you'd need to allocate an extra bit for tp_flags anyway to indicate the longer buffer struct). --Guido van Rossum (home page: http://www.python.org/~guido/) From mhammond at skippinet.com.au Wed Oct 29 18:21:50 2003 From: mhammond at skippinet.com.au (Mark Hammond) Date: Wed Oct 29 18:21:33 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <20031029224455.GA30572@mems-exchange.org> Message-ID: <087001c39e73$70333e60$0500a8c0@eden> Neil Schemenauer > Okay. Perhaps I am missing something but would fixing it be as > simple as adding another field to the tp_as_buffer struct? > > /* references returned by the buffer functins are valid while > * the object remains alive */ > #define PyBuffer_FLAG_SAFE 1 > > Then in stringobject.c (and elsewhere as appropriate): > > static PyBufferProcs buffer_as_buffer = { > (getreadbufferproc)buffer_getreadbuf, > (getwritebufferproc)buffer_getwritebuf, > (getsegcountproc)buffer_getsegcount, > (getcharbufferproc)buffer_getcharbuf, > PyBuffer_FLAG_SAFE, > }; > > Then change bufferobject so that it can only be created from objects > that set PyBuffer_FLAG_SAFE. As the essence of the solution, I think that sounds good! I think that the following should also be done: * Update the docs for the buffer functions to indicate that these are *short term* pointers, that are not guaranteed once *any* Python code is called. * Add new public buffer functions with "LongTerm" in the name (and docs that buffer is valid as long as the object). These check the flag as you propose. * Buffer object uses new LongTerm buffer functions. It points out that the buffer object itself is less at fault than the interface. I'm trying to short-circuit bugs in external extension modules that use the buffer functions without realizing the subtle assumptions made. Mark. From nas-python at python.ca Wed Oct 29 18:56:01 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Wed Oct 29 18:54:31 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <087001c39e73$70333e60$0500a8c0@eden> References: <20031029224455.GA30572@mems-exchange.org> <087001c39e73$70333e60$0500a8c0@eden> Message-ID: <20031029235600.GA30853@mems-exchange.org> On Thu, Oct 30, 2003 at 10:21:50AM +1100, Mark Hammond wrote: > As the essence of the solution, I think that sounds good! Thanks for the feedback. It seems you are one of the few who are familiar with this interrface. > I think that the following should also be done: > > * Update the docs for the buffer functions to indicate that these are *short > term* pointers, that are not guaranteed once *any* Python code is called. > > * Add new public buffer functions with "LongTerm" in the name (and docs that > buffer is valid as long as the object). These check the flag as you > propose. > > * Buffer object uses new LongTerm buffer functions. Seems easy enough. I'll make a patch. Neil From bac at OCF.Berkeley.EDU Wed Oct 29 20:05:03 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Wed Oct 29 20:05:13 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: References: <3F9F1F82.2090209@ocf.berkeley.edu> Message-ID: <3FA063BF.6050207@ocf.berkeley.edu> Martin v. L?wis wrote: > "Brett C." writes: > > >>So, anyone have any ideas? The best one that I can think of is >>optional type-checking. I am fairly open to ideas, though, in almost >>any area involving language design. > > > Did you explicitly mean language *design*? Design/implementation. Basically something involving how a language either works or is created. > Because there might be > areas of research relevant to language implementation, in terms of > efficiency, portability, etc. > > Here are some suggestions: > - memory management: attempt to replace reference counting by > "true" garbage collection Maybe. Kind of happy with the way things work now, though. =) > - threading: attempt to provide free threading efficiently Wow, that would be a challenge, to say the least. Might be too much for just a masters thesis. > - typing: attempt to provide run-time or static type inference, > and see whether this could be used to implement some byte codes > more efficiently (although there is probably overlap with the > specializing compilers) I was actually thinking of type-inference since I am planning on learning (or at least starting to learn) Standard ML next month. > - floating point: provide IEEE-794 (or some such) in a portable > yet efficient way You mean like how we have longs? So code up in C our own way of storing 794 independent of the CPU? > - persistency: provide a mechanism to save the interpreter state > to disk, with the possibility to restart it later (similar to > Smalltalk images) > Hmm. Interesting. Could be the start of continuations. > On language design, I don't have that many suggestions, as I think the > language itself should evolve slowly if at all: > - deterministic finalization: provide a way to get objects destroyed > implicitly at certain points in control flow; a use case would be > thread-safety/critical regions I think you get what you mean by this, but I am not totally sure since I can't come up with a use beyond threads killing themselves properly when the whole program is shutting down. > - attributes: provide syntax to put arbitrary annotations to > functions, classes, and class members, similar to .NET > attributes. Use that facility to implement static and class methods, > synchronized methods, final methods, web methods, transactional > methods, etc (yes, there is a proposal, but nobody knows whether it > meets all requirements - nobody knows what the requirements are) Have no clue what this is since I don't know C#. Almost sounds like Michael's def func() [] proposal at the method level. Or just a lot of descriptors. =) Time to do some Googling. > - interfaces (this may go along with optional static typing) > Yeah, but that is Alex's baby. Thanks for the suggestions, Martin. -Brett From bac at OCF.Berkeley.EDU Wed Oct 29 20:15:23 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Wed Oct 29 20:16:17 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <20031029085918.Y14453@prim.han.de> References: <3F9F1F82.2090209@ocf.berkeley.edu> <20031029085918.Y14453@prim.han.de> Message-ID: <3FA0662B.8070805@ocf.berkeley.edu> Holger Krekel wrote: > Hi Brett, > > Brett C. wrote: > >>Today I got the wheels turning on my masters thesis by getting an >>adviser. Now I just need a topic. =) The big goal is to do something >>involving Python for a thesis to be finished by fall of next year (about >>October) so as to have it done, hopefully published (getting into LL4 >>would be cool), and ready to be used for doctoral applications come >>January 2005. >> >>So, anyone have any ideas? The best one that I can think of is optional >>type-checking. I am fairly open to ideas, though, in almost any area >>involving language design. > > > Maybe you have heard of PyPy, a reimplementation of Python in Python. > We are employing quite some innovative approaches to language design > and implementation and there are certainly a lot of open research > areas. See our OSCON 2003 paper > > http://codespeak.net/pypy/index.cgi?doc/oscon2003-paper.html > Read a while back. I keep an eye on PyPy from a distance by reading the stuff you guys put out. > or two interesting chapters out of our European Union proposal > > http://codespeak.net/pypy/index.cgi?doc/funding/B1.0 > http://codespeak.net/pypy/index.cgi?doc/funding/B6.0 > I will have a read. > You are welcome to discuss stuff on e.g. the IRC channel #pypy > on freenode Nuts. Guess I can't keep my use of IRC down to PyCon discussions. =) > or on the mailing list > > http://codespeak.net/mailman/listinfo/pypy-dev > > in order to find out, if you'd like to join us and/or do some > interesting thesis. > Will do. Thanks, Holger. -Brett From bac at OCF.Berkeley.EDU Wed Oct 29 20:28:13 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Wed Oct 29 20:28:22 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <20031029163540.GA28700@mems-exchange.org> References: <3F9F1F82.2090209@ocf.berkeley.edu> <20031029163540.GA28700@mems-exchange.org> Message-ID: <3FA0692D.30209@ocf.berkeley.edu> Neil Schemenauer wrote: > Hi Brett, > > Some ideas: > > * Finish of the AST compiler. Make it possible to manipulate > ASTs from Python and allow them to be feed to the compiler to > generate code. This is one half of macros for Python. The > other half is harder. > I actually wanted to originally do that, but there is no real research involved; its just coding at this point, right? > * Build a refactoring code editor that works using the AST. > Would probably require the AST to be done. > * Implement an object system that supports multiple dispatch. > You can look at Dylan and Goo for ideas. > Huh, cool. Just looked at Dylan quickly. > * Optimize access to global variables and builtins. See PEP 267 for > some ideas. If we can disallow inter-module shadowing of names > the job becomes easier. Measure the performance difference. > ... and watch my head explode from reading the latest threads. =) Maybe, though. > * Look at making the GC mark-and-sweep. You will need to provide > it explict roots. Is it worth doing? Mark-and-sweep would > require changes to extension modules since they don't expose > roots to the interpreter. > I don't know if it is worth it, although having so far two people suggest changing the GC to something else is interesting. > * More radically, look at Chicken? and it's GC. Henry Baker's > "Cheney on the M.T.A"? is very clever, IMHO, and could be used > instead of Python's reference counting. Build a limited Python > interpreter based on this idea and evaluate it. > > > 1. http://www.call-with-current-continuation.org/chicken.html > 2. http://citeseer.nj.nec.com/baker94cons.html I will have a read. Thanks, Neil. -Brett From bac at OCF.Berkeley.EDU Wed Oct 29 20:38:05 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Wed Oct 29 20:38:11 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: References: Message-ID: <3FA06B7D.2050607@ocf.berkeley.edu> Dennis Allison wrote: > How about re-engineering the interpreter to make it more MP friendly? > (This is probably a bigger task than a Masters thesis.) The current > interpreter serializes on the global interpreter lock (GIL) and blocks > everything. Is there another approach which would allow processing to > continue? Guido said once that there was an attempt to change the > granularity of the locking, but that it quickly became overly complex and > unstable. Perhaps some of Maurice Herlihy's ideas may be adapted to the > problem. Moreover, it may not be necessary that the interpreter state be > consistent and deterministic all the time as long as it eventually > produces the same answer as a deterministic equivalent. There may be > interpreter organizations which move forward optimistically, ignoring > potential locking problems and then (if necessary) recoveri, and these > may have better performance than the more conservative ones. Or they may > not. Some kind of performance tests and evaluations would need to be > part of any such study. > As you said, Dennis, this might be too big for a masters thesis. But it definitely would be nice to have solved. I will definitely think about it. -Brett From bac at OCF.Berkeley.EDU Wed Oct 29 20:47:49 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Wed Oct 29 20:55:49 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DED6B3F8@au3010avexu1.global.avaya.com> References: <338366A6D2E2CA4C9DAEAE652E12A1DED6B3F8@au3010avexu1.global.avaya.com> Message-ID: <3FA06DC5.70407@ocf.berkeley.edu> Delaney, Timothy C (Timothy) wrote: >> From: Dennis Allison [mailto:allison@sumeru.stanford.EDU] >> >> How about re-engineering the interpreter to make it more MP >> friendly? (This is probably a bigger task than a Masters thesis.) >> The current interpreter serializes on the global interpreter lock >> (GIL) and blocks everything. > > > To me this would probably be the most interesting thing to tackle - > especially since it has been tried before with partial success but > overall failure. At the very least that gives a body of work which > you can refer to both as a starting point for your work, and to show > how your approach differs from and improves on existing work. > > It would also be of tremendous value to Python IMO if it could be > done without negatively impacting performance on single-processor > machines. > > Whether it is too large for a Masters thesis I don't know. Does a > Masters thesis require *success* in the stated goal? I've been > thinking about doing my own Masters in the not-too-distant future if > I can find the time ... > Success as in what you set out to do was actually beneficial? No, just as long as something is learned. Successful as actually finishing the darn thing? Yes. Basically a masters thesis needs to require some research, such as looking at other implementations, and some original thought if possible. The problem with a masters thesis, though, is that I have a fixed timeframe (want this done in about a year's time for doctoral school applications) and I don't get to spend a large portion of my time on it (I still have to take normal classes during this time, although I can fenagle my schedule to minimize my work load). I will still consider this, though. -Brett From bac at OCF.Berkeley.EDU Wed Oct 29 21:01:56 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Wed Oct 29 21:02:00 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <3F9F1F82.2090209@ocf.berkeley.edu> References: <3F9F1F82.2090209@ocf.berkeley.edu> Message-ID: <3FA07114.4050009@ocf.berkeley.edu> Just a quick "thank you!" to everyone who has emailed me, personally or publicly, with ideas. There have been a ton of great suggestions and I am going to seriously consider all of them. And this thanks stands indefinitely for any and all future emails on this subject. And please keep sending ideas! Even if I don't pick up on a certain idea maybe someone else will be inspired and decide to run with it or at least start a discussion on possible future improvements (there is always my doctoral thesis in a few years =). I can't believe I just said more discussion on this list was good that I know will most likely take on a life of their own. I guess I really do want to lose my 20/20 vision. =) I also think this thread is a testament to this community in general and this list specifically on how we help others when we can and in the nicest way possible. I have to admit I say with great pride that I am a part of this wonderful community. -Brett From greg at cosc.canterbury.ac.nz Wed Oct 29 21:30:18 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Wed Oct 29 21:31:28 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <087001c39e73$70333e60$0500a8c0@eden> Message-ID: <200310300230.h9U2UId08398@oma.cosc.canterbury.ac.nz> Neil Schemenauer: > Okay. Perhaps I am missing something but would fixing it be as > simple as adding another field to the tp_as_buffer struct? > > /* references returned by the buffer functins are valid while > * the object remains alive */ > #define PyBuffer_FLAG_SAFE 1 That's completely different from what I had in mind, which was: (1) Keep a reference to the base object in the buffer object, and (2) Use the buffer API to fetch a fresh pointer from the base object each time it's needed. Is there some reason that still wouldn't be safe enough? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From allison at sumeru.stanford.EDU Wed Oct 29 22:21:39 2003 From: allison at sumeru.stanford.EDU (Dennis Allison) Date: Wed Oct 29 22:24:01 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <3FA07114.4050009@ocf.berkeley.edu> Message-ID: Brett -- You might put together a list of all the ideas (maybe even a ranked list) and post it as a unit to the list for archival purposes. Thanks. On Wed, 29 Oct 2003, Brett C. wrote: > > > Just a quick "thank you!" to everyone who has emailed me, personally or > publicly, with ideas. There have been a ton of great suggestions and I > am going to seriously consider all of them. And this thanks stands > indefinitely for any and all future emails on this subject. > > And please keep sending ideas! Even if I don't pick up on a certain > idea maybe someone else will be inspired and decide to run with it or at > least start a discussion on possible future improvements (there is > always my doctoral thesis in a few years =). I can't believe I just > said more discussion on this list was good that I know will most likely > take on a life of their own. I guess I really do want to lose my 20/20 > vision. =) > > I also think this thread is a testament to this community in general and > this list specifically on how we help others when we can and in the > nicest way possible. I have to admit I say with great pride that I am a > part of this wonderful community. > > -Brett > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/allison%40sumeru.stanford.edu > From guido at python.org Wed Oct 29 22:39:33 2003 From: guido at python.org (Guido van Rossum) Date: Wed Oct 29 22:39:54 2003 Subject: [Python-Dev] Needed: contractor to answer crypto questions Message-ID: <200310300339.h9U3dYP00412@12-236-54-216.client.attbi.com> I was approached by a legal firm with the questions below about Python's crypto capabilities, from the POV of a legal review of exporting software that embeds Python. I don't have time to research the answers myself (I'm no crypto expert). If you think you can answer the questions, please send me a price quote and I'll forward it to them. They'd like the answers ASAP. --Guido van Rossum (home page: http://www.python.org/~guido/) ------- Forwarded Message > > Hello Guido, [...] > > I understand Python is open source, but when open source code is > integrated in a commercial product, the owner of the commercial product > must include the open source code in their product analysis for U.S. > export classification purposes. Although as open source, Python falls > under an export control exception, this exception is lost once the code is > offered in a commercial product. > > I would appreciate your help in obtaining some additional technical > information in order to complete my export classification analysis. [...] > > 1. We have been advised the following encryption content is in Python. > We are looking for additional information regarding the encryption > content: > a. The Rotor module, which implements a very ancient > encryption algorithm based on the German Enigma. Please tell us the > symmetric key length of the encryption contained within this module. > Please also advise the asymmetric key exchange algorithm length. > b. The wrapper module for Open SSL. Again, please tell > us the symmetric key length of the encryption content contained within > this module. Please also advise the asymmetric key exchange algorithm > length > c. The following questions apply to both the Rotor > module and the wrapper module: > i. can the encryption function be directly > accessed, or modified, by the end user? > ii. Do either of these encryption components > contain an "Open Cryptographic Interface" (an interface that is not fixed > and permits a third party to insert encryption functionality) > > > The following chart is an example of the type of information I need to > submit to the U.S. government. Would you be able to provide similar > information regarding the encryption component(s) included within Pyton? > > EXAMPLE: > > Algorithm Source Key-min Key-max Modes > RC2 OpenSSL 40 128 CBC, ECB, CFB, OFB > ARC4 OpenSSL 40 128 N/A (stream encryption) > DES OpenSSL 40 56 CBC, ECB, CFB, OFB > DESX OpenSSL 168 168 CBC > 3DES-2Key OpenSSL 112 112 CBC, ECB, CFB, OFB > 3DES OpenSSL 168 168 CBC, ECB, CFB, OFB > Blowfish OpenSSL 128 CBC, ECB, CFB, OFB > Diffie-Hellman OpenSSL 192* 16384* Key-exchange, authentication > > DSA OpenSSL Digital Signature > MD5 OpenSSL Integrity > SHA-1 OpenSSL Integrity > * No explicit limit, these appear to be the practical range of values. [...] ------- End of Forwarded Message From nas-python at python.ca Wed Oct 29 23:15:08 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Wed Oct 29 23:13:40 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <3FA0692D.30209@ocf.berkeley.edu> References: <3F9F1F82.2090209@ocf.berkeley.edu> <20031029163540.GA28700@mems-exchange.org> <3FA0692D.30209@ocf.berkeley.edu> Message-ID: <20031030041508.GA31371@mems-exchange.org> On Wed, Oct 29, 2003 at 05:28:13PM -0800, Brett C. wrote: > Neil Schemenauer wrote: > > * Finish of the AST compiler. > > I actually wanted to originally do that, but there is no real research > involved; its just coding at this point, right? Right. It's a prerequite to doing real research. See Jeremy's web log. If you don't want to finish the AST compiler you could just use the Python implementation. It would be slow but good enough for testing ideas. > Huh, cool. Just looked at Dylan quickly. The reference manual is a good reading: http://www.gwydiondylan.org/drm/drm_1.htm Some of the parts I like are the builtin classes (numbers and sealing especially) and the collection protocols. The module and library system is also interesting (although overkill for many programs). > > * Look at making the GC mark-and-sweep. > > I don't know if it is worth it, although having so far two people > suggest changing the GC to something else is interesting. Implementating yet a another M&S GC is not research, IMHO. What _would_ be interesting is comparing the performance of reference counting and a mark and sweep collector. CPU, cache and memory speeds have changed quite dramatically. Also, comparing how easily the runtime can be integrated with the rest of the world (e.g. C libraries) would also be valuable. That said, I'm not sure it's worth it either. I find the Chicken GC more interesting and would look into that further if I had the time. Neil From jeremy at alum.mit.edu Wed Oct 29 23:24:10 2003 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Wed Oct 29 23:26:47 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <5.1.1.6.0.20031029133413.020105e0@telecommunity.com> References: <5.1.1.6.0.20031029131118.030c1770@telecommunity.com> <5.1.1.6.0.20031029131118.030c1770@telecommunity.com> <5.1.1.6.0.20031029133413.020105e0@telecommunity.com> Message-ID: <1067487850.24165.53.camel@localhost.localdomain> On Wed, 2003-10-29 at 13:48, Phillip J. Eby wrote: > At 06:33 PM 10/29/03 +0000, Michael Hudson wrote: > >"Phillip J. Eby" writes: > > > > > * Direct use of positional arguments on the stack as the "locals" of > > > the next function called, without creating (and then unpacking) an > > > argument tuple, in the case where there are no */** arguments > > > provided by the caller. > > > >Already done, unless I misunderstand your idea. Well, the arguments > >might still get copied into the new frame's locals area but I'm pretty > >sure no tuple is involved. > > Hm. I thought that particular optimization only could take place when the > function lacks default arguments. But maybe I've misread that part. If > it's true in all cases, then argument tuple creation isn't where the > overhead is coming from. There is an optimization that depends on having no default arguments (or keyword arguments or free variables). It copies the arguments directly from the caller's frame into the callee's frame without creating an argument tuple. It's interesting to avoid the copy from caller to callee, but I don't think it's a big cost relative to everything else we're doing to set up a frame for calling. (I expect the number of arguments is usually small.) You would need some way to encode what variables are loaded from the caller stack and what variables are loaded from the current frame. Either a different opcode or some kind of flag in the current LOAD/STORE argument. One other possibility for optimization is this XXX comment in fast_function(): /* XXX Perhaps we should create a specialized PyFrame_New() that doesn't take locals, but does take builtins without sanity checking them. */ f = PyFrame_New(tstate, co, globals, NULL); PyFrame_New() does a fair amount of work that is unnecessary in the common case. Jeremy From python at rcn.com Thu Oct 30 00:19:38 2003 From: python at rcn.com (Raymond Hettinger) Date: Thu Oct 30 00:21:28 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option In-Reply-To: <200310281742.h9SHgGt29384@12-236-54-216.client.attbi.com> Message-ID: <001501c39ea5$6a47f900$45ba2c81@oemcomputer> [Guido's code] > unsorted = (1, 10, 2) > print MagicList.sorted(unsorted) > print MagicList(unsorted).sorted() > print SubClass.sorted(unsorted) > print SubClass(unsorted).sorted() Notwithstanding the "perverted" implementation, Alex's idea is absolutely wonderful and addresses a core usability issue with classmethods. If only in the C API, I would like to see just such a universalmethod alternative to classmethod. That would allow different behaviors to be assigned depending on how the method is called. Both list.sort() and dict.fromkeys() would benefit from it: class MagicDict(dict): def _class_fromkeys(cls, lst, value=True): "Make a new dict using keys from list and the given value" obj = cls() for elem in lst: obj[elem] = value return obj def _inst_fromkeys(self, lst, value=True): "Update an existing dict using keys from list and the given value" for elem in lst: self[elem] = value return self newfromkeys = MagicDescriptor(_class_fromkeys, _inst_fromkeys) print MagicDict.newfromkeys('abc') print MagicDict(a=1, d=2).newfromkeys('abc') An alternative implementation is to require only one underlying function and to have it differentiate the cases based on obj and cls: class MagicDict(dict): def newfromkeys(obj, cls, lst, value=True): if obj is None: obj = cls() for elem in lst: obj[elem] = value return obj newfromkeys = universalmethod(newfromkeys) Raymond Hettinger From bac at OCF.Berkeley.EDU Thu Oct 30 00:30:56 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Oct 30 00:31:01 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: References: Message-ID: <3FA0A210.10605@ocf.berkeley.edu> Dennis Allison wrote: > Brett -- > > You might put together a list of all the ideas (maybe even a ranked list) > and post it as a unit to the list for archival purposes. Thanks. > Way ahead of you, Dennis. I have already started to come up with a reST doc for writing up all of these suggestions. It just might be a little while before I get it up since I will need to do some preliminary research on each idea to measure the amount of work they will be. -Brett From guido at python.org Thu Oct 30 00:31:01 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 30 00:32:00 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option In-Reply-To: Your message of "Thu, 30 Oct 2003 00:19:38 EST." <001501c39ea5$6a47f900$45ba2c81@oemcomputer> References: <001501c39ea5$6a47f900$45ba2c81@oemcomputer> Message-ID: <200310300531.h9U5V1F00615@12-236-54-216.client.attbi.com> > Notwithstanding the "perverted" implementation, Alex's idea is > absolutely wonderful and addresses a core usability issue with > classmethods. I'm not so sure. I think the main issue is that Python users aren't used to static methods; C++ and Java users should be familiar with them and I don't think they cause much trouble there. > If only in the C API, I would like to see just such a universalmethod > alternative to classmethod. That would allow different behaviors to be > assigned depending on how the method is called. > > Both list.sort() and dict.fromkeys() would benefit from it: > > > class MagicDict(dict): > > def _class_fromkeys(cls, lst, value=True): > "Make a new dict using keys from list and the given value" > obj = cls() > for elem in lst: > obj[elem] = value > return obj > > def _inst_fromkeys(self, lst, value=True): > "Update an existing dict using keys from list and the given value" > for elem in lst: > self[elem] = value > return self > > newfromkeys = MagicDescriptor(_class_fromkeys, _inst_fromkeys) > > print MagicDict.newfromkeys('abc') > print MagicDict(a=1, d=2).newfromkeys('abc') But your _inst_fromkeys mutates self! That completely defeats the purpose (especially since it also returns self) and I am as much against this (approx. -1000 :-) as I am against sort() returning self. To me this pretty much proves that this is a bad idea; such a schizo method will confuse users more that a class method that ignores the instance. And if you made an honest mistake, and meant to ignore the instance, it still proves that this is too confusing to do! :-) --Guido van Rossum (home page: http://www.python.org/~guido/) From bac at OCF.Berkeley.EDU Thu Oct 30 00:39:41 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Oct 30 00:39:48 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <20031030041508.GA31371@mems-exchange.org> References: <3F9F1F82.2090209@ocf.berkeley.edu> <20031029163540.GA28700@mems-exchange.org> <3FA0692D.30209@ocf.berkeley.edu> <20031030041508.GA31371@mems-exchange.org> Message-ID: <3FA0A41D.2070601@ocf.berkeley.edu> Neil Schemenauer wrote: > On Wed, Oct 29, 2003 at 05:28:13PM -0800, Brett C. wrote: > >>Neil Schemenauer wrote: >> >>> * Finish of the AST compiler. >> >>I actually wanted to originally do that, but there is no real research >>involved; its just coding at this point, right? > > > Right. It's a prerequite to doing real research. See Jeremy's web > log. If you don't want to finish the AST compiler you could just > use the Python implementation. It would be slow but good enough for > testing ideas. > Yeah, I read that. Too bad I can't finish the AST branch *and* do something with it. > >>Huh, cool. Just looked at Dylan quickly. > > > The reference manual is a good reading: > > http://www.gwydiondylan.org/drm/drm_1.htm > > Some of the parts I like are the builtin classes (numbers and > sealing especially) and the collection protocols. The module and > library system is also interesting (although overkill for many > programs). > So many languages to learn! Happen to have a book recommendation? > >>> * Look at making the GC mark-and-sweep. >> >>I don't know if it is worth it, although having so far two people >>suggest changing the GC to something else is interesting. > > > Implementating yet a another M&S GC is not research, IMHO. What > _would_ be interesting is comparing the performance of reference > counting and a mark and sweep collector. CPU, cache and memory > speeds have changed quite dramatically. Also, comparing how easily > the runtime can be integrated with the rest of the world (e.g. C > libraries) would also be valuable. > That is a possibility. Depends if anyone else has done a comparison lately. Seems like this may have been done to death, though. > That said, I'm not sure it's worth it either. I find the Chicken GC > more interesting and would look into that further if I had the time. > I just like the name. =) That and the title of that paper, "Cheney on the M.T.A" causes the humorist in me to want to look at this further, so I will definitely be reading that paper. -Brett From python at rcn.com Thu Oct 30 00:49:53 2003 From: python at rcn.com (Raymond Hettinger) Date: Thu Oct 30 00:50:50 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option In-Reply-To: <200310300531.h9U5V1F00615@12-236-54-216.client.attbi.com> Message-ID: <000501c39ea9$a414e400$8cb6958d@oemcomputer> [GvR] > But your _inst_fromkeys mutates self! That issue wasn't intrinsic to the proposal. The implementation should have read: class MagicDict(dict): def newfromkeys(obj, cls, lst, value=True): "Returns a new MagicDict with the keys in lst set to value" if obj is not None: cls = obj.__class__ newobj = cls() for elem in lst: newobj[elem] = value return newobj newfromkeys = universalmethod(newfromkeys) Universal methods give the method a way to handle the two cases separately. This provides both the capability to make an instance from scratch or to copy it off an existing instance. Your example was especially compelling: a = [3,2,1] print a.sorted() print list.sorted(a) Raymond Hettinger From martin at v.loewis.de Thu Oct 30 02:44:42 2003 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu Oct 30 02:44:52 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <3FA063BF.6050207@ocf.berkeley.edu> References: <3F9F1F82.2090209@ocf.berkeley.edu> <3FA063BF.6050207@ocf.berkeley.edu> Message-ID: <3FA0C16A.8030203@v.loewis.de> Brett C. wrote: >> - floating point: provide IEEE-794 (or some such) in a portable >> yet efficient way > > > You mean like how we have longs? So code up in C our own way of storing > 794 independent of the CPU? Not longs, but floats. And you would not attempt to store it independent of the CPU, but instead, you would make as much use of the CPU as possible, and only implement things in C that the CPU gets wrong. The portion of emulation would vary from CPU to CPU. As a starting point, you might look at the Java strictfpu mode (which got only added after the initial Java release). Java 1.0 was where Python is today: expose whatever the platform provides. In Java, they have the much stronger desire to provide bit-for-bit reproducability on all systems, so they added strictfpu as a trade-off of performance vs. write-once-run-anywhere. >> - deterministic finalization: provide a way to get objects destroyed >> implicitly at certain points in control flow; a use case would be >> thread-safety/critical regions > > > I think you get what you mean by this, but I am not totally sure since I > can't come up with a use beyond threads killing themselves properly when > the whole program is shutting down. Not at all. In Python, you currently do def bump_counter(self): self.mutex.acquire() try: self.counter = self.counter+1 more_actions() finally: self.mutex.release() In C++, you do void bump_counter(){ MutexAcquistion acquire(this); this->counter+=1; more_actions(); } I.e. you can acquire the mutex at the beginning (as a local object), and it gets destroyed automatically at the end of the function. So they have the "resource acquisition is construction, resource release is destruction" design pattern. This is very powerful and convenient, and works almost in CPython, but not in Python - as there is no uarantee when objects get destroyed. > Have no clue what this is since I don't know C#. Almost sounds like > Michael's def func() [] proposal at the method level. Or just a lot of > descriptors. =) Yes, the func()[] proposal might do most of it. However, I'm uncertain whether it puts in place all pieces of the puzzle - one would actually have to try to use that stuff to see whether it really works sufficiently. You would have to set goals first (what is it supposed to do), and then investigate, whether these things can actually be done with it. As I said: static, class, synchronized, final methods might all be candidates; perhaps along with some of the .NET features, like security evidence check (caller must have permission to write files in order to call this method), webmethod (method is automagically exposed as a SOAP/XML-RPC method), etc. Regards, Martin From martin at v.loewis.de Thu Oct 30 02:51:19 2003 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu Oct 30 02:51:42 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <3FA06DC5.70407@ocf.berkeley.edu> References: <338366A6D2E2CA4C9DAEAE652E12A1DED6B3F8@au3010avexu1.global.avaya.com> <3FA06DC5.70407@ocf.berkeley.edu> Message-ID: <3FA0C2F7.40409@v.loewis.de> Brett C. wrote: >> Whether it is too large for a Masters thesis I don't know. Does a >> Masters thesis require *success* in the stated goal? I've been >> thinking about doing my own Masters in the not-too-distant future if >> I can find the time ... >> > > > Success as in what you set out to do was actually beneficial? No, just > as long as something is learned. Successful as actually finishing the > darn thing? Yes. He actually meant "success in the stated goal". I.e. if you go out to implement free threading, would it be considered as a failure of the Master's project if you come back and say: "I did not actually do that"? My answer is "it depends": If you did not do that, and, for example, explain why it *can't* be done, than this is a good thesis, provided you give qualified scientific rationale for why it can't be done. If you say you did not do it, but it could be done in this and that way if you had 50 person years available, then this could be a good thesis as well, provided the strategy you outline, and the rationale for computing the 50 person years is convincing. If you just say, "Oops, I did not finish it because it is too much work", then this would be a bad thesis. Regards, Martin From s.keim at laposte.net Thu Oct 30 03:18:37 2003 From: s.keim at laposte.net (s.keim) Date: Thu Oct 30 03:19:38 2003 Subject: [Python-Dev] Buffer object API Message-ID: > Greg Ewing: > That's completely different from what I had in mind, which was: > > (1) Keep a reference to the base object in the buffer object, and > > (2) Use the buffer API to fetch a fresh pointer from the > base object each time it's needed. > > Is there some reason that still wouldn't be safe enough? I don't know if this can help, but I have once created an object with this behaviour, you can get it at: http://s.keim.free.fr/mem/ (see the memslice module) From my experience, this solve all the problems caused by the buffer object. S?bastien Keim From aleaxit at yahoo.com Thu Oct 30 03:59:33 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Thu Oct 30 03:59:39 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option In-Reply-To: <000501c39ea9$a414e400$8cb6958d@oemcomputer> References: <000501c39ea9$a414e400$8cb6958d@oemcomputer> Message-ID: <200310300959.33587.aleaxit@yahoo.com> On Thursday 30 October 2003 06:49 am, Raymond Hettinger wrote: ... > Universal methods give the method a way to handle the two > cases separately. This provides both the capability to make > an instance from scratch or to copy it off an existing instance. > > Your example was especially compelling: > > a = [3,2,1] > print a.sorted() > print list.sorted(a) Actually, yes, it IS compelling indeed. Funny -- I was originally just brainstorming / musing out loud, never thought about this as a "real thing". But now that it's matured a bit, I do feel sure -- from past experience with what puzzles Python newbies depending on their previous programming knowledge or lack thereof -- that if we had this it would *seriously* reduce the number of puzzlements we have to solve on c.l.py or help@python.org. Which IS strange, in a way, because I do not know of any existing language that has exactly this kind of construct -- a factory callable that you can call on either a type or an instance with good effect. Yet despite it not being previously familiar it DOES "feel natural". Of course, the 3 lines above would also work if sorted was an ordinary instancemethod, but that's just because a is a list instance; if we had some other sequence, say a generator expression, print list.sorted(x*x for x in a) would be just as sweet, and _that_ is the compelling detail IMHO. Trying to think of precedents: Numeric and gmpy have quite a few things like that, except they're (by necessity of the age of gmpy and Numeric) ordinary module functions AND instance methods. E.g.: >>> gmpy.sqrt(33) mpz(5) >>> gmpy.mpz(33).sqrt() mpz(5) >>> gmpy.fsqrt(33) mpf('5.74456264653802865985e0') >>> gmpy.mpf(33).sqrt() mpf('5.74456264653802865985e0') as a module function, sqrt is the truncating integer square root, which is also a method of mpz instances (mpz being the integer type in gmpy). mpf (the float type in gmpy) has a sqrt method too, which is nontruncating -- that is also a module function, but, as such, it needs to be called fsqrt (sigh). I sure _would_ like to expose the functionality as mpf.sqrt(x) and mpz.sqrt(x) [would of course be more readable with other typenames than those 'mpf' and 'mpz', but that's another issue, strictly a design mistake of mine]. Alex From aleaxit at yahoo.com Thu Oct 30 04:05:31 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Thu Oct 30 04:06:03 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <3FA0A41D.2070601@ocf.berkeley.edu> References: <3F9F1F82.2090209@ocf.berkeley.edu> <20031030041508.GA31371@mems-exchange.org> <3FA0A41D.2070601@ocf.berkeley.edu> Message-ID: <200310301005.31669.aleaxit@yahoo.com> On Thursday 30 October 2003 06:39 am, Brett C. wrote: ... > >>Huh, cool. Just looked at Dylan quickly. ... > So many languages to learn! Happen to have a book recommendation? Besides the reference manual, which is also available as a book, there's a good book called "Dylan Programming", Addison-Wesley, Feinberg et al. There's a firm somewhat misleadingly called something like "functional programming" (misleadingly because Dylan's not a FP language...) which focuses on Dylan and used to have both books (reference and Feinberg) in stock and available for decently discounted prices, too. Alex From aleaxit at yahoo.com Thu Oct 30 04:24:09 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Thu Oct 30 04:24:16 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option In-Reply-To: <200310300531.h9U5V1F00615@12-236-54-216.client.attbi.com> References: <001501c39ea5$6a47f900$45ba2c81@oemcomputer> <200310300531.h9U5V1F00615@12-236-54-216.client.attbi.com> Message-ID: <200310301024.09591.aleaxit@yahoo.com> On Thursday 30 October 2003 06:31 am, Guido van Rossum wrote: > > Notwithstanding the "perverted" implementation, Alex's idea is > > absolutely wonderful and addresses a core usability issue with > > classmethods. > > I'm not so sure. I think the main issue is that Python users aren't > used to static methods; C++ and Java users should be familiar with > them and I don't think they cause much trouble there. "Yes, but". The ability to call something on either the class OR the instance IS a Python hallmark... with the constraint that when you call it on the class you need to provide an instance as the first arg (assuming the "something" is a normal instance method, which is surely the most frequent case). You could see universalmethods as being special only in that they WEAKEN this constraint -- they let the 1st arg be EITHER an instance OR something from which a new instance can be naturally constructed. Use cases: in gmpy: if I had universal methods, current instancemethods mpf.sqrt and mpz.sqrt (multiprecision floatingpoint and integer/truncating square roots respectively) could also be called quite naturally as mpf.sqrt(33) and mpz.sqrt(33) respectively. Right now you have to use, instead, mpf(33).sqrt() or mpz(33).sqrt(), which is QUITE a bit more costly because the instance whose sqrt you're taking gets built then immediately discarded (and building mpf -- and to a lesser extent mpz -- instances is a bit costly); OR you can call module functions gmpy.sqrt(33), truncating sqrt, or gmpy.fsqrt(33), nontruncating (returning a multiprecision float). Just one example -- gmpy's chock full of things like this, which universalmethod would let me organize a bit better. in Numeric: lots of module-functions take an arbitrary iterable, build an array instance from it if needed, and operate on an array instance to return a result. This sort-of-works basically because Numeric has "one main type" and thus the issue of "which type are we talking about" doesn't arise (gmpy has 3 types, although mpz takes the lion's share, making things much iffier). But still, Numeric newbies (if they come from OO languages rather than Fortran) DO try calling e.g. x.transpose() for some array x rather than the correct Numeric.transpose(x) -- and in fact array.transpose, called on the class, would ALSO be a perfeclty natural approach. universalmethod would allow array instances to expose such useful functionality as instance methods AND also allow applying direct operations -- without costly construction of intermediate instances to be thrown away at once -- via "array.transpose" and the like. It's not really about new revolutionary functionality: it's just a neater way to "site" existing functionality. This isn't surprising if you look at universalmethod as just a relaxation of the normal constraint "when you call someclass.somemethod(x, ... -- x must be an instance of someclass" into "x must be an instance of someclass OR -- if the somemethod supports it -- something from which such an instance could be constructed in one obvious way". Then, basically, the call is semantically equivalent to someclass(x).somemethod(... BUT the implementation has a chance to AVOID costly construction of an instance for the sole purpose of calling somemethod on it and then throwing away the intermediate instance at once. No revolution, but, I think, a nice little addition to our armoury. Alex From mwh at python.net Thu Oct 30 06:10:20 2003 From: mwh at python.net (Michael Hudson) Date: Thu Oct 30 06:10:44 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <5.1.1.6.0.20031029133413.020105e0@telecommunity.com> (Phillip J. Eby's message of "Wed, 29 Oct 2003 13:48:00 -0500") References: <5.1.1.6.0.20031029131118.030c1770@telecommunity.com> <5.1.1.6.0.20031029131118.030c1770@telecommunity.com> <5.1.1.6.0.20031029133413.020105e0@telecommunity.com> Message-ID: <2mptgfj80z.fsf@starship.python.net> "Phillip J. Eby" writes: > At 06:33 PM 10/29/03 +0000, Michael Hudson wrote: >>"Phillip J. Eby" writes: >> >> > * Direct use of positional arguments on the stack as the "locals" of >> > the next function called, without creating (and then unpacking) an >> > argument tuple, in the case where there are no */** arguments >> > provided by the caller. >> >>Already done, unless I misunderstand your idea. Well, the arguments >>might still get copied into the new frame's locals area but I'm pretty >>sure no tuple is involved. > > Hm. I thought that particular optimization only could take place when > the function lacks default arguments. But maybe I've misread that > part. If it's true in all cases, then argument tuple creation isn't > where the overhead is coming from. I hadn't realized/had forgotten that this optimization depended on the lack of default arguments. Instinct would say that it shouldn't be *too* hard to extend to that case (hardly a thesis topic, at any rate :-). Cheers, mwh -- The only problem with Microsoft is they just have no taste. -- Steve Jobs, (From _Triumph of the Nerds_ PBS special) and quoted by Aahz Maruch on comp.lang.python From mwh at python.net Thu Oct 30 06:16:51 2003 From: mwh at python.net (Michael Hudson) Date: Thu Oct 30 06:16:54 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <200310301005.31669.aleaxit@yahoo.com> (Alex Martelli's message of "Thu, 30 Oct 2003 10:05:31 +0100") References: <3F9F1F82.2090209@ocf.berkeley.edu> <20031030041508.GA31371@mems-exchange.org> <3FA0A41D.2070601@ocf.berkeley.edu> <200310301005.31669.aleaxit@yahoo.com> Message-ID: <2mllr3j7q4.fsf@starship.python.net> Alex Martelli writes: > On Thursday 30 October 2003 06:39 am, Brett C. wrote: > ... >> >>Huh, cool. Just looked at Dylan quickly. > ... >> So many languages to learn! Happen to have a book recommendation? > > Besides the reference manual, which is also available as a book, there's > a good book called "Dylan Programming", Addison-Wesley, Feinberg et al. > > There's a firm somewhat misleadingly called something like "functional > programming" (misleadingly because Dylan's not a FP language...) which > focuses on Dylan and used to have both books (reference and Feinberg) > in stock and available for decently discounted prices, too. It was called "Functional Objects" -- and still is (I thought it was defunct). http://www.functionalobject.com Cheers, mwh -- This is an off-the-top-of-the-head-and-not-quite-sober suggestion, so is probably technically laughable. I'll see how embarassed I feel tomorrow morning. -- Patrick Gosling, ucam.comp.misc From mwh at python.net Thu Oct 30 06:18:35 2003 From: mwh at python.net (Michael Hudson) Date: Thu Oct 30 06:18:39 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <3FA0A210.10605@ocf.berkeley.edu> (Brett C.'s message of "Wed, 29 Oct 2003 21:30:56 -0800") References: <3FA0A210.10605@ocf.berkeley.edu> Message-ID: <2mhe1rj7n8.fsf@starship.python.net> "Brett C." writes: > Dennis Allison wrote: > >> Brett -- >> You might put together a list of all the ideas (maybe even a ranked >> list) >> and post it as a unit to the list for archival purposes. Thanks. >> > > Way ahead of you, Dennis. I have already started to come up with a > reST doc for writing up all of these suggestions. It just might be a > little while before I get it up since I will need to do some > preliminary research on each idea to measure the amount of work they > will be. Could go on the Python Wiki? I take it from your posting of last week that you've thought about other ways of implementing exception handling? I guess a non-reference count based GC is a prerequisite for that... Cheers, mwh -- >> REVIEW OF THE YEAR, 2000 << It was shit. Give us another one. -- NTK Now, 2000-12-29, http://www.ntk.net/ From pedronis at bluewin.ch Thu Oct 30 07:43:04 2003 From: pedronis at bluewin.ch (Samuele Pedroni) Date: Thu Oct 30 07:41:23 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <3FA0692D.30209@ocf.berkeley.edu> References: <20031029163540.GA28700@mems-exchange.org> <3F9F1F82.2090209@ocf.berkeley.edu> <20031029163540.GA28700@mems-exchange.org> Message-ID: <5.2.1.1.0.20031030132333.028a56b0@pop.bluewin.ch> At 17:28 29.10.2003 -0800, Brett C. wrote: >> * Implement an object system that supports multiple dispatch. >> You can look at Dylan and Goo for ideas. > >Huh, cool. Just looked at Dylan quickly. some bits on this: implementing one is probably not too hard apart from optimization but possible/relevant directions are also then - integration with the preexisting Python semantics - reflection. All of CLOS, Dylan, and Goo come with a rather low-level flavor of reflection, in contrast Python has a rather natural one. Once you have mmd what kind of idioms using reflection you can think of, how to best offer/package reflection for the language user? - multi methods cover some ground also coverd by interfaces and adaptation: *) a generic function/multi method is also an interface *) some of the things you can achieve with adaptation can be done with multi methods Once you have multimethods do you still need adaptation in some cases or, could one obtain the functionality otherwise or do you need dispatch on interfaces (not just classes), how would then interfaces look like and the dispatch on them? (Cecil type system and predicate dispatch would be thing to look at for example) Samuele From mcherm at mcherm.com Thu Oct 30 08:01:18 2003 From: mcherm at mcherm.com (Michael Chermside) Date: Thu Oct 30 08:01:22 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option Message-ID: <1067518878.3fa10b9e91afb@mcherm.com> Raymond writes: > If only in the C API, I would like to see just such a universalmethod > alternative to classmethod. That would allow different behaviors to be > assigned depending on how the method is called. And that's exactly why I would be wary of it. One of the GREAT SIMPLICITIES of Python is that all calls are "the same". Calling a function works a particular way. Calling a callable object does the same, although the arguments are passed to __call__. Calling a classmethod does the same thing. Calling a bound method does the same thing except that the first argument is curried. Here in the Python community, we think it is a good thing that one must explicitly name "self" as an argument to methods, and that any function CAN be made a method of objects. Now you're proposing a special situation, where what appears to be a single attribute of a class object is actually TWO functions... two functions that have the same name but subtly different behavior. Right now, I presume that if I have: a = A() # a is an instance of A x = a.aMethod('abc') # this is line 1 y = A.aMethod(a, 'abc') # this is line 2 that line 1 and line 2 do the same thing. This is a direct consequence of the fact that methods in Python are just functions with an instance as the first argument. But your "universalmethod" would break this. It might be worth breaking it, if the result is some *very* readable code in a whole variety of situations. And it's certainly okay for Guido to manually create a class which behaves this way via black magic (and for most users, descriptor tricks are black magic). But to make it a regular and supported idiom, I'd want to see much better evidence that it's worthwhile, because there's an important principle at risk here, and I wouldn't want to trade away the ability to explain "methods" in two sentences: A 'method' is just a function whose first argument is 'self'. The method is an atribute of the class object, and when it is called using "a.method(args)", the instance 'a' is passed as 'self'. for a cute way of making double use of a few factory functions. -- Michael Chermside From Boris.Boutillier at arteris.net Thu Oct 30 08:06:23 2003 From: Boris.Boutillier at arteris.net (Boris Boutillier) Date: Thu Oct 30 08:06:31 2003 Subject: [Python-Dev] Py_TPFLAGS_HEAPTYPE, what's its real meaning ? In-Reply-To: <3F9F7A92.1050800@arteris.net> References: <3F9F7A92.1050800@arteris.net> Message-ID: <3FA10CCF.5020004@arteris.net> No answers on this ? I posted the question two times on c.l.py and got no answers., help would be appreciated. Boris Boris Boutillier wrote: > Hi all, > > I've posted this question to the main python list, but got no answers, > and I didn't see the issue arose on Python-dev (but I subscribed only > two weeks ago). > It concerns problems with the Py_TPFLAGS_HEAPTYPE and the new > 'hackcheck' in python 2.3. > > I'm writing a C-extension module for python 2.3. > I need to declare a new class, MyClass. > For this class I want two things : > 1) redefine the setattr function on objects of this class > (ie setting a new tp_setattro) > 2) I want that the python user can change attributes on MyClass (the > class itself). > > Now I have a conflict on the Py_TPFLAGS_HEAPTYPE with new Python 2.3. > If I have Py_TPFLAGS_HEAPTYPE set on MyClass, I'll have problem with the > new hackcheck (Object/typeobject.c:3631), as I am a HEAPTYPE but I also > redefine tp_setattro. > If I don't have Py_TPFLAGS_HEAPTYPE, the user can't set new attributes on > my class because of a check in type_setattro (Object/typeobject.c:2047). > > The only solution I've got without modifying python source is to > create a specific Metaclass for Myclass, and write the tp_setattr. > But I don't like the idea of making a copy-paste of the type_setattr > source code, just to remove a check, this is not great for future > compatibility with python (at each revision of Python I have to check > if type_setattr has not change to copy-paste the changes). > In fact I'm really wondering what's the real meaning of this flags, > but I think there is some history behind it. > > If you think this is not the right place for this question, just > ignore it, and sorry for disturbance. > > Boris > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/boris.boutillier%40arteris.net > From mwh at python.net Thu Oct 30 08:11:45 2003 From: mwh at python.net (Michael Hudson) Date: Thu Oct 30 08:11:48 2003 Subject: [Python-Dev] Py_TPFLAGS_HEAPTYPE, what's its real meaning ? In-Reply-To: <3FA10CCF.5020004@arteris.net> (Boris Boutillier's message of "Thu, 30 Oct 2003 14:06:23 +0100") References: <3F9F7A92.1050800@arteris.net> <3FA10CCF.5020004@arteris.net> Message-ID: <2mn0bi98fi.fsf@starship.python.net> Boris Boutillier writes: > No answers on this ? I posted the question two times on c.l.py and got > no answers., help would be appreciated. I answered, on comp.lang.python. I didn't say anything especially useful, though. > Boris Boutillier wrote: > >> Hi all, >> >> I've posted this question to the main python list, but got no >> answers, and I didn't see the issue arose on Python-dev (but I >> subscribed only two weeks ago). >> It concerns problems with the Py_TPFLAGS_HEAPTYPE and the new >> hackcheck' in python 2.3. >> >> I'm writing a C-extension module for python 2.3. >> I need to declare a new class, MyClass. >> For this class I want two things : >> 1) redefine the setattr function on objects of this class >> (ie setting a new tp_setattro) >> 2) I want that the python user can change attributes on MyClass (the >> class itself). >> >> Now I have a conflict on the Py_TPFLAGS_HEAPTYPE with new Python 2.3. >> If I have Py_TPFLAGS_HEAPTYPE set on MyClass, I'll have problem with the >> new hackcheck (Object/typeobject.c:3631), as I am a HEAPTYPE but I also >> redefine tp_setattro. >> If I don't have Py_TPFLAGS_HEAPTYPE, the user can't set new attributes on >> my class because of a check in type_setattro (Object/typeobject.c:2047). >> >> The only solution I've got without modifying python source is to >> create a specific Metaclass for Myclass, and write the tp_setattr. I think this is the appropriate solution: your type object is *not* a heap type (i.e. is not allocated on the heap) and you want to influence what happens when you set an attribute on it. Cheers, mwh -- I'd certainly be shocked to discover a consensus. ;-) -- Aahz, comp.lang.python From aleaxit at yahoo.com Thu Oct 30 08:54:48 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Thu Oct 30 08:54:55 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option In-Reply-To: <1067518878.3fa10b9e91afb@mcherm.com> References: <1067518878.3fa10b9e91afb@mcherm.com> Message-ID: <200310301454.48290.aleaxit@yahoo.com> On Thursday 30 October 2003 02:01 pm, Michael Chermside wrote: > Raymond writes: > > If only in the C API, I would like to see just such a universalmethod > > alternative to classmethod. That would allow different behaviors to be > > assigned depending on how the method is called. > > And that's exactly why I would be wary of it. One of the GREAT SIMPLICITIES > of Python is that all calls are "the same". Calling a function works A new descriptortype wouldn't change the ``all the same'' idea at the level at which descriptortypes such as staticmethod and classmethod haven't changed it. > a particular way. Calling a callable object does the same, although > the arguments are passed to __call__. Calling a classmethod does the > same thing. Calling a bound method does the same thing except that the > first argument is curried. Here in the Python community, we think it > is a good thing that one must explicitly name "self" as an argument to > methods, and that any function CAN be made a method of objects. Nothing of this would change. Just consider calling a method on the class or on an instance for various descriptortypes: for staticmethod: aclass.foo() # ignores the exact classobject aninst.foo() # ignores the instanceobject & its exact class for classmethod: aclass.bar() # passes the exact classobject aninst.bar() # ignores the instanceobject, passes its __class__ for functions (which are also descriptors): aclass.baz(aninst) # must explicitly pass an instance of aclass aninst.baz() # pases the instance so, we do have plenty of variety today, right? Consider the last couple in particular (it's after all the most common one): you have the specific constraint that aninst MUST be an instance of aclass. So what we're proposing is JUST a descriptortype that will relax the latter constraint: aninst.wee() passes the instance (just like the latter couple), aclass.wee(beep) does NOT constrain beep to be an instance of a class but is more relaxed, allowing the code of 'wee' to determine what it needs, what it has received, etc -- just like in about ALL cases of Python calls *except* "aclass.baz(aninst)" which is an exceptional case in which Python itself does (enforced) typechecking for you. So what's so bad about optionally being able to do w/o that typecehecking? I've mentioned use cases already -- besides list.sorted -- such as gmpy's sqrt and fsqrt which would more naturally be modeled as just such methods, rather than instancemethods (named sqrt) of types mpz and mpf resp., also available with different names as module-functions (to bypass the typechecking and do typecasting instead). More generally, the idea is that aclas.wee(beep) is just about equivalent to aclas(beep).wee() but may sometimes be implemented more optimally (avoiding the avoidable construction of a temporary instance of aclas from beep, when that is a costly part), and it's better sided in class aclas than as some module function "aclas_wee" (stropping by the typename or some other trick to avoid naming conflict if 'wee' methods of several types are forced into a single namespace because Python doesn't let them be attributes of their respective types in these cases). I don't see any revolution in Python's calls system -- just a little extra possibility that will sometimes allow a better and more natural (or better optimizable) placement of functionality that's now not quite comfortably located as either instancemethod, classmethod or module-level function. > Now you're proposing a special situation, where what appears to be > a single attribute of a class object is actually TWO functions... > two functions that have the same name but subtly different behavior. Nah -- not any more than e.g. a property "is actually THREE functions". A property may HOLD three functions and call the appropriate one in appropriate cases, x.y=23 vs print x.y vs del x.y. In general, a descriptor "has" one or more callables it holds (a function "has" AND "is"). > Right now, I presume that if I have: > a = A() # a is an instance of A > x = a.aMethod('abc') # this is line 1 > y = A.aMethod(a, 'abc') # this is line 2 > that line 1 and line 2 do the same thing. This is a direct consequence > of the fact that methods in Python are just functions with an instance > as the first argument. But your "universalmethod" would break this. Actually, unless you show me the source of A, I cannot be absolutely sure that your presumption is right, even today. A simple example: class A(object): def aMethod(*allofthem): return allofthem aMethod = staticmethod(aMethod) Now, the behavior of lines 1 and 2 is actually quite different -- x is a singleton tuple with a string, y a pair whose first item is an instance of A and the second item a string. Sure, your presumption is reasonable and a reasonable programmer will try to make sure it's valid, but Python already gives the programmer plenty of tools with which to make your presumption invalid. The _design intention_ of universalmethod would be to still satisfy your presumption, PLUS allow calls to A.aMethod(bbb, 'abc') for any "acceptable" object bbb, not necessarily an instance of A, to do something like A(bbb).aMethod('abc') although possibly in a more optimized way (not necessarily constructing a temporary instance of A, if that is costly and can be easily avoided). Of course it can ALSO be used unreasonably, but so can lots of existing descriptors, too. > It might be worth breaking it, if the result is some *very* readable Can't break what's already broken:-). > to make it a regular and supported idiom, I'd want to see much better > evidence that it's worthwhile, because there's an important principle > at risk here, and I wouldn't want to trade away the ability to explain > "methods" in two sentences: > A 'method' is just a function whose first argument is 'self'. > The method is an atribute of the class object, and when it is > called using "a.method(args)", the instance 'a' is passed as > 'self'. > for a cute way of making double use of a few factory functions. I don't think of it as "cute", but rather more appropriate than currently available solutions in some such cases (already exemplified). And those sentences are already false if by 'method' you also want to include staticmethod and classmethod. If you intend 'method' in a stricter sense that excludes staticmethod and classmethod, why, just have your stricter sense ALSO exclude universalmethod and, voila, you can STILL "explain methods in two sentences". Thus, there is no "important principle at risk" whatsoever. Alex From Paul.Moore at atosorigin.com Thu Oct 30 09:27:09 2003 From: Paul.Moore at atosorigin.com (Moore, Paul) Date: Thu Oct 30 09:27:55 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option Message-ID: <16E1010E4581B049ABC51D4975CEDB8803060D0E@UKDCX001.uk.int.atosorigin.com> From: Alex Martelli [mailto:aleaxit@yahoo.com] > On Thursday 30 October 2003 02:01 pm, Michael Chermside wrote: >> And that's exactly why I would be wary of it. One of the GREAT SIMPLICITIES >> of Python is that all calls are "the same". Calling a function works > > A new descriptortype wouldn't change the ``all the same'' idea at > the level at which descriptortypes such as staticmethod and classmethod > haven't changed it. Excuse me, did I miss something? Guido's code was entirely user-level Python, so is available for anyone who wants to use it right now, surely? And if you want it in a C extension, I guess you code a C version for your own use. Why bother arguing over whether it's "right" or "wrong"? Paul. From aleaxit at yahoo.com Thu Oct 30 09:51:36 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Thu Oct 30 09:51:41 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8803060D0E@UKDCX001.uk.int.atosorigin.com> References: <16E1010E4581B049ABC51D4975CEDB8803060D0E@UKDCX001.uk.int.atosorigin.com> Message-ID: <200310301551.36290.aleaxit@yahoo.com> On Thursday 30 October 2003 03:27 pm, Moore, Paul wrote: > From: Alex Martelli [mailto:aleaxit@yahoo.com] > > > On Thursday 30 October 2003 02:01 pm, Michael Chermside wrote: > >> And that's exactly why I would be wary of it. One of the GREAT > >> SIMPLICITIES of Python is that all calls are "the same". Calling a > >> function works > > > > A new descriptortype wouldn't change the ``all the same'' idea at > > the level at which descriptortypes such as staticmethod and classmethod > > haven't changed it. > > Excuse me, did I miss something? Guido's code was entirely user-level > Python, so is available for anyone who wants to use it right now, surely? Yes, exactly like staticmethod was available before it became a builtin (e.g., see p.176, "Python Cookbook"). > And if you want it in a C extension, I guess you code a C version for your > own use. > > Why bother arguing over whether it's "right" or "wrong"? Raymond and I would like to use it as the descriptor for the new list.sorted. If the code gets in Python anyway, then it should ideally be somehow exposed for general use if it's right -- but not if it's wrong. Moreover, if it's wrong "by enough", it might be better to NOT have it get in at all, and keep the possibility well under wraps -- if this behavior is used by what will likely become a reasonably popular method of a reasonably popular built-in type, list, people may well be encouraged to design some aspects of their classes similarly. If such a design is considered a disaster, then encouraging and popularizing it in this way might not be wise. If, on the other hand, the design IS of enough general use, then there are no such qualms -- indeed, documenting the use and design-assumptions of the new descriptor in the Python docs would then be a good idea. So, it appears to me that the discussion on the pro's and con's of such a descriptor type is well warranted on this list. Alex From Paul.Moore at atosorigin.com Thu Oct 30 10:12:40 2003 From: Paul.Moore at atosorigin.com (Moore, Paul) Date: Thu Oct 30 10:13:31 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option Message-ID: <16E1010E4581B049ABC51D4975CEDB8802C0996F@UKDCX001.uk.int.atosorigin.com> From: Alex Martelli [mailto:aleaxit@yahoo.com] > Raymond and I would like to use it as the descriptor for > the new list.sorted. > If the code gets in Python anyway, then it should ideally > be somehow exposed for general use if it's right -- but not > if it's wrong. OK. I follow now. The only contribution I will make is to say that if list.sorted uses it, I think it should be available to the user. I don't like the flavour of "good enough for us, but not for you" that keeping this descriptor purely internal seems to have. On list.sorted, I have no opinion. Paul. From nas-python at python.ca Thu Oct 30 10:21:01 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Thu Oct 30 10:19:32 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <200310300230.h9U2UId08398@oma.cosc.canterbury.ac.nz> References: <087001c39e73$70333e60$0500a8c0@eden> <200310300230.h9U2UId08398@oma.cosc.canterbury.ac.nz> Message-ID: <20031030152101.GA32750@mems-exchange.org> On Thu, Oct 30, 2003 at 03:30:18PM +1300, Greg Ewing wrote: > That's completely different from what I had in mind, which was: > > (1) Keep a reference to the base object in the buffer object, and It already does this. > (2) Use the buffer API to fetch a fresh pointer from the > base object each time it's needed. > > Is there some reason that still wouldn't be safe enough? I don't see any problem with that. It's probably a better solution since it doesn't require a new flag and it lets you create buffers that reference objects like arrays. Neil From guido at python.org Thu Oct 30 10:34:45 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 30 10:34:52 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option In-Reply-To: Your message of "Thu, 30 Oct 2003 00:49:53 EST." <000501c39ea9$a414e400$8cb6958d@oemcomputer> References: <000501c39ea9$a414e400$8cb6958d@oemcomputer> Message-ID: <200310301534.h9UFYjh01347@12-236-54-216.client.attbi.com> > Your example was especially compelling: > > a = [3,2,1] > print a.sorted() > print list.sorted(a) Well, I'd like to withdraw it. Having all three of a.sort(), a.sorted() and list.sorted(a) available brings back all the confusion of a.sort() vs. a.sorted(). What's in CVS is just fine. --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Thu Oct 30 10:43:04 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 30 10:43:12 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option In-Reply-To: Your message of "Thu, 30 Oct 2003 09:59:33 +0100." <200310300959.33587.aleaxit@yahoo.com> References: <000501c39ea9$a414e400$8cb6958d@oemcomputer> <200310300959.33587.aleaxit@yahoo.com> Message-ID: <200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com> > > a = [3,2,1] > > print a.sorted() > > print list.sorted(a) > > Actually, yes, it IS compelling indeed. Funny -- I was originally just > brainstorming / musing out loud, never thought about this as a "real > thing". But now that it's matured a bit, I do feel sure [...] If you feel about it that way, I recommend that you let it mature a bit more. If you really like this so much, please realize that you can do this for *any* instance method. The identity C.foo(C()) == C().foo() holds for all "regular" methods. (Since 2.2 it also holds for extension types.) If we were to do this, we'd be back at square two, which we rejected: list instances having both a sort() and a sorted() method (square one being no sorted() method at all :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From aleaxit at yahoo.com Thu Oct 30 11:20:37 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Thu Oct 30 11:20:54 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option In-Reply-To: <200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com> References: <000501c39ea9$a414e400$8cb6958d@oemcomputer> <200310300959.33587.aleaxit@yahoo.com> <200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com> Message-ID: <200310301720.37743.aleaxit@yahoo.com> On Thursday 30 October 2003 04:43 pm, Guido van Rossum wrote: > > > a = [3,2,1] > > > print a.sorted() > > > print list.sorted(a) > > > > Actually, yes, it IS compelling indeed. Funny -- I was originally just > > brainstorming / musing out loud, never thought about this as a "real > > thing". But now that it's matured a bit, I do feel sure [...] > > If you feel about it that way, I recommend that you let it mature a > bit more. > > If you really like this so much, please realize that you can do this > for *any* instance method. The identity > > C.foo(C()) == C().foo() > > holds for all "regular" methods. (Since 2.2 it also holds for Yes, having a be an instance of list in the above doesn't show 'sorted' as being any different than a perfectly regular instance method -- it WAS in this sense a bad example (I thought I'd mentioned that later on in the very same post?). The identify I want is rather like: C.foo(x) == C(x).foo() for an x that's not necessarily an instance of C, just something that has a natural way to become one. When C is list, any iterable x, for example. In other words, being able to call C.foo(x) _without_ the typechecking constraint that x is an instance of C, as one would have for a normal C.foo unbound-method. > extension types.) If we were to do this, we'd be back at square two, > which we rejected: list instances having both a sort() and a sorted() > method (square one being no sorted() method at all :-). Yes, the names are an issue again -- but having e.g. x=L1.sorted(L2) completely ignore the value of L1 feels a bit strange to me too (as does x=D1.fromkeys(L3) now that I've started thinking about it -- I've never seen any Python user, newbie or otherwise, have actual problems with this, but somebody on c.l.py claimed that's just because "nobody" knows about fromkeys -- so, I dunno...). Darn -- it WOULD be better in some cases if one could ONLY call a method on the class, NOT on an instance when the call would in any case ignore the instance. Calling dict.fromkeys(L3) is wonderful, the problem is that you can also call it on a dict instance, and THAT gets confusing. Similarly, calling list.sorted(iterable) is wonderful, but calling it on a list instance that gets ignored, L1.sorted(iterable), could perhaps be confusing. Yeah, the C++(staticmethod)/Smalltalk(clasmethod) idea of "call it on the instance" sounds cool in the abstract, but when it comes down to cases I'm not so sure any more -- what might be use cases where it's preferable, rather than confusing, to be able to call aninst.somestaticmethod(x,y) "just as if" it was a normal method? Would it be such an imposition to "have to know" that a method is static and call type(aninst).somestaticmethod(x,y) instead, say...? Oh well, I guess it's too late to change the semantics of the existing descriptors, even if one agrees with my newfound doubts. But the funniest thing is that I suspect the _new_ descriptor type would be the _least_ confusing of them, because calling such a method on class or instance would have semantics much closer to ordinary methods, just slightly less typeconstrained. Oh well! Alex From guido at python.org Thu Oct 30 12:16:33 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 30 12:19:34 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option In-Reply-To: Your message of "Thu, 30 Oct 2003 17:20:37 +0100." <200310301720.37743.aleaxit@yahoo.com> References: <000501c39ea9$a414e400$8cb6958d@oemcomputer> <200310300959.33587.aleaxit@yahoo.com> <200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com> <200310301720.37743.aleaxit@yahoo.com> Message-ID: <200310301716.h9UHGXB01596@12-236-54-216.client.attbi.com> > Darn -- it WOULD be better in some cases if one could ONLY call > a method on the class, NOT on an instance when the call would in > any case ignore the instance. Calling dict.fromkeys(L3) is wonderful, > the problem is that you can also call it on a dict instance, and THAT > gets confusing. Similarly, calling list.sorted(iterable) is wonderful, > but calling it on a list instance that gets ignored, L1.sorted(iterable), > could perhaps be confusing. Let's focus on making this an issue that one learns without much pain. Given that the most common mistake would be to write a.sorted(), and that's a TypeError because of the missing argument, perhaps we could make the error message clearer? Perhaps we could use a variant of classmethod whose __get__ would raise the error, rather than waiting until the call -- it could do the equivalent of the following: class PickyClassmethod(classmethod): def __get__(self, obj, cls): if obj is not None: raise TypeError, "class method should be called on class only!" else: return classmethod.__get__(self, None, cls) I don't want to make this behavior the default behavior, because I can see use cases for calling a class method on an instance too, knowing that it is a class method; otherwise one would have to write the ugly x.__class__.foobar(). --Guido van Rossum (home page: http://www.python.org/~guido/) From nas-python at python.ca Thu Oct 30 12:19:39 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Thu Oct 30 12:20:40 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <20031030152101.GA32750@mems-exchange.org> References: <087001c39e73$70333e60$0500a8c0@eden> <200310300230.h9U2UId08398@oma.cosc.canterbury.ac.nz> <20031030152101.GA32750@mems-exchange.org> Message-ID: <20031030171939.GA374@mems-exchange.org> On Thu, Oct 30, 2003 at 07:21:01AM -0800, Neil Schemenauer wrote: > I don't see any problem with that. Okay, small problem. The hash function for the buffer object is brain damaged, in more ways than one actually: >>> import array >>> a = array.array('c') >>> b = buffer(a) >>> hash(b) Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 16384 (LWP 5311)] buffer_hash (self=0x40262d00) at Objects/bufferobject.c:241 241 x = *p << 7; (gdb) l 236 return -1; 237 } 238 239 len = self->b_size; 240 p = (unsigned char *) self->b_ptr; 241 x = *p << 7; 242 while (--len >= 0) 243 x = (1000003*x) ^ *p++; 244 x ^= self->b_size; 245 if (x == -1) (gdb) p len $1 = 0 (gdb) p *p Cannot access memory at address 0x0 The buffer object has 'b_readonly' and 'b_hash' fields. If readonly is true than the object is considered hashable and once computed the hash is stored in the 'hash' field. The problem is that the buffer API doesn't provide a way to determine 'readonly'. The absence of getwritebuf() is not the same thing as being read only. The buffer() builtin always sets the 'readonly' flag! I don't think the buffer hash method can depend on the data being pointed to. There is nothing in the buffer interface that tells you if the data is immutable. The hash method could return the id of the buffer object but I'm not sure how useful that would be. Neil From pje at telecommunity.com Thu Oct 30 12:37:24 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Oct 30 12:37:50 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <5.2.1.1.0.20031030132333.028a56b0@pop.bluewin.ch> References: <3FA0692D.30209@ocf.berkeley.edu> <20031029163540.GA28700@mems-exchange.org> <3F9F1F82.2090209@ocf.berkeley.edu> <20031029163540.GA28700@mems-exchange.org> Message-ID: <5.1.1.6.0.20031030123139.02443660@telecommunity.com> At 01:43 PM 10/30/03 +0100, Samuele Pedroni wrote: >- multi methods cover some ground also coverd by interfaces and adaptation: > *) a generic function/multi method is also an interface > *) some of the things you can achieve with adaptation can be done with > multi methods >Once you have multimethods do you still need adaptation in some cases or, >could one obtain the functionality otherwise or do you need dispatch on >interfaces (not just classes), how would then interfaces look like and the >dispatch on them? With a sufficiently powerful predicate dispatch system, you could do away with adaptation entirely, since you can simulate interfaces by implementing a generic function that indicates whether a type supports the interface, and then defining a predicate type that calls the generic function. That is, I define a predicate type IFoo such that ob is of type IFoo if 'implementsIFoo(ob)'. Then, for any type that implements the interface, I define a multimethod saying that implementsIFoo() is true for objects of that type. Then, I can declare multimethod implementations for the IFoo predicate type. What I'm curious about is: is there any way to do it *without* predicate types? Could you have an "open ended union" type, that you can declare other types to be of, without having to inherit from a base type? From tanzer at swing.co.at Thu Oct 30 12:39:48 2003 From: tanzer at swing.co.at (Christian Tanzer) Date: Thu Oct 30 12:42:44 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option In-Reply-To: Your message of "Thu, 30 Oct 2003 17:20:37 +0100." <200310301720.37743.aleaxit@yahoo.com> Message-ID: > Darn -- it WOULD be better in some cases if one could ONLY call > a method on the class, NOT on an instance when the call would in > any case ignore the instance. Calling dict.fromkeys(L3) is wonderful, > the problem is that you can also call it on a dict instance, and THAT > gets confusing. Similarly, calling list.sorted(iterable) is wonderful, > but calling it on a list instance that gets ignored, L1.sorted(iterable), > could perhaps be confusing. Then why don't you use a custom descriptor which raises an exception when an instance is passed in? Like: def __get__(self, obj, cls): if obj is None: return new.instancemethod(self.classmeth, cls) else: raise TypeError, \ "Calling %s on instance %s ignores instance" % \ (self.classmeth, obj) -- Christian Tanzer http://www.c-tanzer.at/ From pje at telecommunity.com Thu Oct 30 12:51:12 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Oct 30 12:51:51 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option In-Reply-To: <200310301716.h9UHGXB01596@12-236-54-216.client.attbi.com> References: <000501c39ea9$a414e400$8cb6958d@oemcomputer> <200310300959.33587.aleaxit@yahoo.com> <200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com> <200310301720.37743.aleaxit@yahoo.com> Message-ID: <5.1.1.6.0.20031030124652.03062630@telecommunity.com> At 09:16 AM 10/30/03 -0800, Guido van Rossum wrote: >class PickyClassmethod(classmethod): > def __get__(self, obj, cls): > if obj is not None: > raise TypeError, "class method should be called on class only!" > else: > return classmethod.__get__(self, None, cls) > >I don't want to make this behavior the default behavior, because I >can see use cases for calling a class method on an instance too, >knowing that it is a class method; otherwise one would have to write >the ugly x.__class__.foobar(). If there were a 'classonlymethod()' built-in, I'd probably use it, as I use classmethods a fair bit (mostly for specialized constructors), but I don't recall ever desiring to call one via an instance. Do you have an example of the use cases you see? Hm. What if your PickyClassmethod were a built-in called 'constructor' or 'factorymethod'? Probably too confining a name, if there are other uses for class-only methods, I suppose. From guido at python.org Thu Oct 30 13:00:59 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 30 13:01:10 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option In-Reply-To: Your message of "Thu, 30 Oct 2003 12:51:12 EST." <5.1.1.6.0.20031030124652.03062630@telecommunity.com> References: <000501c39ea9$a414e400$8cb6958d@oemcomputer> <200310300959.33587.aleaxit@yahoo.com> <200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com> <200310301720.37743.aleaxit@yahoo.com> <5.1.1.6.0.20031030124652.03062630@telecommunity.com> Message-ID: <200310301801.h9UI10v01751@12-236-54-216.client.attbi.com> > If there were a 'classonlymethod()' built-in, I'd probably use it, as I use > classmethods a fair bit (mostly for specialized constructors), but I don't > recall ever desiring to call one via an instance. Do you have an example > of the use cases you see? Not exactly, but I notice that e.g. UserList uses self.__class__ a lot; I think that's the kind of area where it might show up. > Hm. What if your PickyClassmethod were a built-in called 'constructor' or > 'factorymethod'? Probably too confining a name, if there are other uses > for class-only methods, I suppose. I'm not convinced that we have a problem (beyond Alex lying awake at night, that it :-). --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Thu Oct 30 13:09:58 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Oct 30 13:10:03 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option In-Reply-To: <200310301801.h9UI10v01751@12-236-54-216.client.attbi.com> References: <000501c39ea9$a414e400$8cb6958d@oemcomputer> <200310300959.33587.aleaxit@yahoo.com> <200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com> <200310301720.37743.aleaxit@yahoo.com> <5.1.1.6.0.20031030124652.03062630@telecommunity.com> Message-ID: <5.1.1.6.0.20031030130628.02da0c80@telecommunity.com> At 10:00 AM 10/30/03 -0800, Guido van Rossum wrote: > > Hm. What if your PickyClassmethod were a built-in called 'constructor' or > > 'factorymethod'? Probably too confining a name, if there are other uses > > for class-only methods, I suppose. > >I'm not convinced that we have a problem (beyond Alex lying awake at >night, that it :-). I thought you were proposing to use it for list.sorted, in order to provide a better error message when used with an instance. If such a descriptor were implemented, I was suggesting that it would be useful as a form of documentation (i.e. that a method isn't intended to be called on instances of the class), and thus it would be nice for it to be exposed for folks like me who'd take advantage of it. (Especially if PEP 318 is being implemented.) From guido at python.org Thu Oct 30 13:19:24 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 30 13:19:31 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option In-Reply-To: Your message of "Thu, 30 Oct 2003 13:09:58 EST." <5.1.1.6.0.20031030130628.02da0c80@telecommunity.com> References: <000501c39ea9$a414e400$8cb6958d@oemcomputer> <200310300959.33587.aleaxit@yahoo.com> <200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com> <200310301720.37743.aleaxit@yahoo.com> <5.1.1.6.0.20031030124652.03062630@telecommunity.com> <5.1.1.6.0.20031030130628.02da0c80@telecommunity.com> Message-ID: <200310301819.h9UIJOM01799@12-236-54-216.client.attbi.com> > >I'm not convinced that we have a problem (beyond Alex lying awake at > >night, that it :-). > > I thought you were proposing to use it for list.sorted, in order to provide > a better error message when used with an instance. If such a descriptor > were implemented, I was suggesting that it would be useful as a form of > documentation (i.e. that a method isn't intended to be called on instances > of the class), and thus it would be nice for it to be exposed for folks > like me who'd take advantage of it. (Especially if PEP 318 is being > implemented.) I mostly just proposed it to placate Alex; I think he's overly worried in this case. PEP 318 seems a ways off. --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Thu Oct 30 13:59:28 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Oct 30 13:59:35 2003 Subject: PEP 318 (was Re: [Python-Dev] Re: Guido's Magic Code was: inline sort option) In-Reply-To: <200310301819.h9UIJOM01799@12-236-54-216.client.attbi.com> References: <000501c39ea9$a414e400$8cb6958d@oemcomputer> <200310300959.33587.aleaxit@yahoo.com> <200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com> <200310301720.37743.aleaxit@yahoo.com> <5.1.1.6.0.20031030124652.03062630@telecommunity.com> <5.1.1.6.0.20031030130628.02da0c80@telecommunity.com> Message-ID: <5.1.1.6.0.20031030135759.02435e70@telecommunity.com> At 10:19 AM 10/30/03 -0800, Guido van Rossum wrote: > PEP 318 seems a ways off. Because of lack of consensus on syntax, or is it controversial in some other way? From Bram at moolenaar.net Thu Oct 30 14:08:19 2003 From: Bram at moolenaar.net (Bram Moolenaar) Date: Thu Oct 30 14:09:52 2003 Subject: [Python-Dev] Speeding up regular expression compilation Message-ID: <200310301908.h9UJ8JhL007882@moolenaar.net> In the python-dev archives I find remarks about the old pre module being much faster at compiling regular expressions than the new sre module. My own experiences are that pre is about twenty times as fast. Since my application uses a lot of simple patterns which are matched on short strings (file names actually), the pattern compilation time is taking half the CPU cycles of my program. The faster execution of sre apparently doesn't compensate for the slower compile time. Is the plan to implement the sre module in C getting closer to being done? Is there a trick to make compiling patterns go faster? I'm already falling back to the pre module with Python 2.2 and older. With Python 2.3 this generates a warning message, thus I don't do it there. I considered copying the 2.2 version of pre.py into my application, but this will stop working as soon as the support for pre is dropped (the compiled C code won't be there). Thus it would be only a temporary fix. I don't care about the Unicode support. -- LAUNCELOT: Isn't there a St. Aaaaarrrrrrggghhh's in Cornwall? ARTHUR: No, that's Saint Ives. "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD /// Bram Moolenaar -- Bram@Moolenaar.net -- http://www.Moolenaar.net \\\ /// Creator of Vim - Vi IMproved -- http://www.Vim.org \\\ \\\ Project leader for A-A-P -- http://www.A-A-P.org /// \\\ Help AIDS victims, buy here: http://ICCF-Holland.org/click1.html /// From amk at amk.ca Thu Oct 30 14:27:18 2003 From: amk at amk.ca (amk@amk.ca) Date: Thu Oct 30 14:27:33 2003 Subject: [Python-Dev] HTML parsing: anyone use formatter? Message-ID: <20031030192718.GA13220@rogue.amk.ca> [Crossposted to python-dev, web-sig, and xml-sig. Followups to web-sig@python.org, please.] I'm working on bringing htmllib.py up to HTML 4.01 by adding handlers for all the missing elements. I've currently been adding just empty methods to the HTMLParser class, but the existing methods actually help render the HTML by calling methods on a Formatter object. For example, the definitions for the H1 element look like this: def start_h1(self, attrs): self.formatter.end_paragraph(1) self.formatter.push_font(('h1', 0, 1, 0)) def end_h1(self): self.formatter.end_paragraph(1) self.formatter.pop_font() Question: should I continue supporting this in new methods? This can only go so far; a tag such as or is easy for me to handle, but handling
or or would require greatly expanding the Formatter class's repertoire. I suppose the more general question is, does anyone use Python's formatter module? Do we want to keep it around, or should htmllib be pushed toward doing just HTML parsing? formatter.py is a long way from being able to handle modern web pages and it would be a lot of work to build a decent renderer. --amk From skip at pobox.com Thu Oct 30 14:40:54 2003 From: skip at pobox.com (Skip Montanaro) Date: Thu Oct 30 14:41:06 2003 Subject: [Python-Dev] Speeding up regular expression compilation In-Reply-To: <200310301908.h9UJ8JhL007882@moolenaar.net> References: <200310301908.h9UJ8JhL007882@moolenaar.net> Message-ID: <16289.26950.782796.422409@montanaro.dyndns.org> (better on python-list@python.org than here, btw) Bram> Is there a trick to make compiling patterns go faster? Not really. Note though that the sre module caches compiled regular expressions. How many it caches depends on the size of sre._MAXCACHE (default is 100). If you have many more regular expressions than that, you'll spend a lot of time compiling them. You might find it helpful to boost that number. If you're adventurous, you might investigate recasting the sre_compile._compile function as C code. If you use an Intel CPU, another alternative might be to use psyco. Skip From fincher.8 at osu.edu Thu Oct 30 16:03:15 2003 From: fincher.8 at osu.edu (Jeremy Fincher) Date: Thu Oct 30 15:05:00 2003 Subject: [Python-Dev] HTML parsing: anyone use formatter? In-Reply-To: <20031030192718.GA13220@rogue.amk.ca> References: <20031030192718.GA13220@rogue.amk.ca> Message-ID: <200310301603.15437.fincher.8@osu.edu> On Thursday 30 October 2003 02:27 pm, amk@amk.ca wrote: > I suppose the more general question is, does anyone use Python's formatter > module? Do we want to keep it around, or should htmllib be pushed toward > doing just HTML parsing? formatter.py is a long way from being able to > handle modern web pages and it would be a lot of work to build a decent > renderer. I've never used it myself, though I'll admit that some software I've used (for searching the IMDB) does use it. Jeremy From guido at python.org Thu Oct 30 15:18:32 2003 From: guido at python.org (Guido van Rossum) Date: Thu Oct 30 15:18:41 2003 Subject: PEP 318 (was Re: [Python-Dev] Re: Guido's Magic Code was: inline sort option) In-Reply-To: Your message of "Thu, 30 Oct 2003 13:59:28 EST." <5.1.1.6.0.20031030135759.02435e70@telecommunity.com> References: <000501c39ea9$a414e400$8cb6958d@oemcomputer> <200310300959.33587.aleaxit@yahoo.com> <200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com> <200310301720.37743.aleaxit@yahoo.com> <5.1.1.6.0.20031030124652.03062630@telecommunity.com> <5.1.1.6.0.20031030130628.02da0c80@telecommunity.com> <5.1.1.6.0.20031030135759.02435e70@telecommunity.com> Message-ID: <200310302018.h9UKIW701916@12-236-54-216.client.attbi.com> > > PEP 318 seems a ways off. > > Because of lack of consensus on syntax, or is it controversial in some > other way? Both. This is the kind of syntactic change that require much deep thought before committing. Unfortunately I don't have time for that right now, so please don't ask. --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at python.org Thu Oct 30 15:55:58 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 30 15:56:32 2003 Subject: PEP 318 (was Re: [Python-Dev] Re: Guido's Magic Code was: inline sort option) In-Reply-To: <200310302018.h9UKIW701916@12-236-54-216.client.attbi.com> References: <000501c39ea9$a414e400$8cb6958d@oemcomputer> <200310300959.33587.aleaxit@yahoo.com> <200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com> <200310301720.37743.aleaxit@yahoo.com> <5.1.1.6.0.20031030124652.03062630@telecommunity.com> <5.1.1.6.0.20031030130628.02da0c80@telecommunity.com> <5.1.1.6.0.20031030135759.02435e70@telecommunity.com> <200310302018.h9UKIW701916@12-236-54-216.client.attbi.com> Message-ID: <1067547357.5295.163.camel@anthem> On Thu, 2003-10-30 at 15:18, Guido van Rossum wrote: > > > PEP 318 seems a ways off. > > > > Because of lack of consensus on syntax, or is it controversial in some > > other way? > > Both. This is the kind of syntactic change that require much deep > thought before committing. Unfortunately I don't have time for that > right now, so please don't ask. I won't, but I do hope this is something that we can settle for Python 2.4. I've been using the functionality in Python 2.3 for a while now and it is wonderful, but I the tedium and clumsiness of the current syntax really puts a damper on its use. -Barry From barry at python.org Thu Oct 30 16:03:00 2003 From: barry at python.org (Barry Warsaw) Date: Thu Oct 30 16:03:15 2003 Subject: [Python-Dev] Speeding up regular expression compilation In-Reply-To: <16289.26950.782796.422409@montanaro.dyndns.org> References: <200310301908.h9UJ8JhL007882@moolenaar.net> <16289.26950.782796.422409@montanaro.dyndns.org> Message-ID: <1067547779.5295.168.camel@anthem> On Thu, 2003-10-30 at 14:40, Skip Montanaro wrote: > Not really. Note though that the sre module caches compiled regular > expressions. How many it caches depends on the size of sre._MAXCACHE > (default is 100). If you have many more regular expressions than that, > you'll spend a lot of time compiling them. You might find it helpful to > boost that number. Of course you can just assign your compiled regular expression objects to a global or local and use that. Instant caching! Which is what I tend to do. -Barry From tdelaney at avaya.com Thu Oct 30 17:09:54 2003 From: tdelaney at avaya.com (Delaney, Timothy C (Timothy)) Date: Thu Oct 30 17:10:02 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6B5CC@au3010avexu1.global.avaya.com> > From: "Martin v. L?wis" [mailto:martin@v.loewis.de] > > My answer is "it depends": If you did not do that, and, for example, > explain why it *can't* be done, than this is a good thesis, > provided you > give qualified scientific rationale for why it can't be done. If you > say you did not do it, but it could be done in this and that way if > you had 50 person years available, then this could be a good thesis > as well, provided the strategy you outline, and the rationale for > computing the 50 person years is convincing. If you just say, "Oops, > I did not finish it because it is too much work", then this would be > a bad thesis. Yep - that was what I was getting at, and your explanation corresponds exactly with my gut feeling. Cheers. Tim Delaney From martin at v.loewis.de Thu Oct 30 17:16:48 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Thu Oct 30 17:16:53 2003 Subject: [Python-Dev] Speeding up regular expression compilation In-Reply-To: <200310301908.h9UJ8JhL007882@moolenaar.net> References: <200310301908.h9UJ8JhL007882@moolenaar.net> Message-ID: Bram Moolenaar writes: > Is there a trick to make compiling patterns go faster? If you compile the same regular expression at every program startup, and want to reduce the time for that, you can cPickle the compile expression, and restore it from the string. If that fails (because the format of compiled expressions has failed), you should fall back to compiling expressions, and optionally save the new version. Regards, Martin From mhammond at skippinet.com.au Thu Oct 30 17:21:06 2003 From: mhammond at skippinet.com.au (Mark Hammond) Date: Thu Oct 30 17:20:48 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <200310300230.h9U2UId08398@oma.cosc.canterbury.ac.nz> Message-ID: <0b8e01c39f34$1d31e1f0$0500a8c0@eden> Greg: > Neil Schemenauer: > > > Okay. Perhaps I am missing something but would fixing it be as > > simple as adding another field to the tp_as_buffer struct? > > > > /* references returned by the buffer functins are valid while > > * the object remains alive */ > > #define PyBuffer_FLAG_SAFE 1 > > That's completely different from what I had in mind, which was: > > (1) Keep a reference to the base object in the buffer object, and > > (2) Use the buffer API to fetch a fresh pointer from the > base object each time it's needed. > > Is there some reason that still wouldn't be safe enough? That would work, be less intrusive, and allow all existing code to work unchanged. My only concern is that it does not go anywhere towards fixing the buffer interface itself. To my mind, the buffer object is fairly useless and I never use it - so I really don't care. However, I do have real world uses for the buffer interface. The most compelling is for async IO in the Windows world - I need to pass a buffer Windows will fill in the background, and the buffer interface provides the solution - except for the flaws that also drip down to the buffer object, and leaves us with this problem. Thus, my preference is to fix the buffer object by fixing the interface as much as possible. Here is a sketch of a solution, incorporating both Neil and Greg's ideas: * Type object gets a new flag - TP_HAS_BUFFER_INFO, corresponding to a new 'getbufferinfoproc' slot in the PyBufferProcs structure (note - a function pointer, not static flags as Neil suggested) * New function 'getbufferinfoproc' returns a bitmask - Py_BUFFER_FIXED is one (and currently the only) flag that can be returned. * New buffer functions PyObject_AsFixedCharBuffer, etc. These check the new flag (and a type lacking TP_HAS_BUFFER_INFO is assumed to *not* be fixed) * Buffer object keeps a reference to the existing object (as it does now). Its getbufferinfoproc delegates to the underlying object. * Buffer object *never* keeps a pointer to the buffer - only to the object. Functions like tp_hash always re-fetch the buffer on demand. The buffer returned by the buffer object is then guaranteed to be as reliable as the underlying object. (This may be a semantic issue with hash(), but conceptually seems fine. Potential solution here - add Py_BUFFER_READONLY as a buffer flag, then hash() semantics could do the right thing) After all that, I can't help noticing Greg's solution would be far less work , Mark. From exarkun at intarweb.us Thu Oct 30 19:21:51 2003 From: exarkun at intarweb.us (Jp Calderone) Date: Thu Oct 30 19:22:37 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <3F9F72DF.9080101@thule.no> References: <000501c39de7$0019c180$3403a044@oemcomputer> <3F9F72DF.9080101@thule.no> Message-ID: <20031031002151.GA26628@intarweb.us> On Wed, Oct 29, 2003 at 08:57:19AM +0100, Troels Walsted Hansen wrote: > Raymond Hettinger wrote: > > >At least the builtin buffer function should go away. > >Even if someone had a use for it, it would not make-up for all the time > >lost by all the other people trying to figure what it was good for. > > I trust you will preserve the functionality though? > > I have used the buffer() function to achieve great leaps in performance > in applications which send data from a string buffer to a socket. > Slicing kills performance in this scenario once buffer sizes get beyond > a few 100 kB. > > Below is example from an asyncore.dispatcher subclass. This code sends > chunks with maximum size, without ever slicing the buffer. > > def handle_write(self): > if self.buffer_offset: > sent = self.send(buffer(self.buffer, self.buffer_offset)) > else: > sent = self.send(self.buffer) > self.buffer_offset += sent > if self.buffer_offset == len(self.buffer): > del self.buffer > Twisted uses buffer() similarly. It originally sliced, by a company using the library complained of performance problems. Switching to buffer() alleviated those problems. Jp -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://mail.python.org/pipermail/python-dev/attachments/20031030/15556250/attachment.bin From bac at OCF.Berkeley.EDU Thu Oct 30 21:01:42 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Oct 30 21:01:55 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <3FA0C2F7.40409@v.loewis.de> References: <338366A6D2E2CA4C9DAEAE652E12A1DED6B3F8@au3010avexu1.global.avaya.com> <3FA06DC5.70407@ocf.berkeley.edu> <3FA0C2F7.40409@v.loewis.de> Message-ID: <3FA1C286.2030409@ocf.berkeley.edu> Martin v. L?wis wrote: > Brett C. wrote: > >>> Whether it is too large for a Masters thesis I don't know. Does a >>> Masters thesis require *success* in the stated goal? I've been >>> thinking about doing my own Masters in the not-too-distant future if >>> I can find the time ... >>> >> >> >> Success as in what you set out to do was actually beneficial? No, >> just as long as something is learned. Successful as actually >> finishing the darn thing? Yes. > > > He actually meant "success in the stated goal". I.e. if you go out to > implement free threading, would it be considered as a failure of the > Master's project if you come back and say: "I did not actually do that"? > Ah, OK. My mistake. > My answer is "it depends": If you did not do that, and, for example, > explain why it *can't* be done, than this is a good thesis, provided you > give qualified scientific rationale for why it can't be done. If you > say you did not do it, but it could be done in this and that way if > you had 50 person years available, then this could be a good thesis > as well, provided the strategy you outline, and the rationale for > computing the 50 person years is convincing. If you just say, "Oops, > I did not finish it because it is too much work", then this would be > a bad thesis. > I would have to agree with that assessment. Just have to convince my thesis adviser. =) -Brett From bac at OCF.Berkeley.EDU Thu Oct 30 21:11:35 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Oct 30 21:11:39 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <3FA0C16A.8030203@v.loewis.de> References: <3F9F1F82.2090209@ocf.berkeley.edu> <3FA063BF.6050207@ocf.berkeley.edu> <3FA0C16A.8030203@v.loewis.de> Message-ID: <3FA1C4D7.4010403@ocf.berkeley.edu> Martin v. L?wis wrote: > Brett C. wrote: > >>> - floating point: provide IEEE-794 (or some such) in a portable >>> yet efficient way >> >> >> >> You mean like how we have longs? So code up in C our own way of >> storing 794 independent of the CPU? > > > Not longs, but floats. And you would not attempt to store it independent > of the CPU, but instead, you would make as much use of the CPU as > possible, and only implement things in C that the CPU gets wrong. The > portion of emulation would vary from CPU to CPU. > OK, so in other words play cleanup for how the CPU handles floating point by having custom code that deals with its mix-ups. > As a starting point, you might look at the Java strictfpu mode (which > got only added after the initial Java release). Java 1.0 was where > Python is today: expose whatever the platform provides. In Java, they > have the much stronger desire to provide bit-for-bit reproducability > on all systems, so they added strictfpu as a trade-off of performance > vs. write-once-run-anywhere. > Remembrances of Tim mentioning FPU exceptions start to flood back into my mind. =) >>> - deterministic finalization: provide a way to get objects destroyed >>> implicitly at certain points in control flow; a use case would be >>> thread-safety/critical regions >> >> >> >> I think you get what you mean by this, but I am not totally sure since >> I can't come up with a use beyond threads killing themselves properly >> when the whole program is shutting down. > > > Not at all. In Python, you currently do > > def bump_counter(self): > self.mutex.acquire() > try: > self.counter = self.counter+1 > more_actions() > finally: > self.mutex.release() > > In C++, you do > > void bump_counter(){ > MutexAcquistion acquire(this); > this->counter+=1; > more_actions(); > } > > I.e. you can acquire the mutex at the beginning (as a local object), > and it gets destroyed automatically at the end of the function. So > they have the "resource acquisition is construction, resource release > is destruction" design pattern. This is very powerful and convenient, > and works almost in CPython, but not in Python - as there is no > uarantee when objects get destroyed. > Ah, OK. >> Have no clue what this is since I don't know C#. Almost sounds like >> Michael's def func() [] proposal at the method level. Or just a lot >> of descriptors. =) > > > Yes, the func()[] proposal might do most of it. However, I'm uncertain > whether it puts in place all pieces of the puzzle - one would actually > have to try to use that stuff to see whether it really works > sufficiently. You would have to set goals first (what is it supposed to > do), and then investigate, whether these things can actually be done > with it. As I said: static, class, synchronized, final methods might > all be candidates; perhaps along with some of the .NET features, like > security evidence check (caller must have permission to write files > in order to call this method), webmethod (method is automagically > exposed as a SOAP/XML-RPC method), etc. > I remember that static and classmethod were reasons cited why the func()[] proposal was desired. It's an idea. -Brett From bac at OCF.Berkeley.EDU Thu Oct 30 21:19:57 2003 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Oct 30 21:21:40 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <2mhe1rj7n8.fsf@starship.python.net> References: <3FA0A210.10605@ocf.berkeley.edu> <2mhe1rj7n8.fsf@starship.python.net> Message-ID: <3FA1C6CD.6050201@ocf.berkeley.edu> Michael Hudson wrote: > "Brett C." writes: > > >>Dennis Allison wrote: >> >> >>>Brett -- >>>You might put together a list of all the ideas (maybe even a ranked >>>list) >>>and post it as a unit to the list for archival purposes. Thanks. >>> >> >>Way ahead of you, Dennis. I have already started to come up with a >>reST doc for writing up all of these suggestions. It just might be a >>little while before I get it up since I will need to do some >>preliminary research on each idea to measure the amount of work they >>will be. > > > Could go on the Python Wiki? > Could. Let me get it done in reST locally, then I can look at adding it to the wiki. > I take it from your posting of last week that you've thought about > other ways of implementing exception handling? I guess a > non-reference count based GC is a prerequisite for that... > Yeah, I have tossed the exception handling idea around in my head a little, but the culmination was what I posted. And a non-refcount GC would definitely help, even if the exception handling wasn't changed. More places where you could just return NULL instead of having to deal with DECREFing objects. -Brett From greg at cosc.canterbury.ac.nz Thu Oct 30 22:37:51 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Thu Oct 30 22:38:06 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <20031030171939.GA374@mems-exchange.org> Message-ID: <200310310337.h9V3bpB17539@oma.cosc.canterbury.ac.nz> Neil Schemenauer : > I don't think the buffer hash method can depend on the data being > pointed to. There is nothing in the buffer interface that tells > you if the data is immutable. The hash method could return the id > of the buffer object but I'm not sure how useful that would be. How about just having it call the hash method of the base object? If the base object is hashable, this will do something reasonable, and if not, it will fail in the expected way. Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Thu Oct 30 22:42:33 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Thu Oct 30 22:42:45 2003 Subject: [Python-Dev] Speeding up regular expression compilation In-Reply-To: <16289.26950.782796.422409@montanaro.dyndns.org> Message-ID: <200310310342.h9V3gXf17558@oma.cosc.canterbury.ac.nz> > If you're adventurous, you might investigate recasting the > sre_compile._compile function as C code. Or Pyrex code. :-) Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From greg at cosc.canterbury.ac.nz Thu Oct 30 22:47:59 2003 From: greg at cosc.canterbury.ac.nz (Greg Ewing) Date: Thu Oct 30 22:48:46 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <0b8e01c39f34$1d31e1f0$0500a8c0@eden> Message-ID: <200310310347.h9V3lwa17730@oma.cosc.canterbury.ac.nz> > Thus, my preference is to fix the buffer object by fixing the interface as > much as possible. > > Here is a sketch of a solution, incorporating both Neil and Greg's ideas: Hang on, didn't we already go through the process of designing a new buffer interface not long ago? What was decided about the results of that? Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+ From aleaxit at yahoo.com Fri Oct 31 02:59:40 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 31 02:59:46 2003 Subject: PEP 318 (was Re: [Python-Dev] Re: Guido's Magic Code was: inline sort option) In-Reply-To: <1067547357.5295.163.camel@anthem> References: <200310302018.h9UKIW701916@12-236-54-216.client.attbi.com> <1067547357.5295.163.camel@anthem> Message-ID: <200310310859.40837.aleaxit@yahoo.com> On Thursday 30 October 2003 09:55 pm, Barry Warsaw wrote: > On Thu, 2003-10-30 at 15:18, Guido van Rossum wrote: > > > > PEP 318 seems a ways off. > > > > > > Because of lack of consensus on syntax, or is it controversial in some > > > other way? > > > > Both. This is the kind of syntactic change that require much deep > > thought before committing. Unfortunately I don't have time for that > > right now, so please don't ask. > > I won't, but I do hope this is something that we can settle for Python > 2.4. I've been using the functionality in Python 2.3 for a while now > and it is wonderful, but I the tedium and clumsiness of the current > syntax really puts a damper on its use. Not on mine (my use), but, yes, I _have_ seen some Pythonistas be rather perplexed by it. Giving it a neat, cool look will be good. BTW, when we do come around to PEP 318, I would suggest the 'as' clause on a class statement as the best way to specify a metaclass. 'class Newstyle as type:' for example is IMHO neater -- and thus more encouraging to the generalized use of newstyle classes -- than the "inheriting from object" idea or the setting of __metaclass__; it reads well AND makes what one's doing more obvious when a custom MC is involved, because it's so "up-front". Besides, it's STILL syntax for a call to the thingy on the RHS of 'as', just like, say, def foop() as staticmethod: is, even though the details of how that call is performed are different for metaclasses (called with classname/bases/classdict) and function decorators (called with the function object). BTW, the PEP isn't very clear about this, but, I would hope the 'as' clause applies uniformly to ANY def or class statement, right? No reason to specialcase, that I can see -- "def ... as" may well be used mostly inside classbodies, because we do have decorators ready for that, but the 'synchronized(lock)' decorator used in the PEP's examples would seem just as applicable to freestanding functions as to methods. Alex From aleaxit at yahoo.com Fri Oct 31 03:03:35 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 31 03:03:41 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option In-Reply-To: <200310301819.h9UIJOM01799@12-236-54-216.client.attbi.com> References: <5.1.1.6.0.20031030130628.02da0c80@telecommunity.com> <200310301819.h9UIJOM01799@12-236-54-216.client.attbi.com> Message-ID: <200310310903.35941.aleaxit@yahoo.com> On Thursday 30 October 2003 07:19 pm, Guido van Rossum wrote: > > >I'm not convinced that we have a problem (beyond Alex lying awake at > > >night, that it :-). As it happens I just had a very unusual ten-hours-of-sleep night, so I don't think you need to worry:-). > > on instances of the class), and thus it would be nice for it to be > > exposed for folks like me who'd take advantage of it. (Especially if PEP > > 318 is being implemented.) > > I mostly just proposed it to placate Alex; I think he's overly worried > in this case. PEP 318 seems a ways off. OK, then it does appear to me that new descriptors may wait for PEP 318 to mature, and list.sorted be left as is for now. Hopefully both can be taken into consideration before 2.4 is finalized, since that time is also "a ways off", no doubt. Alex From theller at python.net Fri Oct 31 03:03:52 2003 From: theller at python.net (Thomas Heller) Date: Fri Oct 31 03:04:39 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <200310310347.h9V3lwa17730@oma.cosc.canterbury.ac.nz> (Greg Ewing's message of "Fri, 31 Oct 2003 16:47:59 +1300 (NZDT)") References: <200310310347.h9V3lwa17730@oma.cosc.canterbury.ac.nz> Message-ID: Greg Ewing writes: >> Thus, my preference is to fix the buffer object by fixing the interface as >> much as possible. >> >> Here is a sketch of a solution, incorporating both Neil and Greg's ideas: > > Hang on, didn't we already go through the process of > designing a new buffer interface not long ago? > > What was decided about the results of that? That was pep 298. I withdraw it (well, it's still labeled as draft) because I didn't have enough time to finish the specification. But if anyone wants to take it over, please do so. Thomas From Bram at moolenaar.net Fri Oct 31 06:20:44 2003 From: Bram at moolenaar.net (Bram Moolenaar) Date: Fri Oct 31 06:22:22 2003 Subject: [Python-Dev] Speeding up regular expression compilation In-Reply-To: <1067547779.5295.168.camel@anthem> Message-ID: <200310311120.h9VBKikP001404@moolenaar.net> Barry Warsaw wrote: > On Thu, 2003-10-30 at 14:40, Skip Montanaro wrote: > > Not really. Note though that the sre module caches compiled regular > > expressions. How many it caches depends on the size of sre._MAXCACHE > > (default is 100). If you have many more regular expressions than that, > > you'll spend a lot of time compiling them. You might find it helpful to > > boost that number. > > Of course you can just assign your compiled regular expression objects > to a global or local and use that. Instant caching! Which is what I > tend to do. I'm already caching all the compiled patterns. It's the first-time compile that is consuming time, there are a lot of patterns. But half a second to compile them is too much, the whole program may not run longer than a second. BTW. I've changed the code to use pre.py on Python 2.3 (with the warning removed) as a temporary solution. The problem will be back with 2.4... The reason I sent this to the development list is that I thought this could be solved on the library side. Changing the Python code sounds like working around the real problem. -- BRIDGEKEEPER: What is your favorite colour? GAWAIN: Blue ... No yelloooooww! "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD /// Bram Moolenaar -- Bram@Moolenaar.net -- http://www.Moolenaar.net \\\ /// Creator of Vim - Vi IMproved -- http://www.Vim.org \\\ \\\ Project leader for A-A-P -- http://www.A-A-P.org /// \\\ Help AIDS victims, buy here: http://ICCF-Holland.org/click1.html /// From mwh at python.net Fri Oct 31 06:36:51 2003 From: mwh at python.net (Michael Hudson) Date: Fri Oct 31 06:36:55 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <0b8e01c39f34$1d31e1f0$0500a8c0@eden> (Mark Hammond's message of "Fri, 31 Oct 2003 09:21:06 +1100") References: <0b8e01c39f34$1d31e1f0$0500a8c0@eden> Message-ID: <2mad7h8wq4.fsf@starship.python.net> "Mark Hammond" writes: > That would work, be less intrusive, and allow all existing code to work > unchanged. My only concern is that it does not go anywhere towards fixing > the buffer interface itself. I think that is a different issue entirely. While it may be interesting and important, can we at least try to keep them separate? Cheers, mwh -- This is the fixed point problem again; since all some implementors do is implement the compiler and libraries for compiler writing, the language becomes good at writing compilers and not much else! -- Brian Rogoff, comp.lang.functional From mwh at python.net Fri Oct 31 06:42:52 2003 From: mwh at python.net (Michael Hudson) Date: Fri Oct 31 06:42:56 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <3FA1C6CD.6050201@ocf.berkeley.edu> (Brett C.'s message of "Thu, 30 Oct 2003 18:19:57 -0800") References: <3FA0A210.10605@ocf.berkeley.edu> <2mhe1rj7n8.fsf@starship.python.net> <3FA1C6CD.6050201@ocf.berkeley.edu> Message-ID: <2m65i58wg3.fsf@starship.python.net> "Brett C." writes: >> I take it from your posting of last week that you've thought about >> other ways of implementing exception handling? I guess a >> non-reference count based GC is a prerequisite for that... >> > > Yeah, I have tossed the exception handling idea around in my head a > little, but the culmination was what I posted. > > And a non-refcount GC would definitely help, even if the exception > handling wasn't changed. More places where you could just return NULL > instead of having to deal with DECREFing objects. And reducing the memory overhead of objects. Here's my crazy idea that's been knocking around my head for a while. I wonder if anyone can shoot in down in flames. Remove the ob_type field from all PyObjects. Make pymalloc mandatory, make it use type specific pools and store a pointer to the type object at the start of each pool. So instead of p->ob_type it's *(p&MASK) I think having each type in its own pools would also let you lose the gc_next & gc_prev fields. Combined with a non-refcount GC, you could hammer sizeof(PyIntObject) down to sizeof(long)! (Actually, a potential killer is assigning to __class__ -- maybe you could only do this for heaptypes) Cheers, mwh -- To summarise the summary of the summary:- people are a problem. -- The Hitch-Hikers Guide to the Galaxy, Episode 12 From mhammond at skippinet.com.au Fri Oct 31 08:03:56 2003 From: mhammond at skippinet.com.au (Mark Hammond) Date: Fri Oct 31 08:03:42 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <2mad7h8wq4.fsf@starship.python.net> Message-ID: <009601c39faf$72617a70$0500a8c0@eden> Michael Hudson > "Mark Hammond" writes: > > > That would work, be less intrusive, and allow all existing > code to work > > unchanged. My only concern is that it does not go anywhere > towards fixing > > the buffer interface itself. > > I think that is a different issue entirely. While it may be > interesting and important, can we at least try to keep them separate? I don't see how. The only problem I see is in the buffer interface. We could worm around the buffer interface problem in the buffer object, but I don't see how that is keeping them separate. Am I missing something? Mark. From martin at v.loewis.de Fri Oct 31 08:45:35 2003 From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: Fri Oct 31 08:45:52 2003 Subject: [Python-Dev] Speeding up regular expression compilation In-Reply-To: <200310311120.h9VBKikP001404@moolenaar.net> References: <200310311120.h9VBKikP001404@moolenaar.net> Message-ID: Bram Moolenaar writes: > The reason I sent this to the development list is that I thought this > could be solved on the library side. Changing the Python code sounds > like working around the real problem. It probably can be changed. However, it appears that few people would ever worry about the compilation speed, so it is unlikely that any efforts will be made in improving it. Contributions would be greatly appreciated. Regards, Martin From pje at telecommunity.com Fri Oct 31 08:52:03 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Oct 31 08:51:13 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <2m65i58wg3.fsf@starship.python.net> References: <3FA1C6CD.6050201@ocf.berkeley.edu> <3FA0A210.10605@ocf.berkeley.edu> <2mhe1rj7n8.fsf@starship.python.net> <3FA1C6CD.6050201@ocf.berkeley.edu> Message-ID: <5.1.0.14.0.20031031084822.01e5a020@mail.telecommunity.com> At 11:42 AM 10/31/03 +0000, Michael Hudson wrote: >"Brett C." writes: > > >> I take it from your posting of last week that you've thought about > >> other ways of implementing exception handling? I guess a > >> non-reference count based GC is a prerequisite for that... > >> > > > > Yeah, I have tossed the exception handling idea around in my head a > > little, but the culmination was what I posted. > > > > And a non-refcount GC would definitely help, even if the exception > > handling wasn't changed. More places where you could just return NULL > > instead of having to deal with DECREFing objects. > >And reducing the memory overhead of objects. OTOH, maybe you could see whether INCREF/DECREF can be used to control synchronization of objects between threads, and thus get a multiprocessor Python. Note that if an object's refcount is 1, it's not being shared between threads. INCREF could be looked at as, "I'm about to use this object", so if the object isn't "owned" by the current thread, then lock it and increment an ownership count. Or was that how the experimental free-threading Python worked? >Here's my crazy idea that's been knocking around my head for a while. >I wonder if anyone can shoot in down in flames. > >Remove the ob_type field from all PyObjects. Make pymalloc mandatory, >make it use type specific pools and store a pointer to the type object >at the start of each pool. How would you get from the pointer to the pool head? From mwh at python.net Fri Oct 31 09:07:28 2003 From: mwh at python.net (Michael Hudson) Date: Fri Oct 31 09:07:31 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <009601c39faf$72617a70$0500a8c0@eden> (Mark Hammond's message of "Sat, 1 Nov 2003 00:03:56 +1100") References: <009601c39faf$72617a70$0500a8c0@eden> Message-ID: <2mznfh7b6n.fsf@starship.python.net> "Mark Hammond" writes: > Michael Hudson > >> "Mark Hammond" writes: >> >> > That would work, be less intrusive, and allow all existing >> code to work >> > unchanged. My only concern is that it does not go anywhere >> towards fixing >> > the buffer interface itself. >> >> I think that is a different issue entirely. While it may be >> interesting and important, can we at least try to keep them separate? > > I don't see how. The only problem I see is in the buffer interface. We > could worm around the buffer interface problem in the buffer object, but I > don't see how that is keeping them separate. Am I missing something? Well, there are two things people complain about a) the buffer INTERFACE b) the buffer OBJECT are the issues plaguing both the same? I wasn't under the impression they were. It's entirely possible I'm wrong, though. Cheers, mwh -- [1] If you're lost in the woods, just bury some fibre in the ground carrying data. Fairly soon a JCB will be along to cut it for you - follow the JCB back to civilsation/hitch a lift. -- Simon Burr, cam.misc From mwh at python.net Fri Oct 31 09:10:16 2003 From: mwh at python.net (Michael Hudson) Date: Fri Oct 31 09:10:54 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <5.1.0.14.0.20031031084822.01e5a020@mail.telecommunity.com> (Phillip J. Eby's message of "Fri, 31 Oct 2003 08:52:03 -0500") References: <3FA1C6CD.6050201@ocf.berkeley.edu> <3FA0A210.10605@ocf.berkeley.edu> <2mhe1rj7n8.fsf@starship.python.net> <3FA1C6CD.6050201@ocf.berkeley.edu> <5.1.0.14.0.20031031084822.01e5a020@mail.telecommunity.com> Message-ID: <2mvfq57b1z.fsf@starship.python.net> "Phillip J. Eby" writes: >>Here's my crazy idea that's been knocking around my head for a while. >>I wonder if anyone can shoot in down in flames. >> >>Remove the ob_type field from all PyObjects. Make pymalloc mandatory, >>make it use type specific pools and store a pointer to the type object >>at the start of each pool. > > How would you get from the pointer to the pool head? Did you read the rest of my mail? Maybe I was too terse, but my thinking was that the pools are aligned on a known size boundary (e.g. 4K) so to get to the head you just mask off the 12 (or whatever) least significant bits. Wouldn't work for zeta-c[1], I'd have to admit, but do we care? Cheers, mwh [1] http://www.cliki.net/Zeta-C -- SPIDER: 'Scuse me. [scuttles off] ZAPHOD: One huge spider. FORD: Polite though. -- The Hitch-Hikers Guide to the Galaxy, Episode 11 From nas-python at python.ca Fri Oct 31 09:12:08 2003 From: nas-python at python.ca (Neil Schemenauer) Date: Fri Oct 31 09:11:13 2003 Subject: [Python-Dev] Deprecate the buffer object? In-Reply-To: <200310310337.h9V3bpB17539@oma.cosc.canterbury.ac.nz> References: <20031030171939.GA374@mems-exchange.org> <200310310337.h9V3bpB17539@oma.cosc.canterbury.ac.nz> Message-ID: <20031031141208.GA3566@mems-exchange.org> On Fri, Oct 31, 2003 at 04:37:51PM +1300, Greg Ewing wrote: > How about just having it call the hash method of the base > object? If the base object is hashable, this will do something > reasonable, and if not, it will fail in the expected way. The buffer can reference a subset of the original data ('size' an 'offset' parameters). Neil From pje at telecommunity.com Fri Oct 31 11:20:35 2003 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Oct 31 11:21:08 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <2mvfq57b1z.fsf@starship.python.net> References: <5.1.0.14.0.20031031084822.01e5a020@mail.telecommunity.com> <3FA1C6CD.6050201@ocf.berkeley.edu> <3FA0A210.10605@ocf.berkeley.edu> <2mhe1rj7n8.fsf@starship.python.net> <3FA1C6CD.6050201@ocf.berkeley.edu> <5.1.0.14.0.20031031084822.01e5a020@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20031031111429.03110880@telecommunity.com> At 02:10 PM 10/31/03 +0000, Michael Hudson wrote: >"Phillip J. Eby" writes: > > >>Here's my crazy idea that's been knocking around my head for a while. > >>I wonder if anyone can shoot in down in flames. > >> > >>Remove the ob_type field from all PyObjects. Make pymalloc mandatory, > >>make it use type specific pools and store a pointer to the type object > >>at the start of each pool. > > > > How would you get from the pointer to the pool head? > >Did you read the rest of my mail? Maybe I was too terse, but my Yes, and yes. :) >thinking was that the pools are aligned on a known size boundary >(e.g. 4K) so to get to the head you just mask off the 12 (or whatever) >least significant bits. Ah. But since even the most trivial of Python operations require access to the type, wouldn't this take longer? I mean, for every ob->ob_type->tp_whatever you'll now have something like *(ob & mask)->tp_whatever. So there are still two memory acesses, but now there's a bitmasking operation added in. I suppose that for some object types you could be getting a 12-25% decrease in memory use for the base object, though. From mwh at python.net Fri Oct 31 12:08:36 2003 From: mwh at python.net (Michael Hudson) Date: Fri Oct 31 12:08:45 2003 Subject: [Python-Dev] Looking for master thesis ideas involving Python In-Reply-To: <5.1.1.6.0.20031031111429.03110880@telecommunity.com> (Phillip J. Eby's message of "Fri, 31 Oct 2003 11:20:35 -0500") References: <5.1.0.14.0.20031031084822.01e5a020@mail.telecommunity.com> <3FA1C6CD.6050201@ocf.berkeley.edu> <3FA0A210.10605@ocf.berkeley.edu> <2mhe1rj7n8.fsf@starship.python.net> <3FA1C6CD.6050201@ocf.berkeley.edu> <5.1.0.14.0.20031031084822.01e5a020@mail.telecommunity.com> <5.1.1.6.0.20031031111429.03110880@telecommunity.com> Message-ID: <2mad7h72sr.fsf@starship.python.net> "Phillip J. Eby" writes: >>thinking was that the pools are aligned on a known size boundary >>(e.g. 4K) so to get to the head you just mask off the 12 (or whatever) >>least significant bits. > > Ah. But since even the most trivial of Python operations require > access to the type, wouldn't this take longer? I mean, for every > ob->ob_type->tp_whatever you'll now have something like *(ob & > mask)->tp_whatever. Well, I dunno. I doubt the masking would add significant overhead -- it'd only be one instruction, after all -- but the fact that you'd have to haul the start of the pool into the cache to get the pointer to the type object might hurt. You'd have to try it and measure, I guess. > So there are still two memory acesses, but now there's a bitmasking > operation added in. I suppose that for some object types you could > be getting a 12-25% decrease in memory use for the base object, > though. More than that in the good cases. Something I forgot was that you'd probably have to knock variable length types on the head. Cheers, mwh -- I would hereby duly point you at the website for the current pedal powered submarine world underwater speed record, except I've lost the URL. -- Callas, cam.misc From FBatista at uniFON.com.ar Fri Oct 31 13:36:02 2003 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Fri Oct 31 13:36:50 2003 Subject: [Python-Dev] prePEP: Decimal data type Message-ID: Here I send it. Suggestions and all kinds of recomendations are more than welcomed. If it all goes ok, it'll be a PEP when I finish writing/modifying the code. Thank you. . Facundo ------------------------------------------------------------------------ PEP: XXXX Title: Decimal data type Version: $Revision: 0.1 $ Last-Modified: $Date: 2003/10/31 15:25:00 $ Author: Facundo Batista Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 17-Oct-2003 Python-Version: 2.3.3 Abstract ======== The idea is to have a Decimal data type, for every use where decimals are needed but floating point is too inexact. The Decimal data type should support the Python standard functions and operations and must comply the decimal arithmetic ANSI standard X3.274-1996. Rationale ========= I must separate the requeriments in two sections. The first is to comply with the ANSI standard. All the needings for this are specified in the Mike Cowlishaw's work at http://www2.hursley.ibm.com/decimal/. Cowlishaw's also provided a **lot** of test cases. The second section of requeriments (standard Python functions support, usability, etc) are detailed in the `Requirements`_ section. Here I'll include all the decisions made and why, and all the subjects still being discussed. The requirements will be numbered, to simplify discussion on each point. This work is based on code and test functions written by Eric Price, Aahz and Tim Peters. Actually I'll work on the Decimal.py code in the sandbox (at python/nondist/sandbox/decimal in SourceForge). Some of the explanations of this PEP are taken from the Cowlishaw's work. Items In Discussion ------------------- When in a case like ``Decimal op otherType`` (see point 12 in Requirements_ for details), what should happen? if otherType is an int or long: a. an exception is raised b. otherType is converted to Decimal c. Decimal is converted to int or long (with ``int()`` or ``long()``) if otherType is a float: d. an exception is raised e. otherType is converted to Decimal (rounding? see next item in discussion) f. Decimal is converted to float (with ``float()``) if otherType is a string: g. an exception is raised h. otherType is converted to Decimal i. Decimal is converted to string (bizarre, huh?) When passing floating point to the constructor, what should happen? j. ``Decimal(1.1) == Decimal('1.1')`` k. ``Decimal(1.1) == Decimal('110000000000000008881784197001252...e-51')`` Requirements ============ 1. The syntax should be ``Decimal(value)``. 2. The value could be of the type: - another Decimal - int or long - float - string 3. To exist a Context. The context represents the user-selectable parameters and rules which govern the results of arithmetic operations. In the context the user defines: - what will happen with the exceptional conditions. - what precision will be used - what rounding method will be used 4. The Context must be omnipresent, meaning that changes to it affects all the current and future Decimal instances. 5. The exceptional conditions should be grouped into signals, which could be controlled individually. The context should contain a flag and a trap-enabler for each signal. The signals should be: clamped, division-by-zero, inexact, invalid-operation, overflow, rounded, subnormal and underflow. 6. For each of the signals, the corresponding flag should be set to 1 when the signal occurs. It is only reset to 0 by explicit user action. 7. For each of the signals, the corresponding trap-enabler will indicate which action is to be taken when the signal occurs. If 0, a defined result should be supplied, and execution should continue. If 1, the execution of the operation should end and an exception should be raised. 8. The precision (maximum number of significant digits that can result from an arithmetic operation) must be positive (greater than 0). 9. To have different kinds of rounding; you can choose the algorithm through context: - ``round-down``: (Round toward 0, truncate) The discarded digits are ignored; the result is unchanged:: 1.123 --> 1.12 1.128 --> 1.12 1.125 --> 1.12 1.135 --> 1.13 - ``round-half-up``: If the discarded digits represent greater than or equal to half (0.5) then the result should be incremented by 1 (rounded up); otherwise the discarded digits are ignored:: 1.123 --> 1.12 1.128 --> 1.13 1.125 --> 1.13 1.135 --> 1.14 - ``round-half-even``: If the discarded digits represent greater than half (0.5) then the result coefficient should be incremented by 1 (rounded up); if they represent less than half, then the result is not adjusted (that is, the discarded digits are ignored); otherwise the result is unaltered if its rightmost digit is even, or incremented by 1 (rounded up) if its rightmost digit is odd (to make an even digit):: 1.123 --> 1.12 1.128 --> 1.13 1.125 --> 1.12 1.135 --> 1.14 - ``round-ceiling``: If all of the discarded digits are zero or if the sign is negative the result is unchanged; otherwise, the result should be incremented by 1 (rounded up):: 1.123 --> 1.13 1.128 --> 1.13 -1.123 --> -1.12 -1.128 --> -1.12 - ``round-floor``: If all of the discarded digits are zero or if the sign is positive the result is unchanged; otherwise, the absolute value of the result should be incremented by 1:: 1.123 --> 1.12 1.128 --> 1.12 -1.123 --> -1.13 -1.128 --> -1.13 - ``round-half-down``: If the discarded digits represent greater than half (0.5) then the result should be incremented by 1 (rounded up); otherwise the discarded digits are ignored:: 1.123 --> 1.12 1.128 --> 1.13 1.125 --> 1.12 1.135 --> 1.13 - ``round-up``: (Round away from 0) If all of the discarded digits are zero the result is unchanged. Otherwise, the result should be incremented by 1 (rounded up):: 1.123 --> 1.13 1.128 --> 1.13 1.125 --> 1.13 1.135 --> 1.14 10. Strings with floats in engineering notation will be supported. 11. Calling repr() should do round trip, meaning that:: m = Decimal(...) m == eval(repr(m)) 12. To support the basic aritmetic (``+, -, *, /, //, **, %, divmod``) and comparison (``==, !=, <, >, <=, >=, cmp``) operators in the following cases: - Decimal op Decimal - Decimal op otherType - otherType op Decimal - Decimal op= Decimal - Decimal op= otherType Check `Items In Discussion`_ to see what types could OtherType be, and what happens in each case. 13. To support unary operators (``-, +, abs``). 14. To support the built-in methods: - min, max - float, int, long - str, repr - hash - copy, deepcopy - bool (0 is false, otherwise true) 15. To be immutable. Reference Implementation ======================== To be included later: - code - test code - documentation Copyright ========= This document has been placed in the public domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ADVERTENCIA La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo, son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. Si Ud. No es uno de los destinatarios consignados o la persona responsable de hacer llegar este mensaje a los destinatarios consignados, no est? autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de ella) contenida en este mensaje. Por favor notif?quenos respondiendo al remitente, borre el mensaje original y borre las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones Personales S.A. o alguna empresa asociada. Los mensajes electr?nicos pueden ser alterados, motivo por el cual Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n cualquiera sea el resultante de este mensaje. Muchas Gracias. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20031031/04dbf328/attachment-0001.html From aleaxit at yahoo.com Fri Oct 31 14:42:41 2003 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Oct 31 14:42:49 2003 Subject: [Python-Dev] prePEP: Decimal data type In-Reply-To: References: Message-ID: <200310312042.41751.aleaxit@yahoo.com> On Friday 31 October 2003 07:36 pm, Batista, Facundo wrote: ... > If it all goes ok, it'll be a PEP when I finish writing/modifying the code. I'll gladly help fix the English if needed then, let me know. > When passing floating point to the constructor, what should happen? > > j. ``Decimal(1.1) == Decimal('1.1')`` > k. ``Decimal(1.1) == > Decimal('110000000000000008881784197001252...e-51')`` You forgot an alternative that's likely to be popular on python-dev: "an exception is raised". (This would change requirement 2. later, of course). Alex From jeremy at alum.mit.edu Fri Oct 31 14:44:08 2003 From: jeremy at alum.mit.edu (Jeremy Hylton) Date: Fri Oct 31 14:46:56 2003 Subject: [Python-Dev] proposed change to compiler package Message-ID: <1067629448.24165.150.camel@localhost.localdomain> The top-level walk() function in the compiler package returns the visitor object that is passed to walk. I'd like to change it to return the result of the top-level dispatch() or visit() call. Right now, visitor methods can return a value, which is useful for a visit() call that is internal to a visitor, but can't return it to the caller of walk(). The current return value is pretty useless, since the caller of walk() must pass the visitor as one of the arguments. That is, walk() returns one of its arguments. The change might break some code, but only in a trivial way, and it will make possible to write visitors that don't have any state-- simple combinators. Example: class NameVisitor: """Compute a dotted name from an expression.""" def visitGetattr(self, node): return "%s.%s" % (self.visit(node.expr), node.attrname) def visitName(self, node): return node.name Jeremy From FBatista at uniFON.com.ar Fri Oct 31 14:51:53 2003 From: FBatista at uniFON.com.ar (Batista, Facundo) Date: Fri Oct 31 14:52:27 2003 Subject: [Python-Dev] prePEP: Decimal data type Message-ID: #- On Friday 31 October 2003 07:36 pm, Batista, Facundo wrote: #- ... #- > If it all goes ok, it'll be a PEP when I finish #- writing/modifying the code. #- #- I'll gladly help fix the English if needed then, let me know. Always welcomed too, :) #- > When passing floating point to the constructor, what should happen? #- > #- > j. ``Decimal(1.1) == Decimal('1.1')`` #- > k. ``Decimal(1.1) == #- > Decimal('110000000000000008881784197001252...e-51')`` #- #- You forgot an alternative that's likely to be popular on #- python-dev: "an #- exception is raised". (This would change requirement 2. later, of #- course). You're right. So the 'm' choice (votable as the others) is "an exception is raised". . Facundo From fincher.8 at osu.edu Fri Oct 31 19:40:39 2003 From: fincher.8 at osu.edu (Jeremy Fincher) Date: Fri Oct 31 18:42:19 2003 Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option In-Reply-To: <200310301454.48290.aleaxit@yahoo.com> References: <1067518878.3fa10b9e91afb@mcherm.com> <200310301454.48290.aleaxit@yahoo.com> Message-ID: <200310311940.39491.fincher.8@osu.edu> On Thursday 30 October 2003 08:54 am, Alex Martelli wrote: > just like in about ALL cases > of Python calls *except* "aclass.baz(aninst)" which is an exceptional > case in which Python itself does (enforced) typechecking for you. Out of curiosity, why does Python do this typechecking? I just ran into a situation where such calls in my subclass of sets.Set fail if the sets module gets reloaded. Is there some really important reason why in this case (and only this case) Python does typechecking on pure-Python classes? Jeremy