From greg at electricrain.com  Wed Oct  1 01:27:23 2003
From: greg at electricrain.com (Gregory P. Smith)
Date: Wed Oct  1 01:27:29 2003
Subject: [Python-Dev] Procedures for submitting patches to pybsddb
In-Reply-To: <20030928005252.6fd1b4b9.itamar@itamarst.org>
References: <20030928005252.6fd1b4b9.itamar@itamarst.org>
Message-ID: <20031001052723.GK17491@zot.electricrain.com>

On Sun, Sep 28, 2003 at 12:52:52AM -0400, Itamar Shtull-Trauring wrote:
> I have a patch (DBCursor.get_current_size(), returns size of data for
> current entry) which I'd like to submit. This involves changes to
> pybsddb cvs as well as python cvs, from what I can tell (for tests and
> docs in the pybsddb repo).
> 
> While I have developer access to pybsddb, I don't have it for Python.
> Submitting patches for two different repositories seems cumbersome, so
> where should I add it? Python SF tracker?

If you submit the patch to the python project there's less dependance
on me to check it and commit it.  I suggest just including the pybsddb
repository diffs as a second file in the patch and email the pybsddb-users
mailing list afterwards pointing to it.

If a patch is submitted to pybsddb; i'll usually notice and do the right
thing, but many less eyes watch that patch manager.  The pybsddb-users
email is a nice headsup because i read that low volume list more often
than python-dev.

-g

From martin at v.loewis.de  Wed Oct  1 01:33:24 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Wed Oct  1 01:33:32 2003
Subject: [Python-Dev] Good way of finding out what C functions we have?
In-Reply-To: <3F79F878.4070805@ocf.berkeley.edu>
References: <3F779BA6.6050407@ocf.berkeley.edu>
	<m3r820p1vo.fsf@mira.informatik.hu-berlin.de>
	<3F78E57E.8090403@ocf.berkeley.edu>
	<16249.33065.797263.217111@montanaro.dyndns.org>
	<3F79F878.4070805@ocf.berkeley.edu>
Message-ID: <m3wubped3v.fsf@mira.informatik.hu-berlin.de>

"Brett C." <bac@OCF.Berkeley.EDU> writes:

> Right.  I was wondering if there are any checks in configure.in that
> if something was not available that Python itself would not compile
> *at all*.  I would suspect not since that is what the ANSI C/POSIX
> coding requirement is supposed to handle, right?

Mostly. There are several conditions under which configure would
abort; search for exit. One case is that you try to run it on a
not-longer-supported system.

Regards,
Martin

From greg at electricrain.com  Wed Oct  1 01:55:53 2003
From: greg at electricrain.com (Gregory P. Smith)
Date: Wed Oct  1 01:55:56 2003
Subject: [Python-Dev] 2.3.2 and bsddb
In-Reply-To: <200309291445.h8TEj5hc002980@localhost.localdomain>
References: <200309291445.h8TEj5hc002980@localhost.localdomain>
Message-ID: <20031001055553.GM17491@zot.electricrain.com>

On Tue, Sep 30, 2003 at 12:45:05AM +1000, Anthony Baxter wrote:
> 
> For those of you not following every bug in the SF tracker closely, 
> in http://www.python.org/sf/775414 it's been suggested that the docs
> for 2.3.2 include a warning about using the old-style interface to 
> bsddb (without a DBEnv) with multi-threaded applications. This seems
> like a prudent suggestion - does someone want to supply some words?
> 
> Anthony

Attached is a patch.  Commit it if you like it.

-------------- next part --------------
Index: libbsddb.tex
===================================================================
RCS file: /cvsroot/python/python/dist/src/Doc/lib/libbsddb.tex,v
retrieving revision 1.11
diff --unified=5 -r1.11 libbsddb.tex
--- libbsddb.tex	28 May 2003 16:20:03 -0000	1.11
+++ libbsddb.tex	1 Oct 2003 05:51:11 -0000
@@ -28,10 +28,16 @@
 
 The following is a description of the legacy \module{bsddb} interface
 compatible with the old python bsddb module.  For details about the more
 modern Db and DbEnv object oriented interface see the above mentioned
 pybsddb URL.
+
+\warning{This legacy interface is not thread safe in python 2.3.x
+or earlier.  Data corruption, core dumps or deadlocks may occur if you
+attempt multi-threaded access.  You must use the modern pybsddb
+interface linked to above if you need multi-threaded or multi-process
+database access.}
 
 The \module{bsddb} module defines the following functions that create
 objects that access the appropriate type of Berkeley DB file.  The
 first two arguments of each function are the same.  For ease of
 portability, only the first two arguments should be used in most
From theller at python.net  Wed Oct  1 03:52:00 2003
From: theller at python.net (Thomas Heller)
Date: Wed Oct  1 03:52:30 2003
Subject: [Python-Dev] Re: RELEASED Python 2.3.1
In-Reply-To: <16250.151.309386.606310@grendel.zope.com> (Fred L. Drake,
	Jr.'s message of "Tue, 30 Sep 2003 18:15:51 -0400")
References: <200309261752.h8QHqJgD029823@localhost.localdomain>
	<LNBBLJKPBEHFEDALKOLCEEJHGCAB.tim.one@comcast.net>
	<20030926184853.GB22837@mems-exchange.org> <he2zba58.fsf@python.net>
	<16250.151.309386.606310@grendel.zope.com>
Message-ID: <vfr975un.fsf@python.net>

"Fred L. Drake, Jr." <fdrake@acm.org> writes:

> Thomas Heller writes:
>
>  > And it would help if I could build the HTML docs myself from CVS. I did
>  > manage to create the pdf files with TeTex under windows, but I didn't
>  > succeed with the html pages so far.
>
> Are you using Cygwin?  What problems did you encounter?  I'll help if
> I can; I have a Windows machine (sometimes), but don't know anything
> about non-Cygwin Windows TeX systems (and I don't have Cygwin
> installed most of the time).

No, I'm not using cygwin. I have seen too many broken cvs files with
line end problems, I suspect people check in files with MSDOS line
ending under cygwin.

Now that I can build the docs on starship (thanks, Greg!) it's not
needed anymore to do it under Windows, but for the archives here are my
experiences:

Doing 'nmake pdf' (this is the MSVC6 make utility) in the src/Doc
directory worked, it created the pdf docs with MikTeX I had
installed. Maybe I had to trivially edit the Makefile (replace 'cp' with
'copy' and such) before.

It doesn't work anymore with the recent checkins to the Makefile it
didn't work anymore, although installing the Mingw32 gnumake helped.

Then I tried to bring 'make html' to work, installed latex2html (I have
Perl already), but this always complained about pnmtopng missing (or
something like that). And the make failed with an error such as 'image
format unsupported'. Well, I tried to find and install native windows
pnm2png and png2pnm tools, had to replace incompatible zlib.dll and so
on. It didn't work, instead it broke my ssh and maybe other stuff.

At this point I gave up, removed the software, and be happy that I
managed to get my ssh working again.

Thomas


From theller at python.net  Wed Oct  1 03:54:20 2003
From: theller at python.net (Thomas Heller)
Date: Wed Oct  1 03:54:49 2003
Subject: [Starship] Re: [Python-Dev] Re: RELEASED Python 2.3.1
In-Reply-To: <20030930012556.GA6451@cthulhu.gerg.ca> (Greg Ward's message of
	"Mon, 29 Sep 2003 21:25:56 -0400")
References: <200309261752.h8QHqJgD029823@localhost.localdomain>
	<LNBBLJKPBEHFEDALKOLCEEJHGCAB.tim.one@comcast.net>
	<20030926184853.GB22837@mems-exchange.org> <he2zba58.fsf@python.net>
	<2mzngr1bf2.fsf@starship.python.net> <8yoauf87.fsf@python.net>
	<20030928191828.GA2852@cthulhu.gerg.ca>
	<2mvfrb271d.fsf@starship.python.net>
	<20030930012556.GA6451@cthulhu.gerg.ca>
Message-ID: <oex175qr.fsf@python.net>

Greg Ward <gward@python.net> writes:

> Argh.  Installing tetex-bin (and -doc, -extra, -lib just for fun) now.
>

Thanks, Greg. Works great now.

Thomas


From mwh at python.net  Wed Oct  1 06:45:06 2003
From: mwh at python.net (Michael Hudson)
Date: Wed Oct  1 06:44:23 2003
Subject: [Python-Dev] Re: RELEASED Python 2.3.1
In-Reply-To: <16249.65446.699521.103706@grendel.zope.com> (Fred L. Drake,
	Jr.'s message of "Tue, 30 Sep 2003 18:11:50 -0400")
References: <200309261752.h8QHqJgD029823@localhost.localdomain>
	<LNBBLJKPBEHFEDALKOLCEEJHGCAB.tim.one@comcast.net>
	<20030926184853.GB22837@mems-exchange.org> <he2zba58.fsf@python.net>
	<2mzngr1bf2.fsf@starship.python.net>
	<16249.65446.699521.103706@grendel.zope.com>
Message-ID: <2mbrt11bkd.fsf@starship.python.net>

"Fred L. Drake, Jr." <fdrake@acm.org> writes:

> Michael Hudson writes:
>  > It occurs to me that I don't know *why* Fred is so much the
>  > documentation man; I've not had any trouble processing the docs into
>  > HTML lately (haven't tried on Windows, admittedly, and I haven't
>  > tried to make info ever).
>
> It's certainly gotten easier to deal with the documentation on modern
> Linux distributions.  At CNRI, we used mostly Solaris boxes, and I
> have to build my own teTeX installations from source, and hand-select
> a version of LaTeX2HTML that worked for me.

Oh, there was a reason I put a "lately" in what I said...

> At this point, all the software that I can't just install from a
> RedHat CD is part of what gets pulled down from CVS.  I've been able
> to build the docs on Cygwin as well, though I've not tried lately.
> A lot of what it takes to build the docs is written into Doc/Makefile,
> but it does require a solid make (it even uses $(shell ...) now, so
> maybe only GNU make will do; not sure).

One thing that puzzled me: Doc/Makefile seems to require that
Doc/tools is on $PATH, unless I'm misunderstanding something.

>  > What else needs to be done?  There must be quite a bit of mucking
>  > about on creosote to do, I guess.
>
> There's a bit, but that's getting easier and easier as I've gone
> through it a few times now.  I updated PEP 101 the other evening so
> anyone can do what's needed to build the packages and get them in the
> download locations.  There's more to be written to explain what else
> needs to be updated on the site.

Well, progress!  I think it's a worthy goal that no single person is
required to make a release, and that actually this isn't too far off.

Cheers,
mwh

-- 
  Never meddle in the affairs of NT. It is slow to boot and quick to
  crash.                                             -- Stephen Harris
               -- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html

From gerrit at nl.linux.org  Wed Oct  1 07:23:05 2003
From: gerrit at nl.linux.org (Gerrit Holl)
Date: Wed Oct  1 07:23:11 2003
Subject: [Python-Dev] Documentation packages
In-Reply-To: <16249.61774.732640.328478@grendel.zope.com>
References: <16249.61774.732640.328478@grendel.zope.com>
Message-ID: <20031001112305.GA3667@nl.linux.org>

Hi,

Fred L. Drake, Jr. wrote:
> After a brief discussion on the Doc-SIG, it looks like I can
> reasonably drop the .tar.gz packaging for the documentation, leaving
> only .zip and .tar.bz2 formats.
> 
> Are there any strong objections to this change?

What is the reason to do so? Can it do any harm do leave it in?

just curious...

Gerrit Holl.

-- 
6. If any one steal the property of a temple or of the court, he shall
be put to death, and also the one who receives the stolen thing from him
shall be put to death.
        -- 1780 BC, Hammurabi, Code of Law
--
Asperger Syndroom - een persoonlijke benadering:
	http://people.nl.linux.org/~gerrit/
Het zijn tijden om je zelf met politiek te bemoeien:
	http://www.sp.nl/

From anthony at interlink.com.au  Wed Oct  1 01:54:26 2003
From: anthony at interlink.com.au (Anthony Baxter)
Date: Wed Oct  1 08:57:05 2003
Subject: [Python-Dev] HP Test Drive systems
Message-ID: <200310010554.h915sQBT002065@localhost.localdomain>


If you've found the Sourceforge Compile Farm a bit lacking in
the numbers of systems available, I highly recommend the HP
testdrive program (ok, to be fair, it was originally the DEC
testdrive program to allow you to play with OSF/1 on Alphas, 
then the Compaq testdrive program, but HP's added a bunch of 
their own systems to it, as well as a wide variety of the free
linux and bsd variants). I've appended the list of systems that 
are currently available to the bottom of this list. mwh and I 
have been using these to track down all manner of wacky O/S-
dependent errors.

Sign up for it at http://www.testdrive.compaq.com/

Anthony

Test Drive                            System Type                    
HP Tru64 Unix 4.0g(JAVA)          AS1200 2@533MHz (ev56)     
HP Tru64 Unix 5.1b(JAVA)          DS20L 2@833MHz(ev68)       
HP Tru64 Unix 5.1b(JAVA)          DS20E 2@667MHz (ev67)      
HP Tru64 Unix 5.1b(JAVA)          ES40 4@833MHz (ev67)       
HP Tru64 Unix 5.1b(JAVA)          ES45 4@1GHz (ev68)         
HP Tru64 Unix 5.1b(JAVA)          ES47 2x1GHz (ev7)          
HP OpenVMS 7.3-2 EFT              DS10-L 1@466MHz (ev6)      
HP OpenVMS 7.3-1                  DS20 2@500MHz (ev6)        
HP-UX 11i 11.22                   rx2600 2@900MHz (Itanium II) 
HP-UX 11i 11.22                   rx2600 2@900MHz (Itanium II) 
HP-UX 11i 11.11                   rp2470 2@750MHz (PA-RISC)  
HP-UX 11i 11.11                   rp2470 2@750MHz (PA-RISC)  

Linux Test Drives:

Test Drive                            System Type                    
Debian GNU/Linux 3.0 on Intel     ProLiant DL360 G2 1.4GHz (P3)
Debian GNU/Linux 3.0 on Intel     rx2600 2@900MHz (Itanium II) 
Debian GNU/Linux 3.0 on Alpha     XP1000a 1@667MHz (ev6)     
Debian GNU/Linux 3.0 on Alpha     DS20 2@500MHz (ev6)        
Debian GNU/Linux 3.0 on PA-RISC   rp5470 1@550MHz (PA-RISC)  
Mandrake Linux 9.1 on Intel       ProLiant ML530 2@800MHz (P3) 
Red Hat Ent Linux ES 2.1 on Intel ProLiant ML530 2@1.0GHz (P3) 
Red Hat Ent Linux AS 2.1 on Intel ProLiant DL360 2@800MHz (P3) 
                                                             
Red Hat Ent Linux AS 2.1 on Intel rx2600 2@900MHz (ItaniumII)
Red Hat Ent Linux AS 2.1 on Intel rx2600 2@900MHz (ItaniumII)
Red Hat Ent Linux AS 2.1 on Intel Intel 4@1.4GHz (Itanium II)
Red Hat Linux 7.2 on Alpha        DS20 2@500MHz (ev6)        
Red Hat Linux 7.2 on Alpha(JAVA)  ES40 4@667MHz (ev67)       
Slackware Linux 9.0 on Intel      ProLiant ML530 2@800MHz (P3) 
SuSE Linux Ent Svr 8.0 on Intel   ProLiant DL360 2@1.4GHz (P3) 
SuSE Linux 7.2a on Intel          DL590 4@800MHz (Itanium I) 
SuSE Linux 7.1 on Alpha           DS10-L 1@466MHz (ev6)      
SuSE Linux 7.1 on Alpha           ES40 2@667MHz (ev67)       
SuSE Linux 7.1 on Alpha(JAVA)     DS20e 2@667MHz (ev67)      

BSD Test Drives:

Test Drive                        System Type                         
FreeBSD 4.8 on Intel              ProLiant DL360 2@1.4GHz (P3) 
FreeBSD 4.8 on Alpha              XP1000a 1@667MHz (ev6)     
OpenBSD 3.2 on Intel              ProLiant DL360 2@1.2GHz (P3) 
NetBSD 1.6 on Intel               ProLiant DL360 2@1.2GHz (P3) 

Cluster Test Drives:
Beowulf BrickWall Cluster         DS10 & DS10-L(8) 466MHz (ev6) 
HP TruCluster Server 5.1b(JAVA)   ES40 883Mhz & DS20E 667Mhz 
OpenVMS 7.3 Galaxy Cluster        AS4100 EV56                
OpenVMS 7.3 Galaxy Cluster        AS4100 EV56                
Red Hat Advanced Server Cluster   ProLiant DL360x2 2@800MHz (P3)

iPAQ TestDrive Developer Program:

Test Drive                        System Type                      
iPAQ                              iPAQ H3650                 

Application Test Drives:

Test Drive                        System Type                 
Oracle 9iAS Portal Tru64 Unix5.1B ES40 4@667MHz (ev6)        
Oracle 9iRAC 9.2.0 on Tru64 Unix  ES45 @1GHz & ES40 @833MHz 


From anthony at interlink.com.au  Wed Oct  1 02:00:09 2003
From: anthony at interlink.com.au (Anthony Baxter)
Date: Wed Oct  1 08:57:25 2003
Subject: [Python-Dev] 2.3.2 and bsddb 
In-Reply-To: <20031001055553.GM17491@zot.electricrain.com> 
Message-ID: <200310010600.h91609IZ002197@localhost.localdomain>


Looks good! I've upgraded it to a \begin{notice}[warning] so
that it really stands out (see for instance 
  http://www.python.org/doc/current/lib/node61.html
)

Thanks!
Anthony

From anthony at interlink.com.au  Wed Oct  1 02:10:20 2003
From: anthony at interlink.com.au (Anthony Baxter)
Date: Wed Oct  1 08:57:33 2003
Subject: [Python-Dev] 2.3.2 and bsddb 
In-Reply-To: <20031001055553.GM17491@zot.electricrain.com> 
Message-ID: <200310010610.h916AK2B002343@localhost.localdomain>


Just another thought - should the newer pybsddb API be folded into the
library docs?

Anthony

From barry at python.org  Wed Oct  1 09:01:03 2003
From: barry at python.org (Barry Warsaw)
Date: Wed Oct  1 09:01:09 2003
Subject: [Python-Dev] 2.3.2 and bsddb
In-Reply-To: <200310010610.h916AK2B002343@localhost.localdomain>
References: <200310010610.h916AK2B002343@localhost.localdomain>
Message-ID: <1065013263.19531.24.camel@anthem>

On Wed, 2003-10-01 at 02:10, Anthony Baxter wrote:
> Just another thought - should the newer pybsddb API be folded into the
> library docs?

They're big, but I think worth it.  In general the pybsddb docs are
excellent and invaluable, but I would make one change.  I think the
links to the C API point to pybsddb copies of the Sleepycat
documentation.  I'd change those to point to Sleepycat's own online
documentation.  It's more fragile, but 1) it means pulling less into
Python's library, and 2) should be more up-to-date as Sleepycat makes
changes and new releases.

-Barry


From Paul.Moore at atosorigin.com  Wed Oct  1 09:06:10 2003
From: Paul.Moore at atosorigin.com (Moore, Paul)
Date: Wed Oct  1 09:07:01 2003
Subject: [Python-Dev] 2.3.2 and bsddb
Message-ID: <16E1010E4581B049ABC51D4975CEDB8802C097F1@UKDCX001.uk.int.atosorigin.com>

From: Barry Warsaw [mailto:barry@python.org]

> On Wed, 2003-10-01 at 02:10, Anthony Baxter wrote:
> > Just another thought - should the newer pybsddb API be folded into the
> > library docs?

> They're big, but I think worth it.  In general the pybsddb docs are
> excellent and invaluable [...]

I think it should. I read the bsddb stuff in the Python manual, and barely
noticed the reference to pybsddb. Subconsciously, I assumed that it was
simply background reading, and not important for day to day use (much like
pointers to RFCs in many of the Internet modules). I certainly never
assumed that there might be functionality which wasn't documented in the
Python library reference.

Paul.

From anthony at interlink.com.au  Wed Oct  1 09:52:50 2003
From: anthony at interlink.com.au (Anthony Baxter)
Date: Wed Oct  1 09:54:38 2003
Subject: [Python-Dev] first pass at a release checker
Message-ID: <200310011352.h91DqoUe007522@localhost.localdomain>

Here's the first hack at a quick script for checking a release
tarball for sanity. Please suggest additional checks to make.

At the moment it checks:
  tarball name
  tarball unpacks to a correctly named directory
  no CVS directories in the tarball
  no Release date: XXX in Misc/NEWS
  "configure ; make ; make test" works

Additional checks I plan to add at some point:
  check the version number in Include/patchlevel.h
  check the version number and build number in the windows-specific area

Where should something like this (cleaned up a bit) be checked in?
Tools/something?

Anthony

def Error(message):
    import sys
    print "ERROR:", message
    sys.exit(1)

def searchFile(filename, searchPattern, badPattern):
    import re
    searchRe = re.compile(searchPattern)
    badPatternRe = re.compile(badPattern)
    for line in open(filename):
        if searchRe.match(line):
            if badPatternRe.search(line):
                Error("found %s in %s"%(badPattern, filename))

def main(tarball):
    import os
    # make tarball path absolute
    if tarball[0] != "/":
        tarball = os.path.join(os.getcwd(), tarball)
    if tarball[-4:] != ".tgz":
        Error("tarball should end in .tgz")
    # Check tarball is gzipped, maybe check compression level?
    reldir = "checkrel-%d"%(os.getpid())
    os.mkdir(reldir)
    os.chdir(reldir)
    print "extracting in %s"%reldir
    print "tarball is %s"%(tarball)
    os.system("tar xzf %s"%(tarball))
    relname = os.path.basename(tarball)[:-4]
    entries = os.listdir(".")
    if len(entries) != 1 or entries[0] != relname:
        Error("tarball should have only created %s"%relname)
    os.chdir(relname)
    for dirpath, dirnames, filenames in os.walk('.'):
        if "CVS" in dirnames:
            Error("%s contains a CVS directory!"%dirpath)
            # additional checks go here.
    searchFile("Misc/NEWS", "^\*Release date:", "XXX")
    os.system("./configure")
    os.system("make")
    os.system("make testall")

import sys
main(sys.argv[1])

From aahz at pythoncraft.com  Wed Oct  1 10:02:56 2003
From: aahz at pythoncraft.com (Aahz)
Date: Wed Oct  1 10:03:01 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <20031001112305.GA3667@nl.linux.org>
References: <16249.61774.732640.328478@grendel.zope.com>
	<20031001112305.GA3667@nl.linux.org>
Message-ID: <20031001140255.GA17311@panix.com>

On Wed, Oct 01, 2003, Gerrit Holl wrote:
> Fred L. Drake, Jr. wrote:
>> 
>> After a brief discussion on the Doc-SIG, it looks like I can
>> reasonably drop the .tar.gz packaging for the documentation, leaving
>> only .zip and .tar.bz2 formats.
>> 
>> Are there any strong objections to this change?
> 
> What is the reason to do so? Can it do any harm do leave it in?

Two points:

* It's another step in the release process

* It takes up extra space on the servers

Following Fred's suggestion saves time and space.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From mwh at python.net  Wed Oct  1 11:15:23 2003
From: mwh at python.net (Michael Hudson)
Date: Wed Oct  1 11:14:40 2003
Subject: [Python-Dev] first pass at a release checker
In-Reply-To: <200310011352.h91DqoUe007522@localhost.localdomain> (Anthony
	Baxter's message of "Wed, 01 Oct 2003 23:52:50 +1000")
References: <200310011352.h91DqoUe007522@localhost.localdomain>
Message-ID: <2m7k3p0z1w.fsf@starship.python.net>

Anthony Baxter <anthony@interlink.com.au> writes:

> Where should something like this (cleaned up a bit) be checked in?
> Tools/something?

Tools/scripts/something, I'd have thought.

Cheers,
mwh

-- 
  Presumably pronging in the wrong place zogs it.
                                        -- Aldabra Stoddart, ucam.chat

From michael.l.schneider at eds.com  Wed Oct  1 11:28:09 2003
From: michael.l.schneider at eds.com (Schneider, Michael)
Date: Wed Oct  1 11:28:13 2003
Subject: [Python-Dev] RE: Python-Dev Digest, Vol 3, Issue 2
Message-ID: <49199579A2BB32438A7572AF3DBB2FB501FEED04@uscimplm001.net.plm.eds.com>

SGI python 1.3.2 rc2 fails to build on irix.  There is a compile error in termios.c.

This is caused by the fact that SGI #defines some control chars, but does not implement them.

If the following code is added to Modules/termios.c, then the problem is fixed, and all is well on IRIX.

Can someone get this in?

Thanks,
Mike
--------------------------------------------------------------------------------


// SGI #defines, but does not support these
#ifdef (__sgi)

#ifdef CLNEXT
#undef CLNEXT 
#endif
#ifdef CRPRNT
# undef CRPRNT
#endif
#ifdef CWERASE
# undef CWERASE
#endif
#ifdef CFLUSH
#undef CFLUSH
#endif
#ifdef CDSUSP
#undef CDSUSP

#endif 
 
 
----------------------------------------------------------------
Michael Schneider
Senior Software Engineering Consultant
EDS PLM Solutions
 
"The Greatest Performance Improvement Is the transitioning from a non-working state to the working state"

From anthony at interlink.com.au  Wed Oct  1 11:54:17 2003
From: anthony at interlink.com.au (Anthony Baxter)
Date: Wed Oct  1 11:56:14 2003
Subject: [Python-Dev] RE: Python-Dev Digest, Vol 3, Issue 2 
In-Reply-To: <49199579A2BB32438A7572AF3DBB2FB501FEED04@uscimplm001.net.plm.eds.com>
Message-ID: <200310011554.h91FsHBP009060@localhost.localdomain>


>>> "Schneider, Michael" wrote
> SGI python 1.3.2 rc2 fails to build on irix. 
> There is a compile error in termios.c.

I'm not sure what 1.3.2 rc2 might mean. Is this 2.3.1c1? If so, which exact
version of Irix is this on, and which compiler (and version of compiler)

> This is caused by the fact that SGI #defines some control chars, but does not
> implement them.

Argh. This is a way ugly problem - why does the OS define them and not implement
them? 

Anthony

-- 
Anthony Baxter     <anthony@interlink.com.au>   
It's never too late to have a happy childhood.


From aahz at pythoncraft.com  Wed Oct  1 12:00:11 2003
From: aahz at pythoncraft.com (Aahz)
Date: Wed Oct  1 12:00:17 2003
Subject: [Python-Dev] Irix problems
In-Reply-To: <49199579A2BB32438A7572AF3DBB2FB501FEED04@uscimplm001.net.plm.eds.com>
References: <49199579A2BB32438A7572AF3DBB2FB501FEED04@uscimplm001.net.plm.eds.com>
Message-ID: <20031001160010.GA28676@panix.com>

On Wed, Oct 01, 2003, Schneider, Michael wrote:
>
> SGI python 1.3.2 rc2 fails to build on irix.  There is a compile error
> in termios.c.
>
> This is caused by the fact that SGI #defines some control chars, but
> does not implement them.
>
> If the following code is added to Modules/termios.c, then the problem
> is fixed, and all is well on IRIX.
>
> Can someone get this in?

Did 2.3 or 2.3.1 compile correctly?  If not, it's too late to get this
into 2.3.2.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From michael.l.schneider at eds.com  Wed Oct  1 13:01:29 2003
From: michael.l.schneider at eds.com (Schneider, Michael)
Date: Wed Oct  1 13:01:34 2003
Subject: [Python-Dev] Irix problems
Message-ID: <49199579A2BB32438A7572AF3DBB2FB501FEED06@uscimplm001.net.plm.eds.com>

2.3.2rc2 is the first try, we are updating from 1.5 on SGI...

 
----------------------------------------------------------------
Michael Schneider
Senior Software Engineering Consultant
EDS PLM Solutions
 
"The Greatest Performance Improvement Is the transitioning from a non-working state to the working state"


-----Original Message-----
From: Aahz [mailto:aahz@pythoncraft.com]
Sent: Wednesday, October 01, 2003 12:00 PM
To: Schneider, Michael
Cc: python-dev@python.org
Subject: Re: [Python-Dev] Irix problems


On Wed, Oct 01, 2003, Schneider, Michael wrote:
>
> SGI python 1.3.2 rc2 fails to build on irix.  There is a compile error
> in termios.c.
>
> This is caused by the fact that SGI #defines some control chars, but
> does not implement them.
>
> If the following code is added to Modules/termios.c, then the problem
> is fixed, and all is well on IRIX.
>
> Can someone get this in?

Did 2.3 or 2.3.1 compile correctly?  If not, it's too late to get this
into 2.3.2.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From just at letterror.com  Wed Oct  1 13:07:33 2003
From: just at letterror.com (Just van Rossum)
Date: Wed Oct  1 13:07:28 2003
Subject: [Python-Dev] imp.findmodule and zip files
In-Reply-To: <16250.1230.305462.233362@magrathea.basistech.com>
Message-ID: <r01050400-1026-C12B43DDF43111D79159003065D5E7E4@[10.0.0.23]>

Tom Emerson wrote:

> Should imp.find_module() work for modules that are packaged in a zip
> file in 2.3.x? I'm seeing that this doesn't, and before I dive in to
> figure out why, I want to see if this is the intent.

The imp module is not yet updated to have full access to the new import
hooks :-(. See near the end of http://www.python.org/peps/pep-0302.html
for a discussion of the issues.

Just

From theller at python.net  Wed Oct  1 13:31:27 2003
From: theller at python.net (Thomas Heller)
Date: Wed Oct  1 13:31:58 2003
Subject: [Python-Dev] imp.findmodule and zip files
In-Reply-To: <r01050400-1026-C12B43DDF43111D79159003065D5E7E4@[10.0.0.23]>
	(Just van Rossum's message of "Wed,  1 Oct 2003 19:07:33 +0200")
References: <r01050400-1026-C12B43DDF43111D79159003065D5E7E4@[10.0.0.23]>
Message-ID: <d6dg7tlc.fsf@python.net>

Just van Rossum <just@letterror.com> writes:

> The imp module is not yet updated to have full access to the new import
> hooks :-(. See near the end of http://www.python.org/peps/pep-0302.html
> for a discussion of the issues.

There's another minor issue with the new import hooks which would be
nice to be fixed: to my knowledge, the Py_VerboseFlag is not exposed to
the Python layer. Sometimes it would come handy when debugging a custom
importer.

Thomas


From fdrake at acm.org  Wed Oct  1 14:36:15 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed Oct  1 14:36:46 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <20031001140255.GA17311@panix.com>
References: <16249.61774.732640.328478@grendel.zope.com>
	<20031001112305.GA3667@nl.linux.org>
	<20031001140255.GA17311@panix.com>
Message-ID: <16251.7839.871263.562935@grendel.zope.com>


Aahz writes:
 > * It's another step in the release process

The way I wrote up the documentation release in PEP 101, generating
the files isn't even a step.  There are a couple of make commands that
cause these to be generated; these would not change; just the
definitions for those targets would change.

 > * It takes up extra space on the servers

There is this; not a huge deal, but considering we're running on
hardware owned by XS4ALL, and we're dependent on their goodwill, we
shouldn't waste the space if we don't need to.

 > Following Fred's suggestion saves time and space.

I think more important is that it reduces the number of options that
get presented to some who's looking to download something.  The
plethora of documentation packages is almost embarassing when compared
to the number of packages for the interpreter itself: the sources as a
.tar.gz package (no ZIP!), and the Windows installer.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From fdrake at acm.org  Wed Oct  1 14:56:03 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed Oct  1 14:56:34 2003
Subject: [Python-Dev] Re: RELEASED Python 2.3.1
In-Reply-To: <2mbrt11bkd.fsf@starship.python.net>
References: <200309261752.h8QHqJgD029823@localhost.localdomain>
	<LNBBLJKPBEHFEDALKOLCEEJHGCAB.tim.one@comcast.net>
	<20030926184853.GB22837@mems-exchange.org>
	<he2zba58.fsf@python.net> <2mzngr1bf2.fsf@starship.python.net>
	<16249.65446.699521.103706@grendel.zope.com>
	<2mbrt11bkd.fsf@starship.python.net>
Message-ID: <16251.9027.715275.62943@grendel.zope.com>


Michael Hudson writes:
 > One thing that puzzled me: Doc/Makefile seems to require that
 > Doc/tools is on $PATH, unless I'm misunderstanding something.

It definately doesn't require that; I've never used Doc/tools/ on
$PATH.  One thing it was requiring (only recently) was that there was
a mkhowto symlink somewhere on the $PATH that pointed to the mkhowto
script.

I've removed that constraint for the trunk.

The intention is that we should be able to use a mkhowto script from a
different checkout; you can still modify the MKHOWTO make variable to
do that, but it's not so valuable on the trunk as on the maintenance
branches (where we want to use mkhowto from the trunk).


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From theller at python.net  Wed Oct  1 15:08:03 2003
From: theller at python.net (Thomas Heller)
Date: Wed Oct  1 15:08:35 2003
Subject: [Python-Dev] Re: RELEASED Python 2.3.1
In-Reply-To: <16251.9027.715275.62943@grendel.zope.com> (Fred L. Drake,
	Jr.'s message of "Wed, 1 Oct 2003 14:56:03 -0400")
References: <200309261752.h8QHqJgD029823@localhost.localdomain>
	<LNBBLJKPBEHFEDALKOLCEEJHGCAB.tim.one@comcast.net>
	<20030926184853.GB22837@mems-exchange.org> <he2zba58.fsf@python.net>
	<2mzngr1bf2.fsf@starship.python.net>
	<16249.65446.699521.103706@grendel.zope.com>
	<2mbrt11bkd.fsf@starship.python.net>
	<16251.9027.715275.62943@grendel.zope.com>
Message-ID: <4qys7p4c.fsf@python.net>

"Fred L. Drake, Jr." <fdrake@acm.org> writes:

> Michael Hudson writes:
>  > One thing that puzzled me: Doc/Makefile seems to require that
>  > Doc/tools is on $PATH, unless I'm misunderstanding something.
>
> It definately doesn't require that; I've never used Doc/tools/ on
> $PATH.  One thing it was requiring (only recently) was that there was
> a mkhowto symlink somewhere on the $PATH that pointed to the mkhowto
> script.
>
> I've removed that constraint for the trunk.
>
> The intention is that we should be able to use a mkhowto script from a
> different checkout; you can still modify the MKHOWTO make variable to
> do that, but it's not so valuable on the trunk as on the maintenance
> branches (where we want to use mkhowto from the trunk).

Do we? Why?

Thomas


From fdrake at acm.org  Wed Oct  1 15:11:19 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed Oct  1 15:11:42 2003
Subject: [Python-Dev] Re: RELEASED Python 2.3.1
In-Reply-To: <4qys7p4c.fsf@python.net>
References: <200309261752.h8QHqJgD029823@localhost.localdomain>
	<LNBBLJKPBEHFEDALKOLCEEJHGCAB.tim.one@comcast.net>
	<20030926184853.GB22837@mems-exchange.org>
	<he2zba58.fsf@python.net> <2mzngr1bf2.fsf@starship.python.net>
	<16249.65446.699521.103706@grendel.zope.com>
	<2mbrt11bkd.fsf@starship.python.net>
	<16251.9027.715275.62943@grendel.zope.com>
	<4qys7p4c.fsf@python.net>
Message-ID: <16251.9943.343427.455546@grendel.zope.com>


I wrote:
 > The intention is that we should be able to use a mkhowto script from a
 > different checkout; you can still modify the MKHOWTO make variable to
 > do that, but it's not so valuable on the trunk as on the maintenance
 > branches (where we want to use mkhowto from the trunk).

Thomas Heller writes:
 > Do we? Why?

Definately.  I don't want to maintain several versions of the tools;
they're almost external application at this point.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From aahz at pythoncraft.com  Wed Oct  1 15:28:45 2003
From: aahz at pythoncraft.com (Aahz)
Date: Wed Oct  1 15:28:59 2003
Subject: [Python-Dev] Irix problems
In-Reply-To: <49199579A2BB32438A7572AF3DBB2FB501FEED06@uscimplm001.net.plm.eds.com>
References: <49199579A2BB32438A7572AF3DBB2FB501FEED06@uscimplm001.net.plm.eds.com>
Message-ID: <20031001192845.GA10491@panix.com>

On Wed, Oct 01, 2003, Schneider, Michael wrote:
>
> 2.3.2rc2 is the first try, we are updating from 1.5 on SGI...

In that case, it's too late.  We need this fix out quickly to resolve
boo-boos in 2.3.1.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From theller at python.net  Wed Oct  1 15:41:24 2003
From: theller at python.net (Thomas Heller)
Date: Wed Oct  1 15:41:56 2003
Subject: [Python-Dev] Re: RELEASED Python 2.3.1
In-Reply-To: <16251.9943.343427.455546@grendel.zope.com> (Fred L. Drake,
	Jr.'s message of "Wed, 1 Oct 2003 15:11:19 -0400")
References: <200309261752.h8QHqJgD029823@localhost.localdomain>
	<LNBBLJKPBEHFEDALKOLCEEJHGCAB.tim.one@comcast.net>
	<20030926184853.GB22837@mems-exchange.org> <he2zba58.fsf@python.net>
	<2mzngr1bf2.fsf@starship.python.net>
	<16249.65446.699521.103706@grendel.zope.com>
	<2mbrt11bkd.fsf@starship.python.net>
	<16251.9027.715275.62943@grendel.zope.com> <4qys7p4c.fsf@python.net>
	<16251.9943.343427.455546@grendel.zope.com>
Message-ID: <u16s690b.fsf@python.net>

"Fred L. Drake, Jr." <fdrake@acm.org> writes:

> I wrote:
>  > The intention is that we should be able to use a mkhowto script from a
>  > different checkout; you can still modify the MKHOWTO make variable to
>  > do that, but it's not so valuable on the trunk as on the maintenance
>  > branches (where we want to use mkhowto from the trunk).
>
> Thomas Heller writes:
>  > Do we? Why?
>
> Definately.  I don't want to maintain several versions of the tools;
> they're almost external application at this point.
>

Ok, but it would be nice if pep 101 and 102 would contain instructions
how to build the docs (on unix), I will try to do the same for windows.

Thanks,

Thomas


From fdrake at acm.org  Wed Oct  1 15:48:24 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed Oct  1 15:48:42 2003
Subject: [Python-Dev] Re: RELEASED Python 2.3.1
In-Reply-To: <u16s690b.fsf@python.net>
References: <200309261752.h8QHqJgD029823@localhost.localdomain>
	<LNBBLJKPBEHFEDALKOLCEEJHGCAB.tim.one@comcast.net>
	<20030926184853.GB22837@mems-exchange.org>
	<he2zba58.fsf@python.net> <2mzngr1bf2.fsf@starship.python.net>
	<16249.65446.699521.103706@grendel.zope.com>
	<2mbrt11bkd.fsf@starship.python.net>
	<16251.9027.715275.62943@grendel.zope.com>
	<4qys7p4c.fsf@python.net>
	<16251.9943.343427.455546@grendel.zope.com>
	<u16s690b.fsf@python.net>
Message-ID: <16251.12168.559381.428128@grendel.zope.com>


Thomas Heller writes:
 > Ok, but it would be nice if pep 101 and 102 would contain instructions
 > how to build the docs (on unix), I will try to do the same for windows.

That's in the version of PEP 101 in CVS; the online version isn't
up-to-date due to the anonymous CVS access on SF using their backup
repositories that aren't update frequently enough.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From delza at blastradius.com  Wed Oct  1 16:11:36 2003
From: delza at blastradius.com (Dethe Elza)
Date: Wed Oct  1 16:11:49 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <16251.7839.871263.562935@grendel.zope.com>
Message-ID: <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>

My $0.02 (Canadian), for what it's worth:

While Windows users may have trouble with *.bz2, and be unfamiliar 
enough with the extension *.tgz to not even try (even if it does work), 
I've never known a *nix box to have trouble with *.zip or known a unix 
user who had trouble with *.zip.  So I'd suggest keeping the various 
flavors of documentation, but standardize on zip compression.  That 
will at least remove one variable.

I agree that the main point of all of this is to reduce confusion for 
the newbie coming to the site to download it.  But 90% of those are 
going to be windows users, and the rest of us have gotten used to 
living in a windows-dominated world.  Using bz2 may get you better 
compression and save bandwidth, but it wasn't standard the last time I 
installed RedHat or Debian.  Zip has it's faults, but everybody is 
familiar with it.

--Dethe


From tim at zope.com  Wed Oct  1 16:18:38 2003
From: tim at zope.com (Tim Peters)
Date: Wed Oct  1 16:19:44 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHAEFMHFAA.tim@zope.com>

[Dethe Elza]
> While Windows users may have trouble with *.bz2, and be unfamiliar
> enough with the extension *.tgz to not even try (even if it does
> work), I've never known a *nix box to have trouble with *.zip or
> known a unix user who had trouble with *.zip.  So I'd suggest keeping
> the various flavors of documentation, but standardize on zip
> compression.  That will at least remove one variable.

A difficulty is that the HTML doc set compresses *much* better under bz2
than under zip format, and many people download over slow and expensive
dialup lines.  bz2 is preferred for that reason (smaller file == faster and
cheaper download).

> I agree that the main point of all of this is to reduce confusion for
> the newbie coming to the site to download it.  But 90% of those are
> going to be windows users,

I don't believe that, because the Windows installer for Python includes the
full doc set in a Windows-friendly format.  So there's simply no reason for
the vast majority of Windows Python users to download the doc distribution
at all.

Fred, do we have stats on how often each of the files got downloaded for
previous releases?

> and the rest of us have gotten used to living in a windows-dominated
> world.  Using bz2 may get you better compression and save bandwidth,
> but it wasn't standard the last time I installed RedHat or Debian.
> Zip has it's faults, but everybody is familiar with it.

No argument there.


From tree at basistech.com  Wed Oct  1 16:18:23 2003
From: tree at basistech.com (Tom Emerson)
Date: Wed Oct  1 16:22:41 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
References: <16251.7839.871263.562935@grendel.zope.com>
	<75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
Message-ID: <16251.13967.470512.219448@magrathea.basistech.com>

Dethe Elza writes:
[...]
> I've never known a *nix box to have trouble with *.zip or known a unix 
> user who had trouble with *.zip.  So I'd suggest keeping the various 
> flavors of documentation, but standardize on zip compression.  That 
> will at least remove one variable.

What Unix boxen do you use? I often run into Solaris, IRIX, and HP-UX
boxen that lack unzip.

> I agree that the main point of all of this is to reduce confusion for 
> the newbie coming to the site to download it.  But 90% of those are 
> going to be windows users, and the rest of us have gotten used to 
> living in a windows-dominated world.  Using bz2 may get you better 
> compression and save bandwidth, but it wasn't standard the last time I 
> installed RedHat or Debian.  Zip has it's faults, but everybody is 
> familiar with it.

I wouldn't switch to bz2. Even tgz can be confusing. Having .zip files
for Windows users and .tar.gz files for Unix users is a happy medium
that should work most everywhere. Of course for maximum Unix
portability I suppose you could use .tar.Z ;-)

    -tree

-- 
Tom Emerson                                          Basis Technology Corp.
Software Architect                                 http://www.basistech.com
  "Beware the lollipop of mediocrity: lick it once and you suck forever"

From fdrake at acm.org  Wed Oct  1 16:24:15 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed Oct  1 16:24:36 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
References: <16251.7839.871263.562935@grendel.zope.com>
	<75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
Message-ID: <16251.14319.693162.119201@grendel.zope.com>


Dethe Elza writes:
 > While Windows users may have trouble with *.bz2, and be unfamiliar 
 > enough with the extension *.tgz to not even try (even if it does work), 
 > I've never known a *nix box to have trouble with *.zip or known a unix 
 > user who had trouble with *.zip.  So I'd suggest keeping the various 
 > flavors of documentation, but standardize on zip compression.  That 
 > will at least remove one variable.

At this point, the bzip2 compression has been the most-requested (in
terms of emails begging us to add it); the most important aspect that
makes it desirable is that the file sizes are so much better.  From
this perspective, ZIP files are the worst for the formats which cause
a lot of individual files to be packaged (most importantly, the HTML
and LaTeX source formats).  There are still a lot of people who want
to pull the files over slow links that this seems valuable, at least
for those two formats.  (It may be that it's *only* valuable for those
formats, and can be dropped for the PDF and PostScript formats.)

 > I agree that the main point of all of this is to reduce confusion for 
 > the newbie coming to the site to download it.  But 90% of those are 
 > going to be windows users, and the rest of us have gotten used to 
 > living in a windows-dominated world.  Using bz2 may get you better 
 > compression and save bandwidth, but it wasn't standard the last time I 
 > installed RedHat or Debian.  Zip has it's faults, but everybody is 
 > familiar with it.

Interesting; I don't recall the last time I had to build my own
bzip2.  I'm pretty sure I didn't do anything special to get it on
RedHat recently.  The bandwidth savings aren't nearly so valuable to
python.org as they are to end users on metered internet connections;
those are the users who were so incredibly vocal that we actually
started posting those.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From barry at python.org  Wed Oct  1 16:26:46 2003
From: barry at python.org (Barry Warsaw)
Date: Wed Oct  1 16:26:51 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <16251.14319.693162.119201@grendel.zope.com>
References: <16251.7839.871263.562935@grendel.zope.com>
	<75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
	<16251.14319.693162.119201@grendel.zope.com>
Message-ID: <1065040005.15765.18.camel@geddy>

On Wed, 2003-10-01 at 16:24, Fred L. Drake, Jr. wrote:

> Interesting; I don't recall the last time I had to build my own
> bzip2.  I'm pretty sure I didn't do anything special to get it on
> RedHat recently.  

No, I'm sure you didn't.  bzip2 decompression should be standard on RH9,
and there's even a tar option to read and write it.  What I don't know
is whether bz2 decompression is generally available on MacOSX...

minority-platform-ly y'rs,
-Barry


From fred at zope.com  Wed Oct  1 16:31:17 2003
From: fred at zope.com (Fred L. Drake, Jr.)
Date: Wed Oct  1 16:31:31 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHAEFMHFAA.tim@zope.com>
References: <16251.7839.871263.562935@grendel.zope.com>
	<75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
	<16251.13967.470512.219448@magrathea.basistech.com>
	<BIEJKCLHCIOIHAGOKOLHAEFMHFAA.tim@zope.com>
Message-ID: <16251.14741.876754.348214@grendel.zope.com>


Tim Peters writes:
 > Fred, do we have stats on how often each of the files got downloaded for
 > previous releases?

No, but we should be able to pull those from the server logs.  Maybe
this weekend I'll get time to write a script to pull that data out.

Tom Emerson writes:
 > I wouldn't switch to bz2. Even tgz can be confusing. Having .zip files
 > for Windows users and .tar.gz files for Unix users is a happy medium
 > that should work most everywhere.

Interesting.  bzip2 saves half a MB over gzip for the HTML and
PostScript formats.  What reason do you have for not using bzip2?  It
was very heavily requested for the file-size advantage.

 > Of course for maximum Unix
 > portability I suppose you could use .tar.Z ;-)

Except nobody remembers what to do with those anymore.  ;-)  I haven't
used compress/uncompress in *many* years.


  -Fred

-- 
Fred L. Drake, Jr.  <fred at zope.com>
PythonLabs at Zope Corporation

From michael.l.schneider at eds.com  Wed Oct  1 16:32:16 2003
From: michael.l.schneider at eds.com (Schneider, Michael)
Date: Wed Oct  1 16:32:19 2003
Subject: [Python-Dev] Irix problems
Message-ID: <49199579A2BB32438A7572AF3DBB2FB501FEED07@uscimplm001.net.plm.eds.com>

That's fine,  I can apply the fix to our local src.  Can this
fix go into the next release?

Thanks for your efforts,
Mike

 
----------------------------------------------------------------
Michael Schneider
Senior Software Engineering Consultant
EDS PLM Solutions
 
"The Greatest Performance Improvement Is the transitioning from a non-working state to the working state"


-----Original Message-----
From: Aahz [mailto:aahz@pythoncraft.com]
Sent: Wednesday, October 01, 2003 3:29 PM
To: Schneider, Michael
Cc: python-dev@python.org
Subject: Re: [Python-Dev] Irix problems


On Wed, Oct 01, 2003, Schneider, Michael wrote:
>
> 2.3.2rc2 is the first try, we are updating from 1.5 on SGI...

In that case, it's too late.  We need this fix out quickly to resolve
boo-boos in 2.3.1.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From fdrake at acm.org  Wed Oct  1 16:53:52 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed Oct  1 16:54:17 2003
Subject: [Python-Dev] Re: RELEASED Python 2.3.1
In-Reply-To: <vfr975un.fsf@python.net>
References: <200309261752.h8QHqJgD029823@localhost.localdomain>
	<LNBBLJKPBEHFEDALKOLCEEJHGCAB.tim.one@comcast.net>
	<20030926184853.GB22837@mems-exchange.org>
	<he2zba58.fsf@python.net>
	<16250.151.309386.606310@grendel.zope.com>
	<vfr975un.fsf@python.net>
Message-ID: <16251.16096.224422.956995@grendel.zope.com>


Thomas Heller writes:
 > Now that I can build the docs on starship (thanks, Greg!) it's not
 > needed anymore to do it under Windows, but for the archives here are my
 > experiences:

I appreciate your taking the time!

 > Doing 'nmake pdf' (this is the MSVC6 make utility) in the src/Doc
 > directory worked, it created the pdf docs with MikTeX I had
 > installed. Maybe I had to trivially edit the Makefile (replace 'cp' with
 > 'copy' and such) before.

Most of the "cp" commands were removed as mkhowto became more capable,
are were replaced by calls to shutil.copyfile().  There are still a
few "cp" commands in the Makefile, though, and the "clean" target and
friends still us "rm".

 > It doesn't work anymore with the recent checkins to the Makefile it
 > didn't work anymore, although installing the Mingw32 gnumake helped.

That's expected, since it now uses the GNU-ish $(shell ...) syntax to
call an external script from Doc/tools/.  Removing this would require
even more painful gyrations to maintain the same functionality, or
would require that Python version numbers once more appear in the
documentation source tree.

 > Then I tried to bring 'make html' to work, installed latex2html (I have
 > Perl already), but this always complained about pnmtopng missing (or
 > something like that). And the make failed with an error such as 'image
 > format unsupported'. Well, I tried to find and install native windows
 > pnm2png and png2pnm tools, had to replace incompatible zlib.dll and so
 > on. It didn't work, instead it broke my ssh and maybe other stuff.

That's painful.  netpbm is a documented requirement for LaTeX2HTML,
but is a pain.  I had to install that from source under Cygwin.

 > At this point I gave up, removed the software, and be happy that I
 > managed to get my ssh working again.

That certainly sounds like a pain.  I'll think about what I can do to
make it easier, but I don't think it can take a high priority.  I'm
glad you got it working on Starship.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From python at discworld.dyndns.org  Wed Oct  1 17:05:16 2003
From: python at discworld.dyndns.org (Charles Cazabon)
Date: Wed Oct  1 17:07:14 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <16251.14319.693162.119201@grendel.zope.com>;
	from fdrake@acm.org on Wed, Oct 01, 2003 at 04:24:15PM -0400
References: <16251.7839.871263.562935@grendel.zope.com>
	<75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
	<16251.14319.693162.119201@grendel.zope.com>
Message-ID: <20031001150516.B14797@discworld.dyndns.org>

Fred L. Drake, Jr. <fdrake@acm.org> wrote:
> 
> Interesting; I don't recall the last time I had to build my own
> bzip2.  I'm pretty sure I didn't do anything special to get it on
> RedHat recently.

It was included in the RedHat 6.2 distribution, possibly in 6.1 and 6.0 as
well, though I can't check that.  It hasn't been an "exotic" package in many
years, although it's not necessarily installed by default in a "base" install.

I see no reason not to use .bz2 as the default format.

Charles
-- 
-----------------------------------------------------------------------
Charles Cazabon                           <python@discworld.dyndns.org>
GPL'ed software available at:     http://www.qcc.ca/~charlesc/software/
-----------------------------------------------------------------------

From michael.l.schneider at eds.com  Wed Oct  1 17:35:37 2003
From: michael.l.schneider at eds.com (Schneider, Michael)
Date: Wed Oct  1 17:35:43 2003
Subject: [Python-Dev] Irix problems
Message-ID: <49199579A2BB32438A7572AF3DBB2FB501FEED09@uscimplm001.net.plm.eds.com>

Aahz,

Correction to SGI code--------------------------------------------

 
// SGI #defines, but does not support these
#ifdef __sgi

#ifdef CLNEXT
#undef CLNEXT 
#endif
#ifdef CRPRNT
# undef CRPRNT
#endif
#ifdef CWERASE
# undef CWERASE
#endif
#ifdef CFLUSH
#undef CFLUSH
#endif
#ifdef CDSUSP
#undef CDSUSP
#endif 
 

#endif
----------------------------------------------------------------
Michael Schneider
Senior Software Engineering Consultant
EDS PLM Solutions
 
"The Greatest Performance Improvement Is the transitioning from a non-working state to the working state"


-----Original Message-----
From: Schneider, Michael 
Sent: Wednesday, October 01, 2003 4:32 PM
To: 'Aahz'
Cc: python-dev@python.org
Subject: RE: [Python-Dev] Irix problems


That's fine,  I can apply the fix to our local src.  Can this
fix go into the next release?

Thanks for your efforts,
Mike

 
----------------------------------------------------------------
Michael Schneider
Senior Software Engineering Consultant
EDS PLM Solutions
 
"The Greatest Performance Improvement Is the transitioning from a non-working state to the working state"


-----Original Message-----
From: Aahz [mailto:aahz@pythoncraft.com]
Sent: Wednesday, October 01, 2003 3:29 PM
To: Schneider, Michael
Cc: python-dev@python.org
Subject: Re: [Python-Dev] Irix problems


On Wed, Oct 01, 2003, Schneider, Michael wrote:
>
> 2.3.2rc2 is the first try, we are updating from 1.5 on SGI...

In that case, it's too late.  We need this fix out quickly to resolve
boo-boos in 2.3.1.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From pycon at python.org  Wed Oct  1 17:41:59 2003
From: pycon at python.org (PyCon Chair)
Date: Wed Oct  1 17:42:09 2003
Subject: [Python-Dev] PyCon DC 2004: Call for Proposals
Message-ID: <20031001214159.GA8075@panix.com>

[Please repost to local Python mailing lists.]


Want to share your expertise?  PyCon DC 2004 is looking for proposals to
fill the formal presentation tracks.

PyCon DC 2003 had a broad range of presentations, from reports on
academic and commercial projects to tutorials and case studies, and we
hope to extend that range this year.  As long as the presentation is
interesting and potentially useful to the Python community, it will be
considered for inclusion in the program.

The proposal deadline is December 1; the proposal submission system
should be up by mid-October.  We'll send out another notice with more
info when the submission system goes live.

Proposals should be 250-1000 words in text (plain or reST) or HTML.  You
may request either thirty or sixty minutes for your timeslot.  Proposals
will be accepted or rejected by January 1, 2004.  If your proposal is
accepted, you may include a companion paper for publication on the PyCon
web site.

If you don't want to make a formal presentation, there will be a
significant amount of Open Space to allow for informal and
spur-of-the-moment presentations for which no formal submission is
required.  There will also be several Lightning Talk sessions (five
minutes or less).

For more information, see http://www.python.org/pycon/dc2004/cfp.html


PyCon is a community-oriented conference targeting developers (both those
using Python and those working on the Python project).  It gives you
opportunities to learn about significant advances in the Python
development community, to participate in a programming sprint with some
of the leading minds in the Open Source community, and to meet fellow
developers from around the world.  The organizers work to make the
conference affordable and accessible to all.

DC 2004 will be held March 24-26, 2004 in Washington, D.C.  The keynote
speaker is Mitch Kapor of the Open Source Applications Foundation
(http://www.osafoundation.org/).  There will be a four-day development
sprint before the conference.

We're looking for volunteers to help run PyCon.  If you're interested,
subscribe to http://mail.python.org/mailman/listinfo/pycon-organizers

Don't miss any PyCon announcements!  Subscribe to
http://mail.python.org/mailman/listinfo/pycon-announce

You can discuss PyCon with other interested people by subscribing to
http://mail.python.org/mailman/listinfo/pycon-interest

The central resource for PyCon DC 2004 is
http://www.python.org/pycon/dc2004/
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From python-kbutler at sabaydi.com  Wed Oct  1 19:00:01 2003
From: python-kbutler at sabaydi.com (Kevin J. Butler)
Date: Wed Oct  1 19:00:20 2003
Subject: [Python-Dev] Bug? re.finditer fails to terminate with empty match
Message-ID: <3F7B5C71.7020801@sabaydi.com>

The iterator returned by re.finditer appears to not terminate if the 
final match is empty, but rather keeps returning the final (empty) match.

Is this a bug in _sre?  If so, I'll be happy to file it, though fixing 
it is a bit beyond my _sre experience level at this point.  The solution 
would appear to be to either a check for duplicate match in 
iterator.next(), or to increment position by one after returning an 
empty match (which should be OK, because if a non-empty match started at 
that location, we would have returned it instead of the empty match).

Code to illustrate the failure:

from re import finditer

last = None
for m in finditer( ".*", "asdf" ):
    if last == m.span():
        print "duplicate match:", last
        break
    print m.group(), m.span()
    last = m.span()
   
---
asdf (0, 4)
 (4, 4)
duplicate match: (4, 4)
---

findall works:

print re.findall( ".*", "asdf" )
['asdf', '']

Workaround is to explicitly check for a duplicate span, as I did above, 
or to check for a duplicate end(), which avoids the final empty match

kb


From greg at electricrain.com  Wed Oct  1 19:19:18 2003
From: greg at electricrain.com (Gregory P. Smith)
Date: Wed Oct  1 19:19:22 2003
Subject: [Python-Dev] 2.3.2 and bsddb
In-Reply-To: <1065013263.19531.24.camel@anthem>
References: <200310010610.h916AK2B002343@localhost.localdomain>
	<1065013263.19531.24.camel@anthem>
Message-ID: <20031001231918.GP17491@zot.electricrain.com>

On Wed, Oct 01, 2003 at 09:01:03AM -0400, Barry Warsaw wrote:
> On Wed, 2003-10-01 at 02:10, Anthony Baxter wrote:
> > Just another thought - should the newer pybsddb API be folded into the
> > library docs?

+1 all for it (for python 2.4).  the pybsddb docs should be TeX-ified
and included.  They were originally written by Robin using a zope-ish
formatted ascii -> html generator of some sort so automating the bulk
of the task should be possible.

> ... but I would make one change.  I think the
> links to the C API point to pybsddb copies of the Sleepycat
> documentation.  I'd change those to point to Sleepycat's own online
> documentation.  It's more fragile, but 1) it means pulling less into
> Python's library, and 2) should be more up-to-date as Sleepycat makes
> changes and new releases.

+1 agreed.

One caveat:  Sleepycat keeps the documentation for their current release
of BerkeleyDB online at http://www.sleepycat.com/docs/.  It doesn't
mention any of the different behaviours or even API differences between
it and older versions of BerkeleyDB.

We have no way of knowing exactly what version the users python is
compiled against other than in windows binary releases.

Mentioning that caveat in the documentation should be enough.

-g


From bac at OCF.Berkeley.EDU  Wed Oct  1 20:37:46 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Oct  1 20:38:08 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <1065040005.15765.18.camel@geddy>
References: <16251.7839.871263.562935@grendel.zope.com>	<75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>	<16251.14319.693162.119201@grendel.zope.com>
	<1065040005.15765.18.camel@geddy>
Message-ID: <3F7B735A.8070401@ocf.berkeley.edu>

Barry Warsaw wrote:

> On Wed, 2003-10-01 at 16:24, Fred L. Drake, Jr. wrote:
> 
> 
>>Interesting; I don't recall the last time I had to build my own
>>bzip2.  I'm pretty sure I didn't do anything special to get it on
>>RedHat recently.  
> 
> 
> No, I'm sure you didn't.  bzip2 decompression should be standard on RH9,
> and there's even a tar option to read and write it.  

Considering RH hosts the bzip2 site I would hope you could build on 
their OS.  =)

> What I don't know
> is whether bz2 decompression is generally available on MacOSX...
> 

It is; StuffIt can decompress it.  I just downloaded the GNU Info docs 
and had no problem with double-clicking the file and decompressing.

-Brett


From tinuviel at sparcs.kaist.ac.kr  Wed Oct  1 23:59:54 2003
From: tinuviel at sparcs.kaist.ac.kr (Seo Sanghyeon)
Date: Wed Oct  1 23:59:59 2003
Subject: [Python-Dev] re.finditer
Message-ID: <20031002035954.GA27701@sparcs.kaist.ac.kr>

Hello, python-dev!

This is my first mail to python-dev.

Attached one line patch fixes re.finditer bug reported by
Kevin J. Butler. I read cvs log to find out why this code is
introduced, and it seems to be related to SF bug #581080.

But that bug didn't appear after my patch, so I wonder
why it was introduced in the first place. It seems beyond
my understanding. Please enlighten me.

To test:

#581080
import re
list(re.finditer('\s', 'a b'))
# expected: one item list
# bug: hang

#Kevin J. Butler
import re
list(re.finditer('.*', 'asdf'))
# expected: two item list (?)
# bug: hang

Seo Sanghyeon
-------------- next part --------------
? patch
Index: Modules/_sre.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Modules/_sre.c,v
retrieving revision 2.99
diff -c -r2.99 _sre.c
*** Modules/_sre.c	26 Jun 2003 14:41:08 -0000	2.99
--- Modules/_sre.c	2 Oct 2003 03:48:55 -0000
***************
*** 3062,3069 ****
      match = pattern_new_match((PatternObject*) self->pattern,
                                 state, status);
  
!     if ((status == 0 || state->ptr == state->start) &&
!         state->ptr < state->end)
          state->start = (void*) ((char*) state->ptr + state->charsize);
      else
          state->start = state->ptr;
--- 3062,3068 ----
      match = pattern_new_match((PatternObject*) self->pattern,
                                 state, status);
  
!     if (status == 0 || state->ptr == state->start)
          state->start = (void*) ((char*) state->ptr + state->charsize);
      else
          state->start = state->ptr;
From oussoren at cistron.nl  Thu Oct  2 01:49:37 2003
From: oussoren at cistron.nl (Ronald Oussoren)
Date: Thu Oct  2 01:49:46 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <1065040005.15765.18.camel@geddy>
References: <16251.7839.871263.562935@grendel.zope.com>
	<75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
	<16251.14319.693162.119201@grendel.zope.com>
	<1065040005.15765.18.camel@geddy>
Message-ID: <34FEA612-F49C-11D7-862D-0003931CFE24@cistron.nl>


On 1 okt 2003, at 22:26, Barry Warsaw wrote:

> On Wed, 2003-10-01 at 16:24, Fred L. Drake, Jr. wrote:
>
>> Interesting; I don't recall the last time I had to build my own
>> bzip2.  I'm pretty sure I didn't do anything special to get it on
>> RedHat recently.
>
> No, I'm sure you didn't.  bzip2 decompression should be standard on 
> RH9,
> and there's even a tar option to read and write it.  What I don't know
> is whether bz2 decompression is generally available on MacOSX...

The bzip command-line utilities are shipped as part of MacOS X. I'm not 
sure if Stuffit supports bzip-ed archives.

Ronald


From mwh at python.net  Thu Oct  2 06:08:49 2003
From: mwh at python.net (Michael Hudson)
Date: Thu Oct  2 06:08:07 2003
Subject: [Python-Dev] Re: RELEASED Python 2.3.1
In-Reply-To: <16251.9027.715275.62943@grendel.zope.com> (Fred L. Drake,
	Jr.'s message of "Wed, 1 Oct 2003 14:56:03 -0400")
References: <200309261752.h8QHqJgD029823@localhost.localdomain>
	<LNBBLJKPBEHFEDALKOLCEEJHGCAB.tim.one@comcast.net>
	<20030926184853.GB22837@mems-exchange.org> <he2zba58.fsf@python.net>
	<2mzngr1bf2.fsf@starship.python.net>
	<16249.65446.699521.103706@grendel.zope.com>
	<2mbrt11bkd.fsf@starship.python.net>
	<16251.9027.715275.62943@grendel.zope.com>
Message-ID: <2msmmcymry.fsf@starship.python.net>

"Fred L. Drake, Jr." <fdrake@acm.org> writes:

> Michael Hudson writes:
>  > One thing that puzzled me: Doc/Makefile seems to require that
>  > Doc/tools is on $PATH, unless I'm misunderstanding something.
>
> It definately doesn't require that; I've never used Doc/tools/ on
     ^^^^^^^^^^
> $PATH.  One thing it was requiring (only recently) was that there was
> a mkhowto symlink somewhere on the $PATH that pointed to the mkhowto
> script.

Well, OK.  I was getting "mkhowto: command not found" messages.

> I've removed that constraint for the trunk.

Thanks!

Cheers,
mwh

-- 
 Very clever implementation techniques are required to implement this
 insanity correctly and usefully, not to mention that code written
 with this feature used and abused east and west is exceptionally
 exciting to debug.       -- Erik Naggum on Algol-style "call-by-name"

From andymac at bullseye.apana.org.au  Thu Oct  2 05:55:10 2003
From: andymac at bullseye.apana.org.au (Andrew MacIntyre)
Date: Thu Oct  2 09:15:18 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <16251.14741.876754.348214@grendel.zope.com>
References: <16251.7839.871263.562935@grendel.zope.com>
	<75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
	<16251.13967.470512.219448@magrathea.basistech.com>
	<BIEJKCLHCIOIHAGOKOLHAEFMHFAA.tim@zope.com>
	<16251.14741.876754.348214@grendel.zope.com>
Message-ID: <20031002195207.S85276@bullseye.apana.org.au>

On Wed, 1 Oct 2003, Fred L. Drake, Jr. wrote:

> Interesting.  bzip2 saves half a MB over gzip for the HTML and
> PostScript formats.

If you're producing PDF, why produce Postscript?  AFAIK, Ghostscript
digests PDF and can generate Postscript for those that have/want to use a
Postscript printer.  Around here, print shops seem to actually _prefer_
PDF.

--
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac@bullseye.apana.org.au  (pref) | Snail: PO Box 370
        andymac@pcug.org.au             (alt) |        Belconnen  ACT  2616
Web:    http://www.andymac.org/               |        Australia

From skip at pobox.com  Thu Oct  2 09:43:14 2003
From: skip at pobox.com (Skip Montanaro)
Date: Thu Oct  2 09:43:27 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
References: <16251.7839.871263.562935@grendel.zope.com>
	<75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
Message-ID: <16252.11122.781855.197919@montanaro.dyndns.org>


    Dethe> I've never known a *nix box to have trouble with *.zip or known a
    Dethe> unix user who had trouble with *.zip.  So I'd suggest keeping the
    Dethe> various flavors of documentation, but standardize on zip
    Dethe> compression.  That will at least remove one variable.

Agreed.  We did encounter a problem with a zip file in the SpamBayes group
recently which we believe (though haven't confirmed - the OP has apparently
gone underground) was related to WinZip problems.  As I understand it, if
you set an option in WinZip to "flatten" a zip file, all future zip files
are also flattened.  I guess it's a case of setting that option then poking
the "Save Options" or "OK" button, then forgetting that other zip files will
have structure which shouldn't be eliminated.

Skip

From skip at pobox.com  Thu Oct  2 09:45:23 2003
From: skip at pobox.com (Skip Montanaro)
Date: Thu Oct  2 09:45:33 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <1065040005.15765.18.camel@geddy>
References: <16251.7839.871263.562935@grendel.zope.com>
	<75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
	<16251.14319.693162.119201@grendel.zope.com>
	<1065040005.15765.18.camel@geddy>
Message-ID: <16252.11251.742634.458564@montanaro.dyndns.org>

    Barry> What I don't know is whether bz2 decompression is generally
    Barry> available on MacOSX...

Fink is your friend:

    % type bzip2
    bzip2 is /sw/bin/bzip2

so, no, it's not standard on Mac OS X.

S

From mwh at python.net  Thu Oct  2 09:51:58 2003
From: mwh at python.net (Michael Hudson)
Date: Thu Oct  2 09:51:16 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <16252.11251.742634.458564@montanaro.dyndns.org> (Skip
	Montanaro's message of "Thu, 2 Oct 2003 08:45:23 -0500")
References: <16251.7839.871263.562935@grendel.zope.com>
	<75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
	<16251.14319.693162.119201@grendel.zope.com>
	<1065040005.15765.18.camel@geddy>
	<16252.11251.742634.458564@montanaro.dyndns.org>
Message-ID: <2moewzzr0h.fsf@starship.python.net>

Skip Montanaro <skip@pobox.com> writes:

>     Barry> What I don't know is whether bz2 decompression is generally
>     Barry> available on MacOSX...
>
> Fink is your friend:
>
>     % type bzip2
>     bzip2 is /sw/bin/bzip2
>
> so, no, it's not standard on Mac OS X.

Just because fink supplies something doesn't mean it didn't come with
the base install.  Jaguar has bzip2 installed; I don't think 10.1 did.

Cheers,
mwh

-- 
  SCSI is not magic. There are fundamental technical reasons why it
  is necessary to sacrifice a young goat to your SCSI chain now and
  then.                                                  -- John Woods

From skip at pobox.com  Thu Oct  2 09:58:10 2003
From: skip at pobox.com (Skip Montanaro)
Date: Thu Oct  2 09:58:23 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <16252.11251.742634.458564@montanaro.dyndns.org>
References: <16251.7839.871263.562935@grendel.zope.com>
	<75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
	<16251.14319.693162.119201@grendel.zope.com>
	<1065040005.15765.18.camel@geddy>
	<16252.11251.742634.458564@montanaro.dyndns.org>
Message-ID: <16252.12018.287900.741829@montanaro.dyndns.org>


    Barry> What I don't know is whether bz2 decompression is generally
    Barry> available on MacOSX...

    Skip> Fink is your friend:

    Skip>     % type bzip2
    Skip>     bzip2 is /sw/bin/bzip2

    Skip> so, no, it's not standard on Mac OS X.

Sorry, should have used "type -a" so I saw the version in /usr/bin.

Skip

From fred at zope.com  Thu Oct  2 10:04:41 2003
From: fred at zope.com (Fred L. Drake, Jr.)
Date: Thu Oct  2 10:04:55 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <20031002195207.S85276@bullseye.apana.org.au>
References: <16251.7839.871263.562935@grendel.zope.com>
	<75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
	<16251.13967.470512.219448@magrathea.basistech.com>
	<BIEJKCLHCIOIHAGOKOLHAEFMHFAA.tim@zope.com>
	<16251.14741.876754.348214@grendel.zope.com>
	<20031002195207.S85276@bullseye.apana.org.au>
Message-ID: <16252.12409.982231.274869@grendel.zope.com>


Andrew MacIntyre writes:
 > If you're producing PDF, why produce Postscript?  AFAIK, Ghostscript
 > digests PDF and can generate Postscript for those that have/want to use a
 > Postscript printer.  Around here, print shops seem to actually _prefer_
 > PDF.

I recall a number of people wanting to use the PostScript to drive
real PostScript printers directly.  That was some time ago; perhaps
Ghostscript can handle PDF sufficiently now.

If there's no longer any interest in having the PostScript available,
I'll be glad to drop that.  I guess I really should come up with a
script that pulls the relevant stats from the site logs.


  -Fred

-- 
Fred L. Drake, Jr.  <fred at zope.com>
PythonLabs at Zope Corporation

From pinard at iro.umontreal.ca  Thu Oct  2 11:14:24 2003
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Thu Oct  2 11:14:35 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <20031002195207.S85276@bullseye.apana.org.au>
References: <16251.7839.871263.562935@grendel.zope.com>
	<75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
	<16251.13967.470512.219448@magrathea.basistech.com>
	<BIEJKCLHCIOIHAGOKOLHAEFMHFAA.tim@zope.com>
	<16251.14741.876754.348214@grendel.zope.com>
	<20031002195207.S85276@bullseye.apana.org.au>
Message-ID: <20031002151424.GA14552@alcyon.progiciels-bpi.ca>

[Andrew MacIntyre]
> If you're producing PDF, why produce Postscript?  AFAIK, Ghostscript
> digests PDF and can generate Postscript for those that have/want to use a
> Postscript printer.  Around here, print shops seem to actually _prefer_
> PDF.

But some of us are not print shops, and have Postscript printers, which
are better fed with Postscript, and do not directly accept PDF.

PDF to Postscript converters are not 100% dependable, even if they do
the job most of the time.  Given `.pdf' and `.ps', for one, I would
almost always pick the `.ps' file, to avoid possible fights and trouble.

-- 
Fran?ois Pinard   http://www.iro.umontreal.ca/~pinard

From martin at v.loewis.de  Thu Oct  2 14:39:00 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Thu Oct  2 14:39:27 2003
Subject: [Python-Dev] re.finditer
In-Reply-To: <20031002035954.GA27701@sparcs.kaist.ac.kr>
References: <20031002035954.GA27701@sparcs.kaist.ac.kr>
Message-ID: <m365j7fprv.fsf@mira.informatik.hu-berlin.de>

Seo Sanghyeon <tinuviel@sparcs.kaist.ac.kr> writes:

> But that bug didn't appear after my patch, so I wonder
> why it was introduced in the first place. It seems beyond
> my understanding. Please enlighten me.

Dear Seo Sanghyeon,

Welcome to the list! Please don't post patches here, though - they
*will* get lost. Instead, post them to SF (using a new tracker item),
and discuss them here if you want.

I don't have the time to read your patch right now, so I cannot
comment on the issue itself.

Regards,
Martin


From Jack.Jansen at cwi.nl  Thu Oct  2 18:00:06 2003
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Thu Oct  2 18:00:18 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <16252.11251.742634.458564@montanaro.dyndns.org>
References: <16251.7839.871263.562935@grendel.zope.com>
	<75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
	<16251.14319.693162.119201@grendel.zope.com>
	<1065040005.15765.18.camel@geddy>
	<16252.11251.742634.458564@montanaro.dyndns.org>
Message-ID: <C83477A6-F523-11D7-91B8-000A27B19B96@cwi.nl>

There's always machines out there that won't support newer formats out
of the box, so may I suggest the following course of action:
1. For now we add bz2 compression, and put that at the top of the
    list, with gz far below it. If we want to get real fancy we
    could even put it behind another link "old formats".
2. At some point in the future we look at the http logs to see
    how many people still use the older format.

.Z files were still very useful to some people long after .gz had
become the norm, just because they were stuck on old boxes. And if
Python goes out of its way to remain buildable on various old boxes
as-os it would be silly if we would require people to
download third-party stuff just to decode the documentation...
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman


From pinard at iro.umontreal.ca  Thu Oct  2 20:39:41 2003
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Thu Oct  2 20:39:50 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <C83477A6-F523-11D7-91B8-000A27B19B96@cwi.nl>
References: <16251.7839.871263.562935@grendel.zope.com>
	<75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
	<16251.14319.693162.119201@grendel.zope.com>
	<1065040005.15765.18.camel@geddy>
	<16252.11251.742634.458564@montanaro.dyndns.org>
	<C83477A6-F523-11D7-91B8-000A27B19B96@cwi.nl>
Message-ID: <20031003003941.GA17401@alcyon.progiciels-bpi.ca>

[Jack Jansen]
> .Z files were still very useful to some people long after .gz had
> become the norm, just because they were stuck on old boxes.

Do anybody remember `.z' files?  (`pack' and `unpack' were the tools,
unless I'm mistaken).

I'm _not_ suggesting that they get supported :-).  Despite `.Z' is not
as old as `.z', they are not very far, once added some perspective.

-- 
Fran?ois Pinard   http://www.iro.umontreal.ca/~pinard

From azaidi at vsnl.com  Thu Oct  2 20:05:24 2003
From: azaidi at vsnl.com (Arsalan Zaidi)
Date: Thu Oct  2 21:07:12 2003
Subject: [Python-Dev] Any movement on a SIG for web lib enchancements?
Message-ID: <008501c38942$1336a120$b9479cca@LocalHost>

There was some discussion about this a few weeks ago. But there's still no
SIG.

Is anyone working on this yet?

--Arsalan


From janssen at parc.com  Thu Oct  2 21:18:11 2003
From: janssen at parc.com (Bill Janssen)
Date: Thu Oct  2 21:18:32 2003
Subject: [Python-Dev] Any movement on a SIG for web lib enchancements? 
In-Reply-To: Your message of "Thu, 02 Oct 2003 17:05:24 PDT."
	<008501c38942$1336a120$b9479cca@LocalHost> 
Message-ID: <03Oct2.181816pdt."58611"@synergy1.parc.xerox.com>

Yes, I've been working on a charter.  I'll put out a version for folks
to look at tomorrow (probably announced on the Meta-SIG; Python-Dev
really isn't the right place?).

Bill

> There was some discussion about this a few weeks ago. But there's still no
> SIG.
> 
> Is anyone working on this yet?
> 
> --Arsalan
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev

From fdrake at acm.org  Thu Oct  2 23:54:10 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu Oct  2 23:54:27 2003
Subject: [Doc-SIG] Re: [Python-Dev] Documentation packages
In-Reply-To: <C83477A6-F523-11D7-91B8-000A27B19B96@cwi.nl>
References: <16251.7839.871263.562935@grendel.zope.com>
	<75F17D10-F44B-11D7-B481-0003939B59E8@blastradius.com>
	<16251.14319.693162.119201@grendel.zope.com>
	<1065040005.15765.18.camel@geddy>
	<16252.11251.742634.458564@montanaro.dyndns.org>
	<C83477A6-F523-11D7-91B8-000A27B19B96@cwi.nl>
Message-ID: <16252.62178.436842.527925@grendel.zope.com>


Jack Jansen writes:
 > 1. For now we add bz2 compression, and put that at the top of the
 >     list, with gz far below it. If we want to get real fancy we
 >     could even put it behind another link "old formats".

At this point, we've been providing bzip2-compressed tarballs for
three years; they became available with Python 1.6 (does anyone even
remember that release?).

 > 2. At some point in the future we look at the http logs to see
 >     how many people still use the older format.

I'm hoping to write the script to do that this weekend.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From anthony at interlink.com.au  Fri Oct  3 04:17:42 2003
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri Oct  3 04:19:51 2003
Subject: [Python-Dev] Re: RELEASED Python 2.3.1 
In-Reply-To: <16251.16096.224422.956995@grendel.zope.com> 
Message-ID: <200310030817.h938Hgtk028368@localhost.localdomain>


The current documentation release tools don't build the latex packages.
I tried using the Makefile targets, but they seemed to want to check out
a HEAD revision of the dist/src/Doc directory. Oops. 

I've commented out the latex row of the download table for now - Fred, can
you look into this and fix?

Anthony
-- 
Anthony Baxter     <anthony@interlink.com.au>   
It's never too late to have a happy childhood.


From anthony at python.org  Fri Oct  3 04:35:58 2003
From: anthony at python.org (Anthony Baxter)
Date: Fri Oct  3 04:37:58 2003
Subject: [Python-Dev] RELEASED Python 2.3.2 (final)
Message-ID: <200310030835.h938ZwaN028812@localhost.localdomain>

On behalf of the Python development team and the Python community, I'm
happy to announce the release of Python 2.3.2 (final).

Python 2.3.2 is a bug-fix release, to repair a couple of build problems
and packaging errors in Python 2.3.1. 

For more information on Python 2.3.2, including download links for
various platforms, release notes, and known issues, please see:

    http://www.python.org/2.3.2

Highlights of this new release include:

  - A bug in autoconf that broke building on HP/UX systems is fixed.

  - A bug in the Python configure script that meant os.fsync() was 
    never available is fixed.

Highlights of the previous major Python release (2.3) are available     
from the Python 2.3 page, at                                            

    http://www.python.org/2.3/highlights.html

Many apologies for the flaws in 2.3.1 release. Hopefully the new 
release procedures should stop this happening again.

Enjoy the new release,
Anthony

Anthony Baxter
anthony@python.org
Python 2.3.2 Release Manager
(on behalf of the entire python-dev team)

From fdrake at acm.org  Fri Oct  3 10:50:04 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri Oct  3 10:50:35 2003
Subject: [Python-Dev] Re: RELEASED Python 2.3.1 
In-Reply-To: <200310030817.h938Hgtk028368@localhost.localdomain>
References: <16251.16096.224422.956995@grendel.zope.com>
	<200310030817.h938Hgtk028368@localhost.localdomain>
Message-ID: <16253.35996.751345.757001@grendel.zope.com>


Anthony Baxter writes:
 > The current documentation release tools don't build the latex packages.
 > I tried using the Makefile targets, but they seemed to want to check out
 > a HEAD revision of the dist/src/Doc directory. Oops. 

Argh.  Yes; this is a problem with mksourcepkg on branches.  I think I
can improve that a bit, but the best way seems to be running it with a
second argument giving the specific tag we're interested in.

The script also runs into the "we can't keep our anonymous CVS servers
up to date" problem, so I'll mod my local copy not try to use the
anonymous servers for now.

 > I've commented out the latex row of the download table for now - Fred, can
 > you look into this and fix?

I'll have them up shortly.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From python-kbutler at sabaydi.com  Fri Oct  3 11:26:01 2003
From: python-kbutler at sabaydi.com (Kevin J. Butler)
Date: Fri Oct  3 11:26:34 2003
Subject: [Python-Dev] re.finditer
In-Reply-To: <E1A55um-0001jT-4E@mail.python.org>
References: <E1A55um-0001jT-4E@mail.python.org>
Message-ID: <3F7D9509.7010802@sabaydi.com>

From: Seo Sanghyeon <tinuviel@sparcs.kaist.ac.kr>

>This is my first mail to python-dev.
>  
>
Welcome, nice way to begin.  :-)

>Attached one line patch fixes re.finditer bug reported by
>Kevin J. Butler. I read cvs log to find out why this code is
>introduced, and it seems to be related to SF bug #581080.
>  
>
Excellent, I'll give it a shot.

Meanwhile, I filed a bug:  817234
http://sourceforge.net/tracker/index.php?func=detail&aid=817234&group_id=5470&atid=105470

I included your post & suggested patch.

Thanks!

kb


From python-kbutler at sabaydi.com  Fri Oct  3 14:13:16 2003
From: python-kbutler at sabaydi.com (Kevin J. Butler)
Date: Fri Oct  3 14:13:52 2003
Subject: Resolution: was Re: [Python-Dev] re.finditer
In-Reply-To: <E1A5SON-0005Ny-GY@mail.python.org>
References: <E1A5SON-0005Ny-GY@mail.python.org>
Message-ID: <3F7DBC3C.6040003@sabaydi.com>

Summary: bug 817234

http://sourceforge.net/tracker/index.php?func=detail&aid=817234&group_id=5470&atid=105470

in Python 2.3 and 2.3.1, finditer does not raise StopIteration if the 
end of the string matches with an empty match.

That is, the following code will loop forever:

 >>> import re
 >>> i = re.finditer( ".*", "asdf" )
 >>> for m in i: print m.span()
...
(0, 4)
(4, 4)
(4, 4)
(4, 4)
(4, 4)
(4, 4)
(4, 4)

Seo Sanghyeon posted what appears to be a correct fix. The code was introduced in the fix for bug 581080 http://sourceforge.net/tracker/index.php?func=detail&aid=581080&group_id=5470&atid=105470
but removing this line does not re-introduce that bug.

Thanks, and kudos to Seo...

kb


From anthony at interlink.com.au  Fri Oct  3 20:08:16 2003
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri Oct  3 20:10:23 2003
Subject: [Python-Dev] 2.3.3 plans
Message-ID: <200310040008.h9408HtM008544@localhost.localdomain>


I'm currently thinking of doing 2.3.3 in about 3 months time. My focus
on 2.3.3 will be on fixing the various build glitches that we have on
various platforms - I'd like to see 2.3.3 build on as many boxes as 
possible, "out of the box".

Anthony

From jason at mastaler.com  Sat Oct  4 14:01:39 2003
From: jason at mastaler.com (Jason R. Mastaler)
Date: Sat Oct  4 14:01:47 2003
Subject: [Python-Dev] Re: RELEASED Python 2.3.2 (final)
References: <200310030835.h938ZwaN028812@localhost.localdomain>
Message-ID: <m2ad8gdgqk.fsf@deacon-blues.mid.mastaler.com>

I wanted to say thanks to Anthony and everyone else for responding so
quickly to our concerns with the 2.3.1 release.  It's greatly
appreciated!


From python at discworld.dyndns.org  Sat Oct  4 16:58:37 2003
From: python at discworld.dyndns.org (Charles Cazabon)
Date: Sat Oct  4 16:53:55 2003
Subject: [Python-Dev] Re: RELEASED Python 2.3.2 (final)
In-Reply-To: <m2ad8gdgqk.fsf@deacon-blues.mid.mastaler.com>;
	from jason@mastaler.com on Sat, Oct 04, 2003 at 12:01:39PM -0600
References: <200310030835.h938ZwaN028812@localhost.localdomain>
	<m2ad8gdgqk.fsf@deacon-blues.mid.mastaler.com>
Message-ID: <20031004145837.A13335@discworld.dyndns.org>

Jason R. Mastaler <jason@mastaler.com> wrote:
> I wanted to say thanks to Anthony and everyone else for responding so
> quickly to our concerns with the 2.3.1 release.  It's greatly
> appreciated!

Hear, hear.  Thanks to all involved for their hard work.  It's much easier
when all you have to do is complain about it :).

Charles
-- 
-----------------------------------------------------------------------
Charles Cazabon                           <python@discworld.dyndns.org>
GPL'ed software available at:     http://www.qcc.ca/~charlesc/software/
-----------------------------------------------------------------------

From cstork at ics.uci.edu  Sat Oct  4 19:40:00 2003
From: cstork at ics.uci.edu (Christian Stork)
Date: Sat Oct  4 19:40:55 2003
Subject: [Python-Dev] Efficient predicates for the standard library
Message-ID: <20031004234000.GG25813@ics.uci.edu>

Hi everybody,

This is my first post to python-dev and mailman told me to introduce
myself...  I'm computer science grad student at UC Irvine and I've been
programming in python for quite some time now.  I'm originally from
Germany where I studied math together with Marc-Andre Lemburg, who
should be somewhat known on this list.  ;-)


I'd like to advocate the inclusion of efficient (ie iterator-based)
predicates to the standard library.  If that's asking too much :-) then
consider this just a suggestion for updating the documentation of
itertools.

My reasoning is that these predicate should be used in many places,
especially as part of assert statements.  IMHO lowering the burden to
use assert statements is always a good idea.


The examples given in itertools' documentation are a good starting
point.  More specifically I'm talking about the following:


    def all(pred, seq):
	"Returns True if pred(x) is True for every element in the iterable"
	return False not in imap(pred, seq)

    def some(pred, seq):
	"Returns True if pred(x) is True at least one element in the iterable"
	return True in imap(pred, seq)

    def no(pred, seq):
	"Returns True if pred(x) is False for every element in the iterable"
	return True not in imap(pred, seq)


But before including these functions, I would like to propose two
changes.

1. Meaning of None as predicate

The meaning of None as pred.  The above definitions use itertools.imap's
interpretation of None as pred arguments, ie None is interpreted as the
functions that returns a tuple of its arguments.  Therefore all(None,
<any-sequence>) will always return True.  Similar reasoning renders None
as pred useless for some() and no().

I would like to propose pred=None's meaning to be the same as for
itertools.ifilter, ie None is interpreted as the identity function,
which--in this context--is the same as the bool() function.

Now all(None, seq) it true iff all of seq's elements are interpreted as
True by bool().  This is potentially valuable information. ;-)

2. Argument order

Now that there's a useful default meaning for pred, we should give it a
default and make it an optional argument.  For this the order of
arguments must be reversed.  This is different from itertools consistent
use of iterables as last arguments.  I don't know if this is relvant
here.  Anyway, since predicates are in general more useful like this I
think it's the better choice.


So, I propose an implementation like this:

    def all(seq, pred=bool):
        return False not in imap(pred, seq)

    def some(seq, pred=bool):
        return True in imap(pred, seq)

    def no(seq, pred=bool):
        return True not in imap(pred, seq)


[ You can see now that the meaning of pred == None was just a strawman.
  ;-) ]

For enhanced assert support I'd advocate additional predicates for
easy and fast type checking, eg allListType, allIntType, etc.

Maybe all this should go into it's own `preds' module?


-- 
Chris Stork <><><><><><><><><><><><><>  http://www.ics.uci.edu/~cstork/
OpenPGP fingerprint: B08B 602C C806 C492 D069  021E 41F3 8C8D 50F9 CA2F

From python at rcn.com  Sat Oct  4 21:24:48 2003
From: python at rcn.com (Raymond Hettinger)
Date: Sat Oct  4 21:25:19 2003
Subject: [Python-Dev] Efficient predicates for the standard library
In-Reply-To: <20031004234000.GG25813@ics.uci.edu>
Message-ID: <000001c38adf$77f39ca0$e841fea9@oemcomputer>

[Christian Stork]
> So, I propose an implementation like this:
>
>    def all(seq, pred=bool):
>        return False not in imap(pred, seq)
>
>    def some(seq, pred=bool):
>        return True in imap(pred, seq)
>
>    def no(seq, pred=bool):
>        return True not in imap(pred, seq)

The examples are all useful by themselves, but their primary purpose is
to teach how to use the basic tools.  Accordingly, the examples should
not become complex and they should tie in as well as possible to
previously existing knowledge (i.e. common patterns for argument order).

Your proposal is a net gain and I will change the docs as requested.
Having bool() as a default makes the functions more useful and less
error prone.  Also, it increases instructional value by giving an
example of a predicate (for who skipped class that day).  Also, your
proposed argument order matches common mathematical usage (i.e. All
elements in a <domain> such that <predicate> is true).

For your own purposes, consider using a faster implementation:

    def Some(seq, pred=None):
        for x in ifilter(None, seq):
            return True
        return False

All() and No() have similar fast implementations using ifilterfalse()
and reversing the return values.


> For enhanced assert support I'd advocate additional predicates for
> easy and fast type checking, eg allListType, allIntType, etc.

> Maybe all this should go into it's own `preds' module?

Or maybe not ;-)

Somewhere, Tim has a eloquent and pithy saying which roughly translates
to:

"""Adding trivial functions is a net loss because the burden of learning
or becoming aware of them (and their implementation nuances) will far
exceed the microscopic benefit of saving a line or two that could be
coded on the spot as needed."""

In this case, a single example in the docs may suffice:

    if False in imap(isinstance, seqn, repeat(int)):
        raise TypeError("All arguments must be of type int")
    

Raymond Hettinger


From fincher.8 at osu.edu  Sun Oct  5 00:26:48 2003
From: fincher.8 at osu.edu (Jeremy Fincher)
Date: Sat Oct  4 23:28:20 2003
Subject: [Python-Dev] Efficient predicates for the standard library
In-Reply-To: <20031004234000.GG25813@ics.uci.edu>
References: <20031004234000.GG25813@ics.uci.edu>
Message-ID: <200310050026.48218.fincher.8@osu.edu>

On Saturday 04 October 2003 07:40 pm, Christian Stork wrote:
> I'd like to advocate the inclusion of efficient (ie iterator-based)
> predicates to the standard library.

I agree.  At the very least, I think such predicates should be in the 
itertools module.

> My reasoning is that these predicate should be used in many places,
> especially as part of assert statements.

One of the places where I use them most, to be sure :)

>     def all(pred, seq):
> 	"Returns True if pred(x) is True for every element in the iterable"
> 	return False not in imap(pred, seq)
>
>     def some(pred, seq):
> 	"Returns True if pred(x) is True at least one element in the iterable"
> 	return True in imap(pred, seq)
>
>     def no(pred, seq):
> 	"Returns True if pred(x) is False for every element in the iterable"
> 	return True not in imap(pred, seq)

I would instead call some "any" (it's more standard among the functional 
languages I've worked with), and I wouldn't bother with "no," since it's 
exactly the same as "not any" (or "not some," as the case may be).

As Raymond Hettinger already mentioned, obviously such predicates over 
sequences should exhibit short-circuit behavior -- any should return with the 
first True response and all should return with the first False response.

Jeremy

From cstork at ics.uci.edu  Sun Oct  5 05:57:27 2003
From: cstork at ics.uci.edu ('Christian Stork')
Date: Sun Oct  5 05:58:22 2003
Subject: [Python-Dev] Efficient predicates for the standard library
In-Reply-To: <000001c38adf$77f39ca0$e841fea9@oemcomputer>
References: <20031004234000.GG25813@ics.uci.edu>
	<000001c38adf$77f39ca0$e841fea9@oemcomputer>
Message-ID: <20031005095727.GB32122@ics.uci.edu>

On Sat, Oct 04, 2003 at 09:24:48PM -0400, Raymond Hettinger wrote:
...
> Your proposal is a net gain and I will change the docs as requested.
> Having bool() as a default makes the functions more useful and less
> error prone.  Also, it increases instructional value by giving an
> example of a predicate (for who skipped class that day).  Also, your
> proposed argument order matches common mathematical usage (i.e. All
> elements in a <domain> such that <predicate> is true).
 
Thanks, I agree.

> For your own purposes, consider using a faster implementation:
> 
>     def Some(seq, pred=None):
>         for x in ifilter(None, seq):
>             return True
>         return False
> 
> All() and No() have similar fast implementations using ifilterfalse()
> and reversing the return values.
 
Interesting, this is almost exactly what my first attempt at this looked
like.  Then I saw the examples in the doc and changed to the proposed
ones.

Honestly, I assumed that 

    x in iterable

has a short-circuit implementation.  Why doesn't it?
 
> > For enhanced assert support I'd advocate additional predicates for
> > easy and fast type checking, eg allListType, allIntType, etc.
>
> > Maybe all this should go into it's own `preds' module?
> 
> Or maybe not ;-)
> 
> Somewhere, Tim has a eloquent and pithy saying which roughly translates
> to:
> 
> """Adding trivial functions is a net loss because the burden of learning
> or becoming aware of them (and their implementation nuances) will far
> exceed the microscopic benefit of saving a line or two that could be
> coded on the spot as needed."""

I hear you/him  :-)  and I'd be fine if you just change the docs.  I
also agree that introducing predicates a la allInts seems like a bad
idea since it's overspecialisation.  (I could think of better ways to do
type checking anyway.)

Let me just give you the reasons (in no particular order) for my
suggestion to include the `all' and `some/any' predicates:

1. Efficiency
Maybe I'm a bit naive here, but it seems to me that since these
predicates involve tight inner loops they offer good potential for
speedup, especially when used often and over many iterations.

2. Readabilty
If we offer universally-used predicates with succinct names which are
available as part of the "batteries included" then that increases 
readabilty of code a lot.

3. Asserts
1. & 2. encourage the use of asserts, which increases code quality.

4. It's *not* trivial!
Contrary to what you imply it's not trivial for everybody to just write
efficient and well designed predicates with well-chosen names.  This
discussion is the proof. :-)
 
> In this case, a single example in the docs may suffice:
> 
>     if False in imap(isinstance, seqn, repeat(int)):
>         raise TypeError("All arguments must be of type int")

Just that this would be too much to type for me if I only wanted to
quickly (and without too much runtime overhead) check on myself.  I'd
prefer

    assert isinstance(seq, [int])
    
...but that doesn't exist yet.  ...in Python, that is. ;-)

Anyway, thanks for the itertools package.  I especially enjoy the parts
that remind me of Haskell!

-- 
Chris Stork <><><><><><><><><><><><><>  http://www.ics.uci.edu/~cstork/
OpenPGP fingerprint: B08B 602C C806 C492 D069  021E 41F3 8C8D 50F9 CA2F

From cstork at ics.uci.edu  Sun Oct  5 05:59:54 2003
From: cstork at ics.uci.edu (Christian Stork)
Date: Sun Oct  5 06:00:49 2003
Subject: [Python-Dev] Efficient predicates for the standard library
In-Reply-To: <200310050026.48218.fincher.8@osu.edu>
References: <20031004234000.GG25813@ics.uci.edu>
	<200310050026.48218.fincher.8@osu.edu>
Message-ID: <20031005095954.GC32122@ics.uci.edu>

On Sun, Oct 05, 2003 at 12:26:48AM -0400, Jeremy Fincher wrote:
...
> >     def some(pred, seq):
> > 	"Returns True if pred(x) is True at least one element in the iterable"
> > 	return True in imap(pred, seq)
> >
> >     def no(pred, seq):
> > 	"Returns True if pred(x) is False for every element in the iterable"
> > 	return True not in imap(pred, seq)
> 
> I would instead call some "any" (it's more standard among the functional 
> languages I've worked with), and I wouldn't bother with "no," since it's 
> exactly the same as "not any" (or "not some," as the case may be).
 
Yep, seems better.

-- 
Chris Stork <><><><><><><><><><><><><>  http://www.ics.uci.edu/~cstork/
OpenPGP fingerprint: B08B 602C C806 C492 D069  021E 41F3 8C8D 50F9 CA2F

From skip at manatee.mojam.com  Sun Oct  5 08:01:04 2003
From: skip at manatee.mojam.com (Skip Montanaro)
Date: Sun Oct  5 08:01:15 2003
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200310051201.h95C14wE019221@manatee.mojam.com>


Bug/Patch Summary
-----------------

531 open / 4217 total bugs (+26)
206 open / 2405 total patches (+5)

New Bugs
--------

robotparser interactively prompts for username and password (2003-09-28)
	http://python.org/sf/813986
Grouprefs in lookbehind assertions (2003-09-28)
	http://python.org/sf/814253
LDFLAGS ignored in Makefile (2003-09-28)
	http://python.org/sf/814259
new.function raises TypeError for some strange reason... (2003-09-28)
	http://python.org/sf/814266
5454 - documentation wrong for ossaudiodev mixer device (2003-09-29)
	http://python.org/sf/814606
'import Tkinter' causes windows missing-DLL popup (2003-09-29)
	http://python.org/sf/814654
RedHat 9 blows up at dlclose of pyexpat.so (2003-09-29)
	http://python.org/sf/814726
OSF/1 test_dbm segfaults (2003-09-30)
	http://python.org/sf/814996
bug with ill-formed rfc822 attachments (2003-09-30)
	http://python.org/sf/815563
thread unsafe file objects cause crash (2003-09-30)
	http://python.org/sf/815646
test_locale and en_US (2003-10-01)
	http://python.org/sf/815668
SCO_SV: many modules cannot be imported (2003-10-01)
	http://python.org/sf/815753
tkMessageBox functions reject "type" and "icon" keywords (2003-10-01)
	http://python.org/sf/815924
ImportError: No module named _socket (2003-10-01)
	http://python.org/sf/815999
Missing import in email example (2003-10-01)
	http://python.org/sf/816344
Fatal Python error: GC object already tracked (2003-10-02)
	http://python.org/sf/816476
mark deprecated modules in indexes (2003-10-02)
	http://python.org/sf/816725
webbrowser.open hangs under certain conditions (2003-10-02)
	http://python.org/sf/816810
term.h present but cannot be compiled (2003-10-02)
	http://python.org/sf/816929
Float Multiplication (2003-10-02)
	http://python.org/sf/816946
invalid \U escape gives 0=length unistr (2003-10-03)
	http://python.org/sf/817156
Email.message example missing arg (2003-10-03)
	http://python.org/sf/817178
re.finditer hangs on final empty match (2003-10-03)
	http://python.org/sf/817234
Google kills socket lookup (2003-10-04)
	http://python.org/sf/817611
Need "new style note" (2003-10-04)
	http://python.org/sf/817742
select behavior undefined for empty lists (2003-10-04)
	http://python.org/sf/817920
ossaudiodev FileObject does not support closed const (2003-10-04)
	http://python.org/sf/818006
installer wakes up Windows File Protection (2003-10-05)
	http://python.org/sf/818029
use Windows' default programs location. (2003-10-05)
	http://python.org/sf/818030
os.listdir on empty strings. Inconsistent behaviour. (2003-10-05)
	http://python.org/sf/818059
mailbox._Subfile readline() bug (2003-10-05)
	http://python.org/sf/818065

New Patches
-----------

invalid use of setlocale (2003-09-11)
	http://python.org/sf/804543
deprecated modules (2003-09-29)
	http://python.org/sf/814560
Extension logging.handlers.SocketHandler (2003-10-01)
	http://python.org/sf/815911
popen2 work, fixes bugs 768649 and 761888 (2003-10-01)
	http://python.org/sf/816059
urllib2.URLError don't calll IOError.__init__ (2003-10-02)
	http://python.org/sf/816787
dynamic popen2 MAXFD (2003-10-03)
	http://python.org/sf/817329
urllib2 does not allow for absolute ftp paths (2003-10-03)
	http://python.org/sf/817379
sprout more file operations in SSLFile, fixes 792101 (2003-10-04)
	http://python.org/sf/817854

Closed Bugs
-----------

urllib/urllib2(?) timeouts (2003-09-10)
	http://python.org/sf/803634
invalid use of setlocale (2003-09-11)
	http://python.org/sf/804543
Crash if getvar of a non-existent Tcl variable (2003-09-16)
	http://python.org/sf/807314
exit() raises exception (2003-09-21)
	http://python.org/sf/810214
2.3.1 configure bug (2003-09-23)
	http://python.org/sf/811028
HP/UX vs configure (2003-09-23)
	http://python.org/sf/811160
webbrowser.open_new() opens in an existing browser window (2003-09-24)
	http://python.org/sf/812089

Closed Patches
--------------

popen fix for multiple quoted arguments (2001-09-29)
	http://python.org/sf/466451
Add IPPROTO_IPV6 option to the socketmodule (2003-09-27)
	http://python.org/sf/813445
entry size for cursors (2003-09-27)
	http://python.org/sf/813877

From python at rcn.com  Sun Oct  5 11:46:29 2003
From: python at rcn.com (Raymond Hettinger)
Date: Sun Oct  5 11:46:58 2003
Subject: [Python-Dev] Efficient predicates for the standard library
In-Reply-To: <20031005095727.GB32122@ics.uci.edu>
Message-ID: <000001c38b57$d7e91c20$e841fea9@oemcomputer>


> Honestly, I assumed that 
>
>    x in iterable
>
> has a short-circuit implementation.  Why doesn't it?

It does.

The ifilter() version is faster only because it doesn't have to
continually return values to the 'in' iterator.  The speedup is a small
constant factor.


> Let me just give you the reasons (in no particular order) for my
> suggestion to include the `all' and `some/any' predicates:
> 
> 1. Efficiency
> Maybe I'm a bit naive here, but it seems to me that since these
> predicates involve tight inner loops they offer good potential for
> speedup, especially when used often and over many iterations.

You're guessing incorrectly.  The pure python versions use underlying
itertools which run at full C speed.  You cannot beat the ifilter()
version..

> 2. Readabilty
> If we offer universally-used predicates with succinct names which are
> available as part of the "batteries included" then that increases 
> readabilty of code a lot.

I put the code in the docs in a form so that people can cut and paste
the function definitions it as needed.  Then, they can use all(), any(),
or no() to their heart's content.  


> 4. It's *not* trivial!
> Contrary to what you imply it's not trivial for everybody to just
write
> efficient and well designed predicates with well-chosen names.  This
> discussion is the proof. :-)

Cut and paste is easy. 

 
Raymond


From python at rcn.com  Sun Oct  5 11:49:28 2003
From: python at rcn.com (Raymond Hettinger)
Date: Sun Oct  5 11:49:58 2003
Subject: [Python-Dev] Efficient predicates for the standard library
Message-ID: <000101c38b58$42d5da00$e841fea9@oemcomputer>

> Honestly, I assumed that 
>
>    x in iterable
>
> has a short-circuit implementation.  Why doesn't it?

It does.

The ifilter() version is faster only because it doesn't have to
continually return values to the 'in' iterator.  The speedup is a small
constant factor.


> Let me just give you the reasons (in no particular order) for my
> suggestion to include the `all' and `some/any' predicates:
> 
> 1. Efficiency
> Maybe I'm a bit naive here, but it seems to me that since these
> predicates involve tight inner loops they offer good potential for
> speedup, especially when used often and over many iterations.

You're guessing incorrectly.  The pure python versions use underlying
itertools which loop at full C speed.  You cannot beat the ifilter()
version.


> 2. Readabilty
> If we offer universally-used predicates with succinct names which are
> available as part of the "batteries included" then that increases 
> readabilty of code a lot.

I put the code in the docs in a form so that people can cut and paste
the function definitions it as needed.  Then, they can use all(), any(),
or no() to their heart's content.  


> 4. It's *not* trivial!
> Contrary to what you imply it's not trivial for everybody to just
write
> efficient and well designed predicates with well-chosen names.  This
> discussion is the proof. :-)

Cut and paste is your friend. 

 
Raymond


From jeremy at alum.mit.edu  Mon Oct  6 01:18:27 2003
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon Oct  6 01:19:57 2003
Subject: [Python-Dev] test_bsddb hangs with CVS Python
Message-ID: <1065417507.2095.5.camel@localhost.localdomain>

test_bsddb hangs for me everytime.  This is a current CVS python with
BerkeleyDB 4.1.25.  I've tried commenting out test_pop and
test_mapping_iteration_methods, but it still hangs somewhere.

localhost:~/src/python/build-pydebug> ./python ../Lib/test/test_bsddb.py
-v
test_change (__main__.TestBTree) ... ok
test_clear (__main__.TestBTree) ... ok
test_close_and_reopen (__main__.TestBTree) ... ok
test_contains (__main__.TestBTree) ... ok
test_first_next_looping (__main__.TestBTree) ... ok
test_get (__main__.TestBTree) ... ok
test_getitem (__main__.TestBTree) ... ok
test_has_key (__main__.TestBTree) ... ok
test_keyordering (__main__.TestBTree) ... ok
test_len (__main__.TestBTree) ... ok
test_mapping_iteration_methods (__main__.TestBTree) ... ok
test_pop (__main__.TestBTree) ... ok

strace says:
stat64("./@test", 0xbfffc980)           = -1 ENOENT (No such file or
directory)
stat64("./__db.@test.", {st_mode=S_IFREG|0664, st_size=8192, ...}) = 0
stat64("./@test", 0xbfffc830)           = -1 ENOENT (No such file or
directory)
rename("./__db.@test.", "./@test")      = 0
stat64("./@test", {st_mode=S_IFREG|0664, st_size=8192, ...}) = 0
open("./@test", O_RDWR|O_LARGEFILE)     = 3
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
fstat64(3, {st_mode=S_IFREG|0664, st_size=8192, ...}) = 0
pread(3, "\0\0\0\0\1\0\0\0\0\0\0\0b1\5\0\t\0\0\0\0\20\0\0\0\t\0\0"...,
4096, 0) = 4096
pread(3, "\0\0\0\0\1\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\20\1\5\0"...,
4096, 4096) = 4096
futex(0x4055ad40, FUTEX_WAIT, 0, NULL

Jeremy


From aleaxit at yahoo.com  Mon Oct  6 03:17:16 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct  6 03:17:24 2003
Subject: [Python-Dev] Efficient predicates for the standard library
In-Reply-To: <20031005095727.GB32122@ics.uci.edu>
References: <20031004234000.GG25813@ics.uci.edu>
	<000001c38adf$77f39ca0$e841fea9@oemcomputer>
	<20031005095727.GB32122@ics.uci.edu>
Message-ID: <200310060917.16869.aleaxit@yahoo.com>

On Sunday 05 October 2003 11:57 am, 'Christian Stork' wrote:
> On Sat, Oct 04, 2003 at 09:24:48PM -0400, Raymond Hettinger wrote:
> ...
>
> > Your proposal is a net gain and I will change the docs as requested.
> > Having bool() as a default makes the functions more useful and less
> > error prone.  Also, it increases instructional value by giving an
> > example of a predicate (for who skipped class that day).  Also, your
> > proposed argument order matches common mathematical usage (i.e. All
> > elements in a <domain> such that <predicate> is true).
>
> Thanks, I agree.

Adding to this chorus of agreement, I'd also point out that the form
with seq first and pred second ALSO agrees with the usage in list
comprehensions of analogous semantics -- [x for x in seq if pred(x)] .


> > For your own purposes, consider using a faster implementation:
> >
> >     def Some(seq, pred=None):
> >         for x in ifilter(None, seq):

I suspect that what Raymond means here is ifilter(pred, seq) -- the
way he's written it, the pred argument would be ignored.

> >             return True
> >         return False
> >
> > All() and No() have similar fast implementations using ifilterfalse()
> > and reversing the return values.
>
> Interesting, this is almost exactly what my first attempt at this looked
> like.  Then I saw the examples in the doc and changed to the proposed
> ones.
>
> Honestly, I assumed that
>
>     x in iterable
>
> has a short-circuit implementation.  Why doesn't it?

It does.  But (assuming the occurrence of x is the Mth out of N items
in seq), "return True in imap(pred, seq)" must yield M times and perform
M comparisons, while "for x in ifilter(pred, seq): return True" -- while
still performing M comparisons, inside ifilter -- yields only once, so
it saves performing M-1 yields.  The second form is also more tolerant
in what it accepts (which is something of a golden rule...) -- it does
not malfunction quietly if pred returns true/false values that differ from
the "canonical" True and False instances of bool.  In some applications,
the resulting ability to use an existing pred function directly rather than
wrapping it into a bool(...) may further accelerate things.


Alex


From aleaxit at yahoo.com  Mon Oct  6 03:28:38 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct  6 03:28:43 2003
Subject: [Python-Dev] Efficient predicates for the standard library
In-Reply-To: <000001c38adf$77f39ca0$e841fea9@oemcomputer>
References: <000001c38adf$77f39ca0$e841fea9@oemcomputer>
Message-ID: <200310060928.38292.aleaxit@yahoo.com>

On Sunday 05 October 2003 03:24 am, Raymond Hettinger wrote:
   ...
> In this case, a single example in the docs may suffice:
>
>     if False in imap(isinstance, seqn, repeat(int)):
>         raise TypeError("All arguments must be of type int")

If assert is seen as the typical way to posit such checks, then maybe:

assert False not in imap(isinstance, seq, repeat(int)), "All args must be int"

might be considered to be a didactically preferable example.

Personally, I would not really consider this optimal, with either way of
expression.  Python's error messages are slowly but surely getting better in
that, instead of just saying that something (e.g.) "must be int" (and leaving
the coder in the dark about WHAT it was instead), more and more such
messages are saying "must be int, not str" or the like.  Giving examples
that lead to less-informative error messages is, IMHO, not a good idea; to
give more information in case of errors, of course, does require a bit more
code in the check.  I guess for sanity checks that are meant to never really
trigger an error message, one might be inclined to ignore this issue -- at
least until the first time one such message does trigger and one has to go
back and re-instrument the checks to be more informative;-).  Sorry for the
aside, but I care more about helpful error messages than about "efficient
predicates", where the efficiency gain bids fair to be a micro-optimization...


Alex


From mwh at python.net  Mon Oct  6 05:44:39 2003
From: mwh at python.net (Michael Hudson)
Date: Mon Oct  6 05:43:51 2003
Subject: [Python-Dev] 2.3.3 plans
In-Reply-To: <200310040008.h9408HtM008544@localhost.localdomain> (Anthony
	Baxter's message of "Sat, 04 Oct 2003 10:08:16 +1000")
References: <200310040008.h9408HtM008544@localhost.localdomain>
Message-ID: <2m3ce6zomw.fsf@starship.python.net>

Anthony Baxter <anthony@interlink.com.au> writes:

> I'm currently thinking of doing 2.3.3 in about 3 months time. My focus
> on 2.3.3 will be on fixing the various build glitches that we have on
> various platforms - I'd like to see 2.3.3 build on as many boxes as 
> possible, "out of the box".

This sounds good.  It would be nice to have a more sustained effort
this time, and also to get on board people who know the problem
platforms (as opposed to "logging on to a testdrive machine and
flailing").

What platforms have issues that we know about?  There's old SCO (but
the fix for that is known), HPUX/ia64, various oddities on Irix.

Cheers,
mwh

-- 
  You owe the Oracle a star-spangled dunce cap.
                 -- Internet Oracularity Internet Oracularity #1299-08

From gmccaughan at synaptics-uk.com  Mon Oct  6 07:19:54 2003
From: gmccaughan at synaptics-uk.com (Gareth McCaughan)
Date: Mon Oct  6 07:20:29 2003
Subject: [Python-Dev] Efficient predicates for the standard library
Message-ID: <200310061219.54092.gmccaughan@synaptics-uk.com>

Chris Stork wrote:

> Now that there's a useful default meaning for pred, we should give
> it a default and make it an optional argument.  For this the order of
> arguments must be reversed.  This is different from itertools consistent
> use of iterables as last arguments.  I don't know if this is relevant
> here.  Anyway, since predicates are in general more useful like this
> I think it's the better choice.

Perhaps that's true for a piece of code given as an example.
I don't think it would be sensible if, as you propose, these
functions were to be put in the standard library, because
there's something better to do with the default args for
real applications.

    def any(pred, *iterables):

I think the ability to work with multiple sequences (and
not to have to use the argument order iter1, pred, iter2, ...)
is more important than the ability to avoid typing "bool,".

Another option would be

    def any(*iterables, pred=bool):
        for items in imap(None, *iterables):
            if pred(*items): return True
        return False

which looks to me like it offers the best of both worlds.

-- 
g


From cstork at ics.uci.edu  Mon Oct  6 07:44:28 2003
From: cstork at ics.uci.edu (Christian Stork)
Date: Mon Oct  6 07:45:30 2003
Subject: [Python-Dev] Efficient predicates for the standard library
In-Reply-To: <200310061219.54092.gmccaughan@synaptics-uk.com>
References: <200310061219.54092.gmccaughan@synaptics-uk.com>
Message-ID: <20031006114428.GA7899@ics.uci.edu>

On Mon, Oct 06, 2003 at 12:19:54PM +0100, Gareth McCaughan wrote:
...
>     def any(pred, *iterables):
> 
> I think the ability to work with multiple sequences (and
> not to have to use the argument order iter1, pred, iter2, ...)
> is more important than the ability to avoid typing "bool,".

Raymond would tell you to use either chain() or izip() on your
*iterables.  ;-)  This would also make clear what is actually meant.

> Another option would be
> 
>     def any(*iterables, pred=bool):

>>> def any(*iterables, pred=bool):
------------------------------------------------------------
   File "<stdin>", line 1
     def any(*iterables, pred=bool):
                            ^
SyntaxError: invalid syntax


-Chris

From gmccaughan at synaptics-uk.com  Mon Oct  6 08:06:18 2003
From: gmccaughan at synaptics-uk.com (Gareth McCaughan)
Date: Mon Oct  6 08:06:57 2003
Subject: [Python-Dev] Efficient predicates for the standard library
In-Reply-To: <20031006114428.GA7899@ics.uci.edu>
References: <200310061219.54092.gmccaughan@synaptics-uk.com>
	<20031006114428.GA7899@ics.uci.edu>
Message-ID: <200310061306.18041.gmccaughan@synaptics-uk.com>

I said:

>>     def any(pred, *iterables):
>>
>> I think the ability to work with multiple sequences (and
>> not to have to use the argument order iter1, pred, iter2, ...)
>> is more important than the ability to avoid typing "bool,".

Chris Stork replied:

> Raymond would tell you to use either chain() or izip() on your
> *iterables.  ;-)  This would also make clear what is actually meant.

Ugh. :-)

>> Another option would be
>>
>>     def any(*iterables, pred=bool):
>>>> def any(*iterables, pred=bool):
>
> ------------------------------------------------------------
>    File "<stdin>", line 1
>      def any(*iterables, pred=bool):
>                             ^
> SyntaxError: invalid syntax

Aieee! I was so sure you could do that, I didn't bother
checking. In fact my thoughts went like this: "Hang on;
can you do that? ... Yes, of course you can. I'm just
thinking of Lisp, where you can't because of the way
keyword args work there. That's a nice benefit of
Python's less minimal syntax, isn't it?". How annoying.

-- 
g


From joerg at britannica.bec.de  Mon Oct  6 08:48:10 2003
From: joerg at britannica.bec.de (Joerg Sonnenberger)
Date: Mon Oct  6 08:49:18 2003
Subject: [Python-Dev] nested packages and import order
Message-ID: <20031006124810.GA1890@britannica.bec.de>

Hi all,
I have a package a.b with the following content:

a/b/__init__.py:
  import a.b
  dir(a.b)

Running this generates a AttributeError for b, obviously the
import didn't add b to the module "a". Even though it can be
argued that importing a package from within is bad style, this
a clearly a bug since its at least surprising.

Shouldn't the import create the namespace entry in a after it
created the module entry in sys.modules? 

Joerg

P.S.: Please CC me, I'm not subscriped

From michael.l.schneider at eds.com  Mon Oct  6 10:07:53 2003
From: michael.l.schneider at eds.com (Schneider, Michael)
Date: Mon Oct  6 10:07:57 2003
Subject: [Python-Dev] RE: Python-Dev Digest, Vol 3, Issue 10
Message-ID: <49199579A2BB32438A7572AF3DBB2FB501FEED1D@uscimplm001.net.plm.eds.com>

Tana,

I have a conflict for the 10:30 meeting.

Can we get together at 2:00 this afternoon.  I'm sorry for the clash,
Mike

 
----------------------------------------------------------------
Michael Schneider
Senior Software Engineering Consultant
EDS PLM Solutions
 
"The Greatest Performance Improvement Is the transitioning from a non-working state to the working state"


-----Original Message-----
From: python-dev-request@python.org
[mailto:python-dev-request@python.org]
Sent: Sunday, October 05, 2003 12:02 PM
To: python-dev@python.org
Subject: Python-Dev Digest, Vol 3, Issue 10


Send Python-Dev mailing list submissions to
	python-dev@python.org

To subscribe or unsubscribe via the World Wide Web, visit
	http://mail.python.org/mailman/listinfo/python-dev
or, via email, send a message with subject or body 'help' to
	python-dev-request@python.org

You can reach the person managing the list at
	python-dev-owner@python.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Python-Dev digest..."


Today's Topics:

   1. RE: Efficient predicates for the standard library
      (Raymond Hettinger)
   2. Efficient predicates for the standard library (Raymond Hettinger)


----------------------------------------------------------------------

Message: 1
Date: Sun, 5 Oct 2003 11:46:29 -0400
From: "Raymond Hettinger" <python@rcn.com>
Subject: RE: [Python-Dev] Efficient predicates for the standard
	library
To: "'Christian Stork'" <cstork@ics.uci.edu>
Cc: python-dev@python.org
Message-ID: <000001c38b57$d7e91c20$e841fea9@oemcomputer>
Content-Type: text/plain;	charset="us-ascii"


> Honestly, I assumed that 
>
>    x in iterable
>
> has a short-circuit implementation.  Why doesn't it?

It does.

The ifilter() version is faster only because it doesn't have to
continually return values to the 'in' iterator.  The speedup is a small
constant factor.


> Let me just give you the reasons (in no particular order) for my
> suggestion to include the `all' and `some/any' predicates:
> 
> 1. Efficiency
> Maybe I'm a bit naive here, but it seems to me that since these
> predicates involve tight inner loops they offer good potential for
> speedup, especially when used often and over many iterations.

You're guessing incorrectly.  The pure python versions use underlying
itertools which run at full C speed.  You cannot beat the ifilter()
version..

> 2. Readabilty
> If we offer universally-used predicates with succinct names which are
> available as part of the "batteries included" then that increases 
> readabilty of code a lot.

I put the code in the docs in a form so that people can cut and paste
the function definitions it as needed.  Then, they can use all(), any(),
or no() to their heart's content.  


> 4. It's *not* trivial!
> Contrary to what you imply it's not trivial for everybody to just
write
> efficient and well designed predicates with well-chosen names.  This
> discussion is the proof. :-)

Cut and paste is easy. 

 
Raymond


------------------------------

Message: 2
Date: Sun, 5 Oct 2003 11:49:28 -0400
From: "Raymond Hettinger" <python@rcn.com>
Subject: [Python-Dev] Efficient predicates for the standard library
To: <python-dev@python.org>
Message-ID: <000101c38b58$42d5da00$e841fea9@oemcomputer>
Content-Type: text/plain;	charset="us-ascii"

> Honestly, I assumed that 
>
>    x in iterable
>
> has a short-circuit implementation.  Why doesn't it?

It does.

The ifilter() version is faster only because it doesn't have to
continually return values to the 'in' iterator.  The speedup is a small
constant factor.


> Let me just give you the reasons (in no particular order) for my
> suggestion to include the `all' and `some/any' predicates:
> 
> 1. Efficiency
> Maybe I'm a bit naive here, but it seems to me that since these
> predicates involve tight inner loops they offer good potential for
> speedup, especially when used often and over many iterations.

You're guessing incorrectly.  The pure python versions use underlying
itertools which loop at full C speed.  You cannot beat the ifilter()
version.


> 2. Readabilty
> If we offer universally-used predicates with succinct names which are
> available as part of the "batteries included" then that increases 
> readabilty of code a lot.

I put the code in the docs in a form so that people can cut and paste
the function definitions it as needed.  Then, they can use all(), any(),
or no() to their heart's content.  


> 4. It's *not* trivial!
> Contrary to what you imply it's not trivial for everybody to just
write
> efficient and well designed predicates with well-chosen names.  This
> discussion is the proof. :-)

Cut and paste is your friend. 

 
Raymond


------------------------------

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev


End of Python-Dev Digest, Vol 3, Issue 10
*****************************************

From tim.one at comcast.net  Mon Oct  6 10:20:56 2003
From: tim.one at comcast.net (Tim Peters)
Date: Mon Oct  6 10:21:02 2003
Subject: [Python-Dev] test_bsddb hangs with CVS Python
In-Reply-To: <1065417507.2095.5.camel@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEPHGFAB.tim.one@comcast.net>

[Jeremy Hylton]
> test_bsddb hangs for me everytime.  This is a current CVS python with
> BerkeleyDB 4.1.25.  I've tried commenting out test_pop and
> test_mapping_iteration_methods, but it still hangs somewhere.

On Win98SE, it hangs every time in test_popitem, which I changed like so:

    def test_popitem(self):
        print [1]                    # ADDED THIS
        k, v = self.f.popitem()
        print [2]                    # AND THIS
        self.assert_(k in self.d)
        self.assert_(v in self.d.values())
        self.assert_(k not in self.f)
        self.assertEqual(len(self.d)-1, len(self.f))

It prints [1], but not [2]:

C:\Code\python\PCbuild>python_d ../lib/test/test_bsddb.py -v
test_change (__main__.TestBTree) ... ok
test_clear (__main__.TestBTree) ... ok
test_close_and_reopen (__main__.TestBTree) ... ok
test_contains (__main__.TestBTree) ... ok
test_first_next_looping (__main__.TestBTree) ... ok
test_get (__main__.TestBTree) ... ok
test_getitem (__main__.TestBTree) ... ok
test_has_key (__main__.TestBTree) ... ok
test_keyordering (__main__.TestBTree) ... ok
test_len (__main__.TestBTree) ... ok
test_mapping_iteration_methods (__main__.TestBTree) ... ok
test_pop (__main__.TestBTree) ... ok
test_popitem (__main__.TestBTree) ... [1]

A stacktrace at the point it's hung; looks like deadlock:

_BSDDB_D! __db_win32_mutex_lock + 134 bytes
_BSDDB_D! __lock_get + 2264 bytes
_BSDDB_D! __lock_get + 197 bytes
_BSDDB_D! __db_lget + 365 bytes
_BSDDB_D! __bam_search + 322 bytes
_BSDDB_D! __bam_c_rget + 3535 bytes
_BSDDB_D! __bam_c_dup + 1251 bytes
_BSDDB_D! __db_c_get + 875 bytes
_BSDDB_D! __db_delete + 378 bytes
_DB_delete(DBObject * 0x00ba1ee8, __db_txn * 0x00000000,
           __db_dbt * 0x0062d9b0, int 0)
          line 545 + 29 bytes
DB_ass_sub(DBObject * 0x00ba1ee8, _object * 0x00881d10,
           _object * 0x00000000)
          line 2343 + 17 bytes
PyObject_DelItem(_object * 0x00ba1ee8, _object * 0x00881d10)
                line 155 + 16 bytes
eval_frame(_frame * 0x0098a368) line 1460 + 13 bytes
PyEval_EvalCodeEx(PyCodeObject * 0x00bb6550, _object * 0x008782d8,
                  _object * 0x00000000, _object * * 0x008f034c,
                  int 2, _object * * 0x00000000, int 0,
                  _object * * 0x00000000, int 0, _object * 0x00000000)
                 line 2663 + 9 bytes
...

eval_frame() is executing DELETE_SUBSCR.


Good(?) news:  test_popitem continues to hang even if all other tests are
commented out:

C:\Code\python\PCbuild>python_d ../lib/test/test_bsddb.py -v
test_popitem (__main__.TestBTree) ... [1]

Same stacktrace at that point.

I was using a debug-build CVS Python above.  It also hangs the same place
using a release-build Python.


From tjreedy at udel.edu  Mon Oct  6 12:41:03 2003
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon Oct  6 12:41:08 2003
Subject: [Python-Dev] Re: nested packages and import order
References: <20031006124810.GA1890@britannica.bec.de>
Message-ID: <bls5v0$61h$1@sea.gmane.org>


"Joerg Sonnenberger" <joerg@britannica.bec.de> wrote in message
news:20031006124810.GA1890@britannica.bec.de...
> Hi all,
> I have a package a.b with the following content: [snip]

Please direct current Python version usage questions to
comp.lang.python or the equivalent mailing list (see www.python.org).
The py-dev mailing list (and its mirror, g.c.p.devel) is for
discussion of the next and future version of Python.

> P.S.: Please CC me, I'm not subscriped

Done

TJR


From g_will at cyberus.ca  Mon Oct  6 16:04:25 2003
From: g_will at cyberus.ca (Gordon Williams)
Date: Mon Oct  6 16:03:56 2003
Subject: [Python-Dev] ConfigParser items method
Message-ID: <000e01c38c45$0aebe650$7654e640@amd950>

Hi All,

There is a consistency problem between the RawConfigParser and the
ConfigParser and SafeConfigParser with the items method.  In the first case
a list of tuples is returned and in the second two a generator is returned.
This is quite confusing and I thought that this was a bug, but the docs
indicate that this is what is supposed to happen.

An items method that returned a list of tuples as it does in the
RawconfigParser would be a useful method to have for both ConfigParser and
SafeConfigParser.

The RawConfigParser docs say that items should return a list:

      items( section)

Return a list of (name, value) pairs for each option in the given section.


The ConfigParser docs say that items should return a generator:

      items( section[, raw[, vars]])

Create a generator which will return a tuple (name, value) for each option
in the given section. Optional arguments have the same meaning as for the
get() method. New in version 2.3.


RawConfigParser returns list:

>>> Config.config
<ConfigParser.RawConfigParser instance at 0x00E0DAD0>
>>> Config.config.items("personal")
[('age', '21'), ('company', 'Aztec'), ('name', 'karthik')]
>>>

ConfigParser and SafeConfigParser return generator:

>>> Config.config
<ConfigParser.SafeConfigParser instance at 0x00DB1738>
>>> Config.config.items("personal")
<generator object at 0x00DC3350>
>>> for item in Config.config.items("personal"):
...  print item
...
('age', '21')
('company', 'Aztec')
('name', 'karthik')


>>> Config.config
<ConfigParser.ConfigParser instance at 0x00E0D378>
>>> Config.config.items("personal")
<generator object at 0x00DB1738>
>>> for item in Config.config.items("personal"):
...  print item
...
('age', '21')
('company', 'Aztec')
('name', 'karthik')

It doesn't make sense to me that the same method should return different
objects.  Maybe another name for ConfigParser and SafeConfigParser would be
appropriate to indicate that a generator was being returned.

Regards,

Gordon Williams


From skip at pobox.com  Mon Oct  6 16:07:11 2003
From: skip at pobox.com (Skip Montanaro)
Date: Mon Oct  6 16:07:26 2003
Subject: [Python-Dev] 2.3.3 plans
In-Reply-To: <2m3ce6zomw.fsf@starship.python.net>
References: <200310040008.h9408HtM008544@localhost.localdomain>
	<2m3ce6zomw.fsf@starship.python.net>
Message-ID: <16257.52079.226636.407139@montanaro.dyndns.org>


    Michael> This sounds good.  It would be nice to have a more sustained
    Michael> effort this time, and also to get on board people who know the
    Michael> problem platforms (as opposed to "logging on to a testdrive
    Michael> machine and flailing").

It's not quite exhaustive yet, but I will remind people about the
PythonTesters wiki page:

    http://www.python.org/cgi-bin/moinmoin/PythonTesters

Maybe that page should also mention some of the vendor-specific test sites
(HP Test Drive, SourceForge compile farm, PBF server farm, ...).

    Michael> What platforms have issues that we know about?  There's old SCO
    Michael> (but the fix for that is known), HPUX/ia64, various oddities on
    Michael> Irix.

I think it would be real nice if we hammered hard on the bsddb3 problems.
Whatever it is, it seems to affect a broad cross-section of the community.

Skip


From fdrake at acm.org  Mon Oct  6 16:12:45 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon Oct  6 16:12:58 2003
Subject: [Python-Dev] ConfigParser items method
In-Reply-To: <000e01c38c45$0aebe650$7654e640@amd950>
References: <000e01c38c45$0aebe650$7654e640@amd950>
Message-ID: <16257.52413.154509.392409@grendel.zope.com>


Gordon Williams writes:
 > An items method that returned a list of tuples as it does in the
 > RawconfigParser would be a useful method to have for both ConfigParser and
 > SafeConfigParser.

I'm happy for these to always return a list.  I probably changed this
around when I refactored the classes into raw/classic/safe flavors
without really thinking about it.

If there are no objections, feel free to file a bug report on
SourceForge and assign it to me.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From g_will at cyberus.ca  Mon Oct  6 16:35:48 2003
From: g_will at cyberus.ca (Gordon Williams)
Date: Mon Oct  6 16:35:19 2003
Subject: [Python-Dev] ConfigParser items method
References: <000e01c38c45$0aebe650$7654e640@amd950>
	<16257.52413.154509.392409@grendel.zope.com>
Message-ID: <000701c38c49$6d75e510$7654e640@amd950>

Hi Fred,

I cant log into sourceforge bugs.  I will leave it in your capable hands.

Regards,

Gordon Williams

----- Original Message -----
From: "Fred L. Drake, Jr." <fdrake@acm.org>
To: "Gordon Williams" <g_will@cyberus.ca>
Cc: <python-dev@python.org>
Sent: Monday, October 06, 2003 4:12 PM
Subject: Re: [Python-Dev] ConfigParser items method


>
> Gordon Williams writes:
>  > An items method that returned a list of tuples as it does in the
>  > RawconfigParser would be a useful method to have for both ConfigParser
and
>  > SafeConfigParser.
>
> I'm happy for these to always return a list.  I probably changed this
> around when I refactored the classes into raw/classic/safe flavors
> without really thinking about it.
>
> If there are no objections, feel free to file a bug report on
> SourceForge and assign it to me.
>
>
>   -Fred
>
> --
> Fred L. Drake, Jr.  <fdrake at acm.org>
> PythonLabs at Zope Corporation
>


From fdrake at acm.org  Mon Oct  6 16:41:42 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon Oct  6 16:41:53 2003
Subject: [Python-Dev] ConfigParser items method
In-Reply-To: <000701c38c49$6d75e510$7654e640@amd950>
References: <000e01c38c45$0aebe650$7654e640@amd950>
	<16257.52413.154509.392409@grendel.zope.com>
	<000701c38c49$6d75e510$7654e640@amd950>
Message-ID: <16257.54150.334033.186260@grendel.zope.com>


Gordon Williams writes:
 > I cant log into sourceforge bugs.  I will leave it in your capable hands.

Report filed:

http://sourceforge.net/tracker/index.php?func=detail&aid=818861&group_id=5470&atid=105470


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From martin at v.loewis.de  Mon Oct  6 16:52:21 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Mon Oct  6 16:52:32 2003
Subject: [Python-Dev] 2.3.3 plans
In-Reply-To: <16257.52079.226636.407139@montanaro.dyndns.org>
References: <200310040008.h9408HtM008544@localhost.localdomain>
	<2m3ce6zomw.fsf@starship.python.net>
	<16257.52079.226636.407139@montanaro.dyndns.org>
Message-ID: <m34qymf5ru.fsf@mira.informatik.hu-berlin.de>

Skip Montanaro <skip@pobox.com> writes:

> I think it would be real nice if we hammered hard on the bsddb3 problems.
> Whatever it is, it seems to affect a broad cross-section of the community.

But is there a single report that cannot be attributed to
multi-threading, or multi-processes?

Regards,
Martin


From skip at pobox.com  Mon Oct  6 16:59:51 2003
From: skip at pobox.com (Skip Montanaro)
Date: Mon Oct  6 17:00:02 2003
Subject: [Python-Dev] 2.3.3 plans
In-Reply-To: <m34qymf5ru.fsf@mira.informatik.hu-berlin.de>
References: <200310040008.h9408HtM008544@localhost.localdomain>
	<2m3ce6zomw.fsf@starship.python.net>
	<16257.52079.226636.407139@montanaro.dyndns.org>
	<m34qymf5ru.fsf@mira.informatik.hu-berlin.de>
Message-ID: <16257.55239.526267.58978@montanaro.dyndns.org>


    Martin> Skip Montanaro <skip@pobox.com> writes:
    >> I think it would be real nice if we hammered hard on the bsddb3
    >> problems.  Whatever it is, it seems to affect a broad cross-section
    >> of the community.

    Martin> But is there a single report that cannot be attributed to
    Martin> multi-threading, or multi-processes?

I don't know.  But the fact that we have so far been unable to answer even
this question reliably means we have some work to do.  I have been assuming
that the problems have all been related to access from multiple threads or
processes, but others haven't seemed so sure.  What about the popitem()
hangs?

Skip


From greg at electricrain.com  Mon Oct  6 18:01:16 2003
From: greg at electricrain.com (Gregory P. Smith)
Date: Mon Oct  6 18:01:27 2003
Subject: [Python-Dev] bsddb & popitems
In-Reply-To: <16257.55239.526267.58978@montanaro.dyndns.org>
References: <200310040008.h9408HtM008544@localhost.localdomain>
	<2m3ce6zomw.fsf@starship.python.net>
	<16257.52079.226636.407139@montanaro.dyndns.org>
	<m34qymf5ru.fsf@mira.informatik.hu-berlin.de>
	<16257.55239.526267.58978@montanaro.dyndns.org>
Message-ID: <20031006220116.GB8308@zot.electricrain.com>

On Mon, Oct 06, 2003 at 03:59:51PM -0500, Skip Montanaro wrote:
> 
>     Martin> Skip Montanaro <skip@pobox.com> writes:
>     >> I think it would be real nice if we hammered hard on the bsddb3
>     >> problems.  Whatever it is, it seems to affect a broad cross-section
>     >> of the community.
> 
>     Martin> But is there a single report that cannot be attributed to
>     Martin> multi-threading, or multi-processes?
> 
> I don't know.  But the fact that we have so far been unable to answer even
> this question reliably means we have some work to do.  I have been assuming
> that the problems have all been related to access from multiple threads or
> processes, but others haven't seemed so sure.  What about the popitem()
> hangs?

The popitem() stack trace on win98 that was just posted still looks
like a BerkeleyDB issue.  Its stuck in a lock.  (bsddb always opens its
database and environment with DB_INIT_LOCK and DB_THREAD flags because
it can't tell if it will be used by multiple threads)

I agree with your assumption and still believe that it is BerkeleyDB /
OS locking issues causing the hangs on various platforms.


also, not related to bsddb the problem but since i noticed it...:

Looking at the code for popitem it looks like bsddb uses
UserDict.DictMixin's implementation which does not look thread safe if
two threads were removing things from the "dict" with only one of them
using popitem.  Am I missing something?

A race condition exists in iteritems between finding that k exists in
the dictionary and looking up self[k].

    def iteritems(self):
	for k in self:
	    return (k, self[k])


Greg

From greg at cosc.canterbury.ac.nz  Mon Oct  6 20:33:19 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon Oct  6 20:33:41 2003
Subject: More informative error messages (Re: [Python-Dev] Efficient predicates
	for the standard library)
In-Reply-To: <200310060928.38292.aleaxit@yahoo.com>
Message-ID: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz>

Alex Martelli <aleaxit@yahoo.com>:

> Python's error messages are slowly but surely getting better in that,
> instead of just saying that something (e.g.) "must be int" (and
> leaving the coder in the dark about WHAT it was instead)

While we're on the subject of error messages, I'd like to
point out another one that could be improved. Often one
sees things like

  TypeError: foo() takes exactly 1 argument (2 given)

In the case where foo() is a method of some class, and there
are various versions of foo() defined in various superclasses,
it's sometimes hard to tell exactly *which* foo it was trying
to call. It would be much more useful if the module and
class names were included in the error message, e.g.

  TypeError: MyStuff.SomeClass.foo() takes exactly 1 argument (2
given)

The same goes for function names quoted in the traceback.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Mon Oct  6 20:38:28 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon Oct  6 20:39:14 2003
Subject: Keyword-only arguments (Re: [Python-Dev] Efficient predicates for the
	standard library)
In-Reply-To: <200310061306.18041.gmccaughan@synaptics-uk.com>
Message-ID: <200310070038.h970cSm02652@oma.cosc.canterbury.ac.nz>

Gareth McCaughan <gmccaughan@synaptics-uk.com>:

> >>>> def any(*iterables, pred=bool):
> >
> > ------------------------------------------------------------
> >    File "<stdin>", line 1
> >      def any(*iterables, pred=bool):
> >                             ^
> > SyntaxError: invalid syntax
> 
> Aieee! I was so sure you could do that, I didn't bother
> checking

I was just thinking the other day that you *should* be
able to say that. Any keyword arguments after a * arg
would have to be specified by keyword in the call.

So many PEP ideas, so little time...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From aleaxit at yahoo.com  Tue Oct  7 04:56:34 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct  7 04:56:42 2003
Subject: [Python-Dev] Re: More informative error messages
In-Reply-To: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz>
References: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz>
Message-ID: <200310071056.34954.aleaxit@yahoo.com>

On Tuesday 07 October 2003 02:33 am, Greg Ewing wrote:
> Alex Martelli <aleaxit@yahoo.com>:
> > Python's error messages are slowly but surely getting better in that,
> > instead of just saying that something (e.g.) "must be int" (and
> > leaving the coder in the dark about WHAT it was instead)
>
> While we're on the subject of error messages, I'd like to
> point out another one that could be improved. Often one
> sees things like
>
>   TypeError: foo() takes exactly 1 argument (2 given)
>
> In the case where foo() is a method of some class, and there
> are various versions of foo() defined in various superclasses,
> it's sometimes hard to tell exactly *which* foo it was trying
> to call. It would be much more useful if the module and
> class names were included in the error message, e.g.
>
>   TypeError: MyStuff.SomeClass.foo() takes exactly 1 argument (2
> given)

A perennial beginners' confusion (recently highlighted in a c.l.py thread
whose subject claimed that Python can't count;-) is about that "number
of arguments given" number: one calls zoop.bleep() and is told bleep
"takes exactly 2 arguments (1 given)" when one is sure that one has
given no argument at all (and should give exactly 1) -- the implied 'self'
causing the beginners' confusion.  It seems to me that, if we work on these
messages, we may be able to distinguish the bound-method case into

TypeError: bound method bleep() of Zoop instance takes exactly 1
argument (0 given)

or some such...


Alex


From just at letterror.com  Tue Oct  7 06:17:03 2003
From: just at letterror.com (Just van Rossum)
Date: Tue Oct  7 06:17:11 2003
Subject: Keyword-only arguments (Re: [Python-Dev] Efficient predicates for
	the standard library)
In-Reply-To: <200310070038.h970cSm02652@oma.cosc.canterbury.ac.nz>
Message-ID: <r01050400-1026-668EDBD6F8AF11D7BCCD003065D5E7E4@[10.0.0.23]>

Greg Ewing wrote:

> > >>>> def any(*iterables, pred=bool):
> > >
> > > ------------------------------------------------------------
> > >    File "<stdin>", line 1
> > >      def any(*iterables, pred=bool):
> > >                             ^
> > > SyntaxError: invalid syntax
> > 
> > Aieee! I was so sure you could do that, I didn't bother
> > checking
> 
> I was just thinking the other day that you *should* be
> able to say that. Any keyword arguments after a * arg
> would have to be specified by keyword in the call.

Same here. Here's another limitation I think is unneccesarary:

  >>> args = (1, 2, 3)
  >>> foo(*args, 4, 5, 6)
    File "<stdin>", line 1
      foo(*args, 4, 5, 6)
                 ^
  SyntaxError: invalid syntax
  >>> 

> So many PEP ideas, so little time...

You got that right...

Just

From skip at pobox.com  Tue Oct  7 07:41:01 2003
From: skip at pobox.com (Skip Montanaro)
Date: Tue Oct  7 07:41:13 2003
Subject: [Python-Dev] spambayes-checkins -> spambayes-dev,
	python-checkins -> python-dev
In-Reply-To: <20031006235549.GA14656@cthulhu.gerg.ca>
References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz>
	<1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz>
	<16257.51553.112259.897217@montanaro.dyndns.org>
	<20031006235549.GA14656@cthulhu.gerg.ca>
Message-ID: <16258.42573.626779.73842@montanaro.dyndns.org>


    Greg> On 06 October 2003, Skip Montanaro said:
    >> Maybe the Reply-To: for spambayes-checkins should be spambayes-dev
    >> (and similarly for python-checkins/python-dev).  Can that be
    >> engineered through Mailman?

    Greg> Yes -- it's on the "General Options" page.  Look for
    Greg> reply_goes_to_list.

After seeing your answer I know I asked the wrong question. <wink> I
shouldn't have said "Reply-To:".  In X?Emacs/VM, I just hit the 'f' key to
reply to you and to cc spambayes-dev.  Had this been a spambayes-checkins
message, it would have been nice if the cc went to spambayes-dev instead of
spambayes-checkin.

I can probably solve that for myself by tweaking the vm-followup command
(what the 'f' key is bound to), but there's probably not a general solution.
Setting Reply-To: *might* be okay in a situation like this where you don't
want chit-chat on a checkins list to get lost or not seen by the larger
audience, but I'd only use it as a last resort.

Skip

[OT PS] cthulhu.gerg.ca?  Is that some sort of pronounceable-only-by-Native-
Canadians name?

From barry at python.org  Tue Oct  7 08:30:24 2003
From: barry at python.org (Barry Warsaw)
Date: Tue Oct  7 08:30:29 2003
Subject: [Python-Dev] Re: [spambayes-dev] spambayes-checkins ->
	spambayes-dev, python-checkins -> python-dev
In-Reply-To: <16258.42573.626779.73842@montanaro.dyndns.org>
References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz>
	<1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz>
	<16257.51553.112259.897217@montanaro.dyndns.org>
	<20031006235549.GA14656@cthulhu.gerg.ca>
	<16258.42573.626779.73842@montanaro.dyndns.org>
Message-ID: <1065529823.993.38.camel@anthem>

On Tue, 2003-10-07 at 07:41, Skip Montanaro wrote:
>     Greg> On 06 October 2003, Skip Montanaro said:
>     >> Maybe the Reply-To: for spambayes-checkins should be spambayes-dev
>     >> (and similarly for python-checkins/python-dev).  Can that be
>     >> engineered through Mailman?
> 
>     Greg> Yes -- it's on the "General Options" page.  Look for
>     Greg> reply_goes_to_list.
> 
> After seeing your answer I know I asked the wrong question. <wink> I
> shouldn't have said "Reply-To:".  In X?Emacs/VM, I just hit the 'f' key to
> reply to you and to cc spambayes-dev.  Had this been a spambayes-checkins
> message, it would have been nice if the cc went to spambayes-dev instead of
> spambayes-checkin.
> 
> I can probably solve that for myself by tweaking the vm-followup command
> (what the 'f' key is bound to), but there's probably not a general solution.
> Setting Reply-To: *might* be okay in a situation like this where you don't
> want chit-chat on a checkins list to get lost or not seen by the larger
> audience, but I'd only use it as a last resort.

IMO as an anti-Reply-to munger, I think this is one situation where
Reply-To hacking is perfectly legit.  You don't want discussions on
-checkins, you want them on the discuss mailing list (in this case
spambayes-dev).  MM2.1 can be configured to retain any existing Reply-To
fields so people who have to set this to worm around their broken mail
systems can still be coddled.

python-devers and spambayes-devers, you vant I should do dis?
-Barry


From sjoerd at acm.org  Tue Oct  7 08:44:53 2003
From: sjoerd at acm.org (Sjoerd Mullender)
Date: Tue Oct  7 08:45:05 2003
Subject: [Python-Dev] Re: [spambayes-dev] spambayes-checkins ->
 spambayes-dev, python-checkins -> python-dev
In-Reply-To: <1065529823.993.38.camel@anthem>
References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz>	<1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz>	<16257.51553.112259.897217@montanaro.dyndns.org>	<20031006235549.GA14656@cthulhu.gerg.ca>	<16258.42573.626779.73842@montanaro.dyndns.org>
	<1065529823.993.38.camel@anthem>
Message-ID: <3F82B545.2060702@acm.org>

Barry Warsaw wrote:
> On Tue, 2003-10-07 at 07:41, Skip Montanaro wrote:
> 
>>    Greg> On 06 October 2003, Skip Montanaro said:
>>    >> Maybe the Reply-To: for spambayes-checkins should be spambayes-dev
>>    >> (and similarly for python-checkins/python-dev).  Can that be
>>    >> engineered through Mailman?
>>
>>    Greg> Yes -- it's on the "General Options" page.  Look for
>>    Greg> reply_goes_to_list.
>>
>>After seeing your answer I know I asked the wrong question. <wink> I
>>shouldn't have said "Reply-To:".  In X?Emacs/VM, I just hit the 'f' key to
>>reply to you and to cc spambayes-dev.  Had this been a spambayes-checkins
>>message, it would have been nice if the cc went to spambayes-dev instead of
>>spambayes-checkin.
>>
>>I can probably solve that for myself by tweaking the vm-followup command
>>(what the 'f' key is bound to), but there's probably not a general solution.
>>Setting Reply-To: *might* be okay in a situation like this where you don't
>>want chit-chat on a checkins list to get lost or not seen by the larger
>>audience, but I'd only use it as a last resort.
> 
> 
> IMO as an anti-Reply-to munger, I think this is one situation where
> Reply-To hacking is perfectly legit.  You don't want discussions on
> -checkins, you want them on the discuss mailing list (in this case
> spambayes-dev).  MM2.1 can be configured to retain any existing Reply-To
> fields so people who have to set this to worm around their broken mail
> systems can still be coddled.
> 
> python-devers and spambayes-devers, you vant I should do dis?

+1

-- 
Sjoerd Mullender <sjoerd@acm.org>


From aahz at pythoncraft.com  Tue Oct  7 10:29:22 2003
From: aahz at pythoncraft.com (Aahz)
Date: Tue Oct  7 10:29:27 2003
Subject: [Python-Dev] spambayes-checkins -> spambayes-dev,
	python-checkins -> python-dev
In-Reply-To: <16258.42573.626779.73842@montanaro.dyndns.org>
References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz>
	<1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz>
	<16257.51553.112259.897217@montanaro.dyndns.org>
	<20031006235549.GA14656@cthulhu.gerg.ca>
	<16258.42573.626779.73842@montanaro.dyndns.org>
Message-ID: <20031007142921.GA27594@panix.com>

On Tue, Oct 07, 2003, Skip Montanaro wrote:
>
> [OT PS] cthulhu.gerg.ca?  Is that some sort of pronounceable-only-by-Native-
> Canadians name?

http://cthulhu.fnord.at/
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From kennypitt at hotmail.com  Tue Oct  7 11:01:23 2003
From: kennypitt at hotmail.com (Kenny Pitt)
Date: Tue Oct  7 11:02:02 2003
Subject: [Python-Dev] RE: [spambayes-dev] spambayes-checkins ->
	spambayes-dev, python-checkins -> python-dev
In-Reply-To: <1065529823.993.38.camel@anthem>
Message-ID: <Law11-OE73nojHFUM7R000044a0@hotmail.com>

Barry Warsaw wrote:
> On Tue, 2003-10-07 at 07:41, Skip Montanaro wrote:
>>     Greg> On 06 October 2003, Skip Montanaro said:
>>     >> Maybe the Reply-To: for spambayes-checkins should be
spambayes-dev
>>     >> (and similarly for python-checkins/python-dev).  Can that be
>>     >> engineered through Mailman?
>> 
[snip]
> 
> IMO as an anti-Reply-to munger, I think this is one situation where
> Reply-To hacking is perfectly legit.  You don't want discussions on
> -checkins, you want them on the discuss mailing list (in this case
> spambayes-dev).  MM2.1 can be configured to retain any existing
> Reply-To fields so people who have to set this to worm around their
> broken mail systems can still be coddled.
> 
> python-devers and spambayes-devers, you vant I should do dis?
> -Barry

+1

-- 
Kenny Pitt


From jeremy at zope.com  Tue Oct  7 11:17:21 2003
From: jeremy at zope.com (Jeremy Hylton)
Date: Tue Oct  7 11:20:51 2003
Subject: [Python-Dev] spambayes-checkins -> spambayes-dev,
	python-checkins -> python-dev
In-Reply-To: <16258.42573.626779.73842@montanaro.dyndns.org>
References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz>
	<1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz>
	<16257.51553.112259.897217@montanaro.dyndns.org>
	<20031006235549.GA14656@cthulhu.gerg.ca>
	<16258.42573.626779.73842@montanaro.dyndns.org>
Message-ID: <1065539841.2322.11.camel@localhost.localdomain>

On Tue, 2003-10-07 at 07:41, Skip Montanaro wrote:
> [OT PS] cthulhu.gerg.ca?  Is that some sort of pronounceable-only-by-Native-
> Canadians name?

No.  It's H.P. Lovecraft.  Just be glad he didn't choose yog-sothoth or 
tsathoggua.  In college, we had a cluster of machines in my living group
that were named (briefly) after Lovecraft's creatures.  No one could
remember how to spell the names.

Jeremy


From barry at python.org  Tue Oct  7 11:42:48 2003
From: barry at python.org (Barry Warsaw)
Date: Tue Oct  7 11:42:56 2003
Subject: [Python-Dev] test_bsddb hangs with CVS Python
In-Reply-To: <1065417507.2095.5.camel@localhost.localdomain>
References: <1065417507.2095.5.camel@localhost.localdomain>
Message-ID: <1065541367.17466.0.camel@anthem>

On Mon, 2003-10-06 at 01:18, Jeremy Hylton wrote:
> test_bsddb hangs for me everytime.  This is a current CVS python with
> BerkeleyDB 4.1.25.  I've tried commenting out test_pop and
> test_mapping_iteration_methods, but it still hangs somewhere.
> 
> localhost:~/src/python/build-pydebug> ./python ../Lib/test/test_bsddb.py
> -v
> test_change (__main__.TestBTree) ... ok
> test_clear (__main__.TestBTree) ... ok
> test_close_and_reopen (__main__.TestBTree) ... ok
> test_contains (__main__.TestBTree) ... ok
> test_first_next_looping (__main__.TestBTree) ... ok
> test_get (__main__.TestBTree) ... ok
> test_getitem (__main__.TestBTree) ... ok
> test_has_key (__main__.TestBTree) ... ok
> test_keyordering (__main__.TestBTree) ... ok
> test_len (__main__.TestBTree) ... ok
> test_mapping_iteration_methods (__main__.TestBTree) ... ok
> test_pop (__main__.TestBTree) ... ok
> 
> strace says:
> stat64("./@test", 0xbfffc980)           = -1 ENOENT (No such file or
> directory)
> stat64("./__db.@test.", {st_mode=S_IFREG|0664, st_size=8192, ...}) = 0
> stat64("./@test", 0xbfffc830)           = -1 ENOENT (No such file or
> directory)
> rename("./__db.@test.", "./@test")      = 0
> stat64("./@test", {st_mode=S_IFREG|0664, st_size=8192, ...}) = 0
> open("./@test", O_RDWR|O_LARGEFILE)     = 3
> fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
> fstat64(3, {st_mode=S_IFREG|0664, st_size=8192, ...}) = 0
> pread(3, "\0\0\0\0\1\0\0\0\0\0\0\0b1\5\0\t\0\0\0\0\20\0\0\0\t\0\0"...,
> 4096, 0) = 4096
> pread(3, "\0\0\0\0\1\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\20\1\5\0"...,
> 4096, 4096) = 4096
> futex(0x4055ad40, FUTEX_WAIT, 0, NULL

Same thing here, CVS Python 2.3+, RH9, BDB 4.1.25.

-Barry


From pje at telecommunity.com  Tue Oct  7 11:44:50 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Oct  7 11:45:57 2003
Subject: More informative error messages (Re: [Python-Dev]
	Efficient predicates for the standard library)
In-Reply-To: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz>
References: <200310060928.38292.aleaxit@yahoo.com>
Message-ID: <5.1.1.6.0.20031007114232.0309dc20@telecommunity.com>

At 01:33 PM 10/7/03 +1300, Greg Ewing wrote:
>While we're on the subject of error messages, I'd like to
>point out another one that could be improved. Often one
>sees things like
>
>   TypeError: foo() takes exactly 1 argument (2 given)
>
>In the case where foo() is a method of some class, and there
>are various versions of foo() defined in various superclasses,
>it's sometimes hard to tell exactly *which* foo it was trying
>to call. It would be much more useful if the module and
>class names were included in the error message, e.g.
>
>   TypeError: MyStuff.SomeClass.foo() takes exactly 1 argument (2
>given)

AFAICT, this would at least require a compiler change, and a change to the 
layout of code objects, so that a code object would know its "dotted name".


>The same goes for function names quoted in the traceback.

Don't tracebacks give line number and file?


From barry at python.org  Tue Oct  7 11:46:51 2003
From: barry at python.org (Barry Warsaw)
Date: Tue Oct  7 11:47:08 2003
Subject: [Python-Dev] RE: [spambayes-dev] spambayes-checkins ->
	spambayes-dev,python-checkins -> python-dev
In-Reply-To: <Law11-OE73nojHFUM7R000044a0@hotmail.com>
References: <Law11-OE73nojHFUM7R000044a0@hotmail.com>
Message-ID: <1065541611.17466.2.camel@anthem>

On Tue, 2003-10-07 at 11:01, Kenny Pitt wrote:
> Barry Warsaw wrote:

> > IMO as an anti-Reply-to munger, I think this is one situation where
> > Reply-To hacking is perfectly legit.  You don't want discussions on
> > -checkins, you want them on the discuss mailing list (in this case
> > spambayes-dev).  MM2.1 can be configured to retain any existing
> > Reply-To fields so people who have to set this to worm around their
> > broken mail systems can still be coddled.
> > 
> > python-devers and spambayes-devers, you vant I should do dis?
> > -Barry
> 
> +1

Done for both lists.
-Barry


From popiel at wolfskeep.com  Tue Oct  7 14:09:25 2003
From: popiel at wolfskeep.com (T. Alexander Popiel)
Date: Tue Oct  7 14:09:36 2003
Subject: [Python-Dev] Re: [spambayes-dev] spambayes-checkins ->
	spambayes-dev, python-checkins -> python-dev 
In-Reply-To: Message from Barry Warsaw <barry@python.org> 
	of "Tue, 07 Oct 2003 08:30:24 EDT." <1065529823.993.38.camel@anthem> 
References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz>
	<1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz>
	<16257.51553.112259.897217@montanaro.dyndns.org>
	<20031006235549.GA14656@cthulhu.gerg.ca>
	<16258.42573.626779.73842@montanaro.dyndns.org>
	<1065529823.993.38.camel@anthem> 
Message-ID: <20031007180925.8F8362DE90@cashew.wolfskeep.com>

In message:  <1065529823.993.38.camel@anthem>
             Barry Warsaw <barry@python.org> writes:

[ talk of mangling the reply-to on -checkins to point to -dev,
  iff there is no pre-existing reply-to ]

>python-devers and spambayes-devers, you vant I should do dis?

+1

- Alex

From g_will at cyberus.ca  Tue Oct  7 14:11:36 2003
From: g_will at cyberus.ca (Gordon Williams)
Date: Tue Oct  7 14:11:10 2003
Subject: [Python-Dev] ConfigParser case sensitive and strings vs objects
	returned
References: <000e01c38c45$0aebe650$7654e640@amd950>
	<16257.52413.154509.392409@grendel.zope.com>
Message-ID: <005b01c38cfe$724ee220$6c57e640@amd950>

Hi Fred,

A couple of other things about the ConfigParser module that I find a bit
strange and I'm not sure that is intended behaivior.


1. Option gets converted to lower case and therefore is not case sensitive,
but section is case sensitive.  I would have thought that both would be or
neither would be case sensitive.  (My preference would be that neither would
be case sensitive.)

example if I have a config.txt file with:
[File 1]
databaseADF adsfa:octago DASFDAS
user:Me
password:blank

then this gets written out it is (were databaseADF is now databaseadf):
[File 1]
databaseadf adsfa = octago DASFDAS
password = blank
user = Me

Using "file 1' instead of  "File 1":
>>> Config.config
<ConfigParser.SafeConfigParser instance at 0x010EEE40>
>>> Config.config.options('file 1')
Traceback (most recent call last):
  File "<interactive input>", line 1, in ?
  File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 240, in options
    raise NoSectionError(section)
NoSectionError: No section: 'file 1'

But using 'dataBASEadf adsfa' instead of  'databaseADF adsfa' or
'databaseadf adsfa ' is OK and returns the correct value:
>>> Config.config.get('File 1', 'dataBASEadf adsfa')
'octago DASFDAS'

The differences in handling the option and section are annoying and should
at least be described in the docs if they cant be changed.

2. SafeConfigParser is the recommended ConfigParser in the docs.  I'm not
sure what is meant be safe.  When values are read in from a file they are
first converted to strings.  This is not true for values set within the
code.

If I set an option with anything other than a string then this occurs:

>>> Config.config.set('File 1', 'test', 2)
>>> Config.config.get('File 1', 'test')
Traceback (most recent call last):
  File "<interactive input>", line 1, in ?
  File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 518, in get
    return self._interpolate(section, option, value, d)
  File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 576, in _interpolate
    self._interpolate_some(option, L, rawval, section, vars, 1)
  File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 585, in
_interpolate_some
    p = rest.find("%")
AttributeError: 'int' object has no attribute 'find'

Likely the value assigned to the object should be first converted to a
string before it is stored.  The same thing happens if a dict or float is
passed in as a default:

>>> c= Config.SafeConfigParser({'test':{'1':"One",'2':"Two"}, 'foo':2.3})

This looks OK:
>>> c.write(sys.stdout)
[DEFAULT]
test = {'1': 'One', '2': 'Two'}
foo = 2.3

Problem with get:
>>> c.get('DEFAULT', 'test')
Traceback (most recent call last):
  File "<interactive input>", line 1, in ?
  File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 518, in get
    return self._interpolate(section, option, value, d)
  File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 576, in _interpolate
    self._interpolate_some(option, L, rawval, section, vars, 1)
  File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 585, in
_interpolate_some
    p = rest.find("%")
AttributeError: 'dict' object has no attribute 'find'

>>> c.get('DEFAULT', 'foo')
Traceback (most recent call last):
  File "<interactive input>", line 1, in ?
  File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 518, in get
    return self._interpolate(section, option, value, d)
  File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 576, in _interpolate
    self._interpolate_some(option, L, rawval, section, vars, 1)
  File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 585, in
_interpolate_some
    p = rest.find("%")
AttributeError: 'float' object has no attribute 'find'
>>>

If we set raw= True, then we get back an object (<type 'float'>) and not a
string:
>>> c.get('DEFAULT', 'foo', raw= True)
2.2999999999999998

If we use vars= {} an exception is also thrown:
>>> c.get('DEFAULT', 'junk', vars= {'junk': 99})
Traceback (most recent call last):
  File "<interactive input>", line 1, in ?
  File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 518, in get
    return self._interpolate(section, option, value, d)
  File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 576, in _interpolate
    self._interpolate_some(option, L, rawval, section, vars, 1)
  File "E:\PROGRA~1\PYTHON23\lib\ConfigParser.py", line 585, in
_interpolate_some
    p = rest.find("%")
AttributeError: 'int' object has no attribute 'find'


One last comment is that 'interpolation' is a bit confusing in the docs.
Maybe 'substitution' would be a better word.

Thanks,

Gordon Williams


From fdrake at acm.org  Tue Oct  7 14:48:13 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue Oct  7 14:48:30 2003
Subject: [Python-Dev] Re: ConfigParser case sensitive and strings vs objects
	returned
In-Reply-To: <005b01c38cfe$724ee220$6c57e640@amd950>
References: <000e01c38c45$0aebe650$7654e640@amd950>
	<16257.52413.154509.392409@grendel.zope.com>
	<005b01c38cfe$724ee220$6c57e640@amd950>
Message-ID: <16259.2669.257010.619400@grendel.zope.com>


Gordon Williams writes:
 > Hi Fred,
 > 
 > A couple of other things about the ConfigParser module that I find a bit
 > strange and I'm not sure that is intended behaivior.
 > 
 > 
 > 1. Option gets converted to lower case and therefore is not case sensitive,
 > but section is case sensitive.  I would have thought that both would be or
 > neither would be case sensitive.  (My preference would be that neither would
 > be case sensitive.)

And mine would be that both are case sensitive!  ;-) I guess that's
why we have optionxform to override the transform for option names at
least.

Ideally, both option and section names should be transformed, but the
specific transforms should be independently pluggable.  I'm not
adverse to a patch which adds a sectionxform, but don't have the time
or motivation to change it myself.  Feel free to post a patch to
SourceForge and assign it to me for review.  Documentation and tests
are required.

[...examples elided...]
 > The differences in handling the option and section are annoying and should
 > at least be described in the docs if they cant be changed.

Please suggest specific changes; I don't expect to have much time for
ConfigParser anytime soon, so specific changes (esp. a patch if you
can deal with the LaTeX) would be greatly appreciated.

 > 2. SafeConfigParser is the recommended ConfigParser in the docs.  I'm not
 > sure what is meant be safe.  When values are read in from a file they are
 > first converted to strings.  This is not true for values set within the
 > code.

True.  I'd suggest that at most, a typecheck for a value being a
string could be added to the code; the documentation may need further
elaboration.

The "Safe" was intended to refer specifically to the string
substitution algorithm; it uses a more careful implementation that
isn't as subject to weird border conditions.  Again, the documentation
may require improvements.

 > If I set an option with anything other than a string then this occurs:
...
 > Likely the value assigned to the object should be first converted to a
 > string before it is stored.

Or an exception should be raised, placing the burden squarely on the
caller to do the right thing, instead of guessing what the right thing
is.

 > One last comment is that 'interpolation' is a bit confusing in the docs.
 > Maybe 'substitution' would be a better word.

Agreed.

I'd like to suggest two things:

- Get a SourceForge account and file a bug report (you don't need to
  be on the Python project, just having an account is sufficient).

- Take a look at some of the alternate configuration libraries; they
  may be more suited to your requirements.  My current favorite is
  ZConfig, for which a new version is expected in the next week or so:

  http://www.python.org/pypi?%3Aaction=search&name=ZConfig

  But I might be biased about this one.  ;-)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From ianb at colorstudy.com  Tue Oct  7 20:41:48 2003
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue Oct  7 20:41:45 2003
Subject: More informative error messages (Re: [Python-Dev] Efficient
	predicates for the standard library)
In-Reply-To: <5.1.1.6.0.20031007114232.0309dc20@telecommunity.com>
Message-ID: <3389EA8A-F928-11D7-B491-000393C2D67E@colorstudy.com>

On Tuesday, October 7, 2003, at 10:44 AM, Phillip J. Eby wrote:
> At 01:33 PM 10/7/03 +1300, Greg Ewing wrote:
>> While we're on the subject of error messages, I'd like to
>> point out another one that could be improved. Often one
>> sees things like
>>
>>   TypeError: foo() takes exactly 1 argument (2 given)
>>
>> In the case where foo() is a method of some class, and there
>> are various versions of foo() defined in various superclasses,
>> it's sometimes hard to tell exactly *which* foo it was trying
>> to call. It would be much more useful if the module and
>> class names were included in the error message, e.g.
>>
>>   TypeError: MyStuff.SomeClass.foo() takes exactly 1 argument (2
>> given)
>
> AFAICT, this would at least require a compiler change, and a change to 
> the layout of code objects, so that a code object would know its 
> "dotted name".

Methods know their class, and classes know their name, so it should be 
okay.  In the case of functions, they know their module.

>> The same goes for function names quoted in the traceback.
>
> Don't tracebacks give line number and file?

Yeah, that seems unnecessary.  In the other case (incorrect arguments) 
it can be hard because you only get the line number of the caller, not 
the function being called.

There's other situations, like list.index, which says "list.index(x): x 
not in list", when it is almost always useful to know what "x" is.  I 
can't think of other ones off the top of my head, but I know there's 
many more.  Is it helpful (or annoying) to open bugs on these?

Personally, I usually add the repr of any interesting arguments to my 
exceptions.  But many of Python's exceptions don't do this.  Is there a 
reasoning there?  Sometimes the repr of an object can be verbose, or in 
getting it you can cause a second error.  Is this the reason for the 
lack of information, or is it just an oversight?  Or a differing 
opinion on how one should debug things?

--
Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org


From gward at python.net  Tue Oct  7 21:47:16 2003
From: gward at python.net (Greg Ward)
Date: Tue Oct  7 21:47:22 2003
Subject: [Python-Dev] Re: spambayes-checkins -> spambayes-dev,
	python-checkins -> python-dev
In-Reply-To: <16258.42573.626779.73842@montanaro.dyndns.org>
References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz>
	<1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz>
	<16257.51553.112259.897217@montanaro.dyndns.org>
	<20031006235549.GA14656@cthulhu.gerg.ca>
	<16258.42573.626779.73842@montanaro.dyndns.org>
Message-ID: <20031008014716.GA22217@cthulhu.gerg.ca>

On 07 October 2003, Skip Montanaro said:
> 
>     Greg> On 06 October 2003, Skip Montanaro said:
>     >> Maybe the Reply-To: for spambayes-checkins should be spambayes-dev
>     >> (and similarly for python-checkins/python-dev).  Can that be
>     >> engineered through Mailman?
> 
>     Greg> Yes -- it's on the "General Options" page.  Look for
>     Greg> reply_goes_to_list.
> 
> After seeing your answer I know I asked the wrong question. <wink> I
> shouldn't have said "Reply-To:".  In X?Emacs/VM, I just hit the 'f' key to
> reply to you and to cc spambayes-dev.  Had this been a spambayes-checkins
> message, it would have been nice if the cc went to spambayes-dev instead of
> spambayes-checkin.

I *think* what you want is a Mailman feature to set the Mail-Followup-To
header.  Not sure if such a feature exists.

> [OT PS] cthulhu.gerg.ca?  Is that some sort of pronounceable-only-by-Native-
> Canadians name?

Shhh!!!  Don't want spammers to guess that my secret email address is
"${my_first_name}@${my_personal_domain}".  (Tee-hee-hee, using
shell/Perl syntax on python-dev should cause some consternation.)

Actually, Cthulhu is an ancient eldritch entity lurking in the depths
beneath the Pacific Ocean, waiting to be awakened for the day when he
shall DEVOUR ALL HUMANITY!!  Good summaries here:
  http://www.kuro5hin.org/story/2003/9/1/172415/6523
  http://www.necfiles.org/part2.htm

        Greg
-- 
Greg Ward <gward@python.net>                         http://www.gerg.ca/
Secrecy is the beginning of tyranny.

From barry at python.org  Tue Oct  7 22:15:55 2003
From: barry at python.org (Barry Warsaw)
Date: Tue Oct  7 22:16:04 2003
Subject: [Python-Dev] Re: [spambayes-dev] Re: spambayes-checkins ->
	spambayes-dev, python-checkins -> python-dev
In-Reply-To: <20031008014716.GA22217@cthulhu.gerg.ca>
References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz>
	<1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz>
	<16257.51553.112259.897217@montanaro.dyndns.org>
	<20031006235549.GA14656@cthulhu.gerg.ca>
	<16258.42573.626779.73842@montanaro.dyndns.org>
	<20031008014716.GA22217@cthulhu.gerg.ca>
Message-ID: <1065579355.18519.43.camel@anthem>

On Tue, 2003-10-07 at 21:47, Greg Ward wrote:

> I *think* what you want is a Mailman feature to set the Mail-Followup-To
> header.  Not sure if such a feature exists.

Unfortunately, Mail-Followup-To is neither standard nor widely
implemented in mail readers.  Too bad, it's a good idea.

-Barry


From fdrake at acm.org  Tue Oct  7 22:20:04 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue Oct  7 22:20:20 2003
Subject: More informative error messages (Re: [Python-Dev] Efficient
	predicates for the standard library)
In-Reply-To: <3389EA8A-F928-11D7-B491-000393C2D67E@colorstudy.com>
References: <5.1.1.6.0.20031007114232.0309dc20@telecommunity.com>
	<3389EA8A-F928-11D7-B491-000393C2D67E@colorstudy.com>
Message-ID: <16259.29780.396447.476541@grendel.zope.com>


Ian Bicking writes:
 > Personally, I usually add the repr of any interesting arguments to my 
 > exceptions.  But many of Python's exceptions don't do this.  Is there a 
 > reasoning there?  Sometimes the repr of an object can be verbose, or in 
 > getting it you can cause a second error.  Is this the reason for the 
 > lack of information, or is it just an oversight?  Or a differing 
 > opinion on how one should debug things?

Another reason is efficiency.  Some exceptions are raised and caught
within the C code of the interpreter.  For these cases, it is
important that the raise be as efficient as possible, so the
interpreter attempts to avoid instantiation of the exception instance;
this cost was once attributed with a fairly bad performance
degradation when we tried a nicer message for AttributeError that
caused the exception instance to always be created (fixed before
release, of course, IIRC!).

That's not to say that there aren't several places where better
exception messages can't be used effectively.  This is only an issue
for exceptions that are going to be frequently raised and caught in C
code.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From fdrake at acm.org  Tue Oct  7 22:22:12 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue Oct  7 22:22:23 2003
Subject: [Python-Dev] Re: [spambayes-dev] Re: spambayes-checkins ->
	spambayes-dev, python-checkins -> python-dev
In-Reply-To: <1065579355.18519.43.camel@anthem>
References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz>
	<1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz>
	<16257.51553.112259.897217@montanaro.dyndns.org>
	<20031006235549.GA14656@cthulhu.gerg.ca>
	<16258.42573.626779.73842@montanaro.dyndns.org>
	<20031008014716.GA22217@cthulhu.gerg.ca>
	<1065579355.18519.43.camel@anthem>
Message-ID: <16259.29908.594392.212493@grendel.zope.com>


Barry Warsaw writes:
 > Unfortunately, Mail-Followup-To is neither standard nor widely
 > implemented in mail readers.  Too bad, it's a good idea.

There'd be more motivation for mailers to support it if lists
generated it.  Add it to Mailman, and you'll give mailer
authors/maintainers another reason to support it.  ;-)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From greg at cosc.canterbury.ac.nz  Tue Oct  7 22:34:15 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue Oct  7 22:34:37 2003
Subject: More informative error messages (Re: [Python-Dev] Efficient
	predicates for the standard library)
In-Reply-To: <5.1.1.6.0.20031007114232.0309dc20@telecommunity.com>
Message-ID: <200310080234.h982YFE12149@oma.cosc.canterbury.ac.nz>

> >   TypeError: MyStuff.SomeClass.foo() takes exactly 1 argument (2
> >given)
> 
> AFAICT, this would at least require a compiler change, and a change to the 
> layout of code objects, so that a code object would know its "dotted name".

Perhaps. I had the idea that methods already had some notion of
the name of the class they were defined in, but maybe that's
only bound methods. In any case, I think it would be worth
making this change.

> Don't tracebacks give line number and file?

Yes, but the exception occurs just *before* the function is entered,
so the traceback stops one level short of showing you where the
function being called is defined! That's what makes this problem
so annoying.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Tue Oct  7 22:49:07 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct  7 22:49:32 2003
Subject: More informative error messages (Re: [Python-Dev] Efficient
	predicates for the standard library)
In-Reply-To: Your message of "Wed, 08 Oct 2003 15:34:15 +1300."
	<200310080234.h982YFE12149@oma.cosc.canterbury.ac.nz> 
References: <200310080234.h982YFE12149@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310080249.h982n7R22500@12-236-54-216.client.attbi.com>

> > >   TypeError: MyStuff.SomeClass.foo() takes exactly 1 argument (2
> > >given)
> > 
> > AFAICT, this would at least require a compiler change, and a
> > change to the layout of code objects, so that a code object would
> > know its "dotted name".
> 
> Perhaps. I had the idea that methods already had some notion of
> the name of the class they were defined in, but maybe that's
> only bound methods. In any case, I think it would be worth
> making this change.

Only bound methods.  What should the error message be in this case?

  class C:
    pass

  def f(self, a): pass

  C.f = f

  C().f()

> > Don't tracebacks give line number and file?
> 
> Yes, but the exception occurs just *before* the function is entered,
> so the traceback stops one level short of showing you where the
> function being called is defined! That's what makes this problem
> so annoying.

Yes, that's one of the issues of this.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg at cosc.canterbury.ac.nz  Tue Oct  7 22:50:13 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue Oct  7 22:50:31 2003
Subject: More informative error messages (Re: [Python-Dev] Efficient
	predicates for the standard library)
In-Reply-To: <3389EA8A-F928-11D7-B491-000393C2D67E@colorstudy.com>
Message-ID: <200310080250.h982oD012177@oma.cosc.canterbury.ac.nz>

Ian Bicking <ianb@colorstudy.com>:

> > Don't tracebacks give line number and file?
> 
> Yeah, that seems unnecessary.

Even when it does give a line number and file, I don't
always want to have to go looking them all up just to
get an idea of the call path that led to the error.

This is a particularly severe problem in Pyrex, where
frequently I will get tracebacks telling me things like
there was an error in a 27-level deep stack of calls
to various functions called "generate_execution_code"
scattered among the 50 or so classes in Nodes.py...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Tue Oct  7 23:06:15 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue Oct  7 23:06:33 2003
Subject: More informative error messages (Re: [Python-Dev] Efficient
	predicates for the standard library)
In-Reply-To: <16259.29780.396447.476541@grendel.zope.com>
Message-ID: <200310080306.h9836F312300@oma.cosc.canterbury.ac.nz>

"Fred L. Drake, Jr." <fdrake@acm.org>:

> this cost was once attributed with a fairly bad performance
> degradation when we tried a nicer message for AttributeError that
> caused the exception instance to always be created

This suggests that perhaps using exceptions for non-exceptional flow
control isn't such a good idea, if it forces things like
AttributeError to be less useful for debugging than they would
otherwise be.

I know the Python philosophy holds that you *should* be able to use
exceptions freely for both purposes, but perhaps that philosophy needs
to be re-examined in the light of this consideration.

I know I find myself preferring these days to use getattr et al with
default arguments rather than catching exceptions when testing for the
presence of something, as it seems to more directly express what I'm
trying to do, and avoids all chance of catching the wrong
exception. Perhaps the equivalent should be done inside the
interpreter, too?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Tue Oct  7 23:12:20 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue Oct  7 23:12:28 2003
Subject: More informative error messages (Re: [Python-Dev] Efficient
	predicates for the standard library)
In-Reply-To: <200310080249.h982n7R22500@12-236-54-216.client.attbi.com>
Message-ID: <200310080312.h983CKN12322@oma.cosc.canterbury.ac.nz>

> What should the error message be in this case?
> 
>   class C:
>     pass
> 
>   def f(self, a): pass
> 
>   C.f = f

I wouldn't mind if it reported f as a top-level function
in that case. It wouldn't be any worse than what happens now
if you do

  def f(a):
    pass

  g = f

  g()

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From tim.one at comcast.net  Wed Oct  8 00:02:35 2003
From: tim.one at comcast.net (Tim Peters)
Date: Wed Oct  8 00:02:45 2003
Subject: More informative error messages (Re: [Python-Dev]
	Efficientpredicates for the standard library)
In-Reply-To: <200310080306.h9836F312300@oma.cosc.canterbury.ac.nz>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEJAGGAB.tim.one@comcast.net>

[Fred L. Drake, Jr.]
>> this cost was once attributed with a fairly bad performance
>> degradation when we tried a nicer message for AttributeError that
>> caused the exception instance to always be created

[Greg Ewing]
> This suggests that perhaps using exceptions for non-exceptional flow
> control isn't such a good idea, if it forces things like
> AttributeError to be less useful for debugging than they would
> otherwise be.
>
> I know the Python philosophy holds that you *should* be able to use
> exceptions freely for both purposes, but perhaps that philosophy needs
> to be re-examined in the light of this consideration.
>
> I know I find myself preferring these days to use getattr et al with
> default arguments rather than catching exceptions when testing for the
> presence of something, as it seems to more directly express what I'm
> trying to do, and avoids all chance of catching the wrong
> exception. Perhaps the equivalent should be done inside the
> interpreter, too?

The equivalent is already done inside the interpreter, about as far as is
possible.  Under the covers, the only way getattr(obj, name, default) *can*
work is to search obj's inheritance chain, trying to get the attribute at
each level, and clearing whatever internal AttributeErrors may be raised
along the way.  The presence of user-written __getattr__ hooks dooms simpler
schemes.

An internal PyExc_AttributeError isn't the same as a user-visible
AttributeError, though -- a class instance isn't created unless and until
PyErr_NormalizeException() gets called because the exception needs to be
made user-visible.  If the latter never happens, setting and clearing
exceptions internally is pretty cheap (a pointer to the global
PyExc_AttributeError object is stuffed into the thread state).  OTOH, almost
every call to a C API function has to test+branch for an error-return value,
and I've often wondered whether a setjmp/longjmp-based hack might allow for
cleaner and more optimizable code (hand-rolled "real exception handling").


From barry at python.org  Wed Oct  8 00:03:04 2003
From: barry at python.org (Barry Warsaw)
Date: Wed Oct  8 00:03:14 2003
Subject: [Python-Dev] Re: [spambayes-dev] Re: spambayes-checkins ->
	spambayes-dev, python-checkins -> python-dev
In-Reply-To: <16259.29908.594392.212493@grendel.zope.com>
References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz>
	<1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz>
	<16257.51553.112259.897217@montanaro.dyndns.org>
	<20031006235549.GA14656@cthulhu.gerg.ca>
	<16258.42573.626779.73842@montanaro.dyndns.org>
	<20031008014716.GA22217@cthulhu.gerg.ca>
	<1065579355.18519.43.camel@anthem>
	<16259.29908.594392.212493@grendel.zope.com>
Message-ID: <1065585784.18519.50.camel@anthem>

On Tue, 2003-10-07 at 22:22, Fred L. Drake, Jr. wrote:
> Barry Warsaw writes:
>  > Unfortunately, Mail-Followup-To is neither standard nor widely
>  > implemented in mail readers.  Too bad, it's a good idea.
> 
> There'd be more motivation for mailers to support it if lists
> generated it.  Add it to Mailman, and you'll give mailer
> authors/maintainers another reason to support it.  ;-)

Yeah, we've tried that with the RFC 2369 (List-*) headers for years and
it hasn't seemed to work.  Besides, the closest thing to a standard for
Mail-Followup-To says that list servers should never set the header[1]. 
I support your efforts to fight the good fight though, by setting your
MUA to add those headers. :)

-Barry

[1] http://cr.yp.to/proto/replyto.html

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20031008/2b8c66f5/attachment.bin
From guido at python.org  Wed Oct  8 00:14:04 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct  8 00:14:22 2003
Subject: More informative error messages (Re: [Python-Dev] Efficient
	predicates for the standard library)
In-Reply-To: Your message of "Wed, 08 Oct 2003 16:12:20 +1300."
	<200310080312.h983CKN12322@oma.cosc.canterbury.ac.nz> 
References: <200310080312.h983CKN12322@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310080414.h984E4a22702@12-236-54-216.client.attbi.com>

> > What should the error message be in this case?
> > 
> >   class C:
> >     pass
> > 
> >   def f(self, a): pass
> > 
> >   C.f = f
> 
> I wouldn't mind if it reported f as a top-level function
> in that case.

My point was that the runtime can't distinguish this case from

  class C:
    def f(self, a): pass

so the error message has to be the same in both cases.

If we want the error message in the latter example to be different,
we'll have to provide extra information in the code object.

If we don't want to do that, we *may* still be able to recover the
fact that we were calling a bound method, but in that case the former
example will give the same error message as the latter.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg at cosc.canterbury.ac.nz  Wed Oct  8 01:39:05 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Oct  8 01:39:16 2003
Subject: More informative error messages (Re: [Python-Dev] Efficient
	predicates for the standard library)
In-Reply-To: <200310080414.h984E4a22702@12-236-54-216.client.attbi.com>
Message-ID: <200310080539.h985d5q12749@oma.cosc.canterbury.ac.nz>

> If we want the error message in the latter example to be different,
> we'll have to provide extra information in the code object.

Yes, I understand that.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at electricrain.com  Wed Oct  8 04:03:02 2003
From: greg at electricrain.com (Gregory P. Smith)
Date: Wed Oct  8 04:03:11 2003
Subject: [Python-Dev] Re: More informative error messages
In-Reply-To: <200310071056.34954.aleaxit@yahoo.com>
References: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz>
	<200310071056.34954.aleaxit@yahoo.com>
Message-ID: <20031008080302.GA15666@zot.electricrain.com>

On Tue, Oct 07, 2003 at 10:56:34AM +0200, Alex Martelli wrote:
> >
> >   TypeError: foo() takes exactly 1 argument (2 given)
> 
> A perennial beginners' confusion (recently highlighted in a c.l.py thread
> whose subject claimed that Python can't count;-) is about that "number
> of arguments given" number: one calls zoop.bleep() and is told bleep
> "takes exactly 2 arguments (1 given)" when one is sure that one has
> given no argument at all (and should give exactly 1) -- the implied 'self'
> causing the beginners' confusion.  It seems to me that, if we work on these
> messages, we may be able to distinguish the bound-method case into
> 
> TypeError: bound method bleep() of Zoop instance takes exactly 1
> argument (0 given)

I've had to answer that question about the "wrong" numbers for python
newbies[1] frequently as well.  Even a simple cleaning up of the user
visible off by one error to be:

  TypeError: method bleep() takes exactly 1 argument (0 given)

At the time the TypeError is constructed it shouldn't add serious overhead
to check if its a method or a function and subtract 1 accordingly.

Greg

[1] where newbie is defined as someone who doesn't know the answer to
    that yet ;)

From guido at python.org  Wed Oct  8 09:47:07 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct  8 09:47:29 2003
Subject: [Python-Dev] Re: More informative error messages
In-Reply-To: Your message of "Wed, 08 Oct 2003 01:03:02 PDT."
	<20031008080302.GA15666@zot.electricrain.com> 
References: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz>
	<200310071056.34954.aleaxit@yahoo.com> 
	<20031008080302.GA15666@zot.electricrain.com> 
Message-ID: <200310081347.h98Dl7423343@12-236-54-216.client.attbi.com>

> At the time the TypeError is constructed it shouldn't add serious
> overhead to check if its a method or a function and subtract 1
> accordingly.

You'd think so, eh?  Have you looked at the code?  Have you tried to
come up with a patch?  Why do you think that in 13 years this hasn't
been fixed if it's such a common complaint?

I'm not arguing against fixing this (I think it would be great) but
the number of people who've implied that this should be an easy thing
to fix annoys me.

For better or for worse, the distinction between a function and a
bound method is gone by the time it's called, and recovering that
difference is going to be tough.  Not in terms of serious overhead,
but in terms of serious changes to code that is already extremely
subtle.  That code it's so subtle *because* we want to keep function
call overhead as low as possible, and anything that would add even a
fraction of a microsecond to the cost of calling a function with the
correct number of arguments will be scrutinized to death.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From ncoghlan at iinet.net.au  Wed Oct  8 10:11:19 2003
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Wed Oct  8 10:11:27 2003
Subject: [Python-Dev] Re: More informative error messages
In-Reply-To: <200310081347.h98Dl7423343@12-236-54-216.client.attbi.com>
References: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz>	<200310071056.34954.aleaxit@yahoo.com>
	<20031008080302.GA15666@zot.electricrain.com>
	<200310081347.h98Dl7423343@12-236-54-216.client.attbi.com>
Message-ID: <3F841B07.2000503@iinet.net.au>

Guido van Rossum strung bits together to say:
> For better or for worse, the distinction between a function and a
> bound method is gone by the time it's called, and recovering that
> difference is going to be tough.  Not in terms of serious overhead,
> but in terms of serious changes to code that is already extremely
> subtle.  That code it's so subtle *because* we want to keep function
> call overhead as low as possible, and anything that would add even a
> fraction of a microsecond to the cost of calling a function with the
> correct number of arguments will be scrutinized to death.

Given this, perhaps a simple addition to the error string might be enough to 
help reduce confusion:

-------------
TypeError: foo() takes exactly 1 argument (2 given). (Note: For bound methods, 
the argument count includes the object the method is bound to)
-------------

Experienced users are unlikely to care, and newer users should then be able to 
figure out why the argument count is one more than they expect. About the only 
problem I can see is that it is hard to be clear, without also making the error 
string rather long (like the one above).

Regards,
Nick.
It's simple, but if it works. . .

-- 
Nick Coghlan           |              Brisbane, Australia
ICQ#: 68854767         |               ncoghlan@email.com
Mobile: 0409 573 268   |   http://www.talkinboutstuff.net
"Let go your prejudices,
               lest they limit your thoughts and actions."


From gerrit at nl.linux.org  Wed Oct  8 10:13:18 2003
From: gerrit at nl.linux.org (Gerrit Holl)
Date: Wed Oct  8 10:13:32 2003
Subject: [Python-Dev] Re: [spambayes-dev] Re: spambayes-checkins ->
	spambayes-dev, python-checkins -> python-dev
In-Reply-To: <1065585784.18519.50.camel@anthem>
References: <1ED4ECF91CDED24C8D012BCF2B034F13038C03B2@its-xchg4.massey.ac.nz>
	<1ED4ECF91CDED24C8D012BCF2B034F13026F2986@its-xchg4.massey.ac.nz>
	<16257.51553.112259.897217@montanaro.dyndns.org>
	<20031006235549.GA14656@cthulhu.gerg.ca>
	<16258.42573.626779.73842@montanaro.dyndns.org>
	<20031008014716.GA22217@cthulhu.gerg.ca>
	<1065579355.18519.43.camel@anthem>
	<16259.29908.594392.212493@grendel.zope.com>
	<1065585784.18519.50.camel@anthem>
Message-ID: <20031008141318.GA5352@nl.linux.org>

Barry Warsaw wrote:
> List-Id: Python core developers <python-dev.python.org>
> List-Unsubscribe: <http://mail.python.org/mailman/listinfo/python-dev>,
> 	<mailto:python-dev-request@python.org?subject=unsubscribe>
> List-Archive: <http://mail.python.org/pipermail/python-dev>
> List-Post: <mailto:python-dev@python.org>
> List-Help: <mailto:python-dev-request@python.org?subject=help>
> List-Subscribe: <http://mail.python.org/mailman/listinfo/python-dev>,
> 	<mailto:python-dev-request@python.org?subject=subscribe>
> 
> Yeah, we've tried that with the RFC 2369 (List-*) headers for years and
> it hasn't seemed to work.

I use them... The ultimate prove that they work ;)!

Gerrit.

-- 
41. If any one fence in the field, garden, and house of a chieftain,
man, or one subject to quit-rent, furnishing the palings therefor; if the
chieftain, man, or one subject to quit-rent return to field, garden, and
house, the palings which were given to him become his property.
        -- 1780 BC, Hammurabi, Code of Law
--
Asperger Syndroom - een persoonlijke benadering:
	http://people.nl.linux.org/~gerrit/
Kom in verzet tegen dit kabinet:
	http://www.sp.nl/

From gerrit at nl.linux.org  Wed Oct  8 10:52:09 2003
From: gerrit at nl.linux.org (Gerrit Holl)
Date: Wed Oct  8 10:52:16 2003
Subject: [Python-Dev] Re: More informative error messages
In-Reply-To: <200310081347.h98Dl7423343@12-236-54-216.client.attbi.com>
References: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz>
	<200310071056.34954.aleaxit@yahoo.com>
	<20031008080302.GA15666@zot.electricrain.com>
	<200310081347.h98Dl7423343@12-236-54-216.client.attbi.com>
Message-ID: <20031008145209.GB5352@nl.linux.org>

[I'm not a regular poster, so I'll introduce myself shortly: I am a
 first-year Physics student without CS knowledge, have learned
 programming with Python a few years ago]

Gregory Smith wrote:
> > At the time the TypeError is constructed it shouldn't add serious
> > overhead to check if its a method or a function and subtract 1
> > accordingly.

Guido van Rossum replied:
> You'd think so, eh?  Have you looked at the code?  Have you tried to
> come up with a patch?  Why do you think that in 13 years this hasn't
> been fixed if it's such a common complaint?

Would it be possible to have this code at IDE-level? E.g., is possible
for Idle to catch TypeError's and try to find out whether this is about
the number of arguments to a callable, and if so, try to find out whether
it is about a method or a function? This is of course a lot of overhead,
but since it is only for an interactive session, I think this is not a big
problem, or am I mistaken here?

Something like:
except TypeError, msg:
    if "takes exactly" in msg[0]: # something with tb_lasti?
        name = msg[0].split('(')[0]
        typ, val, tb = sys.exc_info()
        if name in tb.tb_frame.f_locals.keys():
            if 'instancemethod' in type(tb.tb_frame.f_locals[name]):
                # subtract 1
            else:
                # don't subtract 1
        else:
            # hmm, if it is a method, how do we find it?
        # etc.
    else:
        raise

It seems quite difficult to do so. It is certainly not always possible,
but is it worth the pain?
            
regards,
Gerrit Holl.

-- 
201. If he knock out the teeth of a freed man, he shall pay one-third
of a gold mina.
        -- 1780 BC, Hammurabi, Code of Law
--
Asperger Syndroom - een persoonlijke benadering:
	http://people.nl.linux.org/~gerrit/
Kom in verzet tegen dit kabinet:
	http://www.sp.nl/

From Scott.Daniels at Acm.Org  Wed Oct  8 11:26:43 2003
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Wed Oct  8 11:26:46 2003
Subject: [Python-Dev] RE: More informative error messages 
In-Reply-To: <E1A7F4N-0007i0-Vh@mail.python.org>
References: <E1A7F4N-0007i0-Vh@mail.python.org>
Message-ID: <3F842CB3.1000400@Acm.Org>

[Tim Peters]

>....  OTOH, almost
>every call to a C API function has to test+branch for an error-return value,
>and I've often wondered whether a setjmp/longjmp-based hack might allow for
>cleaner and more optimizable code (hand-rolled "real exception handling").
>
setjmp/longjmp are nightmares for compiler writers.   The writers tend 
to turn
off optimizations around them and/or get corner cases wrong.  If you 
read the
C standard, precious little is guaranteed around setjmp/longjmp.  The C 
code
using disciplined setjmp/longjmp, will read well, probably be be quite
optimizable, but ....

At least some of the C compilers will mis-optimize such code and others 
will be
painfully slow due to the interaction of two compiler coding strategies: 
first,
emit straightforward sloppy code easily cleaned up in the optimization 
passes,
and second, turn off optimization in the presence of setjmp/longjmp.

Maybe the general compiler world has changed, but I had nightmares 
supporting
a language which generated C including setjmp/longjmp calls, and ran it on
top of three C compilers.  Each compiler had nasty cases to avoid, and the
resulting least common denominator was painfully inept.

-Scott David Daniels


From guido at python.org  Wed Oct  8 11:30:00 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct  8 11:30:52 2003
Subject: [Python-Dev] Re: More informative error messages
In-Reply-To: Your message of "Wed, 08 Oct 2003 16:52:09 +0200."
	<20031008145209.GB5352@nl.linux.org> 
References: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz>
	<200310071056.34954.aleaxit@yahoo.com>
	<20031008080302.GA15666@zot.electricrain.com>
	<200310081347.h98Dl7423343@12-236-54-216.client.attbi.com> 
	<20031008145209.GB5352@nl.linux.org> 
Message-ID: <200310081530.h98FU0K23484@12-236-54-216.client.attbi.com>

> Would it be possible to have this code at IDE-level? E.g., is possible
> for Idle to catch TypeError's and try to find out whether this is about
> the number of arguments to a callable, and if so, try to find out whether
> it is about a method or a function? This is of course a lot of overhead,
> but since it is only for an interactive session, I think this is not a big
> problem, or am I mistaken here?

It could be done, probably with 99% reliability.  But there are many
IDEs out there, and many people don't run their code under an IDE at
all, so it would be much preferred to do it in the VM.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From python-kbutler at sabaydi.com  Wed Oct  8 11:34:54 2003
From: python-kbutler at sabaydi.com (Kevin J. Butler)
Date: Wed Oct  8 11:35:07 2003
Subject: [Python-Dev] Re: More informative error messages
In-Reply-To: <E1A7F4O-0007i0-9U@mail.python.org>
References: <E1A7F4O-0007i0-9U@mail.python.org>
Message-ID: <3F842E9E.3000600@sabaydi.com>

>
>
>From: Nick Coghlan <ncoghlan@iinet.net.au>=
>
> Given this, perhaps a simple addition to the error string might be 
> enough to
>
>help reduce confusion:
>  
>
Agreed.

>-------------
>TypeError: foo() takes exactly 1 argument (2 given). (Note: For bound methods, 
>the argument count includes the object the method is bound to)
>-------------
>  
>
Agreed, but then the newbies will wonder what bounds methods are, and to 
what a method would be bound.

Shorter and easier for the uninitiated:

TypeError: foo() takes exactly 1 argument (2 given). Counts may include 'self'.

kb


From mal at lemburg.com  Wed Oct  8 12:32:37 2003
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed Oct  8 12:32:43 2003
Subject: [Python-Dev] Efficient predicates for the standard library
In-Reply-To: <20031004234000.GG25813@ics.uci.edu>
References: <20031004234000.GG25813@ics.uci.edu>
Message-ID: <3F843C25.8090808@lemburg.com>

Christian Stork wrote:
> The examples given in itertools' documentation are a good starting
> point.  More specifically I'm talking about the following:
> 
> 
>     def all(pred, seq):
> 	"Returns True if pred(x) is True for every element in the iterable"
> 	return False not in imap(pred, seq)
> 
>     def some(pred, seq):
> 	"Returns True if pred(x) is True at least one element in the iterable"
> 	return True in imap(pred, seq)
> 
>     def no(pred, seq):
> 	"Returns True if pred(x) is False for every element in the iterable"
> 	return True not in imap(pred, seq)

FYI, similar APIs have been part of mxTools for years, except
that they are called exists() and forall() (the terms used in
math for these things), plus there are a few more:

count(condition,sequence)
     Counts the number of objects in sequence for which condition returns
     true and returns the result as integer. condition must be a callable
     object.

exists(condition,sequence)
     Return 1 if and only if condition is true for at least one of the
     items in sequence and 0 otherwise. condition must be a callable object.

forall(condition,sequence)
     Return 1 if and only if condition is true for all of the items in
     sequence and 0 otherwise. condition must be a callable object.

index(condition,sequence)
     Return the index of the first item for which condition is true.
     A ValueError is raised in case no item is found. condition must be
     a callable object.

Note that the signatures are similar to those of map(), filter()
and reduce() and the do truth checking rather than compare
True/False to the result value which is useful sometimes.

mxTools currently does not support iterable objects for
sequence, but that should be easy to add.

More's here:

    http://www.egenix.com/files/python/mxTools.html

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, Oct 08 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::


From ianb at colorstudy.com  Wed Oct  8 12:33:37 2003
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed Oct  8 12:33:43 2003
Subject: [Python-Dev] Re: More informative error messages
In-Reply-To: <20031008145209.GB5352@nl.linux.org>
Message-ID: <2AA75D24-F9AD-11D7-B53E-000393C2D67E@colorstudy.com>

On Wednesday, October 8, 2003, at 09:52 AM, Gerrit Holl wrote:
> [I'm not a regular poster, so I'll introduce myself shortly: I am a
>  first-year Physics student without CS knowledge, have learned
>  programming with Python a few years ago]
>
> Gregory Smith wrote:
>>> At the time the TypeError is constructed it shouldn't add serious
>>> overhead to check if its a method or a function and subtract 1
>>> accordingly.
>
> Guido van Rossum replied:
>> You'd think so, eh?  Have you looked at the code?  Have you tried to
>> come up with a patch?  Why do you think that in 13 years this hasn't
>> been fixed if it's such a common complaint?
>
> Would it be possible to have this code at IDE-level? E.g., is possible
> for Idle to catch TypeError's and try to find out whether this is about
> the number of arguments to a callable, and if so, try to find out 
> whether
> it is about a method or a function? This is of course a lot of 
> overhead,
> but since it is only for an interactive session, I think this is not a 
> big
> problem, or am I mistaken here?

Or more generally, what if we just add more helpful information to 
tracebacks?  If we care about the particulars of the message, it is 
always in the context of a traceback.  And we don't care about the 
efficiency of tracebacks.

What if, say, exceptions had a method strfortraceback(tb), which was 
smarter when that would be helpful?  Like the code you have here, only 
as a method of TypeError (or some subclass)...

> Something like:
> except TypeError, msg:
>     if "takes exactly" in msg[0]: # something with tb_lasti?
>         name = msg[0].split('(')[0]
>         typ, val, tb = sys.exc_info()
>         if name in tb.tb_frame.f_locals.keys():
>             if 'instancemethod' in type(tb.tb_frame.f_locals[name]):
>                 # subtract 1
>             else:
>                 # don't subtract 1
>         else:
>             # hmm, if it is a method, how do we find it?
>         # etc.
>     else:
>         raise

--
Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org


From greg at electricrain.com  Wed Oct  8 14:07:30 2003
From: greg at electricrain.com (Gregory P. Smith)
Date: Wed Oct  8 14:07:46 2003
Subject: [Python-Dev] Re: More informative error messages
In-Reply-To: <200310081347.h98Dl7423343@12-236-54-216.client.attbi.com>
References: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz>
	<200310071056.34954.aleaxit@yahoo.com>
	<20031008080302.GA15666@zot.electricrain.com>
	<200310081347.h98Dl7423343@12-236-54-216.client.attbi.com>
Message-ID: <20031008180730.GB15666@zot.electricrain.com>

> I'm not arguing against fixing this (I think it would be great) but
> the number of people who've implied that this should be an easy thing
> to fix annoys me.
> 
> For better or for worse, the distinction between a function and a
> bound method is gone by the time it's called, and recovering that
> difference is going to be tough.  Not in terms of serious overhead,
> but in terms of serious changes to code that is already extremely
> subtle.  That code it's so subtle *because* we want to keep function
> call overhead as low as possible, and anything that would add even a
> fraction of a microsecond to the cost of calling a function with the
> correct number of arguments will be scrutinized to death.

Agreed.  I just looked at the code to see why.  Its much more
difficult than I imagined (except in one easy looking case in ceval.c).

For anyone who hasn't read the code, the Python/getargs.c vgetargs1()
function that parses the argument description string has no knowledge
of the PyCFunction object its checking arguments for.  Major restruring
to do this could be done several ways but is a huge task for speed and
C interface compatibility reasons.

-g


From guido at python.org  Wed Oct  8 14:23:18 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct  8 14:23:26 2003
Subject: [Python-Dev] Re: More informative error messages
In-Reply-To: Your message of "Wed, 08 Oct 2003 11:07:30 PDT."
	<20031008180730.GB15666@zot.electricrain.com> 
References: <200310070033.h970XJg02646@oma.cosc.canterbury.ac.nz>
	<200310071056.34954.aleaxit@yahoo.com>
	<20031008080302.GA15666@zot.electricrain.com>
	<200310081347.h98Dl7423343@12-236-54-216.client.attbi.com> 
	<20031008180730.GB15666@zot.electricrain.com> 
Message-ID: <200310081823.h98INIS23675@12-236-54-216.client.attbi.com>

> > For better or for worse, the distinction between a function and a
> > bound method is gone by the time it's called, and recovering that
> > difference is going to be tough.  Not in terms of serious overhead,
> > but in terms of serious changes to code that is already extremely
> > subtle.  That code it's so subtle *because* we want to keep function
> > call overhead as low as possible, and anything that would add even a
> > fraction of a microsecond to the cost of calling a function with the
> > correct number of arguments will be scrutinized to death.
> 
> Agreed.  I just looked at the code to see why.  Its much more
> difficult than I imagined (except in one easy looking case in ceval.c).
> 
> For anyone who hasn't read the code, the Python/getargs.c vgetargs1()
> function that parses the argument description string has no knowledge
> of the PyCFunction object its checking arguments for.  Major restruring
> to do this could be done several ways but is a huge task for speed and
> C interface compatibility reasons.

Um, when is this a problem for methods implemented in C?  AFAIK the
problem only exists for Python methods: take e.g. append() as an
example of a C method, and everything is fine:

  >>> [].append(1,2 )
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  TypeError: append() takes exactly one argument (2 given)
  >>>

The issue is really in ceval.c...

--Guido van Rossum (home page: http://www.python.org/~guido/)

From gminick at hacker.pl  Wed Oct  8 15:42:18 2003
From: gminick at hacker.pl (gminick)
Date: Wed Oct  8 15:47:25 2003
Subject: [Python-Dev] obj.__contains__() returns 1/0...
Message-ID: <20031008194218.GA17069@hannibal>

...shouldn't it return True/False?

examples:

>>> a = 'python'
>>> a.__contains__('perl')
0
>>> a.__contains__('python')
1
>>> a = { 'python' : 1, 'ruby' : 1 }
>>> a.__contains__('perl')
0
>>> a.__contains__('python')
1
>>>

instead of:

>>> a = 'python'
>>> a.__contains__('perl')
False
>>> a.__contains__('python')
True
>>> a = { 'python' : 1, 'ruby' : 1 }
>>> a.__contains__('perl')
False
>>> a.__contains__('python')
True
>>>


The reason for asking is that i.e. obj.__eq__() returns True/False,
and besides True/False looks nicer than 1/0 ;>

ps. I'll send a patch to sf.net in a matter of minutes.
    Please, decide if it should be applied, thanks.

-- 
[ Wojtek Walczak - gminick (at) underground.org.pl ]
[        <http://gminick.linuxsecurity.pl/>        ]
[ "...rozmaite zwroty, matowe od patyny dawnosci." ]


From greg at cosc.canterbury.ac.nz  Wed Oct  8 22:29:09 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Oct  8 22:29:35 2003
Subject: [Python-Dev] Re: More informative error messages
In-Reply-To: <20031008080302.GA15666@zot.electricrain.com>
Message-ID: <200310090229.h992T9m20070@oma.cosc.canterbury.ac.nz>

"Gregory P. Smith" <greg@electricrain.com>:

> At the time the TypeError is constructed it shouldn't add serious overhead
> to check if its a method or a function and subtract 1 accordingly.

Except that by the time the error is detected, we've lost track of
whether it's a method or not.

Maybe a heuristic could be applied, e.g. if the first parameter is
called 'self', say something like "foo() takes exactly 1 argument
(excluding 'self'), 0 given".

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From tim.one at comcast.net  Wed Oct  8 22:41:35 2003
From: tim.one at comcast.net (Tim Peters)
Date: Wed Oct  8 22:41:41 2003
Subject: [Python-Dev] RE: More informative error messages 
In-Reply-To: <3F842CB3.1000400@Acm.Org>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEMIGGAB.tim.one@comcast.net>

[Scott David Daniels]
> setjmp/longjmp are nightmares for compiler writers.

I was one for 15 years, so that won't scare me off <wink>.

> The writers tend to turn off optimizations around them and/or get
> corner cases wrong.
> ...

They do.  I would aim for a tiny total number of setjmp and longjmp calls,
inside very simple functions.  So, e.g., a routine that wanted to die with
an error wouldn't call longjmp directly, it would call a common utility
function containing the longjmp.  The latter function simply wouldn't
return.  Optimizations short of interprocedural analysis aren't harmed then
in the calling function, because nothing in *that* is the target, or direct
source, of a non-local goto.

Last I looked, the Perl source seemed to do such a thing in places, and
that's about as widely ported as Python.  It struck me with force when I was
looking at Perl's version of an adaptive mergesort last year, and got
jealous of how much shorter and clearer the C code could be when every
stinkin' call didn't have to be followed by an error test-and-branch.  The
Python sort code hides most of that syntactically via macros, but the
runtime cost is always there.  In real life, not one sort in a million
actually raises an exception, so executing O(N log N) test-branch blocks per
sort has astronomically low bang for the buck.  In cases like that (which
are common), it doesn't matter how slow actually raising an exception would
be; it's not even tempting to put the longjmp calls inline.

Whatever, I'll never have time to pursue it, so screw it <wink>.


From python at rcn.com  Wed Oct  8 22:53:06 2003
From: python at rcn.com (Raymond Hettinger)
Date: Wed Oct  8 22:53:40 2003
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects
	typeobject.c, 2.244, 2.245
In-Reply-To: <E1A7LV7-0007Qo-00@sc8-pr-cvs1.sourceforge.net>
Message-ID: <000001c38e10$77e31ea0$e841fea9@oemcomputer>

  	if (res == -1 && PyErr_Occurred())
  		return NULL;
! 	return PyInt_FromLong((long)res);
  }
  
--- 3577,3583 ----
  	if (res == -1 && PyErr_Occurred())
  		return NULL;
! 	ret = PyObject_IsTrue(PyInt_FromLong((long)res)) ? Py_True :
Py_False;


The line above leaks and does unnecessary work. I believe it should
read:

  	ret = res ? Py_True : Py_False;


Also, there is another one of these in Objects/descrobject.c line 712.


Raymond Hettinger
                        

From greg at cosc.canterbury.ac.nz  Wed Oct  8 23:21:34 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Oct  8 23:22:12 2003
Subject: [Python-Dev] RE: More informative error messages
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEMIGGAB.tim.one@comcast.net>
Message-ID: <200310090321.h993LYK20349@oma.cosc.canterbury.ac.nz>

> It struck me with force when I was looking at Perl's version of an
> adaptive mergesort last year, and got jealous of how much shorter and
> clearer the C code could be when every stinkin' call didn't have to be
> followed by an error test-and-branch.

Rewrite they Python core in Pyrex.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From tim.one at comcast.net  Wed Oct  8 23:40:05 2003
From: tim.one at comcast.net (Tim Peters)
Date: Wed Oct  8 23:40:14 2003
Subject: [Python-Dev] RE: More informative error messages
In-Reply-To: <200310090321.h993LYK20349@oma.cosc.canterbury.ac.nz>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEMNGGAB.tim.one@comcast.net>

[Greg Ewing]
> Rewrite they Python core in Pyrex.

And steal the glory from you?  No way.  Whip up a patch, and I'll assign it
to Guido <wink>.


From guido at python.org  Wed Oct  8 23:45:45 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct  8 23:46:06 2003
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects
	typeobject.c, 2.244, 2.245
In-Reply-To: Your message of "Wed, 08 Oct 2003 22:53:06 EDT."
	<000001c38e10$77e31ea0$e841fea9@oemcomputer> 
References: <000001c38e10$77e31ea0$e841fea9@oemcomputer> 
Message-ID: <200310090345.h993jkF00618@12-236-54-216.client.attbi.com>

>   	if (res == -1 && PyErr_Occurred())
>   		return NULL;
> ! 	return PyInt_FromLong((long)res);
>   }
>   
> --- 3577,3583 ----
>   	if (res == -1 && PyErr_Occurred())
>   		return NULL;
> ! 	ret = PyObject_IsTrue(PyInt_FromLong((long)res)) ? Py_True :
> Py_False;
> 
> 
> The line above leaks and does unnecessary work. I believe it should
> read:
> 
>   	ret = res ? Py_True : Py_False;

Ai.  I did the review while only half awake. :-)

But the correct thing to do is to use PyBool_FromLong(res); there's
really no need to inline what that function does.

> Also, there is another one of these in Objects/descrobject.c line 712.

I'll fix that one while I'm at it.

BTW, I notice there are a bunch of uses of PyBool_FromLong() that are
preceded by something like "if (res < 0) return NULL;" (or "!= -1").

Maybe PyBool_FromLong() itself could make this unneeded by adding
something like

    if (ok < 0 && PyErr_Occurred())
        return NULL;

to its start?

And, while we're reviewing usage patterns of PyBool_FromLong(), the
string and unicode types are full of places where it is called by a
return statement with a constant 1 or 0 as argument.  This seems
wasteful to me; I imagine that

  Py_INCREF(Py_True);
  return Py_True;

takes less time than

  return PyBool_FromLong(1);

Maybe a pair of macros Py_return_True and Py_return_False would make
sense?

--Guido van Rossum (home page: http://www.python.org/~guido/)

From mhammond at skippinet.com.au  Thu Oct  9 00:14:48 2003
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Thu Oct  9 00:14:47 2003
Subject: [Python-Dev] RE: [Python-checkins]
	python/dist/src/Objectstypeobject.c, 2.244, 2.245
In-Reply-To: <200310090345.h993jkF00618@12-236-54-216.client.attbi.com>
Message-ID: <03a601c38e1b$e2182440$f502a8c0@eden>

> Maybe a pair of macros Py_return_True and Py_return_False would make
> sense?

Include Py_return_None, and a solid +1 from me (even if that isn't how I
would spell it <wink>.)

Mark.


From greg at cosc.canterbury.ac.nz  Thu Oct  9 00:17:55 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu Oct  9 00:18:58 2003
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects
	typeobject.c, 2.244, 2.245
In-Reply-To: <200310090345.h993jkF00618@12-236-54-216.client.attbi.com>
Message-ID: <200310090417.h994HtH00580@oma.cosc.canterbury.ac.nz>

Guido van Rossum <guido@python.org>:

> Maybe PyBool_FromLong() itself could make this unneeded by adding
> something like
> 
>     if (ok < 0 && PyErr_Occurred())
>         return NULL;
> 
> to its start?

Not sure if it would be a good idea to encourage reliance
on one API function doing error checking on behalf of others.
I can see someone coming along later and adding some code
in between whatever returned the result and the PyBool_FromLong
call, not realising that doing so would upset the error
checking.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From martin at v.loewis.de  Thu Oct  9 00:40:30 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Thu Oct  9 00:40:31 2003
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects
	typeobject.c, 2.244, 2.245
In-Reply-To: <200310090345.h993jkF00618@12-236-54-216.client.attbi.com>
References: <000001c38e10$77e31ea0$e841fea9@oemcomputer>
	<200310090345.h993jkF00618@12-236-54-216.client.attbi.com>
Message-ID: <m33ce3hvlt.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> Maybe PyBool_FromLong() itself could make this unneeded by adding
> something like
> 
>     if (ok < 0 && PyErr_Occurred())
>         return NULL;
> 
> to its start?

That would an incompatible change. I would expect PyBool_FromLong(i)
do the same thing as bool(i).

> Maybe a pair of macros Py_return_True and Py_return_False would make
> sense?

You should, of course, add Py_return_None to it, as well.

Then you will find that some contributor goes on a crusade to use
these throughout very quickly :-)

Regards,
Martin

From guido at python.org  Thu Oct  9 00:43:20 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct  9 00:43:39 2003
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects
	typeobject.c, 2.244, 2.245
In-Reply-To: Your message of "Thu, 09 Oct 2003 17:17:55 +1300."
	<200310090417.h994HtH00580@oma.cosc.canterbury.ac.nz> 
References: <200310090417.h994HtH00580@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310090443.h994hKM00786@12-236-54-216.client.attbi.com>

> Guido van Rossum <guido@python.org>:
> 
> > Maybe PyBool_FromLong() itself could make this unneeded by adding
> > something like
> > 
> >     if (ok < 0 && PyErr_Occurred())
> >         return NULL;
> > 
> > to its start?

[Greg Ewing]
> Not sure if it would be a good idea to encourage reliance
> on one API function doing error checking on behalf of others.

Well, most functions in the abstract.c file already do this.

And it would actually *catch* bugs -- in fact, the one that Raymond
found in descrobject.c originally had

    return PyInt_FromLong(PySequence_Contains(pp->dict, key));

which was not checking for errors from PySequence_Contains().

> I can see someone coming along later and adding some code
> in between whatever returned the result and the PyBool_FromLong
> call, not realising that doing so would upset the error
> checking.

Well, they would have to miss two clues: the documented behavior of
PyBool_FromLong() and the fact that whatever produced the value passed
in could fail.  I'm not sure if that's a big worry, especially since
this is typically in dead-simple code.

OTOH, explicit is better than implicit.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at python.org  Thu Oct  9 00:44:25 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct  9 00:44:42 2003
Subject: [Python-Dev] RE: [Python-checkins]
	python/dist/src/Objectstypeobject.c, 2.244, 2.245
In-Reply-To: Your message of "Thu, 09 Oct 2003 14:14:48 +1000."
	<03a601c38e1b$e2182440$f502a8c0@eden> 
References: <03a601c38e1b$e2182440$f502a8c0@eden> 
Message-ID: <200310090444.h994iPW00807@12-236-54-216.client.attbi.com>

> > Maybe a pair of macros Py_return_True and Py_return_False would make
> > sense?
> 
> Include Py_return_None, and a solid +1 from me (even if that isn't how I
> would spell it <wink>.)

How would you spell it?

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu Oct  9 01:03:03 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct  9 01:03:33 2003
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects
	typeobject.c, 2.244, 2.245
In-Reply-To: Your message of "09 Oct 2003 06:40:30 +0200."
	<m33ce3hvlt.fsf@mira.informatik.hu-berlin.de> 
References: <000001c38e10$77e31ea0$e841fea9@oemcomputer>
	<200310090345.h993jkF00618@12-236-54-216.client.attbi.com> 
	<m33ce3hvlt.fsf@mira.informatik.hu-berlin.de> 
Message-ID: <200310090503.h99533G00867@12-236-54-216.client.attbi.com>

> Guido van Rossum <guido@python.org> writes:
> 
> > Maybe PyBool_FromLong() itself could make this unneeded by adding
> > something like
> > 
> >     if (ok < 0 && PyErr_Occurred())
> >         return NULL;
> > 
> > to its start?

[MvL]
> That would an incompatible change. I would expect PyBool_FromLong(i)
> do the same thing as bool(i).

Well, it still does, *except* if you have a pending exception.  IMO
what happens when you make a Python API call while an exception is
pending is pretty underspecified, so it's doubtful whether this
incompatibility matters.

> > Maybe a pair of macros Py_return_True and Py_return_False would make
> > sense?
> 
> You should, of course, add Py_return_None to it, as well.
> 
> Then you will find that some contributor goes on a crusade to use
> these throughout very quickly :-)

There's the minor issue of how to spell it (Mark Hammond may have a
different suggestion) but that certain contributor has my approval
once we get the spelling agreed upon.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From mhammond at skippinet.com.au  Thu Oct  9 01:21:12 2003
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Thu Oct  9 01:21:51 2003
Subject: [Python-Dev] RE: [Python-checkins]
	python/dist/src/Objectstypeobject.c, 2.244, 2.245
In-Reply-To: <200310090444.h994iPW00807@12-236-54-216.client.attbi.com>
Message-ID: <03dc01c38e25$27ba49c0$f502a8c0@eden>

> > > Maybe a pair of macros Py_return_True and Py_return_False
> would make
> > > sense?
> >
> > Include Py_return_None, and a solid +1 from me (even if
> that isn't how I
> > would spell it <wink>.)
>
> How would you spell it?

For some reason, I am somewhat conditioned to macros with all caps.  So I
would personally go for

Py_RETURN_NONE/TRUE/FALSE

But have no objection to any reasonable spelling.

Mark.


From gminick at hacker.pl  Thu Oct  9 01:41:16 2003
From: gminick at hacker.pl (gminick)
Date: Thu Oct  9 01:40:20 2003
Subject: [Python-Dev] obj.__contains__() returns 1/0...
In-Reply-To: <20031008194218.GA17069@hannibal>
References: <20031008194218.GA17069@hannibal>
Message-ID: <20031009054116.GA232@hannibal>

On Wed, Oct 08, 2003 at 09:42:18PM +0200, gminick wrote:
> ...shouldn't it return True/False?
[...]
> The reason for asking is that i.e. obj.__eq__() returns True/False,
> and besides True/False looks nicer than 1/0 ;>
> 
> ps. I'll send a patch to sf.net in a matter of minutes.
>     Please, decide if it should be applied, thanks.
Ok, no need to discuss:

Category: Core (C code)
Group: Python 2.3
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Wojtek Walczak (gminick)
Assigned to: Nobody/Anonymous (nobody)
Summary: obj.__contains__() returns 1/0...

-- 
[ Wojtek Walczak - gminick (at) underground.org.pl ]
[        <http://gminick.linuxsecurity.pl/>        ]
[ "...rozmaite zwroty, matowe od patyny dawnosci." ]


From gminick at hacker.pl  Thu Oct  9 02:28:31 2003
From: gminick at hacker.pl (gminick)
Date: Thu Oct  9 02:27:30 2003
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects
	typeobject.c, 2.244, 2.245
In-Reply-To: <000001c38e10$77e31ea0$e841fea9@oemcomputer>
References: <E1A7LV7-0007Qo-00@sc8-pr-cvs1.sourceforge.net>
	<000001c38e10$77e31ea0$e841fea9@oemcomputer>
Message-ID: <20031009062831.GA274@hannibal>

On Wed, Oct 08, 2003 at 10:53:06PM -0400, Raymond Hettinger wrote:
>   	if (res == -1 && PyErr_Occurred())
>   		return NULL;
> ! 	return PyInt_FromLong((long)res);
>   }
>   
> --- 3577,3583 ----
>   	if (res == -1 && PyErr_Occurred())
>   		return NULL;
> ! 	ret = PyObject_IsTrue(PyInt_FromLong((long)res)) ? Py_True :
> Py_False;
> 
> 
> The line above leaks and does unnecessary work. I believe it should
> read:
> 
>   	ret = res ? Py_True : Py_False;

PyInt_FromLong() returns PyObject, so you need to use PyObject_IsTrue() 
(the way I did) or hack the code not to use PyInt_FromLong(). I used 
PyInt_FromLong() because it was there before. Original code:

   res = (*func)(self, value);
   if (res == -1 && PyErr_Occurred())
      return NULL;
   return PyInt_FromLong((long)res);
}

If you're sure it isn't needed, then of course we can use the easier way
changing the snippet above into:

   res = (*func)(self, value);
   if (res == -1 && PyErr_Occurred())
      return NULL;
   ret = res ? Py_True : Py_False;
   Py_INCREF(ret);
   return ret;
}

So, why was there PyInt_FromLong()? :>

-- 
[ Wojtek Walczak - gminick (at) underground.org.pl ]
[        <http://gminick.linuxsecurity.pl/>        ]
[ "...rozmaite zwroty, matowe od patyny dawnosci." ]


From mwh at python.net  Thu Oct  9 05:49:23 2003
From: mwh at python.net (Michael Hudson)
Date: Thu Oct  9 05:48:39 2003
Subject: [Python-Dev] RE: More informative error messages
In-Reply-To: <200310090321.h993LYK20349@oma.cosc.canterbury.ac.nz> (Greg
	Ewing's message of "Thu, 09 Oct 2003 16:21:34 +1300 (NZDT)")
References: <200310090321.h993LYK20349@oma.cosc.canterbury.ac.nz>
Message-ID: <2mbrsqyc4c.fsf@starship.python.net>

Greg Ewing <greg@cosc.canterbury.ac.nz> writes:

>> It struck me with force when I was looking at Perl's version of an
>> adaptive mergesort last year, and got jealous of how much shorter and
>> clearer the C code could be when every stinkin' call didn't have to be
>> followed by an error test-and-branch.
>
> Rewrite they Python core in Pyrex.

That wouldn't alleviate the runtime cost, would it?

Maybe one day this sort of fundamental rearrangement will be easier to
play with, thanks to PyPy.

Cheers,
mwh
(who has some similarly impractical ideas about memory management...)

-- 
  If i don't understand lisp, it would be wise to not bray about
  how lisp is stupid or otherwise criticize, because my stupidity
  would be archived and open for all in the know to see.
                                                -- Xah, comp.lang.lisp

From python at rcn.com  Thu Oct  9 11:16:42 2003
From: python at rcn.com (Raymond Hettinger)
Date: Thu Oct  9 11:21:58 2003
Subject: [Python-Dev] RE: [Python-checkins]
	python/dist/src/Objectstypeobject.c, 2.244, 2.245
In-Reply-To: <20031009062831.GA274@hannibal>
Message-ID: <000001c38e78$58ec3280$e841fea9@oemcomputer>

[Wojtek Walczak]
> PyInt_FromLong() returns PyObject, so you need to use
PyObject_IsTrue()
> (the way I did) or hack the code not to use PyInt_FromLong(). I used
> PyInt_FromLong() because it was there before. Original code:
> 
>    res = (*func)(self, value);
>    if (res == -1 && PyErr_Occurred())
>       return NULL;
>    return PyInt_FromLong((long)res);
> }
> 
> If you're sure it isn't needed, then of course we can use the easier
way
> changing the snippet above into:
> 
>    res = (*func)(self, value);
>    if (res == -1 && PyErr_Occurred())
>       return NULL;
>    ret = res ? Py_True : Py_False;
>    Py_INCREF(ret);
>    return ret;
> }
> 
> So, why was there PyInt_FromLong()? :>

obj.__contains__() returns a python object.  "res" is a C numeric
object.  So, PyInt_FromLong() was needed to change it from a C long into
a PyObject * to a Python integer (either 0 or 1).

Wrapping that return value in Py_ObjectIsTrue() does successfully
convert the Python integer into a Python bool.

One issue with the way you wrote it is that both PyInt_FromLong() and
Py_ObjectIsTrue() create new Python objects but only one of them is
returned.  The other needs to have its reference count lowered by one so
that obj.__contains__() won't leak.

The other issue is that it wasn't necessary to create an intermediate
PyInt value.  Instead, the PyBool can be created directly from "res"
using PyBool_FromLong() or the equivalent:  ret = res ? Py_True :
Py_False; Py_INCREF(ret); return ret;

Looking at the functions signatures may make it more clear:

  wrap_objobjproc:   argstuple --> PyObject*
  PyInt_FromLong:    long --> PyObject*
  PyObject_IsTrue:   PyObject* --> PyObject*
  PyBool_FromLong:   long --> PyObject*


Hope this helps,


Raymond Hettinger


From tjreedy at udel.edu  Thu Oct  9 12:58:40 2003
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu Oct  9 12:58:45 2003
Subject: [Python-Dev] Re: RE: [Python-checkins]
	python/dist/src/Objectstypeobject.c, 2.244, 2.245
References: <200310090417.h994HtH00580@oma.cosc.canterbury.ac.nz>
	<200310090443.h994hKM00786@12-236-54-216.client.attbi.com>
Message-ID: <bm4440$3k1$1@sea.gmane.org>


> > Guido van Rossum <guido@python.org>:
> >
> > > Maybe PyBool_FromLong() itself could make this unneeded by
adding
> > > something like
> > >
> > >     if (ok < 0 && PyErr_Occurred())
> > >         return NULL;
> > >
> > > to its start?
>
> [Greg Ewing]
> > I can see someone coming along later and adding some code
> > in between whatever returned the result and the PyBool_FromLong
> > call, not realising that doing so would upset the error
> > checking.

My C is a bit rusty (from being swallowed by a Python)...
but in the particular snippet being discussed, it seems that
incorporating the error check in PyBool... would eliminate the need
for the temporary res variable, so that all can be written as

PyBool_FromLong( (*func)(self, value)); /* is (long) cast needed? */

leaving very little 'in between' space in which to insert upsetting
code.

I have no idea how well this generalizes to other prospective uses.

Terry J. Reedy


From gminick at hacker.pl  Thu Oct  9 15:42:30 2003
From: gminick at hacker.pl (gminick)
Date: Thu Oct  9 15:41:45 2003
Subject: [Python-Dev] RE: [Python-checkins]
	python/dist/src/Objectstypeobject.c, 2.244, 2.245
In-Reply-To: <000001c38e78$58ec3280$e841fea9@oemcomputer>
References: <20031009062831.GA274@hannibal>
	<000001c38e78$58ec3280$e841fea9@oemcomputer>
Message-ID: <20031009194229.GA250@hannibal>

On Thu, Oct 09, 2003 at 11:16:42AM -0400, Raymond Hettinger wrote:
[...] 
> One issue with the way you wrote it is that both PyInt_FromLong() and
> Py_ObjectIsTrue() create new Python objects but only one of them is
> returned.  The other needs to have its reference count lowered by one so
> that obj.__contains__() won't leak.
While everything else was clear before, the text above is a nice reminder,
some kind of warning, how to code better. Thank you.

-- 
[ Wojtek Walczak - gminick (at) underground.org.pl ]
[        <http://gminick.linuxsecurity.pl/>        ]
[ "...rozmaite zwroty, matowe od patyny dawnosci." ]


From greg at cosc.canterbury.ac.nz  Thu Oct  9 19:45:27 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu Oct  9 19:46:11 2003
Subject: [Python-Dev] RE: More informative error messages
In-Reply-To: <2mbrsqyc4c.fsf@starship.python.net>
Message-ID: <200310092345.h99NjRt13077@oma.cosc.canterbury.ac.nz>

Michael Hudson <mwh@python.net>:

> > Rewrite they Python core in Pyrex.
> 
> That wouldn't alleviate the runtime cost, would it?

No, but it would save having to write all the refcounting and error
checking code by hand, with attendant abundancy of opportunities for
getting it wrong...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From bac at OCF.Berkeley.EDU  Fri Oct 10 02:17:50 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Oct 10 02:17:59 2003
Subject: [Python-Dev] python-dev Summary for 2003-09-16 through 2003-09-30
	[draft]
Message-ID: <3F864F0E.2070407@ocf.berkeley.edu>

Here is everyone's chance to show why Cal Poly should flunk me on the 
writing proficiency test I have to take this Saturday to prove I can 
write at a college graduate level.

I will probably send the final vesion of this summary on Sunday so you 
have at least until then to make any corrections and such.

And a head's up: I managed to write that guide to Python development but 
I need to do a quick proof-read (yes, I am actually going to proof-read 
something for once) and get one other person to take a quick look at it 
before I post it here to be checked.  But it is coming and will be here 
before December.  =)

-------------------------------

python-dev Summary for 2003-09-16 through 2003-09-30
++++++++++++++++++++++++++++++++++++++++++++++++++++
This is a summary of traffic on the `python-dev mailing list`_ from 
September 16, 2003 through September 30, 2003.  It is intended to inform 
the wider Python community of on-going developments on the list.  To 
comment on anything mentioned here, just post to `comp.lang.python`_ (or 
email python-list@python.org which is a gateway to the newsgroup) with a 
subject line mentioning what you are discussing. All python-dev members 
are interested in seeing ideas discussed by the community, so don't 
hesitate to take a stance on something.  And if all of this really 
interests you then get involved and join `python-dev`_!

This is the twenty-sixth summary written by Brett Cannon (homework, the 
Summaries, how does he find the time?).

All summaries are archived at http://www.python.org/dev/summary/ .

Please note that this summary is written using reStructuredText_ which 
can be found at http://docutils.sf.net/rst.html .  Any unfamiliar 
punctuation is probably markup for reST_ (otherwise it is probably 
regular expression syntax or a typo =); you can safely ignore it, 
although I suggest learning reST; it's simple and is accepted for `PEP 
markup`_ and gives some perks for the HTML output.  Also, because of the 
wonders of programs that like to reformat text, I cannot guarantee you 
will be able to run the text version of this summary through Docutils_ 
as-is unless it is from the original text file.

.. _PEP Markup: http://www.python.org/peps/pep-0012.html

The in-development version of the documentation for Python can be found 
at http://www.python.org/dev/doc/devel/ and should be used when looking 
up any documentation on something mentioned here.  PEPs (Python 
Enhancement Proposals) are located at http://www.python.org/peps/ .  To 
view files in the Python CVS online, go to 
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/ .  Reported bugs 
and suggested patches can be found at the SourceForge_ project page.

.. _python-dev: http://www.python.org/dev/
.. _SourceForge: http://sourceforge.net/tracker/?group_id=5470
.. _python-dev mailing list: 
http://mail.python.org/mailman/listinfo/python-dev
.. _comp.lang.python: http://groups.google.com/groups?q=comp.lang.python
.. _Docutils: http://docutils.sf.net/
.. _reST:
.. _reStructuredText: http://docutils.sf.net/rst.html

.. contents::

.. _last summary: 
http://www.python.org/dev/summary/2003-09-01_2003-09-15.html


=====================
Summary Announcements
=====================

First, sorry about the lateness of this summary.  I have started my 
first quarter at `Cal Poly SLO`_.  Not only do I get to deal with being 
back in school for the first time in over a year, but I also get to be 
abruptly introduced to the quarter system.  Joys abound for me.  I am 
still reworking how I manage my time and the Summaries were the first 
thing to take a back seat.  Hopefully this won't happen again.

In case you have not been following general Python news, `Python 2.3.2`_ 
is now the newest release of Python.  In case you missed the Python 
2.3.1 release, then you missed the little hiccup in that release, which 
is fine.  The Python 2.3.2 release does not technically fall under the 
jurisdiction of this summary, but I am not going to wait half a month to 
let people know about it.

.. _Cal Poly SLO: http://www.calpoly.edu/
.. _Python 2.3.2: http://www.python.org/2.3.2/


=========
Summaries
=========
----------------------------------------------------------
Deprecations won't spontaneously appear in a micro release
----------------------------------------------------------
In case you don't know, sets.BaseSet.update() has been deprecated in 
favor of union_update() in order to cut out the unneeded duplication of 
functionality in Python 2.4 .  While 2.3.1 was still under development 
it grew a PendingDeprecationWarning.  This did not sit well with some 
people.

The argument for the PendingDeprecationWarning was that it is silent by 
default and gives people a heads-up in terms of things that are known to 
be deprecated in the next minor version of Python.

Against this idea, the argument that it adds a change between micro 
versions that is not a bug fix was raised.  In the end this won.

Contributing threads:
   - `pending deprecation warning for Set.update 
<http://mail.python.org/pipermail/python-dev/2003-September/038113.html>`__


------------------------------
Web-SIG on its way, supposedly
------------------------------
Bill Janssen is working on a charter so a Web SIG_ can be started in 
order to redesign the cgi module as the main goal, but also just making 
Python friendlier to web coding in general.

.. _SIG: http://www.python.org/community/sigs.html

Contributing threads:
   - `Improving the CGI module 
<http://mail.python.org/pipermail/python-dev/2003-September/038128.html>`__


-------------------------------------------
Threads and the desolation that is shutdown
-------------------------------------------
Tim Peters decided to try to deal with the fact that the Zope 3 testing 
suite was spitting out a ton of messages about unhandled exceptions 
during shutdown of the interpreter.  It turned out that threads were 
still running during shutdown and thus were throwing a fit because they 
were accessing module globals that were being torn down and set to None.

The problem went away when the second call to PyGC_Collect() in 
Py_Finalize() was commented out.  This is not totally acceptable since 
the second call is there to help collect garbage at shutdown so that 
things clean up properly.  Tim did end up suggesting just taking it out, 
though, for a future version of Python.

He also suggested tearing down the sys module even later (and thus "even 
more of a
special case than it is now").  This would leave sys.modules around and 
thus not cause globals to turn to None and cause errors from that 
side-effect.

Neither solution has been taken yet.  A temporary solution if you keep 
running into this is to make sure that either your cleanup code only 
accesses local variables (if you have to store references to globals 
since that will keep them around for you during shutdown).

Contributing threads:
   - `Fun with 2.3 shutdown 
<http://mail.python.org/pipermail/python-dev/2003-September/038151.html>`__


----------------------
Where is str.rsplit?!?
----------------------
The reason str.rsplit does not exist in Python is because the method is 
not difficult to code on your own.  And yet people still want it.  But 
there was not of a public outcry and the topic just fizzled.

Contributing threads:
   - `Discussion on adding rsplit() for strings and unicode objects. 
<http://mail.python.org/pipermail/python-dev/2003-September/038155.html>`__


-----------------
Waxing on PEP 310
-----------------
Holger Krekel brought up PEP 310 (entitled "Reliable Acquisition/Release 
Pairs") in terms of how code blocks should handle exceptions and such. 
Michael Hudson suggested that might be taking PEP 310 beyond what it is 
meant to cover.  To this, Holger suggested that then perhaps some other 
route should be taken.

As with all PEPs, discussion of them is always helpful for python-dev 
and the community.  It helps hash out ideas and gives python-dev 
feedback on whether a PEP should be rejected.

Contributing threads:
   ` pep 310 (reliable acquisition/release pairs) 
<http://mail.python.org/pipermail/python-dev/2003-September/038160.html>`__


------------------------------------------------------------
bsddb3 failures and the database system it wraps, news at 10
------------------------------------------------------------
The bsddb3 regression tests were failing during preparation for Python 
2.3.1 .  Beyond the "the test just fails sometimes" issues that come up 
with tests that are finicky because of timing, it was suggested that the 
failures are the fault of the Sleepycat_ DB code.  It is still being 
looked into.

.. _Sleepycat: http://www.sleepycat.com/

Contributing threads:
   - `latest bsddb3 test problems 
<http://mail.python.org/pipermail/python-dev/2003-September/038195.html>`__


----------------------------------------------------
We want *you* to help with the war on SF patch items
----------------------------------------------------
Someone wanted to help but wasn't sure how they could.  Martin v. L?wis 
sent an email listing common things anyone can do to help with dealing 
with the patch items on SourceForge_.  The email can be found at 
http://mail.python.org/pipermail/python-dev/2003-September/038253.html .

Contributing threads:
   - `Help offered 
<http://mail.python.org/pipermail/python-dev/2003-September/038245.html>`__


---------------
Python glossary
---------------
Skip Montanaro converted the glossary he has as a wiki at 
http://manatee.mojam.com/python-glossary to the proper format to be 
included in the Python documentation.  You can peruse the glossary as it 
stands in the documentation at 
http://www.python.org/dev/doc/devel/tut/node16.html.  Thanks to Skip for 
for doing the grunt work and getting this done.

If you wish to help, please visit the wiki and add/edit/whatever .

Contributing threads:
   - `Python Glossary 
<http://mail.python.org/pipermail/python-dev/2003-September/038280.html>`__


----------------------------------
Mitch Kapor to speak at PyCon 2004
----------------------------------
Mitch Kapor is founder of the `Open Source Application Foundation`_ 
(OSAF), co-founder of the `Electronic Frontier Foundation`_, and 
developer of Chandler_ .  He is going to be the keynote speaker at 
`PyCon 2004`_ .

The general `Call for Papers`_ has gone out.  If you have any desire to 
speak at PyCon take a look at the CFP.

.. _PyCon 2004: http://www.python.org/pycon/dc2004/
.. _Open Source Application Foundation: http://www.osafoundation.org/
.. _Electronic Frontier Foundation: http://www.eff.org/
.. _Chandler: http://www.osafoundation.org/Chandler-Product_FAQ.htm
.. _Call for Papers: http://www.python.org/pycon/dc2004/cfp.html


-----------------------------------------------------
Python 2.3.1 released, people were happy... initially
-----------------------------------------------------
Python 2.3.1 was released to the general public.  It was meant to be a 
bug-fix release to fix bugs that were discovered after Python 2.3 went 
out the door.

But then a typo in the configure.in script that prevented os.fsync() 
from ever being included was discovered.  A rather vocal group of users 
of this function got out their pitchforks and torches while screaming, 
"blood, blood!" (actually they were nice about it, but saying, "they 
kindly asked for a new release," isn't that dramatic, is it?)

How were the rioting masses (who were actually not rioting) appeased?

Contributing threads:
     - `2.3.1 is (almost) a go 
<http://mail.python.org/pipermail/python-dev/2003-September/038229.html>`__
     - `RELEASED Python 2.3.1 
<http://mail.python.org/pipermail/python-dev/2003-September/038254.html>`__
     - `How to test for stuff like fsync? 
<http://mail.python.org/pipermail/python-dev/2003-September/038354.html>`__


----------------------------------------------
Let them eat cake while releasing Python 2.3.2
----------------------------------------------
Python 2.3.2 was released to deal with the os.fsync() snafu.  HP/UX 
compiling issues were also addressed.

The bsddb3 problems are still there, but it is becoming more and more 
certain that the issues are with Sleepycat and not the bsddb module.

Contributing threads:
   - `plans for 2.3.2 
<http://mail.python.org/pipermail/python-dev/2003-September/038360.html>`__
   - `Python2.3.2 and release23-maint branch 
<http://mail.python.org/pipermail/python-dev/2003-September/038427.html>`__
   - `2.3.2 and bsddb 
<http://mail.python.org/pipermail/python-dev/2003-September/038433.html>`__
   - `RELEASED Python 2.3.2, release candidate 1 
<http://mail.python.org/pipermail/python-dev/2003-September/038449.html>`__
   - `OpenSSL vulnerability 
<http://mail.python.org/pipermail/python-dev/2003-September/038455.html>`__
   - `RELEASED Python 2.3.2 (final) 
<http://mail.python.org/pipermail/python-dev/2003-October/038523.html>`__


From zeddicus at satokar.com  Fri Oct 10 07:37:21 2003
From: zeddicus at satokar.com (Michael Bartl)
Date: Fri Oct 10 07:39:31 2003
Subject: [Python-Dev] Patches & Bug help revisited
Message-ID: <20031010113721.GA4148@satokar.com>

Hi!

I found myself mentioned in the summary so I thought I'd drop a line
again. After my inital offer to help I started with reviewing and
writing (very simple) patches. To be more explicit: patch 813200,
patch 810914, bug 810408, 811082, (can't remember the rest from
memory, can provide a better list later). 

I was quite wondering that no-one seemed to have a look at the patches,
but thought that this might be due to the pressing 2.3.2 release.

I find patch writing rather unsatisfying if they aren't applied (or
rejected <wink> :)

btw: Is there any possibility to search through the bugs/patches. I
couldn't find it and see the sf.net capabilities as rather limited.

Have fun,
    Michael

From theller at python.net  Fri Oct 10 10:39:48 2003
From: theller at python.net (Thomas Heller)
Date: Fri Oct 10 10:39:56 2003
Subject: [Python-Dev] buildin vs. shared modules
Message-ID: <brspcg23.fsf@python.net>

What is the rationale to decide whether a module is builtin or an
extension module in core Python (I only care about Windows)?

To give examples, could zlib be made into a builtin module (because it's
useful for zipimport), _sre (because it's used by warnings), or are
there reasons preventing this?

Thomas


From theshadow at shambala.net  Fri Oct 10 11:25:25 2003
From: theshadow at shambala.net (John Hoffman)
Date: Fri Oct 10 11:25:38 2003
Subject: [Python-Dev] IPv6 in Windows binary distro
Message-ID: <3F86CF65.1000401@shambala.net>

Hello...  Could you please tell me why IPv6 support isn't present in the 
2.3.1 and 2.3.2 Windows binary releases?  Is it broken for Windows?  If 
not, I'd really appreciate if someone could make a new build for me...  
Thanks.


From wtrenker at hotmail.com  Fri Oct 10 04:38:51 2003
From: wtrenker at hotmail.com (William Trenker)
Date: Fri Oct 10 11:42:16 2003
Subject: [Python-Dev] Patches & Bug help revisited
In-Reply-To: <20031010113721.GA4148@satokar.com>
References: <20031010113721.GA4148@satokar.com>
Message-ID: <20031010083851.6c238978.wtrenker@hotmail.com>

On Fri, 10 Oct 2003 13:37:21 +0200
Michael Bartl <zeddicus@satokar.com> wrote regarding [Python-Dev] Patches & Bug help revisited:

> btw: Is there any possibility to search through the bugs/patches. I
> couldn't find it and see the sf.net capabilities as rather limited.

If you go to the patches page, on the left-hand side you will see a search box right under the Sourceforge logo.  The drop-down list will say "Patches".  In the text box under that, type in your search text and click on the search button.

You can do the same thing on the bugs page.  Of course the search drop-down on the bugs page will have the word "Bugs" in it.

Hope this is helpful.
Bill


From zeddicus at satokar.com  Fri Oct 10 11:51:16 2003
From: zeddicus at satokar.com (Michael Bartl)
Date: Fri Oct 10 11:51:20 2003
Subject: [Python-Dev] Patches & Bug help revisited
In-Reply-To: <20031010083851.6c238978.wtrenker@hotmail.com>
References: <20031010113721.GA4148@satokar.com>
	<20031010083851.6c238978.wtrenker@hotmail.com>
Message-ID: <20031010155116.GA5520@satokar.com>

On Fri, Oct 10, 2003 at 08:38:51AM +0000, William Trenker wrote:
> On Fri, 10 Oct 2003 13:37:21 +0200
> Michael Bartl <zeddicus@satokar.com> wrote regarding [Python-Dev] Patches & Bug help revisited:
> 
> > btw: Is there any possibility to search through the bugs/patches. I
> > couldn't find it and see the sf.net capabilities as rather limited.
> 
> If you go to the patches page, on the left-hand side you will see a search box right under the Sourceforge logo.  The drop-down list will say "Patches".  In the text box under that, type in your search text and click on the search button.
> 
> You can do the same thing on the bugs page.  Of course the search drop-down on the bugs page will have the word "Bugs" in it.
> 
> Hope this is helpful.
> Bill

Indeed it is! I never had a look at this, because I thought it's only
for software/people. It's not a full text search which would be nice,
but it's a start :)

From mwh at python.net  Fri Oct 10 11:54:55 2003
From: mwh at python.net (Michael Hudson)
Date: Fri Oct 10 11:54:04 2003
Subject: [Python-Dev] IPv6 in Windows binary distro
In-Reply-To: <3F86CF65.1000401@shambala.net> (John Hoffman's message of
	"Fri, 10 Oct 2003 09:25:25 -0600")
References: <3F86CF65.1000401@shambala.net>
Message-ID: <2mad89ulyo.fsf@starship.python.net>

John Hoffman <theshadow@shambala.net> writes:

> Hello...  Could you please tell me why IPv6 support isn't present in
> the 2.3.1 and 2.3.2 Windows binary releases?  Is it broken for
> Windows?  If not, I'd really appreciate if someone could make a new
> build for me...  Thanks.

Did the 2.3 builds have IPv6 support?  Then this would be a nasty
regression.  However, I *thought* that you had to build with VC++ 7 or
higher to get IPv6 support on Windows, and we've never done that.

Cheers,
mwh
(not a windows victim)

-- 
  NUTRIMAT:  That drink was individually tailored to meet your
             personal requirements for nutrition and pleasure.
    ARTHUR:  Ah.  So I'm a masochist on a diet am I?
                    -- The Hitch-Hikers Guide to the Galaxy, Episode 9

From nas-python at python.ca  Fri Oct 10 12:11:16 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Fri Oct 10 12:10:35 2003
Subject: [Python-Dev] Python 2.3 startup slowness, import related?
Message-ID: <20031010161116.GA1238@mems-exchange.org>

Python 2.3 seems to be really sluggish starting up.  Even on my
relatively fast development machine I am getting annoyed when running
small scripts.  I've only just starting digging into it but I think
I've found something interesting.  Here's an excerpt of strace
output for 2.2:

0.0511 rt_sigaction(SIGRT_30, NULL, {SIG_DFL}, 8) = 0
0.0511 rt_sigaction(SIGRT_31, NULL, {SIG_DFL}, 8) = 0
0.0512 rt_sigaction(SIGINT, NULL, {SIG_DFL}, 8) = 0
0.0512 rt_sigaction(SIGINT, {0x4002e610, [], SA_RESTORER, 0x4010e578}, NULL, 8)
0.0513 stat64("/home/nascheme/lib/python", {st_mode=S_IFDIR|0775, st_size=504, 
0.0514 stat64("/home/nascheme/lib/python/site", 0xbfffed08) = -1 ENOENT (No suc
0.0515 open("/home/nascheme/lib/python/site.so", O_RDONLY|O_LARGEFILE) = -1 ENO
0.0516 open("/home/nascheme/lib/python/sitemodule.so", O_RDONLY|O_LARGEFILE) = 
0.0516 open("/home/nascheme/lib/python/site.py", O_RDONLY|O_LARGEFILE) = -1 ENO
0.0517 open("/home/nascheme/lib/python/site.pyc", O_RDONLY|O_LARGEFILE) = -1 EN
0.0517 stat64("/www/plat/python/lib/python23.zip", 0xbfffe3c4) = -1 ENOENT (No 
0.0518 stat64("/www/plat/python/lib", {st_mode=S_IFDIR|0755, st_size=4096, ...}
0.0519 stat64("/www/plat/python/lib/python23.zip/site", 0xbfffed08) = -1 ENOENT
0.0519 open("/www/plat/python/lib/python23.zip/site.so", O_RDONLY|O_LARGEFILE) 
0.0523 open("/www/plat/python/lib/python23.zip/sitemodule.so", O_RDONLY|O_LARGE
0.0538 open("/www/plat/python/lib/python23.zip/site.py", O_RDONLY|O_LARGEFILE) 
0.0541 open("/www/plat/python/lib/python23.zip/site.pyc", O_RDONLY|O_LARGEFILE)
0.0556 stat64("/www/plat/python/lib/python2.3/", {st_mode=S_IFDIR|0755, st_size
0.0557 stat64("/www/plat/python/lib/python2.3/site", 0xbfffed08) = -1 ENOENT (N
0.0557 open("/www/plat/python/lib/python2.3/site.so", O_RDONLY|O_LARGEFILE) = -
0.0561 open("/www/plat/python/lib/python2.3/sitemodule.so", O_RDONLY|O_LARGEFIL
0.0575 open("/www/plat/python/lib/python2.3/site.py", O_RDONLY|O_LARGEFILE) = 4
0.0593 fstat64(4, {st_mode=S_IFREG|0644, st_size=11784, ...}) = 0
0.0594 open("/www/plat/python/lib/python2.3/site.pyc", O_RDONLY|O_LARGEFILE) = 
0.0611 fstat64(5, {st_mode=S_IFREG|0664, st_size=11417, ...}) = 0

and for 2.3:

0.0521 rt_sigaction(SIGRT_30, NULL, {SIG_DFL}, 8) = 0
0.0521 rt_sigaction(SIGRT_31, NULL, {SIG_DFL}, 8) = 0
0.0522 rt_sigaction(SIGINT, NULL, {SIG_DFL}, 8) = 0
0.0522 rt_sigaction(SIGINT, {0x4002e610, [], SA_RESTORER, 0x4010e578}, NULL
0.0524 stat64("/home/nascheme/lib/python", {st_mode=S_IFDIR|0775, st_size=5
0.0551 stat64("/home/nascheme/lib/python/site", 0xbfffed18) = -1 ENOENT (No
0.0716 open("/home/nascheme/lib/python/site.so", O_RDONLY|O_LARGEFILE) = -1
0.0811 open("/home/nascheme/lib/python/sitemodule.so", O_RDONLY|O_LARGEFILE
0.0925 open("/home/nascheme/lib/python/site.py", O_RDONLY|O_LARGEFILE) = -1
0.1079 open("/home/nascheme/lib/python/site.pyc", O_RDONLY|O_LARGEFILE) = -
0.1145 stat64("/www/python/lib/python23.zip", 0xbfffe3d4) = -1 ENOENT (No s
0.1258 stat64("/www/python/lib", {st_mode=S_IFDIR|0755, st_size=4096, ...})
0.1260 stat64("/www/python/lib/python23.zip/site", 0xbfffed18) = -1 ENOENT 
0.1261 open("/www/python/lib/python23.zip/site.so", O_RDONLY|O_LARGEFILE) =
0.1284 open("/www/python/lib/python23.zip/sitemodule.so", O_RDONLY|O_LARGEF
0.1384 open("/www/python/lib/python23.zip/site.py", O_RDONLY|O_LARGEFILE) =
0.1443 open("/www/python/lib/python23.zip/site.pyc", O_RDONLY|O_LARGEFILE) 
0.1492 stat64("/www/python/lib/python2.3/", {st_mode=S_IFDIR|0755, st_size=
0.1512 stat64("/www/python/lib/python2.3/site", 0xbfffed18) = -1 ENOENT (No
0.1623 open("/www/python/lib/python2.3/site.so", O_RDONLY|O_LARGEFILE) = -1
0.1756 open("/www/python/lib/python2.3/sitemodule.so", O_RDONLY|O_LARGEFILE
0.1994 open("/www/python/lib/python2.3/site.py", O_RDONLY|O_LARGEFILE) = 4
0.2081 fstat64(4, {st_mode=S_IFREG|0644, st_size=11784, ...}) = 0
0.2083 open("/www/python/lib/python2.3/site.pyc", O_RDONLY|O_LARGEFILE) = 5
0.2222 fstat64(5, {st_mode=S_IFREG|0664, st_size=11417, ...}) = 0

I cut off the long lines since the first column showing time in
seconds since startup is the interesting bit.  Notice that 2.3 is
making a few more system calls due to the zip import feature but it
is taking a lot more time to find the 'site' module.

I'm going to keep digging but perhaps someone has a theory as to
what's going on.

  Neil

From kbg at kadnet.dk  Fri Oct 10 12:11:37 2003
From: kbg at kadnet.dk (kasper b. graversen)
Date: Fri Oct 10 12:13:02 2003
Subject: [Python-Dev] attaching methods to an object at runtime      and
 compiler enhancement ideas...
Message-ID: <200310101811370393.020A912A@lisbeth.kadnet.dom>

Hello all.

This is my first posting here. My name is Kasper Graversen, a ph.d. student of the it-university of copenhagen. I'm playing with python for doing roles, that is, runtime specialization pr.v object with the ability of multiple views on each object. So far it has been fun playing with python, but I ponder why it is only possible to introduce functions and not methods to object instances?

I am also wondering if it is possible to change the parsed code at compile time by gaining access to the AST and by the use of some mechanism (somewhat similar to meta classes) to be able to patch in before the execution of the code

Finally, one of the most difficult things of moving from Java to python is the lack of checking done by the compiler ;) Here are two things I really miss which if would like the future versions of the compiler to support:


* A flag, when set, checks that each __init__ method calls its super __init__


* A flag, when set, checks that an inner class in a subclass by the same name of its supers inner class subclasses this class. Oddly only methods and not also inner classes are virtual by default in Python.
eg 
class A(object):
    class B(object):

class C(A):
    class B(object):

should raise an error since C.B should extend A.B


* A flag, when set, raises an error if an a field is introduced in code outside the __init__() block.. this ensures that spelling mistakes are caught at compile time,  when misspelling the field to be accessed.


sincerely
 Kasper B. Graversen
please help save our planet! At least click daily on
http://rainforest.care2.com/  and   http://www.therainforestsite.com/
and tell your friends to do the same...


From nas-python at python.ca  Fri Oct 10 12:25:46 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Fri Oct 10 12:25:03 2003
Subject: [Python-Dev] Python 2.3 startup slowness, import related?
In-Reply-To: <20031010161116.GA1238@mems-exchange.org>
References: <20031010161116.GA1238@mems-exchange.org>
Message-ID: <20031010162546.GA1319@mems-exchange.org>

On Fri, Oct 10, 2003 at 09:11:16AM -0700, Neil Schemenauer wrote:
>Here's an excerpt of strace output for 2.2:

Argh, please ignore that garbage.  I was using the wrong binary for
Python 2.2.

  Neil

From fdrake at acm.org  Fri Oct 10 12:25:41 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri Oct 10 12:26:11 2003
Subject: [Python-Dev] Patches & Bug help revisited
In-Reply-To: <20031010155116.GA5520@satokar.com>
References: <20031010113721.GA4148@satokar.com>
	<20031010083851.6c238978.wtrenker@hotmail.com>
	<20031010155116.GA5520@satokar.com>
Message-ID: <16262.56709.186862.565850@grendel.zope.com>


Michael Bartl writes:
 > Indeed it is! I never had a look at this, because I thought it's only
 > for software/people. It's not a full text search which would be nice,
 > but it's a start :)

It used to be only for software and people, but that was fixed.  Now,
it searches through all the issues in the current tracker and doesn't
let you filter or sort the results in any way.  But better than it was.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From skip at pobox.com  Fri Oct 10 12:26:34 2003
From: skip at pobox.com (Skip Montanaro)
Date: Fri Oct 10 12:44:43 2003
Subject: [Python-Dev] Patches & Bug help revisited
In-Reply-To: <20031010155116.GA5520@satokar.com>
References: <20031010113721.GA4148@satokar.com>
	<20031010083851.6c238978.wtrenker@hotmail.com>
	<20031010155116.GA5520@satokar.com>
Message-ID: <16262.56762.435465.943403@montanaro.dyndns.org>


    >> If you go to the patches page, on the left-hand side you will see a
    >> search box ...

    Michael> I never had a look at this, because I thought it's only for
    Michael> software/people. It's not a full text search which would be
    Michael> nice, but it's a start :)

It's also limited by the fact that you can't specify any other search
constraints (like assignee or current state).  If you search for a term
which turns up frequently, you'll get dozens of hits, most of which will be
useless because the bug or patch has already been closed.

Because of this, I often find it easier to use the browse function using
various criteria to limit its scope.

Skip


From tjreedy at udel.edu  Fri Oct 10 14:19:51 2003
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri Oct 10 14:19:56 2003
Subject: [Python-Dev] Re: attaching methods to an object at runtime and
	compiler enhancement ideas...
References: <200310101811370393.020A912A@lisbeth.kadnet.dom>
Message-ID: <bm6t88$3dc$1@sea.gmane.org>

"kasper b. graversen" <kbg@kadnet.dk> wrote in message
news:200310101811370393.020A912A@lisbeth.kadnet.dom...

At least your first two questions are about usage rather than
development and would be better directed to comp.lang.python (or
g.c.p.general)
In the meanwhile...

>each object. So far it has been fun playing with python, but I ponder
why it is >only possible to introduce functions and not methods to
object instances?

1. Methods are functions attached to classes as attributes.
2. The need for instance-specific 'methods' is rare.
3. Rare needs are covered by explicitly passing the instance as an
arg:
person.role(person, *args)

> I am also wondering if it is possible to change the parsed code at
compile
> time by gaining access to the AST

See compiler module/package and its AST walker.

>Finally, one of the most difficult things of moving from Java to
python is the >lack of checking done by the compiler ;) Here are two
things I really miss >which if would like the future versions of the
compiler to support:

The current interpreter checks only for syntactic correctness and
deprecated usages (to issue warnings).  This is unlikely to change
soon.  Enforcing coding standards is more the province of PyChecker
and PyLint.  If neither have the checks you want, give both authors
your suggestions.

Terry J. Reedy


From bac at OCF.Berkeley.EDU  Fri Oct 10 15:29:11 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Oct 10 15:29:19 2003
Subject: [Python-Dev] Patches & Bug help revisited
In-Reply-To: <20031010113721.GA4148@satokar.com>
References: <20031010113721.GA4148@satokar.com>
Message-ID: <3F870887.4090201@ocf.berkeley.edu>

Michael Bartl wrote:

> Hi!
> 
> I found myself mentioned in the summary so I thought I'd drop a line
> again. After my inital offer to help I started with reviewing and
> writing (very simple) patches. To be more explicit: patch 813200,
> patch 810914, bug 810408, 811082, (can't remember the rest from
> memory, can provide a better list later). 
> 
> I was quite wondering that no-one seemed to have a look at the patches,
> but thought that this might be due to the pressing 2.3.2 release.
> 
> I find patch writing rather unsatisfying if they aren't applied (or
> rejected <wink> :)
> 

Sorry about no one getting to your patches, Michael.  The help is truly 
appreciated, even if no one has gotten around to looking at them.

As for why no one has dealt with them, Python 2.3.2 was part of it.  The 
other issue is just people on python-dev being busy.  Beyond a handful 
of people, not everyone goes through the new patches, bugs, etc. because 
of time constraints and having to prioritize what time they have to work 
on Python.  It is an unfortunate drawback of having everyone who works 
on Python be a volunteer (this can be solved if someone gave the PSF 
*tons* of money and thus could sponsor someone to work on Python full 
time, but I haven't won the lottery yet  =).

Someone will get to them at some point, I promise.  If anything I will 
get to them because I am going to go through all the patch items in the 
near future; I will reach the end at some point.  =)

-Brett


From neal at metaslash.com  Fri Oct 10 16:01:08 2003
From: neal at metaslash.com (Neal Norwitz)
Date: Fri Oct 10 16:01:17 2003
Subject: [Python-Dev] Patches & Bug help revisited
In-Reply-To: <3F870887.4090201@ocf.berkeley.edu>
References: <20031010113721.GA4148@satokar.com>
	<3F870887.4090201@ocf.berkeley.edu>
Message-ID: <20031010200108.GR30467@epoch.metaslash.com>

On Fri, Oct 10, 2003 at 12:29:11PM -0700, Brett C. wrote:
> Michael Bartl wrote:
> 
> >I was quite wondering that no-one seemed to have a look at the patches,
> >but thought that this might be due to the pressing 2.3.2 release.
> >
> >I find patch writing rather unsatisfying if they aren't applied (or
> >rejected <wink> :)
> 
> As for why no one has dealt with them, Python 2.3.2 was part of it.  The 
> other issue is just people on python-dev being busy.  Beyond a handful 
> of people, not everyone goes through the new patches, bugs, etc. because 
> of time constraints and having to prioritize what time they have to work 
> on Python.  

Brett is correct.  Speaking for myself, most of my free time
has gone from working on pychecker to working on python
to working on the PSF (I'm treasurer).

> It is an unfortunate drawback of having everyone who works 
> on Python be a volunteer (this can be solved if someone gave the PSF 
> *tons* of money and thus could sponsor someone to work on Python full 
> time, but I haven't won the lottery yet  =).

It would be even better for the PSF to win grants from other Public
Charities or government organizations.  To win grants we need to
identify grant opportunties and more importantly write proposals.

Neal

From guido at python.org  Fri Oct 10 18:08:16 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 10 18:08:28 2003
Subject: [Python-Dev] Python 2.3 startup slowness, import related?
In-Reply-To: Your message of "Fri, 10 Oct 2003 09:11:16 PDT."
	<20031010161116.GA1238@mems-exchange.org> 
References: <20031010161116.GA1238@mems-exchange.org> 
Message-ID: <200310102208.h9AM8Ha03802@12-236-54-216.client.attbi.com>

> Python 2.3 seems to be really sluggish starting up.

There have been I think at least two past threads on this issue; it
might be useful to look them up.

While a bit of work was done to alleviate the problem, 2.3 remains
much slower because it imports a much larger set of modules at
startup...

--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Fri Oct 10 18:17:13 2003
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri Oct 10 18:17:21 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <brspcg23.fsf@python.net>
References: <brspcg23.fsf@python.net>
Message-ID: <3F872FE9.9070508@v.loewis.de>

Thomas Heller wrote:
> What is the rationale to decide whether a module is builtin or an
> extension module in core Python (I only care about Windows)?

I believe it is mostly tradition, on Windows: We continue to do
things the way they have always been done.

On Linux, there is an additional rationale: small executables and
many files are cool, so we try to have as many shared libraries as
possible. (if you smell sarcasm - that is intentional)

> To give examples, could zlib be made into a builtin module (because it's
> useful for zipimport), _sre (because it's used by warnings), or are
> there reasons preventing this?

I think that anything that would be reasonably replaced by third parties
(such as pyexpat.pyd) should be shared, and anything else should be part
of pythonxy.dll.

Regards,
Martin


From martin at v.loewis.de  Fri Oct 10 18:20:07 2003
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri Oct 10 18:20:15 2003
Subject: [Python-Dev] IPv6 in Windows binary distro
In-Reply-To: <2mad89ulyo.fsf@starship.python.net>
References: <3F86CF65.1000401@shambala.net>
	<2mad89ulyo.fsf@starship.python.net>
Message-ID: <3F873097.7050201@v.loewis.de>

Michael Hudson wrote:

> Did the 2.3 builds have IPv6 support?  Then this would be a nasty
> regression.  However, I *thought* that you had to build with VC++ 7 or
> higher to get IPv6 support on Windows, and we've never done that.

No, 2.3 did not have IPv6. You don't strictly need VC7, though - if
you have the SDK installed in addition to VC6, you could also include
IPv6 support. PC/pyconfig.h does not detect this case automatically,
so you would have to manually activate this support (i.e. include
winsock2.h).

Apart from that, you are right - IPv6 is not supported in the Windows
builds because of lacking support in the compiler's header files.

Regards,
Martin


From aahz at pythoncraft.com  Fri Oct 10 21:18:25 2003
From: aahz at pythoncraft.com (Aahz)
Date: Fri Oct 10 21:18:28 2003
Subject: [Python-Dev] OS testing (was Re: 2.3.3 plans)
In-Reply-To: <16257.52079.226636.407139@montanaro.dyndns.org>
References: <200310040008.h9408HtM008544@localhost.localdomain>
	<2m3ce6zomw.fsf@starship.python.net>
	<16257.52079.226636.407139@montanaro.dyndns.org>
Message-ID: <20031011011825.GA15528@panix.com>

On Mon, Oct 06, 2003, Skip Montanaro wrote:
>
> It's not quite exhaustive yet, but I will remind people about the
> PythonTesters wiki page:
> 
>     http://www.python.org/cgi-bin/moinmoin/PythonTesters
> 
> Maybe that page should also mention some of the vendor-specific test sites
> (HP Test Drive, SourceForge compile farm, PBF server farm, ...).

Added HP Test Drive.  I also added links to PythonTesters on the /dev/
Tools page and the Dev FAQ.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From tim.one at comcast.net  Fri Oct 10 23:52:49 2003
From: tim.one at comcast.net (Tim Peters)
Date: Fri Oct 10 23:52:54 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <brspcg23.fsf@python.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCCELLGHAB.tim.one@comcast.net>

[Thomas Heller]
> What is the rationale to decide whether a module is builtin or an
> extension module in core Python (I only care about Windows)?

I don't know that there is one.  Maybe to avoid chewing address space for
code that some programs won't use.  Generally speaking, it appears some
effort was made to make stuff an extension module on Windows if it was an
optional part of the Unix build.  There was certainly an effort made to
build an extension for Python modules wrapping external cod (like the _bsddb
and _tkinter projects).

> To give examples, could zlib be made into a builtin module (because
> it's useful for zipimport), _sre (because it's used by warnings), or
> are there reasons preventing this?

zlib was there long before Python routinely made use of it; indeed, I doubt
I ever used one byte of the zlib code outside of Python testing before zip
import came along (and since I have no zip files to import from I guess I
still never use it).  Leaving _sre an extension seems odd now, but at the
time it was competing with the external-to-Python PCRE code.

Why do you ask?  Answers must be accurate to 10 decimal digits <wink>.


From bac at OCF.Berkeley.EDU  Sat Oct 11 18:50:25 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Sat Oct 11 18:50:29 2003
Subject: [Python-Dev] Failure of test__locale common on OS X?
Message-ID: <3F888931.9000706@ocf.berkeley.edu>

I just ran ``regrtest.py -unetwork`` after a fresh update and noticed 
that test__locale failed for me under OS X 10.2.8 because 
_locale.RADIXCHAR does not exist.  Anyone else getting this failure?  If 
so I will add it to the expected skip list.

-Brett


From martin at v.loewis.de  Sat Oct 11 18:59:55 2003
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat Oct 11 19:00:02 2003
Subject: [Python-Dev] Failure of test__locale common on OS X?
In-Reply-To: <3F888931.9000706@ocf.berkeley.edu>
References: <3F888931.9000706@ocf.berkeley.edu>
Message-ID: <3F888B6B.6080002@v.loewis.de>

Brett C. wrote:

> I just ran ``regrtest.py -unetwork`` after a fresh update and noticed 
> that test__locale failed for me under OS X 10.2.8 because 
> _locale.RADIXCHAR does not exist.  Anyone else getting this failure?  If 
> so I will add it to the expected skip list.

A test failure is different from a skipped test.

Martin


From jepler at unpythonic.net  Sat Oct 11 20:30:28 2003
From: jepler at unpythonic.net (Jeff Epler)
Date: Sat Oct 11 20:30:35 2003
Subject: [Python-Dev] Failure of test__locale common on OS X?
In-Reply-To: <3F888931.9000706@ocf.berkeley.edu>
References: <3F888931.9000706@ocf.berkeley.edu>
Message-ID: <20031012003021.GA20833@unpythonic.net>

This test was added for python.org/sf/798145

I assume that the 'from _locale import ... RADIXCHAR ...' line fails,
and the test skips?

Any system which doesn't have RADIXCHAR is expected to skip this test.
According to glibc's documentation (the version google coughed up:
   http://www.delorie.com/gnu/docs/glibc/libc_119.html
), RADIXCHAR is among the identifiers specified in "the X/Open
standard".  OS X isn't Unix enough for this situation?

Jeff

From greg at cosc.canterbury.ac.nz  Sat Oct 11 21:13:54 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Sat Oct 11 21:14:19 2003
Subject: [Python-Dev] MacPython - access to FinderInfo of a directory
Message-ID: <200310120113.h9C1Dsv21694@oma.cosc.canterbury.ac.nz>

I discovered recently that the File Manager wrappings in
MacPython don't seem to provide any way of getting at the
FinderInfo of a directory, because GetFInfo/SetFInfo only
work on files, and access to the finderInfo field of the
FSCatalogInfo structure hasn't been implemented.

I have come up with a patch to _Filemodule.c to remedy
this, but patching this file directly probably isn't the
right thing to do, because it seems to have been generated
automatically using bgen. Unfortunately I don't know
enough about bgen to fix this properly.

Should I go ahead and submit a patch anyway, and hope
that someone will be able to reverse-engineer it into
whatever fix is appropriate? It would be good to get
this incorporated into the standard distribution if
possible.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From sdm7g at mac.com  Sat Oct 11 22:00:28 2003
From: sdm7g at mac.com (Steven Majewski)
Date: Sat Oct 11 22:00:35 2003
Subject: [Python-Dev] Failure of test__locale common on OS X?
In-Reply-To: <3F888931.9000706@ocf.berkeley.edu>
Message-ID: <DA274037-FC57-11D7-B8AF-000A957D4F2A@mac.com>


tkFileDialog also fails when run as a main module:

% pythonw tkFileDialog.py
Traceback (most recent call last):
   File "tkFileDialog.py", line 189, in ?
     locale.setlocale(locale.LC_ALL,'')
   File  
"/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/ 
locale.py", line 381, in setlocale
     return _setlocale(category, locale)
locale.Error: locale setting not supported


All the functions seem to work ok when imported. The problem is the  
following
in the 'if __name__ == "__main__" : ' clause:

     # See whether CODESET is defined
     try:
         import locale
         locale.setlocale(locale.LC_ALL,'')
         enc = locale.nl_langinfo(locale.CODESET)
     except (ImportError, AttributeError):
         pass


The tail end of the OSX setlocale man page reads:

------------------------------------------------------------------------ 
---------------------------------
STANDARDS
      The setlocale() and localeconv() functions conform to ISO/IEC  
9899:1990
      (``ISO C89'').

HISTORY
      The setlocale() and localeconv() functions first appeared in  
4.4BSD.

BUGS
      The current implementation supports only the "C" and "POSIX"  
locales for
      all but the LC_COLLATE, LC_CTYPE, and LC_TIME categories.

      In spite of the gnarly currency support in localeconv(), the  
standards
      don't include any functions for generalized currency formatting.

      Use of LC_MONETARY could lead to misleading results until we have  
a real
      time currency conversion function.  LC_NUMERIC and LC_TIME are  
personal
      choices and should not be wrapped up with the other categories.

BSD                              June 9, 1993                            
    BSD
------------------------------------------------------------------------ 
--------------------------------

In fact, setlocale( LC_ALL, "POSIX" ) and setlocale( LC_ALL, "C" ) both  
work.
( Some other things also don't work, but I'm not sure exactly what  
things should work. )

-- Steve Majewski


From skip at manatee.mojam.com  Sun Oct 12 08:00:33 2003
From: skip at manatee.mojam.com (Skip Montanaro)
Date: Sun Oct 12 08:00:42 2003
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200310121200.h9CC0XC6010757@manatee.mojam.com>


Bug/Patch Summary
-----------------

542 open / 4234 total bugs (+13)
209 open / 2411 total patches (+3)

New Bugs
--------

distutils: clean -b ignored; set_undefined_options doesn't (2003-10-05)
	http://python.org/sf/818201
Shared object modules in Windows have no __file__. (2003-10-05)
	http://python.org/sf/818315
socketmodule.c compile error using SunPro cc (2003-10-06)
	http://python.org/sf/818490
optparse "append" action should always make the empty list. (2003-10-07)
	http://python.org/sf/819178
httplib.SSLFile lacks readlines() method (2003-10-07)
	http://python.org/sf/819510
PythonIDE interactive window Unicode bug (2003-10-08)
	http://python.org/sf/819860
Ref Man Index: Symbols -- Latex leak (2003-10-08)
	http://python.org/sf/820344
urllib2 silently returns None when auth_uri is mismatched (2003-10-09)
	http://python.org/sf/820583
tkinter's 'after' and 'threads' on multiprocessor (2003-10-09)
	http://python.org/sf/820605
dbm Error (2003-10-09)
	http://python.org/sf/820953
reduce docs neglect a very important piece of information. (2003-10-11)
	http://python.org/sf/821701
pyclbr.readmodule_ex() (2003-10-11)
	http://python.org/sf/821818
_set_cloexec of tempfile.py uses incorrect error handling (2003-10-11)
	http://python.org/sf/821896
fcntl() not working on sparc (python 2.2.3) (2003-10-11)
	http://python.org/sf/821948
Carbon.CarbonEvt.ReceiveNextEvent args wrong (2003-10-11)
	http://python.org/sf/822005

New Patches
-----------

Fix for former/latter confusion in Extending documentation (2003-10-06)
	http://python.org/sf/819012
fix import problem(unittest.py) (2003-10-07)
	http://python.org/sf/819077
fix doc typos (2003-10-10)
	http://python.org/sf/821093
ftplib: Strict RFC 959 (telnet in command channel) (2003-10-11)
	http://python.org/sf/821862

Closed Bugs
-----------

compiler package needs better documentation. (2000-11-27)
	http://python.org/sf/223616
threads and profiler don't work together (2001-02-08)
	http://python.org/sf/231540
docs need to discuss // and __future__.division (2001-08-08)
	http://python.org/sf/449093
urljoin fails RFC tests (2001-08-11)
	http://python.org/sf/450225
new int overflow handling needs docs (2001-08-22)
	http://python.org/sf/454446
docs should include man page (2001-10-09)
	http://python.org/sf/469773
Lib/profile.doc should be updated (2001-12-04)
	http://python.org/sf/489256
Using the lib index mechanically (2002-04-03)
	http://python.org/sf/538961
Fuzziness in inspect module documentatio (2002-06-01)
	http://python.org/sf/563298
Automated daily documentation builds (2002-06-26)
	http://python.org/sf/574241
-S hides standard dynamic modules (2002-07-25)
	http://python.org/sf/586680
site-packages & build-dir python (2002-07-25)
	http://python.org/sf/586700
cPickle documentation incomplete (2002-09-28)
	http://python.org/sf/616013
File write examples are inadequate (2002-10-09)
	http://python.org/sf/621057
Creation of struct_seq types (2002-10-17)
	http://python.org/sf/624827
pydoc.Helper.topics not based on docs (2002-10-24)
	http://python.org/sf/628258
pygettext should be installed (2002-11-22)
	http://python.org/sf/642309
extra __builtin__ stuff not documented (2002-12-12)
	http://python.org/sf/652749
cPickle not always same as pickle (2002-12-18)
	http://python.org/sf/655802
Accept None for time.ctime() and friends (2002-12-24)
	http://python.org/sf/658254
HTMLParser attribute parsing bug (2003-02-10)
	http://python.org/sf/683938
_iscommand() in webbrowser module (2003-02-16)
	http://python.org/sf/687747
Provide "plucker" format docs. (2003-03-06)
	http://python.org/sf/698900
Building lib.pdf fails on MacOSX (2003-04-14)
	http://python.org/sf/721157
urlopen object's read() doesn't read to EOF (2003-04-21)
	http://python.org/sf/725265
platform module needs docs (LaTeX) (2003-04-24)
	http://python.org/sf/726911
Importing anydbm generates exception if _bsddb unavailable (2003-05-02)
	http://python.org/sf/731501
markupbase parse_declaration cannot recognize comments (2003-05-12)
	http://python.org/sf/736659
Failed assert in stringobject.c (2003-05-14)
	http://python.org/sf/737947
Tutorial: executable scripts on Windows (2003-06-25)
	http://python.org/sf/760657
test_ossaudiodev timing failure (2003-08-04)
	http://python.org/sf/783242
Section 13.1 HTMLParser documentation error (2003-08-23)
	http://python.org/sf/793702
Clarify trailing comma in func arg list (2003-09-01)
	http://python.org/sf/798652
Mode argument of dumbdbm does not work (2003-09-07)
	http://python.org/sf/802128
super instances don't support item assignment (2003-09-12)
	http://python.org/sf/805304
refleaks in _hotshot.c (2003-09-18)
	http://python.org/sf/808756
tex to html convert bug (2003-09-19)
	http://python.org/sf/809599
Py2.2.3: Problem with Expat/XML/Zope on MacOSX 10.2.8 (2003-09-23)
	http://python.org/sf/811070
randint is always even (2003-09-24)
	http://python.org/sf/812202
mark deprecated modules in indexes (2003-10-02)
	http://python.org/sf/816725
Float Multiplication (2003-10-02)
	http://python.org/sf/816946
invalid \U escape gives 0=length unistr (2003-10-03)
	http://python.org/sf/817156
use Windows' default programs location. (2003-10-05)
	http://python.org/sf/818030

Closed Patches
--------------

Add multicall support to xmlrpclib (2002-03-18)
	http://python.org/sf/531629
PyArg_VaParseTupleAndKeywords (2002-04-30)
	http://python.org/sf/550732
attributes for urlsplit, urlparse result (2002-10-16)
	http://python.org/sf/624325
HTMLParser.py - more robust SCRIPT tag parsing (2003-01-19)
	http://python.org/sf/670664
test_htmlparser -- more robust SCRIPT tag handling (2003-01-24)
	http://python.org/sf/674449
Kill off docs for unsafe macros (2003-03-13)
	http://python.org/sf/702933
Remove __file__ after running $PYTHONSTARTUP (2003-04-11)
	http://python.org/sf/719777
build of html docs broken (liboptparse.tex) (2003-05-04)
	http://python.org/sf/732174
Nails down the semantics of dict setitem (2003-06-03)
	http://python.org/sf/748126
Let pprint.py use issubclass instead of is for type checking (2003-06-07)
	http://python.org/sf/750542
Glossary (2003-08-13)
	http://python.org/sf/788509
Improve "veryhigh.tex" API docs (2003-09-01)
	http://python.org/sf/798638
dynamic popen2 MAXFD (2003-10-03)
	http://python.org/sf/817329

From fdrake at acm.org  Sun Oct 12 10:43:08 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sun Oct 12 10:43:17 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib urlparse.py,
	1.41, 1.42
In-Reply-To: <E1A8Xzc-0008JU-00@sc8-pr-cvs1.sourceforge.net>
References: <E1A8Xzc-0008JU-00@sc8-pr-cvs1.sourceforge.net>
Message-ID: <16265.26748.451661.390136@grendel.zope.com>


bcannon@users.sourceforge.net writes:
 > Log Message:
 > (revision purely to add comment)

You can use "cvs admin" to fix broken comments.  It doesn't generate
an email, but it avoids an extra entry in the history.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From bac at OCF.Berkeley.EDU  Sun Oct 12 16:05:21 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Sun Oct 12 16:05:34 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Lib urlparse.py,
	1.41, 1.42
In-Reply-To: <16265.26748.451661.390136@grendel.zope.com>
References: <E1A8Xzc-0008JU-00@sc8-pr-cvs1.sourceforge.net>
	<16265.26748.451661.390136@grendel.zope.com>
Message-ID: <3F89B401.6070601@ocf.berkeley.edu>

Fred L. Drake, Jr. wrote:
> bcannon@users.sourceforge.net writes:
>  > Log Message:
>  > (revision purely to add comment)
> 
> You can use "cvs admin" to fix broken comments.  It doesn't generate
> an email, but it avoids an extra entry in the history.
> 

OK, good to know.  I think I will add that to the dev FAQ.

-Brett


From tjreedy at udel.edu  Sun Oct 12 18:25:03 2003
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun Oct 12 18:25:18 2003
Subject: [Python-Dev] Re: Weekly Python Bug/Patch Summary
References: <200310121200.h9CC0XC6010757@manatee.mojam.com>
Message-ID: <bmckc0$530$1@sea.gmane.org>


"Skip Montanaro" <skip@manatee.mojam.com> wrote in message
news:200310121200.h9CC0XC6010757@manatee.mojam.com...
>
> Bug/Patch Summary
> -----------------
>
> 542 open / 4234 total bugs (+13)
> 209 open / 2411 total patches (+3)
>
> New Bugs (15 new)

> Closed Bugs
Something hiccuped.  About 45 are listed.  If these were really
closed, then net change would be about -30.  Spot check of about 5
showed still open.

TJR


From Jack.Jansen at cwi.nl  Mon Oct 13 05:52:34 2003
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Mon Oct 13 05:52:05 2003
Subject: [Python-Dev] MacPython - access to FinderInfo of a directory
In-Reply-To: <200310120113.h9C1Dsv21694@oma.cosc.canterbury.ac.nz>
Message-ID: <F88643F8-FD62-11D7-A415-0030655234CE@cwi.nl>


On Sunday, October 12, 2003, at 03:13 AM, Greg Ewing wrote:

> I discovered recently that the File Manager wrappings in
> MacPython don't seem to provide any way of getting at the
> FinderInfo of a directory, because GetFInfo/SetFInfo only
> work on files, and access to the finderInfo field of the
> FSCatalogInfo structure hasn't been implemented.
>
> I have come up with a patch to _Filemodule.c to remedy
> this, but patching this file directly probably isn't the
> right thing to do, because it seems to have been generated
> automatically using bgen. Unfortunately I don't know
> enough about bgen to fix this properly.

Greg,
there's an SF bug for this one: #706585. If you could attach your
patch to this one I'll do the magic to work it around to bgen.
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman


From Jack.Jansen at cwi.nl  Mon Oct 13 06:28:43 2003
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Mon Oct 13 06:28:06 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Mac/Tools/IDE
	PyConsole.py, 1.17, 1.18
In-Reply-To: <E1A8lsE-00019d-00@sc8-pr-cvs1.sourceforge.net>
Message-ID: <0560A63A-FD68-11D7-A415-0030655234CE@cwi.nl>

Just,
could you backport this to the 2.3 maintenance branch too? Actually, 
that
may be the only place where IDE bug fixes need to go, I hope we have 
something
new by the time 2.4 comes out...

On Sunday, October 12, 2003, at 09:27 PM, jvr@users.sourceforge.net 
wrote:

> Update of /cvsroot/python/python/dist/src/Mac/Tools/IDE
> In directory sc8-pr-cvs1:/tmp/cvs-serv4435
>
> Modified Files:
> 	PyConsole.py
> Log Message:
> fix for bug [819860]: make sure the buffer gets emptied, even if 
> WEInsert() fails
>
> Index: PyConsole.py
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Mac/Tools/IDE/PyConsole.py,v
> retrieving revision 1.17
> retrieving revision 1.18
> diff -C2 -d -r1.17 -r1.18
> *** PyConsole.py	9 May 2003 11:47:23 -0000	1.17
> --- PyConsole.py	12 Oct 2003 19:27:24 -0000	1.18
> ***************
> *** 128,135 ****
>   		stuff = string.join(stuff, '\r')
>   		self.setselection_at_end()
> ! 		self.ted.WEInsert(stuff, None, None)
>   		selstart, selend = self.getselection()
>   		self._inputstart = selstart
> - 		self._buf = ""
>   		self.ted.WEClearUndo()
>   		self.updatescrollbars()
> --- 128,137 ----
>   		stuff = string.join(stuff, '\r')
>   		self.setselection_at_end()
> ! 		try:
> ! 			self.ted.WEInsert(stuff, None, None)
> ! 		finally:
> ! 			self._buf = ""
>   		selstart, selend = self.getselection()
>   		self._inputstart = selstart
>   		self.ted.WEClearUndo()
>   		self.updatescrollbars()
> ***************
> *** 330,335 ****
>   		self.w.outputtext.setselection(end, end)
>   		self.w.outputtext.ted.WEFeatureFlag(WASTEconst.weFReadOnly, 0)
> ! 		self.w.outputtext.ted.WEInsert(stuff, None, None)
> ! 		self._buf = ""
>   		self.w.outputtext.updatescrollbars()
>   		self.w.outputtext.ted.WEFeatureFlag(WASTEconst.weFReadOnly, 1)
> --- 332,339 ----
>   		self.w.outputtext.setselection(end, end)
>   		self.w.outputtext.ted.WEFeatureFlag(WASTEconst.weFReadOnly, 0)
> ! 		try:
> ! 			self.w.outputtext.ted.WEInsert(stuff, None, None)
> ! 		finally:
> ! 			self._buf = ""
>   		self.w.outputtext.updatescrollbars()
>   		self.w.outputtext.ted.WEFeatureFlag(WASTEconst.weFReadOnly, 1)
>
>
>
> _______________________________________________
> Python-checkins mailing list
> Python-checkins@python.org
> http://mail.python.org/mailman/listinfo/python-checkins
>
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman


From skip at pobox.com  Mon Oct 13 15:10:50 2003
From: skip at pobox.com (Skip Montanaro)
Date: Mon Oct 13 15:11:00 2003
Subject: [Python-Dev] Re: Weekly Python Bug/Patch Summary
In-Reply-To: <bmckc0$530$1@sea.gmane.org>
References: <200310121200.h9CC0XC6010757@manatee.mojam.com>
	<bmckc0$530$1@sea.gmane.org>
Message-ID: <16266.63674.958495.69851@montanaro.dyndns.org>


    >> Bug/Patch Summary
    >> -----------------
    >> 
    >> 542 open / 4234 total bugs (+13)
    >> 209 open / 2411 total patches (+3)
    >> 
    >> New Bugs (15 new)

    >> Closed Bugs

    Terry> Something hiccuped.  About 45 are listed.  If these were really
    Terry> closed, then net change would be about -30.  Spot check of about
    Terry> 5 showed still open.

Thanks, I'll take a look at it when I get a chance.

Skip

From raymond.hettinger at verizon.net  Mon Oct 13 15:34:15 2003
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Mon Oct 13 15:34:57 2003
Subject: [Python-Dev] decorate-sort-undecorate 
Message-ID: <001f01c391c0$fcdb88a0$e841fea9@oemcomputer>

For Py2.4, I propose adding an optional list.sort() argument to support
the decorate-sort-undecorate pattern.
 
The current, pure Python approach to DSU is pure arcana.  It is obscure
enough and cumbersome enough that cmpfunc() tends to get used instead.
 
Built-in C support for DSU requires much less skill to use, results in
more readable code, and runs faster.
 
 
Raymond Hettinger
 
 
------ Concept demonstraton ------------------
 
def sort(self, cmpfunc=None, decorator=None):
    """Show how list.sort() could support a decorating function"""
    args = ()
    if cmpfunc is not None:
        args = (cmpfunc,)
    if decorator is None:
        self.sort(*args)
    else:      
        aux = zip(map(decorator, self), self)   # Decorate
        aux.sort(*args)
        self[:] = list(zip(*aux)[1])       # Un-decorate
 
a = 'the Quick brown Fox jumped Over the Lazy Dog'.split()
sort(a)                                # the no argument form is
unchanged
print a, 'Normal sort'
sort(a, lambda x,y: -cmp(x,y))    # old code still works without change
print a, 'Reverse sort'
sort(a, decorator=str.lower)       # the new way is fast, clean, and
readable
print a, 'Lowercase sort'
 
 
# The decorator form works especially well with mappings so that
database
# keys can be sorted by any field. 
 
ages = dict(john=5, amy=3, andrea=32, henry=12)
names = ages.keys()
location = dict(john='alaska', amy='spain', andrea='peru', henry='iowa')
 
sort(names)
print names, '<-- by name'
sort(names, decorator=ages.__getitem__)
print names, '<-- by age'
sort(names, decorator=location.__getitem__)
print names, '<-- by location'
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20031013/a4413f54/attachment.html
From Paul.Moore at atosorigin.com  Mon Oct 13 15:49:35 2003
From: Paul.Moore at atosorigin.com (Moore, Paul)
Date: Mon Oct 13 15:50:20 2003
Subject: [Python-Dev] decorate-sort-undecorate 
Message-ID: <16E1010E4581B049ABC51D4975CEDB8803060CCD@UKDCX001.uk.int.atosorigin.com>

From: Raymond Hettinger [mailto:raymond.hettinger@verizon.net]

> For Py2.4, I propose adding an optional list.sort() argument to
> support the decorate-sort-undecorate pattern.
[...]
 
> def sort(self, cmpfunc=None, decorator=None):

I like it! But "decorator" isn't a good name - it describes how it's
being done, rather than what is being done. How about "key"? After
all, "key=str.lower" reads more or less as "the key is the lowercase
equivalent of the value", and "key=ages.__getitem__" reads "get the
key by getting the appropriate item from the ages dictionary".

But names apart, it's nice. It lets people use the builtin, without
going for the performance-reducing comparison function...

Paul.

From Patrick.Maupin at silabs.com  Mon Oct 13 16:08:13 2003
From: Patrick.Maupin at silabs.com (Patrick Maupin)
Date: Mon Oct 13 16:08:45 2003
Subject: [Python-Dev] Python and Coercion
Message-ID: <E6DA8043C6A1214495F84A366D1DF4D28DF558@postoffice.silabs.com>

An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20031013/a4ce9338/attachment.html
From guido at python.org  Mon Oct 13 16:35:59 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 13 16:36:27 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "Mon, 13 Oct 2003 20:49:35 BST."
	<16E1010E4581B049ABC51D4975CEDB8803060CCD@UKDCX001.uk.int.atosorigin.com>
References: <16E1010E4581B049ABC51D4975CEDB8803060CCD@UKDCX001.uk.int.atosorigin.com>
Message-ID: <200310132035.h9DKZxv22377@12-236-54-216.client.attbi.com>

> From: Raymond Hettinger [mailto:raymond.hettinger@verizon.net]
> 
> > For Py2.4, I propose adding an optional list.sort() argument to
> > support the decorate-sort-undecorate pattern.
> [...]
>  
> > def sort(self, cmpfunc=None, decorator=None):

[Paul Moore]
> I like it! But "decorator" isn't a good name - it describes how it's
> being done, rather than what is being done. How about "key"? After
> all, "key=str.lower" reads more or less as "the key is the lowercase
> equivalent of the value", and "key=ages.__getitem__" reads "get the
> key by getting the appropriate item from the ages dictionary".

Agreed, that was my first thought too.

> But names apart, it's nice. It lets people use the builtin, without
> going for the performance-reducing comparison function...

+1

--Guido van Rossum (home page: http://www.python.org/~guido/)

From Patrick.Maupin at silabs.com  Mon Oct 13 16:55:23 2003
From: Patrick.Maupin at silabs.com (Patrick Maupin)
Date: Mon Oct 13 16:55:59 2003
Subject: [Python-Dev] Python and Coercion
Message-ID: <E6DA8043C6A1214495F84A366D1DF4D20689B64C@postoffice.silabs.com>

Sorry about the HTML.  I _hate_ the way they
configure things at work -- I have to remember
to force text, and can't get around the stoopid
legal disclaimer at the bottom.

I'll try to remember not to post from here again.

Best regards,
Pat

-----Original Message-----
From: Guido van Rossum [mailto:guido@esi.elementalsecurity.com]
Sent: Monday, October 13, 2003 3:39 PM
To: Patrick Maupin
Cc: pyython-dev@python.org
Subject: RE: [Python-Dev] Python and Coercion


This is a feature.  It is mentioned in passing in http://www.python.org/2.2.2/descrintro.html :

 """Note that while in general operator overloading works just as for classic classes, there are some differences. (The biggest one is the lack of support for __coerce__; new-style classes should always use the new-style numeric API, which passes the other operand uncoerced to the __add__ and __radd__ methods, etc.) """

PS Next time don't post HTML.
--Guido van Rossum (home page: http://www.python.org/~guido)

-----Original Message-----
From: python-dev-bounces@python.org [mailto:python-dev-bounces@python.org] On Behalf Of Patrick Maupin
Sent: Monday, October 13, 2003 1:08 PM
To: python-dev@python.org
Subject: [Python-Dev] Python and Coercion


Dear developers: 
__coerce__ does not seem to work in new-style classes, e.g. 
class foo: 
    def __int__(self): return 1 
    def __coerce__(self,other):  return int(self), int(other) 
x = foo() 
print 1+x 
works fine, but if foo is derived from object, it fails with: 
TypeError: unsupported operand type(s) for +: 'int' and 'foo' 


After finding this difference, I could not figure out if this was 
an interpreter error or a documentation error. 
http://www.python.org/doc/current/ref/coercion-rules.html 
states that: "In Python 3.0, coercion will not be supported." 
so I thought maybe this was the first round of removing this 
support. 
I googled around for awhile trying to find supporting documentation for 
this -- it appears it might have to do with PEP 228, but I'm not really 
sure, so I was hoping someone could point me at a reference point for 
this statement. 
Regards, 
Pat Maupin 


This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto. 

This email and any attachments thereto may contain private, confidential, 
and privileged material for the sole use of the intended recipient. Any 
review, copying, or distribution of this email (or any attachments thereto) 
by others is strictly prohibited. If you are not the intended recipient, 
please contact the sender immediately and permanently delete the original 
and any copies of this email and any attachments thereto.

From pnorvig at google.com  Mon Oct 13 17:17:46 2003
From: pnorvig at google.com (Peter Norvig)
Date: Mon Oct 13 17:17:50 2003
Subject: [Python-Dev] Re: Python-Dev Digest, Vol 3, Issue 25
References: <E1A99UR-00063f-Bu@mail.python.org>
Message-ID: <D18F01EF.3E310B4D@mail.google.com>

I like "sort" better than "decorator"; I would also like "by", as in
sort(names, by=ages.__getitem__).

I would also advocate an optional reverse=False argument, so that

result = sort(names, reverse=True)

is equivalent to

result = sort(names)
result.reverse()


> Date: Mon, 13 Oct 2003 15:34:15 -0400
> From: "Raymond Hettinger" <raymond.hettinger@verizon.net>
> Subject: [Python-Dev] decorate-sort-undecorate
> To: <python-dev@python.org>
> Message-ID: <001f01c391c0$fcdb88a0$e841fea9@oemcomputer>
> Content-Type: text/plain; charset="us-ascii"
>
> For Py2.4, I propose adding an optional list.sort() argument to support
> the decorate-sort-undecorate pattern.
>
> The current, pure Python approach to DSU is pure arcana.  It is obscure
> enough and cumbersome enough that cmpfunc() tends to get used instead.
>
> Built-in C support for DSU requires much less skill to use, results in
> more readable code, and runs faster.
>
> Raymond Hettinger
>

From skip at pobox.com  Mon Oct 13 17:22:59 2003
From: skip at pobox.com (Skip Montanaro)
Date: Mon Oct 13 17:23:17 2003
Subject: [Python-Dev] decorate-sort-undecorate 
In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8803060CCD@UKDCX001.uk.int.atosorigin.com>
References: <16E1010E4581B049ABC51D4975CEDB8803060CCD@UKDCX001.uk.int.atosorigin.com>
Message-ID: <16267.6067.441841.822258@montanaro.dyndns.org>


    >> def sort(self, cmpfunc=None, decorator=None):

    Paul> I like it! But "decorator" isn't a good name - it describes how
    Paul> it's being done, rather than what is being done. How about "key"?

How about keyfunc? "keyfunc=str.lower" reads to me more like "generate sort
keys using str.lower".  "key" doesn't suggest (to me, at least) its value
should be a function.

Skip

From ianb at colorstudy.com  Mon Oct 13 17:29:01 2003
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon Oct 13 17:29:06 2003
Subject: [Python-Dev] decorate-sort-undecorate 
In-Reply-To: <001f01c391c0$fcdb88a0$e841fea9@oemcomputer>
Message-ID: <43148AA1-FDC4-11D7-94FE-000393C2D67E@colorstudy.com>

On Monday, October 13, 2003, at 02:34 PM, Raymond Hettinger wrote:
> For Py2.4, I propose adding an optional list.sort() argument to 
> support the decorate-sort-undecorate pattern.

I've seen proposals for an extension to list comprehension, which would 
be quite nice:

   [s for s in lst sortby s.lower()]

It reads nicely, and avoids lambdas and tiny helper functions.  Also 
handles the sort-returns-None criticism.  But it adds syntax.  And 
since it's not an in-place sort it won't perform as well (but probably 
better than the decorator idiom anyway...?)

--
Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org


From guido at python.org  Mon Oct 13 18:04:28 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 13 18:04:36 2003
Subject: [Python-Dev] Re: Python-Dev Digest, Vol 3, Issue 25
In-Reply-To: Your message of "Mon, 13 Oct 2003 14:17:46 PDT."
	<D18F01EF.3E310B4D@mail.google.com> 
References: <E1A99UR-00063f-Bu@mail.python.org>  
	<D18F01EF.3E310B4D@mail.google.com> 
Message-ID: <200310132204.h9DM4SM22471@12-236-54-216.client.attbi.com>

> I like "sort" better than "decorator"; I would also like "by", as in
> sort(names, by=ages.__getitem__).
> 
> I would also advocate an optional reverse=False argument, so that
> 
> result = sort(names, reverse=True)
> 
> is equivalent to
> 
> result = sort(names)
> result.reverse()

While we're at it, +1.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Oct 13 18:05:32 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 13 18:05:54 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "Mon, 13 Oct 2003 16:22:59 CDT."
	<16267.6067.441841.822258@montanaro.dyndns.org> 
References: <16E1010E4581B049ABC51D4975CEDB8803060CCD@UKDCX001.uk.int.atosorigin.com>
	<16267.6067.441841.822258@montanaro.dyndns.org> 
Message-ID: <200310132205.h9DM5Wr22484@12-236-54-216.client.attbi.com>

>     Paul> I like it! But "decorator" isn't a good name - it describes how
>     Paul> it's being done, rather than what is being done. How about "key"?

[Skip]
> How about keyfunc? "keyfunc=str.lower" reads to me more like "generate sort
> keys using str.lower".  "key" doesn't suggest (to me, at least) its value
> should be a function.

But remember that a parameter name doesn't need to be documentation.
It just needs to be a memory-jogger.  I think key is fine.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Oct 13 18:07:48 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 13 18:08:02 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "Mon, 13 Oct 2003 16:29:01 CDT."
	<43148AA1-FDC4-11D7-94FE-000393C2D67E@colorstudy.com> 
References: <43148AA1-FDC4-11D7-94FE-000393C2D67E@colorstudy.com> 
Message-ID: <200310132207.h9DM7m022506@12-236-54-216.client.attbi.com>

> I've seen proposals for an extension to list comprehension, which would 
> be quite nice:
> 
>    [s for s in lst sortby s.lower()]
> 
> It reads nicely, and avoids lambdas and tiny helper functions.  Also 
> handles the sort-returns-None criticism.  But it adds syntax.  And 
> since it's not an in-place sort it won't perform as well (but probably 
> better than the decorator idiom anyway...?)

This has a very low probability to be accepted.  It suffers IMO from
the "SQL syndrome": having reserved words to the language that are
only meaningful in a very specific syntax yet are reserved
everywhere.  Until we have a general way to avoid that, I'd rather
not go that route.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From skip at pobox.com  Mon Oct 13 18:51:52 2003
From: skip at pobox.com (Skip Montanaro)
Date: Mon Oct 13 18:52:05 2003
Subject: [Python-Dev] Re: Python-Dev Digest, Vol 3, Issue 25
In-Reply-To: <200310132204.h9DM4SM22471@12-236-54-216.client.attbi.com>
References: <E1A99UR-00063f-Bu@mail.python.org>
	<D18F01EF.3E310B4D@mail.google.com>
	<200310132204.h9DM4SM22471@12-236-54-216.client.attbi.com>
Message-ID: <16267.11400.169738.924956@montanaro.dyndns.org>

    >> I would also advocate an optional reverse=False argument, so that
    >> 
    >> result = sort(names, reverse=True)
    >> 
    >> is equivalent to
    >> 
    >> result = sort(names)
    >> result.reverse()

    Guido> While we're at it, +1.

direction=[ascending|descending]

?  Just a thought.

Skip

From fincher.8 at osu.edu  Mon Oct 13 19:59:44 2003
From: fincher.8 at osu.edu (Jeremy Fincher)
Date: Mon Oct 13 19:01:25 2003
Subject: [Python-Dev] Re: Python-Dev Digest, Vol 3, Issue 25
In-Reply-To: <D18F01EF.3E310B4D@mail.google.com>
References: <E1A99UR-00063f-Bu@mail.python.org>
	<D18F01EF.3E310B4D@mail.google.com>
Message-ID: <200310131959.44982.fincher.8@osu.edu>

On Monday 13 October 2003 05:17 pm, Peter Norvig wrote:
> I like "sort" better than "decorator"; I would also like "by", as in
> sort(names, by=ages.__getitem__).

Coincidentally, my own decorate-sort-undecorate function is named "sortBy" :)  
So I'm +1 on naming the argument "by".

> I would also advocate an optional reverse=False argument, so that
>
> result = sort(names, reverse=True)
>
> is equivalent to
>
> result = sort(names)
> result.reverse()

I like it.

Jeremy

From tim.one at comcast.net  Mon Oct 13 19:14:17 2003
From: tim.one at comcast.net (Tim Peters)
Date: Mon Oct 13 19:14:31 2003
Subject: [Python-Dev] decorate-sort-undecorate 
In-Reply-To: <001f01c391c0$fcdb88a0$e841fea9@oemcomputer>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEKDGIAB.tim.one@comcast.net>

[Raymond Hettinger]
> ...
> # The decorator form works especially well with mappings so that
> # database keys can be sorted by any field.

Unfortunately, the case of sorting database records by keys is one where
it's most important to use a different form of DSU to avoid outrageous
runtime.  If you sort

    [(key1, record1), (key2, record2), ...]

then whenever two keys compare equal, tuple comparison goes on to compare
the records too, and general-object comparison can be arbitrarily expensive.
The right way to do this with DSU now is to sort:

    [(key1, 0, record1), (key2, 1, record2), ...]

instead.  Then ties on the keys (which are very common when sorting a
database) are always resolved quickly by comparing two distinct ints.

This is the same way used to force a stable sort in pre-2.3 Python, and
remains the best thing for non-experts to do by default.  Indeed, if it's
not done, then despite that the 2.3 sort *is* stable, sorting on

    [(key1, record1), (key2, record2), ...]

is *not* stable wrt just sorting on the keys.  DSU actually changes the
natural result unless the original indices are inserted after "the keys".

Alas, in decorator=function syntax, there's no clear way in general to write
function so that it knows the index of the object passed to it.

Internally, I suppose the sort routine could sort a temp list consisting of
only the keys, mirroring the relative data movement in the real list too.
That should blow the cache all to hell <wink>.


From pje at telecommunity.com  Mon Oct 13 19:25:57 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Oct 13 19:27:26 2003
Subject: [Python-Dev] decorate-sort-undecorate 
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEKDGIAB.tim.one@comcast.net>
References: <001f01c391c0$fcdb88a0$e841fea9@oemcomputer>
Message-ID: <5.1.1.6.0.20031013192501.032177d0@telecommunity.com>

At 07:14 PM 10/13/03 -0400, Tim Peters wrote:
>Alas, in decorator=function syntax, there's no clear way in general to write
>function so that it knows the index of the object passed to it.

Why not just have the decoration be (key,index,value) then?  Why does the 
key function need the index?


From tim.one at comcast.net  Mon Oct 13 19:45:40 2003
From: tim.one at comcast.net (Tim Peters)
Date: Mon Oct 13 19:45:57 2003
Subject: [Python-Dev] decorate-sort-undecorate 
In-Reply-To: <5.1.1.6.0.20031013192501.032177d0@telecommunity.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEKGGIAB.tim.one@comcast.net>

[Phillip J. Eby]
> Why not just have the decoration be (key,index,value) then?  Why does
> the key function need the index?

It doesn't if indices are synthesized by magic under the covers.  Then it
starts acting more like Zope (we know it does *something*, but it's not
clear what <wink>).  If you want to pay that expense now, you do so
explicitly, and nothing about it is hidden.


From Scott.Daniels at Acm.Org  Mon Oct 13 20:40:35 2003
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Mon Oct 13 20:40:51 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <E1A99UR-00063f-54@mail.python.org>
References: <E1A99UR-00063f-54@mail.python.org>
Message-ID: <3F8B4603.4000701@Acm.Org>

Raymond Hettinger wrote:


>For Py2.4, I propose adding an optional list.sort() argument to
>support the decorate-sort-undecorate pattern.
>
...

>def sort(self, cmpfunc=None, key=None):
>
>    ...
>    if key is None:
>        self.sort(*args)
>    else:      
>        aux = zip(map(key, self), self)   # Decorate
>        aux.sort(*args)
>        self[:] = list(zip(*aux)[1])       # Un-decorate
>
If the argument is for simplicity, do we need to make this stable?
Will warning about incomparables be sufficient?  I'm thinking
about:
    data = [(1-1j), -2, 1, 1j]
    data.sort(key=abs)

Or would we prefer the code to end:

    else:
        # Decorate
        aux = [(key(el), nbr, el) for nbr, el in enumerate(self)]
        aux.sort(*args)
        self[:] = list(zip(*aux)[2])       # Un-decorate

I think the answer comes down to performance vs. law of least
surprise.  I suppose I am slightly in favor of throwing in the
stabilizing count (fewer explanations; those who need speed can
do it themselves.

-Scott David Daniels
 Scott.Daniels@Acm.Org


From greg at cosc.canterbury.ac.nz  Mon Oct 13 20:45:21 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon Oct 13 20:46:23 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310132205.h9DM5Wr22484@12-236-54-216.client.attbi.com>
Message-ID: <200310140045.h9E0jL609446@oma.cosc.canterbury.ac.nz>

Guido:

> But remember that a parameter name doesn't need to be documentation.
> It just needs to be a memory-jogger.  I think key is fine.

+1 on "key" from me, too.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Mon Oct 13 20:48:47 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon Oct 13 20:50:10 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310132207.h9DM7m022506@12-236-54-216.client.attbi.com>
Message-ID: <200310140048.h9E0mlC09462@oma.cosc.canterbury.ac.nz>

Guido:

> >    [s for s in lst sortby s.lower()]
> 
> It suffers IMO from the "SQL syndrome": having reserved words to the
> language that are only meaningful in a very specific syntax yet are
> reserved everywhere.

It could probably be a non-reserved keyword in this case.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Mon Oct 13 20:56:34 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 13 20:56:43 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "Tue, 14 Oct 2003 13:48:47 +1300."
	<200310140048.h9E0mlC09462@oma.cosc.canterbury.ac.nz> 
References: <200310140048.h9E0mlC09462@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310140056.h9E0uYf22690@12-236-54-216.client.attbi.com>

> Guido:
> > >    [s for s in lst sortby s.lower()]
> > 
> > It suffers IMO from the "SQL syndrome": having reserved words to the
> > language that are only meaningful in a very specific syntax yet are
> > reserved everywhere.

[Greg]
> It could probably be a non-reserved keyword in this case.

Yes, but that would be error-prone, because to the parser it would hve
to look like an expression followed by an identifier followed by
another expression.  Many typos in the first expression can then turn
this into a valid but different expression.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg at cosc.canterbury.ac.nz  Mon Oct 13 21:00:58 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon Oct 13 21:01:12 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <43148AA1-FDC4-11D7-94FE-000393C2D67E@colorstudy.com>
Message-ID: <200310140100.h9E10wq09475@oma.cosc.canterbury.ac.nz>

Ian Bicking <ianb@colorstudy.com>:

>    [s for s in lst sortby s.lower()]
> 
> It reads nicely, and avoids lambdas and tiny helper functions.  Also 
> handles the sort-returns-None criticism.  But it adds syntax.

And makes the definition of list semantics in terms of
an equivalent for-loop nest much less elegant.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From mike at nospam.com  Mon Oct 13 21:34:55 2003
From: mike at nospam.com (Mike Rovner)
Date: Mon Oct 13 21:40:26 2003
Subject: [Python-Dev] Re: python-dev Summary for 2003-09-16 through
	2003-09-30
References: <mailman.1066051924.13540.clpa-moderators@python.org>
Message-ID: <bmfjrc$66d$1@sea.gmane.org>

Brett C. wrote:
> We want *you* to help with the war on SF patch items
> ----------------------------------------------------
> Someone wanted to help but wasn't sure how they could.  Martin v.
> Loewis sent an email listing common things anyone can do to help with
> dealing with the patch items on SourceForge_.  The email can be found
> at
> http://mail.python.org/pipermail/python-dev/2003-September/038253.html

24 Sep 2003 09:26:12 +0200 martin <at> v.loewis.de wrote:
>> Aahz <aahz <at> pythoncraft.com> writes:
> Also, try to classify the patch somehow, indicating what most likely
> the problem is for the patch not being reviewed/accepted:
>
>> - the patch might be incomplete. Ping the submitter. If the submitter
>>   is incomplete, either complete it yourself, or suggest rejection
>>   of the patch.

All I can do as SF regestered user is add a comment to existing patch.
I can't extend it, submit extra files, i.e. "complete" it.

Please clarify the preferabale way to "help with the war on SF patch items".

Regards,
Mike


From pje at telecommunity.com  Mon Oct 13 21:53:33 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Oct 13 21:53:27 2003
Subject: [Python-Dev] decorate-sort-undecorate 
Message-ID: <5.1.0.14.0.20031013215332.02722cb0@mail.telecommunity.com>

At 07:45 PM 10/13/03 -0400, Tim Peters wrote:
>[Phillip J. Eby]
> > Why not just have the decoration be (key,index,value) then?  Why does
> > the key function need the index?
>
>It doesn't if indices are synthesized by magic under the covers.

It's not magic if that's the defined behavior, e.g.:

"""Specifying a 'key' callable causes items' sort order to be determined by 
comparing 'key(item)' in place of the item being compared.  In the event 
that 'key()' returns an equal value for two different items, the items' 
order in the original list is preserved.

The 'key' callable is called only once for each item in the list, so in 
general sorting with 'key' is faster than sorting with 'cmpfunc'.  It 
requires more memory, however, because it creates a temporary list of 
'(key(item),original_item_position,item)' tuples in order to perform the 
sort."""


>If you want to pay that expense now, you do so
>explicitly, and nothing about it is hidden.

What expense?  The extra memory overhead for the index?  I suppose so.  But 
if you *don't* want that behavior, you can still DSU manually, no?


From barry at python.org  Mon Oct 13 22:04:51 2003
From: barry at python.org (Barry Warsaw)
Date: Mon Oct 13 22:04:59 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <5.1.0.14.0.20031013215332.02722cb0@mail.telecommunity.com>
References: <5.1.0.14.0.20031013215332.02722cb0@mail.telecommunity.com>
Message-ID: <1066097091.19072.11.camel@geddy>

On Mon, 2003-10-13 at 21:53, Phillip J. Eby wrote:

> """Specifying a 'key' callable causes items' sort order to be determined by 
> comparing 'key(item)' in place of the item being compared.

Using this explanation, "key" doesn't seem right to me.  I can't think
of anything that I like better though, so I guess I just won't send this
email afteral...

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20031013/e475a006/attachment.bin
From tim.one at comcast.net  Mon Oct 13 22:25:51 2003
From: tim.one at comcast.net (Tim Peters)
Date: Mon Oct 13 22:25:53 2003
Subject: [Python-Dev] decorate-sort-undecorate 
In-Reply-To: <5.1.0.14.0.20031013215332.02722cb0@mail.telecommunity.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCELCGIAB.tim.one@comcast.net>

[Phillip J. Eby]
> What expense?  The extra memory overhead for the index?  I suppose
> so.

Yes, that is an expense.  Partly because of the extra memory space in
len(list) temp tuples, but mostly because space allocated for integer
objects is immortal.  That is,

    range(10000000)

grabs space for 1000000 distinct integer objects that's never reused for any
other kind of object, and so does stuffing a million distinct int objects
into a temp DSU list.  Note that this is very different from doing

    for i in xrange(1000000):

which allocates space for only three integer objects (100000, the current
value of i, and preceding value of i), and keeps reusing it.

A cleverer implementation might be able to avoid permanently ratcheting the
space devoted to int objects.

> But if you *don't* want that behavior, you can still DSU manually, no?

I hope so <wink>.


From bac at OCF.Berkeley.EDU  Mon Oct 13 22:26:19 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Mon Oct 13 22:26:28 2003
Subject: [Python-Dev] Draft of an essay on Python development (and how to
	help)
Message-ID: <3F8B5ECB.4030207@ocf.berkeley.edu>

I finally got around to proof-reading a guide to Python development that 
I wrote based on a presentation I gave to the SF Bay Area Python user 
group.  I would like to get it checked by everybody to make sure it is 
to everyone's liking.

The main goal of this doc is twofold: 1) to have something to point 
people to when they ask how they can help or get started on python-dev 
(maybe even be referenced in the welcome email) and 2) to act as a basis 
for a presentation at PyCon 2004 covering how Python is developed.  In 
other words I want this to be good enough to put up on python.org .

Since correcting the summaries works well by pasting into an email, I am 
going to do that here as well.  Comment on any errors in grammar, 
spelling, etc.  If you think an important point is missing, please say so.

Do realize, though, this is not mean to replace the dev FAQ.  I 
specifically wrote it like an essay so that people can just read it from 
beginning to end.  If there is some specific point that people need to 
be pointed to it should probably go into the dev FAQ rather than here. 
I would like to view this as a gentle intro to python-dev's workings to 
help lower the fear factor.

OK, enough explanations.  Here is the doc:

------------------------------------

How Python is Developed
+++++++++++++++++++++++

Introduction
============
Software does not make itself.  Code does not spontaneously come from 
the ether of the universe.  Python_ is no exception to this rule.  Since 
Python made its public debut back in 1991 people beyond the BDFL 
(Benevolent Dictator For Life, `Guido van Rossum`_) have helped 
contribute time and energy to making Python what it is today; a 
powerful, simple programming language available to all.

But it has not been a random process of people doing whatever they 
wanted to Python.  Over the years a process to the development of Python 
has emerged by the group that heads Python's growth and maintenance; 
`python-dev`_.  This document is an attempt to write this process down 
in hopes of lowering any barriers possibly preventing people from 
contributing to the development of Python.

.. _Python: http://www.python.org/
.. _Guido van Rossum: http://www.python.org/~guido/
.. _python-dev:http://mail.python.org/mailman/listinfo/python-dev


Tools Used
==========
To help facilitate the development of Python, certain tools are used. 
Beyond the obvious ones such as a text editor and email client, two 
tools are very pervasive in the development process.

SourceForge_ is used by python-dev to keep track of feature requests, 
reported bugs, and contributed patches.  A detailed explanation on how 
to use SourceForge is covered later in `General SourceForge Guidelines`_.

CVS_ is a networked file versioning system that stores all of files that 
make up Python.  It allows the developers to have a single repository 
for the files along with being able to keep track of any and all changes 
to every file.  The basic commands and uses can be found in the `dev 
FAQ`_ along with a multitude of tutorials spread across the web.

.. _SourceForge: http://sourceforge.net/projects/python/
.. _CVS: http://www.cvshome.org/
.. _dev FAQ: http://www.python.org/dev/devfaq.html


Communicating
=============
Python development is not just programming.  It requires a great deal of 
communication between people.  This communication is not just between 
the members of python-dev; communication within the greater Python 
community also helps with development.  Several mailing lists and 
newsgroups are used to help organize all of these discussions.

In terms of Python development, the primary location for communication 
is the `python-dev`_ mailing list.  This is where the members of 
python-dev hash out ideas and iron out issues.  It is an open list; 
anyone can subscribe to the mailing list.  While the discussion can get 
quite technical, it is not all out of the reach for even a novice and 
thus should not discourage anyone from joining the list.  Please 
realize, though, this list is **only** for the discussion of the 
development of Python; all other questions should be directed somewhere 
else, such as `python-list`_.

When the greater Python community is involved in a discussion, it always 
ends up on `python-list`_.  This mailing list is a gateway to the 
newsgroup `comp.lang.python`_.  This is also a good place to go when you 
have a question about Python that does not pertain to the actual 
development of the language.

Using CVS_ allows the development team to know who made a change to a 
file and when they made their change.  But unless one wants to 
continuously update their local checkout of the repository, the best way 
to stay on top of changes to the repository is to subscribe to 
`Python-checkins`_.  This list sends out an email for each and every 
change to a file in Python.  This list can generate a large amount of 
traffic since even changing a typo in some text will trigger an email to 
be sent out.  But if you wish to be kept abreast of all changes to 
Python then this is a good way to do so.

The Patches_ mailing list sends out an email for all changes to patch 
items on SourceForge_.  This list, just like Python-checkins, can 
generate a large amount of email traffic.  It is in general useful to 
people who wish to help out with the development of Python by knowing 
about all new submitted patches as well as any new developments on 
preexisting ones.

`Python-bugs-list`_ functions much like the Patches mailing list except 
it is for bug items on SourceForge.  If you find yourself wanting to 
help to close and remove bugs in Python this is the right list to 
subscribe to if you can handle the volume of email.

.. _python-list: http://mail.python.org/mailman/listinfo/python-list
.. _comp.lang.python: http://groups.google.com/groups?q=comp.lang.python
.. _Python-checkins: http://mail.python.org/mailman/listinfo/python-checkins
.. _Patches: http://mail.python.org/mailman/listinfo/patches
.. _Python-bugs-list: 
http://mail.python.org/mailman/listinfo/python-bugs-list


The Actual Development
======================
Developing Python is not all just conversations about neat new language 
features (although those neat conversations do come up and there is a 
process to it).  Developing Python also involves maintaining it by 
eliminating discovered bugs, adding and changing features, and various 
other jobs that are not necessarily glamorous but are just as important 
to the language as anything else.


General SourceForge Guidelines
------------------------------
Since a good amount of Python development involves using SourceForge_, 
it is important to follow some guidelines when handling a tracker item 
(bug, patch, etc.).  Probably one of the most important things you can 
do is make sure to set the various options in a new tracker item 
properly.  The submitter should make sure that the Data Type, Category, 
and Group are all set to reasonable values.  The remaining values 
(Assigned To, Status, and Resolution) should in general be left to 
Python developers to set.  The exception to this rule is when you want 
to retract a patch; then "close" the patch by setting Status to "closed" 
and Resolution to whatever is appropriate.

Make sure you do a cursory check to make sure what ever you are 
submitting was not previously submitted by someone else.  Duplication 
just uses up valuable time.

And **please** do not post feature requests, bug reports, or patches to 
the python-dev mailing list.  If you do you will be instructed to create 
an appropriate SourceForge tracker item.  When in doubt as to whether 
you should bring something to python-dev's attention, you can always ask 
on `comp.lang.python`_; Python developers actively participate there and 
move the conversation over if it is deemed reasonable.


Feature Requests
----------------
`Feature requests`_ are for features that you wish Python had but you 
have no plans on actually implementing by writing a patch.  On occasion 
people do go through the features requests (also called RFCs on 
SourceForge) to see if there is anything there that they think should be 
implemented and actually do the implementation.  But in general do not 
expect something put here to be implemented without some participation 
on your part.

The best way to get something implemented is to campaign for it in the 
greater Python community.  `comp.lang.python`_ is the best place to 
accomplish this.  Post to the newsgroup with your idea and see if you 
can either get support or convince someone to implement it.  It might 
even end up being added to `PEP 42`_ so that the idea does not get lost 
in the noise as time passes.

.. _feature requests: 
http://sourceforge.net/tracker/?group_id=5470&atid=355470
.. _PEP 42: http://www.python.org/peps/pep-0042.html


Bug Reports
-----------
Think you found a bug?  Then submit a `bug report`_ on SourceForge. 
Make sure you clearly specify what version of Python you are using, what 
OS, and under what conditions the bug was triggered.  The more 
information you can give the faster the bug can be fixed since time will 
not be wasted requesting more information from you.

.. _bug report: http://sourceforge.net/tracker/?group_id=5470&atid=105470


Patches
-------
Create a patch_ tracker item on SourceForge for any code you think 
should be applied to the Python CVS tree.  For practically any change to 
Python's functionality the documentation and testing suite will need to 
be changed as well.  Doing this in the first place speeds things up 
considerably.

Please make sure your patch is against the CVS repository.  If you don't 
know how to use it (basics are covered in the `dev FAQ`_), then make 
sure you specify what version of Python you made your patch against.

In terms of coding standards, `PEP 8`_ specifies for Python while `PEP 
7`_ specifies for C.  Always try to maximize your code reuse; it makes 
maintenance much easier.

For C code make sure to limit yourself to ANSI C code as much as 
possible.  If you must use non-ANSI C code then see if what you need is 
checked for by looking in pyconfig.h .  You can also look in 
Include/pyport.h for more helpful C code.  If what you need is still not 
there but it is in general available, then add a check in configure.in 
for it (don't forget to run autoreconf to make the changes to take 
effect).  And if that *still* doesn't fit your needs then code up a 
solution yourself.  The reason for all of this is to limit the 
dependence on external code that might not be available for all OSs that 
Python runs on.

Be aware of intellectual property when handling patches.  Any code with 
no copyright will fall under the copyright of the `Python Software 
Foundation`_.  If you have no qualms with that, wonderful; this is the 
best solution for Python.  But if you feel the need to include a 
copyright then make sure that it is compatible with copyright used on 
Python (i.e., BSD-style).  The best solution, though, is to sign the 
copyright over to the Python Software Foundation.

.. _patch: http://sourceforge.net/tracker/?group_id=5470&atid=305470
.. _dev FAQ: http://www.python.org/dev/devfaq.html
.. _PEP 7: http://www.python.org/peps/pep-0007.html
.. _PEP 8: http://www.python.org/peps/pep-0008.html
.. _Python Software Foundation: http://www.python.org/psf/


Changing the Language
=====================
You understand how to file a patch.  You think you have a great idea on 
how Python should change.  You are ready to write code for your change. 
  Great, but you need to realize that certain things must be done for a 
change to be accepted.  Changes fall into two categories; changes to the 
standard library (referred to as the "stdlib") and changes to the 
language proper.


Changes to the stdlib
---------------------
Changes to the stdlib can consist of adding functionality or changing 
existing functionality.

Adding minor functionality (such as a new function or method) requires 
convincing a member of python-dev that the addition of code caused by 
implementing the feature is worth it.  A big addition such as a module 
tends to require more support than just a single member of python-dev. 
As always, getting community support for your addition is a good idea.

With all additions, make sure to write up documentation for your new 
functionality.  Also make sure that proper tests are added to the 
testing suite.

If you want to add a module, be prepared to be called upon for any bug 
fixes or feature requests for that module.  Getting a module added to 
the stdlib makes you by default its maintainer.  If you can't take that 
level of responsibility and commitment and cannot get someone else to 
take it on for you then your battle will be very difficult; when there 
is not a specific maintainer of code python-dev takes responsibility and 
thus your code must be useful to them or else they will reject the module.

Changing existing functionality can be difficult to do if it breaks 
backwards-compatibility.  If your code will break existing code, you 
must provide a legitimate reason on why making the code act in a 
non-compatible way is better than the status quo.  This requires 
python-dev as a whole to agree to the change.

Changing the Language Proper
----------------------------
Changing Python the language is taken **very** seriously.  Python is 
often heralded for its simplicity and cleanliness.  Any additions to the 
language must continue this tradition and view.  Thus any changes must 
go through a long process.

First, you must write a PEP_ (Python Enhancement Proposal).  This is 
basically just a document that explains what you want, why you want it, 
what could be bad about the change, and how you plan on implementing the 
change.  It is best to get feedback on PEPs on `comp.lang.python`_ and 
from python-dev.  Once you feel the document is ready you can request a 
PEP number and to have it added to the official list of PEPs in `PEP 0`_.

Once you have a PEP, you must then convince python-dev and the BDFL that 
your change is worth it.  Be expected to be bombarded with questions and 
counter-arguments.  It can drag on for over a month, easy.  If you are 
not up for that level of discussion then do not bother with trying to 
get your change in.  If you manage to convince a majority of python-dev 
and the BDFL (or most of python-dev; that can lead to the BDFL changing 
his mind) then your change can be applied.

As with all new code make sure you also have appropriate documentation 
patches along with tests for the new functionality.

.. _PEP: http://www.python.org/peps/pep-0001.html
.. _PEP 0: http://www.python.org/peps/pep-0000.html


Helping Out
===========
Many people say they wish they could help out with the development of 
Python but feel they are not up to writing code.  There are plenty of 
things one can do, though, that does not require you to write code. 
Regardless of your coding abilities, there is something for everyone to 
help with.

For feature requests, adding a comment about what you think is helpful. 
  State whether or not you would like to see the feature.  You can also 
volunteer to write the code to implement the feature if you feel up to it.

For bugs, stating whether or not you can reproduce the bug yourself can 
be extremely helpful.  If you can write a fix for the bug that is very 
helpful as well.

For patches, apply the patch and run the testing suite.  You can do a 
code review on the patch to make sure that it is good, clean code.  Help 
add to the patch if it is missing documentation patches or needed 
regression tests.  If the patch adds a new feature, comment on whether 
you think it is worth adding.  If it changes functionality then comment 
on whether you think it might break code; if it does, say whether you 
think it is worth the cost of breaking existing code.

For language changes, make your voice be heard.  Comment about any PEPs 
on `comp.lang.python`_ so that the general opinion of the community can 
be assessed.

If there is nothing specific you find you want to work on but still feel 
like contributing nonetheless, there are several things you can do.  The 
documentation can always use fleshing out.  Adding more tests to the 
testing suite is always useful.  Contribute to discussions on python-dev 
or `comp.lang.python`_.  Just helping out in the community by spreading 
the word about Python or helping someone with a question is helpful.

If you really want to get knee-deep in all of this, join python-dev. 
Once you have been actively participating for a while and are generally 
known on python-dev you can request to have checkin rights on the CVS 
tree.  It is a great way to learn how to work in a large, distributed 
group along with how to write great code.

And if all else fails give money; the `Python Software Foundation`_ is a 
non-profit organization that accepts donations that are tax-deductible 
in the United States.  The funds are used for various thing such as 
lawyers for handling the intellectual property of Python to funding 
PyCon_.  But the PSF could do a lot more if they had the funds.  Every 
dollar does help, so please contribute if you can.

.. _PyCon: http://www.python.org/pycon/


Conclusion
==========
If you get any message from this document, it should be that *anyone* 
can help Python.  All help is greatly appreciated and keeps the language 
the wonderful piece of software that it is.


From bac at OCF.Berkeley.EDU  Mon Oct 13 22:31:10 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Mon Oct 13 22:31:24 2003
Subject: [Python-Dev] Re: python-dev Summary for 2003-09-16
	through	2003-09-30
In-Reply-To: <bmfjrc$66d$1@sea.gmane.org>
References: <mailman.1066051924.13540.clpa-moderators@python.org>
	<bmfjrc$66d$1@sea.gmane.org>
Message-ID: <3F8B5FEE.6010203@ocf.berkeley.edu>

Mike Rovner wrote:

> Brett C. wrote:
> 
>>We want *you* to help with the war on SF patch items
>>----------------------------------------------------
>>Someone wanted to help but wasn't sure how they could.  Martin v.
>>Loewis sent an email listing common things anyone can do to help with
>>dealing with the patch items on SourceForge_.  The email can be found
>>at
>>http://mail.python.org/pipermail/python-dev/2003-September/038253.html
> 
> 
> 24 Sep 2003 09:26:12 +0200 martin <at> v.loewis.de wrote:
> 
>>>Aahz <aahz <at> pythoncraft.com> writes:
>>
>>Also, try to classify the patch somehow, indicating what most likely
>>the problem is for the patch not being reviewed/accepted:
>>
>>
>>>- the patch might be incomplete. Ping the submitter. If the submitter
>>>  is incomplete, either complete it yourself, or suggest rejection
>>>  of the patch.
> 
> 
> All I can do as SF regestered user is add a comment to existing patch.
> I can't extend it, submit extra files, i.e. "complete" it.
> 
> Please clarify the preferabale way to "help with the war on SF patch items".
> 

There is a lot you can do even if you can just comment.  Apply the patch 
and verify that it works for you (especially if it relies on OS-specific 
code) and say what happens.  Comment on the cleanliness of the code.  If 
it adds a new feature, state whether you think it is a good addition or 
not.  Make sure that backwards-compatibility is not broken.  And if it 
is, say whether you think it is a good idea to break 
backwards-compatibility.

Doing *anything* to help the patch along is a great help since it allows 
people who do have checkin abilities to spend less time double-checking 
double-checking the patch and thus can get to more patches with what 
little time they have available to work on Python.

-Brett


From guido at python.org  Mon Oct 13 23:31:58 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 13 23:32:22 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "Mon, 13 Oct 2003 22:25:51 EDT."
	<LNBBLJKPBEHFEDALKOLCCELCGIAB.tim.one@comcast.net> 
References: <LNBBLJKPBEHFEDALKOLCCELCGIAB.tim.one@comcast.net> 
Message-ID: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com>

> [Phillip J. Eby]
> > What expense?  The extra memory overhead for the index?  I suppose
> > so.
> 
> Yes, that is an expense.  Partly because of the extra memory space in
> len(list) temp tuples, but mostly because space allocated for integer
> objects is immortal.  That is,
> 
>     range(10000000)
> 
> grabs space for 1000000 distinct integer objects that's never reused for any
> other kind of object, and so does stuffing a million distinct int objects
> into a temp DSU list.  Note that this is very different from doing
> 
>     for i in xrange(1000000):
> 
> which allocates space for only three integer objects (100000, the current
> value of i, and preceding value of i), and keeps reusing it.
> 
> A cleverer implementation might be able to avoid permanently ratcheting the
> space devoted to int objects.
> 
> > But if you *don't* want that behavior, you can still DSU manually, no?
> 
> I hope so <wink>.

After reading this exchange, I'm not sure I agree with Tim about the
importance of avoiding to compare the full records.  Certainly the
*cost* of that comparison doesn't bother me: I expect it's usually
going to be a tuple or list of simple types like ints and strings, and
comparing those is pretty efficient.

Sure, there's a pattern that requires records with equal to remain in
the original order, but that seems an artefact of a pattern used only
for external sorts (where the sorted data is too large to fit in
memory -- Knuth Vol. III is full of algorithms for this, but they seem
mostly of historical importance in this age of Gigabyte internal
memory).

The pattern is that if you want records sorted by zipcode and within
each zipcode sorted by name, you first sort by name and then do a
stable sort by zipcode.  This was common in the days of external
sorts.  (External sorts are still common in some application domains,
of course, but I doubt that you'd find much Python being used there
for the main sort.)

But using Raymond's proposal, you can do that by specifying a tuple
consisting of zipcode and name as the key, as follows:

  myTable.sort(key = lambda rec: (rec.zipcode, rec.name))

This costs an extra tuple, but the values in the tuple are not new, so
it costs less space than adding the index int (not even counting the
effect of the immortality of ints).  And tuples aren't immortal.  (To
be honest, for tuples of length < 20, the first 2000 of a given size
*are* immortal, but that's a strictly fixed amount of wasted memory.)

Given that this solution isn't exactly rocket science, I think the
default behavior of Raymond's original proposal (making the whole
record the second sort key) is fine.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Oct 13 23:33:37 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 13 23:33:45 2003
Subject: [Python-Dev] Re: Python-Dev Digest, Vol 3, Issue 25
In-Reply-To: Your message of "Mon, 13 Oct 2003 17:51:52 CDT."
	<16267.11400.169738.924956@montanaro.dyndns.org> 
References: <E1A99UR-00063f-Bu@mail.python.org>
	<D18F01EF.3E310B4D@mail.google.com>
	<200310132204.h9DM4SM22471@12-236-54-216.client.attbi.com> 
	<16267.11400.169738.924956@montanaro.dyndns.org> 
Message-ID: <200310140333.h9E3Xbg22932@12-236-54-216.client.attbi.com>

>     >> I would also advocate an optional reverse=False argument, so that
>     >> 
>     >> result = sort(names, reverse=True)
>     >> 
>     >> is equivalent to
>     >> 
>     >> result = sort(names)
>     >> result.reverse()
> 
>     Guido> While we're at it, +1.

[Skip]
> direction=[ascending|descending]
> 
> ?  Just a thought.

But where would these constants be defined?  Using
direction='ascending' feels ugly.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg at cosc.canterbury.ac.nz  Mon Oct 13 23:44:19 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon Oct 13 23:44:36 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCELCGIAB.tim.one@comcast.net>
Message-ID: <200310140344.h9E3iJF10129@oma.cosc.canterbury.ac.nz>

Tim Peters <tim.one@comcast.net>:

> mostly because space allocated for integer objects is immortal.

The implementation doesn't necessarily have to store the
sequence numbers as Python integers.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From mike at nospam.com  Tue Oct 14 00:21:07 2003
From: mike at nospam.com (Mike Rovner)
Date: Tue Oct 14 00:21:11 2003
Subject: [Python-Dev] Re: Draft of an essay on Python development (and how
	tohelp)
References: <3F8B5ECB.4030207@ocf.berkeley.edu>
Message-ID: <bmftja$jm7$1@sea.gmane.org>

Brett C. wrote:
> The main goal of this doc is twofold: 1) to have something to point
> people to when they ask how they can help or get started on python-dev
> (maybe even be referenced in the welcome email)

Very nice welcome reading (probably you want to hear from a novice to
python-dev).

> Help add to the patch if it is missing documentation patches or needed
> regression tests.

Please don't herald for things that can't be done by anyone except patch
author or py-dev member.

Regards,
Mike


From greg at cosc.canterbury.ac.nz  Tue Oct 14 03:13:52 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue Oct 14 03:14:09 2003
Subject: [Python-Dev] MacPython - access to FinderInfo of a directory
In-Reply-To: <F88643F8-FD62-11D7-A415-0030655234CE@cwi.nl>
Message-ID: <200310140713.h9E7DqO10997@oma.cosc.canterbury.ac.nz>

> Greg,
> there's an SF bug for this one: #706585. If you could attach your
> patch to this one I'll do the magic to work it around to bgen.

Okay, I've done that.

Greg


From aleaxit at yahoo.com  Tue Oct 14 04:01:10 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 14 04:01:19 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com>
References: <LNBBLJKPBEHFEDALKOLCCELCGIAB.tim.one@comcast.net>
	<200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com>
Message-ID: <200310141001.10705.aleaxit@yahoo.com>

On Tuesday 14 October 2003 05:31 am, Guido van Rossum wrote:
   ...
> After reading this exchange, I'm not sure I agree with Tim about the
> importance of avoiding to compare the full records.  Certainly the
> *cost* of that comparison doesn't bother me: I expect it's usually
> going to be a tuple or list of simple types like ints and strings, and
> comparing those is pretty efficient.

I have and have seen many use cases where the things being sorted
are dictionaries (comparisons can be costlier) or instances (they can
be non-comparable).

I agree that the "stable" nature of sorting is not all that important in
our context.  But avoiding whole-record comparison in the general
case seems important enough to me that I'd accept any arbitrary
non-comparing behavior (e.g. making the id of the thing being sorted
the secondary key!-) rather than default to whole-record compares.


Alex


From gerrit at nl.linux.org  Tue Oct 14 04:15:53 2003
From: gerrit at nl.linux.org (Gerrit Holl)
Date: Tue Oct 14 04:16:06 2003
Subject: [Python-Dev] Draft of an essay on Python development (and how to
	help)
In-Reply-To: <3F8B5ECB.4030207@ocf.berkeley.edu>
References: <3F8B5ECB.4030207@ocf.berkeley.edu>
Message-ID: <20031014081553.GA2976@nl.linux.org>

Brett C. wrote:
> Feature Requests
> ----------------
> `Feature requests`_ are for features that you wish Python had but you 
> have no plans on actually implementing by writing a patch.  On occasion 
> people do go through the features requests (also called RFCs on 
> SourceForge) to see if there is anything there that they think should be 
> implemented and actually do the implementation.  But in general do not 
> expect something put here to be implemented without some participation 
> on your part.

I think feature requests are called RFE's in SF terminology, not RFC's.

regards,
Gerrit.

-- 
173. If this woman bear sons to her second husband, in the place to
which she went, and then die, her earlier and later sons shall divide the
dowry between them.
        -- 1780 BC, Hammurabi, Code of Law
--
Asperger Syndroom - een persoonlijke benadering:
	http://people.nl.linux.org/~gerrit/
Kom in verzet tegen dit kabinet:
	http://www.sp.nl/

From just at letterror.com  Tue Oct 14 04:37:33 2003
From: just at letterror.com (Just van Rossum)
Date: Tue Oct 14 04:37:43 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com>
Message-ID: <r01050400-1026-AA67E902FE2111D78DF6003065D5E7E4@[10.0.0.23]>

Guido van Rossum wrote:

> After reading this exchange, I'm not sure I agree with Tim about the
> importance of avoiding to compare the full records.  Certainly the
> *cost* of that comparison doesn't bother me: I expect it's usually
> going to be a tuple or list of simple types like ints and strings, and
> comparing those is pretty efficient.

I have no opinion about the importance, but I do have a use case that
differs from Tim's.

The other week I found myself sorting a list of dictionary keys by an
arbitrary attribute of the dict values. The sort needed to be stable, in
the sense that for attributes that contained equal values, the previous
sort order was to be maintained. The dict values themselves weren't
meaningfully sortable. What I did (had to do, given the requirements) is
almost exactly what Tim proposes (I included the indices in the sort),
so having that functionality built into list.sort() would have been
helpful for me. Not having that functionality would mean I'd either not
use the decorator sort feature (ie. do what I do now) or go through
hoops and make the decorator generate the indices. The latter approach
doesn't sound very appealing to me.

Just

From anthony at interlink.com.au  Tue Oct 14 04:47:54 2003
From: anthony at interlink.com.au (Anthony Baxter)
Date: Tue Oct 14 04:50:25 2003
Subject: [Python-Dev] server side digest auth support
Message-ID: <200310140847.h9E8ltLn028921@localhost.localdomain>


We've got http digest auth [RFC 2617] support at the client level in
the standard library, but it doesn't seem like there's server side 
support. I'm planning on adding this (for pypi) but it's not clear 
where it should go - I want to use it from a CGI, but I can see it 
being useful for people writing HTTP servers as well. Should I just
make a new module httpdigest.py?  

Anthony
--
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.


From python at rcn.com  Tue Oct 14 05:00:34 2003
From: python at rcn.com (Raymond Hettinger)
Date: Tue Oct 14 05:01:12 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <r01050400-1026-AA67E902FE2111D78DF6003065D5E7E4@[10.0.0.23]>
Message-ID: <000701c39231$a10332a0$e841fea9@oemcomputer>

I've got a first draft patch (sans docs and tests) loaded at:
   www.python.org/sf/823292

The argument keywords are: cmpfunc, key, reverse

The patch passes regression tests and a minimal set of basic
functionality tests which need to be expanded considerably.  I'll need
to go back over this one in more detail to check:

* Whether the code was inserted in the right place with respect to the
existing anti-mutation code.

* Is the strategy of decorating in-place too aggressive?  Decoration
consists of *replacing* each value x with (x, key(x)).

* Verify reference counting and error handling.


Raymond Hettinger


From Paul.Moore at atosorigin.com  Tue Oct 14 05:31:42 2003
From: Paul.Moore at atosorigin.com (Moore, Paul)
Date: Tue Oct 14 05:32:29 2003
Subject: [Python-Dev] decorate-sort-undecorate
Message-ID: <16E1010E4581B049ABC51D4975CEDB8802C09890@UKDCX001.uk.int.atosorigin.com>

From: Raymond Hettinger [mailto:python@rcn.com]
> I've got a first draft patch (sans docs and tests) loaded at:
>    www.python.org/sf/823292
>
> The argument keywords are: cmpfunc, key, reverse

Can I just clarify the meaning of reverse (the original posting
was a little unclear)? I think that

    l.sort(reverse=True)

should mean the same as

    l.sort()
    l.reverse()

(ie, both sort and reverse inplace, with a void return). The
original posting gave me the impression that a copy would be
done (which I don't think is necessary).

Paul.

From sholden at holdenweb.com  Tue Oct 14 08:15:57 2003
From: sholden at holdenweb.com (Steve Holden)
Date: Tue Oct 14 08:20:46 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <1066097091.19072.11.camel@geddy>
Message-ID: <CGECIJPNNHIFAJKHOLMAKENHIHAA.sholden@holdenweb.com>

[bazzer][ ... ]
>
> Using this explanation, "key" doesn't seem right to me.  I can't think
> of anything that I like better though, so I guess I just
> won't send this
> email afteral...
>
That was a sensible decision. It saved me from having to send this one.

regards
--
Steve Holden          +1 703 278 8281        http://www.holdenweb.com/
Improve the Internet           http://vancouver-webpages.com/CacheNow/
Python Web Programming                http://pydish.holdenweb.com/pwp/
Interview with GvR August 14, 2003       http://www.onlamp.com/python/


From pinard at iro.umontreal.ca  Tue Oct 14 08:28:01 2003
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Tue Oct 14 09:57:20 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com>
References: <LNBBLJKPBEHFEDALKOLCCELCGIAB.tim.one@comcast.net>
	<200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com>
Message-ID: <20031014122801.GA2559@titan.progiciels-bpi.ca>

[Guido van Rossum]
> After reading this exchange, I'm not sure I agree with Tim about the
> importance of avoiding to compare the full records.

It could be useful to avoid comparing the full records.  Once in a
while, I have the problem of comparing objects which are not comparable
to start with, and have to choose between making them comparable, or
using decoration for the time of the sort in which there is a guarantee
that the object themselves will not be used in comparisons (by ensuring
decoration keys never compare equal).  The third option, providing a
comparison function, is something I succeeded to avoid so far, as it
seems to me that this is a good habit relying on fast idioms at hand,
instead of on speed-impacting formulations, and good habits are best
kept by sticking to them. :-)

The problem at making objects comparable is that you fix a preferred or
"natural" ordering for the objects, which might not be so "natural" when
you need to sort them differently.  In some circumstances, maybe many of
them, it is significantly cleaner to leave the objects as not comparable.

-- 
Fran?ois Pinard   http://www.iro.umontreal.ca/~pinard

From guido at python.org  Tue Oct 14 10:37:44 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 14 10:38:02 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "Tue, 14 Oct 2003 05:00:34 EDT."
	<000701c39231$a10332a0$e841fea9@oemcomputer> 
References: <000701c39231$a10332a0$e841fea9@oemcomputer> 
Message-ID: <200310141437.h9EEbii23697@12-236-54-216.client.attbi.com>

> I've got a first draft patch (sans docs and tests) loaded at:
>    www.python.org/sf/823292

No time to review, so feedback just on this email. :-(

> The argument keywords are: cmpfunc, key, reverse

I'd suggest using 'cmp' instead of 'cmpfunc'.  (Same argument as for
'key' vs. 'keyfunc'.)

> The patch passes regression tests and a minimal set of basic
> functionality tests which need to be expanded considerably.  I'll need
> to go back over this one in more detail to check:
> 
> * Whether the code was inserted in the right place with respect to the
> existing anti-mutation code.
> 
> * Is the strategy of decorating in-place too aggressive?  Decoration
> consists of *replacing* each value x with (x, key(x)).

Should be fine.  AFAIR Tim's sort code sets the length of the list to
0, so accessing the list while it's being sorted is not supported
anyway.

> * Verify reference counting and error handling.

Write unit tests and measure process size.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 14 10:46:16 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 14 10:47:00 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "Tue, 14 Oct 2003 10:37:33 +0200."
	<r01050400-1026-AA67E902FE2111D78DF6003065D5E7E4@[10.0.0.23]> 
References: <r01050400-1026-AA67E902FE2111D78DF6003065D5E7E4@[10.0.0.23]> 
Message-ID: <200310141446.h9EEkG523741@12-236-54-216.client.attbi.com>

[Just]
> I have no opinion about the importance, but I do have a use case that
> differs from Tim's.
> 
> The other week I found myself sorting a list of dictionary keys by an
> arbitrary attribute of the dict values. The sort needed to be stable, in
> the sense that for attributes that contained equal values, the previous
> sort order was to be maintained. The dict values themselves weren't
> meaningfully sortable. What I did (had to do, given the requirements) is
> almost exactly what Tim proposes (I included the indices in the sort),
> so having that functionality built into list.sort() would have been
> helpful for me. Not having that functionality would mean I'd either not
> use the decorator sort feature (ie. do what I do now) or go through
> hoops and make the decorator generate the indices. The latter approach
> doesn't sound very appealing to me.

Hm.  I wonder this could be solved by yet another keyword argument
(maybe "stable"?) controlling whether to add the index to the key a la
Tim's recipe.

I note that there are many different uses of sort.  Many common uses
only sort small lists, where performance doesn't matter much; I often
use an inefficient cmp function without worrying about performance in
such cases.  But there are also uses that really test the extremes of
Python's performance, and it's a tribute to Tim that his sort code
stands up so well in that case.  I think it's inevitable that the
default options aren't necessarily best for *all* use cases.

I'm not sure whether the defaults should cater to the extreme
performance cases or to the smaller cases; I expect that the latter
are more common, and people who are sorting truly huge lists should
read the manual if they care about performance.  But that's just me.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 14 10:50:36 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 14 10:51:04 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "Tue, 14 Oct 2003 10:01:10 +0200."
	<200310141001.10705.aleaxit@yahoo.com> 
References: <LNBBLJKPBEHFEDALKOLCCELCGIAB.tim.one@comcast.net>
	<200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com> 
	<200310141001.10705.aleaxit@yahoo.com> 
Message-ID: <200310141450.h9EEobm23763@12-236-54-216.client.attbi.com>

[Alex]
> I have and have seen many use cases where the things being sorted
> are dictionaries (comparisons can be costlier) or instances (they can
> be non-comparable).
> 
> I agree that the "stable" nature of sorting is not all that important in
> our context.  But avoiding whole-record comparison in the general
> case seems important enough to me that I'd accept any arbitrary
> non-comparing behavior (e.g. making the id of the thing being sorted
> the secondary key!-) rather than default to whole-record compares.

Given that internally we still do a DSU, sorting tuples of (key,
something), using the id of the record for 'something' is just as
inefficient as using the original index -- in both cases we'd have to
allocate len(lst) ints.

Greg Ewing suggested that the ints shouldn't have to be Python ints.
While this is true, it would require a much larger overhaul of the
existing sort code, which assumes the "records" to be sorted are
pointers to objects.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at yahoo.com  Tue Oct 14 10:57:40 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 14 10:57:45 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310141446.h9EEkG523741@12-236-54-216.client.attbi.com>
References: <r01050400-1026-AA67E902FE2111D78DF6003065D5E7E4@[10.0.0.23]>
	<200310141446.h9EEkG523741@12-236-54-216.client.attbi.com>
Message-ID: <200310141657.40072.aleaxit@yahoo.com>

On Tuesday 14 October 2003 04:46 pm, Guido van Rossum wrote:
   ...
> I'm not sure whether the defaults should cater to the extreme
> performance cases or to the smaller cases; I expect that the latter
> are more common, and people who are sorting truly huge lists should
> read the manual if they care about performance.  But that's just me.

I think your general philosophy on "defaults cover normal cases" is part of
what makes Python so good, so, if it's just you, that need not be a bad 
thing;-).  However, it seems to me that, in a normal case (sorting a
smallish number of easily comparable thingies), whether the indices are
or are not added to the decoration is not going to make an enormous
difference either way.  So, maybe we should focus on two slightly less
normal cases where performance or correctness may be impacted:

-- if we're sorting a huge list of easily comparable thingies then the
   overhead of adding so many indices to the decoration might hurt

-- if we're sorting a list of expensive-to-compare thingies (e.g. dicts)
   or non-comparable thingies, the indices (or something, but might
   as well be the indices, it seems to me) are needed in the
   decoration (except in the special cases where all keys can be
   guaranteed to differ, of course) -- whether the list is huge or not

This, plus your indication that only people sorting truly huge lists
should have to read the manual, suggests to me that defaulting
to decoration-with-indices (perhaps with an option to omit the
indices) might be a preferable chocie.


Alex


From tim.one at comcast.net  Tue Oct 14 10:58:10 2003
From: tim.one at comcast.net (Tim Peters)
Date: Tue Oct 14 10:58:13 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAENKGIAB.tim.one@comcast.net>

[Guido]
> After reading this exchange, I'm not sure I agree with Tim about the
> importance of avoiding to compare the full records.  Certainly the
> *cost* of that comparison doesn't bother me: I expect it's usually
> going to be a tuple or list of simple types like ints and strings, and
> comparing those is pretty efficient.

I made the remark about cost in reference to database sorts specifically,
where falling back to whole-record comparison can very easily double the
cost of a sort -- or much worse than that.  We saw that in a sample
database-sort application Kevin Altis sent when the 2.3 sort was being
developing, where the results were grossly distorted at first because the
driver *didn't* arrange to break key ties with cheap compares.  Sort a
database of companies by (as that example did, and among other things) the
stock exchange each is listed on, and you're guaranteed that a great many
duplicate keys exist (there are more companies than stock exchanges).
Comparing two ints is then vastly cheaper than firing up a written-in-Python
general database-record long-winded __cmp__ function.

> Sure, there's a pattern that requires records with equal to remain in
> the original order, but that seems an artefact of a pattern used only
> for external sorts (where the sorted data is too large to fit in
> memory -- Knuth Vol. III is full of algorithms for this, but they seem
> mostly of historical importance in this age of Gigabyte internal
> memory).

It's not externality, it's decomposability:  stability is what allows an
N-key primary-secondary-etc sort to be done one key at a time instead, in N
passes, and get the same result either way.  Almost all sorts you're likely
to use in real life are stable in order to support this, whether it's
clicking on an email-metadata column in Outlook, or sorting an array of data
by a contained column in Excel.  These are in-memory sorts, but interactive,
where the user refines sort criteria on the fly and the app has no memory of
what steps were taken before the current sort.  Then stability is essential
to getting the right result -- or the user has to fill out a complex
multi-key sort dialog each time.

> The pattern is that if you want records sorted by zipcode and within
> each zipcode sorted by name, you first sort by name and then do a
> stable sort by zipcode.  This was common in the days of external
> sorts.

I wager it's much more common now (not externality, but sorting first by one
key, then by another -- it's interactivity that drives this now).

> ...
> But using Raymond's proposal, you can do that by specifying a tuple
> consisting of zipcode and name as the key, as follows:
>
>   myTable.sort(key = lambda rec: (rec.zipcode, rec.name))
>
> This costs an extra tuple, but the values in the tuple are not new, so
> it costs less space than adding the index int (not even counting the
> effect of the immortality of ints).

A hidden cost is that apps supporting interactive (or any other form of
multi-stage) sort refinements have to keep track of the full set of sort
keys ever applied.

> And tuples aren't immortal.  (To be honest, for tuples of length < 20,
> the first 2000 of a given size *are* immortal, but that's a strictly
> fixed amount of wasted memory.)

I don't care about that either.

> Given that this solution isn't exactly rocket science, I think the
> default behavior of Raymond's original proposal (making the whole
> record the second sort key) is fine.

It does approach rocket science for an end user to understand why their
database sort is so slow in the presence of many equal keys, and the absence
of a cheap tie-breaker.  It's something I didn't appreciate either until
digging into why Kevin's sample app was so bloody slow in some cases (but
not all!  it was the sorts with many equal keys that were pig slow, and
because-- as the app was coded --they fell back to whole-record comparison).


From aleaxit at yahoo.com  Tue Oct 14 11:00:33 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 14 11:00:39 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310141450.h9EEobm23763@12-236-54-216.client.attbi.com>
References: <LNBBLJKPBEHFEDALKOLCCELCGIAB.tim.one@comcast.net>
	<200310141001.10705.aleaxit@yahoo.com>
	<200310141450.h9EEobm23763@12-236-54-216.client.attbi.com>
Message-ID: <200310141700.33217.aleaxit@yahoo.com>

On Tuesday 14 October 2003 04:50 pm, Guido van Rossum wrote:
   ...
> > case seems important enough to me that I'd accept any arbitrary
> > non-comparing behavior (e.g. making the id of the thing being sorted
> > the secondary key!-) rather than default to whole-record compares.
>
> Given that internally we still do a DSU, sorting tuples of (key,
> something), using the id of the record for 'something' is just as
> inefficient as using the original index -- in both cases we'd have to
> allocate len(lst) ints.

Yes, of course, I was just being facetious -- sorry for not making that
clearer.


> Greg Ewing suggested that the ints shouldn't have to be Python ints.
> While this is true, it would require a much larger overhaul of the
> existing sort code, which assumes the "records" to be sorted are
> pointers to objects.

Again, true.  But maybe the performance increase would be worth
the substantial effort (I don't understand the current sort code enough
to say more than "maybe"!-).


Alex


From skip at pobox.com  Tue Oct 14 11:19:19 2003
From: skip at pobox.com (Skip Montanaro)
Date: Tue Oct 14 11:19:30 2003
Subject: [Python-Dev] Re: Python-Dev Digest, Vol 3, Issue 25
In-Reply-To: <200310140333.h9E3Xbg22932@12-236-54-216.client.attbi.com>
References: <E1A99UR-00063f-Bu@mail.python.org>
	<D18F01EF.3E310B4D@mail.google.com>
	<200310132204.h9DM4SM22471@12-236-54-216.client.attbi.com>
	<16267.11400.169738.924956@montanaro.dyndns.org>
	<200310140333.h9E3Xbg22932@12-236-54-216.client.attbi.com>
Message-ID: <16268.5111.311067.830227@montanaro.dyndns.org>


    Guido> [Skip]
    >> direction=[ascending|descending]
    >> 
    >> ?  Just a thought.

    Guido> But where would these constants be defined?  Using
    Guido> direction='ascending' feels ugly.

I agree there are problems with the concept.  I was just thinking that
reverse=True implies that the user knows without being told what "forward"
is (without relying on past experience with stuff like the Unix sort()
function).  Fortunately, it's easy enough to try things out in Python. ;-)

Skip


From skip at pobox.com  Tue Oct 14 11:20:57 2003
From: skip at pobox.com (Skip Montanaro)
Date: Tue Oct 14 11:21:06 2003
Subject: [Python-Dev] Re: Draft of an essay on Python development (and how
	tohelp)
In-Reply-To: <bmftja$jm7$1@sea.gmane.org>
References: <3F8B5ECB.4030207@ocf.berkeley.edu> <bmftja$jm7$1@sea.gmane.org>
Message-ID: <16268.5209.53004.566117@montanaro.dyndns.org>


    >> Help add to the patch if it is missing documentation patches or
    >> needed regression tests.

    Mike> Please don't herald for things that can't be done by anyone except
    Mike> patch author or py-dev member.

Or identify such barriers and suggest alternate paths to submit such info.

Skip

From nas-python at python.ca  Tue Oct 14 11:27:52 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Tue Oct 14 11:27:03 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com>
References: <LNBBLJKPBEHFEDALKOLCCELCGIAB.tim.one@comcast.net>
	<200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com>
Message-ID: <20031014152752.GA11335@mems-exchange.org>

On Mon, Oct 13, 2003 at 08:31:58PM -0700, Guido van Rossum wrote:
> But using Raymond's proposal, you can do that by specifying a tuple
> consisting of zipcode and name as the key, as follows:
> 
>   myTable.sort(key = lambda rec: (rec.zipcode, rec.name))

This reads nicely.  +1 on 'key'.

  Neil

From jrw at pobox.com  Tue Oct 14 12:09:15 2003
From: jrw at pobox.com (John Williams)
Date: Tue Oct 14 12:09:20 2003
Subject: [Python-Dev] decorate-sort-undecorate
References: <LNBBLJKPBEHFEDALKOLCCELCGIAB.tim.one@comcast.net>	<200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com>
	<200310141001.10705.aleaxit@yahoo.com>
	<200310141450.h9EEobm23763@12-236-54-216.client.attbi.com>
Message-ID: <3F8C1FAB.8020607@pobox.com>

Guido van Rossum wrote:
> Given that internally we still do a DSU, sorting tuples of (key,
> something), using the id of the record for 'something' is just as
> inefficient as using the original index -- in both cases we'd have to
> allocate len(lst) ints.
> 
> Greg Ewing suggested that the ints shouldn't have to be Python ints.
> While this is true, it would require a much larger overhaul of the
> existing sort code, which assumes the "records" to be sorted are
> pointers to objects.

Why not use a special tuple type for the DSU algorithm that ignores its 
last element when doing a comparison? It eliminates the problem of 
creating a zillion int objects, and <speculation>it would be easy to 
implement.</speculation>

jw


From nas-python at python.ca  Tue Oct 14 12:17:56 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Tue Oct 14 12:17:04 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <3F8C1FAB.8020607@pobox.com>
References: <LNBBLJKPBEHFEDALKOLCCELCGIAB.tim.one@comcast.net>
	<200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com>
	<200310141001.10705.aleaxit@yahoo.com>
	<200310141450.h9EEobm23763@12-236-54-216.client.attbi.com>
	<3F8C1FAB.8020607@pobox.com>
Message-ID: <20031014161756.GC11579@mems-exchange.org>

On Tue, Oct 14, 2003 at 11:09:15AM -0500, John Williams wrote:
> Why not use a special tuple type for the DSU algorithm that ignores its 
> last element when doing a comparison?

Clever idea I think.  You don't need a special tuple, just a little
wrapper object that holds the key and the original value and uses
the key for tp_richcompare.

  Neil

From guido at python.org  Tue Oct 14 12:31:47 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 14 12:32:07 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "Tue, 14 Oct 2003 10:58:10 EDT."
	<LNBBLJKPBEHFEDALKOLCAENKGIAB.tim.one@comcast.net> 
References: <LNBBLJKPBEHFEDALKOLCAENKGIAB.tim.one@comcast.net> 
Message-ID: <200310141631.h9EGVlf24023@12-236-54-216.client.attbi.com>

> [Guido]
> > After reading this exchange, I'm not sure I agree with Tim about the
> > importance of avoiding to compare the full records.  Certainly the
> > *cost* of that comparison doesn't bother me: I expect it's usually
> > going to be a tuple or list of simple types like ints and strings, and
> > comparing those is pretty efficient.

[Tim]
> I made the remark about cost in reference to database sorts specifically,
> where falling back to whole-record comparison can very easily double the
> cost of a sort -- or much worse than that.

When exactly do you consider  a sort a "database sort"?

> We saw that in a sample
> database-sort application Kevin Altis sent when the 2.3 sort was being
> developing, where the results were grossly distorted at first because the
> driver *didn't* arrange to break key ties with cheap compares.  Sort a
> database of companies by (as that example did, and among other things) the
> stock exchange each is listed on, and you're guaranteed that a great many
> duplicate keys exist (there are more companies than stock exchanges).
> Comparing two ints is then vastly cheaper than firing up a written-in-Python
> general database-record long-winded __cmp__ function.

No argument there.

> > Sure, there's a pattern that requires records with equal to remain in
> > the original order, but that seems an artefact of a pattern used only
> > for external sorts (where the sorted data is too large to fit in
> > memory -- Knuth Vol. III is full of algorithms for this, but they seem
> > mostly of historical importance in this age of Gigabyte internal
> > memory).
> 
> It's not externality, it's decomposability:  stability is what allows an
> N-key primary-secondary-etc sort to be done one key at a time instead, in N
> passes, and get the same result either way.  Almost all sorts you're likely
> to use in real life are stable in order to support this, whether it's
> clicking on an email-metadata column in Outlook, or sorting an array of data
> by a contained column in Excel.

I experimented a bit with the version of Outlook I have, and it seems
to always use the delivery date/time as the second key, and always in
descending order.

> These are in-memory sorts, but interactive,
> where the user refines sort criteria on the fly and the app has no memory of
> what steps were taken before the current sort.  Then stability is essential
> to getting the right result -- or the user has to fill out a complex
> multi-key sort dialog each time.

I'm not sure that this helps us decide the default behavior of sorts
in Python, which are rarely interactive in this sense.  (If someone
writes an Ourlook substitute, they can pretty well code the sort to do
whatever they want.)

> > The pattern is that if you want records sorted by zipcode and within
> > each zipcode sorted by name, you first sort by name and then do a
> > stable sort by zipcode.  This was common in the days of external
> > sorts.
> 
> I wager it's much more common now (not externality, but sorting first by one
> key, then by another -- it's interactivity that drives this now).

But I don't see the interactivity in Python apps, and that's what
counts here.

> > ...
> > But using Raymond's proposal, you can do that by specifying a tuple
> > consisting of zipcode and name as the key, as follows:
> >
> >   myTable.sort(key = lambda rec: (rec.zipcode, rec.name))
> >
> > This costs an extra tuple, but the values in the tuple are not new, so
> > it costs less space than adding the index int (not even counting the
> > effect of the immortality of ints).
> 
> A hidden cost is that apps supporting interactive (or any other form of
> multi-stage) sort refinements have to keep track of the full set of sort
> keys ever applied.

A small cost compared to the cost of writing an interactive app, *if*
you really want this behavior (I doubt it matters to most people using
Outlook).

> > And tuples aren't immortal.  (To be honest, for tuples of length < 20,
> > the first 2000 of a given size *are* immortal, but that's a strictly
> > fixed amount of wasted memory.)
> 
> I don't care about that either.
> 
> > Given that this solution isn't exactly rocket science, I think the
> > default behavior of Raymond's original proposal (making the whole
> > record the second sort key) is fine.
> 
> It does approach rocket science for an end user to understand why their
> database sort is so slow in the presence of many equal keys, and the absence
> of a cheap tie-breaker.  It's something I didn't appreciate either until
> digging into why Kevin's sample app was so bloody slow in some cases (but
> not all!  it was the sorts with many equal keys that were pig slow, and
> because-- as the app was coded --they fell back to whole-record comparison).

Yeah, understanding performance anomalies is hard.

To cut all this short, I propose that we offer using the index as a
second sort key as an option, on by default, whose name can be (barely
misleading) "stable".  On by default nicely matches the behavior of
the 2.3 sort without any options.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From theller at python.net  Tue Oct 14 12:37:00 2003
From: theller at python.net (Thomas Heller)
Date: Tue Oct 14 12:37:06 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <3F872FE9.9070508@v.loewis.de> (
	=?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Sat,
	11 Oct 2003 00:17:13 +0200")
References: <brspcg23.fsf@python.net> <3F872FE9.9070508@v.loewis.de>
Message-ID: <u16bbwsz.fsf@python.net>

"Martin v. L?wis" <martin@v.loewis.de> writes:

> Thomas Heller wrote:
>> What is the rationale to decide whether a module is builtin or an
>> extension module in core Python (I only care about Windows)?
>
> I believe it is mostly tradition, on Windows: We continue to do
> things the way they have always been done.
>
> On Linux, there is an additional rationale: small executables and
> many files are cool, so we try to have as many shared libraries as
> possible. (if you smell sarcasm - that is intentional)
>
>> To give examples, could zlib be made into a builtin module (because it's
>> useful for zipimport), _sre (because it's used by warnings), or are
>> there reasons preventing this?
>
> I think that anything that would be reasonably replaced by third parties
> (such as pyexpat.pyd) should be shared, and anything else should be part
> of pythonxy.dll.
If I look at the file sizes in the DLLs directory, it seems that at
least unicodedata.pyd, _bsddb.pyd, and _ssl.pyd would significantly grow
python23.dll. Is unicodedata.pyd used by the encoding/decoding methods?

"Tim Peters" <tim.one@comcast.net> writes:

> [Thomas Heller]
>> What is the rationale to decide whether a module is builtin or an
>> extension module in core Python (I only care about Windows)?
>
> I don't know that there is one.  Maybe to avoid chewing address space for
> code that some programs won't use.  Generally speaking, it appears some
> effort was made to make stuff an extension module on Windows if it was an
> optional part of the Unix build.  There was certainly an effort made to
> build an extension for Python modules wrapping external cod (like the _bsddb
> and _tkinter projects).
>
>> To give examples, could zlib be made into a builtin module (because
>> it's useful for zipimport), _sre (because it's used by warnings), or
>> are there reasons preventing this?
>
> zlib was there long before Python routinely made use of it; indeed, I doubt
> I ever used one byte of the zlib code outside of Python testing before zip
> import came along (and since I have no zip files to import from I guess I
> still never use it).  Leaving _sre an extension seems odd now, but at the
> time it was competing with the external-to-Python PCRE code.
>
> Why do you ask?  Answers must be accurate to 10 decimal digits <wink>.

Well, people complain about the number of files py2exe creates.  And
especially the modules used to init Python itself (in the 1.5 days,
exceptions.py, nowadays zlib.pyd) have to be special cased because they
cannot use the import hooks.

Thomas


From guido at python.org  Tue Oct 14 12:55:54 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 14 12:56:01 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "Tue, 14 Oct 2003 11:09:15 CDT."
	<3F8C1FAB.8020607@pobox.com> 
References: <LNBBLJKPBEHFEDALKOLCCELCGIAB.tim.one@comcast.net>
	<200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com>
	<200310141001.10705.aleaxit@yahoo.com>
	<200310141450.h9EEobm23763@12-236-54-216.client.attbi.com> 
	<3F8C1FAB.8020607@pobox.com> 
Message-ID: <200310141655.h9EGtsr24080@12-236-54-216.client.attbi.com>

> Why not use a special tuple type for the DSU algorithm that ignores its 
> last element when doing a comparison? It eliminates the problem of 
> creating a zillion int objects, and <speculation>it would be easy to 
> implement.</speculation>

If we're going to do a custom object, it should be a fixed-length
struct containing (1) the key, (2) a C int of sufficient size to hold
the record index; (3) a pointer to the record, and its comparison
should only use (1) and (2).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From just at letterror.com  Tue Oct 14 13:07:34 2003
From: just at letterror.com (Just van Rossum)
Date: Tue Oct 14 13:07:45 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310141655.h9EGtsr24080@12-236-54-216.client.attbi.com>
Message-ID: <r01050400-1026-E9161CEAFE6811D78DF6003065D5E7E4@[10.0.0.23]>

Guido van Rossum wrote:

> > Why not use a special tuple type for the DSU algorithm that ignores
> > its last element when doing a comparison? It eliminates the problem
> > of creating a zillion int objects, and <speculation>it would be
> > easy to implement.</speculation>
> 
> If we're going to do a custom object, it should be a fixed-length
> struct containing (1) the key, (2) a C int of sufficient size to hold
> the record index; (3) a pointer to the record, and its comparison
> should only use (1) and (2).

But since we have a stable sort, (2) can be omitted. I agree with Neil
that this is a very clever idea!

Just

From fdrake at acm.org  Tue Oct 14 13:22:29 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue Oct 14 13:22:43 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310141655.h9EGtsr24080@12-236-54-216.client.attbi.com>
References: <LNBBLJKPBEHFEDALKOLCCELCGIAB.tim.one@comcast.net>
	<200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com>
	<200310141001.10705.aleaxit@yahoo.com>
	<200310141450.h9EEobm23763@12-236-54-216.client.attbi.com>
	<3F8C1FAB.8020607@pobox.com>
	<200310141655.h9EGtsr24080@12-236-54-216.client.attbi.com>
Message-ID: <16268.12501.505494.397966@grendel.zope.com>


Guido van Rossum writes:
 > If we're going to do a custom object, it should be a fixed-length
 > struct containing (1) the key, (2) a C int of sufficient size to hold
 > the record index; (3) a pointer to the record, and its comparison
 > should only use (1) and (2).

As has been pointed out, we already have a stable sort.  Instead of
making stability an option, let's just keep it.

We could allocate a second array of PyObject* to mirror the list
contents; that would have only the keys.  When two values are switched
in the sort, the values in both the key list and the value list can be
switched.  When done, we only need to decref the computed keys and
free the array of keys.

No additional structures would be needed.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From nas-python at python.ca  Tue Oct 14 13:26:27 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Tue Oct 14 13:25:41 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310141655.h9EGtsr24080@12-236-54-216.client.attbi.com>
References: <LNBBLJKPBEHFEDALKOLCCELCGIAB.tim.one@comcast.net>
	<200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com>
	<200310141001.10705.aleaxit@yahoo.com>
	<200310141450.h9EEobm23763@12-236-54-216.client.attbi.com>
	<3F8C1FAB.8020607@pobox.com>
	<200310141655.h9EGtsr24080@12-236-54-216.client.attbi.com>
Message-ID: <20031014172627.GC11868@mems-exchange.org>

On Tue, Oct 14, 2003 at 09:55:54AM -0700, Guido van Rossum wrote:
> If we're going to do a custom object, it should be a fixed-length
> struct containing (1) the key, (2) a C int of sufficient size to hold
> the record index; (3) a pointer to the record, and its comparison
> should only use (1) and (2).

I just thought of another reason why this is a good idea.  Imagine I
want to sort a list of objects that cannot be compared (e.g. complex
numbers).  I would expect

    cnums.sort(key = lambda n: n.real)

to work, not fail with an exception.

  Neil

From guido at python.org  Tue Oct 14 13:43:50 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 14 13:44:01 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "Tue, 14 Oct 2003 13:22:29 EDT."
	<16268.12501.505494.397966@grendel.zope.com> 
References: <LNBBLJKPBEHFEDALKOLCCELCGIAB.tim.one@comcast.net>
	<200310140332.h9E3VwR22900@12-236-54-216.client.attbi.com>
	<200310141001.10705.aleaxit@yahoo.com>
	<200310141450.h9EEobm23763@12-236-54-216.client.attbi.com>
	<3F8C1FAB.8020607@pobox.com>
	<200310141655.h9EGtsr24080@12-236-54-216.client.attbi.com> 
	<16268.12501.505494.397966@grendel.zope.com> 
Message-ID: <200310141743.h9EHho324220@12-236-54-216.client.attbi.com>

> As has been pointed out, we already have a stable sort.  Instead of
> making stability an option, let's just keep it.

If this can be done without any of the disadvantages brought up at
some point (especially allocating millions of ints), by all means
let's do it.

> We could allocate a second array of PyObject* to mirror the list
> contents; that would have only the keys.  When two values are switched
> in the sort, the values in both the key list and the value list can be
> switched.  When done, we only need to decref the computed keys and
> free the array of keys.

I can't tell if that'll work, but if it does, it would be a great solution.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From tim.one at comcast.net  Tue Oct 14 14:01:49 2003
From: tim.one at comcast.net (Tim Peters)
Date: Tue Oct 14 14:01:57 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310141743.h9EHho324220@12-236-54-216.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEPHGIAB.tim.one@comcast.net>

[Fred]
>> We could allocate a second array of PyObject* to mirror the list
>> contents; that would have only the keys.  When two values are
>> switched in the sort, the values in both the key list and the value
>> list can be switched.  When done, we only need to decref the
>> computed keys and free the array of keys.

[Guido]
> I can't tell if that'll work, but if it does, it would be a great
> solution.

I mentioned that before -- doubling the amount of data movement would hurt,
at best by blowing cache all to hell.

There's a related approach, though:  build a distinct vector of custom
objects, each containing:

1. A pointer to the key.
2. The original index, as a C integer.

This is similar to, but smaller than, something mentioned before.

The comparison function for this kind of object redirects to comparing only
the keys -- the integers are ignored during the sort.  Sort this list with
the sorting code exactly as it exists now.

At the end of sorting, the integer members can be used to permute the
original list into order.  This can be done in-place efficiently (not
entirely obvious; Knuth gives at least one algorithm for it).


From python at rcn.com  Tue Oct 14 14:05:09 2003
From: python at rcn.com (Raymond Hettinger)
Date: Tue Oct 14 14:06:32 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <16268.12501.505494.397966@grendel.zope.com>
Message-ID: <000901c3927d$b4bda2c0$e841fea9@oemcomputer>

[Fred L. Drake]
> As has been pointed out, we already have a stable sort.  Instead of
> making stability an option, let's just keep it.

Right.  If a fast tie-breaker is provided, why would anyone ever choose
stable=False?

 
> We could allocate a second array of PyObject* to mirror the list
> contents; that would have only the keys.  When two values are switched
> in the sort, the values in both the key list and the value list can be
> switched.  When done, we only need to decref the computed keys and
> free the array of keys.
> 
> No additional structures would be needed.

I would rather wrap Tim's existing code than muck with assignment logic.
Ideally, the performance of list.sort() should stay unchanged when the
key function is not specified.

Tim's original (key, index, value) idea seems to be simplest.  The only
sticking point is the immortality of PyInts.  One easy, but not so
elegant way around this is to use another mortal object for a tiebreaker
(for example, "00000", "00001", ...).  Alternatively, is there a way of
telling a PyInt to be mortal?

Besides immortality and speed, another consideration is the interaction
between the cmp function and the key function.  If both are specified,
then the underlying decoration becomes visible to the user:

    def viewcmp(a, b):
        print a, b  # the decoration just became visible
        return cmp(a,b)
    mylist.sort(cmp=viewcmp, key=str.lower)

Since the decoration can be visible, it should be as understandable as
possible.  Viewed this way, PyInts are preferable to a custom object.


Raymond Hettinger
 

From fdrake at acm.org  Tue Oct 14 14:15:52 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue Oct 14 14:16:10 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <000901c3927d$b4bda2c0$e841fea9@oemcomputer>
References: <16268.12501.505494.397966@grendel.zope.com>
	<000901c3927d$b4bda2c0$e841fea9@oemcomputer>
Message-ID: <16268.15704.529902.155365@grendel.zope.com>


Raymond Hettinger writes:
 > Besides immortality and speed, another consideration is the interaction
 > between the cmp function and the key function.  If both are specified,
 > then the underlying decoration becomes visible to the user:

Or the cmp function is only passed the decoration.  Or we disallow
specifying both.  Passing (decoration, index, value) for each value
strikes me as just plain wrong.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From martin at v.loewis.de  Tue Oct 14 14:17:52 2003
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Oct 14 14:18:00 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <u16bbwsz.fsf@python.net>
References: <brspcg23.fsf@python.net> <3F872FE9.9070508@v.loewis.de>
	<u16bbwsz.fsf@python.net>
Message-ID: <3F8C3DD0.4020400@v.loewis.de>

Thomas Heller wrote:

> If I look at the file sizes in the DLLs directory, it seems that at
> least unicodedata.pyd, _bsddb.pyd, and _ssl.pyd would significantly grow
> python23.dll. Is unicodedata.pyd used by the encoding/decoding methods?

No, but it is use by SRE, and by unicode methods (.lower, .upper, ...).

I don't see why it matters, though. Adding modules to pythonxy.dll does 
not increase the memory consumption if the modules are not used. It 
might decrease the memory consumption in case the modules are used.

Regards,
Martin


From theller at python.net  Tue Oct 14 14:25:31 2003
From: theller at python.net (Thomas Heller)
Date: Tue Oct 14 14:25:37 2003
Subject: [Python-Dev] IPv6 in Windows binary distro
In-Reply-To: <3F873097.7050201@v.loewis.de> (
	=?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Sat,
	11 Oct 2003 00:20:07 +0200")
References: <3F86CF65.1000401@shambala.net>
	<2mad89ulyo.fsf@starship.python.net> <3F873097.7050201@v.loewis.de>
Message-ID: <he2bbrs4.fsf@python.net>

"Martin v. L?wis" <martin@v.loewis.de> writes:

> Michael Hudson wrote:
>
>> Did the 2.3 builds have IPv6 support?  Then this would be a nasty
>> regression.  However, I *thought* that you had to build with VC++ 7 or
>> higher to get IPv6 support on Windows, and we've never done that.
>
> No, 2.3 did not have IPv6. You don't strictly need VC7, though - if
> you have the SDK installed in addition to VC6, you could also include
> IPv6 support. PC/pyconfig.h does not detect this case automatically,
> so you would have to manually activate this support (i.e. include
> winsock2.h).
>
> Apart from that, you are right - IPv6 is not supported in the Windows
> builds because of lacking support in the compiler's header files.

Ok, I installed the Feb 2003 Platform SDK, and it seems I'm now able to
compile with IPv6 support - after minor twiddling of the header files.

Now these questions arise:
- Should the next binary release (2.3.3, scheduled for the end of 2003)
include this support?

- Should there be any attempts to detect this support in the header
files automatically (black magic to me), or should the platform sdk be
required to compile Python with IPv6?

Thomas


From bac at OCF.Berkeley.EDU  Tue Oct 14 14:29:57 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Tue Oct 14 14:30:11 2003
Subject: [Python-Dev] Draft of an essay on Python development (and how
	to help)
In-Reply-To: <20031014081553.GA2976@nl.linux.org>
References: <3F8B5ECB.4030207@ocf.berkeley.edu>
	<20031014081553.GA2976@nl.linux.org>
Message-ID: <3F8C40A5.1060902@ocf.berkeley.edu>

Gerrit Holl wrote:

> Brett C. wrote:
> 
>>Feature Requests
>>----------------
>>`Feature requests`_ are for features that you wish Python had but you 
>>have no plans on actually implementing by writing a patch.  On occasion 
>>people do go through the features requests (also called RFCs on 
>>SourceForge) to see if there is anything there that they think should be 
>>implemented and actually do the implementation.  But in general do not 
>>expect something put here to be implemented without some participation 
>>on your part.
> 
> 
> I think feature requests are called RFE's in SF terminology, not RFC's.
> 

They are; Requested Feature Enhancements.  Typo on my part.  Thanks for 
catching it.

-Brett


From bac at OCF.Berkeley.EDU  Tue Oct 14 14:34:11 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Tue Oct 14 14:34:43 2003
Subject: [Python-Dev] IPv6 in Windows binary distro
In-Reply-To: <he2bbrs4.fsf@python.net>
References: <3F86CF65.1000401@shambala.net>	<2mad89ulyo.fsf@starship.python.net>
	<3F873097.7050201@v.loewis.de> <he2bbrs4.fsf@python.net>
Message-ID: <3F8C41A3.3040700@ocf.berkeley.edu>

Thomas Heller wrote:

> "Martin v. L?wis" <martin@v.loewis.de> writes:
> 
> 
>>No, 2.3 did not have IPv6. You don't strictly need VC7, though - if
>>you have the SDK installed in addition to VC6, you could also include
>>IPv6 support. PC/pyconfig.h does not detect this case automatically,
>>so you would have to manually activate this support (i.e. include
>>winsock2.h).
>>
>>Apart from that, you are right - IPv6 is not supported in the Windows
>>builds because of lacking support in the compiler's header files.
> 
> 
> Ok, I installed the Feb 2003 Platform SDK, and it seems I'm now able to
> compile with IPv6 support - after minor twiddling of the header files.
> 
> Now these questions arise:
> - Should the next binary release (2.3.3, scheduled for the end of 2003)
> include this support?
>

Following the rule of thumb that says differences between micro releases 
should be minimized, I would say -1 on this.

> - Should there be any attempts to detect this support in the header
> files automatically (black magic to me), or should the platform sdk be
> required to compile Python with IPv6?
> 

No opinion from me on this one.

-Brett


From bac at OCF.Berkeley.EDU  Tue Oct 14 14:36:19 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Tue Oct 14 14:36:35 2003
Subject: [Python-Dev] Re: Draft of an essay on Python development (and
	how	tohelp)
In-Reply-To: <16268.5209.53004.566117@montanaro.dyndns.org>
References: <3F8B5ECB.4030207@ocf.berkeley.edu> <bmftja$jm7$1@sea.gmane.org>
	<16268.5209.53004.566117@montanaro.dyndns.org>
Message-ID: <3F8C4223.7000708@ocf.berkeley.edu>

Skip Montanaro wrote:

>     >> Help add to the patch if it is missing documentation patches or
>     >> needed regression tests.
> 
>     Mike> Please don't herald for things that can't be done by anyone except
>     Mike> patch author or py-dev member.
> 
> Or identify such barriers and suggest alternate paths to submit such info.
> 

I think I will mention that you can always post the files somewhere else 
online and paste the link into a comment.  I don't want to suggest 
creating another patch item just to store extra stuff for an existing 
patch since that would probably lead to an explosion in patch items and 
that is the last thing we need.

-Brett


From martin at v.loewis.de  Tue Oct 14 14:49:22 2003
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Oct 14 14:49:38 2003
Subject: [Python-Dev] IPv6 in Windows binary distro
In-Reply-To: <he2bbrs4.fsf@python.net>
References: <3F86CF65.1000401@shambala.net>	<2mad89ulyo.fsf@starship.python.net>
	<3F873097.7050201@v.loewis.de> <he2bbrs4.fsf@python.net>
Message-ID: <3F8C4532.8060305@v.loewis.de>

Thomas Heller wrote:

> - Should the next binary release (2.3.3, scheduled for the end of 2003)
> include this support?

I'm leaning actually somewhat towards requesting this, although the
"no changes in Micro releases" is a strong point.

I would not at all be concerned if this was a pure add-on feature.
However, there might be minor changes to existing behaviour:
- python23.dll would now require winsock2.dll. I'm unsure whether
   Win95 was already providing this library.
- the getaddrinfo implementation would now be the Microsoft
   "native emulation", instead of the Python one. It is a native
   emulation because it detects proper getaddrinfo dynamically
   if available, and falls back to emulation otherwise. This might
   cause minor semantic changes over our emulation code.
   [there would also be significant semantic changes in case a host
    has an IPv6 address - but that would be the whole point of
    making the change]

Given that the feature is going to be requested more and more, and
given that Microsoft's getaddrinfo emulation is likely more correct,
thread-safe, etc. than our own, I'm still leaning towards
"include IPv6".

> - Should there be any attempts to detect this support in the header
> files automatically (black magic to me), or should the platform sdk be
> required to compile Python with IPv6?

If this is implemented for 2.3, I think it should be an easily-tunable
option, defaulting to "on" - anybody building Python without the SDK
would have to turn it off.

For 2.4, I hope we will move to VC7, in which case the SDK is not 
required anymore for IPv6.

Regards,
Martin


From tim.one at comcast.net  Tue Oct 14 15:12:37 2003
From: tim.one at comcast.net (Tim Peters)
Date: Tue Oct 14 15:12:44 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <20031014161756.GC11579@mems-exchange.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEAAGJAB.tim.one@comcast.net>

[Neil Schemenauer]
> Clever idea I think.  You don't need a special tuple, just a little
> wrapper object that holds the key and the original value and uses
> the key for tp_richcompare.

That could work well.  If a comparison function was specified too, it would
only see the key (addressing one of Raymond's concerns).


From guido at python.org  Tue Oct 14 15:16:17 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 14 15:16:26 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "Tue, 14 Oct 2003 14:01:49 EDT."
	<LNBBLJKPBEHFEDALKOLCKEPHGIAB.tim.one@comcast.net> 
References: <LNBBLJKPBEHFEDALKOLCKEPHGIAB.tim.one@comcast.net> 
Message-ID: <200310141916.h9EJGHA24421@12-236-54-216.client.attbi.com>

> [Fred]
> >> We could allocate a second array of PyObject* to mirror the list
> >> contents; that would have only the keys.  When two values are
> >> switched in the sort, the values in both the key list and the value
> >> list can be switched.  When done, we only need to decref the
> >> computed keys and free the array of keys.
> 
> [Guido]
> > I can't tell if that'll work, but if it does, it would be a great
> > solution.

[Tim]
> I mentioned that before -- doubling the amount of data movement would hurt,
> at best by blowing cache all to hell.
> 
> There's a related approach, though:  build a distinct vector of custom
> objects, each containing:
> 
> 1. A pointer to the key.
> 2. The original index, as a C integer.
> 
> This is similar to, but smaller than, something mentioned before.

But wouldn't the memory allocated for all those tiny custom objects
also be spread all over the place and hence blow away the cache?

I guess another approach would be to in-line those objects so that we
sort an array of structs like this:

  struct {
    PyObject *key;
    long index;
  }

rather than an array of PyObject*.  But this would probably require
all of the sort code to be cloned.

> The comparison function for this kind of object redirects to comparing only
> the keys -- the integers are ignored during the sort.  Sort this list with
> the sorting code exactly as it exists now.
> 
> At the end of sorting, the integer members can be used to permute the
> original list into order.  This can be done in-place efficiently (not
> entirely obvious; Knuth gives at least one algorithm for it).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 14 15:20:18 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 14 15:20:29 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: Your message of "Tue, 14 Oct 2003 20:17:52 +0200."
	<3F8C3DD0.4020400@v.loewis.de> 
References: <brspcg23.fsf@python.net> <3F872FE9.9070508@v.loewis.de>
	<u16bbwsz.fsf@python.net> <3F8C3DD0.4020400@v.loewis.de> 
Message-ID: <200310141920.h9EJKIk24455@12-236-54-216.client.attbi.com>

> I don't see why it matters, though. Adding modules to pythonxy.dll does 
> not increase the memory consumption if the modules are not used.

Can you explain why not?  Doesn't the whole DLL get loaded into
memory?

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 14 15:19:22 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 14 15:20:38 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "Tue, 14 Oct 2003 14:05:09 EDT."
	<000901c3927d$b4bda2c0$e841fea9@oemcomputer> 
References: <000901c3927d$b4bda2c0$e841fea9@oemcomputer> 
Message-ID: <200310141919.h9EJJMp24444@12-236-54-216.client.attbi.com>

> I would rather wrap Tim's existing code than muck with assignment logic.
> Ideally, the performance of list.sort() should stay unchanged when the
> key function is not specified.

Impossible -- the aux objects tax the memory cache more.  Also the
characteristics of the data will be very different.

> Tim's original (key, index, value) idea seems to be simplest.  The only
> sticking point is the immortality of PyInts.  One easy, but not so
> elegant way around this is to use another mortal object for a tiebreaker
> (for example, "00000", "00001", ...).  Alternatively, is there a way of
> telling a PyInt to be mortal?

I still like custom objects better.

> Besides immortality and speed, another consideration is the interaction
> between the cmp function and the key function.  If both are specified,
> then the underlying decoration becomes visible to the user:
> 
>     def viewcmp(a, b):
>         print a, b  # the decoration just became visible
>         return cmp(a,b)
>     mylist.sort(cmp=viewcmp, key=str.lower)
> 
> Since the decoration can be visible, it should be as understandable as
> possible.  Viewed this way, PyInts are preferable to a custom object.

I think we should disallow specifying both cmp and key arguments.
Using both just doesn't make sense.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 14 15:21:04 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 14 15:21:31 2003
Subject: [Python-Dev] IPv6 in Windows binary distro
In-Reply-To: Your message of "Tue, 14 Oct 2003 20:25:31 +0200."
	<he2bbrs4.fsf@python.net> 
References: <3F86CF65.1000401@shambala.net>
	<2mad89ulyo.fsf@starship.python.net>
	<3F873097.7050201@v.loewis.de> <he2bbrs4.fsf@python.net> 
Message-ID: <200310141921.h9EJL4I24467@12-236-54-216.client.attbi.com>

> - Should the next binary release (2.3.3, scheduled for the end of 2003)
> include this support?

It would be a new feature, wouldn't it?

> - Should there be any attempts to detect this support in the header
> files automatically (black magic to me), or should the platform sdk be
> required to compile Python with IPv6?

The Windows build doesn't do any feature detection, does it?  All it's
got is a hand-edited config file.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From andrew at gaul.org  Tue Oct 14 15:39:05 2003
From: andrew at gaul.org (Andrew Gaul)
Date: Tue Oct 14 15:39:09 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <200310141920.h9EJKIk24455@12-236-54-216.client.attbi.com>
References: <brspcg23.fsf@python.net> <3F872FE9.9070508@v.loewis.de>
	<u16bbwsz.fsf@python.net> <3F8C3DD0.4020400@v.loewis.de>
	<200310141920.h9EJKIk24455@12-236-54-216.client.attbi.com>
Message-ID: <20031014193905.GA32597@paat.pair.com>

On Tue, Oct 14, 2003 at 12:20:18PM -0700, Guido van Rossum wrote:
> > I don't see why it matters, though. Adding modules to pythonxy.dll does 
> > not increase the memory consumption if the modules are not used.
> 
> Can you explain why not?  Doesn't the whole DLL get loaded into
> memory?

The OS maps the entire DLL but only pages in the parts that are
referenced.  This is the same behavior as mmapping an ordinary file
because that is how shared libraries are usually implemented (with some
magic when multiple libraries want the same virtual addresses).

-- 
Andrew Gaul
http://gaul.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20031014/203b20cf/attachment.bin
From python at rcn.com  Tue Oct 14 15:39:27 2003
From: python at rcn.com (Raymond Hettinger)
Date: Tue Oct 14 15:40:43 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEAAGJAB.tim.one@comcast.net>
Message-ID: <002601c3928a$e15b3920$e841fea9@oemcomputer>

> [Neil Schemenauer]
> > Clever idea I think.  You don't need a special tuple, just a little
> > wrapper object that holds the key and the original value and uses
> > the key for tp_richcompare.
> 
> That could work well.  If a comparison function was specified too, it
> would
> only see the key (addressing one of Raymond's concerns).

Don't you still need a tie-breaker index to preserve stability?


Raymond Hettinger


#################################################################
#################################################################
#################################################################
#####
#####
#####
#################################################################
#################################################################
#################################################################

From python at rcn.com  Tue Oct 14 15:41:39 2003
From: python at rcn.com (Raymond Hettinger)
Date: Tue Oct 14 15:42:19 2003
Subject: [Python-Dev] decorate-sort-undecorate
Message-ID: <002701c3928b$3039a720$e841fea9@oemcomputer>

> > [Neil Schemenauer]
> > > Clever idea I think.  You don't need a special tuple, just a
little
> > > wrapper object that holds the key and the original value and uses
> > > the key for tp_richcompare.

[Tim]
> > That could work well.  If a comparison function was specified too,
it
> > would
> > only see the key (addressing one of Raymond's concerns).

[Me]
> Don't you still need a tie-breaker index to preserve stability?

Arghh!  I see it now.  


Raymond


#################################################################
#################################################################
#################################################################
#####
#####
#####
#################################################################
#################################################################
#################################################################

From theller at python.net  Tue Oct 14 15:48:04 2003
From: theller at python.net (Thomas Heller)
Date: Tue Oct 14 15:48:08 2003
Subject: [Python-Dev] IPv6 in Windows binary distro
In-Reply-To: <200310141921.h9EJL4I24467@12-236-54-216.client.attbi.com> (Guido
	van Rossum's message of "Tue, 14 Oct 2003 12:21:04 -0700")
References: <3F86CF65.1000401@shambala.net>
	<2mad89ulyo.fsf@starship.python.net> <3F873097.7050201@v.loewis.de>
	<he2bbrs4.fsf@python.net>
	<200310141921.h9EJL4I24467@12-236-54-216.client.attbi.com>
Message-ID: <65irbnyj.fsf@python.net>

Guido van Rossum <guido@python.org> writes:

>> - Should the next binary release (2.3.3, scheduled for the end of 2003)
>> include this support?
>
> It would be a new feature, wouldn't it?

Sure. And since it is additional work for me, I'm all for leaving it
out, especially since I don't use it myself ;-).

While this is reason enough for *me*, I'm willing to do the additional
work if this feature is requested (if the community is willing to take
the risk of the new stuff).

>> - Should there be any attempts to detect this support in the header
>> files automatically (black magic to me), or should the platform sdk be
>> required to compile Python with IPv6?
>
> The Windows build doesn't do any feature detection, does it?  All it's
> got is a hand-edited config file.

All it does detect now is the C compiler used.

Thomas


From tim.one at comcast.net  Tue Oct 14 15:56:32 2003
From: tim.one at comcast.net (Tim Peters)
Date: Tue Oct 14 15:56:38 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310141631.h9EGVlf24023@12-236-54-216.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEAGGJAB.tim.one@comcast.net>

[Guido]
> ...
> When exactly do you consider  a sort a "database sort"?

Oh, roughly anything that sorts large objects according to a small piece of
each.  In the sorting app I talked about, the database was a giant XML file,
and was read into memory with each database record represented as a class
instance with a few dozen data attributes.  The full-blown __cmp__ for this
class was large.  With a more memory-efficient database representation, I'd
expect the in-memory object to be a kind of wrapper around database-access
code, maybe with a cache of recently-referenced attributes.  Then it's mondo
expensive if you have to break ties by fetching data from disk.

...

>> It's not externality, it's decomposability: ....

> I experimented a bit with the version of Outlook I have, and it seems
> to always use the delivery date/time as the second key, and always in
> descending order.

It depends some on the current view, but I misremembered Outlook's UI
anyway:  to get a multi-heading sort, you have to be depress the shift key
when clicking on the 2nd (and 3rd, etc) column (and click twice (not
double-click!) to reverse the sort order on the current column; the shift
key applies there too if you want a multi-key sort order).

> ...
> I'm not sure that this helps us decide the default behavior of sorts
> in Python, which are rarely interactive in this sense.

Python is used to implement interactive apps.

> (If someone writes an Ourlook substitute, they can pretty well code the
> sort to do whatever they want.)

The 2.3 sort is stable.  That's not only the default, there's no choice
about it.  What's getting proposed is to give up stability to ease an
implementation trick in the cases where both the stability and speed of a
sort are most often most important.  If I'm just sorting a list of floats,
it's very hard to detect whether the sort was stable (I'd have to stare at
the memory addresses of the float objects to tell).  The cases where DSU
gets used are the ones where the object isn't the key (so that stability or
lack thereof becomes obvious), and where the user cares about speed (else
they'd just pass a custom comparison function instead of bothering with
DSU).

Most sorts I do won't specify a key= argument, so most sorts I do couldn't
care less what auto-DSU does.  When I do code a DSU by hand, I nearly always
include the index component, but more for speed reasons than for stability
reasons -- but I rarely write interactive apps.  I'm told that other people
do <wink -- but a lot of people were very happy to see the 2.3 sort become
stable>.

> ...
> But I don't see the interactivity in Python apps, and that's what
> counts here.

Won't your current employer write a web-based system security monitor in
Python, showing tables of information?  An app that doesn't make a table
view sortable on a column isn't a real app <wink>.

> ...
> To cut all this short, I propose that we offer using the index as a
> second sort key as an option, on by default, whose name can be (barely
> misleading) "stable".  On by default nicely matches the behavior of
> the 2.3 sort without any options.

At this point, I may be losing track of how many options the 2.4 sort is
growing -- cmp, key, and stable?  I'd rather drop stable, and that when cmp
or key (or both) is used, sort promises not to fall back to whole-object
comparison by magic (if cmp or key invoke whole-object comparison, fine,
that's on the user's head then).  There are several ways to implement the
latter, most of which would inherit stability from the core 2.3 sort.


From guido at python.org  Tue Oct 14 15:58:15 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 14 15:58:37 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "Tue, 14 Oct 2003 15:39:27 EDT."
	<002601c3928a$e15b3920$e841fea9@oemcomputer> 
References: <002601c3928a$e15b3920$e841fea9@oemcomputer> 
Message-ID: <200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com>

> Don't you still need a tie-breaker index to preserve stability?

No, because the sort algorithm is already stable.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 14 15:59:47 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 14 15:59:54 2003
Subject: [Python-Dev] IPv6 in Windows binary distro
In-Reply-To: Your message of "Tue, 14 Oct 2003 21:48:04 +0200."
	<65irbnyj.fsf@python.net> 
References: <3F86CF65.1000401@shambala.net>
	<2mad89ulyo.fsf@starship.python.net>
	<3F873097.7050201@v.loewis.de> <he2bbrs4.fsf@python.net>
	<200310141921.h9EJL4I24467@12-236-54-216.client.attbi.com> 
	<65irbnyj.fsf@python.net> 
Message-ID: <200310141959.h9EJxlk24602@12-236-54-216.client.attbi.com>

> >> - Should the next binary release (2.3.3, scheduled for the end of 2003)
> >> include this support?
> >
> > It would be a new feature, wouldn't it?
> 
> Sure. And since it is additional work for me, I'm all for leaving it
> out, especially since I don't use it myself ;-).
> 
> While this is reason enough for *me*, I'm willing to do the additional
> work if this feature is requested (if the community is willing to take
> the risk of the new stuff).

I think that the community is pretty strong against new features,
however neat.

Is there a way to offer this functionality as an extension instead?

--Guido van Rossum (home page: http://www.python.org/~guido/)

From tim.one at comcast.net  Tue Oct 14 16:08:59 2003
From: tim.one at comcast.net (Tim Peters)
Date: Tue Oct 14 16:09:05 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <002701c3928b$3039a720$e841fea9@oemcomputer>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEAIGJAB.tim.one@comcast.net>

[Neil Schemenauer]
>>>> Clever idea I think.  You don't need a special tuple, just a little
>>>> wrapper object that holds the key and the original value and uses
>>>> the key for tp_richcompare.

[Tim]
>>> That could work well.  If a comparison function was specified too,
>>> it would only see the key (addressing one of Raymond's concerns).

[Raymond Hettinger, in darkness]
>> Don't you still need a tie-breaker index to preserve stability?

[Raymond, in light]
> Arghh!  I see it now.

In case everyone doesn't, "the trick" is that the core sorting algorithm is
already stable.  The only reason it needs a "cheap tie breaker" in
hand-rolled DSU is to stop (key, object) tuple comparison from falling back
to whole-object comparison when two keys tie.  Falling back to whole-object
comparison is what can break stability (and chew up an enormous # of
cycles).  If comparison is never handed the objects (only the keys), those
potential problems vanish, and stability is inherited.


From gtalvola at nameconnector.com  Tue Oct 14 16:25:17 2003
From: gtalvola at nameconnector.com (Geoffrey Talvola)
Date: Tue Oct 14 16:25:39 2003
Subject: [Python-Dev] decorate-sort-undecorate
Message-ID: <61957B071FF421419E567A28A45C7FE59AF6E4@mailbox.nameconnector.com>

Tim Peters wrote:
> The
> cases where DSU
> gets used are the ones where the object isn't the key (so
> that stability or
> lack thereof becomes obvious), and where the user cares about
> speed (else
> they'd just pass a custom comparison function instead of
> bothering with
> DSU).

I disagree... I almost always use DSU in any circumstances because I find it
easier and more natural to write:

def keyfunc(record):
    return record.LastName.lower(), record.FirstName.lower(),
record.PhoneNumber
mylist = sortUsingKeyFunc(mylist, keyfunc)

than to have to write an equivalent comparison function:

def cmpfunc(r1, r2):
    return cmp((r1.LastName.lower(), r1.FirstName.lower(), r1.PhoneNumber),
               (r2.LastName.lower(), r2.FirstName.lower(), r2.PhoneNumber))
mylist.sort(cmpfunc)

so for me, ease of use is the reason, not speed.  Of course, it doesn't
_hurt_ that DSU is faster...

- Geoff

From martin at v.loewis.de  Tue Oct 14 16:37:19 2003
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Oct 14 16:37:21 2003
Subject: [Python-Dev] IPv6 in Windows binary distro
In-Reply-To: <200310141921.h9EJL4I24467@12-236-54-216.client.attbi.com>
References: <3F86CF65.1000401@shambala.net>
	<2mad89ulyo.fsf@starship.python.net>
	<3F873097.7050201@v.loewis.de> <he2bbrs4.fsf@python.net>
	<200310141921.h9EJL4I24467@12-236-54-216.client.attbi.com>
Message-ID: <3F8C5E7F.9040004@v.loewis.de>

Guido van Rossum wrote:

>>- Should the next binary release (2.3.3, scheduled for the end of 2003)
>>include this support?
> 
> 
> It would be a new feature, wouldn't it?

It would be a new feature only in the binary. The source code has
supported that for quite some time.


Regards,
Martin


From martin at v.loewis.de  Tue Oct 14 16:39:23 2003
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Oct 14 16:39:37 2003
Subject: [Python-Dev] IPv6 in Windows binary distro
In-Reply-To: <200310141959.h9EJxlk24602@12-236-54-216.client.attbi.com>
References: <3F86CF65.1000401@shambala.net>	<2mad89ulyo.fsf@starship.python.net>	<3F873097.7050201@v.loewis.de>
	<he2bbrs4.fsf@python.net>	<200310141921.h9EJL4I24467@12-236-54-216.client.attbi.com>
	<65irbnyj.fsf@python.net>
	<200310141959.h9EJxlk24602@12-236-54-216.client.attbi.com>
Message-ID: <3F8C5EFB.5000806@v.loewis.de>

Guido van Rossum wrote:

> Is there a way to offer this functionality as an extension instead?

You could replace _socket.pyd.

Regards,
Martin


From tim.one at comcast.net  Tue Oct 14 16:44:30 2003
From: tim.one at comcast.net (Tim Peters)
Date: Tue Oct 14 16:44:38 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310141916.h9EJGHA24421@12-236-54-216.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEANGJAB.tim.one@comcast.net>

[Fred]
>>>> We could allocate a second array of PyObject* to mirror the list
>>>> contents; that would have only the keys.  When two values are
>>>> switched in the sort, the values in both the key list and the value
>>>> list can be switched.  When done, we only need to decref the
>>>> computed keys and free the array of keys.

[Guido]
>>> I can't tell if that'll work, but if it does, it would be a great
>>> solution.

[Tim]
>> I mentioned that before -- doubling the amount of data movement
>> would hurt, at best by blowing cache all to hell.
>>
>> There's a related approach, though:  build a distinct vector of
>> custom objects, each containing:
>>
>> 1. A pointer to the key.
>> 2. The original index, as a C integer.
>>
>> This is similar to, but smaller than, something mentioned before.

[Guido]
> But wouldn't the memory allocated for all those tiny custom objects
> also be spread all over the place and hence blow away the cache?

Probably no more than that the data in the original list was spread all over
the place.  The mergesort has (per merge) two input areas and one output
area, which are contiguous vectors and accessed strictly left to right, one
slot at a time, a cache-friendly access pattern.  The real data pointed to
by the vectors is all over creation, but the vectors themselves are
contiguous.  We seem to get a lot of good out of that.  For example, the
version of weak heapsort I coded did very close to the theoretical minimum #
of comparisons on randomly ordered data, and was algorithmically much
simpler than the adaptive mergesort, yet ran much slower.  That can't be
explained by # of comparisons or by instruction count.  The one obvious
difference is that weak heapsort leaps all over the vector, in as
cache-hostile a way imaginable.  The mergesort also has some success in
reusing small merged areas ASAP, while they're still likely to be in cache.

If we were to mirror loads and stores across two lists in lockstep, we'd be
dealing with 4 input areas and 2 output areas per merge.  That's a lot more
strain on a cache, even if the access pattern per area is cache-friendly on
its own.

Of course a large sort is going to blow the cache wrt what the program was
doing before the sort regardless.

> I guess another approach would be to in-line those objects so that we
> sort an array of structs like this:
>
>   struct {
>     PyObject *key;
>     long index;
>   }
>
> rather than an array of PyObject*.  But this would probably require
> all of the sort code to be cloned.

Following Neil, we could change that to store a pointer to the object rather
than an index.  I agree that would be cache-friendlier, but I don't know how
much better it might do.  Dereferencing key pointers acts in the other
direction in all schemes, because the memory holding the keys is likely all
over the place (and probably gets worse as the sort goes on, and the initial
order gets scrambled).  No way to know except to try it and time it.


From tim.one at comcast.net  Tue Oct 14 16:45:44 2003
From: tim.one at comcast.net (Tim Peters)
Date: Tue Oct 14 16:45:49 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310141919.h9EJJMp24444@12-236-54-216.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEAPGJAB.tim.one@comcast.net>

[Raymond]
>> I would rather wrap Tim's existing code than muck with assignment
>> logic. Ideally, the performance of list.sort() should stay unchanged
>> when the key function is not specified.

[Guido]
> Impossible -- the aux objects tax the memory cache more.  Also the
> characteristics of the data will be very different.

I think Raymond has in mind that if a key argument isn't specified, then aux
objects aren't needed, and wouldn't be constructed -- the list would get
sorted the same way it does now.

>> ...
>> Alternatively, is there a way of telling a PyInt to be mortal?

There isn't, but we shouldn't let that drive anything.  I don't think any
law requires that Python always have an unbounded freelist for int objects.
Most objects with custom freelists put a bound on the freelist size (as
Guido noted in this thread for small tuples).  I'm not sure why ints don't;
I guess nobody ever felt motivated enough to stick a bound on 'em.


From martin at v.loewis.de  Tue Oct 14 16:50:38 2003
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Oct 14 16:50:41 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <200310141920.h9EJKIk24455@12-236-54-216.client.attbi.com>
References: <brspcg23.fsf@python.net> <3F872FE9.9070508@v.loewis.de>
	<u16bbwsz.fsf@python.net> <3F8C3DD0.4020400@v.loewis.de>
	<200310141920.h9EJKIk24455@12-236-54-216.client.attbi.com>
Message-ID: <3F8C619E.2000103@v.loewis.de>

Guido van Rossum wrote:

>>I don't see why it matters, though. Adding modules to pythonxy.dll does 
>>not increase the memory consumption if the modules are not used.
> 
> 
> Can you explain why not?  Doesn't the whole DLL get loaded into
> memory?

No. In modern operating systems (including all Win32 implementations,
i.e. W9x and NT+), the code segment is *mapped* instead of being loaded
(in Win32 terminology, by means of MapViewOfFileEx). This causes
demand-paging of the code, meaning that code is only in memory if it is
actually executed.

There are some pitfalls, e.g. that paging operates only on 4k (on x86)
granularity, and that relocations may cause the code to get loaded at
startup time instead of at run-time (in the latter case, it also stops
being shared across processes).

The code still consumes *address space*, but of this, any process has
plenty (2GB on Win32, unless you use the /3GB boot.ini option of W2k+).

The same is true for executables and shared libraries on Unix, meaning
that making extension modules shared libraries does not reduce memory
consumption. It may increase it, as code segments are 4k-aligned, so if
you have many small segments, you may experience rounding overhead.

Regards,
Martin


From tim.one at comcast.net  Tue Oct 14 16:53:17 2003
From: tim.one at comcast.net (Tim Peters)
Date: Tue Oct 14 16:53:22 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <61957B071FF421419E567A28A45C7FE59AF6E4@mailbox.nameconnector.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEBAGJAB.tim.one@comcast.net>

[Geoffrey Talvola]
> I disagree... I almost always use DSU in any circumstances because I
> find it easier and more natural to write:
>
> def keyfunc(record):
>     return record.LastName.lower(), record.FirstName.lower(),
>            record.PhoneNumber
> mylist = sortUsingKeyFunc(mylist, keyfunc)

You've left out the body of sortUsingKeyFunc, so I expect you're unusual in
having built up helper routines for doing DSU frequently.

> than to have to write an equivalent comparison function:
>
> def cmpfunc(r1, r2):
>     return cmp((r1.LastName.lower(), r1.FirstName.lower(),
>                r1.PhoneNumber), (r2.LastName.lower(),
>                r2.FirstName.lower(), r2.PhoneNumber))
> mylist.sort(cmpfunc)

This is wordier than need be, though, duplicating code for the purpose of
making it look bad <wink>.  I'd do

mylist.sort(lambda a, b: cmp(keyfunc(a), keyfunc(b)))

> so for me, ease of use is the reason, not speed.  Of course, it
> doesn't _hurt_ that DSU is faster...

If your records often tie on the result of keyfunc (doesn't look likely
given the names you picked here), and your sortUsingKeyFunc() function
doesn't inject the original list index (or otherwise force a cheap
tie-breaker), DSU can be much slower than passing a custom comparison
function.  Not likely, though.


From gtalvola at nameconnector.com  Tue Oct 14 17:06:19 2003
From: gtalvola at nameconnector.com (Geoffrey Talvola)
Date: Tue Oct 14 17:06:38 2003
Subject: [Python-Dev] decorate-sort-undecorate
Message-ID: <61957B071FF421419E567A28A45C7FE59AF6E5@mailbox.nameconnector.com>

Tim Peters wrote:
> [Geoffrey Talvola]
>> I disagree... I almost always use DSU in any circumstances because I
>> find it easier and more natural to write:
>> 
>> def keyfunc(record):
>>     return record.LastName.lower(), record.FirstName.lower(),       
>> record.PhoneNumber mylist = sortUsingKeyFunc(mylist, keyfunc)
> 
> You've left out the body of sortUsingKeyFunc, so I expect
> you're unusual in
> having built up helper routines for doing DSU frequently.
> 

Yes, I do use a helper routine, but I suspect I'm not the only one out there
who does...

>> than to have to write an equivalent comparison function:
>> 
>> def cmpfunc(r1, r2):
>>     return cmp((r1.LastName.lower(), r1.FirstName.lower(),
>>                r1.PhoneNumber), (r2.LastName.lower(),
>>                r2.FirstName.lower(), r2.PhoneNumber))
>> mylist.sort(cmpfunc)
> 
> This is wordier than need be, though, duplicating code for
> the purpose of
> making it look bad <wink>.  I'd do
> 
> mylist.sort(lambda a, b: cmp(keyfunc(a), keyfunc(b)))

I'm not a huge fan of lambdas from a readability perspective, so I'd
probably wrap _that_ up into a helper function if I didn't know about DSU.

The point I'm trying to make it that a key function is usually more natural
to use than a comparison function.  You're right, DSU isn't the only way to
make use of a key function.  But, I think it would be a good thing for
list.sort() to take a key function because it will guide users toward using
the cleaner key function idiom and therefore improve the readability of
Python code everywhere.

> 
>> so for me, ease of use is the reason, not speed.  Of course, it
>> doesn't _hurt_ that DSU is faster...
> 
> If your records often tie on the result of keyfunc (doesn't
> look likely
> given the names you picked here), and your sortUsingKeyFunc() function
> doesn't inject the original list index (or otherwise force a cheap
> tie-breaker), DSU can be much slower than passing a custom comparison
> function.  Not likely, though.

Not to worry, I do inject the original list index.  For the record, here's
my helper function, probably not optimally efficient but good enough for me:

def sortUsingKeyFunc(l, keyfunc):
    l = zip(map(keyfunc, l), range(len(l)), l)
    l.sort()
    return [x[2] for x in l]

- Geoff

From greg at cosc.canterbury.ac.nz  Tue Oct 14 21:43:42 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue Oct 14 21:44:07 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com>
Message-ID: <200310150143.h9F1hg319002@oma.cosc.canterbury.ac.nz>

Guido:

> > Don't you still need a tie-breaker index to preserve stability?
> 
> No, because the sort algorithm is already stable.

In which case it makes no sense at all for stability
to be an option, since you'd have to go out of your
way to make it *un*stable!

The only issue then is to avoid comparing the whole
record, and this presumably should be non-optional
as well.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Tue Oct 14 21:52:04 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 14 21:52:15 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "Wed, 15 Oct 2003 14:43:42 +1300."
	<200310150143.h9F1hg319002@oma.cosc.canterbury.ac.nz> 
References: <200310150143.h9F1hg319002@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310150152.h9F1q4C25064@12-236-54-216.client.attbi.com>

> > > Don't you still need a tie-breaker index to preserve stability?
> > 
> > No, because the sort algorithm is already stable.
> 
> In which case it makes no sense at all for stability
> to be an option, since you'd have to go out of your
> way to make it *un*stable!
> 
> The only issue then is to avoid comparing the whole
> record, and this presumably should be non-optional
> as well.

Right.  I think we've settled on using small wrapper objects instead
of tuples, whose comparison *only* compares the key value, and whose
other field contains a reference to the full record.  When passing
both cmp and key, cmp is passed the key field from the wrapper.

The wrapper objects don't need to have any general purpose
functionality so their implementation should be very simple.  (We
*could* go further and have a custom allocator for these objects, but
I'm not sure that that's necessary -- pymalloc should be fast enough,
and the bulk cost is going to be the O(N log N) behavior of the sort
anyway.)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From python at rcn.com  Tue Oct 14 22:40:44 2003
From: python at rcn.com (Raymond Hettinger)
Date: Tue Oct 14 22:41:31 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310150152.h9F1q4C25064@12-236-54-216.client.attbi.com>
Message-ID: <004f01c392c5$bb94a740$e841fea9@oemcomputer>

[Greg Ewing]
> > The only issue then is to avoid comparing the whole
> > record, and this presumably should be non-optional
> > as well.

[Guido]
> Right.  I think we've settled on using small wrapper objects instead
> of tuples, whose comparison *only* compares the key value, and whose
> other field contains a reference to the full record.  When passing
> both cmp and key, cmp is passed the key field from the wrapper.

Here is an implementation to try out.  This second patch includes
unittests and docs.  The reference counts work out file for repeated
test runs and the rest of the test suite passes just fine:

   www.python.org/sf/823292

* The optional keywords arguments are:  cmp, key, reverse.

* The key function triggers a DSU step with a wrapper object that holds
the full record but returns only the key for a comparison.  This is
fast, memory efficient, and doesn't change the underlying stability
characteristics of the sort. (I think this was Neil's idea -- and it
works like a charm.)

* If the key function is not specified, no wrapping occurs so that sort
performance is not affected.


Raymond Hettinger


#################################################################
#################################################################
#################################################################
#####
#####
#####
#################################################################
#################################################################
#################################################################

From tdelaney at avaya.com  Tue Oct 14 22:47:11 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Tue Oct 14 22:47:19 2003
Subject: [Python-Dev] decorate-sort-undecorate
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFE7CA@au3010avexu1.global.avaya.com>

> From: Greg Ewing [mailto:greg@cosc.canterbury.ac.nz]
> 
> Guido:
> 
> > > Don't you still need a tie-breaker index to preserve stability?
> > 
> > No, because the sort algorithm is already stable.
> 
> In which case it makes no sense at all for stability
> to be an option, since you'd have to go out of your
> way to make it *un*stable!
> 
> The only issue then is to avoid comparing the whole
> record, and this presumably should be non-optional
> as well.

How would we document this? To date sort() gives no guarantees about stability. We could continue to give this lack of guarantee by stating that only the key as returned from the key function is used in the comparison. Alternatively, we could guarantee that the resulting sort will be stable (which would make it incumbent to use the index if an unstable sort is introduced in a future version).

Personally, I think it would be a good idea to make the guarantee that from 2.3 sort() will be stable when the comparison function returns equal, or the keys compare equal, or the objects compare equal (in the case of no comparison func or key func). But that could just be me.

Tim Delaney

From aleaxit at yahoo.com  Wed Oct 15 02:24:35 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Wed Oct 15 02:24:40 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <61957B071FF421419E567A28A45C7FE59AF6E5@mailbox.nameconnector.com>
References: <61957B071FF421419E567A28A45C7FE59AF6E5@mailbox.nameconnector.com>
Message-ID: <200310150824.35396.aleaxit@yahoo.com>

On Tuesday 14 October 2003 11:06 pm, Geoffrey Talvola wrote:
   ...
> The point I'm trying to make it that a key function is usually more natural
> to use than a comparison function.  You're right, DSU isn't the only way to

I agree, with ONE important exception: when your desired sort order is e.g
"primary key ascending field X, secondary key descending field Y", writing
a key-extraction function can be an absolute BEAR (you have to know
the type of field Y, and even then how to build a key that will sort in 
descending order by it is generally anything but easy), while a comparison
function is trivial:

def compafun(a, b):
    return cmp(a.X,b.X) or cmp(b.Y,a.Y)

i.e., you obtain descending vs ascending order by switching the arguments
to builtin cmp, and join together multiple calls to cmp with 'or' thanks to
the fact that cmp returning 0 (equal on this key) is exactly what needs
to trigger the "moving on to further, less-significant keys".

In fact I find that the simplest general way to do such compares with a key 
extraction function is a wrapper:

class reverser(object):
    def __init__(self, obj): self.obj = obj
    def __cmp__(self, other): return cmp(other.obj, self.obj)

relying on the fact that an instance of reverser will only ever be compared
with another instance of reverser; now, for the same task as above,

def keyextract(obj):
    return obj.X, reverser(obj.Y)

does work.  However, the number of calls to reverser.__cmp__ is generally
O(N log N) [unless you can guarantee that most X subkeys differ] so that
the performance benefits of DSU are, alas, not in evidence any more here.
A C-coded 'reverser' would presumably be able to counteract this, though
(but I admit I have never had to write one in practice).


Alex


From duncan at rcp.co.uk  Wed Oct 15 04:42:37 2003
From: duncan at rcp.co.uk (Duncan Booth)
Date: Wed Oct 15 04:42:30 2003
Subject: [Python-Dev] decorate-sort-undecorate
References: <002601c3928a$e15b3920$e841fea9@oemcomputer>
	<200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com>
Message-ID: <n2m-g.Xns941561E8B4B14duncanrcpcouk@127.0.0.1>

Guido van Rossum <guido@python.org> wrote in 
news:200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com:

>> Don't you still need a tie-breaker index to preserve stability?
> 
> No, because the sort algorithm is already stable.

What about the situation where you want the list sorted in reverse order? 
If you simply sort and then reverse the list you've broken the stability. 

You *could* preserve the stability by using a negative index when the list 
is to be reserved, but might it also be possible to get the special 
comparison object to invert the result of the comparison?

-- 
Duncan Booth                                             duncan@rcp.co.uk
int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3"
"\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?

From theller at python.net  Wed Oct 15 08:55:13 2003
From: theller at python.net (Thomas Heller)
Date: Wed Oct 15 08:55:20 2003
Subject: [Python-Dev] Python-2.3.2 windows binary screwed
Message-ID: <oewiacem.fsf@python.net>

Sigh.

The 2.3.2 windows binary contains invalid MS dlls.

I copied them from my system directory, instead of using those of the
MSVC 6 SP5 redistributables.

There are already 3 bug reports about this:

http://www.python.org/sf/818029, http://www.python.org/sf/824016, and
http://www.python.org/sf/823405.

Strongy affected are probably win98 and NT4 users.

I suggest to remove the Python-2.3.2.exe from the downloads page (or to
hide it), until this issue is resolved. FWIW, Python-2.3.1.exe should
have the same problem.

All this is, of course, only my fault.

Thomas


From list-python-dev at ccraig.org  Wed Oct 15 09:41:26 2003
From: list-python-dev at ccraig.org (Christopher A. Craig)
Date: Wed Oct 15 09:41:33 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEPHGIAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCKEPHGIAB.tim.one@comcast.net>
Message-ID: <t1wzng2mxdl.fsf@kermit.wreck.org>

If this goes in can we document that adding a key parameter makes the
sort stable and key=None causes a stable sort to happen?  Current
CPython won't have to do anything at all with that, but other Pythons
(or a future CPython where a mythical faster-than-timsort nonstable
sort is discovered) would have a documented way to force stability.

-- 
Christopher A. Craig <list-python-dev@ccraig.org>

From mcherm at mcherm.com  Wed Oct 15 09:51:17 2003
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed Oct 15 09:51:19 2003
Subject: [Python-Dev] decorate-sort-undecorate
Message-ID: <1066225877.3f8d50d5eff8b@mcherm.com>

Tim writes:
> Almost all sorts you're likely
> to use in real life are stable in order to support this, whether it's
> clicking on an email-metadata column in Outlook, or sorting an array of data
> by a contained column in Excel.

Guido tries it out:
> I experimented a bit with the version of Outlook I have, and it seems
> to always use the delivery date/time as the second key, and always in
> descending order.

Which is simply evidence that Outlook is poorly designed, and that
Microsoft should have hired Tim to help with design specs. Although
Outlook lacks this feature, I have FREQUENTLY desired it, and been
annoyed at its absence.

Tim in a later email:
> It depends some on the current view, but I misremembered Outlook's UI
> anyway:  to get a multi-heading sort, you have to be depress the shift key
> when clicking on the 2nd (and 3rd, etc) column (and click twice (not
> double-click!) to reverse the sort order on the current column; the shift
> key applies there too if you want a multi-key sort order).

Well well... it's nice to learn that!


I find Tim's (and other's) arguments quite convincing. We can go with
a stable sort... ie, compare the keys, then fall back on stability
and NEVER try comparing the objects themselves, and I think it will
make complete sense. After all, sorting with a "key" parameter
provided is *NOT* really a DSU algorithm... it's a new sort feature
which happens to be _implemented_ using DSU. Should that new feature
sort on "just the key" (leaving ties stable) or should it sort on
"the key and then the objects themselves". I'd say both make sense,
and in fact "just the key" is more obvious to me.

-- Michael Chermside

From larsga at garshol.priv.no  Wed Oct 15 10:19:00 2003
From: larsga at garshol.priv.no (Lars Marius Garshol)
Date: Wed Oct 15 10:19:00 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <1066225877.3f8d50d5eff8b@mcherm.com>
References: <1066225877.3f8d50d5eff8b@mcherm.com>
Message-ID: <m3wub6oa7f.fsf@localhost.localdomain>


* Michael Chermside
| 
| I find Tim's (and other's) arguments quite convincing. We can go
| with a stable sort... ie, compare the keys, then fall back on
| stability and NEVER try comparing the objects themselves, and I
| think it will make complete sense. After all, sorting with a "key"
| parameter provided is *NOT* really a DSU algorithm... it's a new
| sort feature which happens to be _implemented_ using DSU. Should
| that new feature sort on "just the key" (leaving ties stable) or
| should it sort on "the key and then the objects themselves". I'd say
| both make sense, and in fact "just the key" is more obvious to me.

+1.

Very glad to see this being added. This has to take the prize for
utility-I-most-often-reimplement.

-- 
Lars Marius Garshol, Ontopian         <URL: http://www.ontopia.net >
GSM: +47 98 21 55 50                  <URL: http://www.garshol.priv.no >


From anthony at interlink.com.au  Wed Oct 15 10:59:47 2003
From: anthony at interlink.com.au (Anthony Baxter)
Date: Wed Oct 15 11:02:12 2003
Subject: [Python-Dev] Python-2.3.2 windows binary screwed 
In-Reply-To: <oewiacem.fsf@python.net> 
Message-ID: <200310151459.h9FExmvu011497@localhost.localdomain>


[resend - my adsl fell over, don't think the original went out]

I've put a note on the 2.3.2 page. Please email me when you've got a fixed
installer, and I'll do the magic to install it on creosote and gpg sign it.

Anthony

From gtalvola at nameconnector.com  Wed Oct 15 11:05:17 2003
From: gtalvola at nameconnector.com (Geoffrey Talvola)
Date: Wed Oct 15 11:06:59 2003
Subject: [Python-Dev] decorate-sort-undecorate
Message-ID: <61957B071FF421419E567A28A45C7FE59AF6E9@mailbox.nameconnector.com>

Alex Martelli wrote:
> On Tuesday 14 October 2003 11:06 pm, Geoffrey Talvola wrote:
>    ...
>> The point I'm trying to make it that a key function is usually more
>> natural to use than a comparison function.  You're right, DSU isn't
>> the only way to 
> 
> I agree, with ONE important exception: when your desired sort order
> is e.g "primary key ascending field X, secondary key descending
> field Y", writing
> a key-extraction function can be an absolute BEAR (you have to know
> the type of field Y, and even then how to build a key that
> will sort in
> descending order by it is generally anything but easy), while
> a comparison
> function is trivial:

In this case, how about sorting twice, taking advantage of stability?  Using
the proposed new syntax:

mylist.sort(key = lambda r: r.Y)
mylist.reverse()
mylist.sort(key = lambda r: r.X)

It might actually be the fastest way for very large lists, and while it's
not immediately obvious what it's doing, it's not _that_ unreadable...

- Geoff

From theller at python.net  Wed Oct 15 11:11:37 2003
From: theller at python.net (Thomas Heller)
Date: Wed Oct 15 11:11:44 2003
Subject: [Python-Dev] Python-2.3.2 windows binary screwed
In-Reply-To: <200310151459.h9FExmvu011497@localhost.localdomain> (Anthony
	Baxter's message of "Thu, 16 Oct 2003 00:59:47 +1000")
References: <200310151459.h9FExmvu011497@localhost.localdomain>
Message-ID: <ekxea63a.fsf@python.net>

Anthony Baxter <anthony@interlink.com.au> writes:

> [resend - my adsl fell over, don't think the original went out]
>
> I've put a note on the 2.3.2 page. Please email me when you've got a fixed
> installer, and I'll do the magic to install it on creosote and gpg sign it.

Before I'd like some questions to be answered, probably Martin or Tim
have an opinion here (but others are also invited).

First, I hope that it's ok to build the installer with the VC6 SP5 dlls.
The other possibility that comes to mind is to not include *any* MS
runtime dlls, and provide the MS package VCREDIST.EXE separately.

Second, what about the filename / version number / build number?

IMO one should be able to distinguish the new installer from the old
one. The easiest thing would be to just change the filename into maybe
Python-2.3.2.1.exe.

Thomas


From guido at python.org  Wed Oct 15 11:16:51 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 15 11:17:02 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "15 Oct 2003 09:41:26 EDT."
	<t1wzng2mxdl.fsf@kermit.wreck.org> 
References: <LNBBLJKPBEHFEDALKOLCKEPHGIAB.tim.one@comcast.net>  
	<t1wzng2mxdl.fsf@kermit.wreck.org> 
Message-ID: <200310151516.h9FFGp502162@12-236-54-216.client.attbi.com>

> If this goes in can we document that adding a key parameter makes the
> sort stable and key=None causes a stable sort to happen?  Current
> CPython won't have to do anything at all with that, but other Pythons
> (or a future CPython where a mythical faster-than-timsort nonstable
> sort is discovered) would have a documented way to force stability.

That sounds like an extremely roundabout way of doing it; *if* there
had to be a way to request a stable sort, I'd say that specifying a
'stable' keyword would be the way to do it.  But I think that's
unnecessary.

Given that the Jython folks had Tim's sort algorithm translated into
Java in half a day, I don't see why we can't require all
implementations to have a stable sort.  It's not like you can gain
significant speed over Timsort.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Oct 15 11:52:54 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 15 11:53:30 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "Wed, 15 Oct 2003 09:42:37 BST."
	<n2m-g.Xns941561E8B4B14duncanrcpcouk@127.0.0.1> 
References: <002601c3928a$e15b3920$e841fea9@oemcomputer>
	<200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com> 
	<n2m-g.Xns941561E8B4B14duncanrcpcouk@127.0.0.1> 
Message-ID: <200310151553.h9FFqsx02252@12-236-54-216.client.attbi.com>

> What about the situation where you want the list sorted in reverse order? 
> If you simply sort and then reverse the list you've broken the stability. 

Yes, that's the same thing Alex Martelli brought up.  You could also
supply a cmp function, as Geoffrey Talvola suggested (though this will
make the comparisons more costly).

> You *could* preserve the stability by using a negative index when the list 
> is to be reserved, but might it also be possible to get the special 
> comparison object to invert the result of the comparison?

That's a possibility.  Since we've got a reverse keyword argument,
that could be implemented.  (There would have to be two classes, one
with a forward comparison and one with a reverse, to get this info
efficiently into the wrapper objects without using globals.)

But then I wonder what should happen if you specify reverse without
key.  The obvious way to implement this is to do the stable sort
without wrappers and then reverse the whole list, but this also breaks
stability (as you define it).  So maybe specifying reverse should
force using wrappers?  But that's unintuitive in a different way: if
you don't care about the stability of the sort (e.g. if equal keys are
impossible or unlikely), you'd expect the reverse option to simply
reverse the list after sorting it, and using wrappers would make it a
lot slower than that.

How important do you think this is?  We could punt on the issue,
implement reverse by reverting the list afterwards.  (I could define
stability differently and be totally happy with getting everything in
reverse order rather than only the specified key. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Oct 15 12:02:12 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 15 12:02:23 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "Wed, 15 Oct 2003 08:52:54 PDT."
	<200310151553.h9FFqsx02252@12-236-54-216.client.attbi.com> 
References: <002601c3928a$e15b3920$e841fea9@oemcomputer>
	<200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com>
	<n2m-g.Xns941561E8B4B14duncanrcpcouk@127.0.0.1> 
	<200310151553.h9FFqsx02252@12-236-54-216.client.attbi.com> 
Message-ID: <200310151602.h9FG2CV02321@12-236-54-216.client.attbi.com>

> > What about the situation where you want the list sorted in reverse order? 
> > If you simply sort and then reverse the list you've broken the stability. 
> 
> Yes, that's the same thing Alex Martelli brought up.  You could also
> supply a cmp function, as Geoffrey Talvola suggested (though this will
> make the comparisons more costly).

Oops.  I misremembered Geoffrey's suggestion; he suggested two sorts
with a reverse() call in between.  I think that would have the same
problem.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From gtalvola at nameconnector.com  Wed Oct 15 12:03:03 2003
From: gtalvola at nameconnector.com (Geoffrey Talvola)
Date: Wed Oct 15 12:03:27 2003
Subject: [Python-Dev] decorate-sort-undecorate
Message-ID: <61957B071FF421419E567A28A45C7FE59AF6EC@mailbox.nameconnector.com>

Guido van Rossum wrote:
>> What about the situation where you want the list sorted in reverse
>> order? If you simply sort and then reverse the list you've broken
>> the stability. 
> 
> ...
> How important do you think this is?  We could punt on the issue,
> implement reverse by reverting the list afterwards.  (I could define
> stability differently and be totally happy with getting everything in
> reverse order rather than only the specified key. :-)

If you make that the documented behavior, then if someone really needs the
items sorted in reverse order, but stable with respect to the original list,
then this will work:

mylist.reverse()
mylist.sort(key=keyfunc, reverse=True)

- Geoff


From guido at python.org  Wed Oct 15 12:07:41 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 15 12:08:09 2003
Subject: [Python-Dev] Python-2.3.2 windows binary screwed
In-Reply-To: Your message of "Wed, 15 Oct 2003 17:11:37 +0200."
	<ekxea63a.fsf@python.net> 
References: <200310151459.h9FExmvu011497@localhost.localdomain>  
	<ekxea63a.fsf@python.net> 
Message-ID: <200310151607.h9FG7gZ02354@12-236-54-216.client.attbi.com>

> The other possibility that comes to mind is to not include *any* MS
> runtime dlls, and provide the MS package VCREDIST.EXE separately.

This sounds like a bad idea; all previous installers have included the
right DLLs and not gotten any problems.

> Second, what about the filename / version number / build number?
> 
> IMO one should be able to distinguish the new installer from the old
> one. The easiest thing would be to just change the filename into maybe
> Python-2.3.2.1.exe.

I can't think of anything better, so I think it's okay.  Adding a
letter would be confusing because normally suffixes like b2 or c1 come
*before* the final version.

Sigh indeed. :-(

--Guido van Rossum (home page: http://www.python.org/~guido/)

From phil at riverbankcomputing.co.uk  Wed Oct 15 12:16:28 2003
From: phil at riverbankcomputing.co.uk (Phil Thompson)
Date: Wed Oct 15 12:16:35 2003
Subject: [Python-Dev] Python-2.3.2 windows binary screwed
In-Reply-To: <200310151607.h9FG7gZ02354@12-236-54-216.client.attbi.com>
References: <200310151459.h9FExmvu011497@localhost.localdomain>
	<ekxea63a.fsf@python.net>
	<200310151607.h9FG7gZ02354@12-236-54-216.client.attbi.com>
Message-ID: <200310151716.28576.phil@riverbankcomputing.co.uk>

On Wednesday 15 October 2003 5:07 pm, Guido van Rossum wrote:
> > The other possibility that comes to mind is to not include *any* MS
> > runtime dlls, and provide the MS package VCREDIST.EXE separately.
>
> This sounds like a bad idea; all previous installers have included the
> right DLLs and not gotten any problems.
>
> > Second, what about the filename / version number / build number?
> >
> > IMO one should be able to distinguish the new installer from the old
> > one. The easiest thing would be to just change the filename into maybe
> > Python-2.3.2.1.exe.
>
> I can't think of anything better, so I think it's okay.  Adding a
> letter would be confusing because normally suffixes like b2 or c1 come
> *before* the final version.

I would suggest Python-2.3.2-1.exe which more strongly implies the same 
version of software but a different version of packaging.

Phil


From guido at python.org  Wed Oct 15 12:41:44 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 15 12:42:09 2003
Subject: [Python-Dev] Python-2.3.2 windows binary screwed
In-Reply-To: Your message of "Wed, 15 Oct 2003 17:16:28 BST."
	<200310151716.28576.phil@riverbankcomputing.co.uk> 
References: <200310151459.h9FExmvu011497@localhost.localdomain>
	<ekxea63a.fsf@python.net>
	<200310151607.h9FG7gZ02354@12-236-54-216.client.attbi.com> 
	<200310151716.28576.phil@riverbankcomputing.co.uk> 
Message-ID: <200310151641.h9FGfiq02459@12-236-54-216.client.attbi.com>

> I would suggest Python-2.3.2-1.exe which more strongly implies the same 
> version of software but a different version of packaging.

+1

--Guido van Rossum (home page: http://www.python.org/~guido/)

From python at discworld.dyndns.org  Wed Oct 15 12:50:45 2003
From: python at discworld.dyndns.org (Charles Cazabon)
Date: Wed Oct 15 12:46:06 2003
Subject: [Python-Dev] Python-2.3.2 windows binary screwed
In-Reply-To: <200310151641.h9FGfiq02459@12-236-54-216.client.attbi.com>;
	from guido@python.org on Wed, Oct 15, 2003 at 09:41:44AM -0700
References: <200310151459.h9FExmvu011497@localhost.localdomain>
	<ekxea63a.fsf@python.net>
	<200310151607.h9FG7gZ02354@12-236-54-216.client.attbi.com>
	<200310151716.28576.phil@riverbankcomputing.co.uk>
	<200310151641.h9FGfiq02459@12-236-54-216.client.attbi.com>
Message-ID: <20031015105045.A30228@discworld.dyndns.org>

Guido van Rossum <guido@python.org> wrote:
> > I would suggest Python-2.3.2-1.exe which more strongly implies the same 
> > version of software but a different version of packaging.
> 
> +1

How about making it "-2", then, as the previous (broken) package would have
been "-1".  Some might assume "Python-2.3.2.exe" and "Python-2.3.2-1.exe" were
identical, but I would think few would make that assumption with
"Python-2.3.2.exe" and "Python-2.3.2-2.exe".

Charles
-- 
-----------------------------------------------------------------------
Charles Cazabon                           <python@discworld.dyndns.org>
GPL'ed software available at:     http://www.qcc.ca/~charlesc/software/
-----------------------------------------------------------------------

From guido at python.org  Wed Oct 15 12:51:14 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 15 12:51:25 2003
Subject: [Python-Dev] Python-2.3.2 windows binary screwed
In-Reply-To: Your message of "Wed, 15 Oct 2003 10:50:45 MDT."
	<20031015105045.A30228@discworld.dyndns.org> 
References: <200310151459.h9FExmvu011497@localhost.localdomain>
	<ekxea63a.fsf@python.net>
	<200310151607.h9FG7gZ02354@12-236-54-216.client.attbi.com>
	<200310151716.28576.phil@riverbankcomputing.co.uk>
	<200310151641.h9FGfiq02459@12-236-54-216.client.attbi.com> 
	<20031015105045.A30228@discworld.dyndns.org> 
Message-ID: <200310151651.h9FGpEj02501@12-236-54-216.client.attbi.com>

> How about making it "-2", then, as the previous (broken) package
> would have been "-1".  Some might assume "Python-2.3.2.exe" and
> "Python-2.3.2-1.exe" were identical, but I would think few would
> make that assumption with "Python-2.3.2.exe" and
> "Python-2.3.2-2.exe".

Haven't you noticed that Python uses 0-based indexing? :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From barry at python.org  Wed Oct 15 13:35:53 2003
From: barry at python.org (Barry Warsaw)
Date: Wed Oct 15 13:36:03 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310151553.h9FFqsx02252@12-236-54-216.client.attbi.com>
References: <002601c3928a$e15b3920$e841fea9@oemcomputer>
	<200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com>
	<n2m-g.Xns941561E8B4B14duncanrcpcouk@127.0.0.1>
	<200310151553.h9FFqsx02252@12-236-54-216.client.attbi.com>
Message-ID: <1066239353.25726.19.camel@geddy>

While we're hacking on [].sort(), how horrible would it be if we
modified it to return self instead of None?  I don't mind the
sort-in-place behavior, but it's just so inconvenient that it doesn't
return anything useful.  I know it would be better if it returned a new
list, but practicality beats purity. <wink>

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20031015/e434ed8a/attachment.bin
From guido at python.org  Wed Oct 15 13:52:26 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 15 13:52:34 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "Wed, 15 Oct 2003 13:35:53 EDT."
	<1066239353.25726.19.camel@geddy> 
References: <002601c3928a$e15b3920$e841fea9@oemcomputer>
	<200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com>
	<n2m-g.Xns941561E8B4B14duncanrcpcouk@127.0.0.1>
	<200310151553.h9FFqsx02252@12-236-54-216.client.attbi.com> 
	<1066239353.25726.19.camel@geddy> 
Message-ID: <200310151752.h9FHqQ302581@12-236-54-216.client.attbi.com>

> While we're hacking on [].sort(), how horrible would it be if we
> modified it to return self instead of None?

-1000.  This is non-negotiable.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From marktrussell at btopenworld.com  Wed Oct 15 14:42:47 2003
From: marktrussell at btopenworld.com (Mark Russell)
Date: Wed Oct 15 14:44:14 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310151752.h9FHqQ302581@12-236-54-216.client.attbi.com>
References: <002601c3928a$e15b3920$e841fea9@oemcomputer>
	<200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com>
	<n2m-g.Xns941561E8B4B14duncanrcpcouk@127.0.0.1>
	<200310151553.h9FFqsx02252@12-236-54-216.client.attbi.com>
	<1066239353.25726.19.camel@geddy>
	<200310151752.h9FHqQ302581@12-236-54-216.client.attbi.com>
Message-ID: <1066243367.1463.30.camel@straylight>

On Wed, 2003-10-15 at 18:52, Guido van Rossum wrote:
> > While we're hacking on [].sort(), how horrible would it be if we
> > modified it to return self instead of None?
> 
> -1000.  This is non-negotiable.

I have a trivial wrapper function sortcopy() in my
I-wish-these-were-builtins module:

def sortcopy(vals, cmpfunc=None):
    """Non in-place wrapper for list.sort()."""
    copy = list(vals)
    copy.sort(cmpfunc)
    return copy

I use this more often than list.sort(), because most of the time
performance and memory use is not an issue and code using the in-place
version is irritatingly verbose.  Maybe this is worth adding as a
builtin, to satisfy the people that want a non in-place sort.

Mark Russell

From theller at python.net  Wed Oct 15 14:47:57 2003
From: theller at python.net (Thomas Heller)
Date: Wed Oct 15 14:48:03 2003
Subject: [Python-Dev] Python-2.3.2 windows binary screwed
In-Reply-To: <vfqq8m8h.fsf@python.net> (Thomas Heller's message of "Wed, 15
	Oct 2003 19:05:50 +0200")
References: <200310151459.h9FExmvu011497@localhost.localdomain>
	<vfqq8m8h.fsf@python.net>
Message-ID: <oewi8hia.fsf@python.net>

Anthony, did you get this?

Thomas Heller <theller@python.net> writes:

> Ok, here it is:
> http://starship.python.net/crew/theller/Python-2.3.2-1.exe
>
> 87aed0e4a79c350065b770f9a4ddfd75  Python-2.3.2-1.exe
>
> *Exactly* the same as before, except for the MS dlls and the filename.
>
> Thanks (and apologies)
>
> Thomas


From jeremy at alum.mit.edu  Wed Oct 15 15:11:30 2003
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Wed Oct 15 15:13:49 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <1066243367.1463.30.camel@straylight>
References: <002601c3928a$e15b3920$e841fea9@oemcomputer>
	<200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com>
	<n2m-g.Xns941561E8B4B14duncanrcpcouk@127.0.0.1>
	<200310151553.h9FFqsx02252@12-236-54-216.client.attbi.com>
	<1066239353.25726.19.camel@geddy>
	<200310151752.h9FHqQ302581@12-236-54-216.client.attbi.com>
	<1066243367.1463.30.camel@straylight>
Message-ID: <1066245090.2611.19.camel@localhost.localdomain>

On Wed, 2003-10-15 at 14:42, Mark Russell wrote:
> I have a trivial wrapper function sortcopy() in my
> I-wish-these-were-builtins module:
> 
> def sortcopy(vals, cmpfunc=None):
>     """Non in-place wrapper for list.sort()."""
>     copy = list(vals)
>     copy.sort(cmpfunc)
>     return copy
> 
> I use this more often than list.sort(), because most of the time
> performance and memory use is not an issue and code using the in-place
> version is irritatingly verbose.  Maybe this is worth adding as a
> builtin, to satisfy the people that want a non in-place sort.

No.  This is so easy to write, we're all destined to write it again and
again <0.4 wink>.  I also use sort():

def sort(L):
    L.sort()
    return L

Jeremy


From barry at python.org  Wed Oct 15 15:25:32 2003
From: barry at python.org (Barry Warsaw)
Date: Wed Oct 15 15:26:07 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310151752.h9FHqQ302581@12-236-54-216.client.attbi.com>
References: <002601c3928a$e15b3920$e841fea9@oemcomputer>
	<200310141958.h9EJwFu24582@12-236-54-216.client.attbi.com>
	<n2m-g.Xns941561E8B4B14duncanrcpcouk@127.0.0.1>
	<200310151553.h9FFqsx02252@12-236-54-216.client.attbi.com>
	<1066239353.25726.19.camel@geddy>
	<200310151752.h9FHqQ302581@12-236-54-216.client.attbi.com>
Message-ID: <1066245932.25726.36.camel@geddy>

On Wed, 2003-10-15 at 13:52, Guido van Rossum wrote:
> > While we're hacking on [].sort(), how horrible would it be if we
> > modified it to return self instead of None?
> 
> -1000.  This is non-negotiable.

Sniff.

>>> class mylist(list):
...  def sort(self, *args, **kws):
...   super(mylist, self).sort(*args, **kws)
...   return self
... 
>>> mylist([5, 4, 3, 2, 1])
[5, 4, 3, 2, 1]
>>> x = mylist([5, 4, 3, 2, 1])
>>> x.sort()
[1, 2, 3, 4, 5]

Bliss.
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20031015/d011f721/attachment.bin
From aahz at pythoncraft.com  Wed Oct 15 15:26:10 2003
From: aahz at pythoncraft.com (Aahz)
Date: Wed Oct 15 15:26:14 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310151516.h9FFGp502162@12-236-54-216.client.attbi.com>
References: <LNBBLJKPBEHFEDALKOLCKEPHGIAB.tim.one@comcast.net>
	<t1wzng2mxdl.fsf@kermit.wreck.org>
	<200310151516.h9FFGp502162@12-236-54-216.client.attbi.com>
Message-ID: <20031015192610.GA14327@panix.com>

On Wed, Oct 15, 2003, Guido van Rossum wrote:
> 
> That sounds like an extremely roundabout way of doing it; *if* there
> had to be a way to request a stable sort, I'd say that specifying a
> 'stable' keyword would be the way to do it.  But I think that's
> unnecessary.
> 
> Given that the Jython folks had Tim's sort algorithm translated into
> Java in half a day, I don't see why we can't require all
> implementations to have a stable sort.  It's not like you can gain
> significant speed over Timsort.

But in the discussion leading up to adopting Timsort, you (or Tim, same
difference ;-) explicitly said that you didn't want to make any doc
guarantees about stability in case the sort algorithm changed in the
future.  I don't have an opinion about whether we should keep our
options open, but I do think there should be a clearly explicit decision
rather than suddenly assuming that we're going to require Python's core
sort to be stable.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From raymond.hettinger at verizon.net  Wed Oct 15 14:06:32 2003
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed Oct 15 15:43:14 2003
Subject: [Python-Dev] decorate-sort-undecorate
Message-ID: <005201c39347$10c03960$e841fea9@oemcomputer>

If the discussion is wrapped up, I'm ready to commit the patch:

   www.python.org/sf/823292

Summary:

. Adds keyword arguments:  cmp, key, reverse.
. Stable for any combination of arguments (including reverse).
. If key is not specified, then no wrapper is applied and nothing
  is changed (performance is unchanged).
. If cmp and key are specified, the wrapper is removed and the
  original key is passed to the cmp function (the wrapper is not
  visible to the user).
. Has unittests and docs.  Passes the full test suite and repeated
  runs show stable refcounts.


Raymond Hettinger


From ianb at colorstudy.com  Wed Oct 15 15:48:04 2003
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed Oct 15 15:48:09 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <1066239353.25726.19.camel@geddy>
Message-ID: <7DC93A93-FF48-11D7-9282-000393C2D67E@colorstudy.com>

On Wednesday, October 15, 2003, at 12:35 PM, Barry Warsaw wrote:
> While we're hacking on [].sort(), how horrible would it be if we
> modified it to return self instead of None?  I don't mind the
> sort-in-place behavior, but it's just so inconvenient that it doesn't
> return anything useful.  I know it would be better if it returned a new
> list, but practicality beats purity. <wink>

When doing DSU sorting, the in-place sorting isn't really a performance 
win, is it?  You already have to allocate and populate an entire 
alternate list with the sort keys, though I suppose you could have 
those mini key structs point to the original list.

Anyway, while it's obviously in bad taste to propose .sort change its 
return value based on the presence of a key, wouldn't it be good if we 
had access to the new sorted list, instead of always clobbering the 
original list?  Otherwise people's sorted() functions will end up 
copying lists unnecessarily.

Okay, really I'm just hoping for [x for x in l sortby key(x)], if not 
now then someday -- if only there was a decent way of expressing that 
without a keyword... [...in l : key(x)] is the only thing I can think 
of that would be syntactically possible (without introducing a new 
keyword, new punctuation, or reusing a wholely inappropriate existing 
keyword).  Or ";" instead of ":", but neither is very good.

Sigh...

--
Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org


From list-python-dev at ccraig.org  Wed Oct 15 16:55:16 2003
From: list-python-dev at ccraig.org (Christopher A. Craig)
Date: Wed Oct 15 16:55:47 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <20031015192610.GA14327@panix.com>
References: <LNBBLJKPBEHFEDALKOLCKEPHGIAB.tim.one@comcast.net>
	<t1wzng2mxdl.fsf@kermit.wreck.org>
	<200310151516.h9FFGp502162@12-236-54-216.client.attbi.com>
	<20031015192610.GA14327@panix.com>
Message-ID: <t1wfzhumdaj.fsf@kermit.wreck.org>

Aahz <aahz@pythoncraft.com> writes:

> But in the discussion leading up to adopting Timsort, you (or Tim, same
> difference ;-) explicitly said that you didn't want to make any doc
> guarantees about stability in case the sort algorithm changed in the
> future.  I don't have an opinion about whether we should keep our
> options open, but I do think there should be a clearly explicit decision
> rather than suddenly assuming that we're going to require Python's core
> sort to be stable.

Yeah, that's mainly what I meant by my post.  Currently if I want
guarantees that the sort is stable on any future Python I have to
manually DSU.  If DSU is going to be internalized I'd like some way to
guarantee stability (if that involves no arguments at all, great).

-- 
Christopher A. Craig <list-python-dev@ccraig.org>
"It's a fairly embarrassing situation to admit that we can't
 find 90 percent of the universe." Bruce H. Margon (astrophysicist)

From python-kbutler at sabaydi.com  Wed Oct 15 17:05:34 2003
From: python-kbutler at sabaydi.com (Kevin J. Butler)
Date: Wed Oct 15 17:06:00 2003
Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <E1A9rIh-0002mZ-8k@mail.python.org>
References: <E1A9rIh-0002mZ-8k@mail.python.org>
Message-ID: <3F8DB69E.2070406@sabaydi.com>

From: Barry Warsaw <barry@python.org>
> While we're hacking on [].sort(), how horrible would it be if we
> modified it to return self instead of None?

BDFL:
> -1000.  This is non-negotiable.

[Barry's blissful demo code snipped]

+1

Just 998 votes to go - nice to have a precise value on BDFL 
pronouncements.  No voting twice with bigger numbers! ;-)

I think just about everyone gets tripped up by the "sort returns None" 
behavior, and though one (e.g., BDFL) can declare that it is a less 
significant stumble than not realizing the list is sorted in place, it 
is a _continuing_ inconvenience, with virtually every call to [].sort, 
even for Python experts (like Barry, not me).

Small-ongoing-issue-trumps-one-time-surprise-ly y'rs,

kb

PS.  Just realized I made a similar post over 6 years ago. 
http://www.google.com/groups?selm=w4niv00k9sc.fsf%40jamaica.cs.byu.edu 
Does that mean I should just give it up already, or does it emphasize 
that it is an ongoing issue?  Though I still like the fact that the 
change would not break /any/ existing code...


From guido at python.org  Wed Oct 15 17:17:41 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 15 17:18:59 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: Your message of "Wed, 15 Oct 2003 15:26:10 EDT."
	<20031015192610.GA14327@panix.com> 
References: <LNBBLJKPBEHFEDALKOLCKEPHGIAB.tim.one@comcast.net>
	<t1wzng2mxdl.fsf@kermit.wreck.org>
	<200310151516.h9FFGp502162@12-236-54-216.client.attbi.com> 
	<20031015192610.GA14327@panix.com> 
Message-ID: <200310152117.h9FLHfh02774@12-236-54-216.client.attbi.com>

> On Wed, Oct 15, 2003, Guido van Rossum wrote:
> > That sounds like an extremely roundabout way of doing it; *if* there
> > had to be a way to request a stable sort, I'd say that specifying a
> > 'stable' keyword would be the way to do it.  But I think that's
> > unnecessary.
> > 
> > Given that the Jython folks had Tim's sort algorithm translated into
> > Java in half a day, I don't see why we can't require all
> > implementations to have a stable sort.  It's not like you can gain
> > significant speed over Timsort.

[Aahz]
> But in the discussion leading up to adopting Timsort, you (or Tim, same
> difference ;-) explicitly said that you didn't want to make any doc
> guarantees about stability in case the sort algorithm changed in the
> future.

That was before Timsort had proven to be such a tremendous success.

> I don't have an opinion about whether we should keep our
> options open, but I do think there should be a clearly explicit decision
> rather than suddenly assuming that we're going to require Python's core
> sort to be stable.

OK, I pronounce on this: Python's list.sort() shall be stable.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From python at discworld.dyndns.org  Wed Oct 15 17:28:32 2003
From: python at discworld.dyndns.org (Charles Cazabon)
Date: Wed Oct 15 17:23:55 2003
Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <3F8DB69E.2070406@sabaydi.com>;
	from python-kbutler@sabaydi.com on Wed, Oct 15, 2003 at
	03:05:34PM -0600
References: <E1A9rIh-0002mZ-8k@mail.python.org> <3F8DB69E.2070406@sabaydi.com>
Message-ID: <20031015152832.A32481@discworld.dyndns.org>

Kevin J. Butler <python-kbutler@sabaydi.com> wrote:
> 
> I think just about everyone gets tripped up by the "sort returns None" 
> behavior, and though one (e.g., BDFL) can declare that it is a less 
> significant stumble than not realizing the list is sorted in place, it 
> is a _continuing_ inconvenience, with virtually every call to [].sort, 
> even for Python experts (like Barry, not me).

Sure.  I regularly find myself wishing "foo.sort().reverse()" and similar
constructions would work, even in-place.

Charles
-- 
-----------------------------------------------------------------------
Charles Cazabon                           <python@discworld.dyndns.org>
GPL'ed software available at:     http://www.qcc.ca/~charlesc/software/
-----------------------------------------------------------------------

From esr at thyrsus.com  Wed Oct 15 17:31:55 2003
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed Oct 15 17:31:59 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310152117.h9FLHfh02774@12-236-54-216.client.attbi.com>
References: <LNBBLJKPBEHFEDALKOLCKEPHGIAB.tim.one@comcast.net>
	<t1wzng2mxdl.fsf@kermit.wreck.org>
	<200310151516.h9FFGp502162@12-236-54-216.client.attbi.com>
	<20031015192610.GA14327@panix.com>
	<200310152117.h9FLHfh02774@12-236-54-216.client.attbi.com>
Message-ID: <20031015213155.GA24331@thyrsus.com>

Guido van Rossum <guido@python.org>:
> OK, I pronounce on this: Python's list.sort() shall be stable.

Excellent.  I've been keeping out of this discussion, but this is
the outcome I wanted.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

From pdxi11 at terra.es  Wed Oct 15 07:30:33 2003
From: pdxi11 at terra.es (Renee Bacon)
Date: Wed Oct 15 18:34:37 2003
Subject: [Python-Dev] Re: find any one . any where.. any place cwe oypttfppn
Message-ID: <99h668a944n7206g8ne27725m6@7bn2173ms>

An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20031015/2895130b/attachment.html
From eppstein at ics.uci.edu  Wed Oct 15 19:03:43 2003
From: eppstein at ics.uci.edu (David Eppstein)
Date: Wed Oct 15 19:03:47 2003
Subject: [Python-Dev] Re: decorate-sort-undecorate
References: <LNBBLJKPBEHFEDALKOLCKEPHGIAB.tim.one@comcast.net>
	<t1wzng2mxdl.fsf@kermit.wreck.org>
	<200310151516.h9FFGp502162@12-236-54-216.client.attbi.com>
	<20031015192610.GA14327@panix.com>
	<200310152117.h9FLHfh02774@12-236-54-216.client.attbi.com>
Message-ID: <eppstein-F1D5C7.16034215102003@sea.gmane.org>

In article <200310152117.h9FLHfh02774@12-236-54-216.client.attbi.com>,
 Guido van Rossum <guido@python.org> wrote:

> OK, I pronounce on this: Python's list.sort() shall be stable.

And there was much rejoicing.

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science


From mcherm at mcherm.com  Wed Oct 15 19:20:05 2003
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed Oct 15 19:20:04 2003
Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate
Message-ID: <1066260005.3f8dd625626c0@mcherm.com>

BDFL:
> -1000.  This is non-negotiable.

Kevin Butler:
> +1
> 
> Just 998 votes to go - nice to have a precise value on BDFL 
> pronouncements.  No voting twice with bigger numbers! ;-)

Make it <wink> 999 after my -1. Seriously, the BDFL isn't just
making this up. Beginners would be tripped up by this ALL the
time. People like me who move from language to language and
can never remember which behavior goes with which language would
be tripped up. Returning None prevents being tripped up. And
the work-around is *a 2-line function*! Why can't YOU live with
writing a 2-line helper function to save lots of frustration
for those of us who might forget whether it's in-place or not?

I'm not flaming you here, just trying to point out that the
BDFL is *not* alone on this issue.


Don't-all-dictators-hold-faux-elections-these-days lly, yours

-- Michael Chermside


From greg at cosc.canterbury.ac.nz  Wed Oct 15 19:33:03 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Oct 15 19:34:12 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <7DC93A93-FF48-11D7-9282-000393C2D67E@colorstudy.com>
Message-ID: <200310152333.h9FNX3c26830@oma.cosc.canterbury.ac.nz>

Ian Bicking <ianb@colorstudy.com>:

> Okay, really I'm just hoping for [x for x in l sortby key(x)], if
> not now then someday -- if only there was a decent way of expressing
> that without a keyword... [...in l : key(x)] is the only thing I can
> think of that would be syntactically possible (without introducing a
> new keyword, new punctuation, or reusing a wholely inappropriate
> existing keyword).

   [x >> key(x) for x in l] # ascending sort
   [x << key(x) for x in l] # descending sort

(Well, we got print >> f, so it was worth a try...)

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From pnorvig at google.com  Wed Oct 15 20:27:40 2003
From: pnorvig at google.com (Peter Norvig)
Date: Wed Oct 15 20:27:46 2003
Subject: [Python-Dev] decorate-sort-undecorate
Message-ID: <A48AAE39.E892B3A@mail.google.com>

Greg Ewing wrote:
>    [x >> key(x) for x in l] # ascending sort
>    [x << key(x) for x in l] # descending sort
>
>(Well, we got print >> f, so it was worth a try...)

I hope you're not serious about that.  

As it turns out, I have a proposed syntax for something I call an
"accumulation display", and with it I was able to implement and test a
SortBy in about a minute.  It uses the syntax

>>> [SortBy: abs(x) for x in (-2, -4, 3, 1)]
[1, -2, 3, -4]

where SortBy is an expression (in this case an identifier bound to a
class object), not a keyword.  Other examples of accumulation displays
include:

    [Sum: x*x for x in numbers]
    [Product: Prob_spam(word) for word in email_msg]
    [Min: temp(hour) for hour in range(24)]
    [Top(10): humor(joke) for joke in jokes]
    [Argmax: votes[c] for c in candidates]

You can read the whole proposal at http:///www.norvig.com/pyacc.html

-Peter Norvig

From python-kbutler at sabaydi.com  Wed Oct 15 20:50:09 2003
From: python-kbutler at sabaydi.com (Kevin J. Butler)
Date: Wed Oct 15 20:50:30 2003
Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <1066260005.3f8dd625626c0@mcherm.com>
References: <1066260005.3f8dd625626c0@mcherm.com>
Message-ID: <3F8DEB41.6000209@sabaydi.com>


Michael Chermside wrote:
> Make it <wink> 999 after my -1. Seriously, the BDFL isn't just
> making this up. 

A brief google search showed that the python posters whose names I 
recognize automatically who had expressed opinions were about evenly 
split on the issue.  (I was startled to see my own name - that was where 
I came across my post of six years ago.)

So yes, Guido isn't alone. (If he were, he _probably_ would have caved
in to peer pressure.  Maybe not, though...)

 > Beginners would be tripped up by this ALL the
> time. People like me who move from language to language and
> can never remember which behavior goes with which language would
> be tripped up. 

I have yet to see a convincing code example (e.g., "Here is some real 
code - look how confused people would be if list.sort() had returned 
self").  Generally, list.sort() returning self would make the code more 
clear & concise.

In contrast, I've seen multiple people say that using list.sort() in an 
expression caused real bugs (one said it was his most common Python 
bug), and many express irritation about the final code. (Especially 
people with a functional programming background, but I'm not one of them.)

 > Returning None prevents being tripped up. And
> the work-around is *a 2-line function*! Why can't YOU live with
> writing a 2-line helper function to save lots of frustration
> for those of us who might forget whether it's in-place or not?

Oh, we have!

Concrete frustration outweighs speculative frustration.  ;-)
(pun intended).

Or maybe we could have list.sort()
   return "Error: .sort method does not return self."

That would make the following idiom entertaining:

for i in list.sort():
   print i

kb


From ianb at colorstudy.com  Wed Oct 15 21:22:14 2003
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed Oct 15 21:22:20 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <A48AAE39.E892B3A@mail.google.com>
Message-ID: <2CAD2EAE-FF77-11D7-9282-000393C2D67E@colorstudy.com>

On Wednesday, October 15, 2003, at 07:27 PM, Peter Norvig wrote:
> As it turns out, I have a proposed syntax for something I call an
> "accumulation display", and with it I was able to implement and test a
> SortBy in about a minute.  It uses the syntax
>
>>>> [SortBy: abs(x) for x in (-2, -4, 3, 1)]
> [1, -2, 3, -4]
>
> where SortBy is an expression (in this case an identifier bound to a
> class object), not a keyword.  Other examples of accumulation displays
> include:
>
>     [Sum: x*x for x in numbers]
>     [Product: Prob_spam(word) for word in email_msg]
>     [Min: temp(hour) for hour in range(24)]
>     [Top(10): humor(joke) for joke in jokes]
>     [Argmax: votes[c] for c in candidates]
>
> You can read the whole proposal at http:///www.norvig.com/pyacc.html

Neat.  +1.

I think it would be nice if accumulators were created more like 
iterators, maybe with an __accum__ method.  Then builtins like min and 
max could be turned into accumulators, kind of like the int function 
was turned into a class.  Then you also wouldn't have to check for and 
instantiate classes, which seems a little crude.

Then if a sorted() function/class was added to builtins, and it was 
also an accumulator, you'd be all set.  And all the sort method haters 
out there (they number many!) would be happy.

But [sorted: abs(x) for x in lst] doesn't seem right at all, it should 
return a list of abs(x) sorted by x, not a list of x sorted by abs(x).  
[sorted.by: abs(x) for x in lst] is perhaps more clever than practical 
-- it could work and it reads nicely, but it doesn't look normal.

--
Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org


From pje at telecommunity.com  Wed Oct 15 21:36:44 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Oct 15 21:36:27 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <A48AAE39.E892B3A@mail.google.com>
Message-ID: <5.1.0.14.0.20031015212950.02951120@mail.telecommunity.com>

At 05:27 PM 10/15/03 -0700, Peter Norvig wrote:
>As it turns out, I have a proposed syntax for something I call an
>"accumulation display", and with it I was able to implement and test a
>SortBy in about a minute.  It uses the syntax
>
> >>> [SortBy: abs(x) for x in (-2, -4, 3, 1)]
>[1, -2, 3, -4]
>
>where SortBy is an expression (in this case an identifier bound to a
>class object), not a keyword.  Other examples of accumulation displays
>include:
>
>     [Sum: x*x for x in numbers]
>     [Product: Prob_spam(word) for word in email_msg]
>     [Min: temp(hour) for hour in range(24)]
>     [Top(10): humor(joke) for joke in jokes]
>     [Argmax: votes[c] for c in candidates]

+0.  You can do any of these with a function, if you're willing to let the 
entire list be created, and put any needed parameters in as a tuple, e.g.:

Top(10, [(humor(joke),joke) for joke in jokes])

So, if we had generator comprehensions, the proposed mechanism would be 
unnecessary.  Also, note that [] implies the return value is a list or 
sequence of some kind, when it's not.

IMO, it would really be better to have some kind of generator comprehension 
to make inline iterator creation easy, and then put the function or class 
or whatever outside the generator comprehension.  Then, it's clear that 
some function is being applied to a sequence, and that you should look to 
the function to find out the type of the result, e.g.:

Top(10, [yield humor(joke),joke for joke in jokes])


From eppstein at ics.uci.edu  Wed Oct 15 22:58:34 2003
From: eppstein at ics.uci.edu (David Eppstein)
Date: Wed Oct 15 22:58:42 2003
Subject: [Python-Dev] Re: decorate-sort-undecorate
References: <A48AAE39.E892B3A@mail.google.com>
Message-ID: <eppstein-F06802.19583415102003@sea.gmane.org>

In article <A48AAE39.E892B3A@mail.google.com>,
 Peter Norvig <pnorvig@google.com> wrote:

> As it turns out, I have a proposed syntax for something I call an
> "accumulation display", and with it I was able to implement and test a
> SortBy in about a minute.  It uses the syntax
> 
> >>> [SortBy: abs(x) for x in (-2, -4, 3, 1)]
> [1, -2, 3, -4]
> 
> where SortBy is an expression (in this case an identifier bound to a
> class object), not a keyword.
...
> You can read the whole proposal at http:///www.norvig.com/pyacc.html

Would this proposal also allow [Yield: expr(x) for x in someiterator] ?

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science


From tim.one at comcast.net  Wed Oct 15 21:06:07 2003
From: tim.one at comcast.net (Tim Peters)
Date: Thu Oct 16 00:49:48 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310152117.h9FLHfh02774@12-236-54-216.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOECOGJAB.tim.one@comcast.net>

[Guido]
> ...
> OK, I pronounce on this: Python's list.sort() shall be stable.

Wow.  I thought the time machine may have broken on its way to California,
but I see this already reached back to the 2.3 release!  Relief.

+1-ing-ly y'rs  - tim


From tim.one at comcast.net  Wed Oct 15 21:36:07 2003
From: tim.one at comcast.net (Tim Peters)
Date: Thu Oct 16 00:49:52 2003
Subject: [Python-Dev] Python-2.3.2 windows binary screwed
In-Reply-To: <oewiacem.fsf@python.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEDBGJAB.tim.one@comcast.net>

[Thomas Heller]
> Sigh.
>
> The 2.3.2 windows binary contains invalid MS dlls.

Oops.  Guido can tell you how anal I was about this, but I don't think it
ever got documented.  Sorry!  It's why the Wise script has C:\Code\MSDLLs as
a choice for where to get redistributables from.

> I copied them from my system directory, instead of using those of the
> MSVC 6 SP5 redistributables.

That's a good choice.  I can't find it now, but somewhere in the MS
gigabytes of stuff is a list of which versions of these guys are
redistributable.  Sometimes a service pack will install one that isn't
*generally* usable, because it relies on other stuff installed by the same
service pack.  These oddballs often show up in security patches, where
they're seemingly ramming out a fix as fast as possible.

> ...
> Strongy affected are probably win98 and NT4 users.

The happier news is that I've got 2.3.2 on two Win98SE boxes with no ill
effects.  I keep these scrupulously up-to-date, though.


From tim.one at comcast.net  Wed Oct 15 21:41:14 2003
From: tim.one at comcast.net (Tim Peters)
Date: Thu Oct 16 00:49:56 2003
Subject: [Python-Dev] Python-2.3.2 windows binary screwed
In-Reply-To: <ekxea63a.fsf@python.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEDCGJAB.tim.one@comcast.net>

[Thomas Heller]
> Before I'd like some questions to be answered, probably Martin or Tim
> have an opinion here (but others are also invited).
>
> First, I hope that it's ok to build the installer with the VC6 SP5
> dlls.

I have in the past <wink>.  It's OK by me.  The Wise script should already
be refusing to replace newer versions of these DLLs.

> The other possibility that comes to mind is to not include
> *any* MS runtime dlls, and provide the MS package VCREDIST.EXE
> separately.

Martin pointed out correctly that Win95 didn't ship with these things, so
it's safest to keep shipping them until Python moves to VC7 (at which point
I don't think we can pretend to support Win9x anymore).

> Second, what about the filename / version number / build number?

The build number should definitely change.  When someone sends a snippet
from an interactive prompt with an incomprehensible error report, the build
number they're unwittingly tricked into including is the best clue about
what they're really running.  The version number shouldn't change.

> IMO one should be able to distinguish the new installer from the old
> one. The easiest thing would be to just change the filename into maybe
> Python-2.3.2.1.exe.

+1.


From pnorvig at google.com  Thu Oct 16 01:04:41 2003
From: pnorvig at google.com (Peter Norvig)
Date: Thu Oct 16 01:04:47 2003
Subject: [Python-Dev] decorate-sort-undecorate
References: <5.1.0.14.0.20031015212950.02951120@mail.telecommunity.com>
Message-ID: <A58732FF.7D9079E7@mail.google.com>

Yes, you're right -- with generator comprehensions, you can have
short-circuit evaluation via functions on the result, and you can get
at both original element and some function of it, at the cost of
writing f(x), x.  So my proposal would be only a small amount of
syntactic sugar over what you can do with generator comprehensions.
(But you could say the same for list/generator comprehensions over raw
generators.)

-Peter Norvig


On Wed Oct 15 18:36:44 PDT 2003, Phillip J. Eby <pje@telecommunity.com> wrote:
> At 05:27 PM 10/15/03 -0700, Peter Norvig wrote:
> >As it turns out, I have a proposed syntax for something I call an
> >"accumulation display", and with it I was able to implement and test a
> >SortBy in about a minute.  It uses the syntax
> >
> > >>> [SortBy: abs(x) for x in (-2, -4, 3, 1)]
> >[1, -2, 3, -4]
> >
> >where SortBy is an expression (in this case an identifier bound to a
> >class object), not a keyword.  Other examples of accumulation displays
> >include:
> >
> >     [Sum: x*x for x in numbers]
> >     [Product: Prob_spam(word) for word in email_msg]
> >     [Min: temp(hour) for hour in range(24)]
> >     [Top(10): humor(joke) for joke in jokes]
> >     [Argmax: votes[c] for c in candidates]
> 
> +0.  You can do any of these with a function, if you're willing to let the
> entire list be created, and put any needed parameters in as a tuple, e.g.:
>
> Top(10, [(humor(joke),joke) for joke in jokes])
>
> So, if we had generator comprehensions, the proposed mechanism would be
> unnecessary.  Also, note that [] implies the return value is a list or
> sequence of some kind, when it's not.
>
> IMO, it would really be better to have some kind of generator comprehension
> to make inline iterator creation easy, and then put the function or class
> or whatever outside the generator comprehension.  Then, it's clear that
> some function is being applied to a sequence, and that you should look to
> the function to find out the type of the result, e.g.:
>
> Top(10, [yield humor(joke),joke for joke in jokes])
>
>

From greg at cosc.canterbury.ac.nz  Thu Oct 16 01:11:45 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu Oct 16 01:11:58 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <A48AAE39.E892B3A@mail.google.com>
Message-ID: <200310160511.h9G5BjS28355@oma.cosc.canterbury.ac.nz>

Peter Norvig <pnorvig@google.com>:

>    [Sum: x*x for x in numbers]
>    [Product: Prob_spam(word) for word in email_msg]
>    [Min: temp(hour) for hour in range(24)]
>    [Top(10): humor(joke) for joke in jokes]
>    [Argmax: votes[c] for c in candidates]

Interesting idea, but I'm a bit worried by the enclosing
[], which suggests that a list is being constructed,
whereas in most of your examples the result isn't a
list.

I still think it would be fun if Python had an "up"
operator, so with suitably defined accumulator
objects you could say things like

  total = add up my_numbers
  product = multiply up some_probabilities

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Thu Oct 16 01:16:23 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu Oct 16 01:16:44 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <5.1.0.14.0.20031015212950.02951120@mail.telecommunity.com>
Message-ID: <200310160516.h9G5GN328361@oma.cosc.canterbury.ac.nz>

"Phillip J. Eby" <pje@telecommunity.com>:

> IMO, it would really be better to have some kind of generator
> comprehension
> 
> Top(10, [yield humor(joke),joke for joke in jokes])

I like the *idea* of a generator comprehension, but I'm
not sure I like the [yield ...] syntax. It's a bit
idiomatic looking -- the [] still imply a list, even
though it's not building a list at all.

Maybe there should be a different kind of bracketing,
e.g.

  <humor(joke),joke for joke in jokes>

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From anthony at interlink.com.au  Thu Oct 16 01:15:30 2003
From: anthony at interlink.com.au (Anthony Baxter)
Date: Thu Oct 16 01:18:37 2003
Subject: [Python-Dev] Python-2.3.2 windows binary screwed 
In-Reply-To: <oewi8hia.fsf@python.net> 
Message-ID: <200310160515.h9G5FUqc025443@localhost.localdomain>


>>> Thomas Heller wrote
> Anthony, did you get this?

Yep, sorry - I sleep during the night. <wink>

Installed on creosote (along with signature)

Anthony

From martin at v.loewis.de  Thu Oct 16 02:04:16 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Thu Oct 16 02:04:22 2003
Subject: [Python-Dev] Re: python-dev Summary for 2003-09-16 through
	2003-09-30
In-Reply-To: <bmfjrc$66d$1@sea.gmane.org>
References: <mailman.1066051924.13540.clpa-moderators@python.org>
	<bmfjrc$66d$1@sea.gmane.org>
Message-ID: <m3k775zpjz.fsf@mira.informatik.hu-berlin.de>

"Mike Rovner" <mike@nospam.com> writes:

> >> - the patch might be incomplete. Ping the submitter. If the submitter
> >>   is incomplete, either complete it yourself, or suggest rejection
> >>   of the patch.
> 
> All I can do as SF regestered user is add a comment to existing patch.
> I can't extend it, submit extra files, i.e. "complete" it.
> 
> Please clarify the preferabale way to "help with the war on SF patch items".

If you think the patch is best revised in a new form, please submit a
new patch, and leave a message in the original one indicating that you
think your patch should supercede the patch of the original submitter.

However, as Brett explains, there might be other (perhaps better) ways
to achieve the same effect: If you think the patch needs revision in a
certain direction, ask the submitter to revise the patch
accordingly. If you come up with a competing patch, the competition
itself may cause bad feelings - so try to work with the submitter, not
against her.

Regards,
Martin

From martin at v.loewis.de  Thu Oct 16 02:05:12 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Thu Oct 16 02:05:18 2003
Subject: [Python-Dev] Draft of an essay on Python development (and how to
	help)
In-Reply-To: <3F8B5ECB.4030207@ocf.berkeley.edu>
References: <3F8B5ECB.4030207@ocf.berkeley.edu>
Message-ID: <m3fzhtzpif.fsf@mira.informatik.hu-berlin.de>

"Brett C." <bac@OCF.Berkeley.EDU> writes:

> If you get any message from this document, it should be that *anyone*
> can help Python.

It should be what?

Regards,
Martin

From martin at v.loewis.de  Thu Oct 16 02:07:30 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Thu Oct 16 02:08:00 2003
Subject: [Python-Dev] server side digest auth support
In-Reply-To: <200310140847.h9E8ltLn028921@localhost.localdomain>
References: <200310140847.h9E8ltLn028921@localhost.localdomain>
Message-ID: <m3brshzpel.fsf@mira.informatik.hu-berlin.de>

Anthony Baxter <anthony@interlink.com.au> writes:

> We've got http digest auth [RFC 2617] support at the client level in
> the standard library, but it doesn't seem like there's server side 
> support. I'm planning on adding this (for pypi) but it's not clear 
> where it should go - I want to use it from a CGI, but I can see it 
> being useful for people writing HTTP servers as well. Should I just
> make a new module httpdigest.py?  

Can you actually implement it from CGI? How do you get hold of the
WWW-Authenticate header?

Regards,
Martin

From theller at python.net  Thu Oct 16 02:51:14 2003
From: theller at python.net (Thomas Heller)
Date: Thu Oct 16 02:51:22 2003
Subject: [Python-Dev] Python-2.3.2 windows binary screwed
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEDCGJAB.tim.one@comcast.net> (Tim
	Peters's message of "Wed, 15 Oct 2003 21:41:14 -0400")
References: <LNBBLJKPBEHFEDALKOLCAEDCGJAB.tim.one@comcast.net>
Message-ID: <65ip8yl9.fsf@python.net>

"Tim Peters" <tim.one@comcast.net> writes:

> [Thomas Heller]
>> Before I'd like some questions to be answered, probably Martin or Tim
>> have an opinion here (but others are also invited).
>>
>> First, I hope that it's ok to build the installer with the VC6 SP5
>> dlls.
>
> I have in the past <wink>.  It's OK by me.  The Wise script should already
> be refusing to replace newer versions of these DLLs.
>
>> The other possibility that comes to mind is to not include
>> *any* MS runtime dlls, and provide the MS package VCREDIST.EXE
>> separately.
>
> Martin pointed out correctly that Win95 didn't ship with these things, so
> it's safest to keep shipping them until Python moves to VC7 (at which point
> I don't think we can pretend to support Win9x anymore).
>
>> Second, what about the filename / version number / build number?
>
> The build number should definitely change.  When someone sends a snippet
> from an interactive prompt with an incomprehensible error report, the build
> number they're unwittingly tricked into including is the best clue about
> what they're really running.  The version number shouldn't change.

Too late.  Anthony already published on creosote what I sent him.  With
the exception of the MS dlls, the installer contains and installs the
exactly identical files as Python-2.3.2.exe, and this includes the build
number since I did not rebuild Python itself.

>> IMO one should be able to distinguish the new installer from the old
>> one. The easiest thing would be to just change the filename into maybe
>> Python-2.3.2.1.exe.
>
> +1.

Python-2.3.2-1.exe is it now.

Thomas


From theller at python.net  Thu Oct 16 02:55:47 2003
From: theller at python.net (Thomas Heller)
Date: Thu Oct 16 02:56:01 2003
Subject: [Python-Dev] Python-2.3.2 windows binary screwed
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEDBGJAB.tim.one@comcast.net> (Tim
	Peters's message of "Wed, 15 Oct 2003 21:36:07 -0400")
References: <LNBBLJKPBEHFEDALKOLCIEDBGJAB.tim.one@comcast.net>
Message-ID: <1xtd8ydo.fsf@python.net>

"Tim Peters" <tim.one@comcast.net> writes:

> [Thomas Heller]
>> Sigh.
>>
>> The 2.3.2 windows binary contains invalid MS dlls.
>
> Oops.  Guido can tell you how anal I was about this, but I don't think it
> ever got documented.  Sorry!  It's why the Wise script has C:\Code\MSDLLs as
> a choice for where to get redistributables from.

I was probably confused because it had C:\Windows\System also ;-(.
I will change the WISE script to remove these, and update the relevant
PEPs so that this (hopefully) doesn't happen again.

>> I copied them from my system directory, instead of using those of the
>> MSVC 6 SP5 redistributables.
>
> That's a good choice.

Apparently not - they were XP specific.

>  I can't find it now, but somewhere in the MS
> gigabytes of stuff is a list of which versions of these guys are
> redistributable.  Sometimes a service pack will install one that isn't
> *generally* usable, because it relies on other stuff installed by the same
> service pack.  These oddballs often show up in security patches, where
> they're seemingly ramming out a fix as fast as possible.
>

>> Strongy affected are probably win98 and NT4 users.
>
> The happier news is that I've got 2.3.2 on two Win98SE boxes with no ill
> effects.  I keep these scrupulously up-to-date, though.

See the bug reports I mentioned to find out what happened to other
people <0.0 wink>.

Apologies to everyone suffering from my fault.

Thomas


From theller at python.net  Thu Oct 16 02:56:36 2003
From: theller at python.net (Thomas Heller)
Date: Thu Oct 16 02:56:43 2003
Subject: [Python-Dev] Python-2.3.2 windows binary screwed
In-Reply-To: <200310160515.h9G5FUqc025443@localhost.localdomain> (Anthony
	Baxter's message of "Thu, 16 Oct 2003 15:15:30 +1000")
References: <200310160515.h9G5FUqc025443@localhost.localdomain>
Message-ID: <wub57jrv.fsf@python.net>

Anthony Baxter <anthony@interlink.com.au> writes:

>>>> Thomas Heller wrote
>> Anthony, did you get this?
>
> Yep, sorry - I sleep during the night. <wink>

Hm, sometimes I totally forget the timezones.

> Installed on creosote (along with signature)
>
> Anthony

Thanks,

Thomas


From tim.one at comcast.net  Wed Oct 15 20:33:39 2003
From: tim.one at comcast.net (Tim Peters)
Date: Thu Oct 16 04:10:14 2003
Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <3F8DB69E.2070406@sabaydi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKECMGJAB.tim.one@comcast.net>

[Kevin J. Butler]
> BDFL:
>> -1000.  This is non-negotiable.
>
> [Barry's blissful demo code snipped]
>
> +1
>
> Just 998 votes to go - nice to have a precise value on BDFL
> pronouncements.  No voting twice with bigger numbers! ;-)

-1.  Back to 999.

> I think just about everyone gets tripped up by the "sort returns None"
> behavior, and though one (e.g., BDFL) can declare that it is a less
> significant stumble than not realizing the list is sorted in place, it
> is a _continuing_ inconvenience, with virtually every call to [].sort,
> even for Python experts (like Barry, not me).

People would get in worse (subtler) trouble if it did return self.  The
trouble they get from it returning None is all of shallow, immediate, easily
fixed, and 100% consistent with other builtin container mutating methods
(dict.update, dict.clear, list.remove, list.append, list.extend,
list.insert, list.reverse).

That said, since we're having a fire sale on optional sort arguments in 2.4,
I wouldn't oppose an optional Boolean argument you could explicit set to
have x.sort() return x.  For example,

>>> [1, 2, 3].sort(happy_guido=False)
[1, 2, 3]
>>>


From tim.one at comcast.net  Wed Oct 15 20:46:47 2003
From: tim.one at comcast.net (Tim Peters)
Date: Thu Oct 16 04:10:25 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <004f01c392c5$bb94a740$e841fea9@oemcomputer>
Message-ID: <LNBBLJKPBEHFEDALKOLCGECNGJAB.tim.one@comcast.net>

[Raymond Hettinger]
> ...
> * The key function triggers a DSU step with a wrapper object that
> holds the full record but returns only the key for a comparison.
> This is fast, memory efficient, and doesn't change the underlying
> stability characteristics of the sort. (I think this was Neil's idea
> -- and it works like a charm.)

I see the wrapper object participates in cyclic GC.  This adds 12 (32-bit
Linux) to 16 (32-bit Windows) gc overhead bytes per wrapper object, more
than the # of bytes needed to hold the 2 useful pointers.  Since the wrapper
objects only live for the life of the sort, I don't think it's important
that they participate in cyclic gc.  In particular, since the key and value
objects being wrapped stay alive for the life of the sort too, no cyclic
trash they appear in can become collectible during the sort, and so tracing
cycles involving these things can't do any good (it can fritter away time
moving the wrapper objects into older generations, but that's not usually
"good" <wink>).


From tim.one at comcast.net  Wed Oct 15 20:59:00 2003
From: tim.one at comcast.net (Tim Peters)
Date: Thu Oct 16 04:10:34 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310151516.h9FFGp502162@12-236-54-216.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEECOGJAB.tim.one@comcast.net>

[Guido]
> ...
> Given that the Jython folks had Tim's sort algorithm translated into
> Java in half a day, I don't see why we can't require all
> implementations to have a stable sort.  It's not like you can gain
> significant speed over Timsort.

I object to any sort that claims to be more stable than its author.
Speaking of which, by giving up the so-called stability of 2.3's
list.sort(), I can speed sorting of exponentially distributed random floats
by nearly 0.017%!  That's almost a fiftieth of a percent.  Some of the
floats in the result are smaller than their left neighbor, but that's only
because I had to mutate some of the values, and they're not a lot smaller
anyway.

adopting-the-consensus-view-of-floating-point-will-deliver-
    many-such-benefits-ly y'rs  - tim


From tim.one at comcast.net  Wed Oct 15 21:13:12 2003
From: tim.one at comcast.net (Tim Peters)
Date: Thu Oct 16 04:10:39 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <7DC93A93-FF48-11D7-9282-000393C2D67E@colorstudy.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEECPGJAB.tim.one@comcast.net>

[Ian Bicking]
> When doing DSU sorting, the in-place sorting isn't really a
> performance win, is it?  You already have to allocate and populate an
> entire alternate list with the sort keys, though I suppose you could
> have those mini key structs point to the original list.

IIUC, Raymond's patch actually (re)uses the original list object to hold
(pointers to) the wrapper objects.  No additional list is allocated.  Since
the wrapper objects hold (pointers to) the original objects, it's easy to
make the list point back to the original objects at the end.  It's better
this way than hand-rolled DSU coded in Python, although the same effect
*could* be gotten via

    class Wrapper:
        def __init__(self, key, obj):
            self.key = key
            self.obj = obj
        def __lt__(a, b):
            return a.key < b.key

    for i, obj in enumerate(L):
        L[i] = Wrapper(key(obj), obj)
    L.sort()
    for i, w in enumerate(L):
        L[i] = w.key

assuming no exceptions occur along the way.

> Anyway, while it's obviously in bad taste to propose .sort change its
> return value based on the presence of a key, wouldn't it be good if we
> had access to the new sorted list, instead of always clobbering the
> original list?  Otherwise people's sorted() functions will end up
> copying lists unnecessarily.

Give it an optional clobber argument -- your own sort function doesn't
*have* to copy the list.


From aleaxit at yahoo.com  Thu Oct 16 04:20:24 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Thu Oct 16 04:21:16 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <7DC93A93-FF48-11D7-9282-000393C2D67E@colorstudy.com>
References: <7DC93A93-FF48-11D7-9282-000393C2D67E@colorstudy.com>
Message-ID: <200310161020.24161.aleaxit@yahoo.com>

On Wednesday 15 October 2003 09:48 pm, Ian Bicking wrote:
> On Wednesday, October 15, 2003, at 12:35 PM, Barry Warsaw wrote:
> > While we're hacking on [].sort(), how horrible would it be if we
> > modified it to return self instead of None?  I don't mind the
> > sort-in-place behavior, but it's just so inconvenient that it doesn't
> > return anything useful.  I know it would be better if it returned a new
> > list, but practicality beats purity. <wink>
>
> When doing DSU sorting, the in-place sorting isn't really a performance
> win, is it?  You already have to allocate and populate an entire
> alternate list with the sort keys, though I suppose you could have
> those mini key structs point to the original list.

I thought the idea being implemented avoided making a new list --
i.e., that the idea being implemented is the equivalent of:

# decorate
for i, item in enumerate(thelist):
    thelist[i] = CleverWrapper((key(item), item))

# sort (with the new stability guarantee)
thelist.sort()

# undecorate
for i, item in enumerate(thelist):
    thelist[i] = item[1]

where (the equivalent of):

class CleverWrapper(tuple):
    def __cmp__(self, other): return cmp(self[0], other[0])

so, there is no allocation of another list -- just (twice) a repopulation
of the existing one.  How _important_ that is to performance, I dunno,
but wanted to double-check on my understanding of this anyway.


> Okay, really I'm just hoping for [x for x in l sortby key(x)], if not
> now then someday -- if only there was a decent way of expressing that
> without a keyword... [...in l : key(x)] is the only thing I can think
> of that would be syntactically possible (without introducing a new
> keyword, new punctuation, or reusing a wholely inappropriate existing
> keyword).  Or ";" instead of ":", but neither is very good.

Peter Norvig's just-proposed "accumulator" syntax looks quite good to
me from this point of view, and superior to the "generator comprehension"
alternative (though I think the semantics might perhaps be tweaked, but I'm
thinking of writing a separate message about that).

IOW, if we can accept that [ ... ] is not necessarily a list, then 
    [SortedBy: key(x) for x in L] 
would look good to me.  (in this case this WOULD be a list, but I think
the notation pays for itself only if we can use it more generally).  Or
maybe SortedBy[key(x) for x in L] -- extending indexing syntax 
    <expression> [ ... ]
to mean something different if the ... includes a 'for', just like we
already extended list display syntax [ ... ] to mean list comprehension
in just such a case.


Alex


From pyth at devel.trillke.net  Thu Oct 16 04:35:52 2003
From: pyth at devel.trillke.net (Holger Krekel)
Date: Thu Oct 16 04:36:11 2003
Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKECMGJAB.tim.one@comcast.net>;
	from tim.one@comcast.net on Wed, Oct 15, 2003 at 08:33:39PM -0400
References: <3F8DB69E.2070406@sabaydi.com>
	<LNBBLJKPBEHFEDALKOLCKECMGJAB.tim.one@comcast.net>
Message-ID: <20031016103552.H14453@prim.han.de>

Tim Peters wrote:
> [Kevin J. Butler]
> > I think just about everyone gets tripped up by the "sort returns None"
> > behavior, and though one (e.g., BDFL) can declare that it is a less
> > significant stumble than not realizing the list is sorted in place, it
> > is a _continuing_ inconvenience, with virtually every call to [].sort,
> > even for Python experts (like Barry, not me).
> 
> People would get in worse (subtler) trouble if it did return self.  The
> trouble they get from it returning None is all of shallow, immediate, easily
> fixed, and 100% consistent with other builtin container mutating methods
> (dict.update, dict.clear, list.remove, list.append, list.extend,
> list.insert, list.reverse).
> 
> That said, since we're having a fire sale on optional sort arguments in 2.4,
> I wouldn't oppose an optional Boolean argument you could explicit set to
> have x.sort() return x.  For example,
> 
> >>> [1, 2, 3].sort(happy_guido=False)
> [1, 2, 3]
> >>>

If anything at all, i'd suggest a std-module which contains e.g. 
'sort', 'reverse' and 'extend' functions which always return
a new list, so that you could write:

    for i in reverse(somelist):
        ...

which wouldn't modify the list but return a new one. I don't have
a name for such a module, but i have once written a "oneliner"
to implement the above methods (working on tuples, strings, lists):

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/119596

(sorry this was in my early days :-)
have fun,

    holger

From aleaxit at yahoo.com  Thu Oct 16 05:14:31 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Thu Oct 16 05:14:35 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310160516.h9G5GN328361@oma.cosc.canterbury.ac.nz>
References: <200310160516.h9G5GN328361@oma.cosc.canterbury.ac.nz>
Message-ID: <200310161114.31192.aleaxit@yahoo.com>

On Thursday 16 October 2003 07:16 am, Greg Ewing wrote:
> "Phillip J. Eby" <pje@telecommunity.com>:
> > IMO, it would really be better to have some kind of generator
> > comprehension
> >
> > Top(10, [yield humor(joke),joke for joke in jokes])
>
> I like the *idea* of a generator comprehension, but I'm
> not sure I like the [yield ...] syntax. It's a bit
> idiomatic looking -- the [] still imply a list, even
> though it's not building a list at all.
>
> Maybe there should be a different kind of bracketing,
> e.g.
>
>   <humor(joke),joke for joke in jokes>

I think we could extend indexing to mean something different when
the [ ] contain a 'for', just like we extended list display to mean
something different (list comprehension) when the [ ] contain a
'for'.  Syntax such as:

    Top(10)[ humor(joke) for joke in jokes ]

does not suggest a list is _returned_, just like foo[23] doesn't.

And I have an idea on semantics (which I intend to post separately)
which might let accumulator display syntax work for both "iterator
comprehensions" AND "return of ordinary non-iterator" results.


Alex


From aleaxit at yahoo.com  Thu Oct 16 06:00:04 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Thu Oct 16 06:00:12 2003
Subject: [Python-Dev] accumulator display semantics
In-Reply-To: <A58732FF.7D9079E7@mail.google.com>
References: <5.1.0.14.0.20031015212950.02951120@mail.telecommunity.com>
	<A58732FF.7D9079E7@mail.google.com>
Message-ID: <200310161200.04846.aleaxit@yahoo.com>

On Thursday 16 October 2003 07:04 am, Peter Norvig wrote:
> Yes, you're right -- with generator comprehensions, you can have
> short-circuit evaluation via functions on the result, and you can get
> at both original element and some function of it, at the cost of
> writing f(x), x.  So my proposal would be only a small amount of
> syntactic sugar over what you can do with generator comprehensions.

I _like_ your proposal, particularly in my proposed variant syntax
    foo[ x*x for x in xs if cond(x) ]
vs your original syntax
    [ foo: x*x for x in xs if cond(x) ]

I think the "indexing-like" syntax I proposed solves Greg's objection
that your "list display-like" syntax (and similar proposals for iterator
comprehensions) misleadingly suggest that a list is the result; an
indexing makes no such suggestion, as foo[bar] may just as well
be a sequence, an iterator, or anything else whatsoever depending
on foo (and perhaps on bar:-).

But syntax apart, let's dig a little bit more in the semantics.  At

http://www.norvig.com/pyacc.html

you basically propose that the infrastructure for an accumulator
display perform the equivalent of:

        for x in it:
            if a.add(f(x), x):
                break        
        return a.result()

where a, in Ian Bicking's proposal, would be acc.__accum__()
(I like this, as it lets us use existing sum, max, etc, as accumulators,
by adding suitable methods __accum__ to them).

However, this would not let accumulator displays usefully return
iterators -- since the entire for loop is done by the infrastructure,
the best a could do would be to store all needed intermediates
to return an iterator on them as a.result() -- possible memory waste.

My idea about this is still half-baked, but I think it's ready to post
and get your and others' feedback on.

Why not move the for loop, if needed, out of the hard-coded
infrastructure and just have accumulator display syntax such as:
    acc[x*x for x in it]
be exactly equivalent to:
    a = acc.__accum__(lambda x: x*x, iter(it))
    return a.result()
i.e., pass the callable corresponding to the expression, and the
iterator corresponding to the sequence, to the user-coded
accumulator.  Decisions would have to be taken regarding what
exactly to do when the display contains multiple for, if, and/or control
variables, as in
    acc[f(x,y,z) for x, y in it1 if y>x for z in g(y) if z<x+y]
and such nightmares; I'll assume in the following that any such
complicated display is conceptually brought back to the pristine
simplicity of
    acc[<expr>(x) for x in <it>]
where x can be a tuple of the multiple control variables involved
and iterable 'it' already encodes all nested-for's and if's into one
"stream" of values (some similar kind of decision will have to be
taken for your original suggestion, for iterator comprehensions,
and for any other such idea, it seems to me).

The advantage of my idea would be to let accumulator display
syntax just as easily return iterators.  E.g., with something like:

class Accum(object):
    def __accum__(cls, exp, it):
         " make __accum__ a classmethod equivalent to calling the class "
         return cls(exp, it)
    __accum__ = classmethod(__accum__)
    def __init__(self, exp, it):
        " factor-out the common case of looping into this base-class "
        for item in it:
            if self.add(exp(it), it):
                break
    def result(self):
        " let self.add implicitly accumulate into self._result by default "
        return self._result

class Iter(Accum):
    def __init__(self, exp, it):
        " overriding Accum.__init__ as we don't wanna loop "
        self.exp = exp
        self.it = it
    def result(self):
        " overriding Accum.result with a generator "
        for item in self.it: yield self.exp(item)

you could code e.g.
    for y in Iter[ x*x for x in nums if good(x)]:
        blahblah(y)
as being equivalent to:
    for x in nums:
        if good(x):
            y = x*x
            blahblah(y)

but you could also code, roughly as in your original proposal,

class Mean(Accum):
    def __init__(self, exp=None, it=()):
        " do self attribute initializations then chain up to base class "
        self.total, self.n = 0, 0
        Accum.__init__(self, exp, it)
    def add(self, value, _ignore):
        " the elementary step is unchanged "
        self.total, self.n = self.total+value, self.n+1
    def result(self):
        " override Accum.result as this is better computed just one "
        return self.total / self.n

to keep the .add method factored out for non-display use (the
default empty it argument to __init__ is there for this specific
purpose, too), if you wished.


Basically, my proposal amounts to a different factoring of accumulator
display functionality between Python's hard-coded infrastructure, and
functionality to be supplied by the standard library module accum that
you already propose.  By having much less in the hard-coded parts --
basically just the identification and passing-on of the proper expression
and iterator -- and correspondingly more in the standard library, we gain
flexibility because a base class in the library may be more flexibly
"overridden", in part or in its entirety (an accumulator doesn't HAVE to
inherit from class Accum at all, if it just wants to reimplement both
of the __accum__ and result methods on its own).  If this slows things
down a bit we may perhaps in the future hard-code some special cases,
but worrying about it now would feel like premature optimizaton to me.


Alex


From anthony at interlink.com.au  Thu Oct 16 06:08:01 2003
From: anthony at interlink.com.au (Anthony Baxter)
Date: Thu Oct 16 06:10:49 2003
Subject: [Python-Dev] server side digest auth support 
In-Reply-To: <m3brshzpel.fsf@mira.informatik.hu-berlin.de> 
Message-ID: <200310161008.h9GA817Z030936@localhost.localdomain>


>>> Martin v. =?iso-8859-15?q?L=F6wis?= wrote
> Can you actually implement it from CGI? How do you get hold of the
> WWW-Authenticate header?

Hm. You're right - it's been far too long since I used plain old CGI
for anything. Wow, it's a really awful interface. Been spoiled by 
app servers and fastcgi and the like, I guess.

Ah well - can at least implement it for the various server-side things
and client-side things in the std lib.

Anthony

-- 
Anthony Baxter     <anthony@interlink.com.au>   
It's never too late to have a happy childhood.


From just at letterror.com  Thu Oct 16 07:19:06 2003
From: just at letterror.com (Just van Rossum)
Date: Thu Oct 16 07:19:09 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <005201c39347$10c03960$e841fea9@oemcomputer>
Message-ID: <r01050400-1026-8FC4B6A8FFCA11D78DF6003065D5E7E4@[10.0.0.23]>

Raymond Hettinger wrote:

> If the discussion is wrapped up, I'm ready to commit the patch:
> 
>    www.python.org/sf/823292
> 
> Summary:
> 
> .. Adds keyword arguments:  cmp, key, reverse.
> .. Stable for any combination of arguments (including reverse).
[ ... ]

On the sf tracker item you write:

    def sort(self, cmp=None, key=None, reverse=None):
        if cmp is not None and key is not None:
            cmp = cmpwrapper(cmp)
        if key is not None:
            self[:] = [sortwrapper(key(x), x) for x in self]
        if reverse is not None:
            self.reverse()        
        self.sort(cmp)
        if key is not None:
            self[:] = [x.getvalue() for x in self]
        if reverse is not None:
            self.reverse() 

Is there consensus at all about the necessity of that first reverse
call? To me it's not immediately obvious that the reverse option should
maintain the _original_ stable order. In my particular application I
would actually want reverse to do just that: reverse the result of the
sort. Easy enough to work around of course: I could do the reverse
myself after the sort. But it does feel odd: sort() now _has_ a reverse
feature, but I can't use it...

(Also: how does timsort perform when fed a (partially) sorted list
compared to a reversed sorted list? If there's a significant difference
there, than that first reverse call may actually hurt performance in
some cases. Not that I care much about that...)

Just

From paul-python at svensson.org  Thu Oct 16 07:38:33 2003
From: paul-python at svensson.org (Paul Svensson)
Date: Thu Oct 16 07:38:38 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310161114.31192.aleaxit@yahoo.com>
References: <200310160516.h9G5GN328361@oma.cosc.canterbury.ac.nz>
	<200310161114.31192.aleaxit@yahoo.com>
Message-ID: <20031016073514.Q41936@familjen.svensson.org>

On Thu, 16 Oct 2003, Alex Martelli wrote:

>I think we could extend indexing to mean something different when
>the [ ] contain a 'for', just like we extended list display to mean
>something different (list comprehension) when the [ ] contain a
>'for'.  Syntax such as:
>
>    Top(10)[ humor(joke) for joke in jokes ]
>
>does not suggest a list is _returned_, just like foo[23] doesn't.

But it does immediately suggest

    iter[humor(joke) for joke in jokes]

as the format for iterator comprehensions.

Is that good or bad ?

	/Paul

From aleaxit at yahoo.com  Thu Oct 16 07:56:35 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Thu Oct 16 07:56:40 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <20031016073514.Q41936@familjen.svensson.org>
References: <200310160516.h9G5GN328361@oma.cosc.canterbury.ac.nz>
	<200310161114.31192.aleaxit@yahoo.com>
	<20031016073514.Q41936@familjen.svensson.org>
Message-ID: <200310161356.35452.aleaxit@yahoo.com>

On Thursday 16 October 2003 01:38 pm, Paul Svensson wrote:
> On Thu, 16 Oct 2003, Alex Martelli wrote:
> >I think we could extend indexing to mean something different when
> >the [ ] contain a 'for', just like we extended list display to mean
> >something different (list comprehension) when the [ ] contain a
> >'for'.  Syntax such as:
> >
> >    Top(10)[ humor(joke) for joke in jokes ]
> >
> >does not suggest a list is _returned_, just like foo[23] doesn't.
>
> But it does immediately suggest
>
>     iter[humor(joke) for joke in jokes]
>
> as the format for iterator comprehensions.
>
> Is that good or bad ?

Personally I consider it very good, because, in my other message about 
"accumulator display semantics", I show exactly how to achieve that by
generalizing the semantics of these displays (well, I show it for a class
Iter, but the built-in iter might perfectly well define an __accum__ special
method and achieve exactly the same effect).


Alex


From barry at python.org  Thu Oct 16 07:57:10 2003
From: barry at python.org (Barry Warsaw)
Date: Thu Oct 16 07:57:15 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <A48AAE39.E892B3A@mail.google.com>
References: <A48AAE39.E892B3A@mail.google.com>
Message-ID: <1066305429.18702.1.camel@anthem>

On Wed, 2003-10-15 at 20:27, Peter Norvig wrote:

> As it turns out, I have a proposed syntax for something I call an
> "accumulation display", and with it I was able to implement and test a
> SortBy in about a minute.

> You can read the whole proposal at http:///www.norvig.com/pyacc.html

BTW, now is a great time to start writing those Python 2.4 PEPs <wink>.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20031016/f85d9af7/attachment.bin
From mcherm at mcherm.com  Thu Oct 16 08:17:44 2003
From: mcherm at mcherm.com (Michael Chermside)
Date: Thu Oct 16 08:17:47 2003
Subject: [Python-Dev] decorate-sort-undecorate
Message-ID: <1066306664.3f8e8c687ef61@mcherm.com>

Tim writes:
> That said, since we're having a fire sale on optional sort arguments in 2.4,
> I wouldn't oppose an optional Boolean argument you could explicit set to
> have x.sort() return x.  For example,

I just wanted to call everyone's attention to the fact that Tim may
(again... <sigh>) have come up with a decent idea.

Seriously... Guido (and apparently Tim and I too) insist that aList.sort()
must return None since it mutates the list. Meanwhile, Kevin, Barry, and
perhaps others want to be able to write aList.sort().reverse().chainMoreHere().

But both sides could probably be happy with:

    aList.sort(chain=True).reverse()

Right?

-- Michael Chermside


From mcherm at mcherm.com  Thu Oct 16 08:31:34 2003
From: mcherm at mcherm.com (Michael Chermside)
Date: Thu Oct 16 08:31:36 2003
Subject: [Python-Dev] accumulator display syntax
Message-ID: <1066307494.3f8e8fa687e5a@mcherm.com>

Alex Martelli writes:
> I think we could extend indexing to mean something different when
> the [ ] contain a 'for', just like we extended list display to mean
> something different (list comprehension) when the [ ] contain a
> 'for'.  Syntax such as:
> 
>     Top(10)[ humor(joke) for joke in jokes ]
> 
> does not suggest a list is _returned_, just like foo[23] doesn't.

I find the syntax a bit confusing.

Are we subscripting here, or are we juxtaposing one expression
("Top(10)"), with a list comprehension ("[humor(joke) for joke in jokes]")?

Not totally unreadable, but it rubs me the wrong way. I read [] used
for subscripting as completely different from [] used for list literals
and list comprehensions. They just happen to share the same pair of
symbols. To me, this confuses the two somewhat.

-- Michael Chermside


From tim.one at comcast.net  Thu Oct 16 09:53:15 2003
From: tim.one at comcast.net (Tim Peters)
Date: Thu Oct 16 09:53:18 2003
Subject: [Python-Dev] Python-2.3.2 windows binary screwed
In-Reply-To: <1xtd8ydo.fsf@python.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEFPGJAB.tim.one@comcast.net>

>>> I copied them from my system directory, instead of using those of
>>> the MSVC 6 SP5 redistributables.

>> That's a good choice.

> Apparently not - they were XP specific.

We may be compounding ambiguity here.  By "that" I meant the SP
redistributables.  By "they" I expect you mean whatever was sitting in your
system directory, in which case I switch from saying that's a good choice to
that's a rotten choice <wink>.


From aleaxit at yahoo.com  Thu Oct 16 10:02:35 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Thu Oct 16 10:02:42 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <1066307494.3f8e8fa687e5a@mcherm.com>
References: <1066307494.3f8e8fa687e5a@mcherm.com>
Message-ID: <200310161602.35561.aleaxit@yahoo.com>

On Thursday 16 October 2003 02:31 pm, Michael Chermside wrote:
> Alex Martelli writes:
> > I think we could extend indexing to mean something different when
> > the [ ] contain a 'for', just like we extended list display to mean
> > something different (list comprehension) when the [ ] contain a
> > 'for'.  Syntax such as:
> >
> >     Top(10)[ humor(joke) for joke in jokes ]
> >
> > does not suggest a list is _returned_, just like foo[23] doesn't.
>
> I find the syntax a bit confusing.
>
> Are we subscripting here, or are we juxtaposing one expression
> ("Top(10)"), with a list comprehension ("[humor(joke) for joke in jokes]")?

"Subscripting", just like we would do with, say,

    Top(10)[ humor(joke) and joke in jokes ]

This syntax, too, is a bit confusing -- because we rarely use indexing
right on the result of a function call -- but it's perfectly valid Python 
today.  If you dislike the syntax, nothing stops you from writing, today:

select_top_10 = Top(10)
select_top_10[ humor(joke) and joke in jokes ]

and similarly nothing will stop you, if something like this accumulator
display syntax is approved, from writing in the second statement

select_top_10[ humor(joke) for joke in jokes ]

and indeed some would consider this other for more readable.

I am not proposing any newfangled "juxtaposing" syntax, writing two
expressions right one after the other, which would have no precedent
in Python; just an extension of the syntax allowed within brackets in
indexing syntax (by analogy with that allowed today within brackets in list
comprehension / list display syntax) -- for the semantics, see my separate
post "accumulator display semantics".  (Both of my posts are commentary
on the proposal by Peter Norvig for a new accumulator display syntax
and semantics: this syntax looks good to me to avoid the objection that
Peter's proposed "[foo: x for x in bar]" ``looks like it should be returning
a list'' due to the square brackets and the similar objection against the
separately proposed iterator-comprehension syntax).


> Not totally unreadable, but it rubs me the wrong way. I read [] used
> for subscripting as completely different from [] used for list literals
> and list comprehensions. They just happen to share the same pair of
> symbols. To me, this confuses the two somewhat.

Not long ago, what could go inside those square brackets was an
expression, period -- no matter whether the brackets stood on their
own (list display) or followed an expression (indexing/slicing).  Some
minor differences, of course, such as empty [ ] being valid only in
list display but syntactically invalid in indexing, and slice notation [a:b]
being valid in indexing but syntactically invalid in list display; but typical
uses such as [a,b,c] overlapping -- with different meanings, of course
(indexing X[a,b,c] -> the tuple (a,b,c) used as key; display [a,b,c] ->
creating a 3-items list).  So, different semantics but very similar syntax.

Then list comprehensions were introduced and the syntax admitted
inside [ ] got far wider, in "list display" cases only.  Why would it be
a problem if now the syntax admitted in the "similar syntax, different
semantics" case of "indexing" got similarly wider?  How would it
infringe on the "completely different ... just happen to share the same
pair of symbols" (and a lot about the syntax relating to what can
go inside those symbols, too) perception, which seems to me to be
pretty accurate?


Alex


From tim.one at comcast.net  Thu Oct 16 10:25:54 2003
From: tim.one at comcast.net (Tim Peters)
Date: Thu Oct 16 10:25:55 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <r01050400-1026-8FC4B6A8FFCA11D78DF6003065D5E7E4@[10.0.0.23]>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEGBGJAB.tim.one@comcast.net>

[Just]
> On the sf tracker item you [Raymond] write:
>
>     def sort(self, cmp=None, key=None, reverse=None):
>         if cmp is not None and key is not None:
>             cmp = cmpwrapper(cmp)
>         if key is not None:
>             self[:] = [sortwrapper(key(x), x) for x in self]
>         if reverse is not None:
>             self.reverse()
>         self.sort(cmp)
>         if key is not None:
>             self[:] = [x.getvalue() for x in self]
>         if reverse is not None:
>             self.reverse()
>
> Is there consensus at all about the necessity of that first reverse
> call? To me it's not immediately obvious that the reverse option
> should maintain the _original_ stable order. In my particular
> application I would actually want reverse to do just that: reverse
> the result of the sort. Easy enough to work around of course: I could
> do the reverse myself after the sort. But it does feel odd: sort()
> now _has_ a reverse feature, but I can't use it...

"reverse" here is being used in the sense of "flip the sense of the cmp()
result", so that instead of using cmp(x, y), it (conceptually) uses the
negation of cmp(x, y).  This swaps "less than" with "greater than" outcomes,
but leaves "equal" outcomes alone.  In that sense, Raymond's is a clever and
correct implementation.  I don't know that it helps Alex's use case, though
(multi-key sort where some keys want ascending and others descending; those
are still tricky to write directly in one bite, although the reverse
argument makes them easy to do by chaining sorts one key at a time).

> (Also: how does timsort perform when fed a (partially) sorted list
> compared to a reversed sorted list?

I'll need a concrete example to figure out exactly what that's intended to
mean.  The algorithm is equally happy with descending runs as with ascending
runs, although the former need a little time to transform them to ascending
runs, and the all-equal case counts as an ascending run.

> If there's a significant difference there, than that first reverse
> call may actually hurt performance in some cases. Not that I care
> much about that...)

Say we're doing [1, 2, 3].sort(reverse=True).  Raymond first reverses it:

    [3, 2, 1]

In one pass, using two compares (N-1 for a list of length N), the algorithm
recognizes that the whole thing is a single descending run.  It then
reverses it in one pass (swapping elements starting at both ends and moving
toward the middle):

   [1, 2, 3]

and it's done.  Raymond then reverses it again:

   [3, 2, 1]

So there are 3 reversals in all.  Reversals are cheap, since they just swap
pointers in a tight little C loop, and never call back into Python.


From python at rcn.com  Thu Oct 16 10:26:50 2003
From: python at rcn.com (Raymond Hettinger)
Date: Thu Oct 16 10:27:32 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <200310161020.24161.aleaxit@yahoo.com>
Message-ID: <003301c393f1$8aa9efa0$e841fea9@oemcomputer>

[Alex Martelli]
> I thought the idea being implemented avoided making a new list --
> i.e., that the idea being implemented is the equivalent of:
> 
> # decorate
> for i, item in enumerate(thelist):
>     thelist[i] = CleverWrapper((key(item), item))
> 
> # sort (with the new stability guarantee)
> thelist.sort()
> 
> # undecorate
> for i, item in enumerate(thelist):
>     thelist[i] = item[1]
> 
> where (the equivalent of):
> 
> class CleverWrapper(tuple):
>     def __cmp__(self, other): return cmp(self[0], other[0])
> 
> so, there is no allocation of another list -- just (twice) a
repopulation
> of the existing one.  How _important_ that is to performance, I dunno,
> but wanted to double-check on my understanding of this anyway.

Yes, that is how it works in a nutshell ;-)

Of course, it looks more impressive and was harder to write in C.


Raymond Hettinger


From pje at telecommunity.com  Thu Oct 16 10:50:44 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Oct 16 10:50:42 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <1066307494.3f8e8fa687e5a@mcherm.com>
Message-ID: <5.1.0.14.0.20031016104316.01ecbec0@mail.telecommunity.com>

At 05:31 AM 10/16/03 -0700, Michael Chermside wrote:
>Alex Martelli writes:
> > I think we could extend indexing to mean something different when
> > the [ ] contain a 'for', just like we extended list display to mean
> > something different (list comprehension) when the [ ] contain a
> > 'for'.  Syntax such as:
> >
> >     Top(10)[ humor(joke) for joke in jokes ]
> >
> > does not suggest a list is _returned_, just like foo[23] doesn't.
>
>I find the syntax a bit confusing.
>
>Are we subscripting here, or are we juxtaposing one expression
>("Top(10)"), with a list comprehension ("[humor(joke) for joke in jokes]")?
>
>Not totally unreadable, but it rubs me the wrong way. I read [] used
>for subscripting as completely different from [] used for list literals
>and list comprehensions. They just happen to share the same pair of
>symbols. To me, this confuses the two somewhat.

I have to second on the syntax confusion, but for a different reason.  This:

     Top(10)[ humor(joke) for joke in jokes ]

Looks to me like some kind of *slice* syntax.  I would read this as being 
roughly equivalent to:

     temp = Top(10)
     [temp[humor(joke)] for joke in jokes ]

Top(10) and all the other accumulators proposed are, IMO, nothing more than 
transformations of a sequence or iterator.  Transformations are what 
functions are for, and function syntax clearly expresses that the function 
is being applied to the sequence or iterator, and returning a 
result.  Peter's syntax is too magical, and Alex's implies subscripting 
that doesn't really exist.  Both are misleading to a casual reader of the code.


From neal at metaslash.com  Thu Oct 16 11:52:12 2003
From: neal at metaslash.com (Neal Norwitz)
Date: Thu Oct 16 11:52:21 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Objects
	listobject.c, 2.158, 2.159
In-Reply-To: <E1A9z0h-00060l-00@sc8-pr-cvs1.sourceforge.net>
References: <E1A9z0h-00060l-00@sc8-pr-cvs1.sourceforge.net>
Message-ID: <20031016155212.GE30467@epoch.metaslash.com>

On Wed, Oct 15, 2003 at 08:41:11PM -0700, rhettinger@users.sourceforge.net wrote:
> 
> Index: listobject.c
> ===================================================================
> + static PyObject *
> + cmpwrapper_call(cmpwrapperobject *co, PyObject *args, PyObject *kwds)
> + {
> + 	PyObject *x, *y, *xx, *yy;
> + 
> + 	if (!PyArg_UnpackTuple(args, "", 2, 2, &x, &y))
> + 		return NULL;
> + 	if (!PyObject_TypeCheck(x, &sortwrapper_type) ||
> + 	    !PyObject_TypeCheck(x, &sortwrapper_type)) {

The second line should be checking y, not x?

Neal

From ianb at colorstudy.com  Thu Oct 16 12:17:35 2003
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu Oct 16 12:17:46 2003
Subject: [Python-Dev] accumulator display semantics
In-Reply-To: <200310161200.04846.aleaxit@yahoo.com>
Message-ID: <40DEE1A8-FFF4-11D7-9282-000393C2D67E@colorstudy.com>

On Thursday, October 16, 2003, at 05:00 AM, Alex Martelli wrote:
> Why not move the for loop, if needed, out of the hard-coded
> infrastructure and just have accumulator display syntax such as:
>     acc[x*x for x in it]
> be exactly equivalent to:
>     a = acc.__accum__(lambda x: x*x, iter(it))
>     return a.result()
> i.e., pass the callable corresponding to the expression, and the
> iterator corresponding to the sequence, to the user-coded
> accumulator.

Seems simpler if you could get an iterator for [x*x for x in it] that 
returned (x*x, x), then call acc.__accum__(that_iter).  I suppose for 
some accumulators you could sometimes avoid calling the expression, but 
that doesn't seem like a big feature.

It seems like it complicates the semantics that you have to turn the 
list comprehension's expression into a function, where (I imagine) it 
doesn't get turned into a real function otherwise, but is executed 
without a new scope.

--
Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org


From aahz at pythoncraft.com  Thu Oct 16 12:23:41 2003
From: aahz at pythoncraft.com (Aahz)
Date: Thu Oct 16 12:23:45 2003
Subject: [Python-Dev] accumulator display semantics
In-Reply-To: <200310161200.04846.aleaxit@yahoo.com>
References: <5.1.0.14.0.20031015212950.02951120@mail.telecommunity.com>
	<A58732FF.7D9079E7@mail.google.com>
	<200310161200.04846.aleaxit@yahoo.com>
Message-ID: <20031016162341.GA7305@panix.com>

I'm having a difficult time following this discussion.  Would someone
please write a PEP once things settle down?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From tim.one at comcast.net  Thu Oct 16 12:50:41 2003
From: tim.one at comcast.net (Tim Peters)
Date: Thu Oct 16 12:50:41 2003
Subject: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <1066306664.3f8e8c687ef61@mcherm.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHGGJAB.tim.one@comcast.net>

[Michael Chermside]
> ...
> But both sides could probably be happy with:
>
>     aList.sort(chain=True).reverse()
>
> Right?

Probably not:  some people want list.sort() to return a (shallow) copy of
the list in sorted order, leaving the original list alone, sometimes.
Somtimes not.  It's all dead easy already, of course.


From python at rcn.com  Thu Oct 16 13:09:22 2003
From: python at rcn.com (Raymond Hettinger)
Date: Thu Oct 16 13:10:07 2003
Subject: [Python-Dev] Re: [Python-checkins]
	python/dist/src/Objectslistobject.c, 2.158, 2.159
In-Reply-To: <20031016155212.GE30467@epoch.metaslash.com>
Message-ID: <004501c39408$3eed63a0$e841fea9@oemcomputer>

> > + 	    !PyObject_TypeCheck(x, &sortwrapper_type)) {

[Neal]
 
> The second line should be checking y, not x?

Yes.
Will checkin a fix.


Raymond


From python at rcn.com  Thu Oct 16 13:21:21 2003
From: python at rcn.com (Raymond Hettinger)
Date: Thu Oct 16 13:22:02 2003
Subject: [Python-Dev] accumulator display semantics
In-Reply-To: <20031016162341.GA7305@panix.com>
Message-ID: <004c01c39409$eb89d0c0$e841fea9@oemcomputer>

[Aahz]
> I'm having a difficult time following this discussion.  Would someone
> please write a PEP once things settle down?

Peter's link is essentially a PEP already and covers all the essentials:

   http://www.norvig.com/pyacc.html

Still, if his ideas aspire to immortality, he should go the last yard
and format it for pephood.


Raymond Hettinger


From aahz at pythoncraft.com  Thu Oct 16 13:45:26 2003
From: aahz at pythoncraft.com (Aahz)
Date: Thu Oct 16 13:45:29 2003
Subject: [Python-Dev] accumulator display semantics
In-Reply-To: <004c01c39409$eb89d0c0$e841fea9@oemcomputer>
References: <20031016162341.GA7305@panix.com>
	<004c01c39409$eb89d0c0$e841fea9@oemcomputer>
Message-ID: <20031016174526.GA20332@panix.com>

On Thu, Oct 16, 2003, Raymond Hettinger wrote:
> [Aahz]
>>
>> I'm having a difficult time following this discussion.  Would someone
>> please write a PEP once things settle down?
> 
> Peter's link is essentially a PEP already and covers all the essentials:
> 
>    http://www.norvig.com/pyacc.html

Gotcha.  Didn't realize he'd been summarizing the discussion.  Well,
I'll hold my opinion on the whole proposal pending a PEP, but I'll make
two comments on the proposal as it stands:

* I'm strongly opposed to the return idea instead of raising
StopAccumulation (which should be a subclass of StopIteration).  Using
return this way is IMO unPythonic.

* If we're using bracket notation, I think accumulators must return a
list.  I think it would be a Bad Idea to permit other types (although I'm
willing for leeway to permit list subclasses).
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From python at rcn.com  Thu Oct 16 13:49:57 2003
From: python at rcn.com (Raymond Hettinger)
Date: Thu Oct 16 13:50:38 2003
Subject: [Python-Dev] inline sort option
In-Reply-To: <1066306664.3f8e8c687ef61@mcherm.com>
Message-ID: <004f01c3940d$ea1c1320$e841fea9@oemcomputer>

 [Tim Peters]
> > That said, since we're having a fire sale on optional sort arguments
in
> 2.4,
> > I wouldn't oppose an optional Boolean argument you could explicit
set to
> > have x.sort() return x.  For example,

[Michael Chermside]
> I just wanted to call everyone's attention to the fact that Tim may
> (again... <sigh>) have come up with a decent idea.
> 
> Seriously... Guido (and apparently Tim and I too) insist that
aList.sort()
> must return None since it mutates the list. Meanwhile, Kevin, Barry,
and
> perhaps others want to be able to write
> aList.sort().reverse().chainMoreHere().

Are you proposing something like:

    print mylist.sort(inplace=False)  # prints a new, sorted list while
                                      # leaving the original list intact


which would be implemented something like this:

    def inlinesort(alist, *args, **kwds):
        newref = alist[:]
        newref.sort(*args, **kwds)
        return newref


If that is what you're after, I think it is a good idea.  It avoids the
perils of mutating methods returning self.  It is explicit and pleasing
to write:

    for elem in mylist.sort(inplace=False):
        . . .

It is extra nice in a list comprehension:

    peckingorder = [d.name for d in duck.sort(key=seniority,
inplace=False)]


Instead of "inplace=False", an alternative is "inline=True".


Raymond Hettinger


From guido at python.org  Thu Oct 16 14:03:26 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 16 14:04:03 2003
Subject: [Python-Dev] inline sort option
In-Reply-To: Your message of "Thu, 16 Oct 2003 13:49:57 EDT."
	<004f01c3940d$ea1c1320$e841fea9@oemcomputer> 
References: <004f01c3940d$ea1c1320$e841fea9@oemcomputer> 
Message-ID: <200310161803.h9GI3Q404990@12-236-54-216.client.attbi.com>

> Are you proposing something like:
> 
>     print mylist.sort(inplace=False)  # prints a new, sorted list while
>                                       # leaving the original list intact
> 
> 
> which would be implemented something like this:
> 
>     def inlinesort(alist, *args, **kwds):
>         newref = alist[:]
>         newref.sort(*args, **kwds)
>         return newref
> 
> 
> If that is what you're after, I think it is a good idea.  It avoids the
> perils of mutating methods returning self.  It is explicit and pleasing
> to write:
> 
>     for elem in mylist.sort(inplace=False):
>         . . .
> 
> It is extra nice in a list comprehension:
> 
>     peckingorder = [d.name for d in duck.sort(key=seniority,
> inplace=False)]
> 
> Instead of "inplace=False", an alternative is "inline=True".

*If* we're going to consider this, I would recommend using a different
method name rather than a keyword argument.  Arguments whose value
changes the return type present a problem for program analysis tools
like type checkers (and IMO are also easily overseen by human
readers).  And, it's easier to write l.sorted() rather than
l.sort(inline=True).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From eppstein at ics.uci.edu  Thu Oct 16 14:25:13 2003
From: eppstein at ics.uci.edu (David Eppstein)
Date: Thu Oct 16 14:25:28 2003
Subject: [Python-Dev] Re: inline sort option
References: <1066306664.3f8e8c687ef61@mcherm.com>
	<004f01c3940d$ea1c1320$e841fea9@oemcomputer>
Message-ID: <eppstein-FCED70.11251316102003@sea.gmane.org>

In article <004f01c3940d$ea1c1320$e841fea9@oemcomputer>,
 "Raymond Hettinger" <python@rcn.com> wrote:

> Are you proposing something like:
> 
>     print mylist.sort(inplace=False)  # prints a new, sorted list while
>                                       # leaving the original list intact

What's wrong with writing your own three-line function

def sort(L):
    copy = list(L)
    copy.sort()
    return copy

then you can do print sort(mylist) etc to your heart's content...

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science


From pje at telecommunity.com  Thu Oct 16 14:26:13 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Oct 16 14:26:02 2003
Subject: [Python-Dev] accumulator display semantics
In-Reply-To: <20031016174526.GA20332@panix.com>
References: <004c01c39409$eb89d0c0$e841fea9@oemcomputer>
	<20031016162341.GA7305@panix.com>
	<004c01c39409$eb89d0c0$e841fea9@oemcomputer>
Message-ID: <5.1.0.14.0.20031016140604.02e53260@mail.telecommunity.com>

At 01:45 PM 10/16/03 -0400, Aahz wrote:
>On Thu, Oct 16, 2003, Raymond Hettinger wrote:
> > [Aahz]
> >>
> >> I'm having a difficult time following this discussion.  Would someone
> >> please write a PEP once things settle down?
> >
> > Peter's link is essentially a PEP already and covers all the essentials:
> >
> >    http://www.norvig.com/pyacc.html
>
>Gotcha.  Didn't realize he'd been summarizing the discussion.  Well,
>I'll hold my opinion on the whole proposal pending a PEP, but I'll make
>two comments on the proposal as it stands:
>
>* I'm strongly opposed to the return idea instead of raising
>StopAccumulation (which should be a subclass of StopIteration).  Using
>return this way is IMO unPythonic.
>
>* If we're using bracket notation, I think accumulators must return a
>list.  I think it would be a Bad Idea to permit other types (although I'm
>willing for leeway to permit list subclasses).

And while we're writing comments for the "Objections" part of the PEP...  :)

* This does nothing functions can't do today (with less magic and greater 
readability) over any iterable

* If you don't want to allocate memory for the whole list, you can always 
write an iterator object or generator function -- today, even in Python 2.2.

* If you really want a way to create a generator inline, let's just have a 
way to create a generator inline in 2.4.  And any accumulator functions you 
previously wrote for 2.2 or 2.3 will "just work" with the new kind of 
generator.  Note too, that inline generators would have other uses besides 
accumulation expressions.


From bac at OCF.Berkeley.EDU  Thu Oct 16 14:29:15 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Oct 16 14:29:30 2003
Subject: [Python-Dev] Draft of an essay on Python development (and how
	to help)
In-Reply-To: <m3fzhtzpif.fsf@mira.informatik.hu-berlin.de>
References: <3F8B5ECB.4030207@ocf.berkeley.edu>
	<m3fzhtzpif.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3F8EE37B.6070803@ocf.berkeley.edu>

Martin v. L?wis wrote:
> "Brett C." <bac@OCF.Berkeley.EDU> writes:
> 
> 
>>If you get any message from this document, it should be that *anyone*
>>can help Python.
> 
> 
> It should be what?
> 

"...that *anyone* can help with the development of Python"?

-Brett


From aahz at pythoncraft.com  Thu Oct 16 14:51:57 2003
From: aahz at pythoncraft.com (Aahz)
Date: Thu Oct 16 14:52:00 2003
Subject: [Python-Dev] inline sort option
In-Reply-To: <200310161803.h9GI3Q404990@12-236-54-216.client.attbi.com>
References: <004f01c3940d$ea1c1320$e841fea9@oemcomputer>
	<200310161803.h9GI3Q404990@12-236-54-216.client.attbi.com>
Message-ID: <20031016185156.GA3580@panix.com>

On Thu, Oct 16, 2003, Guido van Rossum wrote:
>
> *If* we're going to consider this, I would recommend using a different
> method name rather than a keyword argument.  Arguments whose value
> changes the return type present a problem for program analysis tools
> like type checkers (and IMO are also easily overseen by human
> readers).  And, it's easier to write l.sorted() rather than
> l.sort(inline=True).

Let's make explicit: l.copysort()

I'm not a big fan of grammatical suffixes for distinguishing between
similar meanings.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From python at rcn.com  Thu Oct 16 15:18:08 2003
From: python at rcn.com (Raymond Hettinger)
Date: Thu Oct 16 15:18:51 2003
Subject: [Python-Dev] inline sort option
In-Reply-To: <20031016185156.GA3580@panix.com>
Message-ID: <006701c3941a$3c5e9d40$e841fea9@oemcomputer>

[Guido van Rossum]
> > *If* we're going to consider this, I would recommend using a
different
> > method name rather than a keyword argument.  Arguments whose value
> > changes the return type present a problem for program analysis tools
> > like type checkers (and IMO are also easily overseen by human
> > readers).  And, it's easier to write l.sorted() rather than
> > l.sort(inline=True).

[Aahz]
> Let's make explicit: l.copysort()
> 
> I'm not a big fan of grammatical suffixes for distinguishing between
> similar meanings.

+1


Raymond


From theller at python.net  Thu Oct 16 15:38:01 2003
From: theller at python.net (Thomas Heller)
Date: Thu Oct 16 15:38:05 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <3F8C3DD0.4020400@v.loewis.de> (
	=?iso-8859-1?q?Martin_v._L=F6wis's_message_of?= "Tue,
	14 Oct 2003 20:17:52 +0200")
References: <brspcg23.fsf@python.net> <3F872FE9.9070508@v.loewis.de>
	<u16bbwsz.fsf@python.net> <3F8C3DD0.4020400@v.loewis.de>
Message-ID: <d6cxnfc6.fsf@python.net>

> Thomas Heller wrote:
>
>> If I look at the file sizes in the DLLs directory, it seems that at
>> least unicodedata.pyd, _bsddb.pyd, and _ssl.pyd would significantly grow
>> python23.dll. Is unicodedata.pyd used by the encoding/decoding methods?
>
> No, but it is use by SRE, and by unicode methods (.lower, .upper, ...).

"Martin v. L?wis" <martin@v.loewis.de> writes:

> I don't see why it matters, though. Adding modules to pythonxy.dll
> does not increase the memory consumption if the modules are not
> used. It might decrease the memory consumption in case the modules are
> used.

So, would a patch be accepted (for 2.4, I assume there is no way for
2.3.3) which made everything builtin except for the following modules:

_testcapi - not used outside the testsuite
_tkinter - needs external stuff anyway
pyexpat - may be replaced by a third party module
_ssl - needs Python to be built

Thomas


From FBatista at uniFON.com.ar  Thu Oct 16 15:51:40 2003
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Thu Oct 16 15:52:34 2003
Subject: [Python-Dev] inline sort option
Message-ID: <A128D751272CD411BC9200508BC2194D0338304B@escpl.tcp.com.ar>

#- > > like type checkers (and IMO are also easily overseen by human
#- > > readers).  And, it's easier to write l.sorted() rather than
#- > > l.sort(inline=True).
#- 
#- [Aahz]
#- > Let's make explicit: l.copysort()
#- > 
#- > I'm not a big fan of grammatical suffixes for 
#- distinguishing between
#- > similar meanings.
#- 
#- +1

+2, considering that the difference in behaviour with sort and sorted it's
no so clear to a non-english speaker.

(my first post to the development list, :D )

.	Facundo


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
ADVERTENCIA  

La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo,
son para uso exclusivo del destinatario y pueden contener informaci?n
confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. 

Si Ud. No es uno de los destinatarios consignados o la persona responsable
de hacer llegar este mensaje a los destinatarios consignados, no est?
autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de
ella) contenida en este mensaje. Por favor notif?quenos respondiendo al
remitente, borre el mensaje original y borre las copias (impresas o grabadas
en cualquier medio magn?tico) que pueda haber realizado del mismo. 

Todas las opiniones contenidas en este mail son propias del autor del
mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones
Personales S.A. o alguna empresa asociada. 

Los mensajes electr?nicos pueden ser alterados, motivo por el cual
Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n
cualquiera sea el resultante de este mensaje. 

Muchas Gracias.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20031016/7f1786a9/attachment.html
From shane.holloway at ieee.org  Thu Oct 16 16:00:31 2003
From: shane.holloway at ieee.org (Shane Holloway (IEEE))
Date: Thu Oct 16 16:01:23 2003
Subject: [Python-Dev] inline sort option
In-Reply-To: <200310161803.h9GI3Q404990@12-236-54-216.client.attbi.com>
References: <004f01c3940d$ea1c1320$e841fea9@oemcomputer>
	<200310161803.h9GI3Q404990@12-236-54-216.client.attbi.com>
Message-ID: <3F8EF8DF.3030003@ieee.org>

Guido van Rossum wrote:

 > *If* we're going to consider this, I would recommend using a different
 > method name rather than a keyword argument.  Arguments whose value
 > changes the return type present a problem for program analysis tools
 > like type checkers (and IMO are also easily overseen by human
 > readers).  And, it's easier to write l.sorted() rather than
 > l.sort(inline=True).
 >
 > --Guido van Rossum (home page: http://www.python.org/~guido/)

I'd like to see that as an inplace sort still -- because copysort is 
easy to get to...

l.sorted() # inplace sort, returning self
l[:].sorted() # copy sort, returning new list


Just my 1/50th of a dollar.  ;)
-Shane Holloway


From guido at python.org  Thu Oct 16 16:19:47 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 16 16:20:27 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: Your message of "Thu, 16 Oct 2003 21:38:01 +0200."
	<d6cxnfc6.fsf@python.net> 
References: <brspcg23.fsf@python.net> <3F872FE9.9070508@v.loewis.de>
	<u16bbwsz.fsf@python.net> <3F8C3DD0.4020400@v.loewis.de> 
	<d6cxnfc6.fsf@python.net> 
Message-ID: <200310162019.h9GKJli05194@12-236-54-216.client.attbi.com>

> So, would a patch be accepted (for 2.4, I assume there is no way for
> 2.3.3) which made everything builtin except for the following modules:
> 
> _testcapi - not used outside the testsuite
> _tkinter - needs external stuff anyway
> pyexpat - may be replaced by a third party module
> _ssl - needs Python to be built

I'd rather see an explicit list of the "everything" that you want to
bundle into the main DLL.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From pedronis at bluewin.ch  Thu Oct 16 16:46:01 2003
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Thu Oct 16 16:44:20 2003
Subject: [Python-Dev] accumulator display semantics
In-Reply-To: <5.1.0.14.0.20031016140604.02e53260@mail.telecommunity.com>
References: <20031016174526.GA20332@panix.com>
	<004c01c39409$eb89d0c0$e841fea9@oemcomputer>
	<20031016162341.GA7305@panix.com>
	<004c01c39409$eb89d0c0$e841fea9@oemcomputer>
Message-ID: <5.2.1.1.0.20031016224531.02804310@pop.bluewin.ch>

At 14:26 16.10.2003 -0400, Phillip J. Eby wrote:
>* If you really want a way to create a generator inline, let's just have a 
>way to create a generator inline in 2.4.  And any accumulator functions 
>you previously wrote for 2.2 or 2.3 will "just work" with the new kind of 
>generator.  Note too, that inline generators would have other uses besides 
>accumulation expressions.

agreed. 


From barry at barrys-emacs.org  Thu Oct 16 17:25:51 2003
From: barry at barrys-emacs.org (Barry Scott)
Date: Thu Oct 16 17:26:01 2003
Subject: [Python-Dev] Python-2.3.2 windows binary screwed
In-Reply-To: <wub57jrv.fsf@python.net>
References: <200310160515.h9G5FUqc025443@localhost.localdomain>
	<wub57jrv.fsf@python.net>
Message-ID: <6.0.0.22.0.20031016222201.0221b908@torment.chelsea.private>

You said you are using the SP5 DLLs. They are old...

We use the ones from vc6redist.exe from microsoft they have fixes that
you may need. Its also the versions that you will encounter on XP
systems I believe.

So long as you have the version checking done right in the installer
you will not rewind a DLL backwards.

Barry


From martin at v.loewis.de  Thu Oct 16 17:37:00 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Thu Oct 16 17:37:14 2003
Subject: [Python-Dev] Draft of an essay on Python development (and how to
	help)
In-Reply-To: <3F8EE37B.6070803@ocf.berkeley.edu>
References: <3F8B5ECB.4030207@ocf.berkeley.edu>
	<m3fzhtzpif.fsf@mira.informatik.hu-berlin.de>
	<3F8EE37B.6070803@ocf.berkeley.edu>
Message-ID: <m3u168lv9f.fsf@mira.informatik.hu-berlin.de>

"Brett C." <bac@OCF.Berkeley.EDU> writes:

> >>If you get any message from this document, it should be that *anyone*
> >>can help Python.
> > It should be what?
> >
> 
> "...that *anyone* can help with the development of Python"?

Ah, ok. I was expecting something like

"it should be clear/obvious/doubtful that anyone can help with Python"

Regards,
Martin

From niemeyer at conectiva.com  Thu Oct 16 18:50:59 2003
From: niemeyer at conectiva.com (Gustavo Niemeyer)
Date: Thu Oct 16 18:52:10 2003
Subject: [Python-Dev] SRE recursion
Message-ID: <20031016225058.GB19133@ibook.distro.conectiva>

Hello folks!

I'd like to get back to the SRE recursion issue (#757624). Is this
a good time to commit the patch?

-- 
Gustavo Niemeyer
http://niemeyer.net

From bac at OCF.Berkeley.EDU  Thu Oct 16 19:25:16 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Oct 16 19:25:21 2003
Subject: [Python-Dev] Draft of an essay on Python development (and how
	to help)
In-Reply-To: <m3u168lv9f.fsf@mira.informatik.hu-berlin.de>
References: <3F8B5ECB.4030207@ocf.berkeley.edu>	<m3fzhtzpif.fsf@mira.informatik.hu-berlin.de>	<3F8EE37B.6070803@ocf.berkeley.edu>
	<m3u168lv9f.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3F8F28DC.1000106@ocf.berkeley.edu>

Martin v. L?wis wrote:
> "Brett C." <bac@OCF.Berkeley.EDU> writes:
> 
> 
>>>>If you get any message from this document, it should be that *anyone*
>>>>can help Python.
>>>
>>>It should be what?
>>>
>>
>>"...that *anyone* can help with the development of Python"?
> 
> 
> Ah, ok. I was expecting something like
> 
> "it should be clear/obvious/doubtful that anyone can help with Python"
> 

Could, but I don't want to come off as patronizing.  Last thing I want 
to happen is someone to read that line with "obvious" and then have them 
feel stupid because it didn't come off as obvious.

Even if the person isn't that smart they can still give the PSF money so 
I want to minimize the chance of insulting a possible sugar-daddy for 
the PSF.  =)

-Brett


From niemeyer at conectiva.com  Thu Oct 16 19:24:44 2003
From: niemeyer at conectiva.com (Gustavo Niemeyer)
Date: Thu Oct 16 19:25:54 2003
Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <20031016103552.H14453@prim.han.de>
References: <3F8DB69E.2070406@sabaydi.com>
	<LNBBLJKPBEHFEDALKOLCKECMGJAB.tim.one@comcast.net>
	<20031016103552.H14453@prim.han.de>
Message-ID: <20031016232444.GA27936@ibook.distro.conectiva>

> If anything at all, i'd suggest a std-module which contains e.g. 
> 'sort', 'reverse' and 'extend' functions which always return
> a new list, so that you could write:
> 
>     for i in reverse(somelist):
>         ...

You can do reverse with [::-1] now.

-- 
Gustavo Niemeyer
http://niemeyer.net

From bac at OCF.Berkeley.EDU  Thu Oct 16 19:28:36 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Oct 16 19:28:47 2003
Subject: [Python-Dev] SRE recursion
In-Reply-To: <20031016225058.GB19133@ibook.distro.conectiva>
References: <20031016225058.GB19133@ibook.distro.conectiva>
Message-ID: <3F8F29A4.7060904@ocf.berkeley.edu>

Gustavo Niemeyer wrote:

> Hello folks!
> 
> I'd like to get back to the SRE recursion issue (#757624). Is this
> a good time to commit the patch?
> 

I don't see why not.  I assume this is only going into the main trunk. 
Might as well get it in now if you feel it is ready so that there is 
that much more time for testing and any possible fixing.

-Brett


From greg at cosc.canterbury.ac.nz  Thu Oct 16 20:07:38 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu Oct 16 20:08:28 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310161602.35561.aleaxit@yahoo.com>
Message-ID: <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz>

Alex Martelli <aleaxit@yahoo.com>:

> Then list comprehensions were introduced and the syntax admitted
> inside [ ] got far wider, in "list display" cases only.  Why would it be
> a problem if now the syntax admitted in the "similar syntax, different
> semantics" case of "indexing" got similarly wider?

List comprehensions extended the semantics of list construction by
providing new ways to specify the contents of the list.

Extended slice notation extended the semantics of indexing by
providing new ways to specify the index.

What you're proposing hijacks the indexing syntax and uses it to mean
something completely different from indexing, which is a much bigger
change, and potentially a very confusing one.

So, no, sorry, it doesn't overcome my objection!

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From martin at v.loewis.de  Fri Oct 17 01:49:38 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Fri Oct 17 01:52:36 2003
Subject: [Python-Dev] SRE recursion
In-Reply-To: <20031016225058.GB19133@ibook.distro.conectiva>
References: <20031016225058.GB19133@ibook.distro.conectiva>
Message-ID: <m3d6cwpg5p.fsf@mira.informatik.hu-berlin.de>

Gustavo Niemeyer <niemeyer@conectiva.com> writes:

> I'd like to get back to the SRE recursion issue (#757624). Is this
> a good time to commit the patch?

It would be good if you could find somebody who reviews the
patch. However, if nobody volunteers to review, please go ahead - it
might well be that you are the last active SRE maintainer left on this
planet ...

Regards,
Martin


From gherron at islandtraining.com  Fri Oct 17 02:05:27 2003
From: gherron at islandtraining.com (Gary Herron)
Date: Fri Oct 17 02:06:28 2003
Subject: [Python-Dev] SRE recursion
In-Reply-To: <m3d6cwpg5p.fsf@mira.informatik.hu-berlin.de>
References: <20031016225058.GB19133@ibook.distro.conectiva>
	<m3d6cwpg5p.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200310162305.27509.gherron@islandtraining.com>

On Thursday 16 October 2003 10:49 pm, Martin v. L?wis wrote:
> Gustavo Niemeyer <niemeyer@conectiva.com> writes:
> > I'd like to get back to the SRE recursion issue (#757624). Is this
> > a good time to commit the patch?
>
> It would be good if you could find somebody who reviews the
> patch. However, if nobody volunteers to review, please go ahead - it
> might well be that you are the last active SRE maintainer left on this
> planet ...

I jumped into SRE and wallowed around a bit before the last release,
then got swamped with real (i.e., money earning) work.  I'd be willing
to jump in again if it would help.  Gustavo, would you like me to
review the patch?  Or if you submit it, I'll just get it from cvs and
poke around it that way.

Gary Herron


From whisper at oz.net  Fri Oct 17 02:55:31 2003
From: whisper at oz.net (David LeBlanc)
Date: Fri Oct 17 02:55:35 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <200310162019.h9GKJli05194@12-236-54-216.client.attbi.com>
Message-ID: <GCEDKONBLEFPPADDJCOEGEGFOBAA.whisper@oz.net>


> > So, would a patch be accepted (for 2.4, I assume there is no way for
> > 2.3.3) which made everything builtin except for the following modules:
> >
> > _testcapi - not used outside the testsuite
> > _tkinter - needs external stuff anyway
> > pyexpat - may be replaced by a third party module
> > _ssl - needs Python to be built
>
> I'd rather see an explicit list of the "everything" that you want to
> bundle into the main DLL.
>
> --Guido van Rossum (home page: http://www.python.org/~guido/)

I have no really good technical reason for this, but it gives me a bad
feeling - it's Windows, ok? ;)

A few things come to mind:

What's the cost of mapping the world (all those entry points) at startup?

You have to rebuild all of the main dll just to do something to one
component. To me, that's maybe the biggest single issue.

Any possiblity of new bugs?

Are app users/programmers going to have a bloat perception? How many of them
really understand that a dll is mapped and not loaded at startup?

IMO, it contradicts the unix way of smaller, compartmentalized is better.
It's not unix we're talking about, but it still makes sense to me, whatever
the OS.

On the plus side, it does make some debugging easier if you're working on
extension dlls: fewer sources to have to point Vis Studio at.

On a related side note: has anyone done any investigation to determine which
few percentage of the extensions account for 99% of the dll loads? Maybe
there's no such pattern, but experience suggests there probably is and that
subset might be a better candidate than the whole world.

Dave LeBlanc
Seattle, WA USA


From greg at electricrain.com  Fri Oct 17 03:49:39 2003
From: greg at electricrain.com (Gregory P. Smith)
Date: Fri Oct 17 03:49:43 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <d6cxnfc6.fsf@python.net>
References: <brspcg23.fsf@python.net> <3F872FE9.9070508@v.loewis.de>
	<u16bbwsz.fsf@python.net> <3F8C3DD0.4020400@v.loewis.de>
	<d6cxnfc6.fsf@python.net>
Message-ID: <20031017074939.GG32250@zot.electricrain.com>

On Thu, Oct 16, 2003 at 09:38:01PM +0200, Thomas Heller wrote:
> > Thomas Heller wrote:
> >
> >> If I look at the file sizes in the DLLs directory, it seems that at
> >> least unicodedata.pyd, _bsddb.pyd, and _ssl.pyd would significantly grow
> >> python23.dll. Is unicodedata.pyd used by the encoding/decoding methods?
> >
> > No, but it is use by SRE, and by unicode methods (.lower, .upper, ...).
> 
> "Martin v. L?wis" <martin@v.loewis.de> writes:
> 
> > I don't see why it matters, though. Adding modules to pythonxy.dll
> > does not increase the memory consumption if the modules are not
> > used. It might decrease the memory consumption in case the modules are
> > used.
> 
> So, would a patch be accepted (for 2.4, I assume there is no way for
> 2.3.3) which made everything builtin except for the following modules:
> 
> _testcapi - not used outside the testsuite
> _tkinter - needs external stuff anyway
> pyexpat - may be replaced by a third party module
> _ssl - needs Python to be built
> 

I really don't like the idea of linking _bsddb.pyd statically into the
main python DLL (or .so on other OSes).  It adds significantly to the
size of the python DLL which isn't fair to projects not using BerkeleyDB.

Statically linking any BerkeleyDB version into python on linux (and
presumably bsd and un*x) means that attempts to use more recent pybsddb
modules with an updated version of the BerkeleyDB library built in
don't work properly due to symbol conflicts causing the old library to
be used with the new module code.  I don't know if this problem applies
to windows.

I don't see any good reason to want fewer .pyd files and a monolithic
main DLL.

Greg

From aleaxit at yahoo.com  Fri Oct 17 03:53:55 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 17 03:54:01 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz>
References: <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz>
Message-ID: <200310170953.55170.aleaxit@yahoo.com>

On Friday 17 October 2003 02:07 am, Greg Ewing wrote:
> Alex Martelli <aleaxit@yahoo.com>:
> > Then list comprehensions were introduced and the syntax admitted
> > inside [ ] got far wider, in "list display" cases only.  Why would it be
> > a problem if now the syntax admitted in the "similar syntax, different
> > semantics" case of "indexing" got similarly wider?
>
> List comprehensions extended the semantics of list construction by
> providing new ways to specify the contents of the list.
>
> Extended slice notation extended the semantics of indexing by
> providing new ways to specify the index.
>
> What you're proposing hijacks the indexing syntax and uses it to mean
> something completely different from indexing, which is a much bigger
> change, and potentially a very confusing one.

Hmmm -- on this thread I meant to discuss the syntax only, but, OK,
let's touch on the semantics.  Let's say, then, that my proposed syntax:
    foo[x*x for x in blah]
gets turned into "extending the semantics of indexing" just like, e.g.,
extended slicing did.  That basically requires making this syntax
correspond to Python calling:
    type(foo).__getitem__(foo, <some suitable object>)
just like it does for other possible contents of the parentheses.

E.g., today:

>>> class x(object):
...   def __getitem__(self, index): return index
...
>>> a = x()
>>> print a['tanto':'va':'la', 'gatta':'al':'lardo']
(slice('tanto', 'va', 'la'), slice('gatta', 'al', 'lardo'))
>>>

while hypothetically if this syntax (and corresponding semantics) were
adopted, we might have:

>>> print a[x*x for x in blaap]
<indexing iterator object at 0x402deeac>

Of course, it would be up to a's type x to know what to do with
that iterator, just as, today, it is to know what to do with that
tuple of slice objects with (e.g.) string attributes.  Coding objects
that support iterators as indices would be slightly harder than
having objects receive such indexing via a separate special method,
such as the previously proposed __accum__; but then, this just
corresponds to the slight hardship we pay for generality in coding
objects that support slices as indices via __getitem__ -- the older
and less general approach of having a separate special method,
quondam __getslice__, was easier for special cases but not as
general and extensible as today's.

So, if framing what the subject still calls "accumulator displays"
as "new ways to specify the index" -- and renaming the whole
concept to e.g. "iterators as indices", since there is no necessary
connection of the proposed new syntax and semantics to
accumulation -- can ease acceptance, that's fine with me.  So
is the collapsing of the arguments into a single iterator, rather
than a separate pair of underlying iterator and exp callable to
be applied to each item -- this requires changing the Top(10)
use case to pass both sort-key and item explicitly:

Top(10)[ (humor(joke), joke) for joke in jokes ]

with Top having semantics roughly equivalent to (though no
doubt easily optimized -- by using a heap -- wrt):

def Top(N):
    class topper(object):
        def __getitem__(self, iter):
            values_and_items = list(iter)
            values_and_items.sort()
            return [ item for value, item in values_and_items[:N] ]
    return topper()

But this may in fact be preferable wrt both my and Peter
Norvig's previous ideas as posted on this thread.


> So, no, sorry, it doesn't overcome my objection!

What about this latest small change, of having the indexing syntax
invoke __getitem__ -- just like any other indexing, just with an
iterator as the index rather than (e.g.) a tuple of slices etc?

What, if anything, is "very confusing" in, e.g.,

    sum[x*x for x in blaap]

compared with e.g. the currently accepted:

    a['tanto':'va':'la', 'gatta':'al':'lardo']

?


Alex


From Paul.Moore at atosorigin.com  Fri Oct 17 05:47:07 2003
From: Paul.Moore at atosorigin.com (Moore, Paul)
Date: Fri Oct 17 05:47:52 2003
Subject: [Python-Dev] buildin vs. shared modules
Message-ID: <16E1010E4581B049ABC51D4975CEDB8803060CDB@UKDCX001.uk.int.atosorigin.com>

From: Gregory P. Smith [mailto:greg@electricrain.com]
> I don't see any good reason to want fewer .pyd files and a
> monolithic main DLL.

Agreed. The arguments on both sides seem weak, so I'd prefer to
leave things as they are.

My own (weak) argument against a monolithic DLL is that when
packaging a standalone distribution (Installer, py2exe, cx_Freeze
or whatever) it reduces the distribution size to omit unneeded
DLLs. In particular, _tkinter, pyexpat, _bsddb and _ssl are over
100k each.

Maybe only the DLLs which are necessary for Python to start should
be built in (eg, zlib for zipfile support, _sre seems impossible to
avoid, others I don't know - _winreg?)

But as I said, I see no arguments which aren't weak, so why change?
Paul

From paoloinvernizzi at dmsware.com  Fri Oct 17 07:03:00 2003
From: paoloinvernizzi at dmsware.com (Paolo Invernizzi)
Date: Fri Oct 17 08:10:37 2003
Subject: [Python-Dev] Re: buildin vs. shared modules
In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8803060CDB@UKDCX001.uk.int.atosorigin.com>
References: <16E1010E4581B049ABC51D4975CEDB8803060CDB@UKDCX001.uk.int.atosorigin.com>
Message-ID: <bmoi94$fnh$1@sea.gmane.org>

Moore, Paul wrote:

> Maybe only the DLLs which are necessary for Python to start should
> be built in (eg, zlib for zipfile support, _sre seems impossible to
> avoid, others I don't know - _winreg?)

_winreg is only 36k and the most valuable use I think is that is used by 
distutils for searching VC compiler, but I think it can stay out...

But I agree for zlib and _sre.
With only the core DLL and a zip of necessary modules (os module stuff 
and so) you can start a minimal python and import whatever other zip of 
modules you need...

Python DLL is actually 933k zlib is 61k and _sre is 57k.. so will be 
around 1050k...

---
Paolo Invernizzi


> 
> But as I said, I see no arguments which aren't weak, so why change?
> Paul


From arigo at tunes.org  Fri Oct 17 08:54:29 2003
From: arigo at tunes.org (Armin Rigo)
Date: Fri Oct 17 08:58:22 2003
Subject: [Python-Dev] Trashing recursive objects comparison?
Message-ID: <20031017125429.GA25854@vicky.ecs.soton.ac.uk>

Hello all,

I'm bringing (again) the subject of comparison of recursive objects to the 
table because I just happened to write a buggy piece of code:

class X:
   def __eq__(self, other):
      return self.content == other

This code was buggy because 'self.content' could occasionally be 'self'. In
this case it should have triggered an infinite recursion, and I should have
got a nice (if a bit long) RuntimeError traceback that told me where the
problem was. At least, this is how I would expect my piece of code to
misbehave.

Instead, the answer was 'True', whatever 'other' actually was. Puzzlement
would have gained me if I had no idea about what a bisimulation, or graph
isomorphism, is, and what Python's implementation of that idea is.

Quoting Tim on bug #625698:
> As Erik's latest example shows, the outcome isn't always 
> particularly well defined either.  An alternative to speeding
> this
> silliness is to raise an exception instead when recursive 
> objects are detected.  There was some hack value in doing 
> the graph isomorphism bit, but no real practical value I can 
> see.

If the pretty academic subject of graph isomorphisms is well-worn enough to be
sent to the trash, I'll submit a patch that just removes all this code and
instead use the existing sys.recursionlimit counter to catch infinite
recursions and throw the usual RuntimeError.


Armin


From andymac at bullseye.apana.org.au  Fri Oct 17 08:33:16 2003
From: andymac at bullseye.apana.org.au (Andrew MacIntyre)
Date: Fri Oct 17 09:15:08 2003
Subject: [Python-Dev] SRE recursion
In-Reply-To: <m3d6cwpg5p.fsf@mira.informatik.hu-berlin.de>
References: <20031016225058.GB19133@ibook.distro.conectiva>
	<m3d6cwpg5p.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20031017223015.N64463@bullseye.apana.org.au>

On Fri, 17 Oct 2003, Martin v. [iso-8859-15] L=F6wis wrote:

> Gustavo Niemeyer <niemeyer@conectiva.com> writes:
>
> > I'd like to get back to the SRE recursion issue (#757624). Is this
> > a good time to commit the patch?
>
> It would be good if you could find somebody who reviews the
> patch. However, if nobody volunteers to review, please go ahead - it
> might well be that you are the last active SRE maintainer left on this
> planet ...

Because of the stack recursion issue on FreeBSD (in the presence of
threads), I tested several of Gustavo's patches.  I didn't scrutinise them
for style though...

+1 on getting the patch in early in the 2.4 cycle.

--
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac@bullseye.apana.org.au  (pref) | Snail: PO Box 370
        andymac@pcug.org.au             (alt) |        Belconnen  ACT  2616
Web:    http://www.andymac.org/               |        Australia

From guido at python.org  Fri Oct 17 10:41:04 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 10:41:14 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Fri, 17 Oct 2003 09:53:55 +0200."
	<200310170953.55170.aleaxit@yahoo.com> 
References: <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz>  
	<200310170953.55170.aleaxit@yahoo.com> 
Message-ID: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>

I'd just like to pipe into this discussion saying that while Peter
Norvig's pre-PEP is neat, I'd reject it if it were a PEP; the main
reason that the proposed notation doesn't return a list.  I agree that
having generator comprehensions would be a more general solution.  I
don't have a proposal for generator comprehension syntax though, and
[yield ...] has the same problem.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri Oct 17 10:46:31 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 10:46:38 2003
Subject: [Python-Dev] Trashing recursive objects comparison?
In-Reply-To: Your message of "Fri, 17 Oct 2003 13:54:29 BST."
	<20031017125429.GA25854@vicky.ecs.soton.ac.uk> 
References: <20031017125429.GA25854@vicky.ecs.soton.ac.uk> 
Message-ID: <200310171446.h9HEkVs06278@12-236-54-216.client.attbi.com>

> I'm bringing (again) the subject of comparison of recursive objects to the 
> table because I just happened to write a buggy piece of code:
> 
> class X:
>    def __eq__(self, other):
>       return self.content == other
> 
> This code was buggy because 'self.content' could occasionally be 'self'. In
> this case it should have triggered an infinite recursion, and I should have
> got a nice (if a bit long) RuntimeError traceback that told me where the
> problem was. At least, this is how I would expect my piece of code to
> misbehave.
> 
> Instead, the answer was 'True', whatever 'other' actually was. Puzzlement
> would have gained me if I had no idea about what a bisimulation, or graph
> isomorphism, is, and what Python's implementation of that idea is.
> 
> Quoting Tim on bug #625698:
> > As Erik's latest example shows, the outcome isn't always 
> > particularly well defined either.  An alternative to speeding
> > this
> > silliness is to raise an exception instead when recursive 
> > objects are detected.  There was some hack value in doing 
> > the graph isomorphism bit, but no real practical value I can 
> > see.
> 
> If the pretty academic subject of graph isomorphisms is well-worn
> enough to be sent to the trash, I'll submit a patch that just
> removes all this code and instead use the existing
> sys.recursionlimit counter to catch infinite recursions and throw
> the usual RuntimeError.

Go for it, Armin.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri Oct 17 10:56:38 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 10:56:42 2003
Subject: [Python-Dev] sort() return value
Message-ID: <200310171456.h9HEuc606316@12-236-54-216.client.attbi.com>

I'd like to explain once more why I'm so adamant that sort() shouldn't
return 'self'.

This comes from a coding style (popular in various other languages, I
believe especially Lisp revels in it) where a series of side effects
on a single object can be chained like this:

  x.compress().chop(y).sort(z)

which would be the same as

  x.compress()
  x.chop(y)
  x.sort(z)

I find the chaining form a threat to readability; it requires that the
reader must be intimately familiar with each of the methods.  The
second form makes it clear that each of these calls acts on the same
object, and so even if you don't know the class and its methods very
well, you can understand that the second and third call are applied to
x (and that all calls are made for their side-effects), and not to
something else.

I'd like to reserve chaining for operations that return new values,
like string processing operations:

  y = x.rstrip("\n").split(":").lower()

There are a few standard library modules that encourage chaining of
side-effect calls (pstat comes to mind).  There shouldn't be any new
ones; pstat slipped through my filter when it was weak.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Fri Oct 17 11:54:49 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 17 11:54:55 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
References: <Your message of "Fri, 17 Oct 2003 09:53:55 +0200."
	<200310170953.55170.aleaxit@yahoo.com>
	<200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz>
	<200310170953.55170.aleaxit@yahoo.com>
Message-ID: <5.1.0.14.0.20031017114814.01ef44b0@mail.telecommunity.com>

At 07:41 AM 10/17/03 -0700, Guido van Rossum wrote:
>I'd just like to pipe into this discussion saying that while Peter
>Norvig's pre-PEP is neat, I'd reject it if it were a PEP; the main
>reason that the proposed notation doesn't return a list.  I agree that
>having generator comprehensions would be a more general solution.  I
>don't have a proposal for generator comprehension syntax though, and
>[yield ...] has the same problem.

(yield x*2 for x in foo)

or maybe:

(yield: x*2 for x in foo)

would "yield" better visibility that this is a value that *does* something 
(like lambda).  Or perhaps without the parentheses, but I think they're 
better for clarity, and I'd add them in practice even if they weren't required.

The main problem with a gencomp syntax is that some people are going to use 
it for everything whether they need it or not, even when they have a small 
list and the frame overhead for the generator is going to make it 
slower.  So it almost wants to be a really awkward ugly thing in order to 
discourage them...  but then again, that way lies Ruby.  :)


From pje at telecommunity.com  Fri Oct 17 12:03:41 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 17 12:03:43 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310170953.55170.aleaxit@yahoo.com>
References: <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz>
	<200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz>
Message-ID: <5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com>

At 09:53 AM 10/17/03 +0200, Alex Martelli wrote:
>What about this latest small change, of having the indexing syntax
>invoke __getitem__ -- just like any other indexing, just with an
>iterator as the index rather than (e.g.) a tuple of slices etc?
>
>What, if anything, is "very confusing" in, e.g.,
>
>     sum[x*x for x in blaap]
>
>compared with e.g. the currently accepted:
>
>     a['tanto':'va':'la', 'gatta':'al':'lardo']
>
>?

Because it's arguably bad coding style to use slices or indexes on an 
object in order to perform a function on the indexes supplied.  Wouldn't 
you find a program where this held true:

    TimesTwo[2] == 4

to be in bad style?  Function calls are for transforming arguments, 
indexing is for accessing the contents of a *container*.  Top(10) is not a 
container, it has nothing in it, and neither does TimesTwo.  I suppose you 
could argue that TimesTwo is a conceptual infinite sequence of even 
integers, but for most of the proposed accumulators, similar arguments 
would be a *big* stretch.

Yes, what you propose is certainly *possible*.  But again, if you really 
needed an iterator as an index, you can right now do:

sum[ [x*x for x in blaap] ]

And if there are gencomps, you could do the same.  So, why single out 
subscripting for special consideration with regard to generator 
comprehensions, thus forcing clever tricks of questionable style in order 
to do what ought to be function calls?

I shudder to think of trying to have to explain Top(10)[...] to a Python 
newbie, even if they're an experienced programmer.  Because Top(10) isn't a 
*container*.  I suppose a C++ veteran might consider it an ugly operator 
overloading hack...  and they'd be right.  Top(10,[...]) on the other hand, 
is crystal clear to anybody that gets the idea of function calls.


From guido at python.org  Fri Oct 17 12:10:45 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 12:10:54 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Fri, 17 Oct 2003 11:54:49 EDT."
	<5.1.0.14.0.20031017114814.01ef44b0@mail.telecommunity.com> 
References: <Your message of "Fri, 17 Oct 2003 09:53:55 +0200."
	<200310170953.55170.aleaxit@yahoo.com>
	<200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz>
	<200310170953.55170.aleaxit@yahoo.com> 
	<5.1.0.14.0.20031017114814.01ef44b0@mail.telecommunity.com> 
Message-ID: <200310171610.h9HGAj606439@12-236-54-216.client.attbi.com>

> (yield x*2 for x in foo)
> 
> or maybe:
> 
> (yield: x*2 for x in foo)
> 
> would "yield" better visibility that this is a value that *does*
> something (like lambda).  Or perhaps without the parentheses, but I
> think they're better for clarity, and I'd add them in practice even
> if they weren't required.

Both look decent to me, and in fact the first is what I was thinking
of this morning in the shower. :-)

> The main problem with a gencomp syntax is that some people are going
> to use it for everything whether they need it or not, even when they
> have a small list and the frame overhead for the generator is going
> to make it slower.  So it almost wants to be a really awkward ugly
> thing in order to discourage them...  but then again, that way lies
> Ruby.  :)

Actually, that's also Python's philosophy, if you turn it around: only
things that can be done efficiently should look cute...

--Guido van Rossum (home page: http://www.python.org/~guido/)

From skip at pobox.com  Fri Oct 17 12:12:45 2003
From: skip at pobox.com (Skip Montanaro)
Date: Fri Oct 17 12:12:56 2003
Subject: [Python-Dev] accumulator display syntax
Message-ID: <16272.5373.514560.225999@montanaro.dyndns.org>


    Greg> What you're proposing hijacks the indexing syntax and uses it to
    Greg> mean something completely different from indexing, which is a much
    Greg> bigger change, and potentially a very confusing one.

    Greg> So, no, sorry, it doesn't overcome my objection!

I agree.  Any expression bracketed by '[' and ']', no matter how many other
clues to the ultimate result it might contain, ought to result in a list as
far as I'm concerned. 

Skip

From skip at pobox.com  Fri Oct 17 12:13:37 2003
From: skip at pobox.com (Skip Montanaro)
Date: Fri Oct 17 12:13:56 2003
Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate
Message-ID: <16272.5425.470101.367084@montanaro.dyndns.org>


    >> If anything at all, i'd suggest a std-module which contains e.g. 
    >> 'sort', 'reverse' and 'extend' functions which always return
    >> a new list, so that you could write:
    >> 
    >> for i in reverse(somelist):
    >> ...

    Gustavo> You can do reverse with [::-1] now.

I don't think that is considered "stable" in the sorting sense.  If I sort
in descending order vs ascending order, they are not mere reversals of each
other.  I may well still want adjacent records whose sort keys are
identical to remain in the same order.

What will the new reverse=True keyword arg do?

Skip

From aleaxit at yahoo.com  Fri Oct 17 12:21:39 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 17 12:21:54 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <16272.5373.514560.225999@montanaro.dyndns.org>
References: <16272.5373.514560.225999@montanaro.dyndns.org>
Message-ID: <200310171821.39895.aleaxit@yahoo.com>

On Friday 17 October 2003 06:12 pm, Skip Montanaro wrote:
>     Greg> What you're proposing hijacks the indexing syntax and uses it to
>     Greg> mean something completely different from indexing, which is a
> much Greg> bigger change, and potentially a very confusing one.
>
>     Greg> So, no, sorry, it doesn't overcome my objection!
>
> I agree.  Any expression bracketed by '[' and ']', no matter how many other
> clues to the ultimate result it might contain, ought to result in a list as
> far as I'm concerned.

Hmmm, how is, e.g.
    foo[x*x for x in bar]
any more an "expression bracketed by [ and ]" than, say,
    foo = {'wot': 'tow'}
    foo['wot']
...?  Yet the latter doesn't involve any lists that I can think of.  Nor do
I see why the former need "mean something completely different from
indexing" -- it means to call foo's __getitem__ with the appropriately
constructed object, just as e.g. foo[ 'va':23:2j, {'zip':'zop'}:45:(3,4) ]
today calls it with a tuple of two weird slice objects (and doesn't
happen to involve any lists whatsoever).


Alex


From skip at pobox.com  Fri Oct 17 12:38:07 2003
From: skip at pobox.com (Skip Montanaro)
Date: Fri Oct 17 12:38:17 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310171821.39895.aleaxit@yahoo.com>
References: <16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171821.39895.aleaxit@yahoo.com>
Message-ID: <16272.6895.233187.510629@montanaro.dyndns.org>


    >> I agree.  Any expression bracketed by '[' and ']', no matter how many
    >> other clues to the ultimate result it might contain, ought to result
    >> in a list as far as I'm concerned.

    Alex> Hmmm, how is, e.g.
    Alex>     foo[x*x for x in bar]
    Alex> any more an "expression bracketed by [ and ]" than, say,
    Alex>     foo = {'wot': 'tow'}
    Alex>     foo['wot']
    Alex> ...?  

When I said "expression bracketed by '[' and ']' I agree I was thinking of
list construction sorts of things like:

    foo = ['wot']

not indexing sorts of things like:

    foo['wot']

I'm not in a mood to try and explain anything in more precise terms this
morning (for other reasons, it's been a piss poor day so far) and must trust
your ability to infer my meaning.  I have no idea at this point how to
interpret

    foo[x*x for x in bar]

That looks like a syntax error to me.  You have a probably identifier
followed by a list comprehension.

Here's a slightly more precise term: If a '['...']' construct exists in
context where a list constructor would be legal today, it ought to evaluate
to a list, not to something else.

    Alex> ... just as e.g. foo[ 'va':23:2j, {'zip':'zop'}:45:(3,4) ] ...

I have absolutely no idea how to interpret this.  Is this existing or
proposed Python syntax?

Skip

From seandavidross at hotmail.com  Fri Oct 17 12:43:31 2003
From: seandavidross at hotmail.com (Sean Ross)
Date: Fri Oct 17 12:43:36 2003
Subject: [Python-Dev] accumulator display syntax
Message-ID: <Law9-F10fYIu5vC8UO000024516@hotmail.com>

Hi.
I've not posted to this group before, but I've been following most of the 
discussions on it with interest for about 6 months. Yesterday I saw this 
post:

Guido van Rossum wrote:
>I don't have a proposal for generator comprehension syntax though, and 
>[yield ...] has the same problem.

I actually like the [yield ...] syntax, (I find the intent clear, and have 
no expectations of its returning a list) but since that doesn't look like it 
will be happening, I've tried to think of some other possible syntax. I've 
come up with 16 different possibilities so far, including [yield ...], which 
I've listed below. I'm not advocating any one of them (in fact, many of them 
are abhorrent), I'm just listing some possibilities, in no particular order 
other than as they occurred to me:

# (1) no delimiter
sumofsquares = sum(yield x*x for x in myList)

# (2) brackets
sumofsquares = sum([yield x*x for x in myList])

# (3) parentheses
sumofsquares = sum((yield x*x for x in myList))

# (4) braces
sumofsquares = sum({yield x*x for x in myList})

# (5) pipes
sumofsquares = sum(|yield x*x for x in myList|)

# (6) slashes
sumofsquares = sum(/yield x*x for x in myList/)

# (7) carets
sumofsquares = sum(^yield x*x for x in myList^)

# (8) angle brackets
sumofsquares = sum(<yield x*x for x in myList>)

# (9) sigil @
sumofsquares = sum(@yield x*x for x in myList@)

# (10) sigil $
sumofsquares = sum($yield x*x for x in myList$)

# (11) question marks
sumofsquares = sum(?yield x*x for x in myList?)

# (12) ellipses
sumofsquares = sum(...yield x*x for x in myList...)

# (13) yield:
sumofsquares = sum(yield:[x*x for x in myList])

# (14) unpacking (*)
sumofsquares = sum(*[x*x for x in myList])

# (15) <-
sumofsquares = sum(<-[x*x for x in myList])

#(16) ^
sumofsquares = sum(^[x*x for x in myList])


These last few suggestion (from (13) on) may require some explanation. The 
notion I've had for "yield:" is to have it act something like a lambda so 
that the list comprehension is not evaluated, i.e., no list is constructed 
in memory. Instead, an iterator is created that can be used to generate the 
items, one at a time, that would have been in that list. Something like

def squares(myList):
    for x in myList:
        yield x*x

sumofsquares = sum(squares(myList))

The other suggestions, after (13), are based on this same notion.

Okay. So, there are some generator comprehension syntax ideas. Hopefully 
they will be useful, even if they just serve as items to point to and say 
"we definitely don't want this".

I thank you for your time, and I apologize if these unsolicited suggestions 
are unwanted.

Sean Ross

_________________________________________________________________
Add photos to your e-mail with MSN 8. Get 2 months FREE*.  
http://join.msn.com/?page=features/featuredemail


From aleaxit at yahoo.com  Fri Oct 17 12:52:34 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 17 12:52:39 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com>
References: <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz>
	<5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com>
Message-ID: <200310171852.34515.aleaxit@yahoo.com>

On Friday 17 October 2003 06:03 pm, Phillip J. Eby wrote:
   ...
> Because it's arguably bad coding style to use slices or indexes on an
> object in order to perform a function on the indexes supplied.  Wouldn't
> you find a program where this held true:
>
>     TimesTwo[2] == 4
>
> to be in bad style?  Function calls are for transforming arguments,
> indexing is for accessing the contents of a *container*.  Top(10) is not a

Yes, I would find _gratuitous_ use of indexing where other means are
perfectly adequate to be in bad style.  On the other hand, where Python
'wants' me to use indexing for other purposes, I already do:

>>> class Eval:
...   def __getitem__(self, expr): return eval(expr)
...
>>> print '2 + 2 is %(2 + 2)s' % Eval()
2 + 2 is 4

and given we don't have a better way to "interpolate expressions in
strings", I don't feel particularly troubled by this, either.

> container, it has nothing in it, and neither does TimesTwo.  I suppose you
> could argue that TimesTwo is a conceptual infinite sequence of even
> integers, but for most of the proposed accumulators, similar arguments
> would be a *big* stretch.

Yes; any pure function is mathematically a mapping, but arguing for
general confusion on that score between indexing and function calls
would be stretchy indeed, I agree.  Before we had iterators and
generators, I did use "indexing as pure function call" to get infinite
sequences for use in for loops (to be terminated by break or return
when appropriate), but I'm much happier with iterators for this
purpose (they keep state, so, having dealt with some prefix of a
sequence in a for loop, I still have the sequence's tail intact for
possible future processing -- that's often VERY useful to me! --
AND it's often SO much easier to compute "the next item" than it
is to compute "the i-th item" for an arbitrary natural i).


> Yes, what you propose is certainly *possible*.  But again, if you really
> needed an iterator as an index, you can right now do:
>
> sum[ [x*x for x in blaap] ]

Actually, I need to use parentheses on the outside and brackets only
on the inside -- I assume that's what you meant, of course.

If the iterator is finite, and memory consumption not an issue, sure.
An infinite iterator would in any case not be suitable for sum (but it
_might_ be suitable for other uses, of course).  I truly dislike the way
foo([...]) _looks_, with those ([ and ]) pairs, but, oh well, not _every_
frequently used construct can look nice, after all.

> And if there are gencomps, you could do the same.  So, why single out
> subscripting for special consideration with regard to generator
> comprehensions, thus forcing clever tricks of questionable style in order
> to do what ought to be function calls?

I guess I let my dislike of ([ ... ]) get away with me:-).  If gencomps use
your proposed syntax, I'll have no problem whatsoever coding

sum((yield: x*x for x in blaap))

particularly since the (( ... )) don't look at all bad;-).  Seriously, what 
I'm after is the functionality: since the _syntax_ seemed to be the
stumbling block, I thought of an alternative syntax that seemed fine
to me (not any more of a stretch of the concept of indexing than the
Eval class above, or the not-so-long-ago use of __getitem__ to get
infinite sequences in for loops) and proposed it.  If your (yield: ...)
syntax is approved instead, I'll be first in line to cheer:-).


> I shudder to think of trying to have to explain Top(10)[...] to a Python
> newbie, even if they're an experienced programmer.  Because Top(10) isn't a
> *container*.  I suppose a C++ veteran might consider it an ugly operator
> overloading hack...  and they'd be right.  Top(10,[...]) on the other hand,
> is crystal clear to anybody that gets the idea of function calls.

I agree it's clearer -- a tad less flexible, as you don't get to do separately
    selector = Top(10)
and then somewhere else
    selector[...]
but "oh well", and anyway the issue would be overcome if we had currying
(we could be said to have it, but -- I assume you'd consider
    selector = Top.__get__(10)
some kind of abuse, and besides, this 'currying' isn't very general).


Alex


From nas-python at python.ca  Fri Oct 17 12:57:54 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Fri Oct 17 12:56:59 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310171610.h9HGAj606439@12-236-54-216.client.attbi.com>
References: <200310170953.55170.aleaxit@yahoo.com>
	<200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz>
	<200310170953.55170.aleaxit@yahoo.com>
	<5.1.0.14.0.20031017114814.01ef44b0@mail.telecommunity.com>
	<200310171610.h9HGAj606439@12-236-54-216.client.attbi.com>
Message-ID: <20031017165754.GA22522@mems-exchange.org>

On Fri, Oct 17, 2003 at 09:10:45AM -0700, Guido van Rossum wrote:
> > (yield x*2 for x in foo)
> > 
> > or maybe:
> > 
> > (yield: x*2 for x in foo)
> 
> Both look decent to me, and in fact the first is what I was thinking
> of this morning in the shower. :-)

So would you write:

    sum(yield: x*2 for x in foo)

or

    sum((yield: x*2 for x in foo))

At the moment I like the latter better.

  Neil

From aleaxit at yahoo.com  Fri Oct 17 13:03:42 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 17 13:03:47 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <16272.6895.233187.510629@montanaro.dyndns.org>
References: <16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171821.39895.aleaxit@yahoo.com>
	<16272.6895.233187.510629@montanaro.dyndns.org>
Message-ID: <200310171903.42578.aleaxit@yahoo.com>

On Friday 17 October 2003 06:38 pm, Skip Montanaro wrote:
   ...
>     Alex> ... just as e.g. foo[ 'va':23:2j, {'zip':'zop'}:45:(3,4) ] ...
>
> I have absolutely no idea how to interpret this.  Is this existing or
> proposed Python syntax?

Perfectly valid and current existing Python syntax:

>>> class F(object):
...   def __getitem__(self, x): return x
...
>>> foo=F()
>>> foo[ 'va':23:2j, {'zip':'zop'}:45:(3,4) ]
(slice('va', 23, 2j), slice({'zip': 'zop'}, 45, (3, 4)))

Not particularly _sensible_, mind you, and I hope nobody's yet
written any container that IS to be indexed by such tuples of
slices of multifarious nature.  But, indexing does stretch quite
far in the current Python syntax and semantics (in Python's
*pragmatics* you're supposed to use it far more restrainedly).


Alex


From eppstein at ics.uci.edu  Fri Oct 17 13:15:10 2003
From: eppstein at ics.uci.edu (David Eppstein)
Date: Fri Oct 17 13:15:14 2003
Subject: [Python-Dev] Re: accumulator display syntax
References: <16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171821.39895.aleaxit@yahoo.com>
	<16272.6895.233187.510629@montanaro.dyndns.org>
Message-ID: <eppstein-7939C2.10150917102003@sea.gmane.org>

In article <16272.6895.233187.510629@montanaro.dyndns.org>,
 Skip Montanaro <skip@pobox.com> wrote:

> I'm not in a mood to try and explain anything in more precise terms this
> morning (for other reasons, it's been a piss poor day so far) and must trust
> your ability to infer my meaning.  I have no idea at this point how to
> interpret
> 
>     foo[x*x for x in bar]
> 
> That looks like a syntax error to me.  You have a probably identifier
> followed by a list comprehension.

foo[ anything ] does not look like an identifier followed by a list, it 
looks like an indexing operation.  So I would interpret
    foo[x*x for x in bar]
to equal foo.__getitem__(i) where i is an iterator of x*x for x in bar.

In particular if iter.__getitem__ works appropriately, then
   iter[x*x for x in bar]
could be a generator comprehension and iter[1:n] could be an xrange.
Similarly sum and max could be given appropriate __getitem__ methods.

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science


From guido at python.org  Fri Oct 17 13:15:21 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 13:16:40 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Fri, 17 Oct 2003 19:03:42 +0200."
	<200310171903.42578.aleaxit@yahoo.com> 
References: <16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171821.39895.aleaxit@yahoo.com>
	<16272.6895.233187.510629@montanaro.dyndns.org> 
	<200310171903.42578.aleaxit@yahoo.com> 
Message-ID: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>

> But, indexing does stretch quite
> far in the current Python syntax and semantics (in Python's
> *pragmatics* you're supposed to use it far more restrainedly).

Which is why I didn't like the 'sum[x for x in S]' notation much.
Let's look for an in-line generator notation instead.  I like

  sum((yield x for x in S))

but perhaps we can make this work:

  sum(x for x in S)

(Somebody posted a whole bunch of alternatives that were mostly
picking random delimiters; it didn't look like the right approach.)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From theller at python.net  Fri Oct 17 13:42:59 2003
From: theller at python.net (Thomas Heller)
Date: Fri Oct 17 13:43:09 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <200310162019.h9GKJli05194@12-236-54-216.client.attbi.com> (Guido
	van Rossum's message of "Thu, 16 Oct 2003 13:19:47 -0700")
References: <brspcg23.fsf@python.net> <3F872FE9.9070508@v.loewis.de>
	<u16bbwsz.fsf@python.net> <3F8C3DD0.4020400@v.loewis.de>
	<d6cxnfc6.fsf@python.net>
	<200310162019.h9GKJli05194@12-236-54-216.client.attbi.com>
Message-ID: <y8vjlpzw.fsf@python.net>

Guido van Rossum <guido@python.org> writes:

>> So, would a patch be accepted (for 2.4, I assume there is no way for
>> 2.3.3) which made everything builtin except for the following modules:
>> 
>> _testcapi - not used outside the testsuite
>> _tkinter - needs external stuff anyway
>> pyexpat - may be replaced by a third party module
>> _ssl - needs Python to be built
>
> I'd rather see an explicit list of the "everything" that you want to
> bundle into the main DLL.

Here is the list of Python 2.3 extension modules, in decreasing order of
my preference to be converted into a builtin.

Needed to start Python - should be builtin:

  zlib _sre

Used by myself everyday - would like them to be builtin:

  _socket _winreg mmap select

I'm undecided on these modules, I do not use them now but may in the
future - so I'm undecided:

  _csv winsound datetime bz2

These should remain in separate pyd files for varous reasons:

  _tkinter _bsddb _testcapi pyexpat

Don't know what these do, so I cannot really comment:

  _symtable parser unicodedata

And while we're at it, I have looked at sys.builtin_module_names (again,
from Python 2.3), and wondered if there arn't too many.

I have *never* used any of these (xxsubtype is only a source code
example, isn't it):

  audioop imageop rgbimg xxsubtype

and I guess some of these could also be moved out of python.dll (rotor
is even deprecated):

  _hotshot cmath rotor sha md5 xreadlines

----
There may be incompatibilities - that's why I asked about 2.3.3 or 2.4.

The biggest problem would probably be that you would have to download
additional sources - zlib is one example.

Who cares about the python.dll file getting larger? As Martin explained,
this shouldn't increase memory usage, and since zlib and _sre are loaded
anyway at Python starup, the startup time should decrease IMO.

Let me conclude that I have no pressing need for changing this, but the
decision whether an extension module is builtin or in a dll should
follow a certain pattern.

To reduce the number of files py2exe (or installer) produces the best
way would be to build custom python dlls containing the most popular
extensions as builtins.  Of course this can be done by everyone owning a
C compiler and a text editor.  And my own version would certainly
include _ctypes ;-)

Thomas


From pje at telecommunity.com  Fri Oct 17 13:53:56 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 17 13:53:58 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310171852.34515.aleaxit@yahoo.com>
References: <5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com>
	<200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz>
	<5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com>
Message-ID: <5.1.0.14.0.20031017134657.01ec1a70@mail.telecommunity.com>

At 06:52 PM 10/17/03 +0200, Alex Martelli wrote:
>On Friday 17 October 2003 06:03 pm, Phillip J. Eby wrote:
> > Yes, what you propose is certainly *possible*.  But again, if you really
> > needed an iterator as an index, you can right now do:
> >
> > sum[ [x*x for x in blaap] ]
>
>Actually, I need to use parentheses on the outside and brackets only
>on the inside -- I assume that's what you meant, of course.

No, I meant what I said, which was that if you "really needed an iterator 
as an *index*" (emphasis added).  I suppose I technically should have said, 
if you really want to provide an *iterable*, since a list is not an 
iterator.  But I figured you'd know what I meant.  :)


>I agree it's clearer -- a tad less flexible, as you don't get to do separately
>     selector = Top(10)
>and then somewhere else
>     selector[...]
>but "oh well", and anyway the issue would be overcome if we had currying
>(we could be said to have it, but -- I assume you'd consider
>     selector = Top.__get__(10)
>some kind of abuse, and besides, this 'currying' isn't very general).

Hmmm...  that's a hideously sick hack to perform currying...  but I *like* 
it.  :) Not to use inline, of course, I'd wrap it in a 'curry' 
function.  But what a lovely way to *implement* it, under the hood.  Of 
course, I'd actually use 'new.instancemethod', since it would do the same 
thing for any callable, not just functions.  But I never thought of using 
method objects for providing a currying operation (in the general sense) 
before, even though I've sometimes used as part of a framework to pass 
along extra operators to chained functions.


From shane.holloway at ieee.org  Fri Oct 17 13:55:53 2003
From: shane.holloway at ieee.org (Shane Holloway (IEEE))
Date: Fri Oct 17 13:56:44 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
References: <16272.5373.514560.225999@montanaro.dyndns.org>	<200310171821.39895.aleaxit@yahoo.com>	<16272.6895.233187.510629@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
Message-ID: <3F902D29.2060109@ieee.org>

Guido van Rossum wrote:
> but perhaps we can make this work:
> 
>   sum(x for x in S)

Being able to use generator compressions as an expression would be 
useful.  In that case, I assume the following would be possible as well:

     mygenerator = x for x in S

     for y in x for x in S:
         print y

     return x for x in S

Thanks,
-Shane Holloway


From theller at python.net  Fri Oct 17 13:56:48 2003
From: theller at python.net (Thomas Heller)
Date: Fri Oct 17 13:56:57 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <GCEDKONBLEFPPADDJCOEGEGFOBAA.whisper@oz.net> (David LeBlanc's
	message of "Thu, 16 Oct 2003 23:55:31 -0700")
References: <GCEDKONBLEFPPADDJCOEGEGFOBAA.whisper@oz.net>
Message-ID: <r81blpcv.fsf@python.net>

"David LeBlanc" <whisper@oz.net> writes:

> A few things come to mind:
>
> What's the cost of mapping the world (all those entry points) at startup?
>
> You have to rebuild all of the main dll just to do something to one
> component. To me, that's maybe the biggest single issue.

Hm. How often do you hack the C code of the extension modules included
with Python?

> Are app users/programmers going to have a bloat perception? How many of them
> really understand that a dll is mapped and not loaded at startup?
>
> IMO, it contradicts the unix way of smaller, compartmentalized is better.
> It's not unix we're talking about, but it still makes sense to me, whatever
> the OS.

Maybe unix solves all this, but on Windows it's called DLL Hell.

> On the plus side, it does make some debugging easier if you're working on
> extension dlls: fewer sources to have to point Vis Studio at.

That's never been a problem for me. It always finds the sources itself,
at least for extensions built with distutils (because distutils in debug
builds passes absolute pathnames to the compiler).

> On a related side note: has anyone done any investigation to determine which
> few percentage of the extensions account for 99% of the dll loads? Maybe
> there's no such pattern, but experience suggests there probably is and that
> subset might be a better candidate than the whole world.

That might be.

Thomas


From theller at python.net  Fri Oct 17 14:02:21 2003
From: theller at python.net (Thomas Heller)
Date: Fri Oct 17 14:02:31 2003
Subject: [Python-Dev] Python-2.3.2 windows binary screwed
In-Reply-To: <6.0.0.22.0.20031016222201.0221b908@torment.chelsea.private>
	(Barry Scott's message of "Thu, 16 Oct 2003 22:25:51 +0100")
References: <200310160515.h9G5FUqc025443@localhost.localdomain>
	<wub57jrv.fsf@python.net>
	<6.0.0.22.0.20031016222201.0221b908@torment.chelsea.private>
Message-ID: <llrjlp3m.fsf@python.net>

Barry Scott <barry@barrys-emacs.org> writes:

> You said you are using the SP5 DLLs. They are old...
>
> We use the ones from vc6redist.exe from microsoft they have fixes that
> you may need. Its also the versions that you will encounter on XP
> systems I believe.

Well, isn't SP5 the latest service pack available for Visual Studio 6.0?
I took it from the Oct 2003 MSDN shipment.

> So long as you have the version checking done right in the installer
> you will not rewind a DLL backwards.

The problem in this case was not the installer doing things wrong, the
fault was alone on my side: I did use the dlls from my WinXP system
directory, and the installer correctly used them to replace the versions
on the target computers. If this was a win2k system, the file protection
reverted this change, and the users were lucky again (except they had an
entry in the event log). Unfortunately win98 and NT4 users were not so
happy, for them it broke the system.

Thomas


From pje at telecommunity.com  Fri Oct 17 14:04:52 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 17 14:04:57 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
References: <Your message of "Fri, 17 Oct 2003 19:03:42 +0200."
	<200310171903.42578.aleaxit@yahoo.com>
	<16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171821.39895.aleaxit@yahoo.com>
	<16272.6895.233187.510629@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
Message-ID: <5.1.0.14.0.20031017135553.03e20eb0@mail.telecommunity.com>

At 10:15 AM 10/17/03 -0700, Guido van Rossum wrote:
> > But, indexing does stretch quite
> > far in the current Python syntax and semantics (in Python's
> > *pragmatics* you're supposed to use it far more restrainedly).
>
>Which is why I didn't like the 'sum[x for x in S]' notation much.
>Let's look for an in-line generator notation instead.  I like
>
>   sum((yield x for x in S))
>
>but perhaps we can make this work:
>
>   sum(x for x in S)

Offhand, it seems like the grammar might be rather tricky, but it actually 
does seem more Pythonic than the "yield" syntax, and it retroactively makes 
listcomps shorthand for 'list(x for x in s)'.  However, if gencomps use 
this syntax, then what does:

for x in y*2 for y in z if y<20:
     ...

mean?  ;)

It's a little clearer with parentheses, of course, so perhaps they should 
be required:

for x in (y*2 for y in z if y<20):
     ...

It would be more efficient to code that stuff inline in the loop, if the 
gencomp creates another frame, but it *looks* more efficient to put it in 
the for statement.  But maybe I worry too much, since you could slap a 
listcomp in a for loop now, and I've never even thought of doing so.


From guido at python.org  Fri Oct 17 14:04:53 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 14:06:34 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: Your message of "Fri, 17 Oct 2003 19:42:59 +0200."
	<y8vjlpzw.fsf@python.net> 
References: <brspcg23.fsf@python.net> <3F872FE9.9070508@v.loewis.de>
	<u16bbwsz.fsf@python.net> <3F8C3DD0.4020400@v.loewis.de>
	<d6cxnfc6.fsf@python.net>
	<200310162019.h9GKJli05194@12-236-54-216.client.attbi.com> 
	<y8vjlpzw.fsf@python.net> 
Message-ID: <200310171804.h9HI4rJ06803@12-236-54-216.client.attbi.com>

> >> So, would a patch be accepted (for 2.4, I assume there is no way for
> >> 2.3.3) which made everything builtin except for the following modules:
> >> 
> >> _testcapi - not used outside the testsuite
> >> _tkinter - needs external stuff anyway
> >> pyexpat - may be replaced by a third party module
> >> _ssl - needs Python to be built
> >
> > I'd rather see an explicit list of the "everything" that you want to
> > bundle into the main DLL.
> 
> Here is the list of Python 2.3 extension modules, in decreasing order of
> my preference to be converted into a builtin.
> 
> Needed to start Python - should be builtin:
> 
>   zlib _sre

+1 for _sre.

I'd be +1 for zlib, but see bz2 below for a quibble.  (How important
is this *really* for bootstrap reasons?)

> Used by myself everyday - would like them to be builtin:
> 
>   _socket _winreg mmap select

+1 on _winreg and mmap (they're small enough).

Long ago, when I first set up the VC5 project, there were still some
target systems out there that didn't have a working winsock DLL, and
"import socket" or "import select" would fail there for that reason.
If this is no longer a problem, I'm +1 on this.

> I'm undecided on these modules, I do not use them now but may in the
> future - so I'm undecided:
> 
>   _csv winsound datetime bz2

I'm -1 on bz2; I think bz2 requires a 3rd party external library; for
developers building their own Python who don't want to bother with
that, it's much easier to ignore a DLL that can't be built than to
have to cut a module out of the core DLL.

The same argument applies to zlib -- but I could be swayed by the
counterargument that zlib is needed for zipimport bootrstrap purposes.
(Though is it? you can create zip files without using compression.)

> These should remain in separate pyd files for varous reasons:
> 
>   _tkinter _bsddb _testcapi pyexpat

Agreed.

> Don't know what these do, so I cannot really comment:
> 
>   _symtable parser unicodedata

_symtable is tiny and can be included.

parser is huge but has no external deps; if MvL's argument is correct,
the DLL size increase doesn't translate into a memory usage increase,
so I'd be +1 on including it; ditto for unicodedata.

> And while we're at it, I have looked at sys.builtin_module_names (again,
> from Python 2.3), and wondered if there arn't too many.
> 
> I have *never* used any of these (xxsubtype is only a source code
> example, isn't it):
> 
>   audioop imageop rgbimg xxsubtype

They could all be moved out, but why bother?  (xxsubtype is just a
source code sample module, there's no need to enable it in
distrubutions, but it doesn't hurt anybody either I think!)

> and I guess some of these could also be moved out of python.dll (rotor
> is even deprecated):
> 
>   _hotshot cmath rotor sha md5 xreadlines

Ditto.  None of these are big.  xreadlines should also be deprecated.
But let it stay in the DLL until we stop distributing it (again,
assuming MvL's argument about memory usage is valid).

> ----
> There may be incompatibilities - that's why I asked about 2.3.3 or 2.4.

I wouldn't mess with 2.3.3.

> The biggest problem would probably be that you would have to download
> additional sources - zlib is one example.

Right.

> Who cares about the python.dll file getting larger? As Martin explained,
> this shouldn't increase memory usage, and since zlib and _sre are loaded
> anyway at Python starup, the startup time should decrease IMO.

Right.

> Let me conclude that I have no pressing need for changing this, but the
> decision whether an extension module is builtin or in a dll should
> follow a certain pattern.

"Historical precedent" is a pattern too. :-)

> To reduce the number of files py2exe (or installer) produces the best
> way would be to build custom python dlls containing the most popular
> extensions as builtins.  Of course this can be done by everyone owning a
> C compiler and a text editor.  And my own version would certainly
> include _ctypes ;-)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From tjreedy at udel.edu  Fri Oct 17 14:06:28 2003
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri Oct 17 14:06:40 2003
Subject: [Python-Dev] Re: accumulator display syntax
References: <200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz><200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz>
	<200310170953.55170.aleaxit@yahoo.com>
	<5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com>
Message-ID: <bmpb3b$5li$1@sea.gmane.org>


"Phillip J. Eby" <pje@telecommunity.com> wrote in message
news:5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com...
> At 09:53 AM 10/17/03 +0200, Alex Martelli wrote:
> >What, if anything, is "very confusing" in, e.g.,
> >     sum[x*x for x in blaap]

To me, it both *looks* a lot like a Lisp macro ...

> >compared with e.g. the currently accepted:
> >     a['tanto':'va':'la', 'gatta':'al':'lardo']

(this does use ':' and ',', at least)

> Because it's arguably bad coding style to use slices or indexes on
an
> object in order to perform a function on the indexes supplied.

and acts like a Lisp macro in plugging code pieces into a template
that leads to surprising behavior, given the original form.

> I shudder to think of trying to have to explain Top(10)[...] to a
Python
> newbie, even if they're an experienced programmer.

Ditto.  Getting the reductive functionality thru a gencomp would be
better.

Terry J. Reedy


From guido at python.org  Fri Oct 17 14:08:09 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 14:08:23 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: Your message of "Fri, 17 Oct 2003 19:56:48 +0200."
	<r81blpcv.fsf@python.net> 
References: <GCEDKONBLEFPPADDJCOEGEGFOBAA.whisper@oz.net>  
	<r81blpcv.fsf@python.net> 
Message-ID: <200310171808.h9HI89F06844@12-236-54-216.client.attbi.com>

> "David LeBlanc" <whisper@oz.net> writes:
> 
> > A few things come to mind:
> >
> > What's the cost of mapping the world (all those entry points) at startup?
> >
> > You have to rebuild all of the main dll just to do something to one
> > component. To me, that's maybe the biggest single issue.

[Thomas Heller]
> Hm. How often do you hack the C code of the extension modules included
> with Python?

There's a small but important group of people who rebuild Python from
source with different compiler options (perhaps to enable debugging
their own extensions).  They often don't want to have to bother with
downloading external software that they don't use (like bz2 or bsddb).

> > Are app users/programmers going to have a bloat perception? How
> > many of them really understand that a dll is mapped and not loaded
> > at startup?
> >
> > IMO, it contradicts the unix way of smaller, compartmentalized is better.
> > It's not unix we're talking about, but it still makes sense to me, whatever
> > the OS.
> 
> Maybe unix solves all this, but on Windows it's called DLL Hell.

It's not DLL hell unless there are version issues.  I don't think
multiple extension modules contribute to that (they aern't in the
general Windows DLL search path anyway, only pythonXY.dll is, for the
benefit of Mark Hammond's COM support in win32all).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri Oct 17 14:09:03 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 14:09:10 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Fri, 17 Oct 2003 11:55:53 MDT."
	<3F902D29.2060109@ieee.org> 
References: <16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171821.39895.aleaxit@yahoo.com>
	<16272.6895.233187.510629@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> 
	<3F902D29.2060109@ieee.org> 
Message-ID: <200310171809.h9HI93U06856@12-236-54-216.client.attbi.com>

> Guido van Rossum wrote:
> > but perhaps we can make this work:
> > 
> >   sum(x for x in S)

[Shane Holloway]
> Being able to use generator compressions as an expression would be 
> useful.  In that case, I assume the following would be possible as well:
> 
>      mygenerator = x for x in S
> 
>      for y in x for x in S:
>          print y
> 
>      return x for x in S

You'd probably have to add extra parentheses around (x for x in S) to
help the poor parser (and the human reader).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From tjreedy at udel.edu  Fri Oct 17 14:17:50 2003
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri Oct 17 14:17:56 2003
Subject: [Python-Dev] Re: accumulator display syntax
References: <16272.5373.514560.225999@montanaro.dyndns.org><200310171821.39895.aleaxit@yahoo.com><16272.6895.233187.510629@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
Message-ID: <bmpbof$74i$1@sea.gmane.org>


"Alex Martelli" <aleaxit@yahoo.com> wrote in message
news:200310171903.42578.aleaxit@yahoo.com...
> On Friday 17 October 2003 06:38 pm, Skip Montanaro wrote:
>    ...
> >     Alex> ... just as e.g. foo[ 'va':23:2j,
{'zip':'zop'}:45:(3,4) ] ...
> >
> > I have absolutely no idea how to interpret this.  Is this existing
or
> > proposed Python syntax?
>
> Perfectly valid and current existing Python syntax:
>
> >>> class F(object):
> ...   def __getitem__(self, x): return x
> ...
> >>> foo=F()
> >>> foo[ 'va':23:2j, {'zip':'zop'}:45:(3,4) ]
> (slice('va', 23, 2j), slice({'zip': 'zop'}, 45, (3, 4)))
>
> Not particularly _sensible_, mind you, and I hope nobody's yet
> written any container that IS to be indexed by such tuples of
> slices of multifarious nature.  But, indexing does stretch quite
> far in the current Python syntax and semantics (in Python's
> *pragmatics* you're supposed to use it far more restrainedly).

In your commercial programming group, would you accept such a slice
usage from another programmer, especially without prior agreement of
the group?  Or would you want to edit,  as you would with 'return x<y
and True or False' and might with 'return x<z and 4 or 2'?  If you
would reject it in practice, then it is hardly an argument for
something arguably even odder.

Terry J. Reedy


From theller at python.net  Fri Oct 17 14:29:31 2003
From: theller at python.net (Thomas Heller)
Date: Fri Oct 17 14:29:41 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <200310171804.h9HI4rJ06803@12-236-54-216.client.attbi.com> (Guido
	van Rossum's message of "Fri, 17 Oct 2003 11:04:53 -0700")
References: <brspcg23.fsf@python.net> <3F872FE9.9070508@v.loewis.de>
	<u16bbwsz.fsf@python.net> <3F8C3DD0.4020400@v.loewis.de>
	<d6cxnfc6.fsf@python.net>
	<200310162019.h9GKJli05194@12-236-54-216.client.attbi.com>
	<y8vjlpzw.fsf@python.net>
	<200310171804.h9HI4rJ06803@12-236-54-216.client.attbi.com>
Message-ID: <4qy7lnuc.fsf@python.net>

>> Needed to start Python - should be builtin:
>> 
>>   zlib _sre
>
> +1 for _sre.
>
> I'd be +1 for zlib, but see bz2 below for a quibble.  (How important
> is this *really* for bootstrap reasons?)
>
[...]
> I'm -1 on bz2; I think bz2 requires a 3rd party external library; for
> developers building their own Python who don't want to bother with
> that, it's much easier to ignore a DLL that can't be built than to
> have to cut a module out of the core DLL.
>
> The same argument applies to zlib -- but I could be swayed by the
> counterargument that zlib is needed for zipimport bootrstrap purposes.
> (Though is it? you can create zip files without using compression.)

No, it has nothing to do with zipimport's bootstrap.  When zlib is
available, you can import from compressed zipfiles, when it's not
available, you cannot. (Hopefully Just corrects me if I'm wrong)

Of course, uncompressed zipfiles would always work - and they may be
preferred because they might be even faster.

> Long ago, when I first set up the VC5 project, there were still some
> target systems out there that didn't have a working winsock DLL, and
> "import socket" or "import select" would fail there for that reason.
> If this is no longer a problem, I'm +1 on this.

Not on the sytems that I work on. To be double sure, _socket could be
rewritten to load the winsock dll dynamically. And maybe this becomes
an issue again if IPv6 is compiled in.

>> There may be incompatibilities - that's why I asked about 2.3.3 or 2.4.
>
> I wouldn't mess with 2.3.3.

Ok.

Thomas


From arigo at tunes.org  Fri Oct 17 14:28:11 2003
From: arigo at tunes.org (Armin Rigo)
Date: Fri Oct 17 14:32:05 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <3F902D29.2060109@ieee.org>
References: <16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171821.39895.aleaxit@yahoo.com>
	<16272.6895.233187.510629@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<3F902D29.2060109@ieee.org>
Message-ID: <20031017182811.GA28889@vicky.ecs.soton.ac.uk>

Hello,

On Fri, Oct 17, 2003 at 11:55:53AM -0600, Shane Holloway (IEEE) wrote:
>     mygenerator = x for x in S
> 
>     for y in x for x in S:
>         print y
> 
>     return x for x in S

Interesting but potentially confusing: we could expect the last one to mean
that we executing 'return' repeatedly, i.e. returning a value more than once,
which is not what occurs.  Similarily,

   yield x for x in g()

in a generator would be quite close to the syntax discussed some time ago to
yield all the values yielded by a sub-generator g, but in your proposal it
wouldn't have that meaning: it would only yield a single object, which happens
to be an iterator with the same elements as g().

Even with parenthesis, and assuming a syntax to yield from a sub-generator for
performance reason, the two syntaxes would be dangerously close:

   yield x for x in g()       # means for x in g(): yield x
   yield (x for x in g())     # means yield g()


Armin


From barry at python.org  Fri Oct 17 14:35:47 2003
From: barry at python.org (Barry Warsaw)
Date: Fri Oct 17 14:35:54 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <20031017165754.GA22522@mems-exchange.org>
References: <200310170953.55170.aleaxit@yahoo.com>
	<200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz>
	<200310170953.55170.aleaxit@yahoo.com>
	<5.1.0.14.0.20031017114814.01ef44b0@mail.telecommunity.com>
	<200310171610.h9HGAj606439@12-236-54-216.client.attbi.com>
	<20031017165754.GA22522@mems-exchange.org>
Message-ID: <1066415746.18702.131.camel@anthem>

On Fri, 2003-10-17 at 12:57, Neil Schemenauer wrote:

>     sum((yield: x*2 for x in foo))

+1
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20031017/bbb5c200/attachment.bin
From shane.holloway at ieee.org  Fri Oct 17 14:35:41 2003
From: shane.holloway at ieee.org (Shane Holloway (IEEE))
Date: Fri Oct 17 14:36:28 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <y8vjlpzw.fsf@python.net>
References: <brspcg23.fsf@python.net>
	<3F872FE9.9070508@v.loewis.de>	<u16bbwsz.fsf@python.net>
	<3F8C3DD0.4020400@v.loewis.de>	<d6cxnfc6.fsf@python.net>	<200310162019.h9GKJli05194@12-236-54-216.client.attbi.com>
	<y8vjlpzw.fsf@python.net>
Message-ID: <3F90367D.200@ieee.org>

Thomas Heller wrote:

> Here is the list of Python 2.3 extension modules, in decreasing order of
> my preference to be converted into a builtin.
> 
> Needed to start Python - should be builtin:
> 
>   zlib _sre

+1 -- would not these speed startup time?

> Used by myself everyday - would like them to be builtin:
> 
>   _socket _winreg mmap select

+0 -- I use them a lot, but the overhead importing them is definitely 
acceptable to me.

> I'm undecided on these modules, I do not use them now but may in the
> future - so I'm undecided:
> 
>   _csv winsound datetime bz2

-0 -- Useful modules, but not on my everyday use list.

> These should remain in separate pyd files for varous reasons:
> 
>   _tkinter _bsddb _testcapi pyexpat

Definitely agreed.  :)

> Don't know what these do, so I cannot really comment:
> 
>   _symtable parser unicodedata

Neither do I.  Although unicodedata is fairly big.

> And while we're at it, I have looked at sys.builtin_module_names (again,
> from Python 2.3), and wondered if there arn't too many.
> 
> I have *never* used any of these (xxsubtype is only a source code
> example, isn't it):
> 
>   audioop imageop rgbimg xxsubtype

+1 -- I agree that these would not suffer too badly from being external 
pyds either.

> and I guess some of these could also be moved out of python.dll (rotor
> is even deprecated):
> 
>   _hotshot cmath rotor sha md5 xreadlines

+1 for _hotshot, rotor, and xreadlines -- External would be good.
-0 for sha, md5 -- I like these the way they are, but I see your point.
-1 for cmath -- complex types are part of the language, and should be 
builtin, IMO.

> ----
> There may be incompatibilities - that's why I asked about 2.3.3 or 2.4.

-1 for 2.3.3 or any point release in 2.3
+1 for 2.4

> The biggest problem would probably be that you would have to download
> additional sources - zlib is one example.
> 
> Who cares about the python.dll file getting larger? As Martin explained,
> this shouldn't increase memory usage, and since zlib and _sre are loaded
> anyway at Python starup, the startup time should decrease IMO.

Small is beautiful.  Fast is good.  I don't like the idea of statically 
linking pyds into python simply because we can.  Nor does reducing the 
number of external files for packagers like py2exe.  I know that pain 
too, but I don't want python to suffer from too much bloat for that reason.

> Let me conclude that I have no pressing need for changing this, but the
> decision whether an extension module is builtin or in a dll should
> follow a certain pattern.
> 
> To reduce the number of files py2exe (or installer) produces the best
> way would be to build custom python dlls containing the most popular
> extensions as builtins.  Of course this can be done by everyone owning a
> C compiler and a text editor.  And my own version would certainly
> include _ctypes ;-)
> 
> Thomas
I love ctypes :)  It saves me from doing hard work ;)

Thanks for reading :)
-Shane Holloway


From guido at python.org  Fri Oct 17 14:40:53 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 14:41:06 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: Your message of "Fri, 17 Oct 2003 20:29:31 +0200."
	<4qy7lnuc.fsf@python.net> 
References: <brspcg23.fsf@python.net> <3F872FE9.9070508@v.loewis.de>
	<u16bbwsz.fsf@python.net> <3F8C3DD0.4020400@v.loewis.de>
	<d6cxnfc6.fsf@python.net>
	<200310162019.h9GKJli05194@12-236-54-216.client.attbi.com>
	<y8vjlpzw.fsf@python.net>
	<200310171804.h9HI4rJ06803@12-236-54-216.client.attbi.com> 
	<4qy7lnuc.fsf@python.net> 
Message-ID: <200310171840.h9HIesN06941@12-236-54-216.client.attbi.com>

> > The same argument applies to zlib -- but I could be swayed by the
> > counterargument that zlib is needed for zipimport bootrstrap purposes.
> > (Though is it? you can create zip files without using compression.)
> 
> No, it has nothing to do with zipimport's bootstrap.  When zlib is
> available, you can import from compressed zipfiles, when it's not
> available, you cannot. (Hopefully Just corrects me if I'm wrong)
> 
> Of course, uncompressed zipfiles would always work - and they may be
> preferred because they might be even faster.

Right.  Compression should be used to save network bandwidth, but in
general, these days, files on disk should be uncompressed.

> > Long ago, when I first set up the VC5 project, there were still some
> > target systems out there that didn't have a working winsock DLL, and
> > "import socket" or "import select" would fail there for that reason.
> > If this is no longer a problem, I'm +1 on this.
> 
> Not on the sytems that I work on. To be double sure, _socket could be
> rewritten to load the winsock dll dynamically. And maybe this becomes
> an issue again if IPv6 is compiled in.

I'd rather not have more Windows-specific cruft in the socket and
select module source code -- they are bad enough already.  Dynamically
loading winsock probably would mean that ever call into it has to be
coded differently, right?

--Guido van Rossum (home page: http://www.python.org/~guido/)

From theller at python.net  Fri Oct 17 14:42:42 2003
From: theller at python.net (Thomas Heller)
Date: Fri Oct 17 14:42:51 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <200310171808.h9HI89F06844@12-236-54-216.client.attbi.com> (Guido
	van Rossum's message of "Fri, 17 Oct 2003 11:08:09 -0700")
References: <GCEDKONBLEFPPADDJCOEGEGFOBAA.whisper@oz.net>
	<r81blpcv.fsf@python.net>
	<200310171808.h9HI89F06844@12-236-54-216.client.attbi.com>
Message-ID: <y8vjk8nx.fsf@python.net>

Guido van Rossum <guido@python.org> writes:

>> "David LeBlanc" <whisper@oz.net> writes:
>> 
>> > A few things come to mind:
>> >
>> > What's the cost of mapping the world (all those entry points) at startup?
>> >
>> > You have to rebuild all of the main dll just to do something to one
>> > component. To me, that's maybe the biggest single issue.
>
> [Thomas Heller]
>> Hm. How often do you hack the C code of the extension modules included
>> with Python?
>
> There's a small but important group of people who rebuild Python from
> source with different compiler options (perhaps to enable debugging
> their own extensions).  They often don't want to have to bother with
> downloading external software that they don't use (like bz2 or bsddb).

Well, couldn't there be a mechanism which allows to switch easily
between builtin/external?

>> > Are app users/programmers going to have a bloat perception? How
>> > many of them really understand that a dll is mapped and not loaded
>> > at startup?
>> >
>> > IMO, it contradicts the unix way of smaller, compartmentalized is better.
>> > It's not unix we're talking about, but it still makes sense to me, whatever
>> > the OS.
>> 
>> Maybe unix solves all this, but on Windows it's called DLL Hell.
>
> It's not DLL hell unless there are version issues.
> I don't think multiple extension modules contribute to that (they
> aern't in the general Windows DLL search path anyway, only
> pythonXY.dll is, for the benefit of Mark Hammond's COM support in
> win32all).

I tried to be funny but obviously failed ;-)

Although it smells a little bit like DLL hell.

Thomas


From guido at python.org  Fri Oct 17 14:46:34 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 14:47:04 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Fri, 17 Oct 2003 19:28:11 BST."
	<20031017182811.GA28889@vicky.ecs.soton.ac.uk> 
References: <16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171821.39895.aleaxit@yahoo.com>
	<16272.6895.233187.510629@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<3F902D29.2060109@ieee.org> 
	<20031017182811.GA28889@vicky.ecs.soton.ac.uk> 
Message-ID: <200310171846.h9HIkYY06961@12-236-54-216.client.attbi.com>

> On Fri, Oct 17, 2003 at 11:55:53AM -0600, Shane Holloway (IEEE) wrote:
> >     mygenerator = x for x in S
> > 
> >     for y in x for x in S:
> >         print y
> > 
> >     return x for x in S
> 
> Interesting but potentially confusing: we could expect the last one
> to mean that we executing 'return' repeatedly, i.e. returning a
> value more than once, which is not what occurs.

I'm not sure what you mean by executing 'return' repeatedly; the
closest thing in Python is returning a sequence, and this is pretty
close (for many practical purposes, returning an iterator is just as
good as returning a sequence).

> Similarily,
> 
>    yield x for x in g()
> 
> in a generator would be quite close to the syntax discussed some
> time ago to yield all the values yielded by a sub-generator g, but
> in your proposal it wouldn't have that meaning: it would only yield
> a single object, which happens to be an iterator with the same
> elements as g().

IMO this is not at all similar to what it suggests for return, as
executing 'yield' multiple times *is* a defined thing.

This is why I'd prefer to require extra parentheses;

  yield (x for x in g())

is pretty clear about how many times yield is executed.

> Even with parenthesis, and assuming a syntax to yield from a
> sub-generator for performance reason, the two syntaxes would be
> dangerously close:
> 
>    yield x for x in g()       # means for x in g(): yield x
>    yield (x for x in g())     # means yield g()

I don't see why we need

  yield x for x in g()

when we can already write

  for x in g():
      yield x

This would be a clear case of "more than one way to do it".

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri Oct 17 14:47:42 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 14:48:02 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: Your message of "Fri, 17 Oct 2003 20:42:42 +0200."
	<y8vjk8nx.fsf@python.net> 
References: <GCEDKONBLEFPPADDJCOEGEGFOBAA.whisper@oz.net>
	<r81blpcv.fsf@python.net>
	<200310171808.h9HI89F06844@12-236-54-216.client.attbi.com> 
	<y8vjk8nx.fsf@python.net> 
Message-ID: <200310171847.h9HIlgL06985@12-236-54-216.client.attbi.com>

> > There's a small but important group of people who rebuild Python from
> > source with different compiler options (perhaps to enable debugging
> > their own extensions).  They often don't want to have to bother with
> > downloading external software that they don't use (like bz2 or bsddb).
> 
> Well, couldn't there be a mechanism which allows to switch easily
> between builtin/external?

Of course there *could*, but why bother?  What we have works just as
well IMO.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From shane.holloway at ieee.org  Fri Oct 17 14:50:34 2003
From: shane.holloway at ieee.org (Shane Holloway (IEEE))
Date: Fri Oct 17 14:51:20 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <20031017182811.GA28889@vicky.ecs.soton.ac.uk>
References: <16272.5373.514560.225999@montanaro.dyndns.org>	<200310171821.39895.aleaxit@yahoo.com>	<16272.6895.233187.510629@montanaro.dyndns.org>	<200310171903.42578.aleaxit@yahoo.com>	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>	<3F902D29.2060109@ieee.org>
	<20031017182811.GA28889@vicky.ecs.soton.ac.uk>
Message-ID: <3F9039FA.8070608@ieee.org>

Armin Rigo wrote:

> Hello,
> 
> On Fri, Oct 17, 2003 at 11:55:53AM -0600, Shane Holloway (IEEE) wrote:
> 
>>    mygenerator = x for x in S
>>
>>    for y in x for x in S:
>>        print y
>>
>>    return x for x in S
> 
> 
> Interesting but potentially confusing: we could expect the last one to mean
> that we executing 'return' repeatedly, i.e. returning a value more than once,
> which is not what occurs.  Similarily,
> 
>    yield x for x in g()
> 
> in a generator would be quite close to the syntax discussed some time ago to
> yield all the values yielded by a sub-generator g, but in your proposal it
> wouldn't have that meaning: it would only yield a single object, which happens
> to be an iterator with the same elements as g().

Yes, this is one of the things I trying to getting at -- If gencomps are 
expressions, then they must be expressions everywhere, or my poor brain 
will explode.

As for the subgenerator "unrolling", I think there has to be something 
added to the yield statement to accomplish it -- because it is also 
useful to yield a generator itself and not have it unrolled.  My 
favorite was "yield *S" for that discussion...\

-Shane Holloway


From skip at pobox.com  Fri Oct 17 14:57:46 2003
From: skip at pobox.com (Skip Montanaro)
Date: Fri Oct 17 14:57:56 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
References: <16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171821.39895.aleaxit@yahoo.com>
	<16272.6895.233187.510629@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
Message-ID: <16272.15274.781344.230479@montanaro.dyndns.org>


    >> But, indexing does stretch quite far in the current Python syntax and
    >> semantics (in Python's *pragmatics* you're supposed to use it far
    >> more restrainedly).

    Guido> Which is why I didn't like the 'sum[x for x in S]' notation much.
    Guido> Let's look for an in-line generator notation instead.  I like

    Guido>   sum((yield x for x in S))

    Guido> but perhaps we can make this work:

    Guido>   sum(x for x in S)

Forgive my extreme density on this matter, but I don't understand what

    (yield x for x in S)

is supposed to do.  Is it supposed to return a generator function which I
can assign to a variable (or pass to the builtin function sum() as in your
example) and call later, or is it supposed to turn the current function into
a generator function (so that each executed yield statement returns a value
to the caller of the current function)?

Assuming the result is a generator function (a first class object I can
assign to a variable then call later), is there some reason the current
function notation is inadequate?  This seems to me to suffer the same
expressive shortcomings as lambda.  Lambda seems to be hanging on by the
hair on its chinny chin chin.  Why is this construct gaining traction?  If
you don't like lambda, I can't quite see why syntax this is all that
appealing.

OTOH, if lambda: x: x+1 is okay, then why not:

    yield: x for x in S

?

Skip

From niemeyer at conectiva.com  Fri Oct 17 14:39:16 2003
From: niemeyer at conectiva.com (Gustavo Niemeyer)
Date: Fri Oct 17 15:04:28 2003
Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <16272.5425.470101.367084@montanaro.dyndns.org>
References: <16272.5425.470101.367084@montanaro.dyndns.org>
Message-ID: <20031017183915.GA29652@ibook.distro.conectiva>

>     >> If anything at all, i'd suggest a std-module which contains e.g. 
>     >> 'sort', 'reverse' and 'extend' functions which always return
>     >> a new list, so that you could write:
>     >> 
>     >> for i in reverse(somelist):
>     >> ...
> 
>     Gustavo> You can do reverse with [::-1] now.
> 
> I don't think that is considered "stable" in the sorting sense.  If I
> sort in descending order vs ascending order, they are not mere
> reversals of each other.  I may well still want adjacent records whose
> sort keys are identical to remain in the same order.
> 
> What will the new reverse=True keyword arg do?

Erm.. what are you talking about!? :-)

I was just saying that his reverse(...) method is completely equivalent
to [::-1] now, so it could safely be implemented as:

  reverse = lambda x: x[::-1]

I wasn't trying to mention anything about sort nor keyword arguments
(perhaps I just wasn't the real target of the message!?).

-- 
Gustavo Niemeyer
http://niemeyer.net

From pje at telecommunity.com  Fri Oct 17 15:20:31 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 17 15:20:31 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <16272.15274.781344.230479@montanaro.dyndns.org>
References: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171821.39895.aleaxit@yahoo.com>
	<16272.6895.233187.510629@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
Message-ID: <5.1.0.14.0.20031017151235.034fad20@mail.telecommunity.com>

At 01:57 PM 10/17/03 -0500, Skip Montanaro wrote:

>     >> But, indexing does stretch quite far in the current Python syntax and
>     >> semantics (in Python's *pragmatics* you're supposed to use it far
>     >> more restrainedly).
>
>     Guido> Which is why I didn't like the 'sum[x for x in S]' notation much.
>     Guido> Let's look for an in-line generator notation instead.  I like
>
>     Guido>   sum((yield x for x in S))
>
>     Guido> but perhaps we can make this work:
>
>     Guido>   sum(x for x in S)
>
>Forgive my extreme density on this matter, but I don't understand what
>
>     (yield x for x in S)
>
>is supposed to do.  Is it supposed to return a generator function which I
>can assign to a variable (or pass to the builtin function sum() as in your
>example) and call later, or is it supposed to turn the current function into
>a generator function (so that each executed yield statement returns a value
>to the caller of the current function)?

Neither.  It returns an *iterator*, conceptually equivalent to:

def temp():
     for x in S:
         yield x

temp = temp()

Except of course without creating a 'temp' name.  I suppose you could also 
think of it as:

(lambda: for x in S: yield x)()

except of course that you can't make a generator lambda.

If you look at it this way, then you can consider [x for x in S] to be 
shorthand syntax for list(x for x in S), as they would both produce the 
same result.  However, IIRC, the current listcomp implementation actually 
binds 'x' in the current local namespace, whereas the generator version 
would not.  (And the listcomp version might be faster.)


From bac at OCF.Berkeley.EDU  Fri Oct 17 15:46:52 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Oct 17 15:47:12 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <16272.15274.781344.230479@montanaro.dyndns.org>
References: <16272.5373.514560.225999@montanaro.dyndns.org>	<200310171821.39895.aleaxit@yahoo.com>	<16272.6895.233187.510629@montanaro.dyndns.org>	<200310171903.42578.aleaxit@yahoo.com>	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<16272.15274.781344.230479@montanaro.dyndns.org>
Message-ID: <3F90472C.9060702@ocf.berkeley.edu>

Skip Montanaro wrote:

>     >> But, indexing does stretch quite far in the current Python syntax and
>     >> semantics (in Python's *pragmatics* you're supposed to use it far
>     >> more restrainedly).
> 
>     Guido> Which is why I didn't like the 'sum[x for x in S]' notation much.
>     Guido> Let's look for an in-line generator notation instead.  I like
> 
>     Guido>   sum((yield x for x in S))
> 
>     Guido> but perhaps we can make this work:
> 
>     Guido>   sum(x for x in S)
> 
> Forgive my extreme density on this matter, but I don't understand what
> 
>     (yield x for x in S)
> 
> is supposed to do.  

In an attempt to make sure I understand what is being discussed, I am 
going to take a stab at this.  That way when someone corrects me two 
people get there questions; two birds, one shotgun.

> Is it supposed to return a generator function which I
> can assign to a variable (or pass to the builtin function sum() as in your
> example) and call later, or is it supposed to turn the current function into
> a generator function (so that each executed yield statement returns a value
> to the caller of the current function)?
> 

It returns a generator function.

> Assuming the result is a generator function (a first class object I can
> assign to a variable then call later), is there some reason the current
> function notation is inadequate?  This seems to me to suffer the same
> expressive shortcomings as lambda.  Lambda seems to be hanging on by the
> hair on its chinny chin chin.  Why is this construct gaining traction?  If
> you don't like lambda, I can't quite see why syntax this is all that
> appealing.
>

Extreme shorthand for a common idiom?

> OTOH, if lambda: x: x+1 is okay, then why not:
> 
>     yield: x for x in S
> 

I was actually thinking that myself, but I would rather keep lambda as 
this weird little child of Python who can always be spotted for its 
predisposition toward pink hot pants (images of "Miami Vice" flash in my 
head...).

Personally I am not seeing any extreme need for this feature.  I mean 
the example I keep seeing is ``sum((yield x*2 for x in foo))``.  But how 
is this such a huge win over ``sum([x*2 for x in foo])``?  I know there 
is a memory perk since the entire list won't be constructed, but unless 
there is a better reason I see abuse on the horizon.

The misuse of __slots__ has shown that when something is added that 
seems simple and powerful it will be abused by a lot of programmers 
thinking it is the best thing to use for anything they can shoe horn it 
into.  I don't see this as such an abuse issue as __slots__, mind you, 
but I can still see people using it where a list comp may have been 
better.  Or even having people checking themselves on whether to use 
this or a list comp and just using this because it seems cooler.

I know I am personally +0 on this even after my above worries since I 
don't see my above arguments are back-breakers and those of us who do 
know how to properly to use it will get a perk out of it.

-Brett


From aleaxit at yahoo.com  Fri Oct 17 16:01:50 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 17 16:01:56 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <16272.15274.781344.230479@montanaro.dyndns.org>
References: <16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<16272.15274.781344.230479@montanaro.dyndns.org>
Message-ID: <200310172201.50930.aleaxit@yahoo.com>

On Friday 17 October 2003 08:57 pm, Skip Montanaro wrote:
   ...
> Forgive my extreme density on this matter, but I don't understand what
>
>     (yield x for x in S)
>
> is supposed to do.  Is it supposed to return a generator function which I
> can assign to a variable (or pass to the builtin function sum() as in your
> example) and call later, or is it supposed to turn the current function
> into a generator function (so that each executed yield statement returns a
> value to the caller of the current function)?

Neither: it returns an iterator, _equivalent_ to the one that would be
returned by _calling_ a generator such as

def xxx():
    for x in S:
        yield x

like xxx() [the result of the CALL to xxx, as opposed to xxx itself], (yield: 
x for x in S) is not callable; rather, it's loopable-on.

> you don't like lambda, I can't quite see why syntax this is all that
> appealing.

I don't really like the current state of lambda (and it will likely never get 
any better), I particularly don't like the use of the letter lambda for this 
idea (Church's work notwithstanding, even Paul Graham in his new lispoid 
language has chosen a more sensible keyword, 'func' I believe), but I like 
comprehensions AND iterators, and the use of the word yield in generators.
I'm not quite sure what parallels you see between the two cases.


Alex


From pje at telecommunity.com  Fri Oct 17 16:15:04 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 17 16:15:09 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <3F90472C.9060702@ocf.berkeley.edu>
References: <16272.15274.781344.230479@montanaro.dyndns.org>
	<16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171821.39895.aleaxit@yahoo.com>
	<16272.6895.233187.510629@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<16272.15274.781344.230479@montanaro.dyndns.org>
Message-ID: <5.1.0.14.0.20031017160243.03453220@mail.telecommunity.com>

At 12:46 PM 10/17/03 -0700, Brett C. wrote:
>Skip Montanaro wrote:
>>Is it supposed to return a generator function which I
>>can assign to a variable (or pass to the builtin function sum() as in your
>>example) and call later, or is it supposed to turn the current function into
>>a generator function (so that each executed yield statement returns a value
>>to the caller of the current function)?
>
>It returns a generator function.

No, it returns an iterator.  Technically a generator-iterator, but 
definitely not a generator function, just as [x for x in y] doesn't return 
a function that returns a list.  :)


>Personally I am not seeing any extreme need for this feature.  I mean the 
>example I keep seeing is ``sum((yield x*2 for x in foo))``.  But how is 
>this such a huge win over ``sum([x*2 for x in foo])``?  I know there is a 
>memory perk since the entire list won't be constructed, but unless there 
>is a better reason I see abuse on the horizon.

It's not an extreme need; if it were, it'd have been added in 2.2, where 
all extreme Python needs were met.  ;)


>I know I am personally +0 on this even after my above worries since I 
>don't see my above arguments are back-breakers and those of us who do know 
>how to properly to use it will get a perk out of it.

I'm sort of +0 myself; there are probably few occasions where I'd use a 
gencomp.  But I'm -1 on creating special indexing or listcomp-like 
accumulator syntax, so gencomps are a fallback position.

I'm not sure gencomp is the right term for these things anyway...  calling 
them iterator expressions probably makes more sense.  Then there's not the 
confusion with generator functions, which get called.  And this discussion 
has made it clearer that having 'yield' in the syntax is just plain wrong, 
because yield is a control flow statement.  These things are really just 
expressions that act over iterators to return another iterator.  In 
essence, an iterator expression is just syntax for imap and ifilter, in the 
same way that a listcomp is syntax for map and filter.

Really, you could now write imap and ifilter as functions that compute 
iterator expressions, e.g.:

imap = lambda func,items: func(item) for item in items

ifilter = lambda func, items: item for item in items if func(item)

Which of course means there'd be little need for imap and ifilter, just as 
there's now little need for map and filter.

Anyway, if you look at '.. for .. in .. [if ..]' as a ternary or quaternary 
operator on an iterator (or iterable) that returns an iterator, it makes a 
lot more sense than thinking of it as having anything to do with 
generator(s).  (Even if it might be implemented that way.)


From tim.one at comcast.net  Fri Oct 17 16:16:24 2003
From: tim.one at comcast.net (Tim Peters)
Date: Fri Oct 17 16:16:32 2003
Subject: [Python-Dev] Python-2.3.2 windows binary screwed
In-Reply-To: <llrjlp3m.fsf@python.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCIENGGJAB.tim.one@comcast.net>

[Thomas Heller]
> ...
> The problem in this case was not the installer doing things wrong, the
> fault was alone on my side: I did use the dlls from my WinXP system
> directory, and the installer correctly used them to replace the
> versions on the target computers. If this was a win2k system, the
> file protection reverted this change, and the users were lucky again
> (except they had an entry in the event log). Unfortunately win98 and
> NT4 users were not so happy, for them it broke the system.

For some of them, and probably a small minority (else we would have been
deluged with bug reports about this, not just gotten a handful).  For
example, there were no problems after installing 2.3.2 on two different
Win98SE boxes I use.  I *did* note at the time I was surprised installation
asked me to reboot (which is a sure sign that Wise detected it needed to
replace an in-use DLL), but I forgot to panic about it.

Under the theory that the boxes where this broke are the same ones
contributing to worm spew exploiting MS bugs that were fixed a year ago, you
were doing the world a favor by calling their owners' attention to how out
of date they were <wink>.


From skip at pobox.com  Fri Oct 17 16:20:38 2003
From: skip at pobox.com (Skip Montanaro)
Date: Fri Oct 17 16:20:48 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310172201.50930.aleaxit@yahoo.com>
References: <16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<16272.15274.781344.230479@montanaro.dyndns.org>
	<200310172201.50930.aleaxit@yahoo.com>
Message-ID: <16272.20246.883506.360730@montanaro.dyndns.org>


    >> Is it supposed to return a generator function which I can assign to a
    >> variable (or pass to the builtin function sum() as in your example)
    >> and call later, or is it supposed to turn the current function into a
    >> generator function (so that each executed yield statement returns a
    >> value to the caller of the current function)?

    Alex> Neither: it returns an iterator, _equivalent_ to the one that
    Alex> would be returned by _calling_ a generator such as

    Alex> def xxx():
    Alex>     for x in S:
    Alex>         yield x

All the more reason not to like this.  Why not just define the generator
function and call it?

While Perl sprouts magical punctuation, turning its syntax into line noise,
Python seems to be sprouting multiple function-like things.  We have 

    * functions
    * unbound methods
    * bound methods
    * generator functions
    * iterators (currently invisible via syntax, but created by calling a
      generator function?)
    * instances magically callable via __call__

and now this new (rather limited) syntax for creating iterators.

I am beginning to find it all a bit confusing and unsettling.

Skip

From aleaxit at yahoo.com  Fri Oct 17 16:21:43 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 17 16:21:48 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <bmpbof$74i$1@sea.gmane.org>
References: <16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com> <bmpbof$74i$1@sea.gmane.org>
Message-ID: <200310172221.43456.aleaxit@yahoo.com>

On Friday 17 October 2003 08:17 pm, Terry Reedy wrote:
   ...
> > >>> foo[ 'va':23:2j, {'zip':'zop'}:45:(3,4) ]
> >
> > (slice('va', 23, 2j), slice({'zip': 'zop'}, 45, (3, 4)))
> >
> > Not particularly _sensible_, mind you, and I hope nobody's yet
   ...
> In your commercial programming group, would you accept such a slice
> usage from another programmer, especially without prior agreement of
> the group?  Or would you want to edit,  as you would with 'return x<y
> and True or False' and might with 'return x<z and 4 or 2'?  If you

Such slice usage would presumably be made necessary by the behavior
of foo's type -- that indexing would cause errors with all types I've ever
seen actually used, so, if it were used in production code, it would no
doubt have to be because foo's type requires weird indexing.  Now, I'd
_like_ to say that no external component with such weird behavior would
ever be tolerated in any group I work for... BUT, given that I've had to
write programs using "external components" such as the win32 API's,
MS Office, etc (and may well still have to use others such as OO's new
Python interface -- almost makes one nostalgic for MS Office...:-), if I
said that I'd be lying:-).

Seriously, I can't imagine how such weird interface requirements might
ever end up piled onto indexing in particular (though, come to think of
it, some "indexed properties" in some COM object models... eeek... well,
not QUITE that bad!-).  But the point is quite another: by composing
elementary and perfectly sensible language elements (slices, generalized
indexing) it IS quite possible, even today, to write weird things.  This is
no argument for removing the perfectly sensible elements from the
language, just as the possible abuses of 'and' and 'or' don't mean we
should in general do without THEM.

> would reject it in practice, then it is hardly an argument for
> something arguably even odder.

I'm happy to be using a language which supplies good elementary
components and good general "composability", even though it IS
possible to overuse the composition and end up with weird constructs.
Personally, I don't think that allowing comprehensions in indices would
be particularly odd: just another "good elementary component".  So
would "iterator comprehensions", as an alternative.  Both of them
quite usable in composition with other existing components and rules
to produce weirdness, sure: but showing that weirdness is already
quite possible whether new constructs are allowed or not appears to
me to be a perfectly valid argument for a new construct that's liable
to be used in either good or weird ways.


Alex


From pf_moore at yahoo.co.uk  Fri Oct 17 16:34:21 2003
From: pf_moore at yahoo.co.uk (Paul Moore)
Date: Fri Oct 17 16:34:17 2003
Subject: [Python-Dev] Re: accumulator display syntax
References: <Your message of "Fri, 17 Oct 2003 19:03:42 +0200."
	<200310171903.42578.aleaxit@yahoo.com>
	<16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171821.39895.aleaxit@yahoo.com>
	<16272.6895.233187.510629@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<5.1.0.14.0.20031017135553.03e20eb0@mail.telecommunity.com>
Message-ID: <n0bzy56a.fsf@yahoo.co.uk>

"Phillip J. Eby" <pje@telecommunity.com> writes:

> At 10:15 AM 10/17/03 -0700, Guido van Rossum wrote:

>>Which is why I didn't like the 'sum[x for x in S]' notation much.
>>Let's look for an in-line generator notation instead.  I like
>>
>>   sum((yield x for x in S))
>>
>>but perhaps we can make this work:
>>
>>   sum(x for x in S)

I like the look of this. In this context, it looks very natural.

> Offhand, it seems like the grammar might be rather tricky, but it
> actually does seem more Pythonic than the "yield" syntax, and it
> retroactively makes listcomps shorthand for 'list(x for x in s)'.
> However, if gencomps use this syntax, then what does:
>
> for x in y*2 for y in z if y<20:
>      ...
>
> mean?  ;)

It means you're trying to be too clever, and should use parentheses
:-)

> It's a little clearer with parentheses, of course, so perhaps they
> should be required:
>
> for x in (y*2 for y in z if y<20):
>      ...

I'd rather not require parentheses in general. Guido's example of
sum(x for x in S) looks too nice for me to want to give it up without
a fight. But I'm happy to have cases where the syntax is ambiguous, or
even out-and-out unparseable, without the parentheses. Whether it's
possible to express this in a way that Python's grammar can deal with,
I don't know.

Paul.
-- 
This signature intentionally left blank


From pje at telecommunity.com  Fri Oct 17 16:38:00 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 17 16:38:01 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <16272.20246.883506.360730@montanaro.dyndns.org>
References: <200310172201.50930.aleaxit@yahoo.com>
	<16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<16272.15274.781344.230479@montanaro.dyndns.org>
	<200310172201.50930.aleaxit@yahoo.com>
Message-ID: <5.1.0.14.0.20031017162527.03eb58a0@mail.telecommunity.com>

At 03:20 PM 10/17/03 -0500, Skip Montanaro wrote:
>     * functions
>     * unbound methods
>     * bound methods
>     * generator functions
>     * iterators (currently invisible via syntax, but created by calling a
>       generator function?)
>     * instances magically callable via __call__

The last item on the list encompasses at least the first three.  But you 
also left out __init__ and __new__, which are really ClassType.__call__ or 
type.__call__, though.  :)

To me (and the interpreter, actually), there's just tp_call, tp_iter, and 
tp_iternext (or whatever their actual names are).  Callability, 
iterability, and iterator-next.  Many kinds of objects may have these 
aspects, just as many kinds of objects may be addable with '+'.

Of the things you mention, however, most don't actually have different 
syntax for creating them, and some are even the same object type (e.g. 
unbound and bound methods).  And the syntax for *using* them is always 
uniform: () always calls an object, for ... in ... creates an iterator from 
an iterable, .next() goes to the next item.


>and now this new (rather limited) syntax for creating iterators.

Actually, as now being discussed, list comprehensions would be a special 
case of an iterator expression.


>I am beginning to find it all a bit confusing and unsettling.

Ironically, with iterator comprehension in place, a list comprehension 
would now look like a list containing an iterator, which I agree might be 
confusing.  Too bad we didn't do iterator comps first, or list(itercomp) 
would be the idiomatic way to make a listcomp.

That's really the only confusing bit I see about itercomps...  that you 
have to be careful where you put your parentheses, in order to make your 
intentions clear in some contexts.  However, that's true for many kinds of 
expressions even now.


From aleaxit at yahoo.com  Fri Oct 17 16:40:28 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 17 16:40:35 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <5.1.0.14.0.20031017134657.01ec1a70@mail.telecommunity.com>
References: <5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com>
	<5.1.0.14.0.20031017134657.01ec1a70@mail.telecommunity.com>
Message-ID: <200310172240.28322.aleaxit@yahoo.com>

On Friday 17 October 2003 07:53 pm, Phillip J. Eby wrote:
> At 06:52 PM 10/17/03 +0200, Alex Martelli wrote:
> >On Friday 17 October 2003 06:03 pm, Phillip J. Eby wrote:
> > > Yes, what you propose is certainly *possible*.  But again, if you
> > > really needed an iterator as an index, you can right now do:
> > >
> > > sum[ [x*x for x in blaap] ]
> >
> >Actually, I need to use parentheses on the outside and brackets only
> >on the inside -- I assume that's what you meant, of course.
>
> No, I meant what I said, which was that if you "really needed an iterator
> as an *index*" (emphasis added).  I suppose I technically should have said,
> if you really want to provide an *iterable*, since a list is not an
> iterator.  But I figured you'd know what I meant.  :)

Ah, no, I didn't get your meaning.  But yes, you could of course
pass iter([ x*x for x in blaap ]) as an iterator (not just iterable) index
to whatever... as long as blaap was a FINITE iterator, of course.  If
you can't count on blaap being finite, you'd need to code and name
a separate generator such as:

def squares(blaap):
    for x in blaap:
       yield x*x

then pass the result of calling squares(blaap), or you could choose
to use itertools.imap and a lambda, etc etc.


> >I agree it's clearer -- a tad less flexible, as you don't get to do
> > separately selector = Top(10)
> >and then somewhere else
> >     selector[...]
> >but "oh well", and anyway the issue would be overcome if we had currying
> >(we could be said to have it, but -- I assume you'd consider
> >     selector = Top.__get__(10)
> >some kind of abuse, and besides, this 'currying' isn't very general).
>
> Hmmm...  that's a hideously sick hack to perform currying...  but I *like*
> it.  :) Not to use inline, of course, I'd wrap it in a 'curry'
> function.  But what a lovely way to *implement* it, under the hood.  Of
> course, I'd actually use 'new.instancemethod', since it would do the same

Yes, def curry(func, arg): return new.instancemethod(func, arg, object)
IS indeed way more general than func.__get__(arg) [notably, you get to
call it repeatedly to curry more than one argument, from the left].  But
if you have to define a curry function anyway, it's not a huge win vs

def curry(func, arg):
    def curried(*args): return func(arg, *args)
    return curried

or indeed more general variations thereof such as

def curry(func, *curried_args):
    def curried(*args): return func(*(curried_args+args))
    return curried


Alex


From pyth at devel.trillke.net  Fri Oct 17 16:49:24 2003
From: pyth at devel.trillke.net (Holger Krekel)
Date: Fri Oct 17 16:49:43 2003
Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <20031016232444.GA27936@ibook.distro.conectiva>;
	from niemeyer@conectiva.com on Thu, Oct 16, 2003 at 08:24:44PM
	-0300
References: <3F8DB69E.2070406@sabaydi.com>
	<LNBBLJKPBEHFEDALKOLCKECMGJAB.tim.one@comcast.net>
	<20031016103552.H14453@prim.han.de>
	<20031016232444.GA27936@ibook.distro.conectiva>
Message-ID: <20031017224924.L14453@prim.han.de>

Gustavo Niemeyer wrote:
> > If anything at all, i'd suggest a std-module which contains e.g. 
> > 'sort', 'reverse' and 'extend' functions which always return
> > a new list, so that you could write:
> > 
> >     for i in reverse(somelist):
> >         ...
> 
> You can do reverse with [::-1] now.

sure, but it's a bit unintuitive and i mentioned not only reverse :-)

Actually i think that 'reverse', 'sort' and 'extend' algorithms
could nicely be put into the new itertools module.  

There it's obvious that they wouldn't mutate objects.  And these algorithms
(especially extend and reverse) would be very efficient as iterators because 
they wouldn't create temporary lists/tuples. 

cheers,

    holger

From FBatista at uniFON.com.ar  Fri Oct 17 16:49:44 2003
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Fri Oct 17 16:50:36 2003
Subject: [Python-Dev] prePEP: Money data type
Message-ID: <A128D751272CD411BC9200508BC2194D03383063@escpl.tcp.com.ar>

Here I send it.

Suggestions and all kinds of recomendations are more than welcomed.

If it all goes ok, it'll be a PEP when I finish writing the code.

Thank you.

.	Facundo


------------------------------------------------------------------------

PEP: XXXX
Title: Money data type
Version: $Revision: 0.1 $
Last-Modified: $Date: 2003/10/17 17:34:00 $
Author: Facundo Batista <fbatista@unifon.com.ar>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 17-Oct-2003
Python-Version: 2.3.3


Abstract
========

The idea is to make a Money data type, basically for financial uses, where
decimals are needed but floating point is too inexact.  The Money data type
should support the Python standard functions and operations.


Rationale
=========

The detail of the requeriments are in the `Requirements`_ section.  Here
I'll
include all the decisions made and why, and all the subjects still in
discussion.  The requirements will be numbered, to simplify discussion on
each point.

As an XP exercise, I'll write the test cases before de class itself, so
it'll
comply exactly the requeriments of those tests.  Please see them for an
exact
specification (and if you propose a different behaviour, please propose the
corresponding test case if possible, thanks).


Why Not To Use Tim Peters' FixedPoint?
--------------------------------------

As we'll see in Requeriments, thera are items that FixedPoint doesn?t comply
(because doesn't do something or does it different).  It could be extended
or
modified to comply the Requeriments, but some needs are own to currency, and
some features of FixedPoint are too much for Money, so taking them out will
make this class simplier.

Anyway, sometime maybe one could be made subclass of the other, or just make
one from both.  The code of the Money class is based in large part on the
code of Tim Peters' FixedPoint: thank you for your (very) valuable ideas.


Items In Discussion
-------------------

6. About repr(). Should ``myMoney == eval(repr(myMoney))``?


Requirements
============

1. The sintaxis should be ``Money(value, [precision])``.

2. The value could of the type:

       - another money (if you don't include *precision*, it get
inheritated)
       
       - int or long (default *precision*: 0)::
       
           Money(45): 45
           Money(45, 2): 45.00
           Money(5000000000,3): 5000000000.000
           
       - float (*precision* must be included)::
       
           Money(50.33, 3): 50.330
           
       - string (*precision* get extracted from the string)::
       
           Money('25.32'): 25.32
           Money('25.32', 4): 25.3200
           
       - something that could be coerced by long() or float()

3. Not to support strings with engineer notation (you don't need this when
   using money).

4. Precision must be a non negative integer, and after created the object
   you could not change it.

5. Attributes ``decimalSeparator``, ``currencySymbol`` and
   ``thousandSeparator`` could be overloaded, just to easy change them
   subclassing. This same *decimalSeparator* is that used by the constructor
   when receives a string. Defaults are::
   
       decimalSeparator = '.'
       currencySimbol = '$'
       thousandSeparator = ''

6. Calling repr() should not return str(self), because if the subclass
   indicates that ``decimalSeparator=''``, this could carry to a confusion.
   So, repr() should show a tuple of three values: IntPart, FracPart,
   Precision.

7. To comply the test case of Mark McEahern::

       cost = Money('5.99')
       percentDiscount = 10
       months = 3
       subTotal = cost * months
       discount = subTotal * (percentDiscount * 1.0) / 100
       total = subTotal - discount
       assertEqual(total, Money('16.17'))
    
8. To support the basic aritmetic (``+, -, *, /, //, **, %, divmod``) and
the
   comparisons (``==, !=, <, >, <=, >=, cmp``) in the following cases:
   
       - Money op Money
       - Money op otherType
       - otherType op Money
       - Money op= otherType
   
   OtherType could be int, float or long. Automaticlly will be converted to
   Money, inheritating the precision from the other component of the
   operation (and, in the case of the float, maybe losing precision
**before**
   the operation).
   
   When both are Moneys, the result has the larger precision from both.
    
9. To support unary operators (``-, +, abs``).

10. To support the built-in methods:

        - min, max
        - float, int, long (int and long are rounded by Money)
        - str, repr
        - hash
        - copy, deepcopy
        - bool (0 is false, otherwise true)

11. To have methods that return its components. The value of Money will be
    ``(int part) + (frac part) / (10 ** precision)``.
    
        - ``getPrecision()``: the precision
        - ``getFracPart()``: the fractional part (as long)
        - ``getIntPart()``: the int part (as long)

12. The rounding to be financial. This means that to round a number in a
    position, if the digit at the right of that position is bigger than 5,
    the digit at the left of that position is incremented by one, if it's
    smaller than 5 isn't::
    
        1.123 --> 1.12
        1.128 --> 1.13
    
    But when the digit at the right of that position is ==5. There, if the
    digit at the left of that position is odd, it gets incremented,
otherwise
    isn't::
    
        1.125 --> 1.12
        1.135 --> 1.14


Reference Implementation
========================

To be included later:

	- code
	- test code
	- documentation


Copyright
=========

This document has been placed in the public domain.


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
ADVERTENCIA  

La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo,
son para uso exclusivo del destinatario y pueden contener informaci?n
confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. 

Si Ud. No es uno de los destinatarios consignados o la persona responsable
de hacer llegar este mensaje a los destinatarios consignados, no est?
autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de
ella) contenida en este mensaje. Por favor notif?quenos respondiendo al
remitente, borre el mensaje original y borre las copias (impresas o grabadas
en cualquier medio magn?tico) que pueda haber realizado del mismo. 

Todas las opiniones contenidas en este mail son propias del autor del
mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones
Personales S.A. o alguna empresa asociada. 

Los mensajes electr?nicos pueden ser alterados, motivo por el cual
Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n
cualquiera sea el resultante de este mensaje. 

Muchas Gracias.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20031017/b78dd115/attachment-0001.html
From pf_moore at yahoo.co.uk  Fri Oct 17 16:52:14 2003
From: pf_moore at yahoo.co.uk (Paul Moore)
Date: Fri Oct 17 16:52:09 2003
Subject: [Python-Dev] Re: accumulator display syntax
References: <5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com>
	<200310170007.h9H07c006569@oma.cosc.canterbury.ac.nz>
	<5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com>
	<200310171852.34515.aleaxit@yahoo.com>
	<5.1.0.14.0.20031017134657.01ec1a70@mail.telecommunity.com>
Message-ID: <ekxby4ch.fsf@yahoo.co.uk>

"Phillip J. Eby" <pje@telecommunity.com> writes:

> At 06:52 PM 10/17/03 +0200, Alex Martelli wrote:
>>I assume you'd consider
>>     selector = Top.__get__(10)
>>some kind of abuse, and besides, this 'currying' isn't very general).
>
> Hmmm...  that's a hideously sick hack to perform currying...  but I
> *like* it.  :)

Urk. I just checked, and this works. But I haven't the foggiest idea
why! Could someone please explain? If you do, I promise never to
reveal who told me :-)

Paul.
-- 
This signature intentionally left blank


From niemeyer at conectiva.com  Fri Oct 17 16:26:20 2003
From: niemeyer at conectiva.com (Gustavo Niemeyer)
Date: Fri Oct 17 16:54:32 2003
Subject: [Python-Dev] SRE recursion
In-Reply-To: <200310162305.27509.gherron@islandtraining.com>
References: <20031016225058.GB19133@ibook.distro.conectiva>
	<m3d6cwpg5p.fsf@mira.informatik.hu-berlin.de>
	<200310162305.27509.gherron@islandtraining.com>
Message-ID: <20031017202619.GA31350@ibook.distro.conectiva>

> > > I'd like to get back to the SRE recursion issue (#757624). Is this
> > > a good time to commit the patch?
> >
> > It would be good if you could find somebody who reviews the
> > patch. However, if nobody volunteers to review, please go ahead - it
> > might well be that you are the last active SRE maintainer left on this
> > planet ...
> 
> I jumped into SRE and wallowed around a bit before the last release,
> then got swamped with real (i.e., money earning) work.  I'd be willing
> to jump in again if it would help.  Gustavo, would you like me to
> review the patch?  Or if you submit it, I'll just get it from cvs and
> poke around it that way.

Great! I'll submit it then.

Thanks!

-- 
Gustavo Niemeyer
http://niemeyer.net

From aleaxit at yahoo.com  Fri Oct 17 16:55:43 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 17 16:55:48 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
References: <16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
Message-ID: <200310172255.43697.aleaxit@yahoo.com>

On Friday 17 October 2003 07:15 pm, Guido van Rossum wrote:
> > But, indexing does stretch quite
> > far in the current Python syntax and semantics (in Python's
> > *pragmatics* you're supposed to use it far more restrainedly).
>
> Which is why I didn't like the 'sum[x for x in S]' notation much.

Let it rest in peace, then.

> Let's look for an in-line generator notation instead.  I like
>
>   sum((yield x for x in S))

So do I, _with_ the mandatory extra parentheses and all, and in
fact I think it might be even clearer with the extra colon that Phil
had mentioned, i.e.

    sum((yield: x for x in S))

> but perhaps we can make this work:
>
>   sum(x for x in S)

Perhaps the parser can be coerced to make this work, but the
mandatory parentheses, the yield keyword, and possibly the colon, 
too, may all help, it seems to me, in making this syntax stand
out more.  Yes, some uses may "read" more naturally with as
little extras as feasible, notably [examples that might be better
done with list comprehensions except for _looks_...]:

even_digits = Set(x for x in range(0, 10) if x%2==0)

versus

even_digits = Set((yield: x for x in range(0, 10) if x%2==0))

but that may be because the former notation leads back to
the "set comprehensions" that list comprehensions were
originally derived from.  I don't think it's that clear in other
cases which have nothing to do with sets, such as, e.g.,
Peter Norvig's original examples of "accumulator displays".

And as soon as you consider the notation being used in
any situation EXCEPT as the ONLY argument in a call...:

foo(x, y for y in glab for x in blag)

yes, I know this passes ONE x and one iterator, because
to pass one iterator of pairs one would have to write

foo((x, y) for y in glab for x in blag)

but the distinction between the two seems quite error
prone to me.  BTW, semantically, it WOULD be OK for
these iterator comprehension to NOT "leak" their
control variables to the surrounding scope, right...?  I
do consider the fact that list comprehensions "leak" that
way a misfeature, and keep waiting for some fanatic of
assignment-as-expression to use it IN EARNEST, e.g.,
to code his or her desired "while c=beep(): boop(c)", use

while [c for c in [beep()] if c]:
    boop(c)

...:-).

Anyway, back to the subject, those calls to foo seem
very error-prone, while:

foo(x, (yield: y for y in glab for x in blag))

(mandatory extra parentheses, 'yield', and colon) seems
far less likely to cause any such error.


Alex


From skip at pobox.com  Fri Oct 17 16:56:01 2003
From: skip at pobox.com (Skip Montanaro)
Date: Fri Oct 17 16:56:10 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <n0bzy56a.fsf@yahoo.co.uk>
References: <Your message of "Fri, 17 Oct 2003 19:03:42 +0200."
	<200310171903.42578.aleaxit@yahoo.com>
	<16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171821.39895.aleaxit@yahoo.com>
	<16272.6895.233187.510629@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<5.1.0.14.0.20031017135553.03e20eb0@mail.telecommunity.com>
	<n0bzy56a.fsf@yahoo.co.uk>
Message-ID: <16272.22369.546606.870697@montanaro.dyndns.org>


    >>> sum((yield x for x in S))
    >>> 
    >>> but perhaps we can make this work:
    >>> 
    >>> sum(x for x in S)

    Paul> I like the look of this. In this context, it looks very natural.

How would it look if you used the optional start arg to sum()?  Would either
of these work?

    sum(x for x in S, start=5)
    sum(x for x in S, 5)

or would you have to parenthesize the first arg?

    sum((x for x in S), start=5)
    sum((x for x in S), 5)

Again, why parens?  Why not

    sum(<x for x in S>, start=5)
    sum(<x for x in S>, 5)

or something similar?

Also,

    sum(x for x in S)

and

    sum([x for x in S])

look very similar.  I don't think it would be obvious to the casual observer
what the difference between them was or why the first form didn't raise a
SyntaxError.

    >> It's a little clearer with parentheses, of course, so perhaps they
    >> should be required:
    >> 
    >> for x in (y*2 for y in z if y<20):
    >> ...

    Paul> I'd rather not require parentheses in general. 

Parens are required in certain situations within list comprehensions around
tuples (probably for syntactic reasons, but perhaps to aid the reader as
well) where tuples can often be defined without enclosing parens.  Here's a
contrived example:

    >>> [(a,b) for (a,b) in zip(range(5), range(10))]
    [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)]
    >>> [a,b for (a,b) in zip(range(5), range(10))]
      File "<stdin>", line 1
        [a,b for (a,b) in zip(range(5), range(10))]
               ^
    SyntaxError: invalid syntax

    Paul> Guido's example of sum(x for x in S) looks too nice for me to want
    Paul> to give it up without a fight. But I'm happy to have cases where
    Paul> the syntax is ambiguous, or even out-and-out unparseable, without
    Paul> the parentheses. Whether it's possible to express this in a way
    Paul> that Python's grammar can deal with, I don't know.

I rather suspect parens would be required for tuples if they were added to
the language today.  I see no reason to make an exception here.

Skip

From aleaxit at yahoo.com  Fri Oct 17 17:08:04 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 17 17:08:11 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <ekxby4ch.fsf@yahoo.co.uk>
References: <5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com>
	<5.1.0.14.0.20031017134657.01ec1a70@mail.telecommunity.com>
	<ekxby4ch.fsf@yahoo.co.uk>
Message-ID: <200310172308.04840.aleaxit@yahoo.com>

On Friday 17 October 2003 10:52 pm, Paul Moore wrote:
   ...
> >>     selector = Top.__get__(10)
   ...
> Urk. I just checked, and this works. But I haven't the foggiest idea
> why! Could someone please explain? If you do, I promise never to
> reveal who told me :-)

Functions are descriptors, and func.__get__(obj) returns a bound
method with im_self set to obj -- that's how functions become bound
methods, in today's Python, when accessed with attribute syntax
obj.func on an instance obj of a class which has func in its dict.
But the mechanism is NOT meant for general currying... you could
say the latter just works as a weird-ish side effect, and not in too
general a way: consider for example:

>>> def p(s): print s
...
>>> p.__get__('one case').__get__('another')()
another
>>>

the second __get__ "replaces" the im_self [[it works on _p_ again,
the im_func of the bound method given by the first, NOT on "the
bound method itself", as that isn't a descriptor]]... now if we had a
marketing dept it could sell this as a feature, "rebindable curried
functions", perhaps, but in fact it's an "accidental side effect"...;-)


Alex


From eppstein at ics.uci.edu  Fri Oct 17 17:10:20 2003
From: eppstein at ics.uci.edu (David Eppstein)
Date: Fri Oct 17 17:10:28 2003
Subject: [Python-Dev] Re: accumulator display syntax
References: <200310172201.50930.aleaxit@yahoo.com>
	<16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<16272.15274.781344.230479@montanaro.dyndns.org>
	<200310172201.50930.aleaxit@yahoo.com>
	<16272.20246.883506.360730@montanaro.dyndns.org>
	<5.1.0.14.0.20031017162527.03eb58a0@mail.telecommunity.com>
Message-ID: <eppstein-E2CEB8.14102017102003@sea.gmane.org>

In article <5.1.0.14.0.20031017162527.03eb58a0@mail.telecommunity.com>,
 "Phillip J. Eby" <pje@telecommunity.com> wrote:

> >I am beginning to find it all a bit confusing and unsettling.
> 
> Ironically, with iterator comprehension in place, a list comprehension 
> would now look like a list containing an iterator, which I agree might be 
> confusing.

Along with that confusion, (x*x for x in S) would look like a tuple 
comprehension, rather than a bare iterator.

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science


From Jack.Jansen at cwi.nl  Fri Oct 17 17:11:44 2003
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Fri Oct 17 17:11:59 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <16272.20246.883506.360730@montanaro.dyndns.org>
References: <16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<16272.15274.781344.230479@montanaro.dyndns.org>
	<200310172201.50930.aleaxit@yahoo.com>
	<16272.20246.883506.360730@montanaro.dyndns.org>
Message-ID: <83175E26-00E6-11D8-907C-000A27B19B96@cwi.nl>


On 17-okt-03, at 22:20, Skip Montanaro wrote:
> All the more reason not to like this.  Why not just define the 
> generator
> function and call it?
>
> While Perl sprouts magical punctuation, turning its syntax into line 
> noise,
> Python seems to be sprouting multiple function-like things.  We have
>
>     * functions
>     * unbound methods
>     * bound methods
>     * generator functions
>     * iterators (currently invisible via syntax, but created by 
> calling a
>       generator function?)
>     * instances magically callable via __call__
>
> and now this new (rather limited) syntax for creating iterators.

And you even forget lambda:-)

I agree with Skip here: there's all this magic that crept into Python 
since
2.0 (approximately) that really hampers readability to novices. And here
I mean novices in the wide sense of the word, i.e. including myself 
(novice
to the new concepts). Some of these look like old concepts but are 
really
something completely different (generators versus functions), some are
really little more than keystroke savers (list comprehensions).
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman


From eppstein at ics.uci.edu  Fri Oct 17 17:14:41 2003
From: eppstein at ics.uci.edu (David Eppstein)
Date: Fri Oct 17 17:20:17 2003
Subject: [Python-Dev] Re: accumulator display syntax
References: <200310171903.42578.aleaxit@yahoo.com>
	<16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171821.39895.aleaxit@yahoo.com>
	<16272.6895.233187.510629@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<5.1.0.14.0.20031017135553.03e20eb0@mail.telecommunity.com>
	<n0bzy56a.fsf@yahoo.co.uk>
	<16272.22369.546606.870697@montanaro.dyndns.org>
Message-ID: <eppstein-6B83CE.14144117102003@sea.gmane.org>

In article <16272.22369.546606.870697@montanaro.dyndns.org>,
 Skip Montanaro <skip@pobox.com> wrote:

> Parens are required in certain situations within list comprehensions around
> tuples (probably for syntactic reasons, but perhaps to aid the reader as
> well) where tuples can often be defined without enclosing parens.  Here's a
> contrived example:
> 
>     >>> [(a,b) for (a,b) in zip(range(5), range(10))]
>     [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)]
>     >>> [a,b for (a,b) in zip(range(5), range(10))]
>       File "<stdin>", line 1
>         [a,b for (a,b) in zip(range(5), range(10))]
>                ^
>     SyntaxError: invalid syntax

This one has bitten me several times.

When it does, I discover the error quickly due to the syntax error, but
it would be bad if this became valid syntax and returned a list [a,X] 
where X is an iterator.  I don't think you could count on this getting 
caught by a being unbound, because often the variables in list 
comprehensions can be single letters that shadow previous bindings.

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science


From guido at python.org  Fri Oct 17 17:19:30 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 17:20:33 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: Your message of "Fri, 17 Oct 2003 15:56:01 CDT."
	<16272.22369.546606.870697@montanaro.dyndns.org> 
References: <Your message of "Fri, 17 Oct 2003 19:03:42 +0200."
	<200310171903.42578.aleaxit@yahoo.com>
	<16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171821.39895.aleaxit@yahoo.com>
	<16272.6895.233187.510629@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<5.1.0.14.0.20031017135553.03e20eb0@mail.telecommunity.com>
	<n0bzy56a.fsf@yahoo.co.uk> 
	<16272.22369.546606.870697@montanaro.dyndns.org> 
Message-ID: <200310172119.h9HLJUF07430@12-236-54-216.client.attbi.com>

> Again, why parens?  Why not
> 
>     sum(<x for x in S>, start=5)
>     sum(<x for x in S>, 5)

Because the parser doesn't know whether the > after S is the end of
the <...> brackets or a binary > operator.

(Others can answer your other questions.)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at yahoo.com  Fri Oct 17 17:21:48 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 17 17:21:54 2003
Subject: [Python-Dev] prePEP: Money data type
In-Reply-To: <A128D751272CD411BC9200508BC2194D03383063@escpl.tcp.com.ar>
References: <A128D751272CD411BC9200508BC2194D03383063@escpl.tcp.com.ar>
Message-ID: <200310172321.48818.aleaxit@yahoo.com>

On Friday 17 October 2003 10:49 pm, Batista, Facundo wrote:
   ...
> The idea is to make a Money data type, basically for financial uses, where
> decimals are needed but floating point is too inexact.  The Money data type

Good, but the name seems ambiguous -- I would expect 'money' to include
a *currency unit*, while these are just numbers.  E.g., these days for me a
"money amount" of "1000" isn't immediately significant -- does it mean "old
liras", Euros, SEK, ...?  If a clearer name (perhaps Decimal?) was adopted,
the type's purposes would be also clearer, perhaps.

> 6. About repr(). Should ``myMoney == eval(repr(myMoney))``?

I don't see why not.

> 3. Not to support strings with engineer notation (you don't need this when
>    using money).

Actually, with certain very depreciated currencies exponent notation would
be VERY handy to have.  E.g., given than a Euro is worth 1670000 Turkish
Liras today, you have to count zeros accurately when expressing any
substantial amount in Turkish Liras -- exponential notation would help.

> 10. To support the built-in methods:

I think you mean functions, not methods, in Python terminology.

>         - min, max
>         - float, int, long (int and long are rounded by Money)

Rounding rather than truncation seems strange to me here.

>         - str, repr
>         - hash
>         - copy, deepcopy
>         - bool (0 is false, otherwise true)
>
> 11. To have methods that return its components. The value of Money will be
>     ``(int part) + (frac part) / (10 ** precision)``.
>
>         - ``getPrecision()``: the precision
>         - ``getFracPart()``: the fractional part (as long)
>         - ``getIntPart()``: the int part (as long)

Given we're talking about Python and not Java, I would suggest read-only
accessors (like e.g. the complex type has) rather than accessor methods.
E.g., x.precision , x.fraction and x.integer rather than x.getPrecision() etc.


> 12. The rounding to be financial. This means that to round a number in a
>     position, if the digit at the right of that position is bigger than 5,
>     the digit at the left of that position is incremented by one, if it's
>     smaller than 5 isn't::
>
>         1.123 --> 1.12
>         1.128 --> 1.13
>
>     But when the digit at the right of that position is ==5. There, if the
>     digit at the left of that position is odd, it gets incremented,
> otherwise
>     isn't::
>
>         1.125 --> 1.12
>         1.135 --> 1.14

I don't think these are the rules in the European Union (they're popular
in statistics, but, I suspect, not legally correct in accounting).  I can try
to research that, if you need me to.


Alex


From aleaxit at yahoo.com  Fri Oct 17 17:28:23 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 17 17:28:29 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <5.1.0.14.0.20031017162527.03eb58a0@mail.telecommunity.com>
References: <200310172201.50930.aleaxit@yahoo.com>
	<5.1.0.14.0.20031017162527.03eb58a0@mail.telecommunity.com>
Message-ID: <200310172328.23057.aleaxit@yahoo.com>

On Friday 17 October 2003 10:38 pm, Phillip J. Eby wrote:
   ...
> Ironically, with iterator comprehension in place, a list comprehension
> would now look like a list containing an iterator, which I agree might be
> confusing.  Too bad we didn't do iterator comps first, or list(itercomp)
> would be the idiomatic way to make a listcomp.

Yes.  But don't mind me, I'm still sad that we have range and xrange
when iter(a:b) and list(a:b:c) would be SUCH good replacements for
them if slicing-notation was accepted elsewhere than in indexing, or
iter[a:b] and list[a:b:c] if some people didn't so strenuously object to
certain perfectly harmless uses of indexing...;-)


> That's really the only confusing bit I see about itercomps...  that you
> have to be careful where you put your parentheses, in order to make your
> intentions clear in some contexts.  However, that's true for many kinds of
> expressions even now.

Yes.  But since iterator comprehensions are being designed from scratch
I think we can MANDATE parentheses around them, and a 'yield' right
after the open parenthesis for good measure, to ensure they are not
ambiguous to human readers as well as to parsers.


Alex


From FBatista at uniFON.com.ar  Fri Oct 17 17:33:48 2003
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Fri Oct 17 17:34:34 2003
Subject: [Python-Dev] prePEP: Money data type
Message-ID: <A128D751272CD411BC9200508BC2194D03383065@escpl.tcp.com.ar>

#- Good, but the name seems ambiguous -- I would expect 'money' 
#- to include
#- a *currency unit*, while these are just numbers.  E.g., 
#- these days for me a
#- "money amount" of "1000" isn't immediately significant -- 
#- does it mean "old
#- liras", Euros, SEK, ...?  If a clearer name (perhaps 
#- Decimal?) was adopted,
#- the type's purposes would be also clearer, perhaps.

Specifically it doesn't diferenciate it. It is printed with a '$' prefix,
but that's all.

The name really is a problem. Decimal doesn't imply the different rounding. 


#- > 6. About repr(). Should ``myMoney == eval(repr(myMoney))``?
#- 
#- I don't see why not.

OK, should. But must?


#- > 3. Not to support strings with engineer notation (you 
#- don't need this when
#- >    using money).
#- 
#- Actually, with certain very depreciated currencies exponent 
#- notation would
#- be VERY handy to have.  E.g., given than a Euro is worth 
#- 1670000 Turkish
#- Liras today, you have to count zeros accurately when expressing any
#- substantial amount in Turkish Liras -- exponential notation 
#- would help.

You got me. Taking note.


#- > 10. To support the built-in methods:
#- 
#- I think you mean functions, not methods, in Python terminology.
#- 
#- >         - min, max
#- >         - float, int, long (int and long are rounded by Money)
#- 
#- Rounding rather than truncation seems strange to me here.

To me too. It could be truncated, and if you want to round m to cero
precision, you always can Money(m, 0).


#- > 11. To have methods that return its components. The value 
#- of Money will be
#- >     ``(int part) + (frac part) / (10 ** precision)``.
#- >
#- >         - ``getPrecision()``: the precision
#- >         - ``getFracPart()``: the fractional part (as long)
#- >         - ``getIntPart()``: the int part (as long)
#- 
#- Given we're talking about Python and not Java, I would 
#- suggest read-only
#- accessors (like e.g. the complex type has) rather than 
#- accessor methods.
#- E.g., x.precision , x.fraction and x.integer rather than 
#- x.getPrecision() etc.

Nice. 


#- >     But when the digit at the right of that position is 
#- ==5. There, if the
#- >     digit at the left of that position is odd, it gets incremented,
#- > otherwise
#- >     isn't::
#- >
#- >         1.125 --> 1.12
#- >         1.135 --> 1.14
#- 
#- I don't think these are the rules in the European Union 
#- (they're popular
#- in statistics, but, I suspect, not legally correct in 
#- accounting).  I can try
#- to research that, if you need me to.

Please. Because I found it in FixedPoint, and researching, think that in
Argentina that's the way banks get rounded money.

From guido at python.org  Fri Oct 17 17:45:43 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 17:46:07 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Fri, 17 Oct 2003 22:55:43 +0200."
	<200310172255.43697.aleaxit@yahoo.com> 
References: <16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> 
	<200310172255.43697.aleaxit@yahoo.com> 
Message-ID: <200310172145.h9HLjh807477@12-236-54-216.client.attbi.com>

[Guido]
> > Let's look for an in-line generator notation instead.  I like
> >
> >   sum((yield x for x in S))

[Alex]
> So do I, _with_ the mandatory extra parentheses and all, and in
> fact I think it might be even clearer with the extra colon that Phil
> had mentioned, i.e.
> 
>     sum((yield: x for x in S))
> 
> > but perhaps we can make this work:
> >
> >   sum(x for x in S)
> 
> Perhaps the parser can be coerced to make this work, but the
> mandatory parentheses, the yield keyword, and possibly the colon, 
> too, may all help, it seems to me, in making this syntax stand
> out more.

Hm.  I'm not sure that it *should* stand out more.  The version with
the yield keyword and the colon draws undue attention to the
mechanism.  I bet that if you showed

  sum(x for x in range(10))

to a newbie they'd have no problem understanding it (their biggest
problem would be that range(10) is [0, 1, ..., 9] rather than [1, 2,
..., 10]) but if you showed them

  sum((yield: x for x in S))

they would probably scratch their heads.

I also note that if it wasn't for list comprehensions, the form

  <expr> for <vars> in <expr>

poses absolutely no problems to the parser, since it's just a ternary
operator (though the same is true for the infamous

  <expr> if <test> else <expr>

:-).

List comprehensions make this a bit difficult because they use the
same form in a specific context for something different; at the very
best this would mean that

  [x for x in S]

and

  [(x for x in S)]

are completely different beasts: the first would be equivalent to

  list(S)

while the second would be equivalent to

  [iter(S)]

i.e. a list whose only only element is an iterator over S (not a very
useful thing to have, except perhaps if you had a function taking a
list of iterators as an argument).

> Yes, some uses may "read" more naturally with as
> little extras as feasible, notably [examples that might be better
> done with list comprehensions except for _looks_...]:
> 
> even_digits = Set(x for x in range(0, 10) if x%2==0)
> 
> versus
> 
> even_digits = Set((yield: x for x in range(0, 10) if x%2==0))
> 
> but that may be because the former notation leads back to
> the "set comprehensions" that list comprehensions were
> originally derived from.  I don't think it's that clear in other
> cases which have nothing to do with sets, such as, e.g.,
> Peter Norvig's original examples of "accumulator displays".

Let's go over the examples from http://www.norvig.com/pyacc.html :

    [Sum: x*x for x in numbers]
    sum(x*x for x in numbers)

    [Product: Prob_spam(word) for word in email_msg]
    product(Prob_spam(word) for word in email_msg)

    [Min: temp(hour) for hour in range(24)]
    min(temp(hour) for hour in range(24))

    [Mean: f(x) for x in data]
    mean(f(x) for x in data)

    [Median: f(x) for x in data]
    median(f(x) for x in data)

    [Mode: f(x) for x in data]
    mode(f(x) for x in data)

So far, these can all be written as simple functions that take an
iterable argument, and they look as good with an iterator
comprehension as with a list argument.

    [SortBy: abs(x) for x in (-2, -4, 3, 1)]

This one is a little less obvious, because it requires the feature
from Norvig's PEP that if add() takes a second argument, the unadorned
loop control variable is passed in that position.  It could be done
with this:

    sortby((abs(x), x) for x in (-2, 3, 4, 1))

but I think that Raymond's code in CVS is just as good. :-)

Norvig's Top poses no problem:

    top(humor(joke) for joke in jokes)

In conclusion, I think this syntax is pretty cool.  (It will probably
die the same death as the ternary expression though.)

> And as soon as you consider the notation being used in
> any situation EXCEPT as the ONLY argument in a call...:

Who said that?  I fully intended it to be an expression, acceptable
everywhere, though possibly requiring parentheses to avoid ambiguities
(in list comprehensions) or excessive ugliness (e.g. to the right of
'in' or 'yield').

> foo(x, y for y in glab for x in blag)
> 
> yes, I know this passes ONE x and one iterator, because
> to pass one iterator of pairs one would have to write
> 
> foo((x, y) for y in glab for x in blag)
> 
> but the distinction between the two seems quite error
> prone to me.

It would requier extra parentheses here:

  foo(x, (y for y in glab for x in blag))

> BTW, semantically, it WOULD be OK for
> these iterator comprehension to NOT "leak" their
> control variables to the surrounding scope, right...?

Yes.  (I think list comprehensions shouldn't do this either; it's
just a pain to introduce a new scope; maybe such control variables
should simply be renamed to "impossible" names like the names used for
the anonymous first argument to f below:

  def f((a, b), c): ...

> I
> do consider the fact that list comprehensions "leak" that
> way a misfeature, and keep waiting for some fanatic of
> assignment-as-expression to use it IN EARNEST, e.g.,
> to code his or her desired "while c=beep(): boop(c)", use
> 
> while [c for c in [beep()] if c]:
>     boop(c)
> 
> ...:-).

Yuck.  Fortunately that would be quite slow, and the same fanatics
usually don't like that. :-)

> Anyway, back to the subject, those calls to foo seem
> very error-prone, while:
> 
> foo(x, (yield: y for y in glab for x in blag))
> 
> (mandatory extra parentheses, 'yield', and colon) seems
> far less likely to cause any such error.

I could live with the extra parentheses.  Then we get:

  (x for x in S)             # iter(S)

  [x for x in S]	     # list(S)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From python at rcn.com  Fri Oct 17 17:45:26 2003
From: python at rcn.com (Raymond Hettinger)
Date: Fri Oct 17 17:46:19 2003
Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <20031017224924.L14453@prim.han.de>
Message-ID: <002c01c394f7$fa270500$e841fea9@oemcomputer>

[Gustavo Niemeyer wrote]
> > > If anything at all, i'd suggest a std-module which contains e.g.
> > > 'sort', 'reverse' and 'extend' functions which always return
> > > a new list
> > > a new list, so that you could write:
> > >
> > >     for i in reverse(somelist):
> > >         ...

sort:  This is being addressed by the proposed list.copysort() method

reverse:  This is being addressed by PEP-0322.  When I get a chance,
          the PEP will be revised to propose a builtin instead of 
          various methods attached to specific sequence objects.

extend:  How would this differ from itertools.chain() ?


> > You can do reverse with [::-1] now.

[Holger Krekel]
> sure, but it's a bit unintuitive and i mentioned not only reverse :-)
> 
> Actually i think that 'reverse', 'sort' and 'extend' algorithms
> could nicely be put into the new itertools module.
> 
> There it's obvious that they wouldn't mutate objects.  And these
> algorithms
> (especially extend and reverse) would be very efficient as iterators
> because
> they wouldn't create temporary lists/tuples.


To be considered as a possible itertool, an ideal candidate should:

* work well in combination with other itertools
* be a fundamental building block
* accept all iterables as inputs
* return only an iterator as an output
* run lazily so as not to force the inputs to run to completion
  unless externally requested by list() or some such.
* consume constant memory (this rule was bent for itertools.cycle(),
  but should be followed as much as possible).
* run finitely if some of the inputs are finite (itertools.repeat(),
  count() and cycle() are the only intentionally infinite tools)

There is no chance for isort().  Once you've sorted the whole list,
there is no advantage to returning an iterator instead of a list.

The problem with ireverse() is that it only works with objects that
support __getitem__() and len().  That pretty much precludes 
generators, user defined class based iterators, and the outputs
from other itertools.  So, while it may make a great builtin (which
is what PEP-322 is going to propose), it doesn't fit in with other
itertools.


Raymond Hettinger


From guido at python.org  Fri Oct 17 17:46:32 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 17:46:52 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: Your message of "Fri, 17 Oct 2003 14:10:20 PDT."
	<eppstein-E2CEB8.14102017102003@sea.gmane.org> 
References: <200310172201.50930.aleaxit@yahoo.com>
	<16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<16272.15274.781344.230479@montanaro.dyndns.org>
	<200310172201.50930.aleaxit@yahoo.com>
	<16272.20246.883506.360730@montanaro.dyndns.org>
	<5.1.0.14.0.20031017162527.03eb58a0@mail.telecommunity.com> 
	<eppstein-E2CEB8.14102017102003@sea.gmane.org> 
Message-ID: <200310172146.h9HLkWB07494@12-236-54-216.client.attbi.com>

> Along with that confusion, (x*x for x in S) would look like a tuple 
> comprehension, rather than a bare iterator.

Well, () is already heavily overloaded, so I can live with that.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri Oct 17 17:48:33 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 17:48:40 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: Your message of "Fri, 17 Oct 2003 14:14:41 PDT."
	<eppstein-6B83CE.14144117102003@sea.gmane.org> 
References: <200310171903.42578.aleaxit@yahoo.com>
	<16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171821.39895.aleaxit@yahoo.com>
	<16272.6895.233187.510629@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<5.1.0.14.0.20031017135553.03e20eb0@mail.telecommunity.com>
	<n0bzy56a.fsf@yahoo.co.uk>
	<16272.22369.546606.870697@montanaro.dyndns.org> 
	<eppstein-6B83CE.14144117102003@sea.gmane.org> 
Message-ID: <200310172148.h9HLmXk07520@12-236-54-216.client.attbi.com>

> >     >>> [(a,b) for (a,b) in zip(range(5), range(10))]
> >     [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)]
> >     >>> [a,b for (a,b) in zip(range(5), range(10))]
> >       File "<stdin>", line 1
> >         [a,b for (a,b) in zip(range(5), range(10))]
> >                ^
> >     SyntaxError: invalid syntax
> 
> This one has bitten me several times.
> 
> When it does, I discover the error quickly due to the syntax error,

Generally, when we talk about something "biting", we mean something
that *doesn't* give a syntax error, but silently does something quite
different than what you'd naively expect.

This was made a syntax error specifically because of this ambiguity.

> but it would be bad if this became valid syntax and returned a list
> [a,X] where X is an iterator.  I don't think you could count on this
> getting caught by a being unbound, because often the variables in
> list comprehensions can be single letters that shadow previous
> bindings.

No, [a,X] would be a syntax error if X was an iterator comprehension.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri Oct 17 17:50:34 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 17:50:57 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Fri, 17 Oct 2003 23:28:23 +0200."
	<200310172328.23057.aleaxit@yahoo.com> 
References: <200310172201.50930.aleaxit@yahoo.com>
	<5.1.0.14.0.20031017162527.03eb58a0@mail.telecommunity.com> 
	<200310172328.23057.aleaxit@yahoo.com> 
Message-ID: <200310172150.h9HLoYj07532@12-236-54-216.client.attbi.com>

> Yes.  But don't mind me, I'm still sad that we have range and xrange
> when iter(a:b) and list(a:b:c) would be SUCH good replacements for
> them if slicing-notation was accepted elsewhere than in indexing,

This has been proposed more than once (I think the last time by Paul
Dubois, who wanted x:y:z to be a general expression), and has a
certain elegance, but is probably too terse.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From janssen at parc.com  Fri Oct 17 17:51:42 2003
From: janssen at parc.com (Bill Janssen)
Date: Fri Oct 17 17:52:05 2003
Subject: [Python-Dev] accumulator display syntax 
In-Reply-To: Your message of "Fri, 17 Oct 2003 13:20:38 PDT."
	<16272.20246.883506.360730@montanaro.dyndns.org> 
Message-ID: <03Oct17.145145pdt."58611"@synergy1.parc.xerox.com>

> All the more reason not to like this.  Why not just define the generator
> function and call it?

+1.

Bill

From aleaxit at yahoo.com  Fri Oct 17 17:54:32 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 17 17:54:38 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <16272.20246.883506.360730@montanaro.dyndns.org>
References: <16272.5373.514560.225999@montanaro.dyndns.org>
	<200310172201.50930.aleaxit@yahoo.com>
	<16272.20246.883506.360730@montanaro.dyndns.org>
Message-ID: <200310172354.32838.aleaxit@yahoo.com>

On Friday 17 October 2003 10:20 pm, Skip Montanaro wrote:
   ...
>     Alex> Neither: it returns an iterator, _equivalent_ to the one that
>     Alex> would be returned by _calling_ a generator such as
>
>     Alex> def xxx():
>     Alex>     for x in S:
>     Alex>         yield x
>
> All the more reason not to like this.  Why not just define the generator
> function and call it?

The usual problems: having to use several separate statements, and
name something that you are only interested in using once, is a bit
conceptually cumbersome when you could use a clear inline expression
"right where you need it" for the same purpose.  Moreover, it seems a
bit strange to be able to use the well-liked comprehension syntax only
at the price of storing all intermediate steps in memory -- and have to
zoom up to several separate statements + a name if you'd rather avoid
the memory overhead, e.g.:

sum( [x+x*x for x in short_sequence if x >0] )

is all right, BUT if the sequence becomes too long then

def gottagiveitaname():
    for x in long_sequence:
        if x>0:
            yield x+x*x
sum( gottagiveitaname() )

That much being said, I entirely agree that the proposal is absolutely
NOT crucial to Python -- it will not enormously expand its power nor
its range of applicability.  I don't think it's SO terribly complicated to
require application of such extremely high standards, though.  But if
the consensus is that ONLY lists are important enough to deserve the
beauty of comprehensions, and EVERY other case must either pay
the memory price of a list or the conceptual one of calling and then
invoking a one-use-only generator, so be it, I guess.


> While Perl sprouts magical punctuation, turning its syntax into line noise,
> Python seems to be sprouting multiple function-like things.  We have
>
>     * functions
>     * unbound methods
>     * bound methods
>     * generator functions
>     * iterators (currently invisible via syntax, but created by calling a
>       generator function?)
>     * instances magically callable via __call__

Every one of these was in Python when I first met it, except
generators -- and iterators, which are NOT function-like in the least,
nor "invisible" (often, an iterator is an instance of an explicitly coded
class or type with a next() method).

You seem to have forgotten lambda, though -- and classes/types
(all callable -- arguably via __call__ in some sense, but you could
say just the same of functions &c).  Which ALSO were in Python
when I first met it.  So, I see no "sprouting" -- Python has "always"
(from my POV) had a wide variety of callables.


> and now this new (rather limited) syntax for creating iterators.

...which isn't function-like either, neither in syntax nor in semantics.
Yes, it's limited -- basically to the same cases as list comprehensions,
except that (being an iterator and not a list) there is no necessary
implication of finiteness.


> I am beginning to find it all a bit confusing and unsettling.

I hear you, and I worry about this general effect on you, but I do
not seem to be able to understand the real reasons.  Any such
generalized objection from an experienced Pythonista like you is
well worthy of making everybody sit up and care, it seems to me.

But exactly because of that, it might help if you were able to
articulate your unease more precisely.

Python MAY well have accumulated a few too many things in
its long, glorious story -- because (and for good reason!) we keep
the old cruft around for backwards compatibility, any change means
(alas) growth.  Guido is on record as declaring that release 3.0
will be about simplification: removing some of the cruft, taking
advantage of the 2->3 bump in release number to break a little
bit (not TOO much) backwards compatibility.  Is Python so large
today that we can't afford another release, 2.4, with _some_ kind
of additions to the language proper, without confusing and unsettling
long-time, experienced, highly skilled Pythonistas like you?  Despite
the admirable _stationariety_ of the language proper throughout the
2.2 and 2.3 eras...?  If something like that is your underlying
feeling, it may be well worth articulating -- and perhaps we need
to sit back and listen and take stock (hey, I'd get to NOT have to
write another edition of the Nutshell for a while -- maybe I should
side strongly with this thesis!-).  If it's something else, more specific
to this set of proposals for accumulators / comprehensions, then
maybe there's some _area_ in which any change is particularly
unwelcome?  But I can't guess with any accuracy...


Alex


From python at rcn.com  Fri Oct 17 17:55:11 2003
From: python at rcn.com (Raymond Hettinger)
Date: Fri Oct 17 17:55:54 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310172255.43697.aleaxit@yahoo.com>
Message-ID: <002d01c394f9$56b6e460$e841fea9@oemcomputer>

[GvR]
> > Which is why I didn't like the 'sum[x for x in S]' notation much.

[Alex]
> Let it rest in peace, then.

Goodbye, weird __getitem__ hack!


[GvR]
> > Let's look for an in-line generator notation instead.  I like
> >
> >   sum((yield x for x in S))

[Alex]
> So do I, _with_ the mandatory extra parentheses and all, and in
> fact I think it might be even clearer with the extra colon that Phil
> had mentioned, i.e.
> 
>     sum((yield: x for x in S))

+1


[David Eppstein, in a separate note]
> Along with that confusion, (x*x for x in S) would look like a tuple 
> comprehension, rather than a bare iterator.

Phil's idea cleans that up pretty well:

	(yield: x*x for x in S)

This is no more tuple-like than any expression surrounded by parens.


Raymond Hettinger


From pje at telecommunity.com  Fri Oct 17 17:58:06 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 17 17:58:09 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310172255.43697.aleaxit@yahoo.com>
References: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
Message-ID: <5.1.0.14.0.20031017175033.02fea090@mail.telecommunity.com>

At 10:55 PM 10/17/03 +0200, Alex Martelli wrote:

>while [c for c in [beep()] if c]:
>     boop(c)
>
>..:-).

That is positively *evil*.  Good thing you didn't post it on python-list.  :)


>Anyway, back to the subject, those calls to foo seem
>very error-prone, while:
>
>foo(x, (yield: y for y in glab for x in blag))
>
>(mandatory extra parentheses, 'yield', and colon) seems
>far less likely to cause any such error.

And also much uglier.  Even though I originally proposed it, I like Guido's 
version (sans yield) much better.  OTOH, I can also see where the "tuple 
comprehension" and other possible confusing uses seem to shoot it down.

Hm.  What if list comprehensions returned a "lazy list", that if you took 
an iterator of it, you'd get a generator-iterator, but if you tried to use 
it as a list, it would populate itself?  Then there'd be no need to ever 
*not* use a listcomp, and only one syntax would be necessary.

More specifically, if all you did with the list was iterate over it, and 
then throw it away, it would never actually populate itself.  The principle 
drawback to this idea from a semantic viewpoint is that listcomps can be 
done over expressions that have side-effects.  :(


From martin at v.loewis.de  Fri Oct 17 18:00:54 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Fri Oct 17 18:01:19 2003
Subject: [Python-Dev] SRE recursion
In-Reply-To: <20031017223015.N64463@bullseye.apana.org.au>
References: <20031016225058.GB19133@ibook.distro.conectiva>
	<m3d6cwpg5p.fsf@mira.informatik.hu-berlin.de>
	<20031017223015.N64463@bullseye.apana.org.au>
Message-ID: <m3vfqn1q3t.fsf@mira.informatik.hu-berlin.de>

Andrew MacIntyre <andymac@bullseye.apana.org.au> writes:

> Because of the stack recursion issue on FreeBSD (in the presence of
> threads), I tested several of Gustavo's patches.  I didn't scrutinise them
> for style though...

It's not primarily style that I'm concerned about, but
hard-to-find-in-testing bugs, such as memory leaks, bad decrefs,
incompatibilities in boundary cases, and so on.

Regards,
Martin

From aleaxit at yahoo.com  Fri Oct 17 18:05:24 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 17 18:06:09 2003
Subject: Currying with instancemethod (was Re: [Python-Dev] accumulator
	display syntax)
In-Reply-To: <5.1.0.14.0.20031017173618.02fe7820@mail.telecommunity.com>
References: <5.1.0.14.0.20031017134657.01ec1a70@mail.telecommunity.com>
	<5.1.0.14.0.20031017173618.02fe7820@mail.telecommunity.com>
Message-ID: <200310180005.24686.aleaxit@yahoo.com>

On Friday 17 October 2003 11:45 pm, Phillip J. Eby wrote:
   ...
> At 10:40 PM 10/17/03 +0200, Alex Martelli wrote:
> >Yes, def curry(func, arg): return new.instancemethod(func, arg, object)
   ...
> >def curry(func, arg):
> >     def curried(*args): return func(arg, *args)
> >     return curried
   ...
> It is a big win if the curried function will be used in a
> performance-sensitive way.  Instance method objects don't pay for setting
> up an extra frame object, and for the single curried argument, the
> interpreter even shortcuts some of the instancemethod overhead!  So, if I

You're right: the instancemethod version has impressively better performance
(should the curried function be used in a bottleneck, of course) -- i.e., 
given a.py:

import new

def curry1(func, arg):
    return new.instancemethod(func, arg, object)

def curry2(func, arg):
    def curried(*args): return func(args, *args)
    return curried

def f(a, b, c): return a, b, c

I've measured:

[alex@lancelot ba]$ timeit.py -c -s'
import a
g = a.curry2(a.f, 23)
' 'g(45, 67)'
100000 loops, best of 3: 2 usec per loop

[alex@lancelot ba]$ timeit.py -c -s'
import a
g = a.curry1(a.f, 23)
' 'g(45, 67)'
1000000 loops, best of 3: 1.09 usec per loop

I sure didn't expect an almost 2:1 ratio, while you did predict it.


Alex


From martin at v.loewis.de  Fri Oct 17 18:07:33 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Fri Oct 17 18:07:51 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <GCEDKONBLEFPPADDJCOEGEGFOBAA.whisper@oz.net>
References: <GCEDKONBLEFPPADDJCOEGEGFOBAA.whisper@oz.net>
Message-ID: <m3r81b1psq.fsf@mira.informatik.hu-berlin.de>

"David LeBlanc" <whisper@oz.net> writes:

> What's the cost of mapping the world (all those entry points) at startup?

I believe it is measurable. It also adds maintenance costs to have
extension modules, both in terms of the build procedure, and in
packaging.

> You have to rebuild all of the main dll just to do something to one
> component. To me, that's maybe the biggest single issue.

When did you last wish to rebuild one of the modules without having a
PCBuild directory in the first place? If that ever happened, which
module did you wish to rebuild and why?

> Any possiblity of new bugs?

Not likely.

> Are app users/programmers going to have a bloat perception?

This is possible; it appears that all readers who, in this thread,
have spoken in favour of keeping the status quo have done so because
of a bloat perception.

> IMO, it contradicts the unix way of smaller, compartmentalized is better.

I dislike the usage of shared libraries on Unix, and still hope that
the Python build procedure becomes sane again by reducing its usage of
shared extension modules, in favour of a single complete binary.

> It's not unix we're talking about, but it still makes sense to me, whatever
> the OS.

It makes no sense to me whatsoever.

> On a related side note: has anyone done any investigation to
> determine which few percentage of the extensions account for 99% of
> the dll loads?

Do you have any specific concerns beyond FUD?

Regards,
Martin

From seandavidross at hotmail.com  Fri Oct 17 18:08:08 2003
From: seandavidross at hotmail.com (Sean Ross)
Date: Fri Oct 17 18:08:27 2003
Subject: [Python-Dev] accumulator display syntax
References: <16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<200310172255.43697.aleaxit@yahoo.com>
Message-ID: <Law9-OE40524YsIC3HC00002f29@hotmail.com>

Hello.
Perhaps looking at some examples of what nested itercomps might look like
(because they _will_ be used if they're available...) using each of the
leading syntaxes would be useful in trying to decide which form, if any, is
most
acceptable (or least unacceptable, whichever the case may be):

# (1) without parentheses:
B(y) for y in A(x) for x in myIterable

# (2) for clarity, we'll add some optional parentheses:
B(y) for y in (A(x) for x in myIterable)

# (3) OK. Now, with required parentheses:
(B(y) for y in (A(x) for x in myIterable))

# (4) And, now with the required "yield:" and parentheses:
(yield: B(y) for y in (yield: A(x) for x in myIterable))

#(5) And, finally, for completeness, using the rejected PEP 289 syntax:
[yield B(y) for y in [yield A(x) for x in myIterable]]

Hope that's useful,
Sean

p.s.
I'm only a Python user, and not a developer, so if my comments are not
welcome here, please let me know, and I will refrain in future. Thanks for
your time.


From aleaxit at yahoo.com  Fri Oct 17 18:09:09 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 17 18:09:13 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310172150.h9HLoYj07532@12-236-54-216.client.attbi.com>
References: <200310172201.50930.aleaxit@yahoo.com>
	<200310172328.23057.aleaxit@yahoo.com>
	<200310172150.h9HLoYj07532@12-236-54-216.client.attbi.com>
Message-ID: <200310180009.09257.aleaxit@yahoo.com>

On Friday 17 October 2003 11:50 pm, Guido van Rossum wrote:
> > Yes.  But don't mind me, I'm still sad that we have range and xrange
> > when iter(a:b) and list(a:b:c) would be SUCH good replacements for
> > them if slicing-notation was accepted elsewhere than in indexing,
>
> This has been proposed more than once (I think the last time by Paul
> Dubois, who wanted x:y:z to be a general expression), and has a
> certain elegance, but is probably too terse.

Perhaps mandatory parentheses around it (as sole argument in a
function call, say) might make it un-terse enough for acceptance...?

The frequence of counted loops IS such that replacing

for x in range(9): ...

with

for x in (0:9): ...

WOULD pay for itself soon in reduced wear and tear on keyboards...;-)

[Using iter(0:9) instead would be only "conceptually neat", no typing
advantage on range -- conceded].


Alex


From martin at v.loewis.de  Fri Oct 17 18:10:49 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Fri Oct 17 18:11:10 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <y8vjlpzw.fsf@python.net>
References: <brspcg23.fsf@python.net> <3F872FE9.9070508@v.loewis.de>
	<u16bbwsz.fsf@python.net> <3F8C3DD0.4020400@v.loewis.de>
	<d6cxnfc6.fsf@python.net>
	<200310162019.h9GKJli05194@12-236-54-216.client.attbi.com>
	<y8vjlpzw.fsf@python.net>
Message-ID: <m3n0bz1pna.fsf@mira.informatik.hu-berlin.de>

Thomas Heller <theller@python.net> writes:

> I'm undecided on these modules, I do not use them now but may in the
> future - so I'm undecided:
> 
>   _csv winsound datetime bz2

I think Guido's point that you should be able to build pythonxy.dll
without downloading additional source is good, so _csv, winsound,
datetime would go in, and bz2 would stay out.

Regards,
Martin


From martin at v.loewis.de  Fri Oct 17 18:14:30 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Fri Oct 17 18:14:49 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <3F90367D.200@ieee.org>
References: <brspcg23.fsf@python.net> <3F872FE9.9070508@v.loewis.de>
	<u16bbwsz.fsf@python.net> <3F8C3DD0.4020400@v.loewis.de>
	<d6cxnfc6.fsf@python.net>
	<200310162019.h9GKJli05194@12-236-54-216.client.attbi.com>
	<y8vjlpzw.fsf@python.net> <3F90367D.200@ieee.org>
Message-ID: <m3ismn1ph5.fsf@mira.informatik.hu-berlin.de>

"Shane Holloway (IEEE)" <shane.holloway@ieee.org> writes:

> > Don't know what these do, so I cannot really comment:
> >   _symtable parser unicodedata
> 
> Neither do I.  Although unicodedata is fairly big.

As I tried to explain: the size of the library is relatively
irrelevant, atleast for performance (it might matter for py2exe-style
standalone binary production). What matters (as Guido explains) is
whether you need additional libraries to download or link with, which
is not the case for either of these modules.

Regards,
Martin

From aleaxit at yahoo.com  Fri Oct 17 18:14:51 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 17 18:14:58 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310172145.h9HLjh807477@12-236-54-216.client.attbi.com>
References: <16272.5373.514560.225999@montanaro.dyndns.org>
	<200310172255.43697.aleaxit@yahoo.com>
	<200310172145.h9HLjh807477@12-236-54-216.client.attbi.com>
Message-ID: <200310180014.51336.aleaxit@yahoo.com>

On Friday 17 October 2003 11:45 pm, Guido van Rossum wrote:
   ...
> In conclusion, I think this syntax is pretty cool.  (It will probably
> die the same death as the ternary expression though.)

Ah well -- in this case I guess I won't go to the bother of deciding
whether I like your preferred "lighter" syntax or the "stands our more"
one.  The sad, long, lingering death of the ternary expression was
too painful to repeat -- let's put this one out of its misery sooner.


Alex


From aleaxit at yahoo.com  Fri Oct 17 18:18:15 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 17 18:18:23 2003
Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <002c01c394f7$fa270500$e841fea9@oemcomputer>
References: <002c01c394f7$fa270500$e841fea9@oemcomputer>
Message-ID: <200310180018.15836.aleaxit@yahoo.com>

On Friday 17 October 2003 11:45 pm, Raymond Hettinger wrote:
   ...
> To be considered as a possible itertool, an ideal candidate should:

Very nice set of specs!  Which reminds me: why don't we have take(n, it)
and drop(n, it) there?  I find myself rewriting those quite often.


Alex


From pje at telecommunity.com  Fri Oct 17 17:45:32 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 17 18:24:57 2003
Subject: Currying with instancemethod (was Re: [Python-Dev] accumulator
	display syntax)
In-Reply-To: <200310172240.28322.aleaxit@yahoo.com>
References: <5.1.0.14.0.20031017134657.01ec1a70@mail.telecommunity.com>
	<5.1.0.14.0.20031017115726.0398c210@mail.telecommunity.com>
	<5.1.0.14.0.20031017134657.01ec1a70@mail.telecommunity.com>
Message-ID: <5.1.0.14.0.20031017173618.02fe7820@mail.telecommunity.com>

At 10:40 PM 10/17/03 +0200, Alex Martelli wrote:
>Yes, def curry(func, arg): return new.instancemethod(func, arg, object)
>IS indeed way more general than func.__get__(arg) [notably, you get to
>call it repeatedly to curry more than one argument, from the left].  But
>if you have to define a curry function anyway, it's not a huge win vs
>
>def curry(func, arg):
>     def curried(*args): return func(arg, *args)
>     return curried
>
>or indeed more general variations thereof such as
>
>def curry(func, *curried_args):
>     def curried(*args): return func(*(curried_args+args))
>     return curried

It is a big win if the curried function will be used in a 
performance-sensitive way.  Instance method objects don't pay for setting 
up an extra frame object, and for the single curried argument, the 
interpreter even shortcuts some of the instancemethod overhead!  So, if I 
were taking the time to write a currying function, I'd probably implement 
your latter version by chaining instance methods.  (Of course, I'd also 
want to test to find out how many I could chain before the frame overhead 
was less than the chaining overhead.)

Whenever I've run into a performance problem in Python (usually involving 
loops over 10,000+ items), I've almost invariably found that the big 
culprit is how many (Python) function calls happen in the loop.  In such 
cases, execution time is almost linearly proportional to how many function 
calls happen, and inlining functions or resorting to a Pyrex version of the 
same function can often eliminate the performance problem on that basis 
alone.  (For the Pyrex conversion, I have to use PyObject_GetAttr() in 
place of Pyrex's native attribute access, because it otherwise uses 
GetAttrString(), which seems to often make up for the lack of frame 
creation overhead.)


From pyth at devel.trillke.net  Fri Oct 17 18:27:42 2003
From: pyth at devel.trillke.net (Holger Krekel)
Date: Fri Oct 17 18:27:50 2003
Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <002c01c394f7$fa270500$e841fea9@oemcomputer>;
	from python@rcn.com on Fri, Oct 17, 2003 at 05:45:26PM -0400
References: <20031017224924.L14453@prim.han.de>
	<002c01c394f7$fa270500$e841fea9@oemcomputer>
Message-ID: <20031018002742.M14453@prim.han.de>

Raymond Hettinger wrote:
> [Gustavo Niemeyer wrote]
> > > > If anything at all, i'd suggest a std-module which contains e.g.
> > > > 'sort', 'reverse' and 'extend' functions which always return
> > > > a new list
> > > > a new list, so that you could write:
> > > >
> > > >     for i in reverse(somelist):
> > > >         ...
> 
> sort:  This is being addressed by the proposed list.copysort() method
 
> reverse:  This is being addressed by PEP-0322.  When I get a chance,
>           the PEP will be revised to propose a builtin instead of 
>           various methods attached to specific sequence objects.
 
> extend:  How would this differ from itertools.chain() ?

pointing someone to these three different specific (somewhat limited)
solutions for the "i want reverse/sort/extend/... not to work inplace but
on-the-fly" requirement seems tedious. 

> There is no chance for isort().  Once you've sorted the whole list,
> there is no advantage to returning an iterator instead of a list.

Providing a uniform concept counts as an advantage IMO. Agreed, performance
wise there probably is no advantage with the current sorting-algorithm. 
 
> The problem with ireverse() is that it only works with objects that
> support __getitem__() and len().  That pretty much precludes 
> generators, user defined class based iterators, and the outputs
> from other itertools.  So, while it may make a great builtin (which
> is what PEP-322 is going to propose), it doesn't fit in with other
> itertools.

I wouldn't mind if reverse would - as a fallback - suck all elements and
then spit them out in reverse order.  After all, you sometimes want to
process yielded values from an iterator in reverse order and there is
not much else you can do than to exhaust the iterator. 

cheers,

    holger

From Scott.Daniels at Acm.Org  Fri Oct 17 18:30:44 2003
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Fri Oct 17 18:30:58 2003
Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <E1AAcYS-0002Yt-R8@mail.python.org>
References: <E1AAcYS-0002Yt-R8@mail.python.org>
Message-ID: <3F906D94.5030902@Acm.Org>

[Raymond Hettinger]
> To be considered as a possible itertool, an ideal candidate should:
> * work well in combination with other itertools
> * be a fundamental building block
> * accept all iterables as inputs
> * return only an iterator as an output
> * run lazily so as not to force the inputs to run to completion
>   unless externally requested by list() or some such.
> * consume constant memory (this rule was bent for itertools.cycle(),
>   but should be followed as much as possible).
> * run finitely if some of the inputs are finite (itertools.repeat(),
>   count() and cycle() are the only intentionally infinite tools)
> 
> There is no chance for isort().  Once you've sorted the whole list,
> there is no advantage to returning an iterator instead of a list.
Actually, some case can be made:
loading prepares a heap, iterating extracts the heap top.

sit = isort(someiter)
sit.next()    is the winner.
Then
sit.next()    is second-place (or a tie with the winner).
q = sit.next()
[q] + takewhile(lambda x: x==q, sit)
               is all who tied with the runner-up.


Which isn't to say I think it fits.  But there are reasons to
get everything and then dole out parts.

-Scott David Daniels
Scott.Daniels@Acm.Org


From theller at python.net  Fri Oct 17 18:42:26 2003
From: theller at python.net (Thomas Heller)
Date: Fri Oct 17 18:42:31 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <m3n0bz1pna.fsf@mira.informatik.hu-berlin.de> (Martin v.'s
	message of "18 Oct 2003 00:10:49 +0200")
References: <brspcg23.fsf@python.net> <3F872FE9.9070508@v.loewis.de>
	<u16bbwsz.fsf@python.net> <3F8C3DD0.4020400@v.loewis.de>
	<d6cxnfc6.fsf@python.net>
	<200310162019.h9GKJli05194@12-236-54-216.client.attbi.com>
	<y8vjlpzw.fsf@python.net>
	<m3n0bz1pna.fsf@mira.informatik.hu-berlin.de>
Message-ID: <8ynj1o6l.fsf@python.net>

martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) writes:

> Thomas Heller <theller@python.net> writes:
>
>> I'm undecided on these modules, I do not use them now but may in the
>> future - so I'm undecided:
>> 
>>   _csv winsound datetime bz2
>
> I think Guido's point that you should be able to build pythonxy.dll
> without downloading additional source is good, so _csv, winsound,
> datetime would go in, and bz2 would stay out.

Yes, and _ssl would also stay out (it seems I forgot to list it).  The
only module needing external source is zlib - and this is one I care
about because it may be useful for zipimport of compressed
modules.  Can't we simply import the zlib sources into Python's CVS?

Thomas


From aleaxit at yahoo.com  Fri Oct 17 18:44:47 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 17 18:44:51 2003
Subject: [Python-Dev] prePEP: Money data type
In-Reply-To: <A128D751272CD411BC9200508BC2194D03383065@escpl.tcp.com.ar>
References: <A128D751272CD411BC9200508BC2194D03383065@escpl.tcp.com.ar>
Message-ID: <200310180044.47083.aleaxit@yahoo.com>

On Friday 17 October 2003 11:33 pm, Batista, Facundo wrote:
   ...
> #- I don't think these are the rules in the European Union
> #- (they're popular
> #- in statistics, but, I suspect, not legally correct in
> #- accounting).  I can try
> #- to research that, if you need me to.
>
> Please. Because I found it in FixedPoint, and researching, think that in
> Argentina that's the way banks get rounded money.

Found it -- article 5 of the Council Regulation which established the Euro
a few years ago is titled "Rounding" and specifies (I quote selectively):

"""
shall be rounded up or down to the nearest cent ... If ... a result ... is 
exactly half-way, the sum shall be rounded up.
"""

The regulation goes on to show cases in which two conversions back
and forth (EUR to/from older currencies) can lose or gain a cent, and
specifies:

"""
such difference cannot be invoked to dispute the correctness of payments. The 
difference must be allowed as a 'tolerance' insofar as it results from the 
application of the European Regulation.

This 'tolerance' should also be incorporated in data processing programmes, 
especially accounting programmes, in order to avoid problems connected with 
the reconciliation of amounts.
"""

The Visual Basic FAQ, for example, explicitly warns that VB does *NOT*
respect the legal requirements of Euro conversion rules.  The Euro rules
are summarized in the FAQ as:
"""
When rounding to an x number of decimals, the last decimal must be:
- Rounded down (i.e. left alone) when the following decimal (if any) is 4 or 
less.
- Rounded up when the following decimal is 5 or more.
"""
while VB's rules are:
"""
If after the digit that is to be rounded, the digits following are exactly 
equal to 5, the value is rounded to the NEAREST EVEN NUMBER.
"""
(I _think_ it means "the digit ... is", NOT "the digits ... are").  In fact,
follow-ons clarify that VB isn't fully coeherent on these rules (hah).


But the point remains: rounding half a cent to even rather than always
up violates European Union law; nor can the "tolerance rule" be invoked,
because it's specifically limited to one-cent discrepancies that "result
from the application of the European Regulation", while this one would
result from the _violation_ thereof.  Oh BTW, other sites quite explicitly
state that the rule applies throughout the EU, _not_ only to countries
that have adopted the Euro.

FWIW, Rogue Wave's Money class lets you specify _either_ rounding
approach -- ROUND_PLAIN specifies EU-rules-compliant rounding,
ROUND_BANKERS specifies round-to-even, for exactly in-between
amounts.  Offhand, it would seem impossible to write an accounting
program that respects the law in Europe AND the praxis you mention
at the same time, unless you somehow tell it what rule to use.

Sad, and seems weird to go to such trouble for a cent, but accountants
live and die by such minutiae: I think it would not be wise to ignore them,
PARTICULARLY if we name the type so as to make it appear to the
uninitiated that it "will do the right thing" regarding rounding... when there
isn't ONE right thing, it depends on locale &c:-(.


Alex


From python at rcn.com  Fri Oct 17 18:46:54 2003
From: python at rcn.com (Raymond Hettinger)
Date: Fri Oct 17 18:47:36 2003
Subject: [Python-Dev] RE: itertools, was RE: list.sort
In-Reply-To: <200310180018.15836.aleaxit@yahoo.com>
Message-ID: <003201c39500$9006a8c0$e841fea9@oemcomputer>

[Raymond]
> > To be considered as a possible itertool, an ideal candidate should: 

[Alex]
> Very nice set of specs! 

Thanks!


> Which reminds me: why don't we have take(n, it)
> and drop(n, it) there?  I find myself rewriting those quite often.

Yeah, me too.

When you write them, do they return lists or iterators?
For me, take() has been most useful in list form, but my point 
of view is biased because I use it to experiment with itertool
suggestions and need an easy way manifest a portion of a 
potentially infinite iterator.

My misgivings about drop() and take() are, firstly, that they 
are expressible in-terms of islice() so they don't really add
any new capability.  Secondly, the number of tools needs to be 
kept to a minimum -- already, the number of tools is large 
enough to complicate the task of figuring out how to
use them in combination -- the examples page in the docs is
intended, in part, to record the best discoveries so they
won't have to be continually re-invented.


Raymond Hettinger


P.S.  Itertool tip for the day:  to generate a stream of random
numbers, write:  stapmap(random.random, repeat(()))


From guido at python.org  Fri Oct 17 18:47:28 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 18:48:04 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Fri, 17 Oct 2003 17:58:06 EDT."
	<5.1.0.14.0.20031017175033.02fea090@mail.telecommunity.com> 
References: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com> 
	<5.1.0.14.0.20031017175033.02fea090@mail.telecommunity.com> 
Message-ID: <200310172247.h9HMlSI07690@12-236-54-216.client.attbi.com>

> Hm.  What if list comprehensions returned a "lazy list", that if you took 
> an iterator of it, you'd get a generator-iterator, but if you tried to use 
> it as a list, it would populate itself?  Then there'd be no need to ever 
> *not* use a listcomp, and only one syntax would be necessary.
> 
> More specifically, if all you did with the list was iterate over it, and 
> then throw it away, it would never actually populate itself.  The principle 
> drawback to this idea from a semantic viewpoint is that listcomps can be 
> done over expressions that have side-effects.  :(

I don't think this can be done without breaking b/w compatibility.  Example:

  a = [x**2 for x in range(10)]
  for i in a: print i
  print a

Your proposed semantics would throw away the values in the for loop,
so what would it print in the third line?

--Guido van Rossum (home page: http://www.python.org/~guido/)

From mike at nospam.com  Fri Oct 17 19:01:45 2003
From: mike at nospam.com (Mike Rovner)
Date: Fri Oct 17 19:01:44 2003
Subject: [Python-Dev] Re: prePEP: Money data type
References: <A128D751272CD411BC9200508BC2194D03383065@escpl.tcp.com.ar>
Message-ID: <bmpsch$r4m$1@sea.gmane.org>

Batista, Facundo wrote:
> #- Good, but the name seems ambiguous -- I would expect 'money'
> #- to include
> #- a *currency unit*, while these are just numbers.  E.g.,
> #- these days for me a
> #- "money amount" of "1000" isn't immediately significant --
> #- does it mean "old
> #- liras", Euros, SEK, ...?  If a clearer name (perhaps
> #- Decimal?) was adopted,
> #- the type's purposes would be also clearer, perhaps.
>
> Specifically it doesn't diferenciate it. It is printed with a '$'
> prefix, but that's all.

>From the prePEP it's not clear (for me) the purpose of curencySymbol.
If it's intended for localisation, then prefix isn't enough,
some countries use suffix or even such format

Money(123.45, 2)  -->  123 FF 45 GG

where FF is suffix1 and GG is suffix2.

Regards,
Mike

PS. If it's not appropriate to post such comments to c.l.p.dev, just tell
me.


From pje at telecommunity.com  Fri Oct 17 19:06:22 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 17 19:06:25 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310172247.h9HMlSI07690@12-236-54-216.client.attbi.com>
References: <Your message of "Fri, 17 Oct 2003 17:58:06 EDT."
	<5.1.0.14.0.20031017175033.02fea090@mail.telecommunity.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<5.1.0.14.0.20031017175033.02fea090@mail.telecommunity.com>
Message-ID: <5.1.0.14.0.20031017185718.0256f490@mail.telecommunity.com>

At 03:47 PM 10/17/03 -0700, Guido van Rossum wrote:
> > Hm.  What if list comprehensions returned a "lazy list", that if you took
> > an iterator of it, you'd get a generator-iterator, but if you tried to use
> > it as a list, it would populate itself?  Then there'd be no need to ever
> > *not* use a listcomp, and only one syntax would be necessary.
> >
> > More specifically, if all you did with the list was iterate over it, and
> > then throw it away, it would never actually populate itself.  The 
> principle
> > drawback to this idea from a semantic viewpoint is that listcomps can be
> > done over expressions that have side-effects.  :(
>
>I don't think this can be done without breaking b/w compatibility.  Example:
>
>   a = [x**2 for x in range(10)]
>   for i in a: print i
>   print a
>
>Your proposed semantics would throw away the values in the for loop,
>so what would it print in the third line?

I should've been more specific...  some pseudocode:

class LazyList(list):

     materialized = False

     def __init__(self, generator_func):
         self.generator = generator_func

     def __iter__(self):
         # When iterating, use the generator, unless
         # we've already computed contents.
         if self.materialized:
             return super(LazyList,self).__iter__()
         else:
             return self.generator()

     def __getitem__(self,index):
         if not self.materialized:
             self[:] = list(self.generator())
             self.materialized = True
         return super(LazyList,self).__getitem__(index)

     def __len__(self):
         if not self.materialized:
             self[:] = list(self.generator())
             self.materialized = True
         return super(LazyList,self).__len__()

     # etc.

So, the problem isn't that the code you posted would fail on 'print a', 
it's that the generator function would be run *twice*, which would be a 
no-no if it had side effects, and would also take longer.

It was just a throwaway idea, in the hopes that maybe it would lead to an 
idea that would actually work.  Ah well, maybe in Python 3.0, there'll just 
be itercomps, and we'll use list(itercomp) when we want a list.


From aleaxit at yahoo.com  Fri Oct 17 19:29:42 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 17 19:29:47 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <5.1.0.14.0.20031017175033.02fea090@mail.telecommunity.com>
References: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<5.1.0.14.0.20031017175033.02fea090@mail.telecommunity.com>
Message-ID: <200310180129.42506.aleaxit@yahoo.com>

On Friday 17 October 2003 11:58 pm, Phillip J. Eby wrote:
   ...
> Hm.  What if list comprehensions returned a "lazy list", that if you took
> an iterator of it, you'd get a generator-iterator, but if you tried to use
> it as a list, it would populate itself?  Then there'd be no need to ever
> *not* use a listcomp, and only one syntax would be necessary.
>
> More specifically, if all you did with the list was iterate over it, and
> then throw it away, it would never actually populate itself.  The principle
> drawback to this idea from a semantic viewpoint is that listcomps can be
> done over expressions that have side-effects.  :(

The big problem I see is e.g. as follows:

l1 = range(6)

lc = [ x for x in l1 ]

for a in lc:
    l1.append(a)

(or insert the LC inline in the for, same thing either way I'd sure hope).

Today, this is perfectly well-defined, since the LC "takes a snapshot" when
evaluated -- l1 becomes a 12-elements list, as if I had done l1 *= 2.
But if lc _WASN'T_ "populated"... shudder... it would be as nonterminating
as "for a in l1:" same loop body.

Unfortunately, it seems to me that turning semantics from strict to lazy
is generally unfeasible because of such worries (even if one could somehow
ignore side effects).  Defining semantics as lazy in the first place is fine: 
as e.g. "for a in iter(l1):" has always produced a nonterminating loop for
that body (iter has always been lazy), people just don't use it.  But once
it has been defined as strict, going to lazy is probably unfeasible.  Pity...


Alex


From aleaxit at yahoo.com  Fri Oct 17 19:43:36 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 17 19:43:42 2003
Subject: [Python-Dev] Re: itertools, was RE: list.sort
In-Reply-To: <003201c39500$9006a8c0$e841fea9@oemcomputer>
References: <003201c39500$9006a8c0$e841fea9@oemcomputer>
Message-ID: <200310180143.36999.aleaxit@yahoo.com>

On Saturday 18 October 2003 12:46 am, Raymond Hettinger wrote:
   ...
> My misgivings about drop() and take() are, firstly, that they
> are expressible in-terms of islice() so they don't really add
> any new capability.  Secondly, the number of tools needs to be

True.  I gotta remember that -- I find it unintuitive, maybe it's
islice's odious range-like ordering of arguments.


Alex


From martin at v.loewis.de  Fri Oct 17 18:51:03 2003
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri Oct 17 19:57:28 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <8ynj1o6l.fsf@python.net>
References: <brspcg23.fsf@python.net>
	<3F872FE9.9070508@v.loewis.de>	<u16bbwsz.fsf@python.net>
	<3F8C3DD0.4020400@v.loewis.de>	<d6cxnfc6.fsf@python.net>	<200310162019.h9GKJli05194@12-236-54-216.client.attbi.com>	<y8vjlpzw.fsf@python.net>	<m3n0bz1pna.fsf@mira.informatik.hu-berlin.de>
	<8ynj1o6l.fsf@python.net>
Message-ID: <3F907257.1030406@v.loewis.de>

Thomas Heller wrote:
> Yes, and _ssl would also stay out (it seems I forgot to list it).  The
> only module needing external source is zlib - and this is one I care
> about because it may be useful for zipimport of compressed
> modules.  Can't we simply import the zlib sources into Python's CVS?

I would advise against that: On Unix, it wouldn't be used, because 
people would ask that the platform's zlib shared library is used.

It appears that in this specific case, Guido is willing to compromise
that downloading zlib source to build pythonxy.dll could be acceptable.
Also, in this specific case, making it easy to remove zlib support
would be possible: add a HAVE_ZLIB in pyconfig.h, and put HAVE_ZLIB
around the reference in config.c. Anybody who does not want to download
zlib would need to edit pyconfig.h (or perhaps the pythoncore project).

Regards,
Martin


From guido at python.org  Fri Oct 17 19:57:45 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 19:57:54 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: Your message of "Sat, 18 Oct 2003 00:42:26 +0200."
	<8ynj1o6l.fsf@python.net> 
References: <brspcg23.fsf@python.net> <3F872FE9.9070508@v.loewis.de>
	<u16bbwsz.fsf@python.net> <3F8C3DD0.4020400@v.loewis.de>
	<d6cxnfc6.fsf@python.net>
	<200310162019.h9GKJli05194@12-236-54-216.client.attbi.com>
	<y8vjlpzw.fsf@python.net>
	<m3n0bz1pna.fsf@mira.informatik.hu-berlin.de> 
	<8ynj1o6l.fsf@python.net> 
Message-ID: <200310172357.h9HNvjC07788@12-236-54-216.client.attbi.com>

> The only module needing external source is zlib - and this is one I
> care about because it may be useful for zipimport of compressed
> modules.  Can't we simply import the zlib sources into Python's CVS?

I don't like that very much; there are always licensing issues.  (Even
though we did do this for expat.)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From python at rcn.com  Fri Oct 17 21:10:05 2003
From: python at rcn.com (Raymond Hettinger)
Date: Fri Oct 17 21:11:36 2003
Subject: [Python-Dev] generator comprehension syntax,
	was: accumulator display syntax
In-Reply-To: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
Message-ID: <000201c39514$ac006f20$e841fea9@oemcomputer>

[GvR]
> I'd just like to pipe into this discussion saying that while Peter
> Norvig's pre-PEP is neat, I'd reject it if it were a PEP; the main
> reason that the proposed notation doesn't return a list.  I agree that
> having generator comprehensions would be a more general solution.  I
> don't have a proposal for generator comprehension syntax though, and
> [yield ...] has the same problem.

Is Phil's syntax acceptable to everyone?

     (yield:  x*x for x in roots)


I think this form works nicely.

looking-for-resolution-and-consensus-ly yours,


Raymond


From python at rcn.com  Fri Oct 17 21:25:50 2003
From: python at rcn.com (Raymond Hettinger)
Date: Fri Oct 17 21:26:33 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <A128D751272CD411BC9200508BC2194D0338304B@escpl.tcp.com.ar>
Message-ID: <000401c39516$c440b520$e841fea9@oemcomputer>

[GvR]
> > > like type checkers (and IMO are also easily overseen by human 
> > > readers).? And, it's easier to write l.sorted() rather than 
> > > l.sort(inline=True). 
 
[Aahz] 
> > Let's make explicit: l.copysort() 
> > 
> > I'm not a big fan of grammatical suffixes for 
> distinguishing between 
> > similar meanings. 
> 
> +1

[Facundo] 
> +2, considering that the difference in behaviour with sort and 
> sorted it's no so clear to a non-english speaker. 


FWIW, I've posted a patch to implement list.copysort() that
includes a news announcement, docs, and unittests:

    www.python.org/sf/825814


Raymond Hettinger  


From guido at python.org  Fri Oct 17 23:20:31 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 23:20:42 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: Your message of "Fri, 17 Oct 2003 21:25:50 EDT."
	<000401c39516$c440b520$e841fea9@oemcomputer> 
References: <000401c39516$c440b520$e841fea9@oemcomputer> 
Message-ID: <200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com>

> FWIW, I've posted a patch to implement list.copysort() that
> includes a news announcement, docs, and unittests:
> 
>     www.python.org/sf/825814

Despite my suggesting a better name, I'm not in favor of this (let's
say -0).

For one, this will surely make lots of people write

  for key in D.keys().copysort():
      ...

which makes an unnecessary copy of the keys.  I'd rather continue to
write

  keys = D.keys()
  keys.sort()
  for key in keys:
      ...

--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Fri Oct 17 23:28:39 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 17 23:28:47 2003
Subject: [Python-Dev] generator comprehension syntax, was:
	accumulator display syntax
In-Reply-To: <000201c39514$ac006f20$e841fea9@oemcomputer>
References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
Message-ID: <5.1.0.14.0.20031017231830.01ed3ec0@mail.telecommunity.com>

At 09:10 PM 10/17/03 -0400, Raymond Hettinger wrote:
>[GvR]
> > I'd just like to pipe into this discussion saying that while Peter
> > Norvig's pre-PEP is neat, I'd reject it if it were a PEP; the main
> > reason that the proposed notation doesn't return a list.  I agree that
> > having generator comprehensions would be a more general solution.  I
> > don't have a proposal for generator comprehension syntax though, and
> > [yield ...] has the same problem.
>
>Is Phil's syntax acceptable to everyone?
>
>      (yield:  x*x for x in roots)

Ironically, I'm opposed.  :)

* Yield is a control flow statement, this is an expression

* yield: looks like lambda, and this is not a function

* Yield only makes sense if you come into this thinking about generators

* Yield distracts from the purpose of the expression

To put it another way, Python is "executable pseudocode".  Listcomps are 
pseudocode.  Yield in a generator is pseudocode.  (x*x for x in roots) is 
pseudocode.  But (yield: x*x for x in roots) looks like some kind of weird 
programming language gibberish.  :)

I think the worst misinterpretation I could have about the yield-less 
syntax is that I might think it was a "tuple comprehension" or something 
that returned a sequence instead of an iterator.  However, I'll find out 
it's not a sequence or tuple if I try to do anything with it that requires 
a sequence or tuple.  My worst case problem is re-execution of the iterator.

Which, by the way, brings up a question: should iterator comps be 
reiterable?  I don't see any reason right now why they shouldn't be, and 
can think of situations where reiterability would be useful.


From aahz at pythoncraft.com  Fri Oct 17 23:54:46 2003
From: aahz at pythoncraft.com (Aahz)
Date: Fri Oct 17 23:54:50 2003
Subject: [Python-Dev] generator comprehension syntax,
	was: accumulator display syntax
In-Reply-To: <000201c39514$ac006f20$e841fea9@oemcomputer>
References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
	<000201c39514$ac006f20$e841fea9@oemcomputer>
Message-ID: <20031018035445.GA14929@panix.com>

On Fri, Oct 17, 2003, Raymond Hettinger wrote:
> [GvR]
>>
>> I'd just like to pipe into this discussion saying that while Peter
>> Norvig's pre-PEP is neat, I'd reject it if it were a PEP; the main
>> reason that the proposed notation doesn't return a list.  I agree that
>> having generator comprehensions would be a more general solution.  I
>> don't have a proposal for generator comprehension syntax though, and
>> [yield ...] has the same problem.
> 
> Is Phil's syntax acceptable to everyone?
> 
>      (yield:  x*x for x in roots)

I'm not sure.  Let's try it out:

    for square in (yield: x*x for x in roots):
        print square

That doesn't look *too* bad.  Okay, how about this:

    def grep(pattern, iter):
        pattern = re.compile(pattern)
        for item in iter:
            if pattern.search(str(item)):
                yield item

    for item in grep("1", (yield: x*x for x in roots) ):
        print item

Now that looks disgusting.  OTOH, I doubt any syntax for a generator
comprehension could improve that.  On the gripping hand, I'm concerned
that we're going in Lisp's direction with too many parens.  At least
with the listcomp you have more of a visual cue:

    for item in grep("1", [x*x for x in roots] ):

<shrug>
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From aahz at pythoncraft.com  Fri Oct 17 23:57:05 2003
From: aahz at pythoncraft.com (Aahz)
Date: Fri Oct 17 23:57:07 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com>
References: <000401c39516$c440b520$e841fea9@oemcomputer>
	<200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com>
Message-ID: <20031018035705.GB14929@panix.com>

On Fri, Oct 17, 2003, Guido van Rossum wrote:
>Raymond:
>>
>> FWIW, I've posted a patch to implement list.copysort() that
>> includes a news announcement, docs, and unittests:
> 
> Despite my suggesting a better name, I'm not in favor of this (let's
> say -0).

I'm actually -1, particularly with your clear argument; I just didn't
like your suggestion of l.sorted().
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From guido at python.org  Fri Oct 17 23:57:21 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 17 23:57:38 2003
Subject: [Python-Dev] generator comprehension syntax,
	was: accumulator display syntax
In-Reply-To: Your message of "Fri, 17 Oct 2003 23:28:39 EDT."
	<5.1.0.14.0.20031017231830.01ed3ec0@mail.telecommunity.com> 
References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>  
	<5.1.0.14.0.20031017231830.01ed3ec0@mail.telecommunity.com> 
Message-ID: <200310180357.h9I3vLE08067@12-236-54-216.client.attbi.com>

> Which, by the way, brings up a question: should iterator comps be 
> reiterable?  I don't see any reason right now why they shouldn't be, and 
> can think of situations where reiterability would be useful.

Oh, no.  Not reiterability again.  How can you promise something to be
reiterable if you don't know whether the underlying iterator can be
reiterated?  Keeping a hidden buffer would be a bad idea.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From bac at OCF.Berkeley.EDU  Sat Oct 18 02:07:26 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Sat Oct 18 02:07:36 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <20031018035705.GB14929@panix.com>
References: <000401c39516$c440b520$e841fea9@oemcomputer>	<200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com>
	<20031018035705.GB14929@panix.com>
Message-ID: <3F90D89E.3060706@ocf.berkeley.edu>

Aahz wrote:

> On Fri, Oct 17, 2003, Guido van Rossum wrote:
> 
>>Raymond:
>>
>>>FWIW, I've posted a patch to implement list.copysort() that
>>>includes a news announcement, docs, and unittests:
>>
>>Despite my suggesting a better name, I'm not in favor of this (let's
>>say -0).
> 
> 
> I'm actually -1, particularly with your clear argument; I just didn't
> like your suggestion of l.sorted().

I'm -1 as well.  Lists do not need to grow a method for something that 
only replaces two lines of code that are not tricky in any form of the word.

-Brett


From python at rcn.com  Sat Oct 18 02:52:46 2003
From: python at rcn.com (Raymond Hettinger)
Date: Sat Oct 18 02:53:29 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com>
Message-ID: <001f01c39544$70314de0$e841fea9@oemcomputer>

[Raymond]
> > FWIW, I've posted a patch to implement list.copysort() that
> > includes a news announcement, docs, and unittests:
> >
> >     www.python.org/sf/825814

[Guido]
> Despite my suggesting a better name, I'm not in favor of this (let's
> say -0).
> 
> For one, this will surely make lots of people write
> 
>   for key in D.keys().copysort():
>       ...
> 
> which makes an unnecessary copy of the keys.  I'd rather continue to
> write
> 
>   keys = D.keys()
>   keys.sort()
>   for key in keys:
>       ...

Interesting that you saw this at the same time I was fretting about
it over dinner.  The solution is to bypass the copy step for the
common case of:   for elem in somelistmaker().copysort():  . . .

The revised patch is at:

   www.python.org/sf/825814 

The technique is to re-use the existing list whenever the refcount
is one.  This keeps the mutation invisible.

Advantages of a copysort() method:

* Avoids creating an unnecessary, stateful variable that remains
  visible after the sort is needed.  In the above example, the
definition
  of the "keys" variable changes from unsorted to sorted.  Also, the 
  lifetime of the variable extends past the loop where it was intended
  to be used.  

  In longer code fragments, this unnecessarily increases code
complexity,
  code length, the number of variables, and increases the risk of using
  a variable in the wrong state which is a common source of programming
  errors.

* By avoiding control flow (the assignments in the current approach),
  an inline sort becomes usable anywhere an expression is allowed.  This
  includes important places like function call arguments and list
  comprehensions:

  todo = [t for t in tasks.copysort() if due_today(t)]

  genhistory(date, events.copysort(key=incidenttime))

  Spreading these out over multiple lines is an unnecessary
  distractor from the problem domain, resulting is code that is
  harder to read, write, visually verify, grok, or debug.


Raymond Hettinger


P.S.  There are probably better names than copysort, but the idea
still holds.


From aleaxit at yahoo.com  Sat Oct 18 05:20:45 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 18 05:20:55 2003
Subject: [Python-Dev] generator comprehension syntax,
	was: accumulator display syntax
In-Reply-To: <200310180357.h9I3vLE08067@12-236-54-216.client.attbi.com>
References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
	<5.1.0.14.0.20031017231830.01ed3ec0@mail.telecommunity.com>
	<200310180357.h9I3vLE08067@12-236-54-216.client.attbi.com>
Message-ID: <200310181120.45477.aleaxit@yahoo.com>

On Saturday 18 October 2003 05:57 am, Guido van Rossum wrote:
> > Which, by the way, brings up a question: should iterator comps be
> > reiterable?  I don't see any reason right now why they shouldn't be, and
> > can think of situations where reiterability would be useful.
>
> Oh, no.  Not reiterability again.  How can you promise something to be
> reiterable if you don't know whether the underlying iterator can be
> reiterated?  Keeping a hidden buffer would be a bad idea.

I agree it would be bad to have "black magic" performed by every iterator to
fulfil a contract that may or may not be useful to clients and might be costly
to fulfil.

IF "reiterability" is useful (and I'd need to see some use cases, because I
don't particularly recall pining for it in Python) it should be exposed as a
separate protocol that may or may not be offered by any given iterator
type.  E.g., the presence of a special method __reiter__ could indicate that
this iterator IS able to supply another iterator which retraces the same
steps from the start; and perhaps iter(xxx, reiterable=True) could strive
to provide a reiterable iterator for xxx, which might justify building one 
that keeps a hidden buffer as a last resort.  But first, I'd like use 
cases...

There ARE other features I'd REALLY have liked to get from iterators in
some applications.

A "snapshot" -- providing me two iterators, the original one and another,
which will step independently over the same sequence of items -- would
have been really handy at times.  And a "step back" facility ("undo" of
the last call to next) -- sometimes one level would suffice, sometimes not;
often I could have provided the item to be "pushed back" so the iterator
need not retain memory of it independently, but that wouldn't always be
handy.  Now any of these can be built as a wrapper over an existing
iterator, of course -- just like 'reiterability' could (and you could in fact
easily implement reiterability in terms of snapshotting, by just ensuring a
snapshot is taken at the start and further snapshotted but never disturbed);
but not knowing the abilities of the underlying iterator would mean these
wrappers would often duplicate functionality needlessly.

E.g.:

class snapshottable_sequence_iter(object):
    def __init__(self, sequence, i=0):
        self.sequence = sequence
        self.i = i
    def __iter__(self): return self
    def next(self):
        try: result = self.sequence[self.i]
        except IndexError: raise StopIteration
        self.i += 1
        return result
    def snapshot(self):
        return self.__class__(self.sequence, self.i)

Here, snapshotting is quite cheap, requiring just a new counter and
another reference to the same underlying sequence.  So would be
restarting and stepping back, directly implemented.  But if we need
to wrap a totally generic iterator to provide "snapshottability", we
inevitably end up keeping a list (or the like) of items so far seen from
one but not both 'independent' iterators obtained by a snapshot --
all potentially redundant storage, not to mention the possible coding
trickiness in maintaining that FIFO queue.

As I said I do have use cases for all of these.  Simplest is the
ability to push back the last item obtained by next, since a frequent
patter is:
    for item in iterator:
        if isok(item): process(item)
        else:
            # need to push item back onto iterator, then
            break
    else:
        # all items were OK, iterator exhausted, blah blah

    ...and later...

    for item in iterator:    # process some more items

Of course, as long as just a few levels of pushback are enough, THIS
one is an easy and light-weight wrapper to write:

class pushback_wrapper(object):
    def __init__(self, it):
        self.it = it
        self.pushed_back = []
    def __iter__(self): return self
    def next(self):
        try: return self.pushed_back.pop()
        except IndexError: return self.it.next()
    def pushback(self, item):
        self.pushed_back.append(item)


A "snapshot" would be useful whenever more than one pass on a
sequence _or part of it_ is needed (more useful than a "restart" because
of the "part of it" provision).  And a decent wrapper for it is a bear...


Alex


From mrussell at verio.net  Sat Oct 18 05:44:35 2003
From: mrussell at verio.net (Mark Russell)
Date: Sat Oct 18 05:46:57 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <3F90D89E.3060706@ocf.berkeley.edu>
References: <000401c39516$c440b520$e841fea9@oemcomputer>
	<200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com>
	<20031018035705.GB14929@panix.com> <3F90D89E.3060706@ocf.berkeley.edu>
Message-ID: <1066470275.1346.25.camel@straylight>

On Sat, 2003-10-18 at 07:07, Brett C. wrote:
> I'm -1 as well.  Lists do not need to grow a method for something that 
> only replaces two lines of code that are not tricky in any form of the word.

And don't forget that the trivial function will sort any iterable, not
just lists.  I think

	for member in copysort(someset):

is better than

	for member in list(someset).copysort():

I'm against list.copysort(), and for either leaving things unchanged or
adding copysort() as a builtin (especially if it can use the reference
count trick to avoid unnecessary copies).

Mark Russell

	
From aleaxit at yahoo.com  Sat Oct 18 07:31:17 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 18 07:31:23 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <1066470275.1346.25.camel@straylight>
References: <000401c39516$c440b520$e841fea9@oemcomputer>
	<3F90D89E.3060706@ocf.berkeley.edu>
	<1066470275.1346.25.camel@straylight>
Message-ID: <200310181331.17795.aleaxit@yahoo.com>

On Saturday 18 October 2003 11:44 am, Mark Russell wrote:
> On Sat, 2003-10-18 at 07:07, Brett C. wrote:
> > I'm -1 as well.  Lists do not need to grow a method for something that
> > only replaces two lines of code that are not tricky in any form of the
> > word.
>
> And don't forget that the trivial function will sort any iterable, not
> just lists.  I think
>
> 	for member in copysort(someset):
>
> is better than
>
> 	for member in list(someset).copysort():
>
> I'm against list.copysort(), and for either leaving things unchanged or
> adding copysort() as a builtin (especially if it can use the reference
> count trick to avoid unnecessary copies).

The trick would need to check that the argument is a list, of course,
as well as checking that the reference to it on the stack is the only
one around.  But given this, yes, I guess a built-in would be "better"
by occasionally saving the need to type a few extra characters (though
maybe "worse" by enlarging the built-in module rather than remaining
inside the smaller namespace of the list type...?).

The built-in, or method, 'copysort', would have to accept the same 
optional arguments as the sort method of lists has just grown, of course.


Alex


From aleaxit at yahoo.com  Sat Oct 18 08:26:39 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 18 08:26:45 2003
Subject: [Python-Dev] The Trick (was Re: copysort patch,
	was Re: inline sort option)
In-Reply-To: <200310181331.17795.aleaxit@yahoo.com>
References: <000401c39516$c440b520$e841fea9@oemcomputer>
	<1066470275.1346.25.camel@straylight>
	<200310181331.17795.aleaxit@yahoo.com>
Message-ID: <200310181426.39116.aleaxit@yahoo.com>

Wondering about the trick of copysort not copying a singly-referenced list I 
decided to try it out in a tiny extension module, and, yes,  it is just as 
trivial as one might wish (haven't dealt with optional args to sort, just 
wanting to check performance &c):

static PyObject*
copysort(PyObject* self, PyObject* args)
{
   PyObject *sequence, *listresult;

   if(!PyArg_ParseTuple(args, "O", &sequence))
       return 0;
   if(PyList_CheckExact(sequence) && sequence->ob_refcnt==1) {
       listresult = sequence;
       Py_INCREF(listresult);
   } else {
       listresult = PySequence_List(sequence);
   }
   if(listresult) {
       if(PyList_Sort(listresult) == -1) {
           Py_DECREF(listresult);
           listresult = 0;
       }
   }
   return listresult;
}

and performance on an equally trivial testcase:

x = dict.fromkeys(range(99999))

def looponsorted1(x):
    keys = x.keys()
    keys.sort()
    for k in keys: pass

def looponsorted2(x, c=copysort.copysort):
    for k in c(x.keys()): pass

turns out to be identical between the two _with_ The Trick (4.4e+04 usec with 
timeit.py -c on my box) while without The Trick copysort would slow down to
about 5.5e+04 usec.

But, this reminds me -- function filter, in bltinmodule.c, uses just about the 
same trick too (to modify in-place when possible rather than making a new 
list -- even though when it does make a new list it's an empty one, not a
copy, so the gain is less).  There must be other cases of applicability which
just haven't been considered.  So...

Shouldn't The Trick be embodied in PySequence_List itself...?  So, the whole 
small tricky part above:
   if(PyList_CheckExact(sequence) && sequence->ob_refcnt==1) {
       listresult = sequence;
       Py_INCREF(listresult);
   } else {
       listresult = PySequence_List(sequence);
   }
would collapse to a single PySequence_List call -- *AND* potential calls from
Python code such as "x=list(somedict.keys())" might also be speeded up 
analogously...
[Such a call looks silly when shown like this, but in some cases one might not 
know, in polymorphic use, whether a method returns a new or potentially 
shared list, or other sequence, and a call to list() on the method's result 
then may be needed to ensure the right semantics in all cases].

Is there any hidden trap in The Trick that makes it unadvisable to insert it 
in PySequence_List?  Can't think of any, but I'm sure y'all will let me know 
ASAP what if anything I have overlooked...;-).

One might even be tempted to reach down all the way to PyList_GetSlice, 
placing THERE The Trick in cases of all-list slicing of a singly-referenced 
list (PyList_GetSlice is what PySequence_List uses, so it would also get the 
benefit), but that might be overdoing it -- and encouraging list(xxx) instead
of xxx[:], by making the former a bit faster in some cases, would be no bad 
thing IMHO (already I'm teaching newbies to prefer using list(...) rather 
than ...[:] strictly for legibility and clarity, being able to mention 
possible future performance benefits might well reinforce the habit...;-).


Alex


From marktrussell at btopenworld.com  Sat Oct 18 08:32:53 2003
From: marktrussell at btopenworld.com (Mark Russell)
Date: Sat Oct 18 08:35:14 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310181331.17795.aleaxit@yahoo.com>
References: <000401c39516$c440b520$e841fea9@oemcomputer>
	<3F90D89E.3060706@ocf.berkeley.edu>
	<1066470275.1346.25.camel@straylight>
	<200310181331.17795.aleaxit@yahoo.com>
Message-ID: <1066480373.1942.50.camel@straylight>

On Sat, 2003-10-18 at 12:31, Alex Martelli wrote:
> The built-in, or method, 'copysort', would have to accept the same 
> optional arguments as the sort method of lists has just grown, of course.

Yes.  In fact one point it its favour is that there aren't any choices
to be made - the interface should track that of list.sort(), so it
avoids the usual objection to trivial functions that there are many
possible variants.

Mark Russell

From pf_moore at yahoo.co.uk  Sat Oct 18 09:18:30 2003
From: pf_moore at yahoo.co.uk (Paul Moore)
Date: Sat Oct 18 09:18:20 2003
Subject: [Python-Dev] Re: accumulator display syntax
References: <16272.15274.781344.230479@montanaro.dyndns.org>
	<16272.5373.514560.225999@montanaro.dyndns.org>
	<200310171821.39895.aleaxit@yahoo.com>
	<16272.6895.233187.510629@montanaro.dyndns.org>
	<200310171903.42578.aleaxit@yahoo.com>
	<200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
	<16272.15274.781344.230479@montanaro.dyndns.org>
	<3F90472C.9060702@ocf.berkeley.edu>
	<5.1.0.14.0.20031017160243.03453220@mail.telecommunity.com>
Message-ID: <8yniofa1.fsf@yahoo.co.uk>

"Phillip J. Eby" <pje@telecommunity.com> writes:

> Which of course means there'd be little need for imap and ifilter,
> just as there's now little need for map and filter.
>
> Anyway, if you look at '.. for .. in .. [if ..]' as a ternary or
> quaternary operator on an iterator (or iterable) that returns an
> iterator, it makes a lot more sense than thinking of it as having
> anything to do with generator(s).  (Even if it might be implemented
> that way.)

I've reached the point of skimming this discussion, but this struck a
chord. I think the original proposal (for special syntax for
accumulators) is too limited, and if anything is needed (not clear on
that) it should be a generalised iterator comprehension construct.

In that context, it seems to me that iterator comprehensions bear a
very similar relationship to imap/ifilter to the relationship between
map/filter and list comprehensions.

Paul.
-- 
This signature intentionally left blank


From pje at telecommunity.com  Sat Oct 18 10:04:52 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat Oct 18 10:05:04 2003
Subject: [Python-Dev] generator comprehension syntax, was:
	accumulator display syntax
In-Reply-To: <200310180357.h9I3vLE08067@12-236-54-216.client.attbi.com>
References: <Your message of "Fri, 17 Oct 2003 23:28:39 EDT."
	<5.1.0.14.0.20031017231830.01ed3ec0@mail.telecommunity.com>
	<200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
	<5.1.0.14.0.20031017231830.01ed3ec0@mail.telecommunity.com>
Message-ID: <5.1.0.14.0.20031018094933.03ce4780@mail.telecommunity.com>

At 08:57 PM 10/17/03 -0700, Guido van Rossum wrote:
> > Which, by the way, brings up a question: should iterator comps be
> > reiterable?  I don't see any reason right now why they shouldn't be, and
> > can think of situations where reiterability would be useful.
>
>Oh, no.  Not reiterability again.  How can you promise something to be
>reiterable if you don't know whether the underlying iterator can be
>reiterated?  Keeping a hidden buffer would be a bad idea.

I think I phrased my question poorly.  What I should have said was:

"Should iterator expressions preserve the reiterability of the base 
expression?"

I don't want to make them guarantee reiterability, only to preserve it if 
it already exists.  Does that make more sense?

In essence, this would be done by having an itercomp expression resolve to 
an object whose __iter__ method calls the underlying generator, returning a 
generator-iterator.  Thus, any iteration over the itercomp is equivalent to 
calling a no-arguments generator.  The result is reiterable if the base 
iterable is reiterable, otherwise not.

I suppose technically, this means the itercomp doesn't return an iterator, 
but an iterable, which I suppose could be confusing if you try to call its 
'next()' method.  But then, it could have a next() method that raises an 
error saying "call 'iter()' on me first".


From niemeyer at conectiva.com  Sat Oct 18 10:47:03 2003
From: niemeyer at conectiva.com (Gustavo Niemeyer)
Date: Sat Oct 18 10:48:14 2003
Subject: [Python-Dev] SRE recursion removed
Message-ID: <20031018144703.GA10212@ibook>

The SRE recursion removal patch is finally in. Please, let me know
if you find any problems.

-- 
Gustavo Niemeyer
http://niemeyer.net

From aleaxit at yahoo.com  Sat Oct 18 11:14:19 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 18 11:14:25 2003
Subject: [Python-Dev] The Trick (was Re: copysort patch,
	was Re: inline sort option)
In-Reply-To: <200310181426.39116.aleaxit@yahoo.com>
References: <000401c39516$c440b520$e841fea9@oemcomputer>
	<200310181331.17795.aleaxit@yahoo.com>
	<200310181426.39116.aleaxit@yahoo.com>
Message-ID: <200310181714.19688.aleaxit@yahoo.com>

On Saturday 18 October 2003 02:26 pm, Alex Martelli wrote:
   ...oops...
> x = dict.fromkeys(range(99999))

here, x.keys() IS already sorted, so the importance of The Trick is emphasized 
because the sort itself has little work to do:

> turns out to be identical between the two _with_ The Trick (4.4e+04 usec
> with timeit.py -c on my box) while without The Trick copysort would slow
> down to about 5.5e+04 usec.

I've changed the initialization of x to

> x = dict.fromkeys(map(str,range(99999)))

so that x.keys() is not already sorted (still has several runs that the sort
will exploit -- perhaps representative of some real-world sorts...;-) and
the numbers change to about 240 milliseconds with The Trick (or with
separate statements to get and sort the keys), 265 without -- so, more
like 10% advantage, NOT 20%-ish (a list.copysort method, from Raymond's
patch, has 240 milliseconds too -- indeed it's just about the same code I
was using in the standalone function I posted, give or take some level
of indirectness in C calls that clearly don't matter much here).

Of course, the % advantage will vary with the nature of the list (how
many runs that sort can exploit) and be bigger for smaller lists (given
we're comparing O(N) copy efforts vs O(N log N) sorting efforts).


Alex


From skip at pobox.com  Sat Oct 18 11:16:00 2003
From: skip at pobox.com (Skip Montanaro)
Date: Sat Oct 18 11:16:14 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com>
References: <000401c39516$c440b520$e841fea9@oemcomputer>
	<200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com>
Message-ID: <16273.22832.456737.861600@montanaro.dyndns.org>


    Guido> For one, this will surely make lots of people write

    Guido>   for key in D.keys().copysort():
    Guido>       ...

    Guido> which makes an unnecessary copy of the keys. 

It might be viewed as unnecessary if you intend to change D's keys within
the loop.

    Guido>   keys = D.keys()
    Guido>   keys.sort()
    Guido>   for key in keys:
    Guido>       ...

Current standard practice is also fine.

Skip

From aleaxit at yahoo.com  Sat Oct 18 11:43:38 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 18 11:43:43 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <16273.22832.456737.861600@montanaro.dyndns.org>
References: <000401c39516$c440b520$e841fea9@oemcomputer>
	<200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com>
	<16273.22832.456737.861600@montanaro.dyndns.org>
Message-ID: <200310181743.38959.aleaxit@yahoo.com>

On Saturday 18 October 2003 05:16 pm, Skip Montanaro wrote:
>     Guido> For one, this will surely make lots of people write
>
>     Guido>   for key in D.keys().copysort():
>     Guido>       ...
>
>     Guido> which makes an unnecessary copy of the keys.
>
> It might be viewed as unnecessary if you intend to change D's keys within
> the loop.

D.keys() makes a _snapshot_ of the keys of D -- it doesn't matter what
you do to D in the loop's body.

Admittedly, that's anything but immediately obvious (quite apart from
copysorting or whatever) -- I've seen people change perfectly good
code of the form:
    for k in D.keys():
        vastly_alter_a_dictionary(D, k)
into broken code of the form:
    for k in D:
        vastly_alter_a_dictionary(D, k)
because of having missed this crucial difference -- snapshot in the
first case, but NOT in the second one.  And viceversa, I've seen people
carefully copy.copy(D.keys()) or the equivalent to make sure they did
not suffer from modifying D in the loop's body -- the latter is in a sense
even worse, because the bad effects of the former tend to show up
pretty fast as weird bugs and broken unit-tests, while the latter is "just"
temporarily wasting some memory and copying time.

Anyway, copysort with The Trick, either as a method or function, has
no performance problems - exactly the same performance as:

>     Guido>   keys = D.keys()
>     Guido>   keys.sort()
>     Guido>   for key in keys:
>     Guido>       ...
>
> Current standard practice is also fine.

Nolo contendere.  It DOES feel a bit like boilerplate, that's all.


Alex


From pf_moore at yahoo.co.uk  Sat Oct 18 11:47:54 2003
From: pf_moore at yahoo.co.uk (Paul Moore)
Date: Sat Oct 18 11:47:38 2003
Subject: [Python-Dev] Re: accumulator display syntax
References: <16272.5373.514560.225999@montanaro.dyndns.org>
	<200310172255.43697.aleaxit@yahoo.com>
	<200310172145.h9HLjh807477@12-236-54-216.client.attbi.com>
	<200310180014.51336.aleaxit@yahoo.com>
Message-ID: <4qy6o8d1.fsf@yahoo.co.uk>

Alex Martelli <aleaxit@yahoo.com> writes:

> On Friday 17 October 2003 11:45 pm, Guido van Rossum wrote:
>    ...
>> In conclusion, I think this syntax is pretty cool.  (It will probably
>> die the same death as the ternary expression though.)
>
> Ah well -- in this case I guess I won't go to the bother of deciding
> whether I like your preferred "lighter" syntax or the "stands our more"
> one.  The sad, long, lingering death of the ternary expression was
> too painful to repeat -- let's put this one out of its misery sooner.

The saddest thing about the ternary operator saga (and it may be the
fate of this as well) was that the people who wanted the *semantics*
destroyed their own case by arguing over *syntax*.

I suspect that the only way out of this would be for someone to have
just implemented it, with whatever syntax they preferred. Then it
either goes in or not, with Guido's final veto applying, as always.

Possibly the same is the case here. Unless someone implements iterator
comprehensions, with whatever syntax they feel happiest with,
arguments about syntax are sterile, and merely serve to fragment the
discussion, obscuring the more fundamental question of whether the
semantics is wanted or not.

Paul
-- 
This signature intentionally left blank


From guido at python.org  Sat Oct 18 12:27:50 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 18 12:27:58 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: Your message of "Sat, 18 Oct 2003 16:47:54 BST."
	<4qy6o8d1.fsf@yahoo.co.uk> 
References: <16272.5373.514560.225999@montanaro.dyndns.org>
	<200310172255.43697.aleaxit@yahoo.com>
	<200310172145.h9HLjh807477@12-236-54-216.client.attbi.com>
	<200310180014.51336.aleaxit@yahoo.com> <4qy6o8d1.fsf@yahoo.co.uk> 
Message-ID: <200310181627.h9IGRoP09636@12-236-54-216.client.attbi.com>

> The saddest thing about the ternary operator saga (and it may be the
> fate of this as well) was that the people who wanted the *semantics*
> destroyed their own case by arguing over *syntax*.

I don't see it that way.  There were simply too many people who didn't
want it in *any* form (and even if they weren't a strict majority,
there were certainly too many to ignore).

> I suspect that the only way out of this would be for someone to have
> just implemented it, with whatever syntax they preferred. Then it
> either goes in or not, with Guido's final veto applying, as always.

It was implemented (several times).  That wasn't the point at all.

> Possibly the same is the case here. Unless someone implements iterator
> comprehensions, with whatever syntax they feel happiest with,
> arguments about syntax are sterile, and merely serve to fragment the
> discussion, obscuring the more fundamental question of whether the
> semantics is wanted or not.

Not true.  There are only two major syntax variations contending (with
or without yield) and some quibble about parentheses, and everybody
here seems to agree that either version could work.  The real issue is
whether it adds enough to make it worthwhile to change the language
(again).

My current opinion is that it isn't: for small datasets, the extra
cost of materializing the list using a list comprehension is
negligeable, so there's no need for a new feature, and if you need to
support truly large datasets, you can afford the three extra lines of
code it takes to make a custom iterator or generator.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sat Oct 18 12:33:49 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 18 12:34:05 2003
Subject: [Python-Dev] The Trick
In-Reply-To: Your message of "Sat, 18 Oct 2003 17:14:19 +0200."
	<200310181714.19688.aleaxit@yahoo.com> 
References: <000401c39516$c440b520$e841fea9@oemcomputer>
	<200310181331.17795.aleaxit@yahoo.com>
	<200310181426.39116.aleaxit@yahoo.com> 
	<200310181714.19688.aleaxit@yahoo.com> 
Message-ID: <200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com>

I don't like the trick of avoiding the copy if the refcount is one;
AFAIK it can't be done in Jython.

I think the application area is too narrow to warrant a built-in,
*and* lists shouldn't grow two similar methods.  Let's keep the
language small!

(I know, by that argument several built-ins shouldn't exist.  Well,
they might be withdrawn in 3.0; let's not add more.)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sat Oct 18 13:17:40 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 18 13:17:52 2003
Subject: [Python-Dev] Reiterability
In-Reply-To: Your message of "Sat, 18 Oct 2003 11:20:45 +0200."
	<200310181120.45477.aleaxit@yahoo.com> 
References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
	<5.1.0.14.0.20031017231830.01ed3ec0@mail.telecommunity.com>
	<200310180357.h9I3vLE08067@12-236-54-216.client.attbi.com> 
	<200310181120.45477.aleaxit@yahoo.com> 
Message-ID: <200310181717.h9IHHeI09703@12-236-54-216.client.attbi.com>

[Guido]
> >Oh, no.  Not reiterability again.  How can you promise something to be
> >reiterable if you don't know whether the underlying iterator can be
> >reiterated?  Keeping a hidden buffer would be a bad idea.

[Alex]
> I agree it would be bad to have "black magic" performed by every
> iterator to fulfil a contract that may or may not be useful to
> clients and might be costly to fulfil.
> 
> IF "reiterability" is useful (and I'd need to see some use cases,
> because I don't particularly recall pining for it in Python) it
> should be exposed as a separate protocol that may or may not be
> offered by any given iterator type.  E.g., the presence of a special
> method __reiter__ could indicate that this iterator IS able to
> supply another iterator which retraces the same steps from the
> start; and perhaps iter(xxx, reiterable=True) could strive to
> provide a reiterable iterator for xxx, which might justify building
> one that keeps a hidden buffer as a last resort.  But first, I'd
> like use cases...

In cases where reiterabiliy can be implemented without much effort,
there is already an underlying object representing the sequence
(e.g. a collection object, or an object defining a numerical series).
Reiteration comes for free if you hold on to that underlying object
rather than passing an iterator to them around.

[Phillip]
> I think I phrased my question poorly.  What I should have said was:
> 
> "Should iterator expressions preserve the reiterability of the base
> expression?"

(An iterator expression being something like

  (f(x) for x in S)

right?)

> I don't want to make them guarantee reiterability, only to preserve
> it if it already exists.  Does that make more sense?
>
> In essence, this would be done by having an itercomp expression
> resolve to an object whose __iter__ method calls the underlying
> generator, returning a generator-iterator.  Thus, any iteration over
> the itercomp is equivalent to calling a no-arguments generator.  The
> result is reiterable if the base iterable is reiterable, otherwise
> not.

OK, I think I understand what you're after.  The code for an iterator
expression has to create a generator function behind the scenes, and
call it.  For example:

  A = (f(x) for x in S)

could be translated into:

  def gen(seq):
      for x in seq:
          yield f(x)
  A = gen(S)

(Note that S could be an arbitrary expression and should be evaluated
only once.  This translation does that correctly.)

This allows one to iterate once over A (a generator function doesn't
allow reiteration).  What you are asking looks like it could be done
like this (never mind the local names):

  def gen(seq):
      for x in seq:
	  yield f(x)
  class Helper:
      def __init__(seq):
          self.seq = seq
      def __iter__(self):
          return gen(self.seq)
  A = Helper(S)

Then every time you use iter(A) gen() will be called with the saved
value of S as argument.

> I suppose technically, this means the itercomp doesn't return an
> iterator, but an iterable, which I suppose could be confusing if you
> try to call its 'next()' method.  But then, it could have a next()
> method that raises an error saying "call 'iter()' on me first".

I don't mind that so much, but I don't think all the extra machinery
is worth it; the compiler generally can't tell if it is needed so it
has to produce the reiterable code every time.  If you *want* to
have an iterable instead of an iterator, it's usually easy enough do
(especially given knowledge about the type of S).

[Alex again]
> There ARE other features I'd REALLY have liked to get from iterators
> in some applications.
> 
> A "snapshot" -- providing me two iterators, the original one and
> another, which will step independently over the same sequence of
> items -- would have been really handy at times.  And a "step back"
> facility ("undo" of the last call to next) -- sometimes one level
> would suffice, sometimes not; often I could have provided the item
> to be "pushed back" so the iterator need not retain memory of it
> independently, but that wouldn't always be handy.  Now any of these
> can be built as a wrapper over an existing iterator, of course --
> just like 'reiterability' could (and you could in fact easily
> implement reiterability in terms of snapshotting, by just ensuring a
> snapshot is taken at the start and further snapshotted but never
> disturbed); but not knowing the abilities of the underlying iterator
> would mean these wrappers would often duplicate functionality
> needlessly.

I don't see how it can be done without an explicit request for such a
wrapper in the calling code.  If the underlying iterator is ephemeral
(is not reiterable) the snapshotter has to save a copy of every item,
and that would defeat the purpose of iterators if it was done
automatically.  Or am I misunderstanding?

> E.g.:
> 
> class snapshottable_sequence_iter(object):
>     def __init__(self, sequence, i=0):
>         self.sequence = sequence
>         self.i = i
>     def __iter__(self): return self
>     def next(self):
>         try: result = self.sequence[self.i]
>         except IndexError: raise StopIteration
>         self.i += 1
>         return result
>     def snapshot(self):
>         return self.__class__(self.sequence, self.i)
> 
> Here, snapshotting is quite cheap, requiring just a new counter and
> another reference to the same underlying sequence.  So would be
> restarting and stepping back, directly implemented.  But if we need
> to wrap a totally generic iterator to provide "snapshottability", we
> inevitably end up keeping a list (or the like) of items so far seen
> from one but not both 'independent' iterators obtained by a snapshot
> -- all potentially redundant storage, not to mention the possible
> coding trickiness in maintaining that FIFO queue.

I'm not sure what you are suggesting here.  Are you proposing that
*some* iterators (those which can be snapshotted cheaply) sprout a new
snapshot() method?

> As I said I do have use cases for all of these.  Simplest is the
> ability to push back the last item obtained by next, since a frequent
> patter is:
>     for item in iterator:
>         if isok(item): process(item)
>         else:
>             # need to push item back onto iterator, then
>             break
>     else:
>         # all items were OK, iterator exhausted, blah blah
> 
>     ...and later...
> 
>     for item in iterator:    # process some more items
> 
> Of course, as long as just a few levels of pushback are enough, THIS
> one is an easy and light-weight wrapper to write:
> 
> class pushback_wrapper(object):
>     def __init__(self, it):
>         self.it = it
>         self.pushed_back = []
>     def __iter__(self): return self
>     def next(self):
>         try: return self.pushed_back.pop()
>         except IndexError: return self.it.next()
>     def pushback(self, item):
>         self.pushed_back.append(item)

This definitely sounds like you'd want to create an explicit wrapper
for this; there is too much machinery here to make this a standard
feature.

Perhaps a snapshottable iterator could also have a backup() method
(which would decrement self.i in your first example) or a prev()
method (which would return self.sequence[self.i] and decrement
self.i).

> A "snapshot" would be useful whenever more than one pass on a
> sequence _or part of it_ is needed (more useful than a "restart"
> because of the "part of it" provision).  And a decent wrapper for it
> is a bear...

Such wrappers for specific container types (or maybe just one for
sequences) could be in a standard library module.  Is more needed?

--Guido van Rossum (home page: http://www.python.org/~guido/)

From ncoghlan at iinet.net.au  Sat Oct 18 13:18:07 2003
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sat Oct 18 13:18:05 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310181743.38959.aleaxit@yahoo.com>
References: <000401c39516$c440b520$e841fea9@oemcomputer>	<200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com>	<16273.22832.456737.861600@montanaro.dyndns.org>
	<200310181743.38959.aleaxit@yahoo.com>
Message-ID: <3F9175CF.3040408@iinet.net.au>

Alex Martelli strung bits together to say:
>>    Guido>   keys = D.keys()
>>    Guido>   keys.sort()
>>    Guido>   for key in keys:
>>    Guido>       ...
>>
>>Current standard practice is also fine.
> 
> Nolo contendere.  It DOES feel a bit like boilerplate, that's all.

Hi,

While I'm not an active Python contributor (yet), I've been lurking on 
python-dev since March.

Something was bugging me about the whole l.copysort() ('sortedcopy'?) idea. For 
whatever reason, the above comment crystalised it - if there's going to be a 
special 'sortedcopy' to allow minimalist chaining, then what about 
'reversedcopy' or 'sortedreversedcopy', or any of the other list methods that 
may be considered worth chaining?

Particularly since the following trick seems to work:
==============
 >>> def chain(method, *args, **kwds):
	method(*args, **kwds)
	return method.__self__

 >>> mylist = [1, 2, 3, 3, 2, 1]
 >>> print chain(mylist.sort)
[1, 1, 2, 2, 3, 3]
 >>> mylist = [1, 2, 3, 3, 2, 1]
 >>> print chain(chain(mylist.sort).reverse)
[3, 3, 2, 2, 1, 1]
 >>> print mylist
[3, 3, 2, 2, 1, 1]
 >>> mylist = [1, 2, 3, 3, 2, 1]
 >>> print mylist
[1, 2, 3, 3, 2, 1]
 >>> print chain(chain(list(mylist).sort).reverse)
[3, 3, 2, 2, 1, 1]
 >>> print mylist
[1, 2, 3, 3, 2, 1]
 >>> ==============

(Tested with Python 2.3rc2, which is what is currently installed on my home machine)

Not exactly the easiest to read, but it does do the job of "sorted copy as an 
expression", as well as letting you chain arbitrary methods of any object.

Regards,
Nick.

-- 
Nick Coghlan           |              Brisbane, Australia
ICQ#: 68854767         |               ncoghlan@email.com
Mobile: 0409 573 268   |   http://www.talkinboutstuff.net
"Let go your prejudices,
               lest they limit your thoughts and actions."


From python at rcn.com  Sat Oct 18 13:53:17 2003
From: python at rcn.com (Raymond Hettinger)
Date: Sat Oct 18 13:54:01 2003
Subject: [Python-Dev] in-line sort
In-Reply-To: <200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com>
Message-ID: <002e01c395a0$b65d9c40$e841fea9@oemcomputer>


> I don't like the trick of avoiding the copy if the refcount is one;
> AFAIK it can't be done in Jython.

As Alex demonstrated, the time savings for an O(n) operation inside an
O(n log n) function is irrelevant anyway.


> I think the application area is too narrow to warrant a built-in,
> *and* lists shouldn't grow two similar methods.  Let's keep the
> language small!


Not to be hard headed here, but if dropped now, it will never
be considered again.  Did you have a chance to look at the 
rationale for change in my previous note and added in the
comments for the patch?  I think they offer some examples 
and reasons stronger than "saving a little typing":  
      www.python.org/sf/825814


Raymond


From jacobs at penguin.theopalgroup.com  Sat Oct 18 14:00:28 2003
From: jacobs at penguin.theopalgroup.com (Kevin Jacobs)
Date: Sat Oct 18 14:01:58 2003
Subject: [Python-Dev] The Trick
In-Reply-To: <200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com>
Message-ID: <Pine.LNX.4.44.0310181358170.31626-100000@penguin.theopalgroup.com>

On Sat, 18 Oct 2003, Guido van Rossum wrote:
> I don't like the trick of avoiding the copy if the refcount is one;
> AFAIK it can't be done in Jython.

There is also a problem with the strategy if if gets called by a C
extension.  It is perfectly feasible for a C extension to hold the only
reference to an object, call the copying sort (directly or indirectly), and
then be very surprised that the copy did not take place.

-Kevin

-- 
--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (440) 871-6725 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (440) 871-6722              WWW:    http://www.theopalgroup.com/


From martin at v.loewis.de  Sat Oct 18 14:13:28 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Sat Oct 18 14:13:54 2003
Subject: [Python-Dev] SRE recursion removed
In-Reply-To: <20031018144703.GA10212@ibook>
References: <20031018144703.GA10212@ibook>
Message-ID: <m3llrifm7r.fsf@mira.informatik.hu-berlin.de>

Gustavo Niemeyer <niemeyer@conectiva.com> writes:

> The SRE recursion removal patch is finally in. Please, let me know
> if you find any problems.

What is the purpose of the USE_RECURSION #define? It looks to me like
you have added a lot of dead code; I recommend to remove all this code.

Regards,
Martin


From niemeyer at conectiva.com  Sat Oct 18 14:22:16 2003
From: niemeyer at conectiva.com (Gustavo Niemeyer)
Date: Sat Oct 18 14:23:27 2003
Subject: [Python-Dev] SRE recursion removed
In-Reply-To: <m3llrifm7r.fsf@mira.informatik.hu-berlin.de>
References: <20031018144703.GA10212@ibook>
	<m3llrifm7r.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20031018182215.GA10756@ibook>

> > The SRE recursion removal patch is finally in. Please, let me know
> > if you find any problems.
> 
> What is the purpose of the USE_RECURSION #define? It looks to me like
> you have added a lot of dead code; I recommend to remove all this code.

If you enable USE_RECURSION it will become recursive again, so it's
nice to see if some problem is related to the non-recursive algorithm
or not, and makes it easy to understand to change made.

The "dead" code you're talking about is probably the unused macros,
right? I've used them in some ideas, and gave up later. OTOH, they may
be used in further extensions. If you don't mind, I'd rather leave them
there, than thinking about it again if I need it. But if they're really
a problem, well, I'll remove. Just let me know.

-- 
Gustavo Niemeyer
http://niemeyer.net

From pje at telecommunity.com  Sat Oct 18 14:32:56 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat Oct 18 14:33:12 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: <200310181717.h9IHHeI09703@12-236-54-216.client.attbi.com>
References: <Your message of "Sat, 18 Oct 2003 11:20:45 +0200."
	<200310181120.45477.aleaxit@yahoo.com>
	<200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
	<5.1.0.14.0.20031017231830.01ed3ec0@mail.telecommunity.com>
	<200310180357.h9I3vLE08067@12-236-54-216.client.attbi.com>
	<200310181120.45477.aleaxit@yahoo.com>
Message-ID: <5.1.0.14.0.20031018142209.0388a5a0@mail.telecommunity.com>

At 10:17 AM 10/18/03 -0700, Guido van Rossum wrote:

>[Phillip]
> > I think I phrased my question poorly.  What I should have said was:
> >
> > "Should iterator expressions preserve the reiterability of the base
> > expression?"
>
>(An iterator expression being something like
>
>   (f(x) for x in S)
>
>right?)

Yes.


> > In essence, this would be done by having an itercomp expression
> > resolve to an object whose __iter__ method calls the underlying
> > generator, returning a generator-iterator.  Thus, any iteration over
> > the itercomp is equivalent to calling a no-arguments generator.  The
> > result is reiterable if the base iterable is reiterable, otherwise
> > not.
>
>OK, I think I understand what you're after.  The code for an iterator
>expression has to create a generator function behind the scenes, and
>call it.  For example:
>
>   A = (f(x) for x in S)
>
>could be translated into:
>
>   def gen(seq):
>       for x in seq:
>           yield f(x)
>   A = gen(S)
>
>(Note that S could be an arbitrary expression and should be evaluated
>only once.  This translation does that correctly.)

Interesting.  That wasn't the semantics I envisioned.  I was thinking 
(implicitly, anyway) that an iterator comprehension was a closure.  That 
is, that S would be evaluated each time.  However, if S is a sequence, you 
don't need to reevaluate it, and if S is another iterator expression that 
preserves reiterability, you still don't need to.  So, in that sense 
there's never a need to


>This allows one to iterate once over A (a generator function doesn't
>allow reiteration).  What you are asking looks like it could be done
>like this (never mind the local names):

Yes, that's actually what I said, but I guess I was once again unclear.


>   def gen(seq):
>       for x in seq:
>           yield f(x)
>   class Helper:
>       def __init__(seq):
>           self.seq = seq
>       def __iter__(self):
>           return gen(self.seq)
>   A = Helper(S)
>
>Then every time you use iter(A) gen() will be called with the saved
>value of S as argument.

Yes, except of course Helper would be a builtin type.


>I don't mind that so much, but I don't think all the extra machinery
>is worth it; the compiler generally can't tell if it is needed so it
>has to produce the reiterable code every time.

It has to produce the generator every time, anyway, presumably as a nested 
function with access to the current locals.  The only question is whether 
it can be invoked more than once, and whether you create the helper 
object.  But maybe that's what you mean, and now you're being unclear 
instead of me.  ;)


>   If you *want* to
>have an iterable instead of an iterator, it's usually easy enough do
>(especially given knowledge about the type of S).

I just tend to wish that I didn't have to think about whether iterators are 
reiterable or not, as it forces me to expose to callers of a function 
whether the value they pass must be an iterator or an iterable.  But I 
don't want to reopen the entire reiterability discussion, as I don't have 
any better solutions and the previously proposed solutions make my head 
hurt just trying to make sure I understand the implications.


From martin at v.loewis.de  Sat Oct 18 15:01:52 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Sat Oct 18 15:02:30 2003
Subject: [Python-Dev] SRE recursion removed
In-Reply-To: <20031018182215.GA10756@ibook>
References: <20031018144703.GA10212@ibook>
	<m3llrifm7r.fsf@mira.informatik.hu-berlin.de>
	<20031018182215.GA10756@ibook>
Message-ID: <m3he26fjz3.fsf@mira.informatik.hu-berlin.de>

Gustavo Niemeyer <niemeyer@conectiva.com> writes:

> If you enable USE_RECURSION it will become recursive again, so it's
> nice to see if some problem is related to the non-recursive algorithm
> or not, and makes it easy to understand to change made.

Hmm. Either you trust that your code is basically correct or you
don't. If you trust that it is basically correct, you should remove
the old code, and trust that any problems in SRE (be they related to
your code or independent) can be fixed, in which case maintaining the
old code would be pointless.

Or, if you don't trust that your code is basically correct, you should
not have applied the patch.

> The "dead" code you're talking about is probably the unused macros,
> right? 

No, I'm talking about the now-disabled recursive code.

I also wonder whether the code performing recursion checks has any
function still. So I wonder whether USE_STACKCHECK,
USE_RECURSION_LIMIT are "essentially" dead.

> But if they're really a problem, well, I'll remove. Just let me
> know.

IMO, any unused code in SRE is a problem, because it makes already
difficult-to-follow code more difficult to follow. It is ok to
maintain dead code if the code might be used in the future, but only
if there are specific plans to actually use it in a foreseeable
future. It is not ok 

Regards,
Martin

From da-x at gmx.net  Sat Oct 18 15:13:19 2003
From: da-x at gmx.net (Dan Aloni)
Date: Sat Oct 18 15:13:31 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <3F9175CF.3040408@iinet.net.au>
References: <000401c39516$c440b520$e841fea9@oemcomputer>
	<200310180320.h9I3KVm07959@12-236-54-216.client.attbi.com>
	<16273.22832.456737.861600@montanaro.dyndns.org>
	<200310181743.38959.aleaxit@yahoo.com>
	<3F9175CF.3040408@iinet.net.au>
Message-ID: <20031018191319.GA23071@callisto.yi.org>

On Sun, Oct 19, 2003 at 03:18:07AM +1000, Nick Coghlan wrote:
> Alex Martelli strung bits together to say:
> >>   Guido>   keys = D.keys()
> >>   Guido>   keys.sort()
> >>   Guido>   for key in keys:
> >>   Guido>       ...
> >>
> >>Current standard practice is also fine.
> >
> >Nolo contendere.  It DOES feel a bit like boilerplate, that's all.
>
[...]
> 
> Particularly since the following trick seems to work:
> ==============
> >>> def chain(method, *args, **kwds):
> 	method(*args, **kwds)
> 	return method.__self__
> 
> >>> mylist = [1, 2, 3, 3, 2, 1]
> >>> print chain(mylist.sort)
> [1, 1, 2, 2, 3, 3]
> >>> mylist = [1, 2, 3, 3, 2, 1]
> >>> print chain(chain(mylist.sort).reverse)
>
[...]
> 
> (Tested with Python 2.3rc2, which is what is currently installed on my home 
> machine)
> 
> Not exactly the easiest to read, but it does do the job of "sorted copy as 
> an expression", as well as letting you chain arbitrary methods of any 
> object.

(I'm new on this list)

Actually, there is way to do this out-of-the-box without the chain() 
function:

>>> a = [1,2,3,3,2,1]
>>> (a, (a, a.sort())[0].reverse())[0]
[3, 3, 2, 2, 1, 1]

And there is also one for copysort():

>>> a
[1, 2, 3, 3, 2, 1]
>>> (lambda x:(x, x.sort())[0])(list(a))
[1, 1, 2, 2, 3, 3]
>>> a
[1, 2, 3, 3, 2, 1]

But that's probably not more readable.

Architect's Sketch...

-- 
Dan Aloni
da-x@gmx.net

From michel at dialnetwork.com  Sat Oct 18 15:43:52 2003
From: michel at dialnetwork.com (Michel Pelletier)
Date: Sat Oct 18 15:17:50 2003
Subject: [Python-Dev] The Trick
In-Reply-To: <200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com>
References: <000401c39516$c440b520$e841fea9@oemcomputer><200310181331.17795.aleaxit@yahoo.com><200310181426.39116.aleaxit@yahoo.com>
	<200310181714.19688.aleaxit@yahoo.com>
	<200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com>
Message-ID: <3630.67.160.160.177.1066506232.squirrel@squirrel.dialnetwork.com>

> I don't like the trick of avoiding the copy if
> the refcount is one;
> AFAIK it can't be done in Jython.

It may be possible with the java.lang.ref
package using a somewhat similar trick by (I
imagine) holding a soft reference and examining
the object's rechability to the collector.
-Michel

From pedronis at bluewin.ch  Sat Oct 18 15:29:56 2003
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Sat Oct 18 15:27:45 2003
Subject: [Python-Dev] The Trick
In-Reply-To: <3630.67.160.160.177.1066506232.squirrel@squirrel.dialnetwo
	rk.com>
References: <200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com>
	<000401c39516$c440b520$e841fea9@oemcomputer>
	<200310181331.17795.aleaxit@yahoo.com>
	<200310181426.39116.aleaxit@yahoo.com>
	<200310181714.19688.aleaxit@yahoo.com>
	<200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com>
Message-ID: <5.2.1.1.0.20031018212309.027fb348@pop.bluewin.ch>

At 14:43 18.10.2003 -0500, Michel Pelletier wrote:
> > I don't like the trick of avoiding the copy if
> > the refcount is one;
> > AFAIK it can't be done in Jython.
>
>It may be possible with the java.lang.ref
>package using a somewhat similar trick by (I
>imagine) holding a soft reference and examining
>the object's rechability to the collector.
>-Michel

no, if you put the last reference to an object
in a weak-ref and trigger a GC (which is btw
expensive), well you can discover that there was
just one reference but you have also lost the object.

Now if you have a tiny wrapper/contents organization
you can overcome this, playing the trick with
the wrapper and keeping the contents, OTOH
as I said, the explicit GC is expensive, likely more
than allocating and copying.

regards.


From guido at python.org  Sat Oct 18 15:35:19 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 18 15:35:31 2003
Subject: [Python-Dev] in-line sort
In-Reply-To: Your message of "Sat, 18 Oct 2003 13:53:17 EDT."
	<002e01c395a0$b65d9c40$e841fea9@oemcomputer> 
References: <002e01c395a0$b65d9c40$e841fea9@oemcomputer> 
Message-ID: <200310181935.h9IJZJ609921@12-236-54-216.client.attbi.com>

> > I think the application area is too narrow to warrant a built-in,
> > *and* lists shouldn't grow two similar methods.  Let's keep the
> > language small!
> 
> Not to be hard headed here, but if dropped now, it will never
> be considered again.  Did you have a chance to look at the 
> rationale for change in my previous note and added in the
> comments for the patch?  I think they offer some examples 
> and reasons stronger than "saving a little typing":  
>       www.python.org/sf/825814

I'm taking that into account.  It still smells funny.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sat Oct 18 15:37:12 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 18 15:37:44 2003
Subject: [Python-Dev] SRE recursion removed
In-Reply-To: Your message of "Sat, 18 Oct 2003 15:22:16 -0300."
	<20031018182215.GA10756@ibook> 
References: <20031018144703.GA10212@ibook>
	<m3llrifm7r.fsf@mira.informatik.hu-berlin.de> 
	<20031018182215.GA10756@ibook> 
Message-ID: <200310181937.h9IJbCi09945@12-236-54-216.client.attbi.com>

> > What is the purpose of the USE_RECURSION #define? It looks to me like
> > you have added a lot of dead code; I recommend to remove all this code.
> 
> If you enable USE_RECURSION it will become recursive again, so it's
> nice to see if some problem is related to the non-recursive algorithm
> or not, and makes it easy to understand to change made.

That's okay.

> The "dead" code you're talking about is probably the unused macros,
> right? I've used them in some ideas, and gave up later. OTOH, they may
> be used in further extensions. If you don't mind, I'd rather leave them
> there, than thinking about it again if I need it. But if they're really
> a problem, well, I'll remove. Just let me know.

That is *not* okay.  Dead code is a distraction for future
maintainers.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Sat Oct 18 15:42:53 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Sat Oct 18 15:43:08 2003
Subject: [Python-Dev] SRE recursion removed
In-Reply-To: <200310181937.h9IJbCi09945@12-236-54-216.client.attbi.com>
References: <20031018144703.GA10212@ibook>
	<m3llrifm7r.fsf@mira.informatik.hu-berlin.de>
	<20031018182215.GA10756@ibook>
	<200310181937.h9IJbCi09945@12-236-54-216.client.attbi.com>
Message-ID: <m38ynifi2q.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> > If you enable USE_RECURSION it will become recursive again, so it's
> > nice to see if some problem is related to the non-recursive algorithm
> > or not, and makes it easy to understand to change made.
> 
> That's okay.

There is no interface to enable USE_RECURSION except for editing
_sre.c, and I cannot see why anybody would do that (except to see
whether a bug goes away if it is enabled). So isn't then the old code
essentially dead as well?

Regards,
Martin

From aleaxit at yahoo.com  Sat Oct 18 15:46:20 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 18 15:46:26 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: <200310181717.h9IHHeI09703@12-236-54-216.client.attbi.com>
References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
	<200310181120.45477.aleaxit@yahoo.com>
	<200310181717.h9IHHeI09703@12-236-54-216.client.attbi.com>
Message-ID: <200310182146.20751.aleaxit@yahoo.com>

On Saturday 18 October 2003 07:17 pm, Guido van Rossum wrote:
   ...
> > offered by any given iterator type.  E.g., the presence of a special
> > method __reiter__ could indicate that this iterator IS able to
> > supply another iterator which retraces the same steps from the
   ...
> In cases where reiterabiliy can be implemented without much effort,
> there is already an underlying object representing the sequence
> (e.g. a collection object, or an object defining a numerical series).

...or a generator that needs to be called again, with the same parameters.

> Reiteration comes for free if you hold on to that underlying object
> rather than passing an iterator to them around.

Yes, but you need to pass around a somewhat complicated thing --
the iterator (to have the "current state in the iteration"), the callable
that needs to be called to generate the iterator again (iter, or the
generator, or the class whose instances are numerical series, ...)
and the arguments for that callable (the sequence, the generator's
arguments, the parameters with which to instantiate the class, ...).

Nothing terrible, admittedly, and that's presumably how I'd architect
things IF I ever met a use case for a "reiterable iterator":

class ReiterableIterator(object):
    def __init__(self, thecallable, *itsargs, **itskwds):
        self.c, self.a, self.k = thecallable, itsargs, itskwds
        self.it = thecallable(*itsargs, **itskwds)
    def __iter__(self): return self
    def next(self): return self.it.next()
    def reiter(self): return self.__class__(self.c, *self.a, **self.k)

typical toy example use:

def printwice(n, reiter):
    for i, x in enumerate(reiter):
        if i>=n: break
        print x
    for i, x in enumerate(reiter.reiter()):
        if i>=n: break
        print x

def evens():
    x = 0
    while 1:
        yield x
        x += 2

printwice(5, ReiterableIterator(evens))


> > "Should iterator expressions preserve the reiterability of the base
> > expression?"
>
> (An iterator expression being something like
>
>   (f(x) for x in S)
>
> right?)
   ...
> OK, I think I understand what you're after.  The code for an iterator
> expression has to create a generator function behind the scenes, and
> call it.  For example:

Then if I am to be able to plug it into ReiterableIterator or some such
mechanism, I need to be able to get at said generator function in order
to stash it away (and call it again), right?   Hmmm, maybe an iterator
built by a generator could keep a reference to the generator it's a
child of... but that still wouldn't give the args to call it with, darn... and
i doubt it makes sense to burden every generator-made iterator with
all those thingies, for the one-in-N need to possibly reiterate on it...

>   def gen(seq):
>       for x in seq:
> 	  yield f(x)
>   class Helper:
>       def __init__(seq):
>           self.seq = seq
>       def __iter__(self):
>           return gen(self.seq)
>   A = Helper(S)
>
> Then every time you use iter(A) gen() will be called with the saved
> value of S as argument.

Yes, that would let ReiterableIterator(iter, A) work, of course.


> > I suppose technically, this means the itercomp doesn't return an
> > iterator, but an iterable, which I suppose could be confusing if you
> > try to call its 'next()' method.  But then, it could have a next()
> > method that raises an error saying "call 'iter()' on me first".
>
> I don't mind that so much, but I don't think all the extra machinery
> is worth it; the compiler generally can't tell if it is needed so it
> has to produce the reiterable code every time.  If you *want* to
> have an iterable instead of an iterator, it's usually easy enough do
> (especially given knowledge about the type of S).

Yeah, that seems sensible to me.


> [Alex again]
>
> > There ARE other features I'd REALLY have liked to get from iterators
> > in some applications.
> >
> > A "snapshot" -- providing me two iterators, the original one and
> > another, which will step independently over the same sequence of
> > items -- would have been really handy at times.  And a "step back"
   ...
> > disturbed); but not knowing the abilities of the underlying iterator
> > would mean these wrappers would often duplicate functionality
> > needlessly.
>
> I don't see how it can be done without an explicit request for such a
> wrapper in the calling code.  If the underlying iterator is ephemeral
> (is not reiterable) the snapshotter has to save a copy of every item,
> and that would defeat the purpose of iterators if it was done
> automatically.  Or am I misunderstanding?

No, you're not.  But, if the need to snapshot (or reiterate, very different
thing) was deemed important (and I have my doubts if either of them
IS important enough -- I suspect snapshot perhaps, reiterable not, but
I don't _know_), we COULD have those iterators which "know how to
snapshot themselves" expose a .snapshot or __snapshot__ method.
Then a function make_a_snapshottable(it) [the names are sucky, sorry,
bear with me] would return it if that method was available, otherwise 
the big bad wrapper around it.

Basically, by exposing suitable methods an iterator could "make its
abilities know" to functions that may or may not need to wrap it in
order to achieve certain semantics -- so the functions can build
only those wrappers which are truly indispensable for the purpose.
Roughly the usual "protocol" approach -- functions use an object's
ability IF that object exposes methods providing that ability, and
otherwise fake it on their own.

> I'm not sure what you are suggesting here.  Are you proposing that
> *some* iterators (those which can be snapshotted cheaply) sprout a new
> snapshot() method?

If snapshottability (eek!) is important enough, yes, though __snapshot__
might perhaps be more traditional (but for iterators we do have the
precedent of method next without __underscores__).


> > As I said I do have use cases for all of these.  Simplest is the
> > ability to push back the last item obtained by next, since a frequent

Yeah, that's really easy to provide by a lightweight wrapper, which
was my not-so-well-clarified intended point.

> This definitely sounds like you'd want to create an explicit wrapper

Absolutely.

> Perhaps a snapshottable iterator could also have a backup() method
> (which would decrement self.i in your first example) or a prev()
> method (which would return self.sequence[self.i] and decrement
> self.i).

It seems to me that the ability to back up and that of snapshotting
are somewhat independent.


> > A "snapshot" would be useful whenever more than one pass on a
> > sequence _or part of it_ is needed (more useful than a "restart"
> > because of the "part of it" provision).  And a decent wrapper for it
> > is a bear...
>
> Such wrappers for specific container types (or maybe just one for
> sequences) could be in a standard library module.  Is more needed?

I think that if it's worth providing a wrapper it's also worth having
those iterators that don't need the wrapper (because they already
intrinsically have the needed ability) sprout the relevant method or
special method; "factory functions" provided with the wrappers
could then just return the already-satisfactory iterator, or a wrapper
built around it, depending.

Problem is, I'm NOT sure if "it's worth providing a wrapper" in
each of these cases.  snapshottingability (:-) is the one case where,
if I had to decide myself right now, I'd say "go for it"... but that may
be just because it's the one case for which I happened to stumble
on some use cases in production (apart from "undoing", which isn't
too bad to handle in other ways anyway).


Alex


From guido at python.org  Sat Oct 18 15:55:48 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 18 15:56:01 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: Your message of "Sat, 18 Oct 2003 14:32:56 EDT."
	<5.1.0.14.0.20031018142209.0388a5a0@mail.telecommunity.com> 
References: <Your message of "Sat, 18 Oct 2003 11:20:45 +0200."
	<200310181120.45477.aleaxit@yahoo.com>
	<200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
	<5.1.0.14.0.20031017231830.01ed3ec0@mail.telecommunity.com>
	<200310180357.h9I3vLE08067@12-236-54-216.client.attbi.com>
	<200310181120.45477.aleaxit@yahoo.com> 
	<5.1.0.14.0.20031018142209.0388a5a0@mail.telecommunity.com> 
Message-ID: <200310181955.h9IJtmP10005@12-236-54-216.client.attbi.com>

> >OK, I think I understand what you're after.  The code for an iterator
> >expression has to create a generator function behind the scenes, and
> >call it.  For example:
> >
> >   A = (f(x) for x in S)
> >
> >could be translated into:
> >
> >   def gen(seq):
> >       for x in seq:
> >           yield f(x)
> >   A = gen(S)
> >
> >(Note that S could be an arbitrary expression and should be evaluated
> >only once.  This translation does that correctly.)
> 
> Interesting.  That wasn't the semantics I envisioned.  I was thinking 
> (implicitly, anyway) that an iterator comprehension was a closure.  That 
> is, that S would be evaluated each time.

We must be miscommunicating.  In

  A = [f(x) for x in S]

I certainly don't expect S to be evaluated more than once!

Did you mean "each time through the loop" or "each time we reach this
statement" or "each time someone loops over A" ???

Also note that I was giving the NON-reiterable semantics.  I don't
think there's any other way to do it (of course 'gen' should be an
anonymous function).

> However, if S is a sequence, you 
> don't need to reevaluate it, and if S is another iterator expression that 
> preserves reiterability, you still don't need to.  So, in that sense 
> there's never a need to
> 
> 
> >This allows one to iterate once over A (a generator function doesn't
> >allow reiteration).  What you are asking looks like it could be done
> >like this (never mind the local names):
> 
> Yes, that's actually what I said, but I guess I was once again unclear.
> 
> 
> >   def gen(seq):
> >       for x in seq:
> >           yield f(x)
> >   class Helper:
> >       def __init__(seq):
> >           self.seq = seq
> >       def __iter__(self):
> >           return gen(self.seq)
> >   A = Helper(S)
> >
> >Then every time you use iter(A) gen() will be called with the saved
> >value of S as argument.
> 
> Yes, except of course Helper would be a builtin type.

Sure, and its constructor would take 'gen' as an argument:

  class Helper:
      def __iter__(self, seq, gen):
          self.seq = seq
          self.gen = gen
      def __iter__(self):
          return self.gen(self.seq)

  def gen(seq):
      for x in seq:
          yield f(x)
  A = Helper(S, gen)

> >I don't mind that so much, but I don't think all the extra machinery
> >is worth it; the compiler generally can't tell if it is needed so it
> >has to produce the reiterable code every time.
> 
> It has to produce the generator every time, anyway, presumably as a
> nested function with access to the current locals.  The only
> question is whether it can be invoked more than once, and whether
> you create the helper object.  But maybe that's what you mean, and
> now you're being unclear instead of me.  ;)

I meant creation of the Helper instance.  Given that in most practical
situations if you *need* reiterability you can provide it using
something much simpler, I don't like using a Helper instance.

But in fact I don't even like having the implicit generator function.
I guess that's one reason I'm falling down on the -0 side of this
anyway...

> >   If you *want* to
> >have an iterable instead of an iterator, it's usually easy enough do
> >(especially given knowledge about the type of S).
> 
> I just tend to wish that I didn't have to think about whether
> iterators are reiterable or not, as it forces me to expose to
> callers of a function whether the value they pass must be an
> iterator or an iterable.

To me that's a perfectly reasonable requirement, as long as functions
taking an iterator also take an iterable (i.e. they call iter() on
their argument), so a caller who has only iterables doesn't have to
care about the difference.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sat Oct 18 15:58:57 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 18 15:59:07 2003
Subject: [Python-Dev] SRE recursion removed
In-Reply-To: Your message of "18 Oct 2003 21:42:53 +0200."
	<m38ynifi2q.fsf@mira.informatik.hu-berlin.de> 
References: <20031018144703.GA10212@ibook>
	<m3llrifm7r.fsf@mira.informatik.hu-berlin.de>
	<20031018182215.GA10756@ibook>
	<200310181937.h9IJbCi09945@12-236-54-216.client.attbi.com> 
	<m38ynifi2q.fsf@mira.informatik.hu-berlin.de> 
Message-ID: <200310181958.h9IJwvo10028@12-236-54-216.client.attbi.com>

> > > If you enable USE_RECURSION it will become recursive again, so it's
> > > nice to see if some problem is related to the non-recursive algorithm
> > > or not, and makes it easy to understand to change made.
> > 
> > That's okay.
> 
> There is no interface to enable USE_RECURSION except for editing
> _sre.c, and I cannot see why anybody would do that (except to see
> whether a bug goes away if it is enabled). So isn't then the old code
> essentially dead as well?

Given that we're talking about a very complicated change to extremely
delicate code, and we're pre-alpha, and we've explicitly discussed
giving the code the benefit of the doubt because nobody has the guts
to review it, I find it perfectly reasonable to leave the old code in
with a quick way to re-enable it in case someone produces a test case
that they claim breaks with the new code.  The old code can be phased
out once we're certain the new code is rock solid.  I don't mind
having the #ifdef in for one release cycle.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at yahoo.com  Sat Oct 18 16:01:52 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 18 16:01:57 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <20031018191319.GA23071@callisto.yi.org>
References: <000401c39516$c440b520$e841fea9@oemcomputer>
	<3F9175CF.3040408@iinet.net.au>
	<20031018191319.GA23071@callisto.yi.org>
Message-ID: <200310182201.52605.aleaxit@yahoo.com>

On Saturday 18 October 2003 09:13 pm, Dan Aloni wrote:
   ...
> > >>   Guido>   keys = D.keys()
> > >>   Guido>   keys.sort()
> > >>   Guido>   for key in keys:
   ...
> Actually, there is way to do this out-of-the-box without the chain()
>
> function:
> >>> a = [1,2,3,3,2,1]
> >>> (a, (a, a.sort())[0].reverse())[0]

This cannot be applied to D.keys() for some directory D.

> >>> (lambda x:(x, x.sort())[0])(list(a))

This one can, because the lambda lets you give a temporary name x
to the otherwise-unnamed list returned by D.keys().  It can be made a
_little_ better, too, I think:

>>> D=dict.fromkeys('ciao')
>>> D.keys()
['i', 'a', 'c', 'o']
>>> (lambda x: x.sort() or x)(D.keys())
['a', 'c', 'i', 'o']

and if you want it reversed after sorting,

>>> (lambda x: x.sort() or x.reverse() or x)(D.keys())
['o', 'i', 'c', 'a']


> But that's probably not more readable.

You have a gift for understatement.  Still, probably more readable than
the classic list comprehension hack:

>>> [x for x in [D.keys()] for y in [x.sort(), x] if y][0]
['a', 'c', 'i', 'o']

also, the lambda hack doesn't leak names into the surrounding scope,
while the list comprehension hack, alas, does.

BTW, welcome to python-dev!


Alex


From whisper at oz.net  Sat Oct 18 16:03:14 2003
From: whisper at oz.net (David LeBlanc)
Date: Sat Oct 18 16:03:24 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <m37k329a7u.fsf@mira.informatik.hu-berlin.de>
Message-ID: <GCEDKONBLEFPPADDJCOEMEPMOBAA.whisper@oz.net>

> IOW, I have *never* seen anybody who wanted to rebuild a stock Python
> module without having to download the entire Python source code, and
> rebuild everything. After years of listening to python-help, I found
> that the most common application of rebuilding parts or all of Python
> is building debug binaries on Windows, to debug your own extension
> modules. This requires rebuilding all of Python, and people accept
> that (even though they don't like it).

Are we talking about the same thing? I'm by no means suggesting that the
python.x.x.tar.gz source be broken up! I'm not aware of any ability to
download component sources any other way, nor would I want that!

I have had the experience of building and rebuilding specific extensions of
Python to find a bug. I found it and reported it. I'm proud to say that I
found a very obscure bug in Python that affected how Zope was written at the
time. My 2 seconds of fame ;)

> > Perception does count for a lot, especially when reviewers are
> making gross
> > comparisons of executable (including dlls) sizes.
>
> I, personally, would not make technical decisions on grounds of
> perception which I know would be unfounded. I can see how other people
> would let their decisions guide by incorrect perception, and I find
> that unfortunate (but can accept it).
>
> I would, personally, strive to correct incorrect perception by means
> of education. I know this is a tedious process.

Yes, me too, but there are no end to the number of people who will look at
the surface and never dig deeper. If perception turns people away from
Python, that is not a good thing.

> > Yes. What big benefit does this offer compared to the status quo? Aren't
> > there more important things to devote resources to?
>
> The big benefit is that it simplifies packaging and deployment, and I
> believe this is the reason why Thomas Heller, who has just taken over
> Windows packaging, wants to see it implemented. It simplifies his life,
> and wasting volunteer time should not be taken lightly.

As from above, I'm not, nor do I think it was the original poster's intent,
suggesting that the Python distro be broken up. At this point, I download 2
files: the python binary and the python source. I thought this was about
merging all the .pyd files into a single python dll? As far as I can see,
that's not a distribution issue per se.

> It also simplifies my life, as I plan to maintain a Win64 port. I have
> to perform manual adjustments in each project file - the fewer project
> files, the better.
>
> There are certainly more important things to devote resources to, like
> fixing bugs. Unfortunately, there are no volunteers for these
> important things, and the volunteers tend to look into things that are
> not important but fun.
>
> > If there is enough feeling about it, would it be possible to create an
> > alternate VS project that could do the all in one dll instead of pulling
> > everyone along one path because a few like an idea?
>
> That would cause DLL hell - there must not be competing versions of
> python23.dll.

If I build it one way on my machine, how would that cause dll hell?

> There is also the issue of converting the VS6 projects to VS7.1; when
> that happens, a re-organization might be in place.
>
> > Why do you find it necessary to characterize some honest
> questions as FUD
> > instead of speaking to the merits or demerits of the discussion?
>
> Because it is: this was not meant as a critique, or bad-mouthing, or
> some such; if you have taken it in this sense, I apologize. The only
> rationale for leaving things as-is was that users might fear things
> that you also knew where unfounded fears because of uncertainty - FUD.
>
> Regards,
> Martin

David LeBlanc
Seattle, WA USA


From aleaxit at yahoo.com  Sat Oct 18 16:11:24 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 18 16:11:34 2003
Subject: [Python-Dev] The Trick
In-Reply-To: <200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com>
References: <000401c39516$c440b520$e841fea9@oemcomputer>
	<200310181714.19688.aleaxit@yahoo.com>
	<200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com>
Message-ID: <200310182211.24635.aleaxit@yahoo.com>

On Saturday 18 October 2003 06:33 pm, Guido van Rossum wrote:
> I don't like the trick of avoiding the copy if the refcount is one;
> AFAIK it can't be done in Jython.

No, but if it's only a small optimization, who cares?  Anyway, the
objection that these functions might be called by _C_ code who's
holding the only reference to a PyObject* probably kills The Trick
(particularly my hope of moving it into PySequence_List whether
copysort survived or not).


> I think the application area is too narrow to warrant a built-in,
> *and* lists shouldn't grow two similar methods.  Let's keep the
> language small!

Aye aye, captain.

Can we dream of a standard library module of "neat hacks that
don't really warrant a built-in" in which to stash some of these
general-purpose, no-specific-appropriate-module, useful functions
and classes?  Pluses: would save some people reimplementing
them over and over and sometimes incorrectly; would remove
any pressure to add not-perfectly-appropriate builtins.  Minuses:
one more library module (the, what, 211th?  doesn't seem like
a biggie).  Language unchanged -- just library.  Pretty please?

> (I know, by that argument several built-ins shouldn't exist.  Well,
> they might be withdrawn in 3.0; let's not add more.)

"Amen and Hallelujah" to the hope of slimming language and
built-ins in 3.0 (presumably the removed built-ins will go into a
"legacy curiosa" module, allowing a "from legacy import *" to
ease making old code run in 3.0? seems cheap & sensible).


Alex


From aleaxit at yahoo.com  Sat Oct 18 16:15:28 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 18 16:15:46 2003
Subject: [Python-Dev] The Trick
In-Reply-To: <Pine.LNX.4.44.0310181358170.31626-100000@penguin.theopalgroup.com>
References: <Pine.LNX.4.44.0310181358170.31626-100000@penguin.theopalgroup.com>
Message-ID: <200310182215.28092.aleaxit@yahoo.com>

On Saturday 18 October 2003 08:00 pm, Kevin Jacobs wrote:
> On Sat, 18 Oct 2003, Guido van Rossum wrote:
> > I don't like the trick of avoiding the copy if the refcount is one;
> > AFAIK it can't be done in Jython.
>
> There is also a problem with the strategy if if gets called by a C
> extension.  It is perfectly feasible for a C extension to hold the only
> reference to an object, call the copying sort (directly or indirectly), and
> then be very surprised that the copy did not take place.

Alas, I fear you're right.  Darn -- so much for a possible little but
cheap optimization (which might have been neat in PySequence_List
even if copysort never happens and the optimization is only for
CPython -- I don't see why an optimization being impossible in
Jython should stop CPython from making it, as long as semantics
remain compatible).  It's certainly possible for C code to call
PySequence_List or whatever while holding the only reference,
and count on the returned and argument objects being distinct:-(.


Alex


From aleaxit at yahoo.com  Sat Oct 18 16:24:33 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 18 16:24:38 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <3F9175CF.3040408@iinet.net.au>
References: <000401c39516$c440b520$e841fea9@oemcomputer>
	<200310181743.38959.aleaxit@yahoo.com>
	<3F9175CF.3040408@iinet.net.au>
Message-ID: <200310182224.33499.aleaxit@yahoo.com>

On Saturday 18 October 2003 07:18 pm, Nick Coghlan wrote:
> Alex Martelli strung bits together to say:
> >>    Guido>   keys = D.keys()
> >>    Guido>   keys.sort()
> >>    Guido>   for key in keys:
> >>    Guido>       ...
> >>
> >>Current standard practice is also fine.
> >
> > Nolo contendere.  It DOES feel a bit like boilerplate, that's all.
>
> Hi,
>
> While I'm not an active Python contributor (yet), I've been lurking on
> python-dev since March.

Hi Nick!

> Something was bugging me about the whole l.copysort() ('sortedcopy'?) idea.
> For whatever reason, the above comment crystalised it - if there's going to
> be a special 'sortedcopy' to allow minimalist chaining, then what about
> 'reversedcopy' or 'sortedreversedcopy', or any of the other list methods
> that may be considered worth chaining?

sort has just (in CVS right now) sprouted an optional reverse=True parameter
that makes sort-reverse chaining a non-issue (thanks to the usual 
indefatigable Raymond, too).

But, for the general case: the BDFL has recently Pronounced that he does 
not LIKE chaining and doesn't want to encourage it in the least.  Yes, your
trick does allow chaining, but the repeated chain(...) calls are cumbersome
enough to not count as an encouragement IMHO;-).

A generalized wrapping system that wraps all methods so that any
"return None" is transformed into a "return self" WOULD constitute
encouragement, and thus a case of Lese BDFLity, which would easily
risk the wrath of the PSU (of COURSE there ain't no such thing!)...


Alex


From niemeyer at conectiva.com  Sat Oct 18 16:28:54 2003
From: niemeyer at conectiva.com (Gustavo Niemeyer)
Date: Sat Oct 18 16:30:05 2003
Subject: [Python-Dev] SRE recursion removed
In-Reply-To: <m3he26fjz3.fsf@mira.informatik.hu-berlin.de>
References: <20031018144703.GA10212@ibook>
	<m3llrifm7r.fsf@mira.informatik.hu-berlin.de>
	<20031018182215.GA10756@ibook>
	<m3he26fjz3.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20031018202854.GA22482@ibook>

> Hmm. Either you trust that your code is basically correct or you
> don't. If you trust that it is basically correct, you should remove
> the old code, and trust that any problems in SRE (be they related to
> your code or independent) can be fixed, in which case maintaining the
> old code would be pointless.
> 
> Or, if you don't trust that your code is basically correct, you should
> not have applied the patch.

Hey.. Martin, are you ok? What's going on? You're being extremelly
aggressive without an aparent reason. I'm putting a prize on my head
for hacking the *hairy* code in SRE and removing a serious limitation,
and that's your reaction!? I'm disappointed.

> I also wonder whether the code performing recursion checks has any
> function still. So I wonder whether USE_STACKCHECK,
> USE_RECURSION_LIMIT are "essentially" dead.

Yeah.. I can clean it. let's please wait a little bit to
see the new code working?

> IMO, any unused code in SRE is a problem, because it makes already
> difficult-to-follow code more difficult to follow. It is ok to
> maintain dead code if the code might be used in the future, but only
> if there are specific plans to actually use it in a foreseeable
> future. It is not ok 

Dead *debug* code is something common all over the world. Should we
remove VERBOSE usage as well!? :-)

-- 
Gustavo Niemeyer
http://niemeyer.net

From da-x at gmx.net  Sat Oct 18 16:54:41 2003
From: da-x at gmx.net (Dan Aloni)
Date: Sat Oct 18 16:54:53 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310182201.52605.aleaxit@yahoo.com>
References: <000401c39516$c440b520$e841fea9@oemcomputer>
	<3F9175CF.3040408@iinet.net.au>
	<20031018191319.GA23071@callisto.yi.org>
	<200310182201.52605.aleaxit@yahoo.com>
Message-ID: <20031018205441.GA24562@callisto.yi.org>

On Sat, Oct 18, 2003 at 10:01:52PM +0200, Alex Martelli wrote:
> 
> > >>> (lambda x:(x, x.sort())[0])(list(a))
> 
> This one can, because the lambda lets you give a temporary name x
> to the otherwise-unnamed list returned by D.keys().  It can be made a
> _little_ better, too, I think:
> 
> >>> D=dict.fromkeys('ciao')
> >>> D.keys()
> ['i', 'a', 'c', 'o']
> >>> (lambda x: x.sort() or x)(D.keys())
> ['a', 'c', 'i', 'o']
> 
> and if you want it reversed after sorting,
> 
> >>> (lambda x: x.sort() or x.reverse() or x)(D.keys())
> ['o', 'i', 'c', 'a']

Good, so this way the difference between copied and not 
copied is minimized:

  >>> (lambda x: x.sort() or x)(a)

  And:

  >>> (lambda x: x.sort() or x)(list(a))
 
Nice, this lambda hack is a cleaner, more specific, and simple 
deviation of the chain() function.
 
Perhaps it could be made more understandable like:

  >>> sorted = lambda x: x.sort() or x
  >>> sorted(list(a))
  ['a', 'c', 'i', 'o']
  
  And:
  
  >>> sorted(a)
  ['a', 'c', 'i', 'o']

The only problem is that you assume .sort() always returns a non
True value. If some time in the future .sort() would return self, 
your code would break and then the rightful usage would be:

  >>> a = ['c', 'i', 'a', 'o']
  >>> list(a).sort()
  ['a', 'c', 'i', 'o']
  >>> a
  ['c', 'i', 'a', 'o']
  
  And:
  
  >>> a.sort()
  ['a', 'c', 'i', 'o']
  >>> a
  ['a', 'c', 'i', 'o']

I didn't see the begining of this discussion, but it looks to me that
sort() returning self is much better than adding a .copysort(). 

> BTW, welcome to python-dev!

Thanks!

-- 
Dan Aloni
da-x@gmx.net

From pje at telecommunity.com  Sat Oct 18 17:00:17 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat Oct 18 17:00:35 2003
Subject: [Python-Dev] The Trick
In-Reply-To: <200310182211.24635.aleaxit@yahoo.com>
References: <200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com>
	<000401c39516$c440b520$e841fea9@oemcomputer>
	<200310181714.19688.aleaxit@yahoo.com>
	<200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com>
Message-ID: <5.1.0.14.0.20031018165255.0399fd90@mail.telecommunity.com>

At 10:11 PM 10/18/03 +0200, Alex Martelli wrote:
>Can we dream of a standard library module of "neat hacks that
>don't really warrant a built-in" in which to stash some of these
>general-purpose, no-specific-appropriate-module, useful functions
>and classes?  Pluses: would save some people reimplementing
>them over and over and sometimes incorrectly; would remove
>any pressure to add not-perfectly-appropriate builtins.  Minuses:
>one more library module (the, what, 211th?  doesn't seem like
>a biggie).  Language unchanged -- just library.  Pretty please?

Hmmm.

import tricky.hacks
from dont_try_this_at_home_kids import *

I suppose 'shortcuts' would probably be a less contentious name.  :)

The downside to having such a module would be that it would entertain 
ongoing pressure to add more things to it.  I suppose it'd be better to 
have a huge shortcuts module (or maybe shortcuts package, divided by 
subject matter) than to keep adding builtins.


> > (I know, by that argument several built-ins shouldn't exist.  Well,
> > they might be withdrawn in 3.0; let's not add more.)
>
>"Amen and Hallelujah" to the hope of slimming language and
>built-ins in 3.0 (presumably the removed built-ins will go into a
>"legacy curiosa" module, allowing a "from legacy import *" to
>ease making old code run in 3.0? seems cheap & sensible).

I like it.  Or, for symmetry, maybe 'from __past__ import lambda'.  ;-)

Say, in 3.0, will there be perhaps *no* builtins?  After all, you don't 
need builtins to import things.  Nah, that'd be too much like Java, and not 
enough like pseudocode.

Ah well, time for me to stop making suggestions on what color to paint the 
bicycle shed, and start doing some real work today.  :)


From aleaxit at yahoo.com  Sat Oct 18 17:12:38 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 18 17:12:43 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <20031018205441.GA24562@callisto.yi.org>
References: <000401c39516$c440b520$e841fea9@oemcomputer>
	<200310182201.52605.aleaxit@yahoo.com>
	<20031018205441.GA24562@callisto.yi.org>
Message-ID: <200310182312.38553.aleaxit@yahoo.com>

On Saturday 18 October 2003 10:54 pm, Dan Aloni wrote:
   ...
> Perhaps it could be made more understandable like:
>   >>> sorted = lambda x: x.sort() or x
>   >>> sorted(list(a))

No fair -- that's not a single expression any more!-)

> The only problem is that you assume .sort() always returns a non
> True value. If some time in the future .sort() would return self,
> your code would break and then the rightful usage would be:

Why do you think it would break?  It would do a _tiny_ amount of
avoidable work, but still return the perfectly correct result.  Sure
you don't think I'd post an unreadable inline hack that would break
in the unlikely case the BDFL ever made a change he's specifically
Pronounced again, right?-)


> I didn't see the begining of this discussion, but it looks to me that
> sort() returning self is much better than adding a .copysort().

The BDFL has Pronounced against it: he doesn't LIKE chaining.


Alex


From guido at python.org  Sat Oct 18 17:22:00 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 18 17:22:06 2003
Subject: [Python-Dev] The Trick
In-Reply-To: Your message of "Sat, 18 Oct 2003 22:11:24 +0200."
	<200310182211.24635.aleaxit@yahoo.com> 
References: <000401c39516$c440b520$e841fea9@oemcomputer>
	<200310181714.19688.aleaxit@yahoo.com>
	<200310181633.h9IGXoV09658@12-236-54-216.client.attbi.com> 
	<200310182211.24635.aleaxit@yahoo.com> 
Message-ID: <200310182122.h9ILM0m10190@12-236-54-216.client.attbi.com>

> Can we dream of a standard library module of "neat hacks that
> don't really warrant a built-in" in which to stash some of these
> general-purpose, no-specific-appropriate-module, useful functions
> and classes?  Pluses: would save some people reimplementing
> them over and over and sometimes incorrectly; would remove
> any pressure to add not-perfectly-appropriate builtins.  Minuses:
> one more library module (the, what, 211th?  doesn't seem like
> a biggie).  Language unchanged -- just library.  Pretty please?

Modules should be about specific applications, or algorithms, or data
types, or some other unifying principle.  I think "handy" doesn't
qualify. :-)

> > (I know, by that argument several built-ins shouldn't exist.  Well,
> > they might be withdrawn in 3.0; let's not add more.)
> 
> "Amen and Hallelujah" to the hope of slimming language and
> built-ins in 3.0 (presumably the removed built-ins will go into a
> "legacy curiosa" module, allowing a "from legacy import *" to
> ease making old code run in 3.0? seems cheap & sensible).

Let's not speculate yet about how to get old code to run in 3.0.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Sat Oct 18 17:27:43 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Sat Oct 18 17:28:06 2003
Subject: [Python-Dev] SRE recursion removed
In-Reply-To: <20031018202854.GA22482@ibook>
References: <20031018144703.GA10212@ibook>
	<m3llrifm7r.fsf@mira.informatik.hu-berlin.de>
	<20031018182215.GA10756@ibook>
	<m3he26fjz3.fsf@mira.informatik.hu-berlin.de>
	<20031018202854.GA22482@ibook>
Message-ID: <m3znfydynk.fsf@mira.informatik.hu-berlin.de>

Gustavo Niemeyer <niemeyer@conectiva.com> writes:

> Hey.. Martin, are you ok? What's going on? You're being extremelly
> aggressive without an aparent reason. I'm putting a prize on my head
> for hacking the *hairy* code in SRE and removing a serious limitation,
> and that's your reaction!? I'm disappointed.

Please accept my apologies; I don't want to diminish your efforts, and
I do appreciate them.

However, I'm concerned that track is completely lost as to how SRE
works - is it or is it not the case that the current implementation
which is in CVS is recursive, with arbitrary deep nesting? If it is
not recursive anymore (which the subject suggests), then why is the is
the 'level' argument still in? Can we or can we not remove the ad-hoc
determination of USE_RECURSION_LIMIT?

> Yeah.. I can clean it. let's please wait a little bit to
> see the new code working?

Certainly. However, I was hoping that we have better means of finding
out whether the code still does what it is supposed to do than
testing. Perhaps that is an illusion.

Regards,
Martin

From guido at python.org  Sat Oct 18 18:05:38 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 18 18:06:16 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: Your message of "Sat, 18 Oct 2003 21:46:20 +0200."
	<200310182146.20751.aleaxit@yahoo.com> 
References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
	<200310181120.45477.aleaxit@yahoo.com>
	<200310181717.h9IHHeI09703@12-236-54-216.client.attbi.com> 
	<200310182146.20751.aleaxit@yahoo.com> 
Message-ID: <200310182205.h9IM5cj10229@12-236-54-216.client.attbi.com>

> > Reiteration comes for free if you hold on to that underlying object
> > rather than passing an iterator to them around.
> 
> Yes, but you need to pass around a somewhat complicated thing --
> the iterator (to have the "current state in the iteration"), the callable
> that needs to be called to generate the iterator again (iter, or the
> generator, or the class whose instances are numerical series, ...)
> and the arguments for that callable (the sequence, the generator's
> arguments, the parameters with which to instantiate the class, ...).
> 
> Nothing terrible, admittedly, and that's presumably how I'd architect
> things IF I ever met a use case for a "reiterable iterator":
> 
> class ReiterableIterator(object):
>     def __init__(self, thecallable, *itsargs, **itskwds):
>         self.c, self.a, self.k = thecallable, itsargs, itskwds
>         self.it = thecallable(*itsargs, **itskwds)
>     def __iter__(self): return self
>     def next(self): return self.it.next()
>     def reiter(self): return self.__class__(self.c, *self.a, **self.k)

Why put support for a callable with arbitrary arguments in the
ReiterableIterator class?  Why not say it's called without args, and
if the user has a need to use something with args, they can use one of
the many approaches to currying?

> typical toy example use:
> 
> def printwice(n, reiter):
>     for i, x in enumerate(reiter):
>         if i>=n: break
>         print x
>     for i, x in enumerate(reiter.reiter()):
>         if i>=n: break
>         print x
> 
> def evens():
>     x = 0
>     while 1:
>         yield x
>         x += 2
> 
> printwice(5, ReiterableIterator(evens))

Are there any non-toy examples?

I'm asking because I can't remember ever having had this need myself.

> > [Alex again]
> >
> > > There ARE other features I'd REALLY have liked to get from iterators
> > > in some applications.
> > >
> > > A "snapshot" -- providing me two iterators, the original one and
> > > another, which will step independently over the same sequence of
> > > items -- would have been really handy at times.  And a "step back"
>    ...
> > > disturbed); but not knowing the abilities of the underlying iterator
> > > would mean these wrappers would often duplicate functionality
> > > needlessly.
> >
> > I don't see how it can be done without an explicit request for such a
> > wrapper in the calling code.  If the underlying iterator is ephemeral
> > (is not reiterable) the snapshotter has to save a copy of every item,
> > and that would defeat the purpose of iterators if it was done
> > automatically.  Or am I misunderstanding?
> 
> No, you're not.  But, if the need to snapshot (or reiterate, very
> different thing) was deemed important (and I have my doubts if
> either of them IS important enough -- I suspect snapshot perhaps,
> reiterable not, but I don't _know_), we COULD have those iterators
> which "know how to snapshot themselves" expose a .snapshot or
> __snapshot__ method.  Then a function make_a_snapshottable(it) [the
> names are sucky, sorry, bear with me] would return it if that method
> was available, otherwise the big bad wrapper around it.

A better name would be clone(); copy() would work too, as long as it's
clear that it copies the iterator, not the underlying sequence or
series.  (Subtle difference!)

Reiteration is a special case of cloning: simply stash away a clone
before you begin.

> Basically, by exposing suitable methods an iterator could "make its
> abilities know" to functions that may or may not need to wrap it in
> order to achieve certain semantics -- so the functions can build
> only those wrappers which are truly indispensable for the purpose.
> Roughly the usual "protocol" approach -- functions use an object's
> ability IF that object exposes methods providing that ability, and
> otherwise fake it on their own.

In this case I'm not sure if it is desirable to do this automatically.
If I request a clone of an iterator for a data stream coming from a
pipe or socket, it would have to start buffering everything.  Sure, I
can come up with a buffering class that throws away buffered data that
none of the existing clones can reach, but I very much doubt if it's
worth it; a customized buffering scheme for the application at hand
would likely be more efficient than a generic solution.

> > I'm not sure what you are suggesting here.  Are you proposing that
> > *some* iterators (those which can be snapshotted cheaply) sprout a new
> > snapshot() method?
> 
> If snapshottability (eek!) is important enough, yes, though __snapshot__
> might perhaps be more traditional (but for iterators we do have the
> precedent of method next without __underscores__).

(Which I've admitted before was a mistake.)

A problem I have with making iterator cloning a standard option is
that this would pretty much require that all iterators for which
cloning can be implemented should implement clone().  That in turn
means that iterator implementors have to work harder (sometimes
cloning can be done cheaply, but it might require a different
refactoring of the iterator implementation).

Another issue is that it would make generators second-class citizens,
since they cannot be cloned.  (It would seem to be possible to copy a
stack frame, but then the question begs whether to use shallow or deep
copying -- if a local variable in a generator references a list,
should the list be copied or not?  And if it should be copied, should
it be a deep or shallow copy?  There's no good answer without knowing
the intention of the programmer.)

> > > As I said I do have use cases for all of these.  Simplest is the
> > > ability to push back the last item obtained by next, since a
> > > frequent
> 
> Yeah, that's really easy to provide by a lightweight wrapper, which
> was my not-so-well-clarified intended point.
> 
> > This definitely sounds like you'd want to create an explicit wrapper
> 
> Absolutely.
> 
> > Perhaps a snapshottable iterator could also have a backup() method
> > (which would decrement self.i in your first example) or a prev()
> > method (which would return self.sequence[self.i] and decrement
> > self.i).
> 
> It seems to me that the ability to back up and that of snapshotting
> are somewhat independent.

Backing up suggests a strictly limited buffer; cloning suggests a
potentially arbitrarily large buffer.  If backing up is what you
really need, it's easy to provide a wrapper for it (with a buffer
limit argument).  Since the buffer is only limited, keeping a few
copies of items that aren't strictly necessary won't hurt; it doesn't
have the issue of wasting space with a full copy of an existing
sequence (or worse, of an easily regenerated series).

> > > A "snapshot" would be useful whenever more than one pass on a
> > > sequence _or part of it_ is needed (more useful than a "restart"
> > > because of the "part of it" provision).  And a decent wrapper
> > > for it is a bear...
> >
> > Such wrappers for specific container types (or maybe just one for
> > sequences) could be in a standard library module.  Is more needed?
> 
> I think that if it's worth providing a wrapper it's also worth
> having those iterators that don't need the wrapper (because they
> already intrinsically have the needed ability) sprout the relevant
> method or special method; "factory functions" provided with the
> wrappers could then just return the already-satisfactory iterator,
> or a wrapper built around it, depending.
> 
> Problem is, I'm NOT sure if "it's worth providing a wrapper" in each
> of these cases.  snapshottingability (:-) is the one case where, if
> I had to decide myself right now, I'd say "go for it"... but that
> may be just because it's the one case for which I happened to
> stumble on some use cases in production (apart from "undoing", which
> isn't too bad to handle in other ways anyway).

I'd like to hear more about those cases, to see if they really need
cloning (:-) or can live with a fixed limited backup capability.

I think a standard backup wrapper would be a useful thing to have
(maybe in itertools?); since generator functions can't be cloned, I'm
going to push back on the need for cloning for now until I see a lot
more non-toy evidence.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Sat Oct 18 18:22:59 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Sat Oct 18 18:23:21 2003
Subject: [Python-Dev] Be Honest about LC_NUMERIC
Message-ID: <200310182222.h9IMMx1X004861@mira.informatik.hu-berlin.de>

What happened to this PEP? I can't find it in the PEP list.

Personally, I am satisfied with the patch that evolved from the
discussion (#774665), and I would be willing to apply it even without
a PEP.

Thoughts?

Regards,
Martin

From Scott.Daniels at Acm.Org  Sat Oct 18 18:44:46 2003
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Sat Oct 18 18:45:00 2003
Subject: [Python-Dev] Re: Python-Dev Digest, Vol 3, Issue 59
In-Reply-To: <E1AAyNR-0002tR-KH@mail.python.org>
References: <E1AAyNR-0002tR-KH@mail.python.org>
Message-ID: <3F91C25E.2050409@Acm.Org>

[Alex Martelli]
 >[Kevin Jacobs]
 >>[Guido van Rossum]
>>>I don't like the trick of avoiding the copy if the refcount is one;
>>>AFAIK it can't be done in Jython.
>>
>>There is also a problem with the strategy if if gets called by a
>>C only extension.  It is perfectly feasible for a C extension to
>>hold the reference to an object, call the copying sort (directly
>>or indirectly), and then be very surprised that the copy did not
>>take place.
> 
> Alas, I fear you're right.  Darn--so much for a possible little but
> cheap optimization (which might have been neat in PySequence_List
> even if copysort never happens and the optimization is only for
> CPython -- I don't see why an optimization being impossible in
> Jython should stop CPython from making it, as long as semantics
> remain compatible).  It's certainly possible for C code to call
> PySequence_List or whatever while holding the only reference,
> and count on the returned and argument objects being distinct:-(.

I'm afraid I'm confused here.  If the C code is like:

    ... at this point PTR refers to an object with refcount 1
    OTHER = <invoke sortcopy>(PTR)
    ... Then it might be that PTR == OTHER ...

What possible harm could come?  The C code should expect a
sortcopy method to recycle the object referred to by PTR
if "the Trick" isn't used.  I am a trifle confused about
what harm occurs.  Seems to me that list(v) (and alist[:])
could quite happily implement "the Trick" without fear of
failure.

-Scott David Daniels
Scott.Danies@Acm.Org


From bac at OCF.Berkeley.EDU  Sat Oct 18 22:30:27 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Sat Oct 18 22:30:37 2003
Subject: [Python-Dev] How to spell Py_return_None and friends (was: RE:
 [Python-checkins]
 python/dist/src/Objects typeobject.c, 2.244, 2.245)
In-Reply-To: <200310090503.h99533G00867@12-236-54-216.client.attbi.com>
References: <000001c38e10$77e31ea0$e841fea9@oemcomputer>	<200310090345.h993jkF00618@12-236-54-216.client.attbi.com>
	<m33ce3hvlt.fsf@mira.informatik.hu-berlin.de>
	<200310090503.h99533G00867@12-236-54-216.client.attbi.com>
Message-ID: <3F91F743.6090801@ocf.berkeley.edu>

Guido van Rossum wrote:

>>Guido van Rossum <guido@python.org> writes:
>>
>>
>>>Maybe PyBool_FromLong() itself could make this unneeded by adding
>>>something like
>>>
>>>    if (ok < 0 && PyErr_Occurred())
>>>        return NULL;
>>>
>>>to its start?
> 
> 
> [MvL]
> 
>>That would an incompatible change. I would expect PyBool_FromLong(i)
>>do the same thing as bool(i).
> 
> 
> Well, it still does, *except* if you have a pending exception.  IMO
> what happens when you make a Python API call while an exception is
> pending is pretty underspecified, so it's doubtful whether this
> incompatibility matters.
> 
> 
>>>Maybe a pair of macros Py_return_True and Py_return_False would make
>>>sense?
>>
>>You should, of course, add Py_return_None to it, as well.
>>
>>Then you will find that some contributor goes on a crusade to use
>>these throughout very quickly :-)
> 
> 
> There's the minor issue of how to spell it (Mark Hammond may have a
> different suggestion) but that certain contributor has my approval
> once we get the spelling agreed upon.
> 

So I just grepped the source and checked the patch manager and don't see 
any resolution on this.  I know there was no objections from anyone to 
do this beyond just coming up with an agreed spelling.

So Py_return_None or Py_RETURN_NONE ?  I am with Mark in liking the 
all-caps for macros, but I can easily live with the first suggestion as 
well.

-Brett


From guido at python.org  Sat Oct 18 22:40:46 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 18 22:40:54 2003
Subject: [Python-Dev] Re: How to spell Py_return_None and friends (was: RE:
	[Python-checkins] python/dist/src/Objects typeobject.c, 2.244,
	2.245)
In-Reply-To: Your message of "Sat, 18 Oct 2003 19:30:27 PDT."
	<3F91F743.6090801@ocf.berkeley.edu> 
References: <000001c38e10$77e31ea0$e841fea9@oemcomputer>
	<200310090345.h993jkF00618@12-236-54-216.client.attbi.com>
	<m33ce3hvlt.fsf@mira.informatik.hu-berlin.de>
	<200310090503.h99533G00867@12-236-54-216.client.attbi.com> 
	<3F91F743.6090801@ocf.berkeley.edu> 
Message-ID: <200310190240.h9J2ekX10384@12-236-54-216.client.attbi.com>

> So I just grepped the source and checked the patch manager and don't see 
> any resolution on this.  I know there was no objections from anyone to 
> do this beyond just coming up with an agreed spelling.
> 
> So Py_return_None or Py_RETURN_NONE ?  I am with Mark in liking the 
> all-caps for macros, but I can easily live with the first suggestion as 
> well.

Py_RETURN_NONE, _FALSE, _TRUE are fine.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From bac at OCF.Berkeley.EDU  Sat Oct 18 23:23:17 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Sat Oct 18 23:23:25 2003
Subject: [Python-Dev] Re: How to spell Py_return_None and friends
In-Reply-To: <200310190240.h9J2ekX10384@12-236-54-216.client.attbi.com>
References: <000001c38e10$77e31ea0$e841fea9@oemcomputer>
	<200310090345.h993jkF00618@12-236-54-216.client.attbi.com>
	<m33ce3hvlt.fsf@mira.informatik.hu-berlin.de>
	<200310090503.h99533G00867@12-236-54-216.client.attbi.com>
	<3F91F743.6090801@ocf.berkeley.edu>
	<200310190240.h9J2ekX10384@12-236-54-216.client.attbi.com>
Message-ID: <3F9203A5.2030407@ocf.berkeley.edu>

Guido van Rossum wrote:

>>So I just grepped the source and checked the patch manager and don't see 
>>any resolution on this.  I know there was no objections from anyone to 
>>do this beyond just coming up with an agreed spelling.
>>
>>So Py_return_None or Py_RETURN_NONE ?  I am with Mark in liking the 
>>all-caps for macros, but I can easily live with the first suggestion as 
>>well.
> 
> 
> Py_RETURN_NONE, _FALSE, _TRUE are fine.
> 

OK, great.  I can code them up, but fair warning, I have not done C 
macros in a *long* time so if someone would rather do it then please do 
so.  Regardless, Brett Newbie Question time: what file should they go in?

-Brett


From bac at OCF.Berkeley.EDU  Sat Oct 18 23:38:21 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Sat Oct 18 23:38:29 2003
Subject: [Python-Dev] python-dev Summary for 2003-10-01 through 2003-10-15
	[draft]
Message-ID: <3F92072D.70601@ocf.berkeley.edu>

python-dev Summary for 2003-10-01 through 2003-10-15
++++++++++++++++++++++++++++++++++++++++++++++++++++
This is a summary of traffic on the `python-dev mailing list`_ from 
October 1, 2003 through October 15, 2003.  It is intended to inform the 
wider Python community of on-going developments on the list.  To comment 
on anything mentioned here, just post to `comp.lang.python`_ (or email 
python-list@python.org which is a gateway to the newsgroup) with a 
subject line mentioning what you are discussing. All python-dev members 
are interested in seeing ideas discussed by the community, so don't 
hesitate to take a stance on something.  And if all of this really 
interests you then get involved and join `python-dev`_!

This is the twenty-seventh summary written by Brett Cannon (about to 
turn a quarter century old; so young yet so wise =).

All summaries are archived at http://www.python.org/dev/summary/ .

Please note that this summary is written using reStructuredText_ which 
can be found at http://docutils.sf.net/rst.html .  Any unfamiliar 
punctuation is probably markup for reST_ (otherwise it is probably 
regular expression syntax or a typo =); you can safely ignore it, 
although I suggest learning reST; it's simple and is accepted for `PEP 
markup`_ and gives some perks for the HTML output.  Also, because of the 
wonders of programs that like to reformat text, I cannot guarantee you 
will be able to run the text version of this summary through Docutils_ 
as-is unless it is from the original text file.

.. _PEP Markup: http://www.python.org/peps/pep-0012.html

The in-development version of the documentation for Python can be found 
at http://www.python.org/dev/doc/devel/ and should be used when looking 
up any documentation on something mentioned here.  PEPs (Python 
Enhancement Proposals) are located at http://www.python.org/peps/ .  To 
view files in the Python CVS online, go to 
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/ .  Reported bugs 
and suggested patches can be found at the SourceForge_ project page.

.. _python-dev: http://www.python.org/dev/
.. _SourceForge: http://sourceforge.net/tracker/?group_id=5470
.. _python-dev mailing list: 
http://mail.python.org/mailman/listinfo/python-dev
.. _comp.lang.python: http://groups.google.com/groups?q=comp.lang.python
.. _Docutils: http://docutils.sf.net/
.. _reST:
.. _reStructuredText: http://docutils.sf.net/rst.html

.. contents::

.. _last summary: 
http://www.python.org/dev/summary/2003-09-01_2003-09-15.html


=====================
Summary Announcements
=====================
Python-dev had a major explosion in emails thanks to some proposed 
changes to list.sort (summarized in `Decorate-sort-undecorate eye for 
the list.sort guy`_).  That got covered.  Some behind-the-scenes stuff 
that would not interest the general Python community was left out for my 
personal sanity.

It looks like I will not have major issues continuing writing the 
Summaries in terms of school interfering.  The only big issue will be 
how long past their closure date does it take me to get them out.  In 
other words, unless my schoolwork load suddenly becomes heavy 
continuously I should be able to keep doing the Summaries until my 
personal sanity gives out.

This summary is brought to you the song "Insanity_" by `Liz Phair`_ and 
"`Harder to Breathe`_" by `Maroon 5`_.

.. _Insanity: 
http://phobos.apple.com/WebObjects/MZStore.woa/wa/viewAlbum?playlistId=1760071&selectedItemId=1759480
.. _Liz Phair: 
http://phobos.apple.com/WebObjects/MZStore.woa/wa/viewArtist?artistId=22707
.. _Harder to Breathe: 
http://phobos.apple.com/WebObjects/MZStore.woa/wa/viewAlbum?playlistId=1798612&selectedItemId=1798604
.. _Maroon 5: 
http://phobos.apple.com/WebObjects/MZStore.woa/wa/viewArtist?artistId=1798556

=========
Summaries
=========
--------------------------------------------------------------------
I gave a talk at PyCon 2004 and all I got was respect and admiration
--------------------------------------------------------------------
I summarized this last month, but this is important so I am doing it 
again (and will continue to mention it until no more proposals are being 
accepted).  PyCon_
is ramping up for 2004 and is putting out a `Call for Proposals`_. 
Since PyCon is meant to be very broad-reaching you can propose anything 
from a scientific paper to a tutorial.

If you have any inkling to give a talk please send in a proposal.  It 
can't be rough, the key is that what you want to discuss can be 
understood from the proposal.  So take a look at the link and consider 
coming to PyCon as a speaker and not just a attendee.

.. _PyCon: http://www.python.org/pycon/dc2004/
.. _Call for Proposals: http://www.python.org/pycon/dc2004/cfp.html

Contributing threads:
   `PyCon DC 2004: Call for Proposals 
<http://mail.python.org/pipermail/python-dev/2003-October/038502.html>`__


---------------
Web-SIG started
---------------
As stated on the SIGs page, "The Python `Web SIG`_ is dedicated to 
improving Python's support for interacting with World Wide Web services 
and clients."  If there is some web-related functionality that you think 
Python should, this is the place to discuss it.  If you think an 
existing Python module could stand a redesign then this is the proper 
forum for your ideas.

.. _Web SIG: http://www.python.org/sigs/web-sig/

Contributing threads:
   `Any movement on a SIG for web lib enchancements? 
<http://mail.python.org/pipermail/python-dev/2003-October/038519.html>`__


--------------------------------------------
I have seen the future and it includes 2.3.3
--------------------------------------------
Anthony Baxter, release manager for Python `2.3.1`_ and `2.3.2`_, is 
already planning a 2.3.3 release in about three months time.  He 
initially suggested that the goal of this release should be to have 
Python build on as many platforms as possible.

Michael Hudson listed "HPUX/ia64, various oddities on Irix" as the major 
troublemakers.  He suggested that a sustained push to fix these build 
problems happen instead of trying to do it last-minute.  Michael also 
thought it would be a good idea to try to find experts on the trouble 
platforms instead of having someone getting access to some machine and 
floundering since they don't know the OS.

Skip Montanaro quickly chimed in with 
http://www.python.org/cgi-bin/moinmoin/PythonTesters which is a wiki 
page that lists people who are available to help with testing on various 
OSs.  Please have a look and if you think you could help out on an OS 
add yourself.

.. _2.3.1: http://www.python.org/2.3.1/
.. _2.3.2: http://www.python.org/2.3.2/

Contributing threads:
   `2.3.3 plans 
<http://mail.python.org/pipermail/python-dev/2003-October/038527.html>`__


-------------------
Helping you help us
-------------------
In response to Martin v. L?wis' email on how to handle patches, Michael 
Bartl expressed his disappointment that nothing had happened to his 
patches.  It was explained to him that because of time restraints on 
python-dev that it can take time for people to get to all of the 
patches, but that his work was greatly appreciated and would eventually 
be looked at.

The question of searching on SourceForge_ through the tracker items also 
came up.  There is a search box on the left side of the page, but it is 
not extensive.  Better than nothing.

I also posted an essay I wrote that is meant to act as a guide to how 
Python is developed and how anyone can help with the development 
regardless of abilities.  You can look at the email below in the "Draft 
of an essay on Python development" thread referenced below in 
"Contributing threads".  Hopefully it will end up on python.org once it 
is in its final form.

Contributing threads:
   `Patches & Bug help revisited 
<http://mail.python.org/pipermail/python-dev/2003-October/038621.html>`__
   `Draft of an essay on Python development (and how to	help) 
<http://mail.python.org/pipermail/python-dev/2003-October/038677.html>`__


--------------------------------------------
Making DLLs fatter for lower file dependency
--------------------------------------------
Thomas Heller suggested adding more modules to the Windows DLL as 
built-in so as to cut back on the number of files required to get Python 
to run (py2exe_ stands to benefit from this).  The issue of having a 
larger DLL to have to load into memory was brought up, but Martin v. 
L?wis said that DLLs only load into memory what is needed to run and not 
the entire DLL.

The issue of making the overall DLL larger in terms of disk space was 
brought up, but the worry was partially minimized when the list of 
modules to add was limited to small modules that do not have external 
dependencies.

But zlib might break that last rule in order to allow importation from 
compressed zip files.  The idea of integrating the zlib source into the 
Python tree was brought up, but shot down for licensing issues on top of 
keeping the code synchronized.

.. _py2exe: http://py2exe.sf.net/

Contributing threads:
   `buildin vs. shared modules 
<http://mail.python.org/pipermail/python-dev/2003-October/038622.html>`__


--------------------------------------------------
Decorate-sort-undecorate eye for the list.sort guy
--------------------------------------------------
Raymond Hettinger suggested adding built-in support for the 
decorate-sort-undecorate (DSU) sorting idiom to list.sort (see the 
Python Cookbook recipe at 
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52234 which is 
recipe 2.3 in the dead tree version or Tim Peters' intro to chapter 2 
for details on the idiom).  After a long discussion on the technical 
merits of various ways to do this, list.sort gained the keyword 
arguments 'key' and 'reverse'.

'key' takes in a function that accepts one argument and returns what the 
item should be sorted based on.  So running ``[(0,0), (0,2), 
(0,1)].sort(key=lambda x: x[1])`` will sort the list based on the second 
item in each tuple.  Technically the sort algorithm just runs the item 
it is currently looking at through the function and then handles the 
sorting.  This avoids having to actually allocate another list.

'reverse' does what it sounds like based on whether its argument is true 
or false.

list.sort also became guaranteed to be stable (this include 'reverse').

A discussion of whether list.sort should return self came up and was 
*very* quickly squashed by Guido.  The idea of having a second method, 
though, that did sort and returned a copy of the sorted list is still 
being considered.

Contributing threads:
   `decorate-sort-undecorate 
<http://mail.python.org/pipermail/python-dev/2003-October/038652.html>`__
   `list.sort 
<http://mail.python.org/pipermail/python-dev/2003-October/038772.html>`__


-------------------------------
New Python 2.3.2 Windows binary
-------------------------------
Some invalid DLLs made it into the 2.3.2 Windows binary distribution by 
accident.  It seems to mostly affect Windows 98 and NT 4 users.  The 
binary has been fixed and put up online.  You can tell if you downloaded 
the fixed version by checking the filename; the new one is named 
Python-2.3.2-1.exe (notice the "-1").

Contributing threads:
   `Python-2.3.2 windows binary screwed 
<http://mail.python.org/pipermail/python-dev/2003-October/038746.html>`__


From martin at v.loewis.de  Sun Oct 19 03:37:11 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Sun Oct 19 03:37:29 2003
Subject: [Python-Dev] Re: How to spell Py_return_None and friends
In-Reply-To: <3F9203A5.2030407@ocf.berkeley.edu>
References: <000001c38e10$77e31ea0$e841fea9@oemcomputer>
	<200310090345.h993jkF00618@12-236-54-216.client.attbi.com>
	<m33ce3hvlt.fsf@mira.informatik.hu-berlin.de>
	<200310090503.h99533G00867@12-236-54-216.client.attbi.com>
	<3F91F743.6090801@ocf.berkeley.edu>
	<200310190240.h9J2ekX10384@12-236-54-216.client.attbi.com>
	<3F9203A5.2030407@ocf.berkeley.edu>
Message-ID: <m38ynhwue0.fsf@mira.informatik.hu-berlin.de>

"Brett C." <bac@OCF.Berkeley.EDU> writes:

> OK, great.  I can code them up, but fair warning, I have not done C
> macros in a *long* time so if someone would rather do it then please
> do so.  Regardless, Brett Newbie Question time: what file should they
> go in?

I would put them along with the things they are returning,
i.e. Py_RETURN_TRUE into boolobject.h, Py_RETURN_NONE into object.h
(after the Py_None definition).

Regards,
Martin


From martin at v.loewis.de  Sun Oct 19 03:40:50 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Sun Oct 19 03:41:07 2003
Subject: [Python-Dev] python-dev Summary for 2003-10-01 through 2003-10-15
	[draft]
In-Reply-To: <3F92072D.70601@ocf.berkeley.edu>
References: <3F92072D.70601@ocf.berkeley.edu>
Message-ID: <m34qy5wu7x.fsf@mira.informatik.hu-berlin.de>

"Brett C." <bac@OCF.Berkeley.EDU> writes:

> In response to Martin v. L?wis' email on how to handle patches,
> Michael Bartl expressed his disappointment that nothing had happened
> to his patches.  It was explained to him that because of time
> restraints on python-dev that it can take time for people to get to
> all of the patches, but that his work was greatly appreciated and
> would eventually be looked at.

Follow-up: I have accepted one of his patches. The other I consider
incorrect, waiting for him to further comment (or withdraw the patch).

Regards,
Martin

From aleaxit at yahoo.com  Sun Oct 19 06:05:56 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 19 06:06:07 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: <200310182205.h9IM5cj10229@12-236-54-216.client.attbi.com>
References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
	<200310182146.20751.aleaxit@yahoo.com>
	<200310182205.h9IM5cj10229@12-236-54-216.client.attbi.com>
Message-ID: <200310191205.57016.aleaxit@yahoo.com>

On Sunday 19 October 2003 00:05, Guido van Rossum wrote:
   ...
> > class ReiterableIterator(object):
> >     def __init__(self, thecallable, *itsargs, **itskwds):
   ...
> Why put support for a callable with arbitrary arguments in the
> ReiterableIterator class?  Why not say it's called without args, and
> if the user has a need to use something with args, they can use one of
> the many approaches to currying?

The typical and most frequent case would be that generating a
new iterator requires calling iter(asequence) -- i.e., the typical case
does require arguments.  So, just like e.g. for threading.Thread, 
atexit.register, and other callables that take a callable argument, it 
makes more sense to NOT require the user to invent a currying 
approach (note btw that iter does NOT support the iter.__get__ trick,
of course, as it's a builtin function and not a Python function).  It
would be different if Python supported a curry built-in, but it doesn't.


> > typical toy example use:
   ...
> Are there any non-toy examples?

I have not met any, yet -- whence my interest in hearing about use cases
from anybody who might have.

> I'm asking because I can't remember ever having had this need myself.

Right, me neither.


> A better name would be clone(); copy() would work too, as long as it's
> clear that it copies the iterator, not the underlying sequence or
> series.  (Subtle difference!)
>
> Reiteration is a special case of cloning: simply stash away a clone
> before you begin.

Good name, and good point.

> > Roughly the usual "protocol" approach -- functions use an object's
> > ability IF that object exposes methods providing that ability, and
> > otherwise fake it on their own.
>
> In this case I'm not sure if it is desirable to do this automatically.

Ah, yes, the automatism might be a performance trap -- good point.

> If I request a clone of an iterator for a data stream coming from a
> pipe or socket, it would have to start buffering everything.  Sure, I
> can come up with a buffering class that throws away buffered data that
> none of the existing clones can reach, but I very much doubt if it's
> worth it; a customized buffering scheme for the application at hand
> would likely be more efficient than a generic solution.

Then clone(it) should raise an exception if it does NOT expose a
method supplying "easy cloning" (or more simply it.clone() could
do it, e.g. an AttributeError:-) alerting the user of the need to use
such a "buffering class" wrapper:
    try: clo = it.clone()
    except AttributeError: clo = BufferingWrapper(it)

But if no existing iterator supplies the .clone -- even when it would
be very easy for it to do so -- this would bufferingwrap everything.


> > > I'm not sure what you are suggesting here.  Are you proposing that
> > > *some* iterators (those which can be snapshotted cheaply) sprout a
> > > new snapshot() method?
> >
> > If snapshottability (eek!) is important enough, yes, though
> > __snapshot__ might perhaps be more traditional (but for iterators we do
> > have the precedent of method next without __underscores__).
>
> (Which I've admitted before was a mistake.)

Ah, I didn't recall that admission, sorry.  OK, underscores everywhere then.


> A problem I have with making iterator cloning a standard option is
> that this would pretty much require that all iterators for which
> cloning can be implemented should implement clone().  That in turn
> means that iterator implementors have to work harder (sometimes
> cloning can be done cheaply, but it might require a different
> refactoring of the iterator implementation).

Making iterator authors aware of their clients' possible need to clone
doesn't sound bad to me.  There's no _compulsion_ to provide the
functionality, but some "social pressure" to do it if a refactoring can
afford it, well, why not?

> Another issue is that it would make generators second-class citizens,
> since they cannot be cloned.  (It would seem to be possible to copy a
> stack frame, but then the question begs whether to use shallow or deep
> copying -- if a local variable in a generator references a list,
> should the list be copied or not?  And if it should be copied, should
> it be a deep or shallow copy?  There's no good answer without knowing
> the intention of the programmer.)

Hmmm, there's worse -- if a generator uses an iterator the latter should
be cloned, not copied, to produce the generator-clone effect, e.g.

def by2(it):
    for x in it: yield x*2

If it is a list I don't think this is a problem -- already now the user
cannot change it for the lifetime of iterators produced by by2(it)
without wierd effects, e.g. "for x in by2(L): L.append(x)" gives an
infinite loop.
But if it is an iterator it should be cloned at the time an iterator
produced by by2(it) is cloned.  Eeep.  No, you're right, in the general
case I cannot see how to clone generator-produced iterators.


> > It seems to me that the ability to back up and that of snapshotting
> > are somewhat independent.
>
> Backing up suggests a strictly limited buffer; cloning suggests a

Unless you need to provide "unlimited undo", yes, but that's a harder
problem anyway (needing different architecture).

> > may be just because it's the one case for which I happened to
> > stumble on some use cases in production (apart from "undoing", which
> > isn't too bad to handle in other ways anyway).
>
> I'd like to hear more about those cases, to see if they really need
> cloning (:-) or can live with a fixed limited backup capability.

I have an iterator it whose items, after an arbitrary prefix terminated by 
the first empty item, are supposed to be each 'yes' or 'no'.

I need to process it with different functions depending if it has certain 
proportions of 'yes'/'no' (and yet another function if it has any invalid 
items) -- each of those functions needs to get the iterator from right
after that 'first empty item'.

Today, I do:

def dispatchyesno(it, any_invalid, selective_processing):
    # skip the prefix
    for x in it:
        if not x: break
    # snapshot the rest
    snap = list(it)
    it = iter(snap)
    # count and check
    yeses = noes = 0
    for x in it:
        if x=='yes': yeses += 1
        elif x=='no': noes += 1
        else: return any_invalid(snap)
    total = float(yeses+noes)
    if not total: raise ValueError, "sequence empty after prefix"
    ratio = yeses / total
    for threshold, function in selective_processing:
        if ratio <= threshold: return function(snap)
    raise ValueError, "no function to deal with a ratio of %s" % ratio

(yes, I could use bisect, but the number of items in selective_processing
is generally quite low so I didn't bother).

Basically, I punt and "snapshot" by making a list out of what is left of
my iterator after the prefix.  That may be the best I can do in some cases,
but in others it's a waste.  (Oh well, at least infinite iterators are not a
consideration here, since I do need to exhaust the iterator to get the
ratio:-).  What I plan to do if this becomes a serious problem in the
future is add something like an optional 'clone=None' argument so I
can code:
    if clone is None:
        snap = list(it)
        it = iter(snap)
    else: snap = clone(it)
instead of what I have hardwired now.  But, I _would_ like to just do, e.g.:
    try: snap = it.clone()
    except AttributeError:
        snap = list(it)
        it = iter(snap)
using some standardized protocol for "easily clonable iterators" rather
than requiring such awareness of the issue on the caller's part.


> I think a standard backup wrapper would be a useful thing to have
> (maybe in itertools?); since generator functions can't be cloned, I'm
> going to push back on the need for cloning for now until I see a lot
> more non-toy evidence.

Very reasonable, sure.  I suspect the discussion of backup wrapper
is best moved to another thread, given this msg is so long and there
are all the usual finicky details to nail down....


Alex


From aleaxit at yahoo.com  Sun Oct 19 06:25:34 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 19 06:25:40 2003
Subject: [Python-Dev] why The Trick can't work
In-Reply-To: <3F91C25E.2050409@Acm.Org>
References: <E1AAyNR-0002tR-KH@mail.python.org> <3F91C25E.2050409@Acm.Org>
Message-ID: <200310191225.34383.aleaxit@yahoo.com>

On Sunday 19 October 2003 00:44, Scott David Daniels wrote:
   ...
> >>There is also a problem with the strategy if if gets called by a
> >>C only extension.  It is perfectly feasible for a C extension to
> >>hold the reference to an object, call the copying sort (directly
> >>or indirectly), and then be very surprised that the copy did not
> >>take place.
> >
> > Alas, I fear you're right.  Darn--so much for a possible little but
> > cheap optimization (which might have been neat in PySequence_List
   ...
> I'm afraid I'm confused here.  If the C code is like:
>
>     ... at this point PTR refers to an object with refcount 1
>     OTHER = <invoke sortcopy>(PTR)
>     ... Then it might be that PTR == OTHER ...
>
> What possible harm could come?  The C code should expect a
> sortcopy method to recycle the object referred to by PTR
> if "the Trick" isn't used.  

No!  The point of a sorted *copy* is to NOT "recycle the object",
else you'd just call PyList_Sort.  There is no precedent for a C
function documented as "may steal a reference if it feels like it
but need not".  Without The Trick, PTR* is unchanged -- so it
cannot be changed by The Trick without exposing weirdness 
for such a C-coded extension in these circumstances.

I think such a C-coded extension must ALREADY avoid calling
filter under such circumstances (haven't tested yet) -- and that
undocumented issue is bad enough already...

> I am a trifle confused about
> what harm occurs.  Seems to me that list(v) (and alist[:])
> could quite happily implement "the Trick" without fear of
> failure.

Yes, but they're not "C-coded extensions".  These Python expressions cause 
PySequence_List and PyList_CopySlice respectively to be called (in the end
it always goes rapidly to the copy-slice one), and it's not trivial to find 
a spot that is NOT callable by a C-coded extension where The Trick could
live safely.


Alex


From ncoghlan at iinet.net.au  Sun Oct 19 06:28:45 2003
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sun Oct 19 06:29:32 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310182224.33499.aleaxit@yahoo.com>
References: <000401c39516$c440b520$e841fea9@oemcomputer>	<200310181743.38959.aleaxit@yahoo.com>	<3F9175CF.3040408@iinet.net.au>
	<200310182224.33499.aleaxit@yahoo.com>
Message-ID: <3F92675D.4070406@iinet.net.au>

Alex Martelli strung bits together to say:
> But, for the general case: the BDFL has recently Pronounced that he does 
> not LIKE chaining and doesn't want to encourage it in the least.  Yes, your
> trick does allow chaining, but the repeated chain(...) calls are cumbersome
> enough to not count as an encouragement IMHO;-).

Well, yes, that was sort of the point. For those who _really_ like chaining (I'm 
not one of them - I agree with Guido that it is less readable and harder to 
maintain), the 'chain' function provides a way to do it with what's already in 
the language.

Cheers,
Nick.

-- 
Nick Coghlan           |              Brisbane, Australia
ICQ#: 68854767         |               ncoghlan@email.com
Mobile: 0409 573 268   |   http://www.talkinboutstuff.net
"Let go your prejudices,
               lest they limit your thoughts and actions."


From skip at manatee.mojam.com  Sun Oct 19 08:00:48 2003
From: skip at manatee.mojam.com (Skip Montanaro)
Date: Sun Oct 19 08:00:54 2003
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200310191200.h9JC0m9G003255@manatee.mojam.com>


Bug/Patch Summary
-----------------

545 open / 4255 total bugs (+44)
210 open / 2423 total patches (+13)

New Bugs
--------

tarfile exception on large .tar files (2003-10-13)
	http://python.org/sf/822668
Telnet.read_until() timeout parameter misleading  (2003-10-13)
	http://python.org/sf/822974
cmath.log doesn't have the same interface as math.log. (2003-10-13)
	http://python.org/sf/823209
urllib2 digest auth is broken (2003-10-14)
	http://python.org/sf/823328
os.strerror doesn't understand windows error codes (2003-10-14)
	http://python.org/sf/823672
ntpath.expandvars doesn't expand Windows-style variables. (2003-10-15)
	http://python.org/sf/824371
exception with Message.get_filename() (2003-10-15)
	http://python.org/sf/824417
Package Manager Scrolling Behavior (2003-10-15)
	http://python.org/sf/824430
bad value of INSTSONAME in Makefile (2003-10-15)
	http://python.org/sf/824565
dict.__init__ doesn't call subclass's __setitem__. (2003-10-16)
	http://python.org/sf/824854
Memory error on AIX in email.Utils._qdecode (2003-10-16)
	http://python.org/sf/824977
code.InteractiveConsole interprets escape chars incorrectly (2003-10-17)
	http://python.org/sf/825676
reference to Built-In Types section in file() documentation (2003-10-17)
	http://python.org/sf/825810
Class Problem with repr and getattr on PY2.3.2 (2003-10-18)
	http://python.org/sf/826013

New Patches
-----------

add option to NOT use ~/.netrc in nntplib.NNTP() (2003-10-13)
	http://python.org/sf/823072
Updated .spec file. (2003-10-14)
	http://python.org/sf/823259
use just built python interp. to build the docs. (2003-10-14)
	http://python.org/sf/823775
Add additional isxxx functions to string object. (2003-10-16)
	http://python.org/sf/825313
telnetlib timeout fix (bug 822974) (2003-10-17)
	http://python.org/sf/825417
let's get rid of cyclic object comparison (2003-10-17)
	http://python.org/sf/825639
Add list.copysort() (2003-10-17)
	http://python.org/sf/825814
cmath.log optional base argument, fixes #823209 (2003-10-18)
	http://python.org/sf/826074

Closed Bugs
-----------

tempfile.mktemp() for directories (2002-11-22)
	http://python.org/sf/642391
MacOS.Error for platform.mac_ver under OS X (2003-07-30)
	http://python.org/sf/780461
access fails on Windows with Unicode file name (2003-08-17)
	http://python.org/sf/789995
a bug in IDLE on Python 2.3 i think (2003-08-17)
	http://python.org/sf/790162
mkstemp doesn't return abspath (2003-09-21)
	http://python.org/sf/810408
Google kills socket lookup (2003-10-04)
	http://python.org/sf/817611
installer wakes up Windows File Protection (2003-10-05)
	http://python.org/sf/818029
PythonIDE interactive window Unicode bug (2003-10-08)
	http://python.org/sf/819860
tkinter's 'after' and 'threads' on multiprocessor (2003-10-09)
	http://python.org/sf/820605
reduce docs neglect a very important piece of information. (2003-10-11)
	http://python.org/sf/821701

Closed Patches
--------------

642391: tempfile.mktemp() docs to include dir info (2003-01-04)
	http://python.org/sf/662475
Documentation for platform module (2003-08-08)
	http://python.org/sf/785752
Tidying error messages in compile.c (2003-08-21)
	http://python.org/sf/792869
Windows installer changes for 2.3.1 (2003-08-28)
	http://python.org/sf/796919
Mention behavior of seek() on text files (2003-09-19)
	http://python.org/sf/809535
fix for mkstemp with relative paths (bug #810408) (2003-09-22)
	http://python.org/sf/810914
fix doc typos (2003-10-10)
	http://python.org/sf/821093

From niemeyer at conectiva.com  Sun Oct 19 10:06:39 2003
From: niemeyer at conectiva.com (Gustavo Niemeyer)
Date: Sun Oct 19 10:07:50 2003
Subject: [Python-Dev] SRE recursion removed
In-Reply-To: <m3znfydynk.fsf@mira.informatik.hu-berlin.de>
References: <20031018144703.GA10212@ibook>
	<m3llrifm7r.fsf@mira.informatik.hu-berlin.de>
	<20031018182215.GA10756@ibook>
	<m3he26fjz3.fsf@mira.informatik.hu-berlin.de>
	<20031018202854.GA22482@ibook>
	<m3znfydynk.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20031019140638.GA23157@ibook>

> However, I'm concerned that track is completely lost as to how SRE
> works -

You may have lost it. I haven't.

> is it or is it not the case that the current implementation
> which is in CVS is recursive, with arbitrary deep nesting? If it is

It is *NOT* the case.

Do the following test: set USE_RECURSION_LIMIT to *2*, and run the
tests.

There's a single case of single recursion, that's why it can't be 1:
when SRE_COUNT() is called from the main loop, and then it calls
SRE_MATCH() again. OTOH, this second call of SRE_MATCH() will *never*
recurse again, unless there's a serious bug in the expression compiler
(which is not the case right now). To give an example of when this
single recursion happens, suppose the following expression: "[ab]*".

> not recursive anymore (which the subject suggests), then why is the is
> the 'level' argument still in? Can we or can we not remove the ad-hoc
> determination of USE_RECURSION_LIMIT?

USE_RECURSION_LIMIT is a friend of USE_RECURSION. We have already
discussed this in other messages.

> > Yeah.. I can clean it. let's please wait a little bit to
> > see the new code working?
> 
> Certainly. However, I was hoping that we have better means of finding
> out whether the code still does what it is supposed to do than
> testing. Perhaps that is an illusion.

I'm shocked. Do you really belive that I've done all the changes and
past fixes in SRE without knowing how it works? I thought my
credibility was a little higher.

-- 
Gustavo Niemeyer
http://niemeyer.net

From jacobs at penguin.theopalgroup.com  Sun Oct 19 10:33:00 2003
From: jacobs at penguin.theopalgroup.com (Kevin Jacobs)
Date: Sun Oct 19 10:33:05 2003
Subject: [Python-Dev] why The Trick can't work
In-Reply-To: <200310191225.34383.aleaxit@yahoo.com>
Message-ID: <Pine.LNX.4.44.0310191023010.21555-100000@penguin.theopalgroup.com>

On Sun, 19 Oct 2003, Alex Martelli wrote:
> On Sunday 19 October 2003 00:44, Scott David Daniels wrote:
> > I'm afraid I'm confused here.  If the C code is like:
> >
> >     ... at this point PTR refers to an object with refcount 1
> >     OTHER = <invoke sortcopy>(PTR)
> >     ... Then it might be that PTR == OTHER ...
> >
> > What possible harm could come?  The C code should expect a
> > sortcopy method to recycle the object referred to by PTR
> > if "the Trick" isn't used.  
> 
> No!  The point of a sorted *copy* is to NOT "recycle the object",
> else you'd just call PyList_Sort.  There is no precedent for a C
> function documented as "may steal a reference if it feels like it
> but need not".  Without The Trick, PTR* is unchanged -- so it
> cannot be changed by The Trick without exposing weirdness 
> for such a C-coded extension in these circumstances.

Even worse, the C code may not know that it is calling a copy sort, and
assume that the new object is distinct from the original.  I'm not saying
that such C code is optimal, or even correct, but I suspect that a great
deal of it does exist.  Even worse than that is the possibility that a list
subclass can change the behavior of copysort in such a way that it changes
the refcount of the object and within the copysort call.  This could
confound any code naively attempting to use refcounts to detect the
"optimization", either by masking a copy or masking the lack of a copy.

So I'm not saying that it isn't a neat idea, but it does present more
problems than it solves.  Besides that, I find it no burden to copy then
sort, or write a utility function that does so -- and I manage a project
with well over 1 million lines of Python/C-extension code at the moment.

-Kevin

-- 
--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (440) 871-6725 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (440) 871-6722              WWW:    http://www.theopalgroup.com/


From guido at python.org  Sun Oct 19 12:30:15 2003
From: guido at python.org (Guido van Rossum)
Date: Sun Oct 19 12:30:28 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: Your message of "Sun, 19 Oct 2003 12:05:56 +0200."
	<200310191205.57016.aleaxit@yahoo.com> 
References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
	<200310182146.20751.aleaxit@yahoo.com>
	<200310182205.h9IM5cj10229@12-236-54-216.client.attbi.com> 
	<200310191205.57016.aleaxit@yahoo.com> 
Message-ID: <200310191630.h9JGUF219501@12-236-54-216.client.attbi.com>

> > A problem I have with making iterator cloning a standard option is
> > that this would pretty much require that all iterators for which
> > cloning can be implemented should implement clone().  That in turn
> > means that iterator implementors have to work harder (sometimes
> > cloning can be done cheaply, but it might require a different
> > refactoring of the iterator implementation).
> 
> Making iterator authors aware of their clients' possible need to clone
> doesn't sound bad to me.  There's no _compulsion_ to provide the
> functionality, but some "social pressure" to do it if a refactoring can
> afford it, well, why not?

Well, since it can't be done for the very important class of
generators, I think it's better to prepare the users of all iterators
for their non-reiterability.  It would surely be a shame if the social
pressure to provide cloning ended up making generators second-class
citizens!

> > I'd like to hear more about those cases, to see if they really need
> > cloning (:-) or can live with a fixed limited backup capability.
> 
> I have an iterator it whose items, after an arbitrary prefix terminated by 
> the first empty item, are supposed to be each 'yes' or 'no'.

This is a made-up toy example, right?  Does it correspond with
something you've had to do in real life?

> I need to process it with different functions depending if it has certain 
> proportions of 'yes'/'no' (and yet another function if it has any invalid 
> items) -- each of those functions needs to get the iterator from right
> after that 'first empty item'.
> 
> Today, I do:
> 
> def dispatchyesno(it, any_invalid, selective_processing):
>     # skip the prefix
>     for x in it:
>         if not x: break
>     # snapshot the rest
>     snap = list(it)
>     it = iter(snap)
>     # count and check
>     yeses = noes = 0
>     for x in it:
>         if x=='yes': yeses += 1
>         elif x=='no': noes += 1
>         else: return any_invalid(snap)
>     total = float(yeses+noes)
>     if not total: raise ValueError, "sequence empty after prefix"
>     ratio = yeses / total
>     for threshold, function in selective_processing:
>         if ratio <= threshold: return function(snap)
>     raise ValueError, "no function to deal with a ratio of %s" % ratio
> 
> (yes, I could use bisect, but the number of items in selective_processing
> is generally quite low so I didn't bother).
> 
> Basically, I punt and "snapshot" by making a list out of what is left of
> my iterator after the prefix.  That may be the best I can do in some cases,
> but in others it's a waste.  (Oh well, at least infinite iterators are not a
> consideration here, since I do need to exhaust the iterator to get the
> ratio:-).  What I plan to do if this becomes a serious problem in the
> future is add something like an optional 'clone=None' argument so I
> can code:
>     if clone is None:
>         snap = list(it)
>         it = iter(snap)
>     else: snap = clone(it)
> instead of what I have hardwired now.  But, I _would_ like to just do, e.g.:
>     try: snap = it.clone()
>     except AttributeError:
>         snap = list(it)
>         it = iter(snap)
> using some standardized protocol for "easily clonable iterators" rather
> than requiring such awareness of the issue on the caller's part.

Is this from a real app?  What it most reminds me of is parsing email
messages that can come either from a file or from a pipe; often you
want to scan the body to find the end of its MIME structure and then
go back and do things to the various MIME parts.  If you know it comes
from a real file, it's easy to save the file offsets for the parts as
you parse them; but when it's a pipe, that doesn't work.  In practice,
these days, the right thing to do is probably to save the data read
from a pipe to a temp file first, and then parse the temp file; or if
you insist on parsing it as it comes it, copy the data to a temp file
as you go and save file offsets in the temp file.

But I'm not sure that abstracting this away all the way to an iterator
makes sense.  For one, the generic approach to cloning if the iterator
doesn't have __clone__ would be to make a memory copy, but in this app
a disk copy is desirable (I can invent something that overflows to
disk abouve a certain limit, but it's cumbersome, and you have cleanup
issues, and it needs parameterization since not everybody agrees on
when to spill to disk).  Another issue is that the application does't
require iterating over the clone and the original iterator
simultaneously, but a generic auto-cloner can't assume that; for
files, this would either mean that each clone must have its own file
descriptor (and dup() doesn't cut it because it shares the file
offset), or each clone must keep a file offset, but now you lose the
performance effect of a streaming buffer unless you code up something
extremely hairy with locks etc.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From python at rcn.com  Sun Oct 19 12:50:24 2003
From: python at rcn.com (Raymond Hettinger)
Date: Sun Oct 19 12:51:08 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: <200310191205.57016.aleaxit@yahoo.com>
Message-ID: <003f01c39661$17d5fd80$e841fea9@oemcomputer>

> > A better name would be clone(); copy() would work too, as long as
it's
> > clear that it copies the iterator, not the underlying sequence or
> > series.  (Subtle difference!)
> >
> > Reiteration is a special case of cloning: simply stash away a clone
> > before you begin. 

So far, all of my needs for re-iteration have been met by storing some
of the iterator's data.  If all of it needs to be saved, I use list(it).
If only a portion needs to be saved, then I use the code from the tee() 
example in the itertools documentation:

    def tee(iterable):
        "Return two independent iterators from a single iterable"
        def gen(next, data={}, cnt=[0]):
            dpop = data.pop
            for i in itertools.count():
                if i == cnt[0]:
                    item = data[i] = next()
                    cnt[0] += 1
                else:
                    item = dpop(i)
                yield item
        next = iter(iterable).next
        return (gen(next), gen(next))


Raymond Hettinger


From itamar at itamarst.org  Sun Oct 19 13:05:39 2003
From: itamar at itamarst.org (Itamar Shtull-Trauring)
Date: Sun Oct 19 13:06:32 2003
Subject: [Python-Dev] Fw: [Fwd: Re: Python-Dev Digest, Vol 3, Issue 37]
Message-ID: <20031019130539.7a689a26.itamar@itamarst.org>

(Glyph is having some issues sending mail to mail.python.org, so I'm
forwarding this for him.)
-------------- next part --------------
An embedded message was scrubbed...
From: Glyph Lefkowitz <glyph@twistedmatrix.com>
Subject: Re: Python-Dev Digest, Vol 3, Issue 37
Date: Wed, 15 Oct 2003 20:57:52 -0400
Size: 1725
Url: http://mail.python.org/pipermail/python-dev/attachments/20031019/95f7b658/attachment.mht
From python at rcn.com  Sun Oct 19 13:23:40 2003
From: python at rcn.com (Raymond Hettinger)
Date: Sun Oct 19 13:24:24 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <3F92675D.4070406@iinet.net.au>
Message-ID: <004701c39665$bd6ff440$e841fea9@oemcomputer>

[Alex Martelli]
> > But, for the general case: the BDFL has recently Pronounced that he
does
> > not LIKE chaining and doesn't want to encourage it in the least.
Yes,
> your
> > trick does allow chaining, but the repeated chain(...) calls are
> cumbersome
> > enough to not count as an encouragement IMHO;-).

[Nick Coghlan]
> Well, yes, that was sort of the point. For those who _really_ like
> chaining (I'm
> not one of them - I agree with Guido that it is less readable and
harder
> to
> maintain), the 'chain' function provides a way to do it with what's
> already in
> the language.


Remember, list.copysort() isn't about chaining or even "saving a line or
two".  It is about using an expression instead of a series of
statements.
That makes it possible to use it wherever expressions are allowed, 
including function call arguments and list comprehensions.

Here are some examples taken from the patch comments:

  genhistory(date, events.copysort(key=incidenttime))

  todo = [t for t in tasks.copysort() if due_today(t)]


To break these back into multiple statements is to cloud their intent
and take away their expressiveness.  Using multiple statements requires
introducing auxiliary, state-changing variables that remain visible
longer than necessary.  State changing variables are a classic source of
programming errors.  In contrast, the examples above are clean and show
their correctness without having to mentally decrypt them.

Scanning through the sort examples in the standard library, I see that
the multi-line, statement form is sometimes further clouded by having
a number of statements in-between.  In SimpleHTTPServer.py, for example

    list = os.listdir(path)
      . . . (yada, yada)
    list.sort(key=lambda a: a.lower())
      . . . (yada, yada, yada)
    for name in list:
         . . .

You see other examples using os.environ and such.

The forces working against introducing an in-line sort are:
* the time to copy the list (which Alex later showed to be irrelevant),
* having two list methods with a similar purpose, and 
* the proposed method names are less than sublime

If someone could come-up with a name more elegant than "copysort", I
the idea would be much more appetizing.


Raymond Hettinger


From martin at v.loewis.de  Sun Oct 19 13:36:04 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Sun Oct 19 13:36:51 2003
Subject: [Python-Dev] SRE recursion removed
In-Reply-To: <20031019140638.GA23157@ibook>
References: <20031018144703.GA10212@ibook>
	<m3llrifm7r.fsf@mira.informatik.hu-berlin.de>
	<20031018182215.GA10756@ibook>
	<m3he26fjz3.fsf@mira.informatik.hu-berlin.de>
	<20031018202854.GA22482@ibook>
	<m3znfydynk.fsf@mira.informatik.hu-berlin.de>
	<20031019140638.GA23157@ibook>
Message-ID: <m3ekx9b057.fsf@mira.informatik.hu-berlin.de>

Gustavo Niemeyer <niemeyer@conectiva.com> writes:

> You may have lost it. I haven't.

Very good.

> There's a single case of single recursion, that's why it can't be 1:
> when SRE_COUNT() is called from the main loop, and then it calls
> SRE_MATCH() again. OTOH, this second call of SRE_MATCH() will *never*
> recurse again, unless there's a serious bug in the expression compiler
> (which is not the case right now). 

I see. So there is a guarantee that level will never be larger than 2?

> USE_RECURSION_LIMIT is a friend of USE_RECURSION. We have already
> discussed this in other messages.

Maybe we have, but it was not clear to me. It even still isn't:
Wouldn't it be possible to leave USE_RECURSION in, and remove
USE_RECURSION_LIMIT, and the level argument?

> > Certainly. However, I was hoping that we have better means of finding
> > out whether the code still does what it is supposed to do than
> > testing. Perhaps that is an illusion.
> 
> I'm shocked. Do you really belive that I've done all the changes and
> past fixes in SRE without knowing how it works? I thought my
> credibility was a little higher.

I was relying on your credibility, so I was surprised that you are
interested to leave the old code in - that suggests that you feel
there are problems with your code. I'm trying to find out what you
think these problems are.

However, getting all trust in the SRE code from the trust that I have
in you is not enough for me - and, PLEASE UNDERSTAND, this has nothing
to do with you personally. I feel bad if important code is so
unmaintainable that only a single person understands it. I made the
remark as a comment to you saying

"let's please wait a little bit to see the new code working?"

which suggested that we actually have to *see* how the code works, in
order to determine whether it works.

Regards,
Martin


From aahz at pythoncraft.com  Sun Oct 19 14:40:16 2003
From: aahz at pythoncraft.com (Aahz)
Date: Sun Oct 19 14:40:57 2003
Subject: [Python-Dev] SRE recursion removed
In-Reply-To: <m3ekx9b057.fsf@mira.informatik.hu-berlin.de>
References: <20031018144703.GA10212@ibook>
	<m3llrifm7r.fsf@mira.informatik.hu-berlin.de>
	<20031018182215.GA10756@ibook>
	<m3he26fjz3.fsf@mira.informatik.hu-berlin.de>
	<20031018202854.GA22482@ibook>
	<m3znfydynk.fsf@mira.informatik.hu-berlin.de>
	<20031019140638.GA23157@ibook>
	<m3ekx9b057.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20031019184016.GA21377@panix.com>

On Sun, Oct 19, 2003, Martin v. L?wis wrote:
> Gustavo Niemeyer <niemeyer@conectiva.com> writes:
>> 
>> I'm shocked. Do you really belive that I've done all the changes and
>> past fixes in SRE without knowing how it works? I thought my
>> credibility was a little higher.
> 
> I was relying on your credibility, so I was surprised that you are
> interested to leave the old code in - that suggests that you feel
> there are problems with your code. I'm trying to find out what you
> think these problems are.
> 
> However, getting all trust in the SRE code from the trust that I have
> in you is not enough for me - and, PLEASE UNDERSTAND, this has nothing
> to do with you personally. I feel bad if important code is so
> unmaintainable that only a single person understands it. I made the
> remark as a comment to you saying
> 
> "let's please wait a little bit to see the new code working?"
> 
> which suggested that we actually have to *see* how the code works, in
> order to determine whether it works.

As a datapoint, it sounded to me more like Gustavo saying, "I'm pretty
sure I know what I'm doing here, but this is hairy code and I'd like to
keep the old code around for a bit as a cross-check in case it turns out
I'm wrong."  But if Gustavo's code already passes the regex regression
suite, I would say it's just him being suspenders-and-belt -- which in my
book as a tech support-type person is a Good Thing.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From skip at pobox.com  Sun Oct 19 14:48:47 2003
From: skip at pobox.com (Skip Montanaro)
Date: Sun Oct 19 14:49:02 2003
Subject: [Python-Dev] SRE recursion removed
In-Reply-To: <m3ekx9b057.fsf@mira.informatik.hu-berlin.de>
References: <20031018144703.GA10212@ibook>
	<m3llrifm7r.fsf@mira.informatik.hu-berlin.de>
	<20031018182215.GA10756@ibook>
	<m3he26fjz3.fsf@mira.informatik.hu-berlin.de>
	<20031018202854.GA22482@ibook>
	<m3znfydynk.fsf@mira.informatik.hu-berlin.de>
	<20031019140638.GA23157@ibook>
	<m3ekx9b057.fsf@mira.informatik.hu-berlin.de>
Message-ID: <16274.56463.542862.174939@montanaro.dyndns.org>


    Martin> I feel bad if important code is so unmaintainable that only a
    Martin> single person understands it.

I've never looked at sre or any of the other regular expression engines
which have made their way into or been considered for inclusion in Python
over the years, but my impression has been that for most there's never more
than a small handful of people -- and sometimes only one person -- who truly
understands it at any given time.

Skip

From niemeyer at conectiva.com  Sun Oct 19 15:11:44 2003
From: niemeyer at conectiva.com (Gustavo Niemeyer)
Date: Sun Oct 19 15:12:57 2003
Subject: [Python-Dev] SRE recursion removed
In-Reply-To: <m3ekx9b057.fsf@mira.informatik.hu-berlin.de>
References: <20031018144703.GA10212@ibook>
	<m3llrifm7r.fsf@mira.informatik.hu-berlin.de>
	<20031018182215.GA10756@ibook>
	<m3he26fjz3.fsf@mira.informatik.hu-berlin.de>
	<20031018202854.GA22482@ibook>
	<m3znfydynk.fsf@mira.informatik.hu-berlin.de>
	<20031019140638.GA23157@ibook>
	<m3ekx9b057.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20031019191144.GA27007@ibook>

> > There's a single case of single recursion, that's why it can't be 1:
> > when SRE_COUNT() is called from the main loop, and then it calls
> > SRE_MATCH() again. OTOH, this second call of SRE_MATCH() will *never*
> > recurse again, unless there's a serious bug in the expression compiler
> > (which is not the case right now). 
> 
> I see. So there is a guarantee that level will never be larger than 2?

Yep.

> > USE_RECURSION_LIMIT is a friend of USE_RECURSION. We have already
> > discussed this in other messages.
> 
> Maybe we have, but it was not clear to me. It even still isn't:
> Wouldn't it be possible to leave USE_RECURSION in, and remove
> USE_RECURSION_LIMIT, and the level argument?

Of course we can. It depends totally on what we want to do. We can
remove it, and then if we enable USE_RECURSION, it may blow up the
stack without being catched.

> I was relying on your credibility, so I was surprised that you are
> interested to leave the old code in - that suggests that you feel
> there are problems with your code. I'm trying to find out what you
> think these problems are.

I don't think there are problems.

> However, getting all trust in the SRE code from the trust that I have
> in you is not enough for me - and, PLEASE UNDERSTAND, this has nothing
> to do with you personally. I feel bad if important code is so
> unmaintainable that only a single person understands it. I made the
> remark as a comment to you saying
> 
> "let's please wait a little bit to see the new code working?"
> 
> which suggested that we actually have to *see* how the code works, in
> order to determine whether it works.

It's the first time in my entire life I'm being blamed for being
careful.

-- 
Gustavo Niemeyer
http://niemeyer.net

From martin at v.loewis.de  Sun Oct 19 15:33:57 2003
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Oct 19 15:34:08 2003
Subject: [Python-Dev] SRE recursion removed
In-Reply-To: <20031019191144.GA27007@ibook>
References: <20031018144703.GA10212@ibook>
	<m3llrifm7r.fsf@mira.informatik.hu-berlin.de>
	<20031018182215.GA10756@ibook>
	<m3he26fjz3.fsf@mira.informatik.hu-berlin.de>
	<20031018202854.GA22482@ibook>
	<m3znfydynk.fsf@mira.informatik.hu-berlin.de>
	<20031019140638.GA23157@ibook>
	<m3ekx9b057.fsf@mira.informatik.hu-berlin.de>
	<20031019191144.GA27007@ibook>
Message-ID: <3F92E725.5030308@v.loewis.de>

Gustavo Niemeyer wrote:

> Of course we can. It depends totally on what we want to do. We can
> remove it, and then if we enable USE_RECURSION, it may blow up the
> stack without being catched.

Right. If the intended usage of USE_RECURSION is only to activate it
when looking into SRE bug reports (to see if the bug goes away when
this is activated), it might not matter that it blows up the stack.

The question at hand is what to do with patch 813391: accept, reject,
defer? I was hoping that we can reject it as outdated, but perhaps
it is not outdated.

Regards,
Martin


From aleaxit at yahoo.com  Sun Oct 19 15:40:42 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 19 15:40:48 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <004701c39665$bd6ff440$e841fea9@oemcomputer>
References: <004701c39665$bd6ff440$e841fea9@oemcomputer>
Message-ID: <200310192140.43084.aleaxit@yahoo.com>

On Sunday 19 October 2003 07:23 pm, Raymond Hettinger wrote:
   ...
> The forces working against introducing an in-line sort are:
> * the time to copy the list (which Alex later showed to be irrelevant),
> * having two list methods with a similar purpose, and
> * the proposed method names are less than sublime

Good summary (including the parts I snipped).


> If someone could come-up with a name more elegant than "copysort", I
> the idea would be much more appetizing.

I still think that having it in some module is a bit better than having it as
a method of lists.  The BDFL has already Pronounced that it's too narrow
in applicability for a builtin (and he's right -- as usual), and that we won't 
have "grab-bag" module of shortcuts that don't fit well anywhere else 
(ditto), and seems very doubtful despite your urgings to reconsider his
stance against adding it as a list method (the two-methods-similar-
purpose issue seems quite relevant).

So, back to what I see as a key issue: a module needs to be "about"
something reasonably specific, such as a data type.  Built-in data types
have no associated module except the builtin one, which is crowded
and needs VERY high threshold for any addition.  So, if I came up with
an otherwise wonderful function that works on sets, arrays, ..., I'd be
in clover -- there's an obvious module to house it)... but if the function 
worked on lists, dicts, files, ..., I'd be hosed.  Note that module string
STILL exists, and still is the ONLY way to call maketrans, an important
function that was deemed inappropriate as a string method; a function
just as important, and just as inappropriate as a method, that worked
on lists, or dicts, or files, or slices, or ... would be "homeless" and might
end up just not entering the standard library.

In a way this risks making built-in types "second-class citizens" when
compared to types offered by other modules in the standard library!

I think we SHOULD have modules corresponding to built-in types,
if there are important functions connected with those types but not
appropriate as methods to populate them.  Perhaps we could use the 
User*.py modules for the purpose, but making new ones seems
better.  Rather than being kept together just by naming conventions,
as the User*.py are, they might be grouped in a package.  Good names
are always a problem, but, say, "tools.lists" might be the modules with
the auxiliary tools dealing with lists, if "tools" was the package name --
"tools.dicts", "tools.files", etc, if needed -- "tools.sequences" for tools
equally well suited to all sequences (not just lists) -- of course, that
would retroactively suggest "tools.iters" for your itertools, oh well, pity,
we sure can't rename it breaking backwards compatibility:-).

If we had module tools.lists (or utils.lists, whatever) then I think
copysort (by whatever name) would live well there.  copyreverse
and copyextend might perhaps also go there and make Barry
happy?-)


Alternatively - we could take a different tack.  copysort is NOT so
much a tool that works on an existing list -- as shown in the code I
posted, thanks to PySequence_List, it's just as easy to make it
work on any sequence (finite iterator or iterable).  So what does it
do?  It BUILDS a new list object (a sorted one) from any sequence.

So -- it's a FACTORY FUNCTION of the list type.  Just like, say,
dict.fromkeys is a factory function of the dict type.  Now, factory
functions are "by nature" classmethods of their type object, no?
So, we should package THIS factory function just like others -- as
a classmethod on list, list.somename, just like dict.fromkeys is
a classmethod on dict.

In this light, we surely don't want "copy" as a part of the name --
a factory method should be thought of as building a new list, not
as copying an old one (particularly because it will work on any
sequence as its argument, of course).

Maybe list.buildsorted if we want to emphasize the "build" part.
Or list.newsorted to emphasize that a new object is returned.
Or maybe, like in dict.fromkeys, we don't want to emphasize
either the building or the newness, but then I wouldn't know what
to suggest except the list.sorted that's already drawn catcalls
(though it drew them when it was proposed as an instance
methods of lists -- maybe as a classmethod it will look better?-)


I want the functionality -- any sensible name that might let the
functionality into the standard library would be ok by me (so
would one putting the functionality in as a builtin or as an instance
method of lists, actually, but I _do_ believe those would not be
the best places for this functionality, by far).  I hope the "tools
package" idea and/or the classmethod one find favour...!-)


Alex


From niemeyer at conectiva.com  Sun Oct 19 15:40:25 2003
From: niemeyer at conectiva.com (Gustavo Niemeyer)
Date: Sun Oct 19 15:41:36 2003
Subject: [Python-Dev] SRE recursion removed
In-Reply-To: <3F92E725.5030308@v.loewis.de>
References: <20031018144703.GA10212@ibook>
	<m3llrifm7r.fsf@mira.informatik.hu-berlin.de>
	<20031018182215.GA10756@ibook>
	<m3he26fjz3.fsf@mira.informatik.hu-berlin.de>
	<20031018202854.GA22482@ibook>
	<m3znfydynk.fsf@mira.informatik.hu-berlin.de>
	<20031019140638.GA23157@ibook>
	<m3ekx9b057.fsf@mira.informatik.hu-berlin.de>
	<20031019191144.GA27007@ibook> <3F92E725.5030308@v.loewis.de>
Message-ID: <20031019194025.GA27237@ibook>

> The question at hand is what to do with patch 813391: accept, reject,
> defer? I was hoping that we can reject it as outdated, but perhaps
> it is not outdated.

IMO, reject it as outdated. This problem doesn't exist anymore.

-- 
Gustavo Niemeyer
http://niemeyer.net

From aleaxit at yahoo.com  Sun Oct 19 15:47:30 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 19 15:47:37 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: <003f01c39661$17d5fd80$e841fea9@oemcomputer>
References: <003f01c39661$17d5fd80$e841fea9@oemcomputer>
Message-ID: <200310192147.30590.aleaxit@yahoo.com>

On Sunday 19 October 2003 06:50 pm, Raymond Hettinger wrote:
   ...
> If only a portion needs to be saved, then I use the code from the tee()
> example in the itertools documentation:

VERY very neat indeed.  OK, I've gotta re-read that documentation --
mea culpa for NOT recalling such juicy contents.  I'd also _really_ like
to have this anything-but-trivial code available for non-copy-and-paste
reuse (i.e., in the library rather than just in its docs)...!

OK, I retract my requests for snapshottability -- and change them
into a request to have this 'tee' in the library!-)


Alex


From sholden at holdenweb.com  Sun Oct 19 15:53:37 2003
From: sholden at holdenweb.com (Steve Holden)
Date: Sun Oct 19 15:58:19 2003
Subject: [Python-Dev] SRE recursion removed
In-Reply-To: <m3ekx9b057.fsf@mira.informatik.hu-berlin.de>
Message-ID: <CGECIJPNNHIFAJKHOLMAIEBFIKAA.sholden@holdenweb.com>

[Gustavo]
> >
> > I'm shocked. Do you really belive that I've done all the changes and
> > past fixes in SRE without knowing how it works? I thought my
> > credibility was a little higher.
>
[Martin]
> I was relying on your credibility, so I was surprised that you are
> interested to leave the old code in - that suggests that you feel
> there are problems with your code. I'm trying to find out what you
> think these problems are.
>
> However, getting all trust in the SRE code from the trust that I have
> in you is not enough for me - and, PLEASE UNDERSTAND, this has nothing
> to do with you personally. I feel bad if important code is so
> unmaintainable that only a single person understands it. I made the
> remark as a comment to you saying
>
> "let's please wait a little bit to see the new code working?"
>
> which suggested that we actually have to *see* how the code works, in
> order to determine whether it works.
>
Martin: I suspect that Gustavo is suffering for an excess of care and
modesty: after all, with CVS controlling the code it isn't hard to back
out a patch if it turns out to be a bad idea. But it won't, will it,
Gustavo ;-)?

regards
--
Steve Holden          +1 703 278 8281        http://www.holdenweb.com/
Improve the Internet           http://vancouver-webpages.com/CacheNow/
Python Web Programming                http://pydish.holdenweb.com/pwp/
Interview with GvR August 14, 2003       http://www.onlamp.com/python/


From aleaxit at yahoo.com  Sun Oct 19 16:16:44 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 19 16:16:49 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: <200310191630.h9JGUF219501@12-236-54-216.client.attbi.com>
References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
	<200310191205.57016.aleaxit@yahoo.com>
	<200310191630.h9JGUF219501@12-236-54-216.client.attbi.com>
Message-ID: <200310192216.44849.aleaxit@yahoo.com>

On Sunday 19 October 2003 06:30 pm, Guido van Rossum wrote:
   ...
> > I have an iterator it whose items, after an arbitrary prefix terminated
> > by the first empty item, are supposed to be each 'yes' or 'no'.
>
> This is a made-up toy example, right?  Does it correspond with
> something you've had to do in real life?

Yes, but I signed an NDA, and thus made irrelevant changes sufficient
to completely mask the application area &c (how is the prefix's end is found,
how the rest of the stream is analyzed to determine how to process it).

> But I'm not sure that abstracting this away all the way to an iterator

Perhaps I over-abstracted it, but I just love abstracting streams as
iterators whenever I can get away with it -- I love the clean, reusable
program structure I often get that way, I love the reusable functions
it promotes.  I guess I'll just build my iterators by suitable factory
functions (including "optimized tee-ability" when feasible), tweak
Raymond's "tee" to use "optimized tee-ability" when supplied, and
tell my clients to build the iterators with my factories if they need
memory-optimal tee-ing.  As long as I can't share that code more
widely, having to use e.g. richiters.iter instead of the built-in iter isn't
too bad, anyway.

> makes sense.  For one, the generic approach to cloning if the iterator
> doesn't have __clone__ would be to make a memory copy, but in this app
> a disk copy is desirable (I can invent something that overflows to

An iterator that knows it's coming from disk or pipe can provide that
disk copy (or reuse the existing file) as part of its "optimized tee-ability".

> offset), or each clone must keep a file offset, but now you lose the
> performance effect of a streaming buffer unless you code up something
> extremely hairy with locks etc.

??? when one clone iterates to the end, on a read-only disk file, its seeks
(which happen always to be to the current offset) don't remove the
benefits of read-ahead done on its behalf by the OS.  Maybe you mean
something else by "lose the performance effect"?

As for locks, why?  An iterator in general is not thread-safe: if two threads
iterate on the same iterator, without providing their own locking, boom.  So
why should clones imply stricter thread-safety?


Alex


From skip at pobox.com  Sun Oct 19 17:32:06 2003
From: skip at pobox.com (Skip Montanaro)
Date: Sun Oct 19 17:32:16 2003
Subject: [Python-Dev] 
 Re: [Python-checkins] python/dist/src/Include object.h, 2.121, 
 2.122 boolobject.h, 1.4, 1.5
In-Reply-To: <E1ABKxi-0005nf-00@sc8-pr-cvs1.sourceforge.net>
References: <E1ABKxi-0005nf-00@sc8-pr-cvs1.sourceforge.net>
Message-ID: <16275.726.346088.324819@montanaro.dyndns.org>


    bcannon> Defined macros Py_RETURN_(TRUE|FALSE|NONE) as helper functions
    bcannon> for returning the specified value.  All three Py_INCREF the
    bcannon> singleton and then return it.

    ...
    bcannon> + /* Macro for returning Py_None from a function */
    bcannon> + #define Py_RETURN_NONE Py_INCREF(Py_None); return Py_None;
    ...  
    bcannon> + /* Macros for returning Py_True or Py_False, respectively */
    bcannon> + #define Py_RETURN_TRUE Py_INCREF(Py_True); return Py_True;
    bcannon> + #define Py_RETURN_FALSE Py_INCREF(Py_False); return Py_False;

These don't look right to me.  First, you have no protection against them
being called like this:

       if (!error)
           Py_RETURN_TRUE;

Second, any time you expect to use a macro in a statement context, I don't
think you want to terminate it with a semicolon (the programmer will do
that).  I would have coded them as

  #define Py_RETURN_NONE do {Py_INCREF(Py_None); return Py_None;} while (0)
  #define Py_RETURN_TRUE do {Py_INCREF(Py_True); return Py_True;} while (0)
  #define Py_RETURN_FALSE do {Py_INCREF(Py_False); return Py_False;} while (0)

Skip

From bac at OCF.Berkeley.EDU  Sun Oct 19 17:40:36 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Sun Oct 19 17:41:04 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Include
	object.h, 2.121,  2.122 boolobject.h, 1.4, 1.5
In-Reply-To: <16275.726.346088.324819@montanaro.dyndns.org>
References: <E1ABKxi-0005nf-00@sc8-pr-cvs1.sourceforge.net>
	<16275.726.346088.324819@montanaro.dyndns.org>
Message-ID: <3F9304D4.70407@ocf.berkeley.edu>

Skip Montanaro wrote:

>     bcannon> Defined macros Py_RETURN_(TRUE|FALSE|NONE) as helper functions
>     bcannon> for returning the specified value.  All three Py_INCREF the
>     bcannon> singleton and then return it.
> 
>     ...
>     bcannon> + /* Macro for returning Py_None from a function */
>     bcannon> + #define Py_RETURN_NONE Py_INCREF(Py_None); return Py_None;
>     ...  
>     bcannon> + /* Macros for returning Py_True or Py_False, respectively */
>     bcannon> + #define Py_RETURN_TRUE Py_INCREF(Py_True); return Py_True;
>     bcannon> + #define Py_RETURN_FALSE Py_INCREF(Py_False); return Py_False;
> 
> These don't look right to me.  First, you have no protection against them
> being called like this:
> 
>        if (!error)
>            Py_RETURN_TRUE;
> 

Realized that after my first commit.  Already fixed.

> Second, any time you expect to use a macro in a statement context, I don't
> think you want to terminate it with a semicolon (the programmer will do
> that).  I would have coded them as
> 
>   #define Py_RETURN_NONE do {Py_INCREF(Py_None); return Py_None;} while (0)
>   #define Py_RETURN_TRUE do {Py_INCREF(Py_True); return Py_True;} while (0)
>   #define Py_RETURN_FALSE do {Py_INCREF(Py_False); return Py_False;} while (0)
> 

Isn't {Py_INCREF(Py_None); return Py_None} enough?  I thought ending a 
curly brace with a semi-colon is harmless (equivalent of a NO-OP).  Why 
bother with the do/while loop?

-Brett


From aleaxit at yahoo.com  Sun Oct 19 18:31:32 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 19 18:31:39 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Include
	object.h, 2.121, 2.122 boolobject.h, 1.4, 1.5
In-Reply-To: <3F9304D4.70407@ocf.berkeley.edu>
References: <E1ABKxi-0005nf-00@sc8-pr-cvs1.sourceforge.net>
	<16275.726.346088.324819@montanaro.dyndns.org>
	<3F9304D4.70407@ocf.berkeley.edu>
Message-ID: <200310200031.32498.aleaxit@yahoo.com>

On Sunday 19 October 2003 11:40 pm, Brett C. wrote:
   ...
> #define Py_RETURN_FALSE do {Py_INCREF(Py_False); return Py_False;} while (0)
>
> Isn't {Py_INCREF(Py_None); return Py_None} enough?  I thought ending a
> curly brace with a semi-colon is harmless (equivalent of a NO-OP).  Why

Not in C: the extra semicolon is an empty statement.  So, for example

    if(...) {
    } ; else 

is a syntax error.

> bother with the do/while loop?

To let the user put a semicolon after the macro and get correct C code.


Alex


From bac at OCF.Berkeley.EDU  Sun Oct 19 18:40:34 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Sun Oct 19 18:41:19 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Include
	object.h, 2.121,  2.122 boolobject.h, 1.4, 1.5
In-Reply-To: <200310200031.32498.aleaxit@yahoo.com>
References: <E1ABKxi-0005nf-00@sc8-pr-cvs1.sourceforge.net>
	<16275.726.346088.324819@montanaro.dyndns.org>
	<3F9304D4.70407@ocf.berkeley.edu>
	<200310200031.32498.aleaxit@yahoo.com>
Message-ID: <3F9312E2.8050807@ocf.berkeley.edu>

Alex Martelli wrote:

> On Sunday 19 October 2003 11:40 pm, Brett C. wrote:
>    ...
> 
>>#define Py_RETURN_FALSE do {Py_INCREF(Py_False); return Py_False;} while (0)
>>
>>Isn't {Py_INCREF(Py_None); return Py_None} enough?  I thought ending a
>>curly brace with a semi-colon is harmless (equivalent of a NO-OP).  Why
> 
> 
> Not in C: the extra semicolon is an empty statement.  So, for example
> 
>     if(...) {
>     } ; else 
> 
> is a syntax error.
> 
> 
>>bother with the do/while loop?
> 
> 
> To let the user put a semicolon after the macro and get correct C code.
> 
> 

Nuts.  Time for another commit...

-Brett


From greg at cosc.canterbury.ac.nz  Sun Oct 19 19:23:43 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Sun Oct 19 19:24:01 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
Message-ID: <200310192323.h9JNNhs23070@oma.cosc.canterbury.ac.nz>

Guido:

> the proposed notation doesn't return a list.
> ...
> I don't have a proposal for generator comprehension syntax though, and
> [yield ...] has the same problem.

How about just leaving off the brackets?

  gen = yield x*x for x in stuff

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From tdelaney at avaya.com  Sun Oct 19 19:34:47 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Sun Oct 19 19:34:54 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFED1A@au3010avexu1.global.avaya.com>

> From: Alex Martelli [mailto:aleaxit@yahoo.com]
> 
> I think we SHOULD have modules corresponding to built-in types,
> if there are important functions connected with those types but not
> appropriate as methods to populate them.  Perhaps we could use the 
> User*.py modules for the purpose, but making new ones seems
> better.

Well, we already have a precedent for this - the 'Sets' module.

So if we use the same naming convention ...

For discrete types:

    Lists
    Dicts
    Tuples
    Sets

for interfaces:

    Iterators
    Iterables

and for a catch-all

    Objects

Then we just have to argue over what goes where ;)

Tim Delaney

From tdelaney at avaya.com  Sun Oct 19 19:40:57 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Sun Oct 19 19:41:03 2003
Subject: [Python-Dev] SRE recursion removed
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFED28@au3010avexu1.global.avaya.com>

> From: Steve Holden [mailto:sholden@holdenweb.com]
> >
> Martin: I suspect that Gustavo is suffering for an excess of care and
> modesty: after all, with CVS controlling the code it isn't 
> hard to back
> out a patch if it turns out to be a bad idea. But it won't, will it,
> Gustavo ;-)?

Perhaps a comment that the patch won't be accepted until the dead code has been removed, but that the dead code is there for ease of regression testing during the initial testing period?

Essentially, this is an alpha-level patch. When the dead code is removed it becomes a beta-level patch.

Tim Delaney

From tdelaney at avaya.com  Sun Oct 19 19:44:29 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Sun Oct 19 19:44:35 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFED2C@au3010avexu1.global.avaya.com>

> From: Delaney, Timothy C (Timothy) 
> 
>     Lists
>     Dicts
>     Tuples
>     Sets

And for symmetry with Sets, each module should also provide an import of the type that it is about e.g.

    from Lists import list

OK - so it's Monday morning ... ;)

Tim DElaney

From greg at cosc.canterbury.ac.nz  Sun Oct 19 19:45:04 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Sun Oct 19 19:45:27 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <Law9-F10fYIu5vC8UO000024516@hotmail.com>
Message-ID: <200310192345.h9JNj4I23163@oma.cosc.canterbury.ac.nz>

Sean Ross <seandavidross@hotmail.com> generated:

> # (3) parentheses
> sumofsquares = sum((yield x*x for x in myList))

I think this one illustrates why requiring parentheses
around a bare "yield..." would be a bad idea.

> # (14) unpacking (*)
> sumofsquares = sum(*[x*x for x in myList])

That already has a meaning (you're passing the result of
a list comp as a * argument to the function).

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Sun Oct 19 19:54:27 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Sun Oct 19 19:54:37 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310171715.h9HHFLW06678@12-236-54-216.client.attbi.com>
Message-ID: <200310192354.h9JNsRK23207@oma.cosc.canterbury.ac.nz>

Guido:

> but perhaps we can make this work:
> 
>   sum(x for x in S)

But if "x for x in S" were a legal expression on its own,
returning a generator, then [x for x in S] would have to be
a 1-element list containing a generator.

Unless you're suggesting that it should be a special
feature of the function call syntax? That would be
bizarre...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Sun Oct 19 20:08:31 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Sun Oct 19 20:08:45 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <5.1.0.14.0.20031017151235.034fad20@mail.telecommunity.com>
Message-ID: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>

"Phillip J. Eby" <pje@telecommunity.com>:

> If you look at it this way, then you can consider [x for x in S] to be 
> shorthand syntax for list(x for x in S), as they would both produce the 
> same result.  However, IIRC, the current listcomp implementation actually 
> binds 'x' in the current local namespace, whereas the generator version 
> would not.

Are we sure about that?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Sun Oct 19 20:23:12 2003
From: guido at python.org (Guido van Rossum)
Date: Sun Oct 19 20:23:21 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: Your message of "Sun, 19 Oct 2003 12:50:24 EDT."
	<003f01c39661$17d5fd80$e841fea9@oemcomputer> 
References: <003f01c39661$17d5fd80$e841fea9@oemcomputer> 
Message-ID: <200310200023.h9K0NCp20046@12-236-54-216.client.attbi.com>

> So far, all of my needs for re-iteration have been met by storing some
> of the iterator's data.  If all of it needs to be saved, I use list(it).
> If only a portion needs to be saved, then I use the code from the tee() 
> example in the itertools documentation:
> 
>     def tee(iterable):
>         "Return two independent iterators from a single iterable"
>         def gen(next, data={}, cnt=[0]):
>             dpop = data.pop
>             for i in itertools.count():
>                 if i == cnt[0]:
>                     item = data[i] = next()
>                     cnt[0] += 1
>                 else:
>                     item = dpop(i)
>                 yield item
>         next = iter(iterable).next
>         return (gen(next), gen(next))

Ouch.  That required hard work to understand! :-)  And it doesn't
generalize straightforwardly to three or more iterators.

This approach is nice if you expect the two iterators to remain close
together.  But if they go far apart (without degenerating to the
list(it) case like Alex's example) I imagine that different data
structure than a dict would be more efficient to hold the queue.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg at cosc.canterbury.ac.nz  Sun Oct 19 20:34:54 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Sun Oct 19 20:35:07 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <Law9-OE40524YsIC3HC00002f29@hotmail.com>
Message-ID: <200310200034.h9K0YsQ23385@oma.cosc.canterbury.ac.nz>

Sean Ross <seandavidross@hotmail.com>:

> # (1) without parentheses:
> B(y) for y in A(x) for x in myIterable

Er, excuse me, but that had better *not* be equivalent to

> # (2) for clarity, we'll add some optional parentheses:
> B(y) for y in (A(x) for x in myIterable)

because the former ought to be a single iterator expression
with two nested loops (albeit an erroneous one, since x
is being used before it's bound).

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Sun Oct 19 20:40:37 2003
From: guido at python.org (Guido van Rossum)
Date: Sun Oct 19 20:40:43 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: Your message of "Sun, 19 Oct 2003 22:16:44 +0200."
	<200310192216.44849.aleaxit@yahoo.com> 
References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
	<200310191205.57016.aleaxit@yahoo.com>
	<200310191630.h9JGUF219501@12-236-54-216.client.attbi.com> 
	<200310192216.44849.aleaxit@yahoo.com> 
Message-ID: <200310200040.h9K0ebP20072@12-236-54-216.client.attbi.com>

> > > I have an iterator it whose items, after an arbitrary prefix
> > > terminated by the first empty item, are supposed to be each
> > > 'yes' or 'no'.
> >
> > This is a made-up toy example, right?  Does it correspond with
> > something you've had to do in real life?
> 
> Yes, but I signed an NDA, and thus made irrelevant changes
> sufficient to completely mask the application area &c (how is the
> prefix's end is found, how the rest of the stream is analyzed to
> determine how to process it).

OK, but that does make it harder to judge its value for making the
case for iterator cloning, because you're not saying anything about
the (range of) characteristics of the input iterator.

> > But I'm not sure that abstracting this away all the way to an iterator
> 
> Perhaps I over-abstracted it, but I just love abstracting streams as
> iterators whenever I can get away with it -- I love the clean,
> reusable program structure I often get that way, I love the reusable
> functions it promotes.

But when you add more behavior to the iterator protocol, a lot of the
cleanliness goes away; any simple transformation of an iterator using
a generator function loses all the optional functionality.

> I guess I'll just build my iterators by suitable factory functions
> (including "optimized tee-ability" when feasible), tweak Raymond's
> "tee" to use "optimized tee-ability" when supplied, and tell my
> clients to build the iterators with my factories if they need
> memory-optimal tee-ing.  As long as I can't share that code more
> widely, having to use e.g. richiters.iter instead of the built-in
> iter isn't too bad, anyway.

But you can't get the for-loop to use richiters.iter (you'd have to
add an explicit call to it).  And you can't use any third party or
standard library code for manipulating iterators; you'd have to write
your own clone of itertools.

> > makes sense.  For one, the generic approach to cloning if the
> > iterator doesn't have __clone__ would be to make a memory copy,
> > but in this app a disk copy is desirable (I can invent something
> > that overflows to
> 
> An iterator that knows it's coming from disk or pipe can provide
> that disk copy (or reuse the existing file) as part of its
> "optimized tee-ability".

At considerable cost.

> > offset), or each clone must keep a file offset, but now you lose
> > the performance effect of a streaming buffer unless you code up
> > something extremely hairy with locks etc.
> 
> ??? when one clone iterates to the end, on a read-only disk file,
> its seeks (which happen always to be to the current offset) don't
> remove the benefits of read-ahead done on its behalf by the OS.
> Maybe you mean something else by "lose the performance effect"?

I wasn't thinking about the OS read-ahead, I was thinking of stdio
buffering, and the additional buffering done by file.next().  (See
readahead_get_line_skip() in fileobject.c.)  This buffering has made
"for line in file" in 2.3 faster than any way of iterating over the
lines of a file previously available.  Also, on many systems, every
call to fseek() drops the stdio buffer, even if the seek position is
not actually changed by the call.  It could be done, but would require
incredibly hairy code.

> As for locks, why?  An iterator in general is not thread-safe: if
> two threads iterate on the same iterator, without providing their
> own locking, boom.  So why should clones imply stricter
> thread-safety?

I believe I was thinking of something else; the various iterators
iterating over the same file would somehow have to communicate to each
other who used the file last, so that repeated next() calls on the
same iterator could know they wouldn't have to call seek() and hence
lose the readahead buffer.  This doesn't require locking in the thread
sense, but feels similar.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg at cosc.canterbury.ac.nz  Sun Oct 19 20:45:00 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Sun Oct 19 20:45:24 2003
Subject: [Python-Dev] generator comprehension syntax,
	was: accumulator display syntax
In-Reply-To: <000201c39514$ac006f20$e841fea9@oemcomputer>
Message-ID: <200310200045.h9K0j0q23393@oma.cosc.canterbury.ac.nz>

Raymond Hettinger <python@rcn.com>:

> Is Phil's syntax acceptable to everyone?
> 
>      (yield:  x*x for x in roots)

I could probably live with it, but it would be
so much nicer if the "yield" could be dispensed
with.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Sun Oct 19 21:04:14 2003
From: guido at python.org (Guido van Rossum)
Date: Sun Oct 19 21:04:23 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: Your message of "Sun, 19 Oct 2003 17:23:12 PDT."
	<200310200023.h9K0NCp20046@12-236-54-216.client.attbi.com> 
References: <003f01c39661$17d5fd80$e841fea9@oemcomputer>  
	<200310200023.h9K0NCp20046@12-236-54-216.client.attbi.com> 
Message-ID: <200310200104.h9K14Ev20120@12-236-54-216.client.attbi.com>

FWIW, I partially withdraw my observation that reiterability is a
special case of cloneability.  It is true that if you have
cloneability you have reiterability.  But I hadn't realized that
reiterability is sometimes easier than cloneability!

Cloning a generator function at an arbitrary point is not doable; but
cloning a generator function at the start would be as easy as saving
the function and its arguments.

But this doesn't make me any more comfortable with the idea of adding
reiterability as an iterator feature (even optional).

An iterator represents the rest of the sequence of values it will
generate.  But if we add reiterability into the mix, an iterator
represents two different sequences: its "full" sequence, accessible
via its reiter() method (or whatever it would be called), and its
"current" sequence.  The latter may be different, because when you get
passed an iterator, whoever passed it might already have consumed some
items; this affects the "current" sequence but not the sequence
returned by reiter().  (Cloning doesn't have this problem, but its
other problems make up for this.)

If you prefer to see a code sample explaining the problem: consider a
consumer of a reiterable iterator:

  def printwice(it):
      for x in it: print x
      for x in it.reiter(): print x

Now suppose the following code that calls it:

  r = range(10)
  it = iter(r) # assume this is reiterable
  it.next()    # skip first item
  printwice(it)

This prints 1...9 followed by 0...9 !!!  The solution using cloning
wouldn't have this problem:

  def printwice(it):
      it2 = it.clone()
      for x in it: print x
      for x in it2: print x

With reiter() it becomes hard to explain what the input requirements
are for the function to work correctly; effectively, it would require
a "virginal" (== has never been used :-) reiterable iterator.  So we
might as well require a container!  If you don't have a container but
you have a description of a series, Alex's Reiterable can easily fix
this:

  class Reiterable:
      def __init__(self, func, *args):
          self.func, self.args = func, args
      def __iter__(self):
          return self.func(*self.args)

This should be called with e.g. a generator function and an argument
list for it.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From bac at OCF.Berkeley.EDU  Sun Oct 19 21:26:59 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Sun Oct 19 21:27:10 2003
Subject: [Python-Dev] Re: How to spell Py_return_None and friends
In-Reply-To: <200310200000.h9K006h19965@12-236-54-216.client.attbi.com>
References: <000001c38e10$77e31ea0$e841fea9@oemcomputer>
	<200310090345.h993jkF00618@12-236-54-216.client.attbi.com>
	<m33ce3hvlt.fsf@mira.informatik.hu-berlin.de>
	<200310090503.h99533G00867@12-236-54-216.client.attbi.com>
	<3F91F743.6090801@ocf.berkeley.edu>
	<200310190240.h9J2ekX10384@12-236-54-216.client.attbi.com>
	<3F9203A5.2030407@ocf.berkeley.edu>
	<m38ynhwue0.fsf@mira.informatik.hu-berlin.de>
	<200310191433.h9JEXSL19256@12-236-54-216.client.attbi.com>
	<3F930373.8010809@ocf.berkeley.edu>
	<200310200000.h9K006h19965@12-236-54-216.client.attbi.com>
Message-ID: <3F9339E3.30605@ocf.berkeley.edu>

Guido van Rossum wrote:

>>Now, where do the macros get documented?  In the Python/C API docs all I 
>>see is docs for None in 7.1.2 .  Is that the proper place to document 
>>Py_RETURN_NONE?  Where are the docs for Py_True and Py_False?
> 
> 
> Um, maybe Martin has an idea?  I've not looked at the doc structure
> for years.  If Py_True/False aren't documented, maybe they should be
> added?  Otherwise I suggest you throw this back to python-dev and hope
> Fred responds. :-)
> 

Argh!  Mis-clicked and hit Reply instead of Reply-All.  Joys of a new 
email client.

I couldn't find any docs for Py_True/False both in terms of the TOC and 
the index.

This has turned out to be such a crap day in so many ways it is 
unbelievable.

-Brett


From greg at cosc.canterbury.ac.nz  Sun Oct 19 23:49:49 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Sun Oct 19 23:50:33 2003
Subject: [Python-Dev] How to spell Py_return_None and friends (was: RE:
	[Python-checkins] python/dist/src/Objects typeobject.c, 2.244, 2.245)
In-Reply-To: <3F91F743.6090801@ocf.berkeley.edu>
Message-ID: <200310200349.h9K3nnY24004@oma.cosc.canterbury.ac.nz>

"Brett C." <bac@OCF.Berkeley.EDU>:

> So Py_return_None or Py_RETURN_NONE ? 

PyReturn_None, PyReturn_True, PyReturn_False

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Mon Oct 20 00:44:45 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon Oct 20 00:45:32 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <200310181627.h9IGRoP09636@12-236-54-216.client.attbi.com>
Message-ID: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz>

Guido on iterator comprehensions:

> The real issue is whether it adds enough to make it worthwhile to
> change the language (again).
> 
> My current opinion is that it isn't

Maybe it's time to get back to what started all this, which
was a desire for an accumulation syntax. (Actually it was
a proposal to abuse a proposed accumulation syntax to get
sorting, if I remember correctly, but let's ignore that
detail for now...)

Most of us seem to agree that having list comprehensions 
available as a replacement for map() and filter() is a good
thing. But what about reduce()? Are there equally strong
reasons for wanting an alternative to that, too? If not,
why not?

And if we do, maybe a general iterator comprehension
syntax isn't the best way to go. It seemed that way at
first, but that seems to have led us into a bit of a
quagmire.

So, taking the original accumulator display idea, and
incorporating some of the ideas that have come up along
the way, such as getting rid of the square brackets,
how about

  sum of x*x for x in xvalues
  average of g for g in grades
  maximum of f(x, y) for x in xrange for y in yrange
  top(10) of humour(joke) for joke in comedy

etc.?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From martin at v.loewis.de  Mon Oct 20 01:52:33 2003
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon Oct 20 01:52:37 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310192323.h9JNNhs23070@oma.cosc.canterbury.ac.nz>
References: <200310192323.h9JNNhs23070@oma.cosc.canterbury.ac.nz>
Message-ID: <3F937821.7050908@v.loewis.de>

Greg Ewing wrote:

> How about just leaving off the brackets?
> 
>   gen = yield x*x for x in stuff

I think this has a dangling else problem:

gen = yield x*x for x in yield y+y for y in stuff if x > y

In this expression, how would you put parentheses, and why?

Regards,
Martin


From martin at v.loewis.de  Mon Oct 20 01:54:15 2003
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon Oct 20 01:54:30 2003
Subject: [Python-Dev] SRE recursion removed
In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DECFED28@au3010avexu1.global.avaya.com>
References: <338366A6D2E2CA4C9DAEAE652E12A1DECFED28@au3010avexu1.global.avaya.com>
Message-ID: <3F937887.3070505@v.loewis.de>

Delaney, Timothy C (Timothy) wrote:

> Perhaps a comment that the patch won't be accepted until the dead code 
 > has been removed, but that the dead code is there for ease of regression
 > testing during the initial testing period?

OTOH, the patch has been already committed to CVS head. So it is already
accepted.

Regards,
Martin


From aleaxit at yahoo.com  Mon Oct 20 02:28:57 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 20 02:29:02 2003
Subject: [Python-Dev] generator comprehension syntax,
	was: accumulator display syntax
In-Reply-To: <200310200045.h9K0j0q23393@oma.cosc.canterbury.ac.nz>
References: <200310200045.h9K0j0q23393@oma.cosc.canterbury.ac.nz>
Message-ID: <200310200828.57856.aleaxit@yahoo.com>

On Monday 20 October 2003 02:45 am, Greg Ewing wrote:
> Raymond Hettinger <python@rcn.com>:
> > Is Phil's syntax acceptable to everyone?
> >
> >      (yield:  x*x for x in roots)
>
> I could probably live with it, but it would be
> so much nicer if the "yield" could be dispensed
> with.

I've changed my mind, too, btw (pondering on Guido's last msg on the subject):
mandatory parentheses but no "yield:" would be quite fine.  I realized I 
didn't bother to say so because of Guido's prediction (no pronouncement yet) 
that this issue will anyway die "like the ternary operator" -- I focused on 
that one rather than on the detail of _what_ syntax exactly should be NOT 
adopted:-).


Alex


From python at rcn.com  Mon Oct 20 02:35:49 2003
From: python at rcn.com (Raymond Hettinger)
Date: Mon Oct 20 02:36:34 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310192323.h9JNNhs23070@oma.cosc.canterbury.ac.nz>
Message-ID: <001b01c396d4$66a490c0$a426c797@oemcomputer>

[Guido]
> > the proposed notation doesn't return a list.
> > ...
> > I don't have a proposal for generator comprehension syntax though,
and
> > [yield ...] has the same problem.

[Greg Ewing]
> How about just leaving off the brackets?
> 
>   gen = yield x*x for x in stuff

Heck no!

Right now, the only way to tell if a function is a generator
is to read through the code looking for a yield.  If we do
get a generator comprehension syntax, it *must* be distinctively
set-off from the surrounding code.

Brackets accomplish set-off but look too much like lists.
Parens aren't strong enough unless the yield is followed a colon.
Someone suggested paired angle brackets (lt and gt) but that
  was promptly shot down for some reason I can't recall.
Curly braces and quotes are probably out of the question.
That leaves only dollar signs and other perlisms, yuck.

The best so far is (yield: x*x for x in stuff) but someone
very important said they hated it for some reason.

Perhaps someone can come up with some clever, self explanatory
use of --> or some such.


Raymond


From aleaxit at yahoo.com  Mon Oct 20 02:46:09 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 20 02:46:14 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: <200310200104.h9K14Ev20120@12-236-54-216.client.attbi.com>
References: <003f01c39661$17d5fd80$e841fea9@oemcomputer>
	<200310200023.h9K0NCp20046@12-236-54-216.client.attbi.com>
	<200310200104.h9K14Ev20120@12-236-54-216.client.attbi.com>
Message-ID: <200310200846.09300.aleaxit@yahoo.com>

On Monday 20 October 2003 03:04 am, Guido van Rossum wrote:
> FWIW, I partially withdraw my observation that reiterability is a
> special case of cloneability.  It is true that if you have
> cloneability you have reiterability.  But I hadn't realized that
> reiterability is sometimes easier than cloneability!

Hmmm, I thought I had shown a simple wrapper (holding a callable and args for 
it, as you show later) that implied how to wrap, at creation time, iterators 
built by iter(sequence) or by generators for reiterability (but not for 
cloneability).  So, sure, cloneability is more general (you can use it to 
implement reiterability, but not VV) and harder to implement; reiterability 
IS "a special case" and thus it's less general but easier to implement.

> But this doesn't make me any more comfortable with the idea of adding
> reiterability as an iterator feature (even optional).

Sure.  "Relatively easy to implement" doesn't mean "should be in the
language".  Ease of learning, breadth and appropriateness of use, risk
of misuse, ease of substitution if not in the language -- there are so
many considerations!


> With reiter() it becomes hard to explain what the input requirements
> are for the function to work correctly; effectively, it would require
> a "virginal" (== has never been used :-) reiterable iterator.  So we

Yes, very good point -- and possibly the explanation of why I never
met a use case for reiterability as such.  It's unlikely I want "an
iterator that may already be partly consumed but I can restart
from an unknown-to-me ``previous'' point in its lifetime" -- then
I probably just want an iterable, just as you say.

> might as well require a container!  If you don't have a container but
> you have a description of a series, Alex's Reiterable can easily fix
> this:
>
>   class Reiterable:
>       def __init__(self, func, *args):
>           self.func, self.args = func, args
>       def __iter__(self):
>           return self.func(*self.args)
>
> This should be called with e.g. a generator function and an argument
> list for it.

Yes, or the function that needs a callable + args anyway might require
the callable and the args as its own arguments instead of wanting them
packaged up as an iterable (an iterable's probably better if the typical
use case is passing e.g. a list, though -- asking for the "iterator factory"
callable might help when the typical use case is passing a generator).

Yet another nail in the coffin of "reiterable as a concept in the language",
methinks.


Alex


From aleaxit at yahoo.com  Mon Oct 20 03:40:43 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 20 03:40:49 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: <200310200040.h9K0ebP20072@12-236-54-216.client.attbi.com>
References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
	<200310192216.44849.aleaxit@yahoo.com>
	<200310200040.h9K0ebP20072@12-236-54-216.client.attbi.com>
Message-ID: <200310200940.43021.aleaxit@yahoo.com>

On Monday 20 October 2003 02:40 am, Guido van Rossum wrote:
   ...
> > Perhaps I over-abstracted it, but I just love abstracting streams as
> > iterators whenever I can get away with it -- I love the clean,
> > reusable program structure I often get that way, I love the reusable
> > functions it promotes.
>
> But when you add more behavior to the iterator protocol, a lot of the
> cleanliness goes away; any simple transformation of an iterator using
> a generator function loses all the optional functionality.

It loses the optimization on clonability, only, as far as I can see; i.e.
cloning becomes potentially memory-expensive if what I'm cloning
(tee-ing, whatever) can't give me an optimized way.  I can still code
the higher levels based on clean "tee-able streams", and possibly
optimize some iterator-factories later if profiling shows they're needed
(yet another case where one dreams of a way of profiling MEMORY
use, oh well:-).

BTW, playing around with some of this it seems to me that the
inability to just copy.copy (or copy.deepcopy) anything produced
by iter(sequence) is more of a bother -- quite apart from clonability
(a similar but separate concept), couldn't those iterators be
copy'able anyway?  I.e. just expose underlying sequence and
index as their state for getting and setting?  Otherwise to get
copyable iterators I have to reimplement iter "by hand":

class Iter(object):
    def __init__(self, seq):
        self.seq = seq
        self.idx = 0
    def __iter__(self): return self
    def next(self):
        try: result = self.seq[self.idx]
        except IndexError: raise StopIteration
        self.idx += 1
        return result

and I don't understand the added value of requiring the user to
code this no-added-value, slow-things-down boilerplate.


> > I guess I'll just build my iterators by suitable factory functions
> > (including "optimized tee-ability" when feasible), tweak Raymond's
> > "tee" to use "optimized tee-ability" when supplied, and tell my
> > clients to build the iterators with my factories if they need
> > memory-optimal tee-ing.  As long as I can't share that code more
> > widely, having to use e.g. richiters.iter instead of the built-in
> > iter isn't too bad, anyway.
>
> But you can't get the for-loop to use richiters.iter (you'd have to
> add an explicit call to it).  And you can't use any third party or

No problem, as the iterator built by the for loop is not exposed
in a way that would ever let me try to tee it anyway.

> standard library code for manipulating iterators; you'd have to write
> your own clone of itertools.

For those itertools functions that may preserve "cheap tee-ability"
only, yes.


> > > makes sense.  For one, the generic approach to cloning if the
> > > iterator doesn't have __clone__ would be to make a memory copy,
> > > but in this app a disk copy is desirable (I can invent something
> > > that overflows to
> >
> > An iterator that knows it's coming from disk or pipe can provide
> > that disk copy (or reuse the existing file) as part of its
> > "optimized tee-ability".
>
> At considerable cost.

I'm not sure I see that cost, yet.


> > > offset), or each clone must keep a file offset, but now you lose
> > > the performance effect of a streaming buffer unless you code up
> > > something extremely hairy with locks etc.
> >
> > ??? when one clone iterates to the end, on a read-only disk file,
> > its seeks (which happen always to be to the current offset) don't
> > remove the benefits of read-ahead done on its behalf by the OS.
> > Maybe you mean something else by "lose the performance effect"?
>
> I wasn't thinking about the OS read-ahead, I was thinking of stdio
> buffering, and the additional buffering done by file.next().  (See
> readahead_get_line_skip() in fileobject.c.)  This buffering has made
> "for line in file" in 2.3 faster than any way of iterating over the

Ah, if you're iterating by LINE, yes.  I was iterating by fixed-size
blocks on binary files in my tests, so I didn't see that effect.

> lines of a file previously available.  Also, on many systems, every
> call to fseek() drops the stdio buffer, even if the seek position is
> not actually changed by the call.  It could be done, but would require
> incredibly hairy code.

The call to fseek probably SHOULD drop the buffer in a typical
C implementation _on a R/W file_, because it's used as the way
to signal the file that you're moving from reading to writing or VV
(that's what the C standard says: you need a seek between an
input op and an immediately successive output op or viceversa,
even a seek to the current point, else, undefined behavior -- which
reminds me, I don't know if the _Python_ wrapper maintains that
"clever" requirement for ITS R/W files, but I think it does).  I can
well believe that for simplicity a C-library implementor would then
drop the buffer on a R/O file too, needlessly but understandably.

So, hmmm, wouldn't it suffice to guard the seek call with a
condition that the current point in the file isn't already what we want...?
[testing, testing...] nope, even just the guard slows things down
a LOT.  Hmmm, I think .tell IS implemented by a "dummy" .seek,
isn't it?  So, yes, quite some hairiness (credible or not;-) would be
needed to make an iterated-by-lines file good for optimized
tee-ability.


> > As for locks, why?  An iterator in general is not thread-safe: if
> > two threads iterate on the same iterator, without providing their
> > own locking, boom.  So why should clones imply stricter
> > thread-safety?
>
> I believe I was thinking of something else; the various iterators
> iterating over the same file would somehow have to communicate to each
> other who used the file last, so that repeated next() calls on the
> same iterator could know they wouldn't have to call seek() and hence
> lose the readahead buffer.  This doesn't require locking in the thread
> sense, but feels similar.

Interesting intuition.  The "who used this last" code doesn't feel
similar to a lock, to me: i.e., just transforming a plain iterator

class Lines1(object):
    def __init__(self, f):
        self.f = f
    def __iter__(self): return self
    def next(self):
        line = self.f.next()
        return line

into a somewhat more complicated one:

class Lines(object):
    wholast = {}
    def __init__(self, f):
        self.f = f
        self.wp = f.tell()
    def __iter__(self): return self
    def next(self):
        if self.wholast.get(self.f) is not self:
            self.f.seek(self.wp)
            self.wholast[self.f] = self
        line = self.f.next()
        self.wp += len(line)
        return line

(assuming seek "resyncs").  However, a loop using Lines (over
/usr/share/dict/words) [though twice as fast as my previous
attempt using tell each time] is over twice as slow as one with 
Lines1, which in turn is 3 times as slow as with a tiny generator:

def Lines2(flob):
    for line in flob: yield line

The deuced "for line in flob:" is so deucedly optimized that trying
to compete with it, even with something as apparently trivial as
Lines1, is apparently a lost cause;-).  OK, then I guess that an
iterator by lines on a textfile can't easily be optimized for teeability
by these "share the file object" strategies; rather, the best way to
tee such a disk file would seem to be:

def tee_diskfile(f):
    result = file(f.name, f.mode)
    result.seek(f.tell())
    return f, result


Alex


From aleaxit at yahoo.com  Mon Oct 20 03:44:30 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 20 03:44:36 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
Message-ID: <200310200944.30482.aleaxit@yahoo.com>

On Monday 20 October 2003 02:08 am, Greg Ewing wrote:
> "Phillip J. Eby" <pje@telecommunity.com>:
> > If you look at it this way, then you can consider [x for x in S] to be
> > shorthand syntax for list(x for x in S), as they would both produce the
> > same result.  However, IIRC, the current listcomp implementation actually
> > binds 'x' in the current local namespace, whereas the generator version
> > would not.
>
> Are we sure about that?

We are indeed sure (sadly) that list comprehensions leak control variable 
names.  We can hardly be sure of what iterator comprehensions would be
defined to do, given they don't exist, but surely we can HOPE that in an
ideal world where iterator comprehensions were part of Python they would
not be similarly leaky:-).


Alex


From aleaxit at yahoo.com  Mon Oct 20 03:53:39 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 20 03:53:44 2003
Subject: [Python-Dev] modules for builtin types (was Re: copysort patch)
In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DECFED1A@au3010avexu1.global.avaya.com>
References: <338366A6D2E2CA4C9DAEAE652E12A1DECFED1A@au3010avexu1.global.avaya.com>
Message-ID: <200310200953.39816.aleaxit@yahoo.com>

On Monday 20 October 2003 01:34 am, Delaney, Timothy C (Timothy) wrote:
> > From: Alex Martelli [mailto:aleaxit@yahoo.com]
> >
> > I think we SHOULD have modules corresponding to built-in types,
> > if there are important functions connected with those types but not
> > appropriate as methods to populate them.  Perhaps we could use the
> > User*.py modules for the purpose, but making new ones seems
> > better.
>
> Well, we already have a precedent for this - the 'Sets' module.

Which is actually "sets" (lowercase leading s).

It's a precedent *of sorts*, since sets.Set is not "builtin".  array.array
is another precedent, unfortunately differing in pluralization as well as in
capitalization of the type's name.  The name of module "string" is also
lowercase and singular, and there's no "string.str" nor "string.string" etc
naming the type in the module's namespace.  "Queue" does have an
uppercase initial, but it's singular -- I think "sets" is the only plural
here.

So, I dunno; there seems to be little consistency to guide us.


Alex


From aleaxit at yahoo.com  Mon Oct 20 04:07:46 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 20 04:08:00 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz>
References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz>
Message-ID: <200310201007.46829.aleaxit@yahoo.com>

On Monday 20 October 2003 06:44 am, Greg Ewing wrote:
   ...
> So, taking the original accumulator display idea, and
> incorporating some of the ideas that have come up along
> the way, such as getting rid of the square brackets,
> how about
>
>   sum of x*x for x in xvalues
>   average of g for g in grades
>   maximum of f(x, y) for x in xrange for y in yrange
>   top(10) of humour(joke) for joke in comedy

Wow.

I'm speechless.

[later, having recovered speech] IF (big if) we could pull THAT off, it
WOULD be well worth making 'of' a keyword (and thus requiring a
"from __future__ import").  It's SO beautiful, SO pythonic, the only
risk I can see is that we'd have newbie people coding:
    sum of the_values
rather than:
    sum(the_values)
or:
    sum of x for x in the_values

We could (and hopefully will) quibble about the corresponding
semantics (particularly for the top(10) example, implicitly requiring
some "underlying sequence" to be made available while all other
uses require no such black magic).
But this is the first proposed new syntax I've seen in a long time --
not just on this thread -- that is SO pretty it makes me want it
in the language FOR ITSELF -- to reinforce the "Python is executable
pseudocode" idea!!! -- rather than just as a means to the end of
having the underlying semantics available.

I can but hope others share my fascination with it... in any case,
whatever happens to it, *BRAVO*, Greg!!!


Alex


From Paul.Moore at atosorigin.com  Mon Oct 20 05:47:38 2003
From: Paul.Moore at atosorigin.com (Moore, Paul)
Date: Mon Oct 20 05:48:25 2003
Subject: [Python-Dev] Re: Reiterability
Message-ID: <16E1010E4581B049ABC51D4975CEDB8802C098E6@UKDCX001.uk.int.atosorigin.com>

From: Alex Martelli [mailto:aleaxit@yahoo.com]
> Basically, by exposing suitable methods an iterator could "make its
> abilities know" to functions that may or may not need to wrap it in
> order to achieve certain semantics -- so the functions can build
> only those wrappers which are truly indispensable for the purpose.
> Roughly the usual "protocol" approach -- functions use an object's
> ability IF that object exposes methods providing that ability, and
> otherwise fake it on their own.

I'm glad you pointed this out. This whole thing was starting to sound
very like the sort of thing that the adaptation PEP was intended to
cover.

Can the people who need this get the capability via a suitable
adaptation approach? I'm not familiar enough with the technique to
be sure. If so, wouldn't that be a more general technique (as well
as being already available in 3rd party modules like PyProtocols).

Paul.

From ncoghlan at iinet.net.au  Mon Oct 20 08:58:35 2003
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Mon Oct 20 08:58:42 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <004701c39665$bd6ff440$e841fea9@oemcomputer>
References: <004701c39665$bd6ff440$e841fea9@oemcomputer>
Message-ID: <3F93DBFB.3010507@iinet.net.au>

Raymond Hettinger strung bits together to say:

> Remember, list.copysort() isn't about chaining or even "saving a line or
> two".  It is about using an expression instead of a series of
> statements.
> That makes it possible to use it wherever expressions are allowed, 
> including function call arguments and list comprehensions.
> 
> Here are some examples taken from the patch comments:
> 
>   genhistory(date, events.copysort(key=incidenttime))
> 
>   todo = [t for t in tasks.copysort() if due_today(t)]

'chain' may be a bad name then, since all that function really does is take an 
arbitrary bound method, execute it and then return the object that the method 
was bound to. If we used a name like 'method_as_expr' (instead of 'chain'), then 
the above examples would be:

   genhistory(date, method_as_expr(list(events).sort, key=incidenttime))

   todo = [t for t in method_as_expr(list(tasks).sort) if due_today(t)]

Granted, it's not quite as clear (some might say it's positively arcane!), but 
it also isn't using anything that's not already in the language/standard library.

> The forces working against introducing an in-line sort are:
> * the time to copy the list (which Alex later showed to be irrelevant),
> * having two list methods with a similar purpose, and 
> * the proposed method names are less than sublime
> 
> If someone could come-up with a name more elegant than "copysort", I
> the idea would be much more appetizing.

Would something like 'sortedcopy' be an improvement?

Although Alex's suggestion of a class method like dict.fromkeys() also sounded 
good - naming it is still an issue, though.

I'm not entirely opposed to the idea (the 'method_as_expr' approach feels like 
something of a hack, even to me) - but the object method just doesn't seem to 
fit cleanly into the design of the basic types.

Cheers,
Nick.

-- 
Nick Coghlan           |              Brisbane, Australia
ICQ#: 68854767         |               ncoghlan@email.com
Mobile: 0409 573 268   |   http://www.talkinboutstuff.net
"Let go your prejudices,
               lest they limit your thoughts and actions."


From aleaxit at yahoo.com  Mon Oct 20 09:17:07 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 20 09:17:16 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8802C098E6@UKDCX001.uk.int.atosorigin.com>
References: <16E1010E4581B049ABC51D4975CEDB8802C098E6@UKDCX001.uk.int.atosorigin.com>
Message-ID: <200310201517.07902.aleaxit@yahoo.com>

On Monday 20 October 2003 11:47 am, Moore, Paul wrote:
> From: Alex Martelli [mailto:aleaxit@yahoo.com]
>
> > Basically, by exposing suitable methods an iterator could "make its
> > abilities know" to functions that may or may not need to wrap it in
> > order to achieve certain semantics -- so the functions can build
> > only those wrappers which are truly indispensable for the purpose.
> > Roughly the usual "protocol" approach -- functions use an object's
> > ability IF that object exposes methods providing that ability, and
> > otherwise fake it on their own.
>
> I'm glad you pointed this out. This whole thing was starting to sound
> very like the sort of thing that the adaptation PEP was intended to
> cover.

Darn -- one more underground attempt to foist adaptation into Python
foiled by premature discovery... must learn to phrase things less
overtly, the people around here are too clever!!!


> Can the people who need this get the capability via a suitable
> adaptation approach? I'm not familiar enough with the technique to
> be sure. If so, wouldn't that be a more general technique (as well
> as being already available in 3rd party modules like PyProtocols).

Yes, it would be more general and perfectly adequate for this task
too, but would still require SOME level of cooperation from built-in
types, such as the iterators returned by built-in iter.  Adaptation is
no black magic, just a systematic, clean, general way to use some
capabilities if a type offers them and perhaps kludge them up with
a wrapper if a type doesn't offer them but such a wrapper is possible.

If an iterator built by iter(sequence) just won't let me know about
what sequence it's iterating on and what its current index on it is,
in SOME way or other, there's no way I can prise that information
"by force" out of it -- I must treat it just like any other iterator that
only exposes a .next() method and nothing more (because that's
what it DOES expose).


Alex


From ncoghlan at iinet.net.au  Mon Oct 20 09:22:21 2003
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Mon Oct 20 09:22:27 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <200310201007.46829.aleaxit@yahoo.com>
References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz>
	<200310201007.46829.aleaxit@yahoo.com>
Message-ID: <3F93E18D.5010708@iinet.net.au>

Alex Martelli strung bits together to say:

> On Monday 20 October 2003 06:44 am, Greg Ewing wrote:
>    ...
> 
>>So, taking the original accumulator display idea, and
>>incorporating some of the ideas that have come up along
>>the way, such as getting rid of the square brackets,
>>how about
>>
>>  sum of x*x for x in xvalues
>>  average of g for g in grades
>>  maximum of f(x, y) for x in xrange for y in yrange
>>  top(10) of humour(joke) for joke in comedy
> 
> 
> Wow.
> 
> I'm speechless.
> 
> [later, having recovered speech] IF (big if) we could pull THAT off, it
> WOULD be well worth making 'of' a keyword (and thus requiring a
> "from __future__ import").  It's SO beautiful, SO pythonic, the only
> risk I can see is that we'd have newbie people coding:
>     sum of the_values
> rather than:
>     sum(the_values)
> or:
>     sum of x for x in the_values

Except, if it was defined such that you wrote:
   sum of [x*x for x in the_values]

then:
   sum of the_values

would actually be a valid expression, and Greg's examples would become:

   sum of xvalues
   average of grades
   maximum of [f(x, y) for x in xrange for y in yrange]
   top(10) of [humour(joke) for joke in comedy]

Either way, that's some seriously pretty executable psuedocode he has happening! 
And a magic method "__of__" that takes a list as an argument might be enough to 
do the trick, too.

Cheers,
Nick.

-- 
Nick Coghlan           |              Brisbane, Australia
ICQ#: 68854767         |               ncoghlan@email.com
Mobile: 0409 573 268   |   http://www.talkinboutstuff.net
"Let go your prejudices,
               lest they limit your thoughts and actions."


From aleaxit at yahoo.com  Mon Oct 20 10:01:08 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 20 10:01:15 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <3F93E18D.5010708@iinet.net.au>
References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz>
	<200310201007.46829.aleaxit@yahoo.com>
	<3F93E18D.5010708@iinet.net.au>
Message-ID: <200310201601.08440.aleaxit@yahoo.com>

On Monday 20 October 2003 03:22 pm, Nick Coghlan wrote:
   ...
> >>  sum of x*x for x in xvalues
> >>  average of g for g in grades
> >>  maximum of f(x, y) for x in xrange for y in yrange
> >>  top(10) of humour(joke) for joke in comedy
   ...
> > "from __future__ import").  It's SO beautiful, SO pythonic, the only
> > risk I can see is that we'd have newbie people coding:
> >     sum of the_values
> > rather than:
> >     sum(the_values)
> > or:
> >     sum of x for x in the_values
>
> Except, if it was defined such that you wrote:
>    sum of [x*x for x in the_values]
>
> then:
>    sum of the_values
>
> would actually be a valid expression, and Greg's examples would become:

Yes, you COULD extend the syntax from Greg's

    NAME 'of' listmaker

to _also_ accept

    NAME 'of' test

or thereabouts (in the terms of dist/src/Grammar/Grammar of course), I don't
think it would have any ambiguity.  As to whether it's worth it, I dunno.

>    sum of xvalues

Nope, he's summing the _squares_ --
      sum of x*x for x in xvalues
it says.

>    average of grades

Yes, this one would then work.

>    maximum of [f(x, y) for x in xrange for y in yrange]

Yes, you could put brackets there, but why?

>    top(10) of [humour(joke) for joke in comedy]

Ditto -- and it doesn't do the job unless the magic becomes even blacker.
top(N) is supposed to return jokes, not their humor values; so it needs to
get an iterable or iterator of (humor(joke), joke) PAIRS -- I think it would
DEFINITELY be better to have this spelled out, and in fact I'd prefer:

top(10, key=humour) of comedy

or

top(10, key=humour) of joke for joke in comedy

using the same neat syntax "key=<callable>" just sprouted by lists' sort 
method.

> Either way, that's some seriously pretty executable psuedocode he has
> happening! And a magic method "__of__" that takes a list as an argument
> might be enough to do the trick, too.

Agreed on the prettiness.  I would prefer to have the special method be 
defined to receive "an iterator or iterable" -- so we can maybe put together
a prototype where we just make and pass it a list, BUT keep the door open to
passing it an "iterator comprehension" in the future.  Or maybe make it always
an iterator (in the prototype we can just build the list and call iter on it 
anyway... so it's not any harder to get started playing with it).

Oh BTW, joining another still-current thread --

for x in sorted_copy of mylist:
    ...

now doesn't THAT read just wonderfully, too...?-)


Alex


From tim.one at comcast.net  Mon Oct 20 10:15:40 2003
From: tim.one at comcast.net (Tim Peters)
Date: Mon Oct 20 10:15:46 2003
Subject: [Python-Dev] New warnings in _sre.c
Message-ID: <LNBBLJKPBEHFEDALKOLCMEKHGKAB.tim.one@comcast.net>

MSVC complains when a signed int is compared to an unsigned int.  I'm glad
it does, because the compiler silently casts the signed int to unsigned,
which doesn't do what the author probably intended if the signed int is less
than 0:

#include <stdio.h>

void main()
{
	int i = -1;
	unsigned int j = 0;
	printf("%d\n", i < j);
}

That prints 0, i.e. it is not the case that -1 < 0U.

_sre.c(852) : warning C4018: '<' : signed/unsigned mismatch
_sre.c(1021) : warning C4018: '<' : signed/unsigned mismatch
_sre.c(1035) : warning C4018: '<' : signed/unsigned mismatch
_sre.c(1109) : warning C4018: '<' : signed/unsigned mismatch
_sre.c(1131) : warning C4018: '<' : signed/unsigned mismatch
_sre.c(1192) : warning C4018: '<' : signed/unsigned mismatch
_sre.c(1230) : warning C4018: '<' : signed/unsigned mismatch
_sre.c(1267) : warning C4018: '<' : signed/unsigned mismatch
_sre.c(1285) : warning C4018: '<' : signed/unsigned mismatch
_sre.c(1287) : warning C4018: '<' : signed/unsigned mismatch
_sre.c(1294) : warning C4018: '<' : signed/unsigned mismatch
_sre.c(1314) : warning C4018: '<' : signed/unsigned mismatch
_sre.c(1344) : warning C4018: '<' : signed/unsigned mismatch
_sre.c(1362) : warning C4018: '<' : signed/unsigned mismatch
_sre.c(1384) : warning C4018: '<' : signed/unsigned mismatch
_sre.c(1476) : warning C4018: '<' : signed/unsigned mismatch
_sre.c(1492) : warning C4018: '<' : signed/unsigned mismatch


From guido at python.org  Mon Oct 20 10:30:37 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 20 10:30:50 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Mon, 20 Oct 2003 09:44:30 +0200."
	<200310200944.30482.aleaxit@yahoo.com> 
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>  
	<200310200944.30482.aleaxit@yahoo.com> 
Message-ID: <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com>

> We are indeed sure (sadly) that list comprehensions leak control variable 
> names.

But they shouldn't.  It can be fixed by renaming them (e.g. numeric
names with a leading dot).

> We can hardly be sure of what iterator comprehensions would be
> defined to do, given they don't exist, but surely we can HOPE that
> in an ideal world where iterator comprehensions were part of Python
> they would not be similarly leaky:-).

It's highly likely that the implementation will have to create a
generator function under the hood, so they will be safely contained in
that frame.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From andrew-pythondev at puzzling.org  Mon Oct 20 10:30:56 2003
From: andrew-pythondev at puzzling.org (Andrew Bennetts)
Date: Mon Oct 20 10:31:04 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz>
References: <200310181627.h9IGRoP09636@12-236-54-216.client.attbi.com>
	<200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz>
Message-ID: <20031020143056.GE28665@frobozz>

Greg Ewing wrote:
> how about
> 
>   sum of x*x for x in xvalues
>   average of g for g in grades
>   maximum of f(x, y) for x in xrange for y in yrange
>   top(10) of humour(joke) for joke in comedy

I've thought about this, and I don't think I like it.  "of" just seems like
a new and confusingly different way to spell a function call.  E.g., if I
read this
    max([f(x,y) for x in xrange for y in yrange])
out-loud, I'd say:
    "the maximum of f of x and y for x in xrange, and y in yrange"

So perhaps that third example should be spelt:
    maximum of f of x, y for x in xrange for y in yrange
<wink>.

This particularly struck me when I read Alex's comment:

> for x in sorted_copy of mylist:
>     ...
> 
> now doesn't THAT read just wonderfully, too...?-)

Actually, that strikes me as an odd way of spelling:

for x in sorted_copy(mylist):
    ...

I think the lazy iteration syntax approach was probably a better idea.  I
don't like the proposed use of "yield" to signify it, though -- "yield" is a
flow control statement, so the examples using it in this thread look odd to
me.  Perhaps it would be best to simply use the keyword "lazy" -- after all,
that's the key distinguishing feature.  I think my preferred syntax would
be:

    sum([lazy x*x for x in sequence])

But use of parens instead of brackets, and/or a colon to make the keyword
stand out (and look reminisicent to a lambda! which *is* a related concept,
in a way -- it also defers evaluation), e.g.:

    sum((lazy: x*x for x in sequence))

Would be fine with me as well.

-Andrew.


From FBatista at uniFON.com.ar  Mon Oct 20 10:34:08 2003
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Mon Oct 20 10:35:28 2003
Subject: [Python-Dev] Re: prePEP: Money data type
Message-ID: <A128D751272CD411BC9200508BC2194D0338306C@escpl.tcp.com.ar>

#- >From the prePEP it's not clear (for me) the purpose of 
#- curencySymbol.
#- If it's intended for localisation, then prefix isn't enough,
#- some countries use suffix or even such format

The idea is to keep separated currencySymbol, thousandSeparator and
decimalSeparator, in such a way that if you want to change one of those,
just subclass Money and change it.

In a money amount shown as

	 $1,234.56

'$' is the currencySymbol, ',' is the thousandSeparator, and '.' is the
decimalSeparator.

This three elements are useful working with string, not only showing the
amount with str(), they're also important when parsing in the creation
moment:

	#standard creation
	m = Money('12.35')

	#subclassing
	class MyMoney(Money):
		decimalSeparator = ','

	#wrong!
	m = MyMoney('12.35')

	#right...
	m = MyMoney('12,35')
	

#- Money(123.45, 2)  -->  123 FF 45 GG
#- 
#- where FF is suffix1 and GG is suffix2.

This maybe could be addresed with having a currencyPrefix and currencySuffix
(the later default would be '') instead just one currencySymbol.


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
ADVERTENCIA  

La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo,
son para uso exclusivo del destinatario y pueden contener informaci?n
confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. 

Si Ud. No es uno de los destinatarios consignados o la persona responsable
de hacer llegar este mensaje a los destinatarios consignados, no est?
autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de
ella) contenida en este mensaje. Por favor notif?quenos respondiendo al
remitente, borre el mensaje original y borre las copias (impresas o grabadas
en cualquier medio magn?tico) que pueda haber realizado del mismo. 

Todas las opiniones contenidas en este mail son propias del autor del
mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones
Personales S.A. o alguna empresa asociada. 

Los mensajes electr?nicos pueden ser alterados, motivo por el cual
Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n
cualquiera sea el resultante de este mensaje. 

Muchas Gracias.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20031020/929ad86f/attachment.html
From ncoghlan at iinet.net.au  Mon Oct 20 10:37:48 2003
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Mon Oct 20 10:37:58 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <200310201601.08440.aleaxit@yahoo.com>
References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz>
	<200310201007.46829.aleaxit@yahoo.com>
	<3F93E18D.5010708@iinet.net.au>
	<200310201601.08440.aleaxit@yahoo.com>
Message-ID: <3F93F33C.9070702@iinet.net.au>

Alex Martelli strung bits together to say:
> On Monday 20 October 2003 03:22 pm, Nick Coghlan wrote:
> Yes, you COULD extend the syntax from Greg's
> 
>     NAME 'of' listmaker
> 
> to _also_ accept
> 
>     NAME 'of' test
> 
> or thereabouts (in the terms of dist/src/Grammar/Grammar of course), I don't
> think it would have any ambiguity.  As to whether it's worth it, I dunno.

Actually, I was suggesting that if 'of' is simply designated as taking a list* 
on the right hand side, then you can just write a list comprehension there, 
without needing the parser to understand  the 'for' syntax in that case. But I 
don't know enough about the parser to really know if that would be a saving 
worth making.

(* a list is what I was thinking, but as you point out, an iterable would be better)

>>   sum of xvalues
> 
> Nope, he's summing the _squares_ --
>       sum of x*x for x in xvalues
> it says.

D'oh - and I got that one right higher up, too. Ah, well.

>>   maximum of [f(x, y) for x in xrange for y in yrange]
> 
> Yes, you could put brackets there, but why?

I though it would be easier on the parser (only accepting a list/iterable on the 
right hand side). I don't know if that's actually true, though.

>>   top(10) of [humour(joke) for joke in comedy]
> 
> Ditto -- and it doesn't do the job unless the magic becomes even blacker.
> top(N) is supposed to return jokes, not their humor values; so it needs to
> get an iterable or iterator of (humor(joke), joke) PAIRS -- I think it would
> DEFINITELY be better to have this spelled out, and in fact I'd prefer:
> 
> top(10, key=humour) of comedy
> 
> or
> 
> top(10, key=humour) of joke for joke in comedy
> 
> using the same neat syntax "key=<callable>" just sprouted by lists' sort 
> method.

Yes, that would make it a lot clearer what was going on.

> Agreed on the prettiness.  I would prefer to have the special method be 
> defined to receive "an iterator or iterable" -- so we can maybe put together
> a prototype where we just make and pass it a list, BUT keep the door open to
> passing it an "iterator comprehension" in the future.  Or maybe make it always
> an iterator (in the prototype we can just build the list and call iter on it 
> anyway... so it's not any harder to get started playing with it).

Well, I think we've established that at least two people on the planet love this 
idea. . . and agreed on the iterator/iterable vs lists, too. I only thought of 
that distinction after I'd already hit send :)

> Oh BTW, joining another still-current thread --
> 
> for x in sorted_copy of mylist:
>     ...
> 
> now doesn't THAT read just wonderfully, too...?-)

Not to mention:

   for x in sorted_copy of reversed_copy of my_list:
     ...

   for x in sorted_copy(key=len) of my_list:
     ...

Indeed, _that_ is a solution that looks truly Pythonic!

Hmm, just had a strange thought:

   y = copy of x

How would that be for executable pseudocode? It's entirely possible to do all 
the iterator related things without having this last example work. But what if 
it did?

Cheers,
Nick.
__of__: just a single-argument function call?

-- 
Nick Coghlan           |              Brisbane, Australia
ICQ#: 68854767         |               ncoghlan@email.com
Mobile: 0409 573 268   |   http://www.talkinboutstuff.net
"Let go your prejudices,
               lest they limit your thoughts and actions."


From FBatista at uniFON.com.ar  Mon Oct 20 10:41:23 2003
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Mon Oct 20 10:42:15 2003
Subject: [Python-Dev] prePEP: Money data type
Message-ID: <A128D751272CD411BC9200508BC2194D0338306D@escpl.tcp.com.ar>


#- FWIW, Rogue Wave's Money class lets you specify _either_ rounding
#- approach -- ROUND_PLAIN specifies EU-rules-compliant rounding,
#- ROUND_BANKERS specifies round-to-even, for exactly in-between
#- amounts.  Offhand, it would seem impossible to write an accounting
#- program that respects the law in Europe AND the praxis you mention
#- at the same time, unless you somehow tell it what rule to use.
#- 
#- Sad, and seems weird to go to such trouble for a cent, but 
#- accountants
#- live and die by such minutiae: I think it would not be wise 
#- to ignore them,
#- PARTICULARLY if we name the type so as to make it appear to the
#- uninitiated that it "will do the right thing" regarding 
#- rounding... when there
#- isn't ONE right thing, it depends on locale &c:-(.

Seems to me that the best would be to have two functions (liked the names
roundPlain and roundBankers), and the behaviour to be specified by the user.
But here I found two approaches:

	- By argument: Redefine the sintaxis with Money(value, [precision],
[round]), having a specified default for round.

	- By subclassing: Just make:
		class MyMoney(Money):
			moneyround = roundPlain

The first is better in the way that you use Money directly, but you need to
specify *always* the rounding. In the second way you have to subclass it one
time, but then all the job is done (anyway, maybe you was already
subclassing Money to change it decimalSeparator or something).

Personally, I go for the second choice.

.	Facundo

From mwh at python.net  Mon Oct 20 11:02:29 2003
From: mwh at python.net (Michael Hudson)
Date: Mon Oct 20 11:02:32 2003
Subject: [Python-Dev] Re: itertools, was RE: list.sort
In-Reply-To: <200310180143.36999.aleaxit@yahoo.com> (Alex Martelli's message
	of "Sat, 18 Oct 2003 01:43:36 +0200")
References: <003201c39500$9006a8c0$e841fea9@oemcomputer>
	<200310180143.36999.aleaxit@yahoo.com>
Message-ID: <2my8vgosu2.fsf@starship.python.net>

Alex Martelli <aleaxit@yahoo.com> writes:

> On Saturday 18 October 2003 12:46 am, Raymond Hettinger wrote:
>    ...
>> My misgivings about drop() and take() are, firstly, that they
>> are expressible in-terms of islice() so they don't really add
>> any new capability.  Secondly, the number of tools needs to be
>
> True.  I gotta remember that -- I find it unintuitive, maybe it's
> islice's odious range-like ordering of arguments.

Yes, that rubs me the wrong way too.  That and I always read it
is-lice (and imap always makes me think of mail...).

Cheers,
mwh

-- 
  Need to Know is usually an interesting UK digest of things that
  happened last week or might happen next week. [...] This week,
  nothing happened, and we don't care.
                           -- NTK Now, 2000-12-29, http://www.ntk.net/

From aleaxit at yahoo.com  Mon Oct 20 11:16:36 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 20 11:16:45 2003
Subject: [Python-Dev] Re: itertools, was RE: list.sort
In-Reply-To: <2my8vgosu2.fsf@starship.python.net>
References: <003201c39500$9006a8c0$e841fea9@oemcomputer>
	<200310180143.36999.aleaxit@yahoo.com>
	<2my8vgosu2.fsf@starship.python.net>
Message-ID: <200310201716.36611.aleaxit@yahoo.com>

On Monday 20 October 2003 05:02 pm, Michael Hudson wrote:
   ...
> > islice's odious range-like ordering of arguments.
>
> Yes, that rubs me the wrong way too.  That and I always read it
> is-lice (and imap always makes me think of mail...).

is-lice might be useful in a debugger, though.


Alex


From mwh at python.net  Mon Oct 20 11:23:09 2003
From: mwh at python.net (Michael Hudson)
Date: Mon Oct 20 11:23:12 2003
Subject: [Python-Dev] Re: itertools, was RE: list.sort
In-Reply-To: <200310201716.36611.aleaxit@yahoo.com> (Alex Martelli's message
	of "Mon, 20 Oct 2003 17:16:36 +0200")
References: <003201c39500$9006a8c0$e841fea9@oemcomputer>
	<200310180143.36999.aleaxit@yahoo.com>
	<2my8vgosu2.fsf@starship.python.net>
	<200310201716.36611.aleaxit@yahoo.com>
Message-ID: <2mptgsorvm.fsf@starship.python.net>

Alex Martelli <aleaxit@yahoo.com> writes:

> On Monday 20 October 2003 05:02 pm, Michael Hudson wrote:
>    ...
>> > islice's odious range-like ordering of arguments.
>>
>> Yes, that rubs me the wrong way too.  That and I always read it
>> is-lice (and imap always makes me think of mail...).
>
> is-lice might be useful in a debugger, though.

*groan*

-- 
  ZAPHOD:  OK, so ten out of ten for style, but minus several million
           for good thinking, eh?
                    -- The Hitch-Hikers Guide to the Galaxy, Episode 2

From aleaxit at yahoo.com  Mon Oct 20 11:26:57 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 20 11:27:06 2003
Subject: [Python-Dev] prePEP: Money data type
In-Reply-To: <A128D751272CD411BC9200508BC2194D0338306D@escpl.tcp.com.ar>
References: <A128D751272CD411BC9200508BC2194D0338306D@escpl.tcp.com.ar>
Message-ID: <200310201726.57890.aleaxit@yahoo.com>

On Monday 20 October 2003 04:41 pm, Batista, Facundo wrote:
> #- FWIW, Rogue Wave's Money class lets you specify _either_ rounding
> #- approach -- ROUND_PLAIN specifies EU-rules-compliant rounding,
> #- ROUND_BANKERS specifies round-to-even, for exactly in-between
   ...
> #- isn't ONE right thing, it depends on locale &c:-(.
>
> Seems to me that the best would be to have two functions (liked the names
> roundPlain and roundBankers), and the behaviour to be specified by the

Sure, rounding IS best set by function, though you may want more than
two (roundForbid to raise exceptions when rounding tries to happen,
roundTruncate, etc).

> user. But here I found two approaches:
>
> 	- By argument: Redefine the sintaxis with Money(value, [precision],
> [round]), having a specified default for round.
>
> 	- By subclassing: Just make:
> 		class MyMoney(Money):
> 			moneyround = roundPlain
>
> The first is better in the way that you use Money directly, but you need to
> specify *always* the rounding. In the second way you have to subclass it

They're not at all incompatible!

class Money:
    round = staticmethod(roundWhateverDefault)
    precision = someDefaultPrecision
    def __init__(self, value, precision=None, round=None):
        self.value = value
        if precision is not None: self.precision = precision
        if round is not None: self.round = round

then use self.precision and self.round in all further methods -- they'll
correctly go to either the INSTANCE attribute, if specifically set, or
the CLASS attribute, if no instance attribute is set.  A useful part of
how Python works, btw.

So you can subclass Money and change the default rounding without
any problem whatsoever.

> one time, but then all the job is done (anyway, maybe you was already
> subclassing Money to change it decimalSeparator or something).

I do NOT think any advanced formatting should be part of the responsibilities
of class Money itself.  I would focus on correct and complete arithmetic with
good handling of exact precision and rounding rules: I contend THAT is the
really necessary part.  One can always subclass Money to ADD data and
methods (e.g. with appropriately designed mix-ins), but remember subclassing 
cannot REMOVE capabilities: so, avoid the "fat base class" syndrome, a
well-recognized anti-pattern, and make sure what you put in a base class is
what's needed for ALL uses of it.


Alex


From aleaxit at yahoo.com  Mon Oct 20 11:41:19 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 20 11:41:26 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <3F93F33C.9070702@iinet.net.au>
References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz>
	<200310201601.08440.aleaxit@yahoo.com>
	<3F93F33C.9070702@iinet.net.au>
Message-ID: <200310201741.19295.aleaxit@yahoo.com>

On Monday 20 October 2003 04:37 pm, Nick Coghlan wrote:
   ...
> Well, I think we've established that at least two people on the planet love

Right, hopefully 3 with Greg (though it's not unheard of for posters to this
list to change their minds about their own proposals.  So I told myself I
should stay out of the thread to let others voice their opinion, BUT...:

>    for x in sorted_copy of reversed_copy of my_list:

Ooops -- sorting a reversed copy of my_list is just like sorting my_list...
I think
      for x in sorted_copy(reverse=True) of my_list:
          ...
(again borrowing brand-new keyword syntax from lists' sort method) is
likely to work better...:-)


> Hmm, just had a strange thought:
>
>    y = copy of x
>
> How would that be for executable pseudocode? It's entirely possible to do

Awesomely pseudocoder (what a comparative...!-) wrt the current "y = 
copy.copy(x)".  You WOULD need to "from copy import copy" first, presumably,
but still...

> all the iterator related things without having this last example work. But
> what if it did?

Then the special method would have to be passed the right-hand operand
verbatim, NOT an iterator on it, for the "NAME 'of' test" case; otherwise,
this would be a terrible "attractive nuisance" in such cases as
    x = copy of my_dict
(if the hypothetical special method was passed iter(my_dict), it would only
get the KEYS -- shudder -- so x would presumably end up as a list -- a trap
for the unwary, and one I wouldn't want to have to explain to newbies!-).

However, if I had to choose, I would forego this VERY attractive syntax
sugar, and go for Greg's original suggestion -- 'of' for iterator 
comprehensions only.  Syntax sugar is all very well (at least in this case),
but if it _only_ amounts to a much neater-looking way of doing what is already
quite possible, it's a "more-than-one-way-to-do-itis".

[Just to make sure I argue both sides: introducing "if key in mydict:" as a
better way to express "if mydict.has_key(key):" was a HUGE win, and so
was letting "if needle in haystack:" be used as a better way to express
"haystack.find(needle) >= 0" for substring checks -- so, 'mere' syntax
sugar DOES sometimes make an important difference...]


Alex


From aleaxit at yahoo.com  Mon Oct 20 11:45:36 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 20 11:46:01 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com>
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310200944.30482.aleaxit@yahoo.com>
	<200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com>
Message-ID: <200310201745.36226.aleaxit@yahoo.com>

On Monday 20 October 2003 04:30 pm, Guido van Rossum wrote:
> > We are indeed sure (sadly) that list comprehensions leak control variable
> > names.
>
> But they shouldn't.  It can be fixed by renaming them (e.g. numeric
> names with a leading dot).

Hmmm, sorry?

>>> [.2 for .2 in range(3)]
SyntaxError: can't assign to literal

I think I don't understand what you mean.


> > We can hardly be sure of what iterator comprehensions would be
> > defined to do, given they don't exist, but surely we can HOPE that
> > in an ideal world where iterator comprehensions were part of Python
> > they would not be similarly leaky:-).
>
> It's highly likely that the implementation will have to create a
> generator function under the hood, so they will be safely contained in
> that frame.

And there will be much rejoicing...!-)


Alex


From Paul.Moore at atosorigin.com  Mon Oct 20 11:51:04 2003
From: Paul.Moore at atosorigin.com (Moore, Paul)
Date: Mon Oct 20 11:51:50 2003
Subject: [Python-Dev] Re: accumulator display syntax
Message-ID: <16E1010E4581B049ABC51D4975CEDB8802C098EB@UKDCX001.uk.int.atosorigin.com>

From: Alex Martelli [mailto:aleaxit@yahoo.com]
>> Hmm, just had a strange thought:
>>
>>    y = copy of x
>>
>> How would that be for executable pseudocode? It's entirely possible to do

> Awesomely pseudocoder (what a comparative...!-) wrt the current "y = 
> copy.copy(x)".  You WOULD need to "from copy import copy" first, presumably,
> but still...

Did I miss April 1st? We seem to be discussing the merits of

    f of arg

as an alternative form of

   f(arg)

While I'm sure Cobol had some good points, I don't believe that this was one
of them...

If there is any merit to this proposal, it's very rapidly being lost in
examples of rewriting things which are simple function calls.

Paul.

From pje at telecommunity.com  Mon Oct 20 12:11:30 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Oct 20 12:12:30 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310201745.36226.aleaxit@yahoo.com>
References: <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com>
	<200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310200944.30482.aleaxit@yahoo.com>
	<200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com>
Message-ID: <5.1.1.6.0.20031020120841.03337e00@telecommunity.com>

At 05:45 PM 10/20/03 +0200, Alex Martelli wrote:
>On Monday 20 October 2003 04:30 pm, Guido van Rossum wrote:
> > > We are indeed sure (sadly) that list comprehensions leak control variable
> > > names.
> >
> > But they shouldn't.  It can be fixed by renaming them (e.g. numeric
> > names with a leading dot).
>
>Hmmm, sorry?
>
> >>> [.2 for .2 in range(3)]
>SyntaxError: can't assign to literal
>
>I think I don't understand what you mean.

He was talking about having the bytecode compiler generate "hidden" names 
for the variables...  ones that can't be used from Python.

There's one drawback there, however...  If you're stepping through the 
listcomp generation with a debugger, you won't be able to print the current 
item in the list, as (I believe) is possible now.


From aleaxit at yahoo.com  Mon Oct 20 12:23:45 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 20 12:23:51 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8802C098EB@UKDCX001.uk.int.atosorigin.com>
References: <16E1010E4581B049ABC51D4975CEDB8802C098EB@UKDCX001.uk.int.atosorigin.com>
Message-ID: <200310201823.45392.aleaxit@yahoo.com>

On Monday 20 October 2003 05:51 pm, Moore, Paul wrote:
   ...
> Did I miss April 1st? We seem to be discussing the merits of
>
>     f of arg
>
> as an alternative form of
>
>    f(arg)
>
> While I'm sure Cobol had some good points, I don't believe that this was
> one of them...

I may disagree, but it's sure too late to redesign Python today in that 
respect;-).


> If there is any merit to this proposal, it's very rapidly being lost in
> examples of rewriting things which are simple function calls.

Agreed, and I pointed that out in my latest msg to this thread -- just like
e.g. rewriting the simple function call mydict.has_key(k) as the cool, 
readable "k in mydict", quite identically rewriting the simple function call
sum(numbers) as the cool, readable "sum of numbers" would be mere
syntax sugar, "more than one way to do it", etc.

So, limiting the discussion to Greg's original idea of using 'of' for iterator 
comprehensions will be wiser and more prudent (just like one would
never dare suggesting 'in' as an alternative to calling has_key, say:-).
That 'of' thingy is just SO pretty it's making some of us lose their heads,
that's all...!-)


Alex


From guido at python.org  Mon Oct 20 12:37:17 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 20 12:37:28 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Mon, 20 Oct 2003 17:45:36 +0200."
	<200310201745.36226.aleaxit@yahoo.com> 
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310200944.30482.aleaxit@yahoo.com>
	<200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com> 
	<200310201745.36226.aleaxit@yahoo.com> 
Message-ID: <200310201637.h9KGbHU21287@12-236-54-216.client.attbi.com>

> On Monday 20 October 2003 04:30 pm, Guido van Rossum wrote:
> > > We are indeed sure (sadly) that list comprehensions leak control variable
> > > names.
> >
> > But they shouldn't.  It can be fixed by renaming them (e.g. numeric
> > names with a leading dot).
> 
> Hmmm, sorry?
> 
> >>> [.2 for .2 in range(3)]
> SyntaxError: can't assign to literal
> 
> I think I don't understand what you mean.

I meant that the compiler should rename it.  Just like when you use a
tuple argument:

   def f(a, (b, c), d): ...

this actually defines a function of three (!) arguments whose second
argument is named '.2'.  And the body starts with something
equivalent to

   b, c = .2

For list comps, the compiler could maintain a mapping for the listcomp
control variables so that if you write

  [x for x in range(3)]

it knows to generate bytecode as if x was called '.7'; at the bytecode
level there's no requirement for names to follow the identifier syntax.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From skip at pobox.com  Mon Oct 20 12:38:51 2003
From: skip at pobox.com (Skip Montanaro)
Date: Mon Oct 20 12:38:58 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com>
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310200944.30482.aleaxit@yahoo.com>
	<200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com>
Message-ID: <16276.3995.177704.754136@montanaro.dyndns.org>


    >> We can hardly be sure of what iterator comprehensions would be
    >> defined to do, given they don't exist, but surely we can HOPE that in
    >> an ideal world where iterator comprehensions were part of Python they
    >> would not be similarly leaky:-).

    Guido> It's highly likely that the implementation will have to create a
    Guido> generator function under the hood, so they will be safely
    Guido> contained in that frame.

Which suggests they aren't likely to be a major performance win over list
comprehensions.  If nothing else, they would push the crossover point
between list comprehensions and iterator comprehensions toward much longer
lists.

Is performance is the main reason this addition is being considered?  They
don't seem any more expressive than list comprehensions to me.

Skip


From guido at python.org  Mon Oct 20 12:40:04 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 20 12:41:13 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: Your message of "Mon, 20 Oct 2003 16:51:04 BST."
	<16E1010E4581B049ABC51D4975CEDB8802C098EB@UKDCX001.uk.int.atosorigin.com>
References: <16E1010E4581B049ABC51D4975CEDB8802C098EB@UKDCX001.uk.int.atosorigin.com>
Message-ID: <200310201640.h9KGe4I21305@12-236-54-216.client.attbi.com>

> Did I miss April 1st? We seem to be discussing the merits of
> 
>     f of arg
> 
> as an alternative form of
> 
>    f(arg)
> 
> While I'm sure Cobol had some good points, I don't believe that this was one
> of them...
> 
> If there is any merit to this proposal, it's very rapidly being lost in
> examples of rewriting things which are simple function calls.

Amen.  *If* we were to introduce 'of' as an operator, at least it
should introduce some as-yet-unsupported parameter passing semantics,
like call-by-name. :-)

And in fact, I think that

  sum(x for x in range(10))

reads *better* than

  sum of x for x in range(10)

and certainly better than

  sum of x for x in range of 10

because when you squint, it just becomes a series of undistinguished
words, like

  xxx xx x xxx x xx xxxxx xx xx

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Oct 20 12:43:15 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 20 12:43:21 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Mon, 20 Oct 2003 12:11:30 EDT."
	<5.1.1.6.0.20031020120841.03337e00@telecommunity.com> 
References: <200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com>
	<200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310200944.30482.aleaxit@yahoo.com>
	<200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com> 
	<5.1.1.6.0.20031020120841.03337e00@telecommunity.com> 
Message-ID: <200310201643.h9KGhFM21321@12-236-54-216.client.attbi.com>

> There's one drawback there, however...  If you're stepping through the 
> listcomp generation with a debugger, you won't be able to print the current 
> item in the list, as (I believe) is possible now.

Good point.  But this could be addressed in many ways; the debugger
could grow a way to quote nonstandard variable names, or it could know
about the name mapping, or we could use a different name-mangling
scheme (e.g. prefix the original name with an underscore, and
optionally append _1 or _2 etc. as needed to distinguish it from a
real local with the same name).  Or we could simply state this as a
deficiency (I'm not sure I've ever needed to debug that situation).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From eppstein at ics.uci.edu  Mon Oct 20 13:03:50 2003
From: eppstein at ics.uci.edu (David Eppstein)
Date: Mon Oct 20 13:03:54 2003
Subject: [Python-Dev] Re: accumulator display syntax
References: <16E1010E4581B049ABC51D4975CEDB8802C098EB@UKDCX001.uk.int.atosorigin.com>
	<200310201640.h9KGe4I21305@12-236-54-216.client.attbi.com>
Message-ID: <eppstein-0E176D.10035020102003@sea.gmane.org>

In article <200310201640.h9KGe4I21305@12-236-54-216.client.attbi.com>,
 Guido van Rossum <guido@python.org> wrote:

> And in fact, I think that
> 
>   sum(x for x in range(10))
> 
> reads *better* than
> 
>   sum of x for x in range(10)
> 
> and certainly better than
> 
>   sum of x for x in range of 10

I also think
    sum(x for x in range(10))
reads much better than
    sum(yield x for x in range(10))
    sum(yield: x for x in range(10))
or even
    sum([x for x in range(10)])

(The yield-based syntaxes also have the problem of confusing the reader 
into thinking the function containing them might be a generator.)
It is enough better that the "tuple comprehension" issue is a 
non-problem for me.  I'm assuming this syntax would need surrounding 
parens inside lists, tuples, and dicts (to avoid confusion with 
list/dict comprehensions and for the same reason [x,x for x in S] is 
currently invalid syntax) but avoiding the extra parens in other 
contexts like function calls looks like a win.

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science


From guido at python.org  Mon Oct 20 13:08:45 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 20 13:08:53 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: Your message of "Mon, 20 Oct 2003 15:17:07 +0200."
	<200310201517.07902.aleaxit@yahoo.com> 
References: <16E1010E4581B049ABC51D4975CEDB8802C098E6@UKDCX001.uk.int.atosorigin.com>
	<200310201517.07902.aleaxit@yahoo.com> 
Message-ID: <200310201708.h9KH8jF21377@12-236-54-216.client.attbi.com>

> Darn -- one more underground attempt to foist adaptation into Python
> foiled by premature discovery... must learn to phrase things less
> overtly, the people around here are too clever!!!

:-)

I'm all for adaptation, I'm just hesitant to adapt it wholeheartedly
because I expect that it will have such a big impact on coding
practices.  I want to have a better feel for what that impact is and
whether it is altogether healthy.  IOW I'm a bit worried that
adaptation might become too attractive of a hammer for all sorts of
problems, whether or not there are better-suited solutions.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at python.org  Mon Oct 20 13:21:10 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 20 13:21:23 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Mon, 20 Oct 2003 11:38:51 CDT."
	<16276.3995.177704.754136@montanaro.dyndns.org> 
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310200944.30482.aleaxit@yahoo.com>
	<200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com> 
	<16276.3995.177704.754136@montanaro.dyndns.org> 
Message-ID: <200310201721.h9KHLAa21428@12-236-54-216.client.attbi.com>

>     Guido> It's highly likely that the implementation will have to create a
>     Guido> generator function under the hood, so they will be safely
>     Guido> contained in that frame.

[Skip]
> Which suggests they aren't likely to be a major performance win over
> list comprehensions.  If nothing else, they would push the crossover
> point between list comprehensions and iterator comprehensions toward
> much longer lists.
> 
> Is performance is the main reason this addition is being considered?
> They don't seem any more expressive than list comprehensions to me.

They are more expressive in one respect: you can't use a list
comprehension to express an infinite sequence (that's truncated by the
consumer).

They are more efficient in a related situation: a list comprehension
buffers all its items before the next processing step begins; an
iterator comprehension doesn't need to do any buffering.  So iterator
comprehensions win if you're pipelining operations just like Unix
pipes are a huge win over temporary files in some situations.  This is
particularly important when the consumer is some accumulator like
'average' or 'sum'.  Whether there is an actual gain in speed depends
on how large the list is.  You should be able to time examples like

   sum([x*x for x in R])

vs.

   def gen(R):
       for x in R:
           yield x*x
   sum(gen(R))

for various lengths of R.  (The latter would be a good indication of
how fast an iterator generator could run.)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aahz at pythoncraft.com  Mon Oct 20 13:31:35 2003
From: aahz at pythoncraft.com (Aahz)
Date: Mon Oct 20 13:31:39 2003
Subject: [Python-Dev] listcomps vs. for loops
Message-ID: <20031020173134.GA29040@panix.com>

On Mon, Oct 20, 2003, Guido van Rossum wrote:
>Alex Martelli:
>>
>> We are indeed sure (sadly) that list comprehensions leak control variable 
>> names.
> 
> But they shouldn't.  It can be fixed by renaming them (e.g. numeric
> names with a leading dot).

?!?!  When listcomps were introduced, you were strongly against any
changes that would make it difficult to switch back and forth between a
listcomp and its corresponding equivalent for loop.  Are you changing
your position or are you suggesting that for loops should grow private
names?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From guido at python.org  Mon Oct 20 13:43:25 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 20 13:43:36 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: Your message of "Mon, 20 Oct 2003 09:40:43 +0200."
	<200310200940.43021.aleaxit@yahoo.com> 
References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
	<200310192216.44849.aleaxit@yahoo.com>
	<200310200040.h9K0ebP20072@12-236-54-216.client.attbi.com> 
	<200310200940.43021.aleaxit@yahoo.com> 
Message-ID: <200310201743.h9KHhPZ21469@12-236-54-216.client.attbi.com>

> BTW, playing around with some of this it seems to me that the
> inability to just copy.copy (or copy.deepcopy) anything produced by
> iter(sequence) is more of a bother -- quite apart from clonability
> (a similar but separate concept), couldn't those iterators be
> copy'able anyway?  I.e. just expose underlying sequence and index as
> their state for getting and setting?

I'm not sure why you say it's separate from cloning; it seems to me
that copy.copy(iter(range(10))) should return *exactly* what we'd want
the proposed clone operation to return.

> Otherwise to get copyable
> iterators I have to reimplement iter "by hand":
> 
> class Iter(object):
>     def __init__(self, seq):
>         self.seq = seq
>         self.idx = 0
>     def __iter__(self): return self
>     def next(self):
>         try: result = self.seq[self.idx]
>         except IndexError: raise StopIteration
>         self.idx += 1
>         return result
> 
> and I don't understand the added value of requiring the user to
> code this no-added-value, slow-things-down boilerplate.

I see this as a plea to add __copy__ and __deepcopy__ methods to all
standard iterators for which it makes sense.  (Or maybe only __copy__
-- I'm not sure what value __deepcopy__ would add.)

I find this a reasonable request for the iterators belonging to
stndard containers (list, tuple, dict).  I guess that some of the
iterators in itertools might also support this easily.  Perhaps this
would be the road to supporting iterator cloning?

> > > An iterator that knows it's coming from disk or pipe can provide
> > > that disk copy (or reuse the existing file) as part of its
> > > "optimized tee-ability".
> >
> > At considerable cost.
> 
> I'm not sure I see that cost, yet.

Mostly complexity of the code to implement it, and things like making
sure that the disk file is deleted (not an easy problem
cross-platform!).

> > lines of a file previously available.  Also, on many systems,
> > every call to fseek() drops the stdio buffer, even if the seek
> > position is not actually changed by the call.  It could be done,
> > but would require incredibly hairy code.
> 
> The call to fseek probably SHOULD drop the buffer in a typical
> C implementation _on a R/W file_, because it's used as the way
> to signal the file that you're moving from reading to writing or VV
> (that's what the C standard says: you need a seek between an
> input op and an immediately successive output op or viceversa,
> even a seek to the current point, else, undefined behavior -- which
> reminds me, I don't know if the _Python_ wrapper maintains that
> "clever" requirement for ITS R/W files, but I think it does).

Yes it does: file_seek() calls drop_readahead().

> I can well believe that for simplicity a C-library implementor would
> then drop the buffer on a R/O file too, needlessly but
> understandably.

For any stdio implementation supporting fileno(), fseek() is also used
to synch up the seek positions maintained by stdio and by the
underlying OS or file descriptor implementation.

> The deuced "for line in flob:" is so deucedly optimized that trying
> to compete with it, even with something as apparently trivial as
> Lines1, is apparently a lost cause;-).  OK, then I guess that an
> iterator by lines on a textfile can't easily be optimized for teeability
> by these "share the file object" strategies; rather, the best way to
> tee such a disk file would seem to be:

> def tee_diskfile(f):
>     result = file(f.name, f.mode)
>     result.seek(f.tell())
>     return f, result

Right, except you might want to change the mode to a read-only mode
(without losing the 'b' or 'U' property).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Oct 20 13:48:00 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 20 13:48:08 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: Your message of "Mon, 20 Oct 2003 13:31:35 EDT."
	<20031020173134.GA29040@panix.com> 
References: <20031020173134.GA29040@panix.com> 
Message-ID: <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com>

> >> We are indeed sure (sadly) that list comprehensions leak control
> >> variable names.
> > 
> > But they shouldn't.  It can be fixed by renaming them (e.g. numeric
> > names with a leading dot).
> 
> ?!?!  When listcomps were introduced, you were strongly against any
> changes that would make it difficult to switch back and forth between a
> listcomp and its corresponding equivalent for loop.

I don't recall what I said then.  Did I say it was a feature that

  L = [x for x in R]
  print x

would print the last item of R?

> Are you changing your position or are you suggesting that for loops
> should grow private names?

No, only list comps.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aahz at pythoncraft.com  Mon Oct 20 13:52:30 2003
From: aahz at pythoncraft.com (Aahz)
Date: Mon Oct 20 13:52:35 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com>
References: <20031020173134.GA29040@panix.com>
	<200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com>
Message-ID: <20031020175230.GA7307@panix.com>

On Mon, Oct 20, 2003, Guido van Rossum wrote:
>Aahz:
>>
>> ?!?!  When listcomps were introduced, you were strongly against any
>> changes that would make it difficult to switch back and forth between a
>> listcomp and its corresponding equivalent for loop.
> 
> I don't recall what I said then.  Did I say it was a feature that
> 
>   L = [x for x in R]
>   print x
> 
> would print the last item of R?

What I remember you saying was that it was an unfortunate but necessary
consequence so that it would work the same as

    L = []
    for x in R:
        L.append(x)
    print x

You didn't want to have different semantics for two such similar
constructs ("there's only one way").  You also didn't want to push a
stack frame for listcomps.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From mcherm at mcherm.com  Mon Oct 20 14:06:52 2003
From: mcherm at mcherm.com (Michael Chermside)
Date: Mon Oct 20 14:06:58 2003
Subject: [Python-Dev] listcomps vs. for loops
Message-ID: <1066673212.3f94243c2c03c@mcherm.com>

Alex:
> We are indeed sure (sadly) that list comprehensions leak control
> variable names.

Guido:
> But they shouldn't.  It can be fixed by renaming them (e.g. numeric
> names with a leading dot).

Aahz:
> ?!?!  When listcomps were introduced, you were strongly against [...]
> Are you changing your position[...]?

Guido:
> Did I say it was a feature that
> 
>   L = [x for x in R]
>   print x
> 
> would print the last item of R?

Well, I don't care much about the history of what you may have said...
let's get it out in the open: The fact that listcomps leak their
variable (thus providing a handy name-binding expression for the evil-minded
among us) is a BAD THING.

I'd love to see that (mis)feature removed someday. I'd love to have that
made possible by Guido's _immediately_ and _officially_ declaring it to be
an unsupported (and deprecated) feature. Then maybe *someday* we could
get rid of them. Even now, people are writing code that (ab)uses this,
and making it ever harder to ever change this in the future.

-- Michael Chermside


From guido at python.org  Mon Oct 20 14:08:07 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 20 14:08:22 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: Your message of "Mon, 20 Oct 2003 13:52:30 EDT."
	<20031020175230.GA7307@panix.com> 
References: <20031020173134.GA29040@panix.com>
	<200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> 
	<20031020175230.GA7307@panix.com> 
Message-ID: <200310201808.h9KI88Q21557@12-236-54-216.client.attbi.com>

> What I remember you saying was that it was an unfortunate but necessary
> consequence so that it would work the same as
> 
>     L = []
>     for x in R:
>         L.append(x)
>     print x
> 
> You didn't want to have different semantics for two such similar
> constructs ("there's only one way").  You also didn't want to push a
> stack frame for listcomps.

Then I guess I *have* changed my mind.  I guess I didn't think of the
renaming solution way back when.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Oct 20 14:15:22 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 20 14:15:33 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: Your message of "Mon, 20 Oct 2003 17:44:45 +1300."
	<200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> 
References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310201815.h9KIFM821583@12-236-54-216.client.attbi.com>

> Most of us seem to agree that having list comprehensions 
> available as a replacement for map() and filter() is a good
> thing. But what about reduce()? Are there equally strong
> reasons for wanting an alternative to that, too? If not,
> why not?

If anything, the desire there is *more* pressing.  Except for
operator.add, expressions involving reduce() are notoriously hard to
understand (except to experienced APL or Scheme hackers :-).

Things like sum, max, average etc. are expressed very elegantly with
iterator comprehensions.

I think the question is more one of frequency of use.  List comps have
nothing over e.g.

  result = []
  for x in S:
      result.append(x**2)

except compactness of exprssion.  How frequent is

  result = 0.0
  for x in S:
      result += x**2

???

(I've already said my -1 about your 'sum of ...' proposal.)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Oct 20 14:22:22 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 20 14:22:33 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: Your message of "Sun, 19 Oct 2003 21:40:42 +0200."
	<200310192140.43084.aleaxit@yahoo.com> 
References: <004701c39665$bd6ff440$e841fea9@oemcomputer>  
	<200310192140.43084.aleaxit@yahoo.com> 
Message-ID: <200310201822.h9KIMMX21628@12-236-54-216.client.attbi.com>

> Or maybe, like in dict.fromkeys, we don't want to emphasize
> either the building or the newness, but then I wouldn't know what
> to suggest except the list.sorted that's already drawn catcalls
> (though it drew them when it was proposed as an instance
> methods of lists -- maybe as a classmethod it will look better?-)

list.sorted as a list factory looks fine to me.  Maybe whoever pointed
out the problem with l.sorted() vs. l.sort() for non-native-English
speakers can shed some light on how list.sorted(x) fares compared to
x.sort()?

But the argument that it wastes a copy still stands (even though
that's only O(N) vs. O(N log N) for the sort).

> I want the functionality -- any sensible name that might let the
> functionality into the standard library would be ok by me (so
> would one putting the functionality in as a builtin or as an instance
> method of lists, actually, but I _do_ believe those would not be
> the best places for this functionality, by far).  I hope the "tools
> package" idea and/or the classmethod one find favour...!-)

I'm still unclear why this so important to have in the library when
you can write it yourself in two lines.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Oct 20 14:23:15 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 20 14:23:24 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: Your message of "Mon, 20 Oct 2003 11:06:52 PDT."
	<1066673212.3f94243c2c03c@mcherm.com> 
References: <1066673212.3f94243c2c03c@mcherm.com> 
Message-ID: <200310201823.h9KINF921648@12-236-54-216.client.attbi.com>

> Alex:
> > We are indeed sure (sadly) that list comprehensions leak control
> > variable names.
> 
> Guido:
> > But they shouldn't.  It can be fixed by renaming them (e.g. numeric
> > names with a leading dot).
> 
> Aahz:
> > ?!?!  When listcomps were introduced, you were strongly against [...]
> > Are you changing your position[...]?
> 
> Guido:
> > Did I say it was a feature that
> > 
> >   L = [x for x in R]
> >   print x
> > 
> > would print the last item of R?
> 
> Well, I don't care much about the history of what you may have said...
> let's get it out in the open: The fact that listcomps leak their
> variable (thus providing a handy name-binding expression for the evil-minded
> among us) is a BAD THING.
> 
> I'd love to see that (mis)feature removed someday. I'd love to have that
> made possible by Guido's _immediately_ and _officially_ declaring it to be
> an unsupported (and deprecated) feature.

Make it so.

> Then maybe *someday* we could
> get rid of them. Even now, people are writing code that (ab)uses this,
> and making it ever harder to ever change this in the future.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Mon Oct 20 14:31:24 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Oct 20 14:31:22 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: <200310201708.h9KH8jF21377@12-236-54-216.client.attbi.com>
References: <Your message of "Mon, 20 Oct 2003 15:17:07 +0200."
	<200310201517.07902.aleaxit@yahoo.com>
	<16E1010E4581B049ABC51D4975CEDB8802C098E6@UKDCX001.uk.int.atosorigin.com>
	<200310201517.07902.aleaxit@yahoo.com>
Message-ID: <5.1.1.6.0.20031020141447.021bc7b0@telecommunity.com>

At 10:08 AM 10/20/03 -0700, Guido van Rossum wrote:

>I'm all for adaptation, I'm just hesitant to adapt it wholeheartedly
>because I expect that it will have such a big impact on coding
>practices.  I want to have a better feel for what that impact is and
>whether it is altogether healthy.  IOW I'm a bit worried that
>adaptation might become too attractive of a hammer for all sorts of
>problems, whether or not there are better-suited solutions.

FWIW, it occurred to me recently that other languages/systems (e.g CLOS, 
Dylan) solve the problems that adaptation solves by using generic 
functions.  So, by analogy, one could simply ask whether generic functions 
are too attractive a hammer in those types of languages.  :)  The other 
comparison that might be made is to downcast operations in e.g. Java, or 
conversion constructors (is that the right name?) in C++.

In some ways, adaptation seems more Pythonic to me than generic functions, 
because it results in objects that support an interface.  To do the same 
with generic functions, one would have to curry in the "self".  OTOH, 
generic functions in CLOS and Dylan support multiple dispatch, which is 
certainly better for implementing binary (or N-ary) operations.  So there 
are tradeoffs either way.  Sometimes, when I define an interface with just 
one method in it, it looks like it would be cleaner as a generic 
function.  But when there's more than one method, I tend to prefer 
interface+adaptation.  I don't have a generic function implementation I'm 
happy with at present, though, so I stick with adaptation for now.

One other issue with generic functions is that languages with generic 
functions usually have open type systems that allow e.g. union types or 
predicate types.  Python doesn't have that, so it's hard to e.g. "adapt 
from one interface to another" with generic functions.  It can be done, 
certainly, it's just hard to do it declaratively in a manner open to extension.


From skip at pobox.com  Mon Oct 20 14:32:13 2003
From: skip at pobox.com (Skip Montanaro)
Date: Mon Oct 20 14:32:22 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com>
References: <20031020173134.GA29040@panix.com>
	<200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com>
Message-ID: <16276.10797.81884.996776@montanaro.dyndns.org>


    >> ?!?!  When listcomps were introduced, you were strongly against any
    >> changes that would make it difficult to switch back and forth between
    >> a listcomp and its corresponding equivalent for loop.

    Guido> I don't recall what I said then.  Did I say it was a feature that

    Guido>   L = [x for x in R]
    Guido>   print x

    Guido> would print the last item of R?

I suspect the lack of a PEP at the time list comprehensions were added to
the language allowed this to slip through.  PEP 202 was mostly written after
list comprehensions were checked into CVS I think (opened 2000-07-13, marked
final 2001-08-14, yes 2001!).  At just 84 lines it's one of the shortest
PEPs.  The patch I opened on SF (#400654, opened 2000-06-28, closed
2000-08-14) was essentially Greg Ewing's experimental patch, which relied
heavily on the existing for loop code generation.  Had there been a PEP with
the usual fanfare, I suspect we'd have caught (or at least considered)
variable leakage, and perhaps suppressed it.  I don't recall the topic ever
coming up until after list comps were part of the language.  It certainly
seems to be the most controversial aspect, after one accepts the idea of
adding them to the language.  Missing such an obvious point of contention is
perhaps one of the strongest arguments for the current PEP process.

Skip

From lists at webcrunchers.com  Mon Oct 20 14:46:17 2003
From: lists at webcrunchers.com (John D.)
Date: Mon Oct 20 14:46:28 2003
Subject: [Python-Dev] dbm bugs?
Message-ID: <v03110707bbb9d7865550@[192.168.0.3]>

#!/usr/local/bin/python
#2003-10-19. Feedback
import dbm

print """
Python dbm bugs summary:
  1. Long strings cause weirdness.
  2. Long keys fail without returning error.

This demonstrates serious bugs in the Python dbm module.
Present in OpenBSD versions 2.2, 2.3, and 2.3.2c1.

len(key+string)>61231 results in the item being 'lost', without warning.
If the key or string is one character shorter, it is fine.
Writing multiple long strings causes unpredictable results
(none, some, or all of the items are lost without warning).

Curiously, keys of length 57148 return an error, but
longer keys fail without warning
(sounds like an = instead of a > somewhere).
"""

mdb=dbm.open("mdb","n")
print "Writing 1 item to database, but upon reading,"
k='k'
v='X'*61230 #Long string
mdb[k]=v
mdb.close()

md=dbm.open("mdb","r")
print "database contains %i items"%len(md.keys())
md.close()


From ianb at colorstudy.com  Mon Oct 20 15:14:48 2003
From: ianb at colorstudy.com (Ian Bicking)
Date: Mon Oct 20 15:14:57 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310201822.h9KIMMX21628@12-236-54-216.client.attbi.com>
Message-ID: <ABFFF174-0331-11D8-BCD7-000393C2D67E@colorstudy.com>

On Monday, October 20, 2003, at 01:22 PM, Guido van Rossum wrote:
>> I want the functionality -- any sensible name that might let the
>> functionality into the standard library would be ok by me (so
>> would one putting the functionality in as a builtin or as an instance
>> method of lists, actually, but I _do_ believe those would not be
>> the best places for this functionality, by far).  I hope the "tools
>> package" idea and/or the classmethod one find favour...!-)
>
> I'm still unclear why this so important to have in the library when
> you can write it yourself in two lines.

Probably "there should only be one way to do something."  It's 
something that is recreated over and over, mostly the same way but 
sometimes with slight differences (e.g., copy-and-sort versus 
sort-in-place).  Like dict() growing keyword arguments, a copy/sort 
method (function, classmethod, whatever) will standardize something 
that is very commonly reimplemented.  Another analogs might be True and 
False (which before being built into Python may have been spelled 
true/false, TRUE/FALSE, or just 0/1).  These don't add any real 
features, but they standardize these simplest of idioms.

I think I've seen people in this thread say that they've written Big 
Python Programs, and they didn't have any problem with this -- but this 
is a feature that's most important for Small Python Programs.  Defining 
a sort() function becomes boilerplate when you write small programs.  
Or alternatively you create some util module that contains these little 
functions, which becomes like a site.py only somewhat more explicit.  A 
util module feels like boilerplate as well, because it is a module 
without any conceptual integrity, shared between projects only for 
convenience, or not shared as it grows organically.  "from util import 
sort" just feels like cruft-hiding, not real modularity.

--
Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org


From guido at python.org  Mon Oct 20 15:24:33 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 20 15:24:47 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: Your message of "Mon, 20 Oct 2003 14:14:48 CDT."
	<ABFFF174-0331-11D8-BCD7-000393C2D67E@colorstudy.com> 
References: <ABFFF174-0331-11D8-BCD7-000393C2D67E@colorstudy.com> 
Message-ID: <200310201924.h9KJOXc21813@12-236-54-216.client.attbi.com>

> > I'm still unclear why this so important to have in the library when
> > you can write it yourself in two lines.
> 
> Probably "there should only be one way to do something."  It's 
> something that is recreated over and over, mostly the same way but 
> sometimes with slight differences (e.g., copy-and-sort versus 
> sort-in-place).  Like dict() growing keyword arguments, a copy/sort 
> method (function, classmethod, whatever) will standardize something 
> that is very commonly reimplemented.  Another analogs might be True and 
> False (which before being built into Python may have been spelled 
> true/false, TRUE/FALSE, or just 0/1).  These don't add any real 
> features, but they standardize these simplest of idioms.
> 
> I think I've seen people in this thread say that they've written Big 
> Python Programs, and they didn't have any problem with this -- but this 
> is a feature that's most important for Small Python Programs.  Defining 
> a sort() function becomes boilerplate when you write small programs.  
> Or alternatively you create some util module that contains these little 
> functions, which becomes like a site.py only somewhat more explicit.  A 
> util module feels like boilerplate as well, because it is a module 
> without any conceptual integrity, shared between projects only for 
> convenience, or not shared as it grows organically.  "from util import 
> sort" just feels like cruft-hiding, not real modularity.

That's one of the best ways I've seen this formulated.

If Alex's proposal to have list.sorted() as a factory function is
acceptable to the non-English-speaking crowd, I think we can settle on
that.  (Hm, an alternative would be to add a "sort=True" keyword
argument to list()...)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From fdrake at acm.org  Mon Oct 20 15:30:59 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon Oct 20 15:31:51 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310201924.h9KJOXc21813@12-236-54-216.client.attbi.com>
References: <ABFFF174-0331-11D8-BCD7-000393C2D67E@colorstudy.com>
	<200310201924.h9KJOXc21813@12-236-54-216.client.attbi.com>
Message-ID: <16276.14323.383711.943996@grendel.zope.com>


Guido van Rossum writes:
 > (Hm, an alternative would be to add a "sort=True" keyword
 > argument to list()...)

My immediate expectation on seeing that would be that the keyword args
for l.sort() would also be present.  It feels better to isolate that
stuff; keeping list.sorted(...) make more sense I think.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From Martin.McGreal at anheuser-busch.com  Mon Oct 20 15:34:35 2003
From: Martin.McGreal at anheuser-busch.com (McGreal, Martin P.)
Date: Mon Oct 20 15:34:43 2003
Subject: [Python-Dev] to the maintainer of python's configure script
Message-ID: <09C096BBD0CB2244B0B176300BEDD65A098DF2@STLEXGUSR32.abc.corp.anheuser-busch.com>


Hello,

I need to make these modifications to the configure script every time I compile
Python on AIX (both AIX 4.3.3 and 5.2 -- so I assume 5.1 as well), so I figured
I might as well submit them to you. Everything works fine without my changes
except for the readline detection. To get readline detection to work I must...

1. AIX doesn't have a termcap library, so any reference to -ltermcap must be
changed to -lcurses.
2. The prototype in the sample code at line 18237 is different from the
prototype in <readline/readline.h>, so it should simply be removed from the
sample code.
3. The sample code header doesn't include <readline/readline.h>, so both it and
<stdio.h> should be included.


34d33
< $as_unset ENV MAIL MAILPATH
18222c18221
< LIBS="-lreadline -ltermcap $LIBS"
---
> LIBS="-lreadline -lcurses $LIBS"
18225a18225,18226
> #include <stdio.h>
> #include <readline/readline.h>
18237d18237
< char rl_pre_input_hook ();
18286c18286
< LIBS="-lreadline -ltermcap $LIBS"
---
> LIBS="-lreadline -lcurses $LIBS"
18929d18928
< $as_unset ENV MAIL MAILPATH


My configure command is 

../configure -C --includedir=/usr/local/include --with-libs=-L/usr/local/lib
--disable-ipv6 --with-threads 

My readline is version 4.3, and is installed under /usr/local:

# find /usr/local/include -type f |egrep "readline|history"
/usr/local/include/readline/chardefs.h
/usr/local/include/readline/history.h
/usr/local/include/readline/keymaps.h
/usr/local/include/readline/readline.h
/usr/local/include/readline/rlconf.h
/usr/local/include/readline/rlstdc.h
/usr/local/include/readline/rltypedefs.h
/usr/local/include/readline/tilde.h
# find /usr/local/lib -type f |egrep "readline|history"    
/usr/local/lib/libhistory.a
/usr/local/lib/libreadline.a

If I do not make the changes in the configure script for the readline checks,
the following errors are produced:

[rl_pre_input_hook check before changing -ltermcap to -lcurses]:
configure:18215: checking for rl_pre_input_hook in -lreadline
configure:18246: cc_r -o conftest -g -I/usr/local/include  conftest.c -lreadline
-ltermcap -L/usr/local/lib -ldl  >&5
ld: 0706-006 Cannot find or open library file: -l termcap
        ld:open(): No such file or directory

[rl_completion_matches check before changing -ltermcap to -lcurses]
configure:18279: checking for rl_completion_matches in -lreadline
configure:18310: cc_r -o conftest -g -I/usr/local/include  conftest.c -lreadline
-ltermcap -L/usr/local/lib -ldl  >&5
ld: 0706-006 Cannot find or open library file: -l termcap
        ld:open(): No such file or directory
configure:18313: $? = 255

[rl_pre_input_hook check after changing -ltermcap to -lcurses]:
configure:18215: checking for rl_pre_input_hook in -lreadline
configure:18246: cc_r -o conftest -g -I/usr/local/include  conftest.c -lreadline
-lcurses -L/usr/loc
al/lib -ldl  >&5
ld: 0711-317 ERROR: Undefined symbol: .rl_pre_input_hook
ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more information.
configure:18249: $? = 8

[rl_completion_matches check is ok after changing -ltermcap to -lcurses]

[rl_pre_input_hook check after adding <stdio.h> and <readline/readline.h> to
example code]
configure:18215: checking for rl_pre_input_hook in -lreadline
configure:18249: cc_r -o conftest -g -I/usr/local/include  conftest.c -lreadline
-lcurses -L/usr/loc
al/lib -ldl  >&5
"configure", line 18233.9: 1506-236 (W) Macro name _ALL_SOURCE has been
redefined.
"configure", line 18233.9: 1506-358 (I) "_ALL_SOURCE" is defined on line 129 of
/usr/include/standar
ds.h.
"configure", line 18432.6: 1506-343 (S) Redeclaration of rl_pre_input_hook
differs from previous dec
laration on line 526 of "/usr/local/include/readline/readline.h".
"configure", line 18432.6: 1506-382 (I) The type "unsigned char()" of identifier
rl_pre_input_hook d
iffers from previous type "int(*)()".
configure:18252: $? = 1

[rl_pre_input_hook check is ok after deleting redeclaration of
rl_pre_input_hook]


This output was produced on an H50 running AIX 5.2 ML1. The same output can be
produced on AIX 4.3.3 ML11 (tested on an S7A), except that there is a
libtermcap, so -ltermcap doesn't have to be changed to -lcurses (and
consequently, the rl_completion_matches check goes right the first time). And on
4.3.3 for some reason the --includedir=/usr/local/include doesn't work so
instead I had to use

CPPFLAGS="-I/usr/local/include" ./configure -C --with-libs=-L/usr/local/lib
--disable-ipv6 --with-threads

Thanks!
Martin McGreal


PS: I also remove the unsetting of ENV from lines 34 and 18929 because on our
systems ENV is readonly, which makes the configure script choke.


From pje at telecommunity.com  Mon Oct 20 15:42:20 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Oct 20 15:42:29 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310201924.h9KJOXc21813@12-236-54-216.client.attbi.com>
References: <Your message of "Mon, 20 Oct 2003 14:14:48 CDT."
	<ABFFF174-0331-11D8-BCD7-000393C2D67E@colorstudy.com>
	<ABFFF174-0331-11D8-BCD7-000393C2D67E@colorstudy.com>
Message-ID: <5.1.1.6.0.20031020153611.01f00140@telecommunity.com>

At 12:24 PM 10/20/03 -0700, Guido van Rossum wrote:
> >
> > [Ian cites "preferably only one obvious way to do it" to justify a sort 
> idiom]
>
>That's one of the best ways I've seen this formulated.

Does this extend by analogy to other requests for short functions that are 
commonly reimplemented?  Not that any spring to mind at the moment; it just 
seems to me that inline sorting is one of a set of perennially requested 
such functions or methods, where the current standard answer is "but you 
can do it yourself in only X lines!".


>If Alex's proposal to have list.sorted() as a factory function is
>acceptable to the non-English-speaking crowd, I think we can settle on
>that.  (Hm, an alternative would be to add a "sort=True" keyword
>argument to list()...)

Wouldn't it need to grow key and cmpfunc, too?


From nas-python at python.ca  Mon Oct 20 15:51:38 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Mon Oct 20 15:50:40 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310201924.h9KJOXc21813@12-236-54-216.client.attbi.com>
References: <ABFFF174-0331-11D8-BCD7-000393C2D67E@colorstudy.com>
	<200310201924.h9KJOXc21813@12-236-54-216.client.attbi.com>
Message-ID: <20031020195138.GA30478@mems-exchange.org>

On Mon, Oct 20, 2003 at 12:24:33PM -0700, Guido van Rossum wrote:
> (Hm, an alternative would be to add a "sort=True" keyword argument
> to list()...)

Yuck. -1.

  Neil

From martin at v.loewis.de  Mon Oct 20 16:14:25 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Mon Oct 20 16:15:11 2003
Subject: [Python-Dev] New warnings in _sre.c
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMEKHGKAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCMEKHGKAB.tim.one@comcast.net>
Message-ID: <m3ad7vzmxq.fsf@mira.informatik.hu-berlin.de>

"Tim Peters" <tim.one@comcast.net> writes:

> MSVC complains when a signed int is compared to an unsigned int.  I'm glad
> it does, because the compiler silently casts the signed int to unsigned,
> which doesn't do what the author probably intended if the signed int is less
> than 0:

FWIW, gcc complains about the same thing.

Regards,
Martin

From martin at v.loewis.de  Mon Oct 20 16:15:46 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Mon Oct 20 16:17:07 2003
Subject: [Python-Dev] dbm bugs?
In-Reply-To: <v03110707bbb9d7865550@[192.168.0.3]>
References: <v03110707bbb9d7865550@[192.168.0.3]>
Message-ID: <m365ijzmvh.fsf@mira.informatik.hu-berlin.de>

"John D." <lists@webcrunchers.com> writes:

> #!/usr/local/bin/python
> #2003-10-19. Feedback

Can you please submit bug report to sf.net/projects/python?

Thanks,
Martin

From guido at python.org  Mon Oct 20 16:17:23 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 20 16:17:31 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: Your message of "Mon, 20 Oct 2003 15:42:20 EDT."
	<5.1.1.6.0.20031020153611.01f00140@telecommunity.com> 
References: <Your message of "Mon, 20 Oct 2003 14:14:48 CDT."
	<ABFFF174-0331-11D8-BCD7-000393C2D67E@colorstudy.com>
	<ABFFF174-0331-11D8-BCD7-000393C2D67E@colorstudy.com> 
	<5.1.1.6.0.20031020153611.01f00140@telecommunity.com> 
Message-ID: <200310202017.h9KKHNU21889@12-236-54-216.client.attbi.com>

> >That's one of the best ways I've seen this formulated.
> 
> Does this extend by analogy to other requests for short functions
> that are commonly reimplemented?  Not that any spring to mind at the
> moment; it just seems to me that inline sorting is one of a set of
> perennially requested such functions or methods, where the current
> standard answer is "but you can do it yourself in only X lines!".

Only if there's some quirk to reimplementing them correctly, and only
if the need is truly common.  Most recently we did this for sum().

> >If Alex's proposal to have list.sorted() as a factory function is
> >acceptable to the non-English-speaking crowd, I think we can settle on
> >that.  (Hm, an alternative would be to add a "sort=True" keyword
> >argument to list()...)
> 
> Wouldn't it need to grow key and cmpfunc, too?

Yes, but list.sorted() would have to support these too.  It might
become slightly inelegant because we'd probably have to say that
sorted defaults to False except it defaults to True if either of cmp,
and key is specified.  Note that reverse=True would not imply sorting,
so that

  list(range(10), reverse=True)

would yield

  [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

But Raymond has a different proposal in mind for that (he still needs
to update PEP 322 though).

So maybe list.sorted() is better because it doesn't lend itself to
such generalizations (mostly because of the TOOWTDI rule).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Mon Oct 20 16:19:14 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Mon Oct 20 16:19:43 2003
Subject: [Python-Dev] to the maintainer of python's configure script
In-Reply-To: <09C096BBD0CB2244B0B176300BEDD65A098DF2@STLEXGUSR32.abc.corp.anheuser-busch.com>
References: <09C096BBD0CB2244B0B176300BEDD65A098DF2@STLEXGUSR32.abc.corp.anheuser-busch.com>
Message-ID: <m31xt7zmpp.fsf@mira.informatik.hu-berlin.de>

"McGreal, Martin P." <Martin.McGreal@anheuser-busch.com> writes:

> I need to make these modifications to the configure script every
> time I compile Python on AIX (both AIX 4.3.3 and 5.2 -- so I assume
> 5.1 as well), so I figured I might as well submit them to you.

Dear Martin,

Please understand that the patches are likely ignored if sent to
python-dev. Instead, please submit them to sf.net/projects/python. It
would be good if
a) you could unified (-u) or context (-c) diffs, and
b) the patches would be generally applicable to all systems, or,
   if this is not feasible,
c) patches specific to AIX would not harm operation of other systems

Regards,
Martin

From Martin.McGreal at anheuser-busch.com  Mon Oct 20 16:21:15 2003
From: Martin.McGreal at anheuser-busch.com (McGreal, Martin P.)
Date: Mon Oct 20 16:21:30 2003
Subject: [Python-Dev] to the maintainer of python's configure script
Message-ID: <09C096BBD0CB2244B0B176300BEDD65A029141E5@STLEXGUSR32.abc.corp.anheuser-busch.com>

Ok, will do. Thanks!

-----Original Message-----
From: Martin v. L?wis [mailto:martin@v.loewis.de]
Sent: Monday, October 20, 2003 3:19 PM
To: python-dev@python.org
Cc: McGreal, Martin P.
Subject: Re: [Python-Dev] to the maintainer of python's configure script


"McGreal, Martin P." <Martin.McGreal@anheuser-busch.com> writes:

> I need to make these modifications to the configure script every
> time I compile Python on AIX (both AIX 4.3.3 and 5.2 -- so I assume
> 5.1 as well), so I figured I might as well submit them to you.

Dear Martin,

Please understand that the patches are likely ignored if sent to
python-dev. Instead, please submit them to sf.net/projects/python. It
would be good if
a) you could unified (-u) or context (-c) diffs, and
b) the patches would be generally applicable to all systems, or,
   if this is not feasible,
c) patches specific to AIX would not harm operation of other systems

Regards,
Martin

From marktrussell at btopenworld.com  Mon Oct 20 17:11:02 2003
From: marktrussell at btopenworld.com (Mark Russell)
Date: Mon Oct 20 17:13:43 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310201822.h9KIMMX21628@12-236-54-216.client.attbi.com>
References: <004701c39665$bd6ff440$e841fea9@oemcomputer>
	<200310192140.43084.aleaxit@yahoo.com>
	<200310201822.h9KIMMX21628@12-236-54-216.client.attbi.com>
Message-ID: <1066684262.17163.24.camel@straylight>

On Mon, 2003-10-20 at 19:22, Guido van Rossum wrote:
> But the argument that it wastes a copy still stands (even though
> that's only O(N) vs. O(N log N) for the sort).

That would be irrelevant in most of the cases where I would use it -
typically sorting short lists or dicts where the overhead is
unmeasurable.

> I'm still unclear why this so important to have in the library when
> you can write it yourself in two lines.

For little standalone scripts it gets a bit tedious to write this again
and again.  It doesn't take much code to write dict.fromkeys() manually,
but I'm glad that it's there.  I'd say list.sorted (or whatever it gets
called) has at least as much claim to exist.  

Mark Russell

From tdelaney at avaya.com  Mon Oct 20 17:34:20 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Mon Oct 20 17:34:29 2003
Subject: [Python-Dev] SRE recursion removed
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFEF59@au3010avexu1.global.avaya.com>

> From: "Martin v. L?wis" [mailto:martin@v.loewis.de]
> 
> Delaney, Timothy C (Timothy) wrote:
> 
> > Perhaps a comment that the patch won't be accepted until 
> the dead code 
>  > has been removed, but that the dead code is there for ease 
> of regression
>  > testing during the initial testing period?
> 
> OTOH, the patch has been already committed to CVS head. So it 
> is already accepted.

True. Too many different bug tracking and source control systems ...

I think it would be very useful (and important) to document this requirement though - perhaps a separate bug report, with a comment on the patch pointing to it?

Tim Delaney

From tdelaney at avaya.com  Mon Oct 20 17:35:48 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Mon Oct 20 17:35:56 2003
Subject: [Python-Dev] RE: modules for builtin types (was Re: copysort patch)
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFEF5B@au3010avexu1.global.avaya.com>

> From: Alex Martelli [mailto:aleaxit@yahoo.com]
> 
> Which is actually "sets" (lowercase leading s).

You're right ... I had a brain fart, thinking we used:

    from Sets import set

but of course it's:

    from sets import Set

Damn. There goes a beautifully-crafted proposal ;)

Tim Delaney

From python at rcn.com  Mon Oct 20 17:43:59 2003
From: python at rcn.com (Raymond Hettinger)
Date: Mon Oct 20 17:44:47 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <5.1.1.6.0.20031020153611.01f00140@telecommunity.com>
Message-ID: <000301c39753$45a18980$e841fea9@oemcomputer>

Let's see what the use cases look like under the various proposals:

  todo = [t for t in tasks.copysort() if due_today(t)]
  todo = [t for t in list.sorted(tasks) if due_today(t)]
  todo = [t for t in list(tasks, sorted=True) if due_today(t)]

  genhistory(date, events.copysort(key=incidenttime))
  genhistory(date, list.sorted(events, key=incidenttime))
  genhistory(date, list(events, sorted=True, key=incidenttime))

  for f in os.listdir().copysort(): . . .
  for f in list.sorted(os.listdir()): . . .
  for f in list(os.listdir(), sorted=True): . . .

To my eye, the first form reads much better in every case.
It still needs a better name though.


[Phillip J. Eby in a separate note]
> Wouldn't it need to grow key and cmpfunc, too?

Now, that "key" and "reverse" are available,
there is no need for "cmp" in any new methods.


[Guido in a separate note]
>  list(range(10), reverse=True)
>
>would yield
>
>  [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
>
> But Raymond has a different proposal in mind for that (he still needs
to > > update PEP 322 though).

I'll get to it soon; there won't be any surprises.


Raymond Hettinger


#################################################################
#################################################################
#################################################################
#####
#####
#####
#################################################################
#################################################################
#################################################################

From tdelaney at avaya.com  Mon Oct 20 17:55:05 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Mon Oct 20 17:55:13 2003
Subject: [Python-Dev] listcomps vs. for loops
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFEF5F@au3010avexu1.global.avaya.com>

> From: Guido van Rossum [mailto:guido@python.org]
>
> > I'd love to see that (mis)feature removed someday. I'd love 
> to have that
> > made possible by Guido's _immediately_ and _officially_ 
> declaring it to be
> > an unsupported (and deprecated) feature.
> 
> Make it so.

Should someone raise a bug report against the docs for this then?

Tim Delaney

From aleaxit at yahoo.com  Mon Oct 20 17:56:30 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 20 17:56:36 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <000301c39753$45a18980$e841fea9@oemcomputer>
References: <000301c39753$45a18980$e841fea9@oemcomputer>
Message-ID: <200310202356.30050.aleaxit@yahoo.com>

On Monday 20 October 2003 11:43 pm, Raymond Hettinger wrote:
> Let's see what the use cases look like under the various proposals:
>
>   todo = [t for t in tasks.copysort() if due_today(t)]
>   todo = [t for t in list.sorted(tasks) if due_today(t)]
>   todo = [t for t in list(tasks, sorted=True) if due_today(t)]
>
>   genhistory(date, events.copysort(key=incidenttime))
>   genhistory(date, list.sorted(events, key=incidenttime))
>   genhistory(date, list(events, sorted=True, key=incidenttime))
>
>   for f in os.listdir().copysort(): . . .
>   for f in list.sorted(os.listdir()): . . .
>   for f in list(os.listdir(), sorted=True): . . .
>
> To my eye, the first form reads much better in every case.
> It still needs a better name though.

You're forgetting the cases in which (e.g.) tasks is not necessarily a list, 
but any finite sequence (iterable or iterator).  Then. e.g. the first job
becomes:

todo = [t for t in list(tasks).copysort() if due_today(t)]
todo = [t for t in list.sorted(tasks) if due_today(t)]
todo = [t for t in list(tasks, sorted=True) if due_today(t)]

and I think you'll agree that the first construct isn't that good then
(quite apart from the probably negligible overhead of an unneeded
copy -- still, we HAVE determined that said small overhead needs
to be paid sometimes, and needing to code list(x).copysort() when
x is not a list or you don't KNOW if x is a list adds one copy then).


> [Phillip J. Eby in a separate note]
>
> > Wouldn't it need to grow key and cmpfunc, too?
>
> Now, that "key" and "reverse" are available,
> there is no need for "cmp" in any new methods.

Sorry, but much as I dislike cmpfunc it's still opportune at times, e.g.
I'd rather code:

def Aup_Bdown(x, y):
    return cmp(x.A, y.A) or cmp(y.B, x.B)

for a in list.sorted(foo, cmp=Aup_Bdown): ...

than 

for a in list.sorted(
    list.sorted(foo, key=lambda x:x.B, reverse=True),
    key=lambda x: x.A): ...

or even

for a in list(foo).copysort(key=lambda x:x.B, reverse=True
    ).copysort(key=lambda x: x.A): ...


Alex


From marktrussell at btopenworld.com  Mon Oct 20 18:04:12 2003
From: marktrussell at btopenworld.com (Mark Russell)
Date: Mon Oct 20 18:06:49 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <000301c39753$45a18980$e841fea9@oemcomputer>
References: <000301c39753$45a18980$e841fea9@oemcomputer>
Message-ID: <1066687451.16391.45.camel@straylight>

On Mon, 2003-10-20 at 22:43, Raymond Hettinger wrote:
> Let's see what the use cases look like under the various proposals:
> 
>   [1] todo = [t for t in tasks.copysort() if due_today(t)]
>   [2] todo = [t for t in list.sorted(tasks) if due_today(t)]
>   [3] todo = [t for t in list(tasks, sorted=True) if due_today(t)]

Well, #3 is (I hope) a non-starter, given the need for the extra sort
keyword arguments.  And the instance method is less capable - it can't
sort a non-list iterable (except via list(xxx).copysort()).  So I would
definitely prefer #2, especially as I would tend to put:

	sort = list.sorted

at the top of my modules where needed.  Then I'd have:

	todo = [t for t in sort(tasks) if due_today(t)]
        genhistory(date, sort(events, key=incidenttime))
        for f in sort(os.listdir()): . . .

which to me looks enough like pseudocode that I'm happy.  This might
seem like an argument for having sort() as a builtin, but I think it's
still better as a list constructor. Adding "sort = list.sorted" to the
modules that need it is a small price to pay in boilerplate for the big
win of not cluttering the builtin namespace.

Mark Russell

From aleaxit at yahoo.com  Mon Oct 20 18:09:39 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 20 18:09:43 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: <200310201743.h9KHhPZ21469@12-236-54-216.client.attbi.com>
References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
	<200310200940.43021.aleaxit@yahoo.com>
	<200310201743.h9KHhPZ21469@12-236-54-216.client.attbi.com>
Message-ID: <200310210009.39256.aleaxit@yahoo.com>

On Monday 20 October 2003 07:43 pm, Guido van Rossum wrote:
   ...
> I'm not sure why you say it's separate from cloning; it seems to me
> that copy.copy(iter(range(10))) should return *exactly* what we'd want
> the proposed clone operation to return.

I'd be tickled pink if it did, but I would have expected a shallow copy to
return an iterator that's not necessarily independent from the starting one.
Maybe I have a bad mental model of the "depth" (indirectness) of
iterators?

> I see this as a plea to add __copy__ and __deepcopy__ methods to all
> standard iterators for which it makes sense.  (Or maybe only __copy__
> -- I'm not sure what value __deepcopy__ would add.)

Hmmm, copy the underlying sequence as well?  Don't have any use
case for it, but that's what would feel unsurprising to me (as I have
already mentioned I may not have the right mental model of an
iterator...).

> I find this a reasonable request for the iterators belonging to
> stndard containers (list, tuple, dict).  I guess that some of the
> iterators in itertools might also support this easily.  Perhaps this
> would be the road to supporting iterator cloning?

It would surely do a lot to let me clone things, yes, and in fact
doing it with the existing __copy__ protocol sounds much better
than sprouting a new one.


(Thanks for the confirmations and clarifications on file internals.
btw, any news of the new experimental file things you were
playing with back at PythonUK...?)


Alex


From aleaxit at yahoo.com  Mon Oct 20 18:15:23 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 20 18:15:28 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310201721.h9KHLAa21428@12-236-54-216.client.attbi.com>
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<16276.3995.177704.754136@montanaro.dyndns.org>
	<200310201721.h9KHLAa21428@12-236-54-216.client.attbi.com>
Message-ID: <200310210015.23591.aleaxit@yahoo.com>

On Monday 20 October 2003 07:21 pm, Guido van Rossum wrote:
   ...
> 'average' or 'sum'.  Whether there is an actual gain in speed depends
> on how large the list is.  You should be able to time examples like
>
>    sum([x*x for x in R])
>
> vs.
>
>    def gen(R):
>        for x in R:
>            yield x*x
>    sum(gen(R))
>
> for various lengths of R.  (The latter would be a good indication of
> how fast an iterator generator could run.)

with a.py having:
def asum(R):
    sum([ x*x for x in R ])

def gen(R):
    for x in R: yield x*x
def gsum(R, gen=gen):
    sum(gen(R))

I measure:

[alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(100)' 'a.asum(R)'
10000 loops, best of 3: 96 usec per loop
[alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(100)' 'a.gsum(R)'
10000 loops, best of 3: 60 usec per loop
[alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(1000)' 'a.asum(R)'
1000 loops, best of 3: 930 usec per loop
[alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(1000)' 'a.gsum(R)'
1000 loops, best of 3: 590 usec per loop
[alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(10000)' 'a.asum(R)'
100 loops, best of 3: 1.28e+04 usec per loop
[alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(10000)' 'a.gsum(R)'
100 loops, best of 3: 8.4e+03 usec per loop

not sure why gsum's advantage ratio over asum seems to be roughly
constant, but, this IS what I measure!-)


Alex


From aleaxit at yahoo.com  Mon Oct 20 18:24:04 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 20 18:24:09 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: <200310201708.h9KH8jF21377@12-236-54-216.client.attbi.com>
References: <16E1010E4581B049ABC51D4975CEDB8802C098E6@UKDCX001.uk.int.atosorigin.com>
	<200310201517.07902.aleaxit@yahoo.com>
	<200310201708.h9KH8jF21377@12-236-54-216.client.attbi.com>
Message-ID: <200310210024.04660.aleaxit@yahoo.com>

On Monday 20 October 2003 07:08 pm, Guido van Rossum wrote:
> > Darn -- one more underground attempt to foist adaptation into Python
> > foiled by premature discovery... must learn to phrase things less
> > overtly, the people around here are too clever!!!
> >
> :-)
>
> I'm all for adaptation, I'm just hesitant to adapt it wholeheartedly
> because I expect that it will have such a big impact on coding
> practices.  I want to have a better feel for what that impact is and
> whether it is altogether healthy.  IOW I'm a bit worried that

Wise as usual.  I suspect adaptation should enter Python when interfaces
or protocols or however we wanna call them do, and I remember your
explanations about wanting to see real-world experience with that stuff,
because there will be ONE chance to get them into Python "right".

> adaptation might become too attractive of a hammer for all sorts of
> problems, whether or not there are better-suited solutions.

Well, OO has that problem too -- I see people (mostly coming from
Java:-) STARTING with designing a class, by reflex, even when a
couple of functions are more suitable.  It generally doesn't take ALL
that much to wean them from such "premature complexity" if they
work with some non-OObsessed Pythonistas.  Protocol adaptation
is "an attractive hammer" much like OO is, without the further issue
of there being very popular "protocol adaptation oriented languages"
around:-), so I don't think the worry is really justified.

I've seen another poster use a similar analogy with generic functions
and multimethods (which btw we DO have in pypy as an implementation
strategy, see http://codespeak.net/ and browse or download at will),
and perhaps that's equally suitable too.


Alex


From tjreedy at udel.edu  Mon Oct 20 18:42:18 2003
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon Oct 20 18:42:26 2003
Subject: [Python-Dev] Re: Re: accumulator display syntax
References: <200310181627.h9IGRoP09636@12-236-54-216.client.attbi.com><200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz>
	<20031020143056.GE28665@frobozz>
Message-ID: <bn1ocb$645$1@sea.gmane.org>

"Andrew Bennetts" <andrew-pythondev@puzzling.org> wrote in message
...
> I think the lazy iteration syntax approach was probably a better
idea.  I
> don't like the proposed use of "yield" to signify it, though -- 
"yield" is a
> flow control statement, so the examples using it in this thread look
odd to
> me.

Same here.

> Perhaps it would be best to simply use the keyword "lazy" -- after
all,
> that's the key distinguishing feature.  I think my preferred syntax
would
> be:
>
>     sum([lazy x*x for x in sequence])

I like this the best of suggestions so far.  Easy to understand, easy
to teach:
[lazy ...] = iter([...]) but produced more efficiently

> But use of parens instead of brackets, and/or a colon to make the
keyword
> stand out (and look reminisicent to a lambda! which *is* a related
concept,
> in a way -- it also defers evaluation), e.g.:
>
>     sum((lazy: x*x for x in sequence))

I prefer sticking with [...] for 'make a (possibly virtual) list'.
Having removed ':' when abbreviating
_[]
for i in seq: _.append[expr]
as an expression, it seems odd to bring it back for a special case.  I
wish ':' could have also been removed from the lambda abbreviation of
def.

Terry J. Reedy


From tjreedy at udel.edu  Mon Oct 20 18:47:49 2003
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon Oct 20 18:47:54 2003
Subject: [Python-Dev] Re: Re: accumulator display syntax
References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz>
	<200310201815.h9KIFM821583@12-236-54-216.client.attbi.com>
Message-ID: <bn1omm$6ki$1@sea.gmane.org>


"Guido van Rossum" <guido@python.org> wrote in message
news:200310201815.h9KIFM821583@12-236-54-216.client.attbi.com...
> > Most of us seem to agree that having list comprehensions
> > available as a replacement for map() and filter() is a good
> > thing. But what about reduce()? Are there equally strong
> > reasons for wanting an alternative to that, too? If not,
> > why not?
>
> If anything, the desire there is *more* pressing.  Except for
> operator.add, expressions involving reduce() are notoriously hard to
> understand (except to experienced APL or Scheme hackers :-).
>
> Things like sum, max, average etc. are expressed very elegantly with
> iterator comprehensions.
>
> I think the question is more one of frequency of use.  List comps
have
> nothing over e.g.
>
>   result = []
>   for x in S:
>       result.append(x**2)
>
> except compactness of exprssion.  How frequent is
>
>   result = 0.0
>   for x in S:
>       result += x**2
>
> ???
>
> (I've already said my -1 about your 'sum of ...' proposal.)
>
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org
>


From guido at python.org  Mon Oct 20 18:49:41 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 20 18:49:50 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: Your message of "Mon, 20 Oct 2003 23:04:12 BST."
	<1066687451.16391.45.camel@straylight> 
References: <000301c39753$45a18980$e841fea9@oemcomputer>  
	<1066687451.16391.45.camel@straylight> 
Message-ID: <200310202249.h9KMnfe22122@12-236-54-216.client.attbi.com>

> I would tend to put:
> 
> 	sort = list.sorted
> 
> at the top of my modules where needed.

Really?  That would seem to just obfuscate things for the reader (who
would have to scroll back potentially many pages to find the one-line
definition of sort).  Why be so keen on saving 7 keystrokes?

How many calls to list.sorted do you expect to have in your average
module?

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Oct 20 18:51:22 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 20 18:51:32 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: Your message of "Tue, 21 Oct 2003 07:55:05 +1000."
	<338366A6D2E2CA4C9DAEAE652E12A1DECFEF5F@au3010avexu1.global.avaya.com> 
References: <338366A6D2E2CA4C9DAEAE652E12A1DECFEF5F@au3010avexu1.global.avaya.com>
Message-ID: <200310202251.h9KMpMn22142@12-236-54-216.client.attbi.com>

> > From: Guido van Rossum [mailto:guido@python.org]
> >
> > > I'd love to see that (mis)feature removed someday. I'd love to
> > > have that made possible by Guido's _immediately_ and
> > > _officially_ declaring it to be an unsupported (and deprecated)
> > > feature.
> > 
> > Make it so.
> 
> Should someone raise a bug report against the docs for this then?

Please do (unless you can check the fix in yourself :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From tdelaney at avaya.com  Mon Oct 20 19:44:48 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Mon Oct 20 19:44:55 2003
Subject: [Python-Dev] listcomps vs. for loops
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFEFBE@au3010avexu1.global.avaya.com>

> From: Guido van Rossum [mailto:guido@python.org]
> 
> > > From: Guido van Rossum [mailto:guido@python.org]
> > >
> > > > I'd love to see that (mis)feature removed someday. I'd love to
> > > > have that made possible by Guido's _immediately_ and
> > > > _officially_ declaring it to be an unsupported (and deprecated)
> > > > feature.
> > > 
> > > Make it so.
> > 
> > Should someone raise a bug report against the docs for this then?
> 
> Please do (unless you can check the fix in yourself :-).

Raised request 827209:

http://sourceforge.net/tracker/index.php?func=detail&aid=827209&group_id=5470&atid=105470

http://tinyurl.com/ro6g

I'd have a go at it, but we're in crunch mode here at the moment - hoping to do a release candidate this week - so I don't have time to set up my environment or anything :(

Tim Delaney

From FBatista at uniFON.com.ar  Mon Oct 20 16:54:55 2003
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Mon Oct 20 21:57:55 2003
Subject: [Python-Dev] prePEP: Money data type
Message-ID: <A128D751272CD411BC9200508BC2194D03383080@escpl.tcp.com.ar>

#- Sure, rounding IS best set by function, though you may want more than
#- two (roundForbid to raise exceptions when rounding tries to happen,
#- roundTruncate, etc).

So far, goes 4 different kind:
	roundPlain
	roundBanker
	roundTruncate
	roundForbid


#- class Money:
#-     round = staticmethod(roundWhateverDefault)
#-     precision = someDefaultPrecision
#-     def __init__(self, value, precision=None, round=None):
#-         self.value = value
#-         if precision is not None: self.precision = precision
#-         if round is not None: self.round = round
#- 
#- then use self.precision and self.round in all further 
#- methods -- they'll
#- correctly go to either the INSTANCE attribute, if 
#- specifically set, or
#- the CLASS attribute, if no instance attribute is set.  A 
#- useful part of
#- how Python works, btw.

Wow! This is the difference between a python newbie and a python guru, :)


#- I do NOT think any advanced formatting should be part of the 
#- responsibilities
#- of class Money itself.  I would focus on correct and 
#- complete arithmetic with
#- good handling of exact precision and rounding rules: I 
#- contend THAT is the
#- really necessary part.

Triming formatting, adding different types of rounding, and allowing strings
with engineering notation. Maybe is better to build a Decimal class (kind of
FixedPoint) easily subclassable to make a Money one.

From guido at python.org  Mon Oct 20 23:44:30 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 20 23:44:51 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: Your message of "Tue, 21 Oct 2003 00:09:39 +0200."
	<200310210009.39256.aleaxit@yahoo.com> 
References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
	<200310200940.43021.aleaxit@yahoo.com>
	<200310201743.h9KHhPZ21469@12-236-54-216.client.attbi.com> 
	<200310210009.39256.aleaxit@yahoo.com> 
Message-ID: <200310210344.h9L3iVh23308@12-236-54-216.client.attbi.com>

> On Monday 20 October 2003 07:43 pm, Guido van Rossum wrote:
>    ...
> > I'm not sure why you say it's separate from cloning; it seems to me
> > that copy.copy(iter(range(10))) should return *exactly* what we'd want
> > the proposed clone operation to return.

[Alex]
> I'd be tickled pink if it did, but I would have expected a shallow
> copy to return an iterator that's not necessarily independent from
> the starting one.  Maybe I have a bad mental model of the "depth"
> (indirectness) of iterators?

Hm.  Let's consider a Python implementation of a sequence iterator (I
think you've given a similar class before):

  class SeqIter:
      def __init__(self, seq, i=0):
          self.seq = seq
          self.i = i
      def __iter__(self):
          return self # Obligatory self-returning __iter__
      def next(self):
          try:
              x = self.seq[self.i]
          except IndexError:
              raise StopIteration
          else:
              self.i += 1
              return x

All we care about really is that this is an instance with two instance
variables, seq and i.  A shallow copy creates a new instance (with a
new __dict__!) with the same two instance variable names, referencing
the same two objects.  Since i is immutable, the copy/clone is
independent from the original iterator; but both reference the same
underlying sequence object.

Clearly this is the copy semantics that would be expected from a
sequence iterator object implemented in C.  Ditto for the dict
iterator.

Now if someone wrote a tree class with a matching iterator class
(which might keep a stack of nodes visited in a list), the default
copy.copy() semantics might not be right, but such a class could
easily provide a __copy__ method that did the right thing (making a
shallow copy of the stack).

> > I see this as a plea to add __copy__ and __deepcopy__ methods to all
> > standard iterators for which it makes sense.  (Or maybe only __copy__
> > -- I'm not sure what value __deepcopy__ would add.)
> 
> Hmmm, copy the underlying sequence as well?  Don't have any use
> case for it, but that's what would feel unsurprising to me (as I have
> already mentioned I may not have the right mental model of an
> iterator...).

Right.  I have no use case for this either, although it's close to
pickling, and who knows if someday it might be useful to be able to
pickle iterators along with their containers.

> > I find this a reasonable request for the iterators belonging to
> > stndard containers (list, tuple, dict).  I guess that some of the
> > iterators in itertools might also support this easily.  Perhaps this
> > would be the road to supporting iterator cloning?
> 
> It would surely do a lot to let me clone things, yes, and in fact
> doing it with the existing __copy__ protocol sounds much better
> than sprouting a new one.

Right, so that's settled.  We don't need an iterator cloning protocol,
we can just let iterators support __copy__.  (There's no C-level slot
for this.)

> (Thanks for the confirmations and clarifications on file internals.
> btw, any news of the new experimental file things you were
> playing with back at PythonUK...?)

No; I donated it to pypy and I think it's in their subversion depot.
I haven't had time to play with it further.  It would be great to do a
rewrite from the ground up of the file object without using stdio, but
it would be a lot of work to get it right on all platforms; I guess an
engineering team of volunteers should be formed to tackle this issue.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 00:09:11 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 00:09:19 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 00:15:23 +0200."
	<200310210015.23591.aleaxit@yahoo.com> 
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<16276.3995.177704.754136@montanaro.dyndns.org>
	<200310201721.h9KHLAa21428@12-236-54-216.client.attbi.com> 
	<200310210015.23591.aleaxit@yahoo.com> 
Message-ID: <200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com>

> with a.py having:
> def asum(R):
>     sum([ x*x for x in R ])
> 
> def gen(R):
>     for x in R: yield x*x
> def gsum(R, gen=gen):
>     sum(gen(R))
> 
> I measure:
> 
> [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(100)' 'a.asum(R)'
> 10000 loops, best of 3: 96 usec per loop
> [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(100)' 'a.gsum(R)'
> 10000 loops, best of 3: 60 usec per loop
> [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(1000)' 'a.asum(R)'
> 1000 loops, best of 3: 930 usec per loop
> [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(1000)' 'a.gsum(R)'
> 1000 loops, best of 3: 590 usec per loop
> [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(10000)' 'a.asum(R)'
> 100 loops, best of 3: 1.28e+04 usec per loop
> [alex@lancelot auto]$ timeit.py -c -s'import a' -s'R=range(10000)' 'a.gsum(R)'
> 100 loops, best of 3: 8.4e+03 usec per loop
> 
> not sure why gsum's advantage ratio over asum seems to be roughly
> constant, but, this IS what I measure!-)

Great!  This is a plus for iterator comprehensions (we need a better
term BTW).  I guess that building up a list using repeated append()
calls slows things down more than the frame switching used by
generator functions; I knew the latter was fast but this is a pleasant
result.

BTW, if I use a different function that calculates list() instead of
sum(), the generator version is a few percent slower than the list
comprehension.  But that's because list(a) has a shortcut in case a is
a list, while sum(a) always uses PyIter_Next().  So this is actually
consistent: despite the huge win of the shortcut, the generator
version is barely slower.

I think the answer lies in the bytecode:

>>> def lc(a):
     return [x for x in a]
    
>>> import dis
>>> dis.dis(lc)
  2           0 BUILD_LIST               0
              3 DUP_TOP             
              4 LOAD_ATTR                0 (append)
              7 STORE_FAST               1 (_[1])
             10 LOAD_FAST                0 (a)
             13 GET_ITER            
        >>   14 FOR_ITER                16 (to 33)
             17 STORE_FAST               2 (x)
             20 LOAD_FAST                1 (_[1])
             23 LOAD_FAST                2 (x)
             26 CALL_FUNCTION            1
             29 POP_TOP             
             30 JUMP_ABSOLUTE           14
        >>   33 DELETE_FAST              1 (_[1])
             36 RETURN_VALUE        
             37 LOAD_CONST               0 (None)
             40 RETURN_VALUE        
>>> def gen(a):
     for x in a: yield x
    
>>> dis.dis(gen)
  2           0 SETUP_LOOP              18 (to 21)
              3 LOAD_FAST                0 (a)
              6 GET_ITER            
        >>    7 FOR_ITER                10 (to 20)
             10 STORE_FAST               1 (x)
             13 LOAD_FAST                1 (x)
             16 YIELD_VALUE         
             17 JUMP_ABSOLUTE            7
        >>   20 POP_BLOCK           
        >>   21 LOAD_CONST               0 (None)
             24 RETURN_VALUE        
>>> 

The list comprehension executes 7 bytecodes per iteration; the
generator version only 5 (this could be more of course if the
expression was more complicated than 'x').  The YIELD_VALUE does very
little work; falling out of the frame is like falling off a log; and
gen_iternext() is pretty sparse code too.  On the list comprehension
side, calling the list's append method has a bunch of overhead.  (Some
of which could be avoided if we had a special-purpose opcode which
called PyList_Append().)

But the executive summary remains: the generator wins because it
doesn't have to materialize the whole list.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg at cosc.canterbury.ac.nz  Tue Oct 21 03:27:03 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue Oct 21 03:27:20 2003
Subject: [Python-Dev] Re: Re: accumulator display syntax
In-Reply-To: <bn1ocb$645$1@sea.gmane.org>
Message-ID: <200310210727.h9L7R3V02910@oma.cosc.canterbury.ac.nz>

> >     sum([lazy x*x for x in sequence])
>
> I like this the best of suggestions so far.  Easy to understand, easy
> to teach:
> [lazy ...] = iter([...]) but produced more efficiently

-1. An iterator is not a lazy list. A lazy list would support
indexing, slicing, etc. while calculating its items on demand.
An iterator is inherently sequential and single-use -- a
different concept.

But maybe some other keyword could be added to ease any
syntactic problems, such as "all" or "every":

  sum(all x*x for x in xlist)
  sum(every x*x for x in xlist)

The presence of the extra keyword would then distinguish
an iterator comprehension from the innards of a list
comprehension.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Tue Oct 21 03:43:06 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue Oct 21 03:43:15 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com>
Message-ID: <200310210743.h9L7h6k02941@oma.cosc.canterbury.ac.nz>

Guido:

> But the executive summary remains: the generator wins because it
> doesn't have to materialize the whole list.

But what would happen if the generator were replaced with
in-line code that computes the values and feeds them to
an accumulator object, such as might result from an
accumulator syntax that gets inline-expanded in the
same way as a list comp?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From aleaxit at yahoo.com  Tue Oct 21 03:43:31 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 21 03:43:38 2003
Subject: [Python-Dev] Re: Re: accumulator display syntax
In-Reply-To: <200310210727.h9L7R3V02910@oma.cosc.canterbury.ac.nz>
References: <200310210727.h9L7R3V02910@oma.cosc.canterbury.ac.nz>
Message-ID: <200310210943.31574.aleaxit@yahoo.com>

On Tuesday 21 October 2003 09:27 am, Greg Ewing wrote:
   ...
> But maybe some other keyword could be added to ease any
> syntactic problems, such as "all" or "every":
>
>   sum(all x*x for x in xlist)
>   sum(every x*x for x in xlist)
>
> The presence of the extra keyword would then distinguish
> an iterator comprehension from the innards of a list
> comprehension.

Heh, you ARE a volcano of cool syntactic ideas these days, Greg.

As between them, to me 'all' sort of connotes 'all at once' while
'every' connotes 'one by one' (so would a third possibility, 'each');
so 'all' is the one I like least.

Besides accumulators &c we should also think of normal loops:

for a in all x*x for x in xlist: ...

for a in every x*x for x in xlist: ...

for a in each x*x for x in xlist: ...

Of these three, 'every' looks best to me, personally.


Alex


From greg at cosc.canterbury.ac.nz  Tue Oct 21 03:55:23 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue Oct 21 03:55:43 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8802C098EB@UKDCX001.uk.int.atosorigin.com>
Message-ID: <200310210755.h9L7tNH02963@oma.cosc.canterbury.ac.nz>

> Did I miss April 1st? We seem to be discussing the merits of
>
>     f of arg
>
> as an alternative form of
>
>    f(arg)
> 
> While I'm sure Cobol had some good points, I don't believe that this was one
> of them...

No, some people were *abusing* my suggested accumulator
syntax for things that could have been done more directly
using a function call. It was not meant to be used for
copying or sorting!

I may have misled people a bit by using "sum" in one of
the examples, since there is currently a function by
that name, which wouldn't be directly usable that way.
Just to be clear,

  y = accum of f(x) for x in seq

would be equivalent to something like

  a = accum()
  for x in seq:
    a.__consume__(f(x))
  y = a.__result__()

which, as you can see, is rather more than just a
function call.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg at cosc.canterbury.ac.nz  Tue Oct 21 04:01:29 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue Oct 21 04:01:47 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <3F93E18D.5010708@iinet.net.au>
Message-ID: <200310210801.h9L81Tk02971@oma.cosc.canterbury.ac.nz>

> Except, if it was defined such that you wrote:
>    sum of [x*x for x in the_values]

I don't think that would be a good idea, because the square
brackets make it look less efficient than it really is,
and leave you wondering why you shouldn't just write
a function call with a listcomp as argument instead.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From aleaxit at yahoo.com  Tue Oct 21 05:13:41 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 21 05:13:47 2003
Subject: [Python-Dev] prePEP: Money data type
In-Reply-To: <A128D751272CD411BC9200508BC2194D03383080@escpl.tcp.com.ar>
References: <A128D751272CD411BC9200508BC2194D03383080@escpl.tcp.com.ar>
Message-ID: <200310211113.41658.aleaxit@yahoo.com>

On Monday 20 October 2003 10:54 pm, Batista, Facundo wrote:
    ...
> Triming formatting, adding different types of rounding, and allowing
> strings with engineering notation. Maybe is better to build a Decimal class
> (kind of FixedPoint) easily subclassable to make a Money one.

Sure, arithmetic (including rounding) is what we most need.  If we call
it Decimal or whatever, that may be preferable to Money, I don't know.


Alex


From aleaxit at yahoo.com  Tue Oct 21 06:02:26 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 21 06:02:35 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: <200310210344.h9L3iVh23308@12-236-54-216.client.attbi.com>
References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
	<200310210009.39256.aleaxit@yahoo.com>
	<200310210344.h9L3iVh23308@12-236-54-216.client.attbi.com>
Message-ID: <200310211202.26677.aleaxit@yahoo.com>

On Tuesday 21 October 2003 05:44, Guido van Rossum wrote:
   ...
> All we care about really is that this is an instance with two instance
> variables, seq and i.  A shallow copy creates a new instance (with a
> new __dict__!) with the same two instance variable names, referencing
> the same two objects.  Since i is immutable, the copy/clone is

Ah -- *right*!  The index can be taken as IMMUTABLE -- so the fact
that the copy is shallow, so gets "the same index object", is a red
herring -- as soon as either the copy or the original "increment" the
index, they're in fact creating a new index object for themselves while
still leaving their brother's index object unchanged.  I get it now -- I
was thinking too abstractly and generally, in terms of a more general
"index" which might be mutable, thus shared (including its changes)
after a shallow copy.

> Now if someone wrote a tree class with a matching iterator class
> (which might keep a stack of nodes visited in a list), the default
> copy.copy() semantics might not be right, but such a class could
> easily provide a __copy__ method that did the right thing (making a
> shallow copy of the stack).

Yes, if we specify an iter's __copy__ makes an independent iterator,
which is surely the most useful semantics for it, then any weird iterator
whose index is in fact mutable can copy not-quite-shallowly and offer
the same useful semantics.  I'm not sure where that leaves generator
made iterators, which don't really know which parts of the state in their
saved frame are "index", but having them just punt and refuse to copy
themselves shallowly might be ok.


> > > I see this as a plea to add __copy__ and __deepcopy__ methods to all
> > > standard iterators for which it makes sense.  (Or maybe only __copy__
> > > -- I'm not sure what value __deepcopy__ would add.)
> >
> > Hmmm, copy the underlying sequence as well?  Don't have any use
> > case for it, but that's what would feel unsurprising to me (as I have
> > already mentioned I may not have the right mental model of an
> > iterator...).
>
> Right.  I have no use case for this either, although it's close to
> pickling, and who knows if someday it might be useful to be able to
> pickle iterators along with their containers.

Sure, it might.  Perhaps the typical use case would be one in which
an iterator gets deepcopied "incidentally" as part of the deepcopy of
some other object which "happens" to hold an iterator; if iterators knew
how to deepcopy themselves that would save some work on the part
of the other object's author.  No huge win, sure.  But once the copy
gets deep, generator-made iterators should also have no problem
actually doing it, and that may be another middle-size win.


> > > would be the road to supporting iterator cloning?
> >
> > It would surely do a lot to let me clone things, yes, and in fact
> > doing it with the existing __copy__ protocol sounds much better
> > than sprouting a new one.
>
> Right, so that's settled.  We don't need an iterator cloning protocol,
> we can just let iterators support __copy__.  (There's no C-level slot
> for this.)

Right.


> > (Thanks for the confirmations and clarifications on file internals.
> > btw, any news of the new experimental file things you were
> > playing with back at PythonUK...?)
>
> No; I donated it to pypy and I think it's in their subversion depot.
> I haven't had time to play with it further.  It would be great to do a
> rewrite from the ground up of the file object without using stdio, but
> it would be a lot of work to get it right on all platforms; I guess an
> engineering team of volunteers should be formed to tackle this issue.

Right, and doing so as part of pypy is surely right, since it's one of
the many things pypy definitely needs to become fully self-hosting.


Alex


From aleaxit at yahoo.com  Tue Oct 21 06:03:49 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 21 06:03:56 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310201637.h9KGbHU21287@12-236-54-216.client.attbi.com>
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310201745.36226.aleaxit@yahoo.com>
	<200310201637.h9KGbHU21287@12-236-54-216.client.attbi.com>
Message-ID: <200310211203.49395.aleaxit@yahoo.com>

On Monday 20 October 2003 18:37, Guido van Rossum wrote:
   ...
> > >>> [.2 for .2 in range(3)]
> >
> > SyntaxError: can't assign to literal
> >
> > I think I don't understand what you mean.
>
> I meant that the compiler should rename it.  Just like when you use a

<<sound effect of hand slapping forehead>> I'm being rather thick these
days, I guess.  Thanks for clarifying!


Alex


From marktrussell at btopenworld.com  Tue Oct 21 06:31:28 2003
From: marktrussell at btopenworld.com (Mark Russell)
Date: Tue Oct 21 06:34:18 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310202249.h9KMnfe22122@12-236-54-216.client.attbi.com>
References: <000301c39753$45a18980$e841fea9@oemcomputer>
	<1066687451.16391.45.camel@straylight>
	<200310202249.h9KMnfe22122@12-236-54-216.client.attbi.com>
Message-ID: <1066732288.18847.17.camel@straylight>

On Mon, 2003-10-20 at 23:49, Guido van Rossum wrote:
> Really?  That would seem to just obfuscate things for the reader (who
> would have to scroll back potentially many pages to find the one-line
> definition of sort).

I think most readers would probably be able to guess what

	for key in sort(d.keys()):

would do.  If not then it's no worse than a user-defined function.

It's also a matter of proportion -- the important thing about the code
above is that it's walking over a dictionary.  In most of my uses, the
sort() is just a detail to ensure reproducible behaviour.

In a new language I think you could make a case for the default
behaviour for dict iteration to be sorted, with a
walk-in-unspecified-order method for the cases where the speed really
does matter.  Back in the real world, how about:

       for key, value in d.sort():

(i.e. adding a sort() instance method to dict equivalent to:

      def sort(d, cmp=None, key=None, reverse=False):
          l = list(d.items())
	  l.sort(cmp, key, reverse)
	  return l

).  At least there's no question of an in-place sort for dicts!

> Why be so keen on saving 7 keystrokes?

It's not totally trivial - for me a list comprehension is noticeably
less readable when split over more than one line.

> How many calls to list.sorted do you expect to have in your average
> module?

Er, about 0.3 :-) In the project I'm working on, there are 52 sortcopy()
calls in 162 modules (about 18K LOC).  Not enough to justify a built-in
sort(), but enough I think to make list.sorted() worthwhile.

Mark Russell


From aleaxit at yahoo.com  Tue Oct 21 07:00:42 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 21 07:00:59 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <1066732288.18847.17.camel@straylight>
References: <000301c39753$45a18980$e841fea9@oemcomputer>
	<200310202249.h9KMnfe22122@12-236-54-216.client.attbi.com>
	<1066732288.18847.17.camel@straylight>
Message-ID: <200310211300.42187.aleaxit@yahoo.com>

On Tuesday 21 October 2003 12:31 pm, Mark Russell wrote:
> On Mon, 2003-10-20 at 23:49, Guido van Rossum wrote:
> > Really?  That would seem to just obfuscate things for the reader (who
> > would have to scroll back potentially many pages to find the one-line
> > definition of sort).
>
> I think most readers would probably be able to guess what
>
> 	for key in sort(d.keys()):
>
> would do.  If not then it's no worse than a user-defined function.

Incidentally,
    for k in list.sorted(d):
will be marginally faster, e.g. (using the copysort I posted here, without The 
Trick -- it should be just about identical to the list.sorted classmethod):

import copysort

x = dict.fromkeys(map(str,range(99999)))

def looponkeys(x, c=copysort.copysort):
    for k in c(x.keys()): pass

def loopondict(x, c=copysort.copysort):
    for k in c(x): pass


[alex@lancelot ext]$ timeit.py -c -s'import t' 't.loopondict(t.x)'
10 loops, best of 3: 2.84e+05 usec per loop
[alex@lancelot ext]$ timeit.py -c -s'import t' 't.looponkeys(t.x)'
10 loops, best of 3: 2.67e+05 usec per loop

i.e., about 10% better for this size of list and number of runs (quite a few,
eyeball x.keys()...:-).  Nothing crucial, of course, but still.

Moreover, "list.sorted(d)" and "sort(d.keys())" are the same length, and
the former is conceptually simpler (one [explicit] method call, vs one method
call and one function call).  Of course, if each keystroke count, you may
combine both "abbreviations" and just use "sort(d)".


>        for key, value in d.sort():
>
> (i.e. adding a sort() instance method to dict equivalent to:

Why should multiple data types acquire separate .sort methods with
subtly different semantics (one works in-place and returns None, one
doesn't mutate the object and returns a list, ...) when there's no real
added value wrt ONE classmethod of list...?  Particularly with cmp,
key, and reverse on each, seems cumbersome to me.  Truly, is
list.sorted(d.iteritems())  [or d.items() if you'd rather save 4 chars
than a small slice of time:-)] SO "unobvious"?  I just don't get it.


Alex


From marktrussell at btopenworld.com  Tue Oct 21 07:18:16 2003
From: marktrussell at btopenworld.com (Mark Russell)
Date: Tue Oct 21 07:21:04 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310211300.42187.aleaxit@yahoo.com>
References: <000301c39753$45a18980$e841fea9@oemcomputer>
	<200310202249.h9KMnfe22122@12-236-54-216.client.attbi.com>
	<1066732288.18847.17.camel@straylight>
	<200310211300.42187.aleaxit@yahoo.com>
Message-ID: <1066735096.18849.33.camel@straylight>

On Tue, 2003-10-21 at 12:00, Alex Martelli wrote:
> Why should multiple data types acquire separate .sort methods with
> subtly different semantics (one works in-place and returns None, one
> doesn't mutate the object and returns a list, ...) when there's no real
> added value wrt ONE classmethod of list...?

I agree that the different semantics for lists and dicts are a strike
against this.  The argument for it is that walking over a dictionary in
sorted order is (at least to me) a missing idiom in python.  Does this
never come up when you're teaching the language?

I wouldn't advocate adding this to other types (e.g. Set) because
they're much less commonly used than dicts, so I don't think there's a
danger of a creeping plague of sort methods.  Not a big deal though -
list.sorted() is the real win.

Mark Russell

PS: I'm really not an anal-retentive keystoke counter :-)

From mwh at python.net  Tue Oct 21 07:41:00 2003
From: mwh at python.net (Michael Hudson)
Date: Tue Oct 21 07:41:06 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: <200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com> (Guido
	van Rossum's message of "Mon, 20 Oct 2003 10:48:00 -0700")
References: <20031020173134.GA29040@panix.com>
	<200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com>
Message-ID: <2mad7uq0mr.fsf@starship.python.net>

Guido van Rossum <guido@python.org> writes:

> I don't recall what I said then.  Did I say it was a feature that
>
>   L = [x for x in R]
>   print x
>
> would print the last item of R?

A problem with such code irrespective of anything else is that it
fails when R is empty.

Cheers,
mwh

-- 
  Whaaat? That is the most retarded thing I have seen since, oh,
  yesterday                             -- Kaz Kylheku, comp.lang.lisp

From aleaxit at yahoo.com  Tue Oct 21 07:55:02 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 21 07:55:57 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <1066735096.18849.33.camel@straylight>
References: <000301c39753$45a18980$e841fea9@oemcomputer>
	<200310211300.42187.aleaxit@yahoo.com>
	<1066735096.18849.33.camel@straylight>
Message-ID: <200310211355.02326.aleaxit@yahoo.com>

On Tuesday 21 October 2003 01:18 pm, Mark Russell wrote:
> On Tue, 2003-10-21 at 12:00, Alex Martelli wrote:
> > Why should multiple data types acquire separate .sort methods with
> > subtly different semantics (one works in-place and returns None, one
> > doesn't mutate the object and returns a list, ...) when there's no real
> > added value wrt ONE classmethod of list...?
>
> I agree that the different semantics for lists and dicts are a strike
> against this.  The argument for it is that walking over a dictionary in
> sorted order is (at least to me) a missing idiom in python.  Does this

It's a frequently used idiom (actually more than one) -- it's not
"missing".

> never come up when you're teaching the language?

Sure, and I have a good time explaining that half the time you
want to sort on KEYS and half the time on VALUES.  An example
I often use is building and displaying a word-frequency index: now
it's pretty obvious that you may want to display it just as easily
by frequency (most frequent words first) OR alphabetically.

The key= construct IS a huge win, btw.  I just wish there WAS an
easier way to express the TYPICAL keys one wants to use than
    lambda x: x[N]
for some N or
    lambda x: x.A
for some A.  getattr and operator.getitem are no use, alas, even
when curried, because they take x first:-(.  I'd rather not teach lambda
(at least surely not early on!) so I'll end up with lots of little def's
(whose need had sharply decreased with list comprehensions, as
map and filter moved into a corner to gather dust).  Ah well.


> I wouldn't advocate adding this to other types (e.g. Set) because
> they're much less commonly used than dicts, so I don't think there's a

Actually, I was thinking of presenting them BEFORE dicts next time
I have an opportunity of teaching Python from scratch.  The ARE
simpler and more fundamental, after all.

> danger of a creeping plague of sort methods.  Not a big deal though -
> list.sorted() is the real win.

I concur.


> Mark Russell
>
> PS: I'm really not an anal-retentive keystoke counter :-)

OK, sorry for the digs, it just _looked_ that way for a sec;-).


Alex


From mwh at python.net  Tue Oct 21 07:59:50 2003
From: mwh at python.net (Michael Hudson)
Date: Tue Oct 21 07:59:55 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310201637.h9KGbHU21287@12-236-54-216.client.attbi.com> (Guido
	van Rossum's message of "Mon, 20 Oct 2003 09:37:17 -0700")
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310200944.30482.aleaxit@yahoo.com>
	<200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com>
	<200310201745.36226.aleaxit@yahoo.com>
	<200310201637.h9KGbHU21287@12-236-54-216.client.attbi.com>
Message-ID: <2m65iipzrd.fsf@starship.python.net>

Guido van Rossum <guido@python.org> writes:

>> On Monday 20 October 2003 04:30 pm, Guido van Rossum wrote:
>> > > We are indeed sure (sadly) that list comprehensions leak control variable
>> > > names.
>> >
>> > But they shouldn't.  It can be fixed by renaming them (e.g. numeric
>> > names with a leading dot).
>> 
>> Hmmm, sorry?
>> 
>> >>> [.2 for .2 in range(3)]
>> SyntaxError: can't assign to literal
>> 
>> I think I don't understand what you mean.
>
> I meant that the compiler should rename it.

Implementing this might be entertaining.  In particular what happens
if the iteration variable is a local in the frame anyway?  I presume
that would inhibit the renaming, but then there's a potentially
confusing dichotomy as to whether renaming gets done.  Of course
you could *always* rename, but then code like 

def f(x):
    r = [x+1 for x in range(x)]
    return r, x

becomes even more incomprehensible (and changes in behaviour).

And what about horrors like

    [([x for x in range(10)],x) for x in range(10)]

vs:

    [([x for x in range(10)],y) for y in range(10)]

?

I suppose you could make a case for throwing out (or warning about)
all these cases at compile time, but that would require significant
effort as well (I think).

Cheers,
mwh

-- 
  This song is for anyone ... fuck it.  Shut up and listen.
                         -- Eminem, "The Way I Am"

From ncoghlan at iinet.net.au  Tue Oct 21 09:33:20 2003
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Tue Oct 21 09:33:25 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <200310201741.19295.aleaxit@yahoo.com>
References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz>
	<200310201601.08440.aleaxit@yahoo.com>
	<3F93F33C.9070702@iinet.net.au>
	<200310201741.19295.aleaxit@yahoo.com>
Message-ID: <3F9535A0.9060803@iinet.net.au>

Alex Martelli strung bits together to say:
> On Monday 20 October 2003 04:37 pm, Nick Coghlan wrote:
>>   for x in sorted_copy of reversed_copy of my_list:
> 
> Ooops -- sorting a reversed copy of my_list is just like sorting my_list...
> I think
>       for x in sorted_copy(reverse=True) of my_list:
>           ...
> (again borrowing brand-new keyword syntax from lists' sort method) is
> likely to work better...:-)

(slightly OT for this thread, but. . .)

I got the impression that:

   l.sort(reverse=True)

was stable w.r.t. items that sort equivalently, while:

   l.reverse()
   l.sort()

was not. I.e. the "reverse" in the sort arguments refers to reversing the order 
of the arguments to the comparison operation, rather than to reversing the list.

> However, if I had to choose, I would forego this VERY attractive syntax
> sugar, and go for Greg's original suggestion -- 'of' for iterator 
> comprehensions only.  Syntax sugar is all very well (at least in this case),
> but if it _only_ amounts to a much neater-looking way of doing what is already
> quite possible, it's a "more-than-one-way-to-do-itis".

Yes - quite pretty, but ultimately confusing, I think (as a few people have 
pointed out).

However, getting back to Greg's original point - that our goal is to find a 
syntax that does for "reduce" what list comprehensions did for "map" and 
"filter", I realised last night that this "of" syntax isn't it.

The "of" syntax requires us to have an existing special operator to handle the 
accumulation (e.g. sum or max), whereas what reduce does is let us take an 
existing binary function (e.g. operator.add), and feed it a sequence 
element-by-element, accumulating the result. If we already have a method that 
can extract the result from we want from a seqeunce, then list comprehensions 
and method calls are perfectly adequate.

(starts thinking about this from the basics of what the goal is)

So what we're really talking about is syntactic sugar for:

   y = 0
   for x in xvalues:
     if (x > 0):
       y = y + (x*x)

We want to be able to specify the object to iterate over, the condition for 
which elements to consider (filter), an arbitrary function involving the element 
(map), and the method we want to use to accumulate the elements (reduce)

If we had a list comprehension:
   squares_of_positives = [x*x for x in xvalues if x > 0]

the original unrolled form would have been:
   squares_of_positives = []
   for x in xvalues:
     if (x > 0):
       squares_of_positives.append(x*x)

So list comprehensions just fix the accumulation method (appending to the result 
list). So what we need is a way to describe how to accumulate the result, as 
well as the ability to initialise the cumulative result:

   y = y + x*x from y = 0 for x in xvalues if x > 0

Yuck. Looks like an assignment, but is actually an accumulation expression. Ah, 
how about:

   y + x*x from y = 0 for x in xvalues if x > 0

The 'from' clause identifies the accumulation variable, in just the same way the 
'for' clause identifies the name of the current value from the iterable.

Cheers,
Nick.

-- 
Nick Coghlan           |              Brisbane, Australia
ICQ#: 68854767         |               ncoghlan@email.com
Mobile: 0409 573 268   |   http://www.talkinboutstuff.net
"Let go your prejudices,
               lest they limit your thoughts and actions."


From ncoghlan at iinet.net.au  Tue Oct 21 09:41:39 2003
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Tue Oct 21 09:41:44 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <200310201815.h9KIFM821583@12-236-54-216.client.attbi.com>
References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz>
	<200310201815.h9KIFM821583@12-236-54-216.client.attbi.com>
Message-ID: <3F953793.1000208@iinet.net.au>

Guido van Rossum strung bits together to say:

> except compactness of exprssion.  How frequent is
> 
>   result = 0.0
>   for x in S:
>       result += x**2
> 
> ???
> 
> (I've already said my -1 about your 'sum of ...' proposal.)

Just so this suggestion doesn't get buried in the part of the thread where I was 
getting rather carried away about Greg's 'of' syntax (sorry!).

What about:

   result + x**2 from result = 0.0 for x in S

Essentially short for:
   result = 0.0
   for x in S:
     result = result + x**2

Cheers,
Nick.

-- 
Nick Coghlan           |              Brisbane, Australia
ICQ#: 68854767         |               ncoghlan@email.com
Mobile: 0409 573 268   |   http://www.talkinboutstuff.net
"Let go your prejudices,
               lest they limit your thoughts and actions."


From skip at pobox.com  Tue Oct 21 09:43:41 2003
From: skip at pobox.com (Skip Montanaro)
Date: Tue Oct 21 09:43:51 2003
Subject: [Python-Dev] Re: Re: accumulator display syntax
In-Reply-To: <bn1ocb$645$1@sea.gmane.org>
References: <200310181627.h9IGRoP09636@12-236-54-216.client.attbi.com>
	<200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz>
	<20031020143056.GE28665@frobozz> <bn1ocb$645$1@sea.gmane.org>
Message-ID: <16277.14349.844999.220166@montanaro.dyndns.org>

    Terry> "Andrew Bennetts" <andrew-pythondev@puzzling.org> wrote in message
    >> ...  I think the lazy iteration syntax approach was probably a better
    >> idea.  I don't like the proposed use of "yield" to signify it, though
    >> --
    >> "yield" is a flow control statement, so the examples using it in this
    >> thread look odd to me.

    Terry> Same here.

And probably contributed to my initial confusion about what the proposed
construct was supposed to do.  (I'm still not keen on it, but at least I
understand it better.)

Skip

From skip at pobox.com  Tue Oct 21 09:57:38 2003
From: skip at pobox.com (Skip Montanaro)
Date: Tue Oct 21 09:57:47 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com>
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<16276.3995.177704.754136@montanaro.dyndns.org>
	<200310201721.h9KHLAa21428@12-236-54-216.client.attbi.com>
	<200310210015.23591.aleaxit@yahoo.com>
	<200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com>
Message-ID: <16277.15186.392757.583785@montanaro.dyndns.org>


    >> [Alex measures speed improvements]

    Guido> Great!  This is a plus for iterator comprehensions (we need a
    Guido> better term BTW).

Here's an alternate suggestion.  Instead of inventing new syntax, why not
change the semantics of list comprehensions to be lazy?  They haven't been
in use that long, and while they are popular, the semantic tweakage would
probably cause minimal disruption.  In situations where laziness wasn't
wanted, the most that a particular use would have to change (I think) is to
pass it to list().

Skip

From exarkun at intarweb.us  Tue Oct 21 10:28:16 2003
From: exarkun at intarweb.us (Jp Calderone)
Date: Tue Oct 21 10:28:36 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: <2mad7uq0mr.fsf@starship.python.net>
References: <20031020173134.GA29040@panix.com>
	<200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com>
	<2mad7uq0mr.fsf@starship.python.net>
Message-ID: <20031021142816.GA25455@intarweb.us>

On Tue, Oct 21, 2003 at 12:41:00PM +0100, Michael Hudson wrote:
> Guido van Rossum <guido@python.org> writes:
> 
> > I don't recall what I said then.  Did I say it was a feature that
> >
> >   L = [x for x in R]
> >   print x
> >
> > would print the last item of R?
> 
> A problem with such code irrespective of anything else is that it
> fails when R is empty.
> 

  Not when x is properly initialized.  Anyway, this is no different from the
problem of:

    for x in R:
        ...
    print x


  In any case, are there plans to also have the compiler emit warnings about
potential reliance on this feature?

  Jp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://mail.python.org/pipermail/python-dev/attachments/20031021/31c2666d/attachment.bin
From tjreedy at udel.edu  Tue Oct 21 10:29:32 2003
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue Oct 21 10:29:40 2003
Subject: [Python-Dev] Re: listcomps vs. for loops
References: <20031020173134.GA29040@panix.com><200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com>
	<2mad7uq0mr.fsf@starship.python.net>
Message-ID: <bn3fsd$6nv$1@sea.gmane.org>


"Michael Hudson" <mwh@python.net> wrote in message
news:2mad7uq0mr.fsf@starship.python.net...
> Guido van Rossum <guido@python.org> writes:
>
> > I don't recall what I said then.  Did I say it was a feature that
> >
> >   L = [x for x in R]
> >   print x
> >
> > would print the last item of R?

Someone more-or-less did -- in the tutorial.  See bottom below.

> A problem with such code irrespective of anything else is that it
> fails when R is empty.

Same would be true of for loops, except that typical after-for usage,
such as searching for item in list, has else clause to set control
variable to default in 'not found' cases, which include empty lists.

The Ref Manual currently says nothing about leakage or overwriting.
That should make leakage fair game for plugging.

On the other hand, Tutorial 5.1.4 List Comprehensions says:
'''
To make list comprehensions match the behavior of for loops,
assignments to the loop variable remain visible outside of the
comprehension:

>>> x = 100                     # this gets overwritten
>>> [x**3 for x in range(5)]
[0, 1, 8, 27, 64]
>>> x                           # the final value for range(5)
4
'''
(Pointed out by John Roth in response to my c.l.py posting.)
I have added note to SF 827209.

Terry J. Reedy


From mwh at python.net  Tue Oct 21 10:45:00 2003
From: mwh at python.net (Michael Hudson)
Date: Tue Oct 21 10:45:08 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: <20031021142816.GA25455@intarweb.us> (Jp Calderone's message of
	"Tue, 21 Oct 2003 10:28:16 -0400")
References: <20031020173134.GA29040@panix.com>
	<200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com>
	<2mad7uq0mr.fsf@starship.python.net>
	<20031021142816.GA25455@intarweb.us>
Message-ID: <2mwuayodjn.fsf@starship.python.net>

Jp Calderone <exarkun@intarweb.us> writes:

> On Tue, Oct 21, 2003 at 12:41:00PM +0100, Michael Hudson wrote:
>> Guido van Rossum <guido@python.org> writes:
>> 
>> > I don't recall what I said then.  Did I say it was a feature that
>> >
>> >   L = [x for x in R]
>> >   print x
>> >
>> > would print the last item of R?
>> 
>> A problem with such code irrespective of anything else is that it
>> fails when R is empty.
>> 
>
>   Not when x is properly initialized.

Obviously.

>  Anyway, this is no different from the
> problem of:
>
>     for x in R:
>         ...
>     print x

Well, yes.  I still think it's dubious code.

>   In any case, are there plans to also have the compiler emit warnings about
> potential reliance on this feature?

I would hope that we wouldn't make changes without emitting such a
warning.  I'm not sure how hard it would be to implement, tho'.

(It would be /nice/ to implement a warning whenever there's a
possibility of the UnboundLocalError exception, but that *definitely*
requires control flow analysis and that is *definitely* a heap of
work, unless the ast-branch gets some attention).

Cheers,
mwh

-- 
  We did requirements and task analysis, iterative design, and user
  testing. You'd almost think programming languages were an interface
  between people and computers.                    -- Steven Pemberton
          (one of the designers of Python's direct ancestor ABC)

From aleaxit at yahoo.com  Tue Oct 21 11:49:21 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 21 11:49:36 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <16277.15186.392757.583785@montanaro.dyndns.org>
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com>
	<16277.15186.392757.583785@montanaro.dyndns.org>
Message-ID: <200310211749.21152.aleaxit@yahoo.com>

On Tuesday 21 October 2003 03:57 pm, Skip Montanaro wrote:
>     >> [Alex measures speed improvements]
>
>     Guido> Great!  This is a plus for iterator comprehensions (we need a
>     Guido> better term BTW).
>
> Here's an alternate suggestion.  Instead of inventing new syntax, why not
> change the semantics of list comprehensions to be lazy?  They haven't been
> in use that long, and while they are popular, the semantic tweakage would
> probably cause minimal disruption.  In situations where laziness wasn't
> wanted, the most that a particular use would have to change (I think) is to
> pass it to list().

Well, yes, the _most_ one could ever have to change is move from
[ ... ] to list[ ... ]) to get back today's semantics.  But any use NOT so
changed may break, in general; any perfectly correct program coded
with Python 2.1 to Python 2.3 -- several years' worth of "current Python",
by the time 2.4 comes out -- might break.

I think we should keep the user-observable semantics as now, BUT
maybe an optimization IS possible if all the user code does with the
LC is loop on it (or anyway just get its iter(...) and nothing else).

Perhaps a _variant_ of "The Trick" MIGHT be practicable (since I
don't believe the "call from C holding just one ref" IS a real risk here).
Again it would be based on reference-count being 1 at a certain point.

The LC itself _might_ just build a generator and wrap it in a
"pseudolist" object.  Said pseudolist object, IF reacting to a tp_iter
when its reference count is one, NEED NOT "unfold" itself.  But
for ANY other operation, it must generate the real list and "get out
of the way" as much as possible.

Note that this includes a tp_iter WITH rc>1.  For example:

x = [ a.strip().upper() for a in thefile if len(a)>7 ]
for y in x: blah(y)
for z in x: bluh(z)

the first 'for' implicitly calls iter(x) but that must NOT be allowed
to "consume" thefile in a throwaway fashion -- because x can be
used again later (e.g. in the 2nd for).  This works fine today and
has worked for years, and I would NOT like it to break in 2.4... if 
LC's had been lazy from the start (just as they are in Haskell),
that would have been wonderful, but, alas, we didn't have the 
iterator protocol then...:-(

As to whether the optimization is worth this complication, I dunno.
I'd rather have "iterator literals", I think -- simpler and more explicit.
That way when i see [x.bah() for x in someiterator] I KNOW the
iterator is consumed right then and there, I don't need to look at
the surrounding context... context-depended semantics is not
Python's most normal and usual approach, after all...


Alex


From pje at telecommunity.com  Tue Oct 21 11:59:19 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Oct 21 11:59:19 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <16277.15186.392757.583785@montanaro.dyndns.org>
References: <200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com>
	<200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<16276.3995.177704.754136@montanaro.dyndns.org>
	<200310201721.h9KHLAa21428@12-236-54-216.client.attbi.com>
	<200310210015.23591.aleaxit@yahoo.com>
	<200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com>
Message-ID: <5.1.1.6.0.20031021115620.023e2300@telecommunity.com>

At 08:57 AM 10/21/03 -0500, Skip Montanaro wrote:

>     >> [Alex measures speed improvements]
>
>     Guido> Great!  This is a plus for iterator comprehensions (we need a
>     Guido> better term BTW).
>
>Here's an alternate suggestion.  Instead of inventing new syntax, why not
>change the semantics of list comprehensions to be lazy?  They haven't been
>in use that long, and while they are popular, the semantic tweakage would
>probably cause minimal disruption.  In situations where laziness wasn't
>wanted, the most that a particular use would have to change (I think) is to
>pass it to list().

If you make it a list that's lazy, it doesn't lose the memory allocation 
overhead for the list.  If I understand Alex's benchmarks, making a lazy 
list would end up being *slower* than list comprehension is now.

I previously proposed a different solution earlier in this thread, where 
you get a pseudo-list that, if iterated, runs the underlying generator 
function.  But there were issues with possible side-effects (not to mention 
reiterability) of the underlying iterator on which the comprehension was based.


From theller at python.net  Tue Oct 21 12:26:13 2003
From: theller at python.net (Thomas Heller)
Date: Tue Oct 21 12:26:28 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <200310171840.h9HIesN06941@12-236-54-216.client.attbi.com> (Guido
	van Rossum's message of "Fri, 17 Oct 2003 11:40:53 -0700")
References: <brspcg23.fsf@python.net> <3F872FE9.9070508@v.loewis.de>
	<u16bbwsz.fsf@python.net> <3F8C3DD0.4020400@v.loewis.de>
	<d6cxnfc6.fsf@python.net>
	<200310162019.h9GKJli05194@12-236-54-216.client.attbi.com>
	<y8vjlpzw.fsf@python.net>
	<200310171804.h9HI4rJ06803@12-236-54-216.client.attbi.com>
	<4qy7lnuc.fsf@python.net>
	<200310171840.h9HIesN06941@12-236-54-216.client.attbi.com>
Message-ID: <he22muai.fsf@python.net>

Guido van Rossum <guido@python.org> writes:

[about making _socket a builtin module instead of an extension]
>> > Long ago, when I first set up the VC5 project, there were still some
>> > target systems out there that didn't have a working winsock DLL, and
>> > "import socket" or "import select" would fail there for that reason.
>> > If this is no longer a problem, I'm +1 on this.
>> 
>> Not on the sytems that I work on. To be double sure, _socket could be
>> rewritten to load the winsock dll dynamically. And maybe this becomes
>> an issue again if IPv6 is compiled in.
>
> I'd rather not have more Windows-specific cruft in the socket and
> select module source code -- they are bad enough already.  Dynamically
> loading winsock probably would mean that ever call into it has to be
> coded differently, right?

Yes.

Yet another approach would be to use the delay_load feature of MSVC, it
allows dynamic loading of the dlls at runtime, ideally without changing
the source code.

So far I have never tried that, does anyone know if this really works?

Thomas


From fdrake at acm.org  Tue Oct 21 12:29:42 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue Oct 21 12:30:01 2003
Subject: [Python-Dev] Expat 1.95.7 in Python 2.3.x?
Message-ID: <16277.24310.600396.856699@grendel.zope.com>


I released Expat 1.95.7 yesterday, and updated the Python and PyXML
projects to use the new version.  It fixes a number of bugs in Expat
as well as cleaning up some build issues that caused Python and PyXML
to ship a slightly modified version.  (It may also prove a little
faster in some applications, since it's now using a string hash
function based on Python's.)

I'd be interested in hearing if there are any objections to updating
the Python 2.3.x maintenance tree to also use the new version.  I
think it's safe (no new features) and allows us to ship an unmodified
Expat.

Please let me know if you know of any objections.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From skip at pobox.com  Tue Oct 21 12:34:24 2003
From: skip at pobox.com (Skip Montanaro)
Date: Tue Oct 21 12:34:36 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310211749.21152.aleaxit@yahoo.com>
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com>
	<16277.15186.392757.583785@montanaro.dyndns.org>
	<200310211749.21152.aleaxit@yahoo.com>
Message-ID: <16277.24592.805548.835843@montanaro.dyndns.org>


    >> Here's an alternate suggestion.  Instead of inventing new syntax, why
    >> not change the semantics of list comprehensions to be lazy?  

    Alex> Well, yes, the _most_ one could ever have to change is move from [
    Alex> ... ] to list[ ... ]) to get back today's semantics.  But any use
    Alex> NOT so changed may break, in general; any perfectly correct
    Alex> program coded with Python 2.1 to Python 2.3 -- several years'
    Alex> worth of "current Python", by the time 2.4 comes out -- might
    Alex> break.

I understand all that.  Still, the "best" syntax for these so-called
iterator comprehensions might have been the current list comprehension
syntax.  I don't know how hard it would be to fix existing code, probably
not a massive undertaking, but the bugs lazy list comprehensions introduced
would probably be a bit subtle.

Let's perform a little thought experiment.  We already have the current list
comprehension syntax and the people thinking about lazy list comprehensions
are seem to be struggling a bit to find syntax for them which doesn't appear
cobbled together.  Direct your attention to Python 3.0 where one of the
things Guido has said he would like to do is to eliminate some bits of the
language he feels are warts.  Given two similar language constructs
implementing two similar sets of semantics, I'd have to think he would like
to toss one of each.  The list comprehension syntax seems the more obvious
(to me) syntax to keep while it would appear there are some advantages to
the lazy list comprehension semantics (enumerate (parts of) infinite
sequences, better memory usage, some performance improvements).

I don't know when 3.0 alpha will (conceptually) become the CVS trunk.  Guido
may not know either, but it is getting nearer every day.  Unless he likes
one of the proposed new syntaxes well enough to conclude now that he will
keep both syntaxes and both sets of semantics in 3.0, I think we should look
at other alternatives which don't introduce new syntax, including morphing
list comprehensions into lazy list comprehensions or leaving lazy list
comprehensions out of the language, at least in 2.x.  As I think people
learned when considering ternary operators and switch statements, adding
constructs to the language in a Pythonic way is not always possible, no
matter how compelling the feature might be.  In those situations it makes
sense to leave the construct out for now and see if syntax restructuring in
3.0 will make addition of such desired features possible.

Anyone for

    [x for x in S]L

? <lazy wink>

Skip

From aleaxit at yahoo.com  Tue Oct 21 12:41:45 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 21 12:41:52 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <3F953793.1000208@iinet.net.au>
References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz>
	<200310201815.h9KIFM821583@12-236-54-216.client.attbi.com>
	<3F953793.1000208@iinet.net.au>
Message-ID: <200310211841.45711.aleaxit@yahoo.com>

On Tuesday 21 October 2003 03:41 pm, Nick Coghlan wrote:
   ---
> What about:
>
>    result + x**2 from result = 0.0 for x in S
>
> Essentially short for:
>    result = 0.0
>    for x in S:
>      result = result + x**2

Not bad, but I'm not sure I like the strict limitation to "A = A + f(x)" forms
(possibly with some other operator in lieu of + etc, of course).  Say I
want to make a sets.Set out of the iterator, for example:

result.union([ x**2 ]) from result = sets.Set() for x in theiter

now that's deucedly _inefficient_, consarn it!, because it maps to a
loop of:
    result = result.union([ x** ])

so I may be tempted to try, instead:

real_result = sets.Set()
real_result.union_update([ x**2 ]) from fake_result = None for x in theiter

and hoping the N silly rebindings of fake_result to None cost me less
than not having to materialize a list from theiter would cost if I did
    real_result = sets.Set([ x**2 for x in theiter ])

I don't think we should encourage that sort of thing with the "implicit
assignment" in accumulation.

So, if it's an accumulation syntax we're going for, I'd much rather find
ways to express whether we want [a] no assignment at all (as e.g for
union_update), [b] plain assignment, [c] augmented assignment such
as += or whatever.  Sorry, no good idea comes to my mind now, but
I _do_ think we'd want all three possibilities...


Alex


From aleaxit at yahoo.com  Tue Oct 21 12:50:25 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 21 12:50:34 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <16277.24592.805548.835843@montanaro.dyndns.org>
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310211749.21152.aleaxit@yahoo.com>
	<16277.24592.805548.835843@montanaro.dyndns.org>
Message-ID: <200310211850.25376.aleaxit@yahoo.com>

On Tuesday 21 October 2003 06:34 pm, Skip Montanaro wrote:
   ...
> would like to toss one of each.  The list comprehension syntax seems the
> more obvious (to me) syntax to keep while it would appear there are some
> advantages to the lazy list comprehension semantics (enumerate (parts of)
> infinite sequences, better memory usage, some performance improvements).

Yes to both points.   Hmmm...


> should look at other alternatives which don't introduce new syntax,
> including morphing list comprehensions into lazy list comprehensions or

...as long as this can be done WITHOUT breaking a ton of my code...

> leaving lazy list comprehensions out of the language, at least in 2.x.  As

Eeek.  Maybe.  Sigh.  3 years or so (best case, assuming 2.4 is the last
of the 2.*'s) before I can teach and deploy lazy comprehensions?-(

Hmmm... what about skipping 2.4, and making a beeline for 3.0...?-)


Alex


From guido at python.org  Tue Oct 21 12:53:38 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 12:53:48 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 17:49:21 +0200."
	<200310211749.21152.aleaxit@yahoo.com> 
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com>
	<16277.15186.392757.583785@montanaro.dyndns.org> 
	<200310211749.21152.aleaxit@yahoo.com> 
Message-ID: <200310211653.h9LGrcC24239@12-236-54-216.client.attbi.com>

[Skip]
> > Here's an alternate suggestion.  Instead of inventing new syntax,
> > why not change the semantics of list comprehensions to be lazy?
> > They haven't been in use that long, and while they are popular,
> > the semantic tweakage would probably cause minimal disruption.  In
> > situations where laziness wasn't wanted, the most that a
> > particular use would have to change (I think) is to pass it to
> > list().

Sorry, too late.  You're hugely underestimating the backwards
compatibility issues.  And they have been in use at least since 2000
(they were introduced in 2.0).

[Alex]
> I think we should keep the user-observable semantics as now, BUT
> maybe an optimization IS possible if all the user code does with the
> LC is loop on it (or anyway just get its iter(...) and nothing else).

But that's not very common, so I don't see the point of putting in the
effort, plus it's not safe.  Using a LC as the sequence of a for loop
is ugly, and usually

  for x in [y for y in S if P(y)]: ...

means the same as

  for x in S:
      if P(x): ...

except when it doesn't, and then making the list comprehension lazy
can be a mistake: the following example

  for key in [k for k in d if d[k] is None]:
       del d[key]

is *not* the same as

  for key in d:
      if d[key] is None:
          del d

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 12:56:31 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 12:56:41 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: Your message of "Tue, 21 Oct 2003 15:45:00 BST."
	<2mwuayodjn.fsf@starship.python.net> 
References: <20031020173134.GA29040@panix.com>
	<200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com>
	<2mad7uq0mr.fsf@starship.python.net>
	<20031021142816.GA25455@intarweb.us> 
	<2mwuayodjn.fsf@starship.python.net> 
Message-ID: <200310211656.h9LGuVQ24250@12-236-54-216.client.attbi.com>

> >  Anyway, this is no different from the
> > problem of:
> >
> >     for x in R:
> >         ...
> >     print x
> 
> Well, yes.  I still think it's dubious code.
> 
> > In any case, are there plans to also have the compiler emit
> > warnings about potential reliance on this feature?
> 
> I would hope that we wouldn't make changes without emitting such a
> warning.  I'm not sure how hard it would be to implement, tho'.

Warning about what?

I have no intent to make the example quoted above illegal; a regular
for loop control variable's scope will extend beyond the loop.

It's only list comprehensions where I plan to remove x from the scope
after the comprehension is finished.

Do you need a warning for that change too?  Code that relies on it is
pretty sick IMO.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at yahoo.com  Tue Oct 21 12:57:57 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 21 12:58:02 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <he22muai.fsf@python.net>
References: <brspcg23.fsf@python.net>
	<200310171840.h9HIesN06941@12-236-54-216.client.attbi.com>
	<he22muai.fsf@python.net>
Message-ID: <200310211857.57783.aleaxit@yahoo.com>

On Tuesday 21 October 2003 06:26 pm, Thomas Heller wrote:
   ...
> Yet another approach would be to use the delay_load feature of MSVC, it
> allows dynamic loading of the dlls at runtime, ideally without changing
> the source code.
>
> So far I have never tried that, does anyone know if this really works?

Yes, back when I was in think3 we experimented extensively as soon as
it was available (still in a beta of -- I don't recall if it was the SDK or 
VStudio 6), and except for a few specific libraries that gave some trouble
(MSVCRT.DLL and the MFC one, only -- I think because they did
something to the memory allocation mechanisms, MSVCRT having it
and MFC changing it -- perhaps it was because we were ALSO using
other memory-related tools in DLL's, e.g. leak-detectors), it always worked 
smoothly and "spread" the load, making the app startup faster.  So we
set the two DLL's that gave us trouble for load and startup and the rest
for delayed load and lived happily ever after (I don't even recall exactly
HOW we did that, it WAS years ago...).


Alex


From guido at python.org  Tue Oct 21 12:58:23 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 12:58:34 2003
Subject: [Python-Dev] Re: listcomps vs. for loops
In-Reply-To: Your message of "Tue, 21 Oct 2003 10:29:32 EDT."
	<bn3fsd$6nv$1@sea.gmane.org> 
References: <20031020173134.GA29040@panix.com><200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com>
	<2mad7uq0mr.fsf@starship.python.net> <bn3fsd$6nv$1@sea.gmane.org> 
Message-ID: <200310211658.h9LGwN124275@12-236-54-216.client.attbi.com>

> > > I don't recall what I said then.  Did I say it was a feature that
> > >
> > >   L = [x for x in R]
> > >   print x
> > >
> > > would print the last item of R?
> 
> Someone more-or-less did -- in the tutorial.  See bottom below.

Oh bah!

> > A problem with such code irrespective of anything else is that it
> > fails when R is empty.
> 
> Same would be true of for loops, except that typical after-for usage,
> such as searching for item in list, has else clause to set control
> variable to default in 'not found' cases, which include empty lists.

The regular for loop won't change.

> The Ref Manual currently says nothing about leakage or overwriting.
> That should make leakage fair game for plugging.

Unfortunately the Ref Manual is notoriously incomplete.

> On the other hand, Tutorial 5.1.4 List Comprehensions says:
> '''
> To make list comprehensions match the behavior of for loops,
> assignments to the loop variable remain visible outside of the
> comprehension:
> 
> >>> x = 100                     # this gets overwritten
> >>> [x**3 for x in range(5)]
> [0, 1, 8, 27, 64]
> >>> x                           # the final value for range(5)
> 4
> '''
> (Pointed out by John Roth in response to my c.l.py posting.)
> I have added note to SF 827209.

Sigh.  What a bummer to put this in a tutorial. :-(

But it won't stop me from deprecating the feature.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 13:04:14 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 13:05:57 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 23:41:39 +1000."
	<3F953793.1000208@iinet.net.au> 
References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz>
	<200310201815.h9KIFM821583@12-236-54-216.client.attbi.com> 
	<3F953793.1000208@iinet.net.au> 
Message-ID: <200310211704.h9LH4E324322@12-236-54-216.client.attbi.com>

> What about:
> 
>    result + x**2 from result = 0.0 for x in S
> 
> Essentially short for:
>    result = 0.0
>    for x in S:
>      result = result + x**2

You're kidding right?

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at yahoo.com  Tue Oct 21 13:24:25 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 21 13:24:31 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310211653.h9LGrcC24239@12-236-54-216.client.attbi.com>
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310211749.21152.aleaxit@yahoo.com>
	<200310211653.h9LGrcC24239@12-236-54-216.client.attbi.com>
Message-ID: <200310211924.25711.aleaxit@yahoo.com>

On Tuesday 21 October 2003 06:53 pm, Guido van Rossum wrote:
   ...
> > maybe an optimization IS possible if all the user code does with the
> > LC is loop on it (or anyway just get its iter(...) and nothing else).
>
> But that's not very common, so I don't see the point of putting in the

It IS common, at least in the code I write, e.g.:

  d = dict([ (f(a), g(a)) for a in S ])

  s = sets.Set([ a*a for a in S ])

  totsq = sum([ x*x for x in S ])
 
etc.  I detest the look of those ([ ... ]), but that's the closest I get to 
dict comprehensions, set comprehensions, etc.


> except when it doesn't, and then making the list comprehension lazy
> can be a mistake: the following example
>
>   for key in [k for k in d if d[k] is None]:
>        del d[key]
>
> is *not* the same as
>
>   for key in d:
>       if d[key] is None:
>           del d

Well, no, but even if that last statement was "del d[key]" you'd still be
right:-).  Even in a situation where the list comp is only looped over once,
code MIGHT still be relying on the LC having "snapshotted" and/or
exhausted iterators IT uses.  I was basically thinking of passing the LC
as argument to something -- the typical cases where I use LC now and
WISH they were lazy, as above -- rather about for loops.  And even
when the LC _is_ an argument there might be cases where its current
strict (nonlazy) semantics are necessary.  Oh well!


Alex


From exarkun at intarweb.us  Tue Oct 21 13:31:25 2003
From: exarkun at intarweb.us (Jp Calderone)
Date: Tue Oct 21 13:31:44 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: <200310211656.h9LGuVQ24250@12-236-54-216.client.attbi.com>
References: <20031020173134.GA29040@panix.com>
	<200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com>
	<2mad7uq0mr.fsf@starship.python.net>
	<20031021142816.GA25455@intarweb.us>
	<2mwuayodjn.fsf@starship.python.net>
	<200310211656.h9LGuVQ24250@12-236-54-216.client.attbi.com>
Message-ID: <20031021173125.GA27127@intarweb.us>

On Tue, Oct 21, 2003 at 09:56:31AM -0700, Guido van Rossum wrote:
> > >  Anyway, this is no different from the
> > > problem of:
> > >
> > >     for x in R:
> > >         ...
> > >     print x
> > 
> > Well, yes.  I still think it's dubious code.
> > 
> > > In any case, are there plans to also have the compiler emit
> > > warnings about potential reliance on this feature?
> > 
> > I would hope that we wouldn't make changes without emitting such a
> > warning.  I'm not sure how hard it would be to implement, tho'.
> 
> Warning about what?
> 
> I have no intent to make the example quoted above illegal; a regular
> for loop control variable's scope will extend beyond the loop.
>

  Sorry, my ordering could have been a little more clear.  I only meant a
warning for the list comprehension case.
 
> [snip]
> 
> Do you need a warning for that change too?  Code that relies on it is
> pretty sick IMO.
> 

  I agree, and I try never to write such code.  But having Python point out
any places I foolishly did so makes the job of fixing any bugs this change
introduces into my code that much easier.

  It also serves to point out to people who *don't* realize how sick this
construct is that a potentially large chunk of their software will break in
Python X.Y (3.0?), where it will break, and why it will break.

  Jp
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://mail.python.org/pipermail/python-dev/attachments/20031021/59d36242/attachment.bin
From aleaxit at yahoo.com  Tue Oct 21 13:39:45 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 21 13:39:50 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <5.1.1.6.0.20031021115620.023e2300@telecommunity.com>
References: <200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com>
	<5.1.1.6.0.20031021115620.023e2300@telecommunity.com>
Message-ID: <200310211939.45800.aleaxit@yahoo.com>

On Tuesday 21 October 2003 05:59 pm, Phillip J. Eby wrote:
   ...
> If you make it a list that's lazy, it doesn't lose the memory allocation
> overhead for the list.  If I understand Alex's benchmarks, making a lazy
> list would end up being *slower* than list comprehension is now.

No, my benchmarks show that NOT having to "incarnate" the list,
when all you do is loop on it, is a modest but repeatable win (20%-30%
or so).


Alex


From fdrake at acm.org  Tue Oct 21 14:04:38 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue Oct 21 14:04:52 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Doc/lib
	libplatform.tex, 1.1, 1.2
In-Reply-To: <E1AC0mX-0003du-00@sc8-pr-cvs1.sourceforge.net>
References: <E1AC0mX-0003du-00@sc8-pr-cvs1.sourceforge.net>
Message-ID: <16277.30006.67329.572905@grendel.zope.com>


fdrake@users.sourceforge.net writes:
 > Modified Files:
 > 	libplatform.tex 
 > Log Message:
 > - make this section format
 > - start cleaning up the markup for consistency
 > - comment out the reference to a MS KnowledgeBase article that doesn't
 >   seem to be present at msdn.microsoft.com; hopefully someone can
 >   point out an alternate source for the relevant information

I forgot to mention in the checkin message that this is *not* ready to
be backported to the 2.3.x maintenance branch yet.  I hope to make
another substantial pass through this later this week.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From guido at python.org  Tue Oct 21 14:08:20 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 14:08:31 2003
Subject: [Python-Dev] prePEP: Money data type
In-Reply-To: Your message of "Tue, 21 Oct 2003 11:13:41 +0200."
	<200310211113.41658.aleaxit@yahoo.com> 
References: <A128D751272CD411BC9200508BC2194D03383080@escpl.tcp.com.ar>  
	<200310211113.41658.aleaxit@yahoo.com> 
Message-ID: <200310211808.h9LI8Kt24464@12-236-54-216.client.attbi.com>

> Sure, arithmetic (including rounding) is what we most need.  If we call
> it Decimal or whatever, that may be preferable to Money, I don't know.

Remember, a Decimal implementation following the IEEE 854 specs and
Mike Cowlishaw's design and tests exists in the nondist part of the
Python source tree, thanks to Eric Pierce (and some early work by
Aahz, and encouraging words by Tim Peters).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 14:09:33 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 14:09:56 2003
Subject: [Python-Dev] Re: Re: accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 09:43:31 +0200."
	<200310210943.31574.aleaxit@yahoo.com> 
References: <200310210727.h9L7R3V02910@oma.cosc.canterbury.ac.nz>  
	<200310210943.31574.aleaxit@yahoo.com> 
Message-ID: <200310211809.h9LI9XV24477@12-236-54-216.client.attbi.com>

> On Tuesday 21 October 2003 09:27 am, Greg Ewing wrote:
>    ...
> > But maybe some other keyword could be added to ease any
> > syntactic problems, such as "all" or "every":
> >
> >   sum(all x*x for x in xlist)
> >   sum(every x*x for x in xlist)
> >
> > The presence of the extra keyword would then distinguish
> > an iterator comprehension from the innards of a list
> > comprehension.
> 
> Heh, you ARE a volcano of cool syntactic ideas these days, Greg.
> 
> As between them, to me 'all' sort of connotes 'all at once' while
> 'every' connotes 'one by one' (so would a third possibility, 'each');
> so 'all' is the one I like least.
> 
> Besides accumulators &c we should also think of normal loops:
> 
> for a in all x*x for x in xlist: ...
> 
> for a in every x*x for x in xlist: ...
> 
> for a in each x*x for x in xlist: ...
> 
> Of these three, 'every' looks best to me, personally.
> 
> 
> Alex

I'd rather reserver these keywords for conditions using quantifiers,
like in ABC.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 14:11:39 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 14:11:46 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 20:43:06 +1300."
	<200310210743.h9L7h6k02941@oma.cosc.canterbury.ac.nz> 
References: <200310210743.h9L7h6k02941@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310211811.h9LIBdD24492@12-236-54-216.client.attbi.com>

> > But the executive summary remains: the generator wins because it
> > doesn't have to materialize the whole list.
> 
> But what would happen if the generator were replaced with
> in-line code that computes the values and feeds them to
> an accumulator object, such as might result from an
> accumulator syntax that gets inline-expanded in the
> same way as a list comp?

I'd worry that writing an accumilator would become much less natural.
The cool thing of iterators and generators is that you can write both
the source (generator) and the destination (iterator consumer) as a
simple loop, which is how you usually think about it.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From theller at python.net  Tue Oct 21 14:25:12 2003
From: theller at python.net (Thomas Heller)
Date: Tue Oct 21 14:26:07 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <200310211857.57783.aleaxit@yahoo.com> (Alex Martelli's message
	of "Tue, 21 Oct 2003 18:57:57 +0200")
References: <brspcg23.fsf@python.net>
	<200310171840.h9HIesN06941@12-236-54-216.client.attbi.com>
	<he22muai.fsf@python.net> <200310211857.57783.aleaxit@yahoo.com>
Message-ID: <u162la7r.fsf@python.net>

Alex Martelli <aleaxit@yahoo.com> writes:

> On Tuesday 21 October 2003 06:26 pm, Thomas Heller wrote:
>    ...
>> Yet another approach would be to use the delay_load feature of MSVC, it
>> allows dynamic loading of the dlls at runtime, ideally without changing
>> the source code.
>>
>> So far I have never tried that, does anyone know if this really works?
>
> Yes, back when I was in think3 we experimented extensively as soon as
> it was available (still in a beta of -- I don't recall if it was the SDK or 
> VStudio 6), and except for a few specific libraries that gave some trouble
> (MSVCRT.DLL and the MFC one, only -- I think because they did
> something to the memory allocation mechanisms, MSVCRT having it
> and MFC changing it -- perhaps it was because we were ALSO using
> other memory-related tools in DLL's, e.g. leak-detectors), it always worked 
> smoothly and "spread" the load, making the app startup faster.  So we
> set the two DLL's that gave us trouble for load and startup and the rest
> for delayed load and lived happily ever after (I don't even recall exactly
> HOW we did that, it WAS years ago...).

After installing MSVC6 on a win98 machine, where I could rename
wsock32.dll away (which was not possible on XP due to file system
protection), I was able to change socketmodule.c to use delay loading of
the winsock dll.  I had to wrap up the WSAStartup() call inside a 
__try {} __except {} block to catch the exception thrown.

With this change, _socket (and maybe also select) could then also be
converted into builtin modules.

Guido, what do you think?

Thomas

PS: Here's the exception raised when loading of wsock32.dll fails:
>>> import _socket
Traceback (most recent call last):
  File "<stdin>", line1m in ?
ImportError: WSAStartup failed: error code -1066598274

and here's the tiny patch:

Index: socketmodule.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Modules/socketmodule.c,v
retrieving revision 1.271.6.5
diff -c -r1.271.6.5 socketmodule.c
*** socketmodule.c	20 Oct 2003 14:34:47 -0000	1.271.6.5
--- socketmodule.c	21 Oct 2003 18:21:39 -0000
***************
*** 3381,3387 ****
  	WSADATA WSAData;
  	int ret;
  	char buf[100];
! 	ret = WSAStartup(0x0101, &WSAData);
  	switch (ret) {
  	case 0:	/* No error */
  		Py_AtExit(os_cleanup);
--- 3381,3391 ----
  	WSADATA WSAData;
  	int ret;
  	char buf[100];
! 	__try {
! 		ret = WSAStartup(0x0101, &WSAData);
! 	} __except (ret = GetExceptionCode(), EXCEPTION_EXECUTE_HANDLER) {
! 		;
! 	}
  	switch (ret) {
  	case 0:	/* No error */
  		Py_AtExit(os_cleanup);


From aahz at pythoncraft.com  Tue Oct 21 14:57:49 2003
From: aahz at pythoncraft.com (Aahz)
Date: Tue Oct 21 14:57:54 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: <200310211656.h9LGuVQ24250@12-236-54-216.client.attbi.com>
References: <20031020173134.GA29040@panix.com>
	<200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com>
	<2mad7uq0mr.fsf@starship.python.net>
	<20031021142816.GA25455@intarweb.us>
	<2mwuayodjn.fsf@starship.python.net>
	<200310211656.h9LGuVQ24250@12-236-54-216.client.attbi.com>
Message-ID: <20031021185748.GA18869@panix.com>

On Tue, Oct 21, 2003, Guido van Rossum wrote:
>
> It's only list comprehensions where I plan to remove x from the scope
> after the comprehension is finished.
> 
> Do you need a warning for that change too?  Code that relies on it is
> pretty sick IMO.

Yes, it's sick, but since you made clear previously that listcomps
semantics equivalent to the corresponding for loop, I wouldn't be
surprised to discover that someone converted a for loop to a listcomp
without fixing that sickness.  So yes, it needs a warning.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From python at rcn.com  Tue Oct 21 14:58:55 2003
From: python at rcn.com (Raymond Hettinger)
Date: Tue Oct 21 14:59:44 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <16277.24592.805548.835843@montanaro.dyndns.org>
Message-ID: <004601c39805$60dd9a60$e841fea9@oemcomputer>

[Skip Montanaro]
> I understand all that.  Still, the "best" syntax for these so-called
> iterator comprehensions might have been the current list comprehension
> syntax.  

Skip is right about returning to the basics.  Before considering 
some of the wild syntaxes that have been submitted, I suggest 
re-examining the very first proposal with brackets and yield.

At one time, I got a lot of feedback on this from comp.lang.python.  
Just about everyone found the brackets to be helpful and not misleading,
the immediate presence of "yield" was more than enough to signal that 
an iterator was being returned instead of a list:

	g = [yield (len(line),line)  for line in file  if len(line)>5]

This syntax is instantly learnable from existing knowledge about 
list comprehensions and generators.  The advantage of a near zero
learning curve should not be easily dismissed.

Also, this syntax makes is trivially easy to convert an existing
list comprehension into an iterator comprehension if needed to 
help the application scale-up or to improve performance.


Raymond Hettinger


#################################################################
#################################################################
#################################################################
#####
#####
#####
#################################################################
#################################################################
#################################################################

From FBatista at uniFON.com.ar  Tue Oct 21 15:07:12 2003
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Tue Oct 21 15:08:13 2003
Subject: [Python-Dev] prePEP: Money data type
Message-ID: <A128D751272CD411BC9200508BC2194D0338309E@escpl.tcp.com.ar>

Guido van Rossum wrote:

#- Remember, a Decimal implementation following the IEEE 854 specs and
#- Mike Cowlishaw's design and tests exists in the nondist part of the
#- Python source tree, thanks to Eric Pierce (and some early work by
#- Aahz, and encouraging words by Tim Peters).

Meaning that I should extend/finish it or meaning that Money should not
repeat that work and get specific with money issues?

Can't find it in the CVS, specific path?

Thank you!

.	Facundo


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
ADVERTENCIA  

La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo,
son para uso exclusivo del destinatario y pueden contener informaci?n
confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. 

Si Ud. No es uno de los destinatarios consignados o la persona responsable
de hacer llegar este mensaje a los destinatarios consignados, no est?
autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de
ella) contenida en este mensaje. Por favor notif?quenos respondiendo al
remitente, borre el mensaje original y borre las copias (impresas o grabadas
en cualquier medio magn?tico) que pueda haber realizado del mismo. 

Todas las opiniones contenidas en este mail son propias del autor del
mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones
Personales S.A. o alguna empresa asociada. 

Los mensajes electr?nicos pueden ser alterados, motivo por el cual
Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n
cualquiera sea el resultante de este mensaje. 

Muchas Gracias.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20031021/c3d57d21/attachment-0001.html
From ianb at colorstudy.com  Tue Oct 21 15:08:30 2003
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue Oct 21 15:08:31 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <004601c39805$60dd9a60$e841fea9@oemcomputer>
Message-ID: <F579545B-03F9-11D8-AF55-000393C2D67E@colorstudy.com>

On Tuesday, October 21, 2003, at 01:58 PM, Raymond Hettinger wrote:
> At one time, I got a lot of feedback on this from comp.lang.python.
> Just about everyone found the brackets to be helpful and not 
> misleading,
> the immediate presence of "yield" was more than enough to signal that
> an iterator was being returned instead of a list:
>
> 	g = [yield (len(line),line)  for line in file  if len(line)>5]

FWIW, that g is an iterator is *far* less surprising than the fact that 
yield turns a function into a generator.  If it's okay that a yield in 
the body of a function change the function, why can't a yield in the 
body of a list comprehension change the list comprehension?  It's a lot 
more noticeable, and people should know that "yield" signals something 
a little more tricky is going on.  Also has good symmetry with the 
current meaning of yield.

--
Ian Bicking | ianb@colorstudy.com | http://blog.ianbicking.org


From fdrake at acm.org  Tue Oct 21 15:21:02 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue Oct 21 15:21:14 2003
Subject: [Python-Dev] prePEP: Money data type
In-Reply-To: <A128D751272CD411BC9200508BC2194D0338309E@escpl.tcp.com.ar>
References: <A128D751272CD411BC9200508BC2194D0338309E@escpl.tcp.com.ar>
Message-ID: <16277.34590.271610.511919@grendel.zope.com>


Batista, Facundo writes:
 > Can't find it in the CVS, specific path?

The main body of the Python sources are in CVS as python/dist/src/;
the rest is in python/nondist/.  The decimal package is in
python/nondist/sandbox/decimal/.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From guido at python.org  Tue Oct 21 15:30:49 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 15:31:04 2003
Subject: [Python-Dev] prePEP: Money data type
In-Reply-To: Your message of "Tue, 21 Oct 2003 16:07:12 -0300."
	<A128D751272CD411BC9200508BC2194D0338309E@escpl.tcp.com.ar> 
References: <A128D751272CD411BC9200508BC2194D0338309E@escpl.tcp.com.ar> 
Message-ID: <200310211930.h9LJUnS24625@12-236-54-216.client.attbi.com>

> #- Remember, a Decimal implementation following the IEEE 854 specs and
> #- Mike Cowlishaw's design and tests exists in the nondist part of the
> #- Python source tree, thanks to Eric Pierce (and some early work by
> #- Aahz, and encouraging words by Tim Peters).
> 
> Meaning that I should extend/finish it or meaning that Money should not
> repeat that work and get specific with money issues?

Meaning that you should use if if possible rather than reinventing
that particular wheel.

And yes, if the Decimal class still needs work, if you want to help
fix it that would be great!

--Guido van Rossum (home page: http://www.python.org/~guido/)

From tim.one at comcast.net  Tue Oct 21 15:31:10 2003
From: tim.one at comcast.net (Tim Peters)
Date: Tue Oct 21 15:31:17 2003
Subject: [Python-Dev] prePEP: Money data type
In-Reply-To: <A128D751272CD411BC9200508BC2194D0338309E@escpl.tcp.com.ar>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEEDGLAB.tim.one@comcast.net>

[Guido]
> Remember, a Decimal implementation following the IEEE 854 specs and
> Mike Cowlishaw's design and tests exists in the nondist part of the
> Python source tree, thanks to Eric Pierce ...

s/Pierce/Price/

[Batista, Facundo]
> Meaning that I should extend/finish it or meaning that Money should
> not repeat that work and get specific with money issues?

Meaning that there's an existing body of work that's already been informed
by years of design debate (IBM's proposed decimal standard), and an involved
Python implementation of that.  What happens next depends on who can make
time to do something next.

> Can't find it in the CVS, specific path?

IBM's proposed standard:

   http://www2.hursley.ibm.com/decimal/

Eric's implementation:

   http://cvs.sf.net/viewcvs.py/python/python/nondist/sandbox/decimal/


From guido at python.org  Tue Oct 21 15:36:40 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 15:36:53 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 14:58:55 EDT."
	<004601c39805$60dd9a60$e841fea9@oemcomputer> 
References: <004601c39805$60dd9a60$e841fea9@oemcomputer> 
Message-ID: <200310211936.h9LJaeO24644@12-236-54-216.client.attbi.com>

> Skip is right about returning to the basics.  Before considering 
> some of the wild syntaxes that have been submitted, I suggest 
> re-examining the very first proposal with brackets and yield.
> 
> At one time, I got a lot of feedback on this from comp.lang.python.  
> Just about everyone found the brackets to be helpful and not misleading,
> the immediate presence of "yield" was more than enough to signal that 
> an iterator was being returned instead of a list:
> 
> 	g = [yield (len(line),line)  for line in file  if len(line)>5]
> 
> This syntax is instantly learnable from existing knowledge about 
> list comprehensions and generators.  The advantage of a near zero
> learning curve should not be easily dismissed.
> 
> Also, this syntax makes is trivially easy to convert an existing
> list comprehension into an iterator comprehension if needed to 
> help the application scale-up or to improve performance.

-1.

I expect that most iterator comprehensions (we need a better term!)
are not stored in a variable but passed as an argument to something
that takes an iterable, e.g.

  sum(len(line) for line in file if line.strip())

I find that in such cases, the 'yield' distracts from what is going on
by focusing attention on the generator (which is really just an
implementation detail).

We can quibble about whether double parentheses are needed, but this
syntax is just so much clearer than the version with square brackets
and yield, that there is no contest IMO.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 15:37:42 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 15:37:52 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: Your message of "Tue, 21 Oct 2003 14:57:49 EDT."
	<20031021185748.GA18869@panix.com> 
References: <20031020173134.GA29040@panix.com>
	<200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com>
	<2mad7uq0mr.fsf@starship.python.net>
	<20031021142816.GA25455@intarweb.us>
	<2mwuayodjn.fsf@starship.python.net>
	<200310211656.h9LGuVQ24250@12-236-54-216.client.attbi.com> 
	<20031021185748.GA18869@panix.com> 
Message-ID: <200310211937.h9LJbg624656@12-236-54-216.client.attbi.com>

> > It's only list comprehensions where I plan to remove x from the scope
> > after the comprehension is finished.
> > 
> > Do you need a warning for that change too?  Code that relies on it is
> > pretty sick IMO.
> 
> Yes, it's sick, but since you made clear previously that listcomps
> semantics equivalent to the corresponding for loop, I wouldn't be
> surprised to discover that someone converted a for loop to a listcomp
> without fixing that sickness.  So yes, it needs a warning.

OK, fair enough.  Someone update the doc bug report for this.
Initially, we're just going to document it as deprecated behavior (or
maybe "despised" behavior :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at python.org  Tue Oct 21 15:40:56 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 15:41:04 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: Your message of "Tue, 21 Oct 2003 20:25:12 +0200."
	<u162la7r.fsf@python.net> 
References: <brspcg23.fsf@python.net>
	<200310171840.h9HIesN06941@12-236-54-216.client.attbi.com>
	<he22muai.fsf@python.net> <200310211857.57783.aleaxit@yahoo.com> 
	<u162la7r.fsf@python.net> 
Message-ID: <200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com>

> After installing MSVC6 on a win98 machine, where I could rename
> wsock32.dll away (which was not possible on XP due to file system
> protection), I was able to change socketmodule.c to use delay loading of
> the winsock dll.  I had to wrap up the WSAStartup() call inside a 
> __try {} __except {} block to catch the exception thrown.
> 
> With this change, _socket (and maybe also select) could then also be
> converted into builtin modules.
> 
> Guido, what do you think?

I think now is a good time to try this in 2.4.  I don't think I'd want
to do this (or any of the proposed reorgs) in 2.3 though.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 15:46:20 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 15:46:52 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Doc/lib
	libplatform.tex, 1.1, 1.2
In-Reply-To: Your message of "Tue, 21 Oct 2003 14:04:38 EDT."
	<16277.30006.67329.572905@grendel.zope.com> 
References: <E1AC0mX-0003du-00@sc8-pr-cvs1.sourceforge.net>  
	<16277.30006.67329.572905@grendel.zope.com> 
Message-ID: <200310211946.h9LJkKP24720@12-236-54-216.client.attbi.com>

>  > - comment out the reference to a MS KnowledgeBase article that doesn't
>  >   seem to be present at msdn.microsoft.com; hopefully someone can
>  >   point out an alternate source for the relevant information

Bizarre.  It seems MS has removed all traces of that article; I found
lots of pointers to it in Google but they all point to the same dead
link.  Google's cache is your best bet...

--Guido van Rossum (home page: http://www.python.org/~guido/)

From FBatista at uniFON.com.ar  Tue Oct 21 15:48:39 2003
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Tue Oct 21 15:49:35 2003
Subject: [Python-Dev] prePEP: Money data type
Message-ID: <A128D751272CD411BC9200508BC2194D0338309F@escpl.tcp.com.ar>

Guido van Rossum wrote:

#- > Meaning that I should extend/finish it or meaning that 
#- Money should not
#- > repeat that work and get specific with money issues?
#- 
#- Meaning that you should use if if possible rather than reinventing
#- that particular wheel.

I'll study it and see if I can subclass it or something.


#- And yes, if the Decimal class still needs work, if you want to help
#- fix it that would be great!

I'll do my best. 

.	Facundo


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
ADVERTENCIA  

La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo,
son para uso exclusivo del destinatario y pueden contener informaci?n
confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. 

Si Ud. No es uno de los destinatarios consignados o la persona responsable
de hacer llegar este mensaje a los destinatarios consignados, no est?
autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de
ella) contenida en este mensaje. Por favor notif?quenos respondiendo al
remitente, borre el mensaje original y borre las copias (impresas o grabadas
en cualquier medio magn?tico) que pueda haber realizado del mismo. 

Todas las opiniones contenidas en este mail son propias del autor del
mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones
Personales S.A. o alguna empresa asociada. 

Los mensajes electr?nicos pueden ser alterados, motivo por el cual
Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n
cualquiera sea el resultante de este mensaje. 

Muchas Gracias.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20031021/6f710a96/attachment.html
From guido at python.org  Tue Oct 21 15:50:03 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 15:50:34 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 19:24:25 +0200."
	<200310211924.25711.aleaxit@yahoo.com> 
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310211749.21152.aleaxit@yahoo.com>
	<200310211653.h9LGrcC24239@12-236-54-216.client.attbi.com> 
	<200310211924.25711.aleaxit@yahoo.com> 
Message-ID: <200310211950.h9LJo3i24741@12-236-54-216.client.attbi.com>

> On Tuesday 21 October 2003 06:53 pm, Guido van Rossum wrote:
>    ...
> > > maybe an optimization IS possible if all the user code does with the
> > > LC is loop on it (or anyway just get its iter(...) and nothing else).
> >
> > But that's not very common, so I don't see the point of putting in the
> 
> It IS common, at least in the code I write, e.g.:
> 
>   d = dict([ (f(a), g(a)) for a in S ])
> 
>   s = sets.Set([ a*a for a in S ])
> 
>   totsq = sum([ x*x for x in S ])
>  
> etc.  I detest the look of those ([ ... ]), but that's the closest I
> get to dict comprehensions, set comprehensions, etc.

OK, but you hve very little hope of optimizing the incarnation away by
the compiler (especially since our attempts at warning about
surreptitious changes to builtins had to be withdrawn before 2.3 went
out).

> > except when it doesn't, and then making the list comprehension lazy
> > can be a mistake: the following example
> >
> >   for key in [k for k in d if d[k] is None]:
> >        del d[key]
> >
> > is *not* the same as
> >
> >   for key in d:
> >       if d[key] is None:
> >           del d
> 
> Well, no, but even if that last statement was "del d[key]" you'd still be
> right:-).

:-(

> Even in a situation where the list comp is only looped over once,
> code MIGHT still be relying on the LC having "snapshotted" and/or
> exhausted iterators IT uses.  I was basically thinking of passing the LC
> as argument to something -- the typical cases where I use LC now and
> WISH they were lazy, as above -- rather about for loops.  And even
> when the LC _is_ an argument there might be cases where its current
> strict (nonlazy) semantics are necessary.  Oh well!

Yes, this is why iterator comprehensions (we need a better term!!!)
would be so cool to have (I think much cooler than conditional
expressions).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Tue Oct 21 15:52:23 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Oct 21 15:52:29 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310211936.h9LJaeO24644@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 14:58:55 EDT."
	<004601c39805$60dd9a60$e841fea9@oemcomputer>
	<004601c39805$60dd9a60$e841fea9@oemcomputer>
Message-ID: <5.1.1.6.0.20031021155150.0240c6b0@telecommunity.com>

At 12:36 PM 10/21/03 -0700, Guido van Rossum wrote:
>I expect that most iterator comprehensions (we need a better term!)
>are not stored in a variable but passed as an argument to something
>that takes an iterable, e.g.

Iterator expression?


From guido at python.org  Tue Oct 21 15:53:17 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 15:53:23 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 18:50:25 +0200."
	<200310211850.25376.aleaxit@yahoo.com> 
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310211749.21152.aleaxit@yahoo.com>
	<16277.24592.805548.835843@montanaro.dyndns.org> 
	<200310211850.25376.aleaxit@yahoo.com> 
Message-ID: <200310211953.h9LJrHg24771@12-236-54-216.client.attbi.com>

> Hmmm... what about skipping 2.4, and making a beeline for 3.0...?-)

Not until I can quit my job at ES and spend a year or so on PSF funds
on it.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From skip at pobox.com  Tue Oct 21 15:56:08 2003
From: skip at pobox.com (Skip Montanaro)
Date: Tue Oct 21 15:56:17 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310211936.h9LJaeO24644@12-236-54-216.client.attbi.com>
References: <004601c39805$60dd9a60$e841fea9@oemcomputer>
	<200310211936.h9LJaeO24644@12-236-54-216.client.attbi.com>
Message-ID: <16277.36696.353007.168363@montanaro.dyndns.org>


    Guido> I expect that most iterator comprehensions (we need a better
    Guido> term!)

You didn't like "lazy list comprehensions"?

    Guido> We can quibble about whether double parentheses are needed, ...

You haven't convinced me that you're not going to want to toss out one of
the two comprehension syntaxes and only retain the lazy semantics in Py3k.
If that's the case and the current list comprehension syntax is better than
the current crop of proposals, why even add (lazy list|iterator)
comprehensions now?  Just make do without them until Py3k and make all list
comprehensions lazy at that point.  There will be enough other bullets to
bite that this shouldn't be a big deal (many programs will probably require
significant rewriting anyway).

Skip

From python at rcn.com  Tue Oct 21 16:00:46 2003
From: python at rcn.com (Raymond Hettinger)
Date: Tue Oct 21 16:01:50 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: <200310211703.h9LH3X124310@12-236-54-216.client.attbi.com>
Message-ID: <000f01c3980e$047d7200$e841fea9@oemcomputer>

It is clear now that tee() is a fundamental building block and 
that a C implementation has decisive advantages over its pure 
python counterpart.  

So ...

tee() will become a standard itertools function in Py2.4.


Raymond Hettinger


From guido at python.org  Tue Oct 21 16:02:01 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 16:02:17 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 11:34:24 CDT."
	<16277.24592.805548.835843@montanaro.dyndns.org> 
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com>
	<16277.15186.392757.583785@montanaro.dyndns.org>
	<200310211749.21152.aleaxit@yahoo.com> 
	<16277.24592.805548.835843@montanaro.dyndns.org> 
Message-ID: <200310212002.h9LK21624815@12-236-54-216.client.attbi.com>

[Skip]
> I understand all that.  Still, the "best" syntax for these so-called
> iterator comprehensions might have been the current list
> comprehension syntax.  I don't know how hard it would be to fix
> existing code, probably not a massive undertaking, but the bugs lazy
> list comprehensions introduced would probably be a bit subtle.
> 
> Let's perform a little thought experiment.  We already have the
> current list comprehension syntax and the people thinking about lazy
> list comprehensions are seem to be struggling a bit to find syntax
> for them which doesn't appear cobbled together.  Direct your
> attention to Python 3.0 where one of the things Guido has said he
> would like to do is to eliminate some bits of the language he feels
> are warts.  Given two similar language constructs implementing two
> similar sets of semantics, I'd have to think he would like to toss
> one of each.  The list comprehension syntax seems the more obvious
> (to me) syntax to keep while it would appear there are some
> advantages to the lazy list comprehension semantics (enumerate
> (parts of) infinite sequences, better memory usage, some performance
> improvements).
> 
> I don't know when 3.0 alpha will (conceptually) become the CVS
> trunk.  Guido may not know either, but it is getting nearer every
> day.

Not necessarily.  Maybe the time machine's stuck. :-)

> Unless he likes one of the proposed new syntaxes well enough to
> conclude now that he will keep both syntaxes and both sets of
> semantics in 3.0, I think we should look at other alternatives which
> don't introduce new syntax, including morphing list comprehensions
> into lazy list comprehensions or leaving lazy list comprehensions
> out of the language, at least in 2.x.  As I think people learned
> when considering ternary operators and switch statements, adding
> constructs to the language in a Pythonic way is not always possible,
> no matter how compelling the feature might be.  In those situations
> it makes sense to leave the construct out for now and see if syntax
> restructuring in
> 3.0 will make addition of such desired features possible.
> 
> Anyone for
> 
>     [x for x in S]L
> 
> ? <lazy wink>

Thanks for trying to bang some sense into this.

Personally, I still like the idea best to make

    (x for x in S)

be an iterator comprehension

and

    [x for x in S]

syntactic sugar for the common operation

    list((x for x in S))

I'm not 100% sure about requiring the double parentheses, but I
certainly want to require extra parentheses if there's a comma on
either side, so that if we want to pass a 2-argument function a list
comprehension, it will have to be parenthesized, e.g.

    foo((x for x in S), 42)
    bar(42, (x for x in S))

This makes me think that it's probably fine to also require

    sum((x for x in S))

Of course, multiple for clauses and if clauses are still supported
just like in current list comprehensions; they add no new syntactic
issues, except if we also were to introduce conditional
expressions. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From theller at python.net  Tue Oct 21 16:03:33 2003
From: theller at python.net (Thomas Heller)
Date: Tue Oct 21 16:03:48 2003
Subject: [Python-Dev] buildin vs. shared modules
In-Reply-To: <200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com> (Guido
	van Rossum's message of "Tue, 21 Oct 2003 12:40:56 -0700")
References: <brspcg23.fsf@python.net>
	<200310171840.h9HIesN06941@12-236-54-216.client.attbi.com>
	<he22muai.fsf@python.net> <200310211857.57783.aleaxit@yahoo.com>
	<u162la7r.fsf@python.net>
	<200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com>
Message-ID: <k76yl5nu.fsf@python.net>

Guido van Rossum <guido@python.org> writes:

>> After installing MSVC6 on a win98 machine, where I could rename
>> wsock32.dll away (which was not possible on XP due to file system
>> protection), I was able to change socketmodule.c to use delay loading of
>> the winsock dll.  I had to wrap up the WSAStartup() call inside a 
>> __try {} __except {} block to catch the exception thrown.
>> 
>> With this change, _socket (and maybe also select) could then also be
>> converted into builtin modules.
>> 
>> Guido, what do you think?
>
> I think now is a good time to try this in 2.4.  I don't think I'd want
> to do this (or any of the proposed reorgs) in 2.3 though.

Yes, I understood this already.

Thomas


From fincher.8 at osu.edu  Tue Oct 21 17:05:15 2003
From: fincher.8 at osu.edu (Jeremy Fincher)
Date: Tue Oct 21 16:06:50 2003
Subject: [Python-Dev] prePEP: Money data type
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEEDGLAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCGEEDGLAB.tim.one@comcast.net>
Message-ID: <200310211705.15094.fincher.8@osu.edu>

On Tuesday 21 October 2003 03:31 pm, Tim Peters wrote:
> Eric's implementation:
>
>    http://cvs.sf.net/viewcvs.py/python/python/nondist/sandbox/decimal/

Just out of curiosity, why isn't this distributed with Python?

Jeremy

From guido at python.org  Tue Oct 21 16:06:45 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 16:07:00 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 12:59:50 BST."
	<2m65iipzrd.fsf@starship.python.net> 
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310200944.30482.aleaxit@yahoo.com>
	<200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com>
	<200310201745.36226.aleaxit@yahoo.com>
	<200310201637.h9KGbHU21287@12-236-54-216.client.attbi.com> 
	<2m65iipzrd.fsf@starship.python.net> 
Message-ID: <200310212006.h9LK6jt24859@12-236-54-216.client.attbi.com>

> > I meant that the compiler should rename it.
> 
> Implementing this might be entertaining.  In particular what happens
> if the iteration variable is a local in the frame anyway?  I presume
> that would inhibit the renaming, but then there's a potentially
> confusing dichotomy as to whether renaming gets done.  Of course
> you could *always* rename, but then code like 
> 
> def f(x):
>     r = [x+1 for x in range(x)]
>     return r, x
> 
> becomes even more incomprehensible (and changes in behaviour).

Here's the rule I'd propose for iterator comprehensions, which list
comprehensions would inherit:

    [<expr1> for <vars> in <expr2>]

The variables in <vars> should always be simple variables, and their
scope only extends to <expr1>.  If there's a variable with the same
name in an outer scope (including the function containing the
comprehension) it is not accessible (at least not by name) in
<expr1>.  <expr2> is not affected.

In comprehensions you won't be able to do some things you can do with
regular for loops:

    a = [1,2]
    for a[0] in range(10): print a

> And what about horrors like
> 
>     [([x for x in range(10)],x) for x in range(10)]
> 
> vs:
> 
>     [([x for x in range(10)],y) for y in range(10)]
> 
> ?
> 
> I suppose you could make a case for throwing out (or warning about)
> all these cases at compile time, but that would require significant
> effort as well (I think).

I think the semantics are crisply defined, users who write these
deserve what they get (confusion and the wrath of their readers).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From skip at pobox.com  Tue Oct 21 16:16:02 2003
From: skip at pobox.com (Skip Montanaro)
Date: Tue Oct 21 16:16:28 2003
Subject: [Python-Dev] prePEP: Money data type
In-Reply-To: <200310211705.15094.fincher.8@osu.edu>
References: <LNBBLJKPBEHFEDALKOLCGEEDGLAB.tim.one@comcast.net>
	<200310211705.15094.fincher.8@osu.edu>
Message-ID: <16277.37890.616759.734494@montanaro.dyndns.org>


    >> http://cvs.sf.net/viewcvs.py/python/python/nondist/sandbox/decimal/

    Jeremy> Just out of curiosity, why isn't this distributed with Python?

'cuz it's still in the sandbox (the place where people can play with code
ideas).

Skip

From FBatista at uniFON.com.ar  Tue Oct 21 16:16:41 2003
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Tue Oct 21 16:17:38 2003
Subject: [Python-Dev] prePEP: Money data type
Message-ID: <A128D751272CD411BC9200508BC2194D033830A1@escpl.tcp.com.ar>

Jeremy Fincher wrote:

#- On Tuesday 21 October 2003 03:31 pm, Tim Peters wrote:
#- > Eric's implementation:
#- >
#- >    
#- http://cvs.sf.net/viewcvs.py/python/python/nondist/sandbox/decimal/
#- 
#- Just out of curiosity, why isn't this distributed with Python?

Nice question!

Think that testDecimal.py is not finishing well. At least that's where I'll
start after studying the class itself.

.	Facundo


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
ADVERTENCIA  

La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo,
son para uso exclusivo del destinatario y pueden contener informaci?n
confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. 

Si Ud. No es uno de los destinatarios consignados o la persona responsable
de hacer llegar este mensaje a los destinatarios consignados, no est?
autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de
ella) contenida en este mensaje. Por favor notif?quenos respondiendo al
remitente, borre el mensaje original y borre las copias (impresas o grabadas
en cualquier medio magn?tico) que pueda haber realizado del mismo. 

Todas las opiniones contenidas en este mail son propias del autor del
mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones
Personales S.A. o alguna empresa asociada. 

Los mensajes electr?nicos pueden ser alterados, motivo por el cual
Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n
cualquiera sea el resultante de este mensaje. 

Muchas Gracias.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20031021/a810f090/attachment.html
From tim.one at comcast.net  Tue Oct 21 16:19:33 2003
From: tim.one at comcast.net (Tim Peters)
Date: Tue Oct 21 16:19:38 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310211936.h9LJaeO24644@12-236-54-216.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEEJGLAB.tim.one@comcast.net>

[Guido]
> I expect that most iterator comprehensions (we need a better term!)
> ...

Well, calling it an iterator Aussonderungsaxiom would continue emphasizing
the wrong thing <wink>.

"Set comprehensions" in a programming language originated with SETL, and are
named in honor of the set-theoretic Axiom of Comprehension
(Aussonderungsaxiom).  In its well-behaved form, that says roughly that
given a set X, then for any predicate P(x), there exists a subset of X whose
elements consist of exactly those elements x of X for which P(x) is true (in
its ill-behaved form, it leads directly to Russell's Paradox -- the set of
all sets that don't contain themselves).

So "comprehension" emphasizes the "if" part of list comprehension syntax,
which often isn't the most interesting thing.  More interesting more often
are (a) the computation done on the objects gotten from the for-iterator,
and (b) that the results are generated one at a time.

Put that all in a pot and stir, and the name "generator expression" seems
natural and useful to me.  In the Icon language, *all* expressions are
generators, so maybe I'm biased by that.  OTOH, "the results are generated
one at a time" is close to plain English, and "generator expression" then
brings to my mind an expression capable of delivering a sequence of results.

Or you could call it an Orlijn flourish.


From pf_moore at yahoo.co.uk  Tue Oct 21 16:20:48 2003
From: pf_moore at yahoo.co.uk (Paul Moore)
Date: Tue Oct 21 16:19:55 2003
Subject: [Python-Dev] Re: buildin vs. shared modules
References: <brspcg23.fsf@python.net>
	<200310171840.h9HIesN06941@12-236-54-216.client.attbi.com>
	<he22muai.fsf@python.net> <200310211857.57783.aleaxit@yahoo.com>
	<u162la7r.fsf@python.net>
	<200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com>
Message-ID: <7k2ywden.fsf@yahoo.co.uk>

Guido van Rossum <guido@python.org> writes:

>> After installing MSVC6 on a win98 machine, where I could rename
>> wsock32.dll away (which was not possible on XP due to file system
>> protection), I was able to change socketmodule.c to use delay loading of
>> the winsock dll.  I had to wrap up the WSAStartup() call inside a 
>> __try {} __except {} block to catch the exception thrown.
>> 
>> With this change, _socket (and maybe also select) could then also be
>> converted into builtin modules.
>> 
>> Guido, what do you think?
>
> I think now is a good time to try this in 2.4.  I don't think I'd want
> to do this (or any of the proposed reorgs) in 2.3 though.

One (very mild) point - this is highly MSVC-specific. I don't know if
there is ever going to be any interest in (for example) getting Python
to build with Mingw/gcc on Windows, but there's no equivalent of this
in Mingw (indeed, Mingw doesn't, as far as I know, support
__try/__except either).

But in the absence of anyone who is working on a Mingw build, this is
pretty much irrelevant...

Paul.
-- 
This signature intentionally left blank


From tim.one at comcast.net  Tue Oct 21 16:22:11 2003
From: tim.one at comcast.net (Tim Peters)
Date: Tue Oct 21 16:22:18 2003
Subject: [Python-Dev] prePEP: Money data type
In-Reply-To: <200310211705.15094.fincher.8@osu.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEEKGLAB.tim.one@comcast.net>

>> http://cvs.sf.net/viewcvs.py/python/python/nondist/sandbox/decimal/ 

[Jeremy Fincher]
> Just out of curiosity, why isn't this distributed with Python?

Because it's not finished.  Finish it, then ask again <wink>.

From skip at pobox.com  Tue Oct 21 16:23:15 2003
From: skip at pobox.com (Skip Montanaro)
Date: Tue Oct 21 16:23:27 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310212006.h9LK6jt24859@12-236-54-216.client.attbi.com>
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310200944.30482.aleaxit@yahoo.com>
	<200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com>
	<200310201745.36226.aleaxit@yahoo.com>
	<200310201637.h9KGbHU21287@12-236-54-216.client.attbi.com>
	<2m65iipzrd.fsf@starship.python.net>
	<200310212006.h9LK6jt24859@12-236-54-216.client.attbi.com>
Message-ID: <16277.38323.854588.570453@montanaro.dyndns.org>


    Guido> Here's the rule I'd propose for iterator comprehensions, which list
    Guido> comprehensions would inherit:

    Guido>     [<expr1> for <vars> in <expr2>]

    Guido> The variables in <vars> should always be simple variables, and
    Guido> their scope only extends to <expr1>.  If there's a variable with
    Guido> the same name in an outer scope (including the function
    Guido> containing the comprehension) it is not accessible (at least not
    Guido> by name) in <expr1>.  <expr2> is not affected.

I thought the definition for list comprehension syntax was something like

    '['
        <expr> for <vars> in <expr>
               [ for <vars> in <expr> ] *
               [ if <expr> ] *
    ']'

The loop <vars> in an earlier for clause should be visible in all nested for
clauses and conditional clauses, not just in the first <expr>.

Skip

From guido at python.org  Tue Oct 21 16:33:13 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 16:33:22 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: Your message of "Tue, 21 Oct 2003 12:02:26 +0200."
	<200310211202.26677.aleaxit@yahoo.com> 
References: <200310171441.h9HEf4E06247@12-236-54-216.client.attbi.com>
	<200310210009.39256.aleaxit@yahoo.com>
	<200310210344.h9L3iVh23308@12-236-54-216.client.attbi.com> 
	<200310211202.26677.aleaxit@yahoo.com> 
Message-ID: <200310212033.h9LKXDk24952@12-236-54-216.client.attbi.com>

> Yes, if we specify an iter's __copy__ makes an independent iterator,
> which is surely the most useful semantics for it, then any weird iterator
> whose index is in fact mutable can copy not-quite-shallowly and offer
> the same useful semantics.  I'm not sure where that leaves generator
> made iterators, which don't really know which parts of the state in their
> saved frame are "index", but having them just punt and refuse to copy
> themselves shallowly might be ok.

I thought we already established before that attempting to guess wihch
parts of a generator function to copy and which parts to share is
hopeless.  generator-made iterators won't be __copy__-able, period.

I think this is the weakness of this cloning business, because it
either makes generators second-class iterators, or it makes cloning a
precarious thing to attempt when generators are used.  (You can make a
non-cloneable iterator cloneable by wrapping it into something that
buffers just those items that are still reacheable by clones, but this
can still require arbitrary amounts of buffer space.  The problem is
that using a generator as a filter in a pipeline of iterators makes
the result non-cloneable, even if the underlying iterator is
cloneable.  I'm thinking of situations like

  def odd(it):
      while True:
          it.next()
          yield it.next()

  it = odd(range(1000))
  it2 = clone(it)

Here we'd wish the result could be the same as that of

  tmp = range(1000)
  it = odd(tmp)
  it2 = odd(tmp)

but that can't be realized.


> Sure, it might.  Perhaps the typical use case would be one in which
> an iterator gets deepcopied "incidentally" as part of the deepcopy
> of some other object which "happens" to hold an iterator; if
> iterators knew how to deepcopy themselves that would save some work
> on the part of the other object's author.  No huge win, sure.  But
> once the copy gets deep, generator-made iterators should also have
> no problem actually doing it, and that may be another middle-size
> win.

I still don't think deep-copying stack frames is a business I'd like
to be in.  Too many tricky issues.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From mcherm at mcherm.com  Tue Oct 21 16:38:33 2003
From: mcherm at mcherm.com (Michael Chermside)
Date: Tue Oct 21 16:38:39 2003
Subject: [Python-Dev] accumulator display syntax
Message-ID: <1066768713.3f959949dc764@mcherm.com>

Skip writes:
> thought the definition for list comprehension syntax was something like
> 
>     '['
>         <expr> for <vars> in <expr>
>                [ for <vars> in <expr> ] *
>                [ if <expr> ] *
>     ']'

Nope:

>>> [x*y for x in 'aBcD' if x.islower() for y in range(4) if y%2]
['a', 'aaa', 'c', 'ccc']

-- Michael Chermside


From guido at python.org  Tue Oct 21 16:46:42 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 16:47:00 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 15:52:23 EDT."
	<5.1.1.6.0.20031021155150.0240c6b0@telecommunity.com> 
References: <Your message of "Tue, 21 Oct 2003 14:58:55 EDT."
	<004601c39805$60dd9a60$e841fea9@oemcomputer>
	<004601c39805$60dd9a60$e841fea9@oemcomputer> 
	<5.1.1.6.0.20031021155150.0240c6b0@telecommunity.com> 
Message-ID: <200310212046.h9LKkgp25011@12-236-54-216.client.attbi.com>

> Iterator expression?

Better.  Or perhaps generator expression?  To maintain the link with
generator functions, since the underlying mechanism *will* be mostly
the same.  Yes, I like that even better.

BTW, while Alex has shown that a generator function with no free
variables runs quite fast, a generator expression that uses variables
from the surrounding scope will have to use the nested scopes
machinery to access those, unlike a list comprehension; not only does
this run slower, but it also slows down all other uses of that
variable in the surrounding scope (because it becomes a "cell"
throughout the scope).

Someone could time how well

  y = 1
  sum([x*y for x in R])

fares compared to

  y = 1
  def gen():
     for x in R: yield y*y
  sum(gen())

for R in (range(N) for N in (100, 1000, 10000)).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 16:50:16 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 16:50:34 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 14:56:08 CDT."
	<16277.36696.353007.168363@montanaro.dyndns.org> 
References: <004601c39805$60dd9a60$e841fea9@oemcomputer>
	<200310211936.h9LJaeO24644@12-236-54-216.client.attbi.com> 
	<16277.36696.353007.168363@montanaro.dyndns.org> 
Message-ID: <200310212050.h9LKoGM25025@12-236-54-216.client.attbi.com>

>     Guido> I expect that most iterator comprehensions (we need a better
>     Guido> term!)
> 
> You didn't like "lazy list comprehensions"?

No, because list comprehensions are no longer the fundamental building
blocks.  Generator expression sounds good to me now.

>     Guido> We can quibble about whether double parentheses are needed, ...
> 
> You haven't convinced me that you're not going to want to toss out
> one of the two comprehension syntaxes and only retain the lazy
> semantics in Py3k.

Too many double negatives. :-)

Right now I feel like keeping both syntaxes, but declaring list
comprehensions syntactic sugar for list(generator expression).

> If that's the case and the current list comprehension syntax is
> better than the current crop of proposals, why even add (lazy
> list|iterator) comprehensions now?  Just make do without them until
> Py3k and make all list comprehensions lazy at that point.  There
> will be enough other bullets to bite that this shouldn't be a big
> deal (many programs will probably require significant rewriting
> anyway).

It's likely that generator experssions won't make it into Python 2.x
for any x, just because of the effort to get the community to accept
new syntax in general.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From pf_moore at yahoo.co.uk  Tue Oct 21 16:30:41 2003
From: pf_moore at yahoo.co.uk (Paul Moore)
Date: Tue Oct 21 16:50:48 2003
Subject: [Python-Dev] Re: prePEP: Money data type
References: <A128D751272CD411BC9200508BC2194D0338309E@escpl.tcp.com.ar>
	<LNBBLJKPBEHFEDALKOLCGEEDGLAB.tim.one@comcast.net>
Message-ID: <3cdmwcy6.fsf@yahoo.co.uk>

"Tim Peters" <tim.one@comcast.net> writes:

> Meaning that there's an existing body of work that's already been informed
> by years of design debate (IBM's proposed decimal standard), and an involved
> Python implementation of that.  What happens next depends on who can make
> time to do something next.

While I'm little more than an interested bystander, I'm not clear what
*could* happen next. Can what's in nondist simply (!) be documented
and migrated to the standard library? Is there a need for a C
implementation (much like datetime started in Python but became C
before release)?

The module TODO comment just mentions "cleanup, hunt and kill bugs".
So it certainly sounds like it's nearly there...

Paul.
-- 
This signature intentionally left blank


From guido at python.org  Tue Oct 21 16:52:00 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 16:52:11 2003
Subject: [Python-Dev] prePEP: Money data type
In-Reply-To: Your message of "Tue, 21 Oct 2003 17:05:15 EDT."
	<200310211705.15094.fincher.8@osu.edu> 
References: <LNBBLJKPBEHFEDALKOLCGEEDGLAB.tim.one@comcast.net>  
	<200310211705.15094.fincher.8@osu.edu> 
Message-ID: <200310212052.h9LKq1G25047@12-236-54-216.client.attbi.com>

> >    http://cvs.sf.net/viewcvs.py/python/python/nondist/sandbox/decimal/
> 
> Just out of curiosity, why isn't this distributed with Python?

Because it's not seen any actual usage, is AFAIK undocumented, and
one can have quibbles about the API.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From tim.one at comcast.net  Tue Oct 21 16:55:01 2003
From: tim.one at comcast.net (Tim Peters)
Date: Tue Oct 21 16:55:07 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310212046.h9LKkgp25011@12-236-54-216.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>

[Guido]
> ...
> BTW, while Alex has shown that a generator function with no free
> variables runs quite fast, a generator expression that uses variables
> from the surrounding scope will have to use the nested scopes
> machinery to access those, unlike a list comprehension; not only does
> this run slower, but it also slows down all other uses of that
> variable in the surrounding scope (because it becomes a "cell"
> throughout the scope).

The implementation could synthesize a generator function abusing default
arguments to give the generator's frame locals with the same names.


From guido at python.org  Tue Oct 21 16:55:37 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 16:55:51 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 16:19:33 EDT."
	<LNBBLJKPBEHFEDALKOLCMEEJGLAB.tim.one@comcast.net> 
References: <LNBBLJKPBEHFEDALKOLCMEEJGLAB.tim.one@comcast.net> 
Message-ID: <200310212055.h9LKtcC25068@12-236-54-216.client.attbi.com>

> Well, calling it an iterator Aussonderungsaxiom would continue emphasizing
> the wrong thing <wink>.
> 
> "Set comprehensions" in a programming language originated with SETL,
> and are named in honor of the set-theoretic Axiom of Comprehension
> (Aussonderungsaxiom).  In its well-behaved form, that says roughly
> that given a set X, then for any predicate P(x), there exists a
> subset of X whose elements consist of exactly those elements x of X
> for which P(x) is true (in its ill-behaved form, it leads directly
> to Russell's Paradox -- the set of all sets that don't contain
> themselves).
> 
> So "comprehension" emphasizes the "if" part of list comprehension
> syntax, which often isn't the most interesting thing.  More
> interesting more often are (a) the computation done on the objects
> gotten from the for-iterator, and (b) that the results are generated
> one at a time.
> 
> Put that all in a pot and stir, and the name "generator expression"
> seems natural and useful to me.  In the Icon language, *all*
> expressions are generators, so maybe I'm biased by that.  OTOH, "the
> results are generated one at a time" is close to plain English, and
> "generator expression" then brings to my mind an expression capable
> of delivering a sequence of results.

Thanks for an independent validation of "generator expressions"!  It's
a perfect term.

> Or you could call it an Orlijn flourish.

No, that term is already reserved for something else (the details of
which I'll spare you, as they involve intimate details about toddler
hygiene :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 16:57:03 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 16:57:16 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 15:23:15 CDT."
	<16277.38323.854588.570453@montanaro.dyndns.org> 
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310200944.30482.aleaxit@yahoo.com>
	<200310201430.h9KEUbD21012@12-236-54-216.client.attbi.com>
	<200310201745.36226.aleaxit@yahoo.com>
	<200310201637.h9KGbHU21287@12-236-54-216.client.attbi.com>
	<2m65iipzrd.fsf@starship.python.net>
	<200310212006.h9LK6jt24859@12-236-54-216.client.attbi.com> 
	<16277.38323.854588.570453@montanaro.dyndns.org> 
Message-ID: <200310212057.h9LKv3425092@12-236-54-216.client.attbi.com>

> I thought the definition for list comprehension syntax was something like
> 
>     '['
>         <expr> for <vars> in <expr>
>                [ for <vars> in <expr> ] *
>                [ if <expr> ] *
>     ']'
> 
> The loop <vars> in an earlier for clause should be visible in all nested for
> clauses and conditional clauses, not just in the first <expr>.

Absolutely, good point!

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 17:04:50 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 17:04:58 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 16:55:01 EDT."
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net> 
References: <LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net> 
Message-ID: <200310212104.h9LL4oH25172@12-236-54-216.client.attbi.com>

> [Guido]
> > ...
> > BTW, while Alex has shown that a generator function with no free
> > variables runs quite fast, a generator expression that uses variables
> > from the surrounding scope will have to use the nested scopes
> > machinery to access those, unlike a list comprehension; not only does
> > this run slower, but it also slows down all other uses of that
> > variable in the surrounding scope (because it becomes a "cell"
> > throughout the scope).

[Tim]
> The implementation could synthesize a generator function abusing default
> arguments to give the generator's frame locals with the same names.

Yes, I think that could work -- I see no way that something invoked by
the generator expression could possibly modify a variable binding in
the surrounding scope.

<thinks>

Argh, someone *could* pass around a copy of locals() and make an
assignment into that.  But I think we're already deprecating
non-read-only use of locals(), so I'd like to ban that as abuse.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From aleaxit at yahoo.com  Tue Oct 21 17:16:22 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 21 17:16:30 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310212046.h9LKkgp25011@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 14:58:55 EDT."
	<004601c39805$60dd9a60$e841fea9@oemcomputer>
	<5.1.1.6.0.20031021155150.0240c6b0@telecommunity.com>
	<200310212046.h9LKkgp25011@12-236-54-216.client.attbi.com>
Message-ID: <200310212316.22749.aleaxit@yahoo.com>

On Tuesday 21 October 2003 10:46 pm, Guido van Rossum wrote:
>   y = 1
>   sum([x*y for x in R])
>
> fares compared to
>
>   y = 1
>   def gen():
>      for x in R: yield y*y
>   sum(gen())

module a.py being:

R = [range(N) for N in (10, 100, 10000)]

def lc(R):
  y = 1
  sum([x*y for x in R])

def gen1(R):
  y = 1
  def gen():
     for x in R: yield y*y
  sum(gen())

def gen2(R):
  y = 1
  def gen(R=R, y=y):
     for x in R: yield y*y
  sum(gen())

i measure:

for N=10:
[alex@lancelot bo]$ timeit.py -c -s'import a' 'a.lc(a.R[0])'
100000 loops, best of 3: 12.3 usec per loop
[alex@lancelot bo]$ timeit.py -c -s'import a' 'a.gen1(a.R[0])'
100000 loops, best of 3: 10.4 usec per loop
[alex@lancelot bo]$ timeit.py -c -s'import a' 'a.gen2(a.R[0])'
100000 loops, best of 3: 9.7 usec per loop

for N=100:
[alex@lancelot bo]$ timeit.py -c -s'import a' 'a.lc(a.R[1])'
10000 loops, best of 3: 93 usec per loop
[alex@lancelot bo]$ timeit.py -c -s'import a' 'a.gen1(a.R[1])'
10000 loops, best of 3: 59 usec per loop
[alex@lancelot bo]$ timeit.py -c -s'import a' 'a.gen2(a.R[1])'
10000 loops, best of 3: 55 usec per loop

for N=10000:
[alex@lancelot bo]$ timeit.py -c -s'import a' 'a.lc(a.R[2])'
100 loops, best of 3: 9.4e+03 usec per loop
[alex@lancelot bo]$ timeit.py -c -s'import a' 'a.gen1(a.R[2])'
100 loops, best of 3: 5.6e+03 usec per loop
[alex@lancelot bo]$ timeit.py -c -s'import a' 'a.gen2(a.R[2])'
100 loops, best of 3: 5.2e+03 usec per loop

I think it's well worth overcoming come "community resistance
to new syntax" to get this kind of advantage easily.  The trick
of binding outer-scope variables as default args is neat but
buys less than the pure idea of just using a generator rather
than a list comprehension.


Alex


From pedronis at bluewin.ch  Tue Oct 21 17:33:30 2003
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Tue Oct 21 17:31:09 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310212104.h9LL4oH25172@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 16:55:01 EDT."
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
Message-ID: <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>

At 14:04 21.10.2003 -0700, Guido van Rossum wrote:
> > [Guido]
> > > ...
> > > BTW, while Alex has shown that a generator function with no free
> > > variables runs quite fast, a generator expression that uses variables
> > > from the surrounding scope will have to use the nested scopes
> > > machinery to access those, unlike a list comprehension; not only does
> > > this run slower, but it also slows down all other uses of that
> > > variable in the surrounding scope (because it becomes a "cell"
> > > throughout the scope).
>
>[Tim]
> > The implementation could synthesize a generator function abusing default
> > arguments to give the generator's frame locals with the same names.
>
>Yes, I think that could work -- I see no way that something invoked by
>the generator expression could possibly modify a variable binding in
>the surrounding scope.

so this, if I understand:

def h():
   y = 0
   l = [1,2]
   it = (x+y for x in l)
   y = 1
   for v in it:
     print v

will print 1,2 and not 2,3

unlike:

def h():
   y = 0
   l = [1,2]
   def gen(S):
     for x in S:
       yield x+y
   it = gen(l)
   y = 1
   for v in it:
     print v


From allison at sumeru.stanford.EDU  Tue Oct 21 17:33:49 2003
From: allison at sumeru.stanford.EDU (Dennis Allison)
Date: Tue Oct 21 17:34:06 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Doc/lib
	libplatform.tex, 1.1, 1.2
In-Reply-To: <200310211946.h9LJkKP24720@12-236-54-216.client.attbi.com>
Message-ID: <Pine.LNX.4.10.10310211432100.18703-100000@sumeru.stanford.EDU>

Or Brewster Kahle's web archive, www.archive.org

On Tue, 21 Oct 2003, Guido van Rossum wrote:

> >  > - comment out the reference to a MS KnowledgeBase article that doesn't
> >  >   seem to be present at msdn.microsoft.com; hopefully someone can
> >  >   point out an alternate source for the relevant information
> 
> Bizarre.  It seems MS has removed all traces of that article; I found
> lots of pointers to it in Google but they all point to the same dead
> link.  Google's cache is your best bet...
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/allison%40sumeru.stanford.edu
> 


From pje at telecommunity.com  Tue Oct 21 17:57:39 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Oct 21 17:57:42 2003
Subject: [Python-Dev] locals() (was Re: accumulator display syntax)
In-Reply-To: <200310212104.h9LL4oH25172@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 16:55:01 EDT."
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
Message-ID: <5.1.1.6.0.20031021174739.01f60e00@telecommunity.com>

At 02:04 PM 10/21/03 -0700, Guido van Rossum wrote:

>Argh, someone *could* pass around a copy of locals() and make an
>assignment into that.

Not when the locals() is that of a CPython function, and I expect the same 
is true of Jython functions.


>   But I think we're already deprecating
>non-read-only use of locals(), so I'd like to ban that as abuse.

FWIW, both Zope 3 and PEAK currently make use of 'locals()' (actually, 
sys._getframe()) to modify locals of a class or module scope (i.e. 
non-functions).  For both class and module scopes, it seems to be implied 
by the language definition that the local namespace is the __dict__ of the 
corresponding object.

So, is this deprecated usage for class and module objects too?


From python at rcn.com  Tue Oct 21 17:59:56 2003
From: python at rcn.com (Raymond Hettinger)
Date: Tue Oct 21 18:00:43 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310211936.h9LJaeO24644@12-236-54-216.client.attbi.com>
Message-ID: <001001c3981e$aa78f340$e841fea9@oemcomputer>

[Guido]
> I expect that most iterator comprehensions (we need a better term!)
> are not stored in a variable but passed as an argument to something
> that takes an iterable, e.g.
> 
>   sum(len(line) for line in file if line.strip())

That is somewhat beautiful.

So, I drop my request for bracketed yields and throw my tiny weight
behind this idea for an iterator expression.


> We can quibble about whether double parentheses are needed

I vote for not requiring the outer parentheses unless there is an
adjacent comma.  That would unnecessarily complicate the simple,
elegant proposal.  

Otherwise, I would anticipate frequent questions to the help list
or tutor list on why something coded like your example doesn't work.

Also, the double paren form just looks funny, like there is something
wrong with it but you can't tell what.


Timing
------

Based on the extensive comp.lang.python discussions when I first
floated a PEP on the subject, I conclude that the user community
will very much accept the new form and that there is no reason
to not include it in Py2.4.

If there is any doubt on that score, I would be happy to update
the PEP to match the current proposal for iterator expressions
and solicit more community feedback.


Raymond Hettinger 


From barry at python.org  Tue Oct 21 18:07:22 2003
From: barry at python.org (Barry Warsaw)
Date: Tue Oct 21 18:08:17 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <001001c3981e$aa78f340$e841fea9@oemcomputer>
References: <001001c3981e$aa78f340$e841fea9@oemcomputer>
Message-ID: <1066774041.5750.255.camel@anthem>

On Tue, 2003-10-21 at 17:59, Raymond Hettinger wrote:
> [Guido]
> > I expect that most iterator comprehensions (we need a better term!)
> > are not stored in a variable but passed as an argument to something
> > that takes an iterable, e.g.
> > 
> >   sum(len(line) for line in file if line.strip())
> 
> That is somewhat beautiful.

Indeed, as is the term "generator expression" and the relegation to
syntactic sugar of list comprehensions.

> > We can quibble about whether double parentheses are needed
> 
> I vote for not requiring the outer parentheses unless there is an
> adjacent comma.

I like that too.  It mirrors other situations where the parentheses
aren't needed except to disambiguate syntax.  In the above example,
there's no ambiguity.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20031021/53953213/attachment.bin
From guido at python.org  Tue Oct 21 18:11:17 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 18:11:23 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 23:16:22 +0200."
	<200310212316.22749.aleaxit@yahoo.com> 
References: <Your message of "Tue, 21 Oct 2003 14:58:55 EDT."
	<004601c39805$60dd9a60$e841fea9@oemcomputer>
	<5.1.1.6.0.20031021155150.0240c6b0@telecommunity.com>
	<200310212046.h9LKkgp25011@12-236-54-216.client.attbi.com> 
	<200310212316.22749.aleaxit@yahoo.com> 
Message-ID: <200310212211.h9LMBH925278@12-236-54-216.client.attbi.com>

> I think it's well worth overcoming come "community resistance
> to new syntax" to get this kind of advantage easily.  The trick
> of binding outer-scope variables as default args is neat but
> buys less than the pure idea of just using a generator rather
> than a list comprehension.

Thanks for the measurements!

Is someone interested in writing up a PEP and taking it to the
community?  Or do I have to do it myself (and risk another newsgroup
meltdown)?

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 18:14:19 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 18:14:27 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> 
References: <Your message of "Tue, 21 Oct 2003 16:55:01 EDT."
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net> 
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> 
Message-ID: <200310212214.h9LMEJB25302@12-236-54-216.client.attbi.com>

> >[name withheld]
> > > The implementation could synthesize a generator function abusing default
> > > arguments to give the generator's frame locals with the same names.

[Guido]
> >Yes, I think that could work -- I see no way that something invoked by
> >the generator expression could possibly modify a variable binding in
> >the surrounding scope.

[Samuele]
> so this, if I understand:
> 
> def h():
>    y = 0
>    l = [1,2]
>    it = (x+y for x in l)
>    y = 1
>    for v in it:
>      print v
> 
> will print 1,2 and not 2,3
> 
> unlike:
> 
> def h():
>    y = 0
>    l = [1,2]
>    def gen(S):
>      for x in S:
>        yield x+y
>    it = gen(l)
>    y = 1
>    for v in it:
>      print v

Argh.  Of course.

No, I think it should use the actual value of y, just like a nested
function.

Never mind that idea then.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 18:18:28 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 18:18:37 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 17:59:56 EDT."
	<001001c3981e$aa78f340$e841fea9@oemcomputer> 
References: <001001c3981e$aa78f340$e841fea9@oemcomputer> 
Message-ID: <200310212218.h9LMIS725333@12-236-54-216.client.attbi.com>

> I vote for not requiring the outer parentheses unless there is an
> adjacent comma.  That would unnecessarily complicate the simple,
> elegant proposal.  
> 
> Otherwise, I would anticipate frequent questions to the help list
> or tutor list on why something coded like your example doesn't work.
> 
> Also, the double paren form just looks funny, like there is something
> wrong with it but you can't tell what.

OK.  I think I can pull it off in the Grammar.

> Timing
> ------
> 
> Based on the extensive comp.lang.python discussions when I first
> floated a PEP on the subject, I conclude that the user community
> will very much accept the new form and that there is no reason
> to not include it in Py2.4.
> 
> If there is any doubt on that score, I would be happy to update
> the PEP to match the current proposal for iterator expressions
> and solicit more community feedback.

Wonderful!  Rename PEP 289 to "generator expressions" and change the
contents to match this proposal.  Thanks for being the fall guy!

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 18:25:28 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 18:25:44 2003
Subject: [Python-Dev] locals() (was Re: accumulator display syntax)
In-Reply-To: Your message of "Tue, 21 Oct 2003 17:57:39 EDT."
	<5.1.1.6.0.20031021174739.01f60e00@telecommunity.com> 
References: <Your message of "Tue, 21 Oct 2003 16:55:01 EDT."
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net> 
	<5.1.1.6.0.20031021174739.01f60e00@telecommunity.com> 
Message-ID: <200310212225.h9LMPSv25371@12-236-54-216.client.attbi.com>

> >Argh, someone *could* pass around a copy of locals() and make an
> >assignment into that.
> 
> Not when the locals() is that of a CPython function, and I expect the same 
> is true of Jython functions.

Well, the effect is undefined; there may be things you can do that
would force the changes out to the real local variables.

> >   But I think we're already deprecating
> >non-read-only use of locals(), so I'd like to ban that as abuse.
> 
> FWIW, both Zope 3 and PEAK currently make use of 'locals()'
> (actually, sys._getframe()) to modify locals of a class or module
> scope (i.e. non-functions).  For both class and module scopes, it
> seems to be implied by the language definition that the local
> namespace is the __dict__ of the corresponding object.
> 
> So, is this deprecated usage for class and module objects too?

It isn't.  I'm not sure it shouldn't be; at some point it might be
attractive to lock down the namespace of certain modules and classes,
and in fact new-style classes already attempt to lock down their
__dict__.  Fortunately the __dict__ you see when executing a function
during the class definition phase is not the class dict; the class
dict is a copy of it taken by the class creation code.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From walter at livinglogic.de  Tue Oct 21 18:28:51 2003
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Tue Oct 21 18:29:01 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310212002.h9LK21624815@12-236-54-216.client.attbi.com>
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>	<200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com>	<16277.15186.392757.583785@montanaro.dyndns.org>	<200310211749.21152.aleaxit@yahoo.com>
	<16277.24592.805548.835843@montanaro.dyndns.org>
	<200310212002.h9LK21624815@12-236-54-216.client.attbi.com>
Message-ID: <3F95B323.9010405@livinglogic.de>

Guido van Rossum wrote:

> [...]
> Thanks for trying to bang some sense into this.
> 
> Personally, I still like the idea best to make
> 
>     (x for x in S)
> 
> be an iterator comprehension
> 
> and
> 
>     [x for x in S]
> 
> syntactic sugar for the common operation
> 
>     list((x for x in S))

Would this mean:
[x for x in S] is a list comprehension and
[(x for x in S)] is a list containing one generator expression?

Bye,
    Walter D?rwald


From guido at python.org  Tue Oct 21 18:31:58 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 18:32:11 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Wed, 22 Oct 2003 00:28:51 +0200."
	<3F95B323.9010405@livinglogic.de> 
References: <200310200008.h9K08VE23310@oma.cosc.canterbury.ac.nz>
	<200310210409.h9L49Bu23354@12-236-54-216.client.attbi.com>
	<16277.15186.392757.583785@montanaro.dyndns.org>
	<200310211749.21152.aleaxit@yahoo.com>
	<16277.24592.805548.835843@montanaro.dyndns.org>
	<200310212002.h9LK21624815@12-236-54-216.client.attbi.com> 
	<3F95B323.9010405@livinglogic.de> 
Message-ID: <200310212231.h9LMVwR25409@12-236-54-216.client.attbi.com>

> Would this mean:
> [x for x in S] is a list comprehension and
> [(x for x in S)] is a list containing one generator expression?

Yes.  (Raymond, you might mention this in the PEP.)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From pedronis at bluewin.ch  Tue Oct 21 18:43:49 2003
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Tue Oct 21 18:41:29 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310212214.h9LMEJB25302@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<Your message of "Tue, 21 Oct 2003 16:55:01 EDT."
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
Message-ID: <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch>

At 15:14 21.10.2003 -0700, Guido van Rossum wrote:
> > >[name withheld]
> > > > The implementation could synthesize a generator function abusing 
> default
> > > > arguments to give the generator's frame locals with the same names.
>
>[Guido]
> > >Yes, I think that could work -- I see no way that something invoked by
> > >the generator expression could possibly modify a variable binding in
> > >the surrounding scope.
>
>[Samuele]
> > so this, if I understand:
> >
> > def h():
> >    y = 0
> >    l = [1,2]
> >    it = (x+y for x in l)
> >    y = 1
> >    for v in it:
> >      print v
> >
> > will print 1,2 and not 2,3
> >
> > unlike:
> >
> > def h():
> >    y = 0
> >    l = [1,2]
> >    def gen(S):
> >      for x in S:
> >        yield x+y
> >    it = gen(l)
> >    y = 1
> >    for v in it:
> >      print v
>
>Argh.  Of course.
>
>No, I think it should use the actual value of y, just like a nested
>function.
>
>Never mind that idea then.

this is a bit OT and too late, but given that our closed over variables are 
read-only, I'm wondering whether, having a 2nd chance, using cells and 
following mutations in the enclosing scopes is really worth it, we kind of 
mimic Scheme and relatives but there outer scope variables are also 
rebindable. Maybe copying semantics not using cells for our closures would 
not be too insane, and people would not be burnt by trying things like this:

for msg in msgs:
   def onClick(e):
     print msg
   panel.append(Button(msg,onClick=onClick))

which obviously doesn't do what one could expect today. OTOH as for general 
mutability, using a mutable object (list,...) would allow for mutability 
when one really need it (rarely).


From pedronis at bluewin.ch  Tue Oct 21 18:49:23 2003
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Tue Oct 21 18:47:08 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch>
References: <200310212214.h9LMEJB25302@12-236-54-216.client.attbi.com>
	<Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<Your message of "Tue, 21 Oct 2003 16:55:01 EDT."
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
Message-ID: <5.2.1.1.0.20031022004621.027cd230@pop.bluewin.ch>

At 00:43 22.10.2003 +0200, Samuele Pedroni wrote:

>this is a bit OT and too late, but given that our closed over variables 
>are read-only, I'm wondering whether, having a 2nd chance, using cells and 
>following mutations in the enclosing scopes is really worth it, we kind of 
>mimic Scheme and relatives but there outer scope variables are also 
>rebindable. Maybe copying semantics not using cells for our closures would 
>not be too insane, and people would not be burnt by trying things like this:
>
>for msg in msgs:
>   def onClick(e):
>     print msg
>   panel.append(Button(msg,onClick=onClick))
>
>which obviously doesn't do what one could expect today. OTOH as for 
>general mutability, using a mutable object (list,...) would allow for 
>mutability when one really need it (rarely).

of course OTOH cells make it easier to cope with recursive references:

def g():
   def f(x):
     ... f refers to f ...
   return f

but this seem more an implementation detail, although not using cells would 
make this rather
trickier to support. 


From guido at python.org  Tue Oct 21 18:51:59 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 18:52:07 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: Your message of "Wed, 22 Oct 2003 00:43:49 +0200."
	<5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> 
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> <Your
	message of "Tue, 21 Oct 2003 16:55:01 EDT."
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> 
	<5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch> 
Message-ID: <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com>

[Changing the subject.]

[Samuele]
> this is a bit OT and too late, but given that our closed over
> variables are read-only, I'm wondering whether, having a 2nd chance,
> using cells and following mutations in the enclosing scopes is
> really worth it, we kind of mimic Scheme and relatives but there
> outer scope variables are also rebindable. Maybe copying semantics
> not using cells for our closures would not be too insane, and people
> would not be burnt by trying things like this:
> 
> for msg in msgs:
>    def onClick(e):
>      print msg
>    panel.append(Button(msg,onClick=onClick))
> 
> which obviously doesn't do what one could expect today. OTOH as for
> general mutability, using a mutable object (list,...) would allow
> for mutability when one really need it (rarely).

It was done this way because not everybody agreed that closed-over
variables should be read-only, and the current semantics allow us to
make them writable (as in Scheme, I suppose?) if we can agree on a
syntax to declare an "intermediate scope" global.

Maybe "global x in f" would work?

def outer():
    x = 1
    def intermediate():
        x = 2
        def inner():
            global x in outer
            x = 42
        inner()
        print x      # prints 2
    intermediate()
    print x          # prints 42

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at yahoo.com  Tue Oct 21 19:05:07 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 21 19:05:22 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310212211.h9LMBH925278@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 14:58:55 EDT."
	<004601c39805$60dd9a60$e841fea9@oemcomputer>
	<200310212316.22749.aleaxit@yahoo.com>
	<200310212211.h9LMBH925278@12-236-54-216.client.attbi.com>
Message-ID: <200310220105.08017.aleaxit@yahoo.com>

On Wednesday 22 October 2003 00:11, Guido van Rossum wrote:
> > I think it's well worth overcoming come "community resistance
> > to new syntax" to get this kind of advantage easily.  The trick
> > of binding outer-scope variables as default args is neat but
> > buys less than the pure idea of just using a generator rather
> > than a list comprehension.
>
> Thanks for the measurements!
>
> Is someone interested in writing up a PEP and taking it to the
> community?  Or do I have to do it myself (and risk another newsgroup
> meltdown)?

I'm interested, if it can wait until next week (in a few hours I'm
flying off for a trip and I won't even have my laptop along).  What's
the procedure for requesting a PEP number, again?


Alex


From walter at livinglogic.de  Tue Oct 21 19:06:30 2003
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Tue Oct 21 19:06:38 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 23:33:30
	+0200."	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<Your	message of "Tue, 21 Oct 2003 16:55:01
	EDT."	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch>
	<200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com>
Message-ID: <3F95BBF6.1090900@livinglogic.de>

Guido van Rossum wrote:

> [...]
> Maybe "global x in f" would work?
> 
> def outer():
>     x = 1
>     def intermediate():
>         x = 2
>         def inner():
>             global x in outer
>             x = 42
>         inner()
>         print x      # prints 2
>     intermediate()
>     print x          # prints 42

Why not make local variables attributes of the function, i.e.
replace:

    def inner():
       global x in outer
       x = 42

with:

    def inner():
       outer.x = 42

Global variables could then be assigned via:
    global.x = 42

Could this be made backwards compatible?

Bye,
    Walter D?rwald


From guido at python.org  Tue Oct 21 19:07:43 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 19:07:56 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Wed, 22 Oct 2003 01:05:07 +0200."
	<200310220105.08017.aleaxit@yahoo.com> 
References: <Your message of "Tue, 21 Oct 2003 14:58:55 EDT."
	<004601c39805$60dd9a60$e841fea9@oemcomputer>
	<200310212316.22749.aleaxit@yahoo.com>
	<200310212211.h9LMBH925278@12-236-54-216.client.attbi.com> 
	<200310220105.08017.aleaxit@yahoo.com> 
Message-ID: <200310212307.h9LN7hY25523@12-236-54-216.client.attbi.com>

> > Is someone interested in writing up a PEP and taking it to the
> > community?  Or do I have to do it myself (and risk another newsgroup
> > meltdown)?
> 
> I'm interested, if it can wait until next week (in a few hours I'm
> flying off for a trip and I won't even have my laptop along).  What's
> the procedure for requesting a PEP number, again?

Raymond is going to give PEP 289 an overhaul.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 19:09:28 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 19:09:37 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: Your message of "Wed, 22 Oct 2003 01:06:30 +0200."
	<3F95BBF6.1090900@livinglogic.de> 
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> <Your
	message of "Tue, 21 Oct 2003 16:55:01 EDT."
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch>
	<200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> 
	<3F95BBF6.1090900@livinglogic.de> 
Message-ID: <200310212309.h9LN9Sk25548@12-236-54-216.client.attbi.com>

> Why not make local variables attributes of the function, i.e.
> replace:
> 
>     def inner():
>        global x in outer
>        x = 42
> 
> with:
> 
>     def inner():
>        outer.x = 42

Because this already means something!  outer.x refers to the attribute
x of function outer.  That's quite different than local variable x of
the most recent invocation of outer on the current thread's call stack!

> Global variables could then be assigned via:
>     global.x = 42

This has a tiny bit of appeal, but not enough to bother.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From tdelaney at avaya.com  Tue Oct 21 19:11:44 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Tue Oct 21 19:11:52 2003
Subject: [Python-Dev] listcomps vs. for loops
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com>

> From: Jp Calderone [mailto:exarkun@intarweb.us.avaya.com]
> 
>   Not when x is properly initialized.  Anyway, this is no 
> different from the
> problem of:
> 
>     for x in R:
>         ...
>     print x

For which reason I propose that Python 3.0 have the control name in any for expression be "local" to the expression ;)

Hmm - actually this does raise another issue.

    >>> x = 1
    >>> y = [1, 2, 3]
    >>> y = [x for x in y]

Using the current semantics:

    >>> print x
    3

Using the new semantics:

    >>> print x
    1

Is this a problem? Are the new semantics going to cause confusion?

Tim Delaney

From pje at telecommunity.com  Tue Oct 21 19:13:43 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Oct 21 19:13:49 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310212214.h9LMEJB25302@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<Your message of "Tue, 21 Oct 2003 16:55:01 EDT."
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
Message-ID: <5.1.1.6.0.20031021190931.01ec41d0@telecommunity.com>

At 03:14 PM 10/21/03 -0700, Guido van Rossum wrote:

>[Samuele]
> > so this, if I understand:
> >
> > def h():
> >    y = 0
> >    l = [1,2]
> >    it = (x+y for x in l)
> >    y = 1
> >    for v in it:
> >      print v
> >
> > will print 1,2 and not 2,3
> >
> > unlike:
> >
> > def h():
> >    y = 0
> >    l = [1,2]
> >    def gen(S):
> >      for x in S:
> >        yield x+y
> >    it = gen(l)
> >    y = 1
> >    for v in it:
> >      print v
>
>Argh.  Of course.
>
>No, I think it should use the actual value of y, just like a nested
>function.

Why?


>Never mind that idea then.

Actually, I consider Samuele's example a good argument in *favor* of the 
idea.  Because of the similarity between listcomps and generator 
expressions (gen-X's? ;) )  it seems late binding of locals would lead to 
people thinking the behavior is a bug.  Since a genex is not a function (at 
least in form) a late binding would be very non-obvious and 
counterintuitive relative to other kinds of expressions.


From tdelaney at avaya.com  Tue Oct 21 19:15:46 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Tue Oct 21 19:15:51 2003
Subject: [Python-Dev] Re: buildin vs. shared modules
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFF21E@au3010avexu1.global.avaya.com>

> From: Paul Moore [mailto:pf_moore@yahoo.co.uk]
> 
> But in the absence of anyone who is working on a Mingw build, this is
> pretty much irrelevant...

Well, Gerhard has periodically worked on getting Mingw to work. I've had a quick go myself, but don't know the ins and outs enough. I would like Mingw to work, as I don't have access to MSVC at home, and don't have time to work on Python at work :(

Since this is definitely MSVC-specific, I think it should be in an #ifdef block. Other Windows implementations (Mingw, etc) would not get the delay loading.

Tim Delaney

From aleaxit at yahoo.com  Tue Oct 21 19:21:52 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 21 19:22:00 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch>
	<200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com>
Message-ID: <200310220121.52789.aleaxit@yahoo.com>

On Wednesday 22 October 2003 00:51, Guido van Rossum wrote:
   ...
> Maybe "global x in f" would work?

Actually, I would rather like to DO AWAY with the anomalous 'global'
statement and its weird anomalies such as:

x = 23

def f1(u):
    if u:
        global x
    x = 45

def f2():
    if 0:
        global x
    x = 45

print x
f2()
print x
f1(0)
print x

"if u:" when u is 0, and "if 0:", should have the same effect to avoid
violating the least-astonishment rule -- but when the if's body has
a global in it, they don't.  Eeek.

Plus. EVERY newbie makes the mistake of taking "global" to mean
"for ALL modules" rather than "for THIS module", uselessly using
global in toplevel, etc.  It's a wart and I'd rather work to remove it
than to expand it, even though I _would_ like rebindable outers.

I'd rather have a special name that means "this module" available
for import (yes, I can do that with an import hook today).  Say that
__this_module__ was deemed acceptable for this.  Then,
    import __this_module__
    __this_module__.x = 23
lets me rebind the global-to-this-module variable x without 'global'
and its various ills.  Yeah, the name isn't _too_ cool.  But I like the
idea, and when I bounced it experimentally in c.l.py a couple weeks
ago the reaction was mildly positive and without flames.  Making
globals a TAD less handy to rebind from within a function would
not be exactly bad, either.  (Of course 'global' would stay until 3.0
at least, but having an alternative I could explain it as obsolescent:-).

Extending this idea (perhaps overstretching it), some other name
"special for import" might indicate outer scopes.  Though reserving
the whole family of names __outer_<name>__ is probably overdoing
it... plus, the object thus 'imported' would not be a module and would
raise errors if you tried setattr'ing in it a name that's NOT a local
variable of <name> (the import itself would fail if you were not lexically
nested inside a function called <name>).  Thus this would allow
*re-binding* existing local outer names but not *adding* new ones,
which feels just fine to me (but maybe not to all).

OK, this is 1/4-baked for the closure issue.  BUT -- I'd STILL love
to gradually ease 'global' out, think the "import __this_module__"
idea is 3/4-baked (lacks a good special name...), and would hate
to see 'global' gain a new lease of life for sophisticated uses...;-)


Alex


From aleaxit at yahoo.com  Tue Oct 21 19:26:58 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 21 19:27:04 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310212307.h9LN7hY25523@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 14:58:55 EDT."
	<004601c39805$60dd9a60$e841fea9@oemcomputer>
	<200310220105.08017.aleaxit@yahoo.com>
	<200310212307.h9LN7hY25523@12-236-54-216.client.attbi.com>
Message-ID: <200310220126.59005.aleaxit@yahoo.com>

On Wednesday 22 October 2003 01:07, Guido van Rossum wrote:
> > > Is someone interested in writing up a PEP and taking it to the
> > > community?  Or do I have to do it myself (and risk another newsgroup
> > > meltdown)?
> >
> > I'm interested, if it can wait until next week (in a few hours I'm
> > flying off for a trip and I won't even have my laptop along).  What's
> > the procedure for requesting a PEP number, again?
>
> Raymond is going to give PEP 289 an overhaul.

Wonderful!  Much the best idea.


Alex


From guido at python.org  Tue Oct 21 19:27:28 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 19:27:42 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: Your message of "Wed, 22 Oct 2003 09:11:44 +1000."
	<338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com> 
References: <338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com>
Message-ID: <200310212327.h9LNRSX25612@12-236-54-216.client.attbi.com>

> >     for x in R:
> >         ...
> >     print x
> 
> For which reason I propose that Python 3.0 have the control name in
> any for expression be "local" to the expression ;)

What expression?

If you're talking about making

  x = None
  for x in R: pass
  print x # last item of R

illegal, forget it.  That's too darn useful.

> Hmm - actually this does raise another issue.
> 
>     >>> x = 1
>     >>> y = [1, 2, 3]
>     >>> y = [x for x in y]
> 
> Using the current semantics:
> 
>     >>> print x
>     3
> 
> Using the new semantics:
> 
>     >>> print x
>     1
> 
> Is this a problem? Are the new semantics going to cause confusion?

No, and no; we already went over this (but I don't blame you for not
reading every msg in this thread :-).  It does mean that we have to
start issuing proper deprecation warnings, and maybe we won't be able
to properly fix the LC scope thing before 3.0.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 19:30:50 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 19:31:00 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 19:13:43 EDT."
	<5.1.1.6.0.20031021190931.01ec41d0@telecommunity.com> 
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> <Your
	message of "Tue, 21 Oct 2003 16:55:01 EDT."
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> 
	<5.1.1.6.0.20031021190931.01ec41d0@telecommunity.com> 
Message-ID: <200310212330.h9LNUop25640@12-236-54-216.client.attbi.com>

> Actually, I consider Samuele's example a good argument in *favor* of
> the idea.  Because of the similarity between listcomps and generator
> expressions (gen-X's? ;) ) it seems late binding of locals would
> lead to people thinking the behavior is a bug.  Since a genex is not
> a function (at least in form) a late binding would be very
> non-obvious and counterintuitive relative to other kinds of
> expressions.

Hm.  We do late binding of globals.  Why shouldn't we do late binding
of locals?  There are lots of corners or the language where if you
expect something else the actual behavior feels like a bug, until
someone explains it to you.  That's no reason to compromise.  It's an
opportunity for education about scopes!

--Guido van Rossum (home page: http://www.python.org/~guido/)

From nas-python at python.ca  Tue Oct 21 19:39:10 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Tue Oct 21 19:38:01 2003
Subject: [Python-Dev] accumulator display syntax
Message-ID: <20031021233910.GA2091@mems-exchange.org>

Guido:
> Personally, I still like the idea best to make
> 
>     (x for x in S)
> 
> be an iterator comprehension
> 
> and
> 
>     [x for x in S]
> 
> syntactic sugar for the common operation
> 
>     list((x for x in S))

FWIW, that's enough to switch my vote for generator expressions from
-0 to +0.  If they work this way then there is essentially no extra
complexity in the language.  It's important to look at things from
the perspective of a new Python programmer, I think.

Another nice thing is that we have tuple and dict comprehensions
for free:

  tuple(x for x in S)
  dict((k, v) for k, v in S)
  Set(x for x in S)

Aside from the bit of syntactic sugar, everything is nice an
regular.

  Neil

From walter at livinglogic.de  Tue Oct 21 19:38:55 2003
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Tue Oct 21 19:39:00 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <200310212309.h9LN9Sk25548@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> <Your
	message of "Tue, 21 Oct 2003 16:55:01 EDT."
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch>
	<200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com>
	<3F95BBF6.1090900@livinglogic.de>
	<200310212309.h9LN9Sk25548@12-236-54-216.client.attbi.com>
Message-ID: <3F95C38F.4040201@livinglogic.de>

Guido van Rossum wrote:

>>Why not make local variables attributes of the function, i.e.
>>replace:
>>
>>    def inner():
>>       global x in outer
>>       x = 42
>>
>>with:
>>
>>    def inner():
>>       outer.x = 42
> 
> 
> Because this already means something!  outer.x refers to the attribute
> x of function outer.  That's quite different than local variable x of
> the most recent invocation of outer on the current thread's call stack!

I guess unifying them both (somewhat like the instance attribute
lookup rule) won't work.

>>Global variables could then be assigned via:
>>    global.x = 42
> 
> 
> This has a tiny bit of appeal, but not enough to bother.

Bye,
    Walter D?rwald


From tdelaney at avaya.com  Tue Oct 21 19:39:03 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Tue Oct 21 19:39:12 2003
Subject: [Python-Dev] listcomps vs. for loops
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFF239@au3010avexu1.global.avaya.com>

> From: Guido van Rossum [mailto:guido@python.org]
> 
> > >     for x in R:
> > >         ...
> > >     print x
> > 
> > For which reason I propose that Python 3.0 have the control name in
> > any for expression be "local" to the expression ;)
> 
> What expression?

Sorry - I meant statement.

> If you're talking about making
> 
>   x = None
>   for x in R: pass
>   print x # last item of R
> 
> illegal, forget it.  That's too darn useful.

Note the winking smiley above :) Although I do find the scope limiting in:

    for (int i=0; i < 10; ++i)
    {
    }

to be a nice feature of C++ (good god - did I just say that?) and hate that the implementation in MSVC is broken and the control variable leaks.

> No, and no; we already went over this (but I don't blame you for not
> reading every msg in this thread :-).  It does mean that we have to
> start issuing proper deprecation warnings, and maybe we won't be able
> to properly fix the LC scope thing before 3.0.

Yeah - I realised later that the discussion was hidden in the accumulator syntax thread.

I definitely wouldn't find it confusing, but I've been a proponent of not leaking the control variable all along :)

Tim Delaney

From guido at python.org  Tue Oct 21 19:40:34 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 19:40:47 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: Your message of "Wed, 22 Oct 2003 01:21:52 +0200."
	<200310220121.52789.aleaxit@yahoo.com> 
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch>
	<200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> 
	<200310220121.52789.aleaxit@yahoo.com> 
Message-ID: <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com>

> Actually, I would rather like to DO AWAY with the anomalous 'global'
> statement and its weird anomalies such as:
> 
> x = 23
> 
> def f1(u):
>     if u:
>         global x
>     x = 45
> 
> def f2():
>     if 0:
>         global x
>     x = 45
> 
> print x
> f2()
> print x
> f1(0)
> print x
> 
> "if u:" when u is 0, and "if 0:", should have the same effect to avoid
> violating the least-astonishment rule -- but when the if's body has
> a global in it, they don't.  Eeek.

Eek.  Global statement inside flow control should be deprecated, not
abused to show that global is evil. :-)

> Plus. EVERY newbie makes the mistake of taking "global" to mean
> "for ALL modules" rather than "for THIS module",

Only if they've been exposed to languages that have such globals.

> uselessly using global in toplevel,

Which the parser should reject.

> etc.  It's a wart and I'd rather work to remove it than to expand
> it, even though I _would_ like rebindable outers.
> 
> I'd rather have a special name that means "this module" available
> for import (yes, I can do that with an import hook today).  Say that
> __this_module__ was deemed acceptable for this.  Then,
>     import __this_module__
>     __this_module__.x = 23
> lets me rebind the global-to-this-module variable x without 'global'
> and its various ills.  Yeah, the name isn't _too_ cool.  But I like the
> idea, and when I bounced it experimentally in c.l.py a couple weeks
> ago the reaction was mildly positive and without flames.  Making
> globals a TAD less handy to rebind from within a function would
> not be exactly bad, either.  (Of course 'global' would stay until 3.0
> at least, but having an alternative I could explain it as obsolescent:-).

I think it's not unreasonable to want to replace global with attribute
assignment of *something*.  I don't think that "something" should have
to be imported before you can use it; I don't even think it deserves
to have leading and trailing double underscores.

Walter suggested 'global.x = 23' which looks reasonable; unfortunately
my parser can't do this without removing the existing global statement
from the Grammar: after seeing the token 'global' it must be able to
make a decision about whether to expand this to a global statement or
an assignment without peeking ahead, and that's impossible.

> Extending this idea (perhaps overstretching it), some other name
> "special for import" might indicate outer scopes.  Though reserving
> the whole family of names __outer_<name>__ is probably overdoing
> it... plus, the object thus 'imported' would not be a module and would
> raise errors if you tried setattr'ing in it a name that's NOT a local
> variable of <name> (the import itself would fail if you were not lexically
> nested inside a function called <name>).  Thus this would allow
> *re-binding* existing local outer names but not *adding* new ones,
> which feels just fine to me (but maybe not to all).
> 
> OK, this is 1/4-baked for the closure issue.  BUT -- I'd STILL love
> to gradually ease 'global' out, think the "import __this_module__"
> idea is 3/4-baked (lacks a good special name...), and would hate
> to see 'global' gain a new lease of life for sophisticated uses...;-)

If we removed global from the language, how would you spell assignment
to a variable in an outer function scope?  Remember, you can *not* use
'outer.x' because that already refers to a function attribute.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 19:42:20 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 19:42:29 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 16:39:10 PDT."
	<20031021233910.GA2091@mems-exchange.org> 
References: <20031021233910.GA2091@mems-exchange.org> 
Message-ID: <200310212342.h9LNgKa25725@12-236-54-216.client.attbi.com>

> FWIW, that's enough to switch my vote for generator expressions from
> -0 to +0.

Thanks for the support!  I value your judgement.

> If they work this way then there is essentially no extra
> complexity in the language.  It's important to look at things from
> the perspective of a new Python programmer, I think.
> 
> Another nice thing is that we have tuple and dict comprehensions
> for free:
> 
>   tuple(x for x in S)
>   dict((k, v) for k, v in S)
>   Set(x for x in S)

Yes, this is nice.

> Aside from the bit of syntactic sugar, everything is nice an
> regular.

Exactly.  We should thank Peter Norvig for starting this discussion!

--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg at cosc.canterbury.ac.nz  Tue Oct 21 19:42:41 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue Oct 21 19:43:41 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <1066735096.18849.33.camel@straylight>
Message-ID: <200310212342.h9LNgfb10069@oma.cosc.canterbury.ac.nz>

Mark Russell <marktrussell@btopenworld.com>:

> The argument for it is that walking over a dictionary in sorted order
> is (at least to me) a missing idiom in python.  Does this never come
> up when you're teaching the language?

Maybe dicts should have a .sortedkeys() method? The specialised
method name would help stave off any temptation to add varied sort
methods to other types.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From pje at telecommunity.com  Tue Oct 21 19:47:58 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Oct 21 19:48:01 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310212330.h9LNUop25640@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 19:13:43 EDT."
	<5.1.1.6.0.20031021190931.01ec41d0@telecommunity.com>
	<Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<Your message of "Tue, 21 Oct 2003 16:55:01 EDT."
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<5.1.1.6.0.20031021190931.01ec41d0@telecommunity.com>
Message-ID: <5.1.1.6.0.20031021193219.01df0c60@telecommunity.com>

At 04:30 PM 10/21/03 -0700, Guido van Rossum wrote:
> > Actually, I consider Samuele's example a good argument in *favor* of
> > the idea.  Because of the similarity between listcomps and generator
> > expressions (gen-X's? ;) ) it seems late binding of locals would
> > lead to people thinking the behavior is a bug.  Since a genex is not
> > a function (at least in form) a late binding would be very
> > non-obvious and counterintuitive relative to other kinds of
> > expressions.
>
>Hm.  We do late binding of globals.  Why shouldn't we do late binding
>of locals?

Wha?  Oh, you mean in a function.  But that's what I'm saying, it's *not* a 
function.  Sure, it's implemented as one under the hood, but it doesn't 
*look* like a function.  In any normal (non-lambda) expression, whether a 
variable is local or global, its value is retrieved immediately.

Also, even though there's a function under the hood, that function is 
*called* and its value returned immediately.  This seems consistent with an 
immediate binding of parameters.


>   There are lots of corners or the language where if you
>expect something else the actual behavior feels like a bug, until
>someone explains it to you.  That's no reason to compromise.  It's an
>opportunity for education about scopes!

So far, I haven't seen you say any reason why the "arguments" approach is 
bad, or why the "closure" approach is good.  Both are certainly Pythonic in 
some circumstances, but why do you feel that one is better than the other, 
here?

I will state one pragmatic reason for using the default arguments approach: 
code converted from using a listcomp to a genex can immediately have bugs 
as a result of rebinding a local.  Those bugs won't happen if rebinding the 
local has no effect on the genex's evaluation.  (Obviously, an aliasing 
problem can still be created if one modifies a mutable used in the genex, 
but there's no way to remove that possibility and still end up with a lazy 
iterator.)

Given that one of the big arguments in favor of genexes is to make 
"upgrading" from listcomps easy, it shouldn't fail so quickly and 
obviously.  E.g., converting from:

x = {}
for i in range(10):
     x[i] = [y^i for y in range(10)]

to:

x = {}
for i in range(10):
     x[i] = (y^i for y in range(10))

Shouldn't result in all of x's elements iterating over the same values!


From FBatista at uniFON.com.ar  Tue Oct 21 15:50:32 2003
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Tue Oct 21 19:49:37 2003
Subject: [Python-Dev] prePEP: Money data type
Message-ID: <A128D751272CD411BC9200508BC2194D033830A0@escpl.tcp.com.ar>

Tim Peters wrote:

#- Meaning that there's an existing body of work that's already 
#- been informed
#- by years of design debate (IBM's proposed decimal standard), 
#- and an involved
#- Python implementation of that.  What happens next depends on 
#- who can make
#- time to do something next.

I'm urged to have a Money data type, but I'll see if I can get it through
Decimal, improving/fixing/extedign Decimal and saving effort at the same
time.

.	Facundo

From greg at cosc.canterbury.ac.nz  Tue Oct 21 19:49:48 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue Oct 21 19:50:53 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <2m65iipzrd.fsf@starship.python.net>
Message-ID: <200310212349.h9LNnm710076@oma.cosc.canterbury.ac.nz>

Michael Hudson <mwh@python.net>:

> In particular what happens if the iteration variable is a local in the
> frame anyway?  I presume that would inhibit the renaming

Why?

> but then code like 
> 
> def f(x):
>     r = [x+1 for x in range(x)]
>     return r, x
> 
> becomes even more incomprehensible (and changes in behaviour).

Anyone who writes code like that *deserves* to have the
behaviour changed on them!

If this is really a worry, an alternative would be to
simply forbid using a name for the loop variable that's
used for anything else outside the loop. That could
break existing code too, but at least it would break
it in a very obvious way by making it fail to compile.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From walter at livinglogic.de  Tue Oct 21 19:51:05 2003
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Tue Oct 21 19:51:10 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 23:33:30
	+0200."	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>	<5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch>	<200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com>
	<200310220121.52789.aleaxit@yahoo.com>
	<200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com>
Message-ID: <3F95C669.4080706@livinglogic.de>

Guido van Rossum wrote:

> [...]
> Walter suggested 'global.x = 23' which looks reasonable; unfortunately
> my parser can't do this without removing the existing global statement
> from the Grammar: after seeing the token 'global' it must be able to
> make a decision about whether to expand this to a global statement or
> an assignment without peeking ahead, and that's impossible.

Couldn't this be solved by making 'global<whitespace>.' a token?

Should {get|has}attr(global, 'foo') be possible?

Bye,
    Walter D?rwald


From tdelaney at avaya.com  Tue Oct 21 19:53:01 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Tue Oct 21 19:53:10 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DECFF246@au3010avexu1.global.avaya.com>

> From: Greg Ewing [mailto:greg@cosc.canterbury.ac.nz]
> 
> Maybe dicts should have a .sortedkeys() method? The specialised
> method name would help stave off any temptation to add varied sort
> methods to other types.

-1.

I think that:

    d = {1: 2, 3: 4}

    for i in list.sorted(d):
        print i

or

    d = {1: 2, 3: 4}

    for i in list.sorted(d.iterkeys()):
        print i

looks very clean and unambiguous. And is even better at staving off temptation to add varied sort methods to other types.

Tim Delaney

From walter at livinglogic.de  Tue Oct 21 19:57:20 2003
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Tue Oct 21 19:57:25 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 23:33:30
	+0200."	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>	<5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch>	<200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com>
	<200310220121.52789.aleaxit@yahoo.com>
	<200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com>
Message-ID: <3F95C7E0.4030608@livinglogic.de>

Guido van Rossum wrote:

> [...]
> Walter suggested 'global.x = 23' which looks reasonable; unfortunately
> my parser can't do this without removing the existing global statement
> from the Grammar: after seeing the token 'global' it must be able to
> make a decision about whether to expand this to a global statement or
> an assignment without peeking ahead, and that's impossible.

Another idea: We could replace the function globals() with an object
that provides __call__ for backwards compatibility, but also has a
special __setattr__. Then global assignment would be 'globals.x = 23'.
Would this be possible?

Bye,
    Walter D?rwald


From aleaxit at yahoo.com  Tue Oct 21 19:58:21 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 21 19:58:27 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<200310220121.52789.aleaxit@yahoo.com>
	<200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com>
Message-ID: <200310220158.21389.aleaxit@yahoo.com>

On Wednesday 22 October 2003 01:40, Guido van Rossum wrote:
   ...
> Eek.  Global statement inside flow control should be deprecated, not
> abused to show that global is evil. :-)

OK, let's (deprecate them), shall we...?

> > Plus. EVERY newbie makes the mistake of taking "global" to mean
> > "for ALL modules" rather than "for THIS module",
>
> Only if they've been exposed to languages that have such globals.

Actually, I've seen that happen to complete newbies too.  "global" is
a VERY strong word -- or at least perceived as such.

> > uselessly using global in toplevel,
>
> Which the parser should reject.

Again: can we do that in 2.4?

> I think it's not unreasonable to want to replace global with attribute
> assignment of *something*.  I don't think that "something" should have
> to be imported before you can use it; I don't even think it deserves
> to have leading and trailing double underscores.

Using attribute assignment is my main drive here.  I was doing it
via import only to be able to experiment with that in today's Python;-).

> Walter suggested 'global.x = 23' which looks reasonable; unfortunately
> my parser can't do this without removing the existing global statement
> from the Grammar: after seeing the token 'global' it must be able to
> make a decision about whether to expand this to a global statement or
> an assignment without peeking ahead, and that's impossible.

So it can't be global, as it must stay a keyword for backwards compatibility
at least until 3.0.  What about:
    this_module
    current_module
    sys.modules[__name__]  [[hmmm this DOES work today, but...;-)]]
    __module__
...?

> If we removed global from the language, how would you spell assignment
> to a variable in an outer function scope?  Remember, you can *not* use
> 'outer.x' because that already refers to a function attribute.

scope(outer).x , making 'scope' a suitable built-in factory function.  I do
think this deserves a built-in.

If we have this, maybe scope could also be reused as e.g.
scope(global).x = 23
?  I think the reserved keyword 'global' SHOULD give the parser no
problem in this one specific use (but, I'm guessing...!).


Alex


From guido at python.org  Tue Oct 21 20:19:40 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 20:19:53 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: Your message of "Wed, 22 Oct 2003 01:51:05 +0200."
	<3F95C669.4080706@livinglogic.de> 
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch>
	<200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com>
	<200310220121.52789.aleaxit@yahoo.com>
	<200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> 
	<3F95C669.4080706@livinglogic.de> 
Message-ID: <200310220019.h9M0JeS25829@12-236-54-216.client.attbi.com>

> > Walter suggested 'global.x = 23' which looks reasonable; unfortunately
> > my parser can't do this without removing the existing global statement
> > from the Grammar: after seeing the token 'global' it must be able to
> > make a decision about whether to expand this to a global statement or
> > an assignment without peeking ahead, and that's impossible.
> 
> Couldn't this be solved by making 'global<whitespace>.' a token?
> 
> Should {get|has}attr(global, 'foo') be possible?

Yes, I think if we go this path, global should behave as a predefined
variable.  Maybe we should call it __globals__ after all, consistent
with __file__ and __name__ (it would create a cycle, but we have
plenty of those already).

Though I still wish it didn't need underscores.  Maybe 'globals' could
sprout __getattribute__ and __setattr__ methods that would delegate to
the current global module?

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 20:20:15 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 20:20:38 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: Your message of "Wed, 22 Oct 2003 01:57:20 +0200."
	<3F95C7E0.4030608@livinglogic.de> 
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch>
	<200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com>
	<200310220121.52789.aleaxit@yahoo.com>
	<200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> 
	<3F95C7E0.4030608@livinglogic.de> 
Message-ID: <200310220020.h9M0KF825841@12-236-54-216.client.attbi.com>

> Another idea: We could replace the function globals() with an object
> that provides __call__ for backwards compatibility, but also has a
> special __setattr__. Then global assignment would be 'globals.x = 23'.
> Would this be possible?

Yes, I just proposed this in my previous response. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 20:23:48 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 20:23:56 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 19:47:58 EDT."
	<5.1.1.6.0.20031021193219.01df0c60@telecommunity.com> 
References: <Your message of "Tue, 21 Oct 2003 19:13:43 EDT."
	<5.1.1.6.0.20031021190931.01ec41d0@telecommunity.com> <Your
	message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> <Your
	message of "Tue, 21 Oct 2003 16:55:01 EDT."
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<5.1.1.6.0.20031021190931.01ec41d0@telecommunity.com> 
	<5.1.1.6.0.20031021193219.01df0c60@telecommunity.com> 
Message-ID: <200310220023.h9M0Nmg25868@12-236-54-216.client.attbi.com>

> > > Actually, I consider Samuele's example a good argument in *favor* of
> > > the idea.  Because of the similarity between listcomps and generator
> > > expressions (gen-X's? ;) ) it seems late binding of locals would
> > > lead to people thinking the behavior is a bug.  Since a genex is not
> > > a function (at least in form) a late binding would be very
> > > non-obvious and counterintuitive relative to other kinds of
> > > expressions.
> >
> >Hm.  We do late binding of globals.  Why shouldn't we do late binding
> >of locals?
> 
> Wha?  Oh, you mean in a function.

No, everywhere.  Global in generator expressions also have late
binding:

   A = 1
   def f():
       return (x+A for x in range(3))
   g = f()
   A = 2
   print list(g)   # prints [2, 3, 4]; not [1, 2, 3]

> But that's what I'm saying, it's *not* a 
> function.  Sure, it's implemented as one under the hood, but it doesn't 
> *look* like a function.  In any normal (non-lambda) expression, whether a 
> variable is local or global, its value is retrieved immediately.

That's because the expression is evaluated immediately.  When passing
generator expressions around that reference free variables (whether
global or from a function scope), the expression is evaluated when it
is requested.  Note that even under your model,

  A = []
  g = (A for x in range(3))
  A.append(42)
  print list(g)    # prints [[42], [42], [42]]

> Also, even though there's a function under the hood, that function
> is *called* and its value returned immediately.  This seems
> consistent with an immediate binding of parameters.

But it's a generator function, and the call suspends immediately, and
continues to execute only when the next() method on the result is
called.

> >   There are lots of corners or the language where if you
> >expect something else the actual behavior feels like a bug, until
> >someone explains it to you.  That's no reason to compromise.  It's an
> >opportunity for education about scopes!
> 
> So far, I haven't seen you say any reason why the "arguments"
> approach is bad, or why the "closure" approach is good.  Both are
> certainly Pythonic in some circumstances, but why do you feel that
> one is better than the other, here?

Unified semantic principles.  I want to be able to explain generator
expressions as a shorthand for defining and calling generator
functions.  Invoking default argument semantics makes the explanation
less clean: we would have to go through the trouble of finding all
references to fere variables.  Do you want globals to be passed via
default arguments as well?  And what about builtins?  (Note that the
compiler currently doesn't know the difference.)

> I will state one pragmatic reason for using the default arguments
> approach: code converted from using a listcomp to a genex can
> immediately have bugs as a result of rebinding a local.  Those bugs
> won't happen if rebinding the local has no effect on the genex's
> evaluation.  (Obviously, an aliasing problem can still be created if
> one modifies a mutable used in the genex, but there's no way to
> remove that possibility and still end up with a lazy iterator.)
> 
> Given that one of the big arguments in favor of genexes is to make 
> "upgrading" from listcomps easy, it shouldn't fail so quickly and 
> obviously.  E.g., converting from:
> 
> x = {}
> for i in range(10):
>      x[i] = [y^i for y in range(10)]
> 
> to:
> 
> x = {}
> for i in range(10):
>      x[i] = (y^i for y in range(10))
> 
> Shouldn't result in all of x's elements iterating over the same values!

Hm.  I think most generator expressions should be finished before
moving on to the next line, as in

  for n in range(4):
    print sum(x**n for x in range(1, 11))

Saving a generator expression for later use should be something you
rarely do, and you should really think of it as a shorthand for a
generator function just as lambda is a shorthand for a regular
function.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 20:42:05 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 21 20:42:17 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: Your message of "Wed, 22 Oct 2003 01:58:21 +0200."
	<200310220158.21389.aleaxit@yahoo.com> 
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<200310220121.52789.aleaxit@yahoo.com>
	<200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> 
	<200310220158.21389.aleaxit@yahoo.com> 
Message-ID: <200310220042.h9M0g5225903@12-236-54-216.client.attbi.com>

(Changing the subject yet again)

> > > Plus. EVERY newbie makes the mistake of taking "global" to mean
> > > "for ALL modules" rather than "for THIS module",
> >
> > Only if they've been exposed to languages that have such globals.
> 
> Actually, I've seen that happen to complete newbies too.  "global" is
> a VERY strong word -- or at least perceived as such.

We can't expect everybody to guess the rules of the language purely
based on the symbols used.

But I appreciate the argument; 'global' comes from ABC's SHARE, but
ABC doesn't have modules.  (It does have workspaces, but AFAIR there
is no communication at all between workspaces, so it isn't
unreasonable that a SHAREd name in one workspace isn't visible in
another workspace.)

> > > uselessly using global in toplevel,
> >
> > Which the parser should reject.
> 
> Again: can we do that in 2.4?

Submit a patch.  It'll probably break plenty of code though (I bet you
including Zope :-), so you'll have to start with a warning in 2.4.

> > I think it's not unreasonable to want to replace global with
> > attribute assignment of *something*.  I don't think that
> > "something" should have to be imported before you can use it; I
> > don't even think it deserves to have leading and trailing double
> > underscores.
> 
> Using attribute assignment is my main drive here.  I was doing it
> via import only to be able to experiment with that in today's Python;-).

You could have writen an import hook that simply inserted __globals__
in each imported module. :-)

> > Walter suggested 'global.x = 23' which looks reasonable; unfortunately
> > my parser can't do this without removing the existing global statement
> > from the Grammar: after seeing the token 'global' it must be able to
> > make a decision about whether to expand this to a global statement or
> > an assignment without peeking ahead, and that's impossible.
> 
> So it can't be global, as it must stay a keyword for backwards compatibility
> at least until 3.0.  What about:
>     this_module
>     current_module
>     sys.modules[__name__]  [[hmmm this DOES work today, but...;-)]]
>     __module__
> ...?

__module__ can't work because it has to be a string.  (I guess it
could be a str subclass but that would be too perverse.)

Walter and I both suggested hijacking the 'globals' builtin.  What do
you think of that?

> > If we removed global from the language, how would you spell assignment
> > to a variable in an outer function scope?  Remember, you can *not* use
> > 'outer.x' because that already refers to a function attribute.
> 
> scope(outer).x , making 'scope' a suitable built-in factory
> function.  I do think this deserves a built-in.

Hm.  I want it to be something that the compiler can know about
reliably, and a built-in function doesn't work (yet).  The compiler
currently knows enough about nested scopes so that it can implement
locals that are shared with inner functions differently (using
cells).  It's also too asymmetric -- *using* x would continue to be
just x.

Hmm.  That's also a problem I have with changing global assignment --
I think the compiler should know about it, just like it knows about
*using* globals.

And it's not just the compiler.  I think it requires more mental
gymnastics of the human reader to realize that

    def outer():
        def f():
            scope(outer).x = 42
            print x
        return f
    outer()()

prints 42 rather than being an error. But how does the compiler know
to reserve space for x in outer's scope?

Another thing is that your proposed scope() is too dynamic -- it
would require searching the scopes that (statically) enclose the call
for a stack frame belonging to the argument.  But there's no stack by
the time f gets called in the last example!  (The curernt machinery
for nested scopes doesn't reference stack frames; it only passes
cells.)

> If we have this, maybe scope could also be reused as e.g.
> scope(global).x = 23
> ?  I think the reserved keyword 'global' SHOULD give the parser no
> problem in this one specific use (but, I'm guessing...!).

I don't want to go there. :-)

(If it wasn't clear, I'm struggling with this subject -- I think there
are good reasons for why I'm resisting your proposal, but I haven't
found them yet.  The more I think about it, the less I like
'globals.x = 42' .

--Guido van Rossum (home page: http://www.python.org/~guido/)

From eppstein at ics.uci.edu  Tue Oct 21 20:44:45 2003
From: eppstein at ics.uci.edu (David Eppstein)
Date: Tue Oct 21 20:44:48 2003
Subject: [Python-Dev] Re: accumulator display syntax
References: <20031021233910.GA2091@mems-exchange.org>
Message-ID: <eppstein-D2981F.17444421102003@sea.gmane.org>

In article <20031021233910.GA2091@mems-exchange.org>,
 Neil Schemenauer <nas-python@python.ca> wrote:

> nother nice thing is that we have tuple and dict comprehensions
> for free:
> 
>   tuple(x for x in S)
>   dict((k, v) for k, v in S)
>   Set(x for x in S)

Who cares about tuple comprehensions, but I would like similar syntactic 
sugar for dict comprehensions as for lists:
    {k:v for k,v in S}
(PEP 274).

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science


From tim.one at comcast.net  Tue Oct 21 21:06:31 2003
From: tim.one at comcast.net (Tim Peters)
Date: Tue Oct 21 21:06:36 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEGLGLAB.tim.one@comcast.net>

[Samuele Pedroni]
> so this, if I understand:
>
> def h():
>    y = 0
>    l = [1,2]
>    it = (x+y for x in l)
>    y = 1
>    for v in it:
>      print v
>
> will print 1,2 and not 2,3

That is what I had in mind, and that if the first assignment to "y" were
commented out, the assignment to "it" would raise UnboundLocalError.

> unlike:
>
> def h():
>    y = 0
>    l = [1,2]
>    def gen(S):
>      for x in S:
>        yield x+y
>    it = gen(l)
>    y = 1
>    for v in it:
>      print v

Yes, but like it if you replaced the "def gen" and the line following it
with:

    def gen(y=y, l=l):
        for x in l:
            yield x+y
    it = gen()

This is worth some thought.  My intuition is that we *don't* want "a
closure" here.  If generator expressions were reiterable, then (probably
obnoxiously) clever code could make some of use of tricking them into using
different inherited bindings on different (re)iterations.  But they're
one-shot things, and divorcing the values actually used from the values in
force at the definition site sounds like nothing but trouble to me
(error-prone and surprising).  They look like expressions, after all, and
after

    x = 5
    y = x**2
    x = 10
    print y

it would be very surprising to see 100 get printed.  In the rare cases
that's desirable, creating an explicit closure is clear(er):

    x = 5
    y = lambda: x**2
    x = 10
    print y()

I expect creating a closure instead would bite hard especially when building
a list of generator expressions (one of the cases where delaying generation
of the results is easily plausible) in a loop.  The loop index variable will
probably play some role (directly or indirectly) in the intended operation
of each generator expression constructed, and then you severely want *not*
for each generator expression to see "the last" value of the index vrlbl.

For concreteness, test_generators.Queens.__init__ creates a list of rowgen()
generators, and rowgen uses the default-arg trick to give each generator a
different value for rowuses; it would be an algorithmic disaster if they all
used the same value.

Generator expressions are too limited to do what rowgen() does (it needs to
create and undo side effects as backtracking proceeds), so it's not
perfectly relevant as-is.  I *suspect* that if people work at writing
concrete use cases, though, a similar thing will hold.

BTW, Icon can give no guidance here:  in that language, the generation of a
generator's result sequence is inextricably bound to the lexical occurrence
of the generator.  The question arises in Python because definition site and
generation can be divorced.


From barry at python.org  Tue Oct 21 21:21:51 2003
From: barry at python.org (Barry Warsaw)
Date: Tue Oct 21 21:21:57 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <eppstein-D2981F.17444421102003@sea.gmane.org>
References: <20031021233910.GA2091@mems-exchange.org>
	<eppstein-D2981F.17444421102003@sea.gmane.org>
Message-ID: <1066785710.5750.333.camel@anthem>

On Tue, 2003-10-21 at 20:44, David Eppstein wrote:

> Who cares about tuple comprehensions, but I would like similar syntactic 
> sugar for dict comprehensions as for lists:
>     {k:v for k,v in S}
> (PEP 274).

+1

:)

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20031021/63df6654/attachment.bin
From pedronis at bluewin.ch  Tue Oct 21 21:27:14 2003
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Tue Oct 21 21:24:51 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310220042.h9M0g5225903@12-236-54-216.client.attbi.com>
References: <Your message of "Wed, 22 Oct 2003 01:58:21 +0200."
	<200310220158.21389.aleaxit@yahoo.com> <Your message of "Tue,
	21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<200310220121.52789.aleaxit@yahoo.com>
	<200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com>
	<200310220158.21389.aleaxit@yahoo.com>
Message-ID: <5.2.1.1.0.20031022031539.027f6fc0@pop.bluewin.ch>

At 17:42 21.10.2003 -0700, Guido van Rossum wrote:
>(Changing the subject yet again)
>
> > > > Plus. EVERY newbie makes the mistake of taking "global" to mean
> > > > "for ALL modules" rather than "for THIS module",
> > >
> > > Only if they've been exposed to languages that have such globals.
> >
> > Actually, I've seen that happen to complete newbies too.  "global" is
> > a VERY strong word -- or at least perceived as such.
>
>We can't expect everybody to guess the rules of the language purely
>based on the symbols used.
>
>But I appreciate the argument; 'global' comes from ABC's SHARE, but
>ABC doesn't have modules.  (It does have workspaces, but AFAIR there
>is no communication at all between workspaces, so it isn't
>unreasonable that a SHAREd name in one workspace isn't visible in
>another workspace.)
>
> > > > uselessly using global in toplevel,
> > >
> > > Which the parser should reject.
> >
> > Again: can we do that in 2.4?
>
>Submit a patch.  It'll probably break plenty of code though (I bet you
>including Zope :-), so you'll have to start with a warning in 2.4.
>
> > > I think it's not unreasonable to want to replace global with
> > > attribute assignment of *something*.  I don't think that
> > > "something" should have to be imported before you can use it; I
> > > don't even think it deserves to have leading and trailing double
> > > underscores.
> >
> > Using attribute assignment is my main drive here.  I was doing it
> > via import only to be able to experiment with that in today's Python;-).
>
>You could have writen an import hook that simply inserted __globals__
>in each imported module. :-)
>
> > > Walter suggested 'global.x = 23' which looks reasonable; unfortunately
> > > my parser can't do this without removing the existing global statement
> > > from the Grammar: after seeing the token 'global' it must be able to
> > > make a decision about whether to expand this to a global statement or
> > > an assignment without peeking ahead, and that's impossible.
> >
> > So it can't be global, as it must stay a keyword for backwards 
> compatibility
> > at least until 3.0.  What about:
> >     this_module
> >     current_module
> >     sys.modules[__name__]  [[hmmm this DOES work today, but...;-)]]
> >     __module__
> > ...?
>
>__module__ can't work because it has to be a string.  (I guess it
>could be a str subclass but that would be too perverse.)
>
>Walter and I both suggested hijacking the 'globals' builtin.  What do
>you think of that?
>
> > > If we removed global from the language, how would you spell assignment
> > > to a variable in an outer function scope?  Remember, you can *not* use
> > > 'outer.x' because that already refers to a function attribute.
> >
> > scope(outer).x , making 'scope' a suitable built-in factory
> > function.  I do think this deserves a built-in.
>
>Hm.  I want it to be something that the compiler can know about
>reliably, and a built-in function doesn't work (yet).  The compiler
>currently knows enough about nested scopes so that it can implement
>locals that are shared with inner functions differently (using
>cells).  It's also too asymmetric -- *using* x would continue to be
>just x.
>
>Hmm.  That's also a problem I have with changing global assignment --
>I think the compiler should know about it, just like it knows about
>*using* globals.
>
>And it's not just the compiler.  I think it requires more mental
>gymnastics of the human reader to realize that
>
>     def outer():
>         def f():
>             scope(outer).x = 42
>             print x
>         return f
>     outer()()
>
>prints 42 rather than being an error. But how does the compiler know
>to reserve space for x in outer's scope?
>
>Another thing is that your proposed scope() is too dynamic -- it
>would require searching the scopes that (statically) enclose the call
>for a stack frame belonging to the argument.  But there's no stack by
>the time f gets called in the last example!  (The curernt machinery
>for nested scopes doesn't reference stack frames; it only passes
>cells.)
>
> > If we have this, maybe scope could also be reused as e.g.
> > scope(global).x = 23
> > ?  I think the reserved keyword 'global' SHOULD give the parser no
> > problem in this one specific use (but, I'm guessing...!).
>
>I don't want to go there. :-)
>
>(If it wasn't clear, I'm struggling with this subject -- I think there
>are good reasons for why I'm resisting your proposal, but I haven't
>found them yet.  The more I think about it, the less I like
>'globals.x = 42' .

. suggests runtime, for compile time then maybe

global::x=42
module::x=42

outer::x=42

(I don't like those, and personally I don't see the need to get rebinding 
for closed-over variables but anyway)

another possibility is that today  <name> <name> is a syntax error, so maybe

global x = 42 or
module x = 42

they would not be statements, this for symmetry would also be legal:

y = module x + 1

then

outer x = 42

and also

y = g x + 1

the problems are also clear, in some other languages x y is function 
application, etc..


From kiko at async.com.br  Tue Oct 21 21:43:46 2003
From: kiko at async.com.br (Christian Robottom Reis)
Date: Tue Oct 21 21:43:57 2003
Subject: [Python-Dev] Re: Be Honest about LC_NUMERIC
In-Reply-To: <200310182222.h9IMMx1X004861@mira.informatik.hu-berlin.de>
References: <200310182222.h9IMMx1X004861@mira.informatik.hu-berlin.de>
Message-ID: <20031022014346.GI2977@async.com.br>

On Sun, Oct 19, 2003 at 12:22:59AM +0200, Martin v. L?wis wrote:
> What happened to this PEP? I can't find it in the PEP list.

Sorry, I've been completely distracted by real life, lately. 

I can put some effort into the text this week, but I'm not sure what I
should do beyond sending the PEP to the list and waiting for comments
(which were pretty sparse!)

> Personally, I am satisfied with the patch that evolved from the
> discussion (#774665), and I would be willing to apply it even without
> a PEP.

I would really appreciate a comment from Tim outlining his opinion (so
I'm adding him to the To: list). Just to recapitulate, the patch Gustavo
has posted doesn't use the thread-safe glibc functions, which means that
we won't be safe from runtime locale switching.

I suppose I should also point out that runtime locale switching is
useful in certain obscure situations; for instance, formatting a number
with periods grouping the thousands can be done by setting to the da_DK
locale temporarily. Whether this hack is to be encouraged or shelved for
something better is yet unknown to me, though.

Take care,
--
Christian Robottom Reis | http://async.com.br/~kiko/ | [+55 16] 261 2331

From barry at python.org  Tue Oct 21 21:55:36 2003
From: barry at python.org (Barry Warsaw)
Date: Tue Oct 21 21:55:48 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <200310220020.h9M0KF825841@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch>
	<200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com>
	<200310220121.52789.aleaxit@yahoo.com>
	<200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com>
	<3F95C7E0.4030608@livinglogic.de>
	<200310220020.h9M0KF825841@12-236-54-216.client.attbi.com>
Message-ID: <1066787735.5750.343.camel@anthem>

On Tue, 2003-10-21 at 20:20, Guido van Rossum wrote:
> > Another idea: We could replace the function globals() with an object
> > that provides __call__ for backwards compatibility, but also has a
> > special __setattr__. Then global assignment would be 'globals.x = 23'.
> > Would this be possible?
> 
> Yes, I just proposed this in my previous response. :-)

So maybe the idea of using function attributes isn't totally nuts, if
you use a special name.  E.g. outer.__locals__.x and outer.__globals__.x
-Barry


-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20031021/6088ac50/attachment.bin
From greg at cosc.canterbury.ac.nz  Tue Oct 21 23:14:56 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue Oct 21 23:15:19 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <200310212309.h9LN9Sk25548@12-236-54-216.client.attbi.com>
Message-ID: <200310220314.h9M3Euk11066@oma.cosc.canterbury.ac.nz>

Guido:

> >     def inner():
> >        outer.x = 42
> 
> Because this already means something!

Hmmm, maybe

   x of outer = 42

Determined-to-get-an-'of'-into-the-language-somehow-ly,

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From jeremy at zope.com  Tue Oct 21 23:08:10 2003
From: jeremy at zope.com (Jeremy Hylton)
Date: Tue Oct 21 23:15:37 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310220042.h9M0g5225903@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<200310220121.52789.aleaxit@yahoo.com>
	<200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com>
	<200310220158.21389.aleaxit@yahoo.com>
	<200310220042.h9M0g5225903@12-236-54-216.client.attbi.com>
Message-ID: <1066792089.19270.29.camel@localhost.localdomain>

On Tue, 2003-10-21 at 20:42, Guido van Rossum wrote:
> (If it wasn't clear, I'm struggling with this subject -- I think there
> are good reasons for why I'm resisting your proposal, but I haven't
> found them yet.  The more I think about it, the less I like
> 'globals.x = 42' .

I think it's good that attribute assignment and variable assignment look
different.  An object's attributes are more dynamic that the variables
in a module.  I don't see much benefit to conflating two distinct
concepts.

Jeremy


From python at rcn.com  Tue Oct 21 23:17:26 2003
From: python at rcn.com (Raymond Hettinger)
Date: Tue Oct 21 23:18:13 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <eppstein-D2981F.17444421102003@sea.gmane.org>
Message-ID: <000001c3984b$052cd820$e841fea9@oemcomputer>

>  Neil Schemenauer <nas-python@python.ca> wrote:
> > nother nice thing is that we have tuple and dict comprehensions
> > for free:
> >
> >   tuple(x for x in S)
> >   dict((k, v) for k, v in S)
> >   Set(x for x in S)

[David Eppstein]
> Who cares about tuple comprehensions, but I would like similar
syntactic
> sugar for dict comprehensions as for lists:
>     {k:v for k,v in S}
> (PEP 274).

-1

Let's keep just one way to do it.

That constuct saves a few characters just to get a little 
cuteness and another special case to remember and maintain.

Once you have iterator expressions, you've already gotten 99% of
the benefits of PEP 274.

List comprehensions, on the other hand, already exist, so they
*have* to be supported.


Raymond


From tim_one at email.msn.com  Tue Oct 21 23:03:14 2003
From: tim_one at email.msn.com (Tim Peters)
Date: Tue Oct 21 23:21:35 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310220223.h9M2N7l26105@12-236-54-216.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEJOFEAB.tim_one@email.msn.com>

[Guido]
> Urgh, we need this sorted out before Raymond can rewrite PEP 289 and
> present it to c.l.py...

That would be good <wink>.  I don't feel a sense of urgency, though, and
will be out of town the rest of the week.

I sure *expect* that most generator expressions will "get consumed"
immediately, at their definition site, so that there's no meaningful
question to answer then (as in, e.g., the endless sum(generator_expression)
examples, assuming the builtin sum).

That means people have to think of plausible use cases where evaluation is
delayed.  There are some good examples of lists-of-generators in
test_generators.py, and I'll just note that they use the default-arg
mechanism to force a particular loop-variant non-local value, or use an
instance variable, and/or use lexical scoping but know darned well that the
up-level binding will never change over the life of each generator.

That's all the concrete stuff I have to stare at now (& recalling that the
question can't be asked in Icon -- no "divorce" possible there, and no
lexical nesting even if it were possible to delay generation).

> ...
> So, do you want *all* free variables to be passed using the
> default-argument trick (even globals and builtins), or only those that
> correspond to variables in the immediately outer scope, or only those
> corresponding to function scopes (as opposed to globals)?

All or none make sense to me, as semantic models (not ruling out that a
clever implementation may take shortcuts).  I'm not having a hard time
imagining that "all" will be useful; I haven't yet managed to dream up a
plausible use case where "none" actually helps.

> n = 0
> def f():
>     global n
>     n += 1
>     return n
> print list(n+f() for x in range(10))

Like I just said <wink>.  There's no question that semantics can differ
between "all" and "none" (and at several points between to boot).  Stick a
"global f" inside f() and rebind f based on the current value of n too, if
you like.  I'm having a hard time imagining something *useful* coming out of
such tricks combined with "none".  Under "all", I look at the print and
think "f is f, and n is 0, and that's it".

I'm not sure it's "a feature" that

    print [n+f() for x in range(10)]

looks up n and f anew on each iteration -- if I saw a listcomp that actually
relied on this, I'd be eager to avoid inheriting any of author's code.


From tjreedy at udel.edu  Tue Oct 21 23:22:09 2003
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue Oct 21 23:22:16 2003
Subject: [Python-Dev] Re: closure semantics
References: <Your message of "Tue, 21 Oct 2003 23:33:30
	+0200."<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch><5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch><200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com>
	<200310220121.52789.aleaxit@yahoo.com>
	<200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com>
Message-ID: <bn4t52$km9$1@sea.gmane.org>


"Guido van Rossum" <guido@python.org> wrote in message
news:200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com...
> Eek.  Global statement inside flow control should be deprecated, not
> abused to show that global is evil. :-)

Is there any good reason to ever use globals anywhere other than as
the first statement (after doc string) of a function?  If not, could
its usage be so restricted (like __future__ import)?

> > Plus. EVERY newbie makes the mistake of taking "global" to mean
> > "for ALL modules" rather than "for THIS module",

Part of my brain still thinks that, and another part has to say,
'no, just modular or mod_vars()'.

> Only if they've been exposed to languages that have such globals.

Like Python with __builtins__?  which I think of as the true globals.
Do C or Fortran count as such a source of 'infection'?

> > uselessly using global in toplevel,
>
> Which the parser should reject.

Good.  The current nonrejection sometimes leads beginners astray
because they think it must be doing something.

While I use global/s() just fine, I still don't like the names.  I
decided awhile ago that they must predate import, when the current
module scoop would have been 'global'.

>[from another post] But I appreciate the argument; 'global' comes
from ABC's >SHARE, but ABC doesn't have modules.

Aha!  Now I can use this explanation as fact instead of speculation.

Terry J. Reedy


From eppstein at ics.uci.edu  Tue Oct 21 23:27:05 2003
From: eppstein at ics.uci.edu (David Eppstein)
Date: Tue Oct 21 23:27:11 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <000001c3984b$052cd820$e841fea9@oemcomputer>
References: <000001c3984b$052cd820$e841fea9@oemcomputer>
Message-ID: <13803476.1066768024@[192.168.1.101]>

On 10/21/03 11:17 PM -0400 Raymond Hettinger <python@rcn.com> wrote:
> -1
>
> Let's keep just one way to do it.
>
> That constuct saves a few characters just to get a little
> cuteness and another special case to remember and maintain.
>
> Once you have iterator expressions, you've already gotten 99% of
> the benefits of PEP 274.

Currently, I am using expressions like

	pos2d = 
dict([(s,(positions[s][0]+dx*positions[s][2],positions[s][1]+dy*positions[s
][2]))
               for s in positions])

Once I have iterator expressions, I can simplify it by dropping a whole two 
characters (the brackets) and get an unimportant time savings.  But with 
PEP 274, I could write

	pos2d = 
{s:(positions[s][0]+dx*positions[s][2],positions[s][1]+dy*positions[s][2])
          for s in positions}

Instead of five levels of nested parens+brackets, I would need only three, 
and each level would be a different type of paren or bracket, which I think 
together with the shorter overall length would contribute significantly to 
readability.
-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science


From jeremy at alum.mit.edu  Tue Oct 21 23:00:43 2003
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Tue Oct 21 23:29:54 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<Your message of "Tue, 21 Oct 2003 16:55:01 EDT."
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch>
	<200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com>
Message-ID: <1066791643.19270.25.camel@localhost.localdomain>

On Tue, 2003-10-21 at 18:51, Guido van Rossum wrote:
> [Samuele]
> > this is a bit OT and too late, but given that our closed over
> > variables are read-only, I'm wondering whether, having a 2nd chance,
> > using cells and following mutations in the enclosing scopes is
> > really worth it, we kind of mimic Scheme and relatives but there
> > outer scope variables are also rebindable. Maybe copying semantics
> > not using cells for our closures would not be too insane, and people
> > would not be burnt by trying things like this:
> > 
> > for msg in msgs:
> >    def onClick(e):
> >      print msg
> >    panel.append(Button(msg,onClick=onClick))
> > 
> > which obviously doesn't do what one could expect today. OTOH as for
> > general mutability, using a mutable object (list,...) would allow
> > for mutability when one really need it (rarely).

I think copying semantics would be too surprising.

> It was done this way because not everybody agreed that closed-over
> variables should be read-only, and the current semantics allow us to
> make them writable (as in Scheme, I suppose?) if we can agree on a
> syntax to declare an "intermediate scope" global.
> 
> Maybe "global x in f" would work?

Woo hoo.  I'm happy to hear you've had a change of heart on this topic.
I think a simple, declarative statement would be clearer than assigning
to an attribute of a special object.

If a special object, like __global__, existed, could you create an
alias, like:

    surprise = __global__
    surprise.x = 1
    print __global__.x

?

It would apparently also allow you to use a local and global variable
with the same name in the same scope.  That's odd, although I suppose it
would be clear from context whether the local or global was intended.

> def outer():
>     x = 1
>     def intermediate():
>         x = 2
>         def inner():
>             global x in outer
>             x = 42
>         inner()
>         print x      # prints 2
>     intermediate()
>     print x          # prints 42

I would prefer to see a separate statement similar to global that meant
"look for the nearest enclosing binding."  Rather than specifying that
you want to use x from outer, you could only say you don't want x to be
local.  That means you'd always get intermediate.

I think this choice is more modular.  If you can re-bind a non-local
variable, then the name of the function where it is initially bound
isn't that interesting.  It would be safe, for example, to move it to
another place in the function hierarchy without affecting the semantics
of the program -- except that in the case of "global x in outer" you'd
have to change all the referring global statements.  Or would the
semantics be to create a binding for x in outer, even if it didn't
already exist?

Jeremy


From pje at telecommunity.com  Tue Oct 21 22:05:55 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Oct 21 23:31:38 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310220023.h9M0Nmg25868@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 19:47:58 EDT."
	<5.1.1.6.0.20031021193219.01df0c60@telecommunity.com>
	<Your message of "Tue, 21 Oct 2003 19:13:43 EDT."
	<5.1.1.6.0.20031021190931.01ec41d0@telecommunity.com>
	<Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<Your message of "Tue, 21 Oct 2003 16:55:01 EDT."
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<5.1.1.6.0.20031021190931.01ec41d0@telecommunity.com>
	<5.1.1.6.0.20031021193219.01df0c60@telecommunity.com>
Message-ID: <5.1.0.14.0.20031021214942.026ca320@mail.telecommunity.com>

At 05:23 PM 10/21/03 -0700, Guido van Rossum wrote:
>Unified semantic principles.  I want to be able to explain generator
>expressions as a shorthand for defining and calling generator
>functions.

For a technical explanation, I would say, "any name that is not defined by 
the generator expression itself has the binding that was in effect for that 
name at the time the generator expression occurs."  (Note that this 
statement is equally true for any other non-lambda expression.)

For a non-technical explanation, I wouldn't say anything, because I don't 
think anybody is going to assume the late-binding behavior, who doesn't 
already have the mental model that "this is a shortcut for a generator 
function".

IOW, the issue I see here is that if somebody runs into the problem, they 
need to learn about the free variables and closures concept in order to 
understand why their code is breaking.  But, if it doesn't break, then why 
do they need to learn that?


>Invoking default argument semantics makes the explanation
>less clean: we would have to go through the trouble of finding all
>references to fere variables.  Do you want globals to be passed via
>default arguments as well?  And what about builtins?  (Note that the
>compiler currently doesn't know the difference.)

This sounds like "if the implementation is hard to explain" grounds, which 
I agree with in principle.  I'm not positive it's that hard to explain, 
though, mainly because I don't see how anyone would *question* it in the 
first place.  I find it hard to imagine somebody *wanting* changes to the 
variable bindings to affect an iterator expression, and thus the issue of 
why that doesn't work should be *much* rarer than the other way around.

Past this point I think I'll be duplicating either my or Tim's arguments 
for this, so I'll leave off now.


From barry at python.org  Tue Oct 21 23:34:50 2003
From: barry at python.org (Barry Warsaw)
Date: Tue Oct 21 23:35:01 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <1066791643.19270.25.camel@localhost.localdomain>
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<Your message of "Tue, 21 Oct 2003 16:55:01 EDT."
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch>
	<200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com>
	<1066791643.19270.25.camel@localhost.localdomain>
Message-ID: <1066793689.5750.376.camel@anthem>

On Tue, 2003-10-21 at 23:00, Jeremy Hylton wrote:

> I would prefer to see a separate statement similar to global that meant
> "look for the nearest enclosing binding."  Rather than specifying that
> you want to use x from outer, you could only say you don't want x to be
> local.  That means you'd always get intermediate.

Would those "up" bindings chain?

-Barry


-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20031021/eaa90213/attachment.bin
From jeremy at alum.mit.edu  Tue Oct 21 23:46:00 2003
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Tue Oct 21 23:48:32 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <1066793689.5750.376.camel@anthem>
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<Your message of "Tue, 21 Oct 2003 16:55:01 EDT."
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch>
	<200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com>
	<1066791643.19270.25.camel@localhost.localdomain>
	<1066793689.5750.376.camel@anthem>
Message-ID: <1066794359.19270.31.camel@localhost.localdomain>

On Tue, 2003-10-21 at 23:34, Barry Warsaw wrote:
> On Tue, 2003-10-21 at 23:00, Jeremy Hylton wrote:
> 
> > I would prefer to see a separate statement similar to global that meant
> > "look for the nearest enclosing binding."  Rather than specifying that
> > you want to use x from outer, you could only say you don't want x to be
> > local.  That means you'd always get intermediate.
> 
> Would those "up" bindings chain?

Yes.  If a block had an up declaration and it contained a nested block
with an up declaration for the same variable, both blocks would refer to
an outer binding.

Jeremy


From guido at python.org  Tue Oct 21 22:15:15 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 00:08:01 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: Your message of "Tue, 21 Oct 2003 21:55:36 EDT."
	<1066787735.5750.343.camel@anthem> 
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch>
	<200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com>
	<200310220121.52789.aleaxit@yahoo.com>
	<200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com>
	<3F95C7E0.4030608@livinglogic.de>
	<200310220020.h9M0KF825841@12-236-54-216.client.attbi.com> 
	<1066787735.5750.343.camel@anthem> 
Message-ID: <200310220215.h9M2FFc26081@12-236-54-216.client.attbi.com>

> So maybe the idea of using function attributes isn't totally nuts, if
> you use a special name.  E.g. outer.__locals__.x and outer.__globals__.x

-1.  Way too ugly.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 21 22:23:07 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 00:08:13 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 21:06:31 EDT."
	<LNBBLJKPBEHFEDALKOLCGEGLGLAB.tim.one@comcast.net> 
References: <LNBBLJKPBEHFEDALKOLCGEGLGLAB.tim.one@comcast.net> 
Message-ID: <200310220223.h9M2N7l26105@12-236-54-216.client.attbi.com>

Urgh, we need this sorted out before Raymond can rewrite PEP 289 and
present it to c.l.py...

> [Samuele Pedroni]
> > so this, if I understand:
> >
> > def h():
> >    y = 0
> >    l = [1,2]
> >    it = (x+y for x in l)
> >    y = 1
> >    for v in it:
> >      print v
> >
> > will print 1,2 and not 2,3
> 
> That is what I had in mind, and that if the first assignment to "y" were
> commented out, the assignment to "it" would raise UnboundLocalError.
> 
> > unlike:
> >
> > def h():
> >    y = 0
> >    l = [1,2]
> >    def gen(S):
> >      for x in S:
> >        yield x+y
> >    it = gen(l)
> >    y = 1
> >    for v in it:
> >      print v
> 
> Yes, but like it if you replaced the "def gen" and the line following it
> with:
> 
>     def gen(y=y, l=l):
>         for x in l:
>             yield x+y
>     it = gen()
> 
> This is worth some thought.  My intuition is that we *don't* want "a
> closure" here.  If generator expressions were reiterable, then (probably
> obnoxiously) clever code could make some of use of tricking them into using
> different inherited bindings on different (re)iterations.  But they're
> one-shot things, and divorcing the values actually used from the values in
> force at the definition site sounds like nothing but trouble to me
> (error-prone and surprising).  They look like expressions, after all, and
> after
> 
>     x = 5
>     y = x**2
>     x = 10
>     print y
> 
> it would be very surprising to see 100 get printed.  In the rare cases
> that's desirable, creating an explicit closure is clear(er):
> 
>     x = 5
>     y = lambda: x**2
>     x = 10
>     print y()
> 
> I expect creating a closure instead would bite hard especially when building
> a list of generator expressions (one of the cases where delaying generation
> of the results is easily plausible) in a loop.  The loop index variable will
> probably play some role (directly or indirectly) in the intended operation
> of each generator expression constructed, and then you severely want *not*
> for each generator expression to see "the last" value of the index vrlbl.

Right.

> For concreteness, test_generators.Queens.__init__ creates a list of rowgen()
> generators, and rowgen uses the default-arg trick to give each generator a
> different value for rowuses; it would be an algorithmic disaster if they all
> used the same value.
> 
> Generator expressions are too limited to do what rowgen() does (it needs to
> create and undo side effects as backtracking proceeds), so it's not
> perfectly relevant as-is.  I *suspect* that if people work at writing
> concrete use cases, though, a similar thing will hold.
> 
> BTW, Icon can give no guidance here:  in that language, the generation of a
> generator's result sequence is inextricably bound to the lexical occurrence
> of the generator.  The question arises in Python because definition site and
> generation can be divorced.

So, do you want *all* free variables to be passed using the
default-argument trick (even globals and builtins), or only those that
correspond to variables in the immediately outer scope, or only those
corresponding to function scopes (as opposed to globals)?

n = 0
def f():
    global n
    n += 1
    return n
print list(n+f() for x in range(10))

--Guido van Rossum (home page: http://www.python.org/~guido/)

From eppstein at ics.uci.edu  Wed Oct 22 00:11:24 2003
From: eppstein at ics.uci.edu (David Eppstein)
Date: Wed Oct 22 00:11:31 2003
Subject: [Python-Dev] Re: closure semantics
References: <200310212309.h9LN9Sk25548@12-236-54-216.client.attbi.com>
	<200310220314.h9M3Euk11066@oma.cosc.canterbury.ac.nz>
Message-ID: <eppstein-082489.21112421102003@sea.gmane.org>

In article <200310220314.h9M3Euk11066@oma.cosc.canterbury.ac.nz>,
 Greg Ewing <greg@cosc.canterbury.ac.nz> wrote:

> Guido:
> 
> > >     def inner():
> > >        outer.x = 42
> > 
> > Because this already means something!
> 
> Hmmm, maybe
> 
>    x of outer = 42
> 
> Determined-to-get-an-'of'-into-the-language-somehow-ly,

scope(outer).x = 42

Almost implementable now by using the inspect module to find the first 
matching scope, except that inspect can't change the local variable 
values, only look at them.

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science


From tdelaney at avaya.com  Wed Oct 22 00:44:13 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Wed Oct 22 00:44:21 2003
Subject: [Python-Dev] Re: accumulator display syntax
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6A8FF@au3010avexu1.global.avaya.com>

> From: David Eppstein [mailto:eppstein@ics.uci.edu]
> 
> Once I have iterator expressions, I can simplify it by 
> dropping a whole two 
> characters (the brackets) and get an unimportant time 
> savings.  But with 
> PEP 274, I could write
> 
> 	pos2d = 
> {s:(positions[s][0]+dx*positions[s][2],positions[s][1]+dy*posi
> tions[s][2])
>           for s in positions}

Don't be evil ... <wink>

Tim Delaney

From guido at python.org  Wed Oct 22 00:48:44 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 00:48:53 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: Your message of "Tue, 21 Oct 2003 23:00:43 EDT."
	<1066791643.19270.25.camel@localhost.localdomain> 
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch> <Your
	message of "Tue, 21 Oct 2003 16:55:01 EDT."
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<LNBBLJKPBEHFEDALKOLCMEENGLAB.tim.one@comcast.net>
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch>
	<200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com> 
	<1066791643.19270.25.camel@localhost.localdomain> 
Message-ID: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com>

[Guido]
> > Maybe "global x in f" would work?

[Jeremy]
> Woo hoo.  I'm happy to hear you've had a change of heart on this topic.
> I think a simple, declarative statement would be clearer than assigning
> to an attribute of a special object.

Right.

> If a special object, like __global__, existed, could you create an
> alias, like:
> 
>     surprise = __global__
>     surprise.x = 1
>     print __global__.x
> 
> ?
> 
> It would apparently also allow you to use a local and global variable
> with the same name in the same scope.  That's odd, although I suppose it
> would be clear from context whether the local or global was intended.

I don't care about that argument; it's no more confusing to have
globals.x and x as it is to have self.x and x, and the latter happens
all the time.

> > def outer():
> >     x = 1
> >     def intermediate():
> >         x = 2
> >         def inner():
> >             global x in outer
> >             x = 42
> >         inner()
> >         print x      # prints 2
> >     intermediate()
> >     print x          # prints 42
> 
> I would prefer to see a separate statement similar to global that meant
> "look for the nearest enclosing binding."  Rather than specifying that
> you want to use x from outer, you could only say you don't want x to be
> local.  That means you'd always get intermediate.

That would be fine; I think that code where you have a choice of more
than one outer variable with the same name is seriously insane.  An
argument for naming the outer function is that explicit is better than
implicit, and it might help the reader if there is more than one
level; OTOH it is a pain if you decide to rename the outer function
(easily caught by the parser, but creates unnecessary work).

I admit that I chose this mostly because the syntax 'global x in
outer' reads well and doesn't require new keywords.

> I think this choice is more modular.  If you can re-bind a non-local
> variable, then the name of the function where it is initially bound
> isn't that interesting.  It would be safe, for example, to move it to
> another place in the function hierarchy without affecting the semantics
> of the program

I'm not sure what you mean here.  Move x around, or move outer around?
In both cases I can easily see how the semantics *would* change, in
general.

> -- except that in the case of "global x in outer" you'd
> have to change all the referring global statements.

Yes, that's the main downside.

> Or would the semantics be to create a binding for x in outer, even
> if it didn't already exist?

That would be the semantics, right; just like the current global
statement doesn't care whether the global variable already exists in
the module or not; it will create it if necessary.  But a relative
global statement would be fine too; it would be an error if there's no
definition of the given variable in scope.

But all this is moot unless someone comes up with a way to spell this
that doesn't require a new keyword or change the meaning of 'global x'
even if there's an x at an intermediate scope (i.e. you can't change
'global x' to mean "search for the next outer scope that defines x").

And we still have to answer Alex's complaint that newbies misinterpret
the word 'global'.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Oct 22 01:00:21 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 01:00:38 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: Your message of "Tue, 21 Oct 2003 23:22:09 EDT."
	<bn4t52$km9$1@sea.gmane.org> 
References: <Your message of "Tue, 21 Oct 2003 23:33:30
	+0200."<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch><5.2.1.1.0.20031022002931.027c3e38@pop.bluewin.ch><200310212252.h9LMq0f25469@12-236-54-216.client.attbi.com>
	<200310220121.52789.aleaxit@yahoo.com>
	<200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com> 
	<bn4t52$km9$1@sea.gmane.org> 
Message-ID: <200310220500.h9M50Ln26397@12-236-54-216.client.attbi.com>

> Is there any good reason to ever use globals anywhere other than as
> the first statement (after doc string) of a function?

If the use of the global is fairly localized, I sometimes like to have
the global declaration immediately proceed the first use, assuming all
other uses are in the same indented block.  (This means that I
sometimes *do* have global inside flow control, but then all uses are
also inside the same branch.)

But I'm not sure this is a *good* reason.

> If not, could its usage be so restricted (like __future__ import)?

This would break way too much stuff.  It would have been a good idea
for 0.1.  But then I was trying to keep the grammar small
while keeping syntactic checks out of the compilation phase if at all
possible, and I thought "screw it -- if import can go anywhere, so can
global."

> > > Plus. EVERY newbie makes the mistake of taking "global" to mean
> > > "for ALL modules" rather than "for THIS module",
> 
> Part of my brain still thinks that, and another part has to say,
> 'no, just modular or mod_vars()'.
> 
> > Only if they've been exposed to languages that have such globals.
> 
> Like Python with __builtins__?  which I think of as the true globals.

Hardly, since they aren't normally thought of as variables.

> Do C or Fortran count as such a source of 'infection'?

C, definitely -- it has the concept and the terminology.  In Fortran,
it's called common blocks (similar in idea to ABC's SHARE).

> > > uselessly using global in toplevel,
> >
> > Which the parser should reject.
> 
> Good.  The current nonrejection sometimes leads beginners astray
> because they think it must be doing something.

Just like

   x + 1

I suppose.  I'm sure PyChecker catches this.

> While I use global/s() just fine, I still don't like the names.  I
> decided awhile ago that they must predate import, when the current
> module scoop would have been 'global'.

No, they were both there from day one.

Frankly, I don't think in this case newbie confusion is enough of a
reason to switch from global to some other keyword of mechanism.

Yes, this means I'm retracting my support for Alex's
"replace-global-with-attribute-assignment" proposal -- Jeremy's
objection made me realize why I don't like it much.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Oct 22 01:02:56 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 01:03:11 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: Your message of "Wed, 22 Oct 2003 03:27:14 +0200."
	<5.2.1.1.0.20031022031539.027f6fc0@pop.bluewin.ch> 
References: <Your message of "Wed, 22 Oct 2003 01:58:21 +0200."
	<200310220158.21389.aleaxit@yahoo.com> <Your message of "Tue,
	21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<200310220121.52789.aleaxit@yahoo.com>
	<200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com>
	<200310220158.21389.aleaxit@yahoo.com> 
	<5.2.1.1.0.20031022031539.027f6fc0@pop.bluewin.ch> 
Message-ID: <200310220503.h9M52um26419@12-236-54-216.client.attbi.com>

[Samuele]
> . suggests runtime, for compile time then maybe

Right, that's what I don't like about it.

> global::x=42
> module::x=42
> 
> outer::x=42
> 
> (I don't like those, and personally I don't see the need to get rebinding 
> for closed-over variables but anyway)

I don't like these either.

> another possibility is that today  <name> <name> is a syntax error, so maybe
> 
> global x = 42 or
> module x = 42
> 
> they would not be statements, this for symmetry would also be legal:
> 
> y = module x + 1
> 
> then
> 
> outer x = 42
> 
> and also
> 
> y = g x + 1
> 
> the problems are also clear, in some other languages x y is function 
> application, etc..

Juxtaposition of names opens a whole lot of cans of worms -- for one,
it makes many more typos pass the parser.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg at cosc.canterbury.ac.nz  Wed Oct 22 01:16:33 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Oct 22 01:20:07 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <eppstein-D2981F.17444421102003@sea.gmane.org>
Message-ID: <200310220516.h9M5GXV11525@oma.cosc.canterbury.ac.nz>

David Eppstein <eppstein@ics.uci.edu>:

> Who cares about tuple comprehensions, but I would like similar syntactic 
> sugar for dict comprehensions as for lists:
>     {k:v for k,v in S}

If you have *that*, as well as generator expressions,
someone is going to want

  k:v for k,v in S

as a bare expression to be some sort of generator.
What exactly it would generate isn't clear...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From python at rcn.com  Wed Oct 22 01:19:58 2003
From: python at rcn.com (Raymond Hettinger)
Date: Wed Oct 22 01:20:58 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310212218.h9LMIS725333@12-236-54-216.client.attbi.com>
Message-ID: <000a01c3985c$2345dc60$e841fea9@oemcomputer>

[Raymond]
> > If there is any doubt on that score, I would be happy to update
> > the PEP to match the current proposal for iterator expressions
> > and solicit more community feedback.

[Guido]
> Wonderful!  Rename PEP 289 to "generator expressions" and change the
> contents to match this proposal.  Thanks for being the fall guy!

Here is a rough draft on the resurrected PEP.
I'm sure it contains many flaws and I welcome suggested amendments.
In particular, the follow needs attention:

* Precise specification of the syntax including the edge cases
  with commas where enclosing parentheses are required.

* Making sure the acknowledgements are correct and complete.

* Verifying my understanding of the issues surrounding late binding,
  modification of locals, and returning generator expressions.

* Clear articulation of the expected benefits.  There are so many,
  it was difficult to keep it focused.


Raymond Hettinger

----------------------------------------------------------------------

PEP: 289
Title: Generator Expressions
Version: $Revision: 1.2 $
Last-Modified: $Date: 2003/08/30 23:57:36 $
Author: python@rcn.com (Raymond D. Hettinger)
Status: Active
Type: Standards Track
Created: 30-Jan-2002
Python-Version: 2.3
Post-History: 22-Oct-2003


Abstract

    This PEP introduces generator expressions as a high performance,
    memory efficient generalization of list expressions and
    generators.

Rationale

    Experience with list expressions has shown their wide-spread
    utility throughout Python.  However, many of the use cases do
    not need to have a full list created in memory.  Instead, they
    only need to iterate over the elements one at a time.

    For instance, the following dictionary constructor code will
    build a full item list in memory, iterate over that item list,
    and, when the reference is no longer needed, delete the list:

        d = dict([(k, func(v)) for k in keylist])

    Time, clarity, and memory are conserved by using an generator
    expession instead:

        d = dict((k, func(v)) for k in keylist)

    Similar benefits are conferred on the constructors for other
    container objects:

        s = Set(word  for line in page  for word in line.split())

    Having a syntax similar to list comprehensions makes it easy to
switch
    to an iterator expression when scaling up application.

    Generator expressions are especially useful in functions that reduce
    an iterable input to a single value:

        sum(len(line) for line.strip() in file if len(line)>5)

    Accordingly, generator expressions are expected to partially
eliminate
    the need for reduce() which is notorious for its lack of clarity.
And,
    there are additional speed and clarity benefits from writing
expressions
    directly instead of using lambda.

    List expressions greatly reduced the need for filter() and map().
    Likewise, generator expressions are expected to minimize the need
    for itertools.ifilter() and itertools.imap().  In contrast, the
    utility of other itertools will be enhanced by generator
expressions:

        dotproduct = sum(x*y for x,y in itertools.izip(x_vector,
y_vector))


BDFL Pronouncements

    The previous version of this PEP was REJECTED.  The bracketed
    yield syntax left something to be desired; the performance gains had
    not been demonstrated; and the range of use cases had not been
    shown.  After, much discussion on the python-dev list, the PEP has
    been resurrected its present form.  The impetus for the discussion
    was an innovative proposal from Peter Norvig.


The Gory Details

    1) In order to achieve a performance gain, generator expressions
need
    to be run in the local stackframe; otherwise, the improvement in
    cache performance gets offset by the time spent switching
    stackframes.  The upshot of this is that generator expressions need
    to be both created and consumed within the context of a single
    stackframe.  Accordingly, the generator expression cannot be
returned
    to another function:

        return (k, func(v)) for k in keylist

    2) The loop variable is not exposed to the surrounding function.
    This both facilates the implementation and makes typical use cases
    more reliable.  In some future version of Python, list
comprehensions
    will also hide the induction variable from the surrounding code
(and,
    in Py2.4, warnings will be issued for code accessing the induction
    variable).
                                                                
    3) Variables references in the generator expressions will exhibit
late
    binding just like other Python code.  In the following example, the
    iterator runs *after* the value of y is set to one:

        def h():
            y = 0
            l = [1,2]
            def gen(S):
                for x in S:
                    yield x+y
            it = gen(l)
            y = 1
            for v in it:
              print v

    4) List comprehensions will remain unchanged.
    So, [x for x in S] is a list comprehension and
    [(x for x in S)] is a list containing one generator expression.

    5) It is prohibited to use locals() for other than read-only use
    in generator expressions.  This simplifies the implementation and
    precludes a certain class of obfuscated code.


Acknowledgements:

    Peter Norvig resurrected the discussion proposal for "accumulation
    displays".

    Alex Martelli provided critical measurements that proved the
    the performance benefits of generator expressions.

    Samuele Pedroni provided the example of late binding.

    Guido van Rossum suggested the bracket free, yield free syntax.

    Raymond Hettinger first proposed "generator comprehensions" in
    January 2002.


References

    [1] PEP 255 Simple Generators
        http://python.sourceforge.net/peps/pep-0255.html

    [2] PEP 202 List Comprehensions
        http://python.sourceforge.net/peps/pep-0202.html

    [3] Peter Norvig's Accumulation Display Proposal
        http:///www.norvig.com/pyacc.html

Copyright

    This document has been placed in the public domain.


Local Variables:
mode: indented-text
indent-tabs-mode: nil
fill-column: 70
End:


From guido at python.org  Wed Oct 22 01:27:42 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 01:27:55 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Tue, 21 Oct 2003 23:03:14 EDT."
	<LNBBLJKPBEHFEDALKOLCEEJOFEAB.tim_one@email.msn.com> 
References: <LNBBLJKPBEHFEDALKOLCEEJOFEAB.tim_one@email.msn.com> 
Message-ID: <200310220527.h9M5Rgr26465@12-236-54-216.client.attbi.com>

[Tim]
> I'm not sure it's "a feature" that
> 
>     print [n+f() for x in range(10)]
> 
> looks up n and f anew on each iteration -- if I saw a listcomp that
> actually relied on this, I'd be eager to avoid inheriting any of
> author's code.

It's just a direct consequence of Python's general rule for name
lookup in all contexts: variables are looked up when used, not before.
(Note: lookup is different from scope determination, which is done
mostly at compile time.  Scope determination tells you where to look;
lookup gives you the actual value of that location.)  If n is a global
and calling f() changes n, f()+n differs from n+f(), and both are
well-defined due to the left-to-right rule.  That's not good or bad,
that's just *how it is*.  Despite having some downsides, the
simplicity of the rule is good; I'm sure we could come up with
downsides of other rules too.

Despite the good case that's been made for what would be most useful,
I'm loathe to drop the evaluation rule for convenience in one special
case.  Next people may argue that in Python 3.0 lambda should also do
this; arguably it's more useful than the current semantics there too.
And then what next -- maybe all nested functions should copy their
free variables?  Oh, and then maybe outermost functions should copy
their globals into locals too -- that will speed up a lot of code. :-)

There are other places in Python where some rule is applied to "all
free variables of a given piece of code" (the distinction between
locals and non-locals in functions is made this way).  But there are
no other places where implicit local *copies* of all those free
variables are taken.

I'd need to find a unifying principle to warrant doing that beyond
utility.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg at cosc.canterbury.ac.nz  Wed Oct 22 01:38:26 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Oct 22 01:38:39 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310220223.h9M2N7l26105@12-236-54-216.client.attbi.com>
Message-ID: <200310220538.h9M5cQn11555@oma.cosc.canterbury.ac.nz>

Guido van Rossum <guido@python.org>:

> So, do you want *all* free variables to be passed using the
> default-argument trick (even globals and builtins), or only those that
> correspond to variables in the immediately outer scope, or only those
> corresponding to function scopes (as opposed to globals)?

And what about

  foo = (f(x) for x in stuff)
  def f(x):
    ...
  for blarg in foo:
    ...

?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Wed Oct 22 02:02:28 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 02:02:47 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Wed, 22 Oct 2003 01:19:58 EDT."
	<000a01c3985c$2345dc60$e841fea9@oemcomputer> 
References: <000a01c3985c$2345dc60$e841fea9@oemcomputer> 
Message-ID: <200310220602.h9M62Su26531@12-236-54-216.client.attbi.com>

> Here is a rough draft on the resurrected PEP.

Thanks -- that was quick!

> I'm sure it contains many flaws and I welcome suggested amendments.
> In particular, the follow needs attention:
> 
> * Precise specification of the syntax including the edge cases
>   with commas where enclosing parentheses are required.
> 
> * Making sure the acknowledgements are correct and complete.
> 
> * Verifying my understanding of the issues surrounding late binding,
>   modification of locals, and returning generator expressions.
> 
> * Clear articulation of the expected benefits.  There are so many,
>   it was difficult to keep it focused.
> 
> 
> Raymond Hettinger
> 
> ----------------------------------------------------------------------
> 
> PEP: 289
> Title: Generator Expressions
> Version: $Revision: 1.2 $
> Last-Modified: $Date: 2003/08/30 23:57:36 $
> Author: python@rcn.com (Raymond D. Hettinger)
> Status: Active
> Type: Standards Track
> Created: 30-Jan-2002
> Python-Version: 2.3
> Post-History: 22-Oct-2003
> 
> 
> Abstract
> 
>     This PEP introduces generator expressions as a high performance,
>     memory efficient generalization of list expressions and
>     generators.

Um, please change "list expressions" back to "list comprehensions"
everywhere.  Global substitute gone awry? :-)

> Rationale
> 
>     Experience with list expressions has shown their wide-spread
>     utility throughout Python.  However, many of the use cases do
>     not need to have a full list created in memory.  Instead, they
>     only need to iterate over the elements one at a time.
> 
>     For instance, the following dictionary constructor code will
>     build a full item list in memory, iterate over that item list,
>     and, when the reference is no longer needed, delete the list:
> 
>         d = dict([(k, func(v)) for k in keylist])

I'd prefer to use the example

          sum([x*x for x in range(10)])

>     Time, clarity, and memory are conserved by using an generator
>     expession instead:
> 
>         d = dict((k, func(v)) for k in keylist)

which becomes

          sum(x*x for x in range(10))

(I find the dict constructor example sub-optimal because it starts
with two parentheses, and visually finding the match for the second of
those is further complicated by the use of func(v) for the value.)

>     Similar benefits are conferred on the constructors for other
>     container objects:

(Here you can use the dict constructor example.)

>         s = Set(word  for line in page  for word in line.split())
> 
>     Having a syntax similar to list comprehensions makes it easy to
>     switch to an iterator expression when scaling up application.
                   ^^^^^^^^
                   generator

>     Generator expressions are especially useful in functions that reduce
>     an iterable input to a single value:
> 
>         sum(len(line) for line.strip() in file if len(line)>5)
                            ^^^^^^^^^^^^

That's not valid syntax; my example was something like

          sum(len(line)  for line in file  if line.strip())

>     Accordingly, generator expressions are expected to partially
>     eliminate the need for reduce() which is notorious for its lack
>     of clarity. And, there are additional speed and clarity benefits
>     from writing expressions directly instead of using lambda.
> 
>     List expressions greatly reduced the need for filter() and
>     map().  Likewise, generator expressions are expected to minimize
>     the need for itertools.ifilter() and itertools.imap().  In
>     contrast, the utility of other itertools will be enhanced by
>     generator expressions:
> 
>         dotproduct = sum(x*y for x,y in itertools.izip(x_vector, y_vector))
> 
> 
> BDFL Pronouncements
> 
>     The previous version of this PEP was REJECTED.  The bracketed
>     yield syntax left something to be desired; the performance gains
>     had not been demonstrated; and the range of use cases had not
>     been shown.  After, much discussion on the python-dev list, the
>     PEP has been resurrected its present form.  The impetus for the
>     discussion was an innovative proposal from Peter Norvig.
> 
> 
> The Gory Details
> 
>     1) In order to achieve a performance gain, generator expressions need
>     to be run in the local stackframe; otherwise, the improvement in
>     cache performance gets offset by the time spent switching
>     stackframes.  The upshot of this is that generator expressions
>     need to be both created and consumed within the context of a
>     single stackframe.  Accordingly, the generator expression cannot
>     be returned to another function:
> 
>         return (k, func(v)) for k in keylist

Heh?  Did you keep this from the old PEP?  Performance tests show that
a generator function is already faster than a list comprehension, and
the semantics are now defined as equivalent to creating an anonymous
generator function and calling it.  (There's still discussion about
whether that generator function should copy the current value of all
free variables into default arguments.)

We need a Gory Detail item explaining the exact syntax.  I propose
that a generator expression always needs to be inside a set of
parentheses and cannot have a comma on either side.  Unfortunately
this is different from list comprehensions; while [1, x for x in R] is
illegal, [x for x in 1, 2, 3] is legal, meaning [x for x in (1,2,3)].

With reference to the file Grammar/Grammar in CVS, I think these
changes are suitable:

(1) The rule

      atom: '(' [testlist] ')'

    changes to

      atom: '(' [listmaker1] ')'

    where listmaker1 is almost the same as listmaker, but only allows a
    single test after 'for' ... 'in'.

(2) The rule for arglist is similarly changed so that it can be either
    a bunch of arguments possibly followed by *xxx and/or **xxx, or a
    single generator expression.  This is even hairier, so I'm not
    going to present the exact changes here; I'm confident that it can
    be done though using the same kind of breakdown as used for
    listmaker.  Yes, maybe the compiler may have to work a little
    harder to distinguish all the cases. :-)

>     2) The loop variable is not exposed to the surrounding function.
>     This both facilates the implementation and makes typical use
>     cases more reliable.  In some future version of Python, list
>     comprehensions will also hide the induction variable from the
>     surrounding code (and, in Py2.4, warnings will be issued for
>     code accessing the induction variable).
>                                                                 
>     3) Variables references in the generator expressions will
>     exhibit late binding just like other Python code.  In the
>     following example, the iterator runs *after* the value of y is
>     set to one:
> 
>         def h():
>             y = 0
>             l = [1,2]
>             def gen(S):
>                 for x in S:
>                     yield x+y
>             it = gen(l)
>             y = 1
>             for v in it:
>               print v

There is still discussion about this one.

>     4) List comprehensions will remain unchanged.
>     So, [x for x in S] is a list comprehension and
>     [(x for x in S)] is a list containing one generator expression.
> 
>     5) It is prohibited to use locals() for other than read-only use
>     in generator expressions.  This simplifies the implementation and
>     precludes a certain class of obfuscated code.

I wouldn't mention this.  assigning into locals() has an undefined
effect anyway.

> Acknowledgements:
> 
>     Peter Norvig resurrected the discussion proposal for "accumulation
>     displays".

Can you do inline URLs in the final version?  Maybe an opportunity to
learn reST. :-)  Or else at least add [3] to the text.

>     Alex Martelli provided critical measurements that proved the
>     the performance benefits of generator expressions.

And also argued with great force that this was a useful thing to have
(as have several others).

>     Samuele Pedroni provided the example of late binding.

(But he wanted generator expressions *not* to use late binding!)

>     Guido van Rossum suggested the bracket free, yield free syntax.

I don't need credits, and I wouldn't be surprised if someone else
had suggested it first.

>     Raymond Hettinger first proposed "generator comprehensions" in
>     January 2002.

Phillip Eby suggested "iterator expressions" as the name and
subsequently Tim Peters suggested "generator expressions".


> References
> 
>     [1] PEP 255 Simple Generators
>         http://python.sourceforge.net/peps/pep-0255.html
> 
>     [2] PEP 202 List Comprehensions
>         http://python.sourceforge.net/peps/pep-0202.html
> 
>     [3] Peter Norvig's Accumulation Display Proposal
>         http:///www.norvig.com/pyacc.html

I'd point to the thread in python-dev too.

BTW I think the idea of having some iterators support __copy__ as a
way to indicate they can be cloned is also PEPpable; we've pretty much
reached closure on that one.  PEP 1 explains how to get a PEP number.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From theller at python.net  Wed Oct 22 03:31:31 2003
From: theller at python.net (Thomas Heller)
Date: Wed Oct 22 03:31:59 2003
Subject: [Python-Dev] Re: buildin vs. shared modules
In-Reply-To: <7k2ywden.fsf@yahoo.co.uk> (Paul Moore's message of "Tue, 21
	Oct 2003 21:20:48 +0100")
References: <brspcg23.fsf@python.net>
	<200310171840.h9HIesN06941@12-236-54-216.client.attbi.com>
	<he22muai.fsf@python.net> <200310211857.57783.aleaxit@yahoo.com>
	<u162la7r.fsf@python.net>
	<200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com>
	<7k2ywden.fsf@yahoo.co.uk>
Message-ID: <65ihlodo.fsf@python.net>

[Thomas]
>>> After installing MSVC6 on a win98 machine, where I could rename
>>> wsock32.dll away (which was not possible on XP due to file system
>>> protection), I was able to change socketmodule.c to use delay loading of
>>> the winsock dll.  I had to wrap up the WSAStartup() call inside a 
>>> __try {} __except {} block to catch the exception thrown.
>>> 
>>> With this change, _socket (and maybe also select) could then also be
>>> converted into builtin modules.
>>> 
>>> Guido, what do you think?
>>
[Guido]
>> I think now is a good time to try this in 2.4.  I don't think I'd want
>> to do this (or any of the proposed reorgs) in 2.3 though.
>
[Paul]
> One (very mild) point - this is highly MSVC-specific. I don't know if
> there is ever going to be any interest in (for example) getting Python
> to build with Mingw/gcc on Windows, but there's no equivalent of this
> in Mingw (indeed, Mingw doesn't, as far as I know, support
> __try/__except either).

The whole delayload/__try/__except stuff may be unneeded in 2.4, because
it will most probably be compiled with MSVC7.1, installed via an msi
installer, and all systems where the msi actually could be installed
would already have a winsock (or winsock2) dll.  At least that is my
impression on what I hear about systems older than (or including?)
win98SE these days.

Thomas


From python at rcn.com  Wed Oct 22 03:57:45 2003
From: python at rcn.com (Raymond Hettinger)
Date: Wed Oct 22 03:58:33 2003
Subject: [Python-Dev] PEP 289: Generator Expressions (second draft)
In-Reply-To: <200310220602.h9M62Su26531@12-236-54-216.client.attbi.com>
Message-ID: <000001c39872$2dece760$e841fea9@oemcomputer>

Guido, thanks for the quick edits of the first draft.

Here is a link to the second:
    http://users.rcn.com/python/download/pep-0289.html

The reST version is attached.


[Guido]
> BTW I think the idea of having some iterators support __copy__ as a
> way to indicate they can be cloned is also PEPpable; we've pretty much
> reached closure on that one.  PEP 1 explains how to get a PEP number.

That one sounds like a job for Alex.


Raymond Hettinger

------------------------------------------------------------------

PEP: 289
Title: Generator Expressions
Version: $Revision: 1.3 $
Last-Modified: $Date: 2003/08/30 23:57:36 $
Author: python@rcn.com (Raymond D. Hettinger)
Status: Active
Type: Standards Track
Content-Type: text/x-rst
Created: 30-Jan-2002
Python-Version: 2.3
Post-History: 22-Oct-2003


Abstract
========

This PEP introduces generator expressions as a high performance,
memory efficient generalization of list comprehensions [1]_ and
generators [2]_.


Rationale
=========

Experience with list comprehensions has shown their wide-spread
utility throughout Python.  However, many of the use cases do
not need to have a full list created in memory.  Instead, they
only need to iterate over the elements one at a time.

For instance, the following summation code will build a full list of
squares in memory, iterate over those values, and, when the reference
is no longer needed, delete the list::

    sum([x*x for x in range(10)])

Time, clarity, and memory are conserved by using an generator
expession instead::

    sum(x*x for x in range(10))

Similar benefits are conferred on constructors for container objects::

    s = Set(word  for line in page  for word in line.split())
    d = dict( (k, func(v)) for k in keylist)

Generator expressions are especially useful in functions that reduce
an iterable input to a single value::

    sum(len(line)  for line in file  if line.strip())

Accordingly, generator expressions are expected to partially eliminate
the need for reduce() which is notorious for its lack of clarity. And,
there are additional speed and clarity benefits from writing expressions
directly instead of using lambda.

List comprehensions greatly reduced the need for filter() and map().
Likewise, generator expressions are expected to minimize the need
for itertools.ifilter() and itertools.imap().  In contrast, the
utility of other itertools will be enhanced by generator expressions::

    dotproduct = sum(x*y for x,y in itertools.izip(x_vector, y_vector))
    
Having a syntax similar to list comprehensions also makes it easy to
convert existing code into an generator expression when scaling up
application.


BDFL Pronouncements
===================

The previous version of this PEP was REJECTED.  The bracketed yield
syntax left something to be desired; the performance gains had not been
demonstrated; and the range of use cases had not been shown.  After,
much discussion on the python-dev list, the PEP has been resurrected
its present form.  The impetus for the discussion was an innovative
proposal from Peter Norvig [3]_.


The Gory Details
================

1.  The semantics of a generator expression are equivalent to creating
an anonymous generator function and calling it.  There's still
discussion
about whether that generator function should copy the current value of
all
free variables into default arguments.

2. The syntax requires that a generator expression always needs to be
inside
a set of parentheses and cannot have a comma on either side.
Unfortunately,
this is different from list comprehensions.  While [1, x for x in R] is
illegal, [x for x in 1, 2, 3] is legal, meaning [x for x in (1,2,3)].
With reference to the file Grammar/Grammar in CVS, two rules change:

    a) The rule::

          atom: '(' [testlist] ')'

       changes to::

          atom: '(' [listmaker1] ')'

       where listmaker1 is almost the same as listmaker, but only allows
       a single test after 'for' ... 'in'.

    b)  The rule for arglist needs similar changes.


2. The loop variable is not exposed to the surrounding function.  This
facilates the implementation and makes typical use cases more reliable.
In some future version of Python, list comprehensions will also hide the
induction variable from the surrounding code (and, in Py2.4, warnings
will be issued for code accessing the induction variable).
                                                                
3. There is still discussion about whether variable referenced in
generator
expressions will exhibit late binding just like other Python code.  In
the
following example, the iterator runs *after* the value of y is set to
one::

    def h():
        y = 0
        l = [1,2]
        def gen(S):
            for x in S:
                yield x+y
        it = gen(l)
        y = 1
        for v in it:
          print v

4. List comprehensions will remain unchanged::

    [x for x in S]    # This is a list comprehension.
    [(x for x in S)]  # This is a list containing one generator
expression.


Acknowledgements
================

* Raymond Hettinger first proposed the idea of "generator
comprehensions"
  in January 2002.
  
* Peter Norvig resurrected the discussion in his proposal for
  Accumulation Displays [3]_.

* Alex Martelli provided critical measurements that proved the
performance
  benefits of generator expressions.  He also provided strong arguments
  that they were a desirable thing to have.

* Phillip Eby suggested "iterator expressions" as the name.

* Subsequently, Tim Peters suggested the name "generator expressions".

* Samuele Pedroni argued against late binding and provided the example
  shown above.


References
==========

.. [1] PEP 202 List Comprehensions
       http://python.sourceforge.net/peps/pep-0202.html

.. [2] PEP 255 Simple Generators
       http://python.sourceforge.net/peps/pep-0255.html

.. [3] Peter Norvig's Accumulation Display Proposal
       http:///www.norvig.com/pyacc.html


Copyright
=========

This document has been placed in the public domain.


..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   End:


From fincher.8 at osu.edu  Wed Oct 22 05:45:48 2003
From: fincher.8 at osu.edu (Jeremy Fincher)
Date: Wed Oct 22 04:47:25 2003
Subject: [Python-Dev] PEP 289: Generator Expressions (second draft)
In-Reply-To: <000001c39872$2dece760$e841fea9@oemcomputer>
References: <000001c39872$2dece760$e841fea9@oemcomputer>
Message-ID: <200310220545.49319.fincher.8@osu.edu>

On Wednesday 22 October 2003 03:57 am, Raymond Hettinger wrote:
> Accordingly, generator expressions are expected to partially eliminate
> the need for reduce() which is notorious for its lack of clarity. And,
> there are additional speed and clarity benefits from writing expressions
> directly instead of using lambda.

I probably missed it in this monster of a thread, but how do generator 
expressions do this?  It seems that they'd only make reduce more efficient, 
but it would still be just as needed as before.

Jeremy

From mwh at python.net  Wed Oct 22 07:03:18 2003
From: mwh at python.net (Michael Hudson)
Date: Wed Oct 22 07:03:21 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310212349.h9LNnm710076@oma.cosc.canterbury.ac.nz> (Greg
	Ewing's message of "Wed, 22 Oct 2003 12:49:48 +1300 (NZDT)")
References: <200310212349.h9LNnm710076@oma.cosc.canterbury.ac.nz>
Message-ID: <2moew9o7pl.fsf@starship.python.net>

Greg Ewing <greg@cosc.canterbury.ac.nz> writes:

> Michael Hudson <mwh@python.net>:
>
>> In particular what happens if the iteration variable is a local in the
>> frame anyway?  I presume that would inhibit the renaming
>
> Why?

Well, because then you have the same name for two different bindings.

>> but then code like 
>> 
>> def f(x):
>>     r = [x+1 for x in range(x)]
>>     return r, x
>> 
>> becomes even more incomprehensible (and changes in behaviour).
>
> Anyone who writes code like that *deserves* to have the
> behaviour changed on them!

This was not my impression of the Python way.  I know I'd be pretty
pissed if this broke my app.

I have no objection to breaking the above code, just to breaking it
silently!  Having code *silently change in behaviour* (not die with an
expection, post a warning at compile time or fail to compile at all)
is about an evil a change as it's possible to contemplate, IMO.

> If this is really a worry, an alternative would be to
> simply forbid using a name for the loop variable that's
> used for anything else outside the loop. That could
> break existing code too, but at least it would break
> it in a very obvious way by making it fail to compile.

This would be infinitely preferable!

Cheers,
mwh

-- 
  I like silliness in a MP skit, but not in my APIs. :-)
                                       -- Guido van Rossum, python-dev

From gmccaughan at synaptics-uk.com  Wed Oct 22 07:15:27 2003
From: gmccaughan at synaptics-uk.com (Gareth McCaughan)
Date: Wed Oct 22 07:16:10 2003
Subject: [Python-Dev] accumulator display syntax
Message-ID: <200310221215.27570.gmccaughan@synaptics-uk.com>

Tim Peters wrote:

> "Set comprehensions" in a programming language originated with SETL,
> and are named in honor of the set-theoretic Axiom of Comprehension
> (Aussonderungsaxiom).  In its well-behaved form, that says roughly that
> given a set X, then for any predicate P(x), there exists a subset of X whose
> elements consist of exactly those elements x of X for which P(x) is true (in
> its ill-behaved form, it leads directly to Russell's Paradox -- the set of
> all sets that don't contain themselves).

<pedant>
"Aussonderungsaxiom" is the axiom of *separation*[1], which is
a weakened version of the (disastrous) axiom of *comprehension*.
In terms of Python's listcomps, comprehension would be [x if P(x)]
and separation [x for x in S if P(x)]. So we should be
calling them "list separations", really :-).

    [1] Hence the name; compare English "sunder".

For the record, I like "generator expressions" too, or "iterator
expressions".
</pedant>

-- 
Gareth McCaughan


From pedronis at bluewin.ch  Wed Oct 22 08:12:35 2003
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Wed Oct 22 08:10:19 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310220602.h9M62Su26531@12-236-54-216.client.attbi.com>
References: <Your message of "Wed, 22 Oct 2003 01:19:58 EDT."
	<000a01c3985c$2345dc60$e841fea9@oemcomputer>
	<000a01c3985c$2345dc60$e841fea9@oemcomputer>
Message-ID: <5.2.1.1.0.20031022140258.027b3d80@pop.bluewin.ch>

At 23:02 21.10.2003 -0700, Guido van Rossum wrote:

> >     Samuele Pedroni provided the example of late binding.
>
>(But he wanted generator expressions *not* to use late binding!)
>
>

to be honest no, I was just arguing for coherent behavior between
generator expressions and closures, Tim and Phillip J. Eby  argued (are 
arguing) against late binding.

It is true that successively in an OT-way I mildly proposed 
non-late-binding semantics but for _all_ closures wrt to free variables 
apart from globals,
  but I got that a fraction of people still would like rebinding support 
for closer-over vars (something I don't miss personally) , and there are 
subtle issues wrt recursive references, which while solvable would make the 
semantics rather DWIMish
, not a good thing.

Samuele. 


From skip at pobox.com  Wed Oct 22 08:42:08 2003
From: skip at pobox.com (Skip Montanaro)
Date: Wed Oct 22 08:42:17 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310212307.h9LN7hY25523@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 14:58:55 EDT."
	<004601c39805$60dd9a60$e841fea9@oemcomputer>
	<200310212316.22749.aleaxit@yahoo.com>
	<200310212211.h9LMBH925278@12-236-54-216.client.attbi.com>
	<200310220105.08017.aleaxit@yahoo.com>
	<200310212307.h9LN7hY25523@12-236-54-216.client.attbi.com>
Message-ID: <16278.31520.908436.64862@montanaro.dyndns.org>


    Guido> Raymond is going to give PEP 289 an overhaul.

Since you rejected PEP 289 at one point, it might be worth having a short
explanation of why you've changed your mind.

Skip

From skip at pobox.com  Wed Oct 22 08:48:52 2003
From: skip at pobox.com (Skip Montanaro)
Date: Wed Oct 22 08:49:04 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <20031021233910.GA2091@mems-exchange.org>
References: <20031021233910.GA2091@mems-exchange.org>
Message-ID: <16278.31924.243308.981142@montanaro.dyndns.org>


    Neil> Another nice thing is that we have tuple and dict comprehensions
    Neil> for free:

    Neil>   tuple(x for x in S)
    Neil>   dict((k, v) for k, v in S)
    Neil>   Set(x for x in S)

    Neil> Aside from the bit of syntactic sugar, everything is nice an
    Neil> regular.

Maybe in 3.0 the syntactic sugar for list comprehensions should disappear
then.

Skip

From ncoghlan at iinet.net.au  Wed Oct 22 09:02:52 2003
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Wed Oct 22 09:02:56 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <200310211841.45711.aleaxit@yahoo.com>
References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz>
	<200310201815.h9KIFM821583@12-236-54-216.client.attbi.com>
	<3F953793.1000208@iinet.net.au>
	<200310211841.45711.aleaxit@yahoo.com>
Message-ID: <3F967FFC.6040507@iinet.net.au>

Alex Martelli strung bits together to say:

> I don't think we should encourage that sort of thing with the "implicit
> assignment" in accumulation.
> 
> So, if it's an accumulation syntax we're going for, I'd much rather find
> ways to express whether we want [a] no assignment at all (as e.g for
> union_update), [b] plain assignment, [c] augmented assignment such
> as += or whatever.  Sorry, no good idea comes to my mind now, but
> I _do_ think we'd want all three possibilities...

I had a similar thought about 5 minutes after turning my computer off last 
night. The alternative I came up with was:

   y = (from result = 0.0 do result += x**2 for x in values if x > 0)

The two extra clauses (from & do) are pretty much unavoidable if we want to be 
able to express both the starting point, and the method of accumulation. And 
hopefully those clauses would be enough to disambiguate this from the new syntax 
for generator expressions.

The 'from' clause would allow a single plain assignment statement. It names the 
accumulation variable, and also gives it an initial value (if you don't want an 
initial value, an explicit assignment to None should suffice)

The 'do' clause would allow single plain or augmented assignment statements, as 
well as allowing any expression.

'from' is already a keyword (thanks to 'from ... import ...') and it might be 
possible to avoid making 'do' a keyword (in the same way that 'as' is not a 
keyword despite its use in 'from ... import ... as ...')

(And I'll add my vote to pointing out that generator expressions don't magically 
eliminate the use of the reduce function or accumulation loops any more than 
list comprehensions did. We still need the ability to express the starting value 
and the accumulation method).

Cheers,
Nick.

P.S. I'm heading off to Canberra early tomorrow morning, so I won't be catching 
up on this discussion until the weekend.

-- 
Nick Coghlan           |              Brisbane, Australia
ICQ#: 68854767         |               ncoghlan@email.com
Mobile: 0409 573 268   |   http://www.talkinboutstuff.net
"Let go your prejudices,
               lest they limit your thoughts and actions."


From skip at pobox.com  Wed Oct 22 09:09:48 2003
From: skip at pobox.com (Skip Montanaro)
Date: Wed Oct 22 09:09:57 2003
Subject: [Python-Dev] Time for py3k@python.org or a Py3K Wiki?
Message-ID: <16278.33180.5190.95094@montanaro.dyndns.org>


These various discussions are moving along a bit too rapidly for me to keep
up.  We have been discussing language issues which are going to impact
Python 3.0, either by deprecating current language constructs which can't be
eliminated until then (e.g., the global statement) or by tossing around
language construct ideas which will have to wait until then for their
implementation (other mechanisms for variable access in outer scopes).
Unfortunately, I'm afraid these things are going to get lost in the sea of
other python-dev topics and be forgotten about then the time is ripe.

Maybe this would be a good time to create a py3k@python.org mailing list
with more restrictions than python-dev (posting by members only?  membership
by invitation?) so we can more easily separate these ideas from shorter term
issues and keep track of them in a separate Mailman archive.  I'd suggest
starting a Wiki, but that seems a bit too "global".  You can restrict Wiki
mods in MoinMoin to users who are logged in, but I'm not sure you can
restrict signups very well.

I also think Guido wants to make a significant leap on his own at Python
3.0, but that is going to require a considerable amount of uninteruppted
full-time available for that effort.  Given that my 21-year old only
recently fled the nest and my 20-year old keeps returning, I'd say Guido's
going to have to wait for quite awhile for the "uninterrupted" qualifier to
become unconditionally true. ;-) In the meantime, a mailing list archive or
Wiki would provide a good place to keep notes which he could refer to.

Skip

From skip at pobox.com  Wed Oct 22 09:15:16 2003
From: skip at pobox.com (Skip Montanaro)
Date: Wed Oct 22 09:15:24 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310220042.h9M0g5225903@12-236-54-216.client.attbi.com>
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<200310220121.52789.aleaxit@yahoo.com>
	<200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com>
	<200310220158.21389.aleaxit@yahoo.com>
	<200310220042.h9M0g5225903@12-236-54-216.client.attbi.com>
Message-ID: <16278.33508.677499.127119@montanaro.dyndns.org>


    Guido> (If it wasn't clear, I'm struggling with this subject -- I think
    Guido> there are good reasons for why I'm resisting your proposal, but I
    Guido> haven't found them yet.  The more I think about it, the less I
    Guido> like 'globals.x = 42' .

How about

    __.x = 42

?

Skip

From ark-mlist at att.net  Wed Oct 22 09:24:15 2003
From: ark-mlist at att.net (Andrew Koenig)
Date: Wed Oct 22 09:22:03 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: <200310212033.h9LKXDk24952@12-236-54-216.client.attbi.com>
Message-ID: <009401c3989f$cb269030$6402a8c0@arkdesktop>

> I thought we already established before that attempting to guess wihch
> parts of a generator function to copy and which parts to share is
> hopeless.  generator-made iterators won't be __copy__-able, period.

> I think this is the weakness of this cloning business, because it
> either makes generators second-class iterators, or it makes cloning a
> precarious thing to attempt when generators are used.  (You can make a
> non-cloneable iterator cloneable by wrapping it into something that
> buffers just those items that are still reacheable by clones, but this
> can still require arbitrary amounts of buffer space.

However, the buffering can be done in a way that uses only as much buffer
space as is truly needed.  Just maintain the buffer as a singly linked list
in which new elements are inserted at the *tail* of the list.  Then whenever
the head becomes unreachable (e.g. because no iterators refer to it), it
will be garbage collected.


From skip at pobox.com  Wed Oct 22 09:26:02 2003
From: skip at pobox.com (Skip Montanaro)
Date: Wed Oct 22 09:26:19 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <13803476.1066768024@[192.168.1.101]>
References: <000001c3984b$052cd820$e841fea9@oemcomputer>
	<13803476.1066768024@[192.168.1.101]>
Message-ID: <16278.34154.245725.959203@montanaro.dyndns.org>

>>>>> "David" == David Eppstein <eppstein@ics.uci.edu> writes:

    David> Currently, I am using expressions like

    David>      pos2d = 
    David> dict([(s,(positions[s][0]+dx*positions[s][2],positions[s][1]+dy*positions[s
    David> ][2]))
    David>                for s in positions])

which I would have written something like

    pos2d = dict([(s,(positions[s][0]+dx*positions[s][2],
                      positions[s][1]+dy*positions[s][2]))
                     for s in positions])

so that I could see the relationship between the two tuple elements.

    [ skipping the avoidance of listcomp syntactic sugar ]

    David> But with PEP 274, I could write

    David>      pos2d = 
    David> {s:(positions[s][0]+dx*positions[s][2],positions[s][1]+dy*positions[s][2])
    David>           for s in positions}

    David> Instead of five levels of nested parens+brackets, I would need
    David> only three, and each level would be a different type of paren or
    David> bracket, which I think together with the shorter overall length
    David> would contribute significantly to readability.

which I would still find unreadable and would recast in a more obvious (to
me) way as

    pos2d = {s: (positions[s][0]+dx*positions[s][2],
                 positions[s][1]+dy*positions[s][2])
                for s in positions}

The extra characters required today are less of a problem if the expression
is laid out sensibly.

Skip

From aahz at pythoncraft.com  Wed Oct 22 09:49:13 2003
From: aahz at pythoncraft.com (Aahz)
Date: Wed Oct 22 09:49:18 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: <200310212327.h9LNRSX25612@12-236-54-216.client.attbi.com>
References: <338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com>
	<200310212327.h9LNRSX25612@12-236-54-216.client.attbi.com>
Message-ID: <20031022134913.GA21755@panix.com>

On Tue, Oct 21, 2003, Guido van Rossum wrote:
>
> If you're talking about making
> 
>   x = None
>   for x in R: pass
>   print x # last item of R
> 
> illegal, forget it.  That's too darn useful.

Not illegal, but perhaps for 3.0 we should consider making that print
display "None".  The question is to what extent Python should continue
having unified semantics across constructs.  While I agree that listcomps
should definitely have a local scope ("expressions should not have
side-effects"), I think that there would be advantages to the control
variable in a for loop also having local scope that are magnified by
having compatible semantics between listcomps and for loops.  In other
words, consider

    x = None
    [x for x in R]
    print x

Why should the two behave differently?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From guido at python.org  Wed Oct 22 10:42:26 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 10:42:38 2003
Subject: [Python-Dev] Re: buildin vs. shared modules
In-Reply-To: Your message of "Wed, 22 Oct 2003 09:31:31 +0200."
	<65ihlodo.fsf@python.net> 
References: <brspcg23.fsf@python.net>
	<200310171840.h9HIesN06941@12-236-54-216.client.attbi.com>
	<he22muai.fsf@python.net> <200310211857.57783.aleaxit@yahoo.com>
	<u162la7r.fsf@python.net>
	<200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com>
	<7k2ywden.fsf@yahoo.co.uk> <65ihlodo.fsf@python.net> 
Message-ID: <200310221442.h9MEgQN27219@12-236-54-216.client.attbi.com>

> The whole delayload/__try/__except stuff may be unneeded in 2.4, because
> it will most probably be compiled with MSVC7.1, installed via an msi
> installer,

Is anyone working on that?  I have the VC7.1 compiler too, but haven't
tried to use it yet.  Maybe someone should check in a project
(separate from the VC6 project, so people don't *have to* switch yet)?

Are the tools needed to build an MSI installer included in VC7.1?  If
not, are they a free download?

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Oct 22 11:02:09 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 11:02:15 2003
Subject: [Python-Dev] Re: Reiterability
In-Reply-To: Your message of "Wed, 22 Oct 2003 09:24:15 EDT."
	<009401c3989f$cb269030$6402a8c0@arkdesktop> 
References: <009401c3989f$cb269030$6402a8c0@arkdesktop> 
Message-ID: <200310221502.h9MF29U27337@12-236-54-216.client.attbi.com>

> > I thought we already established before that attempting to guess wihch
> > parts of a generator function to copy and which parts to share is
> > hopeless.  generator-made iterators won't be __copy__-able, period.
> 
> > I think this is the weakness of this cloning business, because it
> > either makes generators second-class iterators, or it makes cloning a
> > precarious thing to attempt when generators are used.  (You can make a
> > non-cloneable iterator cloneable by wrapping it into something that
> > buffers just those items that are still reacheable by clones, but this
> > can still require arbitrary amounts of buffer space.
> 
> However, the buffering can be done in a way that uses only as much
> buffer space as is truly needed.  Just maintain the buffer as a
> singly linked list in which new elements are inserted at the *tail*
> of the list.  Then whenever the head becomes unreachable
> (e.g. because no iterators refer to it), it will be garbage
> collected.

Correct.  For this reason, Raymond will make a leak-proof version of
his tee() function part of itertools.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Oct 22 11:06:43 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 11:07:01 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: Your message of "Wed, 22 Oct 2003 09:49:13 EDT."
	<20031022134913.GA21755@panix.com> 
References: <338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com>
	<200310212327.h9LNRSX25612@12-236-54-216.client.attbi.com> 
	<20031022134913.GA21755@panix.com> 
Message-ID: <200310221506.h9MF6hM27365@12-236-54-216.client.attbi.com>

> > If you're talking about making
> > 
> >   x = None
> >   for x in R: pass
> >   print x # last item of R
> > 
> > illegal, forget it.  That's too darn useful.
> 
> Not illegal, but perhaps for 3.0 we should consider making that print
> display "None".  The question is to what extent Python should continue
> having unified semantics across constructs.  While I agree that listcomps
> should definitely have a local scope ("expressions should not have
> side-effects"), I think that there would be advantages to the control
> variable in a for loop also having local scope that are magnified by
> having compatible semantics between listcomps and for loops.  In other
> words, consider
> 
>     x = None
>     [x for x in R]
>     print x
> 
> Why should the two behave differently?

The variable of a for *statement* must be accessible after the loop
because you might want to break out of the loop with a specific
value.  This is a common pattern that I have no intent of breaking.
So it can't introduce a new scope; then it might as well keep the last
value assigned to it.

List comprehensions and generator expressions don't have 'break'.
(You could cause an exception and catch it, but it's not a common
pattern to use the control variable afterwards -- only the debugger
would need access somehow.)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From nas-python at python.ca  Wed Oct 22 11:08:16 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Wed Oct 22 11:07:10 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <000a01c3985c$2345dc60$e841fea9@oemcomputer>
References: <200310212218.h9LMIS725333@12-236-54-216.client.attbi.com>
	<000a01c3985c$2345dc60$e841fea9@oemcomputer>
Message-ID: <20031022150816.GA4161@mems-exchange.org>


On Wed, Oct 22, 2003 at 01:19:58AM -0400, Raymond Hettinger wrote:
>     Experience with list expressions has shown their wide-spread
>     utility throughout Python.  However, many of the use cases do
>     not need to have a full list created in memory.  Instead, they
>     only need to iterate over the elements one at a time.

I see generator expression as making available the iterator guts of
list comprehensions available as a first class object.  The list()
call is not always wanted.

>     1) In order to achieve a performance gain, generator expressions need
>     to be run in the local stackframe
[...]
>     Accordingly, the generator expression cannot be returned
>     to another function:

That would be unacceptable, IMHO.  Generator expressions should be
first class.  Luckily, generator functions are speedy little buggers. :-)

  Neil

From guido at python.org  Wed Oct 22 11:07:50 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 11:08:08 2003
Subject: [Python-Dev] PEP 289: Generator Expressions (second draft)
In-Reply-To: Your message of "Wed, 22 Oct 2003 05:45:48 EDT."
	<200310220545.49319.fincher.8@osu.edu> 
References: <000001c39872$2dece760$e841fea9@oemcomputer>  
	<200310220545.49319.fincher.8@osu.edu> 
Message-ID: <200310221507.h9MF7od27394@12-236-54-216.client.attbi.com>

> I probably missed it in this monster of a thread, but how do
> generator expressions do this?  It seems that they'd only make
> reduce more efficient, but it would still be just as needed as
> before.

All we need is more standard accumulator functions like sum().  There
are many useful accumulator functions that aren't easily expressed as
a binary operator but are easily done with an explicit iterator
argument, so I am hopeful that the need for reduce will disappear.
99% use cases for reduce were with operator.add, and that's replaced
by sum() already.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From mcherm at mcherm.com  Wed Oct 22 11:11:08 2003
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed Oct 22 11:11:18 2003
Subject: [Python-Dev] closure semantics
Message-ID: <1066835468.3f969e0c7b3a2@mcherm.com>

Guido writes:
> But all this is moot unless someone comes up with a way to spell this
> that doesn't require a new keyword or change the meaning of 'global x'
> even if there's an x at an intermediate scope (i.e. you can't change
> 'global x' to mean "search for the next outer scope that defines x").
> 
> And we still have to answer Alex's complaint that newbies misinterpret
> the word 'global'.

I've always thought that "global" statements ought to resemble "import"
statements. Let me explain.

Python doesn't have an "import" statement. It has SEVERAL import
statements. There's "import m", "import m as n", "from m import n",
"from m import n as n2", even the dreaded "import *". However, this
profusion of different statements is NOT usually confusing to people
for three reasons:

  (1) All have the same primary keyword "import", suggesting that
      they're all related.
  (2) All are ultimately concerned with doing the same thing...
      ensuring a module is in sys.modules and binding a name in the
      current environment so it can be used.
  (3) All of these read like english.

Now, it seems to me that "global" is an ideal canidate for similar
treatment. Rather like "import", there is a single thing we want to
control... specifying, when we use the unadorned variable name, which 
namespace we wish it to refer to. There are several things we might
want. First, what we already have:

  (a) Refers to local namespace. This is the most commonly used
      version, and should be (and is!) the default when no "global"
      statement is used.
  (b) Refers to the module-global namespace. This is the second-most
      commonly used scope, and so I'd say it deserves the simplest
      form of the "global" statement (rather like "import m" is the
      simplest form of "import"). That woud be "global x", and that's
      already how Python works.

And a few others we might want:

  (c) Refers to nearest enclosing nested-scope namespace in which a
      binding of that name already exists.
  (d) Refers to the getattr() namespace (normally __dict__) of the
      first argument of the function. This is for the "don't like
      typing 'self.'" crowd.
  (e) Refers to a truly-global (across all modules) namespace
      (built-ins I suppose). This is what Alex says newbies guess
      that "global" means.
  (f) Refers to a specific enclosing nested-scope namespace, in
      cases where the nearest nested-scope namespace isn't the one
      you want.

Personally, I have no use for (d), (e), and (f), and I'd vote c:+1, d:-0
e:-1, f:-1 on including these. But my point is, that a slightly different
form of the "global" statement would satisfy both readability AND
parsability. I'm not feeling particularly creative, so please try to
improve on these phrasings (some aren't parsable... we need forms that
are parsable AND read well):

   (c) --  "global x in def"
   (d) --  "global x in <self>"
   (e) --  "global global x"
   (f) --  "x is global in <func>"

Okay... all four of those are lousy. But I still think seeking some
alternate "phrases" or "forms" for the global statement has merit.

-- Michael Chermside


From guido at python.org  Wed Oct 22 12:02:43 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 12:03:08 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: Your message of "Wed, 22 Oct 2003 08:15:16 CDT."
	<16278.33508.677499.127119@montanaro.dyndns.org> 
References: <Your message of "Tue, 21 Oct 2003 23:33:30 +0200."
	<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<200310220121.52789.aleaxit@yahoo.com>
	<200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com>
	<200310220158.21389.aleaxit@yahoo.com>
	<200310220042.h9M0g5225903@12-236-54-216.client.attbi.com> 
	<16278.33508.677499.127119@montanaro.dyndns.org> 
Message-ID: <200310221602.h9MG2h427527@12-236-54-216.client.attbi.com>

> How about
> 
>     __.x = 42

Too much line-noise, so too Perlish. :-)

I don't like to use a mysterious symbol like __ for a common thing
like a global variable.  I also don't think I want global variable
assignments to look like attribute assignments.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Oct 22 12:05:58 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 12:07:03 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: Your message of "Wed, 22 Oct 2003 23:02:52 +1000."
	<3F967FFC.6040507@iinet.net.au> 
References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz>
	<200310201815.h9KIFM821583@12-236-54-216.client.attbi.com>
	<3F953793.1000208@iinet.net.au>
	<200310211841.45711.aleaxit@yahoo.com> 
	<3F967FFC.6040507@iinet.net.au> 
Message-ID: <200310221606.h9MG5wo27539@12-236-54-216.client.attbi.com>

> I had a similar thought about 5 minutes after turning my computer off last 
> night. The alternative I came up with was:
> 
>    y = (from result = 0.0 do result += x**2 for x in values if x > 0)

I think you're aiming for the wrong thing here; I really see no reason
why you'd want to avoid writing this out as a real for loop if you
don't have an existing accumulator function (like sum()) to use.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at yahoo.com  Wed Oct 22 12:11:37 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Wed Oct 22 12:11:42 2003
Subject: [Python-Dev] let's not stretch a keyword's use unreasonably,
	_please_...
Message-ID: <20031022161137.96353.qmail@web40513.mail.yahoo.com>

I'm traveling, but i did manage to briefly peek at python-dev -- and,
with some luck, I think i'll manage to post from here too.  It's
probably gonna be the weekend before I get mail access again, but,
in the meantime...:

I think we're trying to stretch the meaning of a keyword, "global"
(that wasn't a particularly appropriate one for Python anyway, as
opposed to ABC), "beyond reason".  Yes, the temptation is huge,
because adding a keyword is problematic.  But this is like C's
extending "static" to mean "private to this module as opposed to
visible from other modules too" and later C++ further stretching it
to mean "pertaining to the class rather than to the instance".

Seriously, I weep inside whenever I have to explain "staticmethod"
in terms of "once upon a time, there was a language (which has just
about nothing to do with Python) which stretched the meaning of a
word, which an older language had already stretched for a vaguely
related purpose, and ..." :-(

Well, nothing we can do about THAT -- that particular abuse of
"static" is widely enough engrained in too many _programmers'_
minds, so newbies will just have to put up with it.

But, "global" _isn't_ similarly engrained.  It sits oddly in
Python -- a statement that doesn't actually DO things, but rather
"flags" something for the compiler's benefit; in other words,
_a declaration_ -- the one and only (expletive deleted) declaration
we have in Python, though we call it "a statement" probably in
an attempt at decency:-).  I've seen it called "a declarative
statement" on this thread, which is something of an oxymoron to
me -- statements DO things.  All but "global", that is.

And it *rubs in* the "compile-time" vs "runtime" distinction
that Python is mostly SO good at seamlessly hiding...!

And all for what purpose?  To let that one particular case
of "assigment to bare name" actually mean setattr on some
object (the current module) "blackmagically".  I've seen it
argued on this thread that "variables in a module and
attributes in an object are different things" -- but they
*AREN'T*!  Just like, say, in a class.

Inside a class C's body ("toplevel" in it, not nested inside
a def &c) I can write
    x = 23
and it means C.x = 23 (for a normal metaclass).  Once the
class object C is created, if I want to tweak that attribute
of C I have to write e.g. C.x = 42 after getting ahold of
some reference to C (by its name, or say in a method of C by
self.__class__.x = 42, etc).

Inside a module M's body ("toplevel" in it, not nested inside
a def &c) I can write
    x = 23
and it means M.x = 23 (unconditionally).  Once the module
object M is created, if I want to tweak that attribute
of M I have to write e.g. M.x = 42 after getting ahold of
some reference to M (say by an "import M", or say in a function
of M by sys.modules[__name__].x = 42, etc).

*OR* in the second case I can alternatively use the magical
"compile-time declarative statement" ``global x'' -- a weird
"special case" with a weird keyword going with it.  Why?
The two cases aren't all that different; bright learners catch
them on to them most easily when I draw the parallels explicitly
(before going on to explain the differences, of course).


If "assigning to bare names not in the current scope" _must_
be supported by a dedicated keyword, I think that keyword
should be 'scope'.  NOT 'global'.  Let's not do the "see how
far we can stretch 'static' and then some" gig, please...

Alternatively, assigning to an attribute of some particular
object still feels a better approach to me -- no new kwd,
no stretching of bad old ones, actually a chance to let bad
old 'global' fade slowly into the sunset.  If there's any
chance to salvage THAT approach -- if it only needs a good
neat syntax to get that "particular object" -- I'll be glad
to participate in brainstorming to find it.  But before we
spend energy on that, I'd like to know it's not sure to be
wasted, because it just 'must' be a keyword; this subdebate
is warranted only if it's _conceivable_ that attribute
assignment be again demeed acceptable for this purpose.

If it HAS to be a keyword, I think it should be 'scope' AND
it should not be a STATEMENT but rather an "operator".
E.g., a "bare name" COULD be "x scope foo" or "(x in
module scope)" or some such construct (I think it could
syntactically resemble attribute assignment and that would
be USEFUL, e.g. scope(foo).x, even if scope was not a
function but a keyword).  But _this_ subdebate, I think, is
warranted only if it's _conceivable_ that a keyword could
be added for this (be it 'scope' or something even better
that I haven't thought of).

If both conditions fail -- it MUST be a keyword, and that
keyword MUST be 'global' no matter how horrid that is,
then once that is etched in stone I don't think there is
much worth debating, because I don't think the result can
be decent -- it seems an overconstrained system.  So, in
just the spirit of "print>>bah, x", I would suggest:

    global>>outer, x

There!  What could be better?-)


Alex


From eppstein at ics.uci.edu  Wed Oct 22 12:23:43 2003
From: eppstein at ics.uci.edu (David Eppstein)
Date: Wed Oct 22 12:23:47 2003
Subject: [Python-Dev] Re: accumulator display syntax
References: <000001c3984b$052cd820$e841fea9@oemcomputer>
	<13803476.1066768024@[192.168.1.101]>
	<16278.34154.245725.959203@montanaro.dyndns.org>
Message-ID: <eppstein-E88B56.09234322102003@sea.gmane.org>

In article <16278.34154.245725.959203@montanaro.dyndns.org>,
 Skip Montanaro <skip@pobox.com> wrote:
[ re a long expression of mine ]
... 
>     pos2d = dict([(s,(positions[s][0]+dx*positions[s][2],
>                       positions[s][1]+dy*positions[s][2]))
>                      for s in positions])
...
>     pos2d = {s: (positions[s][0]+dx*positions[s][2],
>                  positions[s][1]+dy*positions[s][2])
>                 for s in positions}
> 
> The extra characters required today are less of a problem if the expression
> is laid out sensibly.

I have to admit, your indentation is better than mine, even if you 
ignore the problems caused by my using lines wider than 80 characters.  
But I still feel the second of your two alternatives more clearly 
expresses the intent of the expression.

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science


From jeremy at zope.com  Wed Oct 22 12:54:20 2003
From: jeremy at zope.com (Jeremy Hylton)
Date: Wed Oct 22 12:54:22 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com>
Message-ID: <IOEAJCCLBLFDOCOBOEGNEEFGCFAA.jeremy@zope.com>

> But all this is moot unless someone comes up with a way to spell this
> that doesn't require a new keyword or change the meaning of 'global x'
> even if there's an x at an intermediate scope (i.e. you can't change
> 'global x' to mean "search for the next outer scope that defines x").
>
> And we still have to answer Alex's complaint that newbies misinterpret
> the word 'global'.

I'm not averse to introducing a new keyword, which would address both
concerns.  yield was introduced with apparently little problem, so it seems
possible to add a keyword without causing too much disruption.

If we decide we must stick with global, then it's very hard to address
Alex's concern about global being a confusing word choice <wink>.

Jeremy


From dave at pythonapocrypha.com  Wed Oct 22 13:07:38 2003
From: dave at pythonapocrypha.com (Dave Brueck)
Date: Wed Oct 22 13:07:43 2003
Subject: [Python-Dev] replacing 'global'
References: <Your message of "Tue, 21 Oct 2003 23:33:30
	+0200."<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch><200310220121.52789.aleaxit@yahoo.com><200310212340.h9LNeYq25691@12-236-54-216.client.attbi.com><200310220158.21389.aleaxit@yahoo.com><200310220042.h9M0g5225903@12-236-54-216.client.attbi.com>
	<16278.33508.677499.127119@montanaro.dyndns.org>
	<200310221602.h9MG2h427527@12-236-54-216.client.attbi.com>
Message-ID: <1b5501c398be$ff1832d0$891e140a@YODA>

> > How about
> >
> >     __.x = 42
>
> Too much line-noise, so too Perlish. :-)
>
> I don't like to use a mysterious symbol like __ for a common thing
> like a global variable.  I also don't think I want global variable
> assignments to look like attribute assignments.

Go easy on me for piping up here, but aren't they attribute assignments or
at least used as such? After reading the other posts in this thread I wonder
if it would be helpful to have more information on how "global" is used in
practice (and maybe some of those practices will be deemed bad, who knows).
>From my (a user of Python) perspective, "global" has two uses:

1) Attaching objects to the module, so that other modules do a module.name
to get at the object
2) Putting objects in some namespace visible to the rest of the module.

Now whether or not #1 is "good" or "bad" - I don't know, but it sure looks
like attribute assignment to me. Again, please regard this as just feedback
from a user, but especially outside of the module it looks and acts like
attribute assignment, I would expect the same to be true inside the module,
and any distinction would seem arbitrary or artificial (consider, for
example, that it is not an uncommon practice to write a module instead of a
class if the class would be a singleton).

As for #2, I personally don't use global at all because it just rubs me the
wrong way (the same way it would if you removed "self." in a method and made
bind-to-instance implicit like in C++). Instead, many of my modules have
this at the top:

class GV:
  someVar1 = None
  someVar2 = 5

(where GV = "global variables")

I felt _really_ guilty doing this the first few times and I continue to
thing it's yucky, but I don't know of a better alternative, and this
approach reads better, especially compared to:

global foo
<more than a few lines of code>
foo = 10

Seeing GV.foo = 10 adds a lot to readability.

>From this user's perspective, both problems #1 and #2 would be solved by an
object named "module" that refers to this module (but please don't name it
"global" or "globals" - that word has a different expected meaning).

Shutting up,
-Dave


From guido at python.org  Wed Oct 22 13:21:15 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 13:21:22 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: Your message of "Sat, 09 Dec 2000 14:27:37 EST."
	<IOEAJCCLBLFDOCOBOEGNEEFGCFAA.jeremy@zope.com> 
References: <IOEAJCCLBLFDOCOBOEGNEEFGCFAA.jeremy@zope.com> 
Message-ID: <200310221721.h9MHLFl27628@12-236-54-216.client.attbi.com>

[Guido]
> > But all this is moot unless someone comes up with a way to spell this
> > that doesn't require a new keyword or change the meaning of 'global x'
> > even if there's an x at an intermediate scope (i.e. you can't change
> > 'global x' to mean "search for the next outer scope that defines x").
> >
> > And we still have to answer Alex's complaint that newbies misinterpret
> > the word 'global'.

[Jeremy]
> I'm not averse to introducing a new keyword, which would address both
> concerns.  yield was introduced with apparently little problem, so it seems
> possible to add a keyword without causing too much disruption.
> 
> If we decide we must stick with global, then it's very hard to address
> Alex's concern about global being a confusing word choice <wink>.

OK, the tension is mounting.  Which keyword do you have in mind?  And
would you use the same keyword for module-globals as for outer-scope
variables?

--Guido van Rossum (home page: http://www.python.org/~guido/)

From walter at livinglogic.de  Wed Oct 22 13:21:56 2003
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed Oct 22 13:22:02 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: <200310221506.h9MF6hM27365@12-236-54-216.client.attbi.com>
References: <338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com>	<200310212327.h9LNRSX25612@12-236-54-216.client.attbi.com>
	<20031022134913.GA21755@panix.com>
	<200310221506.h9MF6hM27365@12-236-54-216.client.attbi.com>
Message-ID: <3F96BCB4.1000002@livinglogic.de>

Guido van Rossum wrote:

> [...]
> The variable of a for *statement* must be accessible after the loop
> because you might want to break out of the loop with a specific
> value.  This is a common pattern that I have no intent of breaking.
> So it can't introduce a new scope; then it might as well keep the last
> value assigned to it.
> 
> List comprehensions and generator expressions don't have 'break'.
> (You could cause an exception and catch it, but it's not a common
> pattern to use the control variable afterwards -- only the debugger
> would need access somehow.)

How about an until keyword in generator expressions:

sum(len(line) for line in file if not line.startswith("#") until not 
line.strip())

Would the order of "if" and "until" be significant?

And we could have accumulators first() and last():

def first(it):
	return it.next()

def last(it):
	for value in it:
		pass
	return value

first(line for line in file if line.startswith("#"))

if not last(file):
	# last line not terminated

Bye,
    Walter D?rwald


From guido at python.org  Wed Oct 22 13:30:09 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 13:30:22 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Wed, 22 Oct 2003 12:03:18 BST."
	<2moew9o7pl.fsf@starship.python.net> 
References: <200310212349.h9LNnm710076@oma.cosc.canterbury.ac.nz>  
	<2moew9o7pl.fsf@starship.python.net> 
Message-ID: <200310221730.h9MHU9w27692@12-236-54-216.client.attbi.com>

> >> def f(x):
> >>     r = [x+1 for x in range(x)]
> >>     return r, x
> >> 
> >> becomes even more incomprehensible (and changes in behaviour).
> >
> > Anyone who writes code like that *deserves* to have the
> > behaviour changed on them!
> 
> This was not my impression of the Python way.  I know I'd be pretty
> pissed if this broke my app.
> 
> I have no objection to breaking the above code, just to breaking it
> silently!  Having code *silently change in behaviour* (not die with an
> expection, post a warning at compile time or fail to compile at all)
> is about an evil a change as it's possible to contemplate, IMO.
> 
> > If this is really a worry, an alternative would be to
> > simply forbid using a name for the loop variable that's
> > used for anything else outside the loop. That could
> > break existing code too, but at least it would break
> > it in a very obvious way by making it fail to compile.
> 
> This would be infinitely preferable!

Not so fast.  We introduced nested scopes, which could similarly
subtly change the meaning of code without giving an error.  Instead,
we had at least one release that *warned* about situations that would
change meaning silently under the new semantics.  The same release
also implemented the new semantics if you used a __future__ import.

We should do that here too (both the warning and the __future__).

I don't want this kind of code to cause an error; it's not Pythonic to
flag an error when a variable name in an inner scope shields a
variable of the same name in an outer scope.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From pedronis at bluewin.ch  Wed Oct 22 13:32:56 2003
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Wed Oct 22 13:30:59 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <IOEAJCCLBLFDOCOBOEGNEEFGCFAA.jeremy@zope.com>
References: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com>
Message-ID: <5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch>

At 14:27 09.12.2000 -0500, Jeremy Hylton wrote:
> > But all this is moot unless someone comes up with a way to spell this
> > that doesn't require a new keyword or change the meaning of 'global x'
> > even if there's an x at an intermediate scope (i.e. you can't change
> > 'global x' to mean "search for the next outer scope that defines x").
> >
> > And we still have to answer Alex's complaint that newbies misinterpret
> > the word 'global'.
>
>I'm not averse to introducing a new keyword, which would address both
>concerns.  yield was introduced with apparently little problem, so it seems
>possible to add a keyword without causing too much disruption.
>
>If we decide we must stick with global, then it's very hard to address
>Alex's concern about global being a confusing word choice <wink>.

why exactly do we want write access to outer scopes?

for completeness, to avoid the overhead of introducing a class here and there,
to facilitate people using Scheme textbooks with Python?

so far I have not been missing it,

I don't find:

def accgen(n):
   def acc(i):
     global n in accgen
     n += i
     return n
   return acc

particulary more compelling than:

class accgen:
   def __init__(self, n):
     self.n = n

   def __call__(self, i):
     self.n += i
     return self.n

I'm not asking in order to polemize, I just would like to see the rationale 
spelled out.

regards.


From mcherm at mcherm.com  Wed Oct 22 13:34:19 2003
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed Oct 22 13:34:21 2003
Subject: [Python-Dev] closure semantics
Message-ID: <1066844059.3f96bf9b1240f@mcherm.com>

[Jeremy]
> I'm not averse to introducing a new keyword, which would address both
> concerns.  yield was introduced with apparently little problem, so it seems
> possible to add a keyword without causing too much disruption.
> 
> If we decide we must stick with global, then it's very hard to address
> Alex's concern about global being a confusing word choice <wink>.

[Guido]
> OK, the tension is mounting.  Which keyword do you have in mind?  And
> would you use the same keyword for module-globals as for outer-scope
> variables?

Surely the most appropriate keyword is "scope", right?

As in

   scope a is global
   scope b is nested
   scope c is self
   scope d is myDict

Okay... maybe I'm getting too ambitious with the last couple...

-- Michael Chermside


From guido at python.org  Wed Oct 22 13:47:42 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 13:48:07 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: Your message of "Wed, 22 Oct 2003 19:21:56 +0200."
	<3F96BCB4.1000002@livinglogic.de> 
References: <338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com>
	<200310212327.h9LNRSX25612@12-236-54-216.client.attbi.com>
	<20031022134913.GA21755@panix.com>
	<200310221506.h9MF6hM27365@12-236-54-216.client.attbi.com> 
	<3F96BCB4.1000002@livinglogic.de> 
Message-ID: <200310221747.h9MHlgK27769@12-236-54-216.client.attbi.com>

> How about an until keyword in generator expressions:

New keywords are not on the table for generator expressions.  You
could do this with 'while' (which is just 'until not' -- note that
your example uses that :-) but I'd be against making this part of the
syntax more complex.  You can do that with itertools.takewhile or
dropwhile anyway.

> And we could have accumulators first() and last():
> 
> def first(it):
> 	return it.next()

This begs for using a plain old loop statement with a 'break'.

> def last(it):
> 	for value in it:
> 		pass
> 	return value

What if it is empty?

> first(line for line in file if line.startswith("#"))
> 
> if not last(file):
> 	# last line not terminated

The comment is incorrect.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Oct 22 13:57:18 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 13:57:26 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: Your message of "Wed, 22 Oct 2003 19:32:56 +0200."
	<5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch> 
References: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com>  
	<5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch> 
Message-ID: <200310221757.h9MHvI327805@12-236-54-216.client.attbi.com>

[Samuele]
> why exactly do we want write access to outer scopes?
> 
> for completeness, to avoid the overhead of introducing a class here
> and there, to facilitate people using Scheme textbooks with Python?

Probably the latter; I think Jeremy Hylton does know more Scheme than
I do. :-)

> so far I have not been missing it,
> 
> I don't find:
> 
> def accgen(n):
>    def acc(i):
>      global n in accgen
>      n += i
>      return n
>    return acc
> 
> particulary more compelling than:
> 
> class accgen:
>    def __init__(self, n):
>      self.n = n
> 
>    def __call__(self, i):
>      self.n += i
>      return self.n

Some people have "fear of classes".  Some people think that a
function's scope can be cheaper than an object (someone should time
this).

Looking at the last example in the itertools docs:

  def tee(iterable):
      "Return two independent iterators from a single iterable"
      def gen(next, data={}, cnt=[0]):
          dpop = data.pop
          for i in count():
              if i == cnt[0]:
                  item = data[i] = next()
                  cnt[0] += 1
              else:
                  item = dpop(i)
              yield item
      next = iter(iterable).next
      return (gen(next), gen(next))

This would have been clearer if the author didn't have to resort to
representing his counter variable as a list of one element.  Using
'global* x' to mean 'find x in an outer scope', and also moving data
into the outer scope, again to emphasize that it is shared between
multiple calls of gen() without abusing default arguments, it would
become:

  def tee(iterable):
      "Return two independent iterators from a single iterable"
      data = {}
      cnt = 0
      def gen(next):
          global* cnt
          dpop = data.pop
          for i in count():
              if i == cnt:
                  item = data[i] = next()
                  cnt += 1
              else:
                  item = dpop(i)
              yield item
      next = iter(iterable).next
      return (gen(next), gen(next))

which is IMO more readable.

But in 2.4 this will become a real object implemented in C. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From walter at livinglogic.de  Wed Oct 22 14:05:12 2003
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed Oct 22 14:05:41 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: <200310221747.h9MHlgK27769@12-236-54-216.client.attbi.com>
References: <338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com>
	<200310212327.h9LNRSX25612@12-236-54-216.client.attbi.com>
	<20031022134913.GA21755@panix.com>
	<200310221506.h9MF6hM27365@12-236-54-216.client.attbi.com>
	<3F96BCB4.1000002@livinglogic.de>
	<200310221747.h9MHlgK27769@12-236-54-216.client.attbi.com>
Message-ID: <3F96C6D8.8040507@livinglogic.de>

Guido van Rossum wrote:

 > [...]
> >How about an until keyword in generator expressions:
>
>
> New keywords are not on the table for generator expressions.  You
> could do this with 'while' (which is just 'until not' -- note that
> your example uses that :-) 

You're right, using while would be better.

> but I'd be against making this part of the
> syntax more complex.  You can do that with itertools.takewhile or
> dropwhile anyway.

But

sum(len(line) for line in file if not line.startswith("#") while 
line.strip())

looks simple than

sum(itertools.takewhile(lambda l: l.strip(), len(line) for line in file 
if not line.startswith("#"))

>>def last(it):
>>	for value in it:
>>		pass
>>	return value
> 
> What if it is empty?

This should raise an exception.
(It does, but not the correct one! ;))

>>first(line for line in file if line.startswith("#"))
>>
>>if not last(file):
>>	# last line not terminated
> 
> 
> The comment is incorrect.

That should have been:
if not last(file).endswith("\n"):

Bye,
    Walter D?rwald


From python at rcn.com  Wed Oct 22 14:30:53 2003
From: python at rcn.com (Raymond Hettinger)
Date: Wed Oct 22 14:31:41 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <000301c39753$45a18980$e841fea9@oemcomputer>
Message-ID: <005501c398ca$a07a6f20$e841fea9@oemcomputer>

Did the discussion of a sort() expression get resolved?

The last I remember was that the list.sorted() classmethod had won the
most support because it accepted the broadest range of inputs.

I could live with that though I still prefer the more limited
(list-only) copysort() method.


Raymond Hettinger


> Let's see what the use cases look like under the various proposals:
> 
>   todo = [t for t in tasks.copysort() if due_today(t)]
>   todo = [t for t in list.sorted(tasks) if due_today(t)]
>   todo = [t for t in list(tasks, sorted=True) if due_today(t)]
> 
>   genhistory(date, events.copysort(key=incidenttime))
>   genhistory(date, list.sorted(events, key=incidenttime))
>   genhistory(date, list(events, sorted=True, key=incidenttime))
> 
>   for f in os.listdir().copysort(): . . .
>   for f in list.sorted(os.listdir()): . . .
>   for f in list(os.listdir(), sorted=True): . . .
> 
> To my eye, the first form reads much better in every case.
> It still needs a better name though.


From theller at python.net  Wed Oct 22 14:38:18 2003
From: theller at python.net (Thomas Heller)
Date: Wed Oct 22 14:38:39 2003
Subject: [Python-Dev] Re: buildin vs. shared modules
In-Reply-To: <200310221442.h9MEgQN27219@12-236-54-216.client.attbi.com> (Guido
	van Rossum's message of "Wed, 22 Oct 2003 07:42:26 -0700")
References: <brspcg23.fsf@python.net>
	<200310171840.h9HIesN06941@12-236-54-216.client.attbi.com>
	<he22muai.fsf@python.net> <200310211857.57783.aleaxit@yahoo.com>
	<u162la7r.fsf@python.net>
	<200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com>
	<7k2ywden.fsf@yahoo.co.uk> <65ihlodo.fsf@python.net>
	<200310221442.h9MEgQN27219@12-236-54-216.client.attbi.com>
Message-ID: <4qy1qfs5.fsf@python.net>

Guido van Rossum <guido@python.org> writes:

>> The whole delayload/__try/__except stuff may be unneeded in 2.4, because
>> it will most probably be compiled with MSVC7.1, installed via an msi
>> installer,
>
> Is anyone working on that?  I have the VC7.1 compiler too, but haven't
> tried to use it yet.  Maybe someone should check in a project
> (separate from the VC6 project, so people don't *have to* switch yet)?

No, nobdoy is working on that AFAIK.
VC7 can convert VC6 workspace and project files into its own format,
but there is no way back.  You cannot use VC7 files (they are called
solution instead of workspace) in VC6 anymore.  MvL suggested to convert
the files once and then deprecate using the VC6 workspace.

> Are the tools needed to build an MSI installer included in VC7.1?  If
> not, are they a free download?

Yes, there are tools included.  A college of mine tried to use them, and
we quickly switched to Wise for Windows Installer (this is not the same
as the Wise version used in Python 2.3) which does also create msi
files.  But this also has its own range of problems.

MvL again has the idea to create the msi (which is basically a database)
programmatically with Python - either via COM, a custom Python extension
or maybe ctypes.

Thomas


From guido at python.org  Wed Oct 22 14:45:47 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 14:45:57 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: Your message of "Wed, 22 Oct 2003 20:05:12 +0200."
	<3F96C6D8.8040507@livinglogic.de> 
References: <338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com>
	<200310212327.h9LNRSX25612@12-236-54-216.client.attbi.com>
	<20031022134913.GA21755@panix.com>
	<200310221506.h9MF6hM27365@12-236-54-216.client.attbi.com>
	<3F96BCB4.1000002@livinglogic.de>
	<200310221747.h9MHlgK27769@12-236-54-216.client.attbi.com> 
	<3F96C6D8.8040507@livinglogic.de> 
Message-ID: <200310221845.h9MIjlr27891@12-236-54-216.client.attbi.com>

> sum(len(line) for line in file if not line.startswith("#") while 
> line.strip())
> 
> looks simple than
> 
> sum(itertools.takewhile(lambda l: l.strip(), len(line) for line in file 
> if not line.startswith("#"))

I think both are much harder to read and understand than

  n = 0
  for line in file:
      if not line.strip():
          break
      if not line.startwith("#"):
          n += len(line)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Wed Oct 22 14:53:21 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 14:53:31 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: Your message of "Wed, 22 Oct 2003 14:30:53 EDT."
	<005501c398ca$a07a6f20$e841fea9@oemcomputer> 
References: <005501c398ca$a07a6f20$e841fea9@oemcomputer> 
Message-ID: <200310221853.h9MIrL327955@12-236-54-216.client.attbi.com>

> Did the discussion of a sort() expression get resolved?
> 
> The last I remember was that the list.sorted() classmethod had won the
> most support because it accepted the broadest range of inputs.
> 
> I could live with that though I still prefer the more limited
> (list-only) copysort() method.

list.sorted() has won, but we are waiting from feedback from the
person who didn't like having both sort() and sorted() as methods, to
see if his objection still holds when one is a method and the other a
factory function.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From theller at python.net  Wed Oct 22 15:33:05 2003
From: theller at python.net (Thomas Heller)
Date: Wed Oct 22 15:33:33 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules/expat
 macconfig.h, 
 NONE, 1.1.2.1 asciitab.h, 1.1.1.1, 1.1.1.1.16.1 expat.h, 1.5, 1.5.12.1
 iasciitab.h, 1.1.1.1, 1.1.1.1.16.1 internal.h, 1.1, 1.1.14.1 latin1tab.h,
 1.1.1.1, 1.1.1.1.16.1 utf8tab.h, 1.1.1.1, 1.1.1.1.16.1 winconfig.h,
 1.1.1.1, 1.1.1.1.16.1 xmlparse.c, 1.5, 1.5.12.1 xmlrole.c, 1.5, 1.5.12.1
 xmltok.c, 1.3, 1.3.12.1 xmltok_impl.c, 1.2, 1.2.12.1 expat.h.in, 1.1.1.1,
 NONE
In-Reply-To: <E1AC2iE-0001p7-00@sc8-pr-cvs1.sourceforge.net>
	(fdrake@users.sourceforge.net's
	message of "Tue, 21 Oct 2003 13:02:38 -0700")
References: <E1AC2iE-0001p7-00@sc8-pr-cvs1.sourceforge.net>
Message-ID: <vfqhoyoe.fsf@python.net>

fdrake@users.sourceforge.net writes:

> Update of /cvsroot/python/python/dist/src/Modules/expat
> In directory sc8-pr-cvs1:/tmp/cvs-serv7002
>
> Modified Files:
>       Tag: release23-maint
> 	asciitab.h expat.h iasciitab.h internal.h latin1tab.h 
> 	utf8tab.h winconfig.h xmlparse.c xmlrole.c xmltok.c 
> 	xmltok_impl.c 
> Added Files:
>       Tag: release23-maint
> 	macconfig.h 
> Removed Files:
>       Tag: release23-maint
> 	expat.h.in 
> Log Message:
> Update to Expat 1.95.7; there are no changes to the Expat sources.

I'm getting compile errors on Windows (in the release-23maint branch,
haven't tried in the trunk yet):

C:\sf\python\dist\src-maint23\Modules\expat\xmlparse.c(76) : fatal error
C1189: #error : memmove does not exist on this platform, nor is a
substitute available

Thomas


From fdrake at acm.org  Wed Oct 22 15:44:41 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed Oct 22 15:45:38 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules/expat
	macconfig.h, 
	NONE, 1.1.2.1 asciitab.h, 1.1.1.1, 1.1.1.1.16.1 expat.h, 1.5, 1.5.12.1
	iasciitab.h, 1.1.1.1, 1.1.1.1.16.1 internal.h, 1.1,
	1.1.14.1 latin1tab.h, 
	1.1.1.1, 1.1.1.1.16.1 utf8tab.h, 1.1.1.1, 1.1.1.1.16.1 winconfig.h,
	1.1.1.1, 1.1.1.1.16.1 xmlparse.c, 1.5, 1.5.12.1 xmlrole.c, 1.5,
	1.5.12.1
	xmltok.c, 1.3, 1.3.12.1 xmltok_impl.c, 1.2, 1.2.12.1 expat.h.in,
	1.1.1.1, NONE
In-Reply-To: <vfqhoyoe.fsf@python.net>
References: <E1AC2iE-0001p7-00@sc8-pr-cvs1.sourceforge.net>
	<vfqhoyoe.fsf@python.net>
Message-ID: <16278.56873.730705.729001@grendel.zope.com>


Thomas Heller writes:
 > I'm getting compile errors on Windows (in the release-23maint branch,
 > haven't tried in the trunk yet):

I'll bet they match.  ;-)

 > C:\sf\python\dist\src-maint23\Modules\expat\xmlparse.c(76) : fatal error
 > C1189: #error : memmove does not exist on this platform, nor is a
 > substitute available

Hmm.  I see PC\pyconfig.h doesn't define HAVE_MEMMOVE; this gets
defined in the configure-generated pyconfig.h for the Linux systems I
tested this on.

Doesn't Windows always have memmove()?  (I *think* it does based on a
quick look at msdn.microsoft.com, but who knows for sure...)

I'm not sure how extension building works on Windows; if setup.py is
used, you should be able to define HAVE_MEMMOVE in PC\pyconfig.h,
otherwise you can define it in the relevant .dsp file.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From theller at python.net  Wed Oct 22 15:57:12 2003
From: theller at python.net (Thomas Heller)
Date: Wed Oct 22 15:57:49 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules/expat
 macconfig.h, 
 NONE, 1.1.2.1 asciitab.h, 1.1.1.1, 1.1.1.1.16.1 expat.h, 1.5, 1.5.12.1
 iasciitab.h, 1.1.1.1, 1.1.1.1.16.1 internal.h, 1.1, 1.1.14.1 latin1tab.h,
 1.1.1.1, 1.1.1.1.16.1 utf8tab.h, 1.1.1.1, 1.1.1.1.16.1 winconfig.h,
 1.1.1.1, 1.1.1.1.16.1 xmlparse.c, 1.5, 1.5.12.1 xmlrole.c, 1.5, 1.5.12.1
 xmltok.c, 1.3, 1.3.12.1 xmltok_impl.c, 1.2, 1.2.12.1 expat.h.in, 1.1.1.1,
 NONE
In-Reply-To: <16278.56873.730705.729001@grendel.zope.com> (Fred L. Drake,
	Jr.'s message of "Wed, 22 Oct 2003 15:44:41 -0400")
References: <E1AC2iE-0001p7-00@sc8-pr-cvs1.sourceforge.net>
	<vfqhoyoe.fsf@python.net> <16278.56873.730705.729001@grendel.zope.com>
Message-ID: <ptgpoxk7.fsf@python.net>

"Fred L. Drake, Jr." <fdrake@acm.org> writes:

> Thomas Heller writes:
>  > I'm getting compile errors on Windows (in the release-23maint branch,
>  > haven't tried in the trunk yet):
>
> I'll bet they match.  ;-)
>
>  > C:\sf\python\dist\src-maint23\Modules\expat\xmlparse.c(76) : fatal error
>  > C1189: #error : memmove does not exist on this platform, nor is a
>  > substitute available
>
> Hmm.  I see PC\pyconfig.h doesn't define HAVE_MEMMOVE; this gets
> defined in the configure-generated pyconfig.h for the Linux systems I
> tested this on.
>
> Doesn't Windows always have memmove()?  (I *think* it does based on a
> quick look at msdn.microsoft.com, but who knows for sure...)

Windows? MSVC has it.

> I'm not sure how extension building works on Windows; if setup.py is
> used, you should be able to define HAVE_MEMMOVE in PC\pyconfig.h,
> otherwise you can define it in the relevant .dsp file.

setup.py isn't used, and PC\pyconfig.h is manually maintained.
So HAVE_MEMMOVE has to be defined in this file, at least for MSVC6.
I don't know anything about watcom, borland, or other compilers.
Let's add it in the file and see what happens ;-)

Thomas


From fdrake at acm.org  Wed Oct 22 16:07:03 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed Oct 22 16:07:24 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules/expat
	macconfig.h, 
	NONE, 1.1.2.1 asciitab.h, 1.1.1.1, 1.1.1.1.16.1 expat.h, 1.5, 1.5.12.1
	iasciitab.h, 1.1.1.1, 1.1.1.1.16.1 internal.h, 1.1,
	1.1.14.1 latin1tab.h, 
	1.1.1.1, 1.1.1.1.16.1 utf8tab.h, 1.1.1.1, 1.1.1.1.16.1 winconfig.h,
	1.1.1.1, 1.1.1.1.16.1 xmlparse.c, 1.5, 1.5.12.1 xmlrole.c, 1.5,
	1.5.12.1
	xmltok.c, 1.3, 1.3.12.1 xmltok_impl.c, 1.2, 1.2.12.1 expat.h.in,
	1.1.1.1, NONE
In-Reply-To: <ptgpoxk7.fsf@python.net>
References: <E1AC2iE-0001p7-00@sc8-pr-cvs1.sourceforge.net>
	<vfqhoyoe.fsf@python.net>
	<16278.56873.730705.729001@grendel.zope.com>
	<ptgpoxk7.fsf@python.net>
Message-ID: <16278.58215.241781.151437@grendel.zope.com>


Thomas Heller writes:
 > setup.py isn't used, and PC\pyconfig.h is manually maintained.
 > So HAVE_MEMMOVE has to be defined in this file, at least for MSVC6.
 > I don't know anything about watcom, borland, or other compilers.
 > Let's add it in the file and see what happens ;-)

Not quite, I think.

The setup.py script will load it from the pyconfig.h file and pass it
along for Expat, but if that isn't used, it needs to be added to the
.dsp used to build pyexpat.pyd.

Not sure what to do about other C runtimes.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From tim.one at comcast.net  Wed Oct 22 16:10:10 2003
From: tim.one at comcast.net (Tim Peters)
Date: Wed Oct 22 16:10:42 2003
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Modules/expat
	macconfig.h, NONE, 1.1.2.1 asciitab.h, 1.1.1.1,
	1.1.1.1.16.1 expat.h, 1.5, 1.5.12.1iasciitab.h, 1.1.1.1,
	1.1.1.1.16.1 internal.h, 1.1, 1.1.14.1 latin1tab.h, 1.1.1.1,
	1.1.1.1.16.1 utf8tab.h, 1.1.1.1, 1.1.1.
In-Reply-To: <ptgpoxk7.fsf@python.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEPFFEAB.tim.one@comcast.net>

memmove is a standard ANSI C function so can be used freely (Python requires
ANSI C).


From theller at python.net  Wed Oct 22 16:11:16 2003
From: theller at python.net (Thomas Heller)
Date: Wed Oct 22 16:11:41 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules/expat
 macconfig.h, 
 NONE, 1.1.2.1 asciitab.h, 1.1.1.1, 1.1.1.1.16.1 expat.h, 1.5, 1.5.12.1
 iasciitab.h, 1.1.1.1, 1.1.1.1.16.1 internal.h, 1.1, 1.1.14.1 latin1tab.h,
 1.1.1.1, 1.1.1.1.16.1 utf8tab.h, 1.1.1.1, 1.1.1.1.16.1 winconfig.h,
 1.1.1.1, 1.1.1.1.16.1 xmlparse.c, 1.5, 1.5.12.1 xmlrole.c, 1.5, 1.5.12.1
 xmltok.c, 1.3, 1.3.12.1 xmltok_impl.c, 1.2, 1.2.12.1 expat.h.in, 1.1.1.1,
 NONE
In-Reply-To: <16278.58215.241781.151437@grendel.zope.com> (Fred L. Drake,
	Jr.'s message of "Wed, 22 Oct 2003 16:07:03 -0400")
References: <E1AC2iE-0001p7-00@sc8-pr-cvs1.sourceforge.net>
	<vfqhoyoe.fsf@python.net> <16278.56873.730705.729001@grendel.zope.com>
	<ptgpoxk7.fsf@python.net> <16278.58215.241781.151437@grendel.zope.com>
Message-ID: <ismhowwr.fsf@python.net>

"Fred L. Drake, Jr." <fdrake@acm.org> writes:

> Thomas Heller writes:
>  > setup.py isn't used, and PC\pyconfig.h is manually maintained.
>  > So HAVE_MEMMOVE has to be defined in this file, at least for MSVC6.
>  > I don't know anything about watcom, borland, or other compilers.
>  > Let's add it in the file and see what happens ;-)
>
> Not quite, I think.
>
> The setup.py script will load it from the pyconfig.h file and pass it
> along for Expat, but if that isn't used, it needs to be added to the
> .dsp used to build pyexpat.pyd.

Ah, you mean pyconfig.h is not included by the expat files?
Ok, in this case it will have to go into the .dsp.

> Not sure what to do about other C runtimes.

Neither do I.

Thomas


From fdrake at acm.org  Wed Oct 22 16:43:31 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed Oct 22 16:43:48 2003
Subject: [Python-Dev] memmove() in Expat
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEPFFEAB.tim.one@comcast.net>
References: <E1AC2iE-0001p7-00@sc8-pr-cvs1.sourceforge.net>
	<vfqhoyoe.fsf@python.net>
	<16278.56873.730705.729001@grendel.zope.com>
	<ptgpoxk7.fsf@python.net>
	<16278.58215.241781.151437@grendel.zope.com>
	<ismhowwr.fsf@python.net>
	<LNBBLJKPBEHFEDALKOLCEEPFFEAB.tim.one@comcast.net>
Message-ID: <16278.60403.851100.38638@grendel.zope.com>


Tim Peters writes:
 > memmove is a standard ANSI C function so can be used freely (Python requires
 > ANSI C).

Cool; thanks!

Thomas Heller writes:
 > Ah, you mean pyconfig.h is not included by the expat files?
 > Ok, in this case it will have to go into the .dsp.

That's right; the problem isn't pyexpat.c, which imports pyconfig.h
via Python.h.  The #error is in the Expat sources, which we're using
unmodified, and Expat is more tolerant of non-ANSI platforms.

I'd rather see HAVE_MEMMOVE added to the .dsp than use modified Expat
sources.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From eppstein at ics.uci.edu  Wed Oct 22 19:03:06 2003
From: eppstein at ics.uci.edu (David Eppstein)
Date: Wed Oct 22 19:03:14 2003
Subject: [Python-Dev] Re: closure semantics
References: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com>
	<IOEAJCCLBLFDOCOBOEGNEEFGCFAA.jeremy@zope.com>
	<5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch>
Message-ID: <eppstein-567571.16030622102003@sea.gmane.org>

In article <5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch>,
 Samuele Pedroni <pedronis@bluewin.ch> wrote:

> why exactly do we want write access to outer scopes?
> 
> for completeness, to avoid the overhead of introducing a class here and there,
> to facilitate people using Scheme textbooks with Python?

I am currently working on implementing an algorithm with the following 
properties:
 - It is an algorithm, not a data structure; that is, you run it,
   it returns an answer, and it doesn't leave any persistent state
   afterwards.
 - It is sufficiently complex that I prefer to break it into several
   different functions or methods.
 - These functions or methods need to share various state variables.

If I implement it as a collection of separate functions, then there's a 
lot of unnecessary code complexity involved in passing the state 
variables from one function to the next, returning the changes to the 
variables, etc.  Also, it doesn't present a modular interface to the 
rest of the project -- code outside this algorithm is not prevented from 
calling the internal subroutines of the algorithm.

If I implement it as a collection of methods of an object, I then have 
to include a separate function which creates an instance of the object 
and immediately destroys it.  This seems clumsy and also doesn't fit 
with my intuition about what objects are for (representing persistent 
structure).  Also, again, modularity is violated -- outside code should 
not be making instances of this object or accessing its methods.

What I would like to do is to make an outer function, which sets up the 
state variables, defines inner functions, and then calls those 
functions.  Currently, this sort of works: most of the state variables 
consist of mutable objects, so I can mutate them without rebinding them.  
But some of the state is immutable (in this case, an int) so I need to 
somehow encapsulate it in mutable objects, which is again clumsy.
Write access to outer scopes would let me avoid this encapsulation 
problem.

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science


From greg at cosc.canterbury.ac.nz  Wed Oct 22 19:20:28 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Oct 22 19:21:25 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310221215.27570.gmccaughan@synaptics-uk.com>
Message-ID: <200310222320.h9MNKSh18932@oma.cosc.canterbury.ac.nz>

Gareth McCaughan <gmccaughan@synaptics-uk.com>:

> "Aussonderungsaxiom" is the axiom of *separation*[1], which is
> a weakened version of the (disastrous) axiom of *comprehension*.
> In terms of Python's listcomps, comprehension would be [x if P(x)]

Actually, my original implementation of list comps
*did* allow you to write that -- although it didn't
try to loop over all possible values of x, fortunately. :-)

It was Guido who (probably fairly wisely, even though
I didn't agree at the time) decided there had to be
a "for" in there.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Wed Oct 22 19:59:05 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 19:59:28 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: Your message of "Wed, 22 Oct 2003 16:03:06 PDT."
	<eppstein-567571.16030622102003@sea.gmane.org> 
References: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com>
	<IOEAJCCLBLFDOCOBOEGNEEFGCFAA.jeremy@zope.com>
	<5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch> 
	<eppstein-567571.16030622102003@sea.gmane.org> 
Message-ID: <200310222359.h9MNx6L28417@12-236-54-216.client.attbi.com>

> I am currently working on implementing an algorithm with the following 
> properties:
>  - It is an algorithm, not a data structure; that is, you run it,
>    it returns an answer, and it doesn't leave any persistent state
>    afterwards.
>  - It is sufficiently complex that I prefer to break it into several
>    different functions or methods.
>  - These functions or methods need to share various state variables.
> 
> If I implement it as a collection of separate functions, then there's a 
> lot of unnecessary code complexity involved in passing the state 
> variables from one function to the next, returning the changes to the 
> variables, etc.  Also, it doesn't present a modular interface to the 
> rest of the project -- code outside this algorithm is not prevented from 
> calling the internal subroutines of the algorithm.
> 
> If I implement it as a collection of methods of an object, I then have 
> to include a separate function which creates an instance of the object 
> and immediately destroys it.  This seems clumsy and also doesn't fit 
> with my intuition about what objects are for (representing persistent 
> structure).  Also, again, modularity is violated -- outside code should 
> not be making instances of this object or accessing its methods.
> 
> What I would like to do is to make an outer function, which sets up the 
> state variables, defines inner functions, and then calls those 
> functions.  Currently, this sort of works: most of the state variables 
> consist of mutable objects, so I can mutate them without rebinding them.  
> But some of the state is immutable (in this case, an int) so I need to 
> somehow encapsulate it in mutable objects, which is again clumsy.
> Write access to outer scopes would let me avoid this encapsulation 
> problem.

I know the problem, I've dealt with this many times.  Personally I
would much rather define a class than a bunch of nested functions.
I'd have a separate master function that creates the instance, calls
the main computation, and then extracts and returns the result.  Yes,
the class may be accessible at the toplevel in the module.  I don't
care: I just add a comment explaining that it's not part of the API,
or give it a name starting with "_".

My problem with the nested functions is that it is much harder to get
a grasp of what the shared state is -- any local variable in the outer
function *could* be part of the shared state, and the only way to tell
for sure is by inspecting all the subfunctions.  With the class,
there's a strong convention that all state is initialized in
__init__(), so __init__() is self-documenting.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From KCT at empmut.com.au  Wed Oct 22 20:08:55 2003
From: KCT at empmut.com.au (KCT@empmut.com.au)
Date: Wed Oct 22 20:08:18 2003
Subject: [Python-Dev] ......
Message-ID: <5F1E1E39D28A1447AAC833D0D0C6F01401827D@XXXX>


From greg at cosc.canterbury.ac.nz  Wed Oct 22 20:15:48 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Oct 22 20:16:02 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <200310222359.h9MNx6L28417@12-236-54-216.client.attbi.com>
Message-ID: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>

Guido:

> My problem with the nested functions is that it is much harder to get
> a grasp of what the shared state is -- any local variable in the outer
> function *could* be part of the shared state, and the only way to tell
> for sure is by inspecting all the subfunctions.

That would be solved if, instead of marking variables
in inner scopes that refer to outer scopes, it were
the other way round, and variables in the outer scope
were marked as being rebindable in inner scopes.

  def f():
    rebindable x
    def inc_x_by(i):
      x += i # rebinds outer x
    x = 39
    inc_x_by(3)
    return x

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Wed Oct 22 20:56:24 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Oct 22 20:56:59 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <3F967FFC.6040507@iinet.net.au>
Message-ID: <200310230056.h9N0uON19236@oma.cosc.canterbury.ac.nz>

> The alternative I came up with was:
> 
>    y = (from result = 0.0 do result += x**2 for x in values if x > 0)

As has been pointed out, this hardly gains you anything over writing
it all out explicitly. It seems like nothing more than a Perlesque
another-way-to-do-it.

This seems to be the fate of all reduce-replacement suggestions that
try to be fully general -- there are just too many degrees of freedom
to be able to express it all succinctly.

The only way out of this I can see (short of dropping the whole idea)
is to cut out some of the degrees of freedom by restrict ourselves to
targeting the most common cases.

Thinking about the way this works in APL, where you can say things
like

  total = + / numbers

one reason it's so compact is that the system knows what the identity
is for each operator, so you don't have to specify the starting value
explicitly. Another is the use of a binary operator.

So if we postulate a "reducing protocol" that requires function
objects to have a __div__ method that performs reduction with a
suitable identity, then we can write

   total = operator.add / numbers

Does that look succinct enough?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From bac at OCF.Berkeley.EDU  Wed Oct 22 21:02:43 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Oct 22 21:02:54 2003
Subject: [Python-Dev] Time for py3k@python.org or a Py3K Wiki?
In-Reply-To: <16278.33180.5190.95094@montanaro.dyndns.org>
References: <16278.33180.5190.95094@montanaro.dyndns.org>
Message-ID: <3F9728B3.2070809@ocf.berkeley.edu>

Skip Montanaro wrote:
> These various discussions are moving along a bit too rapidly for me to keep
> up.  We have been discussing language issues which are going to impact
> Python 3.0, either by deprecating current language constructs which can't be
> eliminated until then (e.g., the global statement) or by tossing around
> language construct ideas which will have to wait until then for their
> implementation (other mechanisms for variable access in outer scopes).
> Unfortunately, I'm afraid these things are going to get lost in the sea of
> other python-dev topics and be forgotten about then the time is ripe.
> 
> Maybe this would be a good time to create a py3k@python.org mailing list
> with more restrictions than python-dev (posting by members only?  membership
> by invitation?) so we can more easily separate these ideas from shorter term
> issues and keep track of them in a separate Mailman archive.  I'd suggest
> starting a Wiki, but that seems a bit too "global".  You can restrict Wiki
> mods in MoinMoin to users who are logged in, but I'm not sure you can
> restrict signups very well.
> 

I would support doing *something*.  My personal hell that is all of 
these threads seems to be getting deeper and deeper.  But making it 
invitation-only might stifle some ideas.  But then again it might be a 
while before Python 3 has to be worried about so maybe that is not that 
big of an issue now.

God I hope Raymond has PEP 289 updated before the next summary is due.

-Brett


From tim_one at email.msn.com  Wed Oct 22 21:18:42 2003
From: tim_one at email.msn.com (Tim Peters)
Date: Wed Oct 22 21:18:47 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310220223.h9M2N7l26105@12-236-54-216.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEBOFFAB.tim_one@email.msn.com>

I had a large file today, and needed to find lines matching several patterns
simultaneously.  It seemed a natural application for generator expressions,
so let's see how that looks.

Generalized a bit:

Given:
    "source", an iterable producing elements (like a file producing lines)
    "predicates", a sequence of one-argument functions, mapping element to
truth
        (like a regexp search returning a match object or None)

Create:
    a generator producing the elements of source for which each predicate is
true

This is-- or should be --an easy application for pipelining generator
expressions.  Like so:

    pipe = source
    for p in predicates:
        # add a filter over the current pipe, and call that the new pipe
        pipe = e for e in pipe if p(e)

Now I hope that

    for e in pipe:
        print e

prints the desired elements.  If will if the "p" and "pipe" in the generator
expression use the bindings in effect at the time the generator expression
is assigned to pipe.  If the generator expression is instead a closure, it's
a subtle disaster.  You can play with this today like so:

    pipe = source
    for p in predicates:
        # pipe = e for e in pipe if p(e)
        def g(pipe=pipe, p=p):
            for e in pipe:
                if p(e):
                    yield e
        pipe = g()

    for e in pipe:
        print e

Those are the semantics for which "it works".

If "p=p" is removed (so that the implementation of the generator expression
acts like a closure wrt p), the effect is to ignore all but the last
predicate.  Instead predicates[-1] is applied to soucre, and then applied
redundantly to the survivors len(predicates)-1 times each.  It's not obvious
then that the result is wrong, and for some inputs may even be correct.

If "pipe=pipe" is removed instead, it should produce a "generator already
executing" exception, since the "pipe" in the final for-loop is bound to the
same object as the "pipe" inside g then (all of the g's, but only the last g
matters).


From greg at cosc.canterbury.ac.nz  Wed Oct 22 21:36:41 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Oct 22 21:37:48 2003
Subject: [Python-Dev] Can we please have a better dict interpolation syntax?
Message-ID: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz>

I have just had the experience of writing a bunch
of expressions of the form

  "create index %(table)s_lid1_idx on %(table)s(%(lid1)s)" % params

and found myself getting quite confused by all the parentheses
and "s" suffixes. I would *really* like to be able to write
this as

  "create index %{table}_lid1_idx on %{table}(%{lid1})" % params

which I find to be much easier on the eyes.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From fincher.8 at osu.edu  Wed Oct 22 22:37:50 2003
From: fincher.8 at osu.edu (Jeremy Fincher)
Date: Wed Oct 22 21:40:33 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEBOFFAB.tim_one@email.msn.com>
References: <LNBBLJKPBEHFEDALKOLCOEBOFFAB.tim_one@email.msn.com>
Message-ID: <200310222237.50142.fincher.8@osu.edu>

On Wednesday 22 October 2003 09:18 pm, Tim Peters wrote:

<snip>

> Those are the semantics for which "it works".

I'm convinced; not only that free variables should be frozen as if they'd been 
passed into a generator function as keyword arguments, but of the utility of 
generator expressions as a whole -- that code is just beautiful :)

Jeremy

From raymond.hettinger at verizon.net  Wed Oct 22 21:43:05 2003
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed Oct 22 21:43:56 2003
Subject: [Python-Dev] product()
Message-ID: <002401c39907$0176f5a0$e841fea9@oemcomputer>

In the course of writing up Pep 289, it became clear that
the future has a number of accumulator functions in store.
Each of these is useful with iterators of all stripes and
each helps eliminate a reason for using reduce().

Some like average() and stddev() will likely end up in a 
statistics module.  Others like nbiggest(), nsmallest(),
anytrue(), alltrue(), and such may end-up somewhere else.

The product() accumulator is the one destined to be a builtin.

Though it is not nearly as common as sum(), it does enjoy
some popularity.  Having it available will help dispense
with reduce(operator.mul, data, 1).

Would there be any objections to my adding product() to
Py2.4?  The patch was simple and it is ready to go unless 
someone has some major issue with it.


Raymond Hettinger


From greg at cosc.canterbury.ac.nz  Wed Oct 22 21:49:45 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Oct 22 21:50:20 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEBOFFAB.tim_one@email.msn.com>
Message-ID: <200310230149.h9N1njU19481@oma.cosc.canterbury.ac.nz>

Tim Peters <tim_one@email.msn.com>:

> If will if the "p" and "pipe" in the generator expression use the
> bindings in effect at the time the generator expression is assigned to
> pipe.

Lying awake thinking about this sort of thing last night,
I found myself wondering if there should be a way of
explicitly requesting that a name be evaluated at closure 
creation time, e.g.

    pipe = source
    for p in predicates:
        pipe = e for e in pipe if ^p(e)

where the ^ means that p is evaluated in the enclosing
scope when the closure is created, and bound to a slot
which behaves like a default-argument slot (but is
separate from the default arguments).

This would allow the current delayed-evaluation semantics
to be kept as the default, while eliminating any need
for using the default-argument hack when you don't
want delayed evaluation.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From tim_one at email.msn.com  Wed Oct 22 22:07:48 2003
From: tim_one at email.msn.com (Tim Peters)
Date: Wed Oct 22 22:07:56 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310220527.h9M5Rgr26465@12-236-54-216.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGECCFFAB.tim_one@email.msn.com>

[Tim]
>> I'm not sure it's "a feature" that
>>
>>     print [n+f() for x in range(10)]
>>
>> looks up n and f anew on each iteration -- if I saw a listcomp that
>> actually relied on this, I'd be eager to avoid inheriting any of
>> author's code.

[Guido]
> It's just a direct consequence of Python's general rule for name
> lookup in all contexts: variables are looked up when used, not before.
> (Note: lookup is different from scope determination, which is done
> mostly at compile time.  Scope determination tells you where to look;
> lookup gives you the actual value of that location.)  If n is a global
> and calling f() changes n, f()+n differs from n+f(), and both are
> well-defined due to the left-to-right rule.  That's not good or bad,
> that's just *how it is*.  Despite having some downsides, the
> simplicity of the rule is good; I'm sure we could come up with
> downsides of other rules too.

Sorry, but none of that follows unless you first insist that a listcomp is
semantically equivalent to a particular for-loop.  Which we did do at the
start, and which is now being abandoned in part ("well, except for the for
target(s) -- well, OK, they still work like exactly like the for-loop would
work if the target(s) were renamed in a particular 'safe' way").  I don't
mind the renaming trick there, but by the same token there's nothing to stop
explaining the meaning of a generator expression as a particular way of
writing a generator function either.  It's hardly a conceptual strain to
give the function default arguments, or even to eschew that technical
implementation trick and just say the generator's frame gets some particular
initialized local variables (which is the important bit, not the trick used
to get there).

> Despite the good case that's been made for what would be most useful,

I don't see that any good case had been made for or against it:  the only
cases I care about are real use cases.  A thing stands or falls by that,
purity be damned.  I have since posted the first plausible use case that
occurred to me while thinking about real work, and "closure semantics"
turned out to be disastrous in that example (see other email), while
"capture the current binding" semantics turned out to be exactly right in
that example.  I suspected that would be so, but I still want to see more
not-100%-fabricated examples.

> I'm loathe to drop the evaluation rule for convenience in one special
> case.  Next people may argue that in Python 3.0 lambda should also do
> this; arguably it's more useful than the current semantics there too.

It's not analogous:  when I'm writing a lambda, I can *choose* which
bindings to capture at lambda definition time, and which to leave free.
Unless generator expressions grow more hair, I have no choice when writing
one of those, so the implementation-forced choice had better be
overwhelmingly most useful most often.  I can't judge the latter without
plausible use cases, though.

> And then what next -- maybe all nested functions should copy their
> free variables?

Same objection as to the lambda example.

> Oh, and then maybe outermost functions should copy their globals into
> locals too -- that will speed up a lot of code. :-)

It would save Jim a lot of thing=thing arglist typing in Zope code too
<wink>.

> There are other places in Python where some rule is applied to "all
> free variables of a given piece of code" (the distinction between
> locals and non-locals in functions is made this way).  But there are
> no other places where implicit local *copies* of all those free
> variables are taken.

I didn't suggest to copy anything, just to capture the bindings in use at
the time a generator expression is evaluated.  This is easy to explain, and
trivial to explain for people familiar with the default-argument trick.
Whenever I've written a list-of-generators, or in the recent example a
generator pipeline, I have found it semantically necessary, without
exception so far, to capture the bindings of the variables whose bindings
wouldn't otherwise be invariant across the life of the generator.  It it
turns out that this is always, or nearly almost always, the case, across
future examples too, then it would just be goofy not to implement generator
expressions that way ("well, yes, the implementation does do a wrong thing
in every example we had, but what you're not seeing is that the explanation
would have been a line longer had the implementation done a useful thing
instead" <wink>).

> I'd need to find a unifying principle to warrant doing that beyond
> utility.

No you don't -- you just think you do <wink>.


From pje at telecommunity.com  Wed Oct 22 22:12:09 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Oct 22 22:11:32 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310230149.h9N1njU19481@oma.cosc.canterbury.ac.nz>
References: <LNBBLJKPBEHFEDALKOLCOEBOFFAB.tim_one@email.msn.com>
Message-ID: <5.1.0.14.0.20031022220801.02547e20@mail.telecommunity.com>

At 02:49 PM 10/23/03 +1300, Greg Ewing wrote:
>This would allow the current delayed-evaluation semantics
>to be kept as the default, while eliminating any need
>for using the default-argument hack when you don't
>want delayed evaluation.

Does anybody actually have a use case for delayed evaluation?  Why would 
you ever *want* it to be that way?  (Apart from the BDFL's desire to have 
the behavior resemble function behavior.)

And, if there's no use case for delayed evaluation, why make people jump 
through hoops to get the immediate binding?


From tim_one at email.msn.com  Wed Oct 22 22:19:06 2003
From: tim_one at email.msn.com (Tim Peters)
Date: Wed Oct 22 22:19:12 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310230149.h9N1njU19481@oma.cosc.canterbury.ac.nz>
Message-ID: <LNBBLJKPBEHFEDALKOLCCECEFFAB.tim_one@email.msn.com>

[Greg Ewing]
> Lying awake thinking about this sort of thing last night,
> I found myself wondering if there should be a way of
> explicitly requesting that a name be evaluated at closure
> creation time, e.g.
>
>     pipe = source
>     for p in predicates:
>         pipe = e for e in pipe if ^p(e)
>
> where the ^ means that p is evaluated in the enclosing
> scope when the closure is created, and bound to a slot
> which behaves like a default-argument slot (but is
> separate from the default arguments).

As explained in the original email, the example is also a disaster if pipe's
binding isn't captured at creation-time too.

> This would allow the current delayed-evaluation semantics
> to be kept as the default, while eliminating any need
> for using the default-argument hack when you don't
> want delayed evaluation.

Well, I have yet to see an example where delayed evaluation is of any use in
a generator expression, except for a 100%-contrived example that simply
illustrated that the semantics can in fact differ (which I hope isn't
something anyone questioned to begin with <wink>).

Try writing a real example.  If it needs delayed evaluation in a plausible
way, great.  I'm still batting 0 at trying to find such a thing; I confess I
wasn't moved by the

    it = f(x) for x in whatever
    def f(x):
        blah

example (there being no apparent need to contort the order of the
assignments except, again, to illustrate that semantics have consequences).


From guido at python.org  Wed Oct 22 22:28:59 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 22:29:29 2003
Subject: [Python-Dev] Can we please have a better dict interpolation
	syntax?
In-Reply-To: Your message of "Thu, 23 Oct 2003 14:36:41 +1300."
	<200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz> 
References: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310230229.h9N2SxA28642@12-236-54-216.client.attbi.com>

> I have just had the experience of writing a bunch
> of expressions of the form
> 
>   "create index %(table)s_lid1_idx on %(table)s(%(lid1)s)" % params
> 
> and found myself getting quite confused by all the parentheses
> and "s" suffixes. I would *really* like to be able to write
> this as
> 
>   "create index %{table}_lid1_idx on %{table}(%{lid1})" % params
> 
> which I find to be much easier on the eyes.

Wouldn't this be even better?

    "create index ${table}_lid1_idx on $table($lid1)" % params

--Guido van Rossum (home page: http://www.python.org/~guido/)

From tim_one at email.msn.com  Wed Oct 22 22:30:05 2003
From: tim_one at email.msn.com (Tim Peters)
Date: Wed Oct 22 22:30:11 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310221215.27570.gmccaughan@synaptics-uk.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGECFFFAB.tim_one@email.msn.com>

[Gareth McCaughan]
> <pedant>
> "Aussonderungsaxiom" is the axiom of *separation*[1], which is
> a weakened version of the (disastrous) axiom of *comprehension*.

Ya, sez you <wink>.  Seriously, I don't think the usage is as consistent as
you would have us believe here.  When listcomps were introduced, I suggested
at the time that "list separations" would be a better name for them (for the
reason you gave), but the historical precedent set by SETL, and carried over
into Haskell, means "comprehension" will stick forever in this context.  I
don't think the distinction is consistent across math texts either.

> In terms of Python's listcomps, comprehension would be [x if P(x)]
> and separation [x for x in S if P(x)]. So we should be
> calling them "list separations", really :-).

Yes, we should.  SETL and Haskell also required specifying a base set (or
list) from which elements are chosen, so they also should have called them
separations.

>     [1] Hence the name; compare English "sunder".
>
> For the record, I like "generator expressions" too, or "iterator
expressions".
> </pedant>

Good!  Guido has decided you love the former, and I agree <wink>.


From guido at python.org  Wed Oct 22 22:34:08 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 22:34:29 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Thu, 23 Oct 2003 14:49:45 +1300."
	<200310230149.h9N1njU19481@oma.cosc.canterbury.ac.nz> 
References: <200310230149.h9N1njU19481@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310230234.h9N2Y8V28680@12-236-54-216.client.attbi.com>

> Lying awake thinking about this sort of thing last night,
> I found myself wondering if there should be a way of
> explicitly requesting that a name be evaluated at closure 
> creation time, e.g.
> 
>     pipe = source
>     for p in predicates:
>         pipe = e for e in pipe if ^p(e)
> 
> where the ^ means that p is evaluated in the enclosing
> scope when the closure is created, and bound to a slot
> which behaves like a default-argument slot (but is
> separate from the default arguments).

Bah.  Arbitrary semantics bound to line-noise characters.  Guess what
that reminds me of. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From tim_one at email.msn.com  Wed Oct 22 22:40:23 2003
From: tim_one at email.msn.com (Tim Peters)
Date: Wed Oct 22 22:40:33 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310230234.h9N2Y8V28680@12-236-54-216.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOECFFFAB.tim_one@email.msn.com>

[Guido]
> Bah.  Arbitrary semantics bound to line-noise characters.  Guess what
> that reminds me of. :-)

I sure hope the answer isn't "Python 3"!

well-you-*did*-move-to-california-ly y'rs  - tim

From guido at python.org  Wed Oct 22 22:48:40 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 22 22:48:01 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: Your message of "Thu, 23 Oct 2003 13:56:24 +1300."
	<200310230056.h9N0uON19236@oma.cosc.canterbury.ac.nz> 
References: <200310230056.h9N0uON19236@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310230248.h9N2meb01254@12-236-54-216.client.attbi.com>

> Thinking about the way this works in APL, where you can say things
> like
> 
>   total = + / numbers
> 
> one reason it's so compact is that the system knows what the identity
> is for each operator, so you don't have to specify the starting value
> explicitly. Another is the use of a binary operator.
> 
> So if we postulate a "reducing protocol" that requires function
> objects to have a __div__ method that performs reduction with a
> suitable identity, then we can write
> 
>    total = operator.add / numbers
> 
> Does that look succinct enough?

It still suffers from my main problem with reduce(), which is not its
verbosity (far from it) but that except for some special cases (mainly
sum and product) I have to stand on my head to understand what it
does.  This is even the case for examples like

  reduce(lambda x, y: x + y.foo, seq)

which is hardly the epitomy of complexity.  Who here knows for sure it
shouldn't rather be

  reduce(lambda x, y: x.foo + y, seq)

without going through an elaborate step-by-step execution?

This is inherent in the definition of reduce, and no / notation makes
it go away for me.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From sean at datamage.net  Wed Oct 22 22:54:32 2003
From: sean at datamage.net (Sean Legassick)
Date: Wed Oct 22 23:00:53 2003
Subject: [Python-Dev] Re: listcomps vs. for loops
References: <338366A6D2E2CA4C9DAEAE652E12A1DECFF239@au3010avexu1.global.avaya.com>
Message-ID: <tL3uAbmoL0l$EA4H@eucalyptus.datamage.net>

In message 
<338366A6D2E2CA4C9DAEAE652E12A1DECFF239@au3010avexu1.global.avaya.com>, 
"Delaney, Timothy C (Timothy)"
>Note the winking smiley above :) Although I do find the scope limiting in:
>
>    for (int i=0; i < 10; ++i)
>    {
>    }
>
>to be a nice feature of C++ (good god - did I just say that?) and hate 
>that the implementation in MSVC is broken and the control variable 
>leaks.

Me too, but then that's because it's so much more maintainable to be 
able to repeat such for loops ad nauseum using the same loop var name 
without removing the 'int' type declarator. And happily that's not an 
issue in Python.

(Hmmm, jumping out of lurk mode with a comment concerning C++. Apologies 
for the bad form but I am somewhat of a Python newbie, albeit an 
increasingly addicted one).

Sean

-- 
Sean Legassick
sean@datamage.net
   http://www.informage.net - bloggin' along


From greg at cosc.canterbury.ac.nz  Wed Oct 22 23:32:19 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Oct 22 23:32:48 2003
Subject: [Python-Dev] Can we please have a better dict interpolation
	syntax?
In-Reply-To: <200310230229.h9N2SxA28642@12-236-54-216.client.attbi.com>
Message-ID: <200310230332.h9N3WIm20194@oma.cosc.canterbury.ac.nz>

Guido:

> Wouldn't this be even better?
> 
>     "create index ${table}_lid1_idx on $table($lid1)" % params

I wouldn't object to that. I'd have expected *you* to
object to it, though, since it re-defines the meaning
of "$" in an interpolated string. I was just trying
to suggest something that would be backward-compatible.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Wed Oct 22 19:35:52 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Oct 22 23:34:03 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: <200310221506.h9MF6hM27365@12-236-54-216.client.attbi.com>
Message-ID: <200310222335.h9MNZqW19000@oma.cosc.canterbury.ac.nz>

Guido:

> The variable of a for *statement* must be accessible after the loop
> because you might want to break out of the loop with a specific
> value.  This is a common pattern that I have no intent of breaking.

It wouldn't be a great hardship if the loop variable
weren't accessible after the break, because you can
always write

  for x in stuff:
    if meets_condition(x):
      result = x
      break
  do_something_with(result)

which is arguably a clearer way to write it, anyway.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Wed Oct 22 20:36:20 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Oct 22 23:34:11 2003
Subject: [Python-Dev] PEP 289: Generator Expressions (second draft)
In-Reply-To: <200310221507.h9MF7od27394@12-236-54-216.client.attbi.com>
Message-ID: <200310230036.h9N0aK319192@oma.cosc.canterbury.ac.nz>

Guido:

> > I probably missed it in this monster of a thread, but how do
> > generator expressions do this?  It seems that they'd only make
> > reduce more efficient, but it would still be just as needed as
> > before.
> 
> All we need is more standard accumulator functions like sum().  There
> are many useful accumulator functions that aren't easily expressed as
> a binary operator but are easily done with an explicit iterator
> argument, so I am hopeful that the need for reduce will disappear.

But this would still be true even if we introduced such functions
*without* generator expressions, i.e.  given some new standard
accumulator foo_accumulator which accumulates using foo_function, you
can write

  r = foo_accumulator(some_seq)

instead of

  r = reduce(foo_function, some_seq)

regardless of whether some_seq is a regular list or a generator
expression.

So it seems to me that generator expressions have *no* effect on the
need or otherwise for reduce, and any suggestion to that effect should
be removed from the PEP as misleading and confusing.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Wed Oct 22 23:37:11 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Oct 22 23:37:45 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310230234.h9N2Y8V28680@12-236-54-216.client.attbi.com>
Message-ID: <200310230337.h9N3bBa20209@oma.cosc.canterbury.ac.nz>

> >     pipe = source
> >     for p in predicates:
> >         pipe = e for e in pipe if ^p(e)
> 
> Bah.  Arbitrary semantics bound to line-noise characters.  Guess what
> that reminds me of. :-)

If anyone can think of anything less line-noisy, I'm
open to suggestions. The important thing is the idea of
explicitly capturing an enclosing binding, however it's
expressed.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From eppstein at ics.uci.edu  Wed Oct 22 23:59:46 2003
From: eppstein at ics.uci.edu (David Eppstein)
Date: Wed Oct 22 23:59:51 2003
Subject: [Python-Dev] Re: product()
References: <002401c39907$0176f5a0$e841fea9@oemcomputer>
Message-ID: <eppstein-3DFF45.20594622102003@sea.gmane.org>

In article <002401c39907$0176f5a0$e841fea9@oemcomputer>,
 "Raymond Hettinger" <raymond.hettinger@verizon.net> wrote:

> In the course of writing up Pep 289, it became clear that
> the future has a number of accumulator functions in store.
> Each of these is useful with iterators of all stripes and
> each helps eliminate a reason for using reduce().

Maybe it would be useful to get some feeling for how much other 
functions get used in reduce?

I took a look through some of my own code, and found:

- three loops with |= and &= that could have been done as a reduction on 
a generator expression (but for now will stay loops)

- one call reduce(f,...) where f is not known until run time

- no products.

My guess is that, after sum, the functions used in reduce get a lot more 
diverse, and that trying to replace all of them with builtins is not 
feasible.

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science


From tjreedy at udel.edu  Thu Oct 23 00:03:29 2003
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu Oct 23 00:03:35 2003
Subject: [Python-Dev] Re: closure semantics
References: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com><IOEAJCCLBLFDOCOBOEGNEEFGCFAA.jeremy@zope.com><5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch>
	<eppstein-567571.16030622102003@sea.gmane.org>
Message-ID: <bn7juh$rdg$1@sea.gmane.org>


"David Eppstein" <eppstein@ics.uci.edu> wrote in message
news:eppstein-567571.16030622102003@sea.gmane.org...
> If I implement it as a collection of methods of an object, I then
have
> to include a separate function which creates an instance of the
object
> and immediately destroys it.  This seems clumsy and also doesn't fit
> with my intuition about what objects are for (representing
persistent
> structure).  Also, again, modularity is violated -- outside code
should
> not be making instances of this object or accessing its methods.

So why not define the class inside the master function to keep it
private?
For a complex algorithm, re-setup time should be relatively
negligible.

Terry J. Reedy


From greg at cosc.canterbury.ac.nz  Thu Oct 23 00:07:03 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu Oct 23 00:07:43 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <200310230248.h9N2meb01254@12-236-54-216.client.attbi.com>
Message-ID: <200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz>

> I have to stand on my head to understand what it
> does.  This is even the case for examples like
> 
>   reduce(lambda x, y: x + y.foo, seq)

It occurs to me that, with generator expressions,
such cases could be rewritten as

    reduce(lambda x, y: x + y, (z.foo for z in seq))

i.e. any part of the computation that only depends on
the right argument can be factored out into the
generator. So I might have to take back some of what
I said earlier about generator comprehensions being
independent of reduce.

But if I understand you correctly, what you're saying
is that the interesting cases are the ones where there
isn't a ready-made binary function that does what
you want, in which case you're going to have to spell
everything out explicitly anyway one way or another.

In that case, the most you could gain from a reduce
syntax would be that it's an expression rather than
a sequence of statements.

But the same could be said of list comprehensions --
and *was* said quite loudly by many people in the early
days, if I recall correctly. What's the point, people
asked, when writing out a set of nested loops is just
about as easy?

Somehow we came to the conclusion that being able to
write a list comprehension as an expression was a
valuable thing to have, even if it wasn't significantly
shorter or clearer. What about reductions? Do we feel
differently? If so, why?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Thu Oct 23 00:16:19 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 00:15:55 2003
Subject: [Python-Dev] Can we please have a better dict interpolation
	syntax?
In-Reply-To: Your message of "Thu, 23 Oct 2003 16:32:19 +1300."
	<200310230332.h9N3WIm20194@oma.cosc.canterbury.ac.nz> 
References: <200310230332.h9N3WIm20194@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310230416.h9N4GJv01528@12-236-54-216.client.attbi.com>

> > Wouldn't this be even better?
> > 
> >     "create index ${table}_lid1_idx on $table($lid1)" % params
> 
> I wouldn't object to that. I'd have expected *you* to
> object to it, though, since it re-defines the meaning
> of "$" in an interpolated string. I was just trying
> to suggest something that would be backward-compatible.

Correct, my proposal can't be backward-compatible. :-(

But somehow I think that, for various cultural reasons (not just Perl
:-) $ is a better character to use for interpolation than % -- this is
pretty arbitrary, but it seems that $foo is just much more common than
%foo as a substitution indicator, across various languages.  (% is
more common for C-style format strings of course.)

There have been many proposals in this area, even a PEP (PEP 215,
which I don't like that much, despite its use of $).

Many people have also implemented something along these lines, using a
function to request interpolation (or using template files etc.), and
using various things (from dicts to namespaces) as the source for
names.

Anyway, I think this is something that can wait until 3.0, and I'd
rather not have too many discussions here at once, so I'd rather
unhelpfully punt than take this on for real (also for the benefit of
Brett, who has to sort through all of this for his python-dev
summary).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu Oct 23 00:17:28 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 00:17:05 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: Your message of "Thu, 23 Oct 2003 12:35:52 +1300."
	<200310222335.h9MNZqW19000@oma.cosc.canterbury.ac.nz> 
References: <200310222335.h9MNZqW19000@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310230417.h9N4HSn01539@12-236-54-216.client.attbi.com>

> Guido:
> > The variable of a for *statement* must be accessible after the loop
> > because you might want to break out of the loop with a specific
> > value.  This is a common pattern that I have no intent of breaking.

[Greg]
> It wouldn't be a great hardship if the loop variable
> weren't accessible after the break, because you can
> always write
> 
>   for x in stuff:
>     if meets_condition(x):
>       result = x
>       break
>   do_something_with(result)
> 
> which is arguably a clearer way to write it, anyway.

I don't know.  It seems to add clutter.  I don't see the big urge to
limit the scope of loop control variables.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu Oct 23 00:20:30 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 00:19:58 2003
Subject: [Python-Dev] PEP 289: Generator Expressions (second draft)
In-Reply-To: Your message of "Thu, 23 Oct 2003 13:36:20 +1300."
	<200310230036.h9N0aK319192@oma.cosc.canterbury.ac.nz> 
References: <200310230036.h9N0aK319192@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310230420.h9N4KUl01570@12-236-54-216.client.attbi.com>

> Guido:
> > > I probably missed it in this monster of a thread, but how do
> > > generator expressions do this?  It seems that they'd only make
> > > reduce more efficient, but it would still be just as needed as
> > > before.
> > 
> > All we need is more standard accumulator functions like sum().  There
> > are many useful accumulator functions that aren't easily expressed as
> > a binary operator but are easily done with an explicit iterator
> > argument, so I am hopeful that the need for reduce will disappear.
> 
> But this would still be true even if we introduced such functions
> *without* generator expressions, i.e.  given some new standard
> accumulator foo_accumulator which accumulates using foo_function, you
> can write
> 
>   r = foo_accumulator(some_seq)
> 
> instead of
> 
>   r = reduce(foo_function, some_seq)
> 
> regardless of whether some_seq is a regular list or a generator
> expression.
> 
> So it seems to me that generator expressions have *no* effect on the
> need or otherwise for reduce, and any suggestion to that effect should
> be removed from the PEP as misleading and confusing.

After some thinking, I agree.  The only (indirect) link is that
generator expressions make it more attractive to start writing
accumulator functions, and having more accumulator functions available
eliminates the need for reduce().

I'll update the PEP as needed (Raymond already toned down its mention
of reduce()).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu Oct 23 00:25:49 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 00:25:38 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Thu, 23 Oct 2003 16:37:11 +1300."
	<200310230337.h9N3bBa20209@oma.cosc.canterbury.ac.nz> 
References: <200310230337.h9N3bBa20209@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310230425.h9N4Pnf01585@12-236-54-216.client.attbi.com>

> > >     pipe = source
> > >     for p in predicates:
> > >         pipe = e for e in pipe if ^p(e)
> > 
> > Bah.  Arbitrary semantics bound to line-noise characters.  Guess what
> > that reminds me of. :-)
> 
> If anyone can think of anything less line-noisy, I'm
> open to suggestions. The important thing is the idea of
> explicitly capturing an enclosing binding, however it's
> expressed.

I think that no matter what notation you invent, this will remain an
unpythonic thing.  I can't quite explain why I feel that way.  Maybe
it's because it feels very strongly like a directive to the compiler
-- Python's compiler likes to stay out of the way and not need help.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu Oct 23 00:29:13 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 00:28:34 2003
Subject: [Python-Dev] Re: product()
In-Reply-To: Your message of "Wed, 22 Oct 2003 20:59:46 PDT."
	<eppstein-3DFF45.20594622102003@sea.gmane.org> 
References: <002401c39907$0176f5a0$e841fea9@oemcomputer>  
	<eppstein-3DFF45.20594622102003@sea.gmane.org> 
Message-ID: <200310230429.h9N4TD301617@12-236-54-216.client.attbi.com>

> My guess is that, after sum, the functions used in reduce get a lot more 
> diverse, and that trying to replace all of them with builtins is not 
> feasible.

That matches my intuition.

I figure even if we just started deprecating reduce() without offering
a replacement there wouldn't be many complaints.  reduce() just
doesn't get enough mileage.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg at cosc.canterbury.ac.nz  Thu Oct 23 00:41:11 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu Oct 23 00:41:28 2003
Subject: [Python-Dev] Can we please have a better dict interpolation
	syntax?
In-Reply-To: <200310230416.h9N4GJv01528@12-236-54-216.client.attbi.com>
Message-ID: <200310230441.h9N4fBQ20402@oma.cosc.canterbury.ac.nz>

Guido:

> Many people have also implemented something along these lines, using a
> function to request interpolation (or using template files etc.), and
> using various things (from dicts to namespaces) as the source for
> names.

I'm not asking for interpolation out of the current namespace
or anything like that -- just a simple extension to the current
set of formats for interpolating from a dict, that could be
done right now without affecting anything. I'd be willing to
supply a patch if it has some chance of being accepted.

I agree that the more esoteric proposals are best left until
later.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Thu Oct 23 00:50:43 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 00:50:07 2003
Subject: [Python-Dev] Can we please have a better dict interpolation
	syntax?
In-Reply-To: Your message of "Thu, 23 Oct 2003 17:41:11 +1300."
	<200310230441.h9N4fBQ20402@oma.cosc.canterbury.ac.nz> 
References: <200310230441.h9N4fBQ20402@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310230450.h9N4ohh01673@12-236-54-216.client.attbi.com>

> I'm not asking for interpolation out of the current namespace
> or anything like that -- just a simple extension to the current
> set of formats for interpolating from a dict, that could be
> done right now without affecting anything. I'd be willing to
> supply a patch if it has some chance of being accepted.
> 
> I agree that the more esoteric proposals are best left until
> later.

But adding to % interpolation makes it less likely that a radically
different (and better) approach will be implemented, because the
status quo will be closer to "good enough" without being "right".

--Guido van Rossum (home page: http://www.python.org/~guido/)

From tjreedy at udel.edu  Thu Oct 23 01:11:01 2003
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu Oct 23 01:11:11 2003
Subject: [Python-Dev] Re: Re: accumulator display syntax
References: <200310230056.h9N0uON19236@oma.cosc.canterbury.ac.nz>
	<200310230248.h9N2meb01254@12-236-54-216.client.attbi.com>
Message-ID: <bn7nt6$qr$1@sea.gmane.org>


"Guido van Rossum" <guido@python.org> wrote in message
news:200310230248.h9N2meb01254@12-236-54-216.client.attbi.com...
> It still suffers from my main problem with reduce(), which is not
its
> verbosity (far from it) but that except for some special cases
(mainly
> sum and product) I have to stand on my head to understand what it
> does.  This is even the case for examples like
>
>   reduce(lambda x, y: x + y.foo, seq)
>
> which is hardly the epitomy of complexity.  Who here knows for sure
it
> shouldn't rather be
>
>   reduce(lambda x, y: x.foo + y, seq)
>
> without going through an elaborate step-by-step execution?

I do and Raymond Hettinger should. Doc bug 821701 addressed this
confusion.  I suggested the addition of

"The first (left) argument is the accumulator; the second
(right) is the update value from the sequence.  The
accumulator starts as the initializer, if given, or as seq[0]. "

but don't know yet what Raymond actually did.
For remembering, the arg order corresponds to left associativity:
...(((a op b) op c) op d) ... .

For clarity, the updater should be written with real arg names:
lambda sum, item: sum + item.foo

Now sum.foo + item is pretty obviously wrong.  I think it a mistake to
make the two args of the update function look symmetric when they are
not.  Even if the same type, the first represents a cumulation of
several values (and the last return value) while the second is just
one (new) value.

Terry J. Reedy


From tjreedy at udel.edu  Thu Oct 23 01:25:25 2003
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu Oct 23 01:26:20 2003
Subject: [Python-Dev] Re: Re: buildin vs. shared modules
References: <brspcg23.fsf@python.net><200310171840.h9HIesN06941@12-236-54-216.client.attbi.com><he22muai.fsf@python.net>
	<200310211857.57783.aleaxit@yahoo.com><u162la7r.fsf@python.net><200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com><7k2ywden.fsf@yahoo.co.uk>
	<65ihlodo.fsf@python.net>
Message-ID: <bn7oo5$1q6$1@sea.gmane.org>


"Thomas Heller" <theller@python.net> wrote in message
news:65ihlodo.fsf@python.net...
> The whole delayload/__try/__except stuff may be unneeded in 2.4,
because
> it will most probably be compiled with MSVC7.1, installed via an msi
> installer, and all systems where the msi actually could be installed
> would already have a winsock (or winsock2) dll.  At least that is my
> impression on what I hear about systems older than (or including?)
> win98SE these days.

There are a *lot* of Win98 systems that are not officially 'SE',
although a lot of SE stuff has been added thru Windows Update.  They
are both newer and more numerous, I believe, than some of the other
OSes supported.  I would hate for Python to cease working on them.  (I
have one, and my wife three or four.)  So I would hope that a C7.1
build is tested on such before an irrevocable commitment is made.

Terry J. Reedy


From guido at python.org  Thu Oct 23 01:43:40 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 01:43:24 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: Your message of "Thu, 23 Oct 2003 17:07:03 +1300."
	<200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz> 
References: <200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310230543.h9N5heh01776@12-236-54-216.client.attbi.com>

[Guido]
> > I have to stand on my head to understand what it
> > does.  This is even the case for examples like
> > 
> >   reduce(lambda x, y: x + y.foo, seq)

[Greg]
> It occurs to me that, with generator expressions,
> such cases could be rewritten as
> 
>     reduce(lambda x, y: x + y, (z.foo for z in seq))
> 
> i.e. any part of the computation that only depends on
> the right argument can be factored out into the
> generator. So I might have to take back some of what
> I said earlier about generator comprehensions being
> independent of reduce.
> 
> But if I understand you correctly, what you're saying
> is that the interesting cases are the ones where there
> isn't a ready-made binary function that does what
> you want, in which case you're going to have to spell
> everything out explicitly anyway one way or another.

(And then spelling it out so that it works with reduce() reduces
clarity.)

> In that case, the most you could gain from a reduce
> syntax would be that it's an expression rather than
> a sequence of statements.
> 
> But the same could be said of list comprehensions --
> and *was* said quite loudly by many people in the early
> days, if I recall correctly. What's the point, people
> asked, when writing out a set of nested loops is just
> about as easy?

Some people still hate LC's for this reason.

> Somehow we came to the conclusion that being able to
> write a list comprehension as an expression was a
> valuable thing to have, even if it wasn't significantly
> shorter or clearer. What about reductions? Do we feel
> differently? If so, why?

IMO LC's *are* significantly clearer because the notation lets you
focus on what goes into the list (e.g. the expresion "x**2") and under
what conditions (e.g. the condition "x%2 == 1") rather than how you
get it there (i.e. the initializer "result = []" and the call
"result.append(...)").

This is an incredibly common idiom in the use of loops; for
experienced programmers the boilerplate disappears when they read the
code, but for less experienced readers it takes more time to recognize
the idiom.  I think this is at least in part due to the fact that
there are more details that can be written differently, e.g. the name
of the result variable, and exactly at which point it is initialized.

I think that for reductions the gains are less clear.  The initializer
for the result variable and the call that updates its are no longer
boilerplate, because they vary for each use; plus the name of the
result variable should be chosen carefully because it indicates what
kind of result it is (e.g. a sum or product).  So, leaving out the
condition for now, the pattern or idiom is:

  <result> = <initializer>
  for <variable> in <iterable>:
      <result> = <expression>

(Where <expression> uses <result> and <variable>.)

If we think of this as a template with parameters, there are five
parameters!  (A LC without a condition only has 3: <expression>,
<variable> and <iterable>.)  No matter how hard you try, a macro with
5 parameters will have a hard time conveying the meaning of each
without being at least as verbose as the full template.

We could reduce the number of template parameters to 4 by leaving
<result> anonymous; we could then refer to it by e.g. "_" in
<expression>, which is more concise and perhaps acceptable, but makes
certain uses more strained (e.g. mean() below).

Just for fun, let me try to propose a macro syntax:

  reduction(<initializer>, <expression>, <variable>, <iterable>)

(I think it's better to have <initializer> as the first parameter, but
you can )

For example:

  reduction(0, _+x**2, x, S)

Lavishly sprinkle syntactic sugar, and perhaps it can become this
('reduction' would have to be a reserved word):

  reduction(0, _+x**2 for x in S)

A few more examples using this notation:

  # product(S), if Raymond's product() builtin is accepted
  reduction(1, _*x for x in S)

  # mean of f(x); uses result tuple and needs result postprocessing
  total, n = reduction((0, 0), (_[0]+f(x), _[1]+1) for x in S)
  mean = total/n

  # horner(S, x): evaluate a polynomial over x: [6, 3, 4] => 6*x**2 + 3*x + 4
  reduction(0, _*x + c for c in S)

In each of these cases I have the same gut response as to writing
these using reduce(): the notation is too "concentrated", I have to
think so hard before I understand what it does that I wouldn't mind
having it spread over three lines.  Compare the above four examples
to:

  sum = 0
  for x in S:
      sum += x**2

  product = 1
  for x in S:
      product *= x

  total, n = 0, 0
  for x in S:
      total += f(x)
      n += 1
  mean = total/n

  horner = 0
  for c in S:
      horner = horner*x + c

I find that these cause much less strain on the eyes.

(BTW the horner example shows that insisting on augmented assignment
would reduce the power.)

Concluding, I think the reduce() pattern is doomed -- the template is
too complex to capture in special syntax.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu Oct 23 01:48:17 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 01:47:44 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Wed, 22 Oct 2003 22:07:48 EDT."
	<LNBBLJKPBEHFEDALKOLCGECCFFAB.tim_one@email.msn.com> 
References: <LNBBLJKPBEHFEDALKOLCGECCFFAB.tim_one@email.msn.com> 
Message-ID: <200310230548.h9N5mHs01795@12-236-54-216.client.attbi.com>

(This is drawing to a conclusion.  Summary: Tim has convinced me.)

> > There are other places in Python where some rule is applied to "all
> > free variables of a given piece of code" (the distinction between
> > locals and non-locals in functions is made this way).  But there are
> > no other places where implicit local *copies* of all those free
> > variables are taken.
> 
> I didn't suggest to copy anything, just to capture the bindings in use at
> the time a generator expression is evaluated.

Sorry, I meant a pointer copy, not an object copy.  That's a binding
capture.

> This is easy to explain, and trivial to explain for people familiar
> with the default-argument trick.

Phillip Eby already recommended not bothering with that; the
default-argument rule is actually confusing for newbies (they think
the defaults are evaluated at call time) so it's best not to bring
this into the picture.

> Whenever I've written a list-of-generators, or in the recent example
> a generator pipeline, I have found it semantically necessary,
> without exception so far, to capture the bindings of the variables
> whose bindings wouldn't otherwise be invariant across the life of
> the generator.  It it turns out that this is always, or nearly
> almost always, the case, across future examples too, then it would
> just be goofy not to implement generator expressions that way
> ("well, yes, the implementation does do a wrong thing in every
> example we had, but what you're not seeing is that the explanation
> would have been a line longer had the implementation done a useful
> thing instead" <wink>).
> 
> > I'd need to find a unifying principle to warrant doing that beyond
> > utility.
> 
> No you don't -- you just think you do <wink>.

OK, I got it now.  I hope we can find another real-life example; but
there were some other early toy examples that also looked quite
convincing.

I'll take a pass at updating the PEP.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From tjreedy at udel.edu  Thu Oct 23 01:51:45 2003
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu Oct 23 01:51:50 2003
Subject: [Python-Dev] Re: let's not stretch a keyword's use unreasonably,
	_please_...
References: <20031022161137.96353.qmail@web40513.mail.yahoo.com>
Message-ID: <bn7q9h$3ld$1@sea.gmane.org>


"Alex Martelli" <aleaxit@yahoo.com> wrote in message
news:20031022161137.96353.qmail@web40513.mail.yahoo.com...
> Inside a module M's body ("toplevel" in it, not nested inside
> a def &c) I can write
>     x = 23
> and it means M.x = 23 (unconditionally).  Once the module
> object M is created, if I want to tweak that attribute
> of M I have to write e.g. M.x = 42 after getting ahold of
> some reference to M (say by an "import M", or say in a function
> of M by sys.modules[__name__].x = 42, etc).
> Inside a module M's body ("toplevel" in it, not nested inside
> a def &c) I can write
>     x = 23
> and it means M.x = 23 (unconditionally).  Once the module
> object M is created, if I want to tweak that attribute
> of M I have to write e.g. M.x = 42 after getting ahold of
> some reference to M (say by an "import M", or say in a function
> of M by sys.modules[__name__].x = 42, etc).

I somehow overlooked that this would work inside modules also.

>>> import __main__ as m # I know, not general, just for trial
>>> m.c=3
>>> c
3
>>> def e():
...   m.x='ha'
...
>>> e()
>>> x
'ha'

So I really *don't* need global.  Perhaps a new builtin

def me():
  import sys
  return sys.modules[__name__]

or an addition to my template.py file.

Terry J. Reedy


From guido at python.org  Thu Oct 23 02:42:12 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 02:41:29 2003
Subject: [Python-Dev] PEP 289: Generator Expressions (second draft)
In-Reply-To: Your message of "Wed, 22 Oct 2003 21:20:30 PDT."
Message-ID: <200310230642.h9N6gCN01979@12-236-54-216.client.attbi.com>

I've checked in an update to Raymond's PEP 289 which (I hope)
clarifies a lot of things, and settles the capturing of free
variables.

Raymond, please take this to c.l.py for feedback!  Wear asbestos. :-)

I'm sure there will be plenty of misunderstandings in the discussion
there.  If these are due to lack of detail or clarity in the PEP, feel
free to update the PEP.  If there are questions that need us to go
back to the drawing board or requiring BDFL pronouncement, take it
back to python-dev.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Thu Oct 23 02:41:00 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Thu Oct 23 02:42:43 2003
Subject: [Python-Dev] Re: buildin vs. shared modules
In-Reply-To: <4qy1qfs5.fsf@python.net>
References: <brspcg23.fsf@python.net>
	<200310171840.h9HIesN06941@12-236-54-216.client.attbi.com>
	<he22muai.fsf@python.net> <200310211857.57783.aleaxit@yahoo.com>
	<u162la7r.fsf@python.net>
	<200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com>
	<7k2ywden.fsf@yahoo.co.uk> <65ihlodo.fsf@python.net>
	<200310221442.h9MEgQN27219@12-236-54-216.client.attbi.com>
	<4qy1qfs5.fsf@python.net>
Message-ID: <m3d6colamb.fsf@mira.informatik.hu-berlin.de>

Thomas Heller <theller@python.net> writes:

> VC7 can convert VC6 workspace and project files into its own format,
> but there is no way back.  You cannot use VC7 files (they are called
> solution instead of workspace) in VC6 anymore.  MvL suggested to convert
> the files once and then deprecate using the VC6 workspace.

Indeed: Conversion works fairly well, but we (as python-devers) should
agree on using a single compiler - otherwise, conflicting changes will
occur. So I propose to actually move the VC6 project files elsewhere;
anybody who wants to continue to use them would need to copy them back.

I could implement that very quickly; I just need agreement that we
should do so. We would also need agreement on whether to use VC7
(Studio .NET) or VC 7.1 (Studio .NET 2003); I propose to use the
latter.

> MvL again has the idea to create the msi (which is basically a database)
> programmatically with Python - either via COM, a custom Python extension
> or maybe ctypes.

I haven't made much progress with that, though. Initially I plan to
use the MSI COM interface, and I'm fairly certain that this can be
done, but it also takes some effort.

On the plus side, anybody could then do the packaging - you would only
need PythonWin installed. That requirement could be dropped by using
the C API to installer. To build necessary extension module, you would
need to have the Installer SDK installed (which comes with the
platform SDK); I haven't checked whether VC 7.1 ships with the
necessary libraries (in which case there would be no additional
prerequisites).

Regards,
Martin

From bac at OCF.Berkeley.EDU  Thu Oct 23 02:48:03 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Oct 23 02:48:13 2003
Subject: [Python-Dev] setjmp/longjmp exception handling (was: More
	informative error messages)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEJAGGAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCKEJAGGAB.tim.one@comcast.net>
Message-ID: <3F9779A3.7000504@ocf.berkeley.edu>

Tim Peters wrote:
<SNIP>
> An internal PyExc_AttributeError isn't the same as a user-visible
> AttributeError, though -- a class instance isn't created unless and until
> PyErr_NormalizeException() gets called because the exception needs to be
> made user-visible.  If the latter never happens, setting and clearing
> exceptions internally is pretty cheap (a pointer to the global
> PyExc_AttributeError object is stuffed into the thread state).  OTOH, almost
> every call to a C API function has to test+branch for an error-return value,
> and I've often wondered whether a setjmp/longjmp-based hack might allow for
> cleaner and more optimizable code (hand-rolled "real exception handling").
> 

For some odd reason (maybe because of all the code touch-ups I did to 
Python/ast.c in the AST branch), the idea of doing exception handling in 
C using setjmp/longjmp really appealed to me.

So, being a programmer with an itch that needed to be scratched, I came 
up with a possible solution.  Even if the idea would work (which I don't 
know if it will just because I am not sure how thread-safe it is nor if 
the code will work; this was a mental exercise that doesn't compile 
because of casting of jmp_buf and such), I doubt it will ever be 
incorporated into Python just because it would require so much change to 
the C code.  But hey, who knows.

The basic idea is to keep a stack of jmp_buf points.  They are pushed on 
to the stack when a chunk of code wants to handle an exception.  The 
basic code is in the function try_except(); have an 'if' that calls a 
function that pushes on to the stack a new jmp_buf and register it in 
the conditional check.  When an exception is raised a function is called 
(makejmp()) that pops the stack and jumps to the jmp_buf that is popped. 
  Continue until the last item on the stack is reached which should be 
PyErr_NormalizeException() (I think that is the function that exposes an 
exception to Python code).

I have no clue how much performance benefit/loss there would be from 
this, but code would be cleaner since you wouldn't have to do constant 
``if (fxn() == NULL) return NULL;`` checks for raised exceptions.

But in case anyone cares, here is the *very* rough C code:


#include <stddef.h>
#include <setjmp.h>

/* Basically just a stack item */
typedef struct jmp_stack_item_struct {
     jmp_buf jmp_point;
     struct jmp_stack_item_struct *previous;
} jmp_stack_item;


/* Global stack of jmp points to exception handlers */
jmp_stack_item *jmp_stack;

void
try_except(void)
{
     jmp_stack = NULL;

     /* try: */
     if (!setjmp(allocjmp())) {
         ;
     }
     /* except: */
     else {
         ;
     }
}

/* returning jmp_buf like this makes gcc unhappy since it is an array */
jmp_buf
allocjmp(void)
{
     /* malloc jmp_buf and put on top of stack */
     /* return malloc'ed jmp_buf */
     jmp_stack_item *new_jmp = (jmp_stack_item *) 
malloc(sizeof(jmp_stack_item));

     if (!jmp_stack) {
         new_jmp->previous = NULL;
     }
     else {
         new_jmp->previous = jmp_stack;
     }
     jmp_stack = new_jmp;

     return new_jmp->jmp_point;
}

void
raise(void)
{
     /* Exception set; now call... */
     makejmp();
}

void
makejmp(void)
{
     jmp_stack_item *top_jmp = jmp_stack;
     jmp_buf jmp_to;

     if (!jmp_stack->previous)
         longjmp(jmp_stack->jmp_point, 1);
     else {
         memmove(jmp_to, top_jmp->jmp_point, sizeof(jmp_to));
         jmp_stack = top_jmp->previous;
         free(top_jmp);

         longjmp(jmp_to, 1);
     }
}


From guido at python.org  Thu Oct 23 02:49:37 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 02:48:55 2003
Subject: [Python-Dev] Re: buildin vs. shared modules
In-Reply-To: Your message of "23 Oct 2003 08:41:00 +0200."
	<m3d6colamb.fsf@mira.informatik.hu-berlin.de> 
References: <brspcg23.fsf@python.net>
	<200310171840.h9HIesN06941@12-236-54-216.client.attbi.com>
	<he22muai.fsf@python.net> <200310211857.57783.aleaxit@yahoo.com>
	<u162la7r.fsf@python.net>
	<200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com>
	<7k2ywden.fsf@yahoo.co.uk> <65ihlodo.fsf@python.net>
	<200310221442.h9MEgQN27219@12-236-54-216.client.attbi.com>
	<4qy1qfs5.fsf@python.net> 
	<m3d6colamb.fsf@mira.informatik.hu-berlin.de> 
Message-ID: <200310230649.h9N6nbS02025@12-236-54-216.client.attbi.com>

> > VC7 can convert VC6 workspace and project files into its own format,
> > but there is no way back.  You cannot use VC7 files (they are called
> > solution instead of workspace) in VC6 anymore.  MvL suggested to convert
> > the files once and then deprecate using the VC6 workspace.
> 
> Indeed: Conversion works fairly well, but we (as python-devers) should
> agree on using a single compiler - otherwise, conflicting changes will
> occur. So I propose to actually move the VC6 project files elsewhere;
> anybody who wants to continue to use them would need to copy them back.
> 
> I could implement that very quickly; I just need agreement that we
> should do so. We would also need agreement on whether to use VC7
> (Studio .NET) or VC 7.1 (Studio .NET 2003); I propose to use the
> latter.

Right.  Microsoft donated 10 copies of VC7.1 to various key Python
developers (including me, Tim Peters and Jeremy Hylton).

> > MvL again has the idea to create the msi (which is basically a database)
> > programmatically with Python - either via COM, a custom Python extension
> > or maybe ctypes.
> 
> I haven't made much progress with that, though. Initially I plan to
> use the MSI COM interface, and I'm fairly certain that this can be
> done, but it also takes some effort.
> 
> On the plus side, anybody could then do the packaging - you would only
> need PythonWin installed. That requirement could be dropped by using
> the C API to installer. To build necessary extension module, you would
> need to have the Installer SDK installed (which comes with the
> platform SDK); I haven't checked whether VC 7.1 ships with the
> necessary libraries (in which case there would be no additional
> prerequisites).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Thu Oct 23 02:45:19 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Thu Oct 23 02:57:43 2003
Subject: [Python-Dev] Re: Re: buildin vs. shared modules
In-Reply-To: <bn7oo5$1q6$1@sea.gmane.org>
References: <brspcg23.fsf@python.net>
	<200310171840.h9HIesN06941@12-236-54-216.client.attbi.com>
	<he22muai.fsf@python.net> <200310211857.57783.aleaxit@yahoo.com>
	<u162la7r.fsf@python.net>
	<200310211940.h9LJeuK24693@12-236-54-216.client.attbi.com>
	<7k2ywden.fsf@yahoo.co.uk> <65ihlodo.fsf@python.net>
	<bn7oo5$1q6$1@sea.gmane.org>
Message-ID: <m37k2wlaf4.fsf@mira.informatik.hu-berlin.de>

"Terry Reedy" <tjreedy@udel.edu> writes:

> So I would hope that a C7.1 build is tested on such before an
> irrevocable commitment is made.

That will happen only if there are volunteers to test it. Those
volunteers would need to be very active while the transition occurs,
i.e. build from CVS instead of just trying out installable packages
(because initially, there would not be any installable packages).

That said, I'm quite confident that a VC7.1-built-MSI-packaged
application could be installed even on Win95 - you would have to
install installer first, though (by means of the four-files packaging
approach).

Regards,
Martin

From mcherm at mcherm.com  Thu Oct 23 03:29:48 2003
From: mcherm at mcherm.com (Michael Chermside)
Date: Thu Oct 23 03:29:49 2003
Subject: [Python-Dev] product()
Message-ID: <1066894188.3f97836cbb106@mcherm.com>

[Raymond, recently]:
> The product() accumulator is the one destined to be a builtin.
> 
> Though it is not nearly as common as sum(), it does enjoy
> some popularity.  Having it available will help dispense
> with reduce(operator.mul, data, 1).
> 
> Would there be any objections to my adding product() to
> Py2.4?  The patch was simple and it is ready to go unless 
> someone has some major issue with it.


Just wanted to bring you a blast from the past:
[http://mail.python.org/pipermail/python-dev/2003-April/034784.html]
[Alex Martelli:]
> I think I understand the worry that introducing 'sum' would be the start
> of a slippery slope leading to requests for 'prod' (I can't think of other
> bulk operations that would be at all popular -- perhaps bulk and/or, but
> I think that's stretching it).  But I think it's a misplaced worry in this 
> case.  "Adding up a bunch of numbers" is just SO much more common
> than "Multiplying them up" (indeed the latter's hardly idiomatic English,
> while "adding up" sure is), that I believe normal users (as opposed to
> advanced programmers with a keenness on generalization) wouldn't
> have any problem at all with 'sum' being there and 'prod' missing...

I have nothing to add... Alex said it much better than I could.

-- Michael Chermside


From barry at python.org  Thu Oct 23 07:59:11 2003
From: barry at python.org (Barry Warsaw)
Date: Thu Oct 23 07:59:15 2003
Subject: [Python-Dev] product()
In-Reply-To: <002401c39907$0176f5a0$e841fea9@oemcomputer>
References: <002401c39907$0176f5a0$e841fea9@oemcomputer>
Message-ID: <1066910350.11634.7.camel@anthem>

On Wed, 2003-10-22 at 21:43, Raymond Hettinger wrote:
> In the course of writing up Pep 289, it became clear that
> the future has a number of accumulator functions in store.

In a crazy, I-haven't-yet-had-my-coffee-yet desperate attempt at
resurrecting PEP 274, what if we made dict (and maybe tuple) accumulator
functions too?  Then if something like dict(genex) would work, how hard
would it be to add some syntactic sugar for that in {genex}?  Aren't we
kind of close already?

>>> from __future__ import generators
>>> def a():
...   for x in 'hello world':
...    yield x
... 
>>> dict([(c, c) for c in a()])
{' ': ' ', 'e': 'e', 'd': 'd', 'h': 'h', 'l': 'l', 'o': 'o', 'r': 'r', 'w': 'w'}

Okay, I promise, I'll shut up now about PEP 274.

pass-the-joe-ly y'rs,
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20031023/628f09a2/attachment.bin
From pinard at iro.umontreal.ca  Thu Oct 23 08:31:53 2003
From: pinard at iro.umontreal.ca (=?iso-8859-1?Q?Fran=E7ois?= Pinard)
Date: Thu Oct 23 08:32:08 2003
Subject: [Python-Dev] Can we please have a better dict interpolation
	syntax?
In-Reply-To: <200310230416.h9N4GJv01528@12-236-54-216.client.attbi.com>
References: <200310230332.h9N3WIm20194@oma.cosc.canterbury.ac.nz>
	<200310230416.h9N4GJv01528@12-236-54-216.client.attbi.com>
Message-ID: <20031023123153.GA20072@alcyon.progiciels-bpi.ca>

[Guido van Rossum]
> > > Wouldn't this be even better?
> > >     "create index ${table}_lid1_idx on $table($lid1)" % params

"Better" because it uses `$' instead of `%'? It is really a matter of
taste and aesthetics, more than being "better" on technical grounds.
Technically, the multiplication of aspects and paradigms goes against
some unencumberance and simplicity, which made Python attractive to
start with.  We would loose something probably not worth the gain.

> it seems that $foo is just much more common than
> %foo as a substitution indicator, across various languages.

Python has the right of being culturally distinct on some details.  I
see it as an advantage: when languages are too similar, some confusion
arises between differences.  The distinction actually helps.

> Anyway, I think this is something that can wait until 3.0, and I'd
> rather not have too many discussions here at once,

OK, then.  Enough said! :-)

-- 
Fran?ois Pinard   http://www.iro.umontreal.ca/~pinard

From barry at python.org  Thu Oct 23 09:22:41 2003
From: barry at python.org (Barry Warsaw)
Date: Thu Oct 23 09:22:46 2003
Subject: [Python-Dev] Re: let's not stretch a keyword's use
	unreasonably, _please_...
In-Reply-To: <bn7q9h$3ld$1@sea.gmane.org>
References: <20031022161137.96353.qmail@web40513.mail.yahoo.com>
	<bn7q9h$3ld$1@sea.gmane.org>
Message-ID: <1066915360.11634.11.camel@anthem>

On Thu, 2003-10-23 at 01:51, Terry Reedy wrote:

> So I really *don't* need global.  Perhaps a new builtin
> 
> def me():
>   import sys
>   return sys.modules[__name__]

+1, or just "import __me__"

I've often wanted a convenient way to get a hold of the current module
object.  I use something like def me(), but it's a bit ugly and magical
looking.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20031023/5aa0a928/attachment-0001.bin
From skip at pobox.com  Thu Oct 23 09:55:22 2003
From: skip at pobox.com (Skip Montanaro)
Date: Thu Oct 23 09:55:53 2003
Subject: [Python-Dev] Can we please have a better dict interpolation
	syntax?
In-Reply-To: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz>
References: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz>
Message-ID: <16279.56778.309781.129469@montanaro.dyndns.org>


    Greg> I would *really* like to be able to write this as

    Greg>   "create index %{table}_lid1_idx on %{table}(%{lid1})" % params

    Greg> which I find to be much easier on the eyes.

What if lid1 is a float which you want to display with two digits past the
decimal point?

I think we've been around the block on this one a few times.  While %{foo}
might be a convenient shorthand for %(foo)s, I don't think it saves enough
space (one character) or stands out that much more ("{...}" instead of
"(...)s") to make the addition worthwhile.  In addition, you'd have to
retain the current construct in cases where something other than simple
string interpolation was required, in which case you also have the problem
of having two almost identical ways to do dictionary interpolation.

Skip

From skip at pobox.com  Thu Oct 23 10:08:11 2003
From: skip at pobox.com (Skip Montanaro)
Date: Thu Oct 23 10:08:19 2003
Subject: [Python-Dev] Re: product()
In-Reply-To: <eppstein-3DFF45.20594622102003@sea.gmane.org>
References: <002401c39907$0176f5a0$e841fea9@oemcomputer>
	<eppstein-3DFF45.20594622102003@sea.gmane.org>
Message-ID: <16279.57547.169388.138165@montanaro.dyndns.org>


    David> Maybe it would be useful to get some feeling for how much other
    David> functions get used in reduce?

Looking at my own code collection I found five instances of reduce(), all
used either a defined sum function or the equivalent lambda.  There are
probably many other contexts where I might have used reduce, but where it
either didn't occur to me or didn't make the code easier to read or faster.
I'd be happy if you deprecated reduce() today. ;-)

Skip

From skip at pobox.com  Thu Oct 23 10:16:02 2003
From: skip at pobox.com (Skip Montanaro)
Date: Thu Oct 23 10:16:10 2003
Subject: [Python-Dev] Re: let's not stretch a keyword's use unreasonably, 
	_please_...
In-Reply-To: <bn7q9h$3ld$1@sea.gmane.org>
References: <20031022161137.96353.qmail@web40513.mail.yahoo.com>
	<bn7q9h$3ld$1@sea.gmane.org>
Message-ID: <16279.58018.40303.136992@montanaro.dyndns.org>


    >>> import __main__ as m # I know, not general, just for trial
    >>> m.c=3

Isn't (in 3.0) the notion of being able to modify another module's globals
supposed to get restricted to help out (among other things) the compiler?
If so, this use, even though it's not really modifying a global in another
module, might not work forever.

Skip

From ark-mlist at att.net  Thu Oct 23 10:18:48 2003
From: ark-mlist at att.net (Andrew Koenig)
Date: Thu Oct 23 10:18:55 2003
Subject: [Python-Dev] PEP 289: Generator Expressions (second draft)
In-Reply-To: <200310230642.h9N6gCN01979@12-236-54-216.client.attbi.com>
Message-ID: <009301c39970$94247530$6402a8c0@arkdesktop>

> Raymond, please take this to c.l.py for feedback!  Wear asbestos. :-)

One thought:

If we eventually adopt the notation that {a, b, c} is a set, there is a
potential ambiguity in expressions such as {x**2 for x in range(n)}.  Which
is it, a set comprehension or a set with one element that is a generator
expression?

It would have to be the former, of course, by analogy with
[x**2 for x in range(n)], which means that if we introduce generator
expressions, and we later introduce set literals, we will have to introduce
set comprehensions at the same time.  Either that or prohibit generator
expressions as set-literal elements unless parenthesized -- i.e.
{(x**2 for x in range(n))}.


From barry at python.org  Thu Oct 23 10:46:48 2003
From: barry at python.org (Barry Warsaw)
Date: Thu Oct 23 10:46:57 2003
Subject: [Python-Dev] Can we please have a better dict interpolation
	syntax?
In-Reply-To: <200310230416.h9N4GJv01528@12-236-54-216.client.attbi.com>
References: <200310230332.h9N3WIm20194@oma.cosc.canterbury.ac.nz>
	<200310230416.h9N4GJv01528@12-236-54-216.client.attbi.com>
Message-ID: <1066920408.11634.89.camel@anthem>

On Thu, 2003-10-23 at 00:16, Guido van Rossum wrote:

> There have been many proposals in this area, even a PEP (PEP 215,
> which I don't like that much, despite its use of $).

And PEP 292, which I probably should update.

I should mention that $string substitutions are optional in Mailman 2.1,
but they will be the only way to do it in Mailman 3.  I've played a lot
with various implementations of this idea, and below is the one I've
currently settled on.  Not all of the semantics may be perfect for core
Python (i.e. never throw a KeyError), but this is all doable in modern
Python, and for user-exposed templates, gets a +1000 in my book.

>>> s = dstring('${person} lives in $where and owes me $$${amount}')
>>> d = safedict(person='Guido', where='California', amount='1,000,000')
>>> print s % d
Guido lives in California and owes me $1,000,000
>>> d = safedict(person='Tim', amount=.13)
>>> print s % d
Tim lives in ${where} and owes me $0.13

-Barry

import re
# Search for $$, $identifier, or ${identifier}
dre = re.compile(r'(\${2})|\$([_a-z]\w*)|\${([_a-z]\w*)}', re.IGNORECASE)

EMPTYSTRING = ''

class dstring(unicode):
    def __new__(cls, ustr):
        ustr = ustr.replace('%', '%%')
        parts = dre.split(ustr)
        for i in range(1, len(parts), 4):
            if parts[i] is not None:
                parts[i] = '$'
            elif parts[i+1] is not None:
                parts[i+1] = '%(' + parts[i+1] + ')s'
            else:
                parts[i+2] = '%(' + parts[i+2] + ')s'
        return unicode.__new__(cls, EMPTYSTRING.join(filter(None, parts)))


class safedict(dict):
    """Dictionary which returns a default value for unknown keys."""
    def __getitem__(self, key):
        try:
            return super(safedict, self).__getitem__(key)
        except KeyError:
            return '${%s}' % key

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20031023/2684c83b/attachment.bin
From barry at python.org  Thu Oct 23 10:53:15 2003
From: barry at python.org (Barry Warsaw)
Date: Thu Oct 23 10:53:22 2003
Subject: [Python-Dev] Can we please have a better dict interpolation
	syntax?
In-Reply-To: <20031023123153.GA20072@alcyon.progiciels-bpi.ca>
References: <200310230332.h9N3WIm20194@oma.cosc.canterbury.ac.nz>
	<200310230416.h9N4GJv01528@12-236-54-216.client.attbi.com>
	<20031023123153.GA20072@alcyon.progiciels-bpi.ca>
Message-ID: <1066920795.11634.96.camel@anthem>

On Thu, 2003-10-23 at 08:31, Fran?ois Pinard wrote:
> [Guido van Rossum]
> > > > Wouldn't this be even better?
> > > >     "create index ${table}_lid1_idx on $table($lid1)" % params
> 
> "Better" because it uses `$' instead of `%'? It is really a matter of
> taste and aesthetics, more than being "better" on technical grounds.
> Technically, the multiplication of aspects and paradigms goes against
> some unencumberance and simplicity, which made Python attractive to
> start with.  We would loose something probably not worth the gain.

Better because the trailing type specifier on %-strings is extremely
error prone (#1 cause of bugs for Mailman translators is/was leaving off
the trailing 's').  Better because the rules for $-strings are simple
and easy to explain.  Better because the enclosing braces are optional,
and unnecessary in the common case, making for much more readable
template strings.  And yes, better because it uses $ instead of %; it
just seems that more people grok that $foo is a placeholder.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20031023/81a3c016/attachment.bin
From guido at python.org  Thu Oct 23 10:56:27 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 10:57:58 2003
Subject: [Python-Dev] product()
In-Reply-To: Your message of "Thu, 23 Oct 2003 07:59:11 EDT."
	<1066910350.11634.7.camel@anthem> 
References: <002401c39907$0176f5a0$e841fea9@oemcomputer>  
	<1066910350.11634.7.camel@anthem> 
Message-ID: <200310231456.h9NEuRn02615@12-236-54-216.client.attbi.com>

> In a crazy, I-haven't-yet-had-my-coffee-yet desperate attempt at
> resurrecting PEP 274, what if we made dict (and maybe tuple)
> accumulator functions too?

There's nothing magical about accumulator functions; they're just
functions taking an iterable.  We have tons of these today, and
tuple() and dict() are among them.  Once the syntax works,

  dict((k,k) for k,k in "hello")

will with without changes to dict.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From skip at pobox.com  Thu Oct 23 10:56:31 2003
From: skip at pobox.com (Skip Montanaro)
Date: Thu Oct 23 10:58:05 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Doc
	python-docs.txt, 1.2, 1.3
In-Reply-To: <E1ACgX7-0000uo-00@sc8-pr-cvs1.sourceforge.net>
References: <E1ACgX7-0000uo-00@sc8-pr-cvs1.sourceforge.net>
Message-ID: <16279.60447.29714.759275@montanaro.dyndns.org>


    fred> - add "Why is Python installed on my computer?" as a documentation
    fred>   FAQ since this gets asked at the docs at python.org address a
    fred>   lot

And I thought only webmaster@python.org got asked that question all the
time.  Does it get asked at other addresses as well?  I don't recall ever
seeing it on python-list.

Skip

From python at rcn.com  Thu Oct 23 10:58:02 2003
From: python at rcn.com (Raymond Hettinger)
Date: Thu Oct 23 10:58:54 2003
Subject: [Python-Dev] PEP 289: Generator Expressions
In-Reply-To: <200310230642.h9N6gCN01979@12-236-54-216.client.attbi.com>
Message-ID: <002e01c39976$0f130040$e841fea9@oemcomputer>

[Guido]
> I've checked in an update to Raymond's PEP 289 which (I hope)
> clarifies a lot of things, and settles the capturing of free
> variables.

Nice edits.

I'm unclear on the meaning of the last line in detail #3, "(Loop
variables may also use constructs like x[i] or x.a; this form may be
deprecated.)"

Does this mean that "(x.a for x in mylist)" will initiatly be valid but
will someday break?  If so, I can't imagine why.  Or does in mean that
the induction variable can be in that form, "(x for x.a in mylist)".
Surely, this would never be allowed.

 
> Raymond, please take this to c.l.py for feedback!  Wear asbestos. :-)

Will do.


Raymond Hettinger


From barry at python.org  Thu Oct 23 11:02:16 2003
From: barry at python.org (Barry Warsaw)
Date: Thu Oct 23 11:02:23 2003
Subject: [Python-Dev] Can we please have a better dict interpolation
	syntax?
In-Reply-To: <16279.56778.309781.129469@montanaro.dyndns.org>
References: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz>
	<16279.56778.309781.129469@montanaro.dyndns.org>
Message-ID: <1066921335.11634.103.camel@anthem>

On Thu, 2003-10-23 at 09:55, Skip Montanaro wrote:

> What if lid1 is a float which you want to display with two digits past the
> decimal point?

BTW, I should mention that IMO, $-strings are great for end-user
editable string templates, such as (in Mailman) things like translatable
strings or message footer templates.  

But I also think the existing %-strings are just fine for programmers. 
I would definitely be opposed to complicating $-strings with any of the
specialized and fine-grained control you have with %-strings.  KISS and
you'll have a great 99% solution, as long as you accept that the two
substitution formats are aimed at different audiences.

Then again, see my last post.  I'm not sure anything needs to be added
to core Python to support useful $-strings.  Or maybe it can be
implemented as a library module (or part of a 'textutils' package).

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20031023/b1b7b4bf/attachment.bin
From guido at python.org  Thu Oct 23 11:03:35 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 11:03:50 2003
Subject: [Python-Dev] PEP 289: Generator Expressions (second draft)
In-Reply-To: Your message of "Thu, 23 Oct 2003 10:18:48 EDT."
	<009301c39970$94247530$6402a8c0@arkdesktop> 
References: <009301c39970$94247530$6402a8c0@arkdesktop> 
Message-ID: <200310231503.h9NF3Zr02681@12-236-54-216.client.attbi.com>

> If we eventually adopt the notation that {a, b, c} is a set, there is a
> potential ambiguity in expressions such as {x**2 for x in range(n)}.  Which
> is it, a set comprehension or a set with one element that is a generator
> expression?
> 
> It would have to be the former, of course, by analogy with
> [x**2 for x in range(n)], which means that if we introduce generator
> expressions, and we later introduce set literals, we will have to introduce
> set comprehensions at the same time.  Either that or prohibit generator
> expressions as set-literal elements unless parenthesized -- i.e.
> {(x**2 for x in range(n))}.

Don't worry.  The current proposal *always* requires parentheses
around generator expressions (but it may be the only argument to a
function), so your example would be illegal.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From barry at python.org  Thu Oct 23 11:04:41 2003
From: barry at python.org (Barry Warsaw)
Date: Thu Oct 23 11:04:48 2003
Subject: [Python-Dev] PEP 289: Generator Expressions (second draft)
In-Reply-To: <009301c39970$94247530$6402a8c0@arkdesktop>
References: <009301c39970$94247530$6402a8c0@arkdesktop>
Message-ID: <1066921481.11634.106.camel@anthem>

On Thu, 2003-10-23 at 10:18, Andrew Koenig wrote:
> > Raymond, please take this to c.l.py for feedback!  Wear asbestos. :-)
> 
> One thought:
> 
> If we eventually adopt the notation that {a, b, c} is a set, there is a
> potential ambiguity in expressions such as {x**2 for x in range(n)}.  Which
> is it, a set comprehension or a set with one element that is a generator
> expression?
> 
> It would have to be the former, of course, by analogy with
> [x**2 for x in range(n)], which means that if we introduce generator
> expressions, and we later introduce set literals, we will have to introduce
> set comprehensions at the same time.  Either that or prohibit generator
> expressions as set-literal elements unless parenthesized -- i.e.
> {(x**2 for x in range(n))}.

Heh, and then {(x, x**2) for x in range(n)} is a dict comprehension.

okay-/now/-i'll-shut-up-about-them-ly y'rs,
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20031023/31b49715/attachment-0001.bin
From barry at python.org  Thu Oct 23 11:05:57 2003
From: barry at python.org (Barry Warsaw)
Date: Thu Oct 23 11:06:19 2003
Subject: [Python-Dev] product()
In-Reply-To: <200310231456.h9NEuRn02615@12-236-54-216.client.attbi.com>
References: <002401c39907$0176f5a0$e841fea9@oemcomputer>
	<1066910350.11634.7.camel@anthem>
	<200310231456.h9NEuRn02615@12-236-54-216.client.attbi.com>
Message-ID: <1066921556.11634.108.camel@anthem>

On Thu, 2003-10-23 at 10:56, Guido van Rossum wrote:
> > In a crazy, I-haven't-yet-had-my-coffee-yet desperate attempt at
> > resurrecting PEP 274, what if we made dict (and maybe tuple)
> > accumulator functions too?
> 
> There's nothing magical about accumulator functions; they're just
> functions taking an iterable.  We have tons of these today, and
> tuple() and dict() are among them.  Once the syntax works,
> 
>   dict((k,k) for k,k in "hello")
> 
> will with without changes to dict.

Cool!
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20031023/cd65cd86/attachment.bin
From fdrake at acm.org  Thu Oct 23 11:09:09 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu Oct 23 11:09:22 2003
Subject: [Python-Dev] PEP 289: Generator Expressions
In-Reply-To: <002e01c39976$0f130040$e841fea9@oemcomputer>
References: <200310230642.h9N6gCN01979@12-236-54-216.client.attbi.com>
	<002e01c39976$0f130040$e841fea9@oemcomputer>
Message-ID: <16279.61205.953516.442124@grendel.zope.com>


Raymond Hettinger writes:
 > Does this mean that "(x.a for x in mylist)" will initiatly be valid but
 > will someday break?  If so, I can't imagine why.  Or does in mean that
 > the induction variable can be in that form, "(x for x.a in mylist)".
 > Surely, this would never be allowed.

The later.  There's bound to be some seriously evil stuff out there,
just waiting to pop up...   ;-)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From ark-mlist at att.net  Thu Oct 23 11:12:07 2003
From: ark-mlist at att.net (Andrew Koenig)
Date: Thu Oct 23 11:12:13 2003
Subject: [Python-Dev] PEP 289: Generator Expressions (second draft)
In-Reply-To: <1066921481.11634.106.camel@anthem>
Message-ID: <00b401c39978$074ba8b0$6402a8c0@arkdesktop>

> Heh, and then {(x, x**2) for x in range(n)} is a dict comprehension.

No, it's a set comprehension where the set elements are pairs.  The dict
comprehension would be

	{x: x**2 for x in range(n)}

Or would that be a single-element dict whose key is x and value is a
generator expression?  :-)


From fdrake at acm.org  Thu Oct 23 11:18:36 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu Oct 23 11:18:52 2003
Subject: [Python-Dev] Can we please have a better dict interpolation
	syntax?
In-Reply-To: <1066921335.11634.103.camel@anthem>
References: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz>
	<16279.56778.309781.129469@montanaro.dyndns.org>
	<1066921335.11634.103.camel@anthem>
Message-ID: <16279.61772.978424.304106@grendel.zope.com>


Barry Warsaw writes:
 > Then again, see my last post.  I'm not sure anything needs to be added
 > to core Python to support useful $-strings.  Or maybe it can be
 > implemented as a library module (or part of a 'textutils' package).

+1 on adding this as a module.

I've managed to implement this a few times, and it would be nice to
just import the same implementation from everywhere I needed it.

One note: calling this "interpolation" (at least when describing it to
end users) is probably a mistake; "substitution" makes more sense to
people not ingrained in communities where it's called interpolation.
It might be ok to call it interpolation for programmers,
but... there's no need for two different names for it.  ;-)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From skip at pobox.com  Thu Oct 23 11:22:40 2003
From: skip at pobox.com (Skip Montanaro)
Date: Thu Oct 23 11:22:50 2003
Subject: [Python-Dev] Can we please have a better dict interpolation
	syntax?
In-Reply-To: <1066921335.11634.103.camel@anthem>
References: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz>
	<16279.56778.309781.129469@montanaro.dyndns.org>
	<1066921335.11634.103.camel@anthem>
Message-ID: <16279.62016.628120.971560@montanaro.dyndns.org>


    Barry> On Thu, 2003-10-23 at 09:55, Skip Montanaro wrote:
    >> What if lid1 is a float which you want to display with two digits
    >> past the decimal point?

    Barry> BTW, I should mention that IMO, $-strings are great for end-user
    Barry> editable string templates, such as (in Mailman) things like
    Barry> translatable strings or message footer templates.

    ...

    Barry> Then again, see my last post.  I'm not sure anything needs to be
    Barry> added to core Python to support useful $-strings.  Or maybe it
    Barry> can be implemented as a library module (or part of a 'textutils'
    Barry> package).

+1.  If it's not something programmers will use (most of the time, anyway)
there's no need to build it into the language.  If programmers like it, it's
only another module to import.  In addition, I'm fairly certain such a
module could be made compatible with Python as far back as 1.5.2 without a
lot of effort.  You also have the freedom to make it much more flexible (use
of templates and so forth) if it's in a separate module.

Skip

From guido at python.org  Thu Oct 23 11:35:56 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 11:36:51 2003
Subject: [Python-Dev] PEP 289: Generator Expressions
In-Reply-To: Your message of "Thu, 23 Oct 2003 10:58:02 EDT."
	<002e01c39976$0f130040$e841fea9@oemcomputer> 
References: <002e01c39976$0f130040$e841fea9@oemcomputer> 
Message-ID: <200310231535.h9NFZuc02814@12-236-54-216.client.attbi.com>

> I'm unclear on the meaning of the last line in detail #3, "(Loop
> variables may also use constructs like x[i] or x.a; this form may be
> deprecated.)"
> 
> Does this mean that "(x.a for x in mylist)" will initiatly be valid but
> will someday break?

No, I meant that "for x.a in mylist: ..." is valid but shouldn't be,
and consequently (because they all share the same syntax) this is also
allowed in list comprehensions and generator expressions.  All uses
should be disallowed.

> If so, I can't imagine why.  Or does in mean that
> the induction variable can be in that form, "(x for x.a in mylist)".
> Surely, this would never be allowed.

We can prevent it for generator expressions, but it's too late for
list comprehensions and regular for loops -- we'll have to go
deprecate it there.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu Oct 23 11:38:18 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 11:38:26 2003
Subject: [Python-Dev] Can we please have a better dict interpolation
	syntax?
In-Reply-To: Your message of "Thu, 23 Oct 2003 10:22:40 CDT."
	<16279.62016.628120.971560@montanaro.dyndns.org> 
References: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz>
	<16279.56778.309781.129469@montanaro.dyndns.org>
	<1066921335.11634.103.camel@anthem> 
	<16279.62016.628120.971560@montanaro.dyndns.org> 
Message-ID: <200310231538.h9NFcIW02840@12-236-54-216.client.attbi.com>

I have too much on my plate (spent too much on generator expressions
lately :-).

I am bowing out of the variable substitution discussion after noting
that putting it in a module would be a great start (like for sets).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From skip at pobox.com  Thu Oct 23 11:48:15 2003
From: skip at pobox.com (Skip Montanaro)
Date: Thu Oct 23 11:48:26 2003
Subject: [Python-Dev] PEP 289: Generator Expressions
In-Reply-To: <200310231535.h9NFZuc02814@12-236-54-216.client.attbi.com>
References: <002e01c39976$0f130040$e841fea9@oemcomputer>
	<200310231535.h9NFZuc02814@12-236-54-216.client.attbi.com>
Message-ID: <16279.63551.557669.791100@montanaro.dyndns.org>


    Guido> No, I meant that "for x.a in mylist: ..." is valid but shouldn't
    Guido> be,

Valid?  I'll buy that, but it had never occurred to me.  Useful?  That's not
immediately obvious:

    >>> class Foo:
    ...   def __init__(self):
    ...     self.a = 42
    ... 
    >>> lst = [Foo() for i in range(4)]
    >>> lst
    [<__main__.Foo instance at 0x752760>, <__main__.Foo instance at
    0x7529e0>, <__main__.Foo instance at 0x752df0>, <__main__.Foo instance
    0x7529e0>at 0x752dc8>] 
    >>> [x for x.a in lst]
    [Type help() for interactive help, or help(object) for help about
    object., Type help() for interactive help, or help(object) for help
    about object., Type help() for interactive help, or help(object) for
    help about object., Type help() for interactive help, or help(object)
    for help about object.] 

Skip

From ws-news at gmx.at  Thu Oct 23 11:47:24 2003
From: ws-news at gmx.at (Werner Schiendl)
Date: Thu Oct 23 11:51:35 2003
Subject: [Python-Dev] Re: PEP 289: Generator Expressions (second draft)
References: <1066921481.11634.106.camel@anthem>
	<00b401c39978$074ba8b0$6402a8c0@arkdesktop>
Message-ID: <bn8t6m$edi$1@sea.gmane.org>

Hello,

this is my first post to this list, but I followed it "passive" since quite
some time.

I had a thought about distinguishing the list with 1 iterators vs. list
comprehension issue that did not appear (at least to my eyes) yet.


Why not take the same approach than used for tuples already?

like (5) is just the value 5 and (5,) is a 1-tuple containing the value 5

I thought it would be intuitive to have

[x**2 for x in range(n)]  # be a list comprehension like it currently is
[x**2 for x in range(n),]  # a list with 1 iterator in it

> No, it's a set comprehension where the set elements are pairs.  The dict
> comprehension would be
>
> {x: x**2 for x in range(n)}
>
> Or would that be a single-element dict whose key is x and value is a
> generator expression?  :-)

in this case the same could be applied

{x: x**2 for x in range(n)}  # dict comprehension
{x: x**2 for x in range(n),}  # dict with 1 iterator (but "x" is probably
not a valid name, is it?)


best regards

Werner


From Paul.Moore at atosorigin.com  Thu Oct 23 11:54:19 2003
From: Paul.Moore at atosorigin.com (Moore, Paul)
Date: Thu Oct 23 11:55:06 2003
Subject: [Python-Dev] PEP 289: Generator Expressions
Message-ID: <16E1010E4581B049ABC51D4975CEDB8802C0991B@UKDCX001.uk.int.atosorigin.com>

From: Skip Montanaro [mailto:skip@pobox.com]

>    Guido> No, I meant that "for x.a in mylist: ..." is valid but shouldn't
>    Guido> be,

> Valid?  I'll buy that, but it had never occurred to me.  Useful?  That's not
> immediately obvious:

Well, I'll certainly give you "not obviously useful", but...

>>> class Dummy:
...     def __init__(self):
...         self.a = 12
...
>>> d = Dummy()
>>> d.a
12
>>> [9 for d.a in range(4)]
[9, 9, 9, 9]
>>> d.a
3

Paul

From python at rcn.com  Thu Oct 23 11:55:08 2003
From: python at rcn.com (Raymond Hettinger)
Date: Thu Oct 23 11:57:12 2003
Subject: [Python-Dev] PEP 289: Generator Expressions
In-Reply-To: <200310231535.h9NFZuc02814@12-236-54-216.client.attbi.com>
Message-ID: <001101c3997e$094742e0$e841fea9@oemcomputer>

[Guido]
> No, I meant that "for x.a in mylist: ..." is valid but shouldn't be,
> and consequently (because they all share the same syntax) this is also
> allowed in list comprehensions and generator expressions.  All uses
> should be disallowed.
> 
> > If so, I can't imagine why.  Or does in mean that
> > the induction variable can be in that form, "(x for x.a in mylist)".
> > Surely, this would never be allowed.
> 
> We can prevent it for generator expressions, but it's too late for
> list comprehensions and regular for loops -- we'll have to go
> deprecate it there.


Since the issue is not unique to generator expressions, I recommend
leaving it out of the PEP and separately dealing with all for-constructs
at one time.

It's harder to win support for proposals that use the word "deprecate".


Raymond Hettinger


From barry at python.org  Thu Oct 23 11:58:08 2003
From: barry at python.org (Barry Warsaw)
Date: Thu Oct 23 11:58:15 2003
Subject: [Python-Dev] Can we please have a better dict interpolation
	syntax?
In-Reply-To: <16279.61772.978424.304106@grendel.zope.com>
References: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz>
	<16279.56778.309781.129469@montanaro.dyndns.org>
	<1066921335.11634.103.camel@anthem>
	<16279.61772.978424.304106@grendel.zope.com>
Message-ID: <1066924688.11634.156.camel@anthem>

On Thu, 2003-10-23 at 11:18, Fred L. Drake, Jr. wrote:
> Barry Warsaw writes:
>  > Then again, see my last post.  I'm not sure anything needs to be added
>  > to core Python to support useful $-strings.  Or maybe it can be
>  > implemented as a library module (or part of a 'textutils' package).
> 
> +1 on adding this as a module.

Wasn't there talk of a textutils package around the time of
textwrap.py?  Maybe add that for Py2.4?

> I've managed to implement this a few times, and it would be nice to
> just import the same implementation from everywhere I needed it.
> 
> One note: calling this "interpolation" (at least when describing it to
> end users) is probably a mistake; "substitution" makes more sense to
> people not ingrained in communities where it's called interpolation.
> It might be ok to call it interpolation for programmers,
> but... there's no need for two different names for it.  ;-)

Again +1 isn't strong enough. :)  End users understand "substitution",
they don't understand "interpolation".  If started to use the former
everywhere now.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20031023/765c511d/attachment.bin
From barry at python.org  Thu Oct 23 12:01:04 2003
From: barry at python.org (Barry Warsaw)
Date: Thu Oct 23 12:01:55 2003
Subject: [Python-Dev] Can we please have a better dict interpolation
	syntax?
In-Reply-To: <200310231538.h9NFcIW02840@12-236-54-216.client.attbi.com>
References: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz>
	<16279.56778.309781.129469@montanaro.dyndns.org>
	<1066921335.11634.103.camel@anthem>
	<16279.62016.628120.971560@montanaro.dyndns.org>
	<200310231538.h9NFcIW02840@12-236-54-216.client.attbi.com>
Message-ID: <1066924863.11634.159.camel@anthem>

On Thu, 2003-10-23 at 11:38, Guido van Rossum wrote:
> I have too much on my plate (spent too much on generator expressions
> lately :-).
> 
> I am bowing out of the variable substitution discussion after noting
> that putting it in a module would be a great start (like for sets).

I don't have time to do it, but once Someone figures out where to
situate it, feel free to use my posted code, either verbatim or as a
starting point.  PSF donation, blah, blah, blah.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20031023/6c6e568d/attachment.bin
From python at rcn.com  Thu Oct 23 12:04:21 2003
From: python at rcn.com (Raymond Hettinger)
Date: Thu Oct 23 12:07:02 2003
Subject: [Python-Dev] PEP 289: Generator Expressions
In-Reply-To: <16279.63551.557669.791100@montanaro.dyndns.org>
Message-ID: <001801c3997f$524bc6e0$e841fea9@oemcomputer>

[Skip Montanaro]
> Valid?  I'll buy that, but it had never occurred to me

It had not occurred to me either.

A moments reflection on the implementation reveals that any lvalue will
work, even a[:].

Rather than twist ourselves into knots trying to find ways to disallow
it, I think it should be left in the realm of things that never occur to
anyone and have never been a real problem.

don't-ask-don't-tell-ly yours,


Raymond 


From guido at python.org  Thu Oct 23 13:04:16 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 13:04:24 2003
Subject: [Python-Dev] Re: let's not stretch a keyword's use unreasonably,
	_please_...
In-Reply-To: Your message of "Thu, 23 Oct 2003 09:16:02 CDT."
	<16279.58018.40303.136992@montanaro.dyndns.org> 
References: <20031022161137.96353.qmail@web40513.mail.yahoo.com>
	<bn7q9h$3ld$1@sea.gmane.org> 
	<16279.58018.40303.136992@montanaro.dyndns.org> 
Message-ID: <200310231704.h9NH4Gw03094@12-236-54-216.client.attbi.com>

>     >>> import __main__ as m # I know, not general, just for trial
>     >>> m.c=3
> 
> Isn't (in 3.0) the notion of being able to modify another module's globals
> supposed to get restricted to help out (among other things) the compiler?
> If so, this use, even though it's not really modifying a global in another
> module, might not work forever.

That's one reason why I'd rather continue to use 'global' than some
attribute assignment.  To the compiler, module globals are more
special than class variables etc. because they can shadow builtins.
Therefore the compiler would like to know about *all* assignments to
module globals.

Similarly, assignment to locals in outer scopes need to be known to
the compiler because it must make sure that all locals referenced by
inner scopes are implemented as cells.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu Oct 23 13:09:05 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 13:09:12 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: Your message of "Thu, 23 Oct 2003 13:15:48 +1300."
	<200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> 
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310231709.h9NH95703122@12-236-54-216.client.attbi.com>

> Guido:
> > My problem with the nested functions is that it is much harder to get
> > a grasp of what the shared state is -- any local variable in the outer
> > function *could* be part of the shared state, and the only way to tell
> > for sure is by inspecting all the subfunctions.
> 
> That would be solved if, instead of marking variables
> in inner scopes that refer to outer scopes, it were
> the other way round, and variables in the outer scope
> were marked as being rebindable in inner scopes.

[Greg]
>   def f():
>     rebindable x
>     def inc_x_by(i):
>       x += i # rebinds outer x
>     x = 39
>     inc_x_by(3)
>     return x

This would only apply to *assignment* from inner scopes, not to *use*
from inner scopes, right?  (Otherwise it would be seriously backwards
incompatible.)

I'm not sure I like it much, because it gives outer scopes (some) control
over inner scopes.  One of the guidelines is that a name defined in an
inner scope should always shadow the same name in an outer scope, to
allow evolution of the outer scope without affecting local details of
inner scope.  (IOW if an inner function defines a local variable 'x',
the outer scope shouldn't be able to change that.)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From jmarshal at mathworks.com  Thu Oct 23 13:42:23 2003
From: jmarshal at mathworks.com (Joshua Marshall)
Date: Thu Oct 23 13:42:29 2003
Subject: [Python-Dev] closure semantics
Message-ID: <7224B63940F10F40A48AC423597ADE57012DC7BA@MESSAGE-AH.ad.mathworks.com>

> [Jeremy]
> > I'm not averse to introducing a new keyword, which would address
both 
> > concerns.  yield was introduced with apparently little problem, so
it 
> > seems possible to add a keyword without causing too much disruption.
> > 
> > If we decide we must stick with global, then it's very hard to
address 
> > Alex's concern about global being a confusing word choice <wink>.

[Guido]
> OK, the tension is mounting.  Which keyword do you have in 
> mind?  And would you use the same keyword for module-globals 
> as for outer-scope variables?

I'd like to suggest "outer v" for this.  The behavior could be to scan
outward for the first definition of v.  If the only outer-scope variable
is at module-level, then the behavior would be the same as "global v".
Or if everyone is comfortable enough re-using the keyword "global", then
I also like "global v in f".

From skip at pobox.com  Thu Oct 23 13:51:16 2003
From: skip at pobox.com (Skip Montanaro)
Date: Thu Oct 23 13:51:30 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <200310231709.h9NH95703122@12-236-54-216.client.attbi.com>
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310231709.h9NH95703122@12-236-54-216.client.attbi.com>
Message-ID: <16280.5396.284178.989033@montanaro.dyndns.org>


    >> That would be solved if, instead of marking variables in inner scopes
    >> that refer to outer scopes, it were the other way round, and
    >> variables in the outer scope were marked as being rebindable in inner
    >> scopes.
    ...
    Guido> This would only apply to *assignment* from inner scopes, not to
    Guido> *use* from inner scopes, right?  (Otherwise it would be seriously
    Guido> backwards incompatible.)

Given that the global keyword or something like it is here to stay (being
preferable over some attribute-style access) and that global variable writes
needs to be known to the compiler for future efficiency reasons, I think we
need to consider modifications of the current global statement.  The best
thing I've seen so far (I forget who proposed it) is

    'global' vars [ 'in' named_scope ]

where named_scope can only be the name of a function which encloses the
function containing the declaration.  In Greg's example of inc_x_by nested
inside f, he'd have declared:

    global x in f

in inc_x_by.  The current global statement (without a scoping clause) would
continue to refer to the outermost scope of the module.

This should be compatible with existing usage.  The only problem I see is
whether the named_scope needs to be known at compile time or if it can be
deferred until run time.  For example, should this

    import random
    def outer(a):
        x = a
        def inner(a):
            x = 42
            def innermost(r):
                if r < 0.5:
                    global x in inner
                else:
                    global x in outer
                x = r
            print "  inner, x @ start:", x
            innermost(random.random())
            print "  inner, x @ end:", x
        print "outer, x @ start:", x
        inner(a)
        print "outer, x @ end:", x
    outer(12.73)

be valid?  My thought is that it shouldn't.

Skip

From tim at zope.com  Thu Oct 23 14:44:24 2003
From: tim at zope.com (Tim Peters)
Date: Thu Oct 23 14:45:33 2003
Subject: [Python-Dev] PEP 289: Generator Expressions
In-Reply-To: <001801c3997f$524bc6e0$e841fea9@oemcomputer>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHEFFAB.tim@zope.com>

FYI, some of the implementations of the backtracking conjoin() operator in
test_generators.py make heavy use of

    for values[i] in gs[i]():

style for-loops.  That style is often useful when generating vectors
representing combinatorial objects.  I could live without it, but so far
haven't needed to prove that <wink>.


From fdrake at acm.org  Thu Oct 23 14:51:58 2003
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu Oct 23 14:53:23 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Doc
	python-docs.txt, 1.2, 1.3
In-Reply-To: <16279.60447.29714.759275@montanaro.dyndns.org>
References: <E1ACgX7-0000uo-00@sc8-pr-cvs1.sourceforge.net>
	<16279.60447.29714.759275@montanaro.dyndns.org>
Message-ID: <16280.9038.338759.771647@grendel.zope.com>


Skip Montanaro writes:
 > And I thought only webmaster@python.org got asked that question all the
 > time.  Does it get asked at other addresses as well?  I don't recall ever
 > seeing it on python-list.

I wouldn't expect to see it on python-list.  Aren't the people who ask
generally people who *aren't* in the Python community?  They're going
to look for the easiest ways to ask, so that generally means googling
for "Python" and using whatever contact address is on one of the first
pages they find.  The first two Google's showing me now are:

    http://www.python.org/       ( webmaster at python.org )

    http://www.python.org/doc/   ( docs at python.org )

Wanna guess where the questions are going to go?


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation

From martin at v.loewis.de  Thu Oct 23 16:30:16 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Thu Oct 23 16:31:07 2003
Subject: [Python-Dev] setjmp/longjmp exception handling (was: More
	informative error messages)
In-Reply-To: <3F9779A3.7000504@ocf.berkeley.edu>
References: <LNBBLJKPBEHFEDALKOLCKEJAGGAB.tim.one@comcast.net>
	<3F9779A3.7000504@ocf.berkeley.edu>
Message-ID: <m3llrb1yuf.fsf@mira.informatik.hu-berlin.de>

"Brett C." <bac@OCF.Berkeley.EDU> writes:

> The basic idea is to keep a stack of jmp_buf points.

This is an old implementation strategy for exceptions in C++; e.g. GNU
g++ uses it with -fsjlj-exception option. It is generally discouraged
as it is *really* expensive: it requires a lot of memory per jmpbuf,
and it requires that the memory is filled.

In addition, for Python, there would be no simplification: each stack
frame needs to perform "all" DECREFs. To convert this to exception
handling, you would get very many nested try-catch blocks, as each
allocation of some object would need to be followed with a try-catch
block. So if you have 5 objects allocated in a function, you would
need a nesting of 5 levels - i.e. up to column 40.

Regards,
Martin


From pete at shinners.org  Thu Oct 23 16:22:59 2003
From: pete at shinners.org (Pete Shinners)
Date: Thu Oct 23 16:31:58 2003
Subject: [Python-Dev] random.choice IndexError on empty list
Message-ID: <bn9djb$mj4$1@sea.gmane.org>

(this should potentialy go on sourceforge's bug tracker? alas i have 
no account right now)

Another user and I were scratching our heads over why random.choice() 
was raising "IndexError: list index out of range". For awhile we were 
thinking random.random() must have been returning >= 1.0. It turns out 
an empty list was being passed.

I would suggest either a ValueError is raised, or a different 
exception message. perhaps more like the results of

     >>> [].pop()
     Traceback (most recent call last):
       File "<stdin>", line 1, in ?
     IndexError: pop from empty list

The patch is trivial, but I can provide it if there is an agreed-apon 
response.


From jrw at pobox.com  Thu Oct 23 17:40:33 2003
From: jrw at pobox.com (John Williams)
Date: Thu Oct 23 17:40:43 2003
Subject: [Python-Dev] Re: closure semantics
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>	<200310231709.h9NH95703122@12-236-54-216.client.attbi.com>
	<16280.5396.284178.989033@montanaro.dyndns.org>
Message-ID: <3F984AD1.5040306@pobox.com>

Skip Montanaro wrote:
> Given that the global keyword or something like it is here to stay (being
> preferable over some attribute-style access) and that global variable writes
> needs to be known to the compiler for future efficiency reasons, I think we
> need to consider modifications of the current global statement.  The best
> thing I've seen so far (I forget who proposed it) is
> 
>     'global' vars [ 'in' named_scope ]
...
> This should be compatible with existing usage.  The only problem I see is
> whether the named_scope needs to be known at compile time or if it can be
> deferred until run time.

How about (to abuse a keyword that's gone unmolested for too long)

   global foo from def

to declare that foo refers a variable in a lexically enclosing function 
definition?  This avoids to need to name a specific function (which IMHO 
is just a source of confusion over the semantics of strange cases) while 
still having some mnemonic value (foo "comes from" an enclosing function 
definition).

jw


From skip at pobox.com  Thu Oct 23 17:46:33 2003
From: skip at pobox.com (Skip Montanaro)
Date: Thu Oct 23 17:46:43 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <3F984AD1.5040306@pobox.com>
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310231709.h9NH95703122@12-236-54-216.client.attbi.com>
	<16280.5396.284178.989033@montanaro.dyndns.org>
	<3F984AD1.5040306@pobox.com>
Message-ID: <16280.19513.571537.789185@montanaro.dyndns.org>


    John> How about (to abuse a keyword that's gone unmolested for too long)

    John>    global foo from def

    John> to declare that foo refers a variable in a lexically enclosing
    John> function definition?  This avoids to need to name a specific
    John> function (which IMHO is just a source of confusion over the
    John> semantics of strange cases) while still having some mnemonic value
    John> (foo "comes from" an enclosing function definition).

How do you indicate the particular scope to which foo will be bound (there
can be many lexically enclosing function definitions)?  Using my example
again:

    def outer(a):
        x = a
        def inner(a):
            x = 42
            def innermost(r):
                global x from def       # <--- your notation
                x = r
            print "  inner, x @ start:", x
            innermost(random.random())
            print "  inner, x @ end:", x
        print "outer, x @ start:", x
        inner(a)
        print "outer, x @ end:", x

how do you tell Python that x inside innermost is to be associated with the
x in inner or the x in outer?

Skip

From zack at codesourcery.com  Thu Oct 23 17:58:54 2003
From: zack at codesourcery.com (Zack Weinberg)
Date: Thu Oct 23 17:58:59 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <16280.19513.571537.789185@montanaro.dyndns.org> (Skip
	Montanaro's message of "Thu, 23 Oct 2003 16:46:33 -0500")
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310231709.h9NH95703122@12-236-54-216.client.attbi.com>
	<16280.5396.284178.989033@montanaro.dyndns.org>
	<3F984AD1.5040306@pobox.com>
	<16280.19513.571537.789185@montanaro.dyndns.org>
Message-ID: <87k76vehup.fsf@egil.codesourcery.com>

Skip Montanaro <skip@pobox.com> writes:

>     John> How about (to abuse a keyword that's gone unmolested for too long)
>
>     John>    global foo from def
>
>     John> to declare that foo refers a variable in a lexically enclosing
>     John> function definition?  This avoids to need to name a specific
>     John> function (which IMHO is just a source of confusion over the
>     John> semantics of strange cases) while still having some mnemonic value
>     John> (foo "comes from" an enclosing function definition).
>
> How do you indicate the particular scope to which foo will be bound (there
> can be many lexically enclosing function definitions)?  Using my example
> again:
>
>     def outer(a):
>         x = a
>         def inner(a):
>             x = 42
>             def innermost(r):
>                 global x from def       # <--- your notation
>                 x = r
>             print "  inner, x @ start:", x
>             innermost(random.random())
>             print "  inner, x @ end:", x
>         print "outer, x @ start:", x
>         inner(a)
>         print "outer, x @ end:", x
>
> how do you tell Python that x inside innermost is to be associated with the
> x in inner or the x in outer?

Maybe "global foo from <function_name>" ?  Or, "from function_name
global foo" is consistent with import, albeit somewhat weird.


I would never use this feature; I avoid nested functions entirely. 
However, as long as we're talking about this stuff, I wish I could
write "global foo" at module scope and have that mean "this variable
is to be treated as global in all functions in this module".

zw

From guido at python.org  Thu Oct 23 18:06:58 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 18:07:06 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: Your message of "Thu, 23 Oct 2003 12:51:16 CDT."
	<16280.5396.284178.989033@montanaro.dyndns.org> 
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310231709.h9NH95703122@12-236-54-216.client.attbi.com> 
	<16280.5396.284178.989033@montanaro.dyndns.org> 
Message-ID: <200310232206.h9NM6wb03700@12-236-54-216.client.attbi.com>

[Skip]
> Given that the global keyword or something like it is here to stay
> (being preferable over some attribute-style access)

(Actually I expect more pushback from Alex once he's back from his
trip.  He seems to feel strongly about this. :-)

> and that global variable writes needs to be known to the compiler
> for future efficiency reasons, I think we need to consider
> modifications of the current global statement.  The best thing I've
> seen so far (I forget who proposed it) is
> 
>     'global' vars [ 'in' named_scope ]
> 
> where named_scope can only be the name of a function which encloses
> the function containing the declaration.

That was my first suggestion earlier this week.  The main downside
(except from propagating 'global' :-) is that if you rename the
function defining the scope you have to fix all global statements
referring to it.

I saw a variant where the syntax was

    'global' vars 'in' 'def'

which solves that concern (though not particularly elegantly).

> In Greg's example of inc_x_by nested inside f, he'd have declared:
> 
>     global x in f
> 
> in inc_x_by.  The current global statement (without a scoping
> clause) would continue to refer to the outermost scope of the
> module.
> 
> This should be compatible with existing usage.  The only problem I
> see is whether the named_scope needs to be known at compile time or
> if it can be deferred until run time.

Definitely compile time.  'f' has to be a name of a lexically
enclosing 'def'; it's not an expression.  The compiler nees to know
which scope it refers to so it can turn the correct variable into a
cell.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu Oct 23 18:08:58 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 18:09:07 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: Your message of "Thu, 23 Oct 2003 14:58:54 PDT."
	<87k76vehup.fsf@egil.codesourcery.com> 
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310231709.h9NH95703122@12-236-54-216.client.attbi.com>
	<16280.5396.284178.989033@montanaro.dyndns.org>
	<3F984AD1.5040306@pobox.com>
	<16280.19513.571537.789185@montanaro.dyndns.org> 
	<87k76vehup.fsf@egil.codesourcery.com> 
Message-ID: <200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com>

> However, as long as we're talking about this stuff, I wish I could
> write "global foo" at module scope and have that mean "this variable
> is to be treated as global in all functions in this module".

This is similar to Greg Ewing's proposable to have 'rebindable x' at
an outer function scope.  My problem with it remains:

It gives outer scopes (some) control over inner scopes.  One of the
guidelines is that a name defined in an inner scope should always
shadow the same name in an outer scope, to allow evolution of the
outer scope without affecting local details of inner scope.  (IOW if
an inner function defines a local variable 'x', the outer scope
shouldn't be able to change that.)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From zack at codesourcery.com  Thu Oct 23 18:27:01 2003
From: zack at codesourcery.com (Zack Weinberg)
Date: Thu Oct 23 18:27:05 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com> (Guido
	van Rossum's message of "Thu, 23 Oct 2003 15:08:58 -0700")
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310231709.h9NH95703122@12-236-54-216.client.attbi.com>
	<16280.5396.284178.989033@montanaro.dyndns.org>
	<3F984AD1.5040306@pobox.com>
	<16280.19513.571537.789185@montanaro.dyndns.org>
	<87k76vehup.fsf@egil.codesourcery.com>
	<200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com>
Message-ID: <87brs7egju.fsf@egil.codesourcery.com>

Guido van Rossum <guido@python.org> writes:

>> However, as long as we're talking about this stuff, I wish I could
>> write "global foo" at module scope and have that mean "this variable
>> is to be treated as global in all functions in this module".
>
> This is similar to Greg Ewing's proposable to have 'rebindable x' at
> an outer function scope.  My problem with it remains:
>
> It gives outer scopes (some) control over inner scopes.  One of the
> guidelines is that a name defined in an inner scope should always
> shadow the same name in an outer scope, to allow evolution of the
> outer scope without affecting local details of inner scope.  (IOW if
> an inner function defines a local variable 'x', the outer scope
> shouldn't be able to change that.)

Frankly, I wish Python required one to write explicit declarations for
all variables in the program:

var x, y, z # module scope

class bar:
   classvar I, J, K # class variables
   var i, j, k      # instance variables

def foo(...):
   var a, b, c  # function scope
   ...

It's extra bondage and discipline, yeah, but it's that much more help
comprehending the program six months later, and it also gets rid of
the "how was this variable name supposed to be spelled again?"
question.

zw

From jrw at pobox.com  Thu Oct 23 18:31:48 2003
From: jrw at pobox.com (John Williams)
Date: Thu Oct 23 18:32:08 2003
Subject: [Python-Dev] Re: closure semantics
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310231709.h9NH95703122@12-236-54-216.client.attbi.com>
	<16280.5396.284178.989033@montanaro.dyndns.org>
	<3F984AD1.5040306@pobox.com>
	<16280.19513.571537.789185@montanaro.dyndns.org>
Message-ID: <3F9856D4.3000404@pobox.com>

Skip Montanaro wrote:
> 
>     John>    global foo from def
> 
> How do you indicate the particular scope to which foo will be bound (there
> can be many lexically enclosing function definitions)?  Using my example
> again:
> 
>     def outer(a):
>         x = a
>         def inner(a):
>             x = 42
>             def innermost(r):
>                 global x from def       # <--- your notation
>                 x = r
>             print "  inner, x @ start:", x
>             innermost(random.random())
>             print "  inner, x @ end:", x
>         print "outer, x @ start:", x
>         inner(a)
>         print "outer, x @ end:", x
> 
> how do you tell Python that x inside innermost is to be associated with the
> x in inner or the x in outer?

I can think of two reasonable possibilities--either it refers to the 
innermost possible variable, or the compiler rejects this case outright. 
  Either way the problem is easy to solve by renaming one of the variables.

Sorry I wasn't clear--I really only meant to propose a new syntax for 
the already-proposed "global foo in def". For some reason I can't quite 
put my finger on, "in def" looks to me like it's referring to the 
function where the statement occurs, but "from def" looks like it refers 
to some other function.

jw


From raymond.hettinger at verizon.net  Thu Oct 23 18:38:10 2003
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Thu Oct 23 18:39:01 2003
Subject: [Python-Dev] PyList API missing PyList_Pop() and PyList_Delete
Message-ID: <001101c399b6$56d67a20$e841fea9@oemcomputer>

Was there a reason for leaving this out of the API or should it be 
added?  Is the right way to simulate a pop something like this:


n = PyList_GET_SIZE(outbasket);
if (n == 0) {
	PyErr_SetString(PyExc_IndexError, "Pop from an empty list.");
	return NULL;
}
result = PyList_Get_Item(outbasket, n-1);
if (result == NULL)
	return NULL;
Py_INCREF(result);
empty_list = Py_ListNew(0);
if (empty_list == NULL) {
      Py_DECREF(result);
      return NULL;
}
err = PyList_SetSlice(to->outbasket, n-1, n, empty_list);
Py_DECREF(empty_list);
if (err == -1) {
      Py_DECREF(result);
      return NULL;
}
return result;
/* Whew, that was a lot of code for a pop have a popped result */


Raymond Hettinger


From guido at python.org  Thu Oct 23 18:42:50 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 18:43:14 2003
Subject: [Python-Dev] PyList API missing PyList_Pop() and PyList_Delete
In-Reply-To: Your message of "Thu, 23 Oct 2003 18:38:10 EDT."
	<001101c399b6$56d67a20$e841fea9@oemcomputer> 
References: <001101c399b6$56d67a20$e841fea9@oemcomputer> 
Message-ID: <200310232242.h9NMgoG03818@12-236-54-216.client.attbi.com>

> Was there a reason for leaving this out of the API

It is much newer than that set of API functions, and I guess nobody
thought about it.

> or should it be added?

Unclear -- how often does one need this?  Can't you call it using one
of the higher-level method-calling helpers?

> Is the right way to simulate a pop something like this:

No time to check, it should do the same as listpop().

--Guido van Rossum (home page: http://www.python.org/~guido/)

From pedronis at bluewin.ch  Thu Oct 23 19:08:38 2003
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Thu Oct 23 19:07:03 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <200310221757.h9MHvI327805@12-236-54-216.client.attbi.com>
References: <Your message of "Wed, 22 Oct 2003 19:32:56 +0200."
	<5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch>
	<200310220448.h9M4miP26369@12-236-54-216.client.attbi.com>
	<5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch>
Message-ID: <5.2.1.1.0.20031024005809.027db7c0@pop.bluewin.ch>

At 10:57 22.10.2003 -0700, Guido van Rossum wrote:

>   def tee(iterable):
>       "Return two independent iterators from a single iterable"
>       data = {}
>       cnt = 0
>       def gen(next):
>           global* cnt
>           dpop = data.pop
>           for i in count():
>               if i == cnt:
>                   item = data[i] = next()
>                   cnt += 1
>               else:
>                   item = dpop(i)
>               yield item
>       next = iter(iterable).next
>       return (gen(next), gen(next))
>
>which is IMO more readable.

it's a subtle piece of code. I wouldn't mind a more structured syntax with 
both the outer function declaring that is ok for some inner function to 
rebind some of its locals, and the inner function declaring that a local is 
coming from an outer scope:

   def tee(iterable):
       "Return two independent iterators from a single iterable"
       data = {}

       # cnt = 0 here would be ok

      share cnt = 0:  # the assignment is opt,
                       # inner functions in the suite can rebind cnt
        def gen(next):
          use cnt # OR outer cnt
         dpop = data.pop
           for i in count():
               if i == cnt:
                   item = data[i] = next()
                   cnt += 1
               else:
                   item = dpop(i)
               yield item

       # cnt = 0 here would be ok

      next = iter(iterable).next
       return (gen(next), gen(next))

yes it's heavy and unpythonic, but it makes very clear that something 
special is going on with cnt.

no time to add anything else to the thread.

regards.


From guido at python.org  Thu Oct 23 19:22:44 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 19:22:54 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: Your message of "Fri, 24 Oct 2003 01:08:38 +0200."
	<5.2.1.1.0.20031024005809.027db7c0@pop.bluewin.ch> 
References: <Your message of "Wed, 22 Oct 2003 19:32:56 +0200."
	<5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch>
	<200310220448.h9M4miP26369@12-236-54-216.client.attbi.com>
	<5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch> 
	<5.2.1.1.0.20031024005809.027db7c0@pop.bluewin.ch> 
Message-ID: <200310232322.h9NNMiA03864@12-236-54-216.client.attbi.com>

> >   def tee(iterable):
> >       "Return two independent iterators from a single iterable"
> >       data = {}
> >       cnt = 0
> >       def gen(next):
> >           global* cnt
> >           dpop = data.pop
> >           for i in count():
> >               if i == cnt:
> >                   item = data[i] = next()
> >                   cnt += 1
> >               else:
> >                   item = dpop(i)
> >               yield item
> >       next = iter(iterable).next
> >       return (gen(next), gen(next))
> >
> >which is IMO more readable.
> 
> it's a subtle piece of code. I wouldn't mind a more structured syntax with 
> both the outer function declaring that is ok for some inner function to 
> rebind some of its locals, and the inner function declaring that a local is 
> coming from an outer scope:
> 
>    def tee(iterable):
>        "Return two independent iterators from a single iterable"
>        data = {}
> 
>        # cnt = 0 here would be ok
> 
>       share cnt = 0:  # the assignment is opt,
>                        # inner functions in the suite can rebind cnt
>         def gen(next):
>           use cnt # OR outer cnt
>          dpop = data.pop
>            for i in count():
>                if i == cnt:
>                    item = data[i] = next()
>                    cnt += 1
>                else:
>                    item = dpop(i)
>                yield item
> 
>        # cnt = 0 here would be ok
> 
>       next = iter(iterable).next
>        return (gen(next), gen(next))
> 
> yes it's heavy and unpythonic, but it makes very clear that something 
> special is going on with cnt.

Might as well declare a class then. :-)

> no time to add anything else to the thread.

Ditto.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg at cosc.canterbury.ac.nz  Thu Oct 23 19:48:55 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu Oct 23 19:49:18 2003
Subject: [Python-Dev] Can we please have a better dict interpolation
	syntax?
In-Reply-To: <16279.56778.309781.129469@montanaro.dyndns.org>
Message-ID: <200310232348.h9NNmti28349@oma.cosc.canterbury.ac.nz>

Skip Montanaro <skip@pobox.com>:

> I think we've been around the block on this one a few times.  While %{foo}
> might be a convenient shorthand for %(foo)s, I don't think it saves enough
> space (one character) or stands out that much more ("{...}" instead of
> "(...)s") to make the addition worthwhile.

I disagree strongly -- I think it *does* stand out more clearly.  The
"s" on the end of "%(name)s" too easily gets mixed up with other
alphanumeric stuff nearby. If it were just "%(name)" *without* the
trailing "s" it wouldn't be nearly as bad, but unfortunately it can't
be left off and remain backwards compatible.

> What if lid1 is a float which you want to display with two digits
> past the decimal point?

Then I would use the existing construct -- I'm not suggesting that
it be removed.

> in which case you also have the problem of having two almost identical
> ways to do dictionary interpolation.

I don't see that as a big problem. To my mind, practicality beats
purity here -- "%(name)s" is too awkward to be practical for routine
use.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Thu Oct 23 19:56:59 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu Oct 23 19:57:18 2003
Subject: [Python-Dev] PEP 289: Generator Expressions (second draft)
In-Reply-To: <00b401c39978$074ba8b0$6402a8c0@arkdesktop>
Message-ID: <200310232356.h9NNuxP28383@oma.cosc.canterbury.ac.nz>

Andrew Koenig <ark-mlist@att.net>:

> The dict
> comprehension would be
> 
> 	{x: x**2 for x in range(n)}
> 
> Or would that be a single-element dict whose key is x and value is a
> generator expression?  :-)

According to the parentheses rule, no, because that would
have to be

  {x: (x**2 for x in range(n))}

(Parentheses)-(are)-(so)-(handy)-(ly),

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From tdelaney at avaya.com  Thu Oct 23 20:07:06 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Thu Oct 23 20:07:15 2003
Subject: [Python-Dev] Re: closure semantics
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6AC32@au3010avexu1.global.avaya.com>

> From: John Williams [mailto:jrw@pobox.com]
> 
> I can think of two reasonable possibilities--either it refers to the 
> innermost possible variable, or the compiler rejects this 
> case outright. 
>   Either way the problem is easy to solve by renaming one of 
> the variables.

Going on the principle of least surprise, I have to say that I think explicitly naming the scope in which a variable is to be used is the best approach.

My concern with the other proposal is that introducing code between scopes could silently change the semantics of a piece of code.

I'll use the 'outer' proposal since it's the shortest and least confusing to me ...

    def func1()

        x = 1

        def func2()

            def func3()
                outer x
                x += 2

            return func3

        return func2()

    print func1()

should print:

    3

Now, if we change it to:

    def func1()

        x = 1

        def func2()

            x = 2

            def func3()
                outer x
                x += 2

            return func3

        return func2()

    print func1()

it would now print:

    4

OTOH, specifying the scope prevents this type of error:

    def func1()

        x = 1

        def func2()

            def func3()
                global x in func1
                x += 2

            return func3

        return func2()

    print func1()

and

    def func1()

        x = 1

        def func2()

            x = 2

            def func3()
                global x in func1
                x += 2

            return func3

        return func2()

    print func1()

should both print

    3

'global x in func1' is also a *lot* easier to explain.

I think these two points should weigh heavily in any decision. I think the need to rename the target scope is of lesser importance.

Tim Delaney

From guido at python.org  Thu Oct 23 20:11:36 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 20:11:46 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: Your message of "Fri, 24 Oct 2003 10:07:06 +1000."
	<338366A6D2E2CA4C9DAEAE652E12A1DED6AC32@au3010avexu1.global.avaya.com> 
References: <338366A6D2E2CA4C9DAEAE652E12A1DED6AC32@au3010avexu1.global.avaya.com>
Message-ID: <200310240011.h9O0Bav03963@12-236-54-216.client.attbi.com>

[Tim Delane]y
> Going on the principle of least surprise, I have to say that I think
> explicitly naming the scope in which a variable is to be used is the
> best approach.
[...]
> 'global x in func1' is also a *lot* easier to explain.
> 
> I think these two points should weigh heavily in any decision. I
> think the need to rename the target scope is of lesser importance.

I have to concur.  EIBTI.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From bac at OCF.Berkeley.EDU  Thu Oct 23 20:26:34 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Oct 23 20:26:53 2003
Subject: [Python-Dev] setjmp/longjmp exception handling
In-Reply-To: <m3llrb1yuf.fsf@mira.informatik.hu-berlin.de>
References: <LNBBLJKPBEHFEDALKOLCKEJAGGAB.tim.one@comcast.net>	<3F9779A3.7000504@ocf.berkeley.edu>
	<m3llrb1yuf.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3F9871BA.3060502@ocf.berkeley.edu>

Martin v. L?wis wrote:
> "Brett C." <bac@OCF.Berkeley.EDU> writes:
> 
> 
>>The basic idea is to keep a stack of jmp_buf points.
> 
> 
> This is an old implementation strategy for exceptions in C++; e.g. GNU
> g++ uses it with -fsjlj-exception option. It is generally discouraged
> as it is *really* expensive: it requires a lot of memory per jmpbuf,
> and it requires that the memory is filled.
> 

Figures.  Oh well.  At least it was interesting to figure out.


From skip at pobox.com  Thu Oct 23 21:56:26 2003
From: skip at pobox.com (Skip Montanaro)
Date: Thu Oct 23 22:17:32 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <87k76vehup.fsf@egil.codesourcery.com>
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310231709.h9NH95703122@12-236-54-216.client.attbi.com>
	<16280.5396.284178.989033@montanaro.dyndns.org>
	<3F984AD1.5040306@pobox.com>
	<16280.19513.571537.789185@montanaro.dyndns.org>
	<87k76vehup.fsf@egil.codesourcery.com>
Message-ID: <16280.34506.469575.79716@montanaro.dyndns.org>


    Zack> Maybe "global foo from <function_name>" ?

Sounds just about like the "global foo in named_scope" (where "named_scope"
means enclosing function) that I described earlier.  I like "in" better than
"from" because it tells you more clearly that you are messing with the
variable in-place, not making a copy of it into the local scope.

    Zack> Or, "from function_name global foo" is consistent with import,
    Zack> albeit somewhat weird.

That reads a bit weird to me.  The nice thing about the other way is that
"global foo" without any qualifiers means the same thing it does today.
There's also no reason to use the from form as "global foo in function"
doesn't imply that you will refer to foo as "function.foo".

    Zack> I would never use this feature; I avoid nested functions entirely.
    Zack> However, as long as we're talking about this stuff, I wish I could
    Zack> write "global foo" at module scope and have that mean "this
    Zack> variable is to be treated as global in all functions in this
    Zack> module".

I've never actually used nested scopes either, nor have I ever felt the
urge.  Maybe it has something to do with not having done much recent
programming in a language before Python which supported them.  (Pascal does,
but my last Pascal experience was nearly 20 years ago.)

Skip

From skip at pobox.com  Thu Oct 23 22:02:59 2003
From: skip at pobox.com (Skip Montanaro)
Date: Thu Oct 23 22:17:51 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <200310232206.h9NM6wb03700@12-236-54-216.client.attbi.com>
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310231709.h9NH95703122@12-236-54-216.client.attbi.com>
	<16280.5396.284178.989033@montanaro.dyndns.org>
	<200310232206.h9NM6wb03700@12-236-54-216.client.attbi.com>
Message-ID: <16280.34899.211589.786953@montanaro.dyndns.org>


    >> 'global' vars [ 'in' named_scope ]
    >> 
    >> where named_scope can only be the name of a function which encloses
    >> the function containing the declaration.

    Guido> That was my first suggestion earlier this week.  The main
    Guido> downside (except from propagating 'global' :-) is that if you
    Guido> rename the function defining the scope you have to fix all global
    Guido> statements referring to it.

Well, the listed variables are "global" to the current local scope.  I find
the rename argument a bit specious.  If I rename a function I have to change
all the references to it today.  This is just one more.  Since "global" is a
declarative statement, the compiler can tell you immediately that it can't
find the old function name.

    Guido> I saw a variant where the syntax was

    Guido>     'global' vars 'in' 'def'

    Guido> which solves that concern (though not particularly elegantly).

I don't see how that can work though.  What does 'def' mean in this case?
There can be multiple lexically enclosing functions, any of which have the
same local variable x which you might want modify.

    >> This should be compatible with existing usage.  The only problem I
    >> see is whether the named_scope needs to be known at compile time or
    >> if it can be deferred until run time.

    Guido> Definitely compile time.  'f' has to be a name of a lexically
    Guido> enclosing 'def'; it's not an expression.  The compiler nees to
    Guido> know which scope it refers to so it can turn the correct variable
    Guido> into a cell.

Okay, that was easily settled. ;-)

Skip

From greg at cosc.canterbury.ac.nz  Thu Oct 23 23:19:56 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu Oct 23 23:20:05 2003
Subject: [Python-Dev] PEP 289: Generator Expressions (second draft)
In-Reply-To: <200310230642.h9N6gCN01979@12-236-54-216.client.attbi.com>
Message-ID: <200310240319.h9O3JuX29277@oma.cosc.canterbury.ac.nz>

> I've checked in an update to Raymond's PEP 289 which (I hope)
> clarifies a lot of things, and settles the capturing of free
> variables.

I had another early-morning idea about how to deal with the free
variable issue, which could also be used when you have another form of
closure (lambda, def) and you want to capture some of its free
variables.

Suppose there were a special form of assignment

   new x = expr

If x is not used in any nested scope, this is the same as a regular
assignment. But if it is, and consequently x is kept in a cell,
instead of replacing the contents of the cell, this creates a *new*
cell which replaces the previous one in the current scope. But any
previously create closure will still be holding on to the old cell
with its old value. If you do this in a loop, you will end up with a
series of incarnations of the variable, each of which lives in its own
little scope.

Using this, Tim's pipeline example would become

     pipe = source
     for new p in predicates:
        new pipe = e for e in pipe if p(e)

For generator expressions, Tim's idea of just always capturing the
free variables is probably better, since it doesn't require
recognising a subtle problem and then applying a furtherly-subtle
solution. But it seemed like a stunningly brilliant idea at 3:27am
this morning, so I thought I'd share it with you. :-)

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Thu Oct 23 23:26:34 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu Oct 23 23:26:43 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <7224B63940F10F40A48AC423597ADE57012DC7BA@MESSAGE-AH.ad.mathworks.com>
Message-ID: <200310240326.h9O3QYU29285@oma.cosc.canterbury.ac.nz>

Joshua Marshall <jmarshal@mathworks.com>:

> I'd like to suggest "outer v" for this.

We've been assuming all along that the semantics of a
plain "global" statement have to remain exactly as they
are, but is that strictly necessary?

How much hardship would it cause, really, if "global"
were simply redefined to mean "the next scope out where
it's bound"?

It would only break something if "global" were used in
a nested function *and* there were a variable with the
same name in some intermediate scope. That sounds like
a rather rare set of conditions to me. Not significantly
more common than "yield" being used as a variable name,
surely?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Thu Oct 23 23:40:52 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 23:41:21 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: Your message of "Thu, 23 Oct 2003 21:02:59 CDT."
	<16280.34899.211589.786953@montanaro.dyndns.org> 
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310231709.h9NH95703122@12-236-54-216.client.attbi.com>
	<16280.5396.284178.989033@montanaro.dyndns.org>
	<200310232206.h9NM6wb03700@12-236-54-216.client.attbi.com> 
	<16280.34899.211589.786953@montanaro.dyndns.org> 
Message-ID: <200310240340.h9O3eqL04334@12-236-54-216.client.attbi.com>

> Well, the listed variables are "global" to the current local scope.
> I find the rename argument a bit specious.  If I rename a function I
> have to change all the references to it today.  This is just one
> more.  Since "global" is a declarative statement, the compiler can
> tell you immediately that it can't find the old function name.

Right, I tend to agree.

>     Guido> I saw a variant where the syntax was
>     Guido>     'global' vars 'in' 'def'
>     Guido> which solves that concern (though not particularly elegantly).
> 
> I don't see how that can work though.  What does 'def' mean in this
> case?  There can be multiple lexically enclosing functions, any of
> which have the same local variable x which you might want modify.

Yeah, but usually that's not a problem.  The compiler knows about all
those x-es, and uses the innermost (nearest) one.  This matches what
it does when *referencing* a non-local variable, which doesn't need a
global statement.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu Oct 23 23:44:51 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 23 23:45:36 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: Your message of "Fri, 24 Oct 2003 16:26:34 +1300."
	<200310240326.h9O3QYU29285@oma.cosc.canterbury.ac.nz> 
References: <200310240326.h9O3QYU29285@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310240344.h9O3ipI04351@12-236-54-216.client.attbi.com>

> We've been assuming all along that the semantics of a
> plain "global" statement have to remain exactly as they
> are, but is that strictly necessary?
> 
> How much hardship would it cause, really, if "global"
> were simply redefined to mean "the next scope out where
> it's bound"?
> 
> It would only break something if "global" were used in
> a nested function *and* there were a variable with the
> same name in some intermediate scope. That sounds like
> a rather rare set of conditions to me. Not significantly
> more common than "yield" being used as a variable name,
> surely?

Reasonable assumption.  We'd have to do a survey.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From tdelaney at avaya.com  Fri Oct 24 00:26:57 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Fri Oct 24 00:27:04 2003
Subject: [Python-Dev] closure semantics
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6ACF1@au3010avexu1.global.avaya.com>

> From: Guido van Rossum [mailto:guido@python.org]
> 
> > We've been assuming all along that the semantics of a
> > plain "global" statement have to remain exactly as they
> > are, but is that strictly necessary?
> > 
> > How much hardship would it cause, really, if "global"
> > were simply redefined to mean "the next scope out where
> > it's bound"?
> 
> Reasonable assumption.  We'd have to do a survey.

It would break any unadorned 'global x' in a nested scope if the name did not exist anywhere.

I'm not saying this would be good form - personally I think anyone who did this would deserve it - but it would definitely break.

One option would be to have an "if the name doesn't exist, it it created in module scope". But all this creates too many exceptions to what would otherwise be a simple rule IMO:

    global <name> [in <scope>]

where <scope> default to the current module.

Tim Delaney

From tim_one at email.msn.com  Fri Oct 24 00:46:59 2003
From: tim_one at email.msn.com (Tim Peters)
Date: Fri Oct 24 00:47:11 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310230548.h9N5mHs01795@12-236-54-216.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEKDFFAB.tim_one@email.msn.com>

[Tim]
>> This is easy to explain, and trivial to explain for people familiar
>> with the default-argument trick.

[Guido]
> Phillip Eby already recommended not bothering with that; the
> default-argument rule is actually confusing for newbies (they think
> the defaults are evaluated at call time) so it's best not to bring
> this into the picture.

Of course it works equally well to pass regular (non-default) arguments, it
just makes a precise explanation a little longer to type (because the
arglist needs to be typed out in two places).

> ..
> OK, I got it now.  I hope we can find another real-life example; but
> there were some other early toy examples that also looked quite
> convincing.

I expect we will find more, although I haven't had more time to think about
it (and today was devoted to puzzling over excessive rates of ZODB conflict
errors, where generator expressions didn't seem immediately applicable
<wink>).

I do think it's related to non-reiterablity.  If generator expressions were
reiterable, then a case could be made for them capturing a parameterized
computation, reusable for different things by varying the bindings of the
free variables.  Like, say, you wanted to plot the squares of various
functions at a set of points, and then:

    squares = (f(x)**2 for x in inputs)  # assuming reiterability here

    for f in math.sin, math.cos, math.tan:
        plot(squares)

But that doesn't make sense for a one-shot (not reiterable) generator, and
even if it were reiterable I can't think of a real example that would want
the bindings of free variables to change *during* a single pass over the
results.  For that matter, if it were reiterable, the "control by obscure
side effect" style of the example is hard to like anyway.


From greg at cosc.canterbury.ac.nz  Fri Oct 24 01:04:47 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri Oct 24 01:04:58 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DED6ACF1@au3010avexu1.global.avaya.com>
Message-ID: <200310240504.h9O54lQ29545@oma.cosc.canterbury.ac.nz>

> It would break any unadorned 'global x' in a nested scope if the name
> did not exist anywhere.
> 
> One option would be to have an "if the name doesn't exist, it it
> created in module scope".

What would be wrong with that? It's what I had in mind.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From tdelaney at avaya.com  Fri Oct 24 02:25:52 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Fri Oct 24 02:25:59 2003
Subject: [Python-Dev] closure semantics
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6AD4A@au3010avexu1.global.avaya.com>

> From: Greg Ewing [mailto:greg@cosc.canterbury.ac.nz]
> 
> > It would break any unadorned 'global x' in a nested scope 
> if the name
> > did not exist anywhere.
> > 
> > One option would be to have an "if the name doesn't exist, it it
> > created in module scope".
> 
> What would be wrong with that? It's what I had in mind.

It's complex. Can you explain the complete semantics of 'outer' as simply as:

    global <name> [in <scope>]

    Binds and uses <name> in another scope. If 'in <scope>' is omitted
    then the name is bound and used in the scope of the current module.

My understanding of 'outer' is (and I'm not sure about this):

    outer <name>

    Binds and uses <name> in the innermost scope containing the current
    scope that already has <name> bound. If <name> is not bound in any
    containing scope then it is bound into the scope of the current
    module if <name> is used or bound while executing in the current
    scope.

    <include warnings about introducing the name into a scope between the
    current scope and the scope where the programmer was expecting the
    name to be bound>

Or something like that.

Tim Delaney

From Paul.Moore at atosorigin.com  Fri Oct 24 04:02:15 2003
From: Paul.Moore at atosorigin.com (Moore, Paul)
Date: Fri Oct 24 04:03:04 2003
Subject: [Python-Dev] closure semantics
Message-ID: <16E1010E4581B049ABC51D4975CEDB8802C0991D@UKDCX001.uk.int.atosorigin.com>

From: Delaney, Timothy C (Timothy) [mailto:tdelaney@avaya.com]
> It would break any unadorned 'global x' in a nested scope
> if the name did not exist anywhere.

> I'm not saying this would be good form - personally I think
> anyone who did this would deserve it - but it would definitely break.

> One option would be to have an "if the name doesn't exist, it it
> created in module scope". But all this creates too many exceptions
> to what would otherwise be a simple rule IMO:
>
>    global <name> [in <scope>]
>
> where <scope> default to the current module.

This made me think. What should be the effect of

    def f():
        x = 12
        def g():
            global y in f
            y = 12
        g()
        print locals()

I suspect the answer is "it's illegal". But by extension from the current
behaviour of "global", it should create a local variable in f.

Paul

From tdelaney at avaya.com  Fri Oct 24 04:11:01 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Fri Oct 24 04:11:13 2003
Subject: [Python-Dev] closure semantics
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6AD5B@au3010avexu1.global.avaya.com>

> From: Moore, Paul [mailto:Paul.Moore@atosorigin.com]
> > From: Delaney, Timothy C (Timothy) [mailto:tdelaney@avaya.com]
> 
> >    global <name> [in <scope>]
> >
> > where <scope> default to the current module.
> 
> This made me think. What should be the effect of
> 
>     def f():
>         x = 12
>         def g():
>             global y in f
>             y = 12
>         g()
>         print locals()
> 
> I suspect the answer is "it's illegal". But by extension from 
> the current
> behaviour of "global", it should create a local variable in f.

My understanding of (all) the proposals, and what I would expect, is identical semantics to the current 'global', but the affected scope.

So yes, the above should create a local name `y` in `f`. The local name `y` would be allocated at compile time, just like any other local name.

Likewise, the following should be illegal:

     def f():
         x = 12
         y = 1
         def g():
             global y in f
             y = 12
         g()
         print locals()

because the global statement occurs after a local binding of the name.

Tim Delaney

From theller at python.net  Fri Oct 24 05:21:16 2003
From: theller at python.net (Thomas Heller)
Date: Fri Oct 24 05:21:50 2003
Subject: [Python-Dev] cleanup order
Message-ID: <y8vbm1o3.fsf@python.net>

Is the cleanup order at Python shutdown documented somewhere?

The only thing I found is the (old) essay
http://www.python.org/doc/essays/cleanup.html

Thomas


From mwh at python.net  Fri Oct 24 07:27:22 2003
From: mwh at python.net (Michael Hudson)
Date: Fri Oct 24 07:27:30 2003
Subject: [Python-Dev] PyList API missing PyList_Pop() and PyList_Delete
In-Reply-To: <001101c399b6$56d67a20$e841fea9@oemcomputer> (Raymond
	Hettinger's message of "Thu, 23 Oct 2003 18:38:10 -0400")
References: <001101c399b6$56d67a20$e841fea9@oemcomputer>
Message-ID: <2mu15ynaed.fsf@starship.python.net>

"Raymond Hettinger" <raymond.hettinger@verizon.net> writes:

> Was there a reason for leaving this out of the API or should it be 
> added?  Is the right way to simulate a pop something like this:

Well, there's always PyEval_CallMethod...

Cheers,
mwh

-- 
  [3] Modem speeds being what they are, large .avi files were
      generally downloaded to the shell server instead[4].
  [4] Where they were usually found by the technical staff, and
      burned to CD.                                   -- Carlfish, asr

From ncoghlan at iinet.net.au  Fri Oct 24 08:18:14 2003
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Fri Oct 24 08:18:16 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <200310221606.h9MG5wo27539@12-236-54-216.client.attbi.com>
References: <200310200444.h9K4ijH24154@oma.cosc.canterbury.ac.nz>	<200310201815.h9KIFM821583@12-236-54-216.client.attbi.com>	<3F953793.1000208@iinet.net.au>	<200310211841.45711.aleaxit@yahoo.com>
	<3F967FFC.6040507@iinet.net.au>
	<200310221606.h9MG5wo27539@12-236-54-216.client.attbi.com>
Message-ID: <3F991886.3090309@iinet.net.au>

Guido van Rossum strung bits together to say:

>>I had a similar thought about 5 minutes after turning my computer off last 
>>night. The alternative I came up with was:
>>
>>   y = (from result = 0.0 do result += x**2 for x in values if x > 0)
> 
> 
> I think you're aiming for the wrong thing here; I really see no reason
> why you'd want to avoid writing this out as a real for loop if you
> don't have an existing accumulator function (like sum()) to use.

One interesting thing is that I later realised that iterator comprehensions 
combined with the sum function would actually cover 90% of the accumulation 
functions I would ever write.

So Raymond turns out to be correct when he suggests that generator expressions 
may limit the need for reduce functions and accumulation loops. With the sum() 
built in around, they will cover a large number of the reduction operations 
encountered in real life. Previously, sum() was not available, and even if it 
had been the cost of generating the entire list to be summed may have been 
expensive (if the values to be summed are a function of the stored values, 
rather than a straight sum).

So while I think a concise reduction syntax was worth aiming for, I'm also 
willing to admit that it seems to be basically impossible to manage without 
violating Python's maxim of "one obvious way to do it". The combination of 
generator expressions and the various builtins that operate on iterables 
(especially sum()) is a superior solution.

Still, I learned a few interesting things I didn't know last week :)

Cheers,
Nick.
-- 
Nick Coghlan           |              Brisbane, Australia
ICQ#: 68854767         |               ncoghlan@email.com
Mobile: 0409 573 268   |   http://www.talkinboutstuff.net
"Let go your prejudices,
               lest they limit your thoughts and actions."


From skip at pobox.com  Fri Oct 24 08:37:38 2003
From: skip at pobox.com (Skip Montanaro)
Date: Fri Oct 24 08:37:52 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEKDFFAB.tim_one@email.msn.com>
References: <200310230548.h9N5mHs01795@12-236-54-216.client.attbi.com>
	<LNBBLJKPBEHFEDALKOLCGEKDFFAB.tim_one@email.msn.com>
Message-ID: <16281.7442.783253.814142@montanaro.dyndns.org>


    Tim>     squares = (f(x)**2 for x in inputs)  # assuming reiterability here

    Tim>     for f in math.sin, math.cos, math.tan:
    Tim>         plot(squares)

How much more expensive would this be than

    for f in math.sin, math.cos, math.tan:
        squares = (f(x)**2 for x in inputs)
        plot(squares)

which would work without reiterability, right?  The underlying generator
function could still be created at compile-time and it (or its code object?)
stored in the current function's constants.  'f' is simply an argument to it
when the iterator is instantiated.

Skip

From skip at pobox.com  Fri Oct 24 08:41:17 2003
From: skip at pobox.com (Skip Montanaro)
Date: Fri Oct 24 08:41:27 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DED6AD5B@au3010avexu1.global.avaya.com>
References: <338366A6D2E2CA4C9DAEAE652E12A1DED6AD5B@au3010avexu1.global.avaya.com>
Message-ID: <16281.7661.936250.901160@montanaro.dyndns.org>


    Tim> Likewise, the following should be illegal:

    Tim>      def f():
    Tim>          x = 12
    Tim>          y = 1
    Tim>          def g():
    Tim>              global y in f
    Tim>              y = 12
    Tim>          g()
    Tim>          print locals()

    Tim> because the global statement occurs after a local binding of the
    Tim> name.

You meant

    def f():
        x = 12
        y = 1
        def g():
            y = 12
            global y in f
        g()
        print locals()

right?

Skip

From arigo at tunes.org  Fri Oct 24 08:46:06 2003
From: arigo at tunes.org (Armin Rigo)
Date: Fri Oct 24 08:49:57 2003
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects
	typeobject.c, 2.244, 2.245
In-Reply-To: <200310090345.h993jkF00618@12-236-54-216.client.attbi.com>
References: <000001c38e10$77e31ea0$e841fea9@oemcomputer>
	<200310090345.h993jkF00618@12-236-54-216.client.attbi.com>
Message-ID: <20031024124606.GB3853@vicky.ecs.soton.ac.uk>

Hello Guido,

On Wed, Oct 08, 2003 at 08:45:45PM -0700, Guido van Rossum wrote:
>   Py_INCREF(Py_True);
>   return Py_True;
> 
> takes less time than
> 
>   return PyBool_FromLong(1);
> 
> Maybe a pair of macros Py_return_True and Py_return_False would make
> sense?

Sorry if this was already suggested and hastily rejected, but why do we care
at all about the reference counter of the few heavily-used immortal objects of
CPython?

I guess allowing their counter not to be carefully maintained ventures to the
slippery slopes of bad code.  Anyway, my two cents for a (very) slightly
faster and shorter code would be to be allowed never to do Py_INCREF or
Py_DECREF when we know that the object is Py_None, Py_False or Py_True.  These
three would have a dummy tp_dealloc that just resets the reference counter to
some large value if it ever reaches zero.


Armin


From pedronis at bluewin.ch  Fri Oct 24 09:53:57 2003
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Fri Oct 24 09:51:40 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <200310232322.h9NNMiA03864@12-236-54-216.client.attbi.com>
References: <Your message of "Fri, 24 Oct 2003 01:08:38 +0200."
	<5.2.1.1.0.20031024005809.027db7c0@pop.bluewin.ch>
	<Your message of "Wed, 22 Oct 2003 19:32:56 +0200."
	<5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch>
	<200310220448.h9M4miP26369@12-236-54-216.client.attbi.com>
	<5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch>
	<5.2.1.1.0.20031024005809.027db7c0@pop.bluewin.ch>
Message-ID: <5.2.1.1.0.20031024153151.02834768@pop.bluewin.ch>

At 16:22 23.10.2003 -0700, Guido van Rossum wrote:
> > >   def tee(iterable):
> > >       "Return two independent iterators from a single iterable"
> > >       data = {}
> > >       cnt = 0
> > >       def gen(next):
> > >           global* cnt
> > >           dpop = data.pop
> > >           for i in count():
> > >               if i == cnt:
> > >                   item = data[i] = next()
> > >                   cnt += 1
> > >               else:
> > >                   item = dpop(i)
> > >               yield item
> > >       next = iter(iterable).next
> > >       return (gen(next), gen(next))
> > >
> > >which is IMO more readable.
> >
> > it's a subtle piece of code. I wouldn't mind a more structured syntax with
> > both the outer function declaring that is ok for some inner function to
> > rebind some of its locals, and the inner function declaring that a 
> local is
> > coming from an outer scope:
> >
> >    def tee(iterable):
> >        "Return two independent iterators from a single iterable"
> >        data = {}
> >
> >        # cnt = 0 here would be ok
> >
> >       share cnt = 0:  # the assignment is opt,
> >                        # inner functions in the suite can rebind cnt
> >         def gen(next):
> >           use cnt # OR outer cnt
> >          dpop = data.pop
> >            for i in count():
> >                if i == cnt:
> >                    item = data[i] = next()
> >                    cnt += 1
> >                else:
> >                    item = dpop(i)
> >                yield item
> >
> >        # cnt = 0 here would be ok
> >
> >       next = iter(iterable).next
> >        return (gen(next), gen(next))
> >
> > yes it's heavy and unpythonic, but it makes very clear that something
> > special is going on with cnt.
>
>Might as well declare a class then. :-)

well, no, it's probably that I expect rebindable closed-over vars to be 
introduced
but some kind of structured construct instead of the usual Python freeform. 
I think
for this kind of situation I miss the Lisp-y 'let'.

def counter(starval):
    share cnt = startval:
      def inc(i):
        use cnt
        cnt += i
        return cnt
      def dec(i)
        use cnt
        cnt -= i
        return cnt
    return inc,dec

vs.

def counter(starval):
    cnt = startval
    def inc(i):
        global cnt in counter
        cnt += i
        return cnt
    def dec(i)
        global cnt in counter
        cnt -= i
        return cnt
    return inc,dec

vs.

def counter(starval):
    class Counter:
       def __init__(self,startval):
         self.cnt = startval
       def inc(self,i):
         self.cnt += i
         return self.cnt
       def dec(self,i):
         self.cnt += i
         return self.cnt
    newcounter = Counter(startval)
    return newcounter.inc,newcounter.dec

vs.

(defun counter (startval)
               (let ((cnt startval))
                 (flet ((inc (i)
                                 (incf cnt i))
                        (dec (i)
                                 (decf cnt i)))
                   (values #'inc #'dec))))

<wink>


From ncoghlan at iinet.net.au  Fri Oct 24 10:01:55 2003
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Fri Oct 24 10:02:00 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <5.1.0.14.0.20031022220801.02547e20@mail.telecommunity.com>
References: <LNBBLJKPBEHFEDALKOLCOEBOFFAB.tim_one@email.msn.com>
	<5.1.0.14.0.20031022220801.02547e20@mail.telecommunity.com>
Message-ID: <3F9930D3.5060103@iinet.net.au>

Phillip J. Eby strung bits together to say:
> At 02:49 PM 10/23/03 +1300, Greg Ewing wrote:
> 
>> This would allow the current delayed-evaluation semantics
>> to be kept as the default, while eliminating any need
>> for using the default-argument hack when you don't
>> want delayed evaluation.
> 
> 
> Does anybody actually have a use case for delayed evaluation?  Why would 
> you ever *want* it to be that way?  (Apart from the BDFL's desire to 
> have the behavior resemble function behavior.)
> 
> And, if there's no use case for delayed evaluation, why make people jump 
> through hoops to get the immediate binding?

The other thing to consider is that if generator expressions provide immediate 
evaluation, then anyone who wants delayed evaluation semantics still has the 
option of writing an actual generator function - at which point, it ceases to be 
an expression, and becomes a function.

Which seems to fit with the way Python works at the moment:

This displays '1':
   x = 0
   y = x + 1
   x = 1
   print y

This displays '2':
   x = 0
   y = lambda: x + 1
   x = 1
   print y

(I think someone already gave a similar example)

Actually, the exact same no-argument-lambda trick used above would be enough to 
get you late binding of all of the elements in your generator expression. Being 
selective still requires writing a real generator function, though.

Cheers,
Nick.

-- 
Nick Coghlan           |              Brisbane, Australia
ICQ#: 68854767         |               ncoghlan@email.com
Mobile: 0409 573 268   |   http://www.talkinboutstuff.net
"Let go your prejudices,
               lest they limit your thoughts and actions."


From python at rcn.com  Fri Oct 24 10:06:11 2003
From: python at rcn.com (Raymond Hettinger)
Date: Fri Oct 24 10:07:01 2003
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects
	typeobject.c, 2.244, 2.245
In-Reply-To: <20031024124606.GB3853@vicky.ecs.soton.ac.uk>
Message-ID: <001001c39a37$fb221720$e841fea9@oemcomputer>

[Guido van Rossum]
> >   Py_INCREF(Py_True);
> >   return Py_True;
> >
> > takes less time than
> >
> >   return PyBool_FromLong(1);
> >
> > Maybe a pair of macros Py_return_True and Py_return_False would make
> > sense?


[Armin Rigo]
> Sorry if this was already suggested and hastily rejected, but why do
we
> care
> at all about the reference counter of the few heavily-used immortal
> objects of
> CPython?
> 
> I guess allowing their counter not to be carefully maintained ventures
to
> the
> slippery slopes of bad code.  Anyway, my two cents for a (very)
slightly
> faster and shorter code would be to be allowed never to do Py_INCREF
or
> Py_DECREF when we know that the object is Py_None, Py_False or
Py_True.
> These
> three would have a dummy tp_dealloc that just resets the reference
counter
> to
> some large value if it ever reaches zero.


Hmm, how about having the macros do the increments while in the debug
but skip them for production code.  That would keep the quality controls
in, not break existing leak detection methods, and save the microseconds
for incrementing.


Raymond


From FBatista at uniFON.com.ar  Fri Oct 24 10:42:00 2003
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Fri Oct 24 10:43:04 2003
Subject: [Python-Dev] prePEP: Money data type
Message-ID: <A128D751272CD411BC9200508BC2194D033830E9@escpl.tcp.com.ar>

Nick Coghlan wrote:

#- > I'm urged to have a Money data type, but I'll see if I can 
#- get it through
#- > Decimal, improving/fixing/extedign Decimal and saving 
#- effort at the same
#- > time.
#- 
#- And there is always the "class Money(Decimal):" option, as well.

Sure, I think that's the way to do it.

But first I need to know what problems have Decimal (if it isn't in the
standard library, it's sure it needs work).

Anyway, I'm burning my mind with the IBM specification of Decimal Arithmetic
and studying the class itself. Then I'll work out the test cases, see what
must be done, *do it* (if I can), and theeeeeeeeeeen start thinking again
about Money.

.	Facundo


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
ADVERTENCIA  

La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo,
son para uso exclusivo del destinatario y pueden contener informaci?n
confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. 

Si Ud. No es uno de los destinatarios consignados o la persona responsable
de hacer llegar este mensaje a los destinatarios consignados, no est?
autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de
ella) contenida en este mensaje. Por favor notif?quenos respondiendo al
remitente, borre el mensaje original y borre las copias (impresas o grabadas
en cualquier medio magn?tico) que pueda haber realizado del mismo. 

Todas las opiniones contenidas en este mail son propias del autor del
mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones
Personales S.A. o alguna empresa asociada. 

Los mensajes electr?nicos pueden ser alterados, motivo por el cual
Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n
cualquiera sea el resultante de este mensaje. 

Muchas Gracias.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20031024/ac5c896d/attachment.html
From gtalvola at nameconnector.com  Fri Oct 24 11:07:35 2003
From: gtalvola at nameconnector.com (Geoffrey Talvola)
Date: Fri Oct 24 11:07:42 2003
Subject: [Python-Dev] Can we please have a better dict interpolation s
	yntax?
Message-ID: <61957B071FF421419E567A28A45C7FE59AF738@mailbox.nameconnector.com>

Greg Ewing wrote:
> Guido:
> 
>> Wouldn't this be even better?
>> 
>>     "create index ${table}_lid1_idx on $table($lid1)" % params
> 
> I wouldn't object to that. I'd have expected *you* to
> object to it, though, since it re-defines the meaning
> of "$" in an interpolated string. I was just trying
> to suggest something that would be backward-compatible.
> 

$ is currently unused in Python AFAIK.  So why not:

    "create index ${table}_lid1_idx on $table($lid1)" $ params

No backward compatibility problems at all.

- Geoff

From ncoghlan at iinet.net.au  Fri Oct 24 10:45:26 2003
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Fri Oct 24 11:44:30 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <3F9930D3.5060103@iinet.net.au>
References: <LNBBLJKPBEHFEDALKOLCOEBOFFAB.tim_one@email.msn.com>	<5.1.0.14.0.20031022220801.02547e20@mail.telecommunity.com>
	<3F9930D3.5060103@iinet.net.au>
Message-ID: <3F993B06.2080106@iinet.net.au>

Nick Coghlan strung bits together to say:

> This displays '2':
>   x = 0
>   y = lambda: x + 1
>   x = 1
>   print y

D'oh! That last line should be "print y()"!

Regards,
Nick.
Still has to reinstall Python on new OS installation. . .

-- 
Nick Coghlan           |              Brisbane, Australia
ICQ#: 68854767         |               ncoghlan@email.com
Mobile: 0409 573 268   |   http://www.talkinboutstuff.net
"Let go your prejudices,
               lest they limit your thoughts and actions."


From guido at python.org  Fri Oct 24 11:45:17 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 24 11:46:09 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: Your message of "Fri, 24 Oct 2003 09:02:15 BST."
	<16E1010E4581B049ABC51D4975CEDB8802C0991D@UKDCX001.uk.int.atosorigin.com>
References: <16E1010E4581B049ABC51D4975CEDB8802C0991D@UKDCX001.uk.int.atosorigin.com>
Message-ID: <200310241545.h9OFjHx05183@12-236-54-216.client.attbi.com>

> This made me think. What should be the effect of
> 
>     def f():
>         x = 12
>         def g():
>             global y in f
>             y = 12
>         g()
>         print locals()
> 
> I suspect the answer is "it's illegal". But by extension from the
> current behaviour of "global", it should create a local variable in
> f.

I see no reason why it should be illegal; it should indeed create y
in f.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri Oct 24 11:46:35 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 24 11:46:43 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: Your message of "Fri, 24 Oct 2003 18:11:01 +1000."
	<338366A6D2E2CA4C9DAEAE652E12A1DED6AD5B@au3010avexu1.global.avaya.com> 
References: <338366A6D2E2CA4C9DAEAE652E12A1DED6AD5B@au3010avexu1.global.avaya.com>
Message-ID: <200310241546.h9OFkZ905194@12-236-54-216.client.attbi.com>

> Likewise, the following should be illegal:
> 
>      def f():
>          x = 12
>          y = 1
>          def g():
>              global y in f
>              y = 12
>          g()
>          print locals()
> 
> because the global statement occurs after a local binding of the name.

Huh?  The placement of a global statement is irrelevant -- it can
occur anywhere in the scope.  This should certainly work.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri Oct 24 11:50:48 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 24 11:50:58 2003
Subject: [Python-Dev] cleanup order
In-Reply-To: Your message of "Fri, 24 Oct 2003 11:21:16 +0200."
	<y8vbm1o3.fsf@python.net> 
References: <y8vbm1o3.fsf@python.net> 
Message-ID: <200310241550.h9OFonK05219@12-236-54-216.client.attbi.com>

> Is the cleanup order at Python shutdown documented somewhere?

Yes, in the source. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From jmarshal at mathworks.com  Fri Oct 24 11:56:21 2003
From: jmarshal at mathworks.com (Joshua Marshall)
Date: Fri Oct 24 11:58:11 2003
Subject: [Python-Dev] closure semantics
Message-ID: <7224B63940F10F40A48AC423597ADE57012DC7BB@MESSAGE-AH.ad.mathworks.com>

[Timothy]
> It would break any unadorned 'global x' in a nested scope if the
> name did not exist anywhere.
> 
> One option would be to have an "if the name doesn't exist, it it 
> created in module scope".

[Greg Ewing]
> What would be wrong with that? It's what I had in mind.

I believe the <<- operator in the statistical language R
<http://www.r-project.org/> behaves exactly like this.  While not
a compelling reason in itself to adopt this behavior, it's useful
to consider constructs in other successful languages.

From guido at python.org  Fri Oct 24 11:59:04 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 24 11:59:11 2003
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects
	typeobject.c, 2.244, 2.245
In-Reply-To: Your message of "Fri, 24 Oct 2003 13:46:06 BST."
	<20031024124606.GB3853@vicky.ecs.soton.ac.uk> 
References: <000001c38e10$77e31ea0$e841fea9@oemcomputer>
	<200310090345.h993jkF00618@12-236-54-216.client.attbi.com> 
	<20031024124606.GB3853@vicky.ecs.soton.ac.uk> 
Message-ID: <200310241559.h9OFx4M05253@12-236-54-216.client.attbi.com>

> >   Py_INCREF(Py_True);
> >   return Py_True;
> > 
> > takes less time than
> > 
> >   return PyBool_FromLong(1);
> > 
> > Maybe a pair of macros Py_return_True and Py_return_False would make
> > sense?
> 
> Sorry if this was already suggested and hastily rejected, but why do
> we care at all about the reference counter of the few heavily-used
> immortal objects of CPython?

It was never discussed; I don't recall that it has ever occurred to
me.

> I guess allowing their counter not to be carefully maintained
> ventures to the slippery slopes of bad code.  Anyway, my two cents
> for a (very) slightly faster and shorter code would be to be allowed
> never to do Py_INCREF or Py_DECREF when we know that the object is
> Py_None, Py_False or Py_True.  These three would have a dummy
> tp_dealloc that just resets the reference counter to some large
> value if it ever reaches zero.

I think there are debugging modes where this would upset some counters
that maintain a balance of the total number of references in the
world.

I also don't think that the performance ggain would be measurable.

Maybe the slight code size decrease would have some benefits.

I'm worried that there would be a negative effect in terms of people
copying the pattern for other objects.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri Oct 24 12:05:09 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 24 12:06:25 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: Your message of "Fri, 24 Oct 2003 15:53:57 +0200."
	<5.2.1.1.0.20031024153151.02834768@pop.bluewin.ch> 
References: <Your message of "Fri, 24 Oct 2003 01:08:38 +0200."
	<5.2.1.1.0.20031024005809.027db7c0@pop.bluewin.ch> <Your
	message of "Wed, 22 Oct 2003 19:32:56 +0200."
	<5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch>
	<200310220448.h9M4miP26369@12-236-54-216.client.attbi.com>
	<5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch>
	<5.2.1.1.0.20031024005809.027db7c0@pop.bluewin.ch> 
	<5.2.1.1.0.20031024153151.02834768@pop.bluewin.ch> 
Message-ID: <200310241605.h9OG59C05317@12-236-54-216.client.attbi.com>

> well, no, it's probably that I expect rebindable closed-over vars to
> be introduced but some kind of structured construct instead of the
> usual Python freeform.

Why does rebindability make a difference here?  Local vars are already
visible in inner scopes, and if they are mutable, they are already
being modified from inner scopes (just not rebound, but to most
programmers that's an annoying detail).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From arigo at tunes.org  Fri Oct 24 12:08:27 2003
From: arigo at tunes.org (Armin Rigo)
Date: Fri Oct 24 12:12:25 2003
Subject: [Python-Dev] Trashing recursive objects comparison?
In-Reply-To: <200310171446.h9HEkVs06278@12-236-54-216.client.attbi.com>
References: <20031017125429.GA25854@vicky.ecs.soton.ac.uk>
	<200310171446.h9HEkVs06278@12-236-54-216.client.attbi.com>
Message-ID: <20031024160827.GA20721@vicky.ecs.soton.ac.uk>

Hello,

On Fri, Oct 17, 2003 at 07:46:31AM -0700, Guido van Rossum wrote:
> > If the pretty academic subject of graph isomorphisms is well-worn
> > enough to be sent to the trash, I'll submit a patch that just
> > removes all this code and instead use the existing
> > sys.recursionlimit counter to catch infinite recursions and throw
> > the usual RuntimeError.

Patch http://www.python.org/sf/825639.

Rationale
---------

Adding a list to itself is a nice first-time example of Python features, but
it is quite uncommon in practice.  It introduces a few problems for the
CPython interpreter, which must explicitely detect and avoid infinite
recursions not only to please the user (e.g. for a nicer str() representation)  
but because infinite C-level recursions crash it.

The naive definition of comparison between lists is recursive, and thus
suffers from this problem.  "Bisimulation" is one of the possible
mathematically clean definition of what it means for two recursive structures
to be equal; this is what CPython currently implements.  However, I argue that
this behavior is unexpected (and undocumented), and it masks bugs in erroneous
user code: structures may be considered equal by error.  Triggering an
explicit "infinite recursion" exception would have clearly pointed out the
problem.

The current implementation of equality is to return True if comparison of two
containers recursively reaches the same pair of containers.  This is arguably
the same as if the following code:

    def f():
        return f()

returned None instead of looping indefinitely, because some dark magic in
CPython decides that returning None is a good idea (and returning None is
consistent: f() can return None if the nested f() call returns None too.
Of course, returning anything else would be consistent too, but then for the
equality we decide to return True whereas returning False would be consistent
too, and would just make less structures to be considered as equal).

Workarounds
-----------

Under the proposed patch, applications that relied on equality to compare
recursive structures will receive a RuntimeError: maximum recursion depth
exceeded in cmp. This error does not come with a long traceback, unlike the
normal "maximum recursion depth exceeded" error, unless user-defined (pure
Python) comparison operators are involved in the infinite loop.

It is easy to write the bisimulation algorithm in Python if one needs it, but
it is harder and quite unnatural to do the converse: work around CPython's
implementation of equality to turn off the bisimulation behavior.

Three approaches can be taken to port applications:

- structural equality can often be replaced by explicit structural tests.  
This is what the patch does for *all* the tests in Lib/test that relied on
recursive equality.  For example, if you want to check that an object is
really a list that contains itself and nothing else, you can easily check that
"isinstance(a, list) and len(a) == 1 and a[0] is a".  This is more precise
that the now-deprecated variants "a==a[0]" or "b=[]; b.append(b); a==b"
because the latters would also succeed if a is [c] and c is [a], for example.

- among the rare cases where we really want bisimulation, cyclic structures
involving user-defined objects with a custom notion of equality are probably
the most common case.  If so, then it is straightforward to add a cache to the 
__eq__ operator:

    def __eq__(self, other):
        if id(other) in self.assumedequal:
            return True
        try:
            self.assumedequal[id(other)] = True
            #...recursive comparisons...
        finally:
            del self.assumedequal[id(other)]

This typically only needs to be done for one of the classes involved -- as
long as all cycles you are interested in will involve an instance of this
class.

- finally, to compare cyclic structures made only from built-in containers, an
explicit "global" algorithm will do the trick.  Here is a non-recursive one
for lists:

def bisimilar_lists(a, b):
    def consider(a, b):
        key = id(a), id(b)
        if key not in bisim:
            bisim[key] = True
            pending.append((a, b))
    bisim = {}
    pending = []
    consider(a, b)
    for a, b in pending:    # a, b are the lists to compare
        if len(a) != len(b):  # different length
            return False
        for a1, b1 in zip(a, b):
            if type(a1) != type(b1):  # elements of different types
                return False
            if isinstance(a1, list):
                consider(a1, b1)  # add the two sub-lists to 'pending'
            elif a1 != b1:
                return False    # else compare non-lists directory
    return True

This could easily be extended to provide support for dictionaries.  The
complete equivalent of the current CPython implementation is harder to
achieve, but in the improbable case where the user really needs it (as opposed
to one of the above solutions), he could define custom special methods, say
__bisimilar__().  He would then extend the above algorithm to call this method
in preference to __eq__() when it exists.  Alternatively, he could define a
global dictionary mapping types to bisimulation algorithms, with a
registration mecanism for new types.  (This is similar to copy.py and
copy_reg.py.  It could be added to the standard library.)

Patch info
----------

The proposed patch adds two functions to the C API:

int Py_EnterRecursiveCall(char *where)

        Marks a point where a recursive C-level call is about to be
        performed.  'where' should be a string " in xyz" to be concatenated
        to the RuntimeError message caused by the recursion depth limit.

void Py_LeaveRecursiveCall()

        Ends a Py_EnterRecursiveCall().  Must be called once for each
        *successful* invocation of Py_EnterRecursiveCall().

These functions are used to simplify the code of the following:

- eval_frame()
- PyObject_Compare()
- PyObject_RichCompare()
- instance_call()

The idea to make these two functions part of the public API is to have a
well-tested and PyOS_CheckStack()-issuing way to perform safe recursive calls
at the C level, both in the core and in extension modules.  For example,
cPickle.c has its own notion of recursion depth limit, but it does not check
the OS stack; instead, it should probably use Py_EnterRecursiveCall() as well
(which I did not do yet).

Note that Py_EnterRecursiveCall() does the same checks as eval_frame() used to
do, whereas Py_LeaveRecursiveCall() is actually a single-instruction macro.  
There is a performance degradation for the comparison of large non-cyclic
lists, which I measure to be about 6-7% slower with the patch.  Possibly,
extra work could be done to tune Py_EnterRecursiveCall().

Another problem that Py_EnterRecursiveCall() could be enhanced to also address
is that a long, non-recursive comparison cannot currently be interrupted by
Ctrl-C.  For example:

>>> a = [5] * 1000
>>> b = [a] * 1000
>>> c = [b] * 1000
>>> c == c


-=-

Armin


From guido at python.org  Fri Oct 24 12:12:30 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 24 12:12:40 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: Your message of "Fri, 24 Oct 2003 07:37:38 CDT."
	<16281.7442.783253.814142@montanaro.dyndns.org> 
References: <200310230548.h9N5mHs01795@12-236-54-216.client.attbi.com>
	<LNBBLJKPBEHFEDALKOLCGEKDFFAB.tim_one@email.msn.com> 
	<16281.7442.783253.814142@montanaro.dyndns.org> 
Message-ID: <200310241612.h9OGCUw05359@12-236-54-216.client.attbi.com>

> The underlying generator function could still be created at
> compile-time and it (or its code object?)  stored in the current
> function's constants.

No, the code object would be stored in the constants; the function
object would be created each time around the loop.

Good thing it came from an example that Tim himself didn't like. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Fri Oct 24 12:31:06 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 24 12:31:13 2003
Subject: [Python-Dev] Trashing recursive objects comparison?
In-Reply-To: Your message of "Fri, 24 Oct 2003 17:08:27 BST."
	<20031024160827.GA20721@vicky.ecs.soton.ac.uk> 
References: <20031017125429.GA25854@vicky.ecs.soton.ac.uk>
	<200310171446.h9HEkVs06278@12-236-54-216.client.attbi.com> 
	<20031024160827.GA20721@vicky.ecs.soton.ac.uk> 
Message-ID: <200310241631.h9OGV6W05482@12-236-54-216.client.attbi.com>

You've convinced me.  It should be noted in the NEWS file that it may
breaks some apps; I'm sure there are a bunch of clever folks out there
who liked the bisimulation approach enough to depend on it :-).

Anyone else not in favor, please speak up over the weekend so Armin
can check it in on Monday.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From pedronis at bluewin.ch  Fri Oct 24 12:46:42 2003
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Fri Oct 24 12:44:17 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <200310241605.h9OG59C05317@12-236-54-216.client.attbi.com>
References: <Your message of "Fri, 24 Oct 2003 15:53:57 +0200."
	<5.2.1.1.0.20031024153151.02834768@pop.bluewin.ch>
	<Your message of "Fri, 24 Oct 2003 01:08:38 +0200."
	<5.2.1.1.0.20031024005809.027db7c0@pop.bluewin.ch>
	<Your message of "Wed, 22 Oct 2003 19:32:56 +0200."
	<5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch>
	<200310220448.h9M4miP26369@12-236-54-216.client.attbi.com>
	<5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch>
	<5.2.1.1.0.20031024005809.027db7c0@pop.bluewin.ch>
	<5.2.1.1.0.20031024153151.02834768@pop.bluewin.ch>
Message-ID: <5.2.1.1.0.20031024181208.027a6958@pop.bluewin.ch>

At 09:05 24.10.2003 -0700, Guido van Rossum wrote:
> > well, no, it's probably that I expect rebindable closed-over vars to
> > be introduced but some kind of structured construct instead of the
> > usual Python freeform.
>
>Why does rebindability make a difference here?  Local vars are already
>visible in inner scopes, and if they are mutable, they are already
>being modified from inner scopes (just not rebound, but to most
>programmers that's an annoying detail).

most Python programmers or most Python programmers using closures?

Well, it's a gut feeling, let's try to articulate it. Because

a) parametrizing a closure with some read-only variable
b) possibly shared mutable state with indefinite extent

are very different things. I think that people should recur to b) instead 
of using classes sparingly and make it clear when they do so.

b) can feel like global variables with their problems, I think that's why I 
would prefer a syntax that still point out: this is some state and this are 
functions to manipulate it. Classes are fine for that, and knowing that it 
is common style/idiom in Lisp variants this is also fine there:

  (let ... introduces vars
      ... function defs)

I think it is also about expectations when reading some code. Right now, 
reading Python code I expect at most to encounter a), although b) can be 
obtained using mutable objects, but also in that case IMHO an explicit 
uniform idiom would be preferable, like some Ref object inspired by ML 
references.

I can live with all solutions, although I'm still unconviced apart from the 
Scheme textbook argument (which was serious) that this addition is really 
necessary.

regards. 


From tjreedy at udel.edu  Fri Oct 24 13:07:39 2003
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri Oct 24 13:07:44 2003
Subject: [Python-Dev] Re: Re: closure semantics
References: <338366A6D2E2CA4C9DAEAE652E12A1DED6AC32@au3010avexu1.global.avaya.com>
Message-ID: <bnbm8s$p6h$1@sea.gmane.org>


"Delaney, Timothy C (Timothy)" <tdelaney@avaya.com> wrote in message
> I think these two points [consistency and teachability]
>should weigh heavily in any decision.

Agree also

>I think the need to rename the target scope is of lesser importance.

If you mean the need to sync the inner global-in statement with an
outer function name change, that is less onerous than the doing the
same for variable name changes (which might require changes to several
lines in the inner function).  Function name mismatches would, I
presume, be caught as compile-time syntax errors.

But what about name mismatches?  Global statements allows functions to
create 'new' variables in the module scope and not just 'existing'
ones.  What about for in-between scopes?

#start of fress interpreter session
def f():
  global xf
  xf = 1
  def g()
    global xg
    xg = 2
    global xgf in f
    xgf = 3

does this compile and run?  or choke on third global at compile time?
or choke on third assignment at runtime?

Terry J. Reedy


From tjreedy at udel.edu  Fri Oct 24 13:16:41 2003
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri Oct 24 13:16:47 2003
Subject: [Python-Dev] Re: Re: closure semantics
References: <338366A6D2E2CA4C9DAEAE652E12A1DED6AC32@au3010avexu1.global.avaya.com>
	<bnbm8s$p6h$1@sea.gmane.org>
Message-ID: <bnbmpq$qdp$1@sea.gmane.org>


"Terry Reedy" <tjreedy@udel.edu> wrote in message
news:bnbm8s$p6h$1@sea.gmane.org...
>
> But what about name mismatches?  Global statements allows functions
to
> create 'new' variables in the module scope and not just 'existing'
> ones.  What about for in-between scopes?
>
> #start of fress interpreter session
> def f():
>   global xf
>   xf = 1
>   def g()
>     global xg
>     xg = 2
>     global xgf in f
>     xgf = 3
>
> does this compile and run?  or choke on third global at compile
time?
> or choke on third assignment at runtime?

Whoops.
Paul Moore asked same question and Guido answered compile and run.

TJR


From python at rcn.com  Fri Oct 24 13:19:55 2003
From: python at rcn.com (Raymond Hettinger)
Date: Fri Oct 24 13:20:46 2003
Subject: [Python-Dev] RE: [Python-checkins]
	python/dist/src/Objectstypeobject.c, 2.244, 2.245
In-Reply-To: <200310241559.h9OFx4M05253@12-236-54-216.client.attbi.com>
Message-ID: <004e01c39a53$0b5838c0$e841fea9@oemcomputer>

> I also don't think that the performance ggain would be measurable.

The more I think about it, the more I'm sure that it cumulatively will
never save as much time as it took to write this email.


Raymond


From guido at python.org  Fri Oct 24 13:34:04 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 24 13:34:17 2003
Subject: [Python-Dev] RE: [Python-checkins]
	python/dist/src/Objectstypeobject.c, 2.244, 2.245
In-Reply-To: Your message of "Fri, 24 Oct 2003 13:19:55 EDT."
	<004e01c39a53$0b5838c0$e841fea9@oemcomputer> 
References: <004e01c39a53$0b5838c0$e841fea9@oemcomputer> 
Message-ID: <200310241734.h9OHY4L05621@12-236-54-216.client.attbi.com>

> > I also don't think that the performance ggain would be measurable.
> 
> The more I think about it, the more I'm sure that it cumulatively will
> never save as much time as it took to write this email.

So stop thinking about it already! :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From arigo at tunes.org  Fri Oct 24 13:31:02 2003
From: arigo at tunes.org (Armin Rigo)
Date: Fri Oct 24 13:34:54 2003
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Objects
	typeobject.c, 2.244, 2.245
In-Reply-To: <200310241559.h9OFx4M05253@12-236-54-216.client.attbi.com>
References: <000001c38e10$77e31ea0$e841fea9@oemcomputer>
	<200310090345.h993jkF00618@12-236-54-216.client.attbi.com>
	<20031024124606.GB3853@vicky.ecs.soton.ac.uk>
	<200310241559.h9OFx4M05253@12-236-54-216.client.attbi.com>
Message-ID: <20031024173102.GA29094@vicky.ecs.soton.ac.uk>

Hello Guido,

On Fri, Oct 24, 2003 at 08:59:04AM -0700, Guido van Rossum wrote:
> > Sorry if this was already suggested and hastily rejected, but why do
> > we care at all about the reference counter of the few heavily-used
> > immortal objects of CPython?
> 
> It was never discussed; I don't recall that it has ever occurred to
> me.

Just tried, and indeed I can't measure a difference.


Armin


From python at rcn.com  Fri Oct 24 14:12:57 2003
From: python at rcn.com (Raymond Hettinger)
Date: Fri Oct 24 14:13:49 2003
Subject: [Python-Dev] PyList API missing PyList_Pop() and PyList_Delete
In-Reply-To: <2mu15ynaed.fsf@starship.python.net>
Message-ID: <005401c39a5a$73aa9880$e841fea9@oemcomputer>

> > Was there a reason for leaving this out of the API or should it be
> > added?  Is the right way to simulate a pop something like this:
> 
> Well, there's always PyEval_CallMethod...

I ended-up using:

 	PyObject_CallMethod(to->outbasket, "pop", NULL);

The bummer is that this call is effectively used in a loop and runs once
for every data element in an iterable.  Something like pop() has such a
tiny granularity that its runtime is overwhelmed by the lookup time to
call it this way.  For this reason, I think PyList_Pop() warrants
inclusion in the API much more than low granularity methods like
PyList_Reverse() or PyList_Sort().


Raymond Hettinger


From guido at python.org  Fri Oct 24 14:24:52 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 24 14:26:47 2003
Subject: [Python-Dev] PyList API missing PyList_Pop() and PyList_Delete
In-Reply-To: Your message of "Fri, 24 Oct 2003 14:12:57 EDT."
	<005401c39a5a$73aa9880$e841fea9@oemcomputer> 
References: <005401c39a5a$73aa9880$e841fea9@oemcomputer> 
Message-ID: <200310241824.h9OIOr105765@12-236-54-216.client.attbi.com>

> I ended-up using:
> 
>  	PyObject_CallMethod(to->outbasket, "pop", NULL);
> 
> The bummer is that this call is effectively used in a loop and runs once
> for every data element in an iterable.  Something like pop() has such a
> tiny granularity that its runtime is overwhelmed by the lookup time to
> call it this way.  For this reason, I think PyList_Pop() warrants
> inclusion in the API much more than low granularity methods like
> PyList_Reverse() or PyList_Sort().

But it's easy to simulate a pop, writing the C equivalent of

    x = lst[len(lst)-1]
    del lst[len(lst)-1 : len(lst)]

IOW:

PyObject *
listpop(PyObject *lst)
{
    PyObject *x;
    int n;

    n = PyList_GET_SIZE(lst);
    if (n == 0)
        return NULL;
    x = PyList_GET_ITEM(lst, n-1);
    Py_INCREF(x);
    PyList_SetSlice(lst, n-1, n, NULL);
    return x;
}

I see no need to add this to the public API just yet (it would have to
be more flexible to allow lst.pop(n), do more arg checks, etc.).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From oren-py-d at hishome.net  Fri Oct 24 14:26:29 2003
From: oren-py-d at hishome.net (Oren Tirosh)
Date: Fri Oct 24 14:27:08 2003
Subject: [Python-Dev] let's not stretch a keyword's use unreasonably,
	_please_...
In-Reply-To: <20031022161137.96353.qmail@web40513.mail.yahoo.com>
References: <20031022161137.96353.qmail@web40513.mail.yahoo.com>
Message-ID: <20031024182629.GA34310@hishome.net>

On Wed, Oct 22, 2003 at 09:11:37AM -0700, Alex Martelli wrote:
...
> Alternatively, assigning to an attribute of some particular
> object still feels a better approach to me -- no new kwd,
> no stretching of bad old ones, actually a chance to let bad
> old 'global' fade slowly into the sunset.  If there's any
> chance to salvage THAT approach -- if it only needs a good
> neat syntax to get that "particular object" -- I'll be glad
> to participate in brainstorming to find it.  

How about using the word 'global' to get the current module object?
A precedent for this is None which is on its way to becoming a keyword 
to get that "particular object". 

A bit of parser magic would be required so global can still work
as a declaration for compatibility.

>>> global is sys.modules[__name__]
True
>>> global.__dict__ is globals()
True

    Oren

From guido at python.org  Fri Oct 24 14:30:49 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 24 14:31:01 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: Your message of "Fri, 24 Oct 2003 18:46:42 +0200."
	<5.2.1.1.0.20031024181208.027a6958@pop.bluewin.ch> 
References: <Your message of "Fri, 24 Oct 2003 15:53:57 +0200."
	<5.2.1.1.0.20031024153151.02834768@pop.bluewin.ch> <Your
	message of "Fri, 24 Oct 2003 01:08:38 +0200."
	<5.2.1.1.0.20031024005809.027db7c0@pop.bluewin.ch> <Your
	message of "Wed, 22 Oct 2003 19:32:56 +0200."
	<5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch>
	<200310220448.h9M4miP26369@12-236-54-216.client.attbi.com>
	<5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch>
	<5.2.1.1.0.20031024005809.027db7c0@pop.bluewin.ch>
	<5.2.1.1.0.20031024153151.02834768@pop.bluewin.ch> 
	<5.2.1.1.0.20031024181208.027a6958@pop.bluewin.ch> 
Message-ID: <200310241830.h9OIUnq05803@12-236-54-216.client.attbi.com>

[Samuele]
> > > well, no, it's probably that I expect rebindable closed-over vars to
> > > be introduced but some kind of structured construct instead of the
> > > usual Python freeform.

[Guido]
> >Why does rebindability make a difference here?  Local vars are already
> >visible in inner scopes, and if they are mutable, they are already
> >being modified from inner scopes (just not rebound, but to most
> >programmers that's an annoying detail).

[Samuele]
> most Python programmers or most Python programmers using closures?

I meant both categories.

> Well, it's a gut feeling, let's try to articulate it. Because
> 
> a) parametrizing a closure with some read-only variable
> b) possibly shared mutable state with indefinite extent
> 
> are very different things. I think that people should recur to b) instead 
> of using classes sparingly and make it clear when they do so.

Raymond's tree() example is an unfortunate one in this category.
(Unfortunately because it is obfuscated code for speed reasons and
because it appears in an examples section of official docs.)

> b) can feel like global variables with their problems, I think that's why I 
> would prefer a syntax that still point out: this is some state and this are 
> functions to manipulate it. Classes are fine for that, and knowing that it 
> is common style/idiom in Lisp variants this is also fine there:
> 
>   (let ... introduces vars
>       ... function defs)
> 
> I think it is also about expectations when reading some code. Right now, 
> reading Python code I expect at most to encounter a), although b) can be 
> obtained using mutable objects, but also in that case IMHO an explicit 
> uniform idiom would be preferable, like some Ref object inspired by ML 
> references.
> 
> I can live with all solutions, although I'm still unconviced apart from the 
> Scheme textbook argument (which was serious) that this addition is really 
> necessary.

I don't think the Scheme textbook argument should weigh much, since
that's such a small audience.

My original approach has been to discourage (b) by not allowing
rebinding.  Maybe this should stay the way it is.  But the use of
'global x in f' might be enough to tip the reader off -- not quite at
the start of f, when x is defined, but at least at the start of the
inner function that declares x global in f.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From python at rcn.com  Fri Oct 24 14:31:37 2003
From: python at rcn.com (Raymond Hettinger)
Date: Fri Oct 24 14:32:28 2003
Subject: [Python-Dev] Trashing recursive objects comparison?
In-Reply-To: <200310241631.h9OGV6W05482@12-236-54-216.client.attbi.com>
Message-ID: <005d01c39a5d$0f970f60$e841fea9@oemcomputer>

[Guido]
> You've convinced me.  It should be noted in the NEWS file that it may
> breaks some apps; I'm sure there are a bunch of clever folks out there
> who liked the bisimulation approach enough to depend on it :-).
> 
> Anyone else not in favor, please speak up over the weekend so Armin
> can check it in on Monday.

Armin is working on speeding up the patch.  I recommend holding off
until we can measure the performance impact of a revised patch.  If it
only affects cyclic structures, it's no big deal.  But if it impacts
normal equality tests, that warrants a little more discussion.

Another thought is that it would be prudent to see how much breakage can
be expected.  For example, perhaps the patch can tried on an older
python to see if Zope can deal with it.

Otherwise, the patch is elegant and simplifies the code quite a bit.

Also, Armin's well written proposal ought to be preserved somewhere
(like Tim's listsort.txt file).


Raymond Hettinger


From oren-py-d at hishome.net  Fri Oct 24 14:48:50 2003
From: oren-py-d at hishome.net (Oren Tirosh)
Date: Fri Oct 24 14:48:55 2003
Subject: [Python-Dev] Can we please have a better dict interpolation
	syntax?
In-Reply-To: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz>
References: <200310230136.h9N1afs19446@oma.cosc.canterbury.ac.nz>
Message-ID: <20031024184850.GB34310@hishome.net>

On Thu, Oct 23, 2003 at 02:36:41PM +1300, Greg Ewing wrote:
> I have just had the experience of writing a bunch
> of expressions of the form
> 
>   "create index %(table)s_lid1_idx on %(table)s(%(lid1)s)" % params
> 
> and found myself getting quite confused by all the parentheses
> and "s" suffixes. I would *really* like to be able to write
> this as
> 
>   "create index %{table}_lid1_idx on %{table}(%{lid1})" % params
> 
> which I find to be much easier on the eyes.

A while ago I proposed the following syntax for embedded expressions in 
strings, parsed at compile-time:
     "create index \{table}_lid1_idx on \{table}(\{lid1})"

And the equivalent runtime parsed version:
    r"create index \{table}_lid1_idx on \{table}(\{lid1})".cook(params)


testing-the-water-to-see-if-it's-PEP-time-ly yours,

   Oren


From python at rcn.com  Fri Oct 24 15:01:09 2003
From: python at rcn.com (Raymond Hettinger)
Date: Fri Oct 24 15:02:05 2003
Subject: [Python-Dev] PyList API missing PyList_Pop() and PyList_Delete
In-Reply-To: <200310241824.h9OIOr105765@12-236-54-216.client.attbi.com>
Message-ID: <006201c39a61$2f8ed1a0$e841fea9@oemcomputer>


> -----Original Message-----
[Raymond]
> > The bummer is that this call is effectively used in a loop and runs
once
> > for every data element in an iterable.  Something like pop() has
such a
> > tiny granularity that its runtime is overwhelmed by the lookup time
to
> > call it this way.  For this reason, I think PyList_Pop() warrants
> > inclusion in the API much more than low granularity methods like
> > PyList_Reverse() or PyList_Sort().
> 
> But it's easy to simulate a pop, writing the C equivalent of
> 
>     x = lst[len(lst)-1]
>     del lst[len(lst)-1 : len(lst)]

 . . .
>     PyList_SetSlice(lst, n-1, n, NULL);

There's the new piece of information.  I didn't know that the final
argument could be NULL and creating/destroying and empty list for the
arg was unpleasant.  I'll add that info to the API docs.


> I see no need to add this to the public API just yet (it would have to
> be more flexible to allow lst.pop(n), do more arg checks, etc.).


Yes.  See if more requestors come along.

Surely, I was not the first to want to use PyLists as append/pop stacks
for PyObjects.

Thanks,


Raymond


From guido at python.org  Fri Oct 24 15:08:59 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 24 15:09:22 2003
Subject: [Python-Dev] PyList API missing PyList_Pop() and PyList_Delete
In-Reply-To: Your message of "Fri, 24 Oct 2003 15:01:09 EDT."
	<006201c39a61$2f8ed1a0$e841fea9@oemcomputer> 
References: <006201c39a61$2f8ed1a0$e841fea9@oemcomputer> 
Message-ID: <200310241908.h9OJ8xO05942@12-236-54-216.client.attbi.com>

> Surely, I was not the first to want to use PyLists as append/pop stacks
> for PyObjects.

Yes, but most of them write Python code, not C code. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at yahoo.com  Fri Oct 24 15:18:49 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 24 15:18:56 2003
Subject: [Python-Dev] let's not stretch a keyword's use unreasonably,
	_please_...
In-Reply-To: <20031024182629.GA34310@hishome.net>
References: <20031022161137.96353.qmail@web40513.mail.yahoo.com>
	<20031024182629.GA34310@hishome.net>
Message-ID: <200310242118.49335.aleaxit@yahoo.com>

On Friday 24 October 2003 08:26 pm, Oren Tirosh wrote:
   ...
> How about using the word 'global' to get the current module object?
> A precedent for this is None which is on its way to becoming a keyword
> to get that "particular object".
>
> A bit of parser magic would be required so global can still work
> as a declaration for compatibility.

Unfortunately I think Guido clarified in a previous post the amount of
parser magic needed for that is excessive -- no lookahead allowed.

If we managed to tweak the parser, we'd still have the issue of keyword
inappropriateness -- and further stretching if we also want to use it to
allow rebinding of outer-scope variables that AREN'T "global" in any
sense whatsoever.  So I'd much rather have 'scope' and no issue with
parser magic...


Alex


From aahz at pythoncraft.com  Fri Oct 24 16:00:57 2003
From: aahz at pythoncraft.com (Aahz)
Date: Fri Oct 24 16:01:01 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310221853.h9MIrL327955@12-236-54-216.client.attbi.com>
References: <005501c398ca$a07a6f20$e841fea9@oemcomputer>
	<200310221853.h9MIrL327955@12-236-54-216.client.attbi.com>
Message-ID: <20031024200057.GA19184@panix.com>

On Wed, Oct 22, 2003, Guido van Rossum wrote:
>Raymond Hettinger:
>>
>> Did the discussion of a sort() expression get resolved?
>> 
>> The last I remember was that the list.sorted() classmethod had won the
>> most support because it accepted the broadest range of inputs.
>> 
>> I could live with that though I still prefer the more limited
>> (list-only) copysort() method.
> 
> list.sorted() has won, but we are waiting from feedback from the
> person who didn't like having both sort() and sorted() as methods, to
> see if his objection still holds when one is a method and the other a
> factory function.

Actually, I was another person who expressed dislike for "sorted()"
causing confusion, but previous calls for feedback were restricted to
those who felt comfortable expressing opinions for non-English speakers.
I'm still -1 on sorted() compared to copysort(), but because it's a
different context, I'm no longer actively opposed (which is why I didn't
bother speaking up earlier).  I still think that a purely grammatical
change in spelling is not appropriate to indicate meaning, particularly
when it's still the same part of speech (both verbs).  To my mind,
sorted() implies a Boolean predicate.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From aleaxit at yahoo.com  Fri Oct 24 16:17:42 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 24 16:17:48 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <87brs7egju.fsf@egil.codesourcery.com>
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com>
	<87brs7egju.fsf@egil.codesourcery.com>
Message-ID: <200310242217.42857.aleaxit@yahoo.com>

On Friday 24 October 2003 12:27 am, Zack Weinberg wrote:
   ...
> Frankly, I wish Python required one to write explicit declarations for
> all variables in the program:
>
> var x, y, z # module scope
>
> class bar:
>    classvar I, J, K # class variables

Seems like a great way to get uninitialized variables to me.

Might as well mandate initialization, getting a hard-to-read
      classvar I=2.3, J=(2,3), K=23

or to force more readability one might say only one name per
classvar statement
      classvar I=2.3
      classvar J=(2,3)
      classvar K=23

But then what added value is that 'classvar' boilerplate dirtying
things up?  Might as well take it off and get
    I = 2.3
    J = (2, 3)
    K = 23

which is just what we have now.


> It's extra bondage and discipline, yeah, but it's that much more help
> comprehending the program six months later, and it also gets rid of

There is absolutely no help (not one minute later, not six months later)
"comprehending" the program just because some silly language mandates
redundancy, such as a noiseword 'classvar' in front of the assignments.

> the "how was this variable name supposed to be spelled again?"
> question.

I disagree that the 'classvar' boilerplate would provide any help with
that question.  Just put the initializing assignment there and it's only
clearer for NOT being obscured by that 'classvar' thingy.  Document
with docstrings or comments, not by changing the language.


A language which, I suspect, MIGHT let you do exactly what you want,
is Ruby.  I don't know for sure that you can tweak Ruby into giving
(at least) warnings for assignment to symbols outside of a certain set,
but I suspect you might; you _can_ change the language's semantics
pretty deeply.  Yet in most other ways it's close enough to Python that
the two are almost equivalent.  I do believe (and hope!) you stand very
little chance of ever getting into Python something as alien to its
tradition and principles as variable declarations, so, if they're important
to you, considering ruby might be a more productive option for you.


Alex


From aahz at pythoncraft.com  Fri Oct 24 16:20:51 2003
From: aahz at pythoncraft.com (Aahz)
Date: Fri Oct 24 16:20:56 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <200310230543.h9N5heh01776@12-236-54-216.client.attbi.com>
References: <200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz>
	<200310230543.h9N5heh01776@12-236-54-216.client.attbi.com>
Message-ID: <20031024202051.GB19184@panix.com>

On Wed, Oct 22, 2003, Guido van Rossum wrote:
>
> I think that for reductions the gains are less clear.  The initializer
> for the result variable and the call that updates its are no longer
> boilerplate, because they vary for each use; plus the name of the
> result variable should be chosen carefully because it indicates what
> kind of result it is (e.g. a sum or product).  So, leaving out the
> condition for now, the pattern or idiom is:
> 
>   <result> = <initializer>
>   for <variable> in <iterable>:
>       <result> = <expression>
> 
> (Where <expression> uses <result> and <variable>.)

Actually, even that doesn't quite capture the expressiveness needed,
because <expression> needs in some cases to be a sequence of statements
and there needs to be an opportunity for a finalizer to run after the for
loop (e.g. average()).
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From zack at codesourcery.com  Fri Oct 24 16:39:39 2003
From: zack at codesourcery.com (Zack Weinberg)
Date: Fri Oct 24 16:39:46 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <200310242217.42857.aleaxit@yahoo.com> (Alex Martelli's message
	of "Fri, 24 Oct 2003 22:17:42 +0200")
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com>
	<87brs7egju.fsf@egil.codesourcery.com>
	<200310242217.42857.aleaxit@yahoo.com>
Message-ID: <87llrabcac.fsf@egil.codesourcery.com>

Alex Martelli <aleaxit@yahoo.com> writes:

> On Friday 24 October 2003 12:27 am, Zack Weinberg wrote:
>    ...
>> Frankly, I wish Python required one to write explicit declarations for
>> all variables in the program:
>>
>> var x, y, z # module scope
>>
>> class bar:
>>    classvar I, J, K # class variables
>
> Seems like a great way to get uninitialized variables to me.

No, they get a magic cookie value that triggers an exception on use.
Which, incidentally, disambiguates the present UnboundLocalError - is
that a typo, or is that failure to initialize the variable on this
code path?  Consider, eg.


def foo(x):
   s = 2
   if x:
      a = 1
   return a

...
> But then what added value is that 'classvar' boilerplate dirtying
> things up?  Might as well take it off and get
>     I = 2.3
>     J = (2, 3)
>     K = 23
>
> which is just what we have now.
...
>
> There is absolutely no help (not one minute later, not six months later)
> "comprehending" the program just because some silly language mandates
> redundancy, such as a noiseword 'classvar' in front of the assignments.

Understand that I do almost all my programming in typed languages,
where that keyword isn't noise, it's a critical part of the
declaration.

I see where you're coming from with regard to noisewords.  There are
plausible alternatives, although they're all more complicated to
implement and explain, compared to

   var a, b = 2, c = foo()  # a throws UninitializedLocalError if used
                            # before set
   ...
   d     # throws UnboundLocalError
   e = 1 # ALSO throws UnboundLocalError

But in this domain, I am mostly content with the language as is.

I think there really *is* a language deficiency with regard to
declaring class versus instance variables.

class foo:
   A = 1  # these are class variables
   B = 2
   C = 3   

   def __init__(self):
      self.a = 4   # these are instance variables
      self.b = 5
      self.c = 6

I find this imperative syntax for declaring instance variables
profoundly unintuitive.  Further, on my first exposure to Python, I
thought A, B, C were instance variables, although it wasn't hard to
understand why they aren't.

People like to rag on the popularity of __slots__ (for reasons which
are never clearly spelled out, but never mind) -- has anyone
considered that it's popular because it's a way of declaring the set
of instance variables, and there is no other way in the language?

zw

From aleaxit at yahoo.com  Fri Oct 24 16:48:32 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 24 16:48:39 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <200310232206.h9NM6wb03700@12-236-54-216.client.attbi.com>
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<16280.5396.284178.989033@montanaro.dyndns.org>
	<200310232206.h9NM6wb03700@12-236-54-216.client.attbi.com>
Message-ID: <200310242248.32981.aleaxit@yahoo.com>

On Friday 24 October 2003 12:06 am, Guido van Rossum wrote:
> [Skip]
>
> > Given that the global keyword or something like it is here to stay
> > (being preferable over some attribute-style access)
>
> (Actually I expect more pushback from Alex once he's back from his
> trip.  He seems to feel strongly about this. :-)

I do: I dislike "declarative statements" and I also dislike "global" as
a spelling for anything that isn't actually global.  But after a 3-day
Bologna->Munich->Gothenburg->Stockholm->Amsterdam->Bologna
whirl I'm just too bushed -- and have too many hundreds of msgs
to go through (backwards as usual) -- to be very effective;-).  With
luck, I may be able to do better in the weekend...:-).


> That was my first suggestion earlier this week.  The main downside
> (except from propagating 'global' :-) is that if you rename the
> function defining the scope you have to fix all global statements
> referring to it.

I seem to have seen many others say that the "renaming the function"
downside is not a serious problem, and I concur with them; you're just
as likely to rename e.g. the variable (where you have to hunt down and
change every assignment and access as well as the "declarative stmt",
AND get no compiler support for errors) as the function (where you only
need to fix the "declarative stmts" AND the compiler will tell you if you
miss some)


Alex


From bac at OCF.Berkeley.EDU  Fri Oct 24 16:59:57 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Oct 24 17:00:48 2003
Subject: [Python-Dev] cleanup order
In-Reply-To: <y8vbm1o3.fsf@python.net>
References: <y8vbm1o3.fsf@python.net>
Message-ID: <3F9992CD.8030201@ocf.berkeley.edu>

Thomas Heller wrote:

> Is the cleanup order at Python shutdown documented somewhere?
> 
> The only thing I found is the (old) essay
> http://www.python.org/doc/essays/cleanup.html
> 

The summarized history of python-dev to the rescue (thanks to Google's 
restricted domain searching and "python-dev Summary" as a keyword).  =)

http://www.python.org/dev/summary/2003-04-01_2003-04-15.html
http://www.python.org/dev/summary/2003-09-16_2003-09-30.html

Just search in these docs for "shutdown" and "cleanup".  Most of it is 
over threads not being terminated before shutdown begins, but the basic 
order and such is discussed and spilled out.  And the April one has it 
told as if it was being explained to a grade school class (one of my 
more creative, quirky summaries if I do say so myself).

-Brett


From aleaxit at yahoo.com  Fri Oct 24 17:01:16 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 24 17:01:22 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com>
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<87k76vehup.fsf@egil.codesourcery.com>
	<200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com>
Message-ID: <200310242301.16445.aleaxit@yahoo.com>

On Friday 24 October 2003 12:08 am, Guido van Rossum wrote:
> > However, as long as we're talking about this stuff, I wish I could
> > write "global foo" at module scope and have that mean "this variable
> > is to be treated as global in all functions in this module".
>
> This is similar to Greg Ewing's proposable to have 'rebindable x' at
> an outer function scope.  My problem with it remains:
>
> It gives outer scopes (some) control over inner scopes.  One of the
> guidelines is that a name defined in an inner scope should always
> shadow the same name in an outer scope, to allow evolution of the
> outer scope without affecting local details of inner scope.  (IOW if
> an inner function defines a local variable 'x', the outer scope
> shouldn't be able to change that.)

I must be missing something, because I don't understand the value
of that guideline.  I see outer and inner functions as tightly coupled
anyway; it's not as if they could be developed independently -- not
even lexically, surely not semantically.

I do prefer to have the reminder "this is _assigning_ a NON-local
variable" _closer_ to the assignment -- and I DO think it would be
great if such rebinding HAD to be an assignment, not some kind
of "side effect" from statements such as def, class, for, btw.

(Incidentally, we'd get the latter for free if the nonlocal was "an
attribute of some object" -- outer.x = 23 YES, "def outer.x():..." NO.
But i'd still feel safer, even with a deuced 'declarative statement',
if it could somehow be allowed to rebind nonlocals ONLY with an
explicit assignment).

So, anyway, the closer to the assignment the reminder, the better,
so if it has to be a "declarative statement" I'd rather have it in the
inner function than in the outer one.  But for reasons very different
from that guideline which I don't grasp... (probably just sleepiness
and tiredness on my part...).


Alex


From pje at telecommunity.com  Fri Oct 24 17:10:03 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 24 17:12:05 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <87llrabcac.fsf@egil.codesourcery.com>
References: <200310242217.42857.aleaxit@yahoo.com>
	<200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com>
	<87brs7egju.fsf@egil.codesourcery.com>
	<200310242217.42857.aleaxit@yahoo.com>
Message-ID: <5.1.1.6.0.20031024170245.03260160@telecommunity.com>

At 01:39 PM 10/24/03 -0700, Zack Weinberg wrote:
>class foo:
>    A = 1  # these are class variables
>    B = 2
>    C = 3
>
>    def __init__(self):
>       self.a = 4   # these are instance variables
>       self.b = 5
>       self.c = 6
>
>I find this imperative syntax for declaring instance variables
>profoundly unintuitive.  Further, on my first exposure to Python, I
>thought A, B, C were instance variables, although it wasn't hard to
>understand why they aren't.

A, B, and C *are* instance variables.  Why do you think they aren't?


>People like to rag on the popularity of __slots__ (for reasons which
>are never clearly spelled out, but never mind) -- has anyone
>considered that it's popular because it's a way of declaring the set
>of instance variables,

What good does declaring the set of instance variables *do*?  This seems to 
be more of a mental comfort thing than anything else.  I've spent most of 
my career in declaration-free languages, though, so I really don't 
understand why people get so emotional about being able to declare their 
variables.


>  and there is no other way in the language?

Actually, there are a great many ways to implement such a thing.  One way 
might be something like:

class RestrictedVars:
     vars = ()
     def __setattr__(self,attr,name):
         if name not in self.vars:
             raise AttributeError("No such attribute",attr)

class SomeClass(RestrictedVars):
     vars = 'a','b','c'


From zack at codesourcery.com  Fri Oct 24 17:16:27 2003
From: zack at codesourcery.com (Zack Weinberg)
Date: Fri Oct 24 17:16:33 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <5.1.1.6.0.20031024170245.03260160@telecommunity.com> (Phillip
	J. Eby's message of "Fri, 24 Oct 2003 17:10:03 -0400")
References: <200310242217.42857.aleaxit@yahoo.com>
	<200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com>
	<87brs7egju.fsf@egil.codesourcery.com>
	<200310242217.42857.aleaxit@yahoo.com>
	<5.1.1.6.0.20031024170245.03260160@telecommunity.com>
Message-ID: <87ad7qbal0.fsf@egil.codesourcery.com>

"Phillip J. Eby" <pje@telecommunity.com> writes:

> At 01:39 PM 10/24/03 -0700, Zack Weinberg wrote:
>>class foo:
>>    A = 1  # these are class variables
>>    B = 2
>>    C = 3
>>
>>    def __init__(self):
>>       self.a = 4   # these are instance variables
>>       self.b = 5
>>       self.c = 6
>>
>>I find this imperative syntax for declaring instance variables
>>profoundly unintuitive.  Further, on my first exposure to Python, I
>>thought A, B, C were instance variables, although it wasn't hard to
>>understand why they aren't.
>
> A, B, and C *are* instance variables.  Why do you think they aren't?

You prove my point!  I got it wrong!  This is a confusing part of the
language!

> What good does declaring the set of instance variables *do*?  This
> seems to be more of a mental comfort thing than anything else.  I've
> spent most of my career in declaration-free languages, though, so I
> really don't understand why people get so emotional about being able
> to declare their variables.

Yeah, it's a mental comfort thing.  Mental comfort is important.
Having the computer catch your fallible human mistakes is also
important.

zw

From guido at python.org  Fri Oct 24 17:32:22 2003
From: guido at python.org (Guido van Rossum)
Date: Fri Oct 24 17:33:09 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: Your message of "Fri, 24 Oct 2003 23:01:16 +0200."
	<200310242301.16445.aleaxit@yahoo.com> 
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<87k76vehup.fsf@egil.codesourcery.com>
	<200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com> 
	<200310242301.16445.aleaxit@yahoo.com> 
Message-ID: <200310242132.h9OLWMv06320@12-236-54-216.client.attbi.com>

[Guido]
> > It gives outer scopes (some) control over inner scopes.  One of the
> > guidelines is that a name defined in an inner scope should always
> > shadow the same name in an outer scope, to allow evolution of the
> > outer scope without affecting local details of inner scope.  (IOW if
> > an inner function defines a local variable 'x', the outer scope
> > shouldn't be able to change that.)

[Alex]
> I must be missing something, because I don't understand the value
> of that guideline.  I see outer and inner functions as tightly coupled
> anyway; it's not as if they could be developed independently -- not
> even lexically, surely not semantically.

It's the same as the reason why name lookup (whether at compile time
or at run-time) always goes from inner scope to outer.  While you and
I see nested functions as small amounts of closely-knit code, some
people will go overboard and write functions of hundred lines long
containing dozens of inner functions, which may be categorized into
several functional groups.  A decision to share a variable 'foo'
between one group of inner functions shouldn't mean that none of the
other inner functions can have a local variable 'foo'.

Anyway, I hope you'll have a look at my reasons for why the compiler
needs to know about rebinding variables in outer scopes from inside
an inner scope.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From bac at OCF.Berkeley.EDU  Fri Oct 24 17:53:56 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Oct 24 17:54:08 2003
Subject: [Python-Dev] 2nd draft of "How Py is Developed" essay
Message-ID: <3F999F74.7040706@ocf.berkeley.edu>

OK, so using the feedback from the first draft I made a few changes. 
One is a paragraph on what to do if you want to add or change a file on 
a patch item if you are not the original submitter.  I also added 
two-sentence conclusion to the whole essay.  Lastly I changed the title 
to better reflect how Python is ultimately developed.  =)

As before, any comments and corrections are welcome.  If you think this 
sucker is done, please say so!  If I get enough people saying they think 
this is good enough to go out to the world I will post to 
python-announce and python-list and then add it to the python.org/dev/ . 
  Then you can all hear me discuss it again at PyCon (assuming it gets 
accepted).  =)

----------------------------

Guido, Some Guys, and a Mailing List: How Python is Developed
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Introduction
============
Software does not make itself.  Code does not spontaneously come from 
the ether of the universe.  Python_ is no exception to this rule.  Since 
Python made its public debut back in 1991 people beyond the BDFL 
(Benevolent Dictator For Life, `Guido van Rossum`_) have helped 
contribute time and energy to making Python what it is today; a 
powerful, simple programming language available to all.

But it has not been a random process of people doing whatever they 
wanted to Python.  Over the years a process to the development of Python 
has emerged by the group that heads Python's growth and maintenance; 
`python-dev`_.  This document is an attempt to write this process down 
in hopes of lowering any barriers possibly preventing people from 
contributing to the development of Python.

.. _Python: http://www.python.org/
.. _Guido van Rossum: http://www.python.org/~guido/
.. _python-dev:http://mail.python.org/mailman/listinfo/python-dev


Tools Used
==========
To help facilitate the development of Python, certain tools are used. 
Beyond the obvious ones such as a text editor and email client, two 
tools are very pervasive in the development process.

SourceForge_ is used by python-dev to keep track of feature requests, 
reported bugs, and contributed patches.  A detailed explanation on how 
to use SourceForge is covered later in `General SourceForge Guidelines`_.

CVS_ is a networked file versioning system that stores all of files that 
make up Python.  It allows the developers to have a single repository 
for the files along with being able to keep track of any and all changes 
to every file.  The basic commands and uses can be found in the `dev 
FAQ`_ along with a multitude of tutorials spread across the web.

.. _SourceForge: http://sourceforge.net/projects/python/
.. _CVS: http://www.cvshome.org/
.. _dev FAQ: http://www.python.org/dev/devfaq.html


Communicating
=============
Python development is not just programming.  It requires a great deal of 
communication between people.  This communication is not just between 
the members of python-dev; communication within the greater Python 
community also helps with development.  Several mailing lists and 
newsgroups are used to help organize all of these discussions.

In terms of Python development, the primary location for communication 
is the `python-dev`_ mailing list.  This is where the members of 
python-dev hash out ideas and iron out issues.  It is an open list; 
anyone can subscribe to the mailing list.  While the discussion can get 
quite technical, it is not all out of the reach for even a novice and 
thus should not discourage anyone from joining the list.  Please 
realize, though, this list is **only** for the discussion of the 
development of Python; all other questions should be directed somewhere 
else, such as `python-list`_.

When the greater Python community is involved in a discussion, it always 
ends up on `python-list`_.  This mailing list is a gateway to the 
newsgroup `comp.lang.python`_.  This is also a good place to go when you 
have a question about Python that does not pertain to the actual 
development of the language.

Using CVS_ allows the development team to know who made a change to a 
file and when they made their change.  But unless one wants to 
continuously update their local checkout of the repository, the best way 
to stay on top of changes to the repository is to subscribe to 
`Python-checkins`_.  This list sends out an email for each and every 
change to a file in Python.  This list can generate a large amount of 
traffic since even changing a typo in some text will trigger an email to 
be sent out.  But if you wish to be kept abreast of all changes to 
Python then this is a good way to do so.

The Patches_ mailing list sends out an email for all changes to patch 
items on SourceForge_.  This list, just like Python-checkins, can 
generate a large amount of email traffic.  It is in general useful to 
people who wish to help out with the development of Python by knowing 
about all new submitted patches as well as any new developments on 
preexisting ones.

`Python-bugs-list`_ functions much like the Patches mailing list except 
it is for bug items on SourceForge.  If you find yourself wanting to 
help to close and remove bugs in Python this is the right list to 
subscribe to if you can handle the volume of email.

.. _python-list: http://mail.python.org/mailman/listinfo/python-list
.. _comp.lang.python: http://groups.google.com/groups?q=comp.lang.python
.. _Python-checkins: http://mail.python.org/mailman/listinfo/python-checkins
.. _Patches: http://mail.python.org/mailman/listinfo/patches
.. _Python-bugs-list: 
http://mail.python.org/mailman/listinfo/python-bugs-list


The Actual Development
======================
Developing Python is not all just conversations about neat new language 
features (although those neat conversations do come up and there is a 
process to it).  Developing Python also involves maintaining it by 
eliminating discovered bugs, adding and changing features, and various 
other jobs that are not necessarily glamorous but are just as important 
to the language as anything else.


General SourceForge Guidelines
------------------------------
Since a good amount of Python development involves using SourceForge_, 
it is important to follow some guidelines when handling a tracker item 
(bug, patch, etc.).  Probably one of the most important things you can 
do is make sure to set the various options in a new tracker item 
properly.  The submitter should make sure that the Data Type, Category, 
and Group are all set to reasonable values.  The remaining values 
(Assigned To, Status, and Resolution) should in general be left to 
Python developers to set.  The exception to this rule is when you want 
to retract a patch; then "close" the patch by setting Status to "closed" 
and Resolution to whatever is appropriate.

Make sure you do a cursory check to make sure what ever you are 
submitting was not previously submitted by someone else.  Duplication 
just uses up valuable time.

And **please** do not post feature requests, bug reports, or patches to 
the python-dev mailing list.  If you do you will be instructed to create 
an appropriate SourceForge tracker item.  When in doubt as to whether 
you should bring something to python-dev's attention, you can always ask 
on `comp.lang.python`_; Python developers actively participate there and 
move the conversation over if it is deemed reasonable.


Feature Requests
----------------
`Feature requests`_ are for features that you wish Python had but you 
have no plans on actually implementing by writing a patch.  On occasion 
people do go through the features requests (also called RFEs on 
SourceForge) to see if there is anything there that they think should be 
implemented and actually do the implementation.  But in general do not 
expect something put here to be implemented without some participation 
on your part.

The best way to get something implemented is to campaign for it in the 
greater Python community.  `comp.lang.python`_ is the best place to 
accomplish this.  Post to the newsgroup with your idea and see if you 
can either get support or convince someone to implement it.  It might 
even end up being added to `PEP 42`_ so that the idea does not get lost 
in the noise as time passes.

.. _feature requests: 
http://sourceforge.net/tracker/?group_id=5470&atid=355470
.. _PEP 42: http://www.python.org/peps/pep-0042.html


Bug Reports
-----------
Think you found a bug?  Then submit a `bug report`_ on SourceForge. 
Make sure you clearly specify what version of Python you are using, what 
OS, and under what conditions the bug was triggered.  The more 
information you can give the faster the bug can be fixed since time will 
not be wasted requesting more information from you.

.. _bug report: http://sourceforge.net/tracker/?group_id=5470&atid=105470


Patches
-------
Create a patch_ tracker item on SourceForge for any code you think 
should be applied to the Python CVS tree.  For practically any change to 
Python's functionality the documentation and testing suite will need to 
be changed as well.  Doing this in the first place speeds things up 
considerably.

Please make sure your patch is against the CVS repository.  If you don't 
know how to use it (basics are covered in the `dev FAQ`_), then make 
sure you specify what version of Python you made your patch against.

In terms of coding standards, `PEP 8`_ specifies for Python while `PEP 
7`_ specifies for C.  Always try to maximize your code reuse; it makes 
maintenance much easier.

For C code make sure to limit yourself to ANSI C code as much as 
possible.  If you must use non-ANSI C code then see if what you need is 
checked for by looking in pyconfig.h .  You can also look in 
Include/pyport.h for more helpful C code.  If what you need is still not 
there but it is in general available, then add a check in configure.in 
for it (don't forget to run autoreconf to make the changes to take 
effect).  And if that *still* doesn't fit your needs then code up a 
solution yourself.  The reason for all of this is to limit the 
dependence on external code that might not be available for all OSs that 
Python runs on.

Be aware of intellectual property when handling patches.  Any code with 
no copyright will fall under the copyright of the `Python Software 
Foundation`_.  If you have no qualms with that, wonderful; this is the 
best solution for Python.  But if you feel the need to include a 
copyright then make sure that it is compatible with copyright used on 
Python (i.e., BSD-style).  The best solution, though, is to sign the 
copyright over to the Python Software Foundation.

.. _patch: http://sourceforge.net/tracker/?group_id=5470&atid=305470
.. _dev FAQ: http://www.python.org/dev/devfaq.html
.. _PEP 7: http://www.python.org/peps/pep-0007.html
.. _PEP 8: http://www.python.org/peps/pep-0008.html
.. _Python Software Foundation: http://www.python.org/psf/


Changing the Language
=====================
You understand how to file a patch.  You think you have a great idea on 
how Python should change.  You are ready to write code for your change. 
  Great, but you need to realize that certain things must be done for a 
change to be accepted.  Changes fall into two categories; changes to the 
standard library (referred to as the "stdlib") and changes to the 
language proper.


Changes to the stdlib
---------------------
Changes to the stdlib can consist of adding functionality or changing 
existing functionality.

Adding minor functionality (such as a new function or method) requires 
convincing a member of python-dev that the addition of code caused by 
implementing the feature is worth it.  A big addition such as a module 
tends to require more support than just a single member of python-dev. 
As always, getting community support for your addition is a good idea.

With all additions, make sure to write up documentation for your new 
functionality.  Also make sure that proper tests are added to the 
testing suite.

If you want to add a module, be prepared to be called upon for any bug 
fixes or feature requests for that module.  Getting a module added to 
the stdlib makes you by default its maintainer.  If you can't take that 
level of responsibility and commitment and cannot get someone else to 
take it on for you then your battle will be very difficult; when there 
is not a specific maintainer of code python-dev takes responsibility and 
thus your code must be useful to them or else they will reject the module.

Changing existing functionality can be difficult to do if it breaks 
backwards-compatibility.  If your code will break existing code, you 
must provide a legitimate reason on why making the code act in a 
non-compatible way is better than the status quo.  This requires 
python-dev as a whole to agree to the change.

Changing the Language Proper
----------------------------
Changing Python the language is taken **very** seriously.  Python is 
often heralded for its simplicity and cleanliness.  Any additions to the 
language must continue this tradition and view.  Thus any changes must 
go through a long process.

First, you must write a PEP_ (Python Enhancement Proposal).  This is 
basically just a document that explains what you want, why you want it, 
what could be bad about the change, and how you plan on implementing the 
change.  It is best to get feedback on PEPs on `comp.lang.python`_ and 
from python-dev.  Once you feel the document is ready you can request a 
PEP number and to have it added to the official list of PEPs in `PEP 0`_.

Once you have a PEP, you must then convince python-dev and the BDFL that 
your change is worth it.  Be expected to be bombarded with questions and 
counter-arguments.  It can drag on for over a month, easy.  If you are 
not up for that level of discussion then do not bother with trying to 
get your change in.  If you manage to convince a majority of python-dev 
and the BDFL (or most of python-dev; that can lead to the BDFL changing 
his mind) then your change can be applied.

As with all new code make sure you also have appropriate documentation 
patches along with tests for the new functionality.

.. _PEP: http://www.python.org/peps/pep-0001.html
.. _PEP 0: http://www.python.org/peps/pep-0000.html


Helping Out
===========
Many people say they wish they could help out with the development of 
Python but feel they are not up to writing code.  There are plenty of 
things one can do, though, that does not require you to write code. 
Regardless of your coding abilities, there is something for everyone to 
help with.

For feature requests, adding a comment about what you think is helpful. 
  State whether or not you would like to see the feature.  You can also 
volunteer to write the code to implement the feature if you feel up to it.

For bugs, stating whether or not you can reproduce the bug yourself can 
be extremely helpful.  If you can write a fix for the bug that is very 
helpful as well; start a patch item and reference it in a comment in the 
bug item.

For patches, apply the patch and run the testing suite.  You can do a 
code review on the patch to make sure that it is good, clean code.  If 
the patch adds a new feature, comment on whether you think it is worth 
adding.  If it changes functionality then comment on whether you think 
it might break code; if it does, say whether you think it is worth the 
cost of breaking existing code.  Help add to the patch if it is missing 
documentation patches or needed regression tests.

A special mention about adding a file to a tracker item.  Only official 
developers and the creator of the tracker item can add a file.  This 
means that if you want to add a file and you are neither of the types of 
people just mentioned you have to do an extra step or two.  One thing 
you can do is post the file you want added somewhere else online and 
reference the URL in a comment.  You can also create a new patch item if 
you feel the change is thorough enough and cross-reference between both 
patches in the comments.  Be wary of this last option, though, since 
some people might be offended since it might come off as if you think 
there code is bad and yours is better.  The best solution of all is to 
work with the original poster if they are receptive to help.  But if 
they do not respond or are not friendly then do go ahead and do one of 
the other two suggestions.

For language changes, make your voice be heard.  Comment about any PEPs 
on `comp.lang.python`_ so that the general opinion of the community can 
be assessed.

If there is nothing specific you find you want to work on but still feel 
like contributing nonetheless, there are several things you can do.  The 
documentation can always use fleshing out.  Adding more tests to the 
testing suite is always useful.  Contribute to discussions on python-dev 
or `comp.lang.python`_.  Just helping out in the community by spreading 
the word about Python or helping someone with a question is helpful.

If you really want to get knee-deep in all of this, join python-dev. 
Once you have been actively participating for a while and are generally 
known on python-dev you can request to have checkin rights on the CVS 
tree.  It is a great way to learn how to work in a large, distributed 
group along with how to write great code.

And if all else fails give money; the `Python Software Foundation`_ is a 
non-profit organization that accepts donations that are tax-deductible 
in the United States.  The funds are used for various thing such as 
lawyers for handling the intellectual property of Python to funding 
PyCon_.  But the PSF could do a lot more if they had the funds.  One 
goal is to have enough money to fund having Guido work on Python for a 
full year full-time; this would bring about Python 3.  Every dollar does 
help, so please contribute if you can.

.. _PyCon: http://www.python.org/pycon/


Conclusion
==========
If you get any message from this document, it should be that *anyone* 
can help with the development of Python.  All help is greatly 
appreciated and keeps the language the wonderful piece of software that 
it is.


From gward-work at python.net  Fri Oct 24 18:07:03 2003
From: gward-work at python.net (Greg Ward)
Date: Fri Oct 24 18:07:10 2003
Subject: [Python-Dev] Kernel panic writing to /dev/dsp with cmpci driver
Message-ID: <20031024220703.GA2267@intelerad.com>

[cc'ing python-dev because there might be something funny in the
ossaudiodev module -- but some of you already know that!]

I've just upgraded to Linux 2.4.23-pre8 + RML's preemptible kernel
patch, and I have a pretty reproducible panic when writing to /dev/dsp.
Here's what lspci reports about the sound hardware:

02:03.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10)

I'm using the cmpci driver.  Oddly, the panic only happens when using
Python 2.3's ossaudiodev module, which is a fairly thin wrapper around
the OSS API.  Here's a script that crashes my machine every time:

"""
#!/usr/bin/python2.3

import sys
import ossaudiodev

random = open("/dev/urandom", "r")
dsp = ossaudiodev.open("w")
while 1:
    sys.stdout.write("."); sys.stdout.flush()
    dsp.write(random.read(4096))
"""

(I'm quite sure that the panic has nothing to do with /dev/urandom,
since I discovered the it by playing Ogg Vorbis files, not by playing
white noise.)  The crash happens after about 10-12 dots have appeared,
ie. 10-12 4k blocks have been written.

Here's a C version of that script that does *not* crash my system:

"""
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/soundcard.h>

#define BUF_SIZE 4096

int main(int argc, char ** argv)
{
   int nbytes;
   char data[BUF_SIZE];
   int source, dsp;		       /* input, output FDs */

   source = open("/dev/urandom", O_RDONLY);
   dsp = open("/dev/dsp", O_WRONLY);
   printf("source fd=%d, dsp fd=%d\n", source, dsp);
   
   while (1) {
      printf("."); fflush(stdout);
      nbytes = read(source, data, BUF_SIZE);
      write(dsp, data, nbytes);
   }
}
"""

Just wondering if anyone else has seen something like this in
2.4.23-pre8, either with or without the preemptible kernel patch.  I'm
going to try backing out that patch to see if the problem persists; if
so, I'll report back here with more details on the panic.

Oh yeah, this is a Red Hat 9 system -- the sound driver worked perfectly
with Red Hat's 2.4.20-20.9 kernel (which, from the source RPM, appears
to be 2.4.21-pre3 plus a bunch of Red Hat patches).

        Greg

From gward at intelerad.com  Fri Oct 24 18:58:41 2003
From: gward at intelerad.com (Greg Ward)
Date: Fri Oct 24 18:58:47 2003
Subject: [Python-Dev] Re: Kernel panic writing to /dev/dsp with cmpci driver
In-Reply-To: <20031024220703.GA2267@intelerad.com>
References: <20031024220703.GA2267@intelerad.com>
Message-ID: <20031024225841.GA1915@intelerad.com>

On 24 October 2003, I said:
> I'm going to try backing out that patch to see if the problem
> persists; if so, I'll report back here with more details on the panic.

OK, I tried it with a vanilla 2.4.23-pre8.  The panic is still there,
and now I can reproduce it with my C program.  (However, I had to run it
twice.  I'm guessing that if I had run it twice under the preemptible
kernel, it would have crashed then too.)

So it looks like this is definitely a kernel bug, the Python ossaudiodev
driver is not doing anything too perverse, and RML's preemptible kernel
patch is not to blame.  So here's the ksymoops output:

"""
ksymoops 2.4.9 on i686 2.4.23-pre8-gw2.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.23-pre8-gw2/ (default)
     -m /boot/System.map-2.4.23-pre8-gw2 (specified)

c01124d3
*pde = 00000000
Oops: 0000
CPU:    0
EIP:    0010:[<c01124d3>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010013
eax: 72756f2e   ebx: 40017000   ecx: 00000000   edx: 72756f2e
esi: df52c4ac   edi: 00000003   ebp: dd9add58   esp: dd9add3c
ds: 0018   es: 0018   ss: 0018
Process crasher (pid: 1846, stackpage=dd9ad000)
Stack: 
Call trace:    [<c0108945>] [<c0108ac4>] [<c010b168>] [<c01a38dc>] [<c01a3bbe>] [<c01360b3>] [<c010740f>]
Code: 8b 01 85 c6 75 19 8b 02 89 d3 89 c2 0f 18 00 39 f3 75 ea ff


>>EIP; c01124d3 <__wake_up+33/80>   <=====

>>esi; df52c4ac <_end+1f215128/204fecdc>
>>ebp; dd9add58 <_end+1d6969d4/204fecdc>
>>esp; dd9add3c <_end+1d6969b8/204fecdc>

Trace; c0108945 <handle_IRQ_event+45/70>
Trace; c0108ac4 <do_IRQ+64/a0>
Trace; c010b168 <call_do_IRQ+5/d>
Trace; c01a38dc <SHATransform+ac/150>
Trace; c01a3bbe <extract_entropy+23e/360>
Trace; c01360b3 <sys_read+a3/140>
Trace; c010740f <system_call+33/38>

Code;  c01124d3 <__wake_up+33/80>
00000000 <_EIP>:
Code;  c01124d3 <__wake_up+33/80>   <=====
   0:   8b 01                     mov    (%ecx),%eax   <=====
Code;  c01124d5 <__wake_up+35/80>
   2:   85 c6                     test   %eax,%esi
Code;  c01124d7 <__wake_up+37/80>
   4:   75 19                     jne    1f <_EIP+0x1f>
Code;  c01124d9 <__wake_up+39/80>
   6:   8b 02                     mov    (%edx),%eax
Code;  c01124db <__wake_up+3b/80>
   8:   89 d3                     mov    %edx,%ebx
Code;  c01124dd <__wake_up+3d/80>
   a:   89 c2                     mov    %eax,%edx
Code;  c01124df <__wake_up+3f/80>
   c:   0f 18 00                  prefetchnta (%eax)
Code;  c01124e2 <__wake_up+42/80>
   f:   39 f3                     cmp    %esi,%ebx
Code;  c01124e4 <__wake_up+44/80>
  11:   75 ea                     jne    fffffffd <_EIP+0xfffffffd>
Code;  c01124e6 <__wake_up+46/80>
  13:   ff 00                     incl   (%eax)
"""

(Err, the "-gw2" version number is a red herring -- this really is an
unpatched 2.4.23-pre8, I swear!)

Is that enough info for a real kernel hacker to track this down?  I'm
not very experienced with kernel panics, so I'm not sure if this is all
you need.  Let me know if I can provide more info.

        Greg

From tjreedy at udel.edu  Fri Oct 24 20:14:32 2003
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri Oct 24 20:14:40 2003
Subject: [Python-Dev] Re: Re: closure semantics
References: <200310242217.42857.aleaxit@yahoo.com><200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz><200310232208.h9NM8wp03735@12-236-54-216.client.attbi.com><87brs7egju.fsf@egil.codesourcery.com><200310242217.42857.aleaxit@yahoo.com>
	<87llrabcac.fsf@egil.codesourcery.com>
	<5.1.1.6.0.20031024170245.03260160@telecommunity.com>
Message-ID: <bncf99$366$1@sea.gmane.org>


"Phillip J. Eby" <pje@telecommunity.com> wrote in message
news:5.1.1.6.0.20031024170245.03260160@telecommunity.com...
> At 01:39 PM 10/24/03 -0700, Zack Weinberg wrote:
> >class foo:
> >    A = 1  # these are class variables
> >    B = 2
> >    C = 3
> >
> >    def __init__(self):
> >       self.a = 4   # these are instance variables
> >       self.b = 5
> >       self.c = 6
> >
> >I find this imperative syntax for declaring instance variables
> >profoundly unintuitive.  Further, on my first exposure to Python, I
> >thought A, B, C were instance variables, although it wasn't hard to
> >understand why they aren't.
>
> A, B, and C *are* instance variables.  Why do you think they aren't?

What?  They are class attributes that live in the class dictionary,
not the instance dictionary.  They can be (directly) directly accessed
as foo.A, etc, while foo.a, etc don't work.  While they *may* serve as
default or backup same-for-all-instances values for when there is no
instance-specific value of the same name, that not the same, which is
why they are defined differently.  And a class attribute like
number_of_instances would, conceptually, only be a class variable.
Let's not confuse Zack further.

Terry J. Reedy


From eppstein at ics.uci.edu  Fri Oct 24 20:31:12 2003
From: eppstein at ics.uci.edu (David Eppstein)
Date: Fri Oct 24 20:31:20 2003
Subject: [Python-Dev] Re: closure semantics
References: <200310242217.42857.aleaxit@yahoo.com>
	<87brs7egju.fsf@egil.codesourcery.com>
	<200310242217.42857.aleaxit@yahoo.com>
	<87llrabcac.fsf@egil.codesourcery.com>
	<5.1.1.6.0.20031024170245.03260160@telecommunity.com>
	<bncf99$366$1@sea.gmane.org>
Message-ID: <eppstein-69DB9C.17311224102003@sea.gmane.org>

In article <bncf99$366$1@sea.gmane.org>,
 "Terry Reedy" <tjreedy@udel.edu> wrote:

> > A, B, and C *are* instance variables.  Why do you think they aren't?
> 
> What?  They are class attributes that live in the class dictionary,
> not the instance dictionary.

They are instance variables on the class object, which is an instance of 
type 'class'.

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science


From bac at OCF.Berkeley.EDU  Fri Oct 24 21:16:41 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Oct 24 21:16:58 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <eppstein-69DB9C.17311224102003@sea.gmane.org>
References: <200310242217.42857.aleaxit@yahoo.com>	<87brs7egju.fsf@egil.codesourcery.com>	<200310242217.42857.aleaxit@yahoo.com>	<87llrabcac.fsf@egil.codesourcery.com>	<5.1.1.6.0.20031024170245.03260160@telecommunity.com>	<bncf99$366$1@sea.gmane.org>
	<eppstein-69DB9C.17311224102003@sea.gmane.org>
Message-ID: <3F99CEF9.5040304@ocf.berkeley.edu>

David Eppstein wrote:

> In article <bncf99$366$1@sea.gmane.org>,
>  "Terry Reedy" <tjreedy@udel.edu> wrote:
> 
> 
>>>A, B, and C *are* instance variables.  Why do you think they aren't?
>>
>>What?  They are class attributes that live in the class dictionary,
>>not the instance dictionary.
> 
> 
> They are instance variables on the class object, which is an instance of 
> type 'class'.
> 

I think the confusion that is brewing here is how Python masks class 
attributes when you do an assignment on an instance::

 >>> class foo(object):
...     A = 42
...
[12213 refs]
 >>> bar = foo()
[12218 refs]
 >>> bar.A
42
[12220 refs]
 >>> bar.A = 13
[12223 refs]
 >>> foo.A
42
[12223 refs]
 >>> bar.A
13

Python's resolution order checks the instance first and then the class 
(this is ignoring a data descriptor somewhere in this chain; for the 
details read Raymond's essay on descriptors @ 
http://users.rcn.com/python/download/Descriptor.htm#invoking-descriptors ).

-Brett


From tim_one at email.msn.com  Fri Oct 24 23:25:48 2003
From: tim_one at email.msn.com (Tim Peters)
Date: Fri Oct 24 23:26:17 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <16281.7442.783253.814142@montanaro.dyndns.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEALFGAB.tim_one@email.msn.com>

[Tim]
>    squares = (f(x)**2 for x in inputs)  # assuming reiterability here
> ...
>    for f in math.sin, math.cos, math.tan:
>        plot(squares)

[Skip Montanaro]
> How much more expensive

Stop right there.  I must have been unclear.  The only point of the example
was semantic, not cost:  even if generator expressions used closure
semantics, the example *still* wouldn't work the way it appears to read, and
because generator expressions aren't reiterable.  What the example would do
under closure semantics:

1. Plot the square of math.sin(x), for each x in inputs.

then

2. Probably nothing more than that.  The "squares" GE is exhausted after
   #1 completes, and no matter often it's run again it's simply going to
   raise StopIteration at once each time it's tried.  A reasonable plot()
   would probably do nothing when fed an exhausted iterable, but maybe it
   would raise an exception.  That's up to plot().  What it *won't* do
   under any scheme here is go on to plot the squares of math.cos(x) and
   math.tan(x) over the inputs too.

The lack of reiterability (which is fine by me!) thus seems to make a
plausible use for closure semantics hard to imagine.  The example was one
where closure semantics suck despite misleading appearance.

Closures are very often used (in languages other than Python, and in Python
too by people who haven't yet learned to write Python <0.9 wink>) to hold
mutable state in outer scopes, for the use of functions in inner scopes,
very much like an instance's data attributes hold mutable state for the use
of methods defined in the instance's class.  In those common cases, the
power comes from being able to run functions (methods) more than once, or to
reuse the mutable state among functions (methods).  But generator
expressions are always one-shot computations (you get to run a GE to
completion no more than once).  There may be some use for closure semantics
in a collection of GEs that reference each other (similar to instance data
being visible to multiple methods), but so far I've failed to dream up a
plausible case of that nature either.

> would this be than
>
>     for f in math.sin, math.cos, math.tan:
>         squares = (f(x)**2 for x in inputs)
>         plot(squares)

Despite the similar appearance, that does something very different, plotting
all 3 functions (not just math.sin), and regardless of whether closure or
capture semantics are used.  I expect the body of the loop in real life
would be a one-liner, though:

        plot(f(x)**2 for x in inputs)

> which would work without reiterability, right?

Yup.

> The underlying generator function could still be created at compile-time
> and it (or its code object?) stored in the current function's constants.
> 'f' is simply an argument to it when the iterator is instantiated.

Guido expanded on that already.  The code is compiled only once (at "compile
time"), and there's a small runtime cost per outer-loop iteration to build a
function object from the (pre-compiled) code object, and a possibly larger
runtime cost per outer-loop iteration to start the GE.  Passing 'f' and
'inputs' may be part of either of those costs, depending on how it's
implemented -- but giving the synthesized generator function some
initialized locals is the least of the runtime costs.


From aleaxit at yahoo.com  Sat Oct 25 03:21:40 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 03:21:48 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <5.1.1.6.0.20031024170245.03260160@telecommunity.com>
References: <200310242217.42857.aleaxit@yahoo.com>
	<5.1.1.6.0.20031024170245.03260160@telecommunity.com>
Message-ID: <200310250921.40413.aleaxit@yahoo.com>

On Friday 24 October 2003 23:10, Phillip J. Eby wrote:
> At 01:39 PM 10/24/03 -0700, Zack Weinberg wrote:
> >class foo:
> >    A = 1  # these are class variables
> >    B = 2
> >    C = 3
   ...
> >thought A, B, C were instance variables, although it wasn't hard to
> >understand why they aren't.
>
> A, B, and C *are* instance variables.  Why do you think they aren't?

They're _accessible AS_ instance attributes (self.B will be 2 in a method),
but they have the same value in all instances and to _rebind_ them you
need to do so on the class object (you can bind an instance variable with
the same name to shadow each and any of them, of course).

> What good does declaring the set of instance variables *do*?  This seems

It decreases productivity -- that's the empirical result of Prechelt's study
and the feeling of people who have ample experience with both kinds of
language (cfr Robert Martin's well-known blog for an authoritative one, but
my own experience is quite similar).
If you subscribe to the popular fallacy known as "lump of labour" -- there
is a total fixed amount of work that needs to be done -- it would follow 
that diminishing productivity increases the number of jobs available.  Any
economist would be appalled, of course, but, what do THEY know?-)

> to be more of a mental comfort thing than anything else.  I've spent most
> of my career in declaration-free languages, though, so I really don't
> understand why people get so emotional about being able to declare their
> variables.

Most of MY work has been with mandatory-declaration languages, and my
theory is that a "Stockholm Syndrome" is in effect (google for a few tens of
thousands of explanations of that syndrome).


> >  and there is no other way in the language?
>
> Actually, there are a great many ways to implement such a thing.  One way

For instance variables, yes.  Fewer for class variables (you need a custom
metaclass).  None for module variables (also misleadingly known as 'global'
ones) nor for local variables.


Alex


From aleaxit at yahoo.com  Sat Oct 25 03:43:34 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 03:43:40 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <87llrabcac.fsf@egil.codesourcery.com>
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310242217.42857.aleaxit@yahoo.com>
	<87llrabcac.fsf@egil.codesourcery.com>
Message-ID: <200310250943.34682.aleaxit@yahoo.com>

On Friday 24 October 2003 22:39, Zack Weinberg wrote:
   ...
> > There is absolutely no help (not one minute later, not six months
> > later) "comprehending" the program just because some silly language
> > mandates redundancy, such as a noiseword 'classvar' in front of the
> > assignments.
>
> Understand that I do almost all my programming in typed languages,
> where that keyword isn't noise, it's a critical part of the declaration.

I have a vast experience of typed languages, and, even there, the mandatory
redundancy of declarations is just a cop-out.  A _well-designed_ strictly
typed language, such as Haskell or ML, lets the compiler infer all types, so
you don't _have_ to provide declarations -- you can, much in the spirit as 
you can use assert in Python, but you need not.


> I think there really *is* a language deficiency with regard to
> declaring class versus instance variables.

I don't: there is no declaration at all (save for the accursed 'global'), 
only _statements_.  They DO things, and what they do is simple and
obvious.  

> I find this imperative syntax for declaring instance variables
> profoundly unintuitive.  Further, on my first exposure to Python, I

That's because you keep thinking of "declaring".  Don't.  There is
no such thing.  There is documenting (docstrings, comments) and
actions.  Period.  Entities must not be multiplied beyond need: we
don't NEED enforced redundancy.  We don't WANT it: if we did, we
could chose among a huge host of languages imposing it in a
myriad of ways -- but we've chosen Python exactly BECAUSE it
has no such redundancy.

When I write in some scope
    x = 1
I am saying: x is a name in this scope and it refers to value 1.
I have said all that is needed by either the compiler, or a reader
who knows the language, to understand everything perfectly.

Forcing me to say AGAIN "and oh by the way x is REALLY a name
in this scope, I wasn't kidding, honest" is abhorrent.  If you really
like that why stop at ONE forced useless redundancy?  Why not
force me to provide a THIRD redundant "I really REALLY truly mean
it, please DO believe me!!!", or a fourth one, or...?

*ONCE, AND ONLY ONCE*.  A key principle of agile programming.

> thought A, B, C were instance variables, although it wasn't hard to
> understand why they aren't.

Reducing the productivity of all language users to (perhaps) help
a few who hadn't yet understood one "not hard to understand" detail
would be a disastrous trade-off.


> People like to rag on the popularity of __slots__ (for reasons which
> are never clearly spelled out, but never mind) -- has anyone
> considered that it's popular because it's a way of declaring the set
> of instance variables, and there is no other way in the language?

Yes, or more precisely, at least it looks that way, and it's efficient
(saves some per-instance memory).  Much the same way as
"type(x) is int" looks like a way to "declare a type" and so does
isinstance(x, int) later on in one's study of the language (though
no saving accrues there).  But then, "Extraordinary Popular
Delusions and the Madness of Crowds" IS quite deservedly a
best-seller for the last 160+ years.  Fortunately, Python need not
pander to such madness and delusions, however popular:-).


Alex


From aleaxit at yahoo.com  Sat Oct 25 04:07:36 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 04:07:51 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <200310242132.h9OLWMv06320@12-236-54-216.client.attbi.com>
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310242301.16445.aleaxit@yahoo.com>
	<200310242132.h9OLWMv06320@12-236-54-216.client.attbi.com>
Message-ID: <200310251007.36871.aleaxit@yahoo.com>

On Friday 24 October 2003 23:32, Guido van Rossum wrote:
   ...
> or at run-time) always goes from inner scope to outer.  While you and
> I see nested functions as small amounts of closely-knit code, some
> people will go overboard and write functions of hundred lines long
> containing dozens of inner functions, which may be categorized into

This doesn't look like a legitimate use case to me; i.e., I see no need
to distort the language if the benefit goes to such "way overboard" uses.
I think they will have serious maintainability problems anyway.

Fortunately, I don't think of placing the "indication to the compiler" as
close to the assignment-to-outer-variable as a distortion;-)

> Anyway, I hope you'll have a look at my reasons for why the compiler
> needs to know about rebinding variables in outer scopes from inside
> an inner scope.

Sure!  I do understand this.  What I don't understand is why, syntactically,
the reserved word that indicates this to the compiler should have to be
a "statement that does nothing" -- the ONLY "declaration" in the language --
rather than e.g. an _operator_ which specifically flags such uses.

Assume for the sake of argument that we could make 'scope' a reserved
word.  Now, what are the tradeoffs of using a "declaration"
    scope x in outer
which makes all rebidings of x act in the scope of containing function
outer (including 'def x():', 'class x:', 'import x', ...); versus an 
"operator" that must be used to indicate "which x" when specifically
assigning it (no "side effect rebinding" via def &c allowed -- I think it
helps the reader of code a LOT to require specific assignment!), e.g.
    scope(outer).x = 23

Don't think of scope as a built-in function, but as a keyword in either
case (and we could surely have other syntax for the "scope operator",
e.g. "(x in outer scope) = 23" or whatever, as long as it's RIGHT THERE
where x is being assigned).  So the compiler can catch on to the info
just as effectively.

The tradeoffs are:
   -- we can keep thinking of Python as declaration-free and by gradually
       deprecating the global statement make it more so
   -- the reader of code KNOWS what's being assigned to without having
       to scroll up "hundreds of lines" looking for possible declarations
   -- assignment to nonlocals is made less casually convenient by just the
       right amount to ensure it won't be overused
   -- no casual rebinding of nonlocals via def, class, import
   -- once we solve the general problem of allowing non-bare-names as
       iteration variables in 'for', nonlocals benefit from that resolution
       automatically, since nonlocals are never assigned-to as bare-names

I see this as the pluses.  The minus is, we need a new keyword; but I
think we do, because stretching 'global' to mean something that ISN'T
global in any sense is such a hack.

Cutting both ways is the fact that this allows using the same name from
more than one scope (since each use is explicitly qualified as coming
from a specific scope).  That's irrelevant for small compact uses of
nesting, but it may be seen as offering aid and succour to those wanting
to "go overboard" as you detail earlier (bad); OTOH, if I ever need to
maintain such "overboard" code written by others, and refactoring it is
not feasible right now, it may be helpful.  In any case, the confusion is
reduced by having the explicit qualification on assignment.  Similarly
for _accesses_ rather than rebindings -- access to the barename will
keep using the same rules as today, of course, but I think the same syntax
that MUST be used to assign nonlocals should also be optionally usable
to access them -- not important either way in small compact functions,
but more regular and offering a way to make code non-ambiguous in large
ones.
I don't see having two ways to access a name -- barename x or qualified
scope(foo).x -- as a problem, just like today from inside a method we may
access a classvariable as "self.x" OR "self.__class__.x" indifferently -- 
the second form is needed for rebinding and may be chosen for clarity
in some cases where the first simpler ("barer") one would suffice.


Alex


From aleaxit at yahoo.com  Sat Oct 25 04:44:18 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 04:44:25 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Doc
	python-docs.txt, 1.2, 1.3
In-Reply-To: <16279.60447.29714.759275@montanaro.dyndns.org>
References: <E1ACgX7-0000uo-00@sc8-pr-cvs1.sourceforge.net>
	<16279.60447.29714.759275@montanaro.dyndns.org>
Message-ID: <200310251044.18365.aleaxit@yahoo.com>

On Thursday 23 October 2003 16:56, Skip Montanaro wrote:
>     fred> - add "Why is Python installed on my computer?" as a
> documentation fred>   FAQ since this gets asked at the docs at python.org
> address a fred>   lot
>
> And I thought only webmaster@python.org got asked that question all the
> time.  Does it get asked at other addresses as well?  I don't recall ever
> seeing it on python-list.

It's quite common on help@python.org too.  People who ask it probably
don't know enough to post to the ng/python-list, but look for the simplest
way to ask.  Having a FAQ to point them too will be helpful, anyway.


Alex


From aleaxit at yahoo.com  Sat Oct 25 04:58:05 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 04:58:12 2003
Subject: [Python-Dev] Re: let's not stretch a keyword's use unreasonably,
	_please_...
In-Reply-To: <bn7q9h$3ld$1@sea.gmane.org>
References: <20031022161137.96353.qmail@web40513.mail.yahoo.com>
	<bn7q9h$3ld$1@sea.gmane.org>
Message-ID: <200310251058.05704.aleaxit@yahoo.com>

On Thursday 23 October 2003 07:51, Terry Reedy wrote:
   ...
> So I really *don't* need global.  Perhaps a new builtin
>
> def me():
>   import sys
>   return sys.modules[__name__]

Or, we can make the _compiler_ aware of what is going on (and get just the
same semantics as global) by accepting either a non-statement keyword 
(scope, as I suggested elsewhere) or a magicname for import, e.g.
import __me__ as Barry suggested.  Then __me__.x=23 can have just the
same semantics as today "x=23" has if there is some "global x" somewhere
around, and indeed it could be compiled into the same bytecode if __me__
was sufficiently special to the compiler.

[[ If __me__ was assigned to other objects, subjected to setattr, etc, it 
would lose all special powers, and become restricted to whatever 
restrictions may apply now or in the future to "setting stuff in other 
modules". ]]

We'd get more clarity _for human readers_ by thus flagging every 
assignment-to-module-level-name *in the very spot it's happening* and 
avoiding the inappropriate term "global" -- to the compiler it's all the
same, but humans are important, too.


Alex


From aleaxit at yahoo.com  Sat Oct 25 05:32:04 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 05:32:10 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <200310230543.h9N5heh01776@12-236-54-216.client.attbi.com>
References: <200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz>
	<200310230543.h9N5heh01776@12-236-54-216.client.attbi.com>
Message-ID: <200310251132.04686.aleaxit@yahoo.com>

On Thursday 23 October 2003 07:43, Guido van Rossum wrote:
   ...
>   <result> = <initializer>
>   for <variable> in <iterable>:
>       <result> = <expression>
   ...
> Concluding, I think the reduce() pattern is doomed -- the template is
> too complex to capture in special syntax.

I concur, particularly because the assignment in the pattern sketched
above is too limiting.  You point out that forcing augmented assignment
would lose power (e.g., Horner's polynomials need bare assignment),
but the inability to use it would imply inefficiencies -- e.g.,

flatlist = []
for sublist in listoflists:
    flatlist += sublist

or

    flatlist.extend(sublist)

is better than forcing a "flatlist = flatlist + sublist" as the loop body.

Indeed, that's a spot where even 'sum' can be a performance trap;
consider the following z.py:

lol = [ [x] for x in range(1000) ]

def flat1(lol=lol):
    return sum(lol, [])

def flat2(lol=lol):
    result = []
    for sub in lol: result.extend(sub)
    return result

and the measurements:

[alex@lancelot pop]$ timeit.py -c -s'import z' 'z.flat1()'
100 loops, best of 3: 8.5e+03 usec per loop

[alex@lancelot pop]$ timeit.py -c -s'import z' 'z.flat2()'
1000 loops, best of 3: 940 usec per loop

sum looks cooler, but it can be an order of magnitude slower
than the humble loop of result.extend calls.  We could fix this
specific performance trap by specialcasing in sum those cases
where the result has a += method -- hmmm... would a patch for
this performance bug be accepted for 2.3.* ...?  (I understand and
approve that we're keen on avoiding adding functionality in any
2.3.*, but fixed-functionality performance enhancements should
be just as ok as fixes to functionality bugs, right?)

Anyway, there's a zillion other cases that sum doesn't cover
(well, unless we extend dict to have a += synonym for update,
which might be polymorphically useful:-), such as

totaldict = {}
for subdict in listofdicts:
    totaldict.update(subdict)

Indeed, given the number of "modify in place and return None"
methods of both built-in and user-coded types, I think the variant
of "accumulation pattern" which simply calls such a method
on each item of an iterator is about as prevalent as the variant
with assignment "result = ..." as the loop body.


Alex


From aleaxit at yahoo.com  Sat Oct 25 05:35:01 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 05:35:46 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <200310230425.h9N4Pnf01585@12-236-54-216.client.attbi.com>
References: <200310230337.h9N3bBa20209@oma.cosc.canterbury.ac.nz>
	<200310230425.h9N4Pnf01585@12-236-54-216.client.attbi.com>
Message-ID: <200310251135.01779.aleaxit@yahoo.com>

On Thursday 23 October 2003 06:25, Guido van Rossum wrote:
   ...
> it's because it feels very strongly like a directive to the compiler
> -- Python's compiler likes to stay out of the way and not need help.

*YES*!!!  So, what about that 'declarative statement' g****l, hmm...?-)


Alex


From aleaxit at yahoo.com  Sat Oct 25 06:32:55 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 06:33:00 2003
Subject: [Python-Dev] test_bsddb blocks while testing popitem (?)
Message-ID: <200310251232.55044.aleaxit@yahoo.com>

I guess it had been a while since I ran 'make test' on the 2.4 cvs... can't 
find this bug in the bugs db and I'd just like a quick sanity check (if the 
bug's already there or if I'm doing something weird) before I add it.  

Linux Mandrake 9.1, gcc 3.2.2, Berkeley DB 4.1.25 installed in 
/usr/local/BerkeleyDB.4.1 -- "make test" runs fine all the way to test_bsddb 
and blocks there.  Digging further shows it runs fine all the way to test_pop 
and blocks specifically in test_popitem.  Digging yet further with print and
printf shows that when trying to delete the first key ('e') it gets all the 
way to entering the call
    err = self->db->del(self->db, txn, key, 0);
in _DB_delete -- and never gets out of that call.  Ctrl-C does nothing,
have to Ctrl-Z then kill %1 to get out.  Previous deletes done in the course
of the unit-test give no problems (e.g., test_clear also starts by deleting
that 'e' key and just works fine).  So, I'm nonplussed...


Alex


From aleaxit at yahoo.com  Sat Oct 25 07:52:13 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 07:52:18 2003
Subject: [Python-Dev] tests expecting but not finding errors due to bug fixes
Message-ID: <200310251352.13266.aleaxit@yahoo.com>

Switching to the 2.3 maintenance branch (where test_bsdddb runs just fine),
I got "make test" failures on test_re.py.  Turns out that the 2.3-branch 
test_re.py was apparently not updated when the RE recursion bug was
fixed -- it still expects a couple of exceptions to be raised and they don't
get raised any more because the bugfix itself WAS backported.

On general principles, in cases of this ilk, IS it all right to just backport 
the corrected unit-test (from the 2.4 to the 2.3 branch) and commit the
fix, or should one be more circumspect about it...?


Alex


From aleaxit at yahoo.com  Sat Oct 25 08:30:51 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 08:30:57 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <5.1.0.14.0.20031022220801.02547e20@mail.telecommunity.com>
References: <LNBBLJKPBEHFEDALKOLCOEBOFFAB.tim_one@email.msn.com>
	<5.1.0.14.0.20031022220801.02547e20@mail.telecommunity.com>
Message-ID: <200310251430.51930.aleaxit@yahoo.com>

On Thursday 23 October 2003 04:12 am, Phillip J. Eby wrote:
> At 02:49 PM 10/23/03 +1300, Greg Ewing wrote:
> >This would allow the current delayed-evaluation semantics
> >to be kept as the default, while eliminating any need
> >for using the default-argument hack when you don't
> >want delayed evaluation.
>
> Does anybody actually have a use case for delayed evaluation?  Why would
> you ever *want* it to be that way?  (Apart from the BDFL's desire to have
> the behavior resemble function behavior.)

I have looked far and wide over my code, present and desired, and can
only find one example that seems perhaps tangentially relevant -- and I
don't think it's a _good_ example.  Anyway, here it comes:

def smooth(N, sequence):

    def fifo(N):
        window = []
        while 1:
            if len(window) < N:
                yield None
            else:
                yield window
                window.pop(0)
            window.append(item)
    latest = iter(fifo(N)).next

    for item in sequence:
        window = latest()
        if window is None: continue
        yield sum(window) / N

as I said, I don't like it one bit; the non-transparent "argument passing" of
item from the loop "down into" the generator is truly yecchy.  There are
MUCH better ways to do this, such as

def fifo(N, sequence):
    it = iter(sequence)
    window = list(itertools.islice(it, N))
    while 1:
        yield window
        window.pop(0)
        window.append(it.next())

def smooth(N, sequence):
    for window in fifo(N, sequence):
        yield sum(window) / N

It's not clear that this would generalize to generator expressions, anyway.
But I could imagine it might, e.g. IF we had "closure semantics" rather than
"snapshot-binding" somebody COULD be tempted to such murky cases
of "surreptitious argumentpassing down into genexprs"... and we're better
off without any such possibility, IMHO.
    

> And, if there's no use case for delayed evaluation, why make people jump
> through hoops to get the immediate binding?

I understand Guido's position that simplicity and regularity of the rules
count (a LOT).  But in this case I think Tim's insistence on practicality
should count for more: the "bind everything at start time" semantics
are NOT a weird special case, and the "lookup everything each time
around the loop" ones don't seem to yield any non-weird use...


Alex


From aleaxit at yahoo.com  Sat Oct 25 08:39:12 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 08:39:17 2003
Subject: [Python-Dev] product()
In-Reply-To: <002401c39907$0176f5a0$e841fea9@oemcomputer>
References: <002401c39907$0176f5a0$e841fea9@oemcomputer>
Message-ID: <200310251439.12449.aleaxit@yahoo.com>

On Thursday 23 October 2003 03:43 am, Raymond Hettinger wrote:
> In the course of writing up Pep 289, it became clear that
> the future has a number of accumulator functions in store.
> Each of these is useful with iterators of all stripes and
> each helps eliminate a reason for using reduce().
>
> Some like average() and stddev() will likely end up in a
> statistics module.  Others like nbiggest(), nsmallest(),
> anytrue(), alltrue(), and such may end-up somewhere else.
>
> The product() accumulator is the one destined to be a builtin.
>
> Though it is not nearly as common as sum(), it does enjoy
> some popularity.  Having it available will help dispense
> with reduce(operator.mul, data, 1).
>
> Would there be any objections to my adding product() to
> Py2.4?  The patch was simple and it is ready to go unless
> someone has some major issue with it.

Michael has already quoted my April opinion on the subject.

I think these "useful accumulator functions" should all be in
some separate module[s]: none of them is nowhere near
popular enough to warrant being a built-in, IMHO.  If any were,
it might be "alltrue" and "anytrue" -- the short-circuiting ones,
returning the first true or false item found respectively, as in:

def alltrue(seq):
    for x in seq:
        if not x: return x
    else:
        return True

def anytrue(seq):
    for x in seq:
        if x: return x
    else:
        return False

these seem MUCH more generally useful than 'product' (but,
I still opine, not quite enough to warrant being built-ins).


Alex


From aleaxit at yahoo.com  Sat Oct 25 08:57:05 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 08:57:12 2003
Subject: [Python-Dev] fixing sum's performance bug
In-Reply-To: <200310251132.04686.aleaxit@yahoo.com>
References: <200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz>
	<200310230543.h9N5heh01776@12-236-54-216.client.attbi.com>
	<200310251132.04686.aleaxit@yahoo.com>
Message-ID: <200310251457.05711.aleaxit@yahoo.com>

On Saturday 25 October 2003 11:32 am, Alex Martelli wrote:
   ...
> Indeed, that's a spot where even 'sum' can be a performance trap;
> consider the following z.py:
>
> lol = [ [x] for x in range(1000) ]
>
> def flat1(lol=lol):
>     return sum(lol, [])
>
> def flat2(lol=lol):
>     result = []
>     for sub in lol: result.extend(sub)
>     return result
>
> and the measurements:
>
> [alex@lancelot pop]$ timeit.py -c -s'import z' 'z.flat1()'
> 100 loops, best of 3: 8.5e+03 usec per loop
>
> [alex@lancelot pop]$ timeit.py -c -s'import z' 'z.flat2()'
> 1000 loops, best of 3: 940 usec per loop
>
> sum looks cooler, but it can be an order of magnitude slower
> than the humble loop of result.extend calls.  We could fix this
> specific performance trap by specialcasing in sum those cases
> where the result has a += method -- hmmm... would a patch for
> this performance bug be accepted for 2.3.* ...?  (I understand and
> approve that we're keen on avoiding adding functionality in any
> 2.3.*, but fixed-functionality performance enhancements should
> be just as ok as fixes to functionality bugs, right?)

Ah well -- it's the most trivial fix one can possibly think of, just
changing PyNumber_Add to PyNumber_InPlaceAdd -- so the
semantics are _guaranteed_ to be equal in all _sane_ cases, i.e.
excepting only weird user-coded types that have an __iadd__
with a weirdly different semantic than __add__ -- and DOES make
sum's CPU time drop to 490 usec in the above (making it roughly
twice as fast as the loop, as it generally tends to be in typical cases
of summing lots of numbers).  So I wend ahead and committed the
tiny change on both the 2.4 and 2.3 maintenance branches (easy
enough to revert if the "insane" cases must keep working in the
same [not sane:-)] way in 2.3.*)...


Alex


From aleaxit at yahoo.com  Sat Oct 25 09:08:15 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 09:08:20 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310221853.h9MIrL327955@12-236-54-216.client.attbi.com>
References: <005501c398ca$a07a6f20$e841fea9@oemcomputer>
	<200310221853.h9MIrL327955@12-236-54-216.client.attbi.com>
Message-ID: <200310251508.15634.aleaxit@yahoo.com>

On Wednesday 22 October 2003 08:53 pm, Guido van Rossum wrote:
> > Did the discussion of a sort() expression get resolved?
> >
> > The last I remember was that the list.sorted() classmethod had won the
> > most support because it accepted the broadest range of inputs.
> >
> > I could live with that though I still prefer the more limited
> > (list-only) copysort() method.
>
> list.sorted() has won, but we are waiting from feedback from the
> person who didn't like having both sort() and sorted() as methods, to
> see if his objection still holds when one is a method and the other a
> factory function.

So, if I've followed correctly the lots of python-dev mail over the last
few days, that person (Aahz) is roughly +0 on list.sorted as classmethod
and thus we can go ahead.  Right?


Alex


From aleaxit at yahoo.com  Sat Oct 25 09:18:22 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 09:18:27 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: <200310221845.h9MIjlr27891@12-236-54-216.client.attbi.com>
References: <338366A6D2E2CA4C9DAEAE652E12A1DECFF219@au3010avexu1.global.avaya.com>
	<3F96C6D8.8040507@livinglogic.de>
	<200310221845.h9MIjlr27891@12-236-54-216.client.attbi.com>
Message-ID: <200310251518.22207.aleaxit@yahoo.com>

On Wednesday 22 October 2003 08:45 pm, Guido van Rossum wrote:
> > sum(len(line) for line in file if not line.startswith("#") while
> > line.strip())
> >
> > looks simple than
> >
> > sum(itertools.takewhile(lambda l: l.strip(), len(line) for line in file
> > if not line.startswith("#"))
>
> I think both are much harder to read and understand than
>
>   n = 0
>   for line in file:
>       if not line.strip():
>           break
>       if not line.startwith("#"):
>           n += len(line)

Yes, but personally I would prefer another refactoring yet, something
like:

def noncomment_lines_until_white(file):
    for line in file:
        if not line.strip(): break
        if not line.startswith('#'): yield line
n = sum(len(line) for line in noncomment_lines_until_white(file))

To me, the concept "get all non-comment lines until the first
all-whitespace one" gets nicely "factored out", this way, from the
other concept of "sum the lengths of all of these lines".  In Guido's
version I have to reconstruct these two concepts "bottom-up" from
their entwined expression in lower-level terms; in Walter's, I have to
reconstruct them by decomposing a very dense equivalent, still
full of lower-level constructs.  It seems to me that, by naming the
"noncomment_lines_until_white" generator, I make the separation
of (and cooperation between) the two concepts, most clear.

Clearly, people's tastes in named vs unnamed, and lower-level
vs higher-level expression of concepts, differ widely!-)


Alex


From barry at python.org  Sat Oct 25 09:25:47 2003
From: barry at python.org (Barry Warsaw)
Date: Sat Oct 25 09:25:57 2003
Subject: [Python-Dev] test_bsddb blocks while testing popitem (?)
In-Reply-To: <200310251232.55044.aleaxit@yahoo.com>
References: <200310251232.55044.aleaxit@yahoo.com>
Message-ID: <1067088346.10257.71.camel@anthem>

On Sat, 2003-10-25 at 06:32, Alex Martelli wrote:
> I guess it had been a while since I ran 'make test' on the 2.4 cvs... can't 
> find this bug in the bugs db and I'd just like a quick sanity check (if the 
> bug's already there or if I'm doing something weird) before I add it.  

Jeremy and I have both seen similar hangs in 2.4cvs.

-Barry


From aleaxit at yahoo.com  Sat Oct 25 09:37:57 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 09:38:04 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <200310221757.h9MHvI327805@12-236-54-216.client.attbi.com>
References: <200310220448.h9M4miP26369@12-236-54-216.client.attbi.com>
	<5.2.1.1.0.20031022191732.0280ce10@pop.bluewin.ch>
	<200310221757.h9MHvI327805@12-236-54-216.client.attbi.com>
Message-ID: <200310251537.57480.aleaxit@yahoo.com>

On Wednesday 22 October 2003 07:57 pm, Guido van Rossum wrote:
   ...
> > def accgen(n):
> >    def acc(i):
> >      global n in accgen
> >      n += i
> >      return n
> >    return acc
> >
> > particulary more compelling than:
> >
> > class accgen:
> >    def __init__(self, n):
> >      self.n = n
> >
> >    def __call__(self, i):
> >      self.n += i
> >      return self.n
>
> Some people have "fear of classes".  Some people think that a
> function's scope can be cheaper than an object (someone should time
> this).

I need to simulate the "rebinding name in outer scope" with some kind of
item or attribute, of course, but, given this, here comes:

given this b.py:

def accgen_attr(n):
    def acc(i):
        acc.n += i
        return acc.n
    acc.n = n
    return acc

def accgen_item(n):
    n = [n]
    def acc(i):
        n[0] += i
        return n[0]
    return acc

class accgen_clas(object):
    def __init__(self, n):
        self.n = n
    def __call__(self, i):
        self.n += i
        return self.n

def looper(accgen, N=1000):
    acc = accgen(100)
    x = map(acc, xrange(N))
    return x

I measure:

[alex@lancelot ext]$ timeit.py -c -s'import b' 'b.looper(b.accgen_attr)'
1000 loops, best of 3: 1.86e+03 usec per loop

[alex@lancelot ext]$ timeit.py -c -s'import b' 'b.looper(b.accgen_item)'
1000 loops, best of 3: 1.18e+03 usec per loop

[alex@lancelot ext]$ timeit.py -c -s'import b' 'b.looper(b.accgen_clas)'
100 loops, best of 3: 2.1e+03 usec per loop

So, yes, a function IS slightly faster anyway (accgen_attr vs accgen_clas),
AND simulating outer-scope-rebinding with a list item is somewhat faster
than doing so with an attr (a class always uses an attr, and most of its
not-too-terrible performance handicap presumably comes from that fact).

I just don't think such closures would typically be used in bottlenecks SO
tight that a 10%, or even a 40%, extra overhead are going to be crucial.
So, I find it hard to get excited either way by this performance issue.


Alex


From aleaxit at yahoo.com  Sat Oct 25 09:56:54 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 09:56:59 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <1066844059.3f96bf9b1240f@mcherm.com>
References: <1066844059.3f96bf9b1240f@mcherm.com>
Message-ID: <200310251556.54913.aleaxit@yahoo.com>

On Wednesday 22 October 2003 07:34 pm, Michael Chermside wrote:
> [Jeremy]
>
> > I'm not averse to introducing a new keyword, which would address both
> > concerns.  yield was introduced with apparently little problem, so it
> > seems possible to add a keyword without causing too much disruption.
> >
> > If we decide we must stick with global, then it's very hard to address
> > Alex's concern about global being a confusing word choice <wink>.
>
> [Guido]
>
> > OK, the tension is mounting.  Which keyword do you have in mind?  And
> > would you use the same keyword for module-globals as for outer-scope
> > variables?
>
> Surely the most appropriate keyword is "scope", right?

That is my personal vote, yes.


> As in
>
>    scope a is global
>    scope b is nested
>    scope c is self
>    scope d is myDict
>
> Okay... maybe I'm getting too ambitious with the last couple...

If we have to have 'scope' as a statement, I'd slightly prefer it if it HAD 
something useful to do, so I understand your ambition.

If somebody thinks it's useful and important to be able to spell
    themodule.x = 23
as
    <some declarative statement about> x    # e.g. at top of function
    ...many lines in-between obscuring the issue...
    x = 23
then it WOULD no doubt be consistent to be able to do similar things
for other "whatever.x = 23" assignments.

However, that "myDict" still leaves me dubious.  If we want to make
it easy to use attribute setting and access syntax in lieu of dictionary
indexing syntax (and it does look nicer often enough) then it seems
to me that we should rather make available a fast equivalent of a
wrapper such as

class ItemsAsAttrs(object):
    def __init__(self, d):
        object.__setattr__(self, 'd', d)
    def __getattr__(self, n):
        return self.d[n]
    def __setattr__(self, n, v):
        self.d[n] = v

then, "scope xx is ItemsAsAttr(myDict)" would work if such things
as "scope xx is self" did.


Personally, I'd rather go in the other direction: make all assignments
except those to local variables into something _locally clear without
needing to look for possible declarations who knows where_ , rather
than fall for the "convenience trap" of allowing assignment to bare
names (and presumably "side-effect rebindings" such as those in
statements def, class, for, import) to mean something different
depending on "declarative statements".

However, I do understand that it would at least be consistent to
allow such "insidious convenience" for many kinds of non-local
names, as your "ambitious proposals" imply/suggest.  If it IS deemed
desirable to give "x=23" semantics that depend on the possible
presence of a "declarative statement" who-knows-where, it seems
consistent to allow that "convenience" for all kind of semantics.

Perhaps an idea which I think Samuele suggested might be less
insidious that allowing the "declarative statement" to be just
about anywhere within the current function: make 'scope' a normal
compound statement, as in, e.g.:

    scope x in module, xx in foo, z, t, v in self:
        x = 23
        ...etc etc...

this way, at least, the semantics of "x = 23" depend only on those
declarative statements *it's nested inside of*; better than having to
look all over the function, before AND after the "x = 23" and inside
flow control statements too (!), for the "global x" (or whatever) that
MIGHT make "x = 23" mean something different.


Alex


From aleaxit at yahoo.com  Sat Oct 25 10:03:17 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 10:03:22 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <1b5501c398be$ff1832d0$891e140a@YODA>
References: <Your message of "Tue, 21 Oct 2003 23:33:30
	+0200."<5.2.1.1.0.20031021232738.027c3e38@pop.bluewin.ch>
	<200310221602.h9MG2h427527@12-236-54-216.client.attbi.com>
	<1b5501c398be$ff1832d0$891e140a@YODA>
Message-ID: <200310251603.17845.aleaxit@yahoo.com>

On Wednesday 22 October 2003 07:07 pm, Dave Brueck wrote:
   ...
> > like a global variable.  I also don't think I want global variable
> > assignments to look like attribute assignments.
>
> Go easy on me for piping up here, but aren't they attribute assignments or
> at least used as such? After reading the other posts in this thread I

I entirely afree with this "user of Python" perspective, and I think it's a 
pity it's been ignored in the following discussion.

> and any distinction would seem arbitrary or artificial (consider, for

Yes!  If the compiler needs to be aware of global assignments (which IS
a good idea) we can do so by either introducing a new "operator keyword",
OR something like Barry's suggestion of "import __me__" with __me__ as
a magicname recognized by the compiler (hey, if it can recognize __future__
why not __me__?-).  But to the Python user, making things look similar
when their semantics and use ARE similar is a wonderful idea.

> example, that it is not an uncommon practice to write a module instead of a
> class if the class would be a singleton).

Indeed, that IS the officially recommended practice (and Guido emphasized
that in rather adamant words after he had recovered from the shock of
seeing the Borg nonpattern presented at a Python-UK session...:-).


Alex


From skip at pobox.com  Sat Oct 25 10:15:36 2003
From: skip at pobox.com (Skip Montanaro)
Date: Sat Oct 25 10:16:07 2003
Subject: [Python-Dev] accumulator display syntax
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEALFGAB.tim_one@email.msn.com>
References: <16281.7442.783253.814142@montanaro.dyndns.org>
	<LNBBLJKPBEHFEDALKOLCAEALFGAB.tim_one@email.msn.com>
Message-ID: <16282.34184.615261.250326@montanaro.dyndns.org>


    Tim> [Skip Montanaro]
    >> How much more expensive

    Tim> Stop right there.

Okay, but I couldn't resist. ;-)

    >> for f in math.sin, math.cos, math.tan:
    >>    squares = (f(x)**2 for x in inputs)
    >>    plot(squares)

    Tim> Despite the similar appearance, that does something very different,
    ...
    >> which would work without reiterability, right?

    Tim> Yup.

I shouldn't have mentioned performance.  The above was really the point I
was getting at.  The mention of performance was simply because I couldn't
understand why reiterability would be necessary in your example.  I see you
were just pointing out that someone not understanding the underlying nature
of the generator would assume your example would work *and* save cycles
because the definition of the generator expression was hoisted out of the
loop.

Skip

From aleaxit at yahoo.com  Sat Oct 25 10:18:42 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 10:18:46 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <13803476.1066768024@[192.168.1.101]>
References: <000001c3984b$052cd820$e841fea9@oemcomputer>
	<13803476.1066768024@[192.168.1.101]>
Message-ID: <200310251618.42221.aleaxit@yahoo.com>

On Wednesday 22 October 2003 05:27 am, David Eppstein wrote:
   ...
> Currently, I am using expressions like
>
> 	pos2d =
> dict([(s,(positions[s][0]+dx*positions[s][2],positions[s][1]+dy*positions[s
> ][2]))
>                for s in positions])

I _must_ be getting old -- it would never occur to me to write something
as dense and incomprehensible (and no, removing the "dict([" would not
make it much clearer).  Something like:

pos2d = {}
for s, (x, y, delta) in positions.iteritems():
    pos2d[s] = x+dx*delta, y+dy*delta

seems just  SO much clearer and more transparent to me.


Alex


From neal at metaslash.com  Sat Oct 25 10:29:32 2003
From: neal at metaslash.com (Neal Norwitz)
Date: Sat Oct 25 10:29:40 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310251603.17845.aleaxit@yahoo.com>
References: <200310221602.h9MG2h427527@12-236-54-216.client.attbi.com>
	<1b5501c398be$ff1832d0$891e140a@YODA>
	<200310251603.17845.aleaxit@yahoo.com>
Message-ID: <20031025142932.GZ5842@epoch.metaslash.com>

On Sat, Oct 25, 2003 at 04:03:17PM +0200, Alex Martelli wrote:
> 
> Yes!  If the compiler needs to be aware of global assignments (which IS
> a good idea) we can do so by either introducing a new "operator keyword"

One thing that I've always wondered about, why can't one do:

        def reset_foo():
            global foo = []     # declare as global and do assignment

As Alex pointed out in another mail (I'm paraphrasing liberally):
redundancy is bad.  By having to declare foo as global, there's
a guaranteed redundancy of the variable when foo is also assigned.

I don't know if this solution would make Alex dislike global less.
But it changes global to look more like a statement, rather than 
a declaration.

Neal

From aleaxit at yahoo.com  Sat Oct 25 11:05:04 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 11:06:30 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <20031025142932.GZ5842@epoch.metaslash.com>
References: <200310221602.h9MG2h427527@12-236-54-216.client.attbi.com>
	<200310251603.17845.aleaxit@yahoo.com>
	<20031025142932.GZ5842@epoch.metaslash.com>
Message-ID: <200310251705.04439.aleaxit@yahoo.com>

On Saturday 25 October 2003 04:29 pm, Neal Norwitz wrote:
> On Sat, Oct 25, 2003 at 04:03:17PM +0200, Alex Martelli wrote:
> > Yes!  If the compiler needs to be aware of global assignments (which IS
> > a good idea) we can do so by either introducing a new "operator keyword"
>
> One thing that I've always wondered about, why can't one do:
>
>         def reset_foo():
>             global foo = []     # declare as global and do assignment
>
> As Alex pointed out in another mail (I'm paraphrasing liberally):
> redundancy is bad.  By having to declare foo as global, there's
> a guaranteed redundancy of the variable when foo is also assigned.
>
> I don't know if this solution would make Alex dislike global less.
> But it changes global to look more like a statement, rather than
> a declaration.

Indeed, you can see 'global', in this case, as a kind of "operator
keyword", modifying the scope of foo in an assignment statement.

I really have two separate peeves against global (not necessarily
in order of importance, actually):

-- it's the wrong keyword, doesn't really _mean_ "global"
-- it's a "declarative statement", the only one in Python (ecch)
   (leading to weird uncertainty about where it can be placed)
-- "side-effect" assignment to globals, such as in def, class &c
   statements, is quite tricky and error-prone, not useful

Well, OK, _three_ peeves... usual Spanish Inquisition issue...:-)

Your proposal is quite satisfactory wrt solving the second issue,
from my viewpoint.  It would still create a unique-in-Python
construct, but not (IMHO) a problematic one.  As you point out,
it _would_ be more concise than having to separately [a] say
foo is global then [b] assign something.  It would solve any
uncertainty regarding placement of 'global', and syntactically
impede using global variables in "rebinding as side-effect" cases
such as def &c, so the third issue disappears.

The first issue, of course, is untouched:-).  It can't be touched
without choosing a different keyword, anyway.

So, with 2 resolutions out of 3, I do like your idea.  However,
I don't think we can get there from here.  Guido has explained
that the parser must be able to understand a statement that
starts with 'global' without look-ahead; I don't know if it can
keep accepting, for bw compat and with a warning, the old
    global xx
while also accepting the new and improved
    global xx = 23
But perhaps it's not quite as hard as the "global.xx = 23" would
be.  I find Python's parser too murky & mysterious to feel sure.

Other side issues: if you rebind a module-level xx in half a
dozen places in your function f, right now you only need ONE
"global xx" somewhere in f (just about anywhere); with your
proposal, you'd need to flag "global xx = 23" at each of the
several assignments to that xx.  Now, _that suits me just
fine_: indeed, I LOVE the fact that a bare "xx = 23" is KNOWN
to set a local, and you don't have to look all over the place for
declarative statements that might affect its semantics (hmmm,
perhaps a 4th peeve vs global, but I see it as part and parcel
of peeve #2:-).  But globals-lovers might complain that it makes
using globals a TAD less convenient.  (Personally, I would not
mind THAT at all, either: if as a result people use 10% fewer
globals and replace them with arguments or classes etc, I think
that will enhance their programs anyway;-).


So -- +1, even though we may need a different keyword to
solve [a] the problem of getting there from here AND [b] my
peeve #1 ...:-).


Alex


From pf_moore at yahoo.co.uk  Sat Oct 25 11:49:30 2003
From: pf_moore at yahoo.co.uk (Paul Moore)
Date: Sat Oct 25 11:49:31 2003
Subject: [Python-Dev] Re: closure semantics
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310242301.16445.aleaxit@yahoo.com>
	<200310242132.h9OLWMv06320@12-236-54-216.client.attbi.com>
	<200310251007.36871.aleaxit@yahoo.com>
Message-ID: <brs52u7p.fsf@yahoo.co.uk>

Alex Martelli <aleaxit@yahoo.com> writes:

> Assume for the sake of argument that we could make 'scope' a reserved
> word.  Now, what are the tradeoffs of using a "declaration"
>     scope x in outer
> which makes all rebidings of x act in the scope of containing function
> outer (including 'def x():', 'class x:', 'import x', ...); versus an 
> "operator" that must be used to indicate "which x" when specifically
> assigning it (no "side effect rebinding" via def &c allowed -- I think it
> helps the reader of code a LOT to require specific assignment!), e.g.
>     scope(outer).x = 23
>
> Don't think of scope as a built-in function, but as a keyword in either
> case (and we could surely have other syntax for the "scope operator",
> e.g. "(x in outer scope) = 23" or whatever, as long as it's RIGHT THERE
> where x is being assigned).  So the compiler can catch on to the info
> just as effectively.

I'm skimming this, so I apologise if I've missed something obvious.
However, one significant issue with your notation scope(outer).x = 23
is that, although scope(outer) *looks like* a function call, it isn't
- precisely because scope is a keyword.

I think that, if you're using a keyword, you need something
syntactically distinct. Now maybe you can make something like
(x in f scope) work as an expression (I've deliberately used "f" not
"outer" to highlight the fact that it may not always look as "nice" as
your example), but I'm not sure it's as intuitive as you imply.

But then again, I've no problem with "global x in f".

Paul
-- 
This signature intentionally left blank


From eppstein at ics.uci.edu  Sat Oct 25 12:03:01 2003
From: eppstein at ics.uci.edu (David Eppstein)
Date: Sat Oct 25 12:03:04 2003
Subject: [Python-Dev] Re: accumulator display syntax
References: <000001c3984b$052cd820$e841fea9@oemcomputer>
	<13803476.1066768024@[192.168.1.101]>
	<200310251618.42221.aleaxit@yahoo.com>
Message-ID: <eppstein-5D7F6F.09030025102003@sea.gmane.org>

In article <200310251618.42221.aleaxit@yahoo.com>,
 Alex Martelli <aleaxit@yahoo.com> wrote:

> On Wednesday 22 October 2003 05:27 am, David Eppstein wrote:
>    ...
> > Currently, I am using expressions like
> >
> > 	pos2d =
> > dict([(s,(positions[s][0]+dx*positions[s][2],positions[s][1]+dy*positions[s
> > ][2]))
> >                for s in positions])
> 
> I _must_ be getting old -- it would never occur to me to write something
> as dense and incomprehensible (and no, removing the "dict([" would not
> make it much clearer).  Something like:
> 
> pos2d = {}
> for s, (x, y, delta) in positions.iteritems():
>     pos2d[s] = x+dx*delta, y+dy*delta
> 
> seems just  SO much clearer and more transparent to me.

I like the comprehension syntax so much that I push it harder than I 
guess I should.  If I'm building a dictionary by performing some 
transformation on the items of another dictionary, I prefer to write it 
in a way that avoids sequencing the items one by one; I don't think of 
that sequencing as an inherent part of the loop.

Put another way, I prefer declarative to imperative when possible.

Let's try to spread it out a little and use intermediate variable names:
    pos2d = dict([(s, (x + dx*z, y + dy*z))
                  for s,(x,y,z) in positions.items()])

Better?

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science


From guido at python.org  Sat Oct 25 12:40:25 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 25 12:40:55 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: Your message of "Sat, 25 Oct 2003 10:07:36 +0200."
	<200310251007.36871.aleaxit@yahoo.com> 
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310242301.16445.aleaxit@yahoo.com>
	<200310242132.h9OLWMv06320@12-236-54-216.client.attbi.com> 
	<200310251007.36871.aleaxit@yahoo.com> 
Message-ID: <200310251640.h9PGePZ07536@12-236-54-216.client.attbi.com>

> > or at run-time) always goes from inner scope to outer.  While you and
> > I see nested functions as small amounts of closely-knit code, some
> > people will go overboard and write functions of hundred lines long
> > containing dozens of inner functions, which may be categorized into
> 
> This doesn't look like a legitimate use case to me; i.e., I see no need
> to distort the language if the benefit goes to such "way overboard" uses.
> I think they will have serious maintainability problems anyway.

One person here brought up (maybe David Eppstein) that they used this
approach for coding up extensive algorithms that are functional in
nature but have a lot of state referenced *during* the computation.
Whoever it was didn't like using classes because the internal state
would persist past the lifetime of the calculation.

When I visited Google I met one person who was advocating the same
coding style -- he was adamant that if he revealed any internal
details of his algorithm then the users of his library would start
using them, and he wouldn't be able to change the details in another
revision.

AFACT these were both very experienced Python developers who had
thought about the issue and chosen to write large nested functions.

So I don't think you can dismiss this so easily.

> Fortunately, I don't think of placing the "indication to the
> compiler" as close to the assignment-to-outer-variable as a
> distortion;-)
> 
> > Anyway, I hope you'll have a look at my reasons for why the compiler
> > needs to know about rebinding variables in outer scopes from inside
> > an inner scope.
> 
> Sure!  I do understand this.  What I don't understand is why,
> syntactically, the reserved word that indicates this to the compiler
> should have to be a "statement that does nothing" -- the ONLY
> "declaration" in the language -- rather than e.g. an _operator_
> which specifically flags such uses.

Maybe because I haven't seen such an operator proposed that I
liked. :)

And in its normal usage, I don't find 'global x' offensive; that it
can be abused and sometimes misunderstood doesn't matter to me, that's
the case for sooooo many language constructs...

> Assume for the sake of argument that we could make 'scope' a reserved
> word.  Now, what are the tradeoffs of using a "declaration"
>     scope x in outer
> which makes all rebidings of x act in the scope of containing function
> outer (including 'def x():', 'class x:', 'import x', ...); versus an 
> "operator" that must be used to indicate "which x" when specifically
> assigning it (no "side effect rebinding" via def &c allowed -- I think it
> helps the reader of code a LOT to require specific assignment!), e.g.
>     scope(outer).x = 23
> 
> Don't think of scope as a built-in function, but as a keyword in either
> case (and we could surely have other syntax for the "scope operator",
> e.g. "(x in outer scope) = 23" or whatever, as long as it's RIGHT THERE
> where x is being assigned).  So the compiler can catch on to the info
> just as effectively.

What bugs me tremendously about this is that this isn't symmetric with
usage: you can *use* the x from the outer scope without using all that
verbiage, but you must *assign* to it with a special construct.  This
would be particularly confusing if x is used on the right hand side of
the assignment, e.g.:

  scope(outer).x = x.lower()

> The tradeoffs are:
>    -- we can keep thinking of Python as declaration-free and by gradually
>        deprecating the global statement make it more so

Somehow I don't see "declaration-free" as an absolute goal, where 100%
is better than 99%.

>    -- the reader of code KNOWS what's being assigned to without having
>        to scroll up "hundreds of lines" looking for possible declarations

Yeah, but you can still *use* a variable that was set "hundreds of
lines" before, so it's not a full solution (and will never be --
allowing *use* of nonlocals is clearly a much-wanted and very useful
feature).

>    -- assignment to nonlocals is made less casually convenient by just the
>        right amount to ensure it won't be overused

If we don't add "global x in f" or some equivalent, you can't assign
to nonlocals except for module globals, where I don't see a problem.

>    -- no casual rebinding of nonlocals via def, class, import

I don't think that's a real issue.

>    -- once we solve the general problem of allowing non-bare-names as
>        iteration variables in 'for', nonlocals benefit from that
>        resolution automatically, since nonlocals are never
>        assigned-to as bare-names

This is obscure -- most readers here didn't even know you could do
that, and all except Tim (whom I cut a certain amount of slack because
he's from Wisconsin) said they considered it bad style.  So again the
argument is weak.

> I see this as the pluses.  The minus is, we need a new keyword; but I
> think we do, because stretching 'global' to mean something that ISN'T
> global in any sense is such a hack.

Well, if for some reason the entire Python community suddenly leaned
on me to allow assignment to non-locals with a syntactic construct to
be used in every assignment to a non-local, I would much favor the C++
style of <scope>::<name>.

> Cutting both ways is the fact that this allows using the same name from
> more than one scope (since each use is explicitly qualified as coming
> from a specific scope).  That's irrelevant for small compact uses of
> nesting, but it may be seen as offering aid and succour to those wanting
> to "go overboard" as you detail earlier (bad);

There is no need for this even among those folks; a simple renaming
allows access to all variables they need.  (My earlier argument wasn't
about this, it was about accidental shadowing when there was *no* need
to share.)

> OTOH, if I ever need to maintain such "overboard" code written by
> others, and refactoring it is not feasible right now, it may be
> helpful.  In any case, the confusion is reduced by having the
> explicit qualification on assignment.  Similarly for _accesses_
> rather than rebindings -- access to the barename will keep using the
> same rules as today, of course, but I think the same syntax that
> MUST be used to assign nonlocals should also be optionally usable to
> access them -- not important either way in small compact functions,
> but more regular and offering a way to make code non-ambiguous in
> large ones.  I don't see having two ways to access a name --
> barename x or qualified scope(foo).x -- as a problem, just like
> today from inside a method we may access a classvariable as "self.x"
> OR "self.__class__.x" indifferently -- the second form is needed for
> rebinding and may be chosen for clarity in some cases where the
> first simpler ("barer") one would suffice.

Actually, self.__class__.x is probably a mistake, usually one should
name the class explicitly.

But I don't see that as the same, because the name isn't bare in
either case.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aahz at pythoncraft.com  Sat Oct 25 13:15:29 2003
From: aahz at pythoncraft.com (Aahz)
Date: Sat Oct 25 13:15:33 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310251508.15634.aleaxit@yahoo.com>
References: <005501c398ca$a07a6f20$e841fea9@oemcomputer>
	<200310221853.h9MIrL327955@12-236-54-216.client.attbi.com>
	<200310251508.15634.aleaxit@yahoo.com>
Message-ID: <20031025171529.GA18617@panix.com>

On Sat, Oct 25, 2003, Alex Martelli wrote:
>
> So, if I've followed correctly the lots of python-dev mail over the last
> few days, that person (Aahz) is roughly +0 on list.sorted as classmethod
> and thus we can go ahead.  Right?

I'm not the person who objected on non-English speaking grounds, and I'm
-0 because I don't like using grammatical tense as the differentiator;
as I said, I'd expect sorted() to be a predicate.  If we're doing this
(and it seems we are), I still prefer copysort() for clarity.  But I'm
not objecting to sorted().
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From eppstein at ics.uci.edu  Sat Oct 25 14:05:38 2003
From: eppstein at ics.uci.edu (David Eppstein)
Date: Sat Oct 25 14:05:43 2003
Subject: [Python-Dev] Re: closure semantics
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310242301.16445.aleaxit@yahoo.com>
	<200310242132.h9OLWMv06320@12-236-54-216.client.attbi.com>
	<200310251007.36871.aleaxit@yahoo.com>
	<200310251640.h9PGePZ07536@12-236-54-216.client.attbi.com>
Message-ID: <eppstein-E67CEF.11053825102003@sea.gmane.org>

In article <200310251640.h9PGePZ07536@12-236-54-216.client.attbi.com>,
 Guido van Rossum <guido@python.org> wrote:

> > > or at run-time) always goes from inner scope to outer.  While you and
> > > I see nested functions as small amounts of closely-knit code, some
> > > people will go overboard and write functions of hundred lines long
> > > containing dozens of inner functions, which may be categorized into
> > 
> > This doesn't look like a legitimate use case to me; i.e., I see no need
> > to distort the language if the benefit goes to such "way overboard" uses.
> > I think they will have serious maintainability problems anyway.
> 
> One person here brought up (maybe David Eppstein) that they used this
> approach for coding up extensive algorithms that are functional in
> nature but have a lot of state referenced *during* the computation.
> Whoever it was didn't like using classes because the internal state
> would persist past the lifetime of the calculation.

Yes, that was me.  You recommended refactoring the stateful part of the 
algorithm as an object despite its lack of persistence.  It worked and 
my code is much improved thereby.

Specifically, I recognized that one of the outer level functions of my 
code was appending to a sequence of strings, so I turned that function 
into the next() method of an iterator object, and the other nested 
functions became other methods of the same object.

I'm not sure how much of the improvement was due to using an 
object-oriented architecture and how much was due to the effort of 
refactoring in general, but you convinced me that using an object to 
represent shared state explicitly rather than doing it implicitly by 
nested function scoping can be a good idea.

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science


From pedronis at bluewin.ch  Sat Oct 25 14:40:36 2003
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Sat Oct 25 14:38:11 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <200310251640.h9PGePZ07536@12-236-54-216.client.attbi.com>
References: <Your message of "Sat, 25 Oct 2003 10:07:36 +0200."
	<200310251007.36871.aleaxit@yahoo.com>
	<200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310242301.16445.aleaxit@yahoo.com>
	<200310242132.h9OLWMv06320@12-236-54-216.client.attbi.com>
	<200310251007.36871.aleaxit@yahoo.com>
Message-ID: <5.2.1.1.0.20031025194109.0284ac80@pop.bluewin.ch>

At 09:40 25.10.2003 -0700, Guido van Rossum wrote:
> > > or at run-time) always goes from inner scope to outer.  While you and
> > > I see nested functions as small amounts of closely-knit code, some
> > > people will go overboard and write functions of hundred lines long
> > > containing dozens of inner functions, which may be categorized into
> >
> > This doesn't look like a legitimate use case to me; i.e., I see no need
> > to distort the language if the benefit goes to such "way overboard" uses.
> > I think they will have serious maintainability problems anyway.
>
>One person here brought up (maybe David Eppstein) that they used this
>approach for coding up extensive algorithms that are functional in
>nature but have a lot of state referenced *during* the computation.
>Whoever it was didn't like using classes because the internal state
>would persist past the lifetime of the calculation.

[seen David Eppstein's post, discarded obsolete comment]

>When I visited Google I met one person who was advocating the same
>coding style -- he was adamant that if he revealed any internal
>details of his algorithm then the users of his library would start
>using them, and he wouldn't be able to change the details in another
>revision.

I should be missing the details, it seems someone unhappy with the 
encapsulation support in Python, wanting it backward using closures. Yes, 
closures can be used to
get strong encapsulation.

If Python wanted again to support directly some form of sandboxed execution 
, then better support for encapsulation would very likely play a role.

But as I said I should be missing something, if the point is stronger 
encapsulation I would add it to the OO part of the language. The 
schizophrenic split, use objects but if you want encapsulation use 
closures, seems odd.

Aside: I have the maybe misled impression, that having a fully complete 
functional programming support in Python was not the point, but that the 
addition of generators have increased the interest in more functional 
programming support.


>AFACT these were both very experienced Python developers who had
>thought about the issue and chosen to write large nested functions.
>
>

they seem to want to import idioms that before 2.1 were not even imaginable,
and maybe I'm wrong, but idioms that come from somewhere else.

Personally, e.g. I would like multi-method support in Python and I know 
where they come from <wink>.

Every experienced Python developer probaly knows some other language, and 
miss or would like something from there.

Sometimes I have the impression that seeing the additions incrementally, 
and already knowing the language well, make it hard to consider the 
learning curve for someone encountering the language for the first time.

I think that evalaluating whether an addition really enhance expressivity, 
or makes the language more uniform vs the ref-man growth is very important.

IMHO generators were a clear win, generator expressions seem a 
add/substract thing because list comprehension explanation becomes just 
list(gen expr).

regards. 


From skip at pobox.com  Sat Oct 25 12:09:55 2003
From: skip at pobox.com (Skip Montanaro)
Date: Sat Oct 25 15:51:39 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <200310251007.36871.aleaxit@yahoo.com>
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310242301.16445.aleaxit@yahoo.com>
	<200310242132.h9OLWMv06320@12-236-54-216.client.attbi.com>
	<200310251007.36871.aleaxit@yahoo.com>
Message-ID: <16282.41043.939103.536103@montanaro.dyndns.org>


    [Alex]
    Assume for the sake of argument that we could make 'scope' a reserved
    word.  Now, what are the tradeoffs of using a "declaration"
        scope x in outer
    which makes all rebidings of x act in the scope of containing function
    outer (including 'def x():', 'class x:', 'import x', ...); versus an 
    "operator" that must be used to indicate "which x" when specifically
    assigning it (no "side effect rebinding" via def &c allowed -- I think it
    helps the reader of code a LOT to require specific assignment!), e.g.
        scope(outer).x = 23

I don't see how either of your scope statements is really any better than
"global".  If I say

    global x in outer

I am declaring to the compiler that x is global to the current function, and
in particular I want you to bind x to the x which is local to the function
outer.

Maybe "global" isn't perfect, but it seems to suit the situation fairly well
and avoids a new keyword to boot.  

With the "scope(outer).x = 23" notation you are mixing apples and oranges
(declaration and execution).  It looks like an executable statement but it's
really a declaration to the compiler.  Guido has already explained why the
binding has to occur at compile time.

The tradeoffs are:
   -- we can keep thinking of Python as declaration-free and by gradually
      deprecating the global statement make it more so

How do you propose to subsume the current global statement's functionality?

   -- the reader of code KNOWS what's being assigned to without having
       to scroll up "hundreds of lines" looking for possible declarations

As he would with an extension of the current global statement.  I presume
you mean for your scope pseudo function to be used at the "point of attack",
so there would likely be less separation between the declaration and the
assignment.  Of course, using your argument about redundancy against you,
would I have to use

    scope(outer).x = ...

each time I wanted to change the value of x?  What if I rename outer?

   -- assignment to nonlocals is made less casually convenient by just the
       right amount to ensure it won't be overused

I don't see this as a big problem now.  In my own code I rarely use global,
and never use nested functions.  I suspect that's true for most people.

Skip

From guido at python.org  Sat Oct 25 16:20:45 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 25 16:20:55 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: Your message of "Sat, 25 Oct 2003 11:32:04 +0200."
	<200310251132.04686.aleaxit@yahoo.com> 
References: <200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz>
	<200310230543.h9N5heh01776@12-236-54-216.client.attbi.com> 
	<200310251132.04686.aleaxit@yahoo.com> 
Message-ID: <200310252020.h9PKKjL07657@12-236-54-216.client.attbi.com>

> sum looks cooler, but it can be an order of magnitude slower
> than the humble loop of result.extend calls.  We could fix this
> specific performance trap by specialcasing in sum those cases
> where the result has a += method -- hmmm... would a patch for
> this performance bug be accepted for 2.3.* ...?  (I understand and
> approve that we're keen on avoiding adding functionality in any
> 2.3.*, but fixed-functionality performance enhancements should
> be just as ok as fixes to functionality bugs, right?)

No way.  There's nothing that guarantees that a+=b has the same
semantics as a+b, and in fact for lists it doesn't.

I wouldn't even want this for 2.4.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sat Oct 25 17:14:32 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 25 17:14:41 2003
Subject: [Python-Dev] product()
In-Reply-To: Your message of "Sat, 25 Oct 2003 14:39:12 +0200."
	<200310251439.12449.aleaxit@yahoo.com> 
References: <002401c39907$0176f5a0$e841fea9@oemcomputer>  
	<200310251439.12449.aleaxit@yahoo.com> 
Message-ID: <200310252114.h9PLEWU07712@12-236-54-216.client.attbi.com>

> it might be "alltrue" and "anytrue" -- the short-circuiting ones,
> returning the first true or false item found respectively, as in:
> 
> def alltrue(seq):
>     for x in seq:
>         if not x: return x
>     else:
>         return True
> 
> def anytrue(seq):
>     for x in seq:
>         if x: return x
>     else:
>         return False
> 
> these seem MUCH more generally useful than 'product' (but,
> I still opine, not quite enough to warrant being built-ins).

These are close to what ABC does with quantifiers.  There, you can
write

  IF EACH x IN sequence HAS x > 0: ...

ABC has the additional quirk that if there's an ELSE branch, you can
use x in it (as a "counter-example").

In Python, you could write this as

  if alltrue(x > 0 for x in sequence): ...

but the current design doesn't expose x to the else branch.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sat Oct 25 17:18:42 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 25 17:18:57 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
	bltinmodule.c, 2.292.10.1, 2.292.10.2
In-Reply-To: Your message of "Sat, 25 Oct 2003 05:47:11 PDT."
	<E1ADNp1-0000gJ-00@sc8-pr-cvs1.sourceforge.net> 
References: <E1ADNp1-0000gJ-00@sc8-pr-cvs1.sourceforge.net> 
Message-ID: <200310252118.h9PLIgE07777@12-236-54-216.client.attbi.com>

> Modified Files:
>       Tag: release23-maint
> 	bltinmodule.c 
> Log Message:
> changed builtin_sum to use PyNumber_InPlaceAdd -- unchanged semantics but
> fixes performance bug with sum(lotsoflists, []).

I think this ought to be reverted, both in 2.3 and 2.4.  Consider this code:

empty = []
for i in range(10):
  print sum([[x] for x in range(i)], empty)

The output used to be:
    
[]
[0]
[0, 1]
[0, 1, 2]
[0, 1, 2, 3]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4, 5]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6, 7]
[0, 1, 2, 3, 4, 5, 6, 7, 8]

But now it is:

[]
[0]
[0, 0, 1]
[0, 0, 1, 0, 1, 2]
[0, 0, 1, 0, 1, 2, 0, 1, 2, 3]
[0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4]
[0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 5]
[0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 6]
[0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6, 7]
[0, 0, 1, 0, 1, 2, 0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 6, 0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 2, 3, 4, 5, 6, 7, 8]

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sat Oct 25 18:10:56 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 25 18:11:09 2003
Subject: [Python-Dev] test_bsddb blocks while testing popitem (?)
In-Reply-To: Your message of "Sat, 25 Oct 2003 09:25:47 EDT."
	<1067088346.10257.71.camel@anthem> 
References: <200310251232.55044.aleaxit@yahoo.com>  
	<1067088346.10257.71.camel@anthem> 
Message-ID: <200310252210.h9PMAuT07833@12-236-54-216.client.attbi.com>

> On Sat, 2003-10-25 at 06:32, Alex Martelli wrote:
> > I guess it had been a while since I ran 'make test' on the 2.4
> > cvs... can't find this bug in the bugs db and I'd just like a
> > quick sanity check (if the bug's already there or if I'm doing
> > something weird) before I add it.
> 
> Jeremy and I have both seen similar hangs in 2.4cvs.
> 
> -Barry

Ditto for me on RH9.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sat Oct 25 18:20:13 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 25 18:21:21 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: Your message of "Sat, 25 Oct 2003 10:29:32 EDT."
	<20031025142932.GZ5842@epoch.metaslash.com> 
References: <200310221602.h9MG2h427527@12-236-54-216.client.attbi.com>
	<1b5501c398be$ff1832d0$891e140a@YODA>
	<200310251603.17845.aleaxit@yahoo.com> 
	<20031025142932.GZ5842@epoch.metaslash.com> 
Message-ID: <200310252220.h9PMKD507863@12-236-54-216.client.attbi.com>

> One thing that I've always wondered about, why can't one do:
> 
>         def reset_foo():
>             global foo = []     # declare as global and do assignment

Nothing deep -- it just never occurred to me.  I was mimicking ABC's
"SHARE foo", which doesn't have this because its syntax for assignment
is the more verbose "PUT value IN variable".

I don't think it'll entice Alex though. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg at cosc.canterbury.ac.nz  Sat Oct 25 18:22:38 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Sat Oct 25 18:22:45 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DED6AD4A@au3010avexu1.global.avaya.com>
Message-ID: <200310252222.h9PMMc505078@oma.cosc.canterbury.ac.nz>

> It's complex. Can you explain the complete semantics of 'outer' as simply as:
> 
>     global <name> [in <scope>]
> 
>     Binds and uses <name> in another scope. If 'in <scope>' is omitted
>     then the name is bound and used in the scope of the current module.

  global <name>

  Assignments to <name> rebind it in the next outer scope where it is 
  already bound, or in the module scope if there is no existing binding.

Seems about the same length as yours.

>   <include warnings about introducing the name into a scope between the
>   current scope and the scope where the programmer was expecting the
>   name to be bound>

Such comments belong in warning messages about the change issued
during the transitional phase, not in the language definition.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Sat Oct 25 18:25:22 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 25 18:25:39 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: Your message of "Sat, 25 Oct 2003 13:15:29 EDT."
	<20031025171529.GA18617@panix.com> 
References: <005501c398ca$a07a6f20$e841fea9@oemcomputer>
	<200310221853.h9MIrL327955@12-236-54-216.client.attbi.com>
	<200310251508.15634.aleaxit@yahoo.com> 
	<20031025171529.GA18617@panix.com> 
Message-ID: <200310252225.h9PMPMT07897@12-236-54-216.client.attbi.com>

[Alex]
> > So, if I've followed correctly the lots of python-dev mail over the last
> > few days, that person (Aahz) is roughly +0 on list.sorted as classmethod
> > and thus we can go ahead.  Right?

[Aahz]
> I'm not the person who objected on non-English speaking grounds, and I'm
> -0 because I don't like using grammatical tense as the differentiator;
> as I said, I'd expect sorted() to be a predicate.  If we're doing this
> (and it seems we are), I still prefer copysort() for clarity.  But I'm
> not objecting to sorted().

Predicates start with 'is'.  For example, s.lower() converts s to
lowercase; s.islower() asks if s is lowercase.

I'm -1 on list.copysort() as a constructor/factory.

Since whoever didn't like sorted() before hasn't piped up now, I think
we should go ahead and implement the list.sorted() constructor.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at yahoo.com  Sat Oct 25 18:38:32 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 18:38:37 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310252220.h9PMKD507863@12-236-54-216.client.attbi.com>
References: <200310221602.h9MG2h427527@12-236-54-216.client.attbi.com>
	<20031025142932.GZ5842@epoch.metaslash.com>
	<200310252220.h9PMKD507863@12-236-54-216.client.attbi.com>
Message-ID: <200310260038.32881.aleaxit@yahoo.com>

On Sunday 26 October 2003 12:20 am, Guido van Rossum wrote:
> > One thing that I've always wondered about, why can't one do:
> >
> >         def reset_foo():
> >             global foo = []     # declare as global and do assignment
>
> Nothing deep -- it just never occurred to me.  I was mimicking ABC's
> "SHARE foo", which doesn't have this because its syntax for assignment
> is the more verbose "PUT value IN variable".
>
> I don't think it'll entice Alex though. :-)

Ah, you haven't seen my answer to it?  I think it meets most of my
objections -- all but the distaste for the keyword 'global' itself -- and
I could definitely live with this more happily than with any other use
of 'global'.  Please see my direct response to Neal for more details.


Alex


From guido at python.org  Sat Oct 25 18:48:12 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 25 18:48:26 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: Your message of "Sat, 25 Oct 2003 17:05:04 +0200."
	<200310251705.04439.aleaxit@yahoo.com> 
References: <200310221602.h9MG2h427527@12-236-54-216.client.attbi.com>
	<200310251603.17845.aleaxit@yahoo.com>
	<20031025142932.GZ5842@epoch.metaslash.com> 
	<200310251705.04439.aleaxit@yahoo.com> 
Message-ID: <200310252248.h9PMmCE07958@12-236-54-216.client.attbi.com>

[Neal]
> >         def reset_foo():
> >             global foo = []     # declare as global and do assignment

[Alex]
> Indeed, you can see 'global', in this case, as a kind of "operator
> keyword", modifying the scope of foo in an assignment statement.
> 
> I really have two separate peeves against global (not necessarily
> in order of importance, actually):
> 
> -- it's the wrong keyword, doesn't really _mean_ "global"

I haven't heard anyone else in this thread agree with you on that
one.  I certainly don't think it's of earth-shattering ugliness.

> -- it's a "declarative statement", the only one in Python (ecch)
>    (leading to weird uncertainty about where it can be placed)

I'd be happy to entertain proposals for reasonable restrictions on
where 'global' can be placed.  (Other placements would have to be
deprecated at first.)

> -- "side-effect" assignment to globals, such as in def, class &c
>    statements, is quite tricky and error-prone, not useful

Agreed; nobody uses these, but again this can be fixed if we want to
(again we'd have to start deprecating existing use first).

Note that this is also currently allowed and probably shouldn't:

  def f():
    global x
    for x in ...:
      ...

> Well, OK, _three_ peeves... usual Spanish Inquisition issue...:-)
> 
> Your proposal is quite satisfactory wrt solving the second issue,
> from my viewpoint.  It would still create a unique-in-Python
> construct, but not (IMHO) a problematic one.

Well, *every* construct is "unique in Python", isn't it?  Because
Python has only one of each construct, in line with the TOOWTDI zen.

Or do you mean "not seen in other languages"?  I'd disagree -- lots of
languages have something similar, e.g. "int x = 5;" in C or "var x =
5" in JavaScript.  IMO, "global x = 5" is sufficiently similar that it
will require no time to learn.

> As you point out,
> it _would_ be more concise than having to separately [a] say
> foo is global then [b] assign something.  It would solve any
> uncertainty regarding placement of 'global', and syntactically
> impede using global variables in "rebinding as side-effect" cases
> such as def &c, so the third issue disappears.
> 
> The first issue, of course, is untouched:-).  It can't be touched
> without choosing a different keyword, anyway.
> 
> So, with 2 resolutions out of 3, I do like your idea.

I don't think that Neal's proposal solves #3, unless 'global x = ...'
becomes the *only* way.  Also, I presume that the following:

  def f():
    global x = 21
    x *= 2
    print x

should continue to be value, and all three lines should reference the
same variable.

But #3 is moot IMO, it can be solved without mucking with global at
all, by simply making the parser reject 'class X', 'def X', 'import X'
and 'for X' when there's also a 'global X' in effect.  Piece of cake.

> However, I don't think we can get there from here.  Guido has
> explained that the parser must be able to understand a statement
> that starts with 'global' without look-ahead; I don't know if it can
> keep accepting, for bw compat and with a warning, the old
>     global xx
> while also accepting the new and improved
>     global xx = 23

There is absolutely no problem recognizing this.

> But perhaps it's not quite as hard as the "global.xx = 23" would
> be.  I find Python's parser too murky & mysterious to feel sure.

If you can understand what code can be recognized by a pure recursive
descent parser with one token lookahead and no backtracking, you can
understand what Python's parser can handle.

> Other side issues: if you rebind a module-level xx in half a
> dozen places in your function f, right now you only need ONE
> "global xx" somewhere in f (just about anywhere); with your
> proposal, you'd need to flag "global xx = 23" at each of the
> several assignments to that xx.  Now, _that suits me just
> fine_: indeed, I LOVE the fact that a bare "xx = 23" is KNOWN
> to set a local, and you don't have to look all over the place for
> declarative statements that might affect its semantics

You may love this for assignments, but for *using* variables there is
already no such comfort.  Whether "print xx" prints a local or global
variable depends on whether there's an assignment to xx anywhere in
the same scope.  So I don't think that is a very strong argument.

> (hmmm,
> perhaps a 4th peeve vs global, but I see it as part and parcel
> of peeve #2:-).  But globals-lovers might complain that it makes
> using globals a TAD less convenient.  (Personally, I would not
> mind THAT at all, either: if as a result people use 10% fewer
> globals and replace them with arguments or classes etc, I think
> that will enhance their programs anyway;-).
> 
> 
> So -- +1, even though we may need a different keyword to
> solve [a] the problem of getting there from here AND [b] my
> peeve #1 ...:-).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From python at rcn.com  Sat Oct 25 18:50:50 2003
From: python at rcn.com (Raymond Hettinger)
Date: Sat Oct 25 18:51:43 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310252225.h9PMPMT07897@12-236-54-216.client.attbi.com>
Message-ID: <003f01c39b4a$70db20c0$e841fea9@oemcomputer>

> Since whoever didn't like sorted() before hasn't piped up now, I think
> we should go ahead and implement the list.sorted() constructor.

Okay, I'll modify the patch to be a classmethod called sorted() and will
assign to Alex for second review.


Raymond


From greg at cosc.canterbury.ac.nz  Sat Oct 25 18:51:50 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Sat Oct 25 18:51:58 2003
Subject: [Python-Dev] Re: Re: closure semantics
In-Reply-To: <bnbm8s$p6h$1@sea.gmane.org>
Message-ID: <200310252251.h9PMpon05099@oma.cosc.canterbury.ac.nz>

> But what about name mismatches?  Global statements allows functions to
> create 'new' variables in the module scope and not just 'existing'
> ones.  What about for in-between scopes?

It's probably a misfeature of the global statement that it allows
that, but if we're going to re-use it in the form of a "global x in
scope" statement, we should keep the behaviour the same for nested
scopes in the interests of consistency.

Maybe this is an argument for introducing an "outer" statement,
which requires an existing binding (determined by existence of
an assignment at compile time) even for the module scope, and
deprecating "global" altogether.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From skip at pobox.com  Sat Oct 25 19:03:17 2003
From: skip at pobox.com (Skip Montanaro)
Date: Sat Oct 25 19:03:34 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <200310252222.h9PMMc505078@oma.cosc.canterbury.ac.nz>
References: <338366A6D2E2CA4C9DAEAE652E12A1DED6AD4A@au3010avexu1.global.avaya.com>
	<200310252222.h9PMMc505078@oma.cosc.canterbury.ac.nz>
Message-ID: <16283.309.987234.133955@montanaro.dyndns.org>


    Greg> global <name>

    Greg> Assignments to <name> rebind it in the next outer scope where it
    Greg> is already bound, or in the module scope if there is no existing
    Greg> binding.

    Greg> Seems about the same length as yours.

Is that compatible with current use?  I think the current semantics are that
global <name> always binds name to an object with that name at module scope.

I thought the point of this discussion was to allow the programmer to
specify the precise scope of the object to which the variable would be
bound, in the face of possibly multiple occurrences of the name.  Using the
existing syntax you have to pick one rather arbitrarily, either the module
scope or the first place you find <name>.

(Again, I have never used nested functions, so this is more of a pedantic
argument than anything for me.  Still, it seems if you're going to change
things you should make it so any instance of an outer variable can be
specified.) 

Skip

From aleaxit at yahoo.com  Sat Oct 25 19:04:18 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 19:04:24 2003
Subject: [Python-Dev] product()
In-Reply-To: <200310252114.h9PLEWU07712@12-236-54-216.client.attbi.com>
References: <002401c39907$0176f5a0$e841fea9@oemcomputer>
	<200310251439.12449.aleaxit@yahoo.com>
	<200310252114.h9PLEWU07712@12-236-54-216.client.attbi.com>
Message-ID: <200310260104.18806.aleaxit@yahoo.com>

On Saturday 25 October 2003 11:14 pm, Guido van Rossum wrote:
> > it might be "alltrue" and "anytrue" -- the short-circuiting ones,
> > returning the first true or false item found respectively, as in:
> >
> > def alltrue(seq):
> >     for x in seq:
> >         if not x: return x
> >     else:
> >         return True
> >
> > def anytrue(seq):
> >     for x in seq:
> >         if x: return x
> >     else:
> >         return False
> >
> > these seem MUCH more generally useful than 'product' (but,
> > I still opine, not quite enough to warrant being built-ins).
>
> These are close to what ABC does with quantifiers.  There, you can
> write
>
>   IF EACH x IN sequence HAS x > 0: ...
>
> ABC has the additional quirk that if there's an ELSE branch, you can
> use x in it (as a "counter-example").
>
> In Python, you could write this as
>
>   if alltrue(x > 0 for x in sequence): ...
>
> but the current design doesn't expose x to the else branch.

Right -- it would return the condition being tested, x>0, when non-true,
so just a False; there is no natural way for it to get the underlying
object on which it's testing it.  This is somewhat the same problem as
Peter Norvig's original Top(10) accumulator example: if you just pass to
it the iterator of the comparison keys, it can't collect the 10 items with
the highest comparison keys.

Maybe
    alltrue(sequence, pred=lambda x: x>0)
might be better (pred would default to None meaning to test the items
in the first argument, the iterator, for true/false directly):

def alltrue(seq, pred=None):
    if pred is None:
        def pred(x): return x
        def wrap(x): return x
    else:
        class wrap(object):
            def __init__(self, x): self.counterexample = x
            def __nonzero__(self): return False
    for x in seq:
        if not pred(x): return wrap(x)
    else:
        return True

or something like that (I do think we need the wrap class, so that
alltrue can return an object that evaluates to false but still allows
the underlying "counter-example" to be retrieved if needed).

Use, of course, would have to be something like:

allpositives = alltrue(sequence, pred=lambda x: x>0)
if allpositives: print "wow, all positives!"
else: print "nope, some nonpositives, e.g.", allpositives.counterexample

Unfortunately, this usage is pushing at TWO not-strengths of Python:
no neat way to pass an unnamed predicate (lambda ain't really all
that neat...) AND no assignment-as-expression.  So, I don't think it
would really catch on all that much.


Alex


From just at letterror.com  Sat Oct 25 19:09:12 2003
From: just at letterror.com (Just van Rossum)
Date: Sat Oct 25 19:09:20 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310252248.h9PMmCE07958@12-236-54-216.client.attbi.com>
Message-ID: <r01050400-1026-409EF471074011D8BDE6003065D5E7E4@[10.0.0.23]>

It seems noone liked (or remembered) an idea I proposed last february,
but I'm going to repost it anyway:

How about adding a "rebinding" operator, for example spelled ":=":

   a := 2

It would mean: bind the value 2 to the nearest scope that defines 'a'.

Original post:
http://mail.python.org/pipermail/python-dev/2003-February/032764.html

A better summary by someone else who liked it:
http://groups.google.com/groups?selm=mailman.1048248875.10571.python-
list%40python.org

Advantages: no declarative statement (I don't like global much to begin
with, but much less for scope declarations other that what it means
now). It's a nice addition to the current scoping rule: an assignment IS
a scope declaration.

Possible disadvantage: you can only rebind to the nearest scope that
defines the name. If there's a farther scope that also defines that name
you can't reach that. But that's nicely symmetrical with how _reading_
values from nested scopes work today, shadowing is nothing new.

Ideally, augmented assignments would also become "rebinding". However,
this may have compatibility problems.

Just

From aleaxit at yahoo.com  Sat Oct 25 19:11:51 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 19:11:56 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <200310252020.h9PKKjL07657@12-236-54-216.client.attbi.com>
References: <200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz>
	<200310251132.04686.aleaxit@yahoo.com>
	<200310252020.h9PKKjL07657@12-236-54-216.client.attbi.com>
Message-ID: <200310260111.51509.aleaxit@yahoo.com>

On Saturday 25 October 2003 10:20 pm, Guido van Rossum wrote:
> > sum looks cooler, but it can be an order of magnitude slower
> > than the humble loop of result.extend calls.  We could fix this
> > specific performance trap by specialcasing in sum those cases
> > where the result has a += method -- hmmm... would a patch for
> > this performance bug be accepted for 2.3.* ...?  (I understand and
> > approve that we're keen on avoiding adding functionality in any
> > 2.3.*, but fixed-functionality performance enhancements should
> > be just as ok as fixes to functionality bugs, right?)
>
> No way.  There's nothing that guarantees that a+=b has the same
> semantics as a+b, and in fact for lists it doesn't.

You mean because += is more permissive (accepts any sequence
RHS while + insists the RHS be specifically a list)?  I don't see how
this would make it bad to use += instead of + -- if we let the user
sum up a mix of (e.g.) strings and tuples, why are we hurting him?

And it seemed to me that cases in which the current semantics of
"a = a + b" would work correctly, while the potentially-faster "a += b"
wouldn't, could be classified as "weird" and ignored in favour of
avoiding "sum" be an orders-of-magnitude performance trap for
such cases (see my performance measurements in other posts of
mine to this thread).

Still, you're the boss.  Sorry -- I'll immediately revert the commits I
had made and be less eager in the future.


> I wouldn't even want this for 2.4.

Aye aye, cap'n.  I'll revert the 2.4 commits too, then.  Sorry.


Alex


From guido at python.org  Sat Oct 25 19:16:35 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 25 19:16:49 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: Your message of "Sun, 26 Oct 2003 01:11:51 +0200."
	<200310260111.51509.aleaxit@yahoo.com> 
References: <200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz>
	<200310251132.04686.aleaxit@yahoo.com>
	<200310252020.h9PKKjL07657@12-236-54-216.client.attbi.com> 
	<200310260111.51509.aleaxit@yahoo.com> 
Message-ID: <200310252316.h9PNGZc08136@12-236-54-216.client.attbi.com>

> > No way.  There's nothing that guarantees that a+=b has the same
> > semantics as a+b, and in fact for lists it doesn't.
> 
> You mean because += is more permissive (accepts any sequence
> RHS while + insists the RHS be specifically a list)?  I don't see how
> this would make it bad to use += instead of + -- if we let the user
> sum up a mix of (e.g.) strings and tuples, why are we hurting him?

We specifically decided that sum() wasn't allowed for strings, because
it's a quadratic algorithm.  Other sequences are just as bad, we just
didn't expect that to be a common case.

Also see my not-so-far-fetched example of a semantic change.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg at cosc.canterbury.ac.nz  Sat Oct 25 19:21:20 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Sat Oct 25 19:23:09 2003
Subject: [Python-Dev] Can we please have a better dict interpolation
	syntax?
In-Reply-To: <20031024184850.GB34310@hishome.net>
Message-ID: <200310252321.h9PNLKC05250@oma.cosc.canterbury.ac.nz>

>      "create index \{table}_lid1_idx on \{table}(\{lid1})"

That looks horrible.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From aahz at pythoncraft.com  Sat Oct 25 19:23:26 2003
From: aahz at pythoncraft.com (Aahz)
Date: Sat Oct 25 19:23:29 2003
Subject: [Python-Dev] Re: let's not stretch a keyword's use unreasonably,
	_please_...
In-Reply-To: <16279.58018.40303.136992@montanaro.dyndns.org>
References: <20031022161137.96353.qmail@web40513.mail.yahoo.com>
	<bn7q9h$3ld$1@sea.gmane.org>
	<16279.58018.40303.136992@montanaro.dyndns.org>
Message-ID: <20031025232326.GA23772@panix.com>

On Thu, Oct 23, 2003, Skip Montanaro wrote:
> 
>     >>> import __main__ as m # I know, not general, just for trial
>     >>> m.c=3
> 
> Isn't (in 3.0) the notion of being able to modify another module's globals
> supposed to get restricted to help out (among other things) the compiler?
> If so, this use, even though it's not really modifying a global in another
> module, might not work forever.

That use had better continue working.  What won't work is

    m.len = my_len()

and even there, there's still some debate about ways to structure
permitting it for the use of debuggers.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From aahz at pythoncraft.com  Sat Oct 25 19:25:15 2003
From: aahz at pythoncraft.com (Aahz)
Date: Sat Oct 25 19:25:18 2003
Subject: [Python-Dev] Re: let's not stretch a keyword's use unreasonably,
	_please_...
In-Reply-To: <200310251058.05704.aleaxit@yahoo.com>
References: <20031022161137.96353.qmail@web40513.mail.yahoo.com>
	<bn7q9h$3ld$1@sea.gmane.org> <200310251058.05704.aleaxit@yahoo.com>
Message-ID: <20031025232515.GB23772@panix.com>

On Sat, Oct 25, 2003, Alex Martelli wrote:
>
> Or, we can make the _compiler_ aware of what is going on (and get just the
> same semantics as global) by accepting either a non-statement keyword 
> (scope, as I suggested elsewhere) or a magicname for import, e.g.
> import __me__ as Barry suggested.  Then __me__.x=23 can have just the
> same semantics as today "x=23" has if there is some "global x" somewhere
> around, and indeed it could be compiled into the same bytecode if __me__
> was sufficiently special to the compiler.

We've already got ``import __main__``; what does __me__ gain?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From aleaxit at yahoo.com  Sat Oct 25 19:31:47 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 19:31:53 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <16282.41043.939103.536103@montanaro.dyndns.org>
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310251007.36871.aleaxit@yahoo.com>
	<16282.41043.939103.536103@montanaro.dyndns.org>
Message-ID: <200310260131.47500.aleaxit@yahoo.com>

On Saturday 25 October 2003 06:09 pm, Skip Montanaro wrote:
   ...
> I don't see this as a big problem now.  In my own code I rarely use global,
> and never use nested functions.  I suspect that's true for most people.

No doubt it's true that most people only care about their own code, and
don't have much to do with teaching and advising others, mentoring them,
maintaining and enhancing code originally written by others, etc.

So, since my professional activity typically encompasses these weird 
activities, not of interest to most people, and that gives me a different 
viewpoint from that of most people, I guess it's silly of me to share it. 

Sorry if my past well-meant eagerness caused problems; it's obviously 
more sensible for people who never use nested functions to help shape 
their syntax and semantics, than for those who DO use them, after all -- and 
similarly, for people who only care about their own code to help determine 
if 'global' is, or isn't, a cause of problems out there in the wide world of 
Python newbies and users far from python-dev.


Alex


From aleaxit at yahoo.com  Sat Oct 25 19:45:23 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Oct 25 19:45:39 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
	bltinmodule.c, 2.292.10.1, 2.292.10.2
In-Reply-To: <200310252118.h9PLIgE07777@12-236-54-216.client.attbi.com>
References: <E1ADNp1-0000gJ-00@sc8-pr-cvs1.sourceforge.net>
	<200310252118.h9PLIgE07777@12-236-54-216.client.attbi.com>
Message-ID: <200310260145.23094.aleaxit@yahoo.com>

On Saturday 25 October 2003 11:18 pm, Guido van Rossum wrote:
> > Modified Files:
> >       Tag: release23-maint
> > 	bltinmodule.c
> > Log Message:
> > changed builtin_sum to use PyNumber_InPlaceAdd -- unchanged semantics but
> > fixes performance bug with sum(lotsoflists, []).
>
> I think this ought to be reverted, both in 2.3 and 2.4.  Consider this
> code:

I have reverted it; it's obviously true that, by causing side effects on the 
2nd argument, the fix as I had commited it could change semantics.  I
apologize for not thinking of this (and adding the missing unit-tests to
catch this, see next paragraph).

If it was up to me I would still pursue the possibility of using 
PyNumber_InPlaceAdd, for example by only doing so if the second
argument could first be successfully copy.copy'ed into the result and falling 
back to PyNumber_Add otherwise.  The alternative of leaving sum as a
performance trap for the unwary -- an "attractive nuisance" in legal terms --
would appear to me to be such a bad situation, as to warrant such
effort (including adding unit-tests to ensure sum does not alter its second
argument, works correctly with a non-copyable 2nd argument, etc).

However, it's YOUR decision, and you have already made it clear in
another mail that your objections to remedying this performance bug are
such that no possible solution will satisfy them.  If a type gives different
results for "a = a + b" vs "a += b", there is no way sum can find this out;
and while, were it my decision, I would not care to support such weird
cases at such a huge performance price, it's not my decision.  Similarly
for types which let you do "a = copy.copy(b)" but do NOT return a valid
copy of b, or return b itself even though it's mutable, and so on weirdly.

I'm just very sad that I didn't think of this performance-trap back when
the specs of sum were first being defined.  Oh well:-(.  Can I at least add
a warning about this performance trap to the docs at
http://www.python.org/doc/current/lib/built-in-funcs.html ?


Alex


From greg at cosc.canterbury.ac.nz  Sat Oct 25 21:10:21 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Sat Oct 25 21:10:41 2003
Subject: [Python-Dev] closure semantics
In-Reply-To: <16283.309.987234.133955@montanaro.dyndns.org>
Message-ID: <200310260110.h9Q1ALH05576@oma.cosc.canterbury.ac.nz>

> Is that compatible with current use?  I think the current semantics are that
> global <name> always binds name to an object with that name at module scope.

No, it's not quite compatible, but I don't think
it's likely to break anything much in practice.

> I thought the point of this discussion was to allow the programmer
> to specify the precise scope of the object to which the variable
> would be bound, in the face of possibly multiple occurrences of the
> name.

In general the point seems to be simply about finding
*some* way to bind intermediate variables. Some suggestions
have included a way to explictly identify the scope, but
that seems like an unnecessary complication to me.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Sat Oct 25 21:15:01 2003
From: greg at cosc.canterbury.ac.nz (greg@cosc.canterbury.ac.nz)
Date: Sat Oct 25 21:15:09 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <20031025171529.GA18617@panix.com>
Message-ID: <200310260115.h9Q1F0905596@oma.cosc.canterbury.ac.nz>

> If we're doing this
> (and it seems we are), I still prefer copysort() for clarity. 

"copysort" sounds like the name of some weird sorting
algorithm to me. I'd prefer "sortedcopy" (although I
suppose that could be read as a predicate, too --
"is x a sorted copy of y?")

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Sat Oct 25 21:20:33 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Sat Oct 25 21:20:41 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <r01050400-1026-409EF471074011D8BDE6003065D5E7E4@[10.0.0.23]>
Message-ID: <200310260120.h9Q1KXw05599@oma.cosc.canterbury.ac.nz>

> How about adding a "rebinding" operator, for example spelled ":=":
> 
>    a := 2

I expect Guido would object to that on the grounds that
it's conferring arbitrary semantics on a symbol.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Sat Oct 25 23:26:03 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 25 23:26:13 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
	bltinmodule.c, 2.292.10.1, 2.292.10.2
In-Reply-To: Your message of "Sun, 26 Oct 2003 01:45:23 +0200."
	<200310260145.23094.aleaxit@yahoo.com> 
References: <E1ADNp1-0000gJ-00@sc8-pr-cvs1.sourceforge.net>
	<200310252118.h9PLIgE07777@12-236-54-216.client.attbi.com> 
	<200310260145.23094.aleaxit@yahoo.com> 
Message-ID: <200310260326.h9Q3Q3708345@12-236-54-216.client.attbi.com>

> > > changed builtin_sum to use PyNumber_InPlaceAdd -- unchanged semantics but
> > > fixes performance bug with sum(lotsoflists, []).
> >
> > I think this ought to be reverted, both in 2.3 and 2.4.  Consider this
> > code:
> 
> I have reverted it; it's obviously true that, by causing side effects on the 
> 2nd argument, the fix as I had commited it could change semantics.  I
> apologize for not thinking of this (and adding the missing unit-tests to
> catch this, see next paragraph).
> 
> If it was up to me I would still pursue the possibility of using 
> PyNumber_InPlaceAdd, for example by only doing so if the second
> argument could first be successfully copy.copy'ed into the result and falling 
> back to PyNumber_Add otherwise.  The alternative of leaving sum as a
> performance trap for the unwary -- an "attractive nuisance" in legal terms --
> would appear to me to be such a bad situation, as to warrant such
> effort (including adding unit-tests to ensure sum does not alter its second
> argument, works correctly with a non-copyable 2nd argument, etc).
> 
> However, it's YOUR decision, and you have already made it clear in
> another mail that your objections to remedying this performance bug are
> such that no possible solution will satisfy them.  If a type gives different
> results for "a = a + b" vs "a += b", there is no way sum can find this out;
> and while, were it my decision, I would not care to support such weird
> cases at such a huge performance price, it's not my decision.  Similarly
> for types which let you do "a = copy.copy(b)" but do NOT return a valid
> copy of b, or return b itself even though it's mutable, and so on weirdly.
> 
> I'm just very sad that I didn't think of this performance-trap back when
> the specs of sum were first being defined.  Oh well:-(.

Oh, but we all *did* think of it.  For strings. :-)

> Can I at least add
> a warning about this performance trap to the docs at
> http://www.python.org/doc/current/lib/built-in-funcs.html ?

Definitely.

You know, I don't even think that I would consider using sum() if I
wanted to concatenate a bunch of lists.  Let's use sum() for numbers.
Big deal.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at python.org  Sat Oct 25 23:29:53 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 25 23:30:09 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: Your message of "Sun, 26 Oct 2003 14:20:33 +1300."
	<200310260120.h9Q1KXw05599@oma.cosc.canterbury.ac.nz> 
References: <200310260120.h9Q1KXw05599@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310260329.h9Q3Trv08377@12-236-54-216.client.attbi.com>

> > How about adding a "rebinding" operator, for example spelled ":=":
> > 
> >    a := 2
> 
> I expect Guido would object to that on the grounds that
> it's conferring arbitrary semantics on a symbol.

Hardly arbitary (I have fond memories of several languages that used :=).

But what is one to make of a function that uses both

  a := 2

and

  a = 2

???

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Sat Oct 25 23:36:20 2003
From: guido at python.org (Guido van Rossum)
Date: Sat Oct 25 23:36:34 2003
Subject: [Python-Dev] product()
In-Reply-To: Your message of "Sun, 26 Oct 2003 01:04:18 +0200."
	<200310260104.18806.aleaxit@yahoo.com> 
References: <002401c39907$0176f5a0$e841fea9@oemcomputer>
	<200310251439.12449.aleaxit@yahoo.com>
	<200310252114.h9PLEWU07712@12-236-54-216.client.attbi.com> 
	<200310260104.18806.aleaxit@yahoo.com> 
Message-ID: <200310260336.h9Q3aKg08424@12-236-54-216.client.attbi.com>

> > These are close to what ABC does with quantifiers.  There, you can
> > write
> >
> >   IF EACH x IN sequence HAS x > 0: ...
> >
> > ABC has the additional quirk that if there's an ELSE branch, you can
> > use x in it (as a "counter-example").
> >
> > In Python, you could write this as
> >
> >   if alltrue(x > 0 for x in sequence): ...
> >
> > but the current design doesn't expose x to the else branch.
> 
> Right -- it would return the condition being tested, x>0, when non-true,
> so just a False; there is no natural way for it to get the underlying
> object on which it's testing it.  This is somewhat the same problem as
> Peter Norvig's original Top(10) accumulator example: if you just pass to
> it the iterator of the comparison keys, it can't collect the 10 items with
> the highest comparison keys.
> 
> Maybe
>     alltrue(sequence, pred=lambda x: x>0)
> might be better (pred would default to None meaning to test the items
> in the first argument, the iterator, for true/false directly):
> 
> def alltrue(seq, pred=None):
>     if pred is None:
>         def pred(x): return x
>         def wrap(x): return x
>     else:
>         class wrap(object):
>             def __init__(self, x): self.counterexample = x
>             def __nonzero__(self): return False
>     for x in seq:
>         if not pred(x): return wrap(x)
>     else:
>         return True
> 
> or something like that (I do think we need the wrap class, so that
> alltrue can return an object that evaluates to false but still allows
> the underlying "counter-example" to be retrieved if needed).
> 
> Use, of course, would have to be something like:
> 
> allpositives = alltrue(sequence, pred=lambda x: x>0)
> if allpositives: print "wow, all positives!"
> else: print "nope, some nonpositives, e.g.", allpositives.counterexample
> 
> Unfortunately, this usage is pushing at TWO not-strengths of Python:
> no neat way to pass an unnamed predicate (lambda ain't really all
> that neat...) AND no assignment-as-expression.  So, I don't think it
> would really catch on all that much.

Yeah.  An explicit for loop sounds much better in cases where we want
to know which x failed the test.  Let alltrue() be as simple as
originally proposed.

Do we need allfalse() and anytrue() and anyfalse() too?  These can all
easily be gotten by judicious use of 'not'.  I think ABC has EACH,
SOME and NO (why not all four? who knows).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at yahoo.com  Sun Oct 26 04:01:30 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 04:01:38 2003
Subject: [Python-Dev] product()
In-Reply-To: <200310260336.h9Q3aKg08424@12-236-54-216.client.attbi.com>
References: <002401c39907$0176f5a0$e841fea9@oemcomputer>
	<200310260104.18806.aleaxit@yahoo.com>
	<200310260336.h9Q3aKg08424@12-236-54-216.client.attbi.com>
Message-ID: <200310261001.31072.aleaxit@yahoo.com>

On Sunday 26 October 2003 04:36, Guido van Rossum wrote:
   ...
> > Unfortunately, this usage is pushing at TWO not-strengths of Python:
> > no neat way to pass an unnamed predicate (lambda ain't really all
> > that neat...) AND no assignment-as-expression.  So, I don't think it
> > would really catch on all that much.
>
> Yeah.  An explicit for loop sounds much better in cases where we want
> to know which x failed the test.  Let alltrue() be as simple as
> originally proposed.

Yeah, makes sense.


> Do we need allfalse() and anytrue() and anyfalse() too?  These can all
> easily be gotten by judicious use of 'not'.  I think ABC has EACH,
> SOME and NO (why not all four? who knows).

If we were discussing language or built-ins I would argue for "only one
obvious way to do it", but I don't think this is all that important once we
are discussing standard-library functions (which IS the case here, right?).
Still, I'm not sure I see the benefits of overlapping functionality in this
specific case.


Alex


From greg at cosc.canterbury.ac.nz  Sun Oct 26 04:13:29 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Sun Oct 26 04:14:13 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310260329.h9Q3Trv08377@12-236-54-216.client.attbi.com>
Message-ID: <200310260913.h9Q9DTS06656@oma.cosc.canterbury.ac.nz>

> Hardly arbitary (I have fond memories of several languages that used :=).

But all the ones I know of use it for ordinary assignment.
We'd be having two kinds of assignment, and there's no
prior art to suggest to suggest which should be = and
which :=. That's the "arbitrary" part.

The only language I can remember seeing which had two
kinds of assignment was Simula, which had := for value
assignment and :- for reference assignment (or was it
the other way around? :-) I always thought that was
kind of weird.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From just at letterror.com  Sun Oct 26 04:53:18 2003
From: just at letterror.com (Just van Rossum)
Date: Sun Oct 26 04:53:25 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310260329.h9Q3Trv08377@12-236-54-216.client.attbi.com>
Message-ID: <r01050400-1026-3BD5886D079A11D8BDE6003065D5E7E4@[10.0.0.23]>

Guido van Rossum wrote:

> Hardly arbitary (I have fond memories of several languages that used
> :=).

I think augmented assignment should (ideally) also be rebinding, and :=
kindof looks like an augmented assignment, so I don't think it's all
that bad. I'd be used to it in a snap.

But: let's not get carried away with this particular spelling, the main
question is: "is it a good idea to have a rebinding assignment
operator?" (regardless of how that operator is spelled). Needless to
say, I think it is.

> But what is one to make of a function that uses both
> 
>   a := 2
> 
> and
> 
>   a = 2
> 
> ???

Simple, "a = 2" means 'a' is local to that function, so "a := 2" will
rebind in the same scope. So the following example will raise
UnboundLocalException:

def foo():
    a := 3
    a = 2

And this will just work (but is kindof pointless):

def foo():
    a = 2
    a := 3

And this would be a substitute for the global statement:

a = 2
def foo():
    a := 3

(Alex noted in private mail that one disadvantage of this idea is that
it makes using globals perhaps TOO easy...)

Just

From aleaxit at yahoo.com  Sun Oct 26 05:09:56 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 05:10:03 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
	bltinmodule.c, 2.292.10.1, 2.292.10.2
In-Reply-To: <200310260326.h9Q3Q3708345@12-236-54-216.client.attbi.com>
References: <E1ADNp1-0000gJ-00@sc8-pr-cvs1.sourceforge.net>
	<200310260145.23094.aleaxit@yahoo.com>
	<200310260326.h9Q3Q3708345@12-236-54-216.client.attbi.com>
Message-ID: <200310261109.56801.aleaxit@yahoo.com>

On Sunday 26 October 2003 04:26, Guido van Rossum wrote:
   ...
> > I'm just very sad that I didn't think of this performance-trap back
> > when the specs of sum were first being defined.  Oh well:-(.
>
> Oh, but we all *did* think of it.  For strings. :-)

Yeah, and your decision to forbid them (while my first prototype
tried forwarding to ''.join) simplified sum's implementation a lot.

Unfortunately we cannot easily distinguish numbers from
sequences in the general case, when user-coded classes are
in play; so we can't easily forbid sequences and allow numbers.

Exactly the same underlying reason as a bug I just opened on
SF: if x is an instance of a class X having __mul__ but not
__rmul__, 3*x works (just like x*3) but 3.0*x raises TypeError
(with a message that shows the problem -- x is being taken as
a sequence).  When X is intended as a number class, this
asymmetry between multiplication and (e.g.) addition violates
the principle of least surprise.


> > Can I at least add
> > a warning about this performance trap to the docs at
> > http://www.python.org/doc/current/lib/built-in-funcs.html ?
>
> Definitely.
>
> You know, I don't even think that I would consider using sum() if I
> wanted to concatenate a bunch of lists.  Let's use sum() for numbers.
> Big deal.

Currently the docs say that sum is mostly meant for numbers.  By
making that observation into a stronger warning, we can at least be
somewhat helpful to those few Python users who read the manuals;-).

If sum just couldn't be used for a list of lists it would indeed not be a
big problem.  The problem is that it can, it's just (unexpectedly for the
naive user) dog-slow, just like a loop of += on a list of strings.  And
people WILL and DO consider and try and practice _any_ use for a
language or library feature.  The problem of the += loop on strings is
essentially solved by psyco, which has tricks to catch that and make
it almost as fast as ''.join; but psyco can't get into a built-in function
such as sum, and thus can't help us with the performance trap there.

As you've indicated that for 2.4 the risk of semantics changes to sum
in weird cases _can_ be at least considered (you're still opposed but
open to perhaps being convinced) I hope to get something for that
(with a copy.copy of the "accumulator" and in-place addition if that
succeeds, falling back to plain addition otherwise) and all the unit tests
needed to show it makes sense.


An aside...:

One common subtheme of this and other recent threads here and
on c.l.py is that, as we think of "accumulator functions" to consume
iterators, we should not ignore the mutating methods (typically
returning None) that are NOT appropriate for list comprehensions
just as they weren't for map and friends.  A substantial minority of
intermediate Python users, knowing or feeling that loops coded in
Python aren't as fast as those that happen inside C-coded funcs
such as sum, those in itertools, etc, is NOT enthusiastic about
coding e.g. "for x in stuff: tot += x".  Most often their performance
focus is of course inappropriate, but it's hard to uproot it.

So, in a typical example, we might have:

L = [ [x] for x in xrange(1000) ]

def aloop(L=L):
    tot = []
    for x in L: tot += x
    return tot

def asum(L=L):
    return sum(L, [])

def amap(L=L):
    tot = []
    map(tot.extend, L)
    return tot

With the now-regressed fix, this gave:

[alex@lancelot bo]$ timeit.py -c -s'import d' 'd.aloop()'
1000 loops, best of 3: 640 usec per loop
[alex@lancelot bo]$ timeit.py -c -s'import d' 'd.asum()'
1000 loops, best of 3: 480 usec per loop
[alex@lancelot bo]$ timeit.py -c -s'import d' 'd.amap()'
1000 loops, best of 3: 790 usec per loop

so sum could be touted as "the only obvious solution" -- shortest,
neatest, fastest... IF it were indeed fast!-)

Unfortunately, with the sum change regressed, d.asum times to
8.4e+03 usec per loop, so it clearly cannot be considered any
more:-).  So, there might be space for an accumulator function
patterned on map but [a] which stops on the shortest sequence
like zip and [b] does NOT build a list of results, meant to be called
a bit like map is in the 'amap' example above.  itertools is a great
little collection of producers and manipulators of iterators, but the
"accumulator functions" might provide the "one obvious way" to
_consume_ iterators for common cases; and accumulating by
calling an accumulator-object's mutator method, such as
tot.extend above, on all items of an iterator, clearly is pretty common.


Alex


From aleaxit at yahoo.com  Sun Oct 26 05:20:16 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 05:20:23 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310260329.h9Q3Trv08377@12-236-54-216.client.attbi.com>
References: <200310260120.h9Q1KXw05599@oma.cosc.canterbury.ac.nz>
	<200310260329.h9Q3Trv08377@12-236-54-216.client.attbi.com>
Message-ID: <200310261120.16246.aleaxit@yahoo.com>

On Sunday 26 October 2003 04:29, Guido van Rossum wrote:
> > > How about adding a "rebinding" operator, for example spelled ":=":
> > >
> > >    a := 2
> >
> > I expect Guido would object to that on the grounds that
> > it's conferring arbitrary semantics on a symbol.
>
> Hardly arbitary (I have fond memories of several languages that used :=).

Now, operator :=) MIGHT indeed be worth considering -- "rebinding
assignment with a smile"!

Yes, of course := IS a very popular way to denote assignment.


> But what is one to make of a function that uses both
>
>   a := 2
>
> and
>
>   a = 2

What would astonish me least: the presence of a normal rebiding would
ensure a is local.  I would prefer, therefore, if the compiler AT LEAST
warned about the presence of := at the same scope, and probably I'd
be even happier if the compiler flagged it as an outright error.  I just 
can't think of good use cases for wanting both at the same scope on
the same name.  I can think of a dubious one: a style where = would
be used as "initializing declaration" for a name at function start, and
all further re-bindings of the name systematically always used := -- I
can think of people who might prefer that style, but it might be best for
Python to avoid style variance by forbidding it (since it obviously can't
be _mandated_, thanks be:-).

By forbidding compresence of = and := on the same name at the same 
scope, := becomes an unmistakable yet unobtrusive symbol saying "this 
assignment here is to a NON-local name", and thus amply satisfies my
long-debated unease wrt "global".


Alex


From aleaxit at yahoo.com  Sun Oct 26 05:25:46 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 05:26:02 2003
Subject: [Python-Dev] Re: let's not stretch a keyword's use unreasonably,
	_please_...
In-Reply-To: <20031025232515.GB23772@panix.com>
References: <20031022161137.96353.qmail@web40513.mail.yahoo.com>
	<200310251058.05704.aleaxit@yahoo.com>
	<20031025232515.GB23772@panix.com>
Message-ID: <200310261125.46048.aleaxit@yahoo.com>

On Sunday 26 October 2003 01:25, Aahz wrote:
> On Sat, Oct 25, 2003, Alex Martelli wrote:
> > Or, we can make the _compiler_ aware of what is going on (and get just
> > the same semantics as global) by accepting either a non-statement
> > keyword (scope, as I suggested elsewhere) or a magicname for import,
> > e.g. import __me__ as Barry suggested.  Then __me__.x=23 can have just
> > the same semantics as today "x=23" has if there is some "global x"
> > somewhere around, and indeed it could be compiled into the same
> > bytecode if __me__ was sufficiently special to the compiler.
>
> We've already got ``import __main__``; what does __me__ gain?

import __main__

works only if the current module is being imported with the name __main__.
Most modules will be using a different name most of the time (i.e. except
when they're being used as main scripts, e.g. to run tests on them).

Similarly, even if I know a module is named foo.py and am willing to 
hardcode that into the module's source,

import foo as __me__

doesn't always work (submodules of packages, modules being run
as main scripts for testing).

Furthermore, the compiler cannot do anything special on most
imports.  __me__ would be designed as special (just like __future__
is) and allow the compiler to recognize the situation and do all it
wants or needs, thus obviating the need for "declarative statements".


Alex


From aleaxit at yahoo.com  Sun Oct 26 05:34:56 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 05:35:03 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <r01050400-1026-409EF471074011D8BDE6003065D5E7E4@[10.0.0.23]>
References: <r01050400-1026-409EF471074011D8BDE6003065D5E7E4@[10.0.0.23]>
Message-ID: <200310261134.56982.aleaxit@yahoo.com>

On Sunday 26 October 2003 01:09, Just van Rossum wrote:
> It seems noone liked (or remembered) an idea I proposed last february,
> but I'm going to repost it anyway:
>
> How about adding a "rebinding" operator, for example spelled ":=":
>
>    a := 2
>
> It would mean: bind the value 2 to the nearest scope that defines 'a'.

In the light of the current discussion, this looks beautiful.  At least if
compresence of := and other bindings (= , class, def, for, import, ...)
for the same name at the same scope is flagged as an error.

I would also suggest for simplicity that := be only allowed in the
simplest form of assignment: to a single bare name -- no packing,
unpacking, chaining, nor can the LHS be an indexing, slicing, nor
dotted name.

> Advantages: no declarative statement (I don't like global much to begin
> with, but much less for scope declarations other that what it means
> now). It's a nice addition to the current scoping rule: an assignment IS
> a scope declaration.

Yes.  Neat. := becomes an unobtrusive but unmistakable indication
"I'm binding this name in NON-local scope" and -- if defined with the
restrictions I suggest -- meets all of my issues wrt 'global'.

> Possible disadvantage: you can only rebind to the nearest scope that
> defines the name. If there's a farther scope that also defines that name
> you can't reach that. But that's nicely symmetrical with how _reading_
> values from nested scopes work today, shadowing is nothing new.

I agree.  Reaching other scopes but the "closest" outer one is not a
use case of any overriding importance, IMHO.

> Ideally, augmented assignments would also become "rebinding". However,
> this may have compatibility problems.

Unfortunately yes.  It might have been better to define them that way in
the first place, but changing them now is dubious.  Besides, we could
not load them with the restrictions I think should be put on := to make
it simplest, sharpest, and most useful.


Alex


From aleaxit at yahoo.com  Sun Oct 26 05:37:40 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 05:37:44 2003
Subject: [Python-Dev] Re: Re: closure semantics
In-Reply-To: <200310252251.h9PMpon05099@oma.cosc.canterbury.ac.nz>
References: <200310252251.h9PMpon05099@oma.cosc.canterbury.ac.nz>
Message-ID: <200310261137.40143.aleaxit@yahoo.com>

On Sunday 26 October 2003 00:51, Greg Ewing wrote:
> > But what about name mismatches?  Global statements allows functions to
> > create 'new' variables in the module scope and not just 'existing'
> > ones.  What about for in-between scopes?
>
> It's probably a misfeature of the global statement that it allows
> that, but if we're going to re-use it in the form of a "global x in
> scope" statement, we should keep the behaviour the same for nested
> scopes in the interests of consistency.
>
> Maybe this is an argument for introducing an "outer" statement,
> which requires an existing binding (determined by existence of
> an assignment at compile time) even for the module scope, and
> deprecating "global" altogether.

I think Just's proposal of := meets all of these issues, too: it doesn't
have to, and won't, propagate global's misfeature of allowing creation
of new variables in nonlocal scope, and "requires an existing binding"
(and allows deprecating global altogether, with a warning in 2.4 etc)
in the most natural manner.


Alex


From aleaxit at yahoo.com  Sun Oct 26 05:42:05 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 05:42:11 2003
Subject: [Python-Dev] test_bsddb blocks while testing popitem (?)
In-Reply-To: <200310252210.h9PMAuT07833@12-236-54-216.client.attbi.com>
References: <200310251232.55044.aleaxit@yahoo.com>
	<1067088346.10257.71.camel@anthem>
	<200310252210.h9PMAuT07833@12-236-54-216.client.attbi.com>
Message-ID: <200310261142.05394.aleaxit@yahoo.com>

On Sunday 26 October 2003 00:10, Guido van Rossum wrote:
> > On Sat, 2003-10-25 at 06:32, Alex Martelli wrote:
> > > I guess it had been a while since I ran 'make test' on the 2.4
> > > cvs... can't find this bug in the bugs db and I'd just like a
> > > quick sanity check (if the bug's already there or if I'm doing
> > > something weird) before I add it.
> >
> > Jeremy and I have both seen similar hangs in 2.4cvs.
> >
> > -Barry
>
> Ditto for me on RH9.

So does anybody have a better idea of what's going on...?  I can't
see what's different in 2.4cvs vs 2.3cvs bsddb module that makes
the former repeatably hang in test_popitem while the latter breezes
thru all tests...!-(  And neither can diff, neither for bsddbmodule.c
nor for test_bsddb.py ...


Alex


From skip at pobox.com  Sun Oct 26 05:42:07 2003
From: skip at pobox.com (Skip Montanaro)
Date: Sun Oct 26 05:42:23 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <r01050400-1026-409EF471074011D8BDE6003065D5E7E4@[10.0.0.23]>
References: <200310252248.h9PMmCE07958@12-236-54-216.client.attbi.com>
	<r01050400-1026-409EF471074011D8BDE6003065D5E7E4@[10.0.0.23]>
Message-ID: <16283.42239.376018.900892@montanaro.dyndns.org>


    Just> How about adding a "rebinding" operator, for example spelled ":=":

    Just> a := 2

    Just> It would mean: bind the value 2 to the nearest scope that defines
    Just> 'a'.

I see a couple problems:

    * Would you be required to use := at each assignment or just the first?
      All the toy examples we pass around are very simple, but it seems that
      the name would get assigned to more than once, so the programmer might
      need to remember the same discipline all the time.  It seems that use
      of
        x := 2
      and
        x = 4
      should be disallowed in the same function so that the compiler can
      flag such mistakes.

    * This seems like a statement which mixes declaration and execution.
      Everyone seems to abhor the global statement.  Perhaps its main saving
      grace is that it doesn't pretend to mix execution and declaration.

I think to narrow the scope of possible alternatives it would be helpful to
know if what we're looking for is a way to allow the programmer only bind in
the nearest enclosing scope or if she should be able to bind to an arbitrary
enclosing scope.  The various ideas seem to be falling into those two
categories.  Guido, do you have a preference or a pronouncement on that
idea?  Knowing that would eliminate one category of solutions.

Skip

From aleaxit at yahoo.com  Sun Oct 26 05:46:53 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 05:46:58 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <eppstein-E67CEF.11053825102003@sea.gmane.org>
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310251640.h9PGePZ07536@12-236-54-216.client.attbi.com>
	<eppstein-E67CEF.11053825102003@sea.gmane.org>
Message-ID: <200310261146.53044.aleaxit@yahoo.com>

On Saturday 25 October 2003 20:05, David Eppstein wrote:
   ...
> > One person here brought up (maybe David Eppstein) that they used this
> > approach for coding up extensive algorithms that are functional in
> > nature but have a lot of state referenced *during* the computation.
   ...
> refactoring in general, but you convinced me that using an object to
> represent shared state explicitly rather than doing it implicitly by
> nested function scoping can be a good idea.

Great testimony, David -- thanks!!!

So, maybe, rather than going out of our way to facilitate coding very
large and complicated closures, it might be better to keep focusing
on _simple_, small closures as the intended, designed-for use case,
and convince users of complicated closures that refactoring, as
David has done, into OO terms, can indeed be preferable.


Alex


From aleaxit at yahoo.com  Sun Oct 26 05:54:55 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 05:55:01 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <eppstein-5D7F6F.09030025102003@sea.gmane.org>
References: <000001c3984b$052cd820$e841fea9@oemcomputer>
	<200310251618.42221.aleaxit@yahoo.com>
	<eppstein-5D7F6F.09030025102003@sea.gmane.org>
Message-ID: <200310261154.55202.aleaxit@yahoo.com>

On Saturday 25 October 2003 18:03, David Eppstein wrote:
   ...
> > > 	pos2d =
> > > dict([(s,(positions[s][0]+dx*positions[s][2],positions[s][1]+dy*posit
> > >ions[s ][2]))
> > >                for s in positions])
   ...
> > pos2d = {}
> > for s, (x, y, delta) in positions.iteritems():
> >     pos2d[s] = x+dx*delta, y+dy*delta
> >
> > seems just  SO much clearer and more transparent to me.
   ...
> I like the comprehension syntax so much that I push it harder than I
> guess I should.  If I'm building a dictionary by performing some
> transformation on the items of another dictionary, I prefer to write it
> in a way that avoids sequencing the items one by one; I don't think of
> that sequencing as an inherent part of the loop.
>
> Put another way, I prefer declarative to imperative when possible.

Hmmm, I see.  List comprehensions are in fact fully imperative (in
Python), but they may be "thought of" in quasi-declarative terms;
I do see the allure of that.  Thanks for clarifying!

We DO have to keep in mind this source of attractiveness in
comprehensions over simple loops, I think.


> Let's try to spread it out a little and use intermediate variable names:
>     pos2d = dict([(s, (x + dx*z, y + dy*z))
>                   for s,(x,y,z) in positions.items()])
>
> Better?

Yes, it does seem better to me.  And with generator expressions,
dropping those slightly intrusive [ ... ] would be another little helpful
step.  Once you can write:

pos2d = dict( (s, (x+dx*z, y+dy*x) for s,(x,y,z) in position.items() )

I don't think the further slight added value in clarity in being able
to write a "dict comprehension" directly, e.g.

pos2d = { s: (x+dx*z, y+dy*x) for s,(x,y,z) in position.items() }

would be enough to warrant the addition to Python's syntax.


Alex


From aleaxit at yahoo.com  Sun Oct 26 06:01:30 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 06:01:36 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <brs52u7p.fsf@yahoo.co.uk>
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310251007.36871.aleaxit@yahoo.com> <brs52u7p.fsf@yahoo.co.uk>
Message-ID: <200310261201.30234.aleaxit@yahoo.com>

On Saturday 25 October 2003 17:49, Paul Moore wrote:
   ...
> However, one significant issue with your notation scope(outer).x = 23
> is that, although scope(outer) *looks like* a function call, it isn't
> - precisely because scope is a keyword.
>
> I think that, if you're using a keyword, you need something
> syntactically distinct. Now maybe you can make something like

Existing operator keywords, such as, e.g., 'not', get away without
it.  One can use parentheses, write not(x), or not (preferable style);
and what's the problem if "not(x)" CAN indeed look like a function
call while in fact it's not?  I really makes no deep difference here
that 'not' is a keyword and not a built-in function (it does matter
when it's used with other syntax, of course, such as "x is not y"
or "x not in y" or "not x" and so on -- but then, where 'scope' to
be introduced, it, too, like other operator keywords, might admit
of slightly different syntax uses).

Similarly, that 'scope' is a keyword known to the compiler is not
deeply important to the user coding scope(f) -- it might as well
be a built-in, from the user's viewpoint.  It's important to the
compiler, it becomes important if the user erroneously tries to
rebind "scope = 23", but those cases don't give problems.


Alex


From skip at pobox.com  Sun Oct 26 05:51:53 2003
From: skip at pobox.com (Skip Montanaro)
Date: Sun Oct 26 06:08:01 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <200310260131.47500.aleaxit@yahoo.com>
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310251007.36871.aleaxit@yahoo.com>
	<16282.41043.939103.536103@montanaro.dyndns.org>
	<200310260131.47500.aleaxit@yahoo.com>
Message-ID: <16283.42825.957517.595315@montanaro.dyndns.org>


    Alex> Sorry if my past well-meant eagerness caused problems; it's
    Alex> obviously more sensible for people who never use nested functions
    Alex> to help shape their syntax and semantics, than for those who DO
    Alex> use them, after all -- and similarly, for people who only care
    Alex> about their own code to help determine if 'global' is, or isn't, a
    Alex> cause of problems out there in the wide world of Python newbies
    Alex> and users far from python-dev.

Pardon me?

Just because I don't use a particular feature of the language doesn't mean I
have no interest in how the language evolves.  I don't believe I ever
disrespected your ideas or opinions.  Why are you disrespecting mine?  Hell,
why are you disrespecting me?

I would be more than happy if nested scopes weren't in the language.  Their
absence would also make your teaching, advising, mentoring, maintenance and
enhancing simpler.  I haven't proposed that they be removed, though that
would be rather clean way to solve this problem.

Alex, if a qualification for discussing improvements to Python is that one
use every aspect of the language, please pronounce.  I'll be happy to butt
out of your turf.

Skip

From just at letterror.com  Sun Oct 26 06:14:58 2003
From: just at letterror.com (Just van Rossum)
Date: Sun Oct 26 06:14:59 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <16283.42239.376018.900892@montanaro.dyndns.org>
Message-ID: <r01050400-1026-A3E6368807A511D8BDE6003065D5E7E4@[10.0.0.23]>

Skip Montanaro wrote:

> I see a couple problems:
> 
> * Would you be required to use := at each assignment or just the
> first?

Just the first; "a = 2" still means "a is local to this scope".

> All the toy examples we pass around are very simple, but it
> seems that the name would get assigned to more than once, so the
> programmer might need to remember the same discipline all the time. 
> It seems that use of
>         x := 2
>       and
>         x = 4
> should be disallowed in the same function so that the compiler can
> flag such mistakes.

I don't see it as a mistake. := would mean: "bind to whichever scope the
name is defined in", and that includes the current scope. I disagree
with Alex when he says := should mean "I'm binding this name in
NON-local scope".

> * This seems like a statement which mixes declaration and execution.

How is that different from "regular" assignment? It mixes declaration
and execution in the same way.

Just

From just at letterror.com  Sun Oct 26 06:19:26 2003
From: just at letterror.com (Just van Rossum)
Date: Sun Oct 26 06:19:25 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <r01050400-1026-A3E6368807A511D8BDE6003065D5E7E4@[10.0.0.23]>
Message-ID: <r01050400-1026-4390080407A611D8BDE6003065D5E7E4@[10.0.0.23]>

Just van Rossum wrote:

> > * Would you be required to use := at each assignment or just the
> > first?
> 
> Just the first; "a = 2" still means "a is local to this scope".
  ^^^^^^^^^^^^^^
        Whoops, I meant *at each asignment*, obviously.

Just

From skip at pobox.com  Sun Oct 26 06:21:41 2003
From: skip at pobox.com (Skip Montanaro)
Date: Sun Oct 26 06:21:53 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <r01050400-1026-A3E6368807A511D8BDE6003065D5E7E4@[10.0.0.23]>
References: <16283.42239.376018.900892@montanaro.dyndns.org>
	<r01050400-1026-A3E6368807A511D8BDE6003065D5E7E4@[10.0.0.23]>
Message-ID: <16283.44613.425664.463009@montanaro.dyndns.org>


    >> * Would you be required to use := at each assignment or just the
    >> first?

    Just> Just the first; "a = 2" still means "a is local to this scope".

That seems like a very subtle error waiting to happen...

    >> All the toy examples we pass around are very simple, but it seems
    >> that the name would get assigned to more than once, so the programmer
    >> might need to remember the same discipline all the time.  It seems
    >> that use of x := 2 and x = 4 should be disallowed in the same
    >> function so that the compiler can flag such mistakes.

    Just> I don't see it as a mistake. := would mean: "bind to whichever
    Just> scope the name is defined in", and that includes the current
    Just> scope. I disagree with Alex when he says := should mean "I'm
    Just> binding this name in NON-local scope".

Yeah, but if you come back to the code in six months and the nested function
is 48 lines long and assigns to x using a variety of ":=" and "="
assignments, it seems to me like it will be hard to tell if there's a
problem.

    >> * This seems like a statement which mixes declaration and execution.

    Just> How is that different from "regular" assignment? It mixes
    Just> declaration and execution in the same way.

Not in the way of saying, "this is global and here's its value".

Skip


From just at letterror.com  Sun Oct 26 06:35:09 2003
From: just at letterror.com (Just van Rossum)
Date: Sun Oct 26 06:35:12 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <16283.44613.425664.463009@montanaro.dyndns.org>
Message-ID: <r01050400-1026-75AE45D007A811D8BDE6003065D5E7E4@[10.0.0.23]>

Skip Montanaro wrote:

>     >> * Would you be required to use := at each assignment or just
>     >> the first?
> 
> Just> Just the first; "a = 2" still means "a is local to this scope".
> 
> That seems like a very subtle error waiting to happen...

Since I said the wrong thing, I'm not sure how to respond to this... Do
you still feel the same way with my corrected reply?

>     >> * This seems like a statement which mixes declaration and
>     >> execution.
> 
> Just> How is that different from "regular" assignment? It mixes
> Just> declaration and execution in the same way.
>
> Not in the way of saying, "this is global and here's its value".

In a way := is the opposite of "this is local and here's its value". It
says: "this is defined _somewhere_ and here's its new value".

Just

From aleaxit at yahoo.com  Sun Oct 26 06:32:41 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 06:36:35 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310260913.h9Q9DTS06656@oma.cosc.canterbury.ac.nz>
References: <200310260913.h9Q9DTS06656@oma.cosc.canterbury.ac.nz>
Message-ID: <200310261232.41678.aleaxit@yahoo.com>

On Sunday 26 October 2003 10:13, Greg Ewing wrote:
> > Hardly arbitary (I have fond memories of several languages that used
> > :=).
>
> But all the ones I know of use it for ordinary assignment.
> We'd be having two kinds of assignment, and there's no
> prior art to suggest to suggest which should be = and
> which :=. That's the "arbitrary" part.
>
> The only language I can remember seeing which had two
> kinds of assignment was Simula, which had := for value
> assignment and :- for reference assignment (or was it
> the other way around? :-) I always thought that was
> kind of weird.

VB6 had LET x = y for value assignment and SET x = y
for reference assignment.  Yes, very confusing particularly
because the LET keyword could be dropped.  Fortunately
we're not proposing anything like that;-).

Icon had := for irreversible and <- for reversible assignment.
(also :=: and <-> for exchanges and diffferent comparisons
for == and === so maybe it HAD gone a bit overboard:-).

I do recall an obscure language where <op>= was always
augmented assignment equivalent to a = a <op> b.  But in
particular the : operator meant to evaluate two exprs and
take the RH one, like comma in C, so a := b did turn out to
mean the same as a = b BUT fail if a couldn't first be
evaluated, which (sort of randomly) is sort of close to Just's
proposal.  Unfortunately I don't remember the language's
name:-(.

Googling a bit does show other languages distinguishing
global from local variable assignments.  E.g, in MUF,
http://www.muq.org/~cynbe/muq/muf1_24.html ,
--> (arrow with TWO hyphens) assigns globally,
-> (arrow with ONE hyphen) assigns locally.

It appears that this approach is slightly less popular than
the 'qualification' one I suggested (e.g. in Javascript you
can assign window.x to assign the global x; in Beanshell,
super.x to assign to x from enclosing scope) which in
turn is less popular than declarations.  Another not very
popular idea is distinguishing locals and globals by name
rules, as in Ruby $glob vs loc or KVirc Glob (upper initial)
vs loc (lower initial).


Alex


From aleaxit at yahoo.com  Sun Oct 26 06:35:41 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 06:37:19 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <16283.42239.376018.900892@montanaro.dyndns.org>
References: <200310252248.h9PMmCE07958@12-236-54-216.client.attbi.com>
	<r01050400-1026-409EF471074011D8BDE6003065D5E7E4@[10.0.0.23]>
	<16283.42239.376018.900892@montanaro.dyndns.org>
Message-ID: <200310261235.41107.aleaxit@yahoo.com>

On Sunday 26 October 2003 11:42, Skip Montanaro wrote:
   ...
> might need to remember the same discipline all the time.  It seems that
> use of
>         x := 2
>       and
>         x = 4
>       should be disallowed in the same function so that the compiler can
>       flag such mistakes.

I entirely agree with you.  There is no good use case that I can see for
this mixture, and prohibiting it helps the compiler help the programmer.


>     * This seems like a statement which mixes declaration and execution.

That's actually the PLAIN assignment statement, which mixes assigning
a value with telling the compiler "this name is local" (other binding 
statements such as def, class etc also do that).


Alex


From skip at pobox.com  Sun Oct 26 07:04:58 2003
From: skip at pobox.com (Skip Montanaro)
Date: Sun Oct 26 07:05:40 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <r01050400-1026-75AE45D007A811D8BDE6003065D5E7E4@[10.0.0.23]>
References: <16283.44613.425664.463009@montanaro.dyndns.org>
	<r01050400-1026-75AE45D007A811D8BDE6003065D5E7E4@[10.0.0.23]>
Message-ID: <16283.47210.64438.619480@montanaro.dyndns.org>

>>>>> "Just" == Just van Rossum <just@letterror.com> writes:

    Just> Skip Montanaro wrote:
    >> >> * Would you be required to use := at each assignment or just
    >> >> the first?
    >>
    Just> Just the first; "a = 2" still means "a is local to this scope".
    >>
    >> That seems like a very subtle error waiting to happen...

    Just> Since I said the wrong thing, I'm not sure how to respond to
    Just> this... Do you still feel the same way with my corrected reply?

Nope.

Skip

From fincher.8 at osu.edu  Sun Oct 26 08:20:59 2003
From: fincher.8 at osu.edu (Jeremy Fincher)
Date: Sun Oct 26 07:22:34 2003
Subject: [Python-Dev] product()
In-Reply-To: <200310260336.h9Q3aKg08424@12-236-54-216.client.attbi.com>
References: <002401c39907$0176f5a0$e841fea9@oemcomputer>
	<200310260104.18806.aleaxit@yahoo.com>
	<200310260336.h9Q3aKg08424@12-236-54-216.client.attbi.com>
Message-ID: <200310260820.59266.fincher.8@osu.edu>

On Saturday 25 October 2003 11:36 pm, Guido van Rossum wrote:
> Do we need allfalse() and anytrue() and anyfalse() too?  These can all
> easily be gotten by judicious use of 'not'.  I think ABC has EACH,
> SOME and NO (why not all four? who knows).

There was a recent thread here ("Efficient predicates for the standard 
library") in which the names "any" and "all" were discussed rather than 
"anytrue" and "alltrue."  Those are at least their common names in the 
functional programming languages I know, and it easily sidesteps the 
confusion that might be caused by having an "anytrue" without an "anyfalse" 
or an "alltrue" without an "allfalse."

Jeremy

From just at letterror.com  Sun Oct 26 07:37:02 2003
From: just at letterror.com (Just van Rossum)
Date: Sun Oct 26 07:37:12 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <16283.47210.64438.619480@montanaro.dyndns.org>
Message-ID: <r01050400-1026-1BBA93CA07B111D8BDE6003065D5E7E4@[10.0.0.23]>

Skip Montanaro wrote:

> Nope.

Ok :). Yet I think I'm starting to agree with you and Alex that :=
should mean "this name is NON-local".

A couple more things:

- I think augmented assignments CAN be made "rebinding" without breaking
code, since currently a += 1 fails if a is neither local nor global.

- Would := be allowed in statements like "self.a := 2"? It makes no
sense, but since "(a, b) := (2, 3)" IS meaningful, what about
"(a, b, self.c) = (1, 2, 3)"?

Just

From skip at manatee.mojam.com  Sun Oct 26 08:00:50 2003
From: skip at manatee.mojam.com (Skip Montanaro)
Date: Sun Oct 26 08:01:05 2003
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200310261300.h9QD0oB3015515@manatee.mojam.com>


Bug/Patch Summary
-----------------

547 open / 4276 total bugs (+42)
205 open / 2432 total patches (+7)

New Bugs
--------

email/Generator.py: Incorrect header output (2003-10-20)
	http://python.org/sf/826756
Proto 2 pickle vs dict subclass (2003-10-20)
	http://python.org/sf/826897
wrong error message of islice indexing (2003-10-20)
	http://python.org/sf/827190
List comprehensions leaking control variable name deprecated (2003-10-20)
	http://python.org/sf/827209
Bug in dbm - long strings in keys and values (2003-10-21)
	http://python.org/sf/827760
object.h misdocuments PyDict_SetItemString (2003-10-21)
	http://python.org/sf/827856
ctime is not creation time (2003-10-21)
	http://python.org/sf/827902
Registry key CurrentVersion not set (2003-10-21)
	http://python.org/sf/827963
Idle fails on loading .idlerc if Home path changes. (2003-10-22)
	http://python.org/sf/828049
sdist generates bad MANIFEST on Windows (2003-10-22)
	http://python.org/sf/828450
bdist_rpm failure when no setup.py (2003-10-23)
	http://python.org/sf/828743
setattr(obj, BADNAME, value) does not raises exception (2003-10-24)
	http://python.org/sf/829458
os.makedirs() cannot handle "." (2003-10-24)
	http://python.org/sf/829532
__mul__ taken as __rmul__ for mul-by-int only (2003-10-25)
	http://python.org/sf/830261
python-mode.el: py-b-of-def-or-class looks inside strings (2003-10-25)
	http://python.org/sf/830347
Config parser don't raise DuplicateSectionError when reading (2003-10-26)
	http://python.org/sf/830449

New Patches
-----------

absolute paths cause problems for MSVC (2003-10-21)
	http://python.org/sf/827386
SimpleHTTPServer directory-indexing "bug" fix (2003-10-21)
	http://python.org/sf/827559
Allow set swig include dirs in setup.py (2003-10-22)
	http://python.org/sf/828336
ref. manual talks of 'sequence' instead of 'iterable' (2003-10-23)
	http://python.org/sf/829073
Fixes smtplib starttls HELO errors (2003-10-24)
	http://python.org/sf/829951
itertoolsmodule.c: islice error messages (827190) (2003-10-25)
	http://python.org/sf/830070
python-mode.el: (py-point 'bod) doesn't quite work (2003-10-25)
	http://python.org/sf/830341

Closed Bugs
-----------

asyncore unhandled write event (2002-03-10)
	http://python.org/sf/528295
missing important curses calls (2003-01-10)
	http://python.org/sf/665572
Problems with non-greedy match groups (2003-03-01)
	http://python.org/sf/695688
ncurses/curses on solaris (2003-03-10)
	http://python.org/sf/700780
sigwinch crashes python with curses (2003-06-14)
	http://python.org/sf/754455
asyncore with non-default map problems (2003-06-20)
	http://python.org/sf/758241
HTMLParser chokes on my.yahoo.com output (2003-06-26)
	http://python.org/sf/761452
minidom.py -- TypeError: object doesn't support slice assig (2003-07-25)
	http://python.org/sf/777884
xmlrpclib's functions dumps() and loads() not documented. (2003-09-19)
	http://python.org/sf/809174
Support for non-string data in ConfigParser unclear/buggy (2003-09-22)
	http://python.org/sf/810843
test_tempfile fails on windows if space in install path (2003-09-23)
	http://python.org/sf/811082
a Py_DECREF() too much (2003-09-25)
	http://python.org/sf/812353
new.function raises TypeError for some strange reason... (2003-09-28)
	http://python.org/sf/814266
webbrowser.open hangs under certain conditions (2003-10-02)
	http://python.org/sf/816810
Need "new style note" (2003-10-04)
	http://python.org/sf/817742
Ref Man Index: Symbols -- Latex leak (2003-10-08)
	http://python.org/sf/820344
tarfile exception on large .tar files (2003-10-13)
	http://python.org/sf/822668
urllib2 digest auth is broken (2003-10-14)
	http://python.org/sf/823328
code.InteractiveConsole interprets escape chars incorrectly (2003-10-17)
	http://python.org/sf/825676
reference to Built-In Types section in file() documentation (2003-10-17)
	http://python.org/sf/825810
Class Problem with repr and getattr on PY2.3.2 (2003-10-18)
	http://python.org/sf/826013

Closed Patches
--------------

Mutable PyCObject (2001-11-02)
	http://python.org/sf/477441
(?(id/name)yes|no) re implementation (2002-06-23)
	http://python.org/sf/572936
Fixing recursive problem in SRE (2003-06-19)
	http://python.org/sf/757624
small fix for setup.py (2003-07-15)
	http://python.org/sf/772077
make test_fcntl 64bit clean (2003-09-13)
	http://python.org/sf/805626
NetBSD py_curses.h fix (2003-09-15)
	http://python.org/sf/806800
Fix many doc typos (2003-09-22)
	http://python.org/sf/810751
normalize whitespace (2003-09-25)
	http://python.org/sf/812378
Fix test_tempfile: space in Win32 install path bug #811082 (2003-09-26)
	http://python.org/sf/813200
_sre stack overflow on FreeBSD/amd64 and /sparc64 (2003-09-26)
	http://python.org/sf/813391
deprecated modules (2003-09-29)
	http://python.org/sf/814560
fix import problem(unittest.py) (2003-10-07)
	http://python.org/sf/819077
Updated .spec file. (2003-10-14)
	http://python.org/sf/823259
Add additional isxxx functions to string object. (2003-10-16)
	http://python.org/sf/825313

From pedronis at bluewin.ch  Sun Oct 26 08:37:01 2003
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Sun Oct 26 08:34:44 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <r01050400-1026-3BD5886D079A11D8BDE6003065D5E7E4@[10.0.0.23
 ]>
References: <200310260329.h9Q3Trv08377@12-236-54-216.client.attbi.com>
Message-ID: <5.2.1.1.0.20031026140652.027b3f78@pop.bluewin.ch>

At 10:53 26.10.2003 +0100, Just van Rossum wrote:

>But: let's not get carried away with this particular spelling, the main
>question is: "is it a good idea to have a rebinding assignment
>operator?" (regardless of how that operator is spelled). Needless to
>say, I think it is.

would you mind trying to express why? everybody is spending a lot of mental 
energy
trying to figure a out a sensible way to achieve this but only Guido has 
made explicit some 3rd party reasons to want it. I would like to read more 
rationales
about why we need it so badly.

Thanks. 


From aleaxit at yahoo.com  Sun Oct 26 06:35:41 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 09:11:35 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <16283.42239.376018.900892@montanaro.dyndns.org>
References: <200310252248.h9PMmCE07958@12-236-54-216.client.attbi.com>
	<r01050400-1026-409EF471074011D8BDE6003065D5E7E4@[10.0.0.23]>
	<16283.42239.376018.900892@montanaro.dyndns.org>
Message-ID: <200310261235.41107.aleaxit@yahoo.com>

On Sunday 26 October 2003 11:42, Skip Montanaro wrote:
   ...
> might need to remember the same discipline all the time.  It seems that
> use of
>         x := 2
>       and
>         x = 4
>       should be disallowed in the same function so that the compiler can
>       flag such mistakes.

I entirely agree with you.  There is no good use case that I can see for
this mixture, and prohibiting it helps the compiler help the programmer.


>     * This seems like a statement which mixes declaration and execution.

That's actually the PLAIN assignment statement, which mixes assigning
a value with telling the compiler "this name is local" (other binding 
statements such as def, class etc also do that).


Alex


From aleaxit at yahoo.com  Sun Oct 26 09:16:31 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 09:18:27 2003
Subject: [Python-Dev] Re: closure semantics
In-Reply-To: <16283.42825.957517.595315@montanaro.dyndns.org>
References: <200310230015.h9N0Fm119158@oma.cosc.canterbury.ac.nz>
	<200310260131.47500.aleaxit@yahoo.com>
	<16283.42825.957517.595315@montanaro.dyndns.org>
Message-ID: <200310261511.29081.aleaxit@yahoo.com>

On Sunday 26 October 2003 11:51, Skip Montanaro wrote:
   ...
> Just because I don't use a particular feature of the language doesn't
> mean I have no interest in how the language evolves.  I don't believe I

Absolutely true.  Any feature added to the language brings some weight
to all, even those who will not use it (perhaps not much to those who will
not use it AND only care about their own code, but I do believe that most
should also care about _others'_ code, even if they don't realize that --
reusing others' code from the net, &c, are still possibilities).

> ever disrespected your ideas or opinions.  Why are you disrespecting
> mine?  Hell, why are you disrespecting me?

I had no intention of expressing any disrespect to you.  If I 
miscommunicated in this regard, I owe you an apology.  As for opinions
based on only caring about one's own code, I am, however, fully entitled
to meta-opine that such opinions are too narrowly based, and that
not considering the coding behavior of others is near-sighted.

> I would be more than happy if nested scopes weren't in the language. 
> Their absence would also make your teaching, advising, mentoring,
> maintenance and enhancing simpler.  I haven't proposed that they be
> removed, though that would be rather clean way to solve this problem.

Of course such a proposal would have to wait for 3.0 (i.e. who knows when)
given backwards incompatibility.  Personally, I think that would just bring
back all the "foo=foo, bar=bar" default-argument abuse that we used to
have before nested scopes appeared, and therefore would not make any
of my activities substantially simpler nor more productive (even discounting
the large work of porting code across such a jump in semantics -- I think
that could be eased by tools systematically _introducing_ default-argument
abuse, but the semantics of such 'snapshotting' is still far enough from
today's nested arguments to require plenty of manual inspection and
changing).

> Alex, if a qualification for discussing improvements to Python is that
> one use every aspect of the language, please pronounce.  I'll be happy to
> butt out of your turf.

You got the wrong guy: I don't get to pronounce, and this ain't my turf.  I
only get to plead, cajole, whine, argue, entreaty, advocate, propose,
appeal, supplicate, contend, suggest, insist, agree, and disagree, just
like everybody else.


Alex


From ncoghlan at iinet.net.au  Sun Oct 26 09:39:29 2003
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sun Oct 26 09:39:36 2003
Subject: [Python-Dev] product()
In-Reply-To: <200310260820.59266.fincher.8@osu.edu>
References: <002401c39907$0176f5a0$e841fea9@oemcomputer>	<200310260104.18806.aleaxit@yahoo.com>	<200310260336.h9Q3aKg08424@12-236-54-216.client.attbi.com>
	<200310260820.59266.fincher.8@osu.edu>
Message-ID: <3F9BDCA1.5040101@iinet.net.au>

Jeremy Fincher strung bits together to say:
> On Saturday 25 October 2003 11:36 pm, Guido van Rossum wrote:
> 
>>Do we need allfalse() and anytrue() and anyfalse() too?  These can all
>>easily be gotten by judicious use of 'not'.  I think ABC has EACH,
>>SOME and NO (why not all four? who knows).
> 
> There was a recent thread here ("Efficient predicates for the standard 
> library") in which the names "any" and "all" were discussed rather than 
> "anytrue" and "alltrue."  Those are at least their common names in the 
> functional programming languages I know, and it easily sidesteps the 
> confusion that might be caused by having an "anytrue" without an "anyfalse" 
> or an "alltrue" without an "allfalse."

 >>> if all(pred(x) for x in values): pass       # alltrue
 >>> if any(pred(x) for x in values): pass       # anytrue
 >>> if any(not pred(x) for x in values): pass   # anyfalse
 >>> if all(not pred(x) for x in values): pass   # allfalse

The names from the earlier thread do read nicely. . .

Alternately, getting away with just one function:

 >>> if all(pred(x) for x in values): pass          # alltrue
 >>> if not all(not pred(x) for x in values): pass  # anytrue
 >>> if not all(pred(x) for x in values): pass      # anyfalse
 >>> if all(not pred(x) for x in values): pass      # allfalse

I don't know about anyone else, but the double negative required to express 
"any" in terms of "all" hurts my brain. . .

Cheers,
Nick.

-- 
Nick Coghlan           |              Brisbane, Australia
ICQ#: 68854767         |               ncoghlan@email.com
Mobile: 0409 573 268   |   http://www.talkinboutstuff.net
"Let go your prejudices,
               lest they limit your thoughts and actions."


From aleaxit at yahoo.com  Sun Oct 26 10:23:54 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 10:24:02 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <r01050400-1026-1BBA93CA07B111D8BDE6003065D5E7E4@[10.0.0.23]>
References: <r01050400-1026-1BBA93CA07B111D8BDE6003065D5E7E4@[10.0.0.23]>
Message-ID: <200310261623.54136.aleaxit@yahoo.com>

On Sunday 26 October 2003 13:37, Just van Rossum wrote:
> Skip Montanaro wrote:
> > Nope.
>
> Ok :). Yet I think I'm starting to agree with you and Alex that :=
> should mean "this name is NON-local".

The more I think about it, the more I like it in its _simplest_ form.

> A couple more things:
>
> - I think augmented assignments CAN be made "rebinding" without breaking
> code, since currently a += 1 fails if a is neither local nor global.

You are right about the breaking code, but I would still slightly prefer
to eschew this just for simplicity -- see also below.

> - Would := be allowed in statements like "self.a := 2"? It makes no
> sense, but since "(a, b) := (2, 3)" IS meaningful, what about
> "(a, b, self.c) = (1, 2, 3)"?

I would not allow := in any but the SIMPLEST case: simple assignment
to a bare name, no unpacking (I earlier said "no packing" but that's silly
and I mispoke there -- "a := 3, 4, 5" WOULD of course be fine), no
chaining, no := when the LHS is an indexing, slicing, attribute access.

Keeping := Franciscan in its simplicity would make it easiest to implement,
easiest to explain, AND avoid all sort of confusing cases where the
distinction between := and = would otherwise be confusingly nonexistent.
It would also make it most effective because it always means the same
thing -- "assignment to (already-existing) nonlocal".  This is much the
spirit in which I'd forego the idea of making += etc access nonlocals too,
though I guess I'm only -0 on that; it seems simplest and most effective
to have the one concept "rebinding a nonlocal name" correspond in strict 1-1
way to the one notation := .  Simplicity and effectiveness feel very 
Pythonic to me.

I think rebinding nonlocals should be rare enough that the fact of having
to write e.g. "a := a+1" rather than "a += 1" is a very minor problem.  The
important use case of += & friends, "xop[flap].quip(glop).nip[zap] += 1",
gets no special benefit from += being deemed "rebinding" -- the rebinding
concept applies usefully to bare names, and for a bare name writing
name := name <op> RHS
is no big deal wrt
name <op>= RHS

If name's a huge list, name.extend(anotherlist) is a fine substitute for
name += anotherlist if you want to keep name nonlocal AND get some
efficiency gain.  Other containers with similar issues should also always
supply a more readable synonym to __iadd__ for such uses, e.g. sets
do, supplying union_update.

So, keeping += &c just like today seems acceptable and preferable.


Alex


From pje at telecommunity.com  Sun Oct 26 10:41:43 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Oct 26 10:41:30 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <200310252316.h9PNGZc08136@12-236-54-216.client.attbi.com>
References: <Your message of "Sun, 26 Oct 2003 01:11:51 +0200."
	<200310260111.51509.aleaxit@yahoo.com>
	<200310230407.h9N473O20346@oma.cosc.canterbury.ac.nz>
	<200310251132.04686.aleaxit@yahoo.com>
	<200310252020.h9PKKjL07657@12-236-54-216.client.attbi.com>
	<200310260111.51509.aleaxit@yahoo.com>
Message-ID: <5.1.0.14.0.20031026103718.03f76e70@mail.telecommunity.com>

At 04:16 PM 10/25/03 -0700, Guido van Rossum wrote:
> > > No way.  There's nothing that guarantees that a+=b has the same
> > > semantics as a+b, and in fact for lists it doesn't.
> >
> > You mean because += is more permissive (accepts any sequence
> > RHS while + insists the RHS be specifically a list)?  I don't see how
> > this would make it bad to use += instead of + -- if we let the user
> > sum up a mix of (e.g.) strings and tuples, why are we hurting him?
>
>We specifically decided that sum() wasn't allowed for strings, because
>it's a quadratic algorithm.  Other sequences are just as bad, we just
>didn't expect that to be a common case.
>
>Also see my not-so-far-fetched example of a semantic change.

Maybe I'm confused, but when Alex first proposed this change, I mentally 
assumed that he meant he would change it so that the *first* addition would 
use + (in order to ensure getting a "fresh" object) and then subsequent 
additions would use +=.

If this were the approach taken, it seems to me that there could not be any 
semantic change or side-effects for types that have compatible meaning for 
+ and += (i.e. += is an in-place version of +).

Maybe I'm missing something here?


From aahz at pythoncraft.com  Sun Oct 26 10:46:26 2003
From: aahz at pythoncraft.com (Aahz)
Date: Sun Oct 26 10:46:29 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310261623.54136.aleaxit@yahoo.com>
References: <r01050400-1026-1BBA93CA07B111D8BDE6003065D5E7E4@[10.0.0.23]>
	<200310261623.54136.aleaxit@yahoo.com>
Message-ID: <20031026154626.GA18564@panix.com>

On Sun, Oct 26, 2003, Alex Martelli wrote:
>
> Keeping := Franciscan in its simplicity would make it easiest to
> implement, easiest to explain, AND avoid all sort of confusing cases
> where the distinction between := and = would otherwise be confusingly
> nonexistent.  It would also make it most effective because it always
> means the same thing -- "assignment to (already-existing) nonlocal".
> This is much the spirit in which I'd forego the idea of making += etc
> access nonlocals too, though I guess I'm only -0 on that; it seems
> simplest and most effective to have the one concept "rebinding a
> nonlocal name" correspond in strict 1-1 way to the one notation := .
> Simplicity and effectiveness feel very Pythonic to me.

Sounds good to me.  Question: what does this do?

    def f():
        def g(x):
            z := x
        g(3)
        print z
        return g
    g = f()
    print z
    g('foo')
    print z

That is, in the absence of a pre-existing binding, where does the
binding for := go?  I think it should be equivalent to global, going to
the module scope.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From just at letterror.com  Sun Oct 26 10:54:30 2003
From: just at letterror.com (Just van Rossum)
Date: Sun Oct 26 10:54:36 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <5.2.1.1.0.20031026140652.027b3f78@pop.bluewin.ch>
Message-ID: <r01050400-1026-B156CCB207CC11D8BDE6003065D5E7E4@[10.0.0.23]>

Samuele Pedroni wrote:

> >But: let's not get carried away with this particular spelling, the
> >main question is: "is it a good idea to have a rebinding assignment
> >operator?" (regardless of how that operator is spelled). Needless to
> >say, I think it is.
> 
> would you mind trying to express why? everybody is spending a lot of
> mental energy trying to figure a out a sensible way to achieve this
> but only Guido has made explicit some 3rd party reasons to want it. I
> would like to read more rationales about why we need it so badly.

My question above is misleading with respect to my personal feelings
about the issue. It should have been:

"""*If* we decide we need to be able to assign to names in outer scopes,
would it be a good idea to add a rebinding operator?"""

I actually don't care much whether the cability is added or not, but
*if* we add it, I'd much rather see a rebinding operator than an
extension to the global statement or a new declarative statement.

Just

From just at letterror.com  Sun Oct 26 10:58:13 2003
From: just at letterror.com (Just van Rossum)
Date: Sun Oct 26 10:58:17 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <20031026154626.GA18564@panix.com>
Message-ID: <r01050400-1026-35B6EE3A07CD11D8BDE6003065D5E7E4@[10.0.0.23]>

Aahz wrote:

> Sounds good to me.  Question: what does this do?
> 
>     def f():
>         def g(x):
>             z := x
>         g(3)
>         print z
>         return g
>     g = f()
>     print z
>     g('foo')
>     print z
> 
> That is, in the absence of a pre-existing binding, where does the
> binding for := go?  I think it should be equivalent to global, going
> to the module scope.

I think it should raise NameError or UnboundLocalError or a new
NameError subclass. "In the face of ambiguity, etc."

Just

From pje at telecommunity.com  Sun Oct 26 11:06:10 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Oct 26 11:05:24 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <r01050400-1026-B156CCB207CC11D8BDE6003065D5E7E4@[10.0.0.23
 ]>
References: <5.2.1.1.0.20031026140652.027b3f78@pop.bluewin.ch>
Message-ID: <5.1.0.14.0.20031026105653.03e64ec0@mail.telecommunity.com>

At 04:54 PM 10/26/03 +0100, Just van Rossum wrote:
>My question above is misleading with respect to my personal feelings
>about the issue. It should have been:
>
>"""*If* we decide we need to be able to assign to names in outer scopes,
>would it be a good idea to add a rebinding operator?"""
>
>I actually don't care much whether the cability is added or not, but
>*if* we add it, I'd much rather see a rebinding operator than an
>extension to the global statement or a new declarative statement.

If we have a rebinding operator, I'd rather it be something considerably 
more visible than the presence or absence of a ':' on an assignment 
statement.  So far, all the examples have been downright scary in the 
invisibility of what's happening.  Mostly, I can imagine some poor sap 
trying to debug a program that uses := and is missing one somewhere or has 
one where it's not intended -- and hoping that poor sap won't be me.  :)

I've mostly stayed out of this discussion, but so far something like the 
scope(function).variable proposals, with perhaps a special case for 
scope(global) or scope(globals) seems to me like the way to go.  It seems 
very Pythonic, in that it is explicit and calls attention to the fact that 
something special is going on, in a way that ':=' does not.  And 'scope' 
can be looked up in a manual more easily than ':=' can.  Last, but not 
least, ':=' looks enough like normal assignment in other languages, that 
somebody just plain might not notice that they *need* to look it up.


From aleaxit at yahoo.com  Sun Oct 26 11:18:48 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 11:19:01 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <5.1.0.14.0.20031026103718.03f76e70@mail.telecommunity.com>
References: <Your message of "Sun, 26 Oct 2003 01:11:51 +0200."
	<200310260111.51509.aleaxit@yahoo.com>
	<200310260111.51509.aleaxit@yahoo.com>
	<5.1.0.14.0.20031026103718.03f76e70@mail.telecommunity.com>
Message-ID: <200310261718.48377.aleaxit@yahoo.com>

On Sunday 26 October 2003 04:41 pm, Phillip J. Eby wrote:
> At 04:16 PM 10/25/03 -0700, Guido van Rossum wrote:
> > > > No way.  There's nothing that guarantees that a+=b has the same
> > > > semantics as a+b, and in fact for lists it doesn't.
   ...
> assumed that he meant he would change it so that the *first* addition would
> use + (in order to ensure getting a "fresh" object) and then subsequent
> additions would use +=.

A better architecture than the initial copy.copy I was now thinking of -- 
thanks.  But it doesn't solve Guido's objection as above shown.

> If this were the approach taken, it seems to me that there could not be any
> semantic change or side-effects for types that have compatible meaning for
> + and += (i.e. += is an in-place version of +).
>
> Maybe I'm missing something here?

Only the fact that "there's nothing that guarantees" this, as Guido says.
alist = alist + x only succeds if x is also a list, while alist += x succeeds
also for tuples and other sequences, for example.

Personally, I don't think this would be a problem, but it's not my decision.


Alex


From skip at pobox.com  Sun Oct 26 11:19:32 2003
From: skip at pobox.com (Skip Montanaro)
Date: Sun Oct 26 11:19:48 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <r01050400-1026-1BBA93CA07B111D8BDE6003065D5E7E4@[10.0.0.23]>
References: <16283.47210.64438.619480@montanaro.dyndns.org>
	<r01050400-1026-1BBA93CA07B111D8BDE6003065D5E7E4@[10.0.0.23]>
Message-ID: <16283.62484.474269.27181@montanaro.dyndns.org>

    Just> - Would := be allowed in statements like "self.a := 2"? It makes
    Just>   no sense, but since "(a, b) := (2, 3)" IS meaningful, what about
    Just>   "(a, b, self.c) = (1, 2, 3)"?

Ummm...  This doesn't seem to be strengthening your argument. ;-)

Skip


From aleaxit at yahoo.com  Sun Oct 26 11:20:16 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 11:20:42 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <r01050400-1026-35B6EE3A07CD11D8BDE6003065D5E7E4@[10.0.0.23]>
References: <r01050400-1026-35B6EE3A07CD11D8BDE6003065D5E7E4@[10.0.0.23]>
Message-ID: <200310261720.16457.aleaxit@yahoo.com>

On Sunday 26 October 2003 04:58 pm, Just van Rossum wrote:
> Aahz wrote:
> > Sounds good to me.  Question: what does this do?
> >
> >     def f():
> >         def g(x):
> >             z := x
> >         g(3)
> >         print z
> >         return g
> >     g = f()
> >     print z
> >     g('foo')
> >     print z
> >
> > That is, in the absence of a pre-existing binding, where does the
> > binding for := go?  I think it should be equivalent to global, going
> > to the module scope.
>
> I think it should raise NameError or UnboundLocalError or a new
> NameError subclass. "In the face of ambiguity, etc."

Absolute agreement here.  I think a new subclass of NameError
would be best.  The simplest and most limited the functionality of
:= the more effective I think it will be.


Alex


From pf_moore at yahoo.co.uk  Sun Oct 26 11:21:26 2003
From: pf_moore at yahoo.co.uk (Paul Moore)
Date: Sun Oct 26 11:21:23 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
 bltinmodule.c, 2.292.10.1, 2.292.10.2
References: <E1ADNp1-0000gJ-00@sc8-pr-cvs1.sourceforge.net>
	<200310260145.23094.aleaxit@yahoo.com>
	<200310260326.h9Q3Q3708345@12-236-54-216.client.attbi.com>
	<200310261109.56801.aleaxit@yahoo.com>
Message-ID: <d6ckvuk9.fsf@yahoo.co.uk>

Alex Martelli <aleaxit@yahoo.com> writes:

> Unfortunately, with the sum change regressed, d.asum times to
> 8.4e+03 usec per loop, so it clearly cannot be considered any
> more:-).  So, there might be space for an accumulator function
> patterned on map but [a] which stops on the shortest sequence
> like zip and [b] does NOT build a list of results, meant to be called
> a bit like map is in the 'amap' example above.  itertools is a great
> little collection of producers and manipulators of iterators, but the
> "accumulator functions" might provide the "one obvious way" to
> _consume_ iterators for common cases; and accumulating by
> calling an accumulator-object's mutator method, such as
> tot.extend above, on all items of an iterator, clearly is pretty common.

I *think* I see what you're getting at here, but I'm struggling to
follow in the absence of concrete use cases. As we're talking about
library functions, I'd suggest that your suggested "accumulator
functions" start their life as an external module - maybe even in
Python, although I take our point about the speed advantages of C.
With a bit of "real life" use, migration into the standard library
might be more of an obvious step.

Paul.
-- 
This signature intentionally left blank


From aleaxit at yahoo.com  Sun Oct 26 11:23:20 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 11:23:24 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <20031026154626.GA18564@panix.com>
References: <r01050400-1026-1BBA93CA07B111D8BDE6003065D5E7E4@[10.0.0.23]>
	<200310261623.54136.aleaxit@yahoo.com>
	<20031026154626.GA18564@panix.com>
Message-ID: <200310261723.20026.aleaxit@yahoo.com>

On Sunday 26 October 2003 04:46 pm, Aahz wrote:
> On Sun, Oct 26, 2003, Alex Martelli wrote:
   ...
> > nonexistent.  It would also make it most effective because it always
> > means the same thing -- "assignment to (already-existing) nonlocal".
   ...
> Sounds good to me.  Question: what does this do?
>
>     def f():
>         def g(x):
>             z := x
   ...
> That is, in the absence of a pre-existing binding, where does the
> binding for := go?  I think it should be equivalent to global, going to
> the module scope.

I think it should raise some subclass of NameError, because it's
not an assignment to an _already-existing_ nonlocal, as per my
text quoted above.  It does not seem to me that "nested functions
able to rebind module-level names" has compelling use cases, so
I would prefer the simplicity of forbidding this usage.


Alex


From aleaxit at yahoo.com  Sun Oct 26 11:24:56 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 11:24:59 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <16283.62484.474269.27181@montanaro.dyndns.org>
References: <16283.47210.64438.619480@montanaro.dyndns.org>
	<r01050400-1026-1BBA93CA07B111D8BDE6003065D5E7E4@[10.0.0.23]>
	<16283.62484.474269.27181@montanaro.dyndns.org>
Message-ID: <200310261724.56194.aleaxit@yahoo.com>

On Sunday 26 October 2003 05:19 pm, Skip Montanaro wrote:
>     Just> - Would := be allowed in statements like "self.a := 2"? It makes
>     Just>   no sense, but since "(a, b) := (2, 3)" IS meaningful, what
> about Just>   "(a, b, self.c) = (1, 2, 3)"?
>
> Ummm...  This doesn't seem to be strengthening your argument. ;-)

Indeed, I think the argument is stronger -- and := is more useful -- if

a, b := 2, 3

and

self.a := 2

and all such non-elementary variations of assignment are NOT allowed
with := .


Alex


From just at letterror.com  Sun Oct 26 11:25:23 2003
From: just at letterror.com (Just van Rossum)
Date: Sun Oct 26 11:25:50 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <5.1.0.14.0.20031026105653.03e64ec0@mail.telecommunity.com>
Message-ID: <r01050400-1026-018AE5FA07D111D8BDE6003065D5E7E4@[10.0.0.23]>

Phillip J. Eby wrote:

> If we have a rebinding operator, I'd rather it be something
> considerably more visible than the presence or absence of a ':' on an
> assignment statement. 

I don't know, but somehow I don't have a problem spotting augmented
assignments, so I don't think := will be as hard to miss as you suggest.

> So far, all the examples have been downright scary in the
> invisibility of what's happening.  Mostly, I can imagine some poor
> sap trying to debug a program that uses := and is missing one
> somewhere or has one where it's not intended -- and hoping that poor
> sap won't be me.  :)

How is that different from a '-=' that should have been a plain '='?
Also, if := is disallowed to rebind in the _same_ scope, this problem
would be spotted by the compiler.

> I've mostly stayed out of this discussion, but so far something like
> the scope(function).variable proposals, with perhaps a special case
> for scope(global) or scope(globals) seems to me like the way to go. 
> It seems very Pythonic, in that it is explicit and calls attention to
> the fact that something special is going on, in a way that ':=' does
> not. 

The reverse argument can be made, too: := calls attention to the fact
that something is happening right there, whereas a declaration may be
many lines away.

> And 'scope' can be looked up in a manual more easily than ':='
> can.  Last, but not least, ':=' looks enough like normal assignment
> in other languages, that somebody just plain might not notice that
> they *need* to look it up.

That's a good point.

Just

From arigo at tunes.org  Sun Oct 26 11:23:25 2003
From: arigo at tunes.org (Armin Rigo)
Date: Sun Oct 26 11:27:20 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
	bltinmodule.c, 2.292.10.1, 2.292.10.2
In-Reply-To: <200310261109.56801.aleaxit@yahoo.com>
References: <E1ADNp1-0000gJ-00@sc8-pr-cvs1.sourceforge.net>
	<200310260145.23094.aleaxit@yahoo.com>
	<200310260326.h9Q3Q3708345@12-236-54-216.client.attbi.com>
	<200310261109.56801.aleaxit@yahoo.com>
Message-ID: <20031026162325.GA4113@vicky.ecs.soton.ac.uk>

Hello,

On Sun, Oct 26, 2003 at 11:09:56AM +0100, Alex Martelli wrote:
> > Oh, but we all *did* think of it.  For strings. :-)

> ...  When X is intended as a number class, this
> asymmetry between multiplication and (e.g.) addition violates
> the principle of least surprise.

I must admit I was a bit surprized when I first tested sum(), without first 
reading its doc because I thought I knew what it should do.  I expected it to 
be a fast equivalent to:

def sum(seq, start=0):
  for item in seq:
    start = start + seq
  return start

or:

  reduce(operator.add, seq, start)

I immediately tried it with strings and lists.  I immediately thought about
lists because of their use of "+" for concatenation.

So it seems that neither strings nor lists are properly supported, neither
tuples by the way, and my opinion on this is that it strongly contradicts the
principle of least surprize.

I would not object to an implementation of sum() that special-case lists, 
tuples and strings for efficiency. (by which I mean I can contribute a patch)

> language or library feature.  The problem of the += loop on strings is
> essentially solved by psyco, which has tricks to catch that and make
> it almost as fast as ''.join; but psyco can't get into a built-in function
> such as sum, and thus can't help us with the performance trap there.

Supporing sum() in Psyco is no big issue, and it could help the same way as it
does for str.__add__.  (It is not explicitely supported yet, but it could be
added.)  Still I believe that getting the complexity right in CPython is 
important, when it can be done.


Armin


From fincher.8 at osu.edu  Sun Oct 26 13:10:27 2003
From: fincher.8 at osu.edu (Jeremy Fincher)
Date: Sun Oct 26 12:12:15 2003
Subject: [Python-Dev] product()
In-Reply-To: <3F9BDCA1.5040101@iinet.net.au>
References: <002401c39907$0176f5a0$e841fea9@oemcomputer>
	<200310260820.59266.fincher.8@osu.edu>
	<3F9BDCA1.5040101@iinet.net.au>
Message-ID: <200310261310.27950.fincher.8@osu.edu>

On Sunday 26 October 2003 09:39 am, Nick Coghlan wrote:
>  >>> if all(pred(x) for x in values): pass       # alltrue
>  >>> if any(pred(x) for x in values): pass       # anytrue

Yeah, that does read nicely, which is why I think it's pretty common in FPLs.

>  >>> if any(not pred(x) for x in values): pass   # anyfalse

I've always expressed this as:

if not all(pred(x) for x in values): pass

>  >>> if all(not pred(x) for x in values): pass   # allfalse

And this as:

if not any(pred(x) for x in values): pass

It's slightly more efficient (only one negation), and it seems to maintain 
better the pseudocode-like aspect that we so much adore in Python :)

Jeremy

From arigo at tunes.org  Sun Oct 26 12:11:48 2003
From: arigo at tunes.org (Armin Rigo)
Date: Sun Oct 26 12:15:49 2003
Subject: [Python-Dev] PyPy: sprint and news
Message-ID: <20031026171148.GC16738@vicky.ecs.soton.ac.uk>

PyPy Sprint announcement & news from the project
================================================

We are coming close to a first experimental release of PyPy,
a more flexible Python implementation written in Python. 
The sprint to make this happen will take place in Amsterdam,
a city know to be reachable by cheap flights :-)

This is 1) the announcement for the sprint;
        2) news about the current state of PyPy; 
        3) some words about a proposal we recently submitted
           to the European Union.


Amsterdam Sprint Details
------------------------

The Sprint will take place from

    the 14th of December to the 21st of December at the

    "Vrije Universiteit in Amsterdam", 14th-21th Dec 2003.

thanks to Etienne Posthumus, who helps us to organize the event.  The
main goal will be a complete C translation of PyPy, probably still using
a hacked Pyrex-variant as an intermediate layer and using CPython's
runtime.  We also plan to work on some fun frontends to PyPy like one
based on pygame or a web browser to visualize interactions between
interpreter and objectspace.  

If you want to participate on the sprint, please subscribe here

    http://codespeak.net/mailman/listinfo/pypy-sprint

and list yourself on this wiki page

    http://codespeak.net/moin/pypy/moin.cgi/AmsterdamSprint

where you will also find more information as the sprint date
approaches.  If you are just interested but don't know if you
come then only subscribe to the mailing list. 


State of the PyPy project 
--------------------------

PyPy works pretty well but still on top of CPython. The double
interpretation penalty makes it - as expected - incredibly slow :-)  In
the Berlin sprint we have thus started to work on the "translation"
part, i.e. how this code should be translated into C.  We can now
translate simple functions to C-like code including some type
annotations.  For convenience, we are reusing a modified subset of Pyrex
to generate the low-level C code.  Thanks to Seo (who joined the project
from south-korea) we also have a lisp-backend to fuel the endless c.l.py
threads about python versus lisp :-)

The goal of the next sprint is to complete this work so that we can
translate the complete PyPy source into a huge Pyrex module, and then a
big CPython extension module.  True, the result is not independent from
CPython, as it runs reusing its runtime environment.  But it's probably
an interesting enough state to make a public release from.  

The translation is done by generating a control flow of functions by
means of abstract interpretation.  IOW, we run the PyPy interpreter with
a custom object space ("flowobjspace") which generates a control flow
graph (including the elementary operations) which is then used to
generate low-level code for backends.  We also have preliminary type
inference on the graphs, which can be used by the Pyrex generator to
emit some C type declarations.  

Writing transformations and analysis of these graphs and displaying them
with GraphViz's 'dot' is great fun!  We certainly have a greater need
than ever for graphical interactive tools to see, understand and debug
all these graph manipulations and run tests of them.  Currently it is a
bit difficult to write a test that checks that a transformed graph
"looks right"! 

What we expect from the Amsterdam sprint is thus:

- a big not-too-slow "cpypy.so" extension module for CPython, where at
  least integer arithmetic is done efficiently

- interactive tools to display and debug and test PyPy, visualizing 
  control flow, call-graphs and state models. 

- improving and rewriting our testing tools to give us more control over
  the testing process, and to allow more interactive testing sessions.


Other interesting News
----------------------

Before mid October, we also had a quite different Sprint.  It was an
approximately 10-day effort towards submitting a proposal to the EU.  If
it is accepted we will have resources to fund some developers working
full- or parttime on the project.  However, our "sprint driven
development" will continue to play the central role for development of
PyPy. 

There are especially two technical sections of the proposal which you 
might find interesting to read:

 "Scientific and technological objectives":
 http://codespeak.net/pypy/index.cgi?doc/funding/B1.0

 "Detailed implementation plan"
 http://codespeak.net/pypy/index.cgi?doc/funding/B6.0

Maybe you want to read the whole proposal for other reasons, too, like
making a EU project of your own or competing with us. Actually,
with our sprints there is usually a lot of room for cooperation :-)
Anyway, here is the PDF-url:

    http://codespeak.net/svn/pypy/trunk/doc/funding/proposal/part_b.pdf

Everybody who thinks that he/she could help on the project is
invited to join! Btw, the latest discussions about our sprint
goals usually take place on the pypy-dev list: 

    http://codespeak.net/mailman/listinfo/pypy-dev

have fun,

    Armin & Holger 


From just at letterror.com  Sun Oct 26 12:31:22 2003
From: just at letterror.com (Just van Rossum)
Date: Sun Oct 26 12:31:22 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310261623.54136.aleaxit@yahoo.com>
Message-ID: <r01050400-1026-38C32CAC07DA11D8BDE6003065D5E7E4@[10.0.0.23]>

Alex Martelli wrote:

> > - I think augmented assignments CAN be made "rebinding" without
> > breaking code, since currently a += 1 fails if a is neither local
> > nor global.
> 
> You are right about the breaking code, but I would still slightly
> prefer to eschew this just for simplicity -- see also below.
[ ... ]
> I think rebinding nonlocals should be rare enough that the fact of
> having to write e.g. "a := a+1" rather than "a += 1" is a very minor
> problem. [ ... ]

Minor, sure, but I think it's an unnecessary restriction, just like many
people think Python's current inability to assign to outer scopes is
unneccesary. If we have a rebinding operator, it'll be very surprising
if augmented assignment ISN'T rebinding. It's just such a natural fit.

Just

From python at rcn.com  Sun Oct 26 12:34:55 2003
From: python at rcn.com (Raymond Hettinger)
Date: Sun Oct 26 12:37:25 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <200310261718.48377.aleaxit@yahoo.com>
Message-ID: <000001c39be7$90fc2460$d4b8958d@oemcomputer>

> > If this were the approach taken, it seems to me that there could not
be
> any
> > semantic change or side-effects for types that have compatible
meaning
> for
> > + and += (i.e. += is an in-place version of +).
> >
> > Maybe I'm missing something here?
> 
> Only the fact that "there's nothing that guarantees" this, as Guido
says.
> alist = alist + x only succeds if x is also a list, while alist += x
> succeeds
> also for tuples and other sequences, for example.
> 
> Personally, I don't think this would be a problem, but it's not my
> decision.


In the context of sum(), I think it would be nice to allow iterables to
be added together:   sum(['abc', range(3), ('do', 're', 'me')], [])

This fits in well with the current thinking that the prohibition of
adding sequences of unlike types be imposed only on operators and not on
functions or methods.  For instance, in sets.py, a|b requires both a and
b to be Sets; however, a.union(b) allows b to be any iterable.  The
matches the distinction between list.__iadd__() and list.extend() where
the former requires a list argument and the latter does not.


Raymond Hettinger


From aleaxit at yahoo.com  Sun Oct 26 12:48:30 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 12:48:35 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
	bltinmodule.c, 2.292.10.1, 2.292.10.2
In-Reply-To: <d6ckvuk9.fsf@yahoo.co.uk>
References: <E1ADNp1-0000gJ-00@sc8-pr-cvs1.sourceforge.net>
	<200310261109.56801.aleaxit@yahoo.com> <d6ckvuk9.fsf@yahoo.co.uk>
Message-ID: <200310261848.30366.aleaxit@yahoo.com>

On Sunday 26 October 2003 05:21 pm, Paul Moore wrote:
   ...
> I *think* I see what you're getting at here, but I'm struggling to
> follow in the absence of concrete use cases. As we're talking about

Assuming the simplest definition, equivalent to:

def loop(bound_method, it):
    for item in it: bound_method(item)

typical simple use cases might be, e.g.:


Merge a stream of dictionaries into one dict:

   merged_dict = {}
   loop(merged_dict.update, stream_of_dictionaries)

rather than:

merged_dict = {}
for d in stream_of_dictionaries:
    merged_dict.update(d)


Add a bunch of sequences into one list:

    all_the_seqs = []
    loop(all_the_seqs.extend, bunch_of_seqs)

rather than:

all_the_seqs = []
for s in bunch_of_seqs:
    all_the_seqs.extend(s)


Add two copies of each of a bunch of sequences ditto:

    all_the_seqs = []
    loop(all_the_seqs.extend, s+s for s in bunch_of_seqs)

ditto but only for sequences which have 23 somewhere in them:

    seqs_with_23 = []
    loop(seqs_with_23.extend, s for s in bunch_of_seqs in 23 in s)

and so on.  There are no doubt possibly more elegant ways, e.g.

def init_and_loop(initvalue, unboundmethod, it, *its):
    for items in itertools.izip(it, *its):
        unboundmethod(initvalue, *items)
    return initvalue

which would allow, e.g.,

merged_dict = init_and_loop({}, dict.update, stream_of_dictionaries)

or other variants yet, but the use cases are roughly the same.


The gain of such tiny "accumulator functions" (consuming one or
more iterators by just passing their items to some mutating-method
and ignoring the latter's results) are essentially conceptual -- it's
not a matter of saving a couple of lines at the point of use, nor of
saving some "bananoseconds" if the accumulator functions are
implemented in C, when compared to the coded-out loops.

Rather, such functions would allow "more declarative style"
presentation (of underlying functionality that remains imperative):
expressions feel more "declarative", stylistically, to some, while
spelling a few steps out feels more "imperative".  We've had this
preference explained recently on this group, and others over in
c.l.py are breaking over the champagne at the news of list.sorted
for essentially the same motivation.


> library functions, I'd suggest that your suggested "accumulator
> functions" start their life as an external module - maybe even in
> Python, although I take our point about the speed advantages of C.

Absolutely.  It's not _my_ suggestion to have more accumulator
functions -- it came up repeatedly on the threads started by Peter
Norvig original proposal about accumulation, and Guido mentioned
them in the 'product' thread I believe (where we also discussed
'any', 'all' etc, if I recall correctly).  I don't think anybody's ever
thought of making these built-ins.  But if that external module[s] (one
or more) is/are not part of the Python 2.4 library, if 2.4 does not
come with a selection of accumulation functions [not necessarily
including that 'loop' &c above mentioned, though I think something
like that might help], I don't think we can have the "accumulation
functionality" -- we only have great ways to make and express
iterators but not many great ways to _consume_ them (and most
particularly if sum, one of the few "good iterator consumers" we
have, is practically unusable for iterators whose items are lists..).

> With a bit of "real life" use, migration into the standard library
> might be more of an obvious step.

You could have said the same of itertools before 2.3, but I think
it was a great decision to accept them into the standard library
instead; 2.3 would be substantially poorer without them.  With an
even richer complement of iterator tools in itertools, and the new
"generator expressions" to give us even more great ways to make
iterators, I think a module of "iterator consumers", also known as
accumulation functions, would be a good idea.  Look at Peter
Norvig's original ideas for some suggestions, for example.

Which reminds me of an issue with Top(10), but, this is a long
enough post, so I think I should write a separate one about that.


Alex


From aleaxit at yahoo.com  Sun Oct 26 12:51:04 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 12:51:10 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <000001c39be7$90fc2460$d4b8958d@oemcomputer>
References: <000001c39be7$90fc2460$d4b8958d@oemcomputer>
Message-ID: <200310261851.04820.aleaxit@yahoo.com>

On Sunday 26 October 2003 06:34 pm, Raymond Hettinger wrote:
   ...
> b to be Sets; however, a.union(b) allows b to be any iterable.  The
> matches the distinction between list.__iadd__() and list.extend() where
> the former requires a list argument and the latter does not.

What distinction...?

>>> x=range(3)
>>> x.__iadd__('foo')
[0, 1, 2, 'f', 'o', 'o']
>>> x
[0, 1, 2, 'f', 'o', 'o']
>>>

did you mean list.__add__()...?  list.__iadd__ IS just as permissive
as list.extend, it seems to me.


Alex


From python at rcn.com  Sun Oct 26 12:56:40 2003
From: python at rcn.com (Raymond Hettinger)
Date: Sun Oct 26 12:57:32 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: <200310261851.04820.aleaxit@yahoo.com>
Message-ID: <000301c39bea$8230e1c0$d4b8958d@oemcomputer>

> did you mean list.__add__()...?  list.__iadd__ IS just as permissive
> as list.extend, it seems to me.

Hmm, I did mean __iadd__() but misremembered what it did.


Raymond


From zack at codesourcery.com  Sun Oct 26 13:14:37 2003
From: zack at codesourcery.com (Zack Weinberg)
Date: Sun Oct 26 13:14:42 2003
Subject: [Python-Dev] Alternate notation for global variable assignments
Message-ID: <87k76rhnn6.fsf@egil.codesourcery.com>


I like Just's := concept except for the similarity to =, and I worry
that the presence of := in the language will flip people into "Pascal
mode" -- thinking that = is the equality operator.  I also think that
the notation is somewhat unnatural -- "globalness" is a property of
the _variable_, not the operator.  So I'd like to suggest instead

   :var       = value   # var in module scope
   :scope:var = value   # var in named enclosing scope

An advantage of this notation is that it can be used anywhere, not
just in an assignment.  This has primary value for people reading the
code -- if you have a fairly large class method that uses a module
variable (not by assigning it) somewhere in the middle, writing it
:var means the reader knows to go look for the assignment way up top.
This should obviously be optional, to preserve backward compatibility.

zw

From aleaxit at yahoo.com  Sun Oct 26 13:20:52 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 13:20:59 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
	bltinmodule.c, 2.292.10.1, 2.292.10.2
In-Reply-To: <20031026162325.GA4113@vicky.ecs.soton.ac.uk>
References: <E1ADNp1-0000gJ-00@sc8-pr-cvs1.sourceforge.net>
	<200310261109.56801.aleaxit@yahoo.com>
	<20031026162325.GA4113@vicky.ecs.soton.ac.uk>
Message-ID: <200310261920.52477.aleaxit@yahoo.com>

On Sunday 26 October 2003 05:23 pm, Armin Rigo wrote:
   ...
> I must admit I was a bit surprized when I first tested sum(), without first
> reading its doc because I thought I knew what it should do.  I expected it
> to be a fast equivalent to:
>
> def sum(seq, start=0):
>   for item in seq:
>     start = start + seq
>   return start

It IS equivalent to that -- plus an explicit typetest to raise if start is an
instance of str or unicode.  I had originally tried forwarding to ''.join for 
strings, but Guido preferred to forbid them, and it still doesn't look like
a problem to me.

Alas, "fast" is debatable:-).

>   reduce(operator.add, seq, start)

sum doesn't reproduce reduce's quirk of using the first item of seq if start
is not given.  So, the def version is closer.

> I immediately tried it with strings and lists.  I immediately thought about
> lists because of their use of "+" for concatenation.
>
> So it seems that neither strings nor lists are properly supported, neither

Strings are explicitly disallowed, so that should take care of the surprise
factor for that specific case.

As for lists, the semantics are right, the speed is not (could be made way
faster with modest effort).  Same for other mutable sequences.

As for tuples and other immutable sequences, they ARE treated exactly
like your 'def' above (roughly like your reduce) would treat them -- not
very fast, but if all you know about something is that it's an immutable
sequence, there's nothing more you can do.  The use case of making a
huge tuple from many smaller ones seems weird enough that I don't
see specialcasing tuples specifically as particularly worthwhile (other
immutable sequences that aren't exactly tuples would still suffer, after
all).

> tuples by the way, and my opinion on this is that it strongly contradicts
> the principle of least surprize.

For mutable sequences, I agree.  For immutable ones, I don't see the
performance trap as being a practical problem for tuples (and weirder
things) -- it WOULD be for strings, but as we specifically disallow them
with a message reminding the user of ''.join, in practice the problem
seems minor.

Maybe I'm coming to this with a too-pragmatical stance...?

> I would not object to an implementation of sum() that special-case lists,
> tuples and strings for efficiency. (by which I mean I can contribute a
> patch)

I think all mutable sequences (that aren't especially weird in their + vs
+= behavior) might be handled correctly, without specialcasing, roughly
as follows (Phillip Eby's idea):

def sum(seq, start=0):
    it = iter(seq)
    try: result = start + it.next()
    except StopIteration: return start
    for item in it:
        result += item
    return result        

my original idea was perhaps a bit goofier, something like:

def sum(seq, start=0):
    try: start = copy.copy(start)
    except TypeError:
        for item in seq:
            start = start + item
    else:
        for item in seq:
            start += item
    return start

Admittedly the latter version may accept a few more cases, e.g.
both versions would accept:
    sum([ range(3), 'foo' ], [])
because [] is copyable, []+range(3) is fine, and list.__iadd__ is
more permissive than list.__add__; however, the first version 
would fail on:
    sum([ 'foo', range(3) ], [])
because []+'foo' fails, while the second version would be fine
because [] is _still_ copyable and __iadd__ is still permissive:-).

So, perhaps, implementing by Phillip's idea would still not reduce
the suprise factor enough.  Hmmm...


> > language or library feature.  The problem of the += loop on strings is
> > essentially solved by psyco, which has tricks to catch that and make
> > it almost as fast as ''.join; but psyco can't get into a built-in
> > function such as sum, and thus can't help us with the performance trap
> > there.
>
> Supporing sum() in Psyco is no big issue, and it could help the same way as
> it does for str.__add__.  (It is not explicitely supported yet, but it
> could be added.)  Still I believe that getting the complexity right in
> CPython is important, when it can be done.

Absolutely -- we can't rely on psyco for everything, particularly not for
getting the big-O right as opposed to speeding things up by constant
multipliers (in part, for example, because psyco doesn't work on Mac's,
which are going to be a HUGE part of Python's installed base...).

However, I would be happy to "leave it to psyco" for a sum of a large
sequence of tuples or other immutable sequences...:-).  I just don't
think that people in practice _will_ fall into that performance-trap...


Alex


From aleaxit at yahoo.com  Sun Oct 26 13:35:35 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sun Oct 26 13:35:57 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <r01050400-1026-018AE5FA07D111D8BDE6003065D5E7E4@[10.0.0.23]>
References: <r01050400-1026-018AE5FA07D111D8BDE6003065D5E7E4@[10.0.0.23]>
Message-ID: <200310261935.35235.aleaxit@yahoo.com>

On Sunday 26 October 2003 05:25 pm, Just van Rossum wrote:
> Phillip J. Eby wrote:
> > If we have a rebinding operator, I'd rather it be something
> > considerably more visible than the presence or absence of a ':' on an
> > assignment statement.
>
> I don't know, but somehow I don't have a problem spotting augmented
> assignments, so I don't think := will be as hard to miss as you suggest.

I agree -- := isn't any less "visible" than, say, -= .


> > So far, all the examples have been downright scary in the
> > invisibility of what's happening.  Mostly, I can imagine some poor
> > sap trying to debug a program that uses := and is missing one
> > somewhere or has one where it's not intended -- and hoping that poor
> > sap won't be me.  :)
>
> How is that different from a '-=' that should have been a plain '='?
> Also, if := is disallowed to rebind in the _same_ scope, this problem
> would be spotted by the compiler.

Not always (the = that should have been a := won't be, for example),
but pretty often (more often than the errant -= will be;-).  Worst case
it's not any worse than the dreaded "typo in variable name" whereby
somebody assigns to, e.g., "accounts__receivable" where they meant
to assign to "accounts_receivable" -- people who are new to Python
are terrified of that possibility, and demand declarations to take care of
it, but long-time practitioners know it's not all that huge a danger.


> > I've mostly stayed out of this discussion, but so far something like
> > the scope(function).variable proposals, with perhaps a special case
> > for scope(global) or scope(globals) seems to me like the way to go.
> > It seems very Pythonic, in that it is explicit and calls attention to
> > the fact that something special is going on, in a way that ':=' does
> > not.
>
> The reverse argument can be made, too: := calls attention to the fact
> that something is happening right there, whereas a declaration may be
> many lines away.

Right (that's part of why i do not like declarations!-), but the proposal
Phillip is referring to would have "scope(foo).x = 23" ``right here'' just
as "x := 23" would.  Actually, speaking as the original author of the
'scope' proposal, I think I now prefer your := when taken in the
simplest, most effective form -- took me a while to convince myself
of that, but it grew on me.


> > And 'scope' can be looked up in a manual more easily than ':='
> > can.  Last, but not least, ':=' looks enough like normal assignment
> > in other languages, that somebody just plain might not notice that
> > they *need* to look it up.
>
> That's a good point.

Well, if they're looking at a function that ONLY has := in isolation and
no occurrence of = -- and their grasp of Python is so scarce that they
don't realize = is Python's normal assignment.  Doesn't seem like a
particularly scary combination of circumstances to me, to be honest.


Alex


From guido at python.org  Sun Oct 26 13:36:46 2003
From: guido at python.org (Guido van Rossum)
Date: Sun Oct 26 13:36:56 2003
Subject: [Python-Dev] Re: accumulator display syntax
In-Reply-To: Your message of "Sun, 26 Oct 2003 12:56:40 EST."
	<000301c39bea$8230e1c0$d4b8958d@oemcomputer> 
References: <000301c39bea$8230e1c0$d4b8958d@oemcomputer> 
Message-ID: <200310261836.h9QIakK25425@12-236-54-216.client.attbi.com>

> > did you mean list.__add__()...?  list.__iadd__ IS just as permissive
> > as list.extend, it seems to me.
> 
> Hmm, I did mean __iadd__() but misremembered what it did.

You're forgiven, at some point in the past they *were* different.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From arigo at tunes.org  Sun Oct 26 14:37:55 2003
From: arigo at tunes.org (Armin Rigo)
Date: Sun Oct 26 14:42:43 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
	bltinmodule.c, 2.292.10.1, 2.292.10.2
In-Reply-To: <200310261920.52477.aleaxit@yahoo.com>
References: <E1ADNp1-0000gJ-00@sc8-pr-cvs1.sourceforge.net>
	<200310261109.56801.aleaxit@yahoo.com>
	<20031026162325.GA4113@vicky.ecs.soton.ac.uk>
	<200310261920.52477.aleaxit@yahoo.com>
Message-ID: <20031026193755.GA27194@vicky.ecs.soton.ac.uk>

Hello Alex,

On Sun, Oct 26, 2003 at 07:20:52PM +0100, Alex Martelli wrote:
> > def sum(seq, start=0):
> >   for item in seq:
> >     start = start + seq
> >   return start
> 
> It IS equivalent to that -- plus an explicit typetest to raise if start is an
> instance of str or unicode.

Yes, it is what I'm saying: it is what we expect it to be, but there is an
exception for no real reason apart from "don't do it like this, buddy, there
is a faster version out there".

I tend to regard this kind of exceptions as very bad, because if you write a
generic algorithm using sum(), even if you don't really see why someone would
think about using your algorithm with strings one day, chances are that
someone will.

Raising a Warning instead of an exception would have had the same result
without the generality problem.

> >   reduce(operator.add, seq, start)
> 
> sum doesn't reproduce reduce's quirk of using the first item of seq if start
> is not given.  So, the def version is closer.

I was thinking about:

  def sum(seq, start=0):
    return reduce(operator.add, seq, start)

which is the same as the previous one.

> Admittedly the latter version may accept a few more cases, e.g.
> both versions would accept:
>     sum([ range(3), 'foo' ], [])
> because [] is copyable, []+range(3) is fine, and list.__iadd__ is
> more permissive than list.__add__; however, the first version 
> would fail on:
>     sum([ 'foo', range(3) ], [])
> because []+'foo' fails, while the second version would be fine
> because [] is _still_ copyable and __iadd__ is still permissive:-).

These cases all show that we have a surprize problem (although probably not a
big one).  The user will expect sum() to have a clean definition, and because
the += one doesn't work, it must be +.  To my opinion, sum() should be
strictly equivalent to the naive + version and try to optimize common cases
under the hood.

Admittedly, this is not obvious, because of precisely all these strange mixed 
type cases which could be user-defined classes with __add__ or __radd__ 
operators...

I'm sure someone will design a class

class x:
  def __add__(self, other):
    return other

so that x() can be used as a trivial starting point for sum() -- and
then sum(["abc", "def"], x()) works :-)


Armin


From gward at python.net  Sun Oct 26 14:55:15 2003
From: gward at python.net (Greg Ward)
Date: Sun Oct 26 14:55:18 2003
Subject: [Python-Dev] Inconsistent error messages in Py{Object,
	Sequence}_SetItem()
Message-ID: <20031026195515.GA30335@cthulhu.gerg.ca>

I just noticed a subtle inconsistency in the error messages when trying
to assign to a tuple:

  >>> (1,)[0] = "foo"
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  TypeError: object doesn't support item assignment
  >>> (1,)['foo'] = "foo"
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  TypeError: object does not support item assignment

Note the "doesn't" vs "does not".  It's easily tracked down to
PyObject_SetItem() and PySequence_SetItem() (in Objects/abstract.c).

Is this deliberate, or a simple oversight?  I'm inclined to assume the
latter, and change "doesn't" to "does not" on the grounds that error
messages are formal writing, and I was taught not to use contractions in
formal writing.

Any objections?

        Greg
-- 
Greg Ward <gward@python.net>                         http://www.gerg.ca/
Eschew obfuscation!

From pf_moore at yahoo.co.uk  Sun Oct 26 15:27:21 2003
From: pf_moore at yahoo.co.uk (Paul Moore)
Date: Sun Oct 26 15:27:15 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
 bltinmodule.c, 2.292.10.1, 2.292.10.2
References: <E1ADNp1-0000gJ-00@sc8-pr-cvs1.sourceforge.net>
	<200310261109.56801.aleaxit@yahoo.com>
	<d6ckvuk9.fsf@yahoo.co.uk> <200310261848.30366.aleaxit@yahoo.com>
Message-ID: <r80ziw2e.fsf@yahoo.co.uk>

Alex Martelli <aleaxit@yahoo.com> writes:

> On Sunday 26 October 2003 05:21 pm, Paul Moore wrote:
>    ...
>> I *think* I see what you're getting at here, but I'm struggling to
>> follow in the absence of concrete use cases. As we're talking about
>
> Assuming the simplest definition, equivalent to:
>
> def loop(bound_method, it):
>     for item in it: bound_method(item)
>
> typical simple use cases might be, e.g.:

[...]

> and so on.

None of which are, to me, particularly convincing. Then again, while I
like a "declarative style" in some cases, I've got nothing against the
sort of "idiom-based" style in which short code patterns just "mean
something" as a whole, and aren't viewed as being comprised of their
individual parts (much like the standard C idiom for walking a linked
list).

> The gain of such tiny "accumulator functions" (consuming one or
> more iterators by just passing their items to some mutating-method
> and ignoring the latter's results) are essentially conceptual -- it's
> not a matter of saving a couple of lines at the point of use, nor of
> saving some "bananoseconds" if the accumulator functions are
> implemented in C, when compared to the coded-out loops.
>
> Rather, such functions would allow "more declarative style"
> presentation (of underlying functionality that remains imperative):
> expressions feel more "declarative", stylistically, to some, while
> spelling a few steps out feels more "imperative".  We've had this
> preference explained recently on this group, and others over in
> c.l.py are breaking over the champagne at the news of list.sorted
> for essentially the same motivation.

OK. I'd bow out here, as I don't feel the need to push the declarative
style that extra step. Let others champion the style.

> Absolutely.  It's not _my_ suggestion to have more accumulator
> functions -- it came up repeatedly on the threads started by Peter
> Norvig original proposal about accumulation, and Guido mentioned
> them in the 'product' thread I believe (where we also discussed
> 'any', 'all' etc, if I recall correctly).

I'm sorry - I'd got the impression that you were arguing the case. In
which case, I'd have to say that I'm not at all clear who it is who's
proposing anything here, or what specifically the proposals are. I
suspect the original intention is getting lost in generalities, and
it's time for those original posters to speak up and clarify exactly
what they want. Maybe a PEP is in order, to get back to the core of
the proposal.

>> With a bit of "real life" use, migration into the standard library
>> might be more of an obvious step.
>
> You could have said the same of itertools before 2.3, but I think
> it was a great decision to accept them into the standard library
> instead; 2.3 would be substantially poorer without them.

Agreed. I was very conscious of itertools when I made that statement.
But my gut feel is that in this case, there has been so much
discussion that the key concept has been obscured. A PEP, or some
prior art, would recapture that.

> With an even richer complement of iterator tools in itertools, and
> the new "generator expressions" to give us even more great ways to
> make iterators, I think a module of "iterator consumers", also known
> as accumulation functions, would be a good idea. Look at Peter
> Norvig's original ideas for some suggestions, for example.

In principle, I don't have a problem with that. Let's get concrete,
though, and see either a PEP or some code. Otherwise, the discussion
isn't really going anywhere.

And on that note, I really ought to bow out :-)

Paul.
-- 
This signature intentionally left blank


From guido at python.org  Sun Oct 26 15:50:18 2003
From: guido at python.org (Guido van Rossum)
Date: Sun Oct 26 15:50:41 2003
Subject: [Python-Dev] Inconsistent error messages in Py{Object,
	Sequence}_SetItem()
In-Reply-To: Your message of "Sun, 26 Oct 2003 14:55:15 EST."
	<20031026195515.GA30335@cthulhu.gerg.ca> 
References: <20031026195515.GA30335@cthulhu.gerg.ca> 
Message-ID: <200310262050.h9QKoI825552@12-236-54-216.client.attbi.com>

> Note the "doesn't" vs "does not".  It's easily tracked down to
> PyObject_SetItem() and PySequence_SetItem() (in Objects/abstract.c).
> 
> Is this deliberate, or a simple oversight?  I'm inclined to assume the
> latter, and change "doesn't" to "does not" on the grounds that error
> messages are formal writing, and I was taught not to use contractions in
> formal writing.

Luckily I wasn't taught formal writing :-), and I don't see why it
can't be doesn't.  I'd say that if you want Python's error messages to
be formal writing, you'd have to change a lot more than just the
one... :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido at python.org  Sun Oct 26 16:16:31 2003
From: guido at python.org (Guido van Rossum)
Date: Sun Oct 26 16:16:43 2003
Subject: [Python-Dev] cloning iterators again
Message-ID: <200310262116.h9QLGVf25583@12-236-54-216.client.attbi.com>

The following is just so beautiful, I have to share it.

I've been thinking about various implementations of Andrew Koenig's
idea of "copyable iterator wrappers", which support a generalization
of Raymond Hettinger's tee().  This needs some kind of queue-ish data
structure, but all queues that I could think of required too much
administration for the additional requirements.  I finally realized
(this may come as no surprise to Andrew :-) that the best
implementation is a singly-linked list.  Yes, a use case for a linked
list in Python!  This nicely takes care of the GC issue when one of
the iterators is discarded before being exhausted.

I hope Raymond can implement this in C.

class Link(object):
    """Singly-linked list of (state, value) pairs.

    The slots are manipulated directly by Wrapper.next() below.

    The state slot can have three values, which determine the meaning
    of the value slot:

    state = 0   => value is not yet determined (set to None)
    state = 1   => value is value to be returned by next()
    state = -1  => value is exception to be raised by next()

    The next slot points to the next Link in the chain; it is None at
    the end of the chain (state <= 0).

    """

    __slots__ = ["state", "value", "next"]

    def __init__(self):
        self.state = 0
        self.value = None
        self.next = None

class Wrapper(object):
    """Copyable wrapper around an iterator.

    Any number of Wrappers around the same iterator share the same
    chain of Links.  The Wrapper that is most behind references the
    oldest Link, and as it moves forward the oldest Link instances are
    automatically discarded.

    The newest Link has its value set to None and its state set to 0.
    When a Wrapper needs to get the value out of this Link, it calls
    next() on the underlying iterator and stores it in the Link,
    setting its state to 1, for the benefit of other Wrappers that are
    behind it.  If the underlying iterator's next() raises an
    exception, the Link's state is set to -1 and its value to the
    exception instance instead.

    When the oldest Wrapper is garbage-collected before it finishes
    the chain, the Links that it owns are also garbage-collected, up
    to the next Link still owned by a live Wrapper.

    """

    __slots__ = ["it", "link"]

    def __init__(self, it, link=None):
        """Constructor.  The link argument is used by __copy__ below."""
        self.it = it
        if link is None:
            link = Link()
        self.link = link

    def __copy__(self):
        """Copy the iterator.

        This returns a new iterator that will return the same series
        of results as the original.

        """
        return Wrapper(self.it, self.link)

    def __iter__(self):
        """All iterators should support __iter__() returning self."""
        return self

    def next(self):
        """Get the next value of the iterator, or raise StopIteration."""
        link = self.link
        if link is None:
            raise StopIteration
        state, value, next = link.state, link.value, link.next
        if state == 0:
            # At the head of the chain: move underlying iterator
            try:
                value = self.it.next()
            except StopIteration, exc:
                value = exc
                state = -1
            else:
                state = 1
            link.state = state
            link.value = value
        if state < 0:
            self.link = None
            raise value
        assert state > 0
        if next is None:
            next = Link()
            link.next = next
        self.link = next
        return value

def tee(it):
    """Replacement for Raymond's tee(); see examples in itertools docs."""
    if not hasattr(it, "__copy__"):
        it = Wrapper(it)
    return (it, it.__copy__())

def test():
    """A simple demonstration of the Wrapper class."""
    import random
    def gen():
        for i in range(10):
            yield i
    it = gen()
    a, b = tee(it)
    b, c = tee(b)
    c, d = tee(c)
    iterators = [a, b, c, d]
    while iterators != [None, None, None, None]:
        i = random.randrange(4)
        it = iterators[i]
        if it is None:
            next = "----"
        else:
            try:
                next = it.next()
            except StopIteration:
                next = "****"
                iterators[i] = None
        print "%4d%s%4s%s" % (i, "   ."*i, next, "   ."*(3-i))

if __name__ == "__main__":
    test()

--Guido van Rossum (home page: http://www.python.org/~guido/)

From martin at v.loewis.de  Sun Oct 26 16:42:08 2003
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Oct 26 16:42:30 2003
Subject: [Python-Dev] Inconsistent error messages in Py{Object,
	Sequence}_SetItem()
In-Reply-To: <200310262050.h9QKoI825552@12-236-54-216.client.attbi.com>
References: <20031026195515.GA30335@cthulhu.gerg.ca>
	<200310262050.h9QKoI825552@12-236-54-216.client.attbi.com>
Message-ID: <3F9C3FB0.8050206@v.loewis.de>

Guido van Rossum wrote:
> Luckily I wasn't taught formal writing :-), and I don't see why it
> can't be doesn't.  I'd say that if you want Python's error messages to
> be formal writing, you'd have to change a lot more than just the
> one... :-)

OTOH, I would always yield to native speakers in such issues. To me
myself, it does not matter much, but if native speakers feel happier
one way or the other, I'd like to help them feel happy :-)

Regards,
Martin


From pje at telecommunity.com  Sun Oct 26 17:11:58 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Oct 26 17:11:15 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <r01050400-1026-018AE5FA07D111D8BDE6003065D5E7E4@[10.0.0.23
 ]>
References: <5.1.0.14.0.20031026105653.03e64ec0@mail.telecommunity.com>
Message-ID: <5.1.0.14.0.20031026170948.03613950@mail.telecommunity.com>

At 05:25 PM 10/26/03 +0100, Just van Rossum wrote:
>Phillip J. Eby wrote:
>
> > If we have a rebinding operator, I'd rather it be something
> > considerably more visible than the presence or absence of a ':' on an
> > assignment statement.
>
>I don't know, but somehow I don't have a problem spotting augmented
>assignments, so I don't think := will be as hard to miss as you suggest.
>
> > So far, all the examples have been downright scary in the
> > invisibility of what's happening.  Mostly, I can imagine some poor
> > sap trying to debug a program that uses := and is missing one
> > somewhere or has one where it's not intended -- and hoping that poor
> > sap won't be me.  :)
>
>How is that different from a '-=' that should have been a plain '='?
>Also, if := is disallowed to rebind in the _same_ scope, this problem
>would be spotted by the compiler.

But some languages use := to mean simple assignment.  So, '=' and ':=' 
don't appear *semantically* distinct.  Whereas, I'm not aware of a language 
that uses '-=' differently.


> > I've mostly stayed out of this discussion, but so far something like
> > the scope(function).variable proposals, with perhaps a special case
> > for scope(global) or scope(globals) seems to me like the way to go.
> > It seems very Pythonic, in that it is explicit and calls attention to
> > the fact that something special is going on, in a way that ':=' does
> > not.
>
>The reverse argument can be made, too: := calls attention to the fact
>that something is happening right there, whereas a declaration may be
>many lines away.

I guess I wasn't clear.  I meant, using 'scope(function).variable = 
whatever' *every* time you assign to the outer scope variable, and not 
having any "declarations", ever.


From python at rcn.com  Sun Oct 26 18:17:35 2003
From: python at rcn.com (Raymond Hettinger)
Date: Sun Oct 26 18:18:30 2003
Subject: [Python-Dev] RE: cloning iterators again
In-Reply-To: <200310262116.h9QLGVf25583@12-236-54-216.client.attbi.com>
Message-ID: <000f01c39c17$575e8920$d4b8958d@oemcomputer>

> The following is just so beautiful, I have to share it.

I have to say, it is a thing of beauty.


> I've been thinking about various implementations of Andrew Koenig's
> idea of "copyable iterator wrappers", which support a generalization
> of Raymond Hettinger's tee(). 

I've re-read some of the old email on the subject but didn't see what
this buys us that we don't already get with the current tee().

When I wrote tee(), I had considered implementing it as a multi-way
tee(it, n=2) so you could write a,b,c,d=tee(myiterable, 4).  Then, I
wracked my brain for use cases and found nothing that warranted:

* the additional memory consumption (the current implementation consumes
only one pointer per element and it stores them in contiguous memory); 

* the additional memory management utilization (the underlying list.pop
and list.append have already been optimized to avoid excessive
malloc/free calls); 

* or the impact on cache performance (using contiguous memory means that
consecutive pops are in the L1 cache at least 90% of the time and using
only one word per entry means that a long series of pops is less likely
to blow everything else out of the cache).

With only two iterators, I can imagine use cases where the two iterators
track each other fairly closely.  But with multiple iterators, one
iterator typically lags far behind (meaning that list(it) is the best
solution) or they track within a fixed number of elements of each other
(meaning that windowing is the best solution).

The itertools example section shows the pure python code for windowing.
AFAICT, that windowing code is unbeatable in terms of speed and memory
consumption (nearly all the time is spent forming the result tuple).


> class Link(object):
>     """Singly-linked list of (state, value) pairs.
. . .
>     __slots__ = ["state", "value", "next"]

One way to implement this is with a type which adds PyHEAD to the space
consumption for the three fields.  An alternate approach is to use PyMem
directly and request space for four fields (including a refcount field).


>         if state < 0:
>             self.link = None
>             raise value

Is it kosher to re-raise the exception long after something else make
have handled it and the execution context has long since disappeared?


> def test():
>     """A simple demonstration of the Wrapper class."""
>     import random
>     def gen():
>         for i in range(10):
>             yield i
>     it = gen()
>     a, b = tee(it)
>     b, c = tee(b)
>     c, d = tee(c)

This is very nice.  The current tee() increases memory consumption and
workload when nested like this.


Raymond Hettinger


From skip at pobox.com  Sun Oct 26 12:11:51 2003
From: skip at pobox.com (Skip Montanaro)
Date: Sun Oct 26 18:58:48 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310261723.20026.aleaxit@yahoo.com>
References: <r01050400-1026-1BBA93CA07B111D8BDE6003065D5E7E4@[10.0.0.23]>
	<200310261623.54136.aleaxit@yahoo.com>
	<20031026154626.GA18564@panix.com>
	<200310261723.20026.aleaxit@yahoo.com>
Message-ID: <16284.87.71562.652543@montanaro.dyndns.org>


    > Sounds good to me.  Question: what does this do?
    >
    >     def f():
    >         def g(x):
    >             z := x
       ...
    > That is, in the absence of a pre-existing binding, where does the
    > binding for := go?  I think it should be equivalent to global, going to
    > the module scope.

This is one place I think an extension of the global statement has a
definite advantage:

    def f():
        def g():
            global z in f
            z = x

Skip

From guido at python.org  Sun Oct 26 19:31:35 2003
From: guido at python.org (Guido van Rossum)
Date: Sun Oct 26 19:31:51 2003
Subject: [Python-Dev] RE: cloning iterators again
In-Reply-To: Your message of "Sun, 26 Oct 2003 18:17:35 EST."
	<000f01c39c17$575e8920$d4b8958d@oemcomputer> 
References: <000f01c39c17$575e8920$d4b8958d@oemcomputer> 
Message-ID: <200310270031.h9R0VZp25738@12-236-54-216.client.attbi.com>

> > I've been thinking about various implementations of Andrew Koenig's
> > idea of "copyable iterator wrappers", which support a generalization
> > of Raymond Hettinger's tee(). 
> 
> I've re-read some of the old email on the subject but didn't see what
> this buys us that we don't already get with the current tee().

Performance-wise I don't know; we'd have to profile it I guess. :-(

With the current tee(), I was thinking that if the two iterators stay
close, you end up moving the in basket to the out basket rather
frequently, and the overhead of that might beat the simplicity of the
linked lists.  Also, *if* you need a lot of clones, using multiple
tee() calls ends up creating several queues, again causing more
overhead.  (These queues end up together containing all the items from
the oldest to the newest iterator.)

I also note that the current tee() doesn't let you use __copy__ easily
(it would be quite messy I think).  The linked-list version supports
__copy__ trivially.  This may be important if we execute (as I think
we should) on the idea of making selected iterators __copy__-able
(especially all the standard container iterators and xrange).

> When I wrote tee(), I had considered implementing it as a multi-way
> tee(it, n=2) so you could write a,b,c,d=tee(myiterable, 4).  Then, I
> wracked my brain for use cases and found nothing that warranted:
> 
> * the additional memory consumption (the current implementation consumes
> only one pointer per element and it stores them in contiguous memory); 
> 
> * the additional memory management utilization (the underlying list.pop
> and list.append have already been optimized to avoid excessive
> malloc/free calls); 
> 
> * or the impact on cache performance (using contiguous memory means that
> consecutive pops are in the L1 cache at least 90% of the time and using
> only one word per entry means that a long series of pops is less likely
> to blow everything else out of the cache).
> 
> With only two iterators, I can imagine use cases where the two iterators
> track each other fairly closely.  But with multiple iterators, one
> iterator typically lags far behind (meaning that list(it) is the best
> solution) or they track within a fixed number of elements of each other
> (meaning that windowing is the best solution).

Maybe Andrew has some use cases?  After all he implemented this once
for C++.  BTW in private mail he reminded me that (a) he'd already
suggested using a linked list to me before, and (b) his version had
several values per link node, which might address some of your
concerns above.

> The itertools example section shows the pure python code for windowing.
> AFAICT, that windowing code is unbeatable in terms of speed and memory
> consumption (nearly all the time is spent forming the result tuple).
> 
> 
> 
> > class Link(object):
> >     """Singly-linked list of (state, value) pairs.
> . . .
> >     __slots__ = ["state", "value", "next"]
> 
> One way to implement this is with a type which adds PyHEAD to the space
> consumption for the three fields.  An alternate approach is to use PyMem
> directly and request space for four fields (including a refcount field).

Or you could use Andrew's suggestion.

> 
> >         if state < 0:
> >             self.link = None
> >             raise value
> 
> Is it kosher to re-raise the exception long after something else make
> have handled it and the execution context has long since disappeared?

This isn't a re-raise; it's a raise of the exception object, which
doesn't depend on the context and can be raised as often as you want
to.  I agree that it might be worth it to do a bare raise (==
re-raise) *if* the exception was in fact caught in the current next()
invocation, to preserve the stack trace.  Or we could change the
meaning of value and store the sys.exc_info() triple in it -- but this
would probably keep too many stack frames and local variables alive
for too long.

> > def test():
> >     """A simple demonstration of the Wrapper class."""
> >     import random
> >     def gen():
> >         for i in range(10):
> >             yield i
> >     it = gen()
> >     a, b = tee(it)
> >     b, c = tee(b)
> >     c, d = tee(c)
> 
> This is very nice.  The current tee() increases memory consumption and
> workload when nested like this.

The question is, how often does one need this?  Have you seen real use
cases for tee() that aren't better served with list()?

--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg at cosc.canterbury.ac.nz  Sun Oct 26 22:22:26 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Sun Oct 26 22:22:45 2003
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Python
	bltinmodule.c, 2.292.10.1, 2.292.10.2
In-Reply-To: <200310261109.56801.aleaxit@yahoo.com>
Message-ID: <200310270322.h9R3MQT13907@oma.cosc.canterbury.ac.nz>

Alex Martelli <aleaxit@yahoo.com>:

> Exactly the same underlying reason as a bug I just opened on
> SF: if x is an instance of a class X having __mul__ but not
> __rmul__, 3*x works (just like x*3) but 3.0*x raises TypeError

Seems to me the bug there is not giving X an __rmul__
method and yet expecting y*x to work at all. The fact
that it happens to work in some cases is an accident
that should not be relied upon.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From tdelaney at avaya.com  Sun Oct 26 22:25:19 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Sun Oct 26 22:26:50 2003
Subject: [Python-Dev] closure semantics
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6AF30@au3010avexu1.global.avaya.com>

> From: Skip Montanaro [mailto:skip@pobox.com]
> 
> You meant
> 
>     def f():
>         x = 12
>         y = 1
>         def g():
>             y = 12
>             global y in f
>         g()
>         print locals()
> 
> right?

Er - yes ... :)

Tim Delaney

From tdelaney at avaya.com  Sun Oct 26 22:27:52 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Sun Oct 26 22:28:00 2003
Subject: [Python-Dev] closure semantics
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6AF32@au3010avexu1.global.avaya.com>

> From: Guido van Rossum [mailto:guido@python.org]
> 
> > Likewise, the following should be illegal:
> > 
> >      def f():
> >          x = 12
> >          y = 1
> >          def g():
> >              global y in f
> >              y = 12
> >          g()
> >          print locals()
> > 
> > because the global statement occurs after a local binding 
> of the name.
> 
> Huh?  The placement of a global statement is irrelevant -- it can
> occur anywhere in the scope.  This should certainly work.

As Skip pointed out, I got:

    y = 12
    global y in f

reversed. And I was thinking of PyChecker warning about this.

I should not have been thinking about these things while trying to set a release candidate build going so I could head home on a Friday evening :(

Tim Delaney

From greg at cosc.canterbury.ac.nz  Sun Oct 26 22:28:08 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Sun Oct 26 22:28:18 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310261134.56982.aleaxit@yahoo.com>
Message-ID: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz>

Alex Martelli <aleaxit@yahoo.com>:

> > Ideally, augmented assignments would also become "rebinding". However,
> > this may have compatibility problems.
> 
> Unfortunately yes.  It might have been better to define them that way in
> the first place, but changing them now is dubious.

I'm not so sure. You need an existing binding before an
augmented assignment will work, so I don't think there can
be any correct existing usages that would be broken by this.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From tdelaney at avaya.com  Sun Oct 26 22:31:12 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Sun Oct 26 22:31:23 2003
Subject: [Python-Dev] closure semantics
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6AF34@au3010avexu1.global.avaya.com>

> From: Greg Ewing [mailto:greg@cosc.canterbury.ac.nz]
> 
> > Is that compatible with current use?  I think the current 
> semantics are that
> > global <name> always binds name to an object with that name 
> at module scope.
> 
> No, it's not quite compatible, but I don't think
> it's likely to break anything much in practice.

I'm almost 100% sure that it will. People tend to use the same short variable names for things, and nested functions had *better* be related ...

We could not use an unadorned 'global' for such a change in semantics. It would require a new keyword.

Tim Delaney

From guido at python.org  Sun Oct 26 22:58:19 2003
From: guido at python.org (Guido van Rossum)
Date: Sun Oct 26 22:58:35 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: Your message of "Mon, 27 Oct 2003 16:28:08 +1300."
	<200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz> 
References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com>

[attribution lost]
> > > Ideally, augmented assignments would also become "rebinding". However,
> > > this may have compatibility problems.

[Alex]
> > Unfortunately yes.  It might have been better to define them that way in
> > the first place, but changing them now is dubious.

[Greg]
> I'm not so sure. You need an existing binding before an
> augmented assignment will work, so I don't think there can
> be any correct existing usages that would be broken by this.

Indeed.  If x is neither local not declared global, x+=... is always
an error, even if an x at an intermediate level exists, so THAT
shouldn't be used as an argument against this.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg at cosc.canterbury.ac.nz  Sun Oct 26 23:28:09 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Sun Oct 26 23:28:18 2003
Subject: [Python-Dev] Alternate notation for global variable assignments
In-Reply-To: <87k76rhnn6.fsf@egil.codesourcery.com>
Message-ID: <200310270428.h9R4S9u14063@oma.cosc.canterbury.ac.nz>

> I like Just's := concept except for the similarity to =, and I worry
> that the presence of := in the language will flip people into "Pascal
> mode" -- thinking that = is the equality operator.  I also think that
> the notation is somewhat unnatural -- "globalness" is a property of
> the _variable_, not the operator.  So I'd like to suggest instead
> 
>    :var       = value   # var in module scope
>    :scope:var = value   # var in named enclosing scope

Yeek, that makes it look like Logo!

What about simply

  outer x = value

In this, 'outer' would be an annotation applicable to any
bare name in an lvalue position, so you could say

  (x, outer y, self.z) = stuff

if you wanted, or even

  def outer f():
    ...

  class outer C:
    ...

although probably I wouldn't mind much if those were
disallowed.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From pete at shinners.org  Mon Oct 27 01:02:47 2003
From: pete at shinners.org (Pete Shinners)
Date: Mon Oct 27 00:07:25 2003
Subject: [Python-Dev] VisualC6 Available
Message-ID: <bni965$mhj$1@sea.gmane.org>

The place I used to work at had several retail copies of MS Visual 
Studio 6.0. Since the company is no longer, I have one available to 
offer to anyone developing python.

If they don't go to someone useful here they will likely just end up 
in the dumpster. I figure if anyone is stuck using a potentially 
'shady' licensed version this could be a good chance to get all legit.

The full product is "Microsoft Visual Studio 6.0 Enterprise Edition, 
English". This will come with the original CD's, Case, CD Key, and 
Certificate of Authenticity.

If this sounds like it will help, drop me an email and I'll figure out 
how to get it to you. I'm especially interested in helping someone 
actively developing python.


From python at rcn.com  Mon Oct 27 00:12:33 2003
From: python at rcn.com (Raymond Hettinger)
Date: Mon Oct 27 00:13:30 2003
Subject: [Python-Dev] RE: cloning iterators again
In-Reply-To: <200310270031.h9R0VZp25738@12-236-54-216.client.attbi.com>
Message-ID: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer>

> > I've re-read some of the old email on the subject but didn't see
what
> > this buys us that we don't already get with the current tee().
> 
> Performance-wise I don't know; we'd have to profile it I guess. :-(

My question was more directed toward non-performance issues.  Do we
really have *any* need for more than two iterators running concurrently?
After all, it's already difficult to come-up with good use cases for two
that are not dominated by list() or window().

 
> With the current tee(), I was thinking that if the two iterators stay
> close, you end up moving the in basket to the out basket rather
> frequently, and the overhead of that might beat the simplicity of the
> linked lists. 

With current tee(), runtime is dominated by calls to Append and Pop
(reverse is super-fast and moves each element only once).  Those are
calls are more expensive than a link jump; however append() and pop()
are optimized to avoid calls to the memory manager while every link
would need steps for alloc/initialization/reference/dealloc.  Cache
effects are also important because the current tee() uses much less
memory and the two memory blocks are contiguous.


> Also, *if* you need a lot of clones, using multiple
> tee() calls ends up creating several queues, again causing more
> overhead.  (These queues end up together containing all the items from
> the oldest to the newest iterator.)

*If* we want to support multiple clones, there is an alternate
implementation of the current tee that only costs one extra word per
iteration.  That would be in there already.  I really *wanted* a
multi-way tee but couldn't find a single use case that warranted it.


> I also note that the current tee() doesn't let you use __copy__ easily
> (it would be quite messy I think).

To __copy__ is to tee.  Both make two iterators from one.
They are different names for the same thing.
Right now, they don't seem comparable because the current tee is only a
two way split and you think of copy as being a multi-way split for no
extra cost.


> Maybe Andrew has some use cases?

I hope so.  I can't think of anything that isn't dominated by list(),
window(), or the current tee().

And, if needed, the current tee() can easily be made multi-way.  It
doubles the unit memory cost from one word to two but that's nothing
compared to the link method (two words for PyHead, another 3 (Linux) or
4 (Windows) words for GC, and another 3 for the data fields).


> The question is, how often does one need this?  Have you seen real use
> cases for tee() that aren't better served with list()?

I'm sure they exist, but they are very few.  I was hoping that a simple,
fast, memory efficient two-way tee() would have satisfied all the
requests, but this thing appears to be taking on a life of its own with
folks thinking they need multiple concurrent iterators created by a
magic method (what for?).


Raymond


From aleaxit at yahoo.com  Mon Oct 27 02:51:02 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 27 02:51:12 2003
Subject: [Python-Dev] RE: cloning iterators again
In-Reply-To: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer>
References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer>
Message-ID: <200310270851.02495.aleaxit@yahoo.com>

On Monday 27 October 2003 06:12, Raymond Hettinger wrote:
   ...
> My question was more directed toward non-performance issues.  Do we
> really have *any* need for more than two iterators running concurrently?

I admit I have no use cases for that.  It was probably a case of over-eager
generalization on my part.

I understand and appreciate all of your other explanations on performance,
except one:

> > I also note that the current tee() doesn't let you use __copy__ easily
> > (it would be quite messy I think).
>
> To __copy__ is to tee.  Both make two iterators from one.
> They are different names for the same thing.
> Right now, they don't seem comparable because the current tee is only a
> two way split and you think of copy as being a multi-way split for no
> extra cost.

I don't understand this.  __copy__ is a special method that a type may
or may not expose.  If it does, copy.copy(x) on an instance x of that type
makes and returns one (shallow) copy of x.  I just got a PEP number
(323) for Copyable Iterators as recently discussed, and hope to commit 
the PEP within today.  But, basically, the idea is trivially simple: 
iterators which really have a tiny amount of state, such as those on
sequences and dicts, will expose __copy__ and implement it by just
duplicating said tiny amount (one pointer to a container and an index).

But I don't understand how it would be quite messy to take advantage
of this in tee(), either: simply, tee() would start with the equivalent of
    it = iter(it)
    try: return it, copy.copy(it)
    except TypeError:pass
and proceed just like now if this shortcut hasn't worked -- that's all.


Alex


From aleaxit at yahoo.com  Mon Oct 27 03:06:44 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 27 03:06:52 2003
Subject: [Python-Dev] the "3*x works w/o __rmul__" bug
In-Reply-To: <200310270322.h9R3MQT13907@oma.cosc.canterbury.ac.nz>
References: <200310270322.h9R3MQT13907@oma.cosc.canterbury.ac.nz>
Message-ID: <200310270906.44209.aleaxit@yahoo.com>

On Monday 27 October 2003 04:22, Greg Ewing wrote:
> Alex Martelli <aleaxit@yahoo.com>:
> > Exactly the same underlying reason as a bug I just opened on
> > SF: if x is an instance of a class X having __mul__ but not
> > __rmul__, 3*x works (just like x*3) but 3.0*x raises TypeError
>
> Seems to me the bug there is not giving X an __rmul__
> method and yet expecting y*x to work at all. The fact
> that it happens to work in some cases is an accident
> that should not be relied upon.

No, the bug is that it works in some cases where it should fail
(and, secondarily, that -- where it does fail -- it gives a weird error
message).  In other words, the bug (in Python) is that "accident".

Nobody's asking for 3.0*x to work where x is a user-coded type
without an __rmul__; rather, the point is that 3*x should fail too,
and ideally they'd have the same clear error message as 3+x
gives when the type has no __radd__.

Doesn't seem trivial to fix (though I hope I'm missing something
obvious) and doesn't affect perfect user-programs, but I do
think it should be fixed because it's sure extremely mysterious
and could send a developer on wild goose chases when met in
the course of development.


Alex


From ncoghlan at iinet.net.au  Mon Oct 27 03:31:32 2003
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Mon Oct 27 03:31:41 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <16284.87.71562.652543@montanaro.dyndns.org>
References: <r01050400-1026-1BBA93CA07B111D8BDE6003065D5E7E4@[10.0.0.23]>	<200310261623.54136.aleaxit@yahoo.com>	<20031026154626.GA18564@panix.com>	<200310261723.20026.aleaxit@yahoo.com>
	<16284.87.71562.652543@montanaro.dyndns.org>
Message-ID: <3F9CD7E4.5070609@iinet.net.au>

Skip Montanaro strung bits together to say:
> This is one place I think an extension of the global statement has a
> definite advantage:
> 
>     def f():
>         def g():
>             global z in f
>             z = x
> 

Alternately (using Just's 'rebinding non-local' syntax:

   def f():
     z = None
     def g():
       z := x

Cheers,
Nick.

-- 
Nick Coghlan           |              Brisbane, Australia
ICQ#: 68854767         |               ncoghlan@email.com
Mobile: 0409 573 268   |   http://www.talkinboutstuff.net
"Let go your prejudices,
               lest they limit your thoughts and actions."


From ncoghlan at iinet.net.au  Mon Oct 27 04:06:35 2003
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Mon Oct 27 04:06:48 2003
Subject: [Python-Dev] product()
In-Reply-To: <200310261310.27950.fincher.8@osu.edu>
References: <002401c39907$0176f5a0$e841fea9@oemcomputer>
	<200310260820.59266.fincher.8@osu.edu>
	<3F9BDCA1.5040101@iinet.net.au>
	<200310261310.27950.fincher.8@osu.edu>
Message-ID: <3F9CE01B.7070905@iinet.net.au>

Jeremy Fincher strung bits together to say:

> On Sunday 26 October 2003 09:39 am, Nick Coghlan wrote:
> 
>> >>> if any(not pred(x) for x in values): pass   # anyfalse
> if not all(pred(x) for x in values): pass

>> >>> if all(not pred(x) for x in values): pass   # allfalse
> if not any(pred(x) for x in values): pass

> It's slightly more efficient (only one negation), and it seems to maintain 
> better the pseudocode-like aspect that we so much adore in Python :)

I originally wrote them out the way you suggest, but then changed them after I 
added the comment that indicated what each example represented (as the less 
efficient versions more literally match the comments).

Anyway, I suspect those used to the idiom would use the forms you suggest. There 
might be some variation due to the multiple ways of writing the expressions 
(using any/all), but I doubt that would be worse than the confusion created by 
the double negative needed to express either any or all in terms of the other.

Cheers,
Nick.

-- 
Nick Coghlan           |              Brisbane, Australia
ICQ#: 68854767         |               ncoghlan@email.com
Mobile: 0409 573 268   |   http://www.talkinboutstuff.net
"Let go your prejudices,
               lest they limit your thoughts and actions."


From aleaxit at yahoo.com  Mon Oct 27 04:33:56 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 27 04:34:08 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com>
References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz>
	<200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com>
Message-ID: <200310271033.56569.aleaxit@yahoo.com>

On Monday 27 October 2003 04:58 am, Guido van Rossum wrote:
> [attribution lost]
>
> > > > Ideally, augmented assignments would also become "rebinding".
> > > > However, this may have compatibility problems.
>
> [Alex]
>
> > > Unfortunately yes.  It might have been better to define them that way
> > > in the first place, but changing them now is dubious.
>
> [Greg]
>
> > I'm not so sure. You need an existing binding before an
> > augmented assignment will work, so I don't think there can
> > be any correct existing usages that would be broken by this.
>
> Indeed.  If x is neither local not declared global, x+=... is always
> an error, even if an x at an intermediate level exists, so THAT
> shouldn't be used as an argument against this.

Actually, if the compiler were able to diagnose that, it would be
wonderful -- but I don't think it can, because it can make no
assumptions regarding what might be defined in global scope (or
at least it definitely can't make any such assumptions now).

So, yes, any sensible program that works today would keep
working.  I dunno about NON-sensible programs such as:

def outer():
    x = 23
    def inner():
        exec 'x = 45'
        x+=1
    # etc etc

but then I guess the presence of 'exec' might be defined to
change semantics of += and/or disallow := or whatever else,
just as today it turns off local-variable optimizations.

My slight preference for leaving += and friends alone is that
a function using them to rebind nonlocals would be hard to
read, that since the change only applies when the LHS is a
bare name the important use cases for augmented assignment
don't apply any way, that it's a bit subtle to explain that
    foo.bar += baz
( += on a dotted name) implies a plain assignment (setattr)
on foo.bar while
    foo_bar += baz
( += on bare name) might imply a := assignment (rebinding
a nonlocal) IF there are no "foo_bar = baz" elsewhere in the
same function BUT it would imply a plain assignment if there
ARE other plain assignments to the same name in the same
function, ...

IOW it seems to me that we're getting into substantial amounts
of subtlety in explaining (and thus maybe in implementing) a
functionality change that's not terribly useful anyway and may 
damage rather than improve readability when it's used.

Taking the typical P. Graham accumulator example, say:

with += rebinding, we can code this:

def accumulator(n=0):
    def increment(i):
        n += i
        return n
    return increment

but without it, we would code:

def accumulator(n=0):
    def increment(i):
        n := n + i
        return n
    return increment

and it doesn't seem to me that the two extra keystrokes are to
be considered a substantial price to pay.  Admittedly in such a
tiny example readability is just as good either way, as it's obvious
which n we're talking about (there being just one, and extremely
nearby wrt the point of use of either += or := ).

Suppose we wanted to have the accumulator "saturate" -- if
the last value it returned was > m it must restart accumulating
from zero.  Now, without augmented assignment:

def accumulator_saturating(n=0, m=100):
    def increment(i):
        if n > m:
            n := i
        else:
            n := n + i
        return n
    return increment

we have a pleasing symmetry and no risk of errors -- if we
mistakenly use an = instead of := in either branch the compiler
will be able to let us know immediately.  (Actually I'd be quite
tempted to code the if branch as "n := 0 + i" to underscore the
symmetry, but maybe I'm just weird:-).

If we do rely on augmented assignment being "rebinding":

def accumulator_saturating(n=0, m=100):
    def increment(i):
        if n > m:
            n = i
        else:
            n += i
        return n
    return increment

the error becomes a runtime rather than compile-time one,
and does take a (small but non-zero) time to discover it.  The
+= 's subtle new semantics (rebinds either a local or nonlocal,
depending on how other assignments elsewhere in the
function are coded) do make it slightly harder to understand
and explain, compared to my favourite approach, which is:
    := is the ONLY way to rebind a nonlocal name
        (and only ever does that, only with a bare name on LHS,
         etc, etc)
which can't be beaten in terms of how simple it is to understand
and explain.  The compiler could then diagnose an error when it
sees := and += used on the same barename in the same
function (and perhaps give a clear error message suggesting
non-augmented := usage in lieu of the augmented assignment).


Can somebody please show a compelling use case for some
"nonlocal += expr" over "nonlocal := nonlocal + expr" , sufficient
to override all the "simplicity" arguments above?  I guess there
must be some, since popular feeling appears to be in favour of
having augmented-assignment as "rebinding", but I can't see them.


Alex


Alex


From greg at electricrain.com  Mon Oct 27 04:40:45 2003
From: greg at electricrain.com (Gregory P. Smith)
Date: Mon Oct 27 04:41:01 2003
Subject: [Python-Dev] Re: test_bsddb blocks testing popitem - reason
In-Reply-To: <200310270930.28811.aleaxit@yahoo.com>
References: <200310251232.55044.aleaxit@yahoo.com>
	<20031027075422.GK3929@zot.electricrain.com>
	<200310270930.28811.aleaxit@yahoo.com>
Message-ID: <20031027094045.GL3929@zot.electricrain.com>

> > It is unfortuantely entirely possible that various berkeleydb libraries
> > have bugs.  Since the BerkeleyDB db->del() call isn't returning it is
> > presumably stuck in a lock waiting for who knows what.
> 
> Right.  But the SAME berkeley db library is being used for my build of
> both Python 2.4 alpha 0, and 2.3 maintenance branch, both from cvs,
> and I can't see any difference in what they're doing with bsddb -- so
> clearly I must be missing something because it's hanging on EVERY
> attempt to run the unittest w/2.4, but never w/2.3.

The big difference i see between 2.3cvs and 2.4cvs that could "explain"
it is that Lib/bsddb/__init__.py has been updated to use a private
(in memory, single process only) DBEnv with locking and thread support
enabled.  That explains why db->del() would be doing locking.  But not
why it would deadlock.

This is also easily reproducable here.  No special platform or berkeleydb
version should be required.

Looking closer I suspect what is happening is that Lib/bsddb/__init__.py
implementation is not threadsafe.  It wants to maintain the current
iterator location using a DBCursor object.  However, having an active
DBCursor holds a lock in the database.  DictMixin's popitem() is
effectively:

    k, v = self.iteritems().next()
    del self[k]
    return (k, v)

The iteritems() call creates an internal DBCursor object for the iterator.
The next() call on the iterator (DBCursor) looks up the value for k.
The following delete attempts to delete the record without using the
DBCursor; thus the deadlock.

If we implement our own popitem() for the bsddb dictionary object
(_DBWithCursor) to perform the delete using the cursor this deadlock in
the unit tests would go away.  That won't stop users from intermixing
iteration over a database with modifications to the database; causing
their own deadlocks (very unexpected in single threaded code).

Proposed fix: It should be possible for the bsddb object to maintain
internal state of its own about what key is is on and close any
internal DB cursor on all non-cursor database accesses leaving the
iteration functions to detect this and reopen and reposition the cursor.
Since the basic bsddb interface doesn't allow databases with duplicate
keys it shouldn't be too difficult.

Its not efficient but a user who cares about efficient use of berkeleydb
should use the real DB/DBEnv interface directly.

How do python dictionaries deal with modifications to the dictionary
intermixed with iteration?

Greg


From aleaxit at yahoo.com  Mon Oct 27 05:25:16 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 27 05:25:24 2003
Subject: [Python-Dev] Re: test_bsddb blocks testing popitem - reason
In-Reply-To: <20031027094045.GL3929@zot.electricrain.com>
References: <200310251232.55044.aleaxit@yahoo.com>
	<200310270930.28811.aleaxit@yahoo.com>
	<20031027094045.GL3929@zot.electricrain.com>
Message-ID: <200310271125.16879.aleaxit@yahoo.com>

On Monday 27 October 2003 10:40 am, Gregory P. Smith wrote:
   ...
> The big difference i see between 2.3cvs and 2.4cvs that could "explain"
> it is that Lib/bsddb/__init__.py has been updated to use a private
> (in memory, single process only) DBEnv with locking and thread support
> enabled.  That explains why db->del() would be doing locking.  But not
> why it would deadlock.

*AH*!  I wasn't looking in the right place, silly me.  Good job!!!  Yes,
now that you've pointed it out, the change from 2.3's
    d = db.DB()
to 2.4's
    e = _openDBEnv()
    d = db.DB(e)
must be the culprit.

I still don't quite see how the lock ends up being "held", but, don't mind
me -- the intricacy of mixins and wrappings and generators and delegations
in those modules is making my head spin anyway, so it's definitely not
surprising that I can't quite see what's going on.

> How do python dictionaries deal with modifications to the dictionary
> intermixed with iteration?

In general, Python doesn't deal well with modifications to any
iterable in the course of a loop using an iterator on that iterable.

The one kind of "modification during the loop" that does work is:

for k in somedict:
    somedict[k] = ...whatever...

i.e. one can change the values corresponding to keys, but not
change the set of keys in any way -- any changes to the set of
keys can cause unending loops or other such misbehavior (not
deadlocks nor crashes, though...).

However, on a real Python dict,
    k, v = thedict.iteritems().next()
doesn't constitute "a loop" -- the iterator object returned by
the iteritems call is dropped since there are no outstanding
references to it right after this statement.  So, following up
with
    del thedict[k]
is quite all right -- the dictionary isn't being "looped on" at
that time.

Given that in bsddb's case that iteritems() first [and only]
next() boils down to a self.first() which in turn does a 
self.dbc.first() I _still_ don't see exactly what's holding the
lock.  But the simplest fix would appear to be in __delitem__,
i.e., if we have a cursor we should delete through it:

    def __delitem__(self, key):
        self._checkOpen()
        if self.dbc is not None:
            self.dbc.set(key)
            self.dbc.delete()
        else:
            del self.db[key]

...but this doesn't in fact remove the deadlock on the
unit-test for popitem, which just confirms I don't really
grasp what's going on, yet!-)


Alex


From just at letterror.com  Mon Oct 27 05:53:47 2003
From: just at letterror.com (Just van Rossum)
Date: Mon Oct 27 05:53:47 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310271033.56569.aleaxit@yahoo.com>
Message-ID: <r01050400-1026-D91F5FB5086B11D8BDE6003065D5E7E4@[10.0.0.23]>

Alex Martelli wrote:

> My slight preference for leaving += and friends alone is that
> a function using them to rebind nonlocals would be hard to
> read, that since the change only applies when the LHS is a
> bare name the important use cases for augmented assignment
> don't apply any way, that it's a bit subtle to explain that
>     foo.bar += baz
> ( += on a dotted name) implies a plain assignment (setattr)
> on foo.bar while
>     foo_bar += baz
> ( += on bare name) might imply a := assignment (rebinding
> a nonlocal) IF there are no "foo_bar = baz" elsewhere in the
> same function BUT it would imply a plain assignment if there
> ARE other plain assignments to the same name in the same
> function, ...

To an extent you're only making it _more_ difficult by saying "x := ..."
rebinds to a non-local name" instead of "x := rebinds to x in whichever
scope x is defined (which may be the local scope)". With the latter
definition, there's less to explain regarding "x += ..." as a rebinding
operation.

I find that _if_ we were to add a rebinding operator, it would be
extremely silly not to allow augmented assignments to be rebinding,
perhaps even patronizing: "yes you can assign to outer scopes, but no
you can't use augmented assignments for that since we think it makes it
too difficult for you." We should either _not_ allow assignments to
outer scopes at all, _or_ allow it and make it as powerful as
practically possible. I don't think allowing it with non-obvious
(arbitrary) limitations is a good idea. For example, the more I think
about it, the more I am _against_ disallowing "a, b := b, a".

That said, someone made a point here that rebinding is a behavior of a
variable, not the assignment operation: that's a very good one indeed,
and does make me less certain of whether adding := would be such a good
idea after all.

Just

From python at rcn.com  Mon Oct 27 06:24:57 2003
From: python at rcn.com (Raymond Hettinger)
Date: Mon Oct 27 06:25:53 2003
Subject: [Python-Dev] RE: cloning iterators again
In-Reply-To: <200310270031.h9R0VZp25738@12-236-54-216.client.attbi.com>
Message-ID: <001201c39c7c$f4582140$81b0958d@oemcomputer>

> I also note that the current tee() doesn't let you use __copy__ easily
> (it would be quite messy I think).  The linked-list version supports
> __copy__ trivially.  This may be important if we execute (as I think
> we should) on the idea of making selected iterators __copy__-able
> (especially all the standard container iterators and xrange).

The current tee() was written to support only a two way split, but it
can easily be cast as a multi-way splitter with no problem.  

The only real difference in the ideas presented so far are whether the
underlying queue should be implemented as a singly linked list or as a
double stack.

As a proof-of-concept, here is GvR's code re-cast with the queue changed
to a double stack implementation.  The interface is completely
unchanged.  The memory consumed is double that the current tee() but
much less than the linked list version.  The speed is half that of the
current tee() and roughly comparable to or slightly better than the
linked list version.


Raymond Hettinger


------------------------------------------------------------------------
--

""" Guido's demo program re-cast with a different underlying data
structure

Replaces the linked list based queue with a two stack based queue.

Advantages:

   The double stack method consumes only two pointers per data element
   while the linked list method consumes space for a link object
   (8 to 10 words).

   The double stack method uses contiguous memory while the link
   objects are more fragmented.

   The stack method uses append() and pop() which are optimized to
   minimize memory management calls.  For the link method, every 
   link costs a malloc and free.

Todo:
    Handle Wrappers that are GC'd before termination.
    Add support for holding an exception.

"""

class TeeMaster(object):
    """Holder for information common to wrapped iterators

    """

    def __init__(self, it):
        self.inbasket = []
        self.inrefcnt = []
        self.outbasket = []
        self.outrefcnt = []
        self.numseen = 0
        self.it = it
        self.numsplits = 0

class Wrapper(object):
    """Copyable wrapper around an iterator.

    Any number of Wrappers around the same iterator share the TeeMaster
    object.  The Wrapper that is most behind will drop the refcnt to
    zero, which causes the reference to be popped off of the queue.

    The newest Wrapper gets a brand new TeeMaster object.  Later
    wrappers share an existing TeeMaster object.  Since they may
    have arrived late in the game, they need to know how many objects
    have already been seen by the wrapper.  When they call next(),
    they ask for the next numseen.

    If a Wrapper is garbage-collected before it finishes, the refcount
    floor needs to be raised.  That has not yet been implemented.

    """

    __slots__ = ["master", "numseen"]

    def __init__(self, it, master=None):
        """Constructor.  The master argument is used by __copy__
below."""
        if master is None:
            master = TeeMaster(it)
        self.master = master
        self.numseen = master.numseen
        self.master.numsplits += 1

    def __copy__(self):
        """Copy the iterator.

        This returns a new iterator that will return the same series
        of results as the original.

        """
        return Wrapper(None, self.master)

    def __iter__(self):
        """All iterators should support __iter__() returning self."""
        return self

    def next(self):
        """Get the next value of the iterator, or raise
StopIteration."""

        master = self.master
        inbasket, inrefcnt = master.inbasket, master.inrefcnt

        if master.numseen == self.numseen:
            # This is the lead dog so get a value through the iterator
            value = master.it.next()
            master.numseen += 1            
            # Save it for the other dogs
            inbasket.append(value)
            inrefcnt.append(master.numsplits-1)
            self.numseen += 1
            return value

        # Not a lead dog -- the view never changes :-(

        location = len(inbasket) - (master.numseen - self.numseen)

        if location >= 0:
            # Our food is in the inbasket
            value = inbasket[location]
            inrefcnt[location] -= 1
            rc = inrefcnt[location]
        else:
            # Our food is in the outbasket
            location = -(location + 1)
            value = master.outbasket[location]
            master.outrefcnt[location] -= 1
            rc = master.outrefcnt[location]
            
        # Purge doggie bowl when no food is left
        if rc == 0:
            if len(master.outbasket) == 0:
                master.outbasket, master.inbasket = master.inbasket,
master.outbasket
                master.outrefcnt, master.inrefcnt = master.inrefcnt,
master.outrefcnt
                master.outbasket.reverse()
                master.outrefcnt.reverse()
            master.outbasket.pop()
            master.outrefcnt.pop()

        self.numseen += 1
        return value

def tee(it):
    """Replacement for Raymond's tee(); see examples in itertools
docs."""
    if not hasattr(it, "__copy__"):
        it = Wrapper(it)
    return (it, it.__copy__())

def test():
    """A simple demonstration of the Wrapper class."""
    import random
    def gen():
        for i in range(10):
            yield i
    it = gen()
    a, b = tee(it)
    b, c = tee(b)
    c, d = tee(c)
    iterators = [a, b, c, d]
    while iterators != [None, None, None, None]:
        i = random.randrange(4)
        it = iterators[i]
        if it is None:
            next = "----"
        else:
            try:
                next = it.next()
            except StopIteration:
                next = "****"
                iterators[i] = None
        print "%4d%s%4s%s" % (i, "   ."*i, next, "   ."*(3-i))

if __name__ == "__main__":
    test()


From mwh at python.net  Mon Oct 27 07:45:31 2003
From: mwh at python.net (Michael Hudson)
Date: Mon Oct 27 07:45:40 2003
Subject: [Python-Dev] PyList API missing PyList_Pop() and PyList_Delete
In-Reply-To: <006201c39a61$2f8ed1a0$e841fea9@oemcomputer> (Raymond
	Hettinger's message of "Fri, 24 Oct 2003 15:01:09 -0400")
References: <006201c39a61$2f8ed1a0$e841fea9@oemcomputer>
Message-ID: <2m3cden91w.fsf@starship.python.net>

"Raymond Hettinger" <python@rcn.com> writes:

>>     PyList_SetSlice(lst, n-1, n, NULL);
>
> There's the new piece of information.  I didn't know that the final
> argument could be NULL and creating/destroying and empty list for the
> arg was unpleasant.  I'll add that info to the API docs.

"del<wotsit> thing" is punned into "set<wotsit> thing NULL" at a
pretty low level, and fairly consistently (hope that made sense...).

Cheers,
mwh

-- 
  MAN:  How can I tell that the past isn't a fiction designed to
        account for the discrepancy between my immediate physical
        sensations and my state of mind?
                   -- The Hitch-Hikers Guide to the Galaxy, Episode 12

From guido at python.org  Mon Oct 27 09:49:39 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 27 09:51:30 2003
Subject: [Python-Dev] RE: cloning iterators again
In-Reply-To: Your message of "Mon, 27 Oct 2003 00:12:33 EST."
	<002f01c39c48$edfa7b60$d4b8958d@oemcomputer> 
References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer> 
Message-ID: <200310271449.h9REnd026601@12-236-54-216.client.attbi.com>

> My question was more directed toward non-performance issues.  Do we
> really have *any* need for more than two iterators running concurrently?
> After all, it's already difficult to come-up with good use cases for two
> that are not dominated by list() or window().
> 
>  
> > With the current tee(), I was thinking that if the two iterators stay
> > close, you end up moving the in basket to the out basket rather
> > frequently, and the overhead of that might beat the simplicity of the
> > linked lists. 
> 
> With current tee(), runtime is dominated by calls to Append and Pop
> (reverse is super-fast and moves each element only once).  Those are
> calls are more expensive than a link jump; however append() and pop()
> are optimized to avoid calls to the memory manager while every link
> would need steps for alloc/initialization/reference/dealloc.  Cache
> effects are also important because the current tee() uses much less
> memory and the two memory blocks are contiguous.
> 
> 
> 
> > Also, *if* you need a lot of clones, using multiple
> > tee() calls ends up creating several queues, again causing more
> > overhead.  (These queues end up together containing all the items from
> > the oldest to the newest iterator.)
> 
> *If* we want to support multiple clones, there is an alternate
> implementation of the current tee that only costs one extra word per
> iteration.  That would be in there already.  I really *wanted* a
> multi-way tee but couldn't find a single use case that warranted it.

All points well taken.


> > I also note that the current tee() doesn't let you use __copy__ easily
> > (it would be quite messy I think).
> 
> To __copy__ is to tee.  Both make two iterators from one.
> They are different names for the same thing.
> Right now, they don't seem comparable because the current tee is only a
> two way split and you think of copy as being a multi-way split for no
> extra cost.

Here I respectfully differ.  When you tee, you have to stop using the
underlying iterator, and replace it with one of the tee'ed copies.
When you __copy__, you can continue to use the original.  The
difference matters if you're tee'ing an iterator owned by another
piece of code.


> > Maybe Andrew has some use cases?
> 
> I hope so.  I can't think of anything that isn't dominated by list(),
> window(), or the current tee().
> 
> And, if needed, the current tee() can easily be made multi-way.  It
> doubles the unit memory cost from one word to two but that's nothing
> compared to the link method (two words for PyHead, another 3 (Linux) or
> 4 (Windows) words for GC, and another 3 for the data fields).

As you said in your first msg, you could do it with much less overhead
if the link cell wasn't made a PyObject.  Also, Andrew's suggestion of
using a link cell containing an array of values could be explored.

But I'll happily back off until we find a use case that needs more
than a 2-way tee *and* we find it's a performance bottleneck for your
approach.  We may never find that.

> > The question is, how often does one need this?  Have you seen real use
> > cases for tee() that aren't better served with list()?
> 
> I'm sure they exist, but they are very few.  I was hoping that a simple,
> fast, memory efficient two-way tee() would have satisfied all the
> requests, but this thing appears to be taking on a life of its own with
> folks thinking they need multiple concurrent iterators created by a
> magic method (what for?).

Well, there was a separate thread about __copy__'ing iterators.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Oct 27 09:53:24 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 27 09:53:29 2003
Subject: [Python-Dev] RE: cloning iterators again
In-Reply-To: Your message of "Mon, 27 Oct 2003 08:51:02 +0100."
	<200310270851.02495.aleaxit@yahoo.com> 
References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer>  
	<200310270851.02495.aleaxit@yahoo.com> 
Message-ID: <200310271453.h9RErOT26617@12-236-54-216.client.attbi.com>

> But I don't understand how it would be quite messy to take advantage
> of this in tee(), either: simply, tee() would start with the equivalent of
>     it = iter(it)
>     try: return it, copy.copy(it)
>     except TypeError:pass
> and proceed just like now if this shortcut hasn't worked -- that's all.

Right, that's what the tee() at the end of my code did, except it
checked for __copy__ explicitly, since I assume that only iterators
whose author has thought about copyability should be assumed copyable;
this means that the default copy stategy for class instances (classic
and new-style) as suspect.

tee is more and less powerful than copy; it is more powerful because
it works for any iterator, but less so because you can't continue
using the underlying iterator (any calls to its next() method will be
lost for both tee'ed copies).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at yahoo.com  Mon Oct 27 10:09:03 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 27 10:09:13 2003
Subject: [Python-Dev] RE: cloning iterators again
In-Reply-To: <20031027103540.GA27782@vicky.ecs.soton.ac.uk>
References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer>
	<200310270851.02495.aleaxit@yahoo.com>
	<20031027103540.GA27782@vicky.ecs.soton.ac.uk>
Message-ID: <200310271609.03819.aleaxit@yahoo.com>

On Monday 27 October 2003 11:35 am, Armin Rigo wrote:
> Hello Alex,
>
> On Mon, Oct 27, 2003 at 08:51:02AM +0100, Alex Martelli wrote:
> > I just got a PEP number (323) for Copyable Iterators as recently
> > discussed,
>
> .. where?

I was assigned the PEP number in email today and just now committed
the PEP (and the update of PEP 0 to list it) to CVS.


> > and hope to commit the PEP within today.  But, basically, the idea is
> > trivially simple: iterators which really have a tiny amount of state,
> > such as those on sequences and dicts, will expose __copy__ and implement
> > it by just duplicating said tiny amount (one pointer to a container and
> > an index).
>
> I needed this for sequence iterators and generators in a recent project.
> Duplicating a user-defined running generator seems funny, but it works
> quite well.  I use this to make a snapshot of the program state and restore
> it later, and the program makes heavy use of parallel-running generators
> stored in lists.
>
>   http://codespeak.net/svn/user/arigo/misc/statesaver.c

Cool!  Why don't you try copy.copy on types you don't automatically
recognize and know how to deal with, BTW?  That might make this
cool piece of code general enough that Guido might perhaps allow
generator-produced iterators to grow it as their __copy__ method...


Alex


From guido at python.org  Mon Oct 27 10:11:16 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 27 10:11:23 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: Your message of "Mon, 27 Oct 2003 10:33:56 +0100."
	<200310271033.56569.aleaxit@yahoo.com> 
References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz>
	<200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com> 
	<200310271033.56569.aleaxit@yahoo.com> 
Message-ID: <200310271511.h9RFBG926665@12-236-54-216.client.attbi.com>

> My slight preference for leaving += and friends alone is that
> a function using them to rebind nonlocals would be hard to
> read, that since the change only applies when the LHS is a
> bare name the important use cases for augmented assignment
> don't apply any way, that it's a bit subtle to explain that
>     foo.bar += baz
> ( += on a dotted name) implies a plain assignment (setattr)
> on foo.bar while
>     foo_bar += baz
> ( += on bare name) might imply a := assignment (rebinding
> a nonlocal) IF there are no "foo_bar = baz" elsewhere in the
> same function BUT it would imply a plain assignment if there
> ARE other plain assignments to the same name in the same
> function, ...

I think you're making this sound more complicated than it is.

I don't think you'll ever *have* to explain this anyway, as long as :=
and += use the same rules to find their target (I'd even accept
rejecting the case where the target is a global for which the compiler
can't find a single assignment, breaking an utterly minuscule amount
of bad code, if any).

I'm *not* saying that I like := (so far I still like 'global x in f'
better) but I think that either way of allowing rebinding nonlocals
will also have to allow rebinding them through += and friends.

I think the main weakness (for me) of := and other approaches that try
to force you to say you're rebinding a nonlocal each time you do it is
beginning to show: there are already well-established rules for
deciding whether a bare name is local or not, and those rules have
always worked "at a distance".  The main reason for disallowing
rebinding nonlocals in the past has been that one of those rules was
"if there's a bare-name assignment to it it must be local (unless
there's also a global statement for it)" (and I couldn't find a
satisfactory way to add a nonlocal declarative statement and I didn't
think it was a huge miss -- actually I still think it's not a *huge*
miss).

> IOW it seems to me that we're getting into substantial amounts
> of subtlety in explaining (and thus maybe in implementing) a
> functionality change that's not terribly useful anyway and may 
> damage rather than improve readability when it's used.
> 
> Taking the typical P. Graham accumulator example, say:
> 
> with += rebinding, we can code this:
> 
> def accumulator(n=0):
>     def increment(i):
>         n += i
>         return n
>     return increment
> 
> but without it, we would code:
> 
> def accumulator(n=0):
>     def increment(i):
>         n := n + i
>         return n
>     return increment
> 
> and it doesn't seem to me that the two extra keystrokes are to
> be considered a substantial price to pay.

That's the argument that has always been used against += by people who
don't like it.  The counterargument is that (a) the savings in typing
isn't always that small, and (b) += *expresses the programmer's
thought better*.  Personally I expect that as soon as nonlocal
rebinding is supported in any way, people would be hugely surprised if
+= and friends were not.

> Admittedly in such a
> tiny example readability is just as good either way, as it's obvious
> which n we're talking about (there being just one, and extremely
> nearby wrt the point of use of either += or := ).
> 
> Suppose we wanted to have the accumulator "saturate" -- if
> the last value it returned was > m it must restart accumulating
> from zero.  Now, without augmented assignment:
> 
> def accumulator_saturating(n=0, m=100):
>     def increment(i):
>         if n > m:
>             n := i
>         else:
>             n := n + i
>         return n
>     return increment
> 
> we have a pleasing symmetry and no risk of errors -- if we
> mistakenly use an = instead of := in either branch the compiler
> will be able to let us know immediately.  (Actually I'd be quite
> tempted to code the if branch as "n := 0 + i" to underscore the
> symmetry, but maybe I'm just weird:-).
> 
> If we do rely on augmented assignment being "rebinding":
> 
> def accumulator_saturating(n=0, m=100):
>     def increment(i):
>         if n > m:
>             n = i
>         else:
>             n += i
>         return n
>     return increment
> 
> the error becomes a runtime rather than compile-time one,
> and does take a (small but non-zero) time to discover it.

Hah.  Another argument *against* rebinding by :=, and *for* a nonlocal
declaration.  With 'nonlocal n, m' in increment() (or however it's
spelled :-) the intent is clear.

> The
> += 's subtle new semantics (rebinds either a local or nonlocal,
> depending on how other assignments elsewhere in the
> function are coded) do make it slightly harder to understand
> and explain, compared to my favourite approach, which is:
>     := is the ONLY way to rebind a nonlocal name
>         (and only ever does that, only with a bare name on LHS,
>          etc, etc)
> which can't be beaten in terms of how simple it is to understand
> and explain.  The compiler could then diagnose an error when it
> sees := and += used on the same barename in the same
> function (and perhaps give a clear error message suggesting
> non-augmented := usage in lieu of the augmented assignment).
> 
> 
> Can somebody please show a compelling use case for some
> "nonlocal += expr" over "nonlocal := nonlocal + expr" , sufficient
> to override all the "simplicity" arguments above?  I guess there
> must be some, since popular feeling appears to be in favour of
> having augmented-assignment as "rebinding", but I can't see them.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at yahoo.com  Mon Oct 27 10:24:01 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 27 10:24:08 2003
Subject: [Python-Dev] RE: cloning iterators again
In-Reply-To: <200310271453.h9RErOT26617@12-236-54-216.client.attbi.com>
References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer>
	<200310270851.02495.aleaxit@yahoo.com>
	<200310271453.h9RErOT26617@12-236-54-216.client.attbi.com>
Message-ID: <200310271624.01265.aleaxit@yahoo.com>

On Monday 27 October 2003 03:53 pm, Guido van Rossum wrote:
> > But I don't understand how it would be quite messy to take advantage
> > of this in tee(), either: simply, tee() would start with the equivalent
> > of it = iter(it)
> >     try: return it, copy.copy(it)
> >     except TypeError:pass
> > and proceed just like now if this shortcut hasn't worked -- that's all.
>
> Right, that's what the tee() at the end of my code did, except it
> checked for __copy__ explicitly, since I assume that only iterators
> whose author has thought about copyability should be assumed copyable;
> this means that the default copy stategy for class instances (classic
> and new-style) as suspect.

I see!  So you want to be more prudent here than an ordinary copy
would be, and also disallow alternatives to __copy__ such as
__getinitargs__ or __getstate__/__setstate__ ...?  Could you give
an example of an iterator class, which is "accidentally" copyable, but
"shouldn't" be for purposes of tee only?  I have a hard time thinking
of any (hmmm, perhaps a file object that's not "held" directly as an
attribute, but indirectly in some devious way...?).  Maybe I need to
revise the PEP 323, which I just committed (to nondist/peps as usual)
accordingly?


> tee is more and less powerful than copy; it is more powerful because
> it works for any iterator, but less so because you can't continue
> using the underlying iterator (any calls to its next() method will be
> lost for both tee'ed copies).

Yes, it IS worth pointing out that the idiom for using tee must
always be
    a, b = tee(c)
and c is not to be used afterwards -- or equivalently
    a, b = tee(a)
when, as common, there are no other references to a (even
indirectly e.g. via somebody holding on to a ref to a.next).  Hmmm,
I wonder if that should go in my PEP, though, since it's more about tee
than about copy...?


Alex


From guido at python.org  Mon Oct 27 10:34:59 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 27 10:37:34 2003
Subject: [Python-Dev] RE: cloning iterators again
In-Reply-To: Your message of "Mon, 27 Oct 2003 16:24:01 +0100."
	<200310271624.01265.aleaxit@yahoo.com> 
References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer>
	<200310270851.02495.aleaxit@yahoo.com>
	<200310271453.h9RErOT26617@12-236-54-216.client.attbi.com> 
	<200310271624.01265.aleaxit@yahoo.com> 
Message-ID: <200310271535.h9RFYxm26796@12-236-54-216.client.attbi.com>

> I see!  So you want to be more prudent here than an ordinary copy
> would be, and also disallow alternatives to __copy__ such as
> __getinitargs__ or __getstate__/__setstate__ ...?  Could you give
> an example of an iterator class, which is "accidentally" copyable, but
> "shouldn't" be for purposes of tee only?

We discussed this before: if the state representing the iterator's
position is a mutable object, copy.copy() will not copy this mutable
object, so the two would share their state (or, more likely, part of
their state).  The example would be a tree iterator using a stack,
represented as a list.


> Yes, it IS worth pointing out that the idiom for using tee must
> always be
>     a, b = tee(c)
> and c is not to be used afterwards -- or equivalently
>     a, b = tee(a)
> when, as common, there are no other references to a (even
> indirectly e.g. via somebody holding on to a ref to a.next).  Hmmm,
> I wonder if that should go in my PEP, though, since it's more about tee
> than about copy...?

I think Raymond should add this to the tee() docs in big bold print.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From amk at amk.ca  Mon Oct 27 11:02:21 2003
From: amk at amk.ca (amk@amk.ca)
Date: Mon Oct 27 11:05:46 2003
Subject: [Python-Dev] htmllib vs. HTMLParser
Message-ID: <20031027160221.GA29155@rogue.amk.ca>

Over in the Web SIG, it was noted that the HTML parser in htmllib has
handlers for HTML 2.0 elements, and it should really support HTML 4.01, the
current version.  I'm looking into doing this.

We actually have two HTML parsers: htmllib.py and the more recent
HTMLParser.py.  The initial check-in comment for 2001/05/18 for
HTMLParser.py reads:

      A much improved HTML parser -- a replacement for sgmllib.  The API is
      derived from but not quite compatible with that of sgmllib, so it's a
      new file.  I suppose it needs documentation, and htmllib needs to be
      changed to use this instead of sgmllib, and sgmllib needs to be
      declared obsolete.  But that can all be done later.

sgmllib only handles those bits of SGML needed for HTML, and anyone doing
serious SGML work is going to have to use a real SGML parser, so deprecating 
sgmllib is reasonable.  HTMLParser needs no changes for HTML 4.01; only
htmllib needs to get a bunch more handler methods.

Should I try to do this for 2.4?

(I can't find an explanation of how the API differs between the two modules
but can figure it out by inspecting the code, and will try to keep the
htmllib module backward-compatible.)

--amk

From FBatista at uniFON.com.ar  Mon Oct 27 11:10:47 2003
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Mon Oct 27 11:11:59 2003
Subject: [Python-Dev] Decimal.py in sandbox
Message-ID: <A128D751272CD411BC9200508BC2194D03383100@escpl.tcp.com.ar>


To my (nice) surprise, all the testCases of Decimal.py ran OK. 

These tests were all about the specification (the ugly side, :) and not
about
using the class. For instance, you can do::

    x = Decimal(3) / 5
    
and it get done allright (according to the tests cases of Mike Cowlishaw).
But you can?t do::

    x = 5 / Decimal(3)

So, here is a temptative list of ToDo for myself:

    1. Clean up unused code, reorder methods (all publics together, etc).
    2. Put some repeated code inside functions.
    3. Write a pre-PEP.
    4. Write testCases for the functionality specified by the pre-PEP.
    5. Write the code to comply the testCases.
    6. Write the PEP.
    7. Submit everything.

Some questions:

    - Is there some of this work (specially the third item) already done or
      started?
    - Should I submit partial work or everything as a whole?
    - Modifications to the sandbox modules, are considered patches? Should I
      send them through SourceForge interface?

As always, suggestions and similars are welcomed (and very appreciated).

Thank you.


Facundo Batista
Gesti?n de Red
fbatista@unifon.com.ar
(54 11) 5130-4643
Cel: 15 5132 0132


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
ADVERTENCIA  

La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo,
son para uso exclusivo del destinatario y pueden contener informaci?n
confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. 

Si Ud. No es uno de los destinatarios consignados o la persona responsable
de hacer llegar este mensaje a los destinatarios consignados, no est?
autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de
ella) contenida en este mensaje. Por favor notif?quenos respondiendo al
remitente, borre el mensaje original y borre las copias (impresas o grabadas
en cualquier medio magn?tico) que pueda haber realizado del mismo. 

Todas las opiniones contenidas en este mail son propias del autor del
mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones
Personales S.A. o alguna empresa asociada. 

Los mensajes electr?nicos pueden ser alterados, motivo por el cual
Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n
cualquiera sea el resultante de este mensaje. 

Muchas Gracias.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20031027/ecb74e42/attachment.html
From neal at metaslash.com  Mon Oct 27 11:12:02 2003
From: neal at metaslash.com (Neal Norwitz)
Date: Mon Oct 27 11:12:11 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310271511.h9RFBG926665@12-236-54-216.client.attbi.com>
References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz>
	<200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com>
	<200310271033.56569.aleaxit@yahoo.com>
	<200310271511.h9RFBG926665@12-236-54-216.client.attbi.com>
Message-ID: <20031027161202.GF5842@epoch.metaslash.com>

On Mon, Oct 27, 2003 at 07:11:16AM -0800, Guido van Rossum wrote:
> 
> Hah.  Another argument *against* rebinding by :=, and *for* a nonlocal
> declaration.  With 'nonlocal n, m' in increment() (or however it's
> spelled :-) the intent is clear.

I dislike := very much.  I think it will confuse newbies and thus be
abused.  While I dislike the global declaration, I don't feel strongly
about changing or removing it.

The best alternative I've seen that addresses nested scope and the global
declaration. Is to borrow :: from C++:

        foo = DEFAULT_VALUES
        counter = 0

        def reset_foo():
            ::foo = DEFAULT_VALUES

        def inc_counter():
            ::counter += 1

        def outer():
            counter = 5
            def inner():
                ::counter += outer::counter     # increment global from outer
                outer::counter += 2             # increment outer counter

The reasons why I like this approach:
        * each variable reference can be explicit when necessary
        * no separate declaration
        * concise, no wording issues like global
        * similarity between global and nested scopes
          (ie, ::foo is global, scope::foo is some outer scope)
          both the global and nested issues are handled at once
        * doesn't prevent augmented assignment
        * it reads well to me and the semantics are pretty clear
          (although that's highly subjective)

Neal

From aleaxit at yahoo.com  Mon Oct 27 11:20:10 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 27 11:20:44 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310271511.h9RFBG926665@12-236-54-216.client.attbi.com>
References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz>
	<200310271033.56569.aleaxit@yahoo.com>
	<200310271511.h9RFBG926665@12-236-54-216.client.attbi.com>
Message-ID: <200310271720.10313.aleaxit@yahoo.com>

On Monday 27 October 2003 04:11 pm, Guido van Rossum wrote:
   ...
> I don't think you'll ever *have* to explain this anyway, as long as :=
> and += use the same rules to find their target (I'd even accept

Actually, I'd like to make a := ... an error when there's an a = ... in
the same function, so it can't be exactly the same rules for a += ...
in my opinion.

> I'm *not* saying that I like := (so far I still like 'global x in f'

Ah well.

> I think the main weakness (for me) of := and other approaches that try
> to force you to say you're rebinding a nonlocal each time you do it is
> beginning to show: there are already well-established rules for
> deciding whether a bare name is local or not, and those rules have

There are, but they represent a wart (according to AMK's python-warts
page, http://www.amk.ca/python/writing/warts.html , and I agree with him
on this, although NOT with his suggested fix of having the compiler 
"automatically adding a global when needed" -- I don't like too-clever
compilers that make subtle inferences behind my back, and I think that
the fact that Python's compiler doesn't is a strength, not a weakness).

The "well-established rules" also cause one of the "10 Python pitfalls"
listed at http://zephyrfalcon.org/labs/python_pitfalls.html .  My personal
experience teaching/consulting/mentoring confirms this, although I,
personally, don't remember having been bitten by this (but then, I
recall only 2 of those 10 pitfalls as giving trouble to me personally, as
opposed to people I taught/advised/etc: mutable default arguments,
and "loops of x=x+y" performance traps for sequences).

It seemed to me that introducing := (or other approaches that require
explicit denotation of "I'm binding a nonlocal here") was a chance to
FIX the warts/pitfalls of those "already well-established rules".  Albeit
with a heavy heart, I would consider even a Rubyesque stropping of
nonlocals (Ruby uses $foo to mean foo is nonlocal, others here have
suggested :foo, whatever, it's not the sugar that matters most to me
here) preferable to using "declarative statements" for the purpose.
Oh well.

> always worked "at a distance".  The main reason for disallowing
> rebinding nonlocals in the past has been that one of those rules was
> "if there's a bare-name assignment to it it must be local (unless
> there's also a global statement for it)" (and I couldn't find a
> satisfactory way to add a nonlocal declarative statement and I didn't
> think it was a huge miss -- actually I still think it's not a *huge*
> miss).

Agreed, not huge, just probably marginally worth doing.  Should it
make "declarative statements" more popular and widely used than
today's bare "global", I don't even know if it would be worth it.

I don't like declarative statements.  I don't understand why you
like them here, when, in your message of Thursday 23 October 
2003 06:25:49 on "accumulator display syntax", you condemned
a proposal "because it feels very strongly like a directive to the
compiler".  "A directive to the compiler" is exactly how "global"
and other proposed declarative-statements feel to me: statements
that don't DO things (like all other statements do), but strictly
and only are "like a directive to the compiler".


> > and it doesn't seem to me that the two extra keystrokes are to
> > be considered a substantial price to pay.
>
> That's the argument that has always been used against += by people who
> don't like it.  The counterargument is that (a) the savings in typing
> isn't always that small, and (b) += *expresses the programmer's

The saving in typing is not always small _when on the left of the
augmented assignment operator you have something much more
complicated than just a bare name_.  For example,

    counter[current_row + current_column * delta] += current_value

Without += this statement would be too long, and it would be
hard to check that the LHS and RHS match exactly -- in practice
one would end up breaking it in two,

    current_index = current_row + current_column * delta
    counter[current_index] = counter[current_index] + current_value

which IS still substantially more cumbersome than the previous
version using += .  But this counterargument does not apply
to uses of += on bare names: the saving is strictly limited to
the length of the bare name, which should be reasonably small.

> thought better*.  Personally I expect that as soon as nonlocal
> rebinding is supported in any way, people would be hugely surprised if
> += and friends were not.

We could try an opinion poll, but it's probably worth it only if this
measure of "expected surprise" was the key point for your decision;
if you're going to prefer declarative statements anyway, there's no
point going through the aggravation.


> > If we do rely on augmented assignment being "rebinding":
> >
> > def accumulator_saturating(n=0, m=100):
> >     def increment(i):
> >         if n > m:
> >             n = i
> >         else:
> >             n += i
> >         return n
> >     return increment
> >
> > the error becomes a runtime rather than compile-time one,
> > and does take a (small but non-zero) time to discover it.
>
> Hah.  Another argument *against* rebinding by :=, and *for* a nonlocal
> declaration.  With 'nonlocal n, m' in increment() (or however it's
> spelled :-) the intent is clear.

I disagree that the example is "an argument for declarations":
on the contrary, it's an argument for := without "rebinding +=".

The erroneous example just reposted gives a runtime error anyway
(I don't know why I wrote it would give a compile-time error --
just like a bare "def f(): x+=1" doesn't give a compile-time error
today, so, presumably, wouldn't this reposted example).  If
"n := n + i" WAS used in lieu of the augmented assignment,
THEN -- and only then -- could we give the preferable compile-
time error, for forbidden mixing of "n = ..." and "n := ..." in
different spots in the same function.


Alex


From guido at python.org  Mon Oct 27 11:51:16 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 27 11:51:34 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: Your message of "Mon, 27 Oct 2003 11:12:02 EST."
	<20031027161202.GF5842@epoch.metaslash.com> 
References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz>
	<200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com>
	<200310271033.56569.aleaxit@yahoo.com>
	<200310271511.h9RFBG926665@12-236-54-216.client.attbi.com> 
	<20031027161202.GF5842@epoch.metaslash.com> 
Message-ID: <200310271651.h9RGpGe27042@12-236-54-216.client.attbi.com>

The only problem with using :: is a syntactic ambiguity:

  a[x::y]

already means something (an extended slice with start=x, no stop, and
step=y).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Oct 27 11:52:53 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 27 11:53:18 2003
Subject: [Python-Dev] htmllib vs. HTMLParser
In-Reply-To: Your message of "Mon, 27 Oct 2003 11:02:21 EST."
	<20031027160221.GA29155@rogue.amk.ca> 
References: <20031027160221.GA29155@rogue.amk.ca> 
Message-ID: <200310271652.h9RGqrN27054@12-236-54-216.client.attbi.com>

> Over in the Web SIG, it was noted that the HTML parser in htmllib has
> handlers for HTML 2.0 elements, and it should really support HTML 4.01, the
> current version.  I'm looking into doing this.
> 
> We actually have two HTML parsers: htmllib.py and the more recent
> HTMLParser.py.  The initial check-in comment for 2001/05/18 for
> HTMLParser.py reads:
> 
>       A much improved HTML parser -- a replacement for sgmllib.  The API is
>       derived from but not quite compatible with that of sgmllib, so it's a
>       new file.  I suppose it needs documentation, and htmllib needs to be
>       changed to use this instead of sgmllib, and sgmllib needs to be
>       declared obsolete.  But that can all be done later.
> 
> sgmllib only handles those bits of SGML needed for HTML, and anyone doing
> serious SGML work is going to have to use a real SGML parser, so deprecating 
> sgmllib is reasonable.  HTMLParser needs no changes for HTML 4.01; only
> htmllib needs to get a bunch more handler methods.
> 
> Should I try to do this for 2.4?

I'm unclear on what you plan to do -- repeal sgmllib an rewrite
htmllib to use HTMLParser internally for a backwards compatible
interface?

> (I can't find an explanation of how the API differs between the two modules
> but can figure it out by inspecting the code, and will try to keep the
> htmllib module backward-compatible.)

That would be required for a few releases, yes.

I'm okay with deprecating sgmllib faster than htmllib.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From neal at metaslash.com  Mon Oct 27 12:08:50 2003
From: neal at metaslash.com (Neal Norwitz)
Date: Mon Oct 27 12:09:00 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310271651.h9RGpGe27042@12-236-54-216.client.attbi.com>
References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz>
	<200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com>
	<200310271033.56569.aleaxit@yahoo.com>
	<200310271511.h9RFBG926665@12-236-54-216.client.attbi.com>
	<20031027161202.GF5842@epoch.metaslash.com>
	<200310271651.h9RGpGe27042@12-236-54-216.client.attbi.com>
Message-ID: <20031027170850.GG5842@epoch.metaslash.com>

On Mon, Oct 27, 2003 at 08:51:16AM -0800, Guido van Rossum wrote:
> The only problem with using :: is a syntactic ambiguity:
> 
>   a[x::y]
> 
> already means something (an extended slice with start=x, no stop, and
> step=y).

I'm not wedded to the :: digraph, I prefer the concept.  :: was nice
because it re-used a similar concept from C++.  No other digraph jumps
out at me.  Some other possibilities (I don't care for any of these):

        Global          Nested
        ------          ------
        :>variable      scope:>variable
        *>variable      scope*>variable
        ->variable      scope->variable
        ?>variable      scope?>variable
        &>variable      scope&>variable

Or perhaps variations using <.

Neal

From aleaxit at yahoo.com  Mon Oct 27 12:20:14 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 27 12:20:50 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <20031027170850.GG5842@epoch.metaslash.com>
References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz>
	<200310271651.h9RGpGe27042@12-236-54-216.client.attbi.com>
	<20031027170850.GG5842@epoch.metaslash.com>
Message-ID: <200310271820.15001.aleaxit@yahoo.com>

On Monday 27 October 2003 06:08 pm, Neal Norwitz wrote:
> On Mon, Oct 27, 2003 at 08:51:16AM -0800, Guido van Rossum wrote:
> > The only problem with using :: is a syntactic ambiguity:
> >
> >   a[x::y]
> >
> > already means something (an extended slice with start=x, no stop, and
> > step=y).
>
> I'm not wedded to the :: digraph, I prefer the concept.  :: was nice
> because it re-used a similar concept from C++.  No other digraph jumps

Does it have to be a digraph?  We could use one of the ASCII chars
Python doesn't use.  For example, $ would give us exactly the same
way as Ruby to strop global variables (though, differently from Ruby,
we'd only _have_ to strop them on rebinding -- more-common "read"
accesses would stay clean) -- $variable meaning 'global'.  And
scope$variable meaning 'outer'.  OTOH, if we used @ instead,
it would read better the other way 'round -- variable@scope DOES
look like a pretty natural way to indicate "said variable at said scope" --
though it doesn't read quite as well _without_ a scope.


Alex


From pedronis at bluewin.ch  Mon Oct 27 12:23:28 2003
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Mon Oct 27 12:21:28 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310271511.h9RFBG926665@12-236-54-216.client.attbi.com>
References: <Your message of "Mon, 27 Oct 2003 10:33:56 +0100."
	<200310271033.56569.aleaxit@yahoo.com>
	<200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz>
	<200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com>
	<200310271033.56569.aleaxit@yahoo.com>
Message-ID: <5.2.1.1.0.20031027181727.027d5b00@pop.bluewin.ch>

At 07:11 27.10.2003 -0800, Guido van Rossum wrote:

>I'm *not* saying that I like := (so far I still like 'global x in f'
>better)

if I understand 'global x in f' will introduce a local x in f even if there
is none, for symmetry with global. Maybe this has already been answered
(this thread is getting too long, and is this change scheduled for 2.4 or 
3.0?) but

x = 'global'

def f():
   def init():
     global x in f
     x = 'in f'
   init()
   print x

f()

will this print 'global' or 'in f' ? I can argument both ways which is not 
a good thing.

Thanks. 


From skip at pobox.com  Mon Oct 27 12:23:40 2003
From: skip at pobox.com (Skip Montanaro)
Date: Mon Oct 27 12:23:50 2003
Subject: [Python-Dev] Let's table the discussion of replacing 'global'
Message-ID: <16285.21660.432100.124214@montanaro.dyndns.org>


[ on changing Python's global variable access mechanisms ]

I'm going to make a suggestion.  Let's shelve this topic for the time being
and simply summarize the issues in an informational PEP aimed at Py3k.  We
don't even know (at least I don't) if we want an implicit search for outer
scope variables or an explicit specification of which scope such variables
should be defined in.  If, for some reason, nested scopes make a quick exit
in Py3k, this would all be moot anyway.  It's not clear nested scopes really
offer anything to Python other than muddled semantics and a more complex
virtual machine implementation.

Skip

From guido at python.org  Mon Oct 27 12:28:33 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 27 12:28:41 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: Your message of "Mon, 27 Oct 2003 18:23:28 +0100."
	<5.2.1.1.0.20031027181727.027d5b00@pop.bluewin.ch> 
References: <Your message of "Mon, 27 Oct 2003 10:33:56 +0100."
	<200310271033.56569.aleaxit@yahoo.com>
	<200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz>
	<200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com>
	<200310271033.56569.aleaxit@yahoo.com> 
	<5.2.1.1.0.20031027181727.027d5b00@pop.bluewin.ch> 
Message-ID: <200310271728.h9RHSXl27167@12-236-54-216.client.attbi.com>

> if I understand 'global x in f' will introduce a local x in f even if there
> is none, for symmetry with global. Maybe this has already been answered
> (this thread is getting too long, and is this change scheduled for 2.4 or 
> 3.0?) but
> 
> x = 'global'
> 
> def f():
>    def init():
>      global x in f
>      x = 'in f'
>    init()
>    print x
> 
> f()
> 
> will this print 'global' or 'in f' ? I can argument both ways which is not 
> a good thing.

The compiler does a full analysis so it will know that init() refers
to a cell for x in f's locals, and hence it will print 'in f'.  For
the purposes of deciding which variables live where, the presence of
'global x in f' inside an inner function (whether or not there's a
matching assignment) is equivalent to the presence of an assignment to
x in f's body.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Oct 27 12:29:49 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 27 12:30:33 2003
Subject: [Python-Dev] Let's table the discussion of replacing 'global'
In-Reply-To: Your message of "Mon, 27 Oct 2003 11:23:40 CST."
	<16285.21660.432100.124214@montanaro.dyndns.org> 
References: <16285.21660.432100.124214@montanaro.dyndns.org> 
Message-ID: <200310271729.h9RHTnl27186@12-236-54-216.client.attbi.com>

> I'm going to make a suggestion.  Let's shelve this topic for the time being
> and simply summarize the issues in an informational PEP aimed at
> Py3k.

Great idea.  I'm getting tired of it too; Alex and I don't seem to be
getting an inch closer to each other.

> We don't even know (at least I don't) if we want an implicit search
> for outer scope variables or an explicit specification of which
> scope such variables should be defined in.  If, for some reason,
> nested scopes make a quick exit in Py3k, this would all be moot
> anyway.

Sorry to disappoint you, but nested scopes aren't going away.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aahz at pythoncraft.com  Mon Oct 27 12:34:48 2003
From: aahz at pythoncraft.com (Aahz)
Date: Mon Oct 27 12:34:51 2003
Subject: [Python-Dev] Decimal.py in sandbox
In-Reply-To: <A128D751272CD411BC9200508BC2194D03383100@escpl.tcp.com.ar>
References: <A128D751272CD411BC9200508BC2194D03383100@escpl.tcp.com.ar>
Message-ID: <20031027173448.GA17544@panix.com>

On Mon, Oct 27, 2003, Batista, Facundo wrote:
>
> Some questions:
> 
>     - Is there some of this work (specially the third item) already done or
>       started?
>     - Should I submit partial work or everything as a whole?
>     - Modifications to the sandbox modules, are considered patches? Should I
>       send them through SourceForge interface?

The first thing you should do is talk with Eric Price
(eprice@tjhsst.edu), author of the code.  You don't need to use SF for
now; CVS should be fine, but you should find out whether Eric would like
to approve changes first.

There's no reason you can't start with a pre-PEP now; I'd focus on
interface (i.e. the question of what ``Decimal(5)/3`` and
``5/Decimal(3)`` should do -- my personal take at this point is that
both ought to fail).
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From walter at livinglogic.de  Mon Oct 27 12:42:40 2003
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Mon Oct 27 12:43:03 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <20031027161202.GF5842@epoch.metaslash.com>
References: <200310270328.h9R3S8S13912@oma.cosc.canterbury.ac.nz>	<200310270358.h9R3wJE25871@12-236-54-216.client.attbi.com>	<200310271033.56569.aleaxit@yahoo.com>	<200310271511.h9RFBG926665@12-236-54-216.client.attbi.com>
	<20031027161202.GF5842@epoch.metaslash.com>
Message-ID: <3F9D5910.9050001@livinglogic.de>

Neal Norwitz wrote:

> On Mon, Oct 27, 2003 at 07:11:16AM -0800, Guido van Rossum wrote:
> 
>>Hah.  Another argument *against* rebinding by :=, and *for* a nonlocal
>>declaration.  With 'nonlocal n, m' in increment() (or however it's
>>spelled :-) the intent is clear.
> 
> 
> I dislike := very much.  I think it will confuse newbies and thus be
> abused.  While I dislike the global declaration, I don't feel strongly
> about changing or removing it.

I think ':=' is to close to '='. The default assigment should be much
easier to type than the special case. Otherwise I'd have to think about
which one I'd like to use every time I type an assigment.

Bye,
    Walter D?rwald


From guido at python.org  Mon Oct 27 13:00:09 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 27 13:03:17 2003
Subject: [Python-Dev] RE: cloning iterators again
In-Reply-To: Your message of "Mon, 27 Oct 2003 06:24:57 EST."
	<001201c39c7c$f4582140$81b0958d@oemcomputer> 
References: <001201c39c7c$f4582140$81b0958d@oemcomputer> 
Message-ID: <200310271800.h9RI09E27291@12-236-54-216.client.attbi.com>

> As a proof-of-concept, here is GvR's code re-cast with the queue changed
> to a double stack implementation.  The interface is completely
> unchanged.  The memory consumed is double that the current tee() but
> much less than the linked list version.  The speed is half that of the
> current tee() and roughly comparable to or slightly better than the
> linked list version.

Actually, if I up the range() in the gen() function to range(10000)
and drop the print statement, the Python version of your code runs
about 20% slower than mine.  But this says nothing about the relative
speed of C implementations.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From just at letterror.com  Mon Oct 27 13:00:55 2003
From: just at letterror.com (Just van Rossum)
Date: Mon Oct 27 13:03:40 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310271651.h9RGpGe27042@12-236-54-216.client.attbi.com>
Message-ID: <r01050400-1026-8455E40108A711D8BDE6003065D5E7E4@[10.0.0.23]>

Guido van Rossum wrote:

> The only problem with using :: is a syntactic ambiguity:
> 
>   a[x::y]
> 
> already means something (an extended slice with start=x, no stop, and
> step=y).

On the other hand:

  a[x y]

doesn't mean anything, so I don't see an immediate problem.

I like Neal's proposal, including the "::" digraph.

Just

From aleaxit at yahoo.com  Mon Oct 27 13:40:31 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 27 13:41:43 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <r01050400-1026-8455E40108A711D8BDE6003065D5E7E4@[10.0.0.23]>
References: <r01050400-1026-8455E40108A711D8BDE6003065D5E7E4@[10.0.0.23]>
Message-ID: <200310271940.31530.aleaxit@yahoo.com>

On Monday 27 October 2003 07:00 pm, Just van Rossum wrote:
> Guido van Rossum wrote:
> > The only problem with using :: is a syntactic ambiguity:
> >
> >   a[x::y]
> >
> > already means something (an extended slice with start=x, no stop, and
> > step=y).
>
> On the other hand:
>
>   a[x y]
>
> doesn't mean anything, so I don't see an immediate problem.

Sorry, just, but I really don't understand the "don't see immediate problem".

As I understand the proposal:

y = 23
biglist = range(999)

def f():
    y = 45   # sets a local
    ::y = 67 # sets the global
    print biglist[::y]

should this print the 67-th item of biglist, or the first 45 ones?

a[x::y] is similarly made ambiguous (slice from x step y, or index at
y in scope x?), at least for human readers if not for the compiler --
to have the same expression mean either thing depending on whether
x names an outer function, a local variable, or neither, or both, for
example, would seem very confusing to me.


> I like Neal's proposal, including the "::" digraph.

I just don't see how :: can be used nonconfusingly due to the
'clash' with "slicing with explicit step and without explicit stop" (ambiguity 
with slices with implicit 0 start for prefix use, a la ::y -- ambiguity with 
slices with explicit start for infix use, a la x::y).

A digraph, single character, or other operator that could be used
(and look nice) in lieu of :: either prefix or infix -- aka "stropping by
any other name", even though the syntax sugar may look different
from Ruby's use of prefix $ to strop globals -- would be fine.  But
I don't think :: can be it.


Alex


From FBatista at uniFON.com.ar  Mon Oct 27 13:46:54 2003
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Mon Oct 27 13:48:03 2003
Subject: [Python-Dev] Decimal.py in sandbox
Message-ID: <A128D751272CD411BC9200508BC2194D03383106@escpl.tcp.com.ar>

Aahz wrote:

#- The first thing you should do is talk with Eric Price
#- (eprice@tjhsst.edu), author of the code.  You don't need to 
#- use SF for
#- now; CVS should be fine, but you should find out whether 
#- Eric would like
#- to approve changes first.

OK, I'll mail him.


#- There's no reason you can't start with a pre-PEP now; I'd focus on
#- interface (i.e. the question of what ``Decimal(5)/3`` and
#- ``5/Decimal(3)`` should do -- my personal take at this point is that
#- both ought to fail).

Well, there's wide discussion about this when I posted the pre-PEP of Money.

The raisoning of majority is that when two operands are of different type,
the less general must be converted to the more general one:

>>> myint = 5
>>> myfloat = 3.0
>>> mywhat = myint + myfloat
>>> type(mywhat)
<type 'float'>


With this in mind, the behaviour would be:

>>> myDecimal = Decimal(5)	
>>> myfloat = 3.0
>>> mywhat = myDecimal + myfloat
>>> isinstance(mywhat, float)
True

and

>>> myDecimal = Decimal(5)	
>>> myint = 3
>>> mywhat = myint + myDecimal
>>> isinstance(mywhat, Decimal)
True

but I really don't know if the first behaviour should be extended to the
latter two.

Anyway, I'll post the pre-PEP and we all should see, :)

Thanks.

.	Facundo

From amk at amk.ca  Mon Oct 27 13:54:52 2003
From: amk at amk.ca (amk@amk.ca)
Date: Mon Oct 27 13:55:05 2003
Subject: [Python-Dev] htmllib vs. HTMLParser
In-Reply-To: <200310271652.h9RGqrN27054@12-236-54-216.client.attbi.com>
References: <20031027160221.GA29155@rogue.amk.ca>
	<200310271652.h9RGqrN27054@12-236-54-216.client.attbi.com>
Message-ID: <20031027185452.GA29897@rogue.amk.ca>

On Mon, Oct 27, 2003 at 08:52:53AM -0800, Guido van Rossum wrote:
> I'm unclear on what you plan to do -- repeal sgmllib an rewrite
> htmllib to use HTMLParser internally for a backwards compatible
> interface?

Correct; that's what your initial checkin message for HTMLParser.py suggests
doing, and if I'm touching htmllib.py to add the HTML 4.01 stuff, I may as
well make the other change, too.  

> I'm okay with deprecating sgmllib faster than htmllib.

sgmllib gets deprecated; htmllib never gets deprecated.  HTMLParser is a
barebones HTML parser that provides no default handlers (handle_head,
handle_title, etc.), and htmllib extends it, adding default handlers for the
various things in HTML 4.01.

--amk

From guido at python.org  Mon Oct 27 14:08:48 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 27 14:09:00 2003
Subject: [Python-Dev] htmllib vs. HTMLParser
In-Reply-To: Your message of "Mon, 27 Oct 2003 13:54:52 EST."
	<20031027185452.GA29897@rogue.amk.ca> 
References: <20031027160221.GA29155@rogue.amk.ca>
	<200310271652.h9RGqrN27054@12-236-54-216.client.attbi.com> 
	<20031027185452.GA29897@rogue.amk.ca> 
Message-ID: <200310271908.h9RJ8mX27413@12-236-54-216.client.attbi.com>

> On Mon, Oct 27, 2003 at 08:52:53AM -0800, Guido van Rossum wrote:
> > I'm unclear on what you plan to do -- repeal sgmllib an rewrite
> > htmllib to use HTMLParser internally for a backwards compatible
> > interface?
> 
> Correct; that's what your initial checkin message for HTMLParser.py suggests
> doing, and if I'm touching htmllib.py to add the HTML 4.01 stuff, I may as
> well make the other change, too.  
> 
> > I'm okay with deprecating sgmllib faster than htmllib.
> 
> sgmllib gets deprecated; htmllib never gets deprecated.  HTMLParser is a
> barebones HTML parser that provides no default handlers (handle_head,
> handle_title, etc.), and htmllib extends it, adding default handlers for the
> various things in HTML 4.01.

OK, got it.  Sounds good to me!

--Guido van Rossum (home page: http://www.python.org/~guido/)

From mwh at python.net  Mon Oct 27 14:26:20 2003
From: mwh at python.net (Michael Hudson)
Date: Mon Oct 27 14:26:25 2003
Subject: [Python-Dev] tests expecting but not finding errors due to bug
	fixes
In-Reply-To: <200310251352.13266.aleaxit@yahoo.com> (Alex Martelli's message
	of "Sat, 25 Oct 2003 13:52:13 +0200")
References: <200310251352.13266.aleaxit@yahoo.com>
Message-ID: <2mr80ylbxf.fsf@starship.python.net>

Alex Martelli <aleaxit@yahoo.com> writes:

> Switching to the 2.3 maintenance branch (where test_bsdddb runs just fine),
> I got "make test" failures on test_re.py.  Turns out that the 2.3-branch 
> test_re.py was apparently not updated when the RE recursion bug was
> fixed -- it still expects a couple of exceptions to be raised and they don't
> get raised any more because the bugfix itself WAS backported.
>
> On general principles, in cases of this ilk, IS it all right to just backport 
> the corrected unit-test (from the 2.4 to the 2.3 branch) and commit the
> fix, or should one be more circumspect about it...?

I'd say go for it.  It sounds like just a partially missed backport
(and someone checking things in without running make test, tsk).

Cheers,
mwh

-- 
  Roll on a game of competetive offence-taking.
                                            -- Dan Sheppard, ucam.chat

From just at letterror.com  Mon Oct 27 14:28:24 2003
From: just at letterror.com (Just van Rossum)
Date: Mon Oct 27 14:28:24 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310271940.31530.aleaxit@yahoo.com>
Message-ID: <r01050400-1026-BEE54F4C08B311D8BDE6003065D5E7E4@[10.0.0.23]>

Alex Martelli wrote:

> Sorry, just, but I really don't understand the "don't see immediate
> problem".
[ ... ]
>     print biglist[::y]

Well, that's the part I didn't see yet, so there :)

Just

From tjreedy at udel.edu  Mon Oct 27 14:47:41 2003
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon Oct 27 14:47:07 2003
Subject: [Python-Dev] Re: Inconsistent error messages in Py{Object,
	Sequence}_SetItem()
References: <20031026195515.GA30335@cthulhu.gerg.ca><200310262050.h9QKoI825552@12-236-54-216.client.attbi.com>
	<3F9C3FB0.8050206@v.loewis.de>
Message-ID: <bnjsn6$c2j$1@sea.gmane.org>


"Martin v. L�wis" <martin@v.loewis.de> wrote in message
news:3F9C3FB0.8050206@v.loewis.de...
> Guido van Rossum wrote:
> > Luckily I wasn't taught formal writing :-), and I don't see why it
> > can't be doesn't.  I'd say that if you want Python's error
messages to
> > be formal writing, you'd have to change a lot more than just the
> > one... :-)
>
> OTOH, I would always yield to native speakers in such issues. To me
> myself, it does not matter much, but if native speakers feel happier
> one way or the other, I'd like to help them feel happy :-)

To add a native-speaker datapoint:

I am old enough to remember being taught the same as Greg.  (However,
American stylistic conventions have tended to get looser since then.)
I also remember going through manuscripts to get rid of contractions
prior to submission for publication.  Given the overloading of
apostrophe both in English and Python, I think 'does not' looks
slightly better than "doesn't" (which saves only one character and
forces a change in quote marks!).  So does consistency versus
accidental variation ;-)

Terry J. Reedy


From guido at python.org  Mon Oct 27 14:54:02 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 27 14:54:14 2003
Subject: [Python-Dev] Re: Inconsistent error messages in Py{Object,
	Sequence}_SetItem()
In-Reply-To: Your message of "Mon, 27 Oct 2003 14:47:41 EST."
	<bnjsn6$c2j$1@sea.gmane.org> 
References: <20031026195515.GA30335@cthulhu.gerg.ca><200310262050.h9QKoI825552@12-236-54-216.client.attbi.com>
	<3F9C3FB0.8050206@v.loewis.de> <bnjsn6$c2j$1@sea.gmane.org> 
Message-ID: <200310271954.h9RJs2i27571@12-236-54-216.client.attbi.com>

> Given the overloading of apostrophe both in English and Python, I
> think 'does not' looks slightly better than "doesn't" (which saves
> only one character and forces a change in quote marks!).  So does
> consistency versus accidental variation ;-)

So who's going to change all the other occurrences of "doesn't" and
other contractions in error messages?

--Guido van Rossum (home page: http://www.python.org/~guido/)

From skip at pobox.com  Mon Oct 27 14:59:00 2003
From: skip at pobox.com (Skip Montanaro)
Date: Mon Oct 27 14:59:11 2003
Subject: [Python-Dev] Let's table the discussion of replacing 'global'
In-Reply-To: <1067283859.8566.633.camel@localhost.localdomain>
References: <16285.21660.432100.124214@montanaro.dyndns.org>
	<200310271729.h9RHTnl27186@12-236-54-216.client.attbi.com>
	<1067283859.8566.633.camel@localhost.localdomain>
Message-ID: <16285.30980.869021.894625@montanaro.dyndns.org>


    Jeremy> I haven't had time to participate in this thread -- too much
    Jeremy> real work for the last several days -- but I'd be happy to write
    Jeremy> a PEP that summarizes the issues.

Thank you.  I was trying to figure out where I was going to find the time.
Feel free to ask me for inputs or an outline (or if you continue in your too
busy ways I'll try to whip something up).

Skip


From tim.one at comcast.net  Mon Oct 27 15:00:26 2003
From: tim.one at comcast.net (Tim Peters)
Date: Mon Oct 27 15:00:31 2003
Subject: [Python-Dev] Re: Inconsistent error messages in Py{Object,
	Sequence}_SetItem()
In-Reply-To: <200310271954.h9RJs2i27571@12-236-54-216.client.attbi.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEELGMAB.tim.one@comcast.net>

[Guido]
> So who's going to change all the other occurrences of "doesn't" and
> other contractions in error messages?

I hope nobody -- it's about as silly a crusade as trying to find a way to
make "$" mean "non-local" <0.5 wink>.

and-that-wouldn't-read-better-as-"it-is-about-as-silly"-ly y'rs  - tim


From jeremy at alum.mit.edu  Mon Oct 27 14:44:21 2003
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon Oct 27 15:33:09 2003
Subject: [Python-Dev] Let's table the discussion of replacing 'global'
In-Reply-To: <200310271729.h9RHTnl27186@12-236-54-216.client.attbi.com>
References: <16285.21660.432100.124214@montanaro.dyndns.org>
	<200310271729.h9RHTnl27186@12-236-54-216.client.attbi.com>
Message-ID: <1067283859.8566.633.camel@localhost.localdomain>

On Mon, 2003-10-27 at 12:29, Guido van Rossum wrote:
> > I'm going to make a suggestion.  Let's shelve this topic for the time being
> > and simply summarize the issues in an informational PEP aimed at
> > Py3k.
> 
> Great idea.  I'm getting tired of it too; Alex and I don't seem to be
> getting an inch closer to each other.
> 
> > We don't even know (at least I don't) if we want an implicit search
> > for outer scope variables or an explicit specification of which
> > scope such variables should be defined in.  If, for some reason,
> > nested scopes make a quick exit in Py3k, this would all be moot
> > anyway.
> 
> Sorry to disappoint you, but nested scopes aren't going away.

I haven't had time to participate in this thread -- too much real work
for the last several days -- but I'd be happy to write a PEP that
summarizes the issues.

Jeremy


From python at rcn.com  Mon Oct 27 16:13:26 2003
From: python at rcn.com (Raymond Hettinger)
Date: Mon Oct 27 16:14:25 2003
Subject: [Python-Dev] PEP 289: Generator Expressions
In-Reply-To: <200310230642.h9N6gCN01979@12-236-54-216.client.attbi.com>
Message-ID: <004701c39ccf$29ef0740$81b0958d@oemcomputer>

[GvR]
> Raymond, please take this to c.l.py for feedback!  Wear asbestos. :-)
> 
> I'm sure there will be plenty of misunderstandings in the discussion
> there.  If these are due to lack of detail or clarity in the PEP, feel
> free to update the PEP.  If there are questions that need us to go
> back to the drawing board or requiring BDFL pronouncement, take it
> back to python-dev.

The asbestos wasn't needed :-)

Overall the pep is being well received.  The discussion has been
uncontentious and light (around 50-55 posts).  

Several people initially thought that lambda should be part of the
syntax, but other respondants quickly laid that to rest.  

Many posters were succinctly positive:  "+1" or "great idea". 

One skeptical response came from someone who didn't like list
comprehensions either.  Alex quickly pointed out that they have been
"wildly successful" for advanced users and newbies alike.  

One poster counter-suggested a weird regex style syntax for embedding
Perl expressions.  The newsgroup was very kind and no one called him
wacko :-)

There was occasional discussion about the parentheses requirement but
that was quickly settled also.  One idea that had some merit was to not
require the outer parentheses for a single expression on the rhs of an
assignment:

    g = (x**2 for x in range(10))   # maybe the outer parens are not
needed

The discussion is winding down and there are no unresolved questions. 


Raymond Hettinger


From pje at telecommunity.com  Mon Oct 27 16:20:26 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Oct 27 16:21:48 2003
Subject: [Python-Dev] PEP 289: Generator Expressions
In-Reply-To: <004701c39ccf$29ef0740$81b0958d@oemcomputer>
References: <200310230642.h9N6gCN01979@12-236-54-216.client.attbi.com>
Message-ID: <5.1.1.6.0.20031027161728.01f6c680@telecommunity.com>

At 04:13 PM 10/27/03 -0500, Raymond Hettinger wrote:
>There was occasional discussion about the parentheses requirement but
>that was quickly settled also.  One idea that had some merit was to not
>require the outer parentheses for a single expression on the rhs of an
>assignment:
>
>     g = (x**2 for x in range(10))   # maybe the outer parens are not
>needed

FWIW, I think the parentheses add clarity over e.g.

g = x**2 for x in range(10)

As this latter formulation looks to me like g will equal 81 after the 
statement is executed.


From sholden at holdenweb.com  Mon Oct 27 16:25:56 2003
From: sholden at holdenweb.com (Steve Holden)
Date: Mon Oct 27 16:31:01 2003
Subject: [Python-Dev] Alternate notation for global variable assignments
In-Reply-To: <87k76rhnn6.fsf@egil.codesourcery.com>
Message-ID: <CGECIJPNNHIFAJKHOLMAEELFINAA.sholden@holdenweb.com>

> -----Original Message-----
> From: python-dev-bounces+sholden=holdenweb.com@python.org
> [mailto:python-dev-bounces+sholden=holdenweb.com@python.org]On
>  Behalf Of
> Zack Weinberg
> Sent: Sunday, October 26, 2003 1:15 PM
> To: python-dev
> Subject: [Python-Dev] Alternate notation for global variable
> assignments
>
>
>
> I like Just's := concept except for the similarity to =, and I worry
> that the presence of := in the language will flip people into "Pascal
> mode" -- thinking that = is the equality operator.  I also think that
> the notation is somewhat unnatural -- "globalness" is a property of
> the _variable_, not the operator.  So I'd like to suggest instead
>
>    :var       = value   # var in module scope
>    :scope:var = value   # var in named enclosing scope
>
> An advantage of this notation is that it can be used anywhere, not
> just in an assignment.  This has primary value for people reading the
> code -- if you have a fairly large class method that uses a module
> variable (not by assigning it) somewhere in the middle, writing it
> :var means the reader knows to go look for the assignment way up top.
> This should obviously be optional, to preserve backward compatibility.
>
However, its use in such expressions as

	sublist = lst[:var]

would lead to substantial ambiguities, right?

regards
--
Steve Holden          +1 703 278 8281        http://www.holdenweb.com/
Improve the Internet           http://vancouver-webpages.com/CacheNow/
Python Web Programming                http://pydish.holdenweb.com/pwp/
Interview with GvR August 14, 2003       http://www.onlamp.com/python/


From greg at electricrain.com  Mon Oct 27 16:56:48 2003
From: greg at electricrain.com (Gregory P. Smith)
Date: Mon Oct 27 16:56:55 2003
Subject: [Python-Dev] Re: test_bsddb blocks testing popitem - reason
In-Reply-To: <200310271125.16879.aleaxit@yahoo.com>
References: <200310251232.55044.aleaxit@yahoo.com>
	<200310270930.28811.aleaxit@yahoo.com>
	<20031027094045.GL3929@zot.electricrain.com>
	<200310271125.16879.aleaxit@yahoo.com>
Message-ID: <20031027215648.GM3929@zot.electricrain.com>

On Mon, Oct 27, 2003 at 11:25:16AM +0100, Alex Martelli wrote:
> I still don't quite see how the lock ends up being "held", but, don't mind
> me -- the intricacy of mixins and wrappings and generators and delegations
> in those modules is making my head spin anyway, so it's definitely not
> surprising that I can't quite see what's going on.

BerkeleyDB internally always grabs a read lock (i believe at the page
level; i don't think BerkeleyDB does record locking) for any database read
when opened with DB_THREAD | DB_INIT_LOCK flags.  I believe the problem
is that a DBCursor object holds this lock as long as it is open/exists.
Other reads can go on happily, but writes must to wait for the read lock
to be released before they can proceed.

> > How do python dictionaries deal with modifications to the dictionary
> > intermixed with iteration?
> 
> In general, Python doesn't deal well with modifications to any
> iterable in the course of a loop using an iterator on that iterable.
> 
> The one kind of "modification during the loop" that does work is:
> 
> for k in somedict:
>     somedict[k] = ...whatever...
> 
> i.e. one can change the values corresponding to keys, but not
> change the set of keys in any way -- any changes to the set of
> keys can cause unending loops or other such misbehavior (not
> deadlocks nor crashes, though...).
> 
> However, on a real Python dict,
>     k, v = thedict.iteritems().next()
> doesn't constitute "a loop" -- the iterator object returned by
> the iteritems call is dropped since there are no outstanding
> references to it right after this statement.  So, following up
> with
>     del thedict[k]
> is quite all right -- the dictionary isn't being "looped on" at
> that time.

What about the behaviour of multiple iterators for the same dict being
used at once (either interleaved or by multiple threads; it shouldn't
matter)?  I expect that works fine in python.

This is something the _DBWithCursor iteration interface does not currently
support due to its use of a single DBCursor internally.

_DBWithCursor is currently written such that the cursor is never closed
once created.  This leaves tons of potential for deadlock even in single
threaded apps.  Reworking _DBWithCursor into a _DBThatUsesCursorsSafely
such that each iterator creates its own cursor in an internal pool
and other non cursor methods that would write to the db destroy all
cursors after saving their current() position so that the iterators can
reopen+reposition them is a solution.

> Given that in bsddb's case that iteritems() first [and only]
> next() boils down to a self.first() which in turn does a 
> self.dbc.first() I _still_ don't see exactly what's holding the
> lock.  But the simplest fix would appear to be in __delitem__,
> i.e., if we have a cursor we should delete through it:
> 
>     def __delitem__(self, key):
>         self._checkOpen()
>         if self.dbc is not None:
>             self.dbc.set(key)
>             self.dbc.delete()
>         else:
>             del self.db[key]
> 
> ...but this doesn't in fact remove the deadlock on the
> unit-test for popitem, which just confirms I don't really
> grasp what's going on, yet!-)

hmm.  i would've expected your __delitem__ to work.  Regardless, using the
debugger I can stop the deadlock from occurring if i do "self.dbc.close();
self.dbc = None" just before popitem's "del self[k]"

Greg


From barry at python.org  Mon Oct 27 17:07:16 2003
From: barry at python.org (Barry Warsaw)
Date: Mon Oct 27 17:07:22 2003
Subject: [Python-Dev] Re: test_bsddb blocks testing popitem - reason
In-Reply-To: <20031027215648.GM3929@zot.electricrain.com>
References: <200310251232.55044.aleaxit@yahoo.com>
	<200310270930.28811.aleaxit@yahoo.com>
	<20031027094045.GL3929@zot.electricrain.com>
	<200310271125.16879.aleaxit@yahoo.com>
	<20031027215648.GM3929@zot.electricrain.com>
Message-ID: <1067292435.1785.91.camel@anthem>

On Mon, 2003-10-27 at 16:56, Gregory P. Smith wrote:

> BerkeleyDB internally always grabs a read lock (i believe at the page
> level; i don't think BerkeleyDB does record locking) 

Correct, at least for btree tables.

-Barry


From python at rcn.com  Mon Oct 27 17:45:09 2003
From: python at rcn.com (Raymond Hettinger)
Date: Mon Oct 27 17:46:07 2003
Subject: [Python-Dev] 
	RE: [Python-checkins] python/nondist/peps pep-0323.txt, NONE,
	1.1 pep-0000.txt, 1.254, 1.255
In-Reply-To: <E1AE8rn-0002PR-00@sc8-pr-cvs1.sourceforge.net>
Message-ID: <005a01c39cdb$fa18b540$81b0958d@oemcomputer>

Excellent PEP!


Consider adding your bookmarking example.  I found it to be a compelling
use case.  Also note that there are many variations of the bookmarking
theme (undo utilities, macro recording, parser lookahead functions,
backtracking, etc).


Under drawbacks and issues there are a couple of thoughts:

* Not all iterators will be copyable.  Knowing which is which creates a
bit of a usability issue (i.e. the question of whether a particular
iterator is copyable will come up every time) and a substitution issue
(i.e. code which depends on copyability precludea substitution of other
iterators that don't have copyability).

* In addition to knowing whether a given iterator is copyable, a user
should also know whether the copy is lightweight (just an index or some
such) or heavy (storing all of the data for future use).  They should
know whether it is active (intercepting every call to iter()) or inert.

* For heavy copies, there is a performance trap when the stored data
stream gets too long.  At some point, just using list() would be better.


Consider adding a section with pure python sample implementations for
listiter.__copy__, dictiter.__copy__, etc.


Also, I have a question about the semantic specification of what a copy
is supposed to do.  Does it guarantee that the same data stream will be
reproduced?  For instance, would a generator of random words expect its
copy to generate the same word sequence.  Or, would a copy of a
dictionary iterator change its output if the underlying dictionary got
updated (i.e. should the dict be frozen to changes when a copy exists or
should it mutate).


Raymond Hettinger


From zack at codesourcery.com  Mon Oct 27 17:55:04 2003
From: zack at codesourcery.com (Zack Weinberg)
Date: Mon Oct 27 17:59:39 2003
Subject: [Python-Dev] Alternate notation for global variable assignments
In-Reply-To: <CGECIJPNNHIFAJKHOLMAEELFINAA.sholden@holdenweb.com> (Steve
	Holden's message of "Mon, 27 Oct 2003 16:25:56 -0500")
References: <CGECIJPNNHIFAJKHOLMAEELFINAA.sholden@holdenweb.com>
Message-ID: <87llr69tpz.fsf@codesourcery.com>

"Steve Holden" <sholden@holdenweb.com> writes:

>>
>>    :var       = value   # var in module scope
>>    :scope:var = value   # var in named enclosing scope
>>
>> An advantage of this notation is that it can be used anywhere, not
>> just in an assignment.  This has primary value for people reading the
>> code -- if you have a fairly large class method that uses a module
>> variable (not by assigning it) somewhere in the middle, writing it
>> :var means the reader knows to go look for the assignment way up top.
>> This should obviously be optional, to preserve backward compatibility.
>>
> However, its use in such expressions as
>
> 	sublist = lst[:var]
>
> would lead to substantial ambiguities, right?

I suppose it would.  Unfortunately, there's no other punctuation mark
that can really be used for the purpose -- I think both $ and @
(suggested elsewhere in response to a similar proposal) have
too many countervailing connotations.  Witness e.g. the suggestion
last week that $ become magic in string % dict notation.

Py-in-the-sky suggestion: make the slice separator character be ;
instead of :.  (Half serious.)

Somewhat warty suggestion: take lst[:var] to be a slice, but
lst[(:var)] to be a global variable reference.  And lst[:(:var)] to be
a slice on a global, etc. etc.

Better ideas solicited.

zw

From aleaxit at yahoo.com  Mon Oct 27 18:07:33 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 27 18:07:40 2003
Subject: [Python-Dev] Alternate notation for global variable assignments
In-Reply-To: <87llr69tpz.fsf@codesourcery.com>
References: <CGECIJPNNHIFAJKHOLMAEELFINAA.sholden@holdenweb.com>
	<87llr69tpz.fsf@codesourcery.com>
Message-ID: <200310280007.33899.aleaxit@yahoo.com>

On Monday 27 October 2003 11:55 pm, Zack Weinberg wrote:
   ...
> Somewhat warty suggestion: take lst[:var] to be a slice, but
> lst[(:var)] to be a global variable reference.  And lst[:(:var)] to be
> a slice on a global, etc. etc.

That would work -- and with the :: (rather than single :) stropping
which Guido seems to prefer, too.  As long as ::name or
scope::name are always (parenthesized) when not doing so
would be ambiguous (same general rules as, say, for tuples),
which in their case would seem to be "within brackets only",
I think :: stropping would work fine -- and perhaps avoid some
possible single-: ambiguity in dictionary display such as

d = { a:b:c }

which would require further parenthesization -- with :: stropping,

d = { a::b:c }

and

d = { a:b::c }

are unambiguous, although parentheses would no doubt be
advisable anyway to help human readers.


Alex


From guido at python.org  Mon Oct 27 18:10:05 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 27 18:11:11 2003
Subject: [Python-Dev] PEP 289: Generator Expressions
In-Reply-To: Your message of "Mon, 27 Oct 2003 16:13:26 EST."
	<004701c39ccf$29ef0740$81b0958d@oemcomputer> 
References: <004701c39ccf$29ef0740$81b0958d@oemcomputer> 
Message-ID: <200310272310.h9RNA5Y27764@12-236-54-216.client.attbi.com>

> Overall the pep is being well received.  The discussion has been
> uncontentious and light (around 50-55 posts).  

Great!

> There was occasional discussion about the parentheses requirement but
> that was quickly settled also.  One idea that had some merit was to not
> require the outer parentheses for a single expression on the rhs of an
> assignment:
> 
>     g = (x**2 for x in range(10))   # maybe the outer parens are not needed

I really think they should be required.  The 'for' keyword feels like
it has a lower "priority" than the assignment operator.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From aleaxit at yahoo.com  Mon Oct 27 18:13:36 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 27 18:13:43 2003
Subject: [Python-Dev] PEP 289: Generator Expressions
In-Reply-To: <200310272310.h9RNA5Y27764@12-236-54-216.client.attbi.com>
References: <004701c39ccf$29ef0740$81b0958d@oemcomputer>
	<200310272310.h9RNA5Y27764@12-236-54-216.client.attbi.com>
Message-ID: <200310280013.36524.aleaxit@yahoo.com>

On Tuesday 28 October 2003 12:10 am, Guido van Rossum wrote:
> > Overall the pep is being well received.  The discussion has been
> > uncontentious and light (around 50-55 posts).
>
> Great!
>
> > There was occasional discussion about the parentheses requirement but
> > that was quickly settled also.  One idea that had some merit was to not
> > require the outer parentheses for a single expression on the rhs of an
> > assignment:
> >
> >     g = (x**2 for x in range(10))   # maybe the outer parens are not
> > needed
>
> I really think they should be required.  The 'for' keyword feels like
> it has a lower "priority" than the assignment operator.

I entirely agree with Guido: the assignment looks _much_ better to
me WITH the parentheses around the RHS.


Alex


From anthony at interlink.com.au  Mon Oct 27 18:16:09 2003
From: anthony at interlink.com.au (Anthony Baxter)
Date: Mon Oct 27 18:19:28 2003
Subject: [Python-Dev] Alternate notation for global variable assignments 
In-Reply-To: <200310280007.33899.aleaxit@yahoo.com> 
Message-ID: <200310272316.h9RNG9FB011873@localhost.localdomain>


>>> Alex Martelli wrote
> On Monday 27 October 2003 11:55 pm, Zack Weinberg wrote:
>    ...
> > Somewhat warty suggestion: take lst[:var] to be a slice, but
> > lst[(:var)] to be a global variable reference.  And lst[:(:var)] to be
> > a slice on a global, etc. etc.
> 
> That would work -- and with the :: (rather than single :) stropping
> which Guido seems to prefer, too.  As long as ::name or
> scope::name are always (parenthesized) when not doing so
> would be ambiguous (same general rules as, say, for tuples),
> which in their case would seem to be "within brackets only",
> I think :: stropping would work fine -- and perhaps avoid some
> possible single-: ambiguity in dictionary display such as

Can I just say, as someone who's only been lightly following this 
thread, that the above :(: type stuff

a) looks incredibly ugly
b) gives absolutely no clue as to what it might mean
c) looks incredibly ugly.

There's already prior usage of the : in python for dictionaries, for
slices, but nothing at all like this. I'd really hope we don't end up
with something this awful looking in the stdlib. 

Speaking purely for myself, of course <wink>

(On the other hand, making the operator :( might be a subtle way of
pre-deprecating it)

Anthony
-- 
Anthony Baxter     <anthony@interlink.com.au>   
It's never too late to have a happy childhood.

From aleaxit at yahoo.com  Mon Oct 27 18:19:29 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Oct 27 18:19:39 2003
Subject: [Python-Dev] PEP 289: Generator Expressions
In-Reply-To: <004701c39ccf$29ef0740$81b0958d@oemcomputer>
References: <004701c39ccf$29ef0740$81b0958d@oemcomputer>
Message-ID: <200310280019.29859.aleaxit@yahoo.com>

On Monday 27 October 2003 10:13 pm, Raymond Hettinger wrote:
   ...
> Several people initially thought that lambda should be part of the

yield was repeatedly mentioned, and I don't recall lambda being, so
maybe this is a typo.

> syntax, but other respondants quickly laid that to rest.

Yes, consensus clearly converged on the proposed syntax (the mention
of "generators" in the construct's name was the part that I think prompted
the desire for 'yield' -- had they been called "iterator expressions" I
suspect nobody would have missed 'yield' even transiently:-).


> One poster counter-suggested a weird regex style syntax for embedding
> Perl expressions.  The newsgroup was very kind and no one called him
> wacko :-)

...though I did say "if you want Perl, you know where to find it"...:-)


> The discussion is winding down and there are no unresolved questions.

Yes, fair summary.  The one persistent (but low-as-a-whisper) grumbling
is by one A.M., who keeps mumbling "they're _iterator_ expressions, the
fact that they use generators is an implementation detail, grmbl grmbl":-).
But then, he IS one of those pesky must-always-have-SOME-whine types.


Alex


From barry at python.org  Mon Oct 27 18:39:03 2003
From: barry at python.org (Barry Warsaw)
Date: Mon Oct 27 18:39:08 2003
Subject: [Python-Dev] Alternate notation for global variable assignments
In-Reply-To: <200310272316.h9RNG9FB011873@localhost.localdomain>
References: <200310272316.h9RNG9FB011873@localhost.localdomain>
Message-ID: <1067297942.1066.24.camel@anthem>

On Mon, 2003-10-27 at 18:16, Anthony Baxter wrote:

> Can I just say, as someone who's only been lightly following this 
> thread, 

Me too.

> that the above :(: type stuff
> 
> a) looks incredibly ugly
> b) gives absolutely no clue as to what it might mean
> c) looks incredibly ugly.
> 
> There's already prior usage of the : in python for dictionaries, for
> slices, but nothing at all like this. I'd really hope we don't end up
> with something this awful looking in the stdlib. 

It's not just you.

-Barry


From nas-python at python.ca  Mon Oct 27 18:45:10 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Mon Oct 27 18:43:48 2003
Subject: [Python-Dev] PEP 289: Generator Expressions
In-Reply-To: <200310280019.29859.aleaxit@yahoo.com>
References: <004701c39ccf$29ef0740$81b0958d@oemcomputer>
	<200310280019.29859.aleaxit@yahoo.com>
Message-ID: <20031027234510.GA22587@mems-exchange.org>

On Tue, Oct 28, 2003 at 12:19:29AM +0100, Alex Martelli wrote:
> The one persistent (but low-as-a-whisper) grumbling is by one
> A.M., who keeps mumbling "they're _iterator_ expressions, the fact
> that they use generators is an implementation detail, grmbl
> grmbl":-).

I'm inclined to agree with him.  Was there some reason why the term
iterator expressions was rejected?

  Neil

From tdelaney at avaya.com  Mon Oct 27 19:00:43 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Mon Oct 27 19:00:50 2003
Subject: [Python-Dev] Alternate notation for global variable assignments
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6B084@au3010avexu1.global.avaya.com>

> From: Zack Weinberg [mailto:zack@codesourcery.com]
> >>
> > However, its use in such expressions as
> >
> > 	sublist = lst[:var]
> >
> > would lead to substantial ambiguities, right?
> 
> I suppose it would.  Unfortunately, there's no other punctuation mark
> that can really be used for the purpose -- I think both $ and @
> (suggested elsewhere in response to a similar proposal) have
> too many countervailing connotations.  Witness e.g. the suggestion
> last week that $ become magic in string % dict notation.

First of all, I'm strongly *against* the idea of :var.

However, I think a syntax that would work with no ambiguities, and not look too bad, would be:

    .var

e.g.

    sublist = lst[.var]

I would also be strongly against this suggestion - it simply deals with the problems I see with the current suggestion. It has its own problems, including (but not limited to) not being very obvious.

Tim Delaney

From greg at cosc.canterbury.ac.nz  Mon Oct 27 19:04:24 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon Oct 27 19:04:42 2003
Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug
In-Reply-To: <200310270906.44209.aleaxit@yahoo.com>
Message-ID: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz>

Alex Martelli <aleaxit@yahoo.com>:

> Nobody's asking for 3.0*x to work where x is a user-coded type
> without an __rmul__; rather, the point is that 3*x should fail too,
> and ideally they'd have the same clear error message as 3+x
> gives when the type has no __radd__.

Okay, that makes sense.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From barry at python.org  Mon Oct 27 19:11:52 2003
From: barry at python.org (Barry Warsaw)
Date: Mon Oct 27 19:12:01 2003
Subject: [Python-Dev] Alternate notation for global variable assignments
In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DED6B084@au3010avexu1.global.avaya.com>
References: <338366A6D2E2CA4C9DAEAE652E12A1DED6B084@au3010avexu1.global.avaya.com>
Message-ID: <1067299912.1066.35.camel@anthem>

On Mon, 2003-10-27 at 19:00, Delaney, Timothy C (Timothy) wrote:

> First of all, I'm strongly *against* the idea of :var.
> 
> However, I think a syntax that would work with no ambiguities, and not look too bad, would be:
> 
>     .var
> 
> e.g.
> 
>     sublist = lst[.var]
> 
> I would also be strongly against this suggestion - it simply deals with the problems I see with the current suggestion. It has its own problems, including (but not limited to) not being very obvious.

What I really want is access to a namespace, and then all the normal
Python attribute access notations just work.  They're one honking great
idea after all.

This was behind the "import __me__" suggestion for access to module
globals.  Why can't we have something similar for nested functions?

-Barry


From guido at python.org  Mon Oct 27 19:23:52 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 27 19:24:08 2003
Subject: [Python-Dev] Alternate notation for global variable assignments
In-Reply-To: Your message of "Tue, 28 Oct 2003 11:00:43 +1100."
	<338366A6D2E2CA4C9DAEAE652E12A1DED6B084@au3010avexu1.global.avaya.com> 
References: <338366A6D2E2CA4C9DAEAE652E12A1DED6B084@au3010avexu1.global.avaya.com>
Message-ID: <200310280023.h9S0Nqk27854@12-236-54-216.client.attbi.com>

> However, I think a syntax that would work with no ambiguities, and
> not look too bad, would be:
> 
>     .var
> 
> e.g.
> 
>     sublist = lst[.var]

No; I want to reserve .var for the "with" statement (a la VB).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Mon Oct 27 19:50:30 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 27 19:51:51 2003
Subject: [Python-Dev] PEP 289: Generator Expressions
In-Reply-To: Your message of "Mon, 27 Oct 2003 15:45:10 PST."
	<20031027234510.GA22587@mems-exchange.org> 
References: <004701c39ccf$29ef0740$81b0958d@oemcomputer>
	<200310280019.29859.aleaxit@yahoo.com> 
	<20031027234510.GA22587@mems-exchange.org> 
Message-ID: <200310280050.h9S0oUR27881@12-236-54-216.client.attbi.com>

> > The one persistent (but low-as-a-whisper) grumbling is by one
> > A.M., who keeps mumbling "they're _iterator_ expressions, the fact
> > that they use generators is an implementation detail, grmbl
> > grmbl":-).
> 
> I'm inclined to agree with him.  Was there some reason why the term
> iterator expressions was rejected?

After seeing "iterator expressions" I came up wit "generator
expressions" and decided I liked that better.  Around the same time
Tim Peters wrote a post where he proposed "generator expressions"
independently:

http://mail.python.org/pipermail/python-dev/2003-October/039186.html

Trying to rationalize my own gut preference, I think I like "generator
expressions" better than "iterator expressions" because there are so
many other expressions that yield iterators (e.g. iter(x) comes to
mind :-).  Just like generator functions are one specific cool way of
creating an iterator, generator expressions are another specific cool
way, and as a bonus, they're related in terms of implementation (and
that certainly reflects on corners of the semantics, so I don't think
we should try to hide this as an implementation detail).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From janssen at parc.com  Mon Oct 27 19:53:32 2003
From: janssen at parc.com (Bill Janssen)
Date: Mon Oct 27 19:53:57 2003
Subject: [Python-Dev] htmllib vs. HTMLParser 
In-Reply-To: Your message of "Mon, 27 Oct 2003 08:02:21 PST."
	<20031027160221.GA29155@rogue.amk.ca> 
Message-ID: <03Oct27.165334pst."58611"@synergy1.parc.xerox.com>

Glad to see you volunteering!

But IMO simply adding some handler methods won't really do it.  You
also need to introduce some knowledge about the semantics of the
syntax.  For example, a new "block"-level element should close all
"in-line" elements that are currently open.  Etc.

It would also be handy to have a version of the parser that takes an
HTML page and returns a parse tree, rather than the halfway solution
we currently have, forcing the user to design and write a lot of code
to get anything done.

Bill

From guido at python.org  Mon Oct 27 19:54:21 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 27 19:55:04 2003
Subject: [Python-Dev] RE: [Python-checkins] python/nondist/peps
	pep-0323.txt, NONE, 1.1 pep-0000.txt, 1.254, 1.255
In-Reply-To: Your message of "Mon, 27 Oct 2003 17:45:09 EST."
	<005a01c39cdb$fa18b540$81b0958d@oemcomputer> 
References: <005a01c39cdb$fa18b540$81b0958d@oemcomputer> 
Message-ID: <200310280054.h9S0sLv27908@12-236-54-216.client.attbi.com>

> Also, I have a question about the semantic specification of what a copy
> is supposed to do.  Does it guarantee that the same data stream will be
> reproduced?  For instance, would a generator of random words expect its
> copy to generate the same word sequence.  Or, would a copy of a
> dictionary iterator change its output if the underlying dictionary got
> updated (i.e. should the dict be frozen to changes when a copy exists or
> should it mutate).

Every attempt should be made for the two copies to return exactly the
same stream of values.  This is the pure tee() semantics.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg at cosc.canterbury.ac.nz  Mon Oct 27 20:48:21 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon Oct 27 20:48:33 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310271033.56569.aleaxit@yahoo.com>
Message-ID: <200310280148.h9S1mLg21692@oma.cosc.canterbury.ac.nz>

Alex Martelli <aleaxit@yahoo.com>:

> My slight preference for leaving += and friends alone is that
> a function using them to rebind nonlocals would be hard to
> read

Using my "outer" suggestion, augmented assignments to
nonlocals would be written

  outer x += 1

which would make the intention pretty clear, I think.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Mon Oct 27 21:02:10 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon Oct 27 21:02:20 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310271511.h9RFBG926665@12-236-54-216.client.attbi.com>
Message-ID: <200310280202.h9S22Ad21701@oma.cosc.canterbury.ac.nz>

Guido:

> there are already well-established rules for deciding whether a bare
> name is local or not, and those rules have always worked "at a
> distance".

If we adopt a method of nonlocal assignment that allows the
deprecation of "global", then we have a chance to change this,
if we think that such "at-a-distance" rules are undesirable
in general.

Do we think that?

Einstein-certainly-seemed-to-ly,

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Mon Oct 27 21:10:11 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon Oct 27 21:10:22 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <20031027161202.GF5842@epoch.metaslash.com>
Message-ID: <200310280210.h9S2AB921796@oma.cosc.canterbury.ac.nz>

> The best alternative I've seen that addresses nested scope and the
> global declaration. Is to borrow :: from C++:

-1000! I hate it whenever an otherwise sensible language
borrows this ugly piece of syntax.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Mon Oct 27 21:20:19 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon Oct 27 21:20:27 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <3F9D5910.9050001@livinglogic.de>
Message-ID: <200310280220.h9S2KJC21818@oma.cosc.canterbury.ac.nz>

Walter D=F6rwald:

> I think ':=3D' is to close to '=3D'. The default assigment should b=
e much
> easier to type than the special case.

Well, typing "outer x =3D value" would require 6 more keystrokes
than "x =3D value". Would that be difficult enough for you? :-)

Greg Ewing, Computer Science Dept, +---------------------------------=
-----+
University of Canterbury,=09   | A citizen of NewZealandCorp, a=09  |
Christchurch, New Zealand=09   | wholly-owned subsidiary of USA Inc. =
 |
greg@cosc.canterbury.ac.nz=09   +------------------------------------=
--+


From bac at OCF.Berkeley.EDU  Mon Oct 27 21:41:48 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Mon Oct 27 21:42:13 2003
Subject: [Python-Dev] Let's table the discussion of replacing 'global'
In-Reply-To: <1067283859.8566.633.camel@localhost.localdomain>
References: <16285.21660.432100.124214@montanaro.dyndns.org>	<200310271729.h9RHTnl27186@12-236-54-216.client.attbi.com>
	<1067283859.8566.633.camel@localhost.localdomain>
Message-ID: <3F9DD76C.8010801@ocf.berkeley.edu>

Jeremy Hylton wrote:

> On Mon, 2003-10-27 at 12:29, Guido van Rossum wrote:
> 
>>>I'm going to make a suggestion.  Let's shelve this topic for the time being
>>>and simply summarize the issues in an informational PEP aimed at
>>>Py3k.
>>
>>Great idea.  I'm getting tired of it too; Alex and I don't seem to be
>>getting an inch closer to each other.
>>
>>
>>>We don't even know (at least I don't) if we want an implicit search
>>>for outer scope variables or an explicit specification of which
>>>scope such variables should be defined in.  If, for some reason,
>>>nested scopes make a quick exit in Py3k, this would all be moot
>>>anyway.
>>
>>Sorry to disappoint you, but nested scopes aren't going away.
> 
> 
> I haven't had time to participate in this thread -- too much real work
> for the last several days -- but I'd be happy to write a PEP that
> summarizes the issues.
> 

Woohoo!  PEPs for generator expressions, copying iterators, and now 
'global' "stuff"!  This will make summarizing the 700-odd emails I have 
for the next summary (at this point; the thing grows an average of 50 
emails a day as of late) a *heck* of a lot easier.  Thanks Jeremy, 
Raymond, and Alex.

-Brett


From guido at python.org  Mon Oct 27 21:55:57 2003
From: guido at python.org (Guido van Rossum)
Date: Mon Oct 27 21:56:06 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: Your message of "Tue, 28 Oct 2003 15:02:10 +1300."
	<200310280202.h9S22Ad21701@oma.cosc.canterbury.ac.nz> 
References: <200310280202.h9S22Ad21701@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com>

> If we adopt a method of nonlocal assignment that allows the
> deprecation of "global", then we have a chance to change this,
> if we think that such "at-a-distance" rules are undesirable
> in general.
> 
> Do we think that?

Alex certainly seems to be arguing this, but I think it's a lost cause.

Even Alex will have to accept the long-distance effect of

  def f():
      x = 42
      .
      . (hundreds of lines of unrelated code)
      .
      print x

And at some point in the future Python *will* grow (optional) type
declarations for all sorts of things (arguments, local variables,
instance variables) and those will certainly have effect at a
distance.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From bob at redivi.com  Mon Oct 27 22:14:06 2003
From: bob at redivi.com (Bob Ippolito)
Date: Mon Oct 27 22:14:35 2003
Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate
Message-ID: <CA1A8348-08F4-11D8-8B4B-000A95686CD8@redivi.com>

> [Gustavo Niemeyer wrote]
> > > You can do reverse with [::-1] now.
>
> [Holger Krekel]
> > sure, but it's a bit unintuitive and i mentioned not only reverse :-)
> >
> > Actually i think that 'reverse', 'sort' and 'extend' algorithms
> > could nicely be put into the new itertools module.
> >
> > There it's obvious that they wouldn't mutate objects.  And these
> > algorithms
> > (especially extend and reverse) would be very efficient as iterators
> > because
> > they wouldn't create temporary lists/tuples.
>
> To be considered as a possible itertool, an ideal candidate should:
>
> * work well in combination with other itertools
> * be a fundamental building block
> * accept all iterables as inputs
> * return only an iterator as an output
> * run lazily so as not to force the inputs to run to completion
>   unless externally requested by list() or some such.
> * consume constant memory (this rule was bent for itertools.cycle(),
>   but should be followed as much as possible).
> * run finitely if some of the inputs are finite (itertools.repeat(),
>   count() and cycle() are the only intentionally infinite tools)
>
> There is no chance for isort().  Once you've sorted the whole list,
> there is no advantage to returning an iterator instead of a list.
>
> The problem with ireverse() is that it only works with objects that
> support __getitem__() and len().  That pretty much precludes
> generators, user defined class based iterators, and the outputs
> from other itertools.  So, while it may make a great builtin (which
> is what PEP-322 is going to propose), it doesn't fit in with other
> itertools.

How about making islice be more lenient about inputs?  For example 
x[::-1] should be expressable by islice(x, None, None, -1) when the 
input implements __len__ and __getitem__ -- but it's not.  [::-1] 
*does* create a temporary list, because Python doesn't have "views" of 
lists/tuples.  islice should also let you go backwards in general, 
islice(x, len(x)-1, None, -2) should work.

-bob


From python at rcn.com  Tue Oct 28 00:11:08 2003
From: python at rcn.com (Raymond Hettinger)
Date: Tue Oct 28 00:12:08 2003
Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <CA1A8348-08F4-11D8-8B4B-000A95686CD8@redivi.com>
Message-ID: <008f01c39d11$e5f13160$81b0958d@oemcomputer>

[Bob Ippolito]
> How about making islice be more lenient about inputs?  For example
> x[::-1] should be expressable by islice(x, None, None, -1) when the
> input implements __len__ and __getitem__ -- but it's not.  [::-1]
> *does* create a temporary list, because Python doesn't have "views" of
> lists/tuples.  islice should also let you go backwards in general,
> islice(x, len(x)-1, None, -2) should work.

Sorry, this idea was examined and rejected long ago.  The itertools
principles involved are:

 - avoiding calls that cause the whole stream to be realized,
 - avoiding situations that require much of the data to be stored in
memory,
 - an itertool should work well with other tools and handle all kinds of
iterators as inputs.

islice(it, None, None, -1) is a disaster when presented with an infinite
iterator and a near disaster with a long running iterator.  Handling
negative steps entails saving data in memory.

The issue of reverse iteration is being dealt with outside the scope of
itertools.  See the soon to be revised PEP 322 on reverse iteration.  It
will give you the "views" that you seek :-)


Raymond Hettinger


From python at rcn.com  Tue Oct 28 01:40:57 2003
From: python at rcn.com (Raymond Hettinger)
Date: Tue Oct 28 01:41:55 2003
Subject: [Python-Dev] PEP 289: Generator Expressions
In-Reply-To: <200310280050.h9S0oUR27881@12-236-54-216.client.attbi.com>
Message-ID: <00a701c39d1e$71af4f00$81b0958d@oemcomputer>

> After seeing "iterator expressions" I came up wit "generator
> expressions" and decided I liked that better.  Around the same time
> Tim Peters wrote a post where he proposed "generator expressions"
> independently:
> 
> http://mail.python.org/pipermail/python-dev/2003-October/039186.html
> 
> Trying to rationalize my own gut preference, I think I like "generator
> expressions" better than "iterator expressions" because there are so
> many other expressions that yield iterators (e.g. iter(x) comes to
> mind :-).  Just like generator functions are one specific cool way of
> creating an iterator, generator expressions are another specific cool
> way, and as a bonus, they're related in terms of implementation (and
> that certainly reflects on corners of the semantics, so I don't think
> we should try to hide this as an implementation detail).

I'm convinced.


Raymond


From aleaxit at yahoo.com  Tue Oct 28 03:56:34 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 03:56:44 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com>
References: <200310280202.h9S22Ad21701@oma.cosc.canterbury.ac.nz>
	<200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com>
Message-ID: <200310280956.34183.aleaxit@yahoo.com>

On Tuesday 28 October 2003 03:55 am, Guido van Rossum wrote:
> > If we adopt a method of nonlocal assignment that allows the
> > deprecation of "global", then we have a chance to change this,
> > if we think that such "at-a-distance" rules are undesirable
> > in general.
> >
> > Do we think that?
>
> Alex certainly seems to be arguing this, but I think it's a lost cause.

I must have some Don Quixote in my blood.  Ah, can anybody point
me to the nearest windmill, please...?-)

Seriously, I realize by now that I stand no chance of affecting your
decision in this matter.  Nevertheless, and that attitude may indeed
be quixotical, I still have (just barely) enough energy not to let your
explanation of your likely coming decision stand as if it was ok with
me, or as if I had no good response to your arguments.  If it's a lost
cause, I think it's because I'm not being very good at marshaling the
arguments for it, not because those arguments are weak.  So,
basically for the record, here goes, once more...


> Even Alex will have to accept the long-distance effect of
>
>   def f():
>       x = 42
>       .
>       . (hundreds of lines of unrelated code)
>       .
>       print x

I have absolutely no problem with that -- except that it's bad style, but
the language cannot, in general, force good style.  The language can
and should ALLOW good style, but enforcing it is not always possible.

In (old) C, there was often no alternative to putting a declaration far
away from the code that used the variable, because declarations had
to come at block start.  Sometimes you could enclose declaration and
use in a nested sub-block, but not always.  C++ and modern C have
removed this wart by letting declarations come at any point before the
variable's used, and _encouraging_ (stylistically -- no enforcement)
the declaration to come always together with the initialization.  That's
about all a language can be expected to do in this regard: not forbid
"action at a distance" (that would be too confining), but _allow_ and
_encourage_ most programs to avoid it.

Python is and always has been just as good or even better: there being
no separate declaration, you _always_ have the equivalent of it "at the
first initialization" (as C++ and modern C encourage but can't enforce),
and it's perfectly natural in most cases to keep that close to the region
in a function where the name is of interest, if that region comprises only
a subset of the function's body.

But this, to some extent, is a red herring.  "Reading" (accessing) the
value referred to by a name looks the name up by rules I mostly _like_,
even though it is quite possible that the name was set "far away".  As
AMK suggests in his "Python warts" essay, people don't often get in
trouble with that because _most_ global (module-level, and even more
built-in) names are NOT re-bound dynamically.  So, when I see, e.g.,
    print len(phonebook)
it's most often fine that phonebook is global, just as it's fine that len
is built-in (it may be argued that we have "too many" built-in names,
and similarly that having "too many" global names is not a good thing,
but having SOME such names is just fine, and indeed inevitable --
perhaps Python may remedy the "too many built-ins" in 3.0, and any
programmer can refactor his own code to alleviate the "too many
globals" -- no deep problem here, in either case).

Re-binding names is different.  It's far rarer than accessing them, of
course.  And while all uses of "print x" mean (semantics equivalent to)
"look x up in the locals, then if not found there in outer scopes, then
if not found there in the globals, then if not found there in the builtins" --
a single, reasonably simple and uniform rule, independent from any
"purely declarative statement", which just determines where the value
will come from -- the situation for "x=42" is currently different.  It's a
rarer situation than just accessing x; it's _more_ important to know
where x will be bound, because that will affect its future lifetime --
which we don't particularly care about when we're just accessing it, but
is more important when we're setting it; _and_ (alas!) it's affected
by a _possible_, purely-declarative, instruction-to-the-compiler "global"
statement SOMEwhere.  "Normally", "x=42" binds or rebinds x
locally.  That's the common case, as rebinding nonlocals is rare.
It's therefore a little trap that some (a small % of) the time we are
instead rebinding a nonlocal _with no nearby reminder of the fact_.

No "nearby reminder" is really needed for the _more common_ case 
of _accessing_ a name -- partly because "where is this being accessed 
from" is often less crucial (while it IS crucial when _binding_ the name),
partly because it's totally common and expected that the "just access"
may be doing lookup in other namespaces (indeed, when I write len(x),
it's the rare case where len HAS been rebound that may be a trap!-).


> And at some point in the future Python *will* grow (optional) type
> declarations for all sorts of things (arguments, local variables,
> instance variables) and those will certainly have effect at a
> distance.

Can we focus on the locals?  Argument passing, and setting attributes
of objects with e.g. "x.y = z" notation, are already subject to rather
different rules than setting bare names, e.g. "x.y = z" might perfectly
well be calling a property setter x.setY(z) or x.__setattr__('y', z), so
I don't think refining those potentially-subtle rules will be a problem,
nor that the situation is parallel to "global".

However, optional type declarations for local variables might surely
be (both problems and parallel:-), depending on roughly what you
have in mind for that.  E.g., are you thinking, syntax sugar apart, of
some new statement "constrain_type" which might go something
like...:

def f():
    constrain_type(int) x, y, z, t
    x = 23      # ok
    y = 2.3     # ??? a
    z = "23"    # ??? b
    t = "foo"   # raise subclass of (TypeError ?)

If so, what semantics do you have in mind for cases a and b?  I can
imagine either an implicit int() call around the RHS (which is why I
guess the assignment to t would fail, though I don't know whether it
would fail with a type or value error), or an implicit isinstance
check, in which case a and b would also fail (and then no doubt with
a type error).

I may be weird, but -- offhand, and not having had time to reflect
on this in depth -- it seems to me that having assignment to bare
names 'fail' in some circumstances, while revolutionary in Python,
would not be particularly troublesome in the "action at a distance"
sense.  After all the constrain_type would have the specific purpose
of forbidding some assignments that would otherwise succeed, would
be used specifically for that, and making "wrong" assignment fail
immediately and noisily would be exactly what it's for.  I may not
think it a GOOD idea to introduce it (for local variables), but if
I argued against it it would not be on the lines of "one can't tell
by just looking at y=2.3 whether it succeeds or fails".

If the concept is to make y=2.3 implicitly do y=int(2.3) I would
be much more worried.  THEN, with no clear indication to the
contrary, we'd have "y=2.3" leave y with a value of 2.3, or 2,
or maybe something else for sufficiently weird values of X in a
"constrain_type(X) y" -- the semantics of a CORRECT program would
suddenly grow subtle dependencies on "nonlocal" ``declarations''.
So, if THAT is your intention -- and indeed that would be far closer
to the way "global" works: it doesn't FORBID assignments, rather
it changes their semantics -- then I admit the parallel is indeed
strict, and I would be worried on basically the same grounds as
I'm grumbling about 'global' and its planned extensions.

Yes, I realize this seems to be arguing _against_ adaptation --
surely if we had "constrain_type(X) y", and set "y = z", we might
like an implicit "y = adapt(z, X)" to be the assignment's semantics?
My answer (again, this is a first-blush reaction, haven't thought
deeply about the issues) is that adaptation is good, but implicit
rather than explicit is ungood, and I'm not sure the good is
stronger than the ungood here; AND, adaptation is not typecasting:
e.g y=adapt("23", int) should NOT succeed.  So, while I might be
more intrigued than horrified by such novel suggestions, I would
surely see the risks in them -- and some of the risks I'd see WOULD
be about "lack of local indication of nonobvious semantics shift".

Just like with 'global', yes.


Alex


From aleaxit at yahoo.com  Tue Oct 28 04:22:31 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 04:22:40 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310280148.h9S1mLg21692@oma.cosc.canterbury.ac.nz>
References: <200310280148.h9S1mLg21692@oma.cosc.canterbury.ac.nz>
Message-ID: <200310281022.31722.aleaxit@yahoo.com>

On Tuesday 28 October 2003 02:48 am, Greg Ewing wrote:
> Alex Martelli <aleaxit@yahoo.com>:
> > My slight preference for leaving += and friends alone is that
> > a function using them to rebind nonlocals would be hard to
> > read
>
> Using my "outer" suggestion, augmented assignments to
> nonlocals would be written
>
>   outer x += 1
>
> which would make the intention pretty clear, I think.

Absolutely clear, and wonderful.  Pity that any alternative to
'global' has been declared "a lost cause" by Guido.

I'd still like to forbid "side effect rebinding" via statements
such as class, def, import, for, i.e., no
    outer def f(): ...
and the like; i.e., the 'outer' statement should be
    'outer' expr_stmt
(in Grammar/Grammar terms) with the further constraint
that the expr_stmt must be an assignment (augmented or
not); and the outer statement should not be a 'small_stmt',
so as to avoid the ambiguity of
    outer x=1; y=2
(is this binding a local or nonlocal name 'y'?).


Alex


From aleaxit at yahoo.com  Tue Oct 28 04:27:06 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 04:27:12 2003
Subject: [Python-Dev] RE: [Python-checkins] python/nondist/peps
	pep-0323.txt, NONE, 1.1 pep-0000.txt, 1.254, 1.255
In-Reply-To: <200310280054.h9S0sLv27908@12-236-54-216.client.attbi.com>
References: <005a01c39cdb$fa18b540$81b0958d@oemcomputer>
	<200310280054.h9S0sLv27908@12-236-54-216.client.attbi.com>
Message-ID: <200310281027.06138.aleaxit@yahoo.com>

On Tuesday 28 October 2003 01:54 am, Guido van Rossum wrote:
> > Also, I have a question about the semantic specification of what a copy
> > is supposed to do.  Does it guarantee that the same data stream will be
> > reproduced?  For instance, would a generator of random words expect its
> > copy to generate the same word sequence.  Or, would a copy of a
> > dictionary iterator change its output if the underlying dictionary got
> > updated (i.e. should the dict be frozen to changes when a copy exists or
> > should it mutate).
>
> Every attempt should be made for the two copies to return exactly the
> same stream of values.  This is the pure tee() semantics.

Yes, but iterators that run on underlying containers don't guarantee,
in general, what happens if the container is mutated while the iteration
is going on -- arbitrary items may end up being skipped, repeated, etc.
So, "every attempt" is, I feel, too strong here.

deepcopy exists for those cases where one is ready to pay a hefty
price for guarantees of "decoupling", after all.


Alex


From aleaxit at yahoo.com  Tue Oct 28 04:31:38 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 04:31:43 2003
Subject: [Python-Dev] Alternate notation for global variable assignments
In-Reply-To: <1067299912.1066.35.camel@anthem>
References: <338366A6D2E2CA4C9DAEAE652E12A1DED6B084@au3010avexu1.global.avaya.com>
	<1067299912.1066.35.camel@anthem>
Message-ID: <200310281031.38234.aleaxit@yahoo.com>

On Tuesday 28 October 2003 01:11 am, Barry Warsaw wrote:
   ...
> What I really want is access to a namespace, and then all the normal
> Python attribute access notations just work.  They're one honking great
> idea after all.

Yes, all in all this does remain my preference, too.  I'd take stropping (or
"keyword stropping" a la Greg's "outer x") rather than declarative stuff,
but just getting a namespace (in ways the compiler could recognize,
i.e. by magicnames such as __me__) and then using __me__.x=23
would require no new syntax and be maximally obvious.  Sigh.


> This was behind the "import __me__" suggestion for access to module
> globals.  Why can't we have something similar for nested functions?

And why can't we have "import __me__" too?  Ah well!


Alex


From aleaxit at yahoo.com  Tue Oct 28 04:37:44 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 04:37:50 2003
Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug
In-Reply-To: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz>
References: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz>
Message-ID: <200310281037.44424.aleaxit@yahoo.com>

On Tuesday 28 October 2003 01:04 am, Greg Ewing wrote:
> Alex Martelli <aleaxit@yahoo.com>:
> > Nobody's asking for 3.0*x to work where x is a user-coded type
> > without an __rmul__; rather, the point is that 3*x should fail too,
> > and ideally they'd have the same clear error message as 3+x
> > gives when the type has no __radd__.
>
> Okay, that makes sense.

So how do you think we should go about it?  I can't see a way
right now (at least not for 2.3, i.e. without breaking some programs).

A user COULD have coded a class that's meant to represent a
sequence AND relied on the (undocumented, I think) feature
that giving the class a __mul__ automatically makes instances 
of that class multipliable by integers on EITHER side, after all.

We can't (sensibly), I think, distinguish that from the case where
the user has coded a class that's meant to represent a number
and gets astonished that __mul__, undocumentedly, makes
isntances of that class multipliable by integers on either side.

So perhaps for 2.3 we should just apologetically note the anomaly
in the docs, and for 2.4 forbid the former case, i.e., require both
__mul__ AND __rmul__ to exist if one wants to code sequence
classes that can be multiplied by integers on either side...?

Any opinions, anybody...?


Alex


From aleaxit at yahoo.com  Tue Oct 28 05:12:21 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 05:12:33 2003
Subject: [Python-Dev] Re: test_bsddb blocks testing popitem - reason
In-Reply-To: <20031027215648.GM3929@zot.electricrain.com>
References: <200310251232.55044.aleaxit@yahoo.com>
	<200310271125.16879.aleaxit@yahoo.com>
	<20031027215648.GM3929@zot.electricrain.com>
Message-ID: <200310281112.21162.aleaxit@yahoo.com>

On Monday 27 October 2003 10:56 pm, Gregory P. Smith wrote:
> On Mon, Oct 27, 2003 at 11:25:16AM +0100, Alex Martelli wrote:
> > I still don't quite see how the lock ends up being "held", but, don't
> > mind me -- the intricacy of mixins and wrappings and generators and
> > delegations in those modules is making my head spin anyway, so it's
> > definitely not surprising that I can't quite see what's going on.
>
> BerkeleyDB internally always grabs a read lock (i believe at the page
> level; i don't think BerkeleyDB does record locking) for any database read
> when opened with DB_THREAD | DB_INIT_LOCK flags.  I believe the problem
> is that a DBCursor object holds this lock as long as it is open/exists.
> Other reads can go on happily, but writes must to wait for the read lock
> to be released before they can proceed.

Aha, much clearer, thanks.

> What about the behaviour of multiple iterators for the same dict being
> used at once (either interleaved or by multiple threads; it shouldn't
> matter)?  I expect that works fine in python.

If the dict is not being modified, or if the only modifications on it are
assigning different values for already-existing keys, multiple iterators
on the same unchanging dict do work fine in one or more threads.
But note that iterators only "read" the dict, don't change it.  If any
change to the set of keys in the dict happens, all bets are off.


> This is something the _DBWithCursor iteration interface does not currently
> support due to its use of a single DBCursor internally.
>
> _DBWithCursor is currently written such that the cursor is never closed
> once created.  This leaves tons of potential for deadlock even in single
> threaded apps.  Reworking _DBWithCursor into a _DBThatUsesCursorsSafely
> such that each iterator creates its own cursor in an internal pool
> and other non cursor methods that would write to the db destroy all
> cursors after saving their current() position so that the iterators can
> reopen+reposition them is a solution.

Woof.  I think I understand what you're saying.  However, writing to a
dict (in the sense of changing the sets of keys) while one is iterating
on the dict is NOT supported in Python -- basically "undefined behavior"
(which does NOT include possibilities of crashes and deadlocks, though).
So, maybe, we could get away with something a bit less rich here?


> > Given that in bsddb's case that iteritems() first [and only]
> > next() boils down to a self.first() which in turn does a
> > self.dbc.first() I _still_ don't see exactly what's holding the
> > lock.  But the simplest fix would appear to be in __delitem__,
> > i.e., if we have a cursor we should delete through it:
> >
> >     def __delitem__(self, key):
> >         self._checkOpen()
> >         if self.dbc is not None:
> >             self.dbc.set(key)
> >             self.dbc.delete()
> >         else:
> >             del self.db[key]
> >
> > ...but this doesn't in fact remove the deadlock on the
> > unit-test for popitem, which just confirms I don't really
> > grasp what's going on, yet!-)
>
> hmm.  i would've expected your __delitem__ to work.  Regardless, using the

Ah!  I'll check again -- maybe I did something silly -- but what happens
now is that the __delitem__ DOES work, the key does get deleted according
to print and printf's I've sprinkled here and there, BUT then right after the
key is deleted everything deadlocks anyway (in test_popitem).

> debugger I can stop the deadlock from occurring if i do "self.dbc.close();
> self.dbc = None" just before popitem's "del self[k]"

So, maybe I _should_ just fix popitem that way and see if all tests pass?
I dunno -- it feels a bit like fixing the symptoms and leaving some deep
underlying problems intact...

Any other opinions?  I don't have any strong feelings one way or the
other, except that I really think unit-tests SHOULD pass... and indeed
that changes should not committed UNLESS unit-tests pass...


Alex


From python at rcn.com  Tue Oct 28 05:29:21 2003
From: python at rcn.com (Raymond Hettinger)
Date: Tue Oct 28 05:30:16 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310260115.h9Q1F0905596@oma.cosc.canterbury.ac.nz>
Message-ID: <000b01c39d3e$59af4de0$c807a044@oemcomputer>

Okay, this is the last chance to come-up with a name other than
sorted().

Here are some alternatives:

  inlinesort()   # immediately clear how it is different from sort()
  sortedcopy()   # clear that it makes a copy and does a sort
  newsorted()    # appropriate for a class method constructor


I especially like the last one and all of them provide a distinction
from list.sort().


Raymond Hettinger


From ncoghlan at iinet.net.au  Tue Oct 28 06:19:31 2003
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Tue Oct 28 06:19:37 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310280956.34183.aleaxit@yahoo.com>
References: <200310280202.h9S22Ad21701@oma.cosc.canterbury.ac.nz>	<200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com>
	<200310280956.34183.aleaxit@yahoo.com>
Message-ID: <3F9E50C3.4040908@iinet.net.au>

The current situation:

Rebinding a variable at module scope:

 >>> def f():
       global x
       x = 1
 >>> f()
 >>> print x
1

If I try to write "global x.y" inside the function, Idle spits the dummy (and 
rightly so). I can rebind x.y quite happily, since I am only referencing x, and 
the lookup rules find the scope I need.

I don't see any reason for 'global <var> in <scope>' or syntactic sugar for 
nonlocal binding (i.e. ":=" ) to accept anything that the current global does not.

Similarly, consider the following from Idle:
 >>> def f():
	x += 1
	
 >>> x = 1
 >>> f()

Traceback (most recent call last):
   File "<pyshell#12>", line 1, in -toplevel-
     f()
   File "<pyshell#10>", line 2, in f
     x += 1
UnboundLocalError: local variable 'x' referenced before assignment

Augmented assignment does not currently automatically invoke a "global" 
definition now, so why should that change no matter the outcome of this discussion?


Guido's suggestion of "nonlocal" for a variant of global that searches 
intervening namespaces first seems nice - the term "non-local variable" 
certainly strikes me as the most freqently used way of referring to variables 
from containing scopes in this thread.

 >>> def f():
       def g():
         nonlocal x
         x = 1
       g()
       print x
 >>> f()
1

If 'nonlocal' was allowed only to _rebind_ variables, rather than create them at 
some other scope (probably advisable since 'nonlocal' merely says, 'somewhere 
other than here', which means there is no obvious suggestion for where to create 
the new variable - I could argue for either "module scope" or "nearest enclosing 
scope"). Defining it this way also allows catching problems at compile time 
instead of runtime (YMMV on whether that's a good thing or not)

At this point, Just's "rebinding variable from outer scope only" assignment 
operator "x := 1" might seem like nice syntactic sugar for "nonlocal x; x =1" 
(it wouldn't require a new keyword, either)

Is there really any need to allow anything more then replicating the search 
order for variable _reference_? Code which nests sufficient scopes that a simple 
'inside-out' search is not sufficient would just seem sorely in need of a 
redesign to me. . .

Regards,
Nick.

-- 
Nick Coghlan           |              Brisbane, Australia
ICQ#: 68854767         |               ncoghlan@email.com
Mobile: 0409 573 268   |   http://www.talkinboutstuff.net
"Let go your prejudices,
               lest they limit your thoughts and actions."


From ncoghlan at iinet.net.au  Tue Oct 28 06:33:40 2003
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Tue Oct 28 06:33:45 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <3F9E50C3.4040908@iinet.net.au>
References: <200310280202.h9S22Ad21701@oma.cosc.canterbury.ac.nz>	<200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com>	<200310280956.34183.aleaxit@yahoo.com>
	<3F9E50C3.4040908@iinet.net.au>
Message-ID: <3F9E5414.8020500@iinet.net.au>

Nick Coghlan strung bits together to say:
::snip::

Saw the rather sensible suggestion to shelve this discussion only _after_ making 
my previous post. Ah well.

Cheers,
Nick.

-- 
Nick Coghlan           |              Brisbane, Australia
ICQ#: 68854767         |               ncoghlan@email.com
Mobile: 0409 573 268   |   http://www.talkinboutstuff.net
"Let go your prejudices,
               lest they limit your thoughts and actions."


From mickey at tm.informatik.uni-frankfurt.de  Tue Oct 28 06:30:08 2003
From: mickey at tm.informatik.uni-frankfurt.de (Michael Lauer)
Date: Tue Oct 28 06:34:05 2003
Subject: [Python-Dev] Re: 2.3.3 plans
Message-ID: <1067340608.24137.11.camel@gandalf.tm.informatik.uni-frankfurt.de>

Anthony wrote:
>I'm currently thinking of doing 2.3.3 in about 3 months time. My focus
>on 2.3.3 will be on fixing the various build glitches that we have on
>various platforms - I'd like to see 2.3.3 build on as many boxes as 
>possible, "out of the box".

Does this also include cross compiling? I'm the maintainer of a
python-for-arm-linux distribution (http://opie.net.wox.org/python)
which is created using the OpenZaurus build infrastructure (http://openzaurus.org).

To get Python cross compiled for arm-linux, I did a few (pretty rough) patches
which I attached to this message. It would be useful for cross compiling, if
(conceptually) the first two could be integrated into Python 2.3.3.

Best Regards,

Mickey.

-- 
:M:
--------------------------------------------------------------------------
Dipl.-Inf. Michael 'Mickey' Lauer   mickey@tm.informatik.uni-frankfurt.de 
  Raum 10b - ++49 69 798 28358       Fachbereich Informatik und Biologie  
--------------------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: python-2.3.2-crosscompile.patch
Type: text/x-diff
Size: 3409 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20031028/78f07782/python-2.3.2-crosscompile.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: python-modules-oz1.patch
Type: text/x-diff
Size: 2026 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20031028/78f07782/python-modules-oz1.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: python-crosscompile-hotshot.patch
Type: text/x-diff
Size: 354 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20031028/78f07782/python-crosscompile-hotshot.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: python-crosscompile-socket.patch
Type: text/x-diff
Size: 286 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20031028/78f07782/python-crosscompile-socket.bin
From ws-news at gmx.at  Tue Oct 28 06:59:49 2003
From: ws-news at gmx.at (Werner Schiendl)
Date: Tue Oct 28 07:00:14 2003
Subject: [Python-Dev] Re: copysort patch, was RE: inline sort option
References: <200310260115.h9Q1F0905596@oma.cosc.canterbury.ac.nz>
	<000b01c39d3e$59af4de0$c807a044@oemcomputer>
Message-ID: <bnllo5$cpt$1@sea.gmane.org>

Hi,

thought you might be interested in the opinions of someone not (yet) working
full-day with Python and whose mother tounge is *not* english.

"Raymond Hettinger" <python@rcn.com> schrieb

> Okay, this is the last chance to come-up with a name other than
> sorted().
>

The method is making a copy and sorts that and returns it, right?
I think the copy is not fully clear from this name.

I'd give it +0

> Here are some alternatives:
>
>   inlinesort()   # immediately clear how it is different from sort()

I'm rather -1 on it.

Inline might be confused with inplace, and even when not it's not clear from
the name that a copy is made.

>   sortedcopy()   # clear that it makes a copy and does a sort

My favourite (if the behaviour is how I believe it, that is *only* the copy
is sorted)
It's really obvious what is done.

+1

>   newsorted()    # appropriate for a class method constructor

I first read this news-orted, and had to step back.
Also "new" is not actually the same than "copy" to me (maybe because of my
C++) background.

Say -0


hth
Werner


From amk at amk.ca  Tue Oct 28 07:53:50 2003
From: amk at amk.ca (amk@amk.ca)
Date: Tue Oct 28 07:53:55 2003
Subject: [Python-Dev] htmllib vs. HTMLParser
In-Reply-To: <03Oct27.165334pst."58611"@synergy1.parc.xerox.com>
References: <20031027160221.GA29155@rogue.amk.ca>
	<03Oct27.165334pst."58611"@synergy1.parc.xerox.com>
Message-ID: <20031028125350.GC1095@rogue.amk.ca>

On Mon, Oct 27, 2003 at 04:53:32PM -0800, Bill Janssen wrote:
> But IMO simply adding some handler methods won't really do it.  You
> also need to introduce some knowledge about the semantics of the
> syntax.  For example, a new "block"-level element should close all
> "in-line" elements that are currently open.  Etc.

Perhaps, but it might be a mug's game.  I was on the Lynx developer list for
a while, and bad HTML requires many, many hacks to be processed sensibly.
Given that XHTML use is slowly rising, that work may not be necessary, but
I'll keep it in mind.

--amk

From pje at telecommunity.com  Tue Oct 28 08:46:08 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Oct 28 08:46:16 2003
Subject: [Python-Dev] Alternate notation for global variable assignments
In-Reply-To: <200310281031.38234.aleaxit@yahoo.com>
References: <1067299912.1066.35.camel@anthem>
	<338366A6D2E2CA4C9DAEAE652E12A1DED6B084@au3010avexu1.global.avaya.com>
	<1067299912.1066.35.camel@anthem>
Message-ID: <5.1.0.14.0.20031028084229.01e66800@mail.telecommunity.com>

At 10:31 AM 10/28/03 +0100, Alex Martelli wrote:
>On Tuesday 28 October 2003 01:11 am, Barry Warsaw wrote:
>    ...
> > What I really want is access to a namespace, and then all the normal
> > Python attribute access notations just work.  They're one honking great
> > idea after all.
>
>Yes, all in all this does remain my preference, too.  I'd take stropping (or
>"keyword stropping" a la Greg's "outer x") rather than declarative stuff,
>but just getting a namespace (in ways the compiler could recognize,
>i.e. by magicnames such as __me__) and then using __me__.x=23
>would require no new syntax and be maximally obvious.  Sigh.

Why not just:

import whatevermynameis

whatevermynameis.foo = bar

This would be even *more* maximally obvious, as you wouldn't need to know 
what '__me__' means.  :)  And how often do you write a module without 
knowing what its name is, or change the name after you've written 
it?  Plus, thanks to the time machine, it already works.  :)

Heck, now that I've thought of it, I'm almost tempted to go change all my 
existing uses of global to this instead...


From pje at telecommunity.com  Tue Oct 28 08:57:54 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Oct 28 08:57:47 2003
Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing 'global')
In-Reply-To: <200310280956.34183.aleaxit@yahoo.com>
References: <200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com>
	<200310280202.h9S22Ad21701@oma.cosc.canterbury.ac.nz>
	<200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com>
Message-ID: <5.1.0.14.0.20031028085337.02f57ec0@mail.telecommunity.com>

At 09:56 AM 10/28/03 +0100, Alex Martelli wrote:
>AND, adaptation is not typecasting:
>e.g y=adapt("23", int) should NOT succeed.

Obviously, it wouldn't succeed today, since int doesn't have __adapt__ and 
str doesn't have __conform__.  But why would you intend that they not have 
them in future?

And, why do you consider adaptation *not* to be typecasting?  I always 
think of it as "give me X, rendered as a Y", which certainly sounds like a 
description of typecasting to me.


From pje at telecommunity.com  Tue Oct 28 09:01:29 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Oct 28 09:00:41 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <000b01c39d3e$59af4de0$c807a044@oemcomputer>
References: <200310260115.h9Q1F0905596@oma.cosc.canterbury.ac.nz>
Message-ID: <5.1.0.14.0.20031028090013.03a9b200@mail.telecommunity.com>

At 05:29 AM 10/28/03 -0500, Raymond Hettinger wrote:
>   inlinesort()   # immediately clear how it is different from sort()
>   sortedcopy()   # clear that it makes a copy and does a sort
>   newsorted()    # appropriate for a class method constructor

+1 on sortedcopy(), especially if it's usable as a method, e.g. 
myList.sortedcopy().  (Note that that doesn't exclude it also being spelled 
as 'list.sortedcopy(myList)'.)


From niemeyer at conectiva.com  Tue Oct 28 08:59:38 2003
From: niemeyer at conectiva.com (Gustavo Niemeyer)
Date: Tue Oct 28 09:01:10 2003
Subject: list.sort, was Re: [Python-Dev] decorate-sort-undecorate
In-Reply-To: <008f01c39d11$e5f13160$81b0958d@oemcomputer>
References: <CA1A8348-08F4-11D8-8B4B-000A95686CD8@redivi.com>
	<008f01c39d11$e5f13160$81b0958d@oemcomputer>
Message-ID: <20031028135938.GA22878@ibook.distro.conectiva>

Hi Raymond!

If that has been discussed long ago as you mention, please, just tell me
so and I won't try to recreate the same discussion. :-)

[...]
> islice(it, None, None, -1) is a disaster when presented with an infinite
> iterator and a near disaster with a long running iterator.

I don't agree with that approach. I think islice() would be more useful
if it was based on best effort to try to reduce memory usage. With
a negative index, it will be necessary to iterate trough all steps,
but it can do better than list(iter)[-1], for example. Knowing that you
have a negative index of -1 means that you may cache just a single
entry, instead of the whole list.

> Handling negative steps entails saving data in memory.

Indeed. But if I *want* to use a negative index over an iterator, it
would be nice if some smart guy did the work for "me" instead of having
to do that by hand, or even worse, having to use a list() which will
store *everything* in memory.

As a real world example, have a look at rrule's __getitem__() method
(more info on https://moin.conectiva.com.br/DateUtil):

    def __getitem__(self, item):
        if self._cache_complete:
            return self._cache[item]
        elif isinstance(item, slice):
            if item.step and item.step < 0:
                return list(iter(self))[item]
            else:
                return list(itertools.islice(self,
                                             item.start or 0,
                                             item.stop or sys.maxint,
                                             item.step or 1))
        elif item >= 0:
            gen = iter(self)
            try:
                for i in range(item+1):
                    res = gen.next()
            except StopIteration:
                raise IndexError
            return res
        else:
            return list(iter(self))[item]

Having negative indexes is *very* useful in that context, and I'd like
so much to turn it into simply

   return list(itertools.islice(self,
			        item.start, item.stop, item.step))

Now, have a look at the count() method, which is useful as
well (it is the same as a __len__() method, but introducing
__len__() kills the iterator performance).

    def count(self):
        if self._len is None:
            for x in self: pass
        return self._len

It is very useful as well, and having something like ilen() would
be nice, even though it must iterate over the whole sequence.

This would never end up in an infinite loop in that context, and
even if it did, I wouldn't care about it. Not introducing it for
being afraid of an infinite loop would be the same as removing
the 'while' construct to avoid "while 1: pass".

-- 
Gustavo Niemeyer
http://niemeyer.net

From guido at python.org  Tue Oct 28 10:15:02 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 28 10:18:09 2003
Subject: [Python-Dev] RE: [Python-checkins] python/nondist/peps
	pep-0323.txt, NONE, 1.1 pep-0000.txt, 1.254, 1.255
In-Reply-To: Your message of "Tue, 28 Oct 2003 10:27:06 +0100."
	<200310281027.06138.aleaxit@yahoo.com> 
References: <005a01c39cdb$fa18b540$81b0958d@oemcomputer>
	<200310280054.h9S0sLv27908@12-236-54-216.client.attbi.com> 
	<200310281027.06138.aleaxit@yahoo.com> 
Message-ID: <200310281515.h9SFF2c28991@12-236-54-216.client.attbi.com>

> > > Also, I have a question about the semantic specification of what a copy
> > > is supposed to do.  Does it guarantee that the same data stream will be
> > > reproduced?  For instance, would a generator of random words expect its
> > > copy to generate the same word sequence.  Or, would a copy of a
> > > dictionary iterator change its output if the underlying dictionary got
> > > updated (i.e. should the dict be frozen to changes when a copy exists or
> > > should it mutate).
> >
> > Every attempt should be made for the two copies to return exactly the
> > same stream of values.  This is the pure tee() semantics.
> 
> Yes, but iterators that run on underlying containers don't guarantee,
> in general, what happens if the container is mutated while the iteration
> is going on -- arbitrary items may end up being skipped, repeated, etc.
> So, "every attempt" is, I feel, too strong here.

Maybe.

I agree that for list and dict iterators, if the list is mutated, this
warrantee shall be void.

But I strongly believe that cloning a random iterator should cause two
identical streams of numbers, not two different random streams.  If
you want two random streams you should create two independent
iterators.  Most random number generators have a sufficiently small
amount of state that making a copy isn't a big deal.  If it is hooked
up to an external source (e.g. /dev/random) then I'd say you'd have to
treat it as a file, and introduce explicit buffering.

> deepcopy exists for those cases where one is ready to pay a hefty
> price for guarantees of "decoupling", after all.

But I don't propose that iterators support __deepcopy__.  The use case
is very different.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 28 10:18:45 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 28 10:19:00 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: Your message of "Tue, 28 Oct 2003 05:29:21 EST."
	<000b01c39d3e$59af4de0$c807a044@oemcomputer> 
References: <000b01c39d3e$59af4de0$c807a044@oemcomputer> 
Message-ID: <200310281518.h9SFIj129025@12-236-54-216.client.attbi.com>

> Okay, this is the last chance to come-up with a name other than
> sorted().
> 
> Here are some alternatives:
> 
>   inlinesort()   # immediately clear how it is different from sort()
>   sortedcopy()   # clear that it makes a copy and does a sort
>   newsorted()    # appropriate for a class method constructor
> 
> 
> I especially like the last one and all of them provide a distinction
> from list.sort().

While we're voting, I still like list.sorted() best, so please keep
that one in the list of possibilities.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 28 10:16:37 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 28 10:26:03 2003
Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug
In-Reply-To: Your message of "Tue, 28 Oct 2003 10:37:44 +0100."
	<200310281037.44424.aleaxit@yahoo.com> 
References: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz>  
	<200310281037.44424.aleaxit@yahoo.com> 
Message-ID: <200310281516.h9SFGbf29003@12-236-54-216.client.attbi.com>

> So perhaps for 2.3 we should just apologetically note the anomaly
> in the docs, and for 2.4 forbid the former case, i.e., require both
> __mul__ AND __rmul__ to exist if one wants to code sequence
> classes that can be multiplied by integers on either side...?
> 
> Any opinions, anybody...?

What's wrong with the status quo?  So 3*x is undefined, and it happens
to return x*3.  Is that so bad?

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 28 10:27:45 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 28 10:28:03 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: Your message of "Tue, 28 Oct 2003 21:19:31 +1000."
	<3F9E50C3.4040908@iinet.net.au> 
References: <200310280202.h9S22Ad21701@oma.cosc.canterbury.ac.nz>
	<200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com>
	<200310280956.34183.aleaxit@yahoo.com> 
	<3F9E50C3.4040908@iinet.net.au> 
Message-ID: <200310281527.h9SFRjw29046@12-236-54-216.client.attbi.com>

> Augmented assignment does not currently automatically invoke a
> "global" definition now, so why should that change no matter the
> outcome of this discussion?

Because of the fair user expectation that if you can write "x = x + 1"
you should also be able to write "x += 1".


> Is there really any need to allow anything more then replicating the
> search order for variable _reference_? Code which nests sufficient
> scopes that a simple 'inside-out' search is not sufficient would
> just seem sorely in need of a redesign to me. . .

I just realized one thing that explains why I prefer explicitly
designating the scope (as in 'global x in f') over something like
'nonlocal'.  It matches what the current global statement does, and it
makes it crystal clear that you *can* declare a variable in a specific
scope and assign to it without requiring there to be a binding for
that variable in the scope itself.  EIBTI when comparing these two.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 28 10:29:53 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 28 10:30:05 2003
Subject: [Python-Dev] Re: 2.3.3 plans
In-Reply-To: Your message of "Tue, 28 Oct 2003 12:30:08 +0100."
	<1067340608.24137.11.camel@gandalf.tm.informatik.uni-frankfurt.de> 
References: <1067340608.24137.11.camel@gandalf.tm.informatik.uni-frankfurt.de>
Message-ID: <200310281529.h9SFTrY29061@12-236-54-216.client.attbi.com>

> Does this also include cross compiling? I'm the maintainer of a
> python-for-arm-linux distribution (http://opie.net.wox.org/python)
> which is created using the OpenZaurus build infrastructure
> (http://openzaurus.org).

I think this is a worthy cause to try and support.  (I love my Zaurus.)

> To get Python cross compiled for arm-linux, I did a few (pretty
> rough) patches which I attached to this message. It would be useful
> for cross compiling, if (conceptually) the first two could be
> integrated into Python 2.3.3.

I hope someone here can work with you on getting the patches in
acceptable shape.  You should start by uploading them to the patch
manager in SourceForge.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 28 10:33:34 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 28 10:34:01 2003
Subject: [Python-Dev] RE: cloning iterators again
In-Reply-To: Your message of "Tue, 28 Oct 2003 12:40:42 GMT."
	<20031028124042.GA22513@vicky.ecs.soton.ac.uk> 
References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer>
	<200310270851.02495.aleaxit@yahoo.com>
	<20031027103540.GA27782@vicky.ecs.soton.ac.uk>
	<200310271609.03819.aleaxit@yahoo.com> 
	<20031028124042.GA22513@vicky.ecs.soton.ac.uk> 
Message-ID: <200310281533.h9SFXZi29097@12-236-54-216.client.attbi.com>

I've lost context for the following thread.  What is this about?  I
can answer one technical question regardless, but I have no idea what
I'm promoting here. :-)

> Hello Alex,
> 
> On Mon, Oct 27, 2003 at 04:09:03PM +0100, Alex Martelli wrote:
> > Cool!  Why don't you try copy.copy on types you don't automatically
> > recognize and know how to deal with, BTW?  That might make this
> > cool piece of code general enough that Guido might perhaps allow
> > generator-produced iterators to grow it as their __copy__ method...
> 
> I will try.  Note that only __deepcopy__ makes sense, as far as I can tell,
> because there is too much state that really needs to be copied and not shared
> in a generator (in particular, the sequence iterators from 'for' loops).
> 
> I'm not sure about how deep-copying should be defined for built-in
> types.  Should a built-in __deepcopy__ method try to import and call
> copy.deepcopy() on the sub-items? This doesn't seem to be right.

Almost -- you have to pass the memo argument that your __deepcopy__
received as the second argument to the recursive deepcopy() calls, to
avoid looping on cycles.

> A bientot,
> 
> Armin.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 28 10:36:23 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 28 10:36:29 2003
Subject: [Python-Dev] Alternate notation for global variable assignments
In-Reply-To: Your message of "Tue, 28 Oct 2003 08:46:08 EST."
	<5.1.0.14.0.20031028084229.01e66800@mail.telecommunity.com> 
References: <1067299912.1066.35.camel@anthem>
	<338366A6D2E2CA4C9DAEAE652E12A1DED6B084@au3010avexu1.global.avaya.com>
	<1067299912.1066.35.camel@anthem> 
	<5.1.0.14.0.20031028084229.01e66800@mail.telecommunity.com> 
Message-ID: <200310281536.h9SFaNr29119@12-236-54-216.client.attbi.com>

> Why not just:
> 
> import whatevermynameis
> 
> whatevermynameis.foo = bar
> 
> This would be even *more* maximally obvious, as you wouldn't need to
> know what '__me__' means.  :) And how often do you write a module
> without knowing what its name is, or change the name after you've
> written it?  Plus, thanks to the time machine, it already works.  :)

Doesn't work when your module may either be called __main__ or
rumpelstiltkin.  It would then become

  if __name__ == "__main__":
      import __main__ as me
  else:
      import rumpelstiltkin as me

which loses the "aha!" effect of a cool solution.  It also IMO
requires too much explanation to the unsuspecting reader who doesn't
understand right away *why* rumpelstiltkin imports itself.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at yahoo.com  Tue Oct 28 10:39:08 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 10:39:14 2003
Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug
In-Reply-To: <200310281516.h9SFGbf29003@12-236-54-216.client.attbi.com>
References: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz>
	<200310281037.44424.aleaxit@yahoo.com>
	<200310281516.h9SFGbf29003@12-236-54-216.client.attbi.com>
Message-ID: <200310281639.08240.aleaxit@yahoo.com>

On Tuesday 28 October 2003 04:16 pm, Guido van Rossum wrote:
> > So perhaps for 2.3 we should just apologetically note the anomaly
> > in the docs, and for 2.4 forbid the former case, i.e., require both
> > __mul__ AND __rmul__ to exist if one wants to code sequence
> > classes that can be multiplied by integers on either side...?
> >
> > Any opinions, anybody...?
>
> What's wrong with the status quo?  So 3*x is undefined, and it happens
> to return x*3.  Is that so bad?

Where is it specified that 3*x "is undefined" when x's type exposes
__mul__ but not __rmul__ ?  Sorry, I don't understand the viewpoint
you seem to imply here.  If x's type exposed no __add__ but "it just
so happened" that x+23 always returned 12345 -- while every other
addition, as expected, failed -- would you doubt the lack of a normal
and reasonably expected exception is bad?

I think that if Python returns "arbitrary" results, rather than raising an
exception, for operations that "should" raise an exception, that is
surely very bad -- it makes it that much harder for programmers to
debug the programs they're developing.  If there's some doubt about
the words I've put in hyphens -- that treating x*y just like y*x only for
certain values of type(y) isn't arbitrary or shouldn't raise -- then we
can of course discuss this, but isn't the general idea correct?

Now, the docs currently say, about sequences under
http://www.python.org/doc/current/ref/sequence-types.html :
"""
sequence types should implement ... multiplication (meaning repetition) by 
defining the methods __mul__(), __rmul__() and __imul__() described below; 
they should not define __coerce__() or other numerical operators.
"""
So, a sequence-emulating type that implements __mul__ but not __rmul__ 
appears to violate that "should".

The description of __mul__ and __rmul__ referred to seems to be
that at http://www.python.org/doc/current/ref/numeric-types.html .

It says that methods corresponding to operations not supported by
a particular kind of number should be left undefined (as opposed
to the behavior of _attempts at those operations_ being undefined),
so if I had a hypothetical number type X such that, for x instance
of X and an integer k, x*k should be supported but k*x shouldn't,
isn't this a recommendation to not write __rmul__ in X ...?


Besides, this weird anomaly is typical of newstyle classes only.
Consider:

>>> class X:
...   def __mul__(self, other): return 23
...
>>> x=X()
>>> x*7
23
>>> 7*x
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: unsupported operand type(s) for *: 'int' and 'instance'
>>>

ALL wonderful, just as expected, hunky-dory.  But now, having
read that newstyle classes are better, I want to make X newstyle --
can't see any indication in the docs that I shouldn't -- and...:

>>> class X(object):
...   def __mul__(self, other): return 23
...
>>> x=X()
>>> x*7
23
>>> 7*x
23
>>>

*eep*!  Yes, it DOES seem to be that this is QUITE bad indeed.


Alex


From mcherm at mcherm.com  Tue Oct 28 10:44:38 2003
From: mcherm at mcherm.com (Michael Chermside)
Date: Tue Oct 28 10:44:39 2003
Subject: [Python-Dev] product()
Message-ID: <1067355878.3f9e8ee6bf177@mcherm.com>

Nick Coghlan writes:

>  >>> if all(pred(x) for x in values): pass       # alltrue
>  >>> if any(pred(x) for x in values): pass       # anytrue
>  >>> if any(not pred(x) for x in values): pass   # anyfalse
>  >>> if all(not pred(x) for x in values): pass   # allfalse
> 
> The names from the earlier thread do read nicely. . .

+1  Very nicely indeed.

-- Michael Chermside

From aleaxit at yahoo.com  Tue Oct 28 11:03:58 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 11:04:04 2003
Subject: [Python-Dev] RE: cloning iterators again
In-Reply-To: <200310281533.h9SFXZi29097@12-236-54-216.client.attbi.com>
References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer>
	<20031028124042.GA22513@vicky.ecs.soton.ac.uk>
	<200310281533.h9SFXZi29097@12-236-54-216.client.attbi.com>
Message-ID: <200310281703.58169.aleaxit@yahoo.com>

On Tuesday 28 October 2003 04:33 pm, Guido van Rossum wrote:
> I've lost context for the following thread.  What is this about?  I
> can answer one technical question regardless, but I have no idea what
> I'm promoting here. :-)

Armin Rigo posted an URL to a C extension module he has recently
developed, and used.  That module is able to copy running instances
of a generator.  Currently, it does so by just "knowing", and type by
type dealing with, several types whose instances could be values
referred to by the generator's frame.  I was suggesting extending this
so that, when values of other types are met, instead of automatically
failing (or, as I believe I recall Armin's extension does today, copying
the reference rather than the value), the copy process would...:

> > > Cool!  Why don't you try copy.copy on types you don't automatically
> > > recognize and know how to deal with, BTW?  That might make this

and Armin said that __copy__ seems weak to him but __deepcopy__
might not be:

> > I will try.  Note that only __deepcopy__ makes sense, as far as I can
> > tell, because there is too much state that really needs to be copied and
> > not shared in a generator (in particular, the sequence iterators from
> > 'for' loops).


So, now he went on to ask about __deepcopy__ and you answered:

> > I'm not sure about how deep-copying should be defined for built-in
> > types.  Should a built-in __deepcopy__ method try to import and call
> > copy.deepcopy() on the sub-items? This doesn't seem to be right.
>
> Almost -- you have to pass the memo argument that your __deepcopy__
> received as the second argument to the recursive deepcopy() calls, to
> avoid looping on cycles.


Now, if Armin's code can only provide __deepcopy__ and not __copy__,
then it's probably of little use wrt the __copy__ functionality I talk about
in PEP 323 (which I still must revise to take into account your feedback
and Raymond's -- plan to get to that as soon as I've cleared my mbox) --
the memory and time expenditure is likely to be too high for that.

It's still going to be a cool hack, well worth "publishing" as such, and
probably able to be "user-installed" as the way deepcopy deals with
generators even though generators themselves may not sprout a
__deepcopy__ method themselves (fortunately, copy.copy does a
lot of "ad-hoc protocol adaptation" -- it's occasionally a bit rambling
or hard to follow, but often allows plugging in "third party copiers" for
types which their authors hadn't imagined would be copied or deep
copied, so that other innocent client code which just calls copy.copy(x)
will work... essentially how "real" adaptation would work, except that
registering a third-party protocol adapter would be easier:-).


Alex


From aleaxit at yahoo.com  Tue Oct 28 11:23:31 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 11:23:39 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310281518.h9SFIj129025@12-236-54-216.client.attbi.com>
References: <000b01c39d3e$59af4de0$c807a044@oemcomputer>
	<200310281518.h9SFIj129025@12-236-54-216.client.attbi.com>
Message-ID: <200310281723.31940.aleaxit@yahoo.com>

On Tuesday 28 October 2003 04:18 pm, Guido van Rossum wrote:
> > Okay, this is the last chance to come-up with a name other than
> > sorted().
> >
> > Here are some alternatives:
> >
> >   inlinesort()   # immediately clear how it is different from sort()
> >   sortedcopy()   # clear that it makes a copy and does a sort
> >   newsorted()    # appropriate for a class method constructor
> >
> >
> > I especially like the last one and all of them provide a distinction
> > from list.sort().
>
> While we're voting, I still like list.sorted() best, so please keep
> that one in the list of possibilities.

I also like list.sorted() -- but list.newsorted() is IMHO even a LITTLE
bit better, making it even clearer that it's giving a NEW list.  Just
a _little_ preference, mind you.  "sortedcopy" appears to me a BIT
less clear (what "copy", if the arg isn't a list...?), "inlinesort" worst.

BTW, I think I should point out one POSSIBLE problem with
classmethods -- since unfortunately they CAN be called on an
instance, and will ignore that instance, this may confuse an
unsuspecting user.  I was arguing on c.l.py that this _wasn't_
confusing because I never saw anybody made that mistake
with dict.fromkeys ... and the response was "what's that?"...
i.e., people aren't making mistakes with it because they have
no knowledge of it.  list.newsorted or however it's going to
be called is probably going to be more popular than existing
dict.fromkeys, so the issue may be more likely to arise there.

Although I think the issue can safely be ignored, I also think
I should point it out anyway, even just to get concurrence on
this -- it IS possible that the classmethod idea is winning "by
accident" just because nobody had thought of the issue,
after all, and that would be bad (and I say so even if I was
the original proposer of the list.sorted classmethod and still
think it should be adopted -- don't want it to get in "on the
sly" because a possible problem was overlooked, as opposed
to, considered and decided to be not an issue).


OK, so here's the problem, exemplified with dict.fromkeys:

d = {1:2, 3:4, 5:6}
dd = d.fromkeys([3, 5])

it's not immediately obvious that the value of d matters not
a whit -- that this is NOT going to return a subset of d
"taken from the keys" 3 and 5, i.e. {3:4, 5:6}, but, rather,
{3:None, 5:None} -- and the functionality a naive user might
attribute to that call d.fromkeys([3, 5]) should in fact be
coded quite differently, e.g.:

dd = dict([ (k,v) for k, v in d.iteritems() if k in [3,5] ])

or perhaps, if most keys are going to be copied:

dd = d.copy()
for k in d:
    if k not in [3, 5]:
        del dd[k]


The situation with list.sorted might be somewhat similar,
although in fact I think that it's harder to construct a case
of fully sympathizable-with confusion like the above.  Still:

L = range(7)
LL = L.sorted()

this raises an exception (presumably about L.sorted
needing "at least 1 arguments, got 0" -- that's what
d.fromkeys() would do today), so the issue ain't as bad --
it will just take the user a while to understand WHY, but
at least there should be running program with strange
results, which makes for harder debugging.  Or:

L = range(7)
LL = L.sorted(('fee', 'fie', 'foo'))

I'l not sure what the coder might expect here, but
again it seems possible that he expected the value of
L to matter in some way to the resulting value of LL.


Perhaps this points to an issue with classmethods in
general, due in part to the fact that they're still rather
little used in Python -- callers of instance.method()
may mostly expect that the result has something to
do with the instance's value, rather than being the
same as type(instance).method() -- little we can do
about it at this point except instruction, I think.


Alex


From aleaxit at yahoo.com  Tue Oct 28 11:35:41 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 11:37:30 2003
Subject: [Python-Dev] RE: [Python-checkins] python/nondist/peps
	pep-0323.txt, NONE, 1.1 pep-0000.txt, 1.254, 1.255
In-Reply-To: <200310281515.h9SFF2c28991@12-236-54-216.client.attbi.com>
References: <005a01c39cdb$fa18b540$81b0958d@oemcomputer>
	<200310281027.06138.aleaxit@yahoo.com>
	<200310281515.h9SFF2c28991@12-236-54-216.client.attbi.com>
Message-ID: <200310281735.41103.aleaxit@yahoo.com>

On Tuesday 28 October 2003 04:15 pm, Guido van Rossum wrote:
   ...
> > > Every attempt should be made for the two copies to return exactly the
> > > same stream of values.  This is the pure tee() semantics.
> >
> > Yes, but iterators that run on underlying containers don't guarantee,
> > in general, what happens if the container is mutated while the iteration
> > is going on -- arbitrary items may end up being skipped, repeated, etc.
> > So, "every attempt" is, I feel, too strong here.
>
> Maybe.
>
> I agree that for list and dict iterators, if the list is mutated, this
> warrantee shall be void.

OK, noticed -- and I'll clarify the PEP about this, thanks.

> But I strongly believe that cloning a random iterator should cause two
> identical streams of numbers, not two different random streams.  If
> you want two random streams you should create two independent
> iterators.  Most random number generators have a sufficiently small
> amount of state that making a copy isn't a big deal.  If it is hooked
> up to an external source (e.g. /dev/random) then I'd say you'd have to
> treat it as a file, and introduce explicit buffering.

I really truly REALLY like this.  I _was_ after all the one who lobbied
Tim to add getstate and setstate to random.py, back in the not-so-
long-ago time when I was a total Python newbie -- exactly because,
being a NOT-newbie consumer of pseudo-random streams, I loathed and
detested the pseudo-random generators that didn't allow me to reproduce
experiments in this way.

So, I entirely agree that if pseudo-random numbers are being consumed
through a "pseudo-random iterator" the copy should indeed step through
just the same numbers.  Again, this will get in the PEP -- *thanks*!

Btw, random.py doesn't seem to supply pseudo-random iterators --
easy to make e.g. with iter(random.random, None) [assuming you
want a nonterminating one], but that wouldn't be copyable.  Should
something be done about that...?

And as for NON-pseudo random numbers, such as those supplied
by /dev/random and other external sources, yes, of course, they
should NOT be copy'able -- best to let tee() work on them by
making its buffer, or else wrap them in a buffer-to-file way if one
needs to "snapshot" the sequence then re-"generate" a lot of it
later for reproducibility purposes.  I.e., absolute agreement.


> > deepcopy exists for those cases where one is ready to pay a hefty
> > price for guarantees of "decoupling", after all.
>
> But I don't propose that iterators support __deepcopy__.  The use case
> is very different.

Yes, the use case of __deepcopy__ is indeed quite different (and
to be honest it doesn't appear in my actual experience -- I can "imagine"
some as well as the next man, but they'd be made out of whole cloth:-).
But I was under the impression that you wanted them in PEP 323 too?
Maybe I misunderstood your words.  Should I take them out of PEP 323?
In that case somebody else can later PEP that if they want, and I can
basically wash my hands of them -- what do you think?


Alex


From aleaxit at yahoo.com  Tue Oct 28 11:42:39 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 11:42:46 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <5.1.0.14.0.20031028090013.03a9b200@mail.telecommunity.com>
References: <200310260115.h9Q1F0905596@oma.cosc.canterbury.ac.nz>
	<5.1.0.14.0.20031028090013.03a9b200@mail.telecommunity.com>
Message-ID: <200310281742.39349.aleaxit@yahoo.com>

On Tuesday 28 October 2003 03:01 pm, Phillip J. Eby wrote:
> At 05:29 AM 10/28/03 -0500, Raymond Hettinger wrote:
> >   inlinesort()   # immediately clear how it is different from sort()
> >   sortedcopy()   # clear that it makes a copy and does a sort
> >   newsorted()    # appropriate for a class method constructor
>
> +1 on sortedcopy(), especially if it's usable as a method, e.g.
> myList.sortedcopy().  (Note that that doesn't exclude it also being spelled
> as 'list.sortedcopy(myList)'.)

Please explain how it might work when the argument to list.sortedcopy is *NOT*
an instance of type list, but rather a completely general sequence, as a
classmethod will of course allow us to have.  Maybe I'm missing some
recent Python enhancements, but I thought that, if a method is fully usable
as an instancemethod, then when called on the type it's an unbound method
and will ONLY support being called with an instance of the type as the 1st 
arg.

Hmmm... maybe one COULD make a custom descriptor that does support
both usages... and maybe it IS worth making the .sorted (or whatever name)
entry a case of exactly such a subtle custom descriptor...


Alex


From FBatista at uniFON.com.ar  Tue Oct 28 11:52:48 2003
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Tue Oct 28 11:53:43 2003
Subject: [Python-Dev] Decimal.py in sandbox
Message-ID: <A128D751272CD411BC9200508BC2194D03383119@escpl.tcp.com.ar>

Aahz wrote:
#- The first thing you should do is talk with Eric Price
#- (eprice@tjhsst.edu), author of the code.  You don't need to 
#- use SF for
#- now; CVS should be fine, but you should find out whether 
#- Eric would like
#- to approve changes first.

Eric Price wrote:
#- Not really-- since school started, I haven't had much time 
#- to spare.  
#- I'll probably look over the changes at some time, but I 
#- wouldn't want to 
#- keep them waiting.


So, to who may I send the changes?

Should I send the whole staff at the end of the work, or keep feeding small
changes?

Should I send by email the diff results?

Thanks for the answers.

.	Facundo


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
ADVERTENCIA  

La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo,
son para uso exclusivo del destinatario y pueden contener informaci?n
confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. 

Si Ud. No es uno de los destinatarios consignados o la persona responsable
de hacer llegar este mensaje a los destinatarios consignados, no est?
autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de
ella) contenida en este mensaje. Por favor notif?quenos respondiendo al
remitente, borre el mensaje original y borre las copias (impresas o grabadas
en cualquier medio magn?tico) que pueda haber realizado del mismo. 

Todas las opiniones contenidas en este mail son propias del autor del
mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones
Personales S.A. o alguna empresa asociada. 

Los mensajes electr?nicos pueden ser alterados, motivo por el cual
Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n
cualquiera sea el resultante de este mensaje. 

Muchas Gracias.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20031028/6f6af4d2/attachment.html
From aleaxit at yahoo.com  Tue Oct 28 11:55:44 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 11:56:11 2003
Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing
	'global')
In-Reply-To: <5.1.0.14.0.20031028085337.02f57ec0@mail.telecommunity.com>
References: <200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com>
	<5.1.0.14.0.20031028085337.02f57ec0@mail.telecommunity.com>
Message-ID: <200310281755.44307.aleaxit@yahoo.com>

On Tuesday 28 October 2003 02:57 pm, Phillip J. Eby wrote:
> At 09:56 AM 10/28/03 +0100, Alex Martelli wrote:
> >AND, adaptation is not typecasting:
> >e.g y=adapt("23", int) should NOT succeed.
>
> Obviously, it wouldn't succeed today, since int doesn't have __adapt__ and
> str doesn't have __conform__.  But why would you intend that they not have
> them in future?

I'd be delighted to have the int type sprout __adapt__ and the str type
sprout __conform__ -- but neither should accept this case, see below.


> And, why do you consider adaptation *not* to be typecasting?  I always
> think of it as "give me X, rendered as a Y", which certainly sounds like a
> description of typecasting to me.

typecasting (in Python) makes a NEW object whose value is somehow
"built" (possibly in a very loose sense) from the supplied argument[s],
but need not have any more than a somewhat tangential relation with
them.  adaptation returns "the same object" passed as the argument,
or a wrapper to it that makes it comply with the protocol.

To give a specific example:

x = file("foo.txt")

now (assuming this succeeds) x is a readonly object which is an
instance of file.  The argument string "foo.txt" has "indicated", quite
indirectly, how to construct the file object, but there's really no true
connection between the value of the argument string and what
will happen as that object x is read.

Thinking of what should happen upon:

x = adapt("foo.txt", file)

what I envision is DEFINITELY the equivalent of:

x = cStringIO.StringIO("foo.txt")

i.e., the value (aka object) "foo.txt", wrapped appropriately so as
to conform to the (readonly) "file protocol" (I can call x.read(3)
and get "foo", then x.seek(0) then x.read(2) and get "fo", etc).


Hmmm, that PEP definitely needs updating (including mentions
of PyProtocol as well as of this issue...)...!  I've been rather remiss
about it so far -- sorry.


Alex


From python at rcn.com  Tue Oct 28 12:09:55 2003
From: python at rcn.com (Raymond Hettinger)
Date: Tue Oct 28 12:14:04 2003
Subject: [Python-Dev] RE: [Python-checkins]
	python/nondist/pepspep-0323.txt, NONE, 1.1 pep-0000.txt, 1.254, 1.255
In-Reply-To: <200310281735.41103.aleaxit@yahoo.com>
Message-ID: <003401c39d76$4f36d080$1a3ac797@oemcomputer>

[Alex]
> Btw, random.py doesn't seem to supply pseudo-random iterators --
> easy to make e.g. with iter(random.random, None) [assuming you
> want a nonterminating one],

Probably a bit faster with:

	starmap(random.random, repeat(()))


> but that wouldn't be copyable.  Should
> something be done about that...?

No.

1) The use case is not typical.  Other than random.random() and
time.ctime(), it is rare to see functions of zero arguments that
usefully return an infinite sequence of distinct values.

2) If you need a copy, run it through tee(). 


Raymond


From tjreedy at udel.edu  Tue Oct 28 12:16:47 2003
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue Oct 28 12:15:46 2003
Subject: [Python-Dev] Re: copysort patch, was RE: inline sort option
References: <000b01c39d3e$59af4de0$c807a044@oemcomputer>
	<200310281518.h9SFIj129025@12-236-54-216.client.attbi.com>
Message-ID: <bnm86p$qhs$1@sea.gmane.org>


"Guido van Rossum" <guido@python.org> wrote in message
news:200310281518.h9SFIj129025@12-236-54-216.client.attbi.com...
> > Here are some alternatives:
> >
> >   inlinesort()   # immediately clear how it is different from
sort()
> >   sortedcopy()   # clear that it makes a copy and does a sort
> >   newsorted()    # appropriate for a class method constructor
> >
> >
> > I especially like the last one and all of them provide a
distinction
> > from list.sort().
>
> While we're voting, I still like list.sorted() best, so please keep
> that one in the list of possibilities.

After thinking about it some more, I also  prefer .sorted to suggested
alternatives.  I read it as follows:
  list(iterable) means 'make a list from iterable (preserving item
order)'
  list.sorted(iterable) means 'make a sorted list from iterable'
While I generally like verbs for method names, the adjective form
works here as modifing the noun/verb 'list' and the invoked
construction process.

'Inline' strikes me as confusing.  'Copy' and 'new' strike me as
redundant noise since, in the new 2.2+ regime,  'list' as a typal verb
*means* 'make a new list'.

Terry J. Reedy


Terry J. Reedy


From tanzer at swing.co.at  Tue Oct 28 12:23:29 2003
From: tanzer at swing.co.at (Christian Tanzer)
Date: Tue Oct 28 12:24:01 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: Your message of "Tue, 28 Oct 2003 17:23:31 +0100."
	<200310281723.31940.aleaxit@yahoo.com>
Message-ID: <E1AEXZ3-0005at-00@swing.co.at>


Alex Martelli <aleaxit@yahoo.com> wrote:

> On Tuesday 28 October 2003 04:18 pm, Guido van Rossum wrote:
> > > Okay, this is the last chance to come-up with a name other than
> > > sorted().
> > >
> > > Here are some alternatives:
> > >
> > >   inlinesort()   # immediately clear how it is different from sort()
> > >   sortedcopy()   # clear that it makes a copy and does a sort
> > >   newsorted()    # appropriate for a class method constructor
> > >
> > >
> > > I especially like the last one and all of them provide a distinction
> > > from list.sort().
> >
> > While we're voting, I still like list.sorted() best, so please keep
> > that one in the list of possibilities.
>
> I also like list.sorted() -- but list.newsorted() is IMHO even a LITTLE
> bit better, making it even clearer that it's giving a NEW list.  Just
> a _little_ preference, mind you.  "sortedcopy" appears to me a BIT
> less clear (what "copy", if the arg isn't a list...?), "inlinesort" worst.

IMO, sorted is the clearest, all other proposals carry excess baggage
making them less clear.

> Perhaps this points to an issue with classmethods in
> general, due in part to the fact that they're still rather
> little used in Python -- callers of instance.method()
> may mostly expect that the result has something to
> do with the instance's value, rather than being the
> same as type(instance).method() -- little we can do
> about it at this point except instruction, I think.

Or put the method into the metaclass. I'm using both classmethods and
methods defined by metaclasses and didn't get any complaints about
classmethods yet.

-- 
Christian Tanzer                                    http://www.c-tanzer.at/


From jjl at pobox.com  Tue Oct 28 12:25:54 2003
From: jjl at pobox.com (John J Lee)
Date: Tue Oct 28 12:27:22 2003
Subject: [Python-Dev] Re: [Web-SIG] Threading and client-side support
In-Reply-To: <20031028124646.GB1095@rogue.amk.ca>
References: <F97C5A85-0806-11D8-A3EF-000393C2D67E@colorstudy.com>
	<Pine.LNX.4.58.0310270400450.1734@alice>
	<20031027150709.GA29045@rogue.amk.ca>
	<Pine.LNX.4.58.0310281033410.467@alice>
	<20031028124646.GB1095@rogue.amk.ca>
Message-ID: <Pine.LNX.4.58.0310281718580.601@alice>

[background for python-dev-ers: In the process of making my client-side
cookie module a suitable candidate for inclusion in the standard library,
I'm trying to make it thread-safe]

On Tue, 28 Oct 2003 amk@amk.ca wrote:
> On Tue, Oct 28, 2003 at 10:35:33AM +0000, John J Lee wrote:
> > Thanks.  So, in particular, httplib, urllib and urllib2 are thread-safe?
>
> No idea; reading the code would be needed to figure that out.

That might not be helpful if the person reading it (me) has zero threading
experience ;-)

I certainly plan to gain that experience, but surely *somebody* already
knows whether they're thread-safe?  I presume they are, broadly, since a
couple of violations of thread safety are commented in urllib2 and urllib.
Right?


John

From guido at python.org  Tue Oct 28 12:42:16 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 28 12:42:24 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: Your message of "Tue, 28 Oct 2003 17:42:39 +0100."
	<200310281742.39349.aleaxit@yahoo.com> 
References: <200310260115.h9Q1F0905596@oma.cosc.canterbury.ac.nz>
	<5.1.0.14.0.20031028090013.03a9b200@mail.telecommunity.com> 
	<200310281742.39349.aleaxit@yahoo.com> 
Message-ID: <200310281742.h9SHgGt29384@12-236-54-216.client.attbi.com>

> Hmmm... maybe one COULD make a custom descriptor that does support
> both usages... and maybe it IS worth making the .sorted (or whatever name)
> entry a case of exactly such a subtle custom descriptor...

Thanks for the idea, I can use this as a perverted example in my talk
at Stanford tomorrow.  Here it is:

import new

def curry(f, x, cls=None):
    return new.instancemethod(f, x)

class MagicDescriptor(object):
    def __init__(self, classmeth, instmeth):
        self.classmeth = classmeth
        self.instmeth = instmeth
    def __get__(self, obj, cls):
        if obj is None:
            return curry(self.classmeth, cls)
        else:
            return curry(self.instmeth, obj)

class MagicList(list):
    def _classcopy(cls, lst):
        obj = cls(lst)
        obj.sort()
        return obj
    def _instcopy(self):
        obj = self.__class__(self)
        obj.sort()
        return obj
    sorted = MagicDescriptor(_classcopy, _instcopy)

class SubClass(MagicList):
    def __str__(self):
        return "SubClass(%s)" % str(list(self))

unsorted = (1, 10, 2)
print MagicList.sorted(unsorted)
print MagicList(unsorted).sorted()
print SubClass.sorted(unsorted)
print SubClass(unsorted).sorted()

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 28 12:51:59 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 28 12:52:07 2003
Subject: [Python-Dev] RE: [Python-checkins] python/nondist/peps
	pep-0323.txt, NONE, 1.1 pep-0000.txt, 1.254, 1.255
In-Reply-To: Your message of "Tue, 28 Oct 2003 17:35:41 +0100."
	<200310281735.41103.aleaxit@yahoo.com> 
References: <005a01c39cdb$fa18b540$81b0958d@oemcomputer>
	<200310281027.06138.aleaxit@yahoo.com>
	<200310281515.h9SFF2c28991@12-236-54-216.client.attbi.com> 
	<200310281735.41103.aleaxit@yahoo.com> 
Message-ID: <200310281752.h9SHpxr29419@12-236-54-216.client.attbi.com>

> Yes, the use case of __deepcopy__ is indeed quite different (and
> to be honest it doesn't appear in my actual experience -- I can "imagine"
> some as well as the next man, but they'd be made out of whole cloth:-).
> But I was under the impression that you wanted them in PEP 323 too?
> Maybe I misunderstood your words.  Should I take them out of PEP 323?
> In that case somebody else can later PEP that if they want, and I can
> basically wash my hands of them -- what do you think?

I think it would be better of PEP 323 only did __copy__, so you can
remove all traces of __deepcopy__.  I don't recall what I said, maybe
I wasn't clear.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 28 13:00:14 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 28 13:00:46 2003
Subject: [Python-Dev] RE: cloning iterators again
In-Reply-To: Your message of "Tue, 28 Oct 2003 17:03:58 +0100."
	<200310281703.58169.aleaxit@yahoo.com> 
References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer>
	<20031028124042.GA22513@vicky.ecs.soton.ac.uk>
	<200310281533.h9SFXZi29097@12-236-54-216.client.attbi.com> 
	<200310281703.58169.aleaxit@yahoo.com> 
Message-ID: <200310281800.h9SI0Fr29445@12-236-54-216.client.attbi.com>

> Armin Rigo posted an URL to a C extension module he has recently
> developed, and used.  That module is able to copy running instances
> of a generator.  Currently, it does so by just "knowing", and type by
> type dealing with, several types whose instances could be values
> referred to by the generator's frame.  I was suggesting extending this
> so that, when values of other types are met, instead of automatically
> failing (or, as I believe I recall Armin's extension does today, copying
> the reference rather than the value), the copy process would...:

I haven't seen Armin's code, but I don't believe that the type alone
gives enough information about whether they should be copied.

Consider a generator that uses a dict as a cache or memo for values it
computes.  Multiple instances of the generator share the dict, but for
efficiency the generator references it in a local variable.  This dict
should not be copied when copying the generator's stack frame.

But consider another generator that uses a dict to hold some of its
state.  This dict should be copied.

> > > > Cool!  Why don't you try copy.copy on types you don't automatically
> > > > recognize and know how to deal with, BTW?  That might make this
> 
> and Armin said that __copy__ seems weak to him but __deepcopy__
> might not be:
> 
> > > I will try.  Note that only __deepcopy__ makes sense, as far as I can
> > > tell, because there is too much state that really needs to be copied and
> > > not shared in a generator (in particular, the sequence iterators from
> > > 'for' loops).
> 
> 
> So, now he went on to ask about __deepcopy__ and you answered:
> 
> > > I'm not sure about how deep-copying should be defined for built-in
> > > types.  Should a built-in __deepcopy__ method try to import and call
> > > copy.deepcopy() on the sub-items? This doesn't seem to be right.
> >
> > Almost -- you have to pass the memo argument that your __deepcopy__
> > received as the second argument to the recursive deepcopy() calls, to
> > avoid looping on cycles.
> 
> 
> Now, if Armin's code can only provide __deepcopy__ and not __copy__,
> then it's probably of little use wrt the __copy__ functionality I
> talk about in PEP 323 (which I still must revise to take into
> account your feedback and Raymond's -- plan to get to that as soon
> as I've cleared my mbox) -- the memory and time expenditure is
> likely to be too high for that.

Right.

> It's still going to be a cool hack, well worth "publishing" as such,

As a third-party module?  Sure.

> and probably able to be "user-installed" as the way deepcopy deals
> with generators even though generators themselves may not sprout a
> __deepcopy__ method themselves (fortunately, copy.copy does a lot of
> "ad-hoc protocol adaptation" -- it's occasionally a bit rambling or
> hard to follow, but often allows plugging in "third party copiers"
> for types which their authors hadn't imagined would be copied or
> deep copied, so that other innocent client code which just calls
> copy.copy(x) will work... essentially how "real" adaptation would
> work, except that registering a third-party protocol adapter would
> be easier:-).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aahz at pythoncraft.com  Tue Oct 28 13:10:10 2003
From: aahz at pythoncraft.com (Aahz)
Date: Tue Oct 28 13:10:13 2003
Subject: [Python-Dev] Re: [Web-SIG] Threading and client-side support
In-Reply-To: <Pine.LNX.4.58.0310281718580.601@alice>
References: <F97C5A85-0806-11D8-A3EF-000393C2D67E@colorstudy.com>
	<Pine.LNX.4.58.0310270400450.1734@alice>
	<20031027150709.GA29045@rogue.amk.ca>
	<Pine.LNX.4.58.0310281033410.467@alice>
	<20031028124646.GB1095@rogue.amk.ca>
	<Pine.LNX.4.58.0310281718580.601@alice>
Message-ID: <20031028181009.GA20129@panix.com>

On Tue, Oct 28, 2003, John J Lee wrote:
> On Tue, 28 Oct 2003 amk@amk.ca wrote:
>> On Tue, Oct 28, 2003 at 10:35:33AM +0000, John J Lee wrote:
>>>
>>> Thanks.  So, in particular, httplib, urllib and urllib2 are thread-safe?
>>
>> No idea; reading the code would be needed to figure that out.
> 
> That might not be helpful if the person reading it (me) has zero
> threading experience ;-)
>
> I certainly plan to gain that experience, but surely *somebody*
> already knows whether they're thread-safe?  I presume they are,
> broadly, since a couple of violations of thread safety are commented
> in urllib2 and urllib.  Right?

Generally speaking, any code that does not rely on global objects is
thread-safe in Python.  For more information, let's take this to
python-list.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From mcherm at mcherm.com  Tue Oct 28 13:10:26 2003
From: mcherm at mcherm.com (Michael Chermside)
Date: Tue Oct 28 13:10:43 2003
Subject: copysort patch, was RE: [Python-Dev]inline sort option
Message-ID: <1067364626.3f9eb11204e4f@mcherm.com>

Alex Martelli writes:
> BTW, I think I should point out one POSSIBLE problem with
> classmethods -- since unfortunately they CAN be called on an
> instance, and will ignore that instance, this may confuse an
> unsuspecting user.

Alex, that's a good point, and one we should be careful of.
However, (as you said) I suspect that the unsuspecting users 
will always call it with zero arguments. So long as that call
always fails (preferably with a useful error message) I think
we should be OK.

So what if we make the error message maximally useful? Something
like this:

   _privateObj= Object()
   def sorted(iteratorToSort=_privateObj):
       if iteratorToSort == _privateObj:
           raise TypeError('sorted is a classmethod of list ' +
                           'taking an iterator argument')
       else:
           <... normal body here ...>

The only thing I've done here was to make the text of the message
more helpful (I've even left the type of the exception as TypeError
even though that might not be the most useful thing). Okay...
there's one other change... if you pass 2 or more arguments, then
it will complain that it expected "at least 0 arguments", but try
it once with 0 arguments and you'll immediately understand.

-- Michael Chermside


From aleaxit at yahoo.com  Tue Oct 28 13:24:54 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 13:25:05 2003
Subject: [Python-Dev] RE: [Python-checkins]
	python/nondist/pepspep-0323.txt, NONE, 1.1 pep-0000.txt, 1.254, 1.255
In-Reply-To: <003401c39d76$4f36d080$1a3ac797@oemcomputer>
References: <003401c39d76$4f36d080$1a3ac797@oemcomputer>
Message-ID: <200310281924.54600.aleaxit@yahoo.com>

On Tuesday 28 October 2003 06:09 pm, Raymond Hettinger wrote:
> [Alex]
>
> > Btw, random.py doesn't seem to supply pseudo-random iterators --
> > easy to make e.g. with iter(random.random, None) [assuming you
> > want a nonterminating one],
>
> Probably a bit faster with:
>
> 	starmap(random.random, repeat(()))

Yep, saving the useless compare does help:

[alex@lancelot bo]$ timeit.py -c -s'import random' -s'import itertools as it' 
\
> -s'xx=it.starmap(random.random, it.repeat(()))' 'xx.next()'
1000000 loops, best of 3: 1.37 usec per loop

[alex@lancelot bo]$ timeit.py -c -s'import random' -s'import itertools as it'     
-s'xx=iter(random.random, None)' 'xx.next()'
1000000 loops, best of 3: 1.62 usec per loop

Renewed compliments for itertools' speed!-)


> > but that wouldn't be copyable.  Should
> > something be done about that...?
>
> No.
>
> 1) The use case is not typical.  Other than random.random() and
> time.ctime(), it is rare to see functions of zero arguments that
> usefully return an infinite sequence of distinct values.

Sorry, I must have been unclear -- I had no "zero arguments"
limitation in mind.  Rather, I thought of something like:

it = random.iterator(random.Random().sample, range(8), 3)

now each call to it.next() would return a random sample of 3
numbers from range(8) w/o repetition.  I.e., the usual
    <makemeacallbackobject>(callable, *args)
idiom (as in Tkinter widgets' .after method, etc, etc).  What I'm
saying is that there is no reason type(it) shouldn't support a
__copy__ method -- as long as the underlying callable sports
an im_self which also exposes a __getstate__ method, at least.

Come to think of this, there may be other use cases for this
general approach than "random iterators".  Do you think that
an iterator on a callable *and args for it* would live well in
itertools?  That module IS, after all, your baby...


> 2) If you need a copy, run it through tee().

That's exactly what I plan to do -- but I would NOT want tee()
to consume O(N) memory [where N is how far out of step the
two iterators may get] in those cases where the iterator
argument DOES have a __copy__ method that can presumably
produce a usable copy with O(1) memory expenditure.  Thus,
I'd like itertools.tee to start by checking if its argument iterator
"is properly copyable".

Guido has pointed out that it would not be safe to just try
copy.copy(it), because that MIGHT produce a copy that does
not satisfy "iterator copying" semantics requirements.  As
an example, he has repeatedly mentioned "an iterator on a
tree which keeps ``a stack of indices''".

Here, I think, is an indication of the kind of thing he fears
(code untested beyond running it on that one example):

import copy

class TreeIter(object):
    def __init__(self, tree):
        self.tree = [tree]
        self.indx = [-1]
    def __iter__(self):
        return self
    def next(self):
        if not self.indx: raise StopIteration
        self.indx[-1] += 1
        try:
            result = self.tree[-1][self.indx[-1]]
        except IndexError:
            self.tree.pop()
            self.indx.pop()
            if not self.indx: raise StopIteration
            return self.next()
        if type(result) is not list: return result
        self.tree.append(result)
        self.indx.append(-1)
        return self.next()

x = [ [1,2,3], [4, 5, [6, 7, 8], 9], 10, 11, [12] ]

print 'all items, one:',
for i in TreeIter(x): print i,
print

print 'all items, two:',
it = TreeIter(x)
for i in it:
    print i,
    if i==6: cop = copy.copy(it)
print

print '>=6 items, one:',
for i in cop: print i,
print

print '>=6 items, two:',
it = TreeIter(x)
for i in it:
    if i==6: cop = copy.deepcopy(it)
for i in cop: print i,
print

Output is:

[alex@lancelot bo]$ python treit.py
all items, one: 1 2 3 4 5 6 7 8 9 10 11 12
all items, two: 1 2 3 4 5 6 7 8 9 10 11 12
>=6 items, one:
>=6 items, two: 7 8 9 10 11 12

i.e., the "iterator copy" returned by copy.copy does NOT satisfy
requirements!  (I've added the last tidbit to show that the one
returned by copy.deepcopy WOULD satisfy them, but, it's clearly
WAY too memory-costly to consider, far worse than tee()!!!!).

So, "safely copying an iterator" means ensuring the iterator's
author HAS thought specifically about allowing a copy -- in
which case, we can (well, we _must_:-) trust that they have
implemented things correctly.  Just using copy.copy(it) MIGHT
fall afoul of a default shallow copy not being sufficient.

Perhaps we can get by with checking for, and using if found,
a __copy__ method only.  Is there a specific need to support
__setstate__ etc, here?  I hope not -- still, these, too, are
things I must make sure are mentioned in the PEP.


Alex


From guido at python.org  Tue Oct 28 13:28:32 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 28 13:28:42 2003
Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug
In-Reply-To: Your message of "Tue, 28 Oct 2003 16:39:08 +0100."
	<200310281639.08240.aleaxit@yahoo.com> 
References: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz>
	<200310281037.44424.aleaxit@yahoo.com>
	<200310281516.h9SFGbf29003@12-236-54-216.client.attbi.com> 
	<200310281639.08240.aleaxit@yahoo.com> 
Message-ID: <200310281828.h9SISW529541@12-236-54-216.client.attbi.com>

> On Tuesday 28 October 2003 04:16 pm, Guido van Rossum wrote:
> > > So perhaps for 2.3 we should just apologetically note the anomaly
> > > in the docs, and for 2.4 forbid the former case, i.e., require both
> > > __mul__ AND __rmul__ to exist if one wants to code sequence
> > > classes that can be multiplied by integers on either side...?
> > >
> > > Any opinions, anybody...?
> >
> > What's wrong with the status quo?  So 3*x is undefined, and it happens
> > to return x*3.  Is that so bad?
> 
> Where is it specified that 3*x "is undefined" when x's type exposes
> __mul__ but not __rmul__ ?  Sorry, I don't understand the viewpoint
> you seem to imply here.  If x's type exposed no __add__ but "it just
> so happened" that x+23 always returned 12345 -- while every other
> addition, as expected, failed -- would you doubt the lack of a normal
> and reasonably expected exception is bad?
> 
> I think that if Python returns "arbitrary" results, rather than raising an
> exception, for operations that "should" raise an exception, that is
> surely very bad -- it makes it that much harder for programmers to
> debug the programs they're developing.  If there's some doubt about
> the words I've put in hyphens -- that treating x*y just like y*x only for
> certain values of type(y) isn't arbitrary or shouldn't raise -- then we
> can of course discuss this, but isn't the general idea correct?
> 
> Now, the docs currently say, about sequences under
> http://www.python.org/doc/current/ref/sequence-types.html :
> """
> sequence types should implement ... multiplication (meaning repetition) by 
> defining the methods __mul__(), __rmul__() and __imul__() described below; 
> they should not define __coerce__() or other numerical operators.
> """
> So, a sequence-emulating type that implements __mul__ but not __rmul__ 
> appears to violate that "should".
> 
> The description of __mul__ and __rmul__ referred to seems to be
> that at http://www.python.org/doc/current/ref/numeric-types.html .
> 
> It says that methods corresponding to operations not supported by
> a particular kind of number should be left undefined (as opposed
> to the behavior of _attempts at those operations_ being undefined),
> so if I had a hypothetical number type X such that, for x instance
> of X and an integer k, x*k should be supported but k*x shouldn't,
> isn't this a recommendation to not write __rmul__ in X ...?
> 
> 
> Besides, this weird anomaly is typical of newstyle classes only.
> Consider:
> 
> >>> class X:
> ...   def __mul__(self, other): return 23
> ...
> >>> x=X()
> >>> x*7
> 23
> >>> 7*x
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> TypeError: unsupported operand type(s) for *: 'int' and 'instance'
> >>>
> 
> ALL wonderful, just as expected, hunky-dory.  But now, having
> read that newstyle classes are better, I want to make X newstyle --
> can't see any indication in the docs that I shouldn't -- and...:
> 
> >>> class X(object):
> ...   def __mul__(self, other): return 23
> ...
> >>> x=X()
> >>> x*7
> 23
> >>> 7*x
> 23
> >>>
> 
> *eep*!  Yes, it DOES seem to be that this is QUITE bad indeed.
> 
> 
> Alex

You're making a mountain of a molehill here, Alex.  I know that in
group theory there are non-Abelian groups (for which AB != BA), but
I've never encountered one myself in programming; more typical such
non-commutative operations are modeled as __add__ rather than as
__mul__.

Anyway, the real issue AFAICT is not that people depend on __rmul__'s
absence to raise a TypeError, but that people learn by example and
find __rmul__ isn't necessary by experimenting with integers.

The reason why it works at all for integers without __rmul__ is
complicated; it has to do with very tricky issues in trying to
implement multiplication of a sequence with an integer.  That code has
gone through a number of iterators, and every time someone eventually
found a bug in it, so I'd rather leave the __rmul__ blemish than
uproot it again.  If you can come up with a fix that doesn't break
sequence repetition I'd be open to accepting it (for 2.4 only, in 2.3
there may be too much code depending on the bug) but only after
serious review -- and not by me, because I'm not all that familiar
with all the subtleties of that code any more. :-(

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at yahoo.com  Tue Oct 28 13:30:46 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 13:31:50 2003
Subject: copysort patch, was RE: [Python-Dev]inline sort option
In-Reply-To: <1067364626.3f9eb11204e4f@mcherm.com>
References: <1067364626.3f9eb11204e4f@mcherm.com>
Message-ID: <200310281930.46518.aleaxit@yahoo.com>

On Tuesday 28 October 2003 07:10 pm, Michael Chermside wrote:
> Alex Martelli writes:
> > BTW, I think I should point out one POSSIBLE problem with
> > classmethods -- since unfortunately they CAN be called on an
> > instance, and will ignore that instance, this may confuse an
> > unsuspecting user.
>
> Alex, that's a good point, and one we should be careful of.

Thanks, that's why I brought the issue up.

> However, (as you said) I suspect that the unsuspecting users
> will always call it with zero arguments. So long as that call
> always fails (preferably with a useful error message) I think
> we should be OK.
>
> So what if we make the error message maximally useful? Something

*VERY good idea*

> like this:
>
>    _privateObj= Object()
>    def sorted(iteratorToSort=_privateObj):
>        if iteratorToSort == _privateObj:
>            raise TypeError('sorted is a classmethod of list ' +
>                            'taking an iterator argument')
>        else:
>            <... normal body here ...>
>
> The only thing I've done here was to make the text of the message
> more helpful (I've even left the type of the exception as TypeError
> even though that might not be the most useful thing). Okay...
> there's one other change... if you pass 2 or more arguments, then
> it will complain that it expected "at least 0 arguments", but try
> it once with 0 arguments and you'll immediately understand.

Could we perhaps deal with the latter issue by adding a *args to sorted's
signature, and changing the condition on the 'if' to:

  if iteratorToSort is _privateObj or args:
     raise TypeError  # etc etc

?  Maybe w/"a single iterator argument" in the error message's text?

{alternatively, if we don't care about keyword args, just having the
*args in the signature and checking "if len(args) != 1: ..." might be OK}


Alex


From guido at python.org  Tue Oct 28 13:34:09 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 28 13:34:16 2003
Subject: [Python-Dev] RE: cloning iterators again
In-Reply-To: Your message of "Tue, 28 Oct 2003 17:41:58 GMT."
	<20031028174158.GA19133@vicky.ecs.soton.ac.uk> 
References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer>
	<200310270851.02495.aleaxit@yahoo.com>
	<20031027103540.GA27782@vicky.ecs.soton.ac.uk>
	<200310271609.03819.aleaxit@yahoo.com>
	<20031028124042.GA22513@vicky.ecs.soton.ac.uk>
	<200310281533.h9SFXZi29097@12-236-54-216.client.attbi.com> 
	<20031028174158.GA19133@vicky.ecs.soton.ac.uk> 
Message-ID: <200310281834.h9SIY9j29592@12-236-54-216.client.attbi.com>

> Ok on this point, the question was whether (the error-checking obfuscated
> equivalent of)
> 
>    PyObject *m = PyImport_ImportModule("copy");
>    PyObject_CallMethod(m, "deepcopy", x, memo);
> 
> should be done inside a built-in __deepcopy__ implementation.  It looks like
> it will make a hell of a lot of quite slow calls to PyImport_ImportModule()  
> for structures like lists of generators, which is the kind of structure you
> are interested in when you deepcopy generators.

Yeah, you should ideally be able to cache the resuls of the import,
except then your code wouldn't work when theer are multiple
interpreters.  Maybe using

  PyObject *modules = PySys_GetObject("modules");
  PyObject *m = PyDict_Lookup(modules, "copy");

would be faster?  PySys_GetObject() doesn't waste much time. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Tue Oct 28 13:37:56 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 28 13:38:03 2003
Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug
In-Reply-To: Your message of "Tue, 28 Oct 2003 10:28:32 PST."
	<200310281828.h9SISW529541@12-236-54-216.client.attbi.com> 
References: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz>
	<200310281037.44424.aleaxit@yahoo.com>
	<200310281516.h9SFGbf29003@12-236-54-216.client.attbi.com>
	<200310281639.08240.aleaxit@yahoo.com> 
	<200310281828.h9SISW529541@12-236-54-216.client.attbi.com> 
Message-ID: <200310281837.h9SIbuV29622@12-236-54-216.client.attbi.com>

> You're making a mountain of a molehill here, Alex.  I know that in
> group theory there are non-Abelian groups (for which AB != BA), but
> I've never encountered one myself in programming; more typical such
> non-commutative operations are modeled as __add__ rather than as
> __mul__.

I need to give myself a small slap on the forehead head, because of
course non-square matrix multiplication is an excellent example where
AB != BA.  However even there, Ax == xA when x is a singleton, and the
issue only arises for integers, so I still don't think there are use
cases.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at yahoo.com  Tue Oct 28 13:43:57 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 13:44:06 2003
Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug
In-Reply-To: <200310281828.h9SISW529541@12-236-54-216.client.attbi.com>
References: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz>
	<200310281639.08240.aleaxit@yahoo.com>
	<200310281828.h9SISW529541@12-236-54-216.client.attbi.com>
Message-ID: <200310281943.57278.aleaxit@yahoo.com>

On Tuesday 28 October 2003 07:28 pm, Guido van Rossum wrote:
   ...
> You're making a mountain of a molehill here, Alex.  I know that in

You have a point: when one is in love (and I still _am_ madly in love
with Python!-), it's hard to admit of imperfections in the loved one:-).
Even a tiny defect...:-).

> group theory there are non-Abelian groups (for which AB != BA), but
> I've never encountered one myself in programming; more typical such
> non-commutative operations are modeled as __add__ rather than as
> __mul__.

I don't remember ever coding a __mul__ that I WANTED to be
non-commutative, right.

> Anyway, the real issue AFAICT is not that people depend on __rmul__'s
> absence to raise a TypeError, but that people learn by example and
> find __rmul__ isn't necessary by experimenting with integers.

Or more seriously: they write what LOOK like perfectly adequate unit
tests, but the numbers they try in "number * x" happen to be ints;
so the unit tests pass -- but their code is broken because they forgot
the __rmul__ and the unittests-with-ints didn't catch that.

> The reason why it works at all for integers without __rmul__ is
> complicated; it has to do with very tricky issues in trying to
> implement multiplication of a sequence with an integer.  That code has

Yes, I think I understand some of that -- I included the analysis of the
bug in my bugreport on SF.

> gone through a number of iterators, and every time someone eventually
> found a bug in it, so I'd rather leave the __rmul__ blemish than
> uproot it again.  If you can come up with a fix that doesn't break
> sequence repetition I'd be open to accepting it (for 2.4 only, in 2.3
> there may be too much code depending on the bug) but only after
> serious review -- and not by me, because I'm not all that familiar
> with all the subtleties of that code any more. :-(

I do have one weird idea that might help here (for 2.4), but I'd better
post that separately because I suspect it's going to fuel its own long
discussion thread.  As things are, without an ability to distinguish a
sequence from a number securely, and IF sequences must work w/o
__rmul__ (but they didn't in classic classes...? and the docs don't
indicate that...?) then I'm stumped.  Who'd be the right people to
review such proposed changes, btw?


Alex


From mcherm at mcherm.com  Tue Oct 28 13:47:27 2003
From: mcherm at mcherm.com (Michael Chermside)
Date: Tue Oct 28 13:47:28 2003
Subject: [Python-Dev] replacing 'global'
Message-ID: <1067366846.3f9eb9bf1ee8e@mcherm.com>

Alex lists this flaw:
> -- it's the wrong keyword, doesn't really _mean_ "global"
Guido says:
> I haven't heard anyone else in this thread agree with you on that
> one.  I certainly don't think it's of earth-shattering ugliness.

Well, I agree. But I also agree with your point that it's certainly
not earth-shattering... just a little confusing to newbies, who
expect "global" to mean "global", not "module-wide". Not worth
changing the language, but if you were to re-invent Python from the
ground up, I'd consider it.

Greg Ewing writes:
> We'd be having two kinds of assignment, and there's no
> prior art to suggest to suggest which should be = and
> which :=. That's the "arbitrary" part.

No one will ever confuse these, because no one will learn about
:= until long after = is well understood. The one spelled "="
will be "the normal one" and ":=" will be "the funny one".

Just mentions:
> (Alex noted in private mail that one disadvantage of this idea is that
> it makes using globals perhaps TOO easy...)

Indeed, that would be my concern. At least the word "global" has
strong negative associations (mostly undeserved in this case since
it really means "module-level" not "global" ;-).

Skip writes:
> It seems that use
>       of
>         x := 2
>       and
>         x = 4
>       should be disallowed in the same function so that the compiler can
>       flag such mistakes.

I agree. When writing a function, we ALLOW name shadowing because we
want the author of the function to be able to use local variables
without having to know anything about the outer scope(s). But if the
author of the function ALREADY KNOWS that there's an outer variable
named "x" (MUST know it since she is modifying that outer variable),
then there's no excuse for the poor choice of names... the local variable
should be renamed to avoid the conflict. The "global" statement as it
currently exists enforces this... if one assignment in a scope is
"global", then ALL will be. I maintain that the use of := vs = should be
the same... all or none!

Despite Just's original  preference for thinking of it as "find somplace 
and rebind", I would always wind up thinking of this as the "bind in some 
outer scope" operator.

-----
Anyhow, that's as far as I got in reading the discussion so far. Whew! 
What a lot of traffic!

-- Michael Chermside


From pje at telecommunity.com  Tue Oct 28 13:47:06 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Oct 28 13:48:44 2003
Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing
	'global')
In-Reply-To: <200310281755.44307.aleaxit@yahoo.com>
References: <5.1.0.14.0.20031028085337.02f57ec0@mail.telecommunity.com>
	<200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com>
	<5.1.0.14.0.20031028085337.02f57ec0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20031028121110.0315e0d0@telecommunity.com>

At 05:55 PM 10/28/03 +0100, Alex Martelli wrote:
>On Tuesday 28 October 2003 02:57 pm, Phillip J. Eby wrote:
> > At 09:56 AM 10/28/03 +0100, Alex Martelli wrote:
> > >AND, adaptation is not typecasting:
> > >e.g y=adapt("23", int) should NOT succeed.
> >
> > Obviously, it wouldn't succeed today, since int doesn't have __adapt__ and
> > str doesn't have __conform__.  But why would you intend that they not have
> > them in future?
>
>I'd be delighted to have the int type sprout __adapt__ and the str type
>sprout __conform__ -- but neither should accept this case, see below.

You didn't actually give any example of why 'adapt("23",int)' shouldn't 
return 23, just why adapt("foo",file) shouldn't return a file.

Currently, Python objects may possess an __int__ method for conversion to 
integers, and a __str__ method for conversion to string.  So, it seems to 
me that for such objects, 'adapt(x,int)' should be equivalent to 
x.__int__() and 'adapt(x,str)' should be equivalent to x.__str__().

So, there is already a defined protocol within Python for conversion to 
specific types, with well-defined meaning.  One might argue that since it's 
already possible to call the special method or invoke the type constructor, 
that it's not necessary for there to be an adapt() synonym for 
them.  However, it's also possible to get an object's attribute or call an 
arbitrary function by exec'ing a dynamically constructed string instead of 
using getattr() or having functions as first class objects.  So, I don't 
see any problem with "convert to integer" being 'int(x)' and yet still 
being able to spell it 'adapt(x,int)' in the circumstance where 'int' is 
actually a variable or parameter, just as one may use 'getattr(x,y)' when 
the attribute to be gotten is a variable.


> > And, why do you consider adaptation *not* to be typecasting?  I always
> > think of it as "give me X, rendered as a Y", which certainly sounds like a
> > description of typecasting to me.
>
>typecasting (in Python) makes a NEW object whose value is somehow
>"built" (possibly in a very loose sense) from the supplied argument[s],
>but need not have any more than a somewhat tangential relation with
>them.  adaptation returns "the same object" passed as the argument,
>or a wrapper to it that makes it comply with the protocol.

I don't understand the dividing line here.  Perhaps that's because Python 
doesn't really *have* an existing notion of typecasting as such, there are 
just constructors (e.g. int) and conversion methods (e.g. 
__int__).  However, conversion methods and even constructors of immutable 
types are allowed to be idempotent.  'int(x) is x' can be true, for 
example.  So, how is that different?


>To give a specific example:
>
>x = file("foo.txt")
>
>now (assuming this succeeds) x is a readonly object which is an
>instance of file.  The argument string "foo.txt" has "indicated", quite
>indirectly, how to construct the file object, but there's really no true
>connection between the value of the argument string and what
>will happen as that object x is read.
>
>Thinking of what should happen upon:
>
>x = adapt("foo.txt", file)
>
>what I envision is DEFINITELY the equivalent of:
>
>x = cStringIO.StringIO("foo.txt")
>
>i.e., the value (aka object) "foo.txt", wrapped appropriately so as
>to conform to the (readonly) "file protocol" (I can call x.read(3)
>and get "foo", then x.seek(0) then x.read(2) and get "fo", etc).

I don't see how any of this impacts the question of whether adapt(x,int) == 
int(x).  Certainly, I agree with you that adapt("foo",file) should not 
equal file("foo"), but I don't understand what one of these things has to 
do with the other.


From aleaxit at yahoo.com  Tue Oct 28 13:49:50 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 13:50:00 2003
Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug
In-Reply-To: <200310281837.h9SIbuV29622@12-236-54-216.client.attbi.com>
References: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz>
	<200310281828.h9SISW529541@12-236-54-216.client.attbi.com>
	<200310281837.h9SIbuV29622@12-236-54-216.client.attbi.com>
Message-ID: <200310281949.50592.aleaxit@yahoo.com>

On Tuesday 28 October 2003 07:37 pm, Guido van Rossum wrote:
> > You're making a mountain of a molehill here, Alex.  I know that in
> > group theory there are non-Abelian groups (for which AB != BA), but
> > I've never encountered one myself in programming; more typical such
> > non-commutative operations are modeled as __add__ rather than as
> > __mul__.
>
> I need to give myself a small slap on the forehead head, because of
> course non-square matrix multiplication is an excellent example where
> AB != BA.  However even there, Ax == xA when x is a singleton, and the
> issue only arises for integers, so I still don't think there are use
> cases.

There may be no "perfectly correct code" that will ever notice 3*x weirdly
works.  But would that make it acceptable to return 42, rather than raise
IndexError, when a list of length exactly 33 is indexed by index 666?  That,
too, might "have no practical use cases" for perfectly correct code.  But
programmers make mistakes, and one of Python's strength is that it does
NOT (crash, hang, or) return weird wrong results when they do -- most
often it raises appropriate exceptions, which make it easy to diagnose and
fix one's mistakes.  Thus, it troubles me that we can't do it here.

I know it's hard to fix (I've stared at that code for QUITE a while...).  But
"deducing" from that difficulty that the error's not worth fixing seems like
a classic case of "sour grapes":-).


Alex


From guido at python.org  Tue Oct 28 14:08:04 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 28 14:08:12 2003
Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug
In-Reply-To: Your message of "Tue, 28 Oct 2003 19:49:50 +0100."
	<200310281949.50592.aleaxit@yahoo.com> 
References: <200310280004.h9S04OO21421@oma.cosc.canterbury.ac.nz>
	<200310281828.h9SISW529541@12-236-54-216.client.attbi.com>
	<200310281837.h9SIbuV29622@12-236-54-216.client.attbi.com> 
	<200310281949.50592.aleaxit@yahoo.com> 
Message-ID: <200310281908.h9SJ84S29755@12-236-54-216.client.attbi.com>

> I know it's hard to fix (I've stared at that code for QUITE a
> while...).  But "deducing" from that difficulty that the error's not
> worth fixing seems like a classic case of "sour grapes":-).

I dunno.  As language warts go I find this one minuscule, and the
effort you spend on rhetoric to convince me a waste of breath.  My
position is: I understand that it's a wart, I just don't think I know
of a good solution, and I can live with the status quo just fine.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From raymond.hettinger at verizon.net  Tue Oct 28 14:14:09 2003
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Tue Oct 28 14:15:49 2003
Subject: [Python-Dev] PEP 322: Generator Expressions (implementation team)
Message-ID: <001601c39d87$aaa08c20$f7b42c81@oemcomputer>

Guido has accepted the generator expressions pep, so it's time for me to
form an implementation team.
 
Any volunteers are welcome to email me directly.  Alex, Brett, Neal,
Jeremy?
 
 
Raymond
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20031028/f67c5990/attachment.html
From jacobs at penguin.theopalgroup.com  Tue Oct 28 14:18:16 2003
From: jacobs at penguin.theopalgroup.com (Kevin Jacobs)
Date: Tue Oct 28 14:18:20 2003
Subject: [Python-Dev] Decimal.py in sandbox
In-Reply-To: <A128D751272CD411BC9200508BC2194D03383119@escpl.tcp.com.ar>
Message-ID: <Pine.LNX.4.44.0310281415540.8203-100000@penguin.theopalgroup.com>

On Tue, 28 Oct 2003, Batista, Facundo wrote:
> Aahz wrote:
> #- The first thing you should do is talk with Eric Price
> #- (eprice@tjhsst.edu), author of the code.  You don't need to 
> #- use SF for
> #- now; CVS should be fine, but you should find out whether 
> #- Eric would like
> #- to approve changes first.
> 
> Eric Price wrote:
> #- Not really-- since school started, I haven't had much time 
> #- to spare.  
> #- I'll probably look over the changes at some time, but I 
> #- wouldn't want to 
> #- keep them waiting.
> 
> So, to who may I send the changes?
> 
> Should I send the whole staff at the end of the work, or keep feeding small
> changes?
> 
> Should I send by email the diff results?

I'll be happy to review your changes, so long as the changesets are kept
fairly focused.  We can then feed them through one of the regular
committers.  Just e-mail them to me directly in unified format (-u) with a
simple explanation of what is being accomplished.

Thanks,
-Kevin

-- 
--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (440) 871-6725 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (440) 871-6722              WWW:    http://www.theopalgroup.com/


From fperez at colorado.edu  Tue Oct 28 14:24:25 2003
From: fperez at colorado.edu (Fernando Perez)
Date: Tue Oct 28 14:24:30 2003
Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug
Message-ID: <3F9EC269.6080108@colorado.edu>

Hi all,

I just wanted to add a small comment on this discussion, which I'd been
following via the newsgroup mirror.

Python is picking up a lot of steam in the scientific computing community, and
in science it is quite common to encounter non-commutative multiplication.
Just to remind Guido from his old math days :), even for square matrices,
AB!=BA in most cases.  The Matrix class supplied with Numpy is one example of
a widely used library which implements '*' as a non-commutative multiplication
operator.

 From what I've read, I realize that this is quite a subtle and difficult bug
to treat.  I just wanted to add a data point for you folks to consider.
Please don't dismiss non-commutative multiplication as too much of an obscure
corner case, it is a daily occurrence for a growing number of python users
(scientists).

Thanks,

Fernando.


From martin at v.loewis.de  Tue Oct 28 15:37:54 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Tue Oct 28 15:38:04 2003
Subject: [Python-Dev] Re: 2.3.3 plans
In-Reply-To: <200310281529.h9SFTrY29061@12-236-54-216.client.attbi.com>
References: <1067340608.24137.11.camel@gandalf.tm.informatik.uni-frankfurt.de>
	<200310281529.h9SFTrY29061@12-236-54-216.client.attbi.com>
Message-ID: <m3r80x5c9p.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> I hope someone here can work with you on getting the patches in
> acceptable shape.  You should start by uploading them to the patch
> manager in SourceForge.

Correct. In addition, the patches should *first* be integrated into
the CVS head, and then backported to 2.3. There is the possibility
that cross-compilation support breaks native compilation procedures,
which would not be acceptable for a point release.

Regards,
Martin

From aleaxit at yahoo.com  Tue Oct 28 15:55:41 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 15:58:12 2003
Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing
	'global')
In-Reply-To: <5.1.1.6.0.20031028121110.0315e0d0@telecommunity.com>
References: <5.1.0.14.0.20031028085337.02f57ec0@mail.telecommunity.com>
	<5.1.1.6.0.20031028121110.0315e0d0@telecommunity.com>
Message-ID: <200310282155.41407.aleaxit@yahoo.com>

On Tuesday 28 October 2003 07:47 pm, Phillip J. Eby wrote:
   ...
> You didn't actually give any example of why 'adapt("23",int)' shouldn't
> return 23, just why adapt("foo",file) shouldn't return a file.

Which is sufficient to show that, IN GENERAL, adapt(x, sometype)
should not just be the equivalent of sometype(x), as you seemed (and,
below, still seem) to argue.  Now, if you want to give positive reasons,
specific, compelling use-cases, to show why for SOME combinations
of type(x) and sometype that general rule should be violated, go ahead,
but the burden of proof is on you.

If you do want to try and justify such specific-cases exceptions, remember:
"adapt(x, foo)" is specified as returning "x or a wrapper around x", and
clearly a new object of type foo with no actual connection to x is neither 
of those.  That's a "formal" reasoning from the PEP's actual current text.

But perhaps informal reasoning may prove more convincing -- let's try.

adaptation is *NOT* conversion -- it's not the creation of a new object
that will thereafter live a life separate from the original one.  This part
is not relevant when the objects are immutable, but it's quite relevant
to your GENERAL idea of, e.g.:
> 'adapt(x,str)' should be equivalent to x.__str__().
Say that x is mutable.  Then, "adapting x to the string protocol", if
supported, should give a wrapper object, supporting all string-object
methods, in a way that any call to such methods relies on the
current-at-call-time value of x.

But, still on that general idea of yours that I quote above, there
is worse, MUCH worse.

Consider: an object's type often supports a __str__ that, as per its
specs in the docs, is "the ``informal'' string representation of an object ...
convenient or concise representation may be used instead".  The docs
make it AMPLY clear that the purpose of __str__ is STRICTLY for
the object's type to give a (convenient, concise, possibly quite incomplete
and inaccurate) HUMAN-READABLE representation of the object.  To
assert that this is in any way equivalent to a claim, on the object type's
part, that its instances can "adapt themselves to the string protocol",
beggars belief.

It borders, I think, on the absurd, to maintain that, for example,
"<open file '/goo/bag', mode 'r' at 0x402cbae0>" *IS* my open file
object "adapted to" the string protocol.  It's clearly a mere human
readable representation, a vague phantom of the object itself.  It
should be obvious that, just as "adapting a string to the (R/O) file
protocol" means wrapping it in cStringIO.StringIO, so the reverse
adaptation, "adapting a file to the string protocol", should utilize
a wrapper object that presents the file's data with all string object
methods, for example via mmap.


> So, there is already a defined protocol within Python for conversion to
> specific types, with well-defined meaning.  One might argue that since it's

Conversion is one thing, adaptation is a different thing.  Creating a new
object "somehow related" to an existing one -- i.e., conversion -- is a very 
different thing from "wrapping" an existing object to support a different 
protocol -- adaptation.

Consider another typical case:

>>> import array
>>> x = array.array('c', 'ciao')
>>> L = list(x)
>>> x.extend(array.array('c', 'foop'))
>>> x
array('c', 'ciaofoop')
>>> L
['c', 'i', 'a', 'o']
>>>

See the point?  CONVERSION, aka construction, aka typecasting, i.e.
list(x), has created a new object, based on what WERE the contents
of x at the time at conversion, but INDEPENDENT from it henceforwards.

Adaptation should NOT work that way: adapt(x, list) would, rather,
return a wrapper, providing listlike methods (some, like pop or remove,
would delegate to x's own methods -- others, like sort, would require
more work) and _eventually performing actual operations on x_, NOT
on a separate thing that once, a long time ago, was constructed by
copying it.

Thus, I see foo(x) and adapt(x, foo) -- even in cases where foo is a
type -- as GENERALLY very different.  If you have SPECIFIC use cases
in mind where it would be clever to make the two operations coincide,
you still haven't made them; I only heard vague generalities about how
adapt(x, y) "should" work without ANY real support for them.

If the code that requests adaptation is happy, as a fall-back, to have
(e.g.) "<open file '/goo/bag', mode 'r' at 0x402cbae0>" as the "ersatz
adaptation" of a file instance to str, for example, it can always do the
fall-back itself, e.g.
    try: z = adapt(x, y)
    except TypeError: 
        try: z = y(x)
        except (TypeError, ValueError):
            # whatever other desperation measures it wants to try

To have adapt itself imply such measures would be a disaster, and
make adaptation basically unusable in all cases where one might
have (e.g.) "y is str".
    

> I don't understand the dividing line here.  Perhaps that's because Python
> doesn't really *have* an existing notion of typecasting as such, there are
> just constructors (e.g. int) and conversion methods (e.g.
> __int__).

Yeah, that's much like C++, except C++ is more general in terms of
conversion methods -- not only can a constructor for type X accept
a Y argument (or const Y&, equivalently), but type Y can also always
choose to provide an "operator X()" to typecast its instances to the
other type [I think I recall that if BOTH types try to cooperate in such
ways you end up with an ambiguity error, though:-)].  That's in contrast
to the specific few 'conversion methods' that Python supports only
for a small set of numeric types as the destination of the conversion.

Either the single-argument constructor or the operator may get used 
when you typecast (static_cast<X>(y) where y is an instance of Y).  

There isn't all that much difference between C++'s approach and 
Python's here, except for C++'s greater generality and the fact that 
in Python you always use notation X(y) to indicate the typecasting 
request.

("typecast" is not a C++ term any more than it's Python's -- I think
it's only used in some obscure languages such as CIAO, tools like
Flex/Harpoon, Mathworks, etc -- so, sorry if my use was obscure).

One important difference: in C++, you get to define whether a
one-argument constructor gets to be evaluated "implicitly", when
an object of type X is required and one of type Y is supplied
instead, or not.  If the constructor is declared explicit, then it
ONLY gets called for EXPLICIT typecasts such as X(y).

In Python, we think EIBNI, and therefore typecasts are explicit.
We do NOT "adapt" a float f to int when an int is required, as
in somelist[f]: we raise a TypeError -- if you want COERCION,
aka CONVERSION, to an int, with possible loss of information
etc, you EXPLICITLY code somelist[int(f)].  Your proposal that
adaptation be, when possible, implemented by conversion, goes
against the grain of that good habit and principle.  Adaptation
in general is not conversion -- when you know you want, or at
least can possibly tolerate as a fallback, total conversion, ASK
for it, explicitly -- perhaps as a fallback if adaptation fails, as
above.  Having "adapt(x, y)" just basically duplicate some possible
cases of y(x) would be a serious diminution of adaptation's
potential and usefulness.


> However, conversion methods and even constructors of immutable
> types are allowed to be idempotent.  'int(x) is x' can be true, for
> example.  So, how is that different?

it's part of the PEP that, if isinstance(x, y), then normally
x is adapt(x, y) [[ with a specific exception for "non substitutable
subclasses" whose usecases I do not know -- anyway, such
subclasses would need to be _specifically_ "exempted" from the
general rule, e.g. by providing an __adapt__ that raises as needed ]].

So, calling y(x) will be wrong EXCEPT when type y is immutable
AND it's EXACTLY the case that "type(x) is y", NOT a subclass,
otherwise:

>>> class xx(int): pass
...
>>> w = xx(23)
>>> type(w)
<class '__main__.xx'>
>>> type(int(w))
<type 'int'>
>>>

... the 'is' constraint is lost, despite the fact that xx IS quite
obviously "substitutable" and has requested NO exception
to the rule, AT ALL.

Again: adaptation is not conversion -- and this is NOT about
the admitted defects in the PEP, because this case is VERY
specifically spelled out there.  Implementing adapt(x, y) as
y(x) may perhaps be of some practical use in some cases,
but I am still waiting for you to show any such use case of
practical compelling interest.  I hope I have _amply_ shown
that the implementation strategy is absolutely out of the
question as a general one, so it matters up to a point if some
very specific subcases are well served by that strategy, anyway.
The key issue is, such cases, if any, will need to be very
specifically identified and justified one by one.


Alex


From aleaxit at yahoo.com  Tue Oct 28 16:26:33 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 16:26:52 2003
Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug
In-Reply-To: <3F9EC269.6080108@colorado.edu>
References: <3F9EC269.6080108@colorado.edu>
Message-ID: <200310282226.33748.aleaxit@yahoo.com>

On Tuesday 28 October 2003 08:24 pm, Fernando Perez wrote:
> Hi all,
>
> I just wanted to add a small comment on this discussion, which I'd been
> following via the newsgroup mirror.

Thanks for your comments!  I didn't even know we HAD an ng mirror...


> Python is picking up a lot of steam in the scientific computing community,
> and in science it is quite common to encounter non-commutative
> multiplication. Just to remind Guido from his old math days :), even for
> square matrices, AB!=BA in most cases.  The Matrix class supplied with

Yes, of course, you're right.  However, the most specific problem is: do
you know of ANY use cases where
   A*x
and
   x*A
should give different results, or the former should succeed and the latter
should fail, *when x is an integer*?

If you can find any use case for that, even in an obscure branch of
maths, then clearly the urgency of fixing this bug goes WAY up.

Otherwise -- if having the problem specifically for an integer x ONLY
should not affect anything -- the bug is basically only going to show
up in software that's under development and not yet completed, or
else not fully correct.  I _still_ want to fix it, but... the urgency of doing
so is going to be different, as I'm sure you'll understand!


Thanks again for your help -- we DO need to hear from users!!!

Alex


From pje at telecommunity.com  Tue Oct 28 16:36:42 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Oct 28 16:38:45 2003
Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing
	'global')
In-Reply-To: <200310282155.41407.aleaxit@yahoo.com>
References: <5.1.1.6.0.20031028121110.0315e0d0@telecommunity.com>
	<5.1.0.14.0.20031028085337.02f57ec0@mail.telecommunity.com>
	<5.1.1.6.0.20031028121110.0315e0d0@telecommunity.com>
Message-ID: <5.1.1.6.0.20031028155739.022ab800@telecommunity.com>

At 09:55 PM 10/28/03 +0100, Alex Martelli wrote:
>On Tuesday 28 October 2003 07:47 pm, Phillip J. Eby wrote:
>    ...
> > You didn't actually give any example of why 'adapt("23",int)' shouldn't
> > return 23, just why adapt("foo",file) shouldn't return a file.
>
>Which is sufficient to show that, IN GENERAL, adapt(x, sometype)
>should not just be the equivalent of sometype(x), as you seemed (and,
>below, still seem) to argue.

I'm not arguing that, nor have I ever intended to.  I merely questioned 
your appearing to argue that adapt(x,sometype) should NEVER equal sometype(x).


>It borders, I think, on the absurd, to maintain that, for example,
>"<open file '/goo/bag', mode 'r' at 0x402cbae0>" *IS* my open file
>object "adapted to" the string protocol.  It's clearly a mere human
>readable representation, a vague phantom of the object itself.  It
>should be obvious that, just as "adapting a string to the (R/O) file
>protocol" means wrapping it in cStringIO.StringIO, so the reverse
>adaptation, "adapting a file to the string protocol", should utilize
>a wrapper object that presents the file's data with all string object
>methods, for example via mmap.

Great, so now you know what you'd like file.__conform__(str) to do.

This has nothing to do with what I was asking about.  You said, in the post 
I originally replied to:

"y=adapt("23", int) should NOT succeed."

And I said, "why not?"

This is not the same as me saying that adapt(x,y) for all y should equal 
y(x).  Such an idea is patently absurd.  I might, however, argue that 
adapt(x,int) should equal int(x) for any x whose __conform__ returns 
None.  Or more precisely, that int.__adapt__(x) should return int(x).

And that is why I'm asking why you appear to disagree.  However, you keep 
talking about *other* values of y and x than 'int' and "23", so I'm no 
closer to understanding your original statement than before.


>Adaptation should NOT work that way: adapt(x, list) would, rather,
>return a wrapper, providing listlike methods (some, like pop or remove,
>would delegate to x's own methods -- others, like sort, would require
>more work) and _eventually performing actual operations on x_, NOT
>on a separate thing that once, a long time ago, was constructed by
>copying it.

For protocols whose contract includes immutability (such as 'int') this 
distinction is irrelevant, since a snapshot is required.  Or are you saying 
that adaptation cannot be used to adapt a mutable object to a protocol that 
includes immutability?


>Thus, I see foo(x) and adapt(x, foo) -- even in cases where foo is a
>type -- as GENERALLY very different.  If you have SPECIFIC use cases
>in mind where it would be clever to make the two operations coincide,
>you still haven't made them; I only heard vague generalities about how
>adapt(x, y) "should" work without ANY real support for them.

It's you who has proposed how they work, and I who asked a question about 
your statement.


>In Python, we think EIBNI, and therefore typecasts are explicit.
>We do NOT "adapt" a float f to int when an int is required, as
>in somelist[f]: we raise a TypeError -- if you want COERCION,
>aka CONVERSION, to an int, with possible loss of information
>etc, you EXPLICITLY code somelist[int(f)].  Your proposal that
>adaptation be, when possible, implemented by conversion, goes

I'm not aware that I made such a proposal.  I asked why you thought that 
adapt('23',int) should *not* return 23.


 >[lots more snipped]

We seem to be having two different conversations.  I haven't proposed 
*anything*, only asked questions.  Meanwhile, you keep debating my supposed 
proposal, and not answering my questions!

Specifically, you still have not answered my question:

Why do you think that 'adapt("23",int)' should not return 23?

That is all I am asking, and trying to understand.  It is a question, not a 
proposal for anything, of any kind.

Now, it is possible I misunderstood your original statement, and you were 
not in fact proposing that it should not.  If so, then that clarification 
would be helpful.

All the rest of this about why adapt(x,y) may have nothing to do with y(x) 
isn't meaningful to me.  The fact that 2+2==4 and 2*2 ==4 doesn't mean that 
multiplication is the same as addition!  So why would adapt(x,y) and y(x) 
being equal for some values of x and y mean that adaptation is 
conversion?  You seem to be arguing, however, that that's what I'm saying.

Further, you seem to me to be saying that "Because addition is not 
multiplication, adding 2 and 2 should not equal 4.  That's what 
multiplication is for, so you should always multiply 2 and 2 to get 4, 
never add them."  And that seems so wrong to me, that I have to ask, "Why 
would you say a thing like that?"

Then, you answer me by saying, "But addition is not multiplication, so why 
are you proposing that adding two numbers should always produce the same 
result as multiplying them?"  When in fact I have not proposed any such 
thing, nor would I!


From aleaxit at yahoo.com  Tue Oct 28 16:42:12 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 16:42:17 2003
Subject: [Python-Dev] RE: [Python-checkins] python/nondist/peps
	pep-0323.txt, NONE, 1.1 pep-0000.txt, 1.254, 1.255
In-Reply-To: <200310281752.h9SHpxr29419@12-236-54-216.client.attbi.com>
References: <005a01c39cdb$fa18b540$81b0958d@oemcomputer>
	<200310281735.41103.aleaxit@yahoo.com>
	<200310281752.h9SHpxr29419@12-236-54-216.client.attbi.com>
Message-ID: <200310282242.12398.aleaxit@yahoo.com>

On Tuesday 28 October 2003 06:51 pm, Guido van Rossum wrote:
> > Yes, the use case of __deepcopy__ is indeed quite different (and
> > to be honest it doesn't appear in my actual experience -- I can "imagine"
> > some as well as the next man, but they'd be made out of whole cloth:-).
> > But I was under the impression that you wanted them in PEP 323 too?
> > Maybe I misunderstood your words.  Should I take them out of PEP 323?
> > In that case somebody else can later PEP that if they want, and I can
> > basically wash my hands of them -- what do you think?
>
> I think it would be better of PEP 323 only did __copy__, so you can
> remove all traces of __deepcopy__.  I don't recall what I said, maybe
> I wasn't clear.

Aye aye cap'n -- that suits me just fine, actually:-).

Alex


From aleaxit at yahoo.com  Tue Oct 28 16:46:50 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 16:46:58 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: <200310281742.h9SHgGt29384@12-236-54-216.client.attbi.com>
References: <200310260115.h9Q1F0905596@oma.cosc.canterbury.ac.nz>
	<200310281742.39349.aleaxit@yahoo.com>
	<200310281742.h9SHgGt29384@12-236-54-216.client.attbi.com>
Message-ID: <200310282246.50113.aleaxit@yahoo.com>

On Tuesday 28 October 2003 06:42 pm, Guido van Rossum wrote:
> > Hmmm... maybe one COULD make a custom descriptor that does support
> > both usages... and maybe it IS worth making the .sorted (or whatever
> > name) entry a case of exactly such a subtle custom descriptor...
>
> Thanks for the idea, I can use this as a perverted example in my talk
> at Stanford tomorrow.  Here it is:

Heh, cool!

> import new
>
> def curry(f, x, cls=None):
>     return new.instancemethod(f, x)

Hmmm, what's the role of the ", cls=None" argument here...?  I.e,
couldn't just

curry = new.instancemethod

be equivalent?


Alex


From pedronis at bluewin.ch  Tue Oct 28 16:55:34 2003
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Tue Oct 28 16:53:00 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310281527.h9SFRjw29046@12-236-54-216.client.attbi.com>
References: <Your message of "Tue,
	28 Oct 2003 21:19:31 +1000." <3F9E50C3.4040908@iinet.net.au>
	<200310280202.h9S22Ad21701@oma.cosc.canterbury.ac.nz>
	<200310280255.h9S2tvW28198@12-236-54-216.client.attbi.com>
	<200310280956.34183.aleaxit@yahoo.com>
	<3F9E50C3.4040908@iinet.net.au>
Message-ID: <5.2.1.1.0.20031028224849.02876cc0@pop.bluewin.ch>

At 07:27 28.10.2003 -0800, Guido van Rossum wrote:
>   It matches what the current global statement does, and it
>makes it crystal clear that you *can* declare a variable in a specific
>scope and assign to it without requiring there to be a binding for
>that variable in the scope itself.  EIBTI when comparing these two.

looking at:

x = 'global'

def f():
   def init():
     global x in f
     x = 'in f'
   def g():
     print x
   init()
   g()

I don't really know whether to call explicit or implicit the fact that x in g
is not the global one. And contrast with

x = 'global'

def f():
   x = 0
   def init():
     global x
     x = 'in f'
   def g():
     print x
   init()
   g()

or consider

x = 'global'

def f():
   global x
   def init():
     global x in f
     x = 'in f'
   def g():
     print x
   init()
   g()


From fperez at colorado.edu  Tue Oct 28 16:57:38 2003
From: fperez at colorado.edu (Fernando Perez)
Date: Tue Oct 28 16:57:42 2003
Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug
In-Reply-To: <200310282226.33748.aleaxit@yahoo.com>
References: <3F9EC269.6080108@colorado.edu>
	<200310282226.33748.aleaxit@yahoo.com>
Message-ID: <3F9EE652.3030703@colorado.edu>

Alex Martelli wrote:
> On Tuesday 28 October 2003 08:24 pm, Fernando Perez wrote:
>> I just wanted to add a small comment on this discussion, which I'd been 
>> following via the newsgroup mirror.
> 
> 
> Thanks for your comments!  I didn't even know we HAD an ng mirror...

Via gmane news, it works quite well in fact.  I typically follow python-dev
there, and only subscribe occasionally if I need to say something.

> Yes, of course, you're right.  However, the most specific problem is: do 
> you know of ANY use cases where A*x and x*A should give different results,
> or the former should succeed and the latter should fail, *when x is an
> integer*?
> 
> If you can find any use case for that, even in an obscure branch of maths,
> then clearly the urgency of fixing this bug goes WAY up.

Well, I'm not a mathematician myself, but I did ask two friends and neither of
them could think of such a case quickly (they're applied people, though, we
need to ask someone doing abstract algebra :)

But I think I can see a 'semi-reasonable' usage case.  Bear with me for a
moment, please.  Suppose A is a member of a class representing a non-linear
operator which acts on functions f, such that in particular:

A(x*f) != x*A(f)

for x an integer.

Now, if for some reason I decide to implement the 'application of A', which in
the above I represented with (), with '*', the bug you mention does surface,
because then:

A*x*f != x*A*f

Or does it?  The left-right order of associativity plays a role here also, and
I don't know exactly how python treats these.

Granted, this example is somewhat contrived.  Here, using __call__ for
application would be more sensible, and the associativity rules may still hide
  the bug.  Using '*' for application is not totally absurd, because if you are
using finite matrix representations of your operators and functions, then in
fact operator-function application _does_ become multiplication in the
finite-dimensional vector space.

But it _suggests_ a possibility for the bug to surface.  At the same time, it
also shows that 'really reasonable' uses will probably not easily expose this one.

The one idea which I think matters, though, is the following: since in python
we can't define new operators, in specific problem domains the existing ones
(such as '*') may end up being reused in very unconventional ways.  So while I
can't think now of a non-commutative integer*THING multiplication, I don't see
why someone might not build a THING where '*' isn't really what we think of as
'multiplication', and then the bug matters.

In the end, I'd argue that it would be _nice_ to have it fixed, but I
understand that with finite developer resources available, this one may have
to take a back seat until someone can show a truly compelling case.  Perhaps
I'm just not imaginative enough to see one quickly :)

> Thanks again for your help -- we DO need to hear from users!!!

No problem, thanks for being receptive :)

Best,

f


From nas-python at python.ca  Tue Oct 28 17:09:54 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Tue Oct 28 17:08:27 2003
Subject: [Python-Dev] Deprecate the buffer object?
Message-ID: <20031028220953.GA25984@mems-exchange.org>

I happened to be looking at the buffer API today and I came across
this posting from Guido:

  http://mail.python.org/pipermail/python-dev/2000-October/009974.html

Over the years there has been a lot of discussion about the buffer
API and the buffer object.  The general consensus seems to be that
the buffer API is not ideal but nonetheless useful.  The buffer
object, OTOH, is considered fundamentally broken and should be
removed.

Does anyone object to deprecating the 'buffer' builtin?  Eventually
we could remove the buffer object completely.

  Neil

From aleaxit at yahoo.com  Tue Oct 28 17:23:18 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 17:23:25 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <20031028220953.GA25984@mems-exchange.org>
References: <20031028220953.GA25984@mems-exchange.org>
Message-ID: <200310282323.18041.aleaxit@yahoo.com>

On Tuesday 28 October 2003 11:09 pm, Neil Schemenauer wrote:
> I happened to be looking at the buffer API today and I came across
> this posting from Guido:
>
>   http://mail.python.org/pipermail/python-dev/2000-October/009974.html
>
> Over the years there has been a lot of discussion about the buffer
> API and the buffer object.  The general consensus seems to be that
> the buffer API is not ideal but nonetheless useful.  The buffer
> object, OTOH, is considered fundamentally broken and should be
> removed.
>
> Does anyone object to deprecating the 'buffer' builtin?  Eventually
> we could remove the buffer object completely.

Is that about RW buffers specifically?  Because I _have_ used R/O
buffers in production code -- when I had a huge string already in
memory, and needed various largish substrings of it at different
but overlapping times, without paying the overhead to copy them
as slicing would have done.  Having 'buffer' as a buit-in was quite
minor though -- considering the number of times I have used it,
importing some module to get at it would have been perfectly
acceptable, perhaps preferable.  If the buffer interface stays but
the function completely disappears, I guess it won't be too hard
for me to recreate it in a tiny extension module, but it's not quite
clear to me why I should need to.

R/W buffers I've never used in production, though.  I do recall
once (at the very beginning of my Python usage) using an 
array's buffer_info method as a Q&D way to do some interfacing
to C, but that was before ctypes, which I think is what i'd use now.


Alex


From nas-python at python.ca  Tue Oct 28 17:30:14 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Tue Oct 28 17:28:49 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <20031028220953.GA25984@mems-exchange.org>
References: <20031028220953.GA25984@mems-exchange.org>
Message-ID: <20031028223014.GA26245@mems-exchange.org>

Looks like I was a little quick sending out that message.  I found
more recent postings from Tim and Guido:

  http://mail.python.org/pipermail/python-dev/2002-July/026408.html
  http://mail.python.org/pipermail/python-dev/2002-July/026413.html

Slippery little beast, that buffer object. :-)  I'm going to go
ahead and add deprecation warnings.

  Neil

From guido at python.org  Tue Oct 28 17:28:35 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 28 17:29:01 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
In-Reply-To: Your message of "Tue, 28 Oct 2003 22:46:50 +0100."
	<200310282246.50113.aleaxit@yahoo.com> 
References: <200310260115.h9Q1F0905596@oma.cosc.canterbury.ac.nz>
	<200310281742.39349.aleaxit@yahoo.com>
	<200310281742.h9SHgGt29384@12-236-54-216.client.attbi.com> 
	<200310282246.50113.aleaxit@yahoo.com> 
Message-ID: <200310282228.h9SMSZf30381@12-236-54-216.client.attbi.com>

> > import new
> >
> > def curry(f, x, cls=None):
> >     return new.instancemethod(f, x)
> 
> Hmmm, what's the role of the ", cls=None" argument here...?

Oops, remnant of a dead code branch.

> I.e, couldn't just
> 
> curry = new.instancemethod
> 
> be equivalent?

Right.  I had bigger plans but decided to can them. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at yahoo.com  Tue Oct 28 17:29:19 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Tue Oct 28 17:29:40 2003
Subject: [Python-Dev] 
	Re: [Python-checkins] python/nondist/peps pep-0323.txt, NONE,
	1.1 pep-0000.txt, 1.254, 1.255
In-Reply-To: <005a01c39cdb$fa18b540$81b0958d@oemcomputer>
References: <005a01c39cdb$fa18b540$81b0958d@oemcomputer>
Message-ID: <200310282329.19689.aleaxit@yahoo.com>

On Monday 27 October 2003 11:45 pm, Raymond Hettinger wrote:
> Excellent PEP!
>
>
> Consider adding your bookmarking example.  I found it to be a compelling
> use case.  Also note that there are many variations of the bookmarking
> theme (undo utilities, macro recording, parser lookahead functions,
> backtracking, etc).

I will -- thanks!


> Under drawbacks and issues there are a couple of thoughts:
>
> * Not all iterators will be copyable.  Knowing which is which creates a
> bit of a usability issue (i.e. the question of whether a particular
> iterator is copyable will come up every time) and a substitution issue
> (i.e. code which depends on copyability precludea substitution of other
> iterators that don't have copyability).

Yes, I'll have to mention that (that the royal road for user code to
access "iterator copying" functionality is via tee() when feasible).


> * In addition to knowing whether a given iterator is copyable, a user
> should also know whether the copy is lightweight (just an index or some
> such) or heavy (storing all of the data for future use).  They should
> know whether it is active (intercepting every call to iter()) or inert.

Heavy copies should be left to 'tee' more often than not.


> * For heavy copies, there is a performance trap when the stored data
> stream gets too long.  At some point, just using list() would be better.

Or saving to disk, beyond a further threshold.

> Consider adding a section with pure python sample implementations for
> listiter.__copy__, dictiter.__copy__, etc.

OK, but some of it's gonna be very-pseudo code (how do you mimic
dictiter's real behaviour in pure Python...?).

> Also, I have a question about the semantic specification of what a copy
> is supposed to do.  Does it guarantee that the same data stream will be
> reproduced?  For instance, would a generator of random words expect its
> copy to generate the same word sequence.  Or, would a copy of a
> dictionary iterator change its output if the underlying dictionary got
> updated (i.e. should the dict be frozen to changes when a copy exists or
> should it mutate).

I'll have to clarify this as for followup discussion on this thread -- 
pseudorandom iterators (I'll give an example) should be copyable and
ensure the same stream from original and copy, real-random iterators
(e.g. from /dev/random) not, iterators on e.g. lists and dicts should not
freeze the underlying contained when copied any more than they do
when first generated (in general if you mutate a dict or list you're
iterating on, Python doesn't guarantee "sensible" behavior...).


Thanks,

Alex


From mike at nospam.com  Tue Oct 28 17:40:58 2003
From: mike at nospam.com (Mike Rovner)
Date: Tue Oct 28 17:41:20 2003
Subject: [Python-Dev] Re: Re: the "3*x works w/o __rmul__" bug
References: <3F9EC269.6080108@colorado.edu><200310282226.33748.aleaxit@yahoo.com>
	<3F9EE652.3030703@colorado.edu>
Message-ID: <bnmr9p$vs5$1@sea.gmane.org>

Fernando Perez wrote:
> Alex Martelli wrote:
>> On Tuesday 28 October 2003 08:24 pm, Fernando Perez wrote:
>>> I just wanted to add a small comment on this discussion, which I'd
>>> been following via the newsgroup mirror.
>> Thanks for your comments!  I didn't even know we HAD an ng mirror...
> Via gmane news, it works quite well in fact.  I typically follow
> python-dev
> there, and only subscribe occasionally if I need to say something.

Just FYI gmane provides two-way access via nntp.
This message is a confirmation. :)

Mike


From aahz at pythoncraft.com  Tue Oct 28 17:43:04 2003
From: aahz at pythoncraft.com (Aahz)
Date: Tue Oct 28 17:43:08 2003
Subject: [Python-Dev] PEP 322: Generator Expressions (implementation team)
In-Reply-To: <001601c39d87$aaa08c20$f7b42c81@oemcomputer>
References: <001601c39d87$aaa08c20$f7b42c81@oemcomputer>
Message-ID: <20031028224303.GA1740@panix.com>

On Tue, Oct 28, 2003, Raymond Hettinger wrote:
>
> Guido has accepted the generator expressions pep, so it's time for me to
> form an implementation team.

Um.  PEP 322 is generator expressions?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From nas-python at python.ca  Tue Oct 28 17:55:20 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Tue Oct 28 17:53:53 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <200310282323.18041.aleaxit@yahoo.com>
References: <20031028220953.GA25984@mems-exchange.org>
	<200310282323.18041.aleaxit@yahoo.com>
Message-ID: <20031028225520.GB26245@mems-exchange.org>

On Tue, Oct 28, 2003 at 11:23:18PM +0100, Alex Martelli wrote:
> Is that about RW buffers specifically?

No.

> Because I _have_ used R/O buffers in production code -- when I had
> a huge string already in memory, and needed various largish
> substrings of it at different but overlapping times, without
> paying the overhead to copy them as slicing would have done.

That's a useful thing to be able to do and the buffer object does it
in a safe way.  I guess that's part of the reason why the buffer
object has managed to survive as long as it has.

  Neil

From tdelaney at avaya.com  Tue Oct 28 18:03:34 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Tue Oct 28 18:03:42 2003
Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing
	'global')
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6B228@au3010avexu1.global.avaya.com>

> From: Phillip J. Eby [mailto:pje@telecommunity.com]

> 
> At 09:56 AM 10/28/03 +0100, Alex Martelli wrote:
> >AND, adaptation is not typecasting:
> >e.g y=adapt("23", int) should NOT succeed.
> 
> And, why do you consider adaptation *not* to be typecasting?  
> I always 
> think of it as "give me X, rendered as a Y", which certainly 
> sounds like a 
> description of typecasting to me.

Because (IMO anyway) adaption is *not* "give me X, rendered as Y". Adaption is "here is an X, can it be used as a Y?".

They are two distinct concepts, although obviously there are crossover points. A string cannot be used as an int, although an int can be created from the string representation of an int.

Adaption should not involve any change to the underlying data - mutating operations on the adapted object should (attempt to) mutate the original object (assuming the adapted object and original object are not one and the same).

Tim Delaney

From pf_moore at yahoo.co.uk  Tue Oct 28 18:05:30 2003
From: pf_moore at yahoo.co.uk (Paul Moore)
Date: Tue Oct 28 18:08:18 2003
Subject: [Python-Dev] Re: Deprecate the buffer object?
References: <20031028220953.GA25984@mems-exchange.org>
	<20031028223014.GA26245@mems-exchange.org>
Message-ID: <8yn5dkud.fsf@yahoo.co.uk>

Neil Schemenauer <nas-python@python.ca> writes:

> Looks like I was a little quick sending out that message.  I found
> more recent postings from Tim and Guido:
>
>   http://mail.python.org/pipermail/python-dev/2002-July/026408.html
>   http://mail.python.org/pipermail/python-dev/2002-July/026413.html
>
> Slippery little beast, that buffer object. :-)  I'm going to go
> ahead and add deprecation warnings.

I used it once in combination with ctypes as buffer(a-ctypes-object)
to get at the raw memory whicy ctypes objects expose via the buffer
API. But it was pretty obscure, and I would happily have used an
external module. Like this:

>>> import ctypes
>>> n = ctypes.c_int(12)
>>> buffer(n)
<read-only buffer for 0x008F6530, ptr 0x0093FB88, size 4 at 0x00977980>
>>> str(buffer(n))
'\x0c\x00\x00\x00'

Basically, the only serious use case is getting the bytes out of
objects which support the buffer API but which *don't* offer a "get
the bytes out" interface.

I've just realised that I could, however, also do this via the array
module:

>>> from array import array
>>> a = array('c')
>>> a.fromstring(n) # Hey - fromstring means "from buffer API"!
>>> a.tostring()
'\x0c\x00\x00\x00'

There's an extra copy in there. Disaster :-)

Nope, I don't think there's a good use case after all...

Paul
-- 
This signature intentionally left blank


From tdelaney at avaya.com  Tue Oct 28 18:17:58 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Tue Oct 28 18:18:06 2003
Subject: [Python-Dev] RE: [Python-checkins]python/nondist/pepspep-0323.txt,
	NONE, 1.1 pep-0000.txt, 1.254, 1.255
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6B231@au3010avexu1.global.avaya.com>

> From: Alex Martelli [mailto:aleaxit@yahoo.com]
> 
> Come to think of this, there may be other use cases for this
> general approach than "random iterators".  Do you think that
> an iterator on a callable *and args for it* would live well in
> itertools?  That module IS, after all, your baby...

Hmm - I like the idea of this.

    import itertools

    d10 = itertools.icall(random.randint, (1, 10,))

    for i in range(10):
        print d10.next()

Tim Delaney

From greg at cosc.canterbury.ac.nz  Tue Oct 28 18:28:39 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue Oct 28 18:29:03 2003
Subject: [Python-Dev] replacing 'global'
In-Reply-To: <200310281022.31722.aleaxit@yahoo.com>
Message-ID: <200310282328.h9SNSd400064@oma.cosc.canterbury.ac.nz>

Alex Martelli <aleaxit@yahoo.com>:

> i.e., the 'outer' statement should be
>     'outer' expr_stmt

The way I was thinking, "outer" wouldn't be a statement
at all, but a modifier applied to an indentifier in a
binding position. So, e.g.

  x, outer y, z = 1, 2, 3

would be legal, meaning that x and z are local and
y isn't, and

    outer x = 1; y = 2

would mean y is local and x isn't. To make both
x and y non-local you would have to write

    outer x = 1; outer y = 2

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From pje at telecommunity.com  Tue Oct 28 18:33:19 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Oct 28 18:35:24 2003
Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing
	'global')
In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DED6B228@au3010avexu1.global
	.avaya.com>
Message-ID: <5.1.1.6.0.20031028180926.01f40040@telecommunity.com>

At 10:03 AM 10/29/03 +1100, Delaney, Timothy C (Timothy) wrote:
> > From: Phillip J. Eby [mailto:pje@telecommunity.com]
>
> >
> > At 09:56 AM 10/28/03 +0100, Alex Martelli wrote:
> > >AND, adaptation is not typecasting:
> > >e.g y=adapt("23", int) should NOT succeed.
> >
> > And, why do you consider adaptation *not* to be typecasting?
> > I always
> > think of it as "give me X, rendered as a Y", which certainly
> > sounds like a
> > description of typecasting to me.
>
>Because (IMO anyway) adaption is *not* "give me X, rendered as Y". 
>Adaption is "here is an X, can it be used as a Y?".
>
>They are two distinct concepts, although obviously there are crossover 
>points.

Yes, just like 2+2==4 and 2*2==4.


>A string cannot be used as an int, although an int can be created from the 
>string representation of an int.

I'd often like to "use a string as an integer", or use some arbitrary 
object as an integer.  Of course, there's a perfectly valid way to express 
this now (i.e. 'int()'), and I think that's fine and in my code I will 
personally prefer to use int() to mean I want an int, because that's clearer.

But, if for some reason I have code that is referencing some protocol as a 
*parameter*, say 'p', and I have no way to know in advance that p==int, 
then the most sensible thing to do is 'adapt(x,p)', rather than 
'p(x)'.  (Assuming 'p' is expected to be a protocol, rather than a 
conversion function.)

Now, given that 'p' *might* be 'int' in some cases, it seems reasonable to 
me that adapt("23",p) should return 23 in such a case.  Since 23 satisfies 
the desired contract (int) on behalf of "23", this seems to be a correct 
adaptation.  For a protocol p that has immutability as part of its 
contract, adapt(x,p) is well within its rights to return an object that is 
a "copy" of x in some sense.  The immutability requirement means that the 
"adapted" value can never change, so really it's a *requirement* that the 
"adaptation" be a snapshot.


>Adaption should not involve any change to the underlying data - mutating 
>operations on the adapted object should (attempt to) mutate the original 
>object (assuming the adapted object and original object are not one and 
>the same).

I agree 100% -- for a protocol whose contract doesn't require immutability, 
the way 'int' does.

I think now that I understand, however, why you and Alex think I'm saying 
something different than I've been saying.  To both of you, "typecasting" 
means "convert to a different type" at an *implementation* level (as it is 
in other languages), and I mean at a *logical* level.  Thus, to me, "I 
would like to use X as a Y" includes whatever contracts Y supplies *as 
applied to X*.  Not, "give me an instance of Y that's a copy of X".

It just so happens, however, that for a protocol whose contract includes 
immutability, these two concepts overlap, just as multiplication and 
addition overlap for the case of 2+2==2*2.  So, IMO, for immutable types 
such as tuple, str, int, and float, I believe that it's reasonable for 
adapt(x,p)==p(x) iff x is not an instance of p already, and does not have a 
__conform__ method that overrides this interpretation.

That such a default interpretation is redundant with p(x), I also 
agree.  However, for code that uses protocols dynamically, that redundancy 
would eliminate the need to make a dummy protocol (e.g. 'IInteger') to use 
in place of 'int'.

OTOH, if Guido decides that Python's eventual interface objects shouldn't 
be types, then there will be an IInteger anyway, and the point becomes moot.

Anyway, I can only understand Alex's objection to such adaptation if he is 
saying that there is no such thing as adapting to an immutable 
protocol!  In that case, there could never exist such a thing as IInteger, 
because you could never adapt anything to it that wasn't already an 
IInteger.  Somehow, this seems wrong to me.


From aahz at pythoncraft.com  Tue Oct 28 18:37:45 2003
From: aahz at pythoncraft.com (Aahz)
Date: Tue Oct 28 18:37:49 2003
Subject: [Python-Dev] Decimal.py in sandbox
In-Reply-To: <A128D751272CD411BC9200508BC2194D03383106@escpl.tcp.com.ar>
References: <A128D751272CD411BC9200508BC2194D03383106@escpl.tcp.com.ar>
Message-ID: <20031028233745.GA19657@panix.com>

On Mon, Oct 27, 2003, Batista, Facundo wrote:
>
> The raisoning of majority is that when two operands are of different type,
> the less general must be converted to the more general one:
> 
> >>> myDecimal = Decimal(5)	
> >>> myfloat = 3.0
> >>> mywhat = myDecimal + myfloat
> >>> isinstance(mywhat, float)
> True

Absolutely not.  No way, no how, no time.  -1000

The problem is that Decimal is capable of greater precision, accuracy,
and range than float.  You could reasonably argue that the result should
be a Decimal, but that has problems with numbers like 1.1 that already
are inexactly represented in Python.  My opinion is that conversion
between float and Decimal should always be explicit (and my recollection
is that Tim Peters agrees).

> >>> myDecimal = Decimal(5)	
> >>> myint = 3
> >>> mywhat = myint + myDecimal
> >>> isinstance(mywhat, Decimal)
> True

This is acceptable (because you can't lose anything), but I'm overall
leaning toward always requiring explicit conversion.

The one thing I dislike in Cowlishaw's algorithms is that integers are
always zero-extended.  IOW, 1e3 is always 1000.  But a standard is a
standard; if we want Python's Decimal results to be interoperable with
other languages, we have to do that.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From aahz at pythoncraft.com  Tue Oct 28 18:39:23 2003
From: aahz at pythoncraft.com (Aahz)
Date: Tue Oct 28 18:39:26 2003
Subject: [Python-Dev] Decimal.py in sandbox
In-Reply-To: <A128D751272CD411BC9200508BC2194D03383119@escpl.tcp.com.ar>
References: <A128D751272CD411BC9200508BC2194D03383119@escpl.tcp.com.ar>
Message-ID: <20031028233923.GB19657@panix.com>

On Tue, Oct 28, 2003, Batista, Facundo wrote:
>
> So, to who may I send the changes?
> 
> Should I send the whole staff at the end of the work, or keep feeding small
> changes?
> 
> Should I send by email the diff results?

Are you comfortable with CVS?  Would you like to check your changes in
directly?  (Since this is sandbox, it doesn't require the usual rigorous
approval process for patches.)
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It is easier to optimize correct code than to correct optimized code."
--Bill Harlan

From greg at cosc.canterbury.ac.nz  Tue Oct 28 19:34:11 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue Oct 28 19:34:25 2003
Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug
In-Reply-To: <200310281828.h9SISW529541@12-236-54-216.client.attbi.com>
Message-ID: <200310290034.h9T0YBo00246@oma.cosc.canterbury.ac.nz>

Guido van Rossum <guido@python.org>:

> The reason why it works at all for integers without __rmul__ is
> complicated; it has to do with very tricky issues in trying to
> implement multiplication of a sequence with an integer.

I thought the plan was to get rid of all the special case code in the
interpreter for multiplying sequences and push it all down into
methods of the objects concerned, i.e. all sequences, including the
built-in ones, would implement the C equivalent of both __mul__ and
__rmul__ if they wanted to support multiplication on both sides.

Is there some reason why that wouldn't work? Or is it just that
nobody has had time to fix all the built-in sequences to work
this way?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From guido at python.org  Tue Oct 28 19:37:38 2003
From: guido at python.org (Guido van Rossum)
Date: Tue Oct 28 19:37:53 2003
Subject: [Python-Dev] Re: the "3*x works w/o __rmul__" bug
In-Reply-To: Your message of "Wed, 29 Oct 2003 13:34:11 +1300."
	<200310290034.h9T0YBo00246@oma.cosc.canterbury.ac.nz> 
References: <200310290034.h9T0YBo00246@oma.cosc.canterbury.ac.nz> 
Message-ID: <200310290037.h9T0bcq30713@12-236-54-216.client.attbi.com>

> I thought the plan was to get rid of all the special case code in the
> interpreter for multiplying sequences and push it all down into
> methods of the objects concerned, i.e. all sequences, including the
> built-in ones, would implement the C equivalent of both __mul__ and
> __rmul__ if they wanted to support multiplication on both sides.
> 
> Is there some reason why that wouldn't work? Or is it just that
> nobody has had time to fix all the built-in sequences to work
> this way?

It would be a lot of work, and I expect that for 3rd party extension
types (and possibly for 3rd party Python classse) it wouldn't be quite
compatible.  I want it to work this way in Python 3.0, but I don't
know if it's worth reworking all that tedious detail in the 2.x
series.

(Understanding that 3.0 is a few years away still.)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From greg at cosc.canterbury.ac.nz  Tue Oct 28 20:37:45 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue Oct 28 20:37:58 2003
Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing
	'global')
In-Reply-To: <5.1.1.6.0.20031028180926.01f40040@telecommunity.com>
Message-ID: <200310290137.h9T1bjO00488@oma.cosc.canterbury.ac.nz>

"Phillip J. Eby" <pje@telecommunity.com>:

> For a protocol p that has immutability as part of its contract,
> adapt(x,p) is well within its rights to return an object that is a
> "copy" of x in some sense.

I don't think that's right -- this should only apply if
the original object x is immutable. Otherwise, changes to
x should be reflected in the view of it provided by p --
even if p itself provides no operations for mutation.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From jacobs at penguin.theopalgroup.com  Tue Oct 28 20:41:04 2003
From: jacobs at penguin.theopalgroup.com (Kevin Jacobs)
Date: Tue Oct 28 20:41:08 2003
Subject: [Python-Dev] Decimal.py in sandbox
In-Reply-To: <20031028233923.GB19657@panix.com>
Message-ID: <Pine.LNX.4.44.0310282038050.15245-100000@penguin.theopalgroup.com>

On Tue, 28 Oct 2003, Aahz wrote:
> On Tue, Oct 28, 2003, Batista, Facundo wrote:
> >
> > So, to who may I send the changes?
> > 
> > Should I send the whole staff at the end of the work, or keep feeding small
> > changes?
> > 
> > Should I send by email the diff results?
> 
> Are you comfortable with CVS?  Would you like to check your changes in
> directly?  (Since this is sandbox, it doesn't require the usual rigorous
> approval process for patches.)

I'd be happier with at least one round of review before committing to CVS. 
The code is fairly complex and an extra set of eyes will help keep
things focused.  I've also volunteered to be that extra set of eyes, and
plan a quick turn-around on any patches sent to me.

However, I don't have CVS write permission either, even to the sandbox.

-Kevin

-- 
--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (440) 871-6725 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (440) 871-6722              WWW:    http://www.theopalgroup.com/


From greg at cosc.canterbury.ac.nz  Tue Oct 28 20:41:54 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue Oct 28 20:42:07 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <20031028220953.GA25984@mems-exchange.org>
Message-ID: <200310290141.h9T1fsw00503@oma.cosc.canterbury.ac.nz>

Neil Schemenauer <nas-python@python.ca>:

> The buffer object, OTOH, is considered fundamentally broken and should
> be removed.

There's no doubt that the current implementation of it is
unacceptably dangerous, but I haven't yet seen an argument
that convinces me that it couldn't be fixed if desired. I
don't think the *idea* of a buffer object is fundamentally
flawed, and it seems potentially useful (although I must
admit that I haven't found a need for it myself yet).

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From jo at jan.csie.ntu.edu.tw  Tue Oct 28 20:42:47 2003
From: jo at jan.csie.ntu.edu.tw (Chih-Chung Chang)
Date: Tue Oct 28 20:43:30 2003
Subject: copysort patch, was RE: [Python-Dev] inline sort option
Message-ID: <20031029014247.GA28906@jan.csie.ntu.edu.tw>

Hi,

Raymond Hettinger wrote:
> Okay, this is the last chance to come-up with a name other than
> sorted().
> 
> Here are some alternatives:
> 
>   inlinesort()   # immediately clear how it is different from sort()
>   sortedcopy()   # clear that it makes a copy and does a sort
>   newsorted()    # appropriate for a class method constructor
> 
> 
> I especially like the last one and all of them provide a distinction
> from list.sort().
> 

How about adding a builtin function sort() which returns the sorted
version of the input list?

		L.sort() # sort in-place
		sort(L)  # return sorted copy

Regards,
Chih-Chung Chang

From bac at OCF.Berkeley.EDU  Tue Oct 28 21:01:38 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Tue Oct 28 21:01:44 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
Message-ID: <3F9F1F82.2090209@ocf.berkeley.edu>

Today I got the wheels turning on my masters thesis by getting an 
adviser.  Now I just need a topic.  =)  The big goal is to do something 
involving Python for a thesis to be finished by fall of next year (about 
October) so as to have it done, hopefully published (getting into LL4 
would be cool), and ready to be used for doctoral applications come 
January 2005.

So, anyone have any ideas?  The best one that I can think of is optional 
type-checking.  I am fairly open to ideas, though, in almost any area 
involving language design.

There is no deadline to this, so if an idea strikes you a while from now 
still let me know.  I suspect I won't settle on an idea any sooner than 
December, and that is only if the idea just smacks me in the face and 
says, "DO THIS!"  Otherwise it might be a while since I don't want to 
take up a topic that won't interest me or is not helpful in some way.

-Brett


From pje at telecommunity.com  Tue Oct 28 21:31:53 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Oct 28 21:31:06 2003
Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing
	'global')
In-Reply-To: <200310290137.h9T1bjO00488@oma.cosc.canterbury.ac.nz>
References: <5.1.1.6.0.20031028180926.01f40040@telecommunity.com>
Message-ID: <5.1.0.14.0.20031028210911.03e4cd80@mail.telecommunity.com>

At 02:37 PM 10/29/03 +1300, Greg Ewing wrote:
>"Phillip J. Eby" <pje@telecommunity.com>:
>
> > For a protocol p that has immutability as part of its contract,
> > adapt(x,p) is well within its rights to return an object that is a
> > "copy" of x in some sense.
>
>I don't think that's right -- this should only apply if
>the original object x is immutable. Otherwise, changes to
>x should be reflected in the view of it provided by p --
>even if p itself provides no operations for mutation.

There's a difference between an interface which provides no methods for 
mutation, and an interface that *requires* immutability.  Part of the 
concept of an 'int' or 'tuple' is that it is a *value* and therefore 
unchanging.  Thus, one might say that IInteger or ITuple conceptually 
derive from IValueObject.

However, that doesn't mean we can't say that adapt([1,2,3],tuple) should 
fail, and I'm certainly open to the possibility of such an interpretation, 
if it's decreed that supporting 'tuple' means guaranteeing that the adaptee 
doesn't change state, not merely the adapted form.

It seems there are three levels of "immutable" one may have in an 
interface/protocol:

1. No mutator methods, but no requirements regarding stability of state
2. Immutability is required of the adapted form (snapshot)
3. Immutability is required of the adaptee

I have made plenty of use of cases 1 and 2, but never 3.  I'm having a hard 
time thinking of a use case for it, so that's probably why it hasn't 
occurred to me before now.  Looking at this list, I now understand at least 
one of Alex's points better: he (and I think you) are assuming that an 
immutable target protocol means case 3.  That has been baffling the heck 
out of me, because I have not yet encountered a use case for 3.

On the other hand, it's possible that I *have* seen use case 3, and 
mistaken it for use case 2, simply because all the types I wrote adapters 
for were immutable.

Given all this, I think I'm okay with saying that adapting from a mutable 
object to an immutable interface (e.g list->tuple) is an improper use of 
adaptation.  Presumably this also means StringIO->str adaptation would be 
invalid as well.  But, int<->str and other such immutable-to-immutable 
conversions seem well within the purview of adaptation.


From greg at cosc.canterbury.ac.nz  Tue Oct 28 21:56:53 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue Oct 28 21:57:12 2003
Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing
	'global')
In-Reply-To: <5.1.0.14.0.20031028210911.03e4cd80@mail.telecommunity.com>
Message-ID: <200310290256.h9T2uqu00728@oma.cosc.canterbury.ac.nz>

"Phillip J. Eby" <pje@telecommunity.com>:

> Given all this, I think I'm okay with saying that adapting from a mutable 
> object to an immutable interface (e.g list->tuple) is an improper use of 
> adaptation.

Expecting such an adaptation to somehow make the underlying
list unchangeable by any means would be unreasonable, I
think. I can't see any way of enforcing that other than by
making a copy, which goes agains the spirit of adaptation.

There still might be uses for it, though, without any
unchangeability guarantee, such as passing it to something
that requires a tuple and not just a sequence, but not
wanting the overhead of making a copy.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From ncoghlan at iinet.net.au  Tue Oct 28 23:09:22 2003
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Tue Oct 28 23:09:30 2003
Subject: [Python-Dev] Alternate notation for global variable assignments
In-Reply-To: <200310281536.h9SFaNr29119@12-236-54-216.client.attbi.com>
References: <1067299912.1066.35.camel@anthem>	<338366A6D2E2CA4C9DAEAE652E12A1DED6B084@au3010avexu1.global.avaya.com>	<1067299912.1066.35.camel@anthem>
	<5.1.0.14.0.20031028084229.01e66800@mail.telecommunity.com>
	<200310281536.h9SFaNr29119@12-236-54-216.client.attbi.com>
Message-ID: <3F9F3D72.5080308@iinet.net.au>

Guido van Rossum strung bits together to say:
> which loses the "aha!" effect of a cool solution.  It also IMO
> requires too much explanation to the unsuspecting reader who doesn't
> understand right away *why* rumpelstiltkin imports itself.

I believe someone else also suggested that if rumpelstiltskin should be imported as:

import fairytales.rumpelstiltskin

then a bare import is going to have trouble, even inside the module.

Cheers,
Nick.

-- 
Nick Coghlan           |              Brisbane, Australia
ICQ#: 68854767         |               ncoghlan@email.com
Mobile: 0409 573 268   |   http://www.talkinboutstuff.net
"Let go your prejudices,
               lest they limit your thoughts and actions."


From python at rcn.com  Wed Oct 29 01:36:35 2003
From: python at rcn.com (Raymond Hettinger)
Date: Wed Oct 29 01:37:33 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <20031028225520.GB26245@mems-exchange.org>
Message-ID: <000501c39de7$0019c180$3403a044@oemcomputer>

> That's a useful thing to be able to do and the buffer object does it
> in a safe way.  I guess that's part of the reason why the buffer
> object has managed to survive as long as it has.

At least the builtin buffer function should go away.
Even if someone had a use for it, it would not make-up for all the time
lost by all the other people trying to figure what it was good for.


Raymond Hettinger


From martin at v.loewis.de  Wed Oct 29 02:17:58 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Wed Oct 29 02:18:25 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <3F9F1F82.2090209@ocf.berkeley.edu>
References: <3F9F1F82.2090209@ocf.berkeley.edu>
Message-ID: <m3oew0o6l5.fsf@mira.informatik.hu-berlin.de>

"Brett C." <bac@OCF.Berkeley.EDU> writes:

> So, anyone have any ideas?  The best one that I can think of is
> optional type-checking.  I am fairly open to ideas, though, in almost
> any area involving language design.

Did you explicitly mean language *design*? Because there might be
areas of research relevant to language implementation, in terms of
efficiency, portability, etc.

Here are some suggestions:
- memory management: attempt to replace reference counting by
  "true" garbage collection
- threading: attempt to provide free threading efficiently
- typing: attempt to provide run-time or static type inference,
  and see whether this could be used to implement some byte codes
  more efficiently (although there is probably overlap with the
  specializing compilers)
- floating point: provide IEEE-794 (or some such) in a portable
  yet efficient way
- persistency: provide a mechanism to save the interpreter state
  to disk, with the possibility to restart it later (similar to
  Smalltalk images)

On language design, I don't have that many suggestions, as I think the
language itself should evolve slowly if at all:
- deterministic finalization: provide a way to get objects destroyed
  implicitly at certain points in control flow; a use case would be
  thread-safety/critical regions
- attributes: provide syntax to put arbitrary annotations to
  functions, classes, and class members, similar to .NET
  attributes. Use that facility to implement static and class methods,
  synchronized methods, final methods, web methods, transactional
  methods, etc (yes, there is a proposal, but nobody knows whether it
  meets all requirements - nobody knows what the requirements are)
- interfaces (this may go along with optional static typing)

Regards,
Martin

From janssen at parc.com  Wed Oct 29 02:26:15 2003
From: janssen at parc.com (Bill Janssen)
Date: Wed Oct 29 02:26:43 2003
Subject: [Python-Dev] htmllib vs. HTMLParser 
In-Reply-To: Your message of "Tue, 28 Oct 2003 04:53:50 PST."
	<20031028125350.GC1095@rogue.amk.ca> 
Message-ID: <03Oct28.232619pst."58611"@synergy1.parc.xerox.com>

> Perhaps, but it might be a mug's game.  I was on the Lynx developer list for
> a while, and bad HTML requires many, many hacks to be processed sensibly.

Yes, I know what you mean.  I would personally be happy to simply
reject bad HTML (return None from the parser), and force the user to
do what he currently has to do to handle it.

Bill

From aleaxit at yahoo.com  Wed Oct 29 02:45:36 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Wed Oct 29 02:46:46 2003
Subject: Adaptation and typecasting (was Re: [Python-Dev] replacing
	'global')
In-Reply-To: <200310290256.h9T2uqu00728@oma.cosc.canterbury.ac.nz>
References: <200310290256.h9T2uqu00728@oma.cosc.canterbury.ac.nz>
Message-ID: <200310290845.36472.aleaxit@yahoo.com>

On Wednesday 29 October 2003 03:56, Greg Ewing wrote:
> "Phillip J. Eby" <pje@telecommunity.com>:
> > Given all this, I think I'm okay with saying that adapting from a
> > mutable object to an immutable interface (e.g list->tuple) is an
> > improper use of adaptation.
>
> Expecting such an adaptation to somehow make the underlying
> list unchangeable by any means would be unreasonable, I
> think. I can't see any way of enforcing that other than by
> making a copy, which goes agains the spirit of adaptation.
>
> There still might be uses for it, though, without any
> unchangeability guarantee, such as passing it to something
> that requires a tuple and not just a sequence, but not
> wanting the overhead of making a copy.

There are uses for both permanent (via copy) and temporary
freezing.  For example: checking if a list is an element of a set will
need only temporary freezing -- just enough to let the list supply
a hash value.  Adding the list to the set will need a frozen copy.

Right now, the sets.py code tries for both kinds of adaptation via
special methods -- __as_immutable__ and
__as_temporarily_immutable__ -- but that's just the usual ad
hoc approach.  If we had adaptation I'd want both of these to
go via protocol adaptation, just because that will allow adaptation
strategies to be supplied by protocol, object type AND third
parties -- practicality beats purity, i.e., even though you are
puristically right that adaptation normally shouldn't copy, I find
this one a compelling very practical use case.

Adaptation altering the object itself, as in "setting a flag in the
list to make it permanently reject any further changes", WOULD
on the other hand be a very bad thing -- one could never safely
try adaptation any longer if one had to fear such permanent
effects on the object being adapted.


Alex


From troels at thule.no  Wed Oct 29 02:57:19 2003
From: troels at thule.no (Troels Walsted Hansen)
Date: Wed Oct 29 02:58:08 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <000501c39de7$0019c180$3403a044@oemcomputer>
References: <000501c39de7$0019c180$3403a044@oemcomputer>
Message-ID: <3F9F72DF.9080101@thule.no>

Raymond Hettinger wrote:

> At least the builtin buffer function should go away.
> Even if someone had a use for it, it would not make-up for all the time
> lost by all the other people trying to figure what it was good for.

I trust you will preserve the functionality though?

I have used the buffer() function to achieve great leaps in performance 
in applications which send data from a string buffer to a socket. 
Slicing kills performance in this scenario once buffer sizes get beyond 
a few 100 kB.

Below is example from an asyncore.dispatcher subclass. This code sends 
chunks with maximum size, without ever slicing the buffer.

     def handle_write(self):
         if self.buffer_offset:
             sent = self.send(buffer(self.buffer, self.buffer_offset))
         else:
             sent = self.send(self.buffer)
         self.buffer_offset += sent
         if self.buffer_offset == len(self.buffer):
             del self.buffer

Troels


From pyth at devel.trillke.net  Wed Oct 29 02:59:18 2003
From: pyth at devel.trillke.net (Holger Krekel)
Date: Wed Oct 29 02:59:54 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <3F9F1F82.2090209@ocf.berkeley.edu>;
	from bac@OCF.Berkeley.EDU on Tue, Oct 28, 2003 at 06:01:38PM -0800
References: <3F9F1F82.2090209@ocf.berkeley.edu>
Message-ID: <20031029085918.Y14453@prim.han.de>

Hi Brett,

Brett C. wrote:
> Today I got the wheels turning on my masters thesis by getting an 
> adviser.  Now I just need a topic.  =)  The big goal is to do something 
> involving Python for a thesis to be finished by fall of next year (about 
> October) so as to have it done, hopefully published (getting into LL4 
> would be cool), and ready to be used for doctoral applications come 
> January 2005.
> 
> So, anyone have any ideas?  The best one that I can think of is optional 
> type-checking.  I am fairly open to ideas, though, in almost any area 
> involving language design.

Maybe you have heard of PyPy, a reimplementation of Python in Python. 
We are employing quite some innovative approaches to language design
and implementation and there are certainly a lot of open research
areas. See our OSCON 2003 paper

    http://codespeak.net/pypy/index.cgi?doc/oscon2003-paper.html

or two interesting chapters out of our European Union proposal

    http://codespeak.net/pypy/index.cgi?doc/funding/B1.0
    http://codespeak.net/pypy/index.cgi?doc/funding/B6.0

You are welcome to discuss stuff on e.g. the IRC channel #pypy
on freenode or on the mailing list

    http://codespeak.net/mailman/listinfo/pypy-dev 

in order to find out, if you'd like to join us and/or do some
interesting thesis. 

have fun,

    holger

From Boris.Boutillier at arteris.net  Wed Oct 29 03:30:10 2003
From: Boris.Boutillier at arteris.net (Boris Boutillier)
Date: Wed Oct 29 03:30:16 2003
Subject: [Python-Dev] Py_TPFLAGS_HEAPTYPE, what's its real meaning ?
Message-ID: <3F9F7A92.1050800@arteris.net>

Hi all,

I've posted this question to the main python list, but got no answers, 
and I didn't see the issue arose on Python-dev (but I subscribed only 
two weeks ago).
It concerns problems with the Py_TPFLAGS_HEAPTYPE and the new 
'hackcheck' in python 2.3.

I'm writing a C-extension module for python 2.3.
I need to declare a new class, MyClass.
For this class I want two things :
 1) redefine the setattr function on objects of this class
   (ie setting a new tp_setattro)
 2) I want that the python user can change attributes on MyClass (the
class itself).

Now I have a conflict on the Py_TPFLAGS_HEAPTYPE with new Python 2.3.
If I have Py_TPFLAGS_HEAPTYPE set on MyClass, I'll have problem with the
new hackcheck (Object/typeobject.c:3631), as I am a HEAPTYPE but I also
redefine tp_setattro.
If I don't have Py_TPFLAGS_HEAPTYPE, the user can't set new attributes on
my class because of a check in type_setattro (Object/typeobject.c:2047).

The only solution I've got without modifying python source is to create 
a specific Metaclass for Myclass, and write the tp_setattr.
But I don't like the idea of making a copy-paste of the type_setattr 
source code, just to remove a check, this is not great for future 
compatibility with python (at each revision of Python I have to check if 
type_setattr has not change to copy-paste the changes).
In fact I'm really wondering what's the real meaning of this flags, but 
I think there is some history behind it.

If you think this is not the right place for this question, just ignore 
it, and sorry for disturbance.

Boris


From FBatista at uniFON.com.ar  Wed Oct 29 04:03:21 2003
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Wed Oct 29 04:04:39 2003
Subject: [Python-Dev] Decimal.py in sandbox
Message-ID: <A128D751272CD411BC9200508BC2194D03383129@escpl.tcp.com.ar>

Aahz wrote:

#- > >>> myDecimal = Decimal(5)	
#- > >>> myfloat = 3.0
#- > >>> mywhat = myDecimal + myfloat
#- > >>> isinstance(mywhat, float)
#- > True
#- 
#- Absolutely not.  No way, no how, no time.  -1000

:)


#- are inexactly represented in Python.  My opinion is that conversion
#- between float and Decimal should always be explicit (and my 
#- recollection
#- is that Tim Peters agrees).

I'm not decided for any option. I just want (it will be nice) the group to
decant either way. There's some controversial about this.

Anyway, I'll explicit the options in the pre-PEP, and we all will take a
side, :)


. Facundo

From FBatista at uniFON.com.ar  Wed Oct 29 04:17:50 2003
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Wed Oct 29 04:18:47 2003
Subject: [Python-Dev] Decimal.py in sandbox
Message-ID: <A128D751272CD411BC9200508BC2194D0338312B@escpl.tcp.com.ar>

Aahz wrote:
#- Are you comfortable with CVS?  Would you like to check 
#- your changes in
#- directly?  (Since this is sandbox, it doesn't require the 
#- usual rigorous
#- approval process for patches.)

Kevin Jacobs wrote:
#- I'd be happier with at least one round of review before 
#- committing to CVS. 
#- The code is fairly complex and an extra set of eyes will help keep
#- things focused.  I've also volunteered to be that extra set 
#- of eyes, and
#- plan a quick turn-around on any patches sent to me.

I'm not comfortable with CVS.

I think I'll use the extra pair of eyes of Kevin (thanks), and start
learning CVS while keeping the universe secure, :)

Thank you all.

.	Facundo


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
ADVERTENCIA  

La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo,
son para uso exclusivo del destinatario y pueden contener informaci?n
confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. 

Si Ud. No es uno de los destinatarios consignados o la persona responsable
de hacer llegar este mensaje a los destinatarios consignados, no est?
autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de
ella) contenida en este mensaje. Por favor notif?quenos respondiendo al
remitente, borre el mensaje original y borre las copias (impresas o grabadas
en cualquier medio magn?tico) que pueda haber realizado del mismo. 

Todas las opiniones contenidas en este mail son propias del autor del
mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones
Personales S.A. o alguna empresa asociada. 

Los mensajes electr?nicos pueden ser alterados, motivo por el cual
Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n
cualquiera sea el resultante de este mensaje. 

Muchas Gracias.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20031029/b77ca76c/attachment.html
From arigo at tunes.org  Wed Oct 29 05:47:36 2003
From: arigo at tunes.org (Armin Rigo)
Date: Wed Oct 29 05:51:31 2003
Subject: [Python-Dev] RE: cloning iterators again
In-Reply-To: <200310281800.h9SI0Fr29445@12-236-54-216.client.attbi.com>
References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer>
	<20031028124042.GA22513@vicky.ecs.soton.ac.uk>
	<200310281533.h9SFXZi29097@12-236-54-216.client.attbi.com>
	<200310281703.58169.aleaxit@yahoo.com>
	<200310281800.h9SI0Fr29445@12-236-54-216.client.attbi.com>
Message-ID: <20031029104736.GA20194@vicky.ecs.soton.ac.uk>

Hello Guido,

On Tue, Oct 28, 2003 at 10:00:14AM -0800, Guido van Rossum wrote:
> I haven't seen Armin's code, but I don't believe that the type alone
> gives enough information about whether they should be copied.

This is a quite deep problem, actually. I admit I have never used copy.py
because in all cases I needed more control about what should be copied or not.  
This generator-copier module that we are talking about is no exception: its
existence is not only due to the fact that it can copy generators, but also
that I needed precise control over what I copied and what I shared.  Putting
this information in __getstate__ or __copy__ methods of instances or in
copy_reg only goes so far, because sometimes you want to do different things
with the same instances in the same program -- e.g. you may want at some point
only a copy of a small number of objects (e.g. to be able to rollback a small
transaction), and at some other point a more complete copy of the state of the
same program.

Nevertheless, I can surely make a C module that registers in copy_reg a deep
copier for generators.


A bientot,

Armin.


From guido at python.org  Wed Oct 29 10:58:53 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 29 10:59:00 2003
Subject: [Python-Dev] RE: cloning iterators again
In-Reply-To: Your message of "Wed, 29 Oct 2003 10:47:36 GMT."
	<20031029104736.GA20194@vicky.ecs.soton.ac.uk> 
References: <002f01c39c48$edfa7b60$d4b8958d@oemcomputer>
	<20031028124042.GA22513@vicky.ecs.soton.ac.uk>
	<200310281533.h9SFXZi29097@12-236-54-216.client.attbi.com>
	<200310281703.58169.aleaxit@yahoo.com>
	<200310281800.h9SI0Fr29445@12-236-54-216.client.attbi.com> 
	<20031029104736.GA20194@vicky.ecs.soton.ac.uk> 
Message-ID: <200310291558.h9TFwr031960@12-236-54-216.client.attbi.com>

> This is a quite deep problem, actually. I admit I have never used
> copy.py because in all cases I needed more control about what should
> be copied or not.  This generator-copier module that we are talking
> about is no exception: its existence is not only due to the fact
> that it can copy generators, but also that I needed precise control
> over what I copied and what I shared.  Putting this information in
> __getstate__ or __copy__ methods of instances or in copy_reg only
> goes so far, because sometimes you want to do different things with
> the same instances in the same program -- e.g. you may want at some
> point only a copy of a small number of objects (e.g. to be able to
> rollback a small transaction), and at some other point a more
> complete copy of the state of the same program.
> 
> Nevertheless, I can surely make a C module that registers in
> copy_reg a deep copier for generators.

I'm not sure that there would be a general use for this...

--Guido van Rossum (home page: http://www.python.org/~guido/)

From nas-python at python.ca  Wed Oct 29 11:35:40 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Wed Oct 29 11:34:18 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <3F9F1F82.2090209@ocf.berkeley.edu>
References: <3F9F1F82.2090209@ocf.berkeley.edu>
Message-ID: <20031029163540.GA28700@mems-exchange.org>

Hi Brett,

Some ideas:

  * Finish of the AST compiler.  Make it possible to manipulate
    ASTs from Python and allow them to be feed to the compiler to
    generate code.  This is one half of macros for Python.  The
    other half is harder.

  * Build a refactoring code editor that works using the AST.

  * Implement an object system that supports multiple dispatch.
    You can look at Dylan and Goo for ideas.

  * Optimize access to global variables and builtins.  See PEP 267 for
    some ideas.  If we can disallow inter-module shadowing of names
    the job becomes easier.  Measure the performance difference.

  * Look at making the GC mark-and-sweep.  You will need to provide
    it explict roots.  Is it worth doing?  Mark-and-sweep would
    require changes to extension modules since they don't expose
    roots to the interpreter.

  * More radically, look at Chicken� and it's GC.  Henry Baker's
    "Cheney on the M.T.A"� is very clever, IMHO, and could be used
    instead of Python's reference counting.  Build a limited Python
    interpreter based on this idea and evaluate it.


1. http://www.call-with-current-continuation.org/chicken.html
2. http://citeseer.nj.nec.com/baker94cons.html

From allison at sumeru.stanford.EDU  Wed Oct 29 12:21:56 2003
From: allison at sumeru.stanford.EDU (Dennis Allison)
Date: Wed Oct 29 12:22:10 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <3F9F1F82.2090209@ocf.berkeley.edu>
Message-ID: <Pine.LNX.4.10.10310290901240.2867-100000@sumeru.stanford.EDU>


How about re-engineering the interpreter to make it more MP friendly?
(This is probably a bigger task than a Masters thesis.)  The current
interpreter serializes on the global interpreter lock (GIL) and blocks
everything.  Is there another approach which would allow processing to
continue?  Guido said once that there was an attempt to change the
granularity of the locking, but that it quickly became overly complex and
unstable.  Perhaps some of Maurice Herlihy's ideas may be adapted to the
problem.  Moreover, it may not be necessary that the interpreter state be
consistent and deterministic all the time as long as it eventually
produces the same answer as a deterministic equivalent.  There may be
interpreter organizations which move forward optimistically, ignoring
potential locking problems and then (if necessary) recoveri, and these
may have better performance than the more conservative ones.  Or they may 
not.  Some kind of performance tests and evaluations would need to be
part of any such study.


On Tue, 28 Oct 2003, Brett C. wrote:

> Today I got the wheels turning on my masters thesis by getting an 
> adviser.  Now I just need a topic.  =)  The big goal is to do something 
> involving Python for a thesis to be finished by fall of next year (about 
> October) so as to have it done, hopefully published (getting into LL4 
> would be cool), and ready to be used for doctoral applications come 
> January 2005.
> 
> So, anyone have any ideas?  The best one that I can think of is optional 
> type-checking.  I am fairly open to ideas, though, in almost any area 
> involving language design.
> 


From fperez at colorado.edu  Wed Oct 29 12:23:25 2003
From: fperez at colorado.edu (Fernando Perez)
Date: Wed Oct 29 12:23:28 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <20031029163540.GA28700@mems-exchange.org>
References: <3F9F1F82.2090209@ocf.berkeley.edu>
	<20031029163540.GA28700@mems-exchange.org>
Message-ID: <3F9FF78D.3060605@colorado.edu>

Hi Brett,

I don't know how interested you are in scientific computing.  But Pat Miller 
from Lawrence Livermore Lab (http://www.llnl.gov/CASC/people/pmiller/) 
presented at SciPy'03 some very interesting stuff for on-the-fly compilation 
of python code into C for numerical work.  None of this has been publically 
released yet, but if that kind of thing sounds interesting to you, you might 
want to contact him.

Just an idea.

Best,

f


From pje at telecommunity.com  Wed Oct 29 13:25:52 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Oct 29 13:27:53 2003
Subject: [Python-Dev] Looking for master thesis ideas involving
  Python
In-Reply-To: <3F9F1F82.2090209@ocf.berkeley.edu>
Message-ID: <5.1.1.6.0.20031029131118.030c1770@telecommunity.com>

At 06:01 PM 10/28/03 -0800, Brett C. wrote:
>Today I got the wheels turning on my masters thesis by getting an 
>adviser.  Now I just need a topic.  =)  The big goal is to do something 
>involving Python for a thesis to be finished by fall of next year (about 
>October) so as to have it done, hopefully published (getting into LL4 
>would be cool), and ready to be used for doctoral applications come 
>January 2005.
>
>So, anyone have any ideas?  The best one that I can think of is optional 
>type-checking.  I am fairly open to ideas, though, in almost any area 
>involving language design.

Throwing another Python-specific implementation issue into the ring...  how 
about performance of Python function calls?

Specifically, the current Python interpreter has a high overhead for 
argument passing and frame setup that dominates performance of simple 
functions.  One strategy I've been thinking about for a little while is 
replacing the per-frame variable size stacks (e.g. argument and block 
stacks) with per-thread stacks.  In principle, this would allow a few 
things to happen:

* Fixed-size "miniframe" workspace objects allocated on the C stack (with 
lazy creation of heap-allocated "real" frame objects when needed for an 
exception or a sys._getframe() call)

* Direct use of positional arguments on the stack as the "locals" of the 
next function called, without creating (and then unpacking) an argument 
tuple, in the case where there are no */** arguments provided by the caller.

This would be a pretty sizeable change to Python's internals (especially 
the core interpreter's handling of "call" operations), but could possibly 
produce double-digit percentage speedups for function calls in tight 
loops.  (I base this hypothesis on the speed difference between a function 
call and resuming a generator, and the general observation that the runtime 
of certain classes of Python programs is almost directly proportional to 
the number of function calls occurring.)


From mwh at python.net  Wed Oct 29 13:33:41 2003
From: mwh at python.net (Michael Hudson)
Date: Wed Oct 29 13:33:44 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <5.1.1.6.0.20031029131118.030c1770@telecommunity.com> (Phillip
	J. Eby's message of "Wed, 29 Oct 2003 13:25:52 -0500")
References: <5.1.1.6.0.20031029131118.030c1770@telecommunity.com>
Message-ID: <2mu15rki62.fsf@starship.python.net>

"Phillip J. Eby" <pje@telecommunity.com> writes:

> * Direct use of positional arguments on the stack as the "locals" of
>   the next function called, without creating (and then unpacking) an
>   argument tuple, in the case where there are no */** arguments
>   provided by the caller.

Already done, unless I misunderstand your idea.  Well, the arguments
might still get copied into the new frame's locals area but I'm pretty
sure no tuple is involved.

Cheers,
mwh

-- 
  That being done, all you have to do next is call free() slightly
  less often than malloc(). You may want to examine the Solaris
  system libraries for a particularly ambitious implementation of
  this technique.          -- Eric O'Dell, comp.lang.dylan (& x-posts)

From pje at telecommunity.com  Wed Oct 29 13:48:00 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Oct 29 13:50:00 2003
Subject: [Python-Dev] Looking for master thesis ideas involving
  Python
In-Reply-To: <2mu15rki62.fsf@starship.python.net>
References: <5.1.1.6.0.20031029131118.030c1770@telecommunity.com>
	<5.1.1.6.0.20031029131118.030c1770@telecommunity.com>
Message-ID: <5.1.1.6.0.20031029133413.020105e0@telecommunity.com>

At 06:33 PM 10/29/03 +0000, Michael Hudson wrote:
>"Phillip J. Eby" <pje@telecommunity.com> writes:
>
> > * Direct use of positional arguments on the stack as the "locals" of
> >   the next function called, without creating (and then unpacking) an
> >   argument tuple, in the case where there are no */** arguments
> >   provided by the caller.
>
>Already done, unless I misunderstand your idea.  Well, the arguments
>might still get copied into the new frame's locals area but I'm pretty
>sure no tuple is involved.

Hm.  I thought that particular optimization only could take place when the 
function lacks default arguments.  But maybe I've misread that part.  If 
it's true in all cases, then argument tuple creation isn't where the 
overhead is coming from.

Anyway...  it wouldn't be a good thesis idea if the answer were as obvious 
as my speculations, would it?  ;)


From wtrenker at shaw.ca  Wed Oct 29 07:13:35 2003
From: wtrenker at shaw.ca (William Trenker)
Date: Wed Oct 29 14:16:46 2003
Subject: [Python-Dev] Weeding thru the PEPs
Message-ID: <20031029121335.00087ef6.wtrenker@shaw.ca>

Hello Python gurus!

I've been learning a lot about Python by following you folks here.  Lots of headscratching on my part, but slowly the elegance and utility of Python is sinking in.

I've been going thru the PEPs on the Python site.  Since I don't live and breathe with the PEPs like you do, I'm having a bit of a problem seeing the forest for the trees.  Specifically, those PEPs which are most active or current are not 'popping off the page' in the PEP index.

Is there a view of the PEP index available that is sorted by the date each PEP was last edited?  I've looked at the listing of the PEP's in CVS, sorted by Age.  That's pretty close but the CVS listing doesn't show the title or status of each PEP.

Just wondering,
Bill

From jeremy at alum.mit.edu  Wed Oct 29 16:23:57 2003
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Wed Oct 29 16:28:19 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: <200310201808.h9KI88Q21557@12-236-54-216.client.attbi.com>
References: <20031020173134.GA29040@panix.com>
	<200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com>
	<20031020175230.GA7307@panix.com>
	<200310201808.h9KI88Q21557@12-236-54-216.client.attbi.com>
Message-ID: <1067462637.24165.7.camel@localhost.localdomain>

On Mon, 2003-10-20 at 14:08, Guido van Rossum wrote:
> > What I remember you saying was that it was an unfortunate but necessary
> > consequence so that it would work the same as
> > 
> >     L = []
> >     for x in R:
> >         L.append(x)
> >     print x
> > 
> > You didn't want to have different semantics for two such similar
> > constructs ("there's only one way").  You also didn't want to push a
> > stack frame for listcomps.
> 
> Then I guess I *have* changed my mind.  I guess I didn't think of the
> renaming solution way back when.

Not to make a big deal out of it, but I just checked on the first report
of this problem that I remember.  David Beazley reported this problem on
python-dev a couple of years ago and suggested the renaming solution.
http://mail.python.org/pipermail/python-dev/2001-May/015089.html

I'm sure we talked about the problem, but since I was talking I probably
said something about a nested scopes solution <0.3 wink>.  In that
thread, Tim did some effective channeling and said the day you approved
a solution based on lambda was the day you'd kill us all.

Jeremy


From guido at python.org  Wed Oct 29 16:38:57 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 29 16:39:28 2003
Subject: [Python-Dev] listcomps vs. for loops
In-Reply-To: Your message of "Wed, 29 Oct 2003 16:23:57 EST."
	<1067462637.24165.7.camel@localhost.localdomain> 
References: <20031020173134.GA29040@panix.com>
	<200310201748.h9KHm0E21507@12-236-54-216.client.attbi.com>
	<20031020175230.GA7307@panix.com>
	<200310201808.h9KI88Q21557@12-236-54-216.client.attbi.com> 
	<1067462637.24165.7.camel@localhost.localdomain> 
Message-ID: <200310292139.h9TLcwX32493@12-236-54-216.client.attbi.com>

> Tim did some effective channeling and said the day you approved
> a solution based on lambda was the day you'd kill us all.

Aargh!  You'er on to my evil plan! :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)

From tdelaney at avaya.com  Wed Oct 29 17:41:43 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Wed Oct 29 17:41:51 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6B3F8@au3010avexu1.global.avaya.com>

> From: Dennis Allison [mailto:allison@sumeru.stanford.EDU]
> 
> How about re-engineering the interpreter to make it more MP friendly?
> (This is probably a bigger task than a Masters thesis.)  The current
> interpreter serializes on the global interpreter lock (GIL) and blocks
> everything.

To me this would probably be the most interesting thing to tackle - especially since it has been tried before with partial success but overall failure. At the very least that gives a body of work which you can refer to both as a starting point for your work, and to show how your approach differs from and improves on existing work.

It would also be of tremendous value to Python IMO if it could be done without negatively impacting performance on single-processor machines.

Whether it is too large for a Masters thesis I don't know. Does a Masters thesis require *success* in the stated goal? I've been thinking about doing my own Masters in the not-too-distant future if I can find the time ...

Tim Delaney

From nas-python at python.ca  Wed Oct 29 17:44:55 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Wed Oct 29 17:43:27 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <200310290141.h9T1fsw00503@oma.cosc.canterbury.ac.nz>
References: <20031028220953.GA25984@mems-exchange.org>
	<200310290141.h9T1fsw00503@oma.cosc.canterbury.ac.nz>
Message-ID: <20031029224455.GA30572@mems-exchange.org>

On Wed, Oct 29, 2003 at 02:41:54PM +1300, Greg Ewing wrote:
> There's no doubt that the current implementation of it is
> unacceptably dangerous, but I haven't yet seen an argument
> that convinces me that it couldn't be fixed if desired.

Okay.  Perhaps I am missing something but would fixing it be as
simple as adding another field to the tp_as_buffer struct?

    /* references returned by the buffer functins are valid while
     * the object remains alive */
    #define PyBuffer_FLAG_SAFE 1

Then in stringobject.c (and elsewhere as appropriate):

    static PyBufferProcs buffer_as_buffer = {
        (getreadbufferproc)buffer_getreadbuf,
        (getwritebufferproc)buffer_getwritebuf,
        (getsegcountproc)buffer_getsegcount,
        (getcharbufferproc)buffer_getcharbuf,
        PyBuffer_FLAG_SAFE,
    };

Then change bufferobject so that it can only be created from objects
that set PyBuffer_FLAG_SAFE.

  Neil

From allison at sumeru.stanford.EDU  Wed Oct 29 17:55:08 2003
From: allison at sumeru.stanford.EDU (Dennis Allison)
Date: Wed Oct 29 17:56:19 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DED6B3F8@au3010avexu1.global.avaya.com>
Message-ID: <Pine.LNX.4.10.10310291446270.4409-100000@sumeru.stanford.EDU>


Measuring the size of a project is difficult.  This one would require (I
think) some significant out-of-the-box thinking.  There are a number of
resource which could be brought to bear in addition to Herlihy's work on
synchronization, for example, Kourosh Gharachorloo's work on programming
for the Stanford Dash MP where he toyed with the issues involved with
building synchronization independent (that is, lock independent)
programs.

On Thu, 30 Oct 2003, Delaney, Timothy C (Timothy) wrote:

> > From: Dennis Allison [mailto:allison@sumeru.stanford.EDU]
> > 
> > How about re-engineering the interpreter to make it more MP friendly?
> > (This is probably a bigger task than a Masters thesis.)  The current
> > interpreter serializes on the global interpreter lock (GIL) and blocks
> > everything.
> 
> To me this would probably be the most interesting thing to tackle - especially since it has been tried before with partial success but overall failure. At the very least that gives a body of work which you can refer to both as a starting point for your work, and to show how your approach differs from and improves on existing work.
> 
> It would also be of tremendous value to Python IMO if it could be done without negatively impacting performance on single-processor machines.
> 
> Whether it is too large for a Masters thesis I don't know. Does a Masters thesis require *success* in the stated goal? I've been thinking about doing my own Masters in the not-too-distant future if I can find the time ...
> 
> Tim Delaney
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/allison%40sumeru.stanford.edu
> 


From guido at python.org  Wed Oct 29 18:11:49 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 29 18:11:57 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: Your message of "Wed, 29 Oct 2003 14:44:55 PST."
	<20031029224455.GA30572@mems-exchange.org> 
References: <20031028220953.GA25984@mems-exchange.org>
	<200310290141.h9T1fsw00503@oma.cosc.canterbury.ac.nz> 
	<20031029224455.GA30572@mems-exchange.org> 
Message-ID: <200310292311.h9TNBn932586@12-236-54-216.client.attbi.com>

> Okay.  Perhaps I am missing something but would fixing it be as
> simple as adding another field to the tp_as_buffer struct?
> 
>     /* references returned by the buffer functins are valid while
>      * the object remains alive */
>     #define PyBuffer_FLAG_SAFE 1
> 
> Then in stringobject.c (and elsewhere as appropriate):
> 
>     static PyBufferProcs buffer_as_buffer = {
>         (getreadbufferproc)buffer_getreadbuf,
>         (getwritebufferproc)buffer_getwritebuf,
>         (getsegcountproc)buffer_getsegcount,
>         (getcharbufferproc)buffer_getcharbuf,
>         PyBuffer_FLAG_SAFE,
>     };
> 
> Then change bufferobject so that it can only be created from objects
> that set PyBuffer_FLAG_SAFE.

I don't know if this is enough, but if it is, I'd recommend adding the
flag bitto tp_flags rather than extending the buffer structure (since
you'd need to allocate an extra bit for tp_flags anyway to indicate
the longer buffer struct).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From mhammond at skippinet.com.au  Wed Oct 29 18:21:50 2003
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Wed Oct 29 18:21:33 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <20031029224455.GA30572@mems-exchange.org>
Message-ID: <087001c39e73$70333e60$0500a8c0@eden>

Neil Schemenauer
> Okay.  Perhaps I am missing something but would fixing it be as
> simple as adding another field to the tp_as_buffer struct?
>
>     /* references returned by the buffer functins are valid while
>      * the object remains alive */
>     #define PyBuffer_FLAG_SAFE 1
>
> Then in stringobject.c (and elsewhere as appropriate):
>
>     static PyBufferProcs buffer_as_buffer = {
>         (getreadbufferproc)buffer_getreadbuf,
>         (getwritebufferproc)buffer_getwritebuf,
>         (getsegcountproc)buffer_getsegcount,
>         (getcharbufferproc)buffer_getcharbuf,
>         PyBuffer_FLAG_SAFE,
>     };
>
> Then change bufferobject so that it can only be created from objects
> that set PyBuffer_FLAG_SAFE.

As the essence of the solution, I think that sounds good!  I think that the
following should also be done:

* Update the docs for the buffer functions to indicate that these are *short
term* pointers, that are not guaranteed once *any* Python code is called.

* Add new public buffer functions with "LongTerm" in the name (and docs that
buffer is valid as long as the object).  These check the flag as you
propose.

* Buffer object uses new LongTerm buffer functions.

It points out that the buffer object itself is less at fault than the
interface.  I'm trying to short-circuit bugs in external extension modules
that use the buffer functions without realizing the subtle assumptions made.

Mark.


From nas-python at python.ca  Wed Oct 29 18:56:01 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Wed Oct 29 18:54:31 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <087001c39e73$70333e60$0500a8c0@eden>
References: <20031029224455.GA30572@mems-exchange.org>
	<087001c39e73$70333e60$0500a8c0@eden>
Message-ID: <20031029235600.GA30853@mems-exchange.org>

On Thu, Oct 30, 2003 at 10:21:50AM +1100, Mark Hammond wrote:
> As the essence of the solution, I think that sounds good!

Thanks for the feedback.  It seems you are one of the few who are
familiar with this interrface.

> I think that the following should also be done:
> 
> * Update the docs for the buffer functions to indicate that these are *short
> term* pointers, that are not guaranteed once *any* Python code is called.
> 
> * Add new public buffer functions with "LongTerm" in the name (and docs that
> buffer is valid as long as the object).  These check the flag as you
> propose.
> 
> * Buffer object uses new LongTerm buffer functions.

Seems easy enough.  I'll make a patch.

  Neil

From bac at OCF.Berkeley.EDU  Wed Oct 29 20:05:03 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Oct 29 20:05:13 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <m3oew0o6l5.fsf@mira.informatik.hu-berlin.de>
References: <3F9F1F82.2090209@ocf.berkeley.edu>
	<m3oew0o6l5.fsf@mira.informatik.hu-berlin.de>
Message-ID: <3FA063BF.6050207@ocf.berkeley.edu>

Martin v. L?wis wrote:

> "Brett C." <bac@OCF.Berkeley.EDU> writes:
> 
> 
>>So, anyone have any ideas?  The best one that I can think of is
>>optional type-checking.  I am fairly open to ideas, though, in almost
>>any area involving language design.
> 
> 
> Did you explicitly mean language *design*?

Design/implementation.  Basically something involving how a language 
either works or is created.

> Because there might be
> areas of research relevant to language implementation, in terms of
> efficiency, portability, etc.
> 
> Here are some suggestions:
> - memory management: attempt to replace reference counting by
>   "true" garbage collection

Maybe.  Kind of happy with the way things work now, though.  =)

> - threading: attempt to provide free threading efficiently

Wow, that would be a challenge, to say the least.  Might be too much for 
just a masters thesis.

> - typing: attempt to provide run-time or static type inference,
>   and see whether this could be used to implement some byte codes
>   more efficiently (although there is probably overlap with the
>   specializing compilers)

I was actually thinking of type-inference since I am planning on 
learning (or at least starting to learn) Standard ML next month.

> - floating point: provide IEEE-794 (or some such) in a portable
>   yet efficient way

You mean like how we have longs?  So code up in C our own way of storing 
  794 independent of the CPU?

> - persistency: provide a mechanism to save the interpreter state
>   to disk, with the possibility to restart it later (similar to
>   Smalltalk images)
> 

Hmm.  Interesting.  Could be the start of continuations.

> On language design, I don't have that many suggestions, as I think the
> language itself should evolve slowly if at all:
> - deterministic finalization: provide a way to get objects destroyed
>   implicitly at certain points in control flow; a use case would be
>   thread-safety/critical regions

I think you get what you mean by this, but I am not totally sure since I 
can't come up with a use beyond threads killing themselves properly when 
the whole program is shutting down.

> - attributes: provide syntax to put arbitrary annotations to
>   functions, classes, and class members, similar to .NET
>   attributes. Use that facility to implement static and class methods,
>   synchronized methods, final methods, web methods, transactional
>   methods, etc (yes, there is a proposal, but nobody knows whether it
>   meets all requirements - nobody knows what the requirements are)

Have no clue what this is since I don't know C#.  Almost sounds like 
Michael's def func() [] proposal at the method level.  Or just a lot of 
descriptors.  =)

Time to do some Googling.

> - interfaces (this may go along with optional static typing)
> 

Yeah, but that is Alex's baby.

Thanks for the suggestions, Martin.

-Brett


From bac at OCF.Berkeley.EDU  Wed Oct 29 20:15:23 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Oct 29 20:16:17 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <20031029085918.Y14453@prim.han.de>
References: <3F9F1F82.2090209@ocf.berkeley.edu>
	<20031029085918.Y14453@prim.han.de>
Message-ID: <3FA0662B.8070805@ocf.berkeley.edu>

Holger Krekel wrote:

> Hi Brett,
> 
> Brett C. wrote:
> 
>>Today I got the wheels turning on my masters thesis by getting an 
>>adviser.  Now I just need a topic.  =)  The big goal is to do something 
>>involving Python for a thesis to be finished by fall of next year (about 
>>October) so as to have it done, hopefully published (getting into LL4 
>>would be cool), and ready to be used for doctoral applications come 
>>January 2005.
>>
>>So, anyone have any ideas?  The best one that I can think of is optional 
>>type-checking.  I am fairly open to ideas, though, in almost any area 
>>involving language design.
> 
> 
> Maybe you have heard of PyPy, a reimplementation of Python in Python. 
> We are employing quite some innovative approaches to language design
> and implementation and there are certainly a lot of open research
> areas. See our OSCON 2003 paper
> 
>     http://codespeak.net/pypy/index.cgi?doc/oscon2003-paper.html
> 

Read a while back. I keep an eye on PyPy from a distance by reading the 
stuff you guys put out.

> or two interesting chapters out of our European Union proposal
> 
>     http://codespeak.net/pypy/index.cgi?doc/funding/B1.0
>     http://codespeak.net/pypy/index.cgi?doc/funding/B6.0
> 

I will have a read.

> You are welcome to discuss stuff on e.g. the IRC channel #pypy
> on freenode

Nuts.  Guess I can't keep my use of IRC down to PyCon discussions.  =)

> or on the mailing list
> 
>     http://codespeak.net/mailman/listinfo/pypy-dev 
> 
> in order to find out, if you'd like to join us and/or do some
> interesting thesis. 
> 

Will do.  Thanks, Holger.

-Brett


From bac at OCF.Berkeley.EDU  Wed Oct 29 20:28:13 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Oct 29 20:28:22 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <20031029163540.GA28700@mems-exchange.org>
References: <3F9F1F82.2090209@ocf.berkeley.edu>
	<20031029163540.GA28700@mems-exchange.org>
Message-ID: <3FA0692D.30209@ocf.berkeley.edu>

Neil Schemenauer wrote:

> Hi Brett,
> 
> Some ideas:
> 
>   * Finish of the AST compiler.  Make it possible to manipulate
>     ASTs from Python and allow them to be feed to the compiler to
>     generate code.  This is one half of macros for Python.  The
>     other half is harder.
> 

I actually wanted to originally do that, but there is no real research 
involved; its just coding at this point, right?

>   * Build a refactoring code editor that works using the AST.
> 

Would probably require the AST to be done.

>   * Implement an object system that supports multiple dispatch.
>     You can look at Dylan and Goo for ideas.
> 

Huh, cool.  Just looked at Dylan quickly.

>   * Optimize access to global variables and builtins.  See PEP 267 for
>     some ideas.  If we can disallow inter-module shadowing of names
>     the job becomes easier.  Measure the performance difference.
> 

... and watch my head explode from reading the latest threads.  =) 
Maybe, though.

>   * Look at making the GC mark-and-sweep.  You will need to provide
>     it explict roots.  Is it worth doing?  Mark-and-sweep would
>     require changes to extension modules since they don't expose
>     roots to the interpreter.
> 

I don't know if it is worth it, although having so far two people 
suggest changing the GC to something else is interesting.

>   * More radically, look at Chicken? and it's GC.  Henry Baker's
>     "Cheney on the M.T.A"? is very clever, IMHO, and could be used
>     instead of Python's reference counting.  Build a limited Python
>     interpreter based on this idea and evaluate it.
> 
> 
> 1. http://www.call-with-current-continuation.org/chicken.html
> 2. http://citeseer.nj.nec.com/baker94cons.html

I will have a read.

Thanks, Neil.

-Brett


From bac at OCF.Berkeley.EDU  Wed Oct 29 20:38:05 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Oct 29 20:38:11 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <Pine.LNX.4.10.10310290901240.2867-100000@sumeru.stanford.EDU>
References: <Pine.LNX.4.10.10310290901240.2867-100000@sumeru.stanford.EDU>
Message-ID: <3FA06B7D.2050607@ocf.berkeley.edu>

Dennis Allison wrote:

> How about re-engineering the interpreter to make it more MP friendly?
> (This is probably a bigger task than a Masters thesis.)  The current
> interpreter serializes on the global interpreter lock (GIL) and blocks
> everything.  Is there another approach which would allow processing to
> continue?  Guido said once that there was an attempt to change the
> granularity of the locking, but that it quickly became overly complex and
> unstable.  Perhaps some of Maurice Herlihy's ideas may be adapted to the
> problem.  Moreover, it may not be necessary that the interpreter state be
> consistent and deterministic all the time as long as it eventually
> produces the same answer as a deterministic equivalent.  There may be
> interpreter organizations which move forward optimistically, ignoring
> potential locking problems and then (if necessary) recoveri, and these
> may have better performance than the more conservative ones.  Or they may 
> not.  Some kind of performance tests and evaluations would need to be
> part of any such study.
> 

As you said, Dennis, this might be too big for a masters thesis.  But it 
definitely would be nice to have solved.  I will definitely think about it.

-Brett


From bac at OCF.Berkeley.EDU  Wed Oct 29 20:47:49 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Oct 29 20:55:49 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <338366A6D2E2CA4C9DAEAE652E12A1DED6B3F8@au3010avexu1.global.avaya.com>
References: <338366A6D2E2CA4C9DAEAE652E12A1DED6B3F8@au3010avexu1.global.avaya.com>
Message-ID: <3FA06DC5.70407@ocf.berkeley.edu>

Delaney, Timothy C (Timothy) wrote:

>> From: Dennis Allison [mailto:allison@sumeru.stanford.EDU]
>> 
>> How about re-engineering the interpreter to make it more MP
>> friendly? (This is probably a bigger task than a Masters thesis.)
>> The current interpreter serializes on the global interpreter lock
>> (GIL) and blocks everything.
> 
> 
> To me this would probably be the most interesting thing to tackle -
> especially since it has been tried before with partial success but
> overall failure. At the very least that gives a body of work which
> you can refer to both as a starting point for your work, and to show
> how your approach differs from and improves on existing work.
> 
> It would also be of tremendous value to Python IMO if it could be
> done without negatively impacting performance on single-processor
> machines.
> 
> Whether it is too large for a Masters thesis I don't know. Does a
> Masters thesis require *success* in the stated goal? I've been
> thinking about doing my own Masters in the not-too-distant future if
> I can find the time ...
> 


Success as in what you set out to do was actually beneficial?  No, just 
as long as something is learned.  Successful as actually finishing the 
darn thing?  Yes.  Basically a masters thesis needs to require some 
research, such as looking at other implementations, and some original 
thought if possible.  The problem with a masters thesis, though, is that 
I have a fixed timeframe (want this done in about a year's time for 
doctoral school applications) and I don't get to spend a large portion 
of my time on it (I still have to take normal classes during this time, 
although I can fenagle my schedule to minimize my work load).

I will still consider this, though.

-Brett


From bac at OCF.Berkeley.EDU  Wed Oct 29 21:01:56 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Oct 29 21:02:00 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <3F9F1F82.2090209@ocf.berkeley.edu>
References: <3F9F1F82.2090209@ocf.berkeley.edu>
Message-ID: <3FA07114.4050009@ocf.berkeley.edu>

<SNIP - my request for ideas for a master thesis>

Just a quick "thank you!" to everyone who has emailed me, personally or 
publicly, with ideas.  There have been a ton of great suggestions and I 
am going to seriously consider all of them.  And this thanks stands 
indefinitely for any and all future emails on this subject.

And please keep sending ideas!  Even if I don't pick up on a certain 
idea maybe someone else will be inspired and decide to run with it or at 
least start a discussion on possible future improvements (there is 
always my doctoral thesis in a few years  =).  I can't believe I just 
said more discussion on this list was good that I know will most likely 
take on a life of their own.  I guess I really do want to lose my 20/20 
vision.  =)

I also think this thread is a testament to this community in general and 
this list specifically on how we help others when we can and in the 
nicest way possible.  I have to admit I say with great pride that I am a 
part of this wonderful community.

-Brett


From greg at cosc.canterbury.ac.nz  Wed Oct 29 21:30:18 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed Oct 29 21:31:28 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <087001c39e73$70333e60$0500a8c0@eden>
Message-ID: <200310300230.h9U2UId08398@oma.cosc.canterbury.ac.nz>

Neil Schemenauer:

> Okay.  Perhaps I am missing something but would fixing it be as
> simple as adding another field to the tp_as_buffer struct?
>
>     /* references returned by the buffer functins are valid while
>      * the object remains alive */
>     #define PyBuffer_FLAG_SAFE 1

That's completely different from what I had in mind, which was:

(1) Keep a reference to the base object in the buffer object, and

(2) Use the buffer API to fetch a fresh pointer from the
    base object each time it's needed.

Is there some reason that still wouldn't be safe enough?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From allison at sumeru.stanford.EDU  Wed Oct 29 22:21:39 2003
From: allison at sumeru.stanford.EDU (Dennis Allison)
Date: Wed Oct 29 22:24:01 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <3FA07114.4050009@ocf.berkeley.edu>
Message-ID: <Pine.LNX.4.10.10310291911060.4409-100000@sumeru.stanford.EDU>

Brett -- 

You might put together a list of all the ideas (maybe even a ranked list)
and post it as a unit to the list for archival purposes.  Thanks.
	

On Wed, 29 Oct 2003, Brett C. wrote:

> <SNIP - my request for ideas for a master thesis>
> 
> Just a quick "thank you!" to everyone who has emailed me, personally or 
> publicly, with ideas.  There have been a ton of great suggestions and I 
> am going to seriously consider all of them.  And this thanks stands 
> indefinitely for any and all future emails on this subject.
> 
> And please keep sending ideas!  Even if I don't pick up on a certain 
> idea maybe someone else will be inspired and decide to run with it or at 
> least start a discussion on possible future improvements (there is 
> always my doctoral thesis in a few years  =).  I can't believe I just 
> said more discussion on this list was good that I know will most likely 
> take on a life of their own.  I guess I really do want to lose my 20/20 
> vision.  =)
> 
> I also think this thread is a testament to this community in general and 
> this list specifically on how we help others when we can and in the 
> nicest way possible.  I have to admit I say with great pride that I am a 
> part of this wonderful community.
> 
> -Brett
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/allison%40sumeru.stanford.edu
> 


From guido at python.org  Wed Oct 29 22:39:33 2003
From: guido at python.org (Guido van Rossum)
Date: Wed Oct 29 22:39:54 2003
Subject: [Python-Dev] Needed: contractor to answer crypto questions
Message-ID: <200310300339.h9U3dYP00412@12-236-54-216.client.attbi.com>

I was approached by a legal firm with the questions below about
Python's crypto capabilities, from the POV of a legal review of
exporting software that embeds Python.  I don't have time to research
the answers myself (I'm no crypto expert).  If you think you can
answer the questions, please send me a price quote and I'll forward it
to them.  They'd like the answers ASAP.

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

> 
> Hello Guido,
[...]
> 
> I understand Python is open source, but when open source code is
> integrated in a commercial product, the owner of the commercial product
> must include the open source code in their product analysis for U.S.
> export classification purposes.  Although as open source, Python falls
> under an export control exception, this exception is lost once the code is
> offered in a commercial product.  
> 
> I would appreciate your help in obtaining some additional technical
> information in order to complete my export classification analysis.
[...]
> 
> 1.	We have been advised the following encryption content is in Python.
> We are looking for additional information regarding the encryption
> content:
> 		a.	The Rotor module, which implements a very ancient
> encryption algorithm based on the German Enigma.  Please tell us the
> symmetric key length of the encryption contained within this module.
> Please also advise the asymmetric key exchange algorithm length.
> 		b.	The wrapper module for Open SSL.  Again, please tell
> us the symmetric key length of the encryption content contained within
> this module.  Please also advise the asymmetric key exchange algorithm
> length
> 		c.	The following questions apply to both the Rotor
> module and the wrapper module:
> 			i.	can the encryption function be directly
> accessed, or modified, by the end user?
> 			ii.	Do either of these encryption components
> contain an "Open Cryptographic Interface" (an interface that is not fixed
> and permits a third party to insert encryption functionality)
> 
> 
> The following chart is an example of the type of information I need to
> submit to the U.S. government.  Would you be able to provide similar
> information regarding the encryption component(s) included within Pyton?
> 
> EXAMPLE:
> 
> Algorithm	Source	Key-min	Key-max	Modes	
> RC2	OpenSSL	40	128	CBC, ECB, CFB, OFB	
> ARC4	OpenSSL	40	128	N/A (stream encryption)	
> DES	OpenSSL	40	56	CBC, ECB, CFB, OFB	
> DESX	OpenSSL	168	168	CBC	
> 3DES-2Key	OpenSSL	112	112	CBC, ECB, CFB, OFB	
> 3DES	OpenSSL	168	168	CBC, ECB, CFB, OFB	
> Blowfish 	OpenSSL		128	CBC, ECB, CFB, OFB	
> Diffie-Hellman	OpenSSL	192*	16384*	Key-exchange, authentication
> 
> DSA	OpenSSL			Digital Signature	
> MD5	OpenSSL			Integrity	
> SHA-1	OpenSSL			Integrity	
> * No explicit limit, these appear to be the practical range of values.
[...]

------- End of Forwarded Message


From nas-python at python.ca  Wed Oct 29 23:15:08 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Wed Oct 29 23:13:40 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <3FA0692D.30209@ocf.berkeley.edu>
References: <3F9F1F82.2090209@ocf.berkeley.edu>
	<20031029163540.GA28700@mems-exchange.org>
	<3FA0692D.30209@ocf.berkeley.edu>
Message-ID: <20031030041508.GA31371@mems-exchange.org>

On Wed, Oct 29, 2003 at 05:28:13PM -0800, Brett C. wrote:
> Neil Schemenauer wrote:
> >  * Finish of the AST compiler.
> 
> I actually wanted to originally do that, but there is no real research 
> involved; its just coding at this point, right?

Right.  It's a prerequite to doing real research.  See Jeremy's web
log.  If you don't want to finish the AST compiler you could just
use the Python implementation.  It would be slow but good enough for
testing ideas.

> Huh, cool.  Just looked at Dylan quickly.

The reference manual is a good reading:

  http://www.gwydiondylan.org/drm/drm_1.htm  

Some of the parts I like are the builtin classes (numbers and
sealing especially) and the collection protocols.  The module and
library system is also interesting (although overkill for many
programs).

> >  * Look at making the GC mark-and-sweep.
> 
> I don't know if it is worth it, although having so far two people 
> suggest changing the GC to something else is interesting.

Implementating yet a another M&S GC is not research, IMHO.  What
_would_ be interesting is comparing the performance of reference
counting and a mark and sweep collector.  CPU, cache and memory
speeds have changed quite dramatically.  Also, comparing how easily
the runtime can be integrated with the rest of the world (e.g. C
libraries) would also be valuable.

That said, I'm not sure it's worth it either.  I find the Chicken GC
more interesting and would look into that further if I had the time.

  Neil

From jeremy at alum.mit.edu  Wed Oct 29 23:24:10 2003
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Wed Oct 29 23:26:47 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <5.1.1.6.0.20031029133413.020105e0@telecommunity.com>
References: <5.1.1.6.0.20031029131118.030c1770@telecommunity.com>
	<5.1.1.6.0.20031029131118.030c1770@telecommunity.com>
	<5.1.1.6.0.20031029133413.020105e0@telecommunity.com>
Message-ID: <1067487850.24165.53.camel@localhost.localdomain>

On Wed, 2003-10-29 at 13:48, Phillip J. Eby wrote:
> At 06:33 PM 10/29/03 +0000, Michael Hudson wrote:
> >"Phillip J. Eby" <pje@telecommunity.com> writes:
> >
> > > * Direct use of positional arguments on the stack as the "locals" of
> > >   the next function called, without creating (and then unpacking) an
> > >   argument tuple, in the case where there are no */** arguments
> > >   provided by the caller.
> >
> >Already done, unless I misunderstand your idea.  Well, the arguments
> >might still get copied into the new frame's locals area but I'm pretty
> >sure no tuple is involved.
> 
> Hm.  I thought that particular optimization only could take place when the 
> function lacks default arguments.  But maybe I've misread that part.  If 
> it's true in all cases, then argument tuple creation isn't where the 
> overhead is coming from.

There is an optimization that depends on having no default arguments (or
keyword arguments or free variables).  It copies the arguments directly
from the caller's frame into the callee's frame without creating an
argument tuple.

It's interesting to avoid the copy from caller to callee, but I don't
think it's a big cost relative to everything else we're doing to set up
a frame for calling.  (I expect the number of arguments is usually
small.)  You would need some way to encode what variables are loaded
from the caller stack and what variables are loaded from the current
frame.  Either a different opcode or some kind of flag in the current
LOAD/STORE argument.

One other possibility for optimization is this XXX comment in
fast_function():
		/* XXX Perhaps we should create a specialized
		   PyFrame_New() that doesn't take locals, but does
		   take builtins without sanity checking them.
		*/
		f = PyFrame_New(tstate, co, globals, NULL);

PyFrame_New() does a fair amount of work that is unnecessary in the
common case.

Jeremy


From python at rcn.com  Thu Oct 30 00:19:38 2003
From: python at rcn.com (Raymond Hettinger)
Date: Thu Oct 30 00:21:28 2003
Subject: [Python-Dev] Re: Guido's Magic Code   was: inline sort option
In-Reply-To: <200310281742.h9SHgGt29384@12-236-54-216.client.attbi.com>
Message-ID: <001501c39ea5$6a47f900$45ba2c81@oemcomputer>

[Guido's code]
> unsorted = (1, 10, 2)
> print MagicList.sorted(unsorted)
> print MagicList(unsorted).sorted()
> print SubClass.sorted(unsorted)
> print SubClass(unsorted).sorted()

Notwithstanding the "perverted" implementation, Alex's idea is
absolutely wonderful and addresses a core usability issue with
classmethods.

If only in the C API, I would like to see just such a universalmethod
alternative to classmethod.  That would allow different behaviors to be
assigned depending on how the method is called.

Both list.sort() and dict.fromkeys() would benefit from it:


class MagicDict(dict):

    def _class_fromkeys(cls, lst, value=True):
        "Make a new dict using keys from list and the given value"
        obj = cls()
        for elem in lst:
            obj[elem] = value
        return obj

    def _inst_fromkeys(self, lst, value=True):
        "Update an existing dict using keys from list and the given
value"
        for elem in lst:
            self[elem] = value
        return self

    newfromkeys = MagicDescriptor(_class_fromkeys, _inst_fromkeys)

print MagicDict.newfromkeys('abc')
print MagicDict(a=1, d=2).newfromkeys('abc')


An alternative implementation is to require only one underlying function
and to have it differentiate the cases based on obj and cls:

class MagicDict(dict):
    def newfromkeys(obj, cls, lst, value=True):
        if obj is None:
            obj = cls()
        for elem in lst:
            obj[elem] = value
        return obj
    newfromkeys = universalmethod(newfromkeys)


Raymond Hettinger


From bac at OCF.Berkeley.EDU  Thu Oct 30 00:30:56 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Oct 30 00:31:01 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <Pine.LNX.4.10.10310291911060.4409-100000@sumeru.stanford.EDU>
References: <Pine.LNX.4.10.10310291911060.4409-100000@sumeru.stanford.EDU>
Message-ID: <3FA0A210.10605@ocf.berkeley.edu>

Dennis Allison wrote:

> Brett -- 
> 
> You might put together a list of all the ideas (maybe even a ranked list)
> and post it as a unit to the list for archival purposes.  Thanks.
> 	

Way ahead of you, Dennis.  I have already started to come up with a reST 
doc for writing up all of these suggestions.  It just might be a little 
while before I get it up since I will need to do some preliminary 
research on each idea to measure the amount of work they will be.

-Brett


From guido at python.org  Thu Oct 30 00:31:01 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 30 00:32:00 2003
Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option
In-Reply-To: Your message of "Thu, 30 Oct 2003 00:19:38 EST."
	<001501c39ea5$6a47f900$45ba2c81@oemcomputer> 
References: <001501c39ea5$6a47f900$45ba2c81@oemcomputer> 
Message-ID: <200310300531.h9U5V1F00615@12-236-54-216.client.attbi.com>

> Notwithstanding the "perverted" implementation, Alex's idea is
> absolutely wonderful and addresses a core usability issue with
> classmethods.

I'm not so sure.  I think the main issue is that Python users aren't
used to static methods; C++ and Java users should be familiar with
them and I don't think they cause much trouble there.

> If only in the C API, I would like to see just such a universalmethod
> alternative to classmethod.  That would allow different behaviors to be
> assigned depending on how the method is called.
> 
> Both list.sort() and dict.fromkeys() would benefit from it:
> 
> 
> class MagicDict(dict):
> 
>     def _class_fromkeys(cls, lst, value=True):
>         "Make a new dict using keys from list and the given value"
>         obj = cls()
>         for elem in lst:
>             obj[elem] = value
>         return obj
> 
>     def _inst_fromkeys(self, lst, value=True):
>         "Update an existing dict using keys from list and the given value"
>         for elem in lst:
>             self[elem] = value
>         return self
> 
>     newfromkeys = MagicDescriptor(_class_fromkeys, _inst_fromkeys)
> 
> print MagicDict.newfromkeys('abc')
> print MagicDict(a=1, d=2).newfromkeys('abc')

But your _inst_fromkeys mutates self!  That completely defeats the
purpose (especially since it also returns self) and I am as much
against this (approx. -1000 :-) as I am against sort() returning self.

To me this pretty much proves that this is a bad idea; such a schizo
method will confuse users more that a class method that ignores the
instance.

And if you made an honest mistake, and meant to ignore the instance,
it still proves that this is too confusing to do! :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bac at OCF.Berkeley.EDU  Thu Oct 30 00:39:41 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Oct 30 00:39:48 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <20031030041508.GA31371@mems-exchange.org>
References: <3F9F1F82.2090209@ocf.berkeley.edu>
	<20031029163540.GA28700@mems-exchange.org>
	<3FA0692D.30209@ocf.berkeley.edu>
	<20031030041508.GA31371@mems-exchange.org>
Message-ID: <3FA0A41D.2070601@ocf.berkeley.edu>

Neil Schemenauer wrote:

> On Wed, Oct 29, 2003 at 05:28:13PM -0800, Brett C. wrote:
> 
>>Neil Schemenauer wrote:
>>
>>> * Finish of the AST compiler.
>>
>>I actually wanted to originally do that, but there is no real research 
>>involved; its just coding at this point, right?
> 
> 
> Right.  It's a prerequite to doing real research.  See Jeremy's web
> log.  If you don't want to finish the AST compiler you could just
> use the Python implementation.  It would be slow but good enough for
> testing ideas.
> 

Yeah, I read that.  Too bad I can't finish the AST branch *and* do 
something with it.

> 
>>Huh, cool.  Just looked at Dylan quickly.
> 
> 
> The reference manual is a good reading:
> 
>   http://www.gwydiondylan.org/drm/drm_1.htm  
> 
> Some of the parts I like are the builtin classes (numbers and
> sealing especially) and the collection protocols.  The module and
> library system is also interesting (although overkill for many
> programs).
> 

So many languages to learn!  Happen to have a book recommendation?

> 
>>> * Look at making the GC mark-and-sweep.
>>
>>I don't know if it is worth it, although having so far two people 
>>suggest changing the GC to something else is interesting.
> 
> 
> Implementating yet a another M&S GC is not research, IMHO.  What
> _would_ be interesting is comparing the performance of reference
> counting and a mark and sweep collector.  CPU, cache and memory
> speeds have changed quite dramatically.  Also, comparing how easily
> the runtime can be integrated with the rest of the world (e.g. C
> libraries) would also be valuable.
> 

That is a possibility.  Depends if anyone else has done a comparison 
lately.  Seems like this may have been done to death, though.

> That said, I'm not sure it's worth it either.  I find the Chicken GC
> more interesting and would look into that further if I had the time.
> 

I just like the name.  =)  That and the title of that paper, "Cheney on 
the M.T.A" causes the humorist in me to want to look at this further, so 
I will definitely be reading that paper.

-Brett


From python at rcn.com  Thu Oct 30 00:49:53 2003
From: python at rcn.com (Raymond Hettinger)
Date: Thu Oct 30 00:50:50 2003
Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option
In-Reply-To: <200310300531.h9U5V1F00615@12-236-54-216.client.attbi.com>
Message-ID: <000501c39ea9$a414e400$8cb6958d@oemcomputer>

[GvR]
> But your _inst_fromkeys mutates self!

That issue wasn't intrinsic to the proposal.
The implementation should have read:

class MagicDict(dict):
    def newfromkeys(obj, cls, lst, value=True):
        "Returns a new MagicDict with the keys in lst set to value"
        if obj is not None:
            cls = obj.__class__
        newobj = cls()
        for elem in lst:
            newobj[elem] = value
        return newobj
    newfromkeys = universalmethod(newfromkeys)


Universal methods give the method a way to handle the two
cases separately.  This provides both the capability to make
an instance from scratch or to copy it off an existing instance.

Your example was especially compelling:
  
    a = [3,2,1]
    print a.sorted()
    print list.sorted(a)


Raymond Hettinger


From martin at v.loewis.de  Thu Oct 30 02:44:42 2003
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Oct 30 02:44:52 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <3FA063BF.6050207@ocf.berkeley.edu>
References: <3F9F1F82.2090209@ocf.berkeley.edu>
	<m3oew0o6l5.fsf@mira.informatik.hu-berlin.de>
	<3FA063BF.6050207@ocf.berkeley.edu>
Message-ID: <3FA0C16A.8030203@v.loewis.de>

Brett C. wrote:
>> - floating point: provide IEEE-794 (or some such) in a portable
>>   yet efficient way
> 
> 
> You mean like how we have longs?  So code up in C our own way of storing 
>  794 independent of the CPU?

Not longs, but floats. And you would not attempt to store it independent
of the CPU, but instead, you would make as much use of the CPU as
possible, and only implement things in C that the CPU gets wrong. The
portion of emulation would vary from CPU to CPU.

As a starting point, you might look at the Java strictfpu mode (which
got only added after the initial Java release). Java 1.0 was where
Python is today: expose whatever the platform provides. In Java, they
have the much stronger desire to provide bit-for-bit reproducability
on all systems, so they added strictfpu as a trade-off of performance
vs. write-once-run-anywhere.

>> - deterministic finalization: provide a way to get objects destroyed
>>   implicitly at certain points in control flow; a use case would be
>>   thread-safety/critical regions
> 
> 
> I think you get what you mean by this, but I am not totally sure since I 
> can't come up with a use beyond threads killing themselves properly when 
> the whole program is shutting down.

Not at all. In Python, you currently do

def bump_counter(self):
   self.mutex.acquire()
   try:
     self.counter = self.counter+1
     more_actions()
   finally:
     self.mutex.release()

In C++, you do

void bump_counter(){
   MutexAcquistion acquire(this);
   this->counter+=1;
   more_actions();
}

I.e. you can acquire the mutex at the beginning (as a local object),
and it gets destroyed automatically at the end of the function. So
they have the "resource acquisition is construction, resource release
is destruction" design pattern. This is very powerful and convenient,
and works almost in CPython, but not in Python - as there is no
uarantee when objects get destroyed.

> Have no clue what this is since I don't know C#.  Almost sounds like 
> Michael's def func() [] proposal at the method level.  Or just a lot of 
> descriptors.  =)

Yes, the func()[] proposal might do most of it. However, I'm uncertain
whether it puts in place all pieces of the puzzle - one would actually
have to try to use that stuff to see whether it really works
sufficiently. You would have to set goals first (what is it supposed to
do), and then investigate, whether these things can actually be done
with it. As I said: static, class, synchronized, final methods might
all be candidates; perhaps along with some of the .NET features, like
security evidence check (caller must have permission to write files
in order to call this method), webmethod (method is automagically 
exposed as a SOAP/XML-RPC method), etc.

Regards,
Martin


From martin at v.loewis.de  Thu Oct 30 02:51:19 2003
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Oct 30 02:51:42 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <3FA06DC5.70407@ocf.berkeley.edu>
References: <338366A6D2E2CA4C9DAEAE652E12A1DED6B3F8@au3010avexu1.global.avaya.com>
	<3FA06DC5.70407@ocf.berkeley.edu>
Message-ID: <3FA0C2F7.40409@v.loewis.de>

Brett C. wrote:

>> Whether it is too large for a Masters thesis I don't know. Does a
>> Masters thesis require *success* in the stated goal? I've been
>> thinking about doing my own Masters in the not-too-distant future if
>> I can find the time ...
>>
> 
> 
> Success as in what you set out to do was actually beneficial?  No, just 
> as long as something is learned.  Successful as actually finishing the 
> darn thing?  Yes.

He actually meant "success in the stated goal". I.e. if you go out to
implement free threading, would it be considered as a failure of the
Master's project if you come back and say: "I did not actually do that"?

My answer is "it depends": If you did not do that, and, for example,
explain why it *can't* be done, than this is a good thesis, provided you
give qualified scientific rationale for why it can't be done. If you
say you did not do it, but it could be done in this and that way if
you had 50 person years available, then this could be a good thesis
as well, provided the strategy you outline, and the rationale for
computing the 50 person years is convincing. If you just say, "Oops,
I did not finish it because it is too much work", then this would be
a bad thesis.

Regards,
Martin


From s.keim at laposte.net  Thu Oct 30 03:18:37 2003
From: s.keim at laposte.net (s.keim)
Date: Thu Oct 30 03:19:38 2003
Subject: [Python-Dev] Buffer object API
Message-ID: <A969CBD2-0AB1-11D8-BADA-00306558A12A@laposte.net>

> Greg Ewing:
> That's completely different from what I had in mind, which was:
>
> (1) Keep a reference to the base object in the buffer object, and
>
> (2) Use the buffer API to fetch a fresh pointer from the
>     base object each time it's needed.
>
> Is there some reason that still wouldn't be safe enough?

I don't know if this can help, but I have once created an object  with 
this behaviour,
you can get it at: http://s.keim.free.fr/mem/  (see the memslice module)

 From my experience, this solve all the problems caused by the buffer 
object.

S?bastien Keim


From aleaxit at yahoo.com  Thu Oct 30 03:59:33 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Thu Oct 30 03:59:39 2003
Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option
In-Reply-To: <000501c39ea9$a414e400$8cb6958d@oemcomputer>
References: <000501c39ea9$a414e400$8cb6958d@oemcomputer>
Message-ID: <200310300959.33587.aleaxit@yahoo.com>

On Thursday 30 October 2003 06:49 am, Raymond Hettinger wrote:
   ...
> Universal methods give the method a way to handle the two
> cases separately.  This provides both the capability to make
> an instance from scratch or to copy it off an existing instance.
>
> Your example was especially compelling:
>
>     a = [3,2,1]
>     print a.sorted()
>     print list.sorted(a)

Actually, yes, it IS compelling indeed.  Funny -- I was originally just
brainstorming / musing out loud, never thought about this as a "real
thing".  But now that it's matured a bit, I do feel sure -- from past
experience with what puzzles Python newbies depending on their
previous programming knowledge or lack thereof -- that if we had
this it would *seriously* reduce the number of puzzlements we have
to solve on c.l.py or help@python.org.  Which IS strange, in a way,
because I do not know of any existing language that has exactly
this kind of construct -- a factory callable that you can call on either
a type or an instance with good effect.  Yet despite it not being
previously familiar it DOES "feel natural".

Of course, the 3 lines above would also work if sorted was an
ordinary instancemethod, but that's just because a is a list instance;
if we had some other sequence, say a generator expression,
    print list.sorted(x*x for x in a)
would be just as sweet, and _that_ is the compelling detail IMHO.


Trying to think of precedents: Numeric and gmpy have quite a
few things like that, except they're (by necessity of the age of gmpy
and Numeric) ordinary module functions AND instance methods.
E.g.:

>>> gmpy.sqrt(33)
mpz(5)
>>> gmpy.mpz(33).sqrt()
mpz(5)
>>> gmpy.fsqrt(33)
mpf('5.74456264653802865985e0')
>>> gmpy.mpf(33).sqrt()
mpf('5.74456264653802865985e0')

as a module function, sqrt is the truncating integer square root,
which is also a method of mpz instances (mpz being the integer
type in gmpy).  mpf (the float type in gmpy) has a sqrt method
too, which is nontruncating -- that is also a module function, but,
as such, it needs to be called fsqrt (sigh).  I sure _would_ like to
expose the functionality as mpf.sqrt(x) and mpz.sqrt(x) [would
of course be more readable with other typenames than those
'mpf' and 'mpz', but that's another issue, strictly a design mistake
of mine].


Alex


From aleaxit at yahoo.com  Thu Oct 30 04:05:31 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Thu Oct 30 04:06:03 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <3FA0A41D.2070601@ocf.berkeley.edu>
References: <3F9F1F82.2090209@ocf.berkeley.edu>
	<20031030041508.GA31371@mems-exchange.org>
	<3FA0A41D.2070601@ocf.berkeley.edu>
Message-ID: <200310301005.31669.aleaxit@yahoo.com>

On Thursday 30 October 2003 06:39 am, Brett C. wrote:
   ...
> >>Huh, cool.  Just looked at Dylan quickly.
   ...
> So many languages to learn!  Happen to have a book recommendation?

Besides the reference manual, which is also available as a book, there's
a good book called "Dylan Programming", Addison-Wesley, Feinberg et al.

There's a firm somewhat misleadingly called something like "functional
programming" (misleadingly because Dylan's not a FP language...) which
focuses on Dylan and used to have both books (reference and Feinberg)
in stock and available for decently discounted prices, too.


Alex


From aleaxit at yahoo.com  Thu Oct 30 04:24:09 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Thu Oct 30 04:24:16 2003
Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option
In-Reply-To: <200310300531.h9U5V1F00615@12-236-54-216.client.attbi.com>
References: <001501c39ea5$6a47f900$45ba2c81@oemcomputer>
	<200310300531.h9U5V1F00615@12-236-54-216.client.attbi.com>
Message-ID: <200310301024.09591.aleaxit@yahoo.com>

On Thursday 30 October 2003 06:31 am, Guido van Rossum wrote:
> > Notwithstanding the "perverted" implementation, Alex's idea is
> > absolutely wonderful and addresses a core usability issue with
> > classmethods.
>
> I'm not so sure.  I think the main issue is that Python users aren't
> used to static methods; C++ and Java users should be familiar with
> them and I don't think they cause much trouble there.

"Yes, but".  The ability to call something on either the class OR the
instance IS a Python hallmark... with the constraint that when you
call it on the class you need to provide an instance as the first arg
(assuming the "something" is a normal instance method, which is
surely the most frequent case).  You could see universalmethods
as being special only in that they WEAKEN this constraint -- they
let the 1st arg be EITHER an instance OR something from which
a new instance can be naturally constructed.

Use cases:


in gmpy:

    if I had universal methods, current instancemethods
        mpf.sqrt
    and
        mpz.sqrt
    (multiprecision floatingpoint and integer/truncating square roots
     respectively) could also be called quite naturally as
        mpf.sqrt(33) and mpz.sqrt(33) respectively.  Right now you
     have to use, instead, mpf(33).sqrt() or mpz(33).sqrt(), which
     is QUITE a bit more costly because the instance whose sqrt
     you're taking gets built then immediately discarded (and building
     mpf -- and to a lesser extent mpz -- instances is a bit costly); OR
     you can call module functions gmpy.sqrt(33), truncating sqrt, or
     gmpy.fsqrt(33), nontruncating (returning a multiprecision float).

     Just one example -- gmpy's chock full of things like this, which
     universalmethod would let me organize a bit better.


in Numeric:

     lots of module-functions take an arbitrary iterable, build an array
     instance from it if needed, and operate on an array instance to
     return a result.  This sort-of-works basically because Numeric has
     "one main type" and thus the issue of "which type are we talking
     about" doesn't arise (gmpy has 3 types, although mpz takes the
     lion's share, making things much iffier).  But still, Numeric newbies
     (if they come from OO languages rather than Fortran) DO try
     calling e.g. x.transpose() for some array x rather than the correct
     Numeric.transpose(x) -- and in fact array.transpose, called on the
     class, would ALSO be a perfeclty natural approach.

     universalmethod would allow array instances to expose such useful
     functionality as instance methods AND also allow applying direct
     operations -- without costly construction of intermediate instances
     to be thrown away at once -- via "array.transpose" and the like.


It's not really about new revolutionary functionality: it's just a neater
way to "site" existing functionality.  This isn't surprising if you look at
universalmethod as just a relaxation of the normal constraint "when
you call someclass.somemethod(x, ... -- x must be an instance of
someclass" into "x must be an instance of someclass OR -- if the
somemethod supports it -- something from which such an instance
could be constructed in one obvious way".  Then, basically, the call
is semantically equivalent to someclass(x).somemethod(... BUT the
implementation has a chance to AVOID costly construction of an
instance for the sole purpose of calling somemethod on it and then
throwing away the intermediate instance at once.

No revolution, but, I think, a nice little addition to our armoury.


Alex


From mwh at python.net  Thu Oct 30 06:10:20 2003
From: mwh at python.net (Michael Hudson)
Date: Thu Oct 30 06:10:44 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <5.1.1.6.0.20031029133413.020105e0@telecommunity.com> (Phillip
	J. Eby's message of "Wed, 29 Oct 2003 13:48:00 -0500")
References: <5.1.1.6.0.20031029131118.030c1770@telecommunity.com>
	<5.1.1.6.0.20031029131118.030c1770@telecommunity.com>
	<5.1.1.6.0.20031029133413.020105e0@telecommunity.com>
Message-ID: <2mptgfj80z.fsf@starship.python.net>

"Phillip J. Eby" <pje@telecommunity.com> writes:

> At 06:33 PM 10/29/03 +0000, Michael Hudson wrote:
>>"Phillip J. Eby" <pje@telecommunity.com> writes:
>>
>> > * Direct use of positional arguments on the stack as the "locals" of
>> >   the next function called, without creating (and then unpacking) an
>> >   argument tuple, in the case where there are no */** arguments
>> >   provided by the caller.
>>
>>Already done, unless I misunderstand your idea.  Well, the arguments
>>might still get copied into the new frame's locals area but I'm pretty
>>sure no tuple is involved.
>
> Hm.  I thought that particular optimization only could take place when
> the function lacks default arguments.  But maybe I've misread that
> part.  If it's true in all cases, then argument tuple creation isn't
> where the overhead is coming from.

I hadn't realized/had forgotten that this optimization depended on the
lack of default arguments.  Instinct would say that it shouldn't be
*too* hard to extend to that case (hardly a thesis topic, at any rate
:-).

Cheers,
mwh

-- 
  The only problem with Microsoft is they just have no taste.
              -- Steve Jobs, (From _Triumph of the Nerds_ PBS special)
                         and quoted by Aahz Maruch on comp.lang.python

From mwh at python.net  Thu Oct 30 06:16:51 2003
From: mwh at python.net (Michael Hudson)
Date: Thu Oct 30 06:16:54 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <200310301005.31669.aleaxit@yahoo.com> (Alex Martelli's message
	of "Thu, 30 Oct 2003 10:05:31 +0100")
References: <3F9F1F82.2090209@ocf.berkeley.edu>
	<20031030041508.GA31371@mems-exchange.org>
	<3FA0A41D.2070601@ocf.berkeley.edu>
	<200310301005.31669.aleaxit@yahoo.com>
Message-ID: <2mllr3j7q4.fsf@starship.python.net>

Alex Martelli <aleaxit@yahoo.com> writes:

> On Thursday 30 October 2003 06:39 am, Brett C. wrote:
>    ...
>> >>Huh, cool.  Just looked at Dylan quickly.
>    ...
>> So many languages to learn!  Happen to have a book recommendation?
>
> Besides the reference manual, which is also available as a book, there's
> a good book called "Dylan Programming", Addison-Wesley, Feinberg et al.
>
> There's a firm somewhat misleadingly called something like "functional
> programming" (misleadingly because Dylan's not a FP language...) which
> focuses on Dylan and used to have both books (reference and Feinberg)
> in stock and available for decently discounted prices, too.

It was called "Functional Objects" -- and still is (I thought it was
defunct).  http://www.functionalobject.com

Cheers,
mwh

-- 
  This is an off-the-top-of-the-head-and-not-quite-sober suggestion,
  so is probably technically laughable.  I'll see how embarassed I
  feel tomorrow morning.            -- Patrick Gosling, ucam.comp.misc

From mwh at python.net  Thu Oct 30 06:18:35 2003
From: mwh at python.net (Michael Hudson)
Date: Thu Oct 30 06:18:39 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <3FA0A210.10605@ocf.berkeley.edu> (Brett C.'s message of "Wed,
	29 Oct 2003 21:30:56 -0800")
References: <Pine.LNX.4.10.10310291911060.4409-100000@sumeru.stanford.EDU>
	<3FA0A210.10605@ocf.berkeley.edu>
Message-ID: <2mhe1rj7n8.fsf@starship.python.net>

"Brett C." <bac@OCF.Berkeley.EDU> writes:

> Dennis Allison wrote:
>
>> Brett -- 
>> You might put together a list of all the ideas (maybe even a ranked
>> list)
>> and post it as a unit to the list for archival purposes.  Thanks.
>> 	
>
> Way ahead of you, Dennis.  I have already started to come up with a
> reST doc for writing up all of these suggestions.  It just might be a
> little while before I get it up since I will need to do some
> preliminary research on each idea to measure the amount of work they
> will be.

Could go on the Python Wiki?

I take it from your posting of last week that you've thought about
other ways of implementing exception handling?  I guess a
non-reference count based GC is a prerequisite for that...

Cheers,
mwh

-- 
                    >> REVIEW OF THE YEAR, 2000 <<
                   It was shit. Give us another one.
                           -- NTK Now, 2000-12-29, http://www.ntk.net/

From pedronis at bluewin.ch  Thu Oct 30 07:43:04 2003
From: pedronis at bluewin.ch (Samuele Pedroni)
Date: Thu Oct 30 07:41:23 2003
Subject: [Python-Dev] Looking for master thesis ideas involving
  Python
In-Reply-To: <3FA0692D.30209@ocf.berkeley.edu>
References: <20031029163540.GA28700@mems-exchange.org>
	<3F9F1F82.2090209@ocf.berkeley.edu>
	<20031029163540.GA28700@mems-exchange.org>
Message-ID: <5.2.1.1.0.20031030132333.028a56b0@pop.bluewin.ch>

At 17:28 29.10.2003 -0800, Brett C. wrote:

>>   * Implement an object system that supports multiple dispatch.
>>     You can look at Dylan and Goo for ideas.
>
>Huh, cool.  Just looked at Dylan quickly.

some bits on this:
implementing one is probably not too hard apart from optimization but 
possible/relevant directions are also then

- integration with the preexisting Python semantics
- reflection. All of CLOS, Dylan, and Goo come with a rather low-level flavor
of reflection, in contrast Python has a rather natural one. Once you have 
mmd what kind of idioms using reflection you can think of, how to best 
offer/package reflection for the language user?
- multi methods cover some ground also coverd by interfaces and adaptation:
   *) a generic function/multi method is also an interface
   *) some of the things you can achieve with adaptation can be done with 
multi methods
Once you have multimethods do you still need adaptation in some cases or, 
could one obtain the functionality otherwise or do you need dispatch on 
interfaces (not just classes), how would then interfaces look like and the 
dispatch on them?

(Cecil type system and predicate dispatch would be thing to look at for 
example)

Samuele


From mcherm at mcherm.com  Thu Oct 30 08:01:18 2003
From: mcherm at mcherm.com (Michael Chermside)
Date: Thu Oct 30 08:01:22 2003
Subject: [Python-Dev] Re: Guido's Magic Code   was: inline sort option
Message-ID: <1067518878.3fa10b9e91afb@mcherm.com>

Raymond writes:
> If only in the C API, I would like to see just such a universalmethod
> alternative to classmethod.  That would allow different behaviors to be
> assigned depending on how the method is called.

And that's exactly why I would be wary of it. One of the GREAT SIMPLICITIES
of Python is that all calls are "the same". Calling a function works
a particular way. Calling a callable object does the same, although
the arguments are passed to __call__. Calling a classmethod does the
same thing. Calling a bound method does the same thing except that the
first argument is curried. Here in the Python community, we think it
is a good thing that one must explicitly name "self" as an argument to
methods, and that any function CAN be made a method of objects.

Now you're proposing a special situation, where what appears to be
a single attribute of a class object is actually TWO functions...
two functions that have the same name but subtly different behavior.

Right now, I presume that if I have:
      a = A()   # a is an instance of A
      x = a.aMethod('abc')   # this is line 1
      y = A.aMethod(a, 'abc')    # this is line 2
that line 1 and line 2 do the same thing. This is a direct consequence
of the fact that methods in Python are just functions with an instance
as the first argument. But your "universalmethod" would break this.

It might be worth breaking it, if the result is some *very* readable
code in a whole variety of situations. And it's certainly okay for
Guido to manually create a class which behaves this way via black
magic (and for most users, descriptor tricks are black magic). But
to make it a regular and supported idiom, I'd want to see much better
evidence that it's worthwhile, because there's an important principle
at risk here, and I wouldn't want to trade away the ability to explain
"methods" in two sentences:
    A 'method' is just a function whose first argument is 'self'.
    The method is an atribute of the class object, and when it is 
    called using "a.method(args)", the instance 'a' is passed as
    'self'.
for a cute way of making double use of a few factory functions.

-- Michael Chermside

      
From Boris.Boutillier at arteris.net  Thu Oct 30 08:06:23 2003
From: Boris.Boutillier at arteris.net (Boris Boutillier)
Date: Thu Oct 30 08:06:31 2003
Subject: [Python-Dev] Py_TPFLAGS_HEAPTYPE, what's its real meaning ?
In-Reply-To: <3F9F7A92.1050800@arteris.net>
References: <3F9F7A92.1050800@arteris.net>
Message-ID: <3FA10CCF.5020004@arteris.net>

No answers on this ? I posted the question two times on c.l.py and got 
no answers., help would be appreciated.

Boris


Boris Boutillier wrote:

> Hi all,
>
> I've posted this question to the main python list, but got no answers, 
> and I didn't see the issue arose on Python-dev (but I subscribed only 
> two weeks ago).
> It concerns problems with the Py_TPFLAGS_HEAPTYPE and the new 
> 'hackcheck' in python 2.3.
>
> I'm writing a C-extension module for python 2.3.
> I need to declare a new class, MyClass.
> For this class I want two things :
> 1) redefine the setattr function on objects of this class
>   (ie setting a new tp_setattro)
> 2) I want that the python user can change attributes on MyClass (the
> class itself).
>
> Now I have a conflict on the Py_TPFLAGS_HEAPTYPE with new Python 2.3.
> If I have Py_TPFLAGS_HEAPTYPE set on MyClass, I'll have problem with the
> new hackcheck (Object/typeobject.c:3631), as I am a HEAPTYPE but I also
> redefine tp_setattro.
> If I don't have Py_TPFLAGS_HEAPTYPE, the user can't set new attributes on
> my class because of a check in type_setattro (Object/typeobject.c:2047).
>
> The only solution I've got without modifying python source is to 
> create a specific Metaclass for Myclass, and write the tp_setattr.
> But I don't like the idea of making a copy-paste of the type_setattr 
> source code, just to remove a check, this is not great for future 
> compatibility with python (at each revision of Python I have to check 
> if type_setattr has not change to copy-paste the changes).
> In fact I'm really wondering what's the real meaning of this flags, 
> but I think there is some history behind it.
>
> If you think this is not the right place for this question, just 
> ignore it, and sorry for disturbance.
>
> Boris
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/boris.boutillier%40arteris.net 
>


From mwh at python.net  Thu Oct 30 08:11:45 2003
From: mwh at python.net (Michael Hudson)
Date: Thu Oct 30 08:11:48 2003
Subject: [Python-Dev] Py_TPFLAGS_HEAPTYPE, what's its real meaning ?
In-Reply-To: <3FA10CCF.5020004@arteris.net> (Boris Boutillier's message of
	"Thu, 30 Oct 2003 14:06:23 +0100")
References: <3F9F7A92.1050800@arteris.net> <3FA10CCF.5020004@arteris.net>
Message-ID: <2mn0bi98fi.fsf@starship.python.net>

Boris Boutillier <Boris.Boutillier@arteris.net> writes:

> No answers on this ? I posted the question two times on c.l.py and got
> no answers., help would be appreciated.

I answered, on comp.lang.python.  I didn't say anything especially
useful, though.

> Boris Boutillier wrote:
>
>> Hi all,
>>
>> I've posted this question to the main python list, but got no
>> answers, and I didn't see the issue arose on Python-dev (but I
>> subscribed only two weeks ago).
>> It concerns problems with the Py_TPFLAGS_HEAPTYPE and the new
>> hackcheck' in python 2.3.
>>
>> I'm writing a C-extension module for python 2.3.
>> I need to declare a new class, MyClass.
>> For this class I want two things :
>> 1) redefine the setattr function on objects of this class
>>   (ie setting a new tp_setattro)
>> 2) I want that the python user can change attributes on MyClass (the
>> class itself).
>>
>> Now I have a conflict on the Py_TPFLAGS_HEAPTYPE with new Python 2.3.
>> If I have Py_TPFLAGS_HEAPTYPE set on MyClass, I'll have problem with the
>> new hackcheck (Object/typeobject.c:3631), as I am a HEAPTYPE but I also
>> redefine tp_setattro.
>> If I don't have Py_TPFLAGS_HEAPTYPE, the user can't set new attributes on
>> my class because of a check in type_setattro (Object/typeobject.c:2047).
>>
>> The only solution I've got without modifying python source is to
>> create a specific Metaclass for Myclass, and write the tp_setattr.

I think this is the appropriate solution: your type object is *not* a
heap type (i.e. is not allocated on the heap) and you want to
influence what happens when you set an attribute on it.

Cheers,
mwh

-- 
  I'd certainly be shocked to discover a consensus.  ;-)
                                             -- Aahz, comp.lang.python

From aleaxit at yahoo.com  Thu Oct 30 08:54:48 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Thu Oct 30 08:54:55 2003
Subject: [Python-Dev] Re: Guido's Magic Code   was: inline sort option
In-Reply-To: <1067518878.3fa10b9e91afb@mcherm.com>
References: <1067518878.3fa10b9e91afb@mcherm.com>
Message-ID: <200310301454.48290.aleaxit@yahoo.com>

On Thursday 30 October 2003 02:01 pm, Michael Chermside wrote:
> Raymond writes:
> > If only in the C API, I would like to see just such a universalmethod
> > alternative to classmethod.  That would allow different behaviors to be
> > assigned depending on how the method is called.
>
> And that's exactly why I would be wary of it. One of the GREAT SIMPLICITIES
> of Python is that all calls are "the same". Calling a function works

A new descriptortype wouldn't change the ``all the same'' idea at
the level at which descriptortypes such as staticmethod and classmethod
haven't changed it.

> a particular way. Calling a callable object does the same, although
> the arguments are passed to __call__. Calling a classmethod does the
> same thing. Calling a bound method does the same thing except that the
> first argument is curried. Here in the Python community, we think it
> is a good thing that one must explicitly name "self" as an argument to
> methods, and that any function CAN be made a method of objects.

Nothing of this would change.  Just consider calling a method on the
class or on an instance for various descriptortypes:
    for staticmethod:
        aclass.foo()      # ignores the exact classobject
        aninst.foo()      # ignores the instanceobject & its exact class
    for classmethod:
        aclass.bar()      # passes the exact classobject
        aninst.bar()      # ignores the instanceobject, passes its __class__
    for functions (which are also descriptors):
        aclass.baz(aninst)    # must explicitly pass an instance of aclass
        aninst.baz()             # pases the instance

so, we do have plenty of variety today, right?  Consider the last couple
in particular (it's after all the most common one): you have the specific
constraint that aninst MUST be an instance of aclass.

So what we're proposing is JUST a descriptortype that will relax the
latter constraint: aninst.wee() passes the instance (just like the latter
couple), aclass.wee(beep) does NOT constrain beep to be an instance
of a class but is more relaxed, allowing the code of 'wee' to determine
what it needs, what it has received, etc -- just like in about ALL cases
of Python calls *except* "aclass.baz(aninst)" which is an exceptional
case in which Python itself does (enforced) typechecking for you.  So
what's so bad about optionally being able to do w/o that typecehecking?

I've mentioned use cases already -- besides list.sorted -- such as
gmpy's sqrt and fsqrt which would more naturally be modeled as
just such methods, rather than instancemethods (named sqrt) of
types mpz and mpf resp., also available with different names as
module-functions (to bypass the typechecking and do typecasting
instead).  More generally, the idea is that aclas.wee(beep) is just
about equivalent to aclas(beep).wee() but may sometimes be
implemented more optimally (avoiding the avoidable construction
of a temporary instance of aclas from beep, when that is a costly
part), and it's better sided in class aclas than as some module
function "aclas_wee" (stropping by the typename or some other
trick to avoid naming conflict if 'wee' methods of several types
are forced into a single namespace because Python doesn't let
them be attributes of their respective types in these cases).

I don't see any revolution in Python's calls system -- just a little
extra possibility that will sometimes allow a better and more natural
(or better optimizable) placement of functionality that's now not
quite comfortably located as either instancemethod, classmethod
or module-level function.


> Now you're proposing a special situation, where what appears to be
> a single attribute of a class object is actually TWO functions...
> two functions that have the same name but subtly different behavior.

Nah -- not any more than e.g. a property "is actually THREE functions".
A property may HOLD three functions and call the appropriate one in
appropriate cases, x.y=23 vs print x.y vs del x.y.  In general, a descriptor
"has" one or more callables it holds (a function "has" AND "is").

> Right now, I presume that if I have:
>       a = A()   # a is an instance of A
>       x = a.aMethod('abc')   # this is line 1
>       y = A.aMethod(a, 'abc')    # this is line 2
> that line 1 and line 2 do the same thing. This is a direct consequence
> of the fact that methods in Python are just functions with an instance
> as the first argument. But your "universalmethod" would break this.

Actually, unless you show me the source of A, I cannot be absolutely
sure that your presumption is right, even today.  A simple example:

class A(object):
    def aMethod(*allofthem): return allofthem
    aMethod = staticmethod(aMethod)

Now, the behavior of lines 1 and 2 is actually quite different -- x is
a singleton tuple with a string, y a pair whose first item is an instance
of A and the second item a string.

Sure, your presumption is reasonable and a reasonable programmer
will try to make sure it's valid, but Python already gives the programmer
plenty of tools with which to make your presumption invalid.

The _design intention_ of universalmethod would be to still satisfy
your presumption, PLUS allow calls to A.aMethod(bbb, 'abc') for any
"acceptable" object bbb, not necessarily an instance of A, to do
something like A(bbb).aMethod('abc') although possibly in a more
optimized way (not necessarily constructing a temporary instance
of A, if that is costly and can be easily avoided).  Of course it can
ALSO be used unreasonably, but so can lots of existing descriptors, too.


> It might be worth breaking it, if the result is some *very* readable

Can't break what's already broken:-).

> to make it a regular and supported idiom, I'd want to see much better
> evidence that it's worthwhile, because there's an important principle
> at risk here, and I wouldn't want to trade away the ability to explain
> "methods" in two sentences:
>     A 'method' is just a function whose first argument is 'self'.
>     The method is an atribute of the class object, and when it is
>     called using "a.method(args)", the instance 'a' is passed as
>     'self'.
> for a cute way of making double use of a few factory functions.

I don't think of it as "cute", but rather more appropriate than currently
available solutions in some such cases (already exemplified).

 And those sentences are already false if by 'method' you also want 
to include staticmethod and classmethod.  If you intend 'method' in
a stricter sense that excludes staticmethod and classmethod, why,
just have your stricter sense ALSO exclude universalmethod and,
voila, you can STILL "explain methods in two sentences".

Thus, there is no "important principle at risk" whatsoever.


Alex
 

From Paul.Moore at atosorigin.com  Thu Oct 30 09:27:09 2003
From: Paul.Moore at atosorigin.com (Moore, Paul)
Date: Thu Oct 30 09:27:55 2003
Subject: [Python-Dev] Re: Guido's Magic Code   was: inline sort option
Message-ID: <16E1010E4581B049ABC51D4975CEDB8803060D0E@UKDCX001.uk.int.atosorigin.com>

From: Alex Martelli [mailto:aleaxit@yahoo.com]
> On Thursday 30 October 2003 02:01 pm, Michael Chermside wrote:
>> And that's exactly why I would be wary of it. One of the GREAT SIMPLICITIES
>> of Python is that all calls are "the same". Calling a function works
>
> A new descriptortype wouldn't change the ``all the same'' idea at
> the level at which descriptortypes such as staticmethod and classmethod
> haven't changed it.

Excuse me, did I miss something? Guido's code was entirely user-level Python,
so is available for anyone who wants to use it right now, surely? And if you
want it in a C extension, I guess you code a C version for your own use.

Why bother arguing over whether it's "right" or "wrong"?

Paul.

From aleaxit at yahoo.com  Thu Oct 30 09:51:36 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Thu Oct 30 09:51:41 2003
Subject: [Python-Dev] Re: Guido's Magic Code   was: inline sort option
In-Reply-To: <16E1010E4581B049ABC51D4975CEDB8803060D0E@UKDCX001.uk.int.atosorigin.com>
References: <16E1010E4581B049ABC51D4975CEDB8803060D0E@UKDCX001.uk.int.atosorigin.com>
Message-ID: <200310301551.36290.aleaxit@yahoo.com>

On Thursday 30 October 2003 03:27 pm, Moore, Paul wrote:
> From: Alex Martelli [mailto:aleaxit@yahoo.com]
>
> > On Thursday 30 October 2003 02:01 pm, Michael Chermside wrote:
> >> And that's exactly why I would be wary of it. One of the GREAT
> >> SIMPLICITIES of Python is that all calls are "the same". Calling a
> >> function works
> >
> > A new descriptortype wouldn't change the ``all the same'' idea at
> > the level at which descriptortypes such as staticmethod and classmethod
> > haven't changed it.
>
> Excuse me, did I miss something? Guido's code was entirely user-level
> Python, so is available for anyone who wants to use it right now, surely?

Yes, exactly like staticmethod was available before it became a builtin
(e.g., see p.176, "Python Cookbook").

> And if you want it in a C extension, I guess you code a C version for your
> own use.
>
> Why bother arguing over whether it's "right" or "wrong"?

Raymond and I would like to use it as the descriptor for the new list.sorted.

If the code gets in Python anyway, then it should ideally be somehow exposed 
for general use if it's right -- but not if it's wrong.  Moreover, if it's 
wrong "by enough", it might be better to NOT have it get in at all, and keep
the possibility well under wraps -- if this behavior is used by what will 
likely become a reasonably popular method of a reasonably popular built-in 
type, list, people may well be encouraged to design some aspects of their
classes similarly.  If such a design is considered a disaster, then 
encouraging and popularizing it in this way might not be wise.  If, on the 
other hand, the design IS of enough general use, then there are no such 
qualms -- indeed, documenting the use and design-assumptions of the
new descriptor in the Python docs would then be a good idea.

So, it appears to me that the discussion on the pro's and con's of such a 
descriptor type is well warranted on this list.


Alex


From Paul.Moore at atosorigin.com  Thu Oct 30 10:12:40 2003
From: Paul.Moore at atosorigin.com (Moore, Paul)
Date: Thu Oct 30 10:13:31 2003
Subject: [Python-Dev] Re: Guido's Magic Code   was: inline sort option
Message-ID: <16E1010E4581B049ABC51D4975CEDB8802C0996F@UKDCX001.uk.int.atosorigin.com>

From: Alex Martelli [mailto:aleaxit@yahoo.com]
> Raymond and I would like to use it as the descriptor for
> the new list.sorted.

> If the code gets in Python anyway, then it should ideally
> be somehow exposed for general use if it's right -- but not
> if it's wrong.

OK. I follow now. The only contribution I will make is to say
that if list.sorted uses it, I think it should be available to
the user. I don't like the flavour of "good enough for us, but
not for you" that keeping this descriptor purely internal seems
to have.

On list.sorted, I have no opinion.
Paul.

From nas-python at python.ca  Thu Oct 30 10:21:01 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Thu Oct 30 10:19:32 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <200310300230.h9U2UId08398@oma.cosc.canterbury.ac.nz>
References: <087001c39e73$70333e60$0500a8c0@eden>
	<200310300230.h9U2UId08398@oma.cosc.canterbury.ac.nz>
Message-ID: <20031030152101.GA32750@mems-exchange.org>

On Thu, Oct 30, 2003 at 03:30:18PM +1300, Greg Ewing wrote:
> That's completely different from what I had in mind, which was:
> 
> (1) Keep a reference to the base object in the buffer object, and

It already does this.

> (2) Use the buffer API to fetch a fresh pointer from the
>     base object each time it's needed.
> 
> Is there some reason that still wouldn't be safe enough?

I don't see any problem with that.  It's probably a better solution
since it doesn't require a new flag and it lets you create buffers
that reference objects like arrays.

  Neil

From guido at python.org  Thu Oct 30 10:34:45 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 30 10:34:52 2003
Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option
In-Reply-To: Your message of "Thu, 30 Oct 2003 00:49:53 EST."
	<000501c39ea9$a414e400$8cb6958d@oemcomputer> 
References: <000501c39ea9$a414e400$8cb6958d@oemcomputer> 
Message-ID: <200310301534.h9UFYjh01347@12-236-54-216.client.attbi.com>

> Your example was especially compelling:
>   
>     a = [3,2,1]
>     print a.sorted()
>     print list.sorted(a)

Well, I'd like to withdraw it.  Having all three of a.sort(),
a.sorted() and list.sorted(a) available brings back all the confusion
of a.sort() vs. a.sorted().  What's in CVS is just fine.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From guido at python.org  Thu Oct 30 10:43:04 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 30 10:43:12 2003
Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option
In-Reply-To: Your message of "Thu, 30 Oct 2003 09:59:33 +0100."
	<200310300959.33587.aleaxit@yahoo.com> 
References: <000501c39ea9$a414e400$8cb6958d@oemcomputer>  
	<200310300959.33587.aleaxit@yahoo.com> 
Message-ID: <200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com>

> >     a = [3,2,1]
> >     print a.sorted()
> >     print list.sorted(a)
> 
> Actually, yes, it IS compelling indeed.  Funny -- I was originally just
> brainstorming / musing out loud, never thought about this as a "real
> thing".  But now that it's matured a bit, I do feel sure [...]

If you feel about it that way, I recommend that you let it mature a
bit more.

If you really like this so much, please realize that you can do this
for *any* instance method.  The identity

   C.foo(C()) == C().foo()

holds for all "regular" methods.  (Since 2.2 it also holds for
extension types.)  If we were to do this, we'd be back at square two,
which we rejected: list instances having both a sort() and a sorted()
method (square one being no sorted() method at all :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From aleaxit at yahoo.com  Thu Oct 30 11:20:37 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Thu Oct 30 11:20:54 2003
Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option
In-Reply-To: <200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com>
References: <000501c39ea9$a414e400$8cb6958d@oemcomputer>
	<200310300959.33587.aleaxit@yahoo.com>
	<200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com>
Message-ID: <200310301720.37743.aleaxit@yahoo.com>

On Thursday 30 October 2003 04:43 pm, Guido van Rossum wrote:
> > >     a = [3,2,1]
> > >     print a.sorted()
> > >     print list.sorted(a)
> >
> > Actually, yes, it IS compelling indeed.  Funny -- I was originally just
> > brainstorming / musing out loud, never thought about this as a "real
> > thing".  But now that it's matured a bit, I do feel sure [...]
>
> If you feel about it that way, I recommend that you let it mature a
> bit more.
>
> If you really like this so much, please realize that you can do this
> for *any* instance method.  The identity
>
>    C.foo(C()) == C().foo()
>
> holds for all "regular" methods.  (Since 2.2 it also holds for

Yes, having a be an instance of list in the above doesn't show 'sorted'
as being any different than a perfectly regular instance method -- it
WAS in this sense a bad example (I thought I'd mentioned that later
on in the very same post?).  The identify I want is rather like:

    C.foo(x) == C(x).foo()

for an x that's not necessarily an instance of C, just something that
has a natural way to become one.  When C is list, any iterable x,
for example.  In other words, being able to call C.foo(x) _without_
the typechecking constraint that x is an instance of C, as one would
have for a normal C.foo unbound-method.


> extension types.)  If we were to do this, we'd be back at square two,
> which we rejected: list instances having both a sort() and a sorted()
> method (square one being no sorted() method at all :-).

Yes, the names are an issue again -- but having e.g. x=L1.sorted(L2)
completely ignore the value of L1 feels a bit strange to me too (as
does x=D1.fromkeys(L3) now that I've started thinking about it --
I've never seen any Python user, newbie or otherwise, have actual
problems with this, but somebody on c.l.py claimed that's just because
"nobody" knows about fromkeys -- so, I dunno...).

Darn -- it WOULD be better in some cases if one could ONLY call
a method on the class, NOT on an instance when the call would in
any case ignore the instance.  Calling dict.fromkeys(L3) is wonderful,
the problem is that you can also call it on a dict instance, and THAT
gets confusing.  Similarly, calling list.sorted(iterable) is wonderful,
but calling it on a list instance that gets ignored, L1.sorted(iterable),
could perhaps be confusing.

Yeah, the C++(staticmethod)/Smalltalk(clasmethod) idea of "call
it on the instance" sounds cool in the abstract, but when it comes
down to cases I'm not so sure any more -- what might be use
cases where it's preferable, rather than confusing, to be able to
call aninst.somestaticmethod(x,y) "just as if" it was a normal
method?  Would it be such an imposition to "have to know" that
a method is static and call type(aninst).somestaticmethod(x,y)
instead, say...?

Oh well, I guess it's too late to change the semantics of the
existing descriptors, even if one agrees with my newfound doubts.
But the funniest thing is that I suspect the _new_ descriptor type
would be the _least_ confusing of them, because calling such a
method on class or instance would have semantics much closer
to ordinary methods, just slightly less typeconstrained.  Oh well!


Alex


From guido at python.org  Thu Oct 30 12:16:33 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 30 12:19:34 2003
Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option
In-Reply-To: Your message of "Thu, 30 Oct 2003 17:20:37 +0100."
	<200310301720.37743.aleaxit@yahoo.com> 
References: <000501c39ea9$a414e400$8cb6958d@oemcomputer>
	<200310300959.33587.aleaxit@yahoo.com>
	<200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com> 
	<200310301720.37743.aleaxit@yahoo.com> 
Message-ID: <200310301716.h9UHGXB01596@12-236-54-216.client.attbi.com>

> Darn -- it WOULD be better in some cases if one could ONLY call
> a method on the class, NOT on an instance when the call would in
> any case ignore the instance.  Calling dict.fromkeys(L3) is wonderful,
> the problem is that you can also call it on a dict instance, and THAT
> gets confusing.  Similarly, calling list.sorted(iterable) is wonderful,
> but calling it on a list instance that gets ignored, L1.sorted(iterable),
> could perhaps be confusing.

Let's focus on making this an issue that one learns without much pain.

Given that the most common mistake would be to write a.sorted(), and
that's a TypeError because of the missing argument, perhaps we could
make the error message clearer?

Perhaps we could use a variant of classmethod whose __get__ would
raise the error, rather than waiting until the call -- it could do the
equivalent of the following:

class PickyClassmethod(classmethod):
    def __get__(self, obj, cls):
        if obj is not None:
            raise TypeError, "class method should be called on class only!"
        else:
            return classmethod.__get__(self, None, cls)

I don't want to make this behavior the default behavior, because I
can see use cases for calling a class method on an instance too,
knowing that it is a class method; otherwise one would have to write
the ugly x.__class__.foobar().

--Guido van Rossum (home page: http://www.python.org/~guido/)

From nas-python at python.ca  Thu Oct 30 12:19:39 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Thu Oct 30 12:20:40 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <20031030152101.GA32750@mems-exchange.org>
References: <087001c39e73$70333e60$0500a8c0@eden>
	<200310300230.h9U2UId08398@oma.cosc.canterbury.ac.nz>
	<20031030152101.GA32750@mems-exchange.org>
Message-ID: <20031030171939.GA374@mems-exchange.org>

On Thu, Oct 30, 2003 at 07:21:01AM -0800, Neil Schemenauer wrote:
> I don't see any problem with that.

Okay, small problem.  The hash function for the buffer object is brain
damaged, in more ways than one actually:

    >>> import array
    >>> a = array.array('c')
    >>> b = buffer(a)
    >>> hash(b)

    Program received signal SIGSEGV, Segmentation fault.
    [Switching to Thread 16384 (LWP 5311)]
    buffer_hash (self=0x40262d00) at Objects/bufferobject.c:241
    241             x = *p << 7;
    (gdb) l
    236                     return -1;
    237             }
    238     
    239             len = self->b_size;
    240             p = (unsigned char *) self->b_ptr;
    241             x = *p << 7;
    242             while (--len >= 0)
    243                     x = (1000003*x) ^ *p++;
    244             x ^= self->b_size;
    245             if (x == -1)
    (gdb) p len
    $1 = 0
    (gdb) p *p
    Cannot access memory at address 0x0

The buffer object has 'b_readonly' and 'b_hash' fields.  If readonly
is true than the object is considered hashable and once computed the
hash is stored in the 'hash' field.  The problem is that the buffer
API doesn't provide a way to determine 'readonly'.  The absence of
getwritebuf() is not the same thing as being read only.  The
buffer() builtin always sets the 'readonly' flag!

I don't think the buffer hash method can depend on the data being
pointed to.  There is nothing in the buffer interface that tells
you if the data is immutable.  The hash method could return the id
of the buffer object but I'm not sure how useful that would be.

  Neil

From pje at telecommunity.com  Thu Oct 30 12:37:24 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Oct 30 12:37:50 2003
Subject: [Python-Dev] Looking for master thesis ideas involving
  Python
In-Reply-To: <5.2.1.1.0.20031030132333.028a56b0@pop.bluewin.ch>
References: <3FA0692D.30209@ocf.berkeley.edu>
	<20031029163540.GA28700@mems-exchange.org>
	<3F9F1F82.2090209@ocf.berkeley.edu>
	<20031029163540.GA28700@mems-exchange.org>
Message-ID: <5.1.1.6.0.20031030123139.02443660@telecommunity.com>

At 01:43 PM 10/30/03 +0100, Samuele Pedroni wrote:
>- multi methods cover some ground also coverd by interfaces and adaptation:
>   *) a generic function/multi method is also an interface
>   *) some of the things you can achieve with adaptation can be done with 
> multi methods
>Once you have multimethods do you still need adaptation in some cases or, 
>could one obtain the functionality otherwise or do you need dispatch on 
>interfaces (not just classes), how would then interfaces look like and the 
>dispatch on them?

With a sufficiently powerful predicate dispatch system, you could do away 
with adaptation entirely, since you can simulate interfaces by implementing 
a generic function that indicates whether a type supports the interface, 
and then defining a predicate type that calls the generic function.

That is, I define a predicate type IFoo such that ob is of type IFoo if 
'implementsIFoo(ob)'.  Then, for any type that implements the interface, I 
define a multimethod saying that implementsIFoo() is true for objects of 
that type.  Then, I can declare multimethod implementations for the IFoo 
predicate type.

What I'm curious about is: is there any way to do it *without* predicate 
types?  Could you have an "open ended union" type, that you can declare 
other types to be of, without having to inherit from a base type?


From tanzer at swing.co.at  Thu Oct 30 12:39:48 2003
From: tanzer at swing.co.at (Christian Tanzer)
Date: Thu Oct 30 12:42:44 2003
Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option
In-Reply-To: Your message of "Thu, 30 Oct 2003 17:20:37 +0100."
	<200310301720.37743.aleaxit@yahoo.com>
Message-ID: <E1AFGlw-0003hR-00@swing.co.at>


> Darn -- it WOULD be better in some cases if one could ONLY call
> a method on the class, NOT on an instance when the call would in
> any case ignore the instance.  Calling dict.fromkeys(L3) is wonderful,
> the problem is that you can also call it on a dict instance, and THAT
> gets confusing.  Similarly, calling list.sorted(iterable) is wonderful,
> but calling it on a list instance that gets ignored, L1.sorted(iterable),
> could perhaps be confusing.

Then why don't you use a custom descriptor which raises an exception
when an instance is passed in? Like:

    def __get__(self, obj, cls):
        if obj is None:
            return new.instancemethod(self.classmeth, cls)
        else:
            raise TypeError, \
              "Calling %s on instance %s ignores instance" % \
              (self.classmeth, obj)

-- 
Christian Tanzer                                    http://www.c-tanzer.at/


From pje at telecommunity.com  Thu Oct 30 12:51:12 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Oct 30 12:51:51 2003
Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option
In-Reply-To: <200310301716.h9UHGXB01596@12-236-54-216.client.attbi.com>
References: <Your message of "Thu, 30 Oct 2003 17:20:37 +0100."
	<200310301720.37743.aleaxit@yahoo.com>
	<000501c39ea9$a414e400$8cb6958d@oemcomputer>
	<200310300959.33587.aleaxit@yahoo.com>
	<200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com>
	<200310301720.37743.aleaxit@yahoo.com>
Message-ID: <5.1.1.6.0.20031030124652.03062630@telecommunity.com>

At 09:16 AM 10/30/03 -0800, Guido van Rossum wrote:
>class PickyClassmethod(classmethod):
>     def __get__(self, obj, cls):
>         if obj is not None:
>             raise TypeError, "class method should be called on class only!"
>         else:
>             return classmethod.__get__(self, None, cls)
>
>I don't want to make this behavior the default behavior, because I
>can see use cases for calling a class method on an instance too,
>knowing that it is a class method; otherwise one would have to write
>the ugly x.__class__.foobar().

If there were a 'classonlymethod()' built-in, I'd probably use it, as I use 
classmethods a fair bit (mostly for specialized constructors), but I don't 
recall ever desiring to call one via an instance.  Do you have an example 
of the use cases you see?

Hm.  What if your PickyClassmethod were a built-in called 'constructor' or 
'factorymethod'?  Probably too confining a name, if there are other uses 
for class-only methods, I suppose.


From guido at python.org  Thu Oct 30 13:00:59 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 30 13:01:10 2003
Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option
In-Reply-To: Your message of "Thu, 30 Oct 2003 12:51:12 EST."
	<5.1.1.6.0.20031030124652.03062630@telecommunity.com> 
References: <Your message of "Thu, 30 Oct 2003 17:20:37 +0100."
	<200310301720.37743.aleaxit@yahoo.com>
	<000501c39ea9$a414e400$8cb6958d@oemcomputer>
	<200310300959.33587.aleaxit@yahoo.com>
	<200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com>
	<200310301720.37743.aleaxit@yahoo.com> 
	<5.1.1.6.0.20031030124652.03062630@telecommunity.com> 
Message-ID: <200310301801.h9UI10v01751@12-236-54-216.client.attbi.com>

> If there were a 'classonlymethod()' built-in, I'd probably use it, as I use 
> classmethods a fair bit (mostly for specialized constructors), but I don't 
> recall ever desiring to call one via an instance.  Do you have an example 
> of the use cases you see?

Not exactly, but I notice that e.g. UserList uses self.__class__ a
lot; I think that's the kind of area where it might show up.

> Hm.  What if your PickyClassmethod were a built-in called 'constructor' or 
> 'factorymethod'?  Probably too confining a name, if there are other uses 
> for class-only methods, I suppose.

I'm not convinced that we have a problem (beyond Alex lying awake at
night, that it :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Thu Oct 30 13:09:58 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Oct 30 13:10:03 2003
Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option
In-Reply-To: <200310301801.h9UI10v01751@12-236-54-216.client.attbi.com>
References: <Your message of "Thu, 30 Oct 2003 12:51:12 EST."
	<5.1.1.6.0.20031030124652.03062630@telecommunity.com>
	<Your message of "Thu,
	30 Oct 2003 17:20:37 +0100." <200310301720.37743.aleaxit@yahoo.com>
	<000501c39ea9$a414e400$8cb6958d@oemcomputer>
	<200310300959.33587.aleaxit@yahoo.com>
	<200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com>
	<200310301720.37743.aleaxit@yahoo.com>
	<5.1.1.6.0.20031030124652.03062630@telecommunity.com>
Message-ID: <5.1.1.6.0.20031030130628.02da0c80@telecommunity.com>

At 10:00 AM 10/30/03 -0800, Guido van Rossum wrote:

> > Hm.  What if your PickyClassmethod were a built-in called 'constructor' or
> > 'factorymethod'?  Probably too confining a name, if there are other uses
> > for class-only methods, I suppose.
>
>I'm not convinced that we have a problem (beyond Alex lying awake at
>night, that it :-).

I thought you were proposing to use it for list.sorted, in order to provide 
a better error message when used with an instance.  If such a descriptor 
were implemented, I was suggesting that it would be useful as a form of 
documentation (i.e. that a method isn't intended to be called on instances 
of the class), and thus it would be nice for it to be exposed for folks 
like me who'd take advantage of it.  (Especially if PEP 318 is being 
implemented.)


From guido at python.org  Thu Oct 30 13:19:24 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 30 13:19:31 2003
Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option
In-Reply-To: Your message of "Thu, 30 Oct 2003 13:09:58 EST."
	<5.1.1.6.0.20031030130628.02da0c80@telecommunity.com> 
References: <Your message of "Thu, 30 Oct 2003 12:51:12 EST."
	<5.1.1.6.0.20031030124652.03062630@telecommunity.com> <Your
	message of "Thu,
	30 Oct 2003 17:20:37 +0100." <200310301720.37743.aleaxit@yahoo.com>
	<000501c39ea9$a414e400$8cb6958d@oemcomputer>
	<200310300959.33587.aleaxit@yahoo.com>
	<200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com>
	<200310301720.37743.aleaxit@yahoo.com>
	<5.1.1.6.0.20031030124652.03062630@telecommunity.com> 
	<5.1.1.6.0.20031030130628.02da0c80@telecommunity.com> 
Message-ID: <200310301819.h9UIJOM01799@12-236-54-216.client.attbi.com>

> >I'm not convinced that we have a problem (beyond Alex lying awake at
> >night, that it :-).
> 
> I thought you were proposing to use it for list.sorted, in order to provide 
> a better error message when used with an instance.  If such a descriptor 
> were implemented, I was suggesting that it would be useful as a form of 
> documentation (i.e. that a method isn't intended to be called on instances 
> of the class), and thus it would be nice for it to be exposed for folks 
> like me who'd take advantage of it.  (Especially if PEP 318 is being 
> implemented.)

I mostly just proposed it to placate Alex; I think he's overly worried
in this case.  PEP 318 seems a ways off.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From pje at telecommunity.com  Thu Oct 30 13:59:28 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Oct 30 13:59:35 2003
Subject: PEP 318 (was Re: [Python-Dev] Re: Guido's Magic Code was:
	inline sort option)
In-Reply-To: <200310301819.h9UIJOM01799@12-236-54-216.client.attbi.com>
References: <Your message of "Thu, 30 Oct 2003 13:09:58 EST."
	<5.1.1.6.0.20031030130628.02da0c80@telecommunity.com>
	<Your message of "Thu, 30 Oct 2003 12:51:12 EST."
	<5.1.1.6.0.20031030124652.03062630@telecommunity.com>
	<Your message of "Thu,
	30 Oct 2003 17:20:37 +0100." <200310301720.37743.aleaxit@yahoo.com>
	<000501c39ea9$a414e400$8cb6958d@oemcomputer>
	<200310300959.33587.aleaxit@yahoo.com>
	<200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com>
	<200310301720.37743.aleaxit@yahoo.com>
	<5.1.1.6.0.20031030124652.03062630@telecommunity.com>
	<5.1.1.6.0.20031030130628.02da0c80@telecommunity.com>
Message-ID: <5.1.1.6.0.20031030135759.02435e70@telecommunity.com>

At 10:19 AM 10/30/03 -0800, Guido van Rossum wrote:
>   PEP 318 seems a ways off.

Because of lack of consensus on syntax, or is it controversial in some 
other way?


From Bram at moolenaar.net  Thu Oct 30 14:08:19 2003
From: Bram at moolenaar.net (Bram Moolenaar)
Date: Thu Oct 30 14:09:52 2003
Subject: [Python-Dev] Speeding up regular expression compilation
Message-ID: <200310301908.h9UJ8JhL007882@moolenaar.net>


In the python-dev archives I find remarks about the old pre module being
much faster at compiling regular expressions than the new sre module.
My own experiences are that pre is about twenty times as fast.

Since my application uses a lot of simple patterns which are matched on
short strings (file names actually), the pattern compilation time is
taking half the CPU cycles of my program.  The faster execution of sre
apparently doesn't compensate for the slower compile time.

Is the plan to implement the sre module in C getting closer to being
done?

Is there a trick to make compiling patterns go faster?

I'm already falling back to the pre module with Python 2.2 and older.
With Python 2.3 this generates a warning message, thus I don't do it
there.

I considered copying the 2.2 version of pre.py into my application, but
this will stop working as soon as the support for pre is dropped (the
compiled C code won't be there).  Thus it would be only a temporary fix.

I don't care about the Unicode support.

-- 
LAUNCELOT: Isn't there a St. Aaaaarrrrrrggghhh's in Cornwall?
ARTHUR:    No, that's Saint Ives.
                 "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

 /// Bram Moolenaar -- Bram@Moolenaar.net -- http://www.Moolenaar.net   \\\
///          Creator of Vim - Vi IMproved -- http://www.Vim.org          \\\
\\\              Project leader for A-A-P -- http://www.A-A-P.org        ///
 \\\  Help AIDS victims, buy here: http://ICCF-Holland.org/click1.html  ///

From amk at amk.ca  Thu Oct 30 14:27:18 2003
From: amk at amk.ca (amk@amk.ca)
Date: Thu Oct 30 14:27:33 2003
Subject: [Python-Dev] HTML parsing: anyone use formatter?
Message-ID: <20031030192718.GA13220@rogue.amk.ca>

[Crossposted to python-dev, web-sig, and xml-sig.  Followups to
web-sig@python.org, please.]

I'm working on bringing htmllib.py up to HTML 4.01 by adding handlers for
all the missing elements.  I've currently been adding just empty methods to
the HTMLParser class, but the existing methods actually help render the HTML
by calling methods on a Formatter object. For example, the definitions for
the H1 element look like this:

    def start_h1(self, attrs):
        self.formatter.end_paragraph(1)
        self.formatter.push_font(('h1', 0, 1, 0))
		    
    def end_h1(self):
        self.formatter.end_paragraph(1)
        self.formatter.pop_font()

Question: should I continue supporting this in new methods?  This can only
go so far; a tag such as <big> or <small> is easy for me to handle, but
handling <form> or <frameset> or <table> would require greatly expanding the
Formatter class's repertoire.

I suppose the more general question is, does anyone use Python's formatter
module?  Do we want to keep it around, or should htmllib be pushed toward
doing just HTML parsing?  formatter.py is a long way from being able to
handle modern web pages and it would be a lot of work to build a decent
renderer.

--amk

From skip at pobox.com  Thu Oct 30 14:40:54 2003
From: skip at pobox.com (Skip Montanaro)
Date: Thu Oct 30 14:41:06 2003
Subject: [Python-Dev] Speeding up regular expression compilation
In-Reply-To: <200310301908.h9UJ8JhL007882@moolenaar.net>
References: <200310301908.h9UJ8JhL007882@moolenaar.net>
Message-ID: <16289.26950.782796.422409@montanaro.dyndns.org>

(better on python-list@python.org than here, btw)

    Bram> Is there a trick to make compiling patterns go faster?

Not really.  Note though that the sre module caches compiled regular
expressions.  How many it caches depends on the size of sre._MAXCACHE
(default is 100).  If you have many more regular expressions than that,
you'll spend a lot of time compiling them.  You might find it helpful to
boost that number.

If you're adventurous, you might investigate recasting the
sre_compile._compile function as C code.  If you use an Intel CPU, another
alternative might be to use psyco.

Skip


From fincher.8 at osu.edu  Thu Oct 30 16:03:15 2003
From: fincher.8 at osu.edu (Jeremy Fincher)
Date: Thu Oct 30 15:05:00 2003
Subject: [Python-Dev] HTML parsing: anyone use formatter?
In-Reply-To: <20031030192718.GA13220@rogue.amk.ca>
References: <20031030192718.GA13220@rogue.amk.ca>
Message-ID: <200310301603.15437.fincher.8@osu.edu>

On Thursday 30 October 2003 02:27 pm, amk@amk.ca wrote:
> I suppose the more general question is, does anyone use Python's formatter
> module?  Do we want to keep it around, or should htmllib be pushed toward
> doing just HTML parsing?  formatter.py is a long way from being able to
> handle modern web pages and it would be a lot of work to build a decent
> renderer.

I've never used it myself, though I'll admit that some software I've used (for 
searching the IMDB) does use it.

Jeremy

From guido at python.org  Thu Oct 30 15:18:32 2003
From: guido at python.org (Guido van Rossum)
Date: Thu Oct 30 15:18:41 2003
Subject: PEP 318 (was Re: [Python-Dev] Re: Guido's Magic Code was: inline
	sort option)
In-Reply-To: Your message of "Thu, 30 Oct 2003 13:59:28 EST."
	<5.1.1.6.0.20031030135759.02435e70@telecommunity.com> 
References: <Your message of "Thu, 30 Oct 2003 13:09:58 EST."
	<5.1.1.6.0.20031030130628.02da0c80@telecommunity.com> <Your
	message of "Thu, 30 Oct 2003 12:51:12 EST."
	<5.1.1.6.0.20031030124652.03062630@telecommunity.com> <Your
	message of "Thu,
	30 Oct 2003 17:20:37 +0100." <200310301720.37743.aleaxit@yahoo.com>
	<000501c39ea9$a414e400$8cb6958d@oemcomputer>
	<200310300959.33587.aleaxit@yahoo.com>
	<200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com>
	<200310301720.37743.aleaxit@yahoo.com>
	<5.1.1.6.0.20031030124652.03062630@telecommunity.com>
	<5.1.1.6.0.20031030130628.02da0c80@telecommunity.com> 
	<5.1.1.6.0.20031030135759.02435e70@telecommunity.com> 
Message-ID: <200310302018.h9UKIW701916@12-236-54-216.client.attbi.com>

> >   PEP 318 seems a ways off.
> 
> Because of lack of consensus on syntax, or is it controversial in some 
> other way?

Both.  This is the kind of syntactic change that require much deep
thought before committing.  Unfortunately I don't have time for that
right now, so please don't ask.

--Guido van Rossum (home page: http://www.python.org/~guido/)

From barry at python.org  Thu Oct 30 15:55:58 2003
From: barry at python.org (Barry Warsaw)
Date: Thu Oct 30 15:56:32 2003
Subject: PEP 318 (was Re: [Python-Dev] Re: Guido's Magic Code was:
	inline sort option)
In-Reply-To: <200310302018.h9UKIW701916@12-236-54-216.client.attbi.com>
References: <Your message of "Thu, 30 Oct 2003 13:09:58 EST."
	<5.1.1.6.0.20031030130628.02da0c80@telecommunity.com>
	<Your message of "Thu, 30 Oct 2003 12:51:12 EST."
	<5.1.1.6.0.20031030124652.03062630@telecommunity.com>
	<Your message of "Thu,
	30 Oct 2003 17:20:37 +0100." <200310301720.37743.aleaxit@yahoo.com>
	<000501c39ea9$a414e400$8cb6958d@oemcomputer>
	<200310300959.33587.aleaxit@yahoo.com>
	<200310301543.h9UFh4G01375@12-236-54-216.client.attbi.com>
	<200310301720.37743.aleaxit@yahoo.com>
	<5.1.1.6.0.20031030124652.03062630@telecommunity.com>
	<5.1.1.6.0.20031030130628.02da0c80@telecommunity.com>
	<5.1.1.6.0.20031030135759.02435e70@telecommunity.com>
	<200310302018.h9UKIW701916@12-236-54-216.client.attbi.com>
Message-ID: <1067547357.5295.163.camel@anthem>

On Thu, 2003-10-30 at 15:18, Guido van Rossum wrote:
> > >   PEP 318 seems a ways off.
> > 
> > Because of lack of consensus on syntax, or is it controversial in some 
> > other way?
> 
> Both.  This is the kind of syntactic change that require much deep
> thought before committing.  Unfortunately I don't have time for that
> right now, so please don't ask.

I won't, but I do hope this is something that we can settle for Python
2.4.  I've been using the functionality in Python 2.3 for a while now
and it is wonderful, but I the tedium and clumsiness of the current
syntax really puts a damper on its use.

-Barry


From barry at python.org  Thu Oct 30 16:03:00 2003
From: barry at python.org (Barry Warsaw)
Date: Thu Oct 30 16:03:15 2003
Subject: [Python-Dev] Speeding up regular expression compilation
In-Reply-To: <16289.26950.782796.422409@montanaro.dyndns.org>
References: <200310301908.h9UJ8JhL007882@moolenaar.net>
	<16289.26950.782796.422409@montanaro.dyndns.org>
Message-ID: <1067547779.5295.168.camel@anthem>

On Thu, 2003-10-30 at 14:40, Skip Montanaro wrote:
> Not really.  Note though that the sre module caches compiled regular
> expressions.  How many it caches depends on the size of sre._MAXCACHE
> (default is 100).  If you have many more regular expressions than that,
> you'll spend a lot of time compiling them.  You might find it helpful to
> boost that number.

Of course you can just assign your compiled regular expression objects
to a global or local and use that.  Instant caching!  Which is what I
tend to do.

-Barry


From tdelaney at avaya.com  Thu Oct 30 17:09:54 2003
From: tdelaney at avaya.com (Delaney, Timothy C (Timothy))
Date: Thu Oct 30 17:10:02 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DED6B5CC@au3010avexu1.global.avaya.com>

> From: "Martin v. L?wis" [mailto:martin@v.loewis.de]
> 
> My answer is "it depends": If you did not do that, and, for example,
> explain why it *can't* be done, than this is a good thesis, 
> provided you
> give qualified scientific rationale for why it can't be done. If you
> say you did not do it, but it could be done in this and that way if
> you had 50 person years available, then this could be a good thesis
> as well, provided the strategy you outline, and the rationale for
> computing the 50 person years is convincing. If you just say, "Oops,
> I did not finish it because it is too much work", then this would be
> a bad thesis.

Yep - that was what I was getting at, and your explanation corresponds exactly with my gut feeling.

Cheers.

Tim Delaney

From martin at v.loewis.de  Thu Oct 30 17:16:48 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Thu Oct 30 17:16:53 2003
Subject: [Python-Dev] Speeding up regular expression compilation
In-Reply-To: <200310301908.h9UJ8JhL007882@moolenaar.net>
References: <200310301908.h9UJ8JhL007882@moolenaar.net>
Message-ID: <m37k2m2wxb.fsf@mira.informatik.hu-berlin.de>

Bram Moolenaar <Bram@moolenaar.net> writes:

> Is there a trick to make compiling patterns go faster?

If you compile the same regular expression at every program startup,
and want to reduce the time for that, you can cPickle the compile
expression, and restore it from the string. If that fails (because the
format of compiled expressions has failed), you should fall back to
compiling expressions, and optionally save the new version.

Regards,
Martin

From mhammond at skippinet.com.au  Thu Oct 30 17:21:06 2003
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Thu Oct 30 17:20:48 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <200310300230.h9U2UId08398@oma.cosc.canterbury.ac.nz>
Message-ID: <0b8e01c39f34$1d31e1f0$0500a8c0@eden>

Greg:
> Neil Schemenauer:
>
> > Okay.  Perhaps I am missing something but would fixing it be as
> > simple as adding another field to the tp_as_buffer struct?
> >
> >     /* references returned by the buffer functins are valid while
> >      * the object remains alive */
> >     #define PyBuffer_FLAG_SAFE 1
>
> That's completely different from what I had in mind, which was:
>
> (1) Keep a reference to the base object in the buffer object, and
>
> (2) Use the buffer API to fetch a fresh pointer from the
>     base object each time it's needed.
>
> Is there some reason that still wouldn't be safe enough?

That would work, be less intrusive, and allow all existing code to work
unchanged.  My only concern is that it does not go anywhere towards fixing
the buffer interface itself.

To my mind, the buffer object is fairly useless and I never use it - so I
really don't care.  However, I do have real world uses for the buffer
interface.  The most compelling is for async IO in the Windows world - I
need to pass a buffer Windows will fill in the background, and the buffer
interface provides the solution - except for the flaws that also drip down
to the buffer object, and leaves us with this problem.

Thus, my preference is to fix the buffer object by fixing the interface as
much as possible.

Here is a sketch of a solution, incorporating both Neil and Greg's ideas:

* Type object gets a new flag - TP_HAS_BUFFER_INFO, corresponding to a new
'getbufferinfoproc' slot in the PyBufferProcs structure (note - a function
pointer, not static flags as Neil suggested)

* New function 'getbufferinfoproc' returns a bitmask - Py_BUFFER_FIXED is
one (and currently the only) flag that can be returned.

* New buffer functions PyObject_AsFixedCharBuffer, etc.  These check the new
flag (and a type lacking TP_HAS_BUFFER_INFO is assumed to *not* be fixed)

* Buffer object keeps a reference to the existing object (as it does now).
Its getbufferinfoproc delegates to the underlying object.

* Buffer object *never* keeps a pointer to the buffer - only to the object.
Functions like tp_hash always re-fetch the buffer on demand.  The buffer
returned by the buffer object is then guaranteed to be as reliable as the
underlying object.  (This may be a semantic issue with hash(), but
conceptually seems fine.  Potential solution here - add Py_BUFFER_READONLY
as a buffer flag, then hash() semantics could do the right thing)

After all that, I can't help noticing Greg's solution would be far less work
<wink>,

Mark.


From exarkun at intarweb.us  Thu Oct 30 19:21:51 2003
From: exarkun at intarweb.us (Jp Calderone)
Date: Thu Oct 30 19:22:37 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <3F9F72DF.9080101@thule.no>
References: <000501c39de7$0019c180$3403a044@oemcomputer>
	<3F9F72DF.9080101@thule.no>
Message-ID: <20031031002151.GA26628@intarweb.us>

On Wed, Oct 29, 2003 at 08:57:19AM +0100, Troels Walsted Hansen wrote:
> Raymond Hettinger wrote:
> 
> >At least the builtin buffer function should go away.
> >Even if someone had a use for it, it would not make-up for all the time
> >lost by all the other people trying to figure what it was good for.
> 
> I trust you will preserve the functionality though?
> 
> I have used the buffer() function to achieve great leaps in performance 
> in applications which send data from a string buffer to a socket. 
> Slicing kills performance in this scenario once buffer sizes get beyond 
> a few 100 kB.
> 
> Below is example from an asyncore.dispatcher subclass. This code sends 
> chunks with maximum size, without ever slicing the buffer.
> 
>     def handle_write(self):
>         if self.buffer_offset:
>             sent = self.send(buffer(self.buffer, self.buffer_offset))
>         else:
>             sent = self.send(self.buffer)
>         self.buffer_offset += sent
>         if self.buffer_offset == len(self.buffer):
>             del self.buffer
> 


  Twisted uses buffer() similarly.  It originally sliced, by a company using
the library complained of performance problems.  Switching to buffer()
alleviated those problems.

  Jp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://mail.python.org/pipermail/python-dev/attachments/20031030/15556250/attachment.bin
From bac at OCF.Berkeley.EDU  Thu Oct 30 21:01:42 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Oct 30 21:01:55 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <3FA0C2F7.40409@v.loewis.de>
References: <338366A6D2E2CA4C9DAEAE652E12A1DED6B3F8@au3010avexu1.global.avaya.com>
	<3FA06DC5.70407@ocf.berkeley.edu> <3FA0C2F7.40409@v.loewis.de>
Message-ID: <3FA1C286.2030409@ocf.berkeley.edu>

Martin v. L?wis wrote:
> Brett C. wrote:
> 
>>> Whether it is too large for a Masters thesis I don't know. Does a
>>> Masters thesis require *success* in the stated goal? I've been
>>> thinking about doing my own Masters in the not-too-distant future if
>>> I can find the time ...
>>>
>>
>>
>> Success as in what you set out to do was actually beneficial?  No, 
>> just as long as something is learned.  Successful as actually 
>> finishing the darn thing?  Yes.
> 
> 
> He actually meant "success in the stated goal". I.e. if you go out to
> implement free threading, would it be considered as a failure of the
> Master's project if you come back and say: "I did not actually do that"?
> 

Ah, OK.  My mistake.

> My answer is "it depends": If you did not do that, and, for example,
> explain why it *can't* be done, than this is a good thesis, provided you
> give qualified scientific rationale for why it can't be done. If you
> say you did not do it, but it could be done in this and that way if
> you had 50 person years available, then this could be a good thesis
> as well, provided the strategy you outline, and the rationale for
> computing the 50 person years is convincing. If you just say, "Oops,
> I did not finish it because it is too much work", then this would be
> a bad thesis.
> 

I would have to agree with that assessment.  Just have to convince my 
thesis adviser.  =)

-Brett


From bac at OCF.Berkeley.EDU  Thu Oct 30 21:11:35 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Oct 30 21:11:39 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <3FA0C16A.8030203@v.loewis.de>
References: <3F9F1F82.2090209@ocf.berkeley.edu>
	<m3oew0o6l5.fsf@mira.informatik.hu-berlin.de>
	<3FA063BF.6050207@ocf.berkeley.edu> <3FA0C16A.8030203@v.loewis.de>
Message-ID: <3FA1C4D7.4010403@ocf.berkeley.edu>

Martin v. L?wis wrote:
> Brett C. wrote:
> 
>>> - floating point: provide IEEE-794 (or some such) in a portable
>>>   yet efficient way
>>
>>
>>
>> You mean like how we have longs?  So code up in C our own way of 
>> storing  794 independent of the CPU?
> 
> 
> Not longs, but floats. And you would not attempt to store it independent
> of the CPU, but instead, you would make as much use of the CPU as
> possible, and only implement things in C that the CPU gets wrong. The
> portion of emulation would vary from CPU to CPU.
> 

OK, so in other words play cleanup for how the CPU handles floating 
point by having custom code that deals with its mix-ups.

> As a starting point, you might look at the Java strictfpu mode (which
> got only added after the initial Java release). Java 1.0 was where
> Python is today: expose whatever the platform provides. In Java, they
> have the much stronger desire to provide bit-for-bit reproducability
> on all systems, so they added strictfpu as a trade-off of performance
> vs. write-once-run-anywhere.
> 

Remembrances of Tim mentioning FPU exceptions start to flood back into 
my mind.  =)

>>> - deterministic finalization: provide a way to get objects destroyed
>>>   implicitly at certain points in control flow; a use case would be
>>>   thread-safety/critical regions
>>
>>
>>
>> I think you get what you mean by this, but I am not totally sure since 
>> I can't come up with a use beyond threads killing themselves properly 
>> when the whole program is shutting down.
> 
> 
> Not at all. In Python, you currently do
> 
> def bump_counter(self):
>   self.mutex.acquire()
>   try:
>     self.counter = self.counter+1
>     more_actions()
>   finally:
>     self.mutex.release()
> 
> In C++, you do
> 
> void bump_counter(){
>   MutexAcquistion acquire(this);
>   this->counter+=1;
>   more_actions();
> }
> 
> I.e. you can acquire the mutex at the beginning (as a local object),
> and it gets destroyed automatically at the end of the function. So
> they have the "resource acquisition is construction, resource release
> is destruction" design pattern. This is very powerful and convenient,
> and works almost in CPython, but not in Python - as there is no
> uarantee when objects get destroyed.
> 

Ah, OK.

>> Have no clue what this is since I don't know C#.  Almost sounds like 
>> Michael's def func() [] proposal at the method level.  Or just a lot 
>> of descriptors.  =)
> 
> 
> Yes, the func()[] proposal might do most of it. However, I'm uncertain
> whether it puts in place all pieces of the puzzle - one would actually
> have to try to use that stuff to see whether it really works
> sufficiently. You would have to set goals first (what is it supposed to
> do), and then investigate, whether these things can actually be done
> with it. As I said: static, class, synchronized, final methods might
> all be candidates; perhaps along with some of the .NET features, like
> security evidence check (caller must have permission to write files
> in order to call this method), webmethod (method is automagically 
> exposed as a SOAP/XML-RPC method), etc.
> 

I remember that static and classmethod were reasons cited why the 
func()[] proposal was desired.  It's an idea.

-Brett


From bac at OCF.Berkeley.EDU  Thu Oct 30 21:19:57 2003
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Oct 30 21:21:40 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <2mhe1rj7n8.fsf@starship.python.net>
References: <Pine.LNX.4.10.10310291911060.4409-100000@sumeru.stanford.EDU>	<3FA0A210.10605@ocf.berkeley.edu>
	<2mhe1rj7n8.fsf@starship.python.net>
Message-ID: <3FA1C6CD.6050201@ocf.berkeley.edu>

Michael Hudson wrote:

> "Brett C." <bac@OCF.Berkeley.EDU> writes:
> 
> 
>>Dennis Allison wrote:
>>
>>
>>>Brett -- 
>>>You might put together a list of all the ideas (maybe even a ranked
>>>list)
>>>and post it as a unit to the list for archival purposes.  Thanks.
>>>	
>>
>>Way ahead of you, Dennis.  I have already started to come up with a
>>reST doc for writing up all of these suggestions.  It just might be a
>>little while before I get it up since I will need to do some
>>preliminary research on each idea to measure the amount of work they
>>will be.
> 
> 
> Could go on the Python Wiki?
> 

Could.  Let me get it done in reST locally, then I can look at adding it 
to the wiki.

> I take it from your posting of last week that you've thought about
> other ways of implementing exception handling?  I guess a
> non-reference count based GC is a prerequisite for that...
> 

Yeah, I have tossed the exception handling idea around in my head a 
little, but the culmination was what I posted.

And a non-refcount GC would definitely help, even if the exception 
handling wasn't changed.  More places where you could just return NULL 
instead of having to deal with DECREFing objects.

-Brett


From greg at cosc.canterbury.ac.nz  Thu Oct 30 22:37:51 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu Oct 30 22:38:06 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <20031030171939.GA374@mems-exchange.org>
Message-ID: <200310310337.h9V3bpB17539@oma.cosc.canterbury.ac.nz>

Neil Schemenauer <nas-python@python.ca>:

> I don't think the buffer hash method can depend on the data being
> pointed to.  There is nothing in the buffer interface that tells
> you if the data is immutable.  The hash method could return the id
> of the buffer object but I'm not sure how useful that would be.

How about just having it call the hash method of the base
object? If the base object is hashable, this will do something
reasonable, and if not, it will fail in the expected way.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Thu Oct 30 22:42:33 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu Oct 30 22:42:45 2003
Subject: [Python-Dev] Speeding up regular expression compilation
In-Reply-To: <16289.26950.782796.422409@montanaro.dyndns.org>
Message-ID: <200310310342.h9V3gXf17558@oma.cosc.canterbury.ac.nz>

> If you're adventurous, you might investigate recasting the
> sre_compile._compile function as C code.  

Or Pyrex code. :-)

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From greg at cosc.canterbury.ac.nz  Thu Oct 30 22:47:59 2003
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu Oct 30 22:48:46 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <0b8e01c39f34$1d31e1f0$0500a8c0@eden>
Message-ID: <200310310347.h9V3lwa17730@oma.cosc.canterbury.ac.nz>

> Thus, my preference is to fix the buffer object by fixing the interface as
> much as possible.
> 
> Here is a sketch of a solution, incorporating both Neil and Greg's ideas:

Hang on, didn't we already go through the process of
designing a new buffer interface not long ago?

What was decided about the results of that?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+

From aleaxit at yahoo.com  Fri Oct 31 02:59:40 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 31 02:59:46 2003
Subject: PEP 318 (was Re: [Python-Dev] Re: Guido's Magic Code was: inline
	sort option)
In-Reply-To: <1067547357.5295.163.camel@anthem>
References: <Your message of "Thu, 30 Oct 2003 13:09:58 EST."
	<5.1.1.6.0.20031030130628.02da0c80@telecommunity.com>
	<200310302018.h9UKIW701916@12-236-54-216.client.attbi.com>
	<1067547357.5295.163.camel@anthem>
Message-ID: <200310310859.40837.aleaxit@yahoo.com>

On Thursday 30 October 2003 09:55 pm, Barry Warsaw wrote:
> On Thu, 2003-10-30 at 15:18, Guido van Rossum wrote:
> > > >   PEP 318 seems a ways off.
> > >
> > > Because of lack of consensus on syntax, or is it controversial in some
> > > other way?
> >
> > Both.  This is the kind of syntactic change that require much deep
> > thought before committing.  Unfortunately I don't have time for that
> > right now, so please don't ask.
>
> I won't, but I do hope this is something that we can settle for Python
> 2.4.  I've been using the functionality in Python 2.3 for a while now
> and it is wonderful, but I the tedium and clumsiness of the current
> syntax really puts a damper on its use.

Not on mine (my use), but, yes, I _have_ seen some Pythonistas be
rather perplexed by it.  Giving it a neat, cool look will be good.

BTW, when we do come around to PEP 318, I would suggest the 'as'
clause on a class statement as the best way to specify a metaclass.

'class Newstyle as type:' for example is IMHO neater -- and thus more
encouraging to the generalized use of newstyle classes -- than the
"inheriting from object" idea or the setting of __metaclass__; it reads
well AND makes what one's doing more obvious when a custom MC
is involved, because it's so "up-front".  Besides, it's STILL syntax for
a call to the thingy on the RHS of 'as', just like, say,
def foop() as staticmethod:
is, even though the details of how that call is performed are different
for metaclasses (called with classname/bases/classdict) and function
decorators (called with the function object).

BTW, the PEP isn't very clear about this, but, I would hope the 'as'
clause applies uniformly to ANY def or class statement, right?  No
reason to specialcase, that I can see -- "def ... as" may well be used
mostly inside classbodies, because we do have decorators ready for
that, but the 'synchronized(lock)' decorator used in the PEP's examples
would seem just as applicable to freestanding functions as to methods.


Alex


From aleaxit at yahoo.com  Fri Oct 31 03:03:35 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 31 03:03:41 2003
Subject: [Python-Dev] Re: Guido's Magic Code was: inline sort option
In-Reply-To: <200310301819.h9UIJOM01799@12-236-54-216.client.attbi.com>
References: <Your message of "Thu, 30 Oct 2003 12:51:12 EST."
	<5.1.1.6.0.20031030124652.03062630@telecommunity.com>
	<5.1.1.6.0.20031030130628.02da0c80@telecommunity.com>
	<200310301819.h9UIJOM01799@12-236-54-216.client.attbi.com>
Message-ID: <200310310903.35941.aleaxit@yahoo.com>

On Thursday 30 October 2003 07:19 pm, Guido van Rossum wrote:
> > >I'm not convinced that we have a problem (beyond Alex lying awake at
> > >night, that it :-).

As it happens I just had a very unusual ten-hours-of-sleep night, so I don't
think you need to worry:-).

> > on instances of the class), and thus it would be nice for it to be
> > exposed for folks like me who'd take advantage of it.  (Especially if PEP
> > 318 is being implemented.)
>
> I mostly just proposed it to placate Alex; I think he's overly worried
> in this case.  PEP 318 seems a ways off.

OK, then it does appear to me that new descriptors may wait for PEP 318
to mature, and list.sorted be left as is for now.  Hopefully both can be taken
into consideration before 2.4 is finalized, since that time is also "a ways 
off", no doubt.


Alex


From theller at python.net  Fri Oct 31 03:03:52 2003
From: theller at python.net (Thomas Heller)
Date: Fri Oct 31 03:04:39 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <200310310347.h9V3lwa17730@oma.cosc.canterbury.ac.nz> (Greg
	Ewing's message of "Fri, 31 Oct 2003 16:47:59 +1300 (NZDT)")
References: <200310310347.h9V3lwa17730@oma.cosc.canterbury.ac.nz>
Message-ID: <ekwtrfyv.fsf@python.net>

Greg Ewing <greg@cosc.canterbury.ac.nz> writes:

>> Thus, my preference is to fix the buffer object by fixing the interface as
>> much as possible.
>> 
>> Here is a sketch of a solution, incorporating both Neil and Greg's ideas:
>
> Hang on, didn't we already go through the process of
> designing a new buffer interface not long ago?
>
> What was decided about the results of that?

That was pep 298. I withdraw it (well, it's still labeled as draft)
because I didn't have enough time to finish the specification.
But if anyone wants to take it over, please do so.

Thomas


From Bram at moolenaar.net  Fri Oct 31 06:20:44 2003
From: Bram at moolenaar.net (Bram Moolenaar)
Date: Fri Oct 31 06:22:22 2003
Subject: [Python-Dev] Speeding up regular expression compilation
In-Reply-To: <1067547779.5295.168.camel@anthem>
Message-ID: <200310311120.h9VBKikP001404@moolenaar.net>


Barry Warsaw wrote:

> On Thu, 2003-10-30 at 14:40, Skip Montanaro wrote:
> > Not really.  Note though that the sre module caches compiled regular
> > expressions.  How many it caches depends on the size of sre._MAXCACHE
> > (default is 100).  If you have many more regular expressions than that,
> > you'll spend a lot of time compiling them.  You might find it helpful to
> > boost that number.
> 
> Of course you can just assign your compiled regular expression objects
> to a global or local and use that.  Instant caching!  Which is what I
> tend to do.

I'm already caching all the compiled patterns.  It's the first-time
compile that is consuming time, there are a lot of patterns.  But half a
second to compile them is too much, the whole program may not run longer
than a second.

BTW. I've changed the code to use pre.py on Python 2.3 (with the warning
removed) as a temporary solution.  The problem will be back with 2.4...

The reason I sent this to the development list is that I thought this
could be solved on the library side.  Changing the Python code sounds
like working around the real problem.

-- 
BRIDGEKEEPER: What is your favorite colour?
GAWAIN:       Blue ...  No yelloooooww!
                 "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

 /// Bram Moolenaar -- Bram@Moolenaar.net -- http://www.Moolenaar.net   \\\
///          Creator of Vim - Vi IMproved -- http://www.Vim.org          \\\
\\\              Project leader for A-A-P -- http://www.A-A-P.org        ///
 \\\  Help AIDS victims, buy here: http://ICCF-Holland.org/click1.html  ///

From mwh at python.net  Fri Oct 31 06:36:51 2003
From: mwh at python.net (Michael Hudson)
Date: Fri Oct 31 06:36:55 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <0b8e01c39f34$1d31e1f0$0500a8c0@eden> (Mark Hammond's message
	of "Fri, 31 Oct 2003 09:21:06 +1100")
References: <0b8e01c39f34$1d31e1f0$0500a8c0@eden>
Message-ID: <2mad7h8wq4.fsf@starship.python.net>

"Mark Hammond" <mhammond@skippinet.com.au> writes:

> That would work, be less intrusive, and allow all existing code to work
> unchanged.  My only concern is that it does not go anywhere towards fixing
> the buffer interface itself.

I think that is a different issue entirely.  While it may be
interesting and important, can we at least try to keep them separate?

Cheers,
mwh

-- 
  This is the fixed point problem again; since all some implementors
  do is implement the compiler and libraries for compiler writing, the
  language becomes good at writing compilers and not much else!
                                 -- Brian Rogoff, comp.lang.functional

From mwh at python.net  Fri Oct 31 06:42:52 2003
From: mwh at python.net (Michael Hudson)
Date: Fri Oct 31 06:42:56 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <3FA1C6CD.6050201@ocf.berkeley.edu> (Brett C.'s message of
	"Thu, 30 Oct 2003 18:19:57 -0800")
References: <Pine.LNX.4.10.10310291911060.4409-100000@sumeru.stanford.EDU>
	<3FA0A210.10605@ocf.berkeley.edu> <2mhe1rj7n8.fsf@starship.python.net>
	<3FA1C6CD.6050201@ocf.berkeley.edu>
Message-ID: <2m65i58wg3.fsf@starship.python.net>

"Brett C." <bac@OCF.Berkeley.EDU> writes:

>> I take it from your posting of last week that you've thought about
>> other ways of implementing exception handling?  I guess a
>> non-reference count based GC is a prerequisite for that...
>> 
>
> Yeah, I have tossed the exception handling idea around in my head a
> little, but the culmination was what I posted.
>
> And a non-refcount GC would definitely help, even if the exception
> handling wasn't changed.  More places where you could just return NULL
> instead of having to deal with DECREFing objects.

And reducing the memory overhead of objects.

Here's my crazy idea that's been knocking around my head for a while.
I wonder if anyone can shoot in down in flames.

Remove the ob_type field from all PyObjects.  Make pymalloc mandatory,
make it use type specific pools and store a pointer to the type object
at the start of each pool.

So instead of 

  p->ob_type

it's

  *(p&MASK)

I think having each type in its own pools would also let you lose the
gc_next & gc_prev fields.

Combined with a non-refcount GC, you could hammer sizeof(PyIntObject)
down to sizeof(long)!

(Actually, a potential killer is assigning to __class__ -- maybe you
could only do this for heaptypes)

Cheers,
mwh

-- 
  To summarise the summary of the summary:- people are a problem.
                   -- The Hitch-Hikers Guide to the Galaxy, Episode 12

From mhammond at skippinet.com.au  Fri Oct 31 08:03:56 2003
From: mhammond at skippinet.com.au (Mark Hammond)
Date: Fri Oct 31 08:03:42 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <2mad7h8wq4.fsf@starship.python.net>
Message-ID: <009601c39faf$72617a70$0500a8c0@eden>

Michael Hudson

> "Mark Hammond" <mhammond@skippinet.com.au> writes:
>
> > That would work, be less intrusive, and allow all existing
> code to work
> > unchanged.  My only concern is that it does not go anywhere
> towards fixing
> > the buffer interface itself.
>
> I think that is a different issue entirely.  While it may be
> interesting and important, can we at least try to keep them separate?

I don't see how.  The only problem I see is in the buffer interface.  We
could worm around the buffer interface problem in the buffer object, but I
don't see how that is keeping them separate.  Am I missing something?

Mark.


From martin at v.loewis.de  Fri Oct 31 08:45:35 2003
From: martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Fri Oct 31 08:45:52 2003
Subject: [Python-Dev] Speeding up regular expression compilation
In-Reply-To: <200310311120.h9VBKikP001404@moolenaar.net>
References: <200310311120.h9VBKikP001404@moolenaar.net>
Message-ID: <m3u15pr05c.fsf@mira.informatik.hu-berlin.de>

Bram Moolenaar <Bram@moolenaar.net> writes:

> The reason I sent this to the development list is that I thought this
> could be solved on the library side.  Changing the Python code sounds
> like working around the real problem.

It probably can be changed. However, it appears that few people would
ever worry about the compilation speed, so it is unlikely that any
efforts will be made in improving it. Contributions would be greatly
appreciated.

Regards,
Martin


From pje at telecommunity.com  Fri Oct 31 08:52:03 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 31 08:51:13 2003
Subject: [Python-Dev] Looking for master thesis ideas involving
  Python
In-Reply-To: <2m65i58wg3.fsf@starship.python.net>
References: <3FA1C6CD.6050201@ocf.berkeley.edu>
	<Pine.LNX.4.10.10310291911060.4409-100000@sumeru.stanford.EDU>
	<3FA0A210.10605@ocf.berkeley.edu>
	<2mhe1rj7n8.fsf@starship.python.net>
	<3FA1C6CD.6050201@ocf.berkeley.edu>
Message-ID: <5.1.0.14.0.20031031084822.01e5a020@mail.telecommunity.com>

At 11:42 AM 10/31/03 +0000, Michael Hudson wrote:
>"Brett C." <bac@OCF.Berkeley.EDU> writes:
>
> >> I take it from your posting of last week that you've thought about
> >> other ways of implementing exception handling?  I guess a
> >> non-reference count based GC is a prerequisite for that...
> >>
> >
> > Yeah, I have tossed the exception handling idea around in my head a
> > little, but the culmination was what I posted.
> >
> > And a non-refcount GC would definitely help, even if the exception
> > handling wasn't changed.  More places where you could just return NULL
> > instead of having to deal with DECREFing objects.
>
>And reducing the memory overhead of objects.

OTOH, maybe you could see whether INCREF/DECREF can be used to control 
synchronization of objects between threads, and thus get a multiprocessor 
Python.  Note that if an object's refcount is 1, it's not being shared 
between threads.  INCREF could be looked at as, "I'm about to use this 
object", so if the object isn't "owned" by the current thread, then lock it 
and increment an ownership count.

Or was that how the experimental free-threading Python worked?


>Here's my crazy idea that's been knocking around my head for a while.
>I wonder if anyone can shoot in down in flames.
>
>Remove the ob_type field from all PyObjects.  Make pymalloc mandatory,
>make it use type specific pools and store a pointer to the type object
>at the start of each pool.

How would you get from the pointer to the pool head?


From mwh at python.net  Fri Oct 31 09:07:28 2003
From: mwh at python.net (Michael Hudson)
Date: Fri Oct 31 09:07:31 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <009601c39faf$72617a70$0500a8c0@eden> (Mark Hammond's message
	of "Sat, 1 Nov 2003 00:03:56 +1100")
References: <009601c39faf$72617a70$0500a8c0@eden>
Message-ID: <2mznfh7b6n.fsf@starship.python.net>

"Mark Hammond" <mhammond@skippinet.com.au> writes:

> Michael Hudson
>
>> "Mark Hammond" <mhammond@skippinet.com.au> writes:
>>
>> > That would work, be less intrusive, and allow all existing
>> code to work
>> > unchanged.  My only concern is that it does not go anywhere
>> towards fixing
>> > the buffer interface itself.
>>
>> I think that is a different issue entirely.  While it may be
>> interesting and important, can we at least try to keep them separate?
>
> I don't see how.  The only problem I see is in the buffer interface.  We
> could worm around the buffer interface problem in the buffer object, but I
> don't see how that is keeping them separate.  Am I missing something?

Well, there are two things people complain about

a) the buffer INTERFACE
b) the buffer OBJECT

are the issues plaguing both the same?  I wasn't under the impression
they were.  It's entirely possible I'm wrong, though.

Cheers,
mwh

-- 
[1] If you're lost in the woods, just bury some fibre in the ground
    carrying data. Fairly soon a JCB will be along to cut it for you
    - follow the JCB back to civilsation/hitch a lift.
                                               -- Simon Burr, cam.misc

From mwh at python.net  Fri Oct 31 09:10:16 2003
From: mwh at python.net (Michael Hudson)
Date: Fri Oct 31 09:10:54 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <5.1.0.14.0.20031031084822.01e5a020@mail.telecommunity.com>
	(Phillip J. Eby's message of "Fri, 31 Oct 2003 08:52:03 -0500")
References: <3FA1C6CD.6050201@ocf.berkeley.edu>
	<Pine.LNX.4.10.10310291911060.4409-100000@sumeru.stanford.EDU>
	<3FA0A210.10605@ocf.berkeley.edu> <2mhe1rj7n8.fsf@starship.python.net>
	<3FA1C6CD.6050201@ocf.berkeley.edu>
	<5.1.0.14.0.20031031084822.01e5a020@mail.telecommunity.com>
Message-ID: <2mvfq57b1z.fsf@starship.python.net>

"Phillip J. Eby" <pje@telecommunity.com> writes:

>>Here's my crazy idea that's been knocking around my head for a while.
>>I wonder if anyone can shoot in down in flames.
>>
>>Remove the ob_type field from all PyObjects.  Make pymalloc mandatory,
>>make it use type specific pools and store a pointer to the type object
>>at the start of each pool.
>
> How would you get from the pointer to the pool head?

Did you read the rest of my mail?  Maybe I was too terse, but my
thinking was that the pools are aligned on a known size boundary
(e.g. 4K) so to get to the head you just mask off the 12 (or whatever)
least significant bits.  Wouldn't work for zeta-c[1], I'd have to
admit, but do we care?

Cheers,
mwh

[1] http://www.cliki.net/Zeta-C

-- 
  SPIDER:  'Scuse me. [scuttles off]
  ZAPHOD:  One huge spider.
    FORD:  Polite though.
                   -- The Hitch-Hikers Guide to the Galaxy, Episode 11

From nas-python at python.ca  Fri Oct 31 09:12:08 2003
From: nas-python at python.ca (Neil Schemenauer)
Date: Fri Oct 31 09:11:13 2003
Subject: [Python-Dev] Deprecate the buffer object?
In-Reply-To: <200310310337.h9V3bpB17539@oma.cosc.canterbury.ac.nz>
References: <20031030171939.GA374@mems-exchange.org>
	<200310310337.h9V3bpB17539@oma.cosc.canterbury.ac.nz>
Message-ID: <20031031141208.GA3566@mems-exchange.org>

On Fri, Oct 31, 2003 at 04:37:51PM +1300, Greg Ewing wrote:
> How about just having it call the hash method of the base
> object? If the base object is hashable, this will do something
> reasonable, and if not, it will fail in the expected way.

The buffer can reference a subset of the original data ('size' an
'offset' parameters).

  Neil

From pje at telecommunity.com  Fri Oct 31 11:20:35 2003
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Oct 31 11:21:08 2003
Subject: [Python-Dev] Looking for master thesis ideas involving
  Python
In-Reply-To: <2mvfq57b1z.fsf@starship.python.net>
References: <5.1.0.14.0.20031031084822.01e5a020@mail.telecommunity.com>
	<3FA1C6CD.6050201@ocf.berkeley.edu>
	<Pine.LNX.4.10.10310291911060.4409-100000@sumeru.stanford.EDU>
	<3FA0A210.10605@ocf.berkeley.edu>
	<2mhe1rj7n8.fsf@starship.python.net>
	<3FA1C6CD.6050201@ocf.berkeley.edu>
	<5.1.0.14.0.20031031084822.01e5a020@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20031031111429.03110880@telecommunity.com>

At 02:10 PM 10/31/03 +0000, Michael Hudson wrote:
>"Phillip J. Eby" <pje@telecommunity.com> writes:
>
> >>Here's my crazy idea that's been knocking around my head for a while.
> >>I wonder if anyone can shoot in down in flames.
> >>
> >>Remove the ob_type field from all PyObjects.  Make pymalloc mandatory,
> >>make it use type specific pools and store a pointer to the type object
> >>at the start of each pool.
> >
> > How would you get from the pointer to the pool head?
>
>Did you read the rest of my mail?  Maybe I was too terse, but my

Yes, and yes.  :)


>thinking was that the pools are aligned on a known size boundary
>(e.g. 4K) so to get to the head you just mask off the 12 (or whatever)
>least significant bits.

Ah.  But since even the most trivial of Python operations require access to 
the type, wouldn't this take longer?  I mean, for every 
ob->ob_type->tp_whatever you'll now have something like *(ob & 
mask)->tp_whatever.  So there are still two memory acesses, but now there's 
a bitmasking operation added in.  I suppose that for some object types you 
could be getting a 12-25% decrease in memory use for the base object, though.


From mwh at python.net  Fri Oct 31 12:08:36 2003
From: mwh at python.net (Michael Hudson)
Date: Fri Oct 31 12:08:45 2003
Subject: [Python-Dev] Looking for master thesis ideas involving Python
In-Reply-To: <5.1.1.6.0.20031031111429.03110880@telecommunity.com> (Phillip
	J. Eby's message of "Fri, 31 Oct 2003 11:20:35 -0500")
References: <5.1.0.14.0.20031031084822.01e5a020@mail.telecommunity.com>
	<3FA1C6CD.6050201@ocf.berkeley.edu>
	<Pine.LNX.4.10.10310291911060.4409-100000@sumeru.stanford.EDU>
	<3FA0A210.10605@ocf.berkeley.edu> <2mhe1rj7n8.fsf@starship.python.net>
	<3FA1C6CD.6050201@ocf.berkeley.edu>
	<5.1.0.14.0.20031031084822.01e5a020@mail.telecommunity.com>
	<5.1.1.6.0.20031031111429.03110880@telecommunity.com>
Message-ID: <2mad7h72sr.fsf@starship.python.net>

"Phillip J. Eby" <pje@telecommunity.com> writes:

>>thinking was that the pools are aligned on a known size boundary
>>(e.g. 4K) so to get to the head you just mask off the 12 (or whatever)
>>least significant bits.
>
> Ah.  But since even the most trivial of Python operations require
> access to the type, wouldn't this take longer?  I mean, for every
> ob->ob_type->tp_whatever you'll now have something like *(ob &
> mask)->tp_whatever.

Well, I dunno.  I doubt the masking would add significant overhead --
it'd only be one instruction, after all -- but the fact that you'd
have to haul the start of the pool into the cache to get the pointer
to the type object might hurt.  You'd have to try it and measure, I
guess.

> So there are still two memory acesses, but now there's a bitmasking
> operation added in.  I suppose that for some object types you could
> be getting a 12-25% decrease in memory use for the base object,
> though.

More than that in the good cases.  Something I forgot was that you'd
probably have to knock variable length types on the head.

Cheers,
mwh

-- 
  I would hereby duly point you at the website for the current pedal
  powered submarine world underwater speed record, except I've lost
  the URL.                                         -- Callas, cam.misc

From FBatista at uniFON.com.ar  Fri Oct 31 13:36:02 2003
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Fri Oct 31 13:36:50 2003
Subject: [Python-Dev] prePEP: Decimal data type
Message-ID: <A128D751272CD411BC9200508BC2194D0338314B@escpl.tcp.com.ar>

Here I send it.

Suggestions and all kinds of recomendations are more than welcomed.

If it all goes ok, it'll be a PEP when I finish writing/modifying the code.

Thank you.

.	Facundo


------------------------------------------------------------------------

PEP: XXXX
Title: Decimal data type
Version: $Revision: 0.1 $
Last-Modified: $Date: 2003/10/31 15:25:00 $
Author: Facundo Batista <fbatista@unifon.com.ar>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 17-Oct-2003
Python-Version: 2.3.3


Abstract
========

The idea is to have a Decimal data type, for every use where decimals are
needed but floating point is too inexact.

The Decimal data type should support the Python standard functions and
operations and must comply the decimal arithmetic ANSI standard X3.274-1996.


Rationale
=========

I must separate the requeriments in two sections.  The first is to comply
with the ANSI standard.  All the needings for this are specified in the
Mike Cowlishaw's work at http://www2.hursley.ibm.com/decimal/.  Cowlishaw's
also provided a **lot** of test cases.  The second section of requeriments
(standard Python functions support, usability, etc) are detailed in the
`Requirements`_ section.  

Here I'll include all the decisions made and why, and all the subjects still
being discussed.  The requirements will be numbered, to simplify discussion
on each point.

This work is based on code and test functions written by Eric Price, Aahz
and
Tim Peters.  Actually I'll work on the Decimal.py code in the sandbox (at
python/nondist/sandbox/decimal in SourceForge).  Some of the explanations of
this PEP are taken from the Cowlishaw's work.


Items In Discussion
-------------------

When in a case like ``Decimal op otherType`` (see point 12 in Requirements_
for details), what should happen?

    if otherType is an int or long:
    
        a. an exception is raised
        b. otherType is converted to Decimal 
        c. Decimal is converted to int or long (with ``int()`` or
``long()``)
        
    if otherType is a float:
    
        d. an exception is raised
        e. otherType is converted to Decimal (rounding? see next item in
           discussion)
        f. Decimal is converted to float (with ``float()``)
        
    if otherType is a string:
    
        g. an exception is raised
        h. otherType is converted to Decimal
        i. Decimal is converted to string (bizarre, huh?)


When passing floating point to the constructor, what should happen?

    j. ``Decimal(1.1) == Decimal('1.1')``
    k. ``Decimal(1.1) ==
Decimal('110000000000000008881784197001252...e-51')``


Requirements        
============

1. The syntax should be ``Decimal(value)``.

2. The value could be of the type:

       - another Decimal
       - int or long
       - float
       - string 

3. To exist a Context.  The context represents the user-selectable
parameters
   and rules which govern the results of arithmetic operations.  In the
   context the user defines:

       - what will happen with the exceptional conditions.
       - what precision will be used
       - what rounding method will be used

4. The Context must be omnipresent, meaning that changes to it affects all
   the current and future Decimal instances.

5. The exceptional conditions should be grouped into signals, which could be
   controlled individually.  The context should contain a flag and a
   trap-enabler for each signal.  The signals should be: clamped,
   division-by-zero, inexact, invalid-operation, overflow, rounded,
subnormal
   and underflow.

6. For each of the signals, the corresponding flag should be set to 1 when
   the signal occurs.  It is only reset to 0 by explicit user action.

7. For each of the signals, the corresponding trap-enabler will indicate
   which action is to be taken when the signal occurs.  If 0, a defined
   result should be supplied, and execution should continue.  If 1, the
   execution of the operation should end and an exception should be raised.

8. The precision (maximum number of significant digits that can result from
   an arithmetic operation) must be positive (greater than 0).

9. To have different kinds of rounding; you can choose the algorithm through
   context:
    
       - ``round-down``: (Round toward 0, truncate) The discarded digits are
         ignored; the result is unchanged::
          
             1.123 --> 1.12
             1.128 --> 1.12
             1.125 --> 1.12
             1.135 --> 1.13

       - ``round-half-up``: If the discarded digits represent greater than
or
         equal to half (0.5) then the result should be incremented by 1
         (rounded up); otherwise the discarded digits are ignored::
          
             1.123 --> 1.12
             1.128 --> 1.13
             1.125 --> 1.13
             1.135 --> 1.14

       - ``round-half-even``: If the discarded digits represent greater than
         half (0.5) then the result coefficient should be incremented by 1
         (rounded up); if they represent less than half, then the result is
         not adjusted (that is, the discarded digits are ignored); otherwise
         the result is unaltered if its rightmost digit is even, or
         incremented by 1 (rounded up) if its rightmost digit is odd (to
make
         an even digit)::
          
             1.123 --> 1.12
             1.128 --> 1.13
             1.125 --> 1.12
             1.135 --> 1.14

       - ``round-ceiling``: If all of the discarded digits are zero or if
the
         sign is negative the result is unchanged; otherwise, the result
         should be incremented by 1 (rounded up)::
          
             1.123 --> 1.13
             1.128 --> 1.13
             -1.123 --> -1.12
             -1.128 --> -1.12

       - ``round-floor``: If all of the discarded digits are zero or if the
         sign is positive the result is unchanged; otherwise, the absolute
         value of the result should be incremented by 1::

             1.123 --> 1.12
             1.128 --> 1.12
             -1.123 --> -1.13
             -1.128 --> -1.13

       - ``round-half-down``: If the discarded digits represent greater than
         half (0.5) then the result should be incremented by 1 (rounded up);
         otherwise the discarded digits are ignored::

             1.123 --> 1.12
             1.128 --> 1.13
             1.125 --> 1.12
             1.135 --> 1.13

       - ``round-up``: (Round away from 0) If all of the discarded digits
are
         zero the result is unchanged. Otherwise, the result should be
         incremented by 1 (rounded up)::

             1.123 --> 1.13
             1.128 --> 1.13
             1.125 --> 1.13
             1.135 --> 1.14

10. Strings with floats in engineering notation will be supported.

11. Calling repr() should do round trip, meaning that::
   
       m = Decimal(...)
       m == eval(repr(m))

12. To support the basic aritmetic (``+, -, *, /, //, **, %, divmod``) and
    comparison (``==, !=, <, >, <=, >=, cmp``) operators in the following
    cases:
   
       - Decimal op Decimal
       - Decimal op otherType
       - otherType op Decimal
       - Decimal op= Decimal
       - Decimal op= otherType
   
    Check `Items In Discussion`_ to see what types could OtherType be, and
    what happens in each case.
   
13. To support unary operators (``-, +, abs``).

14. To support the built-in methods:

        - min, max
        - float, int, long
        - str, repr
        - hash
        - copy, deepcopy
        - bool (0 is false, otherwise true)

15. To be immutable.


Reference Implementation
========================

To be included later:

    - code
    - test code
    - documentation


Copyright
=========

This document has been placed in the public domain.


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
ADVERTENCIA  

La informaci?n contenida en este mensaje y cualquier archivo anexo al mismo,
son para uso exclusivo del destinatario y pueden contener informaci?n
confidencial o propietaria, cuya divulgaci?n es sancionada por la ley. 

Si Ud. No es uno de los destinatarios consignados o la persona responsable
de hacer llegar este mensaje a los destinatarios consignados, no est?
autorizado a divulgar, copiar, distribuir o retener informaci?n (o parte de
ella) contenida en este mensaje. Por favor notif?quenos respondiendo al
remitente, borre el mensaje original y borre las copias (impresas o grabadas
en cualquier medio magn?tico) que pueda haber realizado del mismo. 

Todas las opiniones contenidas en este mail son propias del autor del
mensaje y no necesariamente coinciden con las de Telef?nica Comunicaciones
Personales S.A. o alguna empresa asociada. 

Los mensajes electr?nicos pueden ser alterados, motivo por el cual
Telef?nica Comunicaciones Personales S.A. no aceptar? ninguna obligaci?n
cualquiera sea el resultante de este mensaje. 

Muchas Gracias.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20031031/04dbf328/attachment-0001.html
From aleaxit at yahoo.com  Fri Oct 31 14:42:41 2003
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Oct 31 14:42:49 2003
Subject: [Python-Dev] prePEP: Decimal data type
In-Reply-To: <A128D751272CD411BC9200508BC2194D0338314B@escpl.tcp.com.ar>
References: <A128D751272CD411BC9200508BC2194D0338314B@escpl.tcp.com.ar>
Message-ID: <200310312042.41751.aleaxit@yahoo.com>

On Friday 31 October 2003 07:36 pm, Batista, Facundo wrote:
   ...
> If it all goes ok, it'll be a PEP when I finish writing/modifying the code.

I'll gladly help fix the English if needed then, let me know.

> When passing floating point to the constructor, what should happen?
>
>     j. ``Decimal(1.1) == Decimal('1.1')``
>     k. ``Decimal(1.1) ==
> Decimal('110000000000000008881784197001252...e-51')``

You forgot an alternative that's likely to be popular on python-dev: "an 
exception is raised".  (This would change requirement 2. later, of
course).


Alex


From jeremy at alum.mit.edu  Fri Oct 31 14:44:08 2003
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri Oct 31 14:46:56 2003
Subject: [Python-Dev] proposed change to compiler package
Message-ID: <1067629448.24165.150.camel@localhost.localdomain>

The top-level walk() function in the compiler package returns the
visitor object that is passed to walk.  I'd like to change it to return
the result of the top-level dispatch() or visit() call.  Right now,
visitor methods can return a value, which is useful for a visit() call
that is internal to a visitor, but can't return it to the caller of
walk().

The current return value is pretty useless, since the caller of walk()
must pass the visitor as one of the arguments.  That is, walk() returns
one of its arguments.  The change might break some code, but only in a
trivial way, and it will make possible to write visitors that don't have
any state-- simple combinators.

Example:

class NameVisitor:
    """Compute a dotted name from an expression."""

    def visitGetattr(self, node):
        return "%s.%s" % (self.visit(node.expr), node.attrname)

    def visitName(self, node):
        return node.name

Jeremy


From FBatista at uniFON.com.ar  Fri Oct 31 14:51:53 2003
From: FBatista at uniFON.com.ar (Batista, Facundo)
Date: Fri Oct 31 14:52:27 2003
Subject: [Python-Dev] prePEP: Decimal data type
Message-ID: <A128D751272CD411BC9200508BC2194D03383152@escpl.tcp.com.ar>

#- On Friday 31 October 2003 07:36 pm, Batista, Facundo wrote:
#-    ...
#- > If it all goes ok, it'll be a PEP when I finish 
#- writing/modifying the code.
#- 
#- I'll gladly help fix the English if needed then, let me know.

Always welcomed too, :)


#- > When passing floating point to the constructor, what should happen?
#- >
#- >     j. ``Decimal(1.1) == Decimal('1.1')``
#- >     k. ``Decimal(1.1) ==
#- > Decimal('110000000000000008881784197001252...e-51')``
#- 
#- You forgot an alternative that's likely to be popular on 
#- python-dev: "an 
#- exception is raised".  (This would change requirement 2. later, of
#- course).

You're right. So the 'm' choice (votable as the others) is "an exception is
raised".

.	Facundo

From fincher.8 at osu.edu  Fri Oct 31 19:40:39 2003
From: fincher.8 at osu.edu (Jeremy Fincher)
Date: Fri Oct 31 18:42:19 2003
Subject: [Python-Dev] Re: Guido's Magic Code   was: inline sort option
In-Reply-To: <200310301454.48290.aleaxit@yahoo.com>
References: <1067518878.3fa10b9e91afb@mcherm.com>
	<200310301454.48290.aleaxit@yahoo.com>
Message-ID: <200310311940.39491.fincher.8@osu.edu>

On Thursday 30 October 2003 08:54 am, Alex Martelli wrote:
> just like in about ALL cases
> of Python calls *except* "aclass.baz(aninst)" which is an exceptional
> case in which Python itself does (enforced) typechecking for you.

Out of curiosity, why does Python do this typechecking?

I just ran into a situation where such calls in my subclass of sets.Set fail 
if the sets module gets reloaded.  Is there some really important reason why 
in this case (and only this case) Python does typechecking on pure-Python 
classes?

Jeremy