From anthony at interlink.com.au Fri Apr 1 05:27:36 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Fri Apr 1 05:30:33 2005 Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib/logging handlers.py, 1.19, 1.19.2.1 In-Reply-To: <003101c53634$a60b87e0$d2bc958d@oemcomputer> References: <003101c53634$a60b87e0$d2bc958d@oemcomputer> Message-ID: <200504011327.37278.anthony@interlink.com.au> On Friday 01 April 2005 07:00, Raymond Hettinger wrote: > > Tag: release24-maint > > handlers.py > > Log Message: > > Added optional encoding argument to File based handlers and improved > > error handling for SysLogHandler > > Are you sure you want to backport an API change and new feature? What Raymond said. Please don't add new features to the maintenance branch. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From bac at OCF.Berkeley.EDU Fri Apr 1 11:39:01 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Fri Apr 1 11:39:06 2005 Subject: [Python-Dev] python-dev Summary for 2005-03-16 through 2005-03-31 [draft] Message-ID: <424D16B5.4090204@ocf.berkeley.edu> OK, so here is my final Summary. Like to send it out some time this weekend so please get corrections in ASAP. -------------------------------- ===================== Summary Announcements ===================== --------------- My last summary --------------- So, after nearly 2.5 years, this is my final python-dev Summary. Steve Bethard, Tim Lesher, and Tony Meyer will be taking over for me starting with the April 1 - April 15 summary (and no, this is not an elaborate April Fool's). I have learned a ton during my time doing the Summaries and I appreciate python-dev allowing me to do them all this time. Hopefully I will be able to contribute more now in a programming capacity thanks to having more free time. -------------------- PyCon was fantastic! -------------------- For those of you who missed PyCon, you missed a great one! It is actually my favorite PyCon to date. Already looking forward to next year. -------------------- Python fireside chat -------------------- Scott David Daniels requested a short little blurb from me expounding on my thoughts on Python. Not one to pass on an opportunity to just open myself and possibly shoot myself in the foot, I figured I would take up the idea. So hear we go. First, I suspect Python 3000 stuff will start to make its way into Python. Stuff that doesn't break backwards compatibility will most likely start to be implemented as we head toward the Python 2.9 barrier (Guido has stated several times that there will never be a Python 2.10). Things that are not backwards-compatible will most likely end up being hashed out in various PEPs. All of this will allow the features in Python 3000 to be worked in over time so there is not a huge culture shock. As for things behind the scenes, work on the back-end will move forward. Guido himself has suggested that JIT work should be looked into (according to an interview at http://www.devsource.com/article2/0,1759,1778272,00.asp). I know I plan to fiddle with the back-end to see if the compiler can be made to do more work. Otherwise I expect changes to be made, flame wars to come and go, and for someone else to write the python-dev Summaries. =) ========= Summaries ========= ---------------- Python 2.4.1 out ---------------- Anthony Baxter, on behalf of python-dev, has released `Python 2.4.1`_. .. _Python 2.4.1: http://www.python.org/2.4.1/ Contributing threads: - `RELEASED Python 2.4.1, release candidate 1 `__ - `RELEASED Python 2.4.1, release candidate 2 `__ - `BRANCH FREEZE for 2.4.1 final, 2005-03-30 00:00 UTC `__ - `RELEASED Python 2.4.1 (final) `__ ----------------- AST branch update ----------------- I, along with some other people, sprinted on the AST branch at PyCon. This led to a much more fleshed out design document (found in Python/compile.txt in the AST branch), the ability to build on Windows, and applying Nick Coghlan's fix for hex numbers. Nick also did some more patch work and asked how AST work should be tagged. There is now an AST category on SourceForge that people should use to flag things as for the AST. They should also, by default, assign such items to me ("bcannon" on SF). We have also taken to flagging threads on the AST with "[AST]" as the first item in the subject line. There was also a slight discussion/clarification on the functions named marshal_write_*() that output a byte format for the AST that is supposed to be agnostic of implementation. This will most likely end up being used as the way to pass AST objects back and forth between C and Python code. But with the name collision of the word "marshal" with the actual 'marshal' module, it needs to be changed. I have suggested - byte_encode - linear_form - zephyr_encoding - flat_form - flat_prefix - prefix_form while Nick Coghlan suggsted - linear_ast - bytestream_ast Obviously I prefer "form" and Nick prefers "ast". With Nick's reply being independent of mine it will most likely have "linear" or "byte" in the name. With the patches for descriptors and generator expressions sitting on SF, syntactic support for all of Python 2.4 should get applied shortly. After that it will come down to bug hunting and such. There is a todo list in the design doc for those interested in helping out. Contributing threads: - `Procedure for AST Branch patches `__ - `[AST] A somewhat less trivial patch than the last one. . . `__ - `[AST] question about marshal_write_*() fxns `__ ------------------------------------------------------- Putting docstrings before function declarations is ugly ------------------------------------------------------- The idea of moving docstrings after a 'def' was proposed, making it like most other practices in other languages. But very quickly people spoke up against the suggestion. A main argument was people just like the current way much better. I personally like the style so much that even in my C code I put the comment for all functions after the first curly brace, indented to match the flow of code. There was also an issue of ambiguity. How do you tell where the docstring for a module is when there is a function definition with a comment right after?:: """Module doc""" """Fxn doc""" def foo(): pass There is an ambiguity there thanks to constant string concatenation. In the end no one seemed to like the idea. Contributing threads: - `docstring before function declaration `__ ------------------------------------------- PyPI improvements thanks to PyCon sprinting ------------------------------------------- Thanks to the hard work of Richard Jones, "Fred Drake, Sean Reifschneider, Martin v. L?wis, Mick Twomey, John Camara, Andy Harrington, Andrew Kuchling, David Goodger and Ian Bicking (with Barry Warsaw in a supporting role)" accordinng to Richard, there are a bunch of new features to PyPI_ (pronounced "pippy" to prevent name clashes with PyPy). These improvements include using reST_ for descriptions, a new 'upload' feature for Distutils (requires Python 2.5), ability to sign releases using OpenPGP (requires Python 2.5), metadata fields are now expected to be UTF-8 encoded, interface cleanup, and saner URLs for projects (e.g., http://www.python.org/pypi/roundup/0.8.2). .. _PyPI: http://www.python.org/pypi/ Contributing threads: - `New PyPI broken package editing `__ - `Re: python/dist/src/Lib/distutils/command upload.py, 1.3, 1.4 `__ ------------------------------- Decorators for class statements ------------------------------- The desire to have decorators applied to class statements was brought up once again. Guido quickly responded, though, stating that unless a compelling use case that showed them much more useful than metaclasses it just would not happen. Contributing threads: - `@decoration of classes `__ =============== Skipped Threads =============== + itertools.walk() + Problems with definition of _POSIX_C_SOURCE + thread semantics for file objects Assume nothing is thread-safe + Draft PEP to make file objects support non-blocking mode. + Faster Set.discard() method? + __metaclass__ problem + Example workaround classes for using Unicode with csv module... + Change 'env var BROWSER override' semantics in webbrowser.py + bdist_deb checkin comments + Python 2.4 | 7.3 The for statement + Patch review: all webbrowser.py related patches up to 2005-03-20 + webbrowser.py: browser >/dev/null 2>&1 + C API for the bool type? + Shorthand for lambda + FYI: news items about Burton Report on P-languages + using SCons to build Python + 64-bit sequence and buffer protocol + Pickling instances of nested classes + python.org/sf URLs aren't working? From walter at livinglogic.de Fri Apr 1 13:17:24 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Fri Apr 1 13:17:29 2005 Subject: [Python-Dev] Pickling instances of nested classes In-Reply-To: <424C561F.5060409@strakt.com> References: <1752.84.56.104.245.1112132476.squirrel@isar.livinglogic.de> <4249E904.30808@strakt.com> <424C0CF4.6040607@livinglogic.de> <424C561F.5060409@strakt.com> Message-ID: <424D2DC4.4060904@livinglogic.de> Samuele Pedroni wrote: >> [...] >> And having the full name of the class available would certainly help >> in debugging. > > that's probably the only plus point but the names would be confusing wrt > modules vs. classes. You'd propably need a different separator in repr. XIST does this: >>> from ll.xist.ns import html >>> html.a.Attrs.href > My point was that enabling reduce hooks at the metaclass level has > propably other interesting applications, is far less complicated than > your proposal to implement, it does not further complicate the notion of > what happens at class creation time, and indeed avoids the > implementation costs (for all python impls) of your proposal and still > allows fairly generic solutions to the problem at hand because the > solution can be formulated at the metaclass level. Pickling classes like objects (i.e. by using the pickling methods in their (meta-)classes) solves only the second part of the problem: Finding the nested classes in the module on unpickling. The other problem is to add additional info to the inner class, which gets pickled and makes it findable on unpickling. > If pickle.py is patched along these lines [*] (strawman impl, not much > tested but test_pickle.py still passes, needs further work to support > __reduce_ex__ and cPickle would need similar changes) then this example > works: > > > class HierarchMeta(type): > """metaclass such that inner classes know their outer class, with > pickling support""" > def __new__(cls, name, bases, dic): > sub = [x for x in dic.values() if isinstance(x,HierarchMeta)] I did something similar to this in XIST, but the problem with this approach is that in: class Foo(Elm): pass class Bar(Elm): Baz = Foo the class Foo will get its _outer_ set to Bar although it shouldn't. > [...] > def __reduce__(cls): > if hasattr(cls, '_outer_'): > return getattr, (cls._outer_, cls.__name__) > else: > return cls.__name__ I like this approach: Instead of hardcoding how references to classes are pickled (pickle the __name__), deligate it to the metaclass. BTW, if classes and functions are pickable, why aren't modules: >>> import urllib, cPickle >>> cPickle.dumps(urllib.URLopener) 'curllib\nURLopener\np1\n.' >>> cPickle.dumps(urllib.splitport) 'curllib\nsplitport\np1\n.' >>> cPickle.dumps(urllib) Traceback (most recent call last): File "", line 1, in ? File "/usr/local/lib/python2.4/copy_reg.py", line 69, in _reduce_ex raise TypeError, "can't pickle %s objects" % base.__name__ TypeError: can't pickle module objects We'd just have to pickle the module name. Bye, Walter D?rwald From pedronis at strakt.com Fri Apr 1 15:57:37 2005 From: pedronis at strakt.com (Samuele Pedroni) Date: Fri Apr 1 15:57:47 2005 Subject: [Python-Dev] Pickling instances of nested classes In-Reply-To: <424D2DC4.4060904@livinglogic.de> References: <1752.84.56.104.245.1112132476.squirrel@isar.livinglogic.de> <4249E904.30808@strakt.com> <424C0CF4.6040607@livinglogic.de> <424C561F.5060409@strakt.com> <424D2DC4.4060904@livinglogic.de> Message-ID: <424D5351.1070305@strakt.com> Walter D?rwald wrote: > Samuele Pedroni wrote: > >>> [...] >>> And having the full name of the class available would certainly help >>> in debugging. >> >> >> that's probably the only plus point but the names would be confusing wrt >> modules vs. classes. > > > You'd propably need a different separator in repr. XIST does this: > > >>> from ll.xist.ns import html > >>> html.a.Attrs.href > > >> My point was that enabling reduce hooks at the metaclass level has >> propably other interesting applications, is far less complicated than >> your proposal to implement, it does not further complicate the notion of >> what happens at class creation time, and indeed avoids the >> implementation costs (for all python impls) of your proposal and still >> allows fairly generic solutions to the problem at hand because the >> solution can be formulated at the metaclass level. > > > Pickling classes like objects (i.e. by using the pickling methods in > their (meta-)classes) solves only the second part of the problem: > Finding the nested classes in the module on unpickling. The other > problem is to add additional info to the inner class, which gets > pickled and makes it findable on unpickling. > >> If pickle.py is patched along these lines [*] (strawman impl, not much >> tested but test_pickle.py still passes, needs further work to support >> __reduce_ex__ and cPickle would need similar changes) then this >> example works: >> >> >> class HierarchMeta(type): >> """metaclass such that inner classes know their outer class, with >> pickling support""" >> def __new__(cls, name, bases, dic): >> sub = [x for x in dic.values() if isinstance(x,HierarchMeta)] > > > I did something similar to this in XIST, but the problem with this > approach is that in: > > class Foo(Elm): > pass > > class Bar(Elm): > Baz = Foo > > the class Foo will get its _outer_ set to Bar although it shouldn't. this should approximate that behavior better: [not tested] import sys .... def __new__(cls, name, bases, dic): sub = [x for x in dic.values() if isinstance(x,HierarchMeta)] newtype = type.__new__(cls, name, bases, dic) for x in sub: if not hasattr(x, '_outer_') and getattr(sys.modules.get(x.__module__), x.__name__, None) is not x: x._outer_ = newtype return newtype ..... we don't set _outer_ if a way to pickle the class is already there From tjreedy at udel.edu Fri Apr 1 18:21:40 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Fri Apr 1 18:23:55 2005 Subject: [Python-Dev] Re: python-dev Summary for 2005-03-16 through 2005-03-31[draft] References: <424D16B5.4090204@ocf.berkeley.edu> Message-ID: >This led to a much more fleshed out design document > (found in Python/compile.txt in the AST branch), The directory URL http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Python/?only_with_tag=ast-branch or even the file URL http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Python/Attic/compile.txt?rev=1.1.2.10&only_with_tag=ast-branch&view=auto would be helpful to people not fully familiar with the depository and the required prefix to 'Python' (versus 'python'). I initially found the two-year-old ttp://cvs.sourceforge.net/viewcvs.py/python/python/nondist/sandbox/ast/ >The idea of moving docstrings after a 'def' was proposed /after/before/ From Scott.Daniels at Acm.Org Fri Apr 1 19:01:08 2005 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Fri Apr 1 19:02:35 2005 Subject: [Python-Dev] Re: python-dev Summary for 2005-03-16 through 2005-03-31 [draft] In-Reply-To: <424D16B5.4090204@ocf.berkeley.edu> References: <424D16B5.4090204@ocf.berkeley.edu> Message-ID: Brett C. wrote: > ... I figured I would take up the idea. So hear ^^ here ^^ > we go. > From walter at livinglogic.de Fri Apr 1 21:29:44 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Fri Apr 1 21:29:47 2005 Subject: [Python-Dev] Pickling instances of nested classes In-Reply-To: <424D5351.1070305@strakt.com> References: <1752.84.56.104.245.1112132476.squirrel@isar.livinglogic.de> <4249E904.30808@strakt.com> <424C0CF4.6040607@livinglogic.de> <424C561F.5060409@strakt.com> <424D2DC4.4060904@livinglogic.de> <424D5351.1070305@strakt.com> Message-ID: <424DA128.2060809@livinglogic.de> Samuele Pedroni wrote: > [...] > > this should approximate that behavior better: [not tested] > > import sys > > .... > def __new__(cls, name, bases, dic): > sub = [x for x in dic.values() if isinstance(x,HierarchMeta)] > newtype = type.__new__(cls, name, bases, dic) > for x in sub: > if not hasattr(x, '_outer_') and > getattr(sys.modules.get(x.__module__), x.__name__, None) is not x: > x._outer_ = newtype > return newtype > > ..... > > we don't set _outer_ if a way to pickle the class is already there This doesn't fix class Foo: class Bar: pass class Baz: Bar = Foo.Bar both this should be a simple fix. Bye, Walter D?rwald From ejones at uwaterloo.ca Fri Apr 1 21:36:07 2005 From: ejones at uwaterloo.ca (Evan Jones) Date: Fri Apr 1 21:35:31 2005 Subject: [Python-Dev] Unicode byte order mark decoding Message-ID: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> I recently rediscovered this strange behaviour in Python's Unicode handling. I *think* it is a bug, but before I go and try to hack together a patch, I figure I should run it by the experts here on Python-Dev. If you understand Unicode, please let me know if there are problems with making these minor changes. >>> import codecs >>> codecs.BOM_UTF8.decode( "utf8" ) u'\ufeff' >>> codecs.BOM_UTF16.decode( "utf16" ) u'' Why does the UTF-16 decoder discard the BOM, while the UTF-8 decoder turns it into a character? The UTF-16 decoder contains logic to correctly handle the BOM. It even handles byte swapping, if necessary. I propose that the UTF-8 decoder should have the same logic: it should remove the BOM if it is detected at the beginning of a string. This will remove a bit of manual work for Python programs that deal with UTF-8 files created on Windows, which frequently have the BOM at the beginning. The Unicode standard is unclear about how it should be handled (version 4, section 15.9): > Although there are never any questions of byte order with UTF-8 text, > this sequence can serve as signature for UTF-8 encoded text where the > character set is unmarked. [...] Systems that use the byte order mark > must recognize when an initial U+FEFF signals the byte order. In those > cases, it is not part of the textual content and should be removed > before processing, because otherwise it may be mistaken for a > legitimate zero width no-break space. At the very least, it would be nice to add a note about this to the documentation, and possibly add this example function that implements the "UTF-8 or ASCII?" logic: def autodecode( s ): if s.beginswith( codecs.BOM_UTF8 ): # The byte string s is UTF-8 out = s.decode( "utf8" ) return out[1:] else: return s.decode( "ascii" ) As a second issue, the UTF-16LE and UTF-16BE encoders almost do the right thing: They turn the BOM into a character, just like the Unicode specification says they should. >>> codecs.BOM_UTF16_LE.decode( "utf-16le" ) u'\ufeff' >>> codecs.BOM_UTF16_BE.decode( "utf-16be" ) u'\ufeff' However, they also *incorrectly* handle the reversed byte order mark: >>> codecs.BOM_UTF16_BE.decode( "utf-16le" ) u'\ufffe' This is *not* a valid Unicode character. The Unicode specification (version 4, section 15.8) says the following about non-characters: > Applications are free to use any of these noncharacter code points > internally but should never attempt to exchange them. If a > noncharacter is received in open interchange, an application is not > required to interpret it in any way. It is good practice, however, to > recognize it as a noncharacter and to take appropriate action, such as > removing it from the text. Note that Unicode conformance freely allows > the removal of these characters. (See C10 in Section3.2, Conformance > Requirements.) My interpretation of the specification means that Python should silently remove the character, resulting in a zero length Unicode string. Similarly, both of the following lines should also result in a zero length Unicode string: >>> '\xff\xfe\xfe\xff'.decode( "utf16" ) u'\ufffe' >>> '\xff\xfe\xff\xff'.decode( "utf16" ) u'\uffff' Thanks for your feedback, Evan Jones From mal at egenix.com Fri Apr 1 22:19:40 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Fri Apr 1 22:19:42 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> Message-ID: <424DACDC.4080601@egenix.com> Evan Jones wrote: > I recently rediscovered this strange behaviour in Python's Unicode > handling. I *think* it is a bug, but before I go and try to hack > together a patch, I figure I should run it by the experts here on > Python-Dev. If you understand Unicode, please let me know if there are > problems with making these minor changes. > > >>>> import codecs >>>> codecs.BOM_UTF8.decode( "utf8" ) > u'\ufeff' >>>> codecs.BOM_UTF16.decode( "utf16" ) > u'' > > Why does the UTF-16 decoder discard the BOM, while the UTF-8 decoder > turns it into a character? The BOM (byte order mark) was a non-standard Microsoft invention to detect Unicode text data as such (MS always uses UTF-16-LE for Unicode text files). It is not needed for the UTF-8 because that format doesn't rely on the byte order and the BOM character at the beginning of a stream is a legitimate ZWNBSP (zero width non breakable space) code point. The "utf-16" codec detects and removes the mark, while the two others "utf-16-le" (little endian byte order) and "utf-16-be" (big endian byte order) don't. > The UTF-16 decoder contains logic to > correctly handle the BOM. It even handles byte swapping, if necessary. I > propose that the UTF-8 decoder should have the same logic: it should > remove the BOM if it is detected at the beginning of a string. -1; there's no standard for UTF-8 BOMs - adding it to the codecs module was probably a mistake to begin with. You usually only get UTF-8 files with BOM marks as the result of recoding UTF-16 files into UTF-8. > This will > remove a bit of manual work for Python programs that deal with UTF-8 > files created on Windows, which frequently have the BOM at the > beginning. The Unicode standard is unclear about how it should be > handled (version 4, section 15.9): > >> Although there are never any questions of byte order with UTF-8 text, >> this sequence can serve as signature for UTF-8 encoded text where the >> character set is unmarked. [...] Systems that use the byte order mark >> must recognize when an initial U+FEFF signals the byte order. In those >> cases, it is not part of the textual content and should be removed >> before processing, because otherwise it may be mistaken for a >> legitimate zero width no-break space. > > > At the very least, it would be nice to add a note about this to the > documentation, and possibly add this example function that implements > the "UTF-8 or ASCII?" logic: > > def autodecode( s ): > if s.beginswith( codecs.BOM_UTF8 ): > # The byte string s is UTF-8 > out = s.decode( "utf8" ) > return out[1:] > else: return s.decode( "ascii" ) Well, I'd say that's a very English way of dealing with encoded text ;-) BTW, how do you know that s came from the start of a file and not from slicing some already loaded file somewhere in the middle ? > As a second issue, the UTF-16LE and UTF-16BE encoders almost do the > right thing: They turn the BOM into a character, just like the Unicode > specification says they should. > >>>> codecs.BOM_UTF16_LE.decode( "utf-16le" ) > u'\ufeff' >>>> codecs.BOM_UTF16_BE.decode( "utf-16be" ) > u'\ufeff' > > However, they also *incorrectly* handle the reversed byte order mark: > >>>> codecs.BOM_UTF16_BE.decode( "utf-16le" ) > u'\ufffe' > > This is *not* a valid Unicode character. The Unicode specification > (version 4, section 15.8) says the following about non-characters: > >> Applications are free to use any of these noncharacter code points >> internally but should never attempt to exchange them. If a >> noncharacter is received in open interchange, an application is not >> required to interpret it in any way. It is good practice, however, to >> recognize it as a noncharacter and to take appropriate action, such as >> removing it from the text. Note that Unicode conformance freely allows >> the removal of these characters. (See C10 in Section3.2, Conformance >> Requirements.) > > > My interpretation of the specification means that Python should silently > remove the character, resulting in a zero length Unicode string. > Similarly, both of the following lines should also result in a zero > length Unicode string: > >>>> '\xff\xfe\xfe\xff'.decode( "utf16" ) > u'\ufffe' >>>> '\xff\xfe\xff\xff'.decode( "utf16" ) > u'\uffff' Hmm, wouldn't it be better to raise an error ? After all, a reversed BOM mark in the stream looks a lot like you're trying to decode a UTF-16 stream assuming the wrong byte order ?! Other than that: +1 on fixing this case. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 01 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From bac at OCF.Berkeley.EDU Fri Apr 1 22:52:47 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Fri Apr 1 22:52:53 2005 Subject: [Python-Dev] Re: python-dev Summary for 2005-03-16 through 2005-03-31[draft] In-Reply-To: References: <424D16B5.4090204@ocf.berkeley.edu> Message-ID: <424DB49F.4060607@ocf.berkeley.edu> Terry Reedy wrote: >>This led to a much more fleshed out design document >>(found in Python/compile.txt in the AST branch), > > > The directory URL > > http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Python/?only_with_tag=ast-branch > > or even the file URL > > http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Python/Attic/compile.txt?rev=1.1.2.10&only_with_tag=ast-branch&view=auto > > would be helpful to people not fully familiar with the depository and the > required prefix to 'Python' (versus 'python'). I initially found the > two-year-old > > ttp://cvs.sourceforge.net/viewcvs.py/python/python/nondist/sandbox/ast/ > Yeah, that has become a popular suggestion. It has been fixed. Just didn't think about it. One of those instances where I have been neck-deep in python-dev for so long I forgot that not everyone has a CVS checkout. =) > > >>The idea of moving docstrings after a 'def' was proposed > > > /after/before/ > Fixed. Thanks, Terry. -Brett From ejones at uwaterloo.ca Sat Apr 2 05:04:11 2005 From: ejones at uwaterloo.ca (Evan Jones) Date: Sat Apr 2 05:03:36 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <424DACDC.4080601@egenix.com> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> Message-ID: On Apr 1, 2005, at 15:19, M.-A. Lemburg wrote: > The BOM (byte order mark) was a non-standard Microsoft invention > to detect Unicode text data as such (MS always uses UTF-16-LE for > Unicode text files). Well, it's origins do not really matter since at this point the BOM is firmly encoded in the Unicode standard. It seems to me that it is in everyone's best interest to support it. > It is not needed for the UTF-8 because that format doesn't rely on > the byte order and the BOM character at the beginning of a stream is > a legitimate ZWNBSP (zero width non breakable space) code point. You are correct: it is a legitimate character. However, its use as a ZWNBSP character has been deprecated: > The overloading of semantics for this code point has caused problems > for programs and protocols. The new character U+2060 WORD JOINER has > the same semantics in all cases as U+FEFF, except that it cannot be > used as a signature. Implementers are strongly encouraged to use word > joiner in those circumstances whenever word joining semantics is > intended. Also, the Unicode specification is ambiguous on what an implementation should do about a leading ZWNBSP that is encoded in UTF-16. Like I mentioned, if you look at the Unicode standard, version 4, section 15.9, it says: > 2. Unmarked Character Set. In some circumstances, the character set > information for a stream of coded characters (such as a file) is not > available. The only information available is that the stream contains > text, but the precise character set is not known. This seems to indicate that it is permitted to strip the BOM from the beginning of UTF-8 text. > -1; there's no standard for UTF-8 BOMs - adding it to the > codecs module was probably a mistake to begin with. You usually > only get UTF-8 files with BOM marks as the result of recoding > UTF-16 files into UTF-8. This is clearly incorrect. The UTF-8 is specified in the Unicode standard version 4, section 15.9: > In UTF-8, the BOM corresponds to the byte sequence . I normally find files with UTF-8 BOMs from many Windows applications when you save a text file as UTF8. I think that Notepad or WordPad does this, for example. I think UltraEdit also does the same thing. I know that Scintilla definitely does. >> At the very least, it would be nice to add a note about this to the >> documentation, and possibly add this example function that implements >> the "UTF-8 or ASCII?" logic. > Well, I'd say that's a very English way of dealing with encoded > text ;-) Please note I am saying only that something like this may want to me considered for addition to the documentation, and not to the Python standard library. This example function more closely replicates the logic that is used on those Windows applications when opening ".txt" files. It uses the default locale if there is no BOM: def autodecode( s ): if s.beginswith( codecs.BOM_UTF8 ): # The byte string s is UTF-8 out = s.decode( "utf8" ) return out[1:] else: return s.decode() > BTW, how do you know that s came from the start of a file > and not from slicing some already loaded file somewhere > in the middle ? Well, the same argument could be applied to the UTF-16 decoder know that the string came from the start of a file, and not from slicing some already loaded file? The standard states that: > In the UTF-16 encoding scheme, U+FEFF at the very beginning of a file > or stream explicitly signals the byte order. So it is perfectly permissible to perform this type of processing if you consider a string to be equivalent to a stream. >> My interpretation of the specification means that Python should >> silently >> remove the character, resulting in a zero length Unicode string. > Hmm, wouldn't it be better to raise an error ? After all, > a reversed BOM mark in the stream looks a lot like you're > trying to decode a UTF-16 stream assuming the wrong > byte order ?! Well, either one is possible, however the Unicode standard suggests, but does not require, silently removing them: > It is good practice, however, to recognize it as a noncharacter and to > take appropriate action, such as removing it from the text. Note that > Unicode conformance freely allows the removal of these characters. I would prefer silently ignoring them from the str.decode() function, since I believe in "be strict in what you emit, but liberal in what you accept." I think that this only applies to str.decode(). Any other attempt to create non-characters, such as unichr( 0xffff ), *should* raise an exception because clearly the programmer is making a mistake. > Other than that: +1 on fixing this case. Cool! Evan Jones From irmen at xs4all.nl Sat Apr 2 17:24:31 2005 From: irmen at xs4all.nl (Irmen de Jong) Date: Sat Apr 2 17:24:34 2005 Subject: [Python-Dev] New bug, directly assigned, okay? Message-ID: <424EB92F.5080308@xs4all.nl> I just added a new bug on SF (1175396) and because I think that it is related to other bugs that were assigned to Walter Doerwald, I assigned this new bug directly to Walter too. Is that good practice or does someone else usually assign SF bugs to people? --Irmen From ncoghlan at iinet.net.au Sat Apr 2 17:57:14 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Sat Apr 2 17:57:21 2005 Subject: [Python-Dev] New bug, directly assigned, okay? In-Reply-To: <424EB92F.5080308@xs4all.nl> References: <424EB92F.5080308@xs4all.nl> Message-ID: <424EC0DA.1020307@iinet.net.au> Irmen de Jong wrote: > I just added a new bug on SF (1175396) and because I think > that it is related to other bugs that were assigned to > Walter Doerwald, I assigned this new bug directly to Walter too. > > Is that good practice or does someone else usually assign SF bugs to people? I've certainly done that a few times myself - I figure that even if I get it wrong, the recipient will either pass it on to a more appropriate person, or simply revert it back to unassigned. I usually try to put in a comment to say *why* I've assigned it the way I have, though. Picking an assignee at random should probably be discouraged, but if there is someone that makes sense, then I don't see a problem with asking them to look at it directly. Cheers, Nick. -- Nick Coghlan | ncoghlan@email.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From irmen at xs4all.nl Sat Apr 2 18:04:21 2005 From: irmen at xs4all.nl (Irmen de Jong) Date: Sat Apr 2 18:04:23 2005 Subject: [Python-Dev] New bug, directly assigned, okay? In-Reply-To: <424EC0DA.1020307@iinet.net.au> References: <424EB92F.5080308@xs4all.nl> <424EC0DA.1020307@iinet.net.au> Message-ID: <424EC285.8020703@xs4all.nl> Nick Coghlan wrote: > Irmen de Jong wrote: > >> I just added a new bug on SF (1175396) and because I think >> that it is related to other bugs that were assigned to >> Walter Doerwald, I assigned this new bug directly to Walter too. >> >> Is that good practice or does someone else usually assign SF bugs to >> people? > > > I've certainly done that a few times myself - I figure that even if I > get it wrong, the recipient will either pass it on to a more appropriate > person, or simply revert it back to unassigned. Ah, okay. > I usually try to put in a comment to say *why* I've assigned it the way > I have, though. Picking an assignee at random should probably be > discouraged, but if there is someone that makes sense, then I don't see > a problem with asking them to look at it directly. Yep, that's what I've done. In my bug report (about codecs.readline) I referenced the two other bugs related to it (those were assigned to Walter). Thanks, Irmen. From ottrey at py.redsoft.be Sat Apr 2 09:22:41 2005 From: ottrey at py.redsoft.be (ottrey@py.redsoft.be) Date: Sat Apr 2 21:15:12 2005 Subject: [Python-Dev] hierarchicial named groups extension to the re library Message-ID: I've written an extension to the re library, to provide a more complete matching of hierarchical named groups in regular expressions. I've set up a sourceforge project for it: http://pyre2.sourceforge.net/ re2 extracts a hierarchy of named groups matches from a string, rather than the flat, incomplete dictionary that the standard re module returns. (ie. the re library only returns the ~last~ match for named groups - not a list of ~all~ the matches for the named groups. And the hierarchy of those named groups is non-existant in the flat dictionary of matches that results. ) eg. >>> import re >>> buf='12 drummers drumming, 11 pipers piping, 10 lords a-leaping' >>> regex='^((?P(?P\d+) (?P[^,]+))(, )?)*$' >>> pat1=re.compile(regex) >>> m=pat1.match(buf) >>> m.groupdict() {'verse': '10 lords a-leaping', 'number': '10', 'activity': 'lords a-leaping'} >>> import re2 >>> buf='12 drummers drumming, 11 pipers piping, 10 lords a-leaping' >>> regex='^((?P(?P\d+) (?P[^,]+))(, )?)*$' >>> pat2=re2.compile(regex) >>> x=pat2.extract(buf) >>> x {'verse': [{'number': '12', 'activity': 'drummers drumming'}, {'number': '11', 'activity': 'pipers piping'}, {'number': '10', 'activity': 'lords a-leaping'}]} (See http://pyre2.sourceforge.net/ for more details.) I am wondering what would be the best direction to take this project in. Firstly is it, (or can it be made) useful enough to be included in the python stdlib? (ie. Should I bother writing a PEP for it.) And if so, would it be best to merge its functionality in with the re library, or to leave it as a separate module? And, also are there any suggestions/criticisms on the library itself? From nidoizo at yahoo.com Sat Apr 2 23:01:40 2005 From: nidoizo at yahoo.com (Nicolas Fleury) Date: Sat Apr 2 23:00:44 2005 Subject: [Python-Dev] Re: hierarchicial named groups extension to the re library In-Reply-To: References: Message-ID: ottrey@py.redsoft.be wrote: >>>>import re2 >>>>buf='12 drummers drumming, 11 pipers piping, 10 lords a-leaping' >>>>regex='^((?P(?P\d+) (?P[^,]+))(, )?)*$' >>>>pat2=re2.compile(regex) >>>>x=pat2.extract(buf) >>>>x > > {'verse': [{'number': '12', 'activity': 'drummers > drumming'}, {'number': '11', 'activity': 'pipers > piping'}, {'number': '10', 'activity': 'lords a-leaping'}]} Is a dictionary the good container or should another class be used? Because in the example the content of the "verse" group is lost, excluding its sub-groups. Something like a hierarchic MatchObject could provide access to both information, the sub-groups and the group itself. Also, should it be limited to named groups? > I am wondering what would be the best direction to take this project in. > > Firstly is it, (or can it be made) useful enough to be included in the > python stdlib? (ie. Should I bother writing a PEP for it.) > > And if so, would it be best to merge its functionality in with the re > library, or to leave it as a separate module? > > And, also are there any suggestions/criticisms on the library itself? I find the feature very interesting, but being used to live without it, I have difficulty evaluating its usefulness. However, it reminds me how much at first I found strange that only the last match was kept, so I think, FWIW, that on a purist point of vue the functionality would make sense in the stdlib in some way or another. Regards, Nicolas From jcarlson at uci.edu Sun Apr 3 01:01:57 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun Apr 3 01:11:45 2005 Subject: [Python-Dev] Re: hierarchicial named groups extension to the re library In-Reply-To: References: Message-ID: <20050402134150.7215.JCARLSON@uci.edu> Nicolas Fleury wrote: > > ottrey@py.redsoft.be wrote: > >>>>import re2 > >>>>buf='12 drummers drumming, 11 pipers piping, 10 lords a-leaping' > >>>>regex='^((?P(?P\d+) (?P[^,]+))(, )?)*$' > >>>>pat2=re2.compile(regex) > >>>>x=pat2.extract(buf) If one wanted to match the API of the re module, one should use pat2.findall(buf), which would return a list of 'hierarchical match objects', though with the above, one should really return a list of 'verse' items (the way the regular expression is written). > >>>>x > > > > {'verse': [{'number': '12', 'activity': 'drummers > > drumming'}, {'number': '11', 'activity': 'pipers > > piping'}, {'number': '10', 'activity': 'lords a-leaping'}]} > > Is a dictionary the good container or should another class be used? > Because in the example the content of the "verse" group is lost, > excluding its sub-groups. Something like a hierarchic MatchObject could > provide access to both information, the sub-groups and the group itself. Its contents are not lost, look at the overall dictionary... In any case, I think one can do better than a dictionary. >>> x=pat2.match(buf) #or x=pat2.findall(buf)[0] >>> x '12 drummers drumming,' >>> dir(x) ['verse'] >>> x.verse '12 drummers drumming,' >>> dir(x.verse) ['number', 'activity'] >>> x.verse.number '12' >>> x.verse.activity 'drummers drumming' ...would get my vote (or using obj.group(i) semantics I discuss below). I notice that this is basically what the re2 module already does (having read the web page), though rather than... >>> pat2.extract(buf).verse[1].activity 'pipers piping' I would prefer... >>> pat2.findall(buf)[1].verse.activity 'pipers piping' For .verse[1] or .verse[2] to make sense, it implies that the pattern is something like... ((?P... )(?P...)) ... which it isn't. I understand that the decision was probably made to make it similar to the case of... ((?P... (?p...)+)) ... where multiple matches for goo would require x.foo.goo[i]. > Also, should it be limited to named groups? Probably not. I would suggest using matchobj.group(i) semantics to match the standard re module semantics, though only allow returning items in the current level of the hierarchy. That is, one could use x.verse.group(1) and get back '12', but x.group(1) would return '12 pipers piping' > > I am wondering what would be the best direction to take this project in. > > > > Firstly is it, (or can it be made) useful enough to be included in the > > python stdlib? (ie. Should I bother writing a PEP for it.) > > > > And if so, would it be best to merge its functionality in with the re > > library, or to leave it as a separate module? > > > > And, also are there any suggestions/criticisms on the library itself? > > I find the feature very interesting, but being used to live without it, > I have difficulty evaluating its usefulness. However, it reminds me how > much at first I found strange that only the last match was kept, so I > think, FWIW, that on a purist point of vue the functionality would make > sense in the stdlib in some way or another. re2 can be used as a limited structural parser. This makes the re module useful for more things than it is currently. The question of it being in the standard library, however, I think should be made based on the criteria used previously (whatever they were). - Josiah From nidoizo at yahoo.com Sun Apr 3 02:16:44 2005 From: nidoizo at yahoo.com (Nicolas Fleury) Date: Sun Apr 3 02:14:49 2005 Subject: [Python-Dev] Re: hierarchicial named groups extension to the re library In-Reply-To: <20050402134150.7215.JCARLSON@uci.edu> References: <20050402134150.7215.JCARLSON@uci.edu> Message-ID: Josiah Carlson wrote: > Nicolas Fleury wrote: >>ottrey@py.redsoft.be wrote: >> >>>>>>import re2 >>>>>>buf='12 drummers drumming, 11 pipers piping, 10 lords a-leaping' >>>>>>regex='^((?P(?P\d+) (?P[^,]+))(, )?)*$' >>>>>>pat2=re2.compile(regex) >>>>>>x=pat2.extract(buf) > > If one wanted to match the API of the re module, one should use > pat2.findall(buf), which would return a list of 'hierarchical match > objects', though with the above, one should really return a list of > 'verse' items (the way the regular expression is written). As far as I can understand, the two are orthogonal. findall is used to match the regular expression multiple times; in that case the regular expression is still matched only once. >>>{'verse': [{'number': '12', 'activity': 'drummers >>>drumming'}, {'number': '11', 'activity': 'pipers >>>piping'}, {'number': '10', 'activity': 'lords a-leaping'}]} >> >>Is a dictionary the good container or should another class be used? >>Because in the example the content of the "verse" group is lost, >>excluding its sub-groups. Something like a hierarchic MatchObject could >>provide access to both information, the sub-groups and the group itself. > > Its contents are not lost, look at the overall dictionary... In any > case, I think one can do better than a dictionary. In that specific example, I meant that the space between "10" and "lords a-leaping" was not stored in the dictionary, unless you talk about the dictionary from re instead of re2. Your proposal fixes that, by making the entire content of the parent group (verse) accessible. >>>>x=pat2.match(buf) #or x=pat2.findall(buf)[0] >>>>x > > '12 drummers drumming,' > >>>>dir(x) > > ['verse'] > >>>>x.verse > > '12 drummers drumming,' > It is very easy to use, but I doubt it is a good idea as a return value for match (maybe a match object could have a function to return this easy-to-use object). It would mean that the name of the groups are limited by the interface of the match object returned (what would happen if a group is named "start", "end" of simpliy "group"?). Another solution is to use x["verse"] instead (or continue use a "group" method). >> Also, should it be limited to named groups? > > Probably not. I would suggest using matchobj.group(i) semantics to > match the standard re module semantics, though only allow returning > items in the current level of the hierarchy. That is, one could use > x.verse.group(1) and get back '12', but x.group(1) would return '12 > pipers piping' > Totally agree that matchobj.group interface should be matched. Should group return another match object? Or maybe another function to get match objects of groups? Something like: x.groupobj("verse").group("number") or str(x["verse"]["number"]) Regards, Nicolas From martin at v.loewis.de Sun Apr 3 08:48:16 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Apr 3 08:48:20 2005 Subject: [Python-Dev] Re: hierarchicial named groups extension to the re library In-Reply-To: <20050402134150.7215.JCARLSON@uci.edu> References: <20050402134150.7215.JCARLSON@uci.edu> Message-ID: <424F91B0.2050307@v.loewis.de> Josiah Carlson wrote: > re2 can be used as a limited structural parser. This makes the re > module useful for more things than it is currently. The question of it > being in the standard library, however, I think should be made based on > the criteria used previously (whatever they were). In general, if developers can readily agree that a functionality should be added (i.e. it is "obvious" for some reason), it is added right away. Otherwise, a PEP should be written, and reviewed by the community. In the specific case, Chris Ottrey submitted a link to his project to the SF patches tracker, asking for inclusion. I felt that there is likely no immediate agreement, and suggested he asks on python-dev, and writes a PEP. If this kind of functionality would fall on immediate rejection for some reason, even writing the PEP might be pointless. If the functionality is generally considered useful, a PEP can be written, and then implemented according to the PEP procedures (i.e. collect feedback, discuss alternatives, ask for BDFL pronouncement). I personally think that the proposed functionality should *not* live in a separate module, but somehow be integrated into SRE. Whether or not the proposed functionality is useful in the first place, I don't know. I never have nested named groups in my regular expressions. Regards, Martin From ottrey at py.redsoft.be Sun Apr 3 09:24:49 2005 From: ottrey at py.redsoft.be (ottrey@py.redsoft.be) Date: Sun Apr 3 09:25:01 2005 Subject: [Python-Dev] hierarchicial named groups extension to the re library Message-ID: <7jV2C9bu.1112513089.4609730.ottrey@py.redsoft.be> Nicolas Fleury wrote: > > ottrey at py.redsoft.be wrote: > >>>>import re2 > >>>>buf='12 drummers drumming, 11 pipers piping, 10 lords a-leaping' > >>>>regex='^((?P(?P\d+) (?P[^,]+))(, )?)*$' > >>>>pat2=re2.compile(regex) > >>>>x=pat2.extract(buf) > >>>>x > > > > {'verse': [{'number': '12', 'activity': 'drummers > > drumming'}, {'number': '11', 'activity': 'pipers > > piping'}, {'number': '10', 'activity': 'lords a-leaping'}]} > > Is a dictionary the good container or should another class be used? > Because in the example the content of the "verse" group is lost, > excluding its sub-groups. Something like a hierarchic MatchObject could > provide access to both information, the sub-groups and the group itself. Yes, very good point. Actually it ~is~ a container (that uses dict as it's base class). (I probably should add the following lines to the example.) >>> type(x) >>> x._value '12 drummers drumming, 11 pipers piping, 10 lords a-leaping' >>> x.verse[0]._value '12 drummers drumming' Josiah Carlson jcarlson at uci.edu wrote: > If one wanted to match the API of the re module, one should use > pat2.findall(buf), which would return a list of 'hierarchical match > objects' Well, that would be something I'd want to discuss here. As I'm not sure if I actually ~want~ to match the API of the re module. > Also, should it be limited to named groups? I have given that some thought as well. Internally un-named groups are recursively given the names _group0, _group1 etc as they are found. And then those groups are recursively matched. And in the final step the resulting _Match object is compressed and those un-named groups are discarded. IMO If you don't bother to name a group then you probably aren't going to be interested in it anyway - so why keeping a reference to it? eg. If you only wanted to extract the numbers from those verses... >>> regex='^(((?P\d+) ([^,]+))(, )?)*$' >>> pat2=re2.compile(regex) >>> x=pat2.extract(buf) >>> x {'number': ['12', '11', '10']} Before the compression stage the _Match object actually looked like this: {'_group0': {'_value': '12 drummers drumming, 11 pipers piping, 10 lords a-leaping', '_group0': [{'_value': '12 drummers drumming, ', '_group1': ', ', '_group0': {'_value': '12 drummers drumming', '_group1': 'drummers drumming', 'number': '12'}}, {'_value': '11 pipers piping, ', '_group1': ', ', '_group0': {'_value': '11 pipers piping', '_group1': 'pipers piping', 'number': '11'}}, {'_value': '10 lords a-leaping', '_group0': {'_value': '10 lords a-leaping', '_group1': 'lords a-leaping', 'number': '10'}}]}} But the compression algorithm collected the named groups and brought them to the surface, to return the much nicer looking: {'number': ['12', '11', '10']} NB. There are also a few other tricks up the sleeve of re2. eg. It allows for named groups to be repeated in different branches of a named group hierarchy, without the name redefinition error that the re library will complain about. eg. >>> pat1=re2.compile( '(?P(?P(?P[\w ]+)),(?P(?P[\w ]+)))' ) >>> pat1.extract('Mum,Dad') {'parents': {'father': {'name': 'Dad'}, 'mother': {'name': 'Mum'}}} > I find the feature very interesting, but being used to live without it, > I have difficulty evaluating its usefulness. Yes - this is a good point too, because it ~is~ different from the re library. re2 aims to do all that searching, grouping, iterating and collecting and constructing work for you. > However, it reminds me how much at first I found strange that only the > last match was kept, so I think, FWIW, that on a purist point of vue the > functionality would make sense in the stdlib in some way or another. Actually that "last match only" confusion was part of the motivation for writing it in the first place. > For .verse[1] or .verse[2] to make sense, it implies that the pattern is > something like... > ((?P... )(?P...)) > ... which it isn't. Good pickup! You've seen through my smoke and mirrors. ;-) That list of verses was actually created in the compression stage. (The stage that I failed to mention in my first post.) ie. The regex was: ((?P(?P\d+) (?P[^,]+))(, )?)* Which returns an un-named list of verse groups. Something like: {'_group0': [ {'verse': {'number': '12', 'activity': 'drummers drumming'}, {'verse': {'number': '11', 'activity': 'pipers piping'}}, {'verse': {'number': '10', 'activity': 'lords a-leaping'}}]} But the compression algorithm discarded that '_group0' key and brought the 'verse' groups to the surface, then grouped them together in one 'verse' list. ie. to make: {'verse': [{'number': '12', 'activity': 'drummers drumming'}, {'number': '11', 'activity': 'pipers piping'}, {'number': '10', 'activity': 'lords a-leaping'}]} > > Also, should it be limited to named groups? > > Probably not. I would suggest using matchobj.group(i) semantics to > match the standard re module semantics, though only allow returning > items in the current level of the hierarchy. That is, one could use > x.verse.group(1) and get back '12', but x.group(1) would return '12 > pipers piping' Actually, I ~would~ like to limit it to just named groups. I reckon, if you're not going to bother naming a group, then why would you have any interest in it. I guess its up for discussion how confusing this "new" way of thinking could be and what drawbacks it might have. Regards. Chris. From pierre.barbier at cirad.fr Sun Apr 3 11:13:51 2005 From: pierre.barbier at cirad.fr (Pierre Barbier de Reuille) Date: Sun Apr 3 11:12:48 2005 Subject: [Python-Dev] hierarchicial named groups extension to the re library In-Reply-To: <7jV2C9bu.1112513089.4609730.ottrey@py.redsoft.be> References: <7jV2C9bu.1112513089.4609730.ottrey@py.redsoft.be> Message-ID: <424FB3CF.7020102@cirad.fr> ottrey@py.redsoft.be a ?crit : > Nicolas Fleury wrote: > > [...] > > Actually, I ~would~ like to limit it to just named groups. > I reckon, if you're not going to bother naming a group, then why would > you have any interest in it. > I guess its up for discussion how confusing this "new" way of thinking > could be and what drawbacks it might have. I would find interesting to match every groups without naming them ! For example, if the position in the father group is the best meaning, why bother with names ? If you just allow the user to skip the compression stage it will do the trick ! That leads me to a question: would it be possible to use, as names for unnamed groups, integers instead of strings ? That way, you could access unnamed groups by their rank in their father group for example. A small example of what I would want: >>> buf="123 234 345, 123 256, and 123 289" >>> regex=r'^(( *\d+)+,)+ *(?P[^ ]+)(( *\d+)+).*$' >>> pat2=re2.compile(regex) >>> x=pat2.extract(buf) >>> x { 0: {'_value': "123 234 345,", 0: "123", 1: " 234", 2: " 345"}, 1: {'_value': " 123 256,", 0: " 123", 1:" 256"}, 'logic': {'_value': 'and'}, 3: {'_value': " 123 289", 1: " 123", 2:" 289"} } Pierre > > Regards. > > Chris. > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/pierre.barbier%40cirad.fr > -- Pierre Barbier de Reuille INRA - UMR Cirad/Inra/Cnrs/Univ.MontpellierII AMAP Botanique et Bio-informatique de l'Architecture des Plantes TA40/PSII, Boulevard de la Lironde 34398 MONTPELLIER CEDEX 5, France tel : (33) 4 67 61 65 77 fax : (33) 4 67 61 56 68 From pje at telecommunity.com Sun Apr 3 16:30:21 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sun Apr 3 16:26:59 2005 Subject: [Python-Dev] Re: hierarchicial named groups extension to the re library In-Reply-To: <424F91B0.2050307@v.loewis.de> References: <20050402134150.7215.JCARLSON@uci.edu> <20050402134150.7215.JCARLSON@uci.edu> Message-ID: <5.1.1.6.0.20050403102549.0350bec0@mail.telecommunity.com> At 08:48 AM 4/3/05 +0200, Martin v. L?wis wrote: >I personally think that the proposed functionality should *not* live >in a separate module, but somehow be integrated into SRE. +1. > Whether or >not the proposed functionality is useful in the first place, I don't >know. I never have nested named groups in my regular expressions. Neither have I, but only because it doesn't do what re2 does. :) I'd like to suggest that the addition also allow you to match a group by a named reference, thus allowing a complete grammar to be formed. Of course, I don't know if the underlying regular expression engine could actually do that, but it would be nice if it could, since it would allow simple grammars to be more easily parsed without recourse to a more complex parsing module. From mwh at python.net Sun Apr 3 17:14:16 2005 From: mwh at python.net (Michael Hudson) Date: Sun Apr 3 17:14:18 2005 Subject: [Python-Dev] longobject.c & ob_size Message-ID: <2mk6njdh9z.fsf@starship.python.net> Asking mostly for curiousity, how hard would it be to have longs store their sign bit somewhere less aggravating? It seems to me that the top bit of ob_digit[0] is always 0, for example, and I'm sure this would result no less convolution in longobject.c it'd be considerably more localized convolution. Cheers, mwh -- CDATA is not an integration strategy. -- from Twisted.Quotes From martin at v.loewis.de Sun Apr 3 18:03:29 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun Apr 3 18:03:32 2005 Subject: [Python-Dev] longobject.c & ob_size In-Reply-To: <2mk6njdh9z.fsf@starship.python.net> References: <2mk6njdh9z.fsf@starship.python.net> Message-ID: <425013D1.6090302@v.loewis.de> Michael Hudson wrote: > Asking mostly for curiousity, how hard would it be to have longs store > their sign bit somewhere less aggravating? It seems to me that the > top bit of ob_digit[0] is always 0, for example, and I'm sure this > would result no less convolution in longobject.c it'd be considerably > more localized convolution. I think the amount of special-casing that you need would remain the same - i.e. you would have to mask out the sign before performing the algorithms, then bring it back in. Masking out the bit from digit[0] might slow down the algorithms somewhat, because you would probably mask it out from every digit, not only digit[0] (or else test for digit[0], which test would then be performed for all digits). You would also have to keep the special case for 0L, which has ob_size==0 (i.e. doesn't have digit[0]). That said, I think the change could be implemented within a few hours, taking a day to make the testsuite run again; depending on the review process, you might need two releases to fix the bugs (but then, it is also reasonable to expect to get it right the first time). Regards, Martin From gustavo at niemeyer.net Mon Apr 4 02:16:19 2005 From: gustavo at niemeyer.net (Gustavo Niemeyer) Date: Mon Apr 4 02:16:43 2005 Subject: [Python-Dev] Re: hierarchicial named groups extension to the re library In-Reply-To: <424F91B0.2050307@v.loewis.de> References: <20050402134150.7215.JCARLSON@uci.edu> <424F91B0.2050307@v.loewis.de> Message-ID: <20050404001619.GA11017@burma.localdomain> Greetings, > If this kind of functionality would fall on immediate rejection for > some reason, even writing the PEP might be pointless. If the [...] In my opinion the functionality is useful. > I personally think that the proposed functionality should *not* live > in a separate module, but somehow be integrated into SRE. Whether or [...] Agreed. I propose to integrate this functionality into the SRE syntax, so that this special kind of group may be used when explicitly wanted. This would avoid backward compatibility problems, would give each regular expression a single meaning, and would allow interleaving hierarchical/non-hierarchical groups. I offer myself to integrate the change once we decide on the right way to implement it, and achieve consensus on its adoption. Best regards, -- Gustavo Niemeyer http://niemeyer.net From gustavo at niemeyer.net Mon Apr 4 03:17:17 2005 From: gustavo at niemeyer.net (Gustavo Niemeyer) Date: Mon Apr 4 03:17:43 2005 Subject: [Python-Dev] hierarchicial named groups extension to the re library In-Reply-To: <7jV2C9bu.1112513089.4609730.ottrey@py.redsoft.be> References: <7jV2C9bu.1112513089.4609730.ottrey@py.redsoft.be> Message-ID: <20050404011717.GA11463@burma.localdomain> Greetings Chris, > Well, that would be something I'd want to discuss here. As I'm not > sure if I actually ~want~ to match the API of the re module. If this feature is considered a good addition for the standard library, integrating it on re would be an interesting option. But given what you say above, I'm not sure if *you* want to make it a part of re itself. [...] > IMO If you don't bother to name a group then you probably aren't going > to be interested in it anyway - so why keeping a reference to it? That's not true. There's a lot of code out there using unnamed groups genuinely. The syntax (?: ) is used when the group content is considered unuseful. > If you only wanted to extract the numbers from those verses... > > >>> regex='^(((?P\d+) ([^,]+))(, )?)*$' > >>> pat2=re2.compile(regex) > >>> x=pat2.extract(buf) > >>> x > {'number': ['12', '11', '10']} > > Before the compression stage the _Match object actually looked like this: > > {'_group0': {'_value': '12 drummers drumming, 11 pipers piping, 10 > lords [...] > '10'}}]}} > > But the compression algorithm collected the named groups and brought > them to the surface, to return the much nicer looking: > > {'number': ['12', '11', '10']} I confess I didn't thought about how that could be cleanly implemented, but both outputs you present above look inadequate in my opinion. Regular expressions already have a widely adopted meaning. If we're going to introduce new features, we should try to do that without breaking the current well known meanings they have. > > I find the feature very interesting, but being used to live without it, > > I have difficulty evaluating its usefulness. > > Yes - this is a good point too, because it ~is~ different from the re > library. re2 aims to do all that searching, grouping, iterating and > collecting and constructing work for you. [...] > Actually, I ~would~ like to limit it to just named groups. > I reckon, if you're not going to bother naming a group, then why would > you have any interest in it. > I guess its up for discussion how confusing this "new" way of thinking > could be and what drawbacks it might have. Your target seems to be a new kind of regular expressions indeed. In that case, I'm not sure if "re2" is the right name for it, given that you haven't written an improved SRE, but a completely new kind of regular expression matching which depends on SRE itself rather than extending it on a compatible way. While I would like to see *some* kind of successive matching implemented in SRE (besides the Scanner which is already available), I'm not in favor of that specific implementation. I'm open to discuss that further. -- Gustavo Niemeyer http://niemeyer.net From arigo at tunes.org Mon Apr 4 08:10:43 2005 From: arigo at tunes.org (Armin Rigo) Date: Mon Apr 4 08:17:23 2005 Subject: [Python-Dev] longobject.c & ob_size In-Reply-To: <2mk6njdh9z.fsf@starship.python.net> References: <2mk6njdh9z.fsf@starship.python.net> Message-ID: <20050404061043.GA2960@vicky.ecs.soton.ac.uk> Hi Michael, On Sun, Apr 03, 2005 at 04:14:16PM +0100, Michael Hudson wrote: > Asking mostly for curiousity, how hard would it be to have longs store > their sign bit somewhere less aggravating? As I guess your goal is to get rid of all the "if (size < 0) size = -size" in object.c and friends, I should point out that longobject.c has set out an example that might have been followed by C extension writers. Maybe it is too late to say now that ob_size cannot be negative any more :-( Armin From ottrey at py.redsoft.be Mon Apr 4 08:27:46 2005 From: ottrey at py.redsoft.be (ottrey@py.redsoft.be) Date: Mon Apr 4 08:27:57 2005 Subject: [Python-Dev] hierarchicial named groups extension to the re library In-Reply-To: <20050404011717.GA11463@burma.localdomain> Message-ID: Hi Gustavo!, On 4/4/2005, "Gustavo Niemeyer" wrote: >> Well, that would be something I'd want to discuss here. As I'm not >> sure if I actually ~want~ to match the API of the re module. > >If this feature is considered a good addition for the standard >library, integrating it on re would be an interesting option. >But given what you say above, I'm not sure if *you* want to >make it a part of re itself. > After taking in the great comments made in this discussion, I'm now thinking that it ~would~ be best to try and integrate the new functionality with the existing re library (matching the current API), as there is (at least some) re2 functionality that I think could fit neatly into the existing re API. As, like you say: > This would avoid backward compatibility problems, would give each > regular expression a single meaning, and would allow interleaving > hierarchical/non-hierarchical groups. >If we're going to introduce new features, we should try >to do that without breaking the current well known meanings they >have. Agreed. >I'm not in favor of that specific implementation. > >I'm open to discuss that further. And I'm happy to work on a proposal that attempts to implement the new functionality in a backwardly compatible, integrated way. > I offer myself to integrate the change Thanx! That'd be great. > once we decide on the right way to implement it, > and achieve consensus on its adoption. Great. So I'll conclude from this discussion that (some implementation) of re2 is indeed worth adding to the re library (once we achieve consensus). And as for creating a PEP... >Josiah Carlson wrote: >In general, if developers can readily agree that a functionality should >be added (i.e. it is "obvious" for some reason), it is added right away. >Otherwise, a PEP should be written, and reviewed by the community I'd like to call the current functionality a "work in progress". ie. I'd like to work on it more, taking on board the comments made here. I'd also like to take this discussion off the python-dev list now and shift it to pyre2. (possibly to come back with a more polished proposal.) We've set up a development wiki here: http://py.redsoft.be/pyre2/wiki/ (feel free to add any more suggestions.) And there is also a mailing list, if anyone is interested and would like to subscribe: http://lists.sourceforge.net/lists/listinfo/pyre2-devel Regards. Chris. From olsongt at verizon.net Mon Apr 4 23:51:34 2005 From: olsongt at verizon.net (Grant Olson) Date: Mon Apr 4 23:54:18 2005 Subject: [Python-Dev] Mail.python.org Message-ID: <0IEF00B0UZIC91I3@vms048.mailsrvcs.net> Not a big deal, but I noticed that https://mail.python.org/ is live and shows a generic "Welcome to your new home in cyberspace!" message. One of the webmasters may want to automatically redirect to http://mail.python.org. -Grant From stephen at xemacs.org Tue Apr 5 08:25:09 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue Apr 5 08:25:20 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <424DACDC.4080601@egenix.com> (M.'s message of "Fri, 01 Apr 2005 22:19:40 +0200") References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> Message-ID: <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "MAL" == M writes: MAL> The BOM (byte order mark) was a non-standard Microsoft MAL> invention to detect Unicode text data as such (MS always uses MAL> UTF-16-LE for Unicode text files). The Japanese "memopado" (Notepad) uses UTF-8 signatures; it even adds them to existing UTF-8 files lacking them. MAL> -1; there's no standard for UTF-8 BOMs - adding it to the MAL> codecs module was probably a mistake to begin with. You MAL> usually only get UTF-8 files with BOM marks as the result of MAL> recoding UTF-16 files into UTF-8. There is a standard for UTF-8 _signatures_, however. I don't have the most recent version of the ISO-10646 standard, but Amendment 2 (which defined UTF-8 for ISO-10646) specifically added the UTF-8 signature to Annex F of that standard. Evan quotes Version 4 of the Unicode standard, which explicitly defines the UTF-8 signature. So there is a standard for the UTF-8 signature, and I know of applications which produce it. While I agree with you that Python's codecs shouldn't produce it (by default), providing an option to strip is a good idea. However, this option should be part of the initialization of an IO stream which produces Unicodes, _not_ an operation on arbitrary internal strings (whether raw or Unicode). MAL> BTW, how do you know that s came from the start of a file and MAL> not from slicing some already loaded file somewhere in the MAL> middle ? The programmer or the application might, but Python's codecs don't. The point is that this is also true of rawstrings that happen to contain UTF-16 or UTF-32 data. The UTF-16 ("auto-endian") codec shouldn't strip leading BOMs either, unless it has been told it has the beginning of the string. MAL> Evan Jones wrote: >> This is *not* a valid Unicode character. The Unicode >> specification (version 4, section 15.8) says the following >> about non-characters: >> >>> Applications are free to use any of these noncharacter code >>> points internally but should never attempt to exchange >>> them. If a noncharacter is received in open interchange, an >>> application is not required to interpret it in any way. It is >>> good practice, however, to recognize it as a noncharacter and >>> to take appropriate action, such as removing it from the >>> text. Note that Unicode conformance freely allows the removal >>> of these characters. (See C10 in Section3.2, Conformance >>> Requirements.) >> >> My interpretation of the specification means that Python should The specification _permits_ silent removal; it does not recommend. >> silently remove the character, resulting in a zero length >> Unicode string. Similarly, both of the following lines should >> also result in a zero length Unicode string: >>>> '\xff\xfe\xfe\xff'.decode( "utf16" ) > u'\ufffe' >>>> '\xff\xfe\xff\xff'.decode( "utf16" ) > u'\uffff' I strongly disagree; these decisions should be left to a higher layer. In the case of specified UTFs, the codecs should simply invert the UTF to Python's internal encoding. MAL> Hmm, wouldn't it be better to raise an error ? After all, a MAL> reversed BOM mark in the stream looks a lot like you're MAL> trying to decode a UTF-16 stream assuming the wrong byte MAL> order ?! +1 on (optionally) raising an error. -1 on removing it or anything like that, unless under control of the application (ie, the program written in Python, not Python itself). It's far too easy for software to generate broken Unicode streams[1], and the choice of how to deal with those should be with the application, not with the implementation language. Footnotes: [1] An egregious example was the Outlook Express distributed with early Win2k betas, which produced MIME bodies with apparent Content-Type: text/html; charset=utf-16, but the HTML tags and newlines were 7-bit ASCII! -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From martin at v.loewis.de Tue Apr 5 10:03:15 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Apr 5 10:03:19 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <42524643.3070604@v.loewis.de> Stephen J. Turnbull wrote: > So there is a standard for the UTF-8 signature, and I know of > applications which produce it. While I agree with you that Python's > codecs shouldn't produce it (by default), providing an option to strip > is a good idea. I would personally like to see an "utf-8-bom" codec (perhaps better named "utf-8-sig", which strips the BOM on reading (if present) and generates it on writing. > However, this option should be part of the initialization of an IO > stream which produces Unicodes, _not_ an operation on arbitrary > internal strings (whether raw or Unicode). With the UTF-8-SIG codec, it would apply to all operation modes of the codec, whether stream-based or from strings. Whether or not to use the codec would be the application's choice. Regards, Martin From mal at egenix.com Tue Apr 5 12:19:49 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Tue Apr 5 12:19:52 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <42524643.3070604@v.loewis.de> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> Message-ID: <42526645.3010600@egenix.com> Martin v. L?wis wrote: > Stephen J. Turnbull wrote: > >> So there is a standard for the UTF-8 signature, and I know of >> applications which produce it. While I agree with you that Python's >> codecs shouldn't produce it (by default), providing an option to strip >> is a good idea. > > I would personally like to see an "utf-8-bom" codec (perhaps better > named "utf-8-sig", which strips the BOM on reading (if present) > and generates it on writing. +1. >> However, this option should be part of the initialization of an IO >> stream which produces Unicodes, _not_ an operation on arbitrary >> internal strings (whether raw or Unicode). > > > With the UTF-8-SIG codec, it would apply to all operation modes of > the codec, whether stream-based or from strings. Whether or not to > use the codec would be the application's choice. I'd suggest to use the same mode of operation as we have in the UTF-16 codec: it removes the BOM mark on the first call to the StreamReader .decode() method and writes a BOM mark on the first call to .encode() on a StreamWriter. Note that the UTF-16 codec is strict w/r to the presence of the BOM mark: you get a UnicodeError if a stream does not start with a BOM mark. For the UTF-8-SIG codec, this should probably be relaxed to not require the BOM. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 05 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From walter at livinglogic.de Tue Apr 5 12:31:06 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Tue Apr 5 12:31:10 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <42526645.3010600@egenix.com> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com> Message-ID: <425268EA.7070703@livinglogic.de> M.-A. Lemburg wrote: >> [...] >>With the UTF-8-SIG codec, it would apply to all operation modes of >>the codec, whether stream-based or from strings. Whether or not to >>use the codec would be the application's choice. > > I'd suggest to use the same mode of operation as we have in > the UTF-16 codec: it removes the BOM mark on the first call > to the StreamReader .decode() method and writes a BOM mark > on the first call to .encode() on a StreamWriter. > > Note that the UTF-16 codec is strict w/r to the presence > of the BOM mark: you get a UnicodeError if a stream does > not start with a BOM mark. For the UTF-8-SIG codec, this > should probably be relaxed to not require the BOM. I've started writing such a codec. Making the BOM optional on decoding definitely simplifies the implementation. Bye, Walter D?rwald From mal at egenix.com Tue Apr 5 12:34:53 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Tue Apr 5 12:34:58 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <425269CD.3090009@egenix.com> Stephen J. Turnbull wrote: >>>>>>"MAL" == M writes: > > > MAL> The BOM (byte order mark) was a non-standard Microsoft > MAL> invention to detect Unicode text data as such (MS always uses > MAL> UTF-16-LE for Unicode text files). > > The Japanese "memopado" (Notepad) uses UTF-8 signatures; it even adds > them to existing UTF-8 files lacking them. Is that a MS application ? AFAIK, notepad, wordpad and MS Office always use UTF-16-LE + BOM when saving text as "Unicode text". > MAL> -1; there's no standard for UTF-8 BOMs - adding it to the > MAL> codecs module was probably a mistake to begin with. You > MAL> usually only get UTF-8 files with BOM marks as the result of > MAL> recoding UTF-16 files into UTF-8. > > There is a standard for UTF-8 _signatures_, however. I don't have the > most recent version of the ISO-10646 standard, but Amendment 2 (which > defined UTF-8 for ISO-10646) specifically added the UTF-8 signature to > Annex F of that standard. Evan quotes Version 4 of the Unicode > standard, which explicitly defines the UTF-8 signature. Ok, as signature the BOM does make some sense - whether to strip signatures from a document is a good idea or not is a different matter, though. Here's the Unicode Cons. FAQ on the subject: http://www.unicode.org/faq/utf_bom.html#22 They also explicitly warn about adding BOMs to UTF-8 data since it can break applications and protocols that do not expect such a signature. > So there is a standard for the UTF-8 signature, and I know of > applications which produce it. While I agree with you that Python's > codecs shouldn't produce it (by default), providing an option to strip > is a good idea. > > However, this option should be part of the initialization of an IO > stream which produces Unicodes, _not_ an operation on arbitrary > internal strings (whether raw or Unicode). Right. > MAL> BTW, how do you know that s came from the start of a file and > MAL> not from slicing some already loaded file somewhere in the > MAL> middle ? > > The programmer or the application might, but Python's codecs don't. > The point is that this is also true of rawstrings that happen to > contain UTF-16 or UTF-32 data. The UTF-16 ("auto-endian") codec > shouldn't strip leading BOMs either, unless it has been told it has > the beginning of the string. The UTF-16 stream codecs implement this logic. The UTF-16 encode and decode functions will however always strip the BOM mark from the beginning of a string. If the application doesn't want this stripping to happen, it should use the UTF-16-LE or -BE codec resp. > MAL> Evan Jones wrote: > > >> This is *not* a valid Unicode character. The Unicode > >> specification (version 4, section 15.8) says the following > >> about non-characters: > >> > >>> Applications are free to use any of these noncharacter code > >>> points internally but should never attempt to exchange > >>> them. If a noncharacter is received in open interchange, an > >>> application is not required to interpret it in any way. It is > >>> good practice, however, to recognize it as a noncharacter and > >>> to take appropriate action, such as removing it from the > >>> text. Note that Unicode conformance freely allows the removal > >>> of these characters. (See C10 in Section3.2, Conformance > >>> Requirements.) > >> > >> My interpretation of the specification means that Python should > > The specification _permits_ silent removal; it does not recommend. > > >> silently remove the character, resulting in a zero length > >> Unicode string. Similarly, both of the following lines should > >> also result in a zero length Unicode string: > > >>>> '\xff\xfe\xfe\xff'.decode( "utf16" ) > > u'\ufffe' > >>>> '\xff\xfe\xff\xff'.decode( "utf16" ) > > u'\uffff' > > I strongly disagree; these decisions should be left to a higher layer. > In the case of specified UTFs, the codecs should simply invert the UTF > to Python's internal encoding. > > MAL> Hmm, wouldn't it be better to raise an error ? After all, a > MAL> reversed BOM mark in the stream looks a lot like you're > MAL> trying to decode a UTF-16 stream assuming the wrong byte > MAL> order ?! > > +1 on (optionally) raising an error. The advantage of raising an error is that the application can deal with the situation in whatever way seems fit (by registering a special error handler or by simply using "ignore" or "replace"). I agree that much of this lies outside the scope of codecs and should be handled at an application or protocol level. > -1 on removing it or anything > like that, unless under control of the application (ie, the program > written in Python, not Python itself). It's far too easy for software > to generate broken Unicode streams[1], and the choice of how to deal > with those should be with the application, not with the implementation > language. > > Footnotes: > [1] An egregious example was the Outlook Express distributed with > early Win2k betas, which produced MIME bodies with apparent > Content-Type: text/html; charset=utf-16, but the HTML tags and > newlines were 7-bit ASCII! > -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 05 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From stephen at xemacs.org Tue Apr 5 14:03:19 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue Apr 5 14:03:25 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <42524643.3070604@v.loewis.de> (Martin v. =?iso-8859-1?q?L=F6wis's?= message of "Tue, 05 Apr 2005 10:03:15 +0200") References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> Message-ID: <87is31e8hk.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Martin" == Martin v L?wis writes: Martin> Stephen J. Turnbull wrote: >> However, this option should be part of the initialization of an >> IO stream which produces Unicodes, _not_ an operation on >> arbitrary internal strings (whether raw or Unicode). Martin> With the UTF-8-SIG codec, it would apply to all operation Martin> modes of the codec, whether stream-based or from strings. I had in mind the ability to treat a string as a stream. Martin> Whether or not to use the codec would be the application's Martin> choice. What I think should be provided is a stateful object encapsulating the codec. Ie, to avoid the need to write out = chunk[0].encode("utf-8-sig") + chunk[1].encode("utf-8") -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From stephen at xemacs.org Tue Apr 5 15:04:34 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue Apr 5 15:04:42 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <425269CD.3090009@egenix.com> (M.'s message of "Tue, 05 Apr 2005 12:34:53 +0200") References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <425269CD.3090009@egenix.com> Message-ID: <87zmwdcr31.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>>>"MAL" == M writes: MAL> Stephen J. Turnbull wrote: >> The Japanese "memopado" (Notepad) uses UTF-8 signatures; it >> even adds them to existing UTF-8 files lacking them. MAL> Is that a MS application ? AFAIK, notepad, wordpad and MS MAL> Office always use UTF-16-LE + BOM when saving text as "Unicode MAL> text". Yes, it is an MS application. I'll have to borrow somebody's box to check, but IIRC UTF-8 is the native "text" encoding for Japanese now. (Japanized applications generally behave differently from everything else, as there are so many "standards" for encoding Japanese.) M> The UTF-16 stream codecs implement this logic. M> The UTF-16 encode and decode functions will however always M> strip the BOM mark from the beginning of a string. M> If the application doesn't want this stripping to happen, it M> should use the UTF-16-LE or -BE codec resp. That sounds like it would work fine almost all the time. If it doesn't it's straightforward to work around, and certainly would be more convenient for the non-standards-geek programmer. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From skip at pobox.com Tue Apr 5 15:57:16 2005 From: skip at pobox.com (Skip Montanaro) Date: Tue Apr 5 15:57:19 2005 Subject: [Python-Dev] Mail.python.org In-Reply-To: <0IEF00B0UZIC91I3@vms048.mailsrvcs.net> References: <0IEF00B0UZIC91I3@vms048.mailsrvcs.net> Message-ID: <16978.39228.310785.460397@montanaro.dyndns.org> Grant> Not a big deal, but I noticed that https://mail.python.org/ is Grant> live and shows a generic "Welcome to your new home in Grant> cyberspace!" message. One of the webmasters may want to Grant> automatically redirect to http://mail.python.org. Thanks, I forwarded this along to the folks who can deal with this. Skip From martin at v.loewis.de Tue Apr 5 20:44:47 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Apr 5 20:44:49 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <87is31e8hk.fsf@tleepslib.sk.tsukuba.ac.jp> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <87is31e8hk.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <4252DC9F.50000@v.loewis.de> Stephen J. Turnbull wrote: > Martin> With the UTF-8-SIG codec, it would apply to all operation > Martin> modes of the codec, whether stream-based or from strings. > > I had in mind the ability to treat a string as a stream. Hmm. A string is not a stream, but it could be the contents of a stream. A typical application of codecs goes like this: data = stream.read() [analyze data, e.g. by checking whether there is encoding= in Martin> Whether or not to use the codec would be the application's > Martin> choice. > > What I think should be provided is a stateful object encapsulating the > codec. Ie, to avoid the need to write > > out = chunk[0].encode("utf-8-sig") + chunk[1].encode("utf-8") No. People who want streaming should use cStringIO, i.e. >>> s=cStringIO.StringIO() >>> s1=codecs.getwriter("utf-8")(s) >>> s1.write(u"Hallo") >>> s.getvalue() 'Hallo' Regards, Martin From walter at livinglogic.de Tue Apr 5 21:33:00 2005 From: walter at livinglogic.de (=?iso-8859-1?Q?Walter_D=F6rwald?=) Date: Tue Apr 5 21:33:04 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <425268EA.7070703@livinglogic.de> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com> <425268EA.7070703@livinglogic.de> Message-ID: <1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de> Walter D?rwald sagte: > M.-A. Lemburg wrote: > >>> [...] >>>With the UTF-8-SIG codec, it would apply to all operation >>> modes of the codec, whether stream-based or from strings. Whether >>>or not to use the codec would be the application's choice. >> >> I'd suggest to use the same mode of operation as we have in >> the UTF-16 codec: it removes the BOM mark on the first call >> to the StreamReader .decode() method and writes a BOM mark >> on the first call to .encode() on a StreamWriter. >> >> Note that the UTF-16 codec is strict w/r to the presence >> of the BOM mark: you get a UnicodeError if a stream does >> not start with a BOM mark. For the UTF-8-SIG codec, this >> should probably be relaxed to not require the BOM. > > I've started writing such a codec. Making the BOM optional > on decoding definitely simplifies the implementation. OK, here is the patch: http://www.python.org/sf/1177307 The stateful decoder has a little problem: At least three bytes have to be available from the stream until the StreamReader decides whether these bytes are a BOM that has to be skipped. This means that if the file only contains "ab", the user will never see these two characters. A solution for this would be to add an argument named final to the decode and read methods that tells the decoder that the stream has ended and the remaining buffered bytes have to be handled now. Bye, Walter D?rwald From ejones at uwaterloo.ca Tue Apr 5 21:53:05 2005 From: ejones at uwaterloo.ca (Evan Jones) Date: Tue Apr 5 21:52:27 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com> <425268EA.7070703@livinglogic.de> <1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de> Message-ID: On Apr 5, 2005, at 15:33, Walter D?rwald wrote: > The stateful decoder has a little problem: At least three bytes > have to be available from the stream until the StreamReader > decides whether these bytes are a BOM that has to be skipped. > This means that if the file only contains "ab", the user will > never see these two characters. Shouldn't the decoder be capable of doing a partial match and quitting early? After all, "ab" is encoded in UTF8 as <61> <62> but the BOM is . If it did this type of partial matching, this issue would be avoided except in rare situations. > A solution for this would be to add an argument named final to > the decode and read methods that tells the decoder that the > stream has ended and the remaining buffered bytes have to be > handled now. This functionality is provided by a flush() method on similar objects, such as the zlib compression objects. Evan Jones From fdrake at acm.org Tue Apr 5 21:56:46 2005 From: fdrake at acm.org (Fred Drake) Date: Tue Apr 5 21:57:35 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de> Message-ID: <200504051556.46947.fdrake@acm.org> On Tuesday 05 April 2005 15:53, Evan Jones wrote: > This functionality is provided by a flush() method on similar objects, > such as the zlib compression objects. Or by close() on other objects (htmllib, HTMLParser, the SAX incremental parser, etc.). Too bad there's more than one way to do it. :-( -Fred -- Fred L. Drake, Jr. From martin at v.loewis.de Tue Apr 5 22:05:14 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Apr 5 22:05:16 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com> <425268EA.7070703@livinglogic.de> <1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de> Message-ID: <4252EF7A.8080804@v.loewis.de> Walter D?rwald wrote: > The stateful decoder has a little problem: At least three bytes > have to be available from the stream until the StreamReader > decides whether these bytes are a BOM that has to be skipped. > This means that if the file only contains "ab", the user will > never see these two characters. This can be improved, of course: If the first byte is "a", it most definitely is *not* an UTF-8 signature. So we only need a second byte for the characters between U+F000 and U+FFFF, and a third byte only for the characters U+FEC0...U+FEFF. But with the first byte being \xef, we need three bytes *anyway*, so we can always decide with the first byte only whether we need to wait for three bytes. > A solution for this would be to add an argument named final to > the decode and read methods that tells the decoder that the > stream has ended and the remaining buffered bytes have to be > handled now. Shouldn't an empty read from the underlying stream be taken as an EOF? Regards, Martin From tim.peters at gmail.com Tue Apr 5 22:11:14 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue Apr 5 22:11:17 2005 Subject: [Python-Dev] longobject.c & ob_size In-Reply-To: <2mk6njdh9z.fsf@starship.python.net> References: <2mk6njdh9z.fsf@starship.python.net> Message-ID: <1f7befae05040513113c825c92@mail.gmail.com> [Michael Hudson] > Asking mostly for curiousity, how hard would it be to have longs store > their sign bit somewhere less aggravating? Depends on where that is. > It seems to me that the top bit of ob_digit[0] is always 0, for example, Yes, the top bit of ob_digit[i], for all relevant i, is 0 on all platforms now. > and I'm sure this would result no less convolution in longobject.c it'd be > considerably more localized convolution. I'd much rather give struct _longobject a distinct sign member (say, 0 == zero, -1 = non-zero negative, 1 == non-zero positive). That would simplify code. It would cost no extra bytes for some longs, and 8 extra bytes for others (since obmalloc rounds up to a multiple of 8); I don't care about that (e.g., I never use millions of longs simultaneously, but often use a few dozen very big longs simultaneously; the memory difference is in the noise then). Note that longintrepr.h isn't included by Python.h. Only longobject.h is, and longobject.h doesn't reveal the internal structure of longs. IOW, changing the internal layout of longs shouldn't even hurt binary compatibility. The ob_size member of PyObject_VAR_HEAD would also be redeclared as size_t in an ideal world. From walter at livinglogic.de Tue Apr 5 22:37:24 2005 From: walter at livinglogic.de (=?iso-8859-1?Q?Walter_D=F6rwald?=) Date: Tue Apr 5 22:37:27 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <4252EF7A.8080804@v.loewis.de> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com> <425268EA.7070703@livinglogic.de> <1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de> <4252EF7A.8080804@v.loewis.de> Message-ID: <1809.84.56.104.122.1112733444.squirrel@isar.livinglogic.de> Martin v. L?wis sagte: > Walter D?rwald wrote: >> The stateful decoder has a little problem: At least three bytes >> have to be available from the stream until the StreamReader >> decides whether these bytes are a BOM that has to be skipped. >> This means that if the file only contains "ab", the user will >> never see these two characters. > > This can be improved, of course: If the first byte is "a", > it most definitely is *not* an UTF-8 signature. > > So we only need a second byte for the characters between U+F000 > and U+FFFF, and a third byte only for the characters > U+FEC0...U+FEFF. But with the first byte being \xef, we need > three bytes *anyway*, so we can always decide with the first > byte only whether we need to wait for three bytes. OK, I've updated the patch so that the first bytes will only be kept in the buffer if they are a prefix of the BOM. >> A solution for this would be to add an argument named final to >> the decode and read methods that tells the decoder that the >> stream has ended and the remaining buffered bytes have to be >> handled now. > > Shouldn't an empty read from the underlying stream be taken > as an EOF? There are situations where the byte stream might be temporarily exhausted, e.g. an XML parser that tries to support the IncrementalParser interface, or when you want to decode encoded data piecewise, because you want to give a progress report. Bye, Walter D?rwald From walter at livinglogic.de Tue Apr 5 22:43:03 2005 From: walter at livinglogic.de (=?iso-8859-1?Q?Walter_D=F6rwald?=) Date: Tue Apr 5 22:43:06 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com> <425268EA.7070703@livinglogic.de> <1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de> Message-ID: <1811.84.56.104.122.1112733783.squirrel@isar.livinglogic.de> Evan Jones sagte: > On Apr 5, 2005, at 15:33, Walter D?rwald wrote: >> The stateful decoder has a little problem: At least three bytes >> have to be available from the stream until the StreamReader >> decides whether these bytes are a BOM that has to be skipped. >> This means that if the file only contains "ab", the user will >> never see these two characters. > > Shouldn't the decoder be capable of doing a partial match and quitting early? After all, "ab" is encoded in UTF8 as <61> > <62> but the BOM is . If it did this type of partial matching, this issue would be avoided except in rare > situations. > >> A solution for this would be to add an argument named final to >> the decode and read methods that tells the decoder that the >> stream has ended and the remaining buffered bytes have to be >> handled now. > > This functionality is provided by a flush() method on similar objects, such as the zlib compression objects. Theoretically the name is unimportant, but read(..., final=True) or flush() or close() should subject the pending bytes to normal error handling and must return the result of decoding these pending bytes just like the other methods do. This would mean that we would have to implement a decodecode(), a readclose() and a readlineclose(). IMHO it would be best to add this argument to decode, read and readline directly. But I'm not sure, what this would mean for iterating through a StreamReader. Bye, Walter D?rwald From martin at v.loewis.de Tue Apr 5 22:52:14 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Apr 5 22:52:18 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <1809.84.56.104.122.1112733444.squirrel@isar.livinglogic.de> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com> <425268EA.7070703@livinglogic.de> <1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de> <4252EF7A.8080804@v.loewis.de> <1809.84.56.104.122.1112733444.squirrel@isar.livinglogic.de> Message-ID: <4252FA7E.3090206@v.loewis.de> Walter D?rwald wrote: > There are situations where the byte stream might be temporarily > exhausted, e.g. an XML parser that tries to support the > IncrementalParser interface, or when you want to decode > encoded data piecewise, because you want to give a progress > report. Yes, but these are not file-like objects. In the IncrementalParser, it is *not* the case that a read operation returns an empty string. Instead, the application repeatedly feeds data explicitly. For a file-like object, returning "" indicates EOF. Regards, Martin From raymond.hettinger at verizon.net Tue Apr 5 12:47:07 2005 From: raymond.hettinger at verizon.net (Raymond Hettinger) Date: Wed Apr 6 00:47:20 2005 Subject: [Python-Dev] Developer list update Message-ID: <000101c539cc$d0419e00$e7bd2c81@oemcomputer> FYI, I'm starting a project to see what has become of some of the inactive developers. Essentially, it involves sending them a note to see if they still have use for their checkin permissions. If not, then we can make the change and improve security a bit. Also, to help with institutional memory, I started a log of changes to developer permissions. The goal is to remember who was given access, by whom, and why (some folks are given access for a one-shot project for example). The file is at Misc/developers. The first entry is for Nick Coghlan who was just granted tracker permissions so he can help manage outstanding bugs and patches. Raymond Hettinger From fdrake at acm.org Wed Apr 6 01:06:34 2005 From: fdrake at acm.org (Fred Drake) Date: Wed Apr 6 01:07:13 2005 Subject: [Python-Dev] Developer list update In-Reply-To: <000101c539cc$d0419e00$e7bd2c81@oemcomputer> References: <000101c539cc$d0419e00$e7bd2c81@oemcomputer> Message-ID: <200504051906.34590.fdrake@acm.org> On Tuesday 05 April 2005 06:47, Raymond Hettinger wrote: > Also, to help with institutional memory, I started a log of changes to > developer permissions. The goal is to remember who was given access, by > whom, and why (some folks are given access for a one-shot project for > example). The file is at Misc/developers. Thanks, Raymond! Would anyone here object to renaming the file to developers.txt, though? -Fred -- Fred L. Drake, Jr. From barry at python.org Wed Apr 6 01:20:36 2005 From: barry at python.org (Barry Warsaw) Date: Wed Apr 6 01:20:41 2005 Subject: [Python-Dev] Developer list update In-Reply-To: <200504051906.34590.fdrake@acm.org> References: <000101c539cc$d0419e00$e7bd2c81@oemcomputer> <200504051906.34590.fdrake@acm.org> Message-ID: <1112743236.18820.178.camel@geddy.wooz.org> On Tue, 2005-04-05 at 19:06, Fred Drake wrote: > Would anyone here object to renaming the file to developers.txt, though? +1, please! -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050405/27f6a8f8/attachment-0001.pgp From stephen at xemacs.org Wed Apr 6 02:32:01 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed Apr 6 02:32:07 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <4252DC9F.50000@v.loewis.de> (Martin v. =?iso-8859-1?q?L=F6wis's?= message of "Tue, 05 Apr 2005 20:44:47 +0200") References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <87is31e8hk.fsf@tleepslib.sk.tsukuba.ac.jp> <4252DC9F.50000@v.loewis.de> Message-ID: <87fyy4d9tq.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Martin" == Martin v L?wis writes: Martin> So people do use the "decode-it-all" mode, where no Martin> sequential access is necessary - yet the beginning of the Martin> string is still the beginning of what once was a Martin> stream. This case must be supported. Of course it must be supported. My point is that many strings (in my applications, all but those strings that result from slurping in a file or process output in one go -- example, not a statistically valid sample!) are not the beginning of "what once was a stream". It is error-prone (not to mention unaesthetic) to not make that distinction. "Explicit is better than implicit." Martin> Whether or not to use the codec would be the application's Martin> choice. >> What I think should be provided is a stateful object >> encapsulating the codec. Ie, to avoid the need to write >> out = chunk[0].encode("utf-8-sig") + chunk[1].encode("utf-8") Martin> No. People who want streaming should use cStringIO, i.e. >>> s=cStringIO.StringIO() >>> s1=codecs.getwriter("utf-8")(s) >>> s1.write(u"Hallo") >>> s.getvalue() 'Hallo' Yes! Exactly (except in reverse, we want to _read_ from the slurped stream-as-string, not write to one)! ... and there's no need for a utf-8-sig codec for strings, since you can support the usage in exactly this way. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From tim.peters at gmail.com Wed Apr 6 03:00:43 2005 From: tim.peters at gmail.com (Tim Peters) Date: Wed Apr 6 03:00:47 2005 Subject: [Python-Dev] Developer list update In-Reply-To: <1112743236.18820.178.camel@geddy.wooz.org> References: <000101c539cc$d0419e00$e7bd2c81@oemcomputer> <200504051906.34590.fdrake@acm.org> <1112743236.18820.178.camel@geddy.wooz.org> Message-ID: <1f7befae0504051800452bcca7@mail.gmail.com> [Fred Drake] >> Would anyone here object to renaming the file to developers.txt, though? [Barry Warsaw] > +1, please! I voted with my DOS box. From alex.nanou at gmail.com Wed Apr 6 03:29:44 2005 From: alex.nanou at gmail.com (Alex A. Naanou) Date: Wed Apr 6 03:29:47 2005 Subject: [Python-Dev] inconsistency when swapping obj.__dict__ with a dict-like object... Message-ID: <36f889220504051829266cea1e@mail.gmail.com> Hi! here is a simple piece of code
---cut---
class Dict(dict):
    def __init__(self, dct={}):
        self._dict = dct
    def __getitem__(self, name):
        return self._dct[name]
    def __setitem__(self, name, value):
        self._dct[name] = value
    def __delitem__(self, name):
        del self._dct[name]
    def __contains__(self, name):
        return name in self._dct
    def __iter__(self):
        return iter(self._dct)

class A(object):
    def __new__(cls, *p, **n):
        o = object.__new__(cls)
        o.__dict__ = Dict()
        return o

a = A()
a.xxx = 123
print a.__dict__._dict
a.__dict__._dict['yyy'] = 321
print a.yyy

--uncut--
Here there are two problems, the first is minor, and it is that anything assigned to the __dict__ attribute is checked to be a descendant of the dict class (mixing this in does not seem to work)... and the second problem is a real annoyance, it is that the mapping protocol supported by the Dict object in the example above is not used by the attribute access mechanics (the same thing that once happened in exec)... P.S. (IMHO) the type check here is not that necessary (at least in its current state), as what we need to assert is not the relation to the dict class but the support of the mapping protocol.... thanks. -- Alex. From bac at OCF.Berkeley.EDU Wed Apr 6 04:46:07 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Wed Apr 6 04:46:18 2005 Subject: [Python-Dev] inconsistency when swapping obj.__dict__ with a dict-like object... In-Reply-To: <36f889220504051829266cea1e@mail.gmail.com> References: <36f889220504051829266cea1e@mail.gmail.com> Message-ID: <42534D6F.40200@ocf.berkeley.edu> Alex A. Naanou wrote: > Hi! > > here is a simple piece of code >
> ---cut---
> class Dict(dict):
>     def __init__(self, dct={}):
>         self._dict = dct
>     def __getitem__(self, name):
>         return self._dct[name]
>     def __setitem__(self, name, value):
>         self._dct[name] = value
>     def __delitem__(self, name):
>         del self._dct[name]
>     def __contains__(self, name):
>         return name in self._dct
>     def __iter__(self):
>         return iter(self._dct)
> 
> class A(object):
>     def __new__(cls, *p, **n):
>         o = object.__new__(cls)
>         o.__dict__ = Dict()
>         return o
> 
> a = A()
> a.xxx = 123
> print a.__dict__._dict
> a.__dict__._dict['yyy'] = 321
> print a.yyy
> 
> --uncut--
> 
> > Here there are two problems, the first is minor, and it is that > anything assigned to the __dict__ attribute is checked to be a > descendant of the dict class (mixing this in does not seem to work)... > and the second problem is a real annoyance, it is that the mapping > protocol supported by the Dict object in the example above is not used > by the attribute access mechanics (the same thing that once happened > in exec)... > Actually, overriding __getattribute__() does work; __getattr__() and __getitem__() doesn't. This was brought up last month at some point without any resolve (I think Steve Bethard pointed it out). > P.S. (IMHO) the type check here is not that necessary (at least in its > current state), as what we need to assert is not the relation to the > dict class but the support of the mapping protocol.... > Semantically necessary, no. But simplicity- and performance-wise, maybe. If you grep around in Objects/classobject.c, for instance, you will see PyClassObject.cl_dict is accessed using PyDict_GetItem() and I spotted at least one use of PyDict_DelItem(). To use the mapping protocol would require changing all of these to PyObject_GetItem() and such. Which will be a performance penalty compared to PyDict_GetItem(). So the question is whether the flexibility is worth it. -Brett From martin at v.loewis.de Wed Apr 6 08:06:08 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed Apr 6 08:06:12 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <87fyy4d9tq.fsf@tleepslib.sk.tsukuba.ac.jp> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <87is31e8hk.fsf@tleepslib.sk.tsukuba.ac.jp> <4252DC9F.50000@v.loewis.de> <87fyy4d9tq.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <42537C50.8000608@v.loewis.de> Stephen J. Turnbull wrote: > Of course it must be supported. My point is that many strings (in my > applications, all but those strings that result from slurping in a > file or process output in one go -- example, not a statistically valid > sample!) are not the beginning of "what once was a stream". It is > error-prone (not to mention unaesthetic) to not make that distinction. > > "Explicit is better than implicit." I can't put these two paragraphs together. If you think that explicit is better than implicit, why do you not want to make different calls for the first chunk of a stream, and the subsequent chunks? > >>> s=cStringIO.StringIO() > >>> s1=codecs.getwriter("utf-8")(s) > >>> s1.write(u"Hallo") > >>> s.getvalue() > 'Hallo' > > Yes! Exactly (except in reverse, we want to _read_ from the slurped > stream-as-string, not write to one)! ... and there's no need for a > utf-8-sig codec for strings, since you can support the usage in > exactly this way. However, if there is an utf-8-sig codec for streams, there is currently no way of *preventing* this codec to also be available for strings. The very same code is used for streams and for strings, and automatically so. Regards, Martin From walter at livinglogic.de Wed Apr 6 10:32:59 2005 From: walter at livinglogic.de (=?iso-8859-1?Q?Walter_D=F6rwald?=) Date: Wed Apr 6 10:33:02 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <4252FA7E.3090206@v.loewis.de> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com> <425268EA.7070703@livinglogic.de> <1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de> <4252EF7A.8080804@v.loewis.de> <1809.84.56.104.122.1112733444.squirrel@isar.livinglogic.de> <4252FA7E.3090206@v.loewis.de> Message-ID: <1520.84.56.99.39.1112776379.squirrel@isar.livinglogic.de> Martin v. L?wis sagte: > Walter D?rwald wrote: >> There are situations where the byte stream might be temporarily >> exhausted, e.g. an XML parser that tries to support the >> IncrementalParser interface, or when you want to decode >> encoded data piecewise, because you want to give a progress >> report. > > Yes, but these are not file-like objects. True, on the outside there are no file-like objects. But the IncrementalParser gets passed the XML bytes in chunks, so it has to use a stateful decoder for decoding. Unfortunately this means that is has to use a stream API. (See http://www.python.org/sf/1101097 for a patch that somewhat fixes that.) (Another option would be to completely ignore the stateful API and handcraft stateful decoding (or only support stateless decoding), like most XML parsers for Python do now.) > In the IncrementalParser, > it is *not* the case that a read operation returns an empty > string. Instead, the application repeatedly feeds data explicitly. That's true, but the parser has to wrap this data into an object that can be passed to the StreamReader constructor. (See the Queue class in Lib/test/test_codecs.py for an example.) > For a file-like object, returning "" indicates EOF. Not neccassarily. In the example above the IncrementalParser gets fed a chunk of data, it stuffs this data into the Queue, so that the StreamReader can decode it. Once the data from the Queue is exhausted, there won't any further data until the user calls feed() on the IncrementalParser again. Bye, Walter D?rwald From stephen at xemacs.org Wed Apr 6 11:31:21 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed Apr 6 11:31:27 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <42537C50.8000608@v.loewis.de> (Martin v. =?iso-8859-1?q?L=F6wis's?= message of "Wed, 06 Apr 2005 08:06:08 +0200") References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <87is31e8hk.fsf@tleepslib.sk.tsukuba.ac.jp> <4252DC9F.50000@v.loewis.de> <87fyy4d9tq.fsf@tleepslib.sk.tsukuba.ac.jp> <42537C50.8000608@v.loewis.de> Message-ID: <874qekb6ae.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Martin" == Martin v L?wis writes: Martin> I can't put these two paragraphs together. If you think Martin> that explicit is better than implicit, why do you not want Martin> to make different calls for the first chunk of a stream, Martin> and the subsequent chunks? Because the signature/BOM is not a chunk, it's a header. Handling the signature/BOM is part of stream initialization, not translation, to my mind. The point is that explicitly using a stream shows that initialization (and finalization) matter. The default can be BOM or not, as a pragmatic matter. But then the stream data itself can be treated homogeneously, as implied by the notion of stream. I think it probably also would solve Walter's conundrum about buffering the signature/BOM if responsibility for that were moved out of the codecs and into the objects where signatures make sense. I don't know whether that's really feasible in the short run---I suspect there may be a lot of stream-like modules that would need to be updated---but it would be a saner in the long run. >> Yes! Exactly (except in reverse, we want to _read_ from the >> slurped stream-as-string, not write to one)! ... and there's >> no need for a utf-8-sig codec for strings, since you can >> support the usage in exactly this way. Martin> However, if there is an utf-8-sig codec for streams, there Martin> is currently no way of *preventing* this codec to also be Martin> available for strings. The very same code is used for Martin> streams and for strings, and automatically so. And of course it should be. But if it's not possible to move the -sig facility out of the codecs into the streams, that would be a shame. I think we should encourage people to use streams where initialization or finalization semantics are non-trivial, as they are with signatures. But as long as both utf-8-we-dont-need-no-steenkin-sigs-in-strings and utf-8-sig are available, I can program as I want to (and refer those whose strings get cratered by stray BOMs to you). -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From walter at livinglogic.de Wed Apr 6 13:48:48 2005 From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=) Date: Wed Apr 6 13:48:51 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <874qekb6ae.fsf@tleepslib.sk.tsukuba.ac.jp> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <87is31e8hk.fsf@tleepslib.sk.tsukuba.ac.jp> <4252DC9F.50000@v.loewis.de> <87fyy4d9tq.fsf@tleepslib.sk.tsukuba.ac.jp> <42537C50.8000608@v.loewis.de> <874qekb6ae.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <4253CCA0.9020008@livinglogic.de> Stephen J. Turnbull wrote: >>>>>>"Martin" == Martin v L?wis writes: > > Martin> I can't put these two paragraphs together. If you think > Martin> that explicit is better than implicit, why do you not want > Martin> to make different calls for the first chunk of a stream, > Martin> and the subsequent chunks? > > Because the signature/BOM is not a chunk, it's a header. Handling the > signature/BOM is part of stream initialization, not translation, to my > mind. > > The point is that explicitly using a stream shows that initialization > (and finalization) matter. The default can be BOM or not, as a > pragmatic matter. But then the stream data itself can be treated > homogeneously, as implied by the notion of stream. > > I think it probably also would solve Walter's conundrum about > buffering the signature/BOM if responsibility for that were moved out > of the codecs and into the objects where signatures make sense. Not really. In every encoding where a sequence of more than one byte maps to one Unicode character, you will always need some kind of buffering. If we remove the handling of initial BOMs from the codecs (except for UTF-16 where it is required), this wouldn't change any buffering requirements. > I don't know whether that's really feasible in the short run---I > suspect there may be a lot of stream-like modules that would need to > be updated---but it would be a saner in the long run. I'm not exactly sure, what you're proposing here. That all codecs (even UTF-16) pass the BOM through and some other infrastructure is responsible for dropping it? > [...] Bye, Walter D?rwald From mwh at python.net Wed Apr 6 11:37:22 2005 From: mwh at python.net (Michael Hudson) Date: Wed Apr 6 14:02:12 2005 Subject: [Python-Dev] longobject.c & ob_size In-Reply-To: <1f7befae05040513113c825c92@mail.gmail.com> (Tim Peters's message of "Tue, 5 Apr 2005 16:11:14 -0400") References: <2mk6njdh9z.fsf@starship.python.net> <1f7befae05040513113c825c92@mail.gmail.com> Message-ID: <2m4qek9rfx.fsf@starship.python.net> Tim Peters writes: > [Michael Hudson] >> Asking mostly for curiousity, how hard would it be to have longs store >> their sign bit somewhere less aggravating? > > Depends on where that is. > >> It seems to me that the top bit of ob_digit[0] is always 0, for example, > > Yes, the top bit of ob_digit[i], for all relevant i, is 0 on all > platforms now. > >> and I'm sure this would result no less convolution in longobject.c it'd be >> considerably more localized convolution. > > I'd much rather give struct _longobject a distinct sign member (say, 0 > == zero, -1 = non-zero negative, 1 == non-zero positive). Well, that would indeed be simpler. > That would simplify code. It would cost no extra bytes for some > longs, and 8 extra bytes for others (since obmalloc rounds up to a > multiple of 8); I don't care about that (e.g., I never use millions > of longs simultaneously, but often use a few dozen very big longs > simultaneously; the memory difference is in the noise then). > > Note that longintrepr.h isn't included by Python.h. Only longobject.h > is, and longobject.h doesn't reveal the internal structure of longs. > IOW, changing the internal layout of longs shouldn't even hurt binary > compatibility. Bonus. > The ob_size member of PyObject_VAR_HEAD would also be redeclared as > size_t in an ideal world. As nature intended. I might do a patch, at some point... Cheers, mwh -- Indeed, when I design my killer language, the identifiers "foo" and "bar" will be reserved words, never used, and not even mentioned in the reference manual. Any program using one will simply dump core without comment. Multitudes will rejoice. -- Tim Peters, 29 Apr 1998 From tim.peters at gmail.com Wed Apr 6 14:50:46 2005 From: tim.peters at gmail.com (Tim Peters) Date: Wed Apr 6 14:50:50 2005 Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules mathmodule.c, 2.74, 2.75 In-Reply-To: References: Message-ID: <1f7befae050406055026a1e00c@mail.gmail.com> [mwh@users.sourceforge.net] > Modified Files: > mathmodule.c > Log Message: > Add a comment explaining the import of longintrepr.h. > > Index: mathmodule.c ... > #include "Python.h" > -#include "longintrepr.h" > +#include "longintrepr.h" // just for SHIFT The intent is fine, but please use a standard C (not C++) comment. That is, /*...*/, not //. From mwh at python.net Wed Apr 6 15:57:24 2005 From: mwh at python.net (Michael Hudson) Date: Wed Apr 6 15:57:26 2005 Subject: [Python-Dev] longobject.c & ob_size In-Reply-To: <2m4qek9rfx.fsf@starship.python.net> (Michael Hudson's message of "Wed, 06 Apr 2005 10:37:22 +0100") References: <2mk6njdh9z.fsf@starship.python.net> <1f7befae05040513113c825c92@mail.gmail.com> <2m4qek9rfx.fsf@starship.python.net> Message-ID: <2mwtrg80u3.fsf@starship.python.net> Michael Hudson writes: > Tim Peters writes: > >> [Michael Hudson] >>> Asking mostly for curiousity, how hard would it be to have longs store >>> their sign bit somewhere less aggravating? >> >> Depends on where that is. [...] >> I'd much rather give struct _longobject a distinct sign member (say, 0 >> == zero, -1 = non-zero negative, 1 == non-zero positive). I ended up doing -1 non-zero negative, 1 zero and positive, but I don't know if this is really clearer than what you suggest overall. I suspect it's a wash. [...] > I might do a patch, at some point... http://python.org/sf/1177779 Assigned to you, but unassign if you don't have time (testing the patch is probably more worthwhile than reading it!). Cheers, mwh -- Linux: Horse. Like a wild horse, fun to ride. Also prone to throwing you and stamping you into the ground because it doesn't like your socks. -- Jim's pedigree of operating systems, asr From steven.bethard at gmail.com Wed Apr 6 16:43:44 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Wed Apr 6 16:43:46 2005 Subject: [Python-Dev] inconsistency when swapping obj.__dict__ with a dict-like object... In-Reply-To: <42534D6F.40200@ocf.berkeley.edu> References: <36f889220504051829266cea1e@mail.gmail.com> <42534D6F.40200@ocf.berkeley.edu> Message-ID: On Apr 5, 2005 8:46 PM, Brett C. wrote: > Alex A. Naanou wrote: > > Here there are two problems, the first is minor, and it is that > > anything assigned to the __dict__ attribute is checked to be a > > descendant of the dict class (mixing this in does not seem to work)... > > and the second problem is a real annoyance, it is that the mapping > > protocol supported by the Dict object in the example above is not used > > by the attribute access mechanics (the same thing that once happened > > in exec)... > > Actually, overriding __getattribute__() does work; __getattr__() and > __getitem__() doesn't. This was brought up last month at some point without > any resolve (I think Steve Bethard pointed it out). Yeah, here's the link: http://mail.python.org/pipermail/python-dev/2005-March/051837.html I've pointed out three possible "solutions" there, but they all have some significant drawbacks. I took the complete silence on the topic as an indication that none of the options were acceptable. STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From ncoghlan at gmail.com Wed Apr 6 13:31:44 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed Apr 6 17:17:20 2005 Subject: [Python-Dev] inconsistency when swapping obj.__dict__ with a dict-like object... In-Reply-To: <36f889220504051829266cea1e@mail.gmail.com> References: <36f889220504051829266cea1e@mail.gmail.com> Message-ID: <4253C8A0.2050509@gmail.com> > P.S. (IMHO) the type check here is not that necessary (at least in its > current state), as what we need to assert is not the relation to the > dict class but the support of the mapping protocol.... The type-check is basically correct - as you have discovered, type & object use the PyDict_* API internally (for speed reasons, as I understand it), so supporting the mapping API is not really sufficient for something assigned to __dict__. Changing this for exec is one thing, as speed of access to the locals dict isn't likely to have a major impact on the overall performance of such code, but I would expect changing class dictionary access code in a similar way would have a major (detrimental) performance impact. Depending on the use case, it is possible to work around the problem by defining __dict__, __getattribute__, __setattr__ and __delattr__ in the class. defining __dict__ sidesteps the type error, defining the other three methods then let's you get around the fact that the standard C-level dict pointer is no longer being updated, as well as making sure the general mapping API is used, rather than the concrete PyDict_* API. This is kinda ugly, but it works as long as any C code using the class __dict__ goes via the attribute access machinery and doesn't try to get the dictionary automatically supplied by Python by digging directly into the type structure. ===================== from UserDict import DictMixin class Dict(DictMixin): def __init__(self, dct=None): if dct is None: dct = {} self._dict = dct def __getitem__(self, name): return self._dict[name] def __setitem__(self, name, value): self._dict[name] = value def __delitem__(self, name): del self._dict[name] def keys(self): return self._dict.keys() class A(object): def __new__(cls, *p, **n): o = object.__new__(cls) super(A, o).__setattr__('__dict__', Dict()) return o __dict__ = None def __getattr__(self, attr): try: return self.__dict__[attr] except KeyError: raise AttributeError("%s" % attr) def __setattr__(self, attr, value): if attr in self.__dict__ or not hasattr(self, attr): self.__dict__[attr] = value else: super(A, self).__setattr__(attr, value) def __delattr__(self, attr): if attr in self.__dict__: del self.__dict__[attr] else: super(A, self).__delattr__(attr) Py> a = A() Py> a.__dict__._dict {} Py> a.xxx = 123 Py> a.__dict__._dict {'xxx': 123} Py> a.__dict__._dict['yyy'] = 321 Py> a.yyy 321 Py> a.__dict__._dict {'xxx': 123, 'yyy': 321} Py> del a.xxx Py> a.__dict__._dict {'yyy': 321} Py> del a.xxx Traceback (most recent call last): File "", line 1, in ? File "", line 21, in __delattr__ AttributeError: xxx Py> a.__dict__ = {} Py> a.yyy Traceback (most recent call last): File "", line 1, in ? File "", line 11, in __getattr__ AttributeError: yyy Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From martin at v.loewis.de Wed Apr 6 22:22:19 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed Apr 6 22:22:22 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <874qekb6ae.fsf@tleepslib.sk.tsukuba.ac.jp> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <87is31e8hk.fsf@tleepslib.sk.tsukuba.ac.jp> <4252DC9F.50000@v.loewis.de> <87fyy4d9tq.fsf@tleepslib.sk.tsukuba.ac.jp> <42537C50.8000608@v.loewis.de> <874qekb6ae.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <425444FB.6040407@v.loewis.de> Stephen J. Turnbull wrote: > Because the signature/BOM is not a chunk, it's a header. Handling the > signature/BOM is part of stream initialization, not translation, to my > mind. I'm sorry, but I'm losing track as to what precisely you are trying to say. You seem to be using a mental model that is entirely different from mine. > The point is that explicitly using a stream shows that initialization > (and finalization) matter. The default can be BOM or not, as a > pragmatic matter. But then the stream data itself can be treated > homogeneously, as implied by the notion of stream. But what follows from that point? So it shows some kind of matter... what does that mean for actual changes to Python API? > I think it probably also would solve Walter's conundrum about > buffering the signature/BOM if responsibility for that were moved out > of the codecs and into the objects where signatures make sense. > > I don't know whether that's really feasible in the short run---I > suspect there may be a lot of stream-like modules that would need to > be updated---but it would be a saner in the long run. What is "that" which might be really feasible? To "solve Walter's conundrum"? That "signatures make sense"? So I can't really respond to your message in a meaningful way; I just let it rest... Regards, Martin From kbk at shore.net Thu Apr 7 04:12:08 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Thu Apr 7 04:12:23 2005 Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200504070212.j372C8Nw030750@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 308 open (+11) / 2819 closed ( +7) / 3127 total (+18) Bugs : 882 open (+11) / 4913 closed (+13) / 5795 total (+24) RFE : 176 open ( +1) / 151 closed ( +1) / 327 total ( +2) New / Reopened Patches ______________________ improvement of the script adaptation for the win32 platform (2005-03-30) http://python.org/sf/1173134 opened by Vivian De Smedt unicodedata docstrings (2005-03-30) CLOSED http://python.org/sf/1173245 opened by Jeremy Yallop __slots__ for subclasses of variable length types (2005-03-30) http://python.org/sf/1173475 opened by Michael Hudson Python crashes in pyexpat.c if malformed XML is parsed (2005-03-31) http://python.org/sf/1173998 opened by pdecat hierarchical regular expression (2005-04-01) CLOSED http://python.org/sf/1174589 opened by Chris Ottrey site enhancements (2005-04-01) http://python.org/sf/1174614 opened by Bob Ippolito Export more libreadline API functions (2005-04-01) http://python.org/sf/1175004 opened by Bruce Edge Export more libreadline API functions (2005-04-01) CLOSED http://python.org/sf/1175048 opened by Bruce Edge Patch for whitespace enforcement (2005-04-01) CLOSED http://python.org/sf/1175070 opened by Guido van Rossum Allow weak referencing of classic classes (2005-04-03) http://python.org/sf/1175850 opened by Greg Chapman threading.Condition.wait() return value indicates timeout (2005-04-03) http://python.org/sf/1175933 opened by Martin Blais Make subprocess.Popen support file-like objects (win) (2005-04-03) http://python.org/sf/1175984 opened by Nicolas Fleury Implemented new 'class foo():pass' syntax (2005-04-03) http://python.org/sf/1176019 opened by logistix locale._build_localename treatment for utf8 (2005-04-05) http://python.org/sf/1176504 opened by Hye-Shik Chang Clarify unicode.(en|de)code.() docstrings (2005-04-04) CLOSED http://python.org/sf/1176578 opened by Brett Cannon UTF-8-Sig codec (2005-04-05) http://python.org/sf/1177307 opened by Walter D?rwald Complex commented (2005-04-06) http://python.org/sf/1177597 opened by engelbert gruber explicit sign variable for longs (2005-04-06) http://python.org/sf/1177779 opened by Michael Hudson Patches Closed ______________ unicodedata docstrings (2005-03-30) http://python.org/sf/1173245 closed by perky hierarchical regular expression (2005-04-01) http://python.org/sf/1174589 closed by loewis Export more libreadline API functions (2005-04-01) http://python.org/sf/1175048 closed by loewis Patch for whitespace enforcement (2005-04-01) http://python.org/sf/1175070 closed by gvanrossum ast for decorators (2005-03-21) http://python.org/sf/1167709 closed by nascheme [ast branch] unicode literal fixes (2005-03-25) http://python.org/sf/1170272 closed by nascheme Clarify unicode.(en|de)code.() docstrings (2005-04-04) http://python.org/sf/1176578 closed by bcannon New / Reopened Bugs ___________________ very minor doc bug in 'listsort.txt' (2005-03-30) CLOSED http://python.org/sf/1173407 opened by gyrof quit should quit (2005-03-30) CLOSED http://python.org/sf/1173637 opened by Matt Chaput multiple broken links in profiler docs (2005-03-30) http://python.org/sf/1173773 opened by Ilya Sandler Reading /dev/zero causes SystemError (2005-04-01) http://python.org/sf/1174606 opened by Adam Olsen subclassing ModuleType and another built-in type (2005-04-01) http://python.org/sf/1174712 opened by Armin Rigo PYTHONPATH is not working (2005-04-01) CLOSED http://python.org/sf/1174795 opened by Alexander Belchenko property example code error (2005-04-01) http://python.org/sf/1175022 opened by John Ridley import statement likely to crash if module launches threads (2005-04-01) http://python.org/sf/1175194 opened by Jeff Stearns python hangs if import statement launches threads (2005-04-01) CLOSED http://python.org/sf/1175202 opened by Jeff Stearns codecs.readline sometimes removes newline chars (2005-04-02) CLOSED http://python.org/sf/1175396 opened by Irmen de Jong poorly named variable in urllib2.py (2005-04-03) http://python.org/sf/1175848 opened by Roy Smith StringIO and cStringIO don't provide 'name' attribute (2005-04-03) http://python.org/sf/1175967 opened by logistix compiler module didn't get updated for "class foo():pass" (2005-04-03) http://python.org/sf/1176012 opened by logistix Python garbage collector isn't detecting deadlocks (2005-04-04) CLOSED http://python.org/sf/1176467 opened by Nathan Marushak Readline segfault (2005-04-05) http://python.org/sf/1176893 opened by Walter D?rwald [PyPI] Password reset problem. (2005-04-05) CLOSED http://python.org/sf/1177077 opened by Darek Suchojad random.py/os.urandom robustness (2005-04-05) http://python.org/sf/1177468 opened by Fazal Majid error locale.getlocale() with LANGUAGE=eu_ES (2005-04-06) CLOSED http://python.org/sf/1177674 opened by Zunbeltz Izaola Exec Inside A Function (2005-04-06) http://python.org/sf/1177811 opened by Andrew Wilkinson (?(id)yes|no) only works when referencing the first group (2005-04-06) http://python.org/sf/1177831 opened by Andr? Malo Iterator on Fileobject gives no MemoryError (2005-04-06) http://python.org/sf/1177964 opened by Folke Lemaitre cgitb.py support for frozen images (2005-04-06) http://python.org/sf/1178136 opened by Barry Alan Scott urllib.py overwrite HTTPError code with 200 (2005-04-06) http://python.org/sf/1178141 opened by Barry Alan Scott urllib2.py assumes 206 is an error (2005-04-06) http://python.org/sf/1178145 opened by Barry Alan Scott cgitb.py report wrong line number (2005-04-07) http://python.org/sf/1178148 opened by Barry Alan Scott Bugs Closed ___________ The readline module can cause python to segfault (2005-03-19) http://python.org/sf/1166660 closed by mwh very minor doc bug in 'listsort.txt' (2005-03-30) http://python.org/sf/1173407 closed by rhettinger Property access with decorator makes interpreter crash (2005-03-17) http://python.org/sf/1165306 closed by mwh "cmp" should be "key" in sort doc (2005-03-29) http://python.org/sf/1172581 closed by rhettinger why should URL be required for all packages (2005-03-25) http://python.org/sf/1170424 closed by loewis Possible windows+python bug (2005-03-22) http://python.org/sf/1168427 closed by holo9 quit should quit (2005-03-30) http://python.org/sf/1173637 closed by loewis PYTHONPATH is not working (2005-04-01) http://python.org/sf/1174795 closed by bcannon python hangs if import statement launches threads (2005-04-02) http://python.org/sf/1175202 closed by loewis codecs.readline sometimes removes newline chars (2005-04-02) http://python.org/sf/1175396 closed by doerwalter Python garbage collector isn't detecting deadlocks (2005-04-04) http://python.org/sf/1176467 closed by nascheme [PyPI] Password reset problem. (2005-04-05) http://python.org/sf/1177077 closed by jafo Minor error in section 3.2 (2005-03-11) http://python.org/sf/1161595 closed by jyby error locale.getlocale() with LANGUAGE=eu_ES (2005-04-06) http://python.org/sf/1177674 closed by perky New / Reopened RFE __________________ add "reload" function (2005-04-03) http://python.org/sf/1175686 opened by paul rubin Add a settimeout to ftplib.FTP object (2005-04-06) http://python.org/sf/1177998 opened by Juan Antonio Vali?o Garc?a RFE Closed __________ file() on a file (2005-03-03) http://python.org/sf/1155485 closed by loewis From nbastin at opnet.com Thu Apr 7 05:09:24 2005 From: nbastin at opnet.com (Nicholas Bastin) Date: Thu Apr 7 05:09:47 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <42526645.3010600@egenix.com> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com> Message-ID: <2019f504df72a18fb04061248e3f55d8@opnet.com> On Apr 5, 2005, at 6:19 AM, M.-A. Lemburg wrote: > Note that the UTF-16 codec is strict w/r to the presence > of the BOM mark: you get a UnicodeError if a stream does > not start with a BOM mark. For the UTF-8-SIG codec, this > should probably be relaxed to not require the BOM. I've actually been confused about this point for quite some time now, but never had a chance to bring it up. I do not understand why UnicodeError should be raised if there is no BOM. I know that PEP-100 says: 'utf-16': 16-bit variable length encoding (little/big endian) and: Note: 'utf-16' should be implemented by using and requiring byte order marks (BOM) for file input/output. But this appears to be in error, at least in the current unicode standard. 'utf-16', as defined by the unicode standard, is big-endian in the absence of a BOM: --- 3.10.D42: UTF-16 encoding scheme: ... * The UTF-16 encoding scheme may or may not begin with a BOM. However, when there is no BOM, and in the absence of a higher-level protocol, the byte order of the UTF-16 encoding scheme is big-endian. --- The current implementation of the utf-16 codecs makes for some irritating gymnastics to write the BOM into the file before reading it if it contains no BOM, which seems quite like a bug in the codec. I allow for the possibility that this was ambiguous in the standard when the PEP was written, but it is certainly not ambiguous now. -- Nick From stephen at xemacs.org Thu Apr 7 06:20:53 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu Apr 7 06:21:10 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <4253CCA0.9020008@livinglogic.de> (Walter =?iso-8859-1?q?D=F6rwald's?= message of "Wed, 06 Apr 2005 13:48:48 +0200") References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <87is31e8hk.fsf@tleepslib.sk.tsukuba.ac.jp> <4252DC9F.50000@v.loewis.de> <87fyy4d9tq.fsf@tleepslib.sk.tsukuba.ac.jp> <42537C50.8000608@v.loewis.de> <874qekb6ae.fsf@tleepslib.sk.tsukuba.ac.jp> <4253CCA0.9020008@livinglogic.de> Message-ID: <87br8r9pzu.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Walter" == Walter D?rwald writes: Walter> Not really. In every encoding where a sequence of more Walter> than one byte maps to one Unicode character, you will Walter> always need some kind of buffering. If we remove the Walter> handling of initial BOMs from the codecs (except for Walter> UTF-16 where it is required), this wouldn't change any Walter> buffering requirements. Sure. My point is that codecs should be stateful only to the extent needed to assemble semantically meaningful units (ie, multioctet coded characters). In particular, they should not need to know about location at the beginning, middle, or end of some stream---because in the context of operating on a string they _can't_. >> I don't know whether that's really feasible in the short >> run---I suspect there may be a lot of stream-like modules that >> would need to be updated---but it would be a saner in the long >> run. Walter> I'm not exactly sure, what you're proposing here. That all Walter> codecs (even UTF-16) pass the BOM through and some other Walter> infrastructure is responsible for dropping it? Not exactly. I think that at the lowest level codecs should not implement complex mode-switching internally, but rather explicitly abdicate responsibility to a more appropriate codec. For example, autodetecting UTF-16 on input would be implemented by a Python program that does something like data = stream.read() for detector in [ "utf-16-signature", "utf-16-statistical" ]: # for the UTF-16 detectors, OUT will always be u"" or None out, data, codec = data.decode(detector) if codec: break while codec: more_out, data, codec = data.decode(codec) out = out + more_out if data: # a real program would complain about it pass process(out) where decode("utf-16-signature") would be implemented def utf-16-signature-internal (data): if data[0:2] == "\xfe\xff": return (u"", data[2:], "utf-16-be") else if data[0:2] == "\xff\xfe": return (u"", data[2:], "utf-16-le") else # note: data is undisturbed if the detector fails return (None, data, None) The main point is that the detector is just a codec that stops when it figures out what the next codec should be, touches only data that would be incorrect to pass to the next codec, and leaves the data alone if detection fails. utf-16-signature only handles the BOM (if present), and does not handle arbitrary "chunks" of data. Instead, it passes on the rest of the data (including the first chunk) to be handled by the appropriate utf-16-?e codec. I think that the temptation to encapsulate this logic in a utf-16 codec that "simplifies" things by calling the appropriate utf-16-?e codec itself should be deprecated, but YMMV. What I would really like is for the above style to be easier to achieve than it currently is. BTW, I appreciate your patience in exploring this; after Martin's remark about different mental models I have to suspect this approach is just somehow un-Pythonic, but fleshing it out this way I can see how it will be useful in the context of a different project. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From anthony at interlink.com.au Thu Apr 7 09:27:02 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu Apr 7 09:27:21 2005 Subject: [Python-Dev] Re: hierarchicial named groups extension to the =?iso-8859-1?q?re=09library?= In-Reply-To: <424F91B0.2050307@v.loewis.de> References: <20050402134150.7215.JCARLSON@uci.edu> <424F91B0.2050307@v.loewis.de> Message-ID: <200504071727.03601.anthony@interlink.com.au> On Sunday 03 April 2005 16:48, Martin v. L?wis wrote: > If this kind of functionality would fall on immediate rejection for > some reason, even writing the PEP might be pointless. Note that even if something is rejected, the PEP itself is useful - it collects knowledge in a format that's far more accessible than searching the mailing list archives. (note that I'm not talking about this particular case, but about PEPs in general - I have no opinion on the current proposal, because I'm not a heavy user of REs) -- Anthony Baxter It's never too late to have a happy childhood. From mal at egenix.com Thu Apr 7 11:07:58 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Thu Apr 7 11:08:02 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <2019f504df72a18fb04061248e3f55d8@opnet.com> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com> <2019f504df72a18fb04061248e3f55d8@opnet.com> Message-ID: <4254F86E.4000203@egenix.com> Nicholas Bastin wrote: > > On Apr 5, 2005, at 6:19 AM, M.-A. Lemburg wrote: > >> Note that the UTF-16 codec is strict w/r to the presence >> of the BOM mark: you get a UnicodeError if a stream does >> not start with a BOM mark. For the UTF-8-SIG codec, this >> should probably be relaxed to not require the BOM. > > > I've actually been confused about this point for quite some time now, > but never had a chance to bring it up. I do not understand why > UnicodeError should be raised if there is no BOM. I know that PEP-100 > says: > > 'utf-16': 16-bit variable length encoding (little/big endian) > > and: > > Note: 'utf-16' should be implemented by using and requiring byte order > marks (BOM) for file input/output. > > But this appears to be in error, at least in the current unicode > standard. 'utf-16', as defined by the unicode standard, is big-endian > in the absence of a BOM: > > --- > 3.10.D42: UTF-16 encoding scheme: > ... > * The UTF-16 encoding scheme may or may not begin with a BOM. However, > when there is no BOM, and in the absence of a higher-level protocol, the > byte order of the UTF-16 encoding scheme is big-endian. > --- The problem is "in the absence of a higher level protocol": the codec doesn't know anything about a protocol - it's the application using the codec that knows which protocol get's used. It's a lot safer to require the BOM for UTF-16 streams and raise an exception to have the application decide whether to use UTF-16-BE or the by far more common UTF-16-LE. Unlike for the UTF-8 codec, the BOM for UTF-16 is a configuration parameter, not merely a signature. In terms of history, I don't recall whether your quote was already in the standard at the time I wrote the PEP. You are the first to have reported a problem with the current implementation (which has been around since 2000), so I believe that application writers are more comfortable with the way the UTF-16 codec is currently implemented. Explicit is better than implicit :-) > The current implementation of the utf-16 codecs makes for some > irritating gymnastics to write the BOM into the file before reading it > if it contains no BOM, which seems quite like a bug in the codec. The codec writes a BOM in the first call to .write() - it doesn't write a BOM before reading from the file. > I allow for the possibility that this was ambiguous in the standard when > the PEP was written, but it is certainly not ambiguous now. See above. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 07 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mwh at python.net Thu Apr 7 10:41:03 2005 From: mwh at python.net (Michael Hudson) Date: Thu Apr 7 11:49:31 2005 Subject: [Python-Dev] threading (GilState) question Message-ID: <2mmzsb7zds.fsf@starship.python.net> I recently redid how the readline module handled threads around callbacks into Python (the previous code was insane). This resulted in the following bug report: http://www.python.org/sf/1176893 Which is correctly assigned to me as it's clearly a result of my recent checkin. However, I think my code is correct and the fault lies elsewhere. Basically, if you call PyGilState_Release before PyEval_InitThreads you crash, because PyEval_ReleaseThread gets called while interpreter_lock is NULL. This is very simple to make go away -- the problem is that there are several ways! Point the first is that I really think this is a bug in the GilState APIs: the readline API isn't inherently multi-threaded and so it would be insane to call PyEval_InitThreads() in initreadline, yet it has to cope with being called in a multithreaded situation. If you can't use the GilState APIs in this situation, what are they for? Option 1) Call PyEval_ThreadsInitialized() in PyGilState_Release(). Non-invasive, but bleh. Option 2) Call PyEval_SaveThread() instead of PyEval_ReleaseThread()[1] in PyGilState_Release(). This is my favourite option (PyGilState_Ensure() calls PyEval_RestoreThread which is PyEval_SaveThread()s "mate") and I guess you can distill this long mail into the question "why doesn't PyGilState_Release do this already?" Option 3) Make PyEval_ReleaseThread() not crash when interpreter_lock == NULL. Easy, but it's actually documented that you can't do this. Opinions? Am I placing too much trust into PyGilState_Release()s existing choice of function? Cheers, mwh [1] The issue of having almost-but-not-quite identical variations of API functions -- here PyEval_AcquireThread/PyEval_ReleaseThread vs. PyEval_RestoreThread/PyEval_SaveThread -- is something I can rant about at length, if anyone is interested :) -- I located the link but haven't bothered to re-read the article, preferring to post nonsense to usenet before checking my facts. -- Ben Wolfson, comp.lang.python From ncoghlan at gmail.com Thu Apr 7 13:21:00 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu Apr 7 13:21:06 2005 Subject: [Python-Dev] threading (GilState) question In-Reply-To: <2mmzsb7zds.fsf@starship.python.net> References: <2mmzsb7zds.fsf@starship.python.net> Message-ID: <4255179C.4090608@gmail.com> Michael Hudson wrote: > Option 1) Call PyEval_ThreadsInitialized() in PyGilState_Release(). > Non-invasive, but bleh. Tim rejected this option back when PyEval_ThreadsInitialized() was added to the API [1]. Gustavo was having a similar problem with pygtk, and the end result was to add the ThreadsInitialized API so that pygtk could make its own check without slowing down the default case in the core. > Option 2) Call PyEval_SaveThread() instead of > PyEval_ReleaseThread()[1] in PyGilState_Release(). This is my > favourite option (PyGilState_Ensure() calls PyEval_RestoreThread which > is PyEval_SaveThread()s "mate") and I guess you can distill this long > mail into the question "why doesn't PyGilState_Release do this > already?" See above. Although I'm now wondering about the opposite question: Why doesn't PyGilState_Ensure use PyEval_AcquireThread? Cheers, Nick. [1] http://sourceforge.net/tracker/?func=detail&aid=1044089&group_id=5470&atid=305470 [2] http://mail.python.org/pipermail/python-dev/2004-August/047870.html -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From mwh at python.net Thu Apr 7 14:27:16 2005 From: mwh at python.net (Michael Hudson) Date: Thu Apr 7 14:27:19 2005 Subject: [Python-Dev] threading (GilState) question In-Reply-To: <4255179C.4090608@gmail.com> (Nick Coghlan's message of "Thu, 07 Apr 2005 21:21:00 +1000") References: <2mmzsb7zds.fsf@starship.python.net> <4255179C.4090608@gmail.com> Message-ID: <2mis2y93h7.fsf@starship.python.net> Nick Coghlan writes: > Michael Hudson wrote: >> Option 1) Call PyEval_ThreadsInitialized() in PyGilState_Release(). >> Non-invasive, but bleh. > > Tim rejected this option back when PyEval_ThreadsInitialized() was > added to the API [1]. Well, not really. The patch that was rejected was much larger than any proposal of mine. My option 1) is this: --- pystate.c 09 Feb 2005 10:56:18 +0000 2.39 +++ pystate.c 07 Apr 2005 13:19:55 +0100 @@ -502,7 +502,8 @@ PyThread_delete_key_value(autoTLSkey); } /* Release the lock if necessary */ - else if (oldstate == PyGILState_UNLOCKED) - PyEval_ReleaseThread(tcur); + else if (oldstate == PyGILState_UNLOCKED + && PyEval_ThreadsInitialized()) + PyEval_ReleaseThread(); } #endif /* WITH_THREAD */ > Gustavo was having a similar problem with pygtk, and the end result > was to add the ThreadsInitialized API so that pygtk could make its > own check without slowing down the default case in the core. Well, Gustavo seemed to be complaining about the cost of the locking. I'm complaining about crashes. >> Option 2) Call PyEval_SaveThread() instead of >> PyEval_ReleaseThread()[1] in PyGilState_Release(). This is my >> favourite option (PyGilState_Ensure() calls PyEval_RestoreThread which >> is PyEval_SaveThread()s "mate") and I guess you can distill this long >> mail into the question "why doesn't PyGilState_Release do this >> already?" This option corresponds to this patch: --- pystate.c 09 Feb 2005 10:56:18 +0000 2.39 +++ pystate.c 07 Apr 2005 13:24:33 +0100 @@ -503,6 +503,6 @@ } /* Release the lock if necessary */ else if (oldstate == PyGILState_UNLOCKED) - PyEval_ReleaseThread(tcur); + PyEval_SaveThread(); } #endif /* WITH_THREAD */ > See above. Although I'm now wondering about the opposite question: Why > doesn't PyGilState_Ensure use PyEval_AcquireThread? Well, that would make more sense than what we have now. OTOH, I'd *much* rather make the PyGilState functions more tolerant -- I thought being vaguely easy to use was part of their point. I fail to believe the patch associated with option 2) has any detectable performance cost. Cheers, mwh -- People think I'm a nice guy, and the fact is that I'm a scheming, conniving bastard who doesn't care for any hurt feelings or lost hours of work if it just results in what I consider to be a better system. -- Linus Torvalds From nbastin at opnet.com Thu Apr 7 16:19:37 2005 From: nbastin at opnet.com (Nicholas Bastin) Date: Thu Apr 7 16:19:58 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <4254F86E.4000203@egenix.com> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com> <2019f504df72a18fb04061248e3f55d8@opnet.com> <4254F86E.4000203@egenix.com> Message-ID: <6355f4b2429cfb6fa42cff5670a49ea3@opnet.com> On Apr 7, 2005, at 5:07 AM, M.-A. Lemburg wrote: >> The current implementation of the utf-16 codecs makes for some >> irritating gymnastics to write the BOM into the file before reading it >> if it contains no BOM, which seems quite like a bug in the codec. > > The codec writes a BOM in the first call to .write() - it > doesn't write a BOM before reading from the file. Yes, see, I read a *lot* of UTF-16 that comes from other sources. It's not a matter of writing with python and reading with python. -- Nick From tim.peters at gmail.com Thu Apr 7 17:21:39 2005 From: tim.peters at gmail.com (Tim Peters) Date: Thu Apr 7 17:21:44 2005 Subject: [Python-Dev] threading (GilState) question In-Reply-To: <2mmzsb7zds.fsf@starship.python.net> References: <2mmzsb7zds.fsf@starship.python.net> Message-ID: <1f7befae050407082140a591fd@mail.gmail.com> [Michael Hudson] > ... > Point the first is that I really think this is a bug in the GilState > APIs: the readline API isn't inherently multi-threaded and so it would > be insane to call PyEval_InitThreads() in initreadline, yet it has to > cope with being called in a multithreaded situation. If you can't use > the GilState APIs in this situation, what are they for? That's explained in the PEP -- of course : http://www.python.org/peps/pep-0311.html Under "Limitations and Exclusions" it specifically disowns responsibility for worrying about whether Py_Initialize() and PyEval_InitThreads() have been called: This API will not perform automatic initialization of Python, or initialize Python for multi-threaded operation. Extension authors must continue to call Py_Initialize(), and for multi-threaded applications, PyEval_InitThreads(). The reason for this is that the first thread to call PyEval_InitThreads() is nominated as the "main thread" by Python, and so forcing the extension author to specify the main thread (by forcing her to make this first call) removes ambiguity. As Py_Initialize() must be called before PyEval_InitThreads(), and as both of these functions currently support being called multiple times, the burden this places on extension authors is considered reasonable. That doesn't mean there isn't a clever way to get the same effect anyway, but I don't have time to think about it, and reassigned the bug report to Mark (who may or may not have time). From mal at egenix.com Thu Apr 7 17:35:38 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Thu Apr 7 17:35:41 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <6355f4b2429cfb6fa42cff5670a49ea3@opnet.com> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com> <2019f504df72a18fb04061248e3f55d8@opnet.com> <4254F86E.4000203@egenix.com> <6355f4b2429cfb6fa42cff5670a49ea3@opnet.com> Message-ID: <4255534A.2090505@egenix.com> Nicholas Bastin wrote: > > On Apr 7, 2005, at 5:07 AM, M.-A. Lemburg wrote: > >>> The current implementation of the utf-16 codecs makes for some >>> irritating gymnastics to write the BOM into the file before reading it >>> if it contains no BOM, which seems quite like a bug in the codec. >> >> >> The codec writes a BOM in the first call to .write() - it >> doesn't write a BOM before reading from the file. > > > Yes, see, I read a *lot* of UTF-16 that comes from other sources. It's > not a matter of writing with python and reading with python. Ok, but I don't really follow you here: you are suggesting to relax the current UTF-16 behavior and to start defaulting to UTF-16-BE if no BOM is present - that's most likely going to cause more problems that it seems to solve: namely complete garbage if the data turns out to be UTF-16-LE encoded and, what's worse, enters the application undetected. If you do have UTF-16 without a BOM mark it's much better to let a short function analyze the text by reading for first few bytes of the file and then make an educated guess based on the findings. You can then process the file using one of the other codecs UTF-16-LE or -BE. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 07 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mwh at python.net Thu Apr 7 18:00:12 2005 From: mwh at python.net (Michael Hudson) Date: Thu Apr 7 18:29:59 2005 Subject: [Python-Dev] threading (GilState) question In-Reply-To: <1f7befae050407082140a591fd@mail.gmail.com> (Tim Peters's message of "Thu, 7 Apr 2005 11:21:39 -0400") References: <2mmzsb7zds.fsf@starship.python.net> <1f7befae050407082140a591fd@mail.gmail.com> Message-ID: <2m7jje8tmb.fsf@starship.python.net> Tim Peters writes: > [Michael Hudson] >> ... >> Point the first is that I really think this is a bug in the GilState >> APIs: the readline API isn't inherently multi-threaded and so it would >> be insane to call PyEval_InitThreads() in initreadline, yet it has to >> cope with being called in a multithreaded situation. If you can't use >> the GilState APIs in this situation, what are they for? > > That's explained in the PEP -- of course : > > http://www.python.org/peps/pep-0311.html Gnarr. Of course, I read this passage. I think it's missing a use case. > Under "Limitations and Exclusions" it specifically disowns > responsibility for worrying about whether Py_Initialize() and > PyEval_InitThreads() have been called: > [snip quote] This suggests that I should call PyEval_InitThreads() in initreadline(), which seems daft. > That doesn't mean there isn't a clever way to get the same effect > anyway, Pah. There's a very simple way (see my reply to Nick). It even works in the case that PyEval_InitThreads() is called in between the call to PyGilState_Ensure() and PyGilState_Release(). > but I don't have time to think about it, and reassigned the bug > report to Mark (who may or may not have time). He gets a week :) Cheers, mwh -- Or here's an even simpler indicator of how much C++ sucks: Print out the C++ Public Review Document. Have someone hold it about three feet above your head and then drop it. Thus you will be enlightened. -- Thant Tessman From nbastin at opnet.com Thu Apr 7 22:27:03 2005 From: nbastin at opnet.com (Nicholas Bastin) Date: Thu Apr 7 22:27:55 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <4255534A.2090505@egenix.com> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com> <2019f504df72a18fb04061248e3f55d8@opnet.com> <4254F86E.4000203@egenix.com> <6355f4b2429cfb6fa42cff5670a49ea3@opnet.com> <4255534A.2090505@egenix.com> Message-ID: On Apr 7, 2005, at 11:35 AM, M.-A. Lemburg wrote: > Ok, but I don't really follow you here: you are suggesting to > relax the current UTF-16 behavior and to start defaulting to > UTF-16-BE if no BOM is present - that's most likely going to > cause more problems that it seems to solve: namely complete > garbage if the data turns out to be UTF-16-LE encoded and, > what's worse, enters the application undetected. The crux of my argument is that the spec declares that UTF-16 without a BOM is BE. If the file is encoded in UTF-16LE and it doesn't have a BOM, it doesn't deserve to be processed correctly. That being said, treating it as UTF-16BE if it's LE will result in a lot of invalid code points, so it shouldn't be non-obvious that something has gone wrong. > If you do have UTF-16 without a BOM mark it's much better > to let a short function analyze the text by reading for first > few bytes of the file and then make an educated guess based > on the findings. You can then process the file using one > of the other codecs UTF-16-LE or -BE. This is about what we do now - we catch UnicodeError and then add a BOM to the file, and read it again. We know our files are UTF-16BE if they don't have a BOM, as the files are written by code which observes the spec. We can't use UTF-16BE all the time, because sometimes they're UTF-16LE, and in those cases the BOM is set. It would be nice if you could optionally specify that the codec would assume UTF-16BE if no BOM was present, and not raise UnicodeError in that case, which would preserve the current behaviour as well as allow users' to ask for behaviour which conforms to the standard. I'm not saying that you can't work around the issue now, what I'm saying is that you shouldn't *have* to - I think there is a reasonable expectation that the UTF-16 codec conforms to the spec, and if you wanted it to do something else, it is those users who should be forced to come up with a workaround. -- Nick From walter at livinglogic.de Thu Apr 7 23:32:28 2005 From: walter at livinglogic.de (=?iso-8859-1?Q?Walter_D=F6rwald?=) Date: Thu Apr 7 23:32:31 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com> <2019f504df72a18fb04061248e3f55d8@opnet.com> <4254F86E.4000203@egenix.com> <6355f4b2429cfb6fa42cff5670a49ea3@opnet.com> <4255534A.2090505@egenix.com> Message-ID: <1318.84.56.111.122.1112909548.squirrel@isar.livinglogic.de> Nicholas Bastin sagte: > On Apr 7, 2005, at 11:35 AM, M.-A. Lemburg wrote: > > [...] >> If you do have UTF-16 without a BOM mark it's much better >> to let a short function analyze the text by reading for first >> few bytes of the file and then make an educated guess based >> on the findings. You can then process the file using one >> of the other codecs UTF-16-LE or -BE. > > This is about what we do now - we catch UnicodeError and > then add a BOM to the file, and read it again. We know > our files are UTF-16BE if they don't have a BOM, as the > files are written by code which observes the spec. > We can't use UTF-16BE all the time, because sometimes > they're UTF-16LE, and in those cases the BOM is set. > > It would be nice if you could optionally specify that the > codec would assume UTF-16BE if no BOM was present, > and not raise UnicodeError in that case, which would > preserve the current behaviour as well as allow users' > to ask for behaviour which conforms to the standard. It should be feasible to implement your own codec for that based on Lib/encodings/utf_16.py. Simply replace the line in StreamReader.decode(): raise UnicodeError,"UTF-16 stream does not start with BOM" with: self.decode = codecs.utf_16_be_decode and you should be done. > [...] Bye, Walter D?rwald From martin at v.loewis.de Thu Apr 7 23:38:39 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu Apr 7 23:38:42 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com> <2019f504df72a18fb04061248e3f55d8@opnet.com> <4254F86E.4000203@egenix.com> <6355f4b2429cfb6fa42cff5670a49ea3@opnet.com> <4255534A.2090505@egenix.com> Message-ID: <4255A85F.1080307@v.loewis.de> Nicholas Bastin wrote: > It would be nice if you could optionally specify that the codec would > assume UTF-16BE if no BOM was present, and not raise UnicodeError in > that case, which would preserve the current behaviour as well as allow > users' to ask for behaviour which conforms to the standard. Alternatively, the UTF-16BE codec could support the BOM, and do UTF-16LE if the "other" BOM is found. This would also support your usecase, and in a better way. The Unicode assertion that UTF-16 is BE by default is void these days - there is *always* a higher layer protocol, and it more often than not specifies (perhaps not in English words, but only in the source code of the generator) that the default should by LE. Regards, Martin From walter at livinglogic.de Thu Apr 7 23:47:07 2005 From: walter at livinglogic.de (=?iso-8859-1?Q?Walter_D=F6rwald?=) Date: Thu Apr 7 23:47:10 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <1318.84.56.111.122.1112909548.squirrel@isar.livinglogic.de> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com> <2019f504df72a18fb04061248e3f55d8@opnet.com> <4254F86E.4000203@egenix.com> <6355f4b2429cfb6fa42cff5670a49ea3@opnet.com> <4255534A.2090505@egenix.com> <1318.84.56.111.122.1112909548.squirrel@isar.livinglogic.de> Message-ID: <1329.84.56.111.122.1112910427.squirrel@isar.livinglogic.de> Walter D?rwald sagte: > Nicholas Bastin sagte: > > It should be feasible to implement your own codec for that > based on Lib/encodings/utf_16.py. Simply replace the line > in StreamReader.decode(): > raise UnicodeError,"UTF-16 stream does not start with BOM" > with: > self.decode = codecs.utf_16_be_decode > and you should be done. Oops, this only works if you have a big endian system. Otherwise you have to redecode the input with: codecs.utf_16_ex_decode(input, errors, 1, False) Bye, Walter D?rwald From mal at egenix.com Fri Apr 8 00:12:53 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Fri Apr 8 00:12:57 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <4255A85F.1080307@v.loewis.de> References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com> <2019f504df72a18fb04061248e3f55d8@opnet.com> <4254F86E.4000203@egenix.com> <6355f4b2429cfb6fa42cff5670a49ea3@opnet.com> <4255534A.2090505@egenix.com> <4255A85F.1080307@v.loewis.de> Message-ID: <4255B065.9040101@egenix.com> Martin v. L?wis wrote: > Nicholas Bastin wrote: > >>It would be nice if you could optionally specify that the codec would >>assume UTF-16BE if no BOM was present, and not raise UnicodeError in >>that case, which would preserve the current behaviour as well as allow >>users' to ask for behaviour which conforms to the standard. > > > Alternatively, the UTF-16BE codec could support the BOM, and do > UTF-16LE if the "other" BOM is found. That would violate the Unicode standard - the BOM character for UTF-16-LE and -BE must be interpreted as ZWNBSP. > This would also support your usecase, and in a better way. The > Unicode assertion that UTF-16 is BE by default is void these > days - there is *always* a higher layer protocol, and it more > often than not specifies (perhaps not in English words, but > only in the source code of the generator) that the default should > by LE. I've checked the various versions of the Unicode standard docs: it seems that the quote you have was silently introduced between 3.0 and 4.0. Python currently uses version 3.2.0 of the standard and I don't think enough people are aware of the change in the standard to make a case for dropping the exception raising in the case of a UTF-16 finding a stream without a BOM mark. By the time we switch to 4.1 or later, we can then make the change in the native UTF-16 codec as you requested. Personally, I think that the Unicode consortium should not have introduced a default for the UTF-16 encoding byte order. Using big endian as default in a world where most Unicode data is created on little endian machines is not very realistic either. Note that the UTF-16 codec starts reading data in the machines native byte order and then learns a possibly different byte order by looking for BOMs. Implementing a codec which implements the 4.0 behavior is easy, though. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 07 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From stephen at xemacs.org Fri Apr 8 04:22:50 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri Apr 8 04:23:11 2005 Subject: [Python-Dev] Unicode byte order mark decoding In-Reply-To: <4255B065.9040101@egenix.com> (M.'s message of "Fri, 08 Apr 2005 00:12:53 +0200") References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca> <424DACDC.4080601@egenix.com> <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp> <42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com> <2019f504df72a18fb04061248e3f55d8@opnet.com> <4254F86E.4000203@egenix.com> <6355f4b2429cfb6fa42cff5670a49ea3@opnet.com> <4255534A.2090505@egenix.com> <4255A85F.1080307@v.loewis.de> <4255B065.9040101@egenix.com> Message-ID: <87vf6y6m85.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "MvL" == "Martin v. L?wis" writes: MvL> This would also support your usecase, and in a better way. MvL> The Unicode assertion that UTF-16 is BE by default is void MvL> these days - there is *always* a higher layer protocol, and MvL> it more often than not specifies (perhaps not in English MvL> words, but only in the source code of the generator) that the MvL> default should by LE. That is _not_ a protocol. A protocol is a published specification, not merely a frequent accident of implementation. Anyway, both ISO 10646 and the Unicode standard consider that "internal use" and there is no requirement at all placed on those data. And such generators typically take great advantage of that freedom---have you looked in a .doc file recently? Have you noticed how many different options (previous implementations) of .doc are offered in the Import menu? >>>>> "MAL" == "M.-A. Lemburg" writes: MAL> I've checked the various versions of the Unicode standard MAL> docs: it seems that the quote you have was silently MAL> introduced between 3.0 and 4.0. Probably because ISO 10646 was _always_ BE until the standards were unified. But note that ISO 10646 standardizes only use as a communications medium. Neither ISO 10646 nor Unicode makes any specification about internal usage. Conformance in internal processing is a matter of the programmer's convenience in producing conforming output. MAL> Python currently uses version 3.2.0 of the standard and I MAL> don't think enough people are aware of the change in the MAL> standard There's only one (corporate) person that matters: Microsoft. MAL> By the time we switch to 4.1 or later, we can then make the MAL> change in the native UTF-16 codec as you requested. While in principle I sympathize with Nick, pragmatically Microsoft is unlikely to conform. They will take the position that files created by Windows are "internal" to the Windows environment, except where explicitly intended for exchange with arbitrary platforms, and only then will they conform. As Martin points out, that is what really matters for these defaults. I think you should look to see what Microsoft does. MAL> Personally, I think that the Unicode consortium should not MAL> have introduced a default for the UTF-16 encoding byte MAL> order. Using big endian as default in a world where most MAL> Unicode data is created on little endian machines is not very MAL> realistic either. It's not a default for the UTF-16 encoding byte order. It's a default for the UTF-16 encoding byte order _when UTF-16 is a communications medium_. Given that the generic network byte order is bigendian, I think it would be insane to specify littleendian as Unicode's default. With Unicode same as network, you specify UTF-16 strings internally as an array of uint16_t, and when you put them on the wire (including saving them to a file that might be put on the wire as octet-stream) you apply htons(3) to it. On reading, you apply ntohs(3) to it. The source code is portable, the file is portable. How can you beat that? -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From python at rcn.com Thu Apr 7 16:58:11 2005 From: python at rcn.com (Raymond Hettinger) Date: Fri Apr 8 04:58:25 2005 Subject: [Python-Dev] Developer list update In-Reply-To: <200504051906.34590.fdrake@acm.org> Message-ID: <000301c53b82$3834d160$4baf958d@oemcomputer> Does anyone know what has become of the following developers and perhaps have their current email addresses? Are any of these folks still active in Python development? Ben Gertzfield Charles G Waldman Eric Price Finn Bock Ken Manheimer Moshe Zadka Raymond Hettinger From aleaxit at yahoo.com Fri Apr 8 05:50:53 2005 From: aleaxit at yahoo.com (Alex Martelli) Date: Fri Apr 8 05:50:59 2005 Subject: [Python-Dev] Developer list update In-Reply-To: <000301c53b82$3834d160$4baf958d@oemcomputer> References: <000301c53b82$3834d160$4baf958d@oemcomputer> Message-ID: On Apr 7, 2005, at 07:58, Raymond Hettinger wrote: > Does anyone know what has become of the following developers and > perhaps > have their current email addresses? Are any of these folks still > active > in Python development? > > Ben Gertzfield > Charles G Waldman > Eric Price > Finn Bock > Ken Manheimer > Moshe Zadka Moshe was at Pycon (sorry I didn't think of introducing you to each other!) so I do assume he's still active. Alex From greg.ewing at canterbury.ac.nz Fri Apr 8 07:03:42 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri Apr 8 07:03:59 2005 Subject: [Python-Dev] New style classes and operator methods Message-ID: <425610AE.5070605@canterbury.ac.nz> I think I've found a small flaw in the implementation of binary operator methods for new-style Python classes. If the left and right operands are of the same class, and the class implements a right operand method but not a left operand method, the right operand method is not called. Instead, two attempts are made to call the left operand method. I'm surmising this is because both calls are funnelled through the same C-level method, which is using the types of the operands to decide whether to call the left or right Python methods. I suppose this isn't really a serious problem, since it's easily worked around by always defining at least a left operand method. But I thought I'd point it out anyway. The following example illustrates the problem: class NewStyleSpam(object): def __add__(self, other): print "NewStyleSpam.__add__", self, other return NotImplemented def __radd__(self, other): print "NewStyleSpam.__radd__", self, other return 42 x1 = NewStyleSpam() x2 = NewStyleSpam() print x1 + x2 which produces: NewStyleSpam.__add__ <__main__.NewStyleSpam object at 0x4019062c> <__main__.NewStyleSpam object at 0x4019056c> NewStyleSpam.__add__ <__main__.NewStyleSpam object at 0x4019062c> <__main__.NewStyleSpam object at 0x4019056c> Traceback (most recent call last): File "/home/cosc/staff/research/greg/tmp/foo.py", line 27, in ? print x1 + x2 TypeError: unsupported operand type(s) for +: 'NewStyleSpam' and 'NewStyleSpam' Old-style classes, on the other hand, work as expected: class OldStyleSpam: def __add__(self, other): print "OldStyleSpam.__add__", self, other return NotImplemented def __radd__(self, other): print "OldStyleSpam.__radd__", self, other return 42 y1 = OldStyleSpam() y2 = OldStyleSpam() print y1 + y2 produces: OldStyleSpam.__add__ <__main__.OldStyleSpam instance at 0x4019054c> <__main__.OldStyleSpam instance at 0x401901ec> OldStyleSpam.__radd__ <__main__.OldStyleSpam instance at 0x401901ec> <__main__.OldStyleSpam instance at 0x4019054c> 42 -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From fdrake at acm.org Fri Apr 8 15:31:38 2005 From: fdrake at acm.org (Fred Drake) Date: Fri Apr 8 15:32:02 2005 Subject: [Python-Dev] Developer list update In-Reply-To: <000301c53b82$3834d160$4baf958d@oemcomputer> References: <000301c53b82$3834d160$4baf958d@oemcomputer> Message-ID: <200504080931.38652.fdrake@acm.org> On Thursday 07 April 2005 10:58, Raymond Hettinger wrote: > Eric Price Eric Price was an intern at CNRI; I think it's safe to remove him from the list, as I've not seen anything from him in a *long* time. -Fred -- Fred L. Drake, Jr. From jhylton at gmail.com Fri Apr 8 15:53:07 2005 From: jhylton at gmail.com (Jeremy Hylton) Date: Fri Apr 8 15:53:09 2005 Subject: [Python-Dev] Developer list update In-Reply-To: <200504080931.38652.fdrake@acm.org> References: <000301c53b82$3834d160$4baf958d@oemcomputer> <200504080931.38652.fdrake@acm.org> Message-ID: On Apr 8, 2005 9:31 AM, Fred Drake wrote: > On Thursday 07 April 2005 10:58, Raymond Hettinger wrote: > > Eric Price > > Eric Price was an intern at CNRI; I think it's safe to remove him from the > list, as I've not seen anything from him in a *long* time. Eric Price did some of the work on the decimal package, which was only two summers ago. He wasn't an intern at CNRI. Jeremy From eyal.lotem at gmail.com Fri Apr 8 16:01:02 2005 From: eyal.lotem at gmail.com (Eyal Lotem) Date: Fri Apr 8 16:01:05 2005 Subject: [Python-Dev] Security capabilities in Python Message-ID: I would like to experiment with security based on Python references as security capabilities. Unfortunatly, there are several problems that make Python references invalid as capabilities: * There is no way to create secure proxies because there are no private attributes. * Lots of Python objects are reachable unnecessarily breaking the principle of least privelege (i.e: object.__subclasses__() etc.) I was wondering if any such effort has already begun or if there are other considerations making Python unusable as a capability platform? (Please cc the reply to my email) From fdrake at acm.org Fri Apr 8 16:02:18 2005 From: fdrake at acm.org (Fred Drake) Date: Fri Apr 8 16:02:25 2005 Subject: [Python-Dev] Developer list update In-Reply-To: References: <000301c53b82$3834d160$4baf958d@oemcomputer> <200504080931.38652.fdrake@acm.org> Message-ID: <200504081002.18073.fdrake@acm.org> On Friday 08 April 2005 09:53, Jeremy Hylton wrote: > Eric Price did some of the work on the decimal package, which was only > two summers ago. He wasn't an intern at CNRI. A different Eric Price, then. Mea culpa. (Or am I misremembering the intern's name? Hmm.) -Fred -- Fred L. Drake, Jr. From jim at zope.com Fri Apr 8 16:45:22 2005 From: jim at zope.com (Jim Fulton) Date: Fri Apr 8 16:45:31 2005 Subject: [Python-Dev] Security capabilities in Python In-Reply-To: References: Message-ID: <42569902.9030307@zope.com> You might take a look at zope.security: http://svn.zope.org/Zope3/trunk/src/zope/security/ It isn't a capability-based system, but it does address similar problems and might have some useful ideas. See the README.txt and untrustedinterpreter.txt. Jim Eyal Lotem wrote: > I would like to experiment with security based on Python references as > security capabilities. > > Unfortunatly, there are several problems that make Python references > invalid as capabilities: > > * There is no way to create secure proxies because there are no > private attributes. > * Lots of Python objects are reachable unnecessarily breaking the > principle of least privelege (i.e: object.__subclasses__() etc.) > > I was wondering if any such effort has already begun or if there are > other considerations making Python unusable as a capability platform? > > (Please cc the reply to my email) > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jim%40zope.com -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From barry at python.org Fri Apr 8 16:58:28 2005 From: barry at python.org (Barry Warsaw) Date: Fri Apr 8 16:58:34 2005 Subject: [Python-Dev] Developer list update In-Reply-To: <000301c53b82$3834d160$4baf958d@oemcomputer> References: <000301c53b82$3834d160$4baf958d@oemcomputer> Message-ID: <1112972308.19892.6.camel@geddy.wooz.org> On Thu, 2005-04-07 at 10:58, Raymond Hettinger wrote: > Ben Gertzfield Ben did a lot of work on the i18n parts of the email package. I haven't heard from him in quite a while. > Ken Manheimer Ken's still around. I'll send you his current email address in a separate (pvt) message. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050408/0a6b71b3/attachment.pgp From tim.peters at gmail.com Fri Apr 8 19:01:33 2005 From: tim.peters at gmail.com (Tim Peters) Date: Fri Apr 8 19:01:37 2005 Subject: [Python-Dev] Developer list update In-Reply-To: <000301c53b82$3834d160$4baf958d@oemcomputer> References: <200504051906.34590.fdrake@acm.org> <000301c53b82$3834d160$4baf958d@oemcomputer> Message-ID: <1f7befae05040810013a338acc@mail.gmail.com> [Raymond Hettinger] > Does anyone know what has become of the following developers and perhaps > have their current email addresses? How about we exploit that if someone is a Python developer on SF, they necessarily have an SF email address ($(SFNAME)@users.sourceforge.net, like I'm tim_one@users.sourceforge.net)? Then, IMO, if someone with SF commit privs can't be reached via their SF address, they shouldn't have SF commit privs. From tim.peters at gmail.com Fri Apr 8 19:05:41 2005 From: tim.peters at gmail.com (Tim Peters) Date: Fri Apr 8 19:05:47 2005 Subject: [Python-Dev] Developer list update In-Reply-To: <200504081002.18073.fdrake@acm.org> References: <000301c53b82$3834d160$4baf958d@oemcomputer> <200504080931.38652.fdrake@acm.org> <200504081002.18073.fdrake@acm.org> Message-ID: <1f7befae05040810052e3d40df@mail.gmail.com> [Jeremy] >> Eric Price did some of the work on the decimal package, which was only >> two summers ago. He wasn't an intern at CNRI. [Fred] > A different Eric Price, then. Mea culpa. > > (Or am I misremembering the intern's name? Hmm.) Yes, Eric Price was "the PythonLabs intern", for the brief time that lasted. I'll add info about him to developers.txt. He was given SF developer status specifically to work on the decimal module, which then lived in the Python sandbox. There isn't a reason for him to remain a developer. From python at rcn.com Fri Apr 8 09:02:42 2005 From: python at rcn.com (Raymond Hettinger) Date: Fri Apr 8 21:02:56 2005 Subject: [Python-Dev] Developer list update In-Reply-To: <1f7befae05040810013a338acc@mail.gmail.com> Message-ID: <000101c53c08$f5b4d100$d122a044@oemcomputer> > [Raymond Hettinger] > > Does anyone know what has become of the following developers and perhaps > > have their current email addresses? [Tim Peters] > How about we exploit that if someone is a Python developer on SF, they > necessarily have an SF email address ($(SFNAME)@users.sourceforge.net, > like I'm tim_one@users.sourceforge.net)? I used those addresses and sent notes to everyone who hasn't made a recent checkin. For the most part, we've gotten lots of cheerful responses (with one notable exception) indicating a continuing use for the checkin privs. A few people no longer have a use for the access and I'm recording those as we go. > Then, IMO, if someone with SF commit privs can't be reached via their > SF address, they shouldn't have SF commit privs. I'm taking a lighter approach and making every effort to get in contact. If they respond, I'll ask them to update their SF address. Raymond From tim.peters at gmail.com Fri Apr 8 21:54:32 2005 From: tim.peters at gmail.com (Tim Peters) Date: Fri Apr 8 21:55:05 2005 Subject: [Python-Dev] Developer list update In-Reply-To: <000101c53c08$f5b4d100$d122a044@oemcomputer> References: <1f7befae05040810013a338acc@mail.gmail.com> <000101c53c08$f5b4d100$d122a044@oemcomputer> Message-ID: <1f7befae05040812546478a677@mail.gmail.com> ... [Uncle "Bad Cop" Timmy] >> Then, IMO, if someone with SF commit privs can't be reached via their >> SF address, they shouldn't have SF commit privs. [Raymond "Good Cop" Hettinger] > I'm taking a lighter approach and making every effort to get in contact. > If they respond, I'll ask them to update their SF address. Of course! I would too, if I were you. But given that I'm still me, the annotated attributions above should clarify the role I'm playing here <0.9 wink>. From tim.peters at gmail.com Fri Apr 8 21:54:32 2005 From: tim.peters at gmail.com (Tim Peters) Date: Fri Apr 8 21:57:30 2005 Subject: [Python-Dev] Developer list update In-Reply-To: <000101c53c08$f5b4d100$d122a044@oemcomputer> References: <1f7befae05040810013a338acc@mail.gmail.com> <000101c53c08$f5b4d100$d122a044@oemcomputer> Message-ID: <1f7befae05040812546478a677@mail.gmail.com> ... [Uncle "Bad Cop" Timmy] >> Then, IMO, if someone with SF commit privs can't be reached via their >> SF address, they shouldn't have SF commit privs. [Raymond "Good Cop" Hettinger] > I'm taking a lighter approach and making every effort to get in contact. > If they respond, I'll ask them to update their SF address. Of course! I would too, if I were you. But given that I'm still me, the annotated attributions above should clarify the role I'm playing here <0.9 wink>. From tjreedy at udel.edu Fri Apr 8 22:26:36 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Fri Apr 8 22:28:21 2005 Subject: [Python-Dev] Re: Security capabilities in Python References: Message-ID: "Eyal Lotem" wrote in message news:b64f365b0504080701206af8d3@mail.gmail.com... >I would like to experiment with security based on Python references as > security capabilities. I am pretty sure that there was a prolonged discussion on Python, security, and capability on this list a year or two ago. Perhaps you can find it in the summary archives or the archives themselves. tjr From skip at pobox.com Fri Apr 8 22:30:09 2005 From: skip at pobox.com (Skip Montanaro) Date: Fri Apr 8 22:30:19 2005 Subject: [Python-Dev] Developer list update In-Reply-To: <000301c53b82$3834d160$4baf958d@oemcomputer> References: <200504051906.34590.fdrake@acm.org> <000301c53b82$3834d160$4baf958d@oemcomputer> Message-ID: <16982.59857.373473.929701@montanaro.dyndns.org> Raymond> Does anyone know what has become of ... Raymond> Charles G Waldman I'd scratch Charles from the list. I work at the same company he did. Nobody here has been in touch with him for over a year. Several of us have tried to get ahold of him but to no avail. Skip From greg at electricrain.com Fri Apr 8 23:42:18 2005 From: greg at electricrain.com (Gregory P. Smith) Date: Fri Apr 8 23:42:25 2005 Subject: [Python-Dev] threading (GilState) question In-Reply-To: <2m7jje8tmb.fsf@starship.python.net> References: <2mmzsb7zds.fsf@starship.python.net> <1f7befae050407082140a591fd@mail.gmail.com> <2m7jje8tmb.fsf@starship.python.net> Message-ID: <20050408214218.GE24751@zot.electricrain.com> > > Under "Limitations and Exclusions" it specifically disowns > > responsibility for worrying about whether Py_Initialize() and > > PyEval_InitThreads() have been called: > > > [snip quote] > > This suggests that I should call PyEval_InitThreads() in > initreadline(), which seems daft. fwiw, Modules/_bsddb.c does exactly that. -g From fredrik at pythonware.com Sat Apr 9 00:55:02 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat Apr 9 00:55:13 2005 Subject: [Python-Dev] Re: Developer list update References: <1f7befae05040810013a338acc@mail.gmail.com> <000101c53c08$f5b4d100$d122a044@oemcomputer> Message-ID: Raymond Hettinger wrote: > I used those addresses and sent notes to everyone who hasn't made a > recent checkin. where recent obviously was defined as "after 2.4" for checkins, and "last week" for tracker activities. python-dev was a lot more fun in the old days. From Scott.Daniels at Acm.Org Sat Apr 9 01:15:39 2005 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Sat Apr 9 01:15:45 2005 Subject: [Python-Dev] marshal / unmarshal Message-ID: What should marshal / unmarshal do with floating point NaNs (the case we are worrying about is Infinity) ? The current behavior is not perfect. Michael Spencer chased down a supposed "Idle" problem to (on Win2k): marshal.dumps(1e10000) == 'f\x061.#INF' marshal.loads('f\x061.#INF') == 1.0 Should loads raise an exception? Somehow, I thing 1.0 is not the best possible representation for +Inf. -- Scott David Daniels Scott.Daniels@Acm.Org From fredrik at pythonware.com Sat Apr 9 01:32:21 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat Apr 9 01:32:33 2005 Subject: [Python-Dev] Re: marshal / unmarshal References: Message-ID: Scott David Daniels wrote: > What should marshal / unmarshal do with floating point NaNs (the case we > are worrying about is Infinity) ? The current behavior is not perfect. > > Michael Spencer chased down a supposed "Idle" problem to (on Win2k): > marshal.dumps(1e10000) == 'f\x061.#INF' > marshal.loads('f\x061.#INF') == 1.0 > > Should loads raise an exception? > Somehow, I thing 1.0 is not the best possible representation for +Inf. looks like marshal uses atof to parse the string, without bothering to check for trailing junk... it should probably use a strtod instead, and raise an exception if there's enough junk left at the end (see PyFloat_ FromString for sample code). fwiw, here's what I get on a linux box: >>> import marshal >>> marshal.dumps(1e10000) 'f\x03inf' >>> marshal.loads(_) inf and yes, someone should fix the NaN mess, but I guess everyone's too busy removing unworthy developers from sourceforge to bother working on stuff that's actually useful for real Python users... From tim.peters at gmail.com Sat Apr 9 01:38:24 2005 From: tim.peters at gmail.com (Tim Peters) Date: Sat Apr 9 01:38:28 2005 Subject: [Python-Dev] marshal / unmarshal In-Reply-To: References: Message-ID: <1f7befae0504081638145d3b4c@mail.gmail.com> [Scott David Daniels] > What should marshal / unmarshal do with floating point NaNs (the case we > are worrying about is Infinity) ? The current behavior is not perfect. All Python behavior in the presence of a NaN, infinity, or signed zero is a platform-dependent accident. This is because C89 has no such concepts, and Python is written to the C89 standard. It's not easy to fix across all platforms (because there is no portable way to do so in standard C), although it may be reasonably easy to fix if all anyone cares about is gcc and MSVC (every platform C compiler has its own set of gimmicks for "dealing with" these things). If marshal could reliably detect a NaN, then of course unmarshal should reliably reproduce the NaN -- provided the platform on which it's unpacked supports NaNs. > Should loads raise an exception? Never for a quiet NaN, unless the platform doesn't support NaNs. It's harder to know what to with a signaling NaN, because Python doesn't have any of 754's trap-enable or exception status flags either (the new ``decimal`` module does, but none of that is integrated with the _rest_ of Python yet). Should note that what the fp literal 1e10000 does across boxes is also an accident -- Python defers to the platform C libraries for string<->float conversions. From fredrik at pythonware.com Sat Apr 9 01:37:20 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat Apr 9 01:43:17 2005 Subject: [Python-Dev] Re: hierarchicial named groups extension to the re library References: Message-ID: wrote: > (ie. the re library only returns the ~last~ match for named groups - not > a list of ~all~ the matches for the named groups. And the hierarchy of those named groups is non-existant in the flat dictionary of matches > that results. ) are you 100% sure that this can be implemented on top of other RE engines (CPython isn't the only Python implementation out there). (generally speaking, trying to turn an RE engine into a parser is a lousy idea. the library would benefit more from a simple parser toolkit than it benefits from more non-standard and highly specialized RE hacks...) From tim.peters at gmail.com Sat Apr 9 02:06:06 2005 From: tim.peters at gmail.com (Tim Peters) Date: Sat Apr 9 02:06:10 2005 Subject: [Python-Dev] Re: Developer list update In-Reply-To: References: <1f7befae05040810013a338acc@mail.gmail.com> <000101c53c08$f5b4d100$d122a044@oemcomputer> Message-ID: <1f7befae0504081706160c3fa@mail.gmail.com> [Raymond Hettinger wrote: >> I used those addresses and sent notes to everyone who hasn't made a >> recent checkin. [Fredrik Lundh] > where recent obviously was defined as "after 2.4" for checkins, and "last week" > for tracker activities. Raymond didn't mention tracker activity above, and that's a different issue -- it's possible now to separate commit privileges from tracker privileges on SourceForge. Like it or not (I think I can guess which), every person with commit privs implies at least one box that can become a security hole, and at least 5 people who in fact never commit anymore were agreeable to giving up SF developer privs. > python-dev was a lot more fun in the old days. Ya, but you were too -- and so was I. I expect these all go together, given that (the collective) we _are_ python-dev. So what have you been up to lately? Skip it unless the answer's fun . From tjreedy at udel.edu Sat Apr 9 03:59:32 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Sat Apr 9 03:59:40 2005 Subject: [Python-Dev] Re: marshal / unmarshal References: <1f7befae0504081638145d3b4c@mail.gmail.com> Message-ID: "Tim Peters" wrote in message news:1f7befae0504081638145d3b4c@mail.gmail.com... > All Python behavior in the presence of a NaN, infinity, or signed zero > is a platform-dependent accident. The particular issue here is not platform dependence as such but within-platform usage dependence, as in the same code giving radically different answers in a standard interactive console window and an idle window, or when you run it the first time (from xx.py) versus subsequent times (from xx.pyc) until you edit the file again. (I verified this on 2.2, but MSpencer claimed to have tested on 2.4). Having the value of an expression such as '100 < 1e1000' flip back and forth between True and False from run to run *is* distressing for some people ;-). I know that this has come up before as 'wont fix' bug, but it might be better to have invalid floats like 1e1000, etc, not compile and raise an exception (at least on Windows) instead of breaking the reasonable expectation that unmarshal(marshal(codeob)) == codeob. That would force people (at least on Windows) to do something more more within-platform deterministic. >If marshal could reliably detect a NaN, then of course unmarshal >should reliably reproduce the NaN -- provided the platform on which >it's unpacked supports NaNs Windows seems to support +- INF just fine, doing arithmetic and comparisons 'correctly'. So it seems that detection or reproduction is the problem. Terry J. Reedy From Scott.Daniels at Acm.Org Sat Apr 9 04:20:30 2005 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Sat Apr 9 04:20:51 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: References: <1f7befae0504081638145d3b4c@mail.gmail.com> Message-ID: Terry Reedy wrote: > "Tim Peters" wrote in message > news:1f7befae0504081638145d3b4c@mail.gmail.com... > >>All Python behavior in the presence of a NaN, infinity, or signed zero >>is a platform-dependent accident. > > > The particular issue here is not platform dependence as such but > within-platform usage dependence, as in the same code giving radically > different answers in a standard interactive console window and an idle > window, or when you run it the first time (from xx.py) versus subsequent > times (from xx.pyc) until you edit the file again. (I verified this on 2.2, > but MSpencer claimed to have tested on 2.4). Having the value of an > expression such as '100 < 1e1000' flip back and forth between True and > False from run to run *is* distressing for some people ;-). > > I know that this has come up before as 'wont fix' bug, but it might be > better to have invalid floats like 1e1000, etc, not compile and raise an > exception (at least on Windows) instead of breaking the reasonable > expectation that unmarshal(marshal(codeob)) == codeob. That would force > people (at least on Windows) to do something more more within-platform > deterministic. > > >>If marshal could reliably detect a NaN, then of course unmarshal >>should reliably reproduce the NaN -- provided the platform on which >>it's unpacked supports NaNs > > > Windows seems to support +- INF just fine, doing arithmetic and comparisons > 'correctly'. So it seems that detection or reproduction is the problem. > > Terry J. Reedy > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org > I can write the Windows-dependent detect code if that is what is wanted. I just want to know what the consensus is on the "should." If we cause exceptions, should they be one encode or decode or both? If not, do we replicate all NaNs, Infs of both signs, Indeterminates?.... --Scott David Daniels Scott.Daniels@Acm.Org From python-dev at zesty.ca Sat Apr 9 07:13:40 2005 From: python-dev at zesty.ca (Ka-Ping Yee) Date: Sat Apr 9 07:13:45 2005 Subject: [Python-Dev] Security capabilities in Python In-Reply-To: References: Message-ID: On Fri, 8 Apr 2005, Eyal Lotem wrote: > I would like to experiment with security based on Python references as > security capabilities. This is an interesting and worthwhile thought. Several people (including myself) have talked about the possibility of doing this in the past. I believe the two problems you mention can be addressed without modifying the Python core. > * There is no way to create secure proxies because there are no > private attributes. Attributes are not private, but local variables are. If you use lexical scoping to restrict variable access (as one would in Scheme, E, etc.) you can create secure proxies. See below. > * Lots of Python objects are reachable unnecessarily breaking the > principle of least privelege (i.e: object.__subclasses__() etc.) True. However, Python's restricted execution mode prevents access to these attributes, allowing you to enforce encapsulation. (At least, that is part of the intent of restricted execution mode, though currently we do not make official guarantees about it.) Replacing __builtins__ activates restricted execution mode. Here is a simple facet function. def facet(target, allowed_attrs): class Facet: def __repr__(self): return '' % (allowed_attrs, target) def __getattr__(self, name): if name in allowed_attrs: return getattr(target, name) raise NameError(name) return Facet() def restrict(): global __builtins__ __builtins__ = __builtins__.__dict__.copy() # Here's an example. list = [1, 2, 3] immutable_facet = facet(list, ['__getitem__', '__len__', '__iter__']) # Here's another example. class Counter: def __init__(self): self.n = 0 def increment(self): self.n += 1 def value(self): return self.n counter = Counter() readonly_facet = facet(counter, ['value']) If i've done this correctly, it should be impossible to alter the contents of the list or the counter, given only the immutable_facet or the readonly_facet, after restrict() has been called. (Try it out and let me know if you can poke holes in it...) The upshot of all this is that i think you can do secure programming in Python if you just use a different style. Unfortunately, this style is incompatible with the way classes are usually written in Python, which means you can't safely use much of the standard library, but i believe the language itself is not fatally flawed. -- ?!ng From fredrik at pythonware.com Sat Apr 9 07:32:37 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat Apr 9 07:32:37 2005 Subject: [Python-Dev] Re: marshal / unmarshal References: <1f7befae0504081638145d3b4c@mail.gmail.com> Message-ID: Tim Peters wrote: > All Python behavior in the presence of a NaN, infinity, or signed zero > is a platform-dependent accident. This is because C89 has no such > concepts, and Python is written to the C89 standard. It's not easy to > fix across all platforms (because there is no portable way to do so in > standard C), although it may be reasonably easy to fix if all anyone > cares about is gcc and MSVC which probably represents very close to 100% of all python interpreter instances out there. making floats behave the same on standard builds for windows, mac os x, and linux would be a great step forward. +1.0 from me. >> Should loads raise an exception? > > Never for a quiet NaN, unless the platform doesn't support NaNs. It's > harder to know what to with a signaling NaN, because Python doesn't > have any of 754's trap-enable or exception status flags either (the > new ``decimal`` module does, but none of that is integrated with the > _rest_ of Python yet). > > Should note that what the fp literal 1e10000 does across boxes is also > an accident -- Python defers to the platform C libraries for > string<->float conversions. yeah, but the problem here is that MSVC cannot read its own NaN:s; float() checks for that, but loads doesn't. compare and contrast: >>> float(str(1e10000)) Traceback (most recent call last): File "", line 1, in ? ValueError: invalid literal for float(): 1.#INF >>> import marshal >>> marshal.loads(marshal.dumps(1e10000)) 1.0 on the other hand, >>> marshal.loads("\f\x01x") Traceback (most recent call last): File "", line 1, in ? ValueError: bad marshal data adding basic error checking shouldn't be very hard (you could probably call the string->float converter in the float object module, and just map any exceptions to "bad marshal data") From steve at holdenweb.com Sat Apr 9 08:37:15 2005 From: steve at holdenweb.com (Steve Holden) Date: Sat Apr 9 08:37:36 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: References: Message-ID: <4257781B.4050704@holdenweb.com> Fredrik Lundh wrote: [...] > > and yes, someone should fix the NaN mess, but I guess everyone's too > busy removing unworthy developers from sourceforge to bother working > on stuff that's actually useful for real Python users... > That's not at all true. Some of us are busy giving up commit privileges in order to avoid the impression that we might one day work on stuff that actually useful to real Python users. Except, possibly, conferences. The effbot is at least averagely cantankerous this month :-) unworthi-ly y'rs - steve -- Steve Holden +1 703 861 4237 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ Python Web Programming http://pydish.holdenweb.com/ From exarkun at divmod.com Sat Apr 9 11:02:23 2005 From: exarkun at divmod.com (Jp Calderone) Date: Sat Apr 9 11:02:32 2005 Subject: [Python-Dev] Security capabilities in Python In-Reply-To: Message-ID: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm> On Sat, 9 Apr 2005 00:13:40 -0500 (CDT), Ka-Ping Yee wrote: >On Fri, 8 Apr 2005, Eyal Lotem wrote: > > I would like to experiment with security based on Python references as > > security capabilities. > > This is an interesting and worthwhile thought. Several people > (including myself) have talked about the possibility of doing > this in the past. I believe the two problems you mention can be > addressed without modifying the Python core. > > > * There is no way to create secure proxies because there are no > > private attributes. > > Attributes are not private, but local variables are. If you use > lexical scoping to restrict variable access (as one would in > Scheme, E, etc.) you can create secure proxies. See below. > > > * Lots of Python objects are reachable unnecessarily breaking the > > principle of least privelege (i.e: object.__subclasses__() etc.) > > True. However, Python's restricted execution mode prevents access > to these attributes, allowing you to enforce encapsulation. (At > least, that is part of the intent of restricted execution mode, > though currently we do not make official guarantees about it.) > Replacing __builtins__ activates restricted execution mode. > > Here is a simple facet function. > > def facet(target, allowed_attrs): > class Facet: > def __repr__(self): > return '' % (allowed_attrs, target) > def __getattr__(self, name): > if name in allowed_attrs: > return getattr(target, name) > raise NameError(name) > return Facet() > > def restrict(): > global __builtins__ > __builtins__ = __builtins__.__dict__.copy() > > # Here's an example. > > list = [1, 2, 3] > immutable_facet = facet(list, ['__getitem__', '__len__', '__iter__']) > > # Here's another example. > > class Counter: > def __init__(self): > self.n = 0 > > def increment(self): > self.n += 1 > > def value(self): > return self.n > > counter = Counter() > readonly_facet = facet(counter, ['value']) > > If i've done this correctly, it should be impossible to alter the > contents of the list or the counter, given only the immutable_facet > or the readonly_facet, after restrict() has been called. > > (Try it out and let me know if you can poke holes in it...) > > The upshot of all this is that i think you can do secure programming > in Python if you just use a different style. Unfortunately, this > style is incompatible with the way classes are usually written in > Python, which means you can't safely use much of the standard library, > but i believe the language itself is not fatally flawed. > Does using the gc module to bypass this security count? If so: exarkun@boson:~$ python -i facet.py >>> import gc >>> c = readonly_facet.__getattr__.func_closure[1] >>> r = gc.get_referents(c)[0] >>> r.n = 'hax0r3d' >>> readonly_facet.value() 'hax0r3d' >>> This is the easiest way of which I know to bypass the use of cells as a security mechanism. I believe there are other more involved (and fragile, probably) ways, though. Jp From martin at v.loewis.de Sat Apr 9 13:12:33 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat Apr 9 13:12:37 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: References: <1f7befae0504081638145d3b4c@mail.gmail.com> Message-ID: <4257B8A1.6000902@v.loewis.de> Terry Reedy wrote: > The particular issue here is not platform dependence as such but > within-platform usage dependence, as in the same code giving radically > different answers in a standard interactive console window and an idle > window, or when you run it the first time (from xx.py) versus subsequent > times (from xx.pyc) until you edit the file again. Yet, this *still* is a platform dependence. Python makes no guarantee that 1e1000 is a supported float literal on any platform, and indeed, on your platform, 1e1000 is not supported on your platform. Furthermore, Python makes no guarantee that it will report when an unsupported float-literal is found, so you just get different behaviour, by accident. This, in turn, is a violation of the principle "errors should never pass silently". Alas, nobody found the time to detect the error, yet. Just don't do that, then. Regards, Martin From skip at pobox.com Sat Apr 9 14:30:04 2005 From: skip at pobox.com (Skip Montanaro) Date: Sat Apr 9 14:30:07 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: <4257B8A1.6000902@v.loewis.de> References: <1f7befae0504081638145d3b4c@mail.gmail.com> <4257B8A1.6000902@v.loewis.de> Message-ID: <16983.51916.407019.489590@montanaro.dyndns.org> Martin> Yet, this *still* is a platform dependence. Python makes no Martin> guarantee that 1e1000 is a supported float literal on any Martin> platform, and indeed, on your platform, 1e1000 is not supported Martin> on your platform. Are float("inf") and float("nan") supported everywhere? I don't have ready access to a Windows machine, but on the couple Linux and MacOS machines at-hand they are. As a starting point can it be agreed on whether they should be supported? (There is a unique IEEE-754 representation for both values, right? Should we try and support any other floating point format?) If so, the float("1e10000") == float("inf") in all cases, right? If not, then Python's lexer should be trained to know what out-of-range floats are and complain when it encounters them. In either case, we should then know how to fix marshal.loads (and probably pickle.loads). That seems like it would be a start in the right direction. Skip From fredrik at pythonware.com Sat Apr 9 14:53:15 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat Apr 9 14:54:00 2005 Subject: [Python-Dev] Re: Re: marshal / unmarshal References: <1f7befae0504081638145d3b4c@mail.gmail.com> <4257B8A1.6000902@v.loewis.de> <16983.51916.407019.489590@montanaro.dyndns.org> Message-ID: Skip Montanaro wrote: > Are float("inf") and float("nan") supported everywhere? nope. >>> float("inf") Traceback (most recent call last): File "", line 1, in ? ValueError: invalid literal for float(): inf >>> float("nan") Traceback (most recent call last): File "", line 1, in ? ValueError: invalid literal for float(): nan >>> 1e10000 1.#INF >>> float("1.#INF") Traceback (most recent call last): File "", line 1, in ? ValueError: invalid literal for float(): 1.#INF > As a starting point can it be agreed on whether they should be supported? that would be nice. > In either case, we should then know how to fix marshal.loads (and probably > pickle.loads). pickle doesn't have the INF=>1.0 bug: >>> import pickle >>> pickle.loads(pickle.dumps(1e10000)) ... ValueError: invalid literal for float(): 1.#INF >>> import cPickle >>> cPickle.loads(cPickle.dumps(1e10000)) ... ValueError: could not convert string to float >>> import marshal >>> marshal.loads(marshal.dumps(1e10000)) 1.0 From martin at v.loewis.de Sat Apr 9 15:32:06 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat Apr 9 15:32:09 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: <16983.51916.407019.489590@montanaro.dyndns.org> References: <1f7befae0504081638145d3b4c@mail.gmail.com> <4257B8A1.6000902@v.loewis.de> <16983.51916.407019.489590@montanaro.dyndns.org> Message-ID: <4257D956.80402@v.loewis.de> Skip Montanaro wrote: > Martin> Yet, this *still* is a platform dependence. Python makes no > Martin> guarantee that 1e1000 is a supported float literal on any > Martin> platform, and indeed, on your platform, 1e1000 is not supported > Martin> on your platform. > > Are float("inf") and float("nan") supported everywhere? I would not expect that, but Tim will correct me if I'm wrong. > As a starting point can it be agreed on whether they > should be supported? (There is a unique IEEE-754 representation for both > values, right? Perhaps yes for inf, but I think maybe no for nan. There are multiple IEEE-754 representations for NaN. However, I understand all NaN are meant to compare unequal - even if they use the same representation. > If so, the float("1e10000") == float("inf") in all cases, right? Currently, not necessarily: if a large-enough exponent is supported (which might be the case with a IEEE "long double", dunno), 1e10000 would be a regular value. > That seems like it would be a start in the right direction. Pieces of it would be a start in the right direction. Regards, Martin From fredrik at pythonware.com Sat Apr 9 18:36:48 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat Apr 9 18:36:57 2005 Subject: [Python-Dev] Re: Re: marshal / unmarshal References: <1f7befae0504081638145d3b4c@mail.gmail.com><4257B8A1.6000902@v.loewis.de><16983.51916.407019.489590@montanaro.dyndns.org> Message-ID: > pickle doesn't have the INF=>1.0 bug: > >>>> import pickle >>>> pickle.loads(pickle.dumps(1e10000)) > ... > ValueError: invalid literal for float(): 1.#INF > >>>> import cPickle >>>> cPickle.loads(cPickle.dumps(1e10000)) > ... > ValueError: could not convert string to float > >>>> import marshal >>>> marshal.loads(marshal.dumps(1e10000)) > 1.0 should I check in a fix for this? the code in PyFloat_FromString contains lots of trickery to deal with more or less broken literals, and more or less broken C libraries. unfortunately, and unlike most other functions with similar names, PyFloat_FromString takes a Python object, not a char pointer. would it be a good idea to add a variant that takes a char*? if so, should PyFloat_FromString use the new function, or are we avoiding that kind of refactoring for speed reasons these days? any opinions? From fredrik at pythonware.com Sat Apr 9 19:43:26 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sat Apr 9 19:44:00 2005 Subject: [Python-Dev] Re: Security capabilities in Python References: Message-ID: Ka-Ping wrote: > counter = Counter() > readonly_facet = facet(counter, ['value']) > > If i've done this correctly, it should be impossible to alter the > contents of the list or the counter, given only the immutable_facet > or the readonly_facet, after restrict() has been called. I'm probably missing something, but a straightforward reflection approach seems to work on my machine: >>> restrict() >>> readonly_facet = facet(counter, ['value']) >>> print readonly_facet.value() 0 >>> readonly_facet.value.im_self.n = "oops!" >>> print readonly_facet.value() oops! >>> class mycounter: ... def value(self): return "muhaha!" ... >>> readonly_facet.value.im_self.__class__ = mycounter >>> print readonly_facet.value() muhaha! ... >>> readonly_facet.value.im_func.func_globals["readonly_facet"] = myinstance ... and so on does that restrict() function really do the right thing, or is my python install broken? From mwh at python.net Sat Apr 9 20:13:04 2005 From: mwh at python.net (Michael Hudson) Date: Sat Apr 9 20:13:05 2005 Subject: [Python-Dev] Security capabilities in Python In-Reply-To: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm> (Jp Calderone's message of "Sat, 09 Apr 2005 09:02:23 GMT") References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm> Message-ID: <2mhdif7r9r.fsf@starship.python.net> Jp Calderone writes: > Does using the gc module to bypass this security count? If so: > > exarkun@boson:~$ python -i facet.py > >>> import gc > >>> c = readonly_facet.__getattr__.func_closure[1] > >>> r = gc.get_referents(c)[0] > >>> r.n = 'hax0r3d' > >>> readonly_facet.value() > 'hax0r3d' > >>> > > This is the easiest way of which I know to bypass the use of cells > as a security mechanism. I believe there are other more involved > (and fragile, probably) ways, though. The funniest I know is part of PyPy: def extract_cell_content(c): """Get the value contained in a CPython 'cell', as read through the func_closure of a function object.""" # yuk! this is all I could come up with that works in Python 2.2 too class X(object): def __eq__(self, other): self.other = other x = X() x_cell, = (lambda: x).func_closure x_cell == c return x.other It would be unfortunate for PyPy (and IMHO, very un-pythonic) if this process became impossible. Cheers, mwh -- Java sucks. [...] Java on TV set top boxes will suck so hard it might well inhale people from off their sofa until their heads get wedged in the card slots. --- Jon Rabone, ucam.chat From mwh at python.net Sat Apr 9 20:15:46 2005 From: mwh at python.net (Michael Hudson) Date: Sat Apr 9 22:17:09 2005 Subject: [Python-Dev] threading (GilState) question In-Reply-To: <20050408214218.GE24751@zot.electricrain.com> (Gregory P. Smith's message of "Fri, 8 Apr 2005 14:42:18 -0700") References: <2mmzsb7zds.fsf@starship.python.net> <1f7befae050407082140a591fd@mail.gmail.com> <2m7jje8tmb.fsf@starship.python.net> <20050408214218.GE24751@zot.electricrain.com> Message-ID: <2md5t37r59.fsf@starship.python.net> "Gregory P. Smith" writes: >> > Under "Limitations and Exclusions" it specifically disowns >> > responsibility for worrying about whether Py_Initialize() and >> > PyEval_InitThreads() have been called: >> > >> [snip quote] >> >> This suggests that I should call PyEval_InitThreads() in >> initreadline(), which seems daft. > > fwiw, Modules/_bsddb.c does exactly that. Interesting. The problem with readline.c doing this is that it gets implicitly imported by the interpreter -- although only for interactive sessions. Maybe that's not that big a deal. I'd still prefer to change the functions (would updating the PEP be in order here? Obviously, I'd update the api documentation). Cheers, mwh -- It's relatively seldom that desire for sex is involved in technology procurement decisions. -- ESR at EuroPython 2002 From bob at redivi.com Sat Apr 9 22:54:30 2005 From: bob at redivi.com (Bob Ippolito) Date: Sat Apr 9 22:54:35 2005 Subject: [Python-Dev] threading (GilState) question In-Reply-To: <2md5t37r59.fsf@starship.python.net> References: <2mmzsb7zds.fsf@starship.python.net> <1f7befae050407082140a591fd@mail.gmail.com> <2m7jje8tmb.fsf@starship.python.net> <20050408214218.GE24751@zot.electricrain.com> <2md5t37r59.fsf@starship.python.net> Message-ID: <3af93ffa0bd325b09c6ca6a607c9528d@redivi.com> On Apr 9, 2005, at 11:15 AM, Michael Hudson wrote: > "Gregory P. Smith" writes: > >>>> Under "Limitations and Exclusions" it specifically disowns >>>> responsibility for worrying about whether Py_Initialize() and >>>> PyEval_InitThreads() have been called: >>>> >>> [snip quote] >>> >>> This suggests that I should call PyEval_InitThreads() in >>> initreadline(), which seems daft. >> >> fwiw, Modules/_bsddb.c does exactly that. > > Interesting. The problem with readline.c doing this is that it gets > implicitly imported by the interpreter -- although only for > interactive sessions. Maybe that's not that big a deal. I'd still > prefer to change the functions (would updating the PEP be in order > here? Obviously, I'd update the api documentation). Is there a good reason to *not* call PyEval_InitThreads when using a threaded Python? Sounds like it would just be easier to implicitly call it during Py_Initialize some day. -bob From python-dev at zesty.ca Sat Apr 9 22:56:46 2005 From: python-dev at zesty.ca (Ka-Ping Yee) Date: Sat Apr 9 22:56:55 2005 Subject: [Python-Dev] Re: Security capabilities in Python In-Reply-To: References: Message-ID: On Sat, 9 Apr 2005, Fredrik Lundh wrote: > Ka-Ping wrote: > > counter = Counter() > > readonly_facet = facet(counter, ['value']) > > > > If i've done this correctly, it should be impossible to alter the > > contents of the list or the counter, given only the immutable_facet > > or the readonly_facet, after restrict() has been called. > > I'm probably missing something, but a straightforward reflection > approach seems to work on my machine: That's funny. After i called restrict() Python didn't let me get im_self. >>> restrict() >>> readonly_facet.value > >>> readonly_facet.value.im_self Traceback (most recent call last): File "", line 1, in ? RuntimeError: restricted attribute >>> It doesn't matter if i make the facet before or after restrict(). >>> restrict() >>> rf2 = facet(counter, ['value']) >>> rf2.value > >>> rf2.value.im_self Traceback (most recent call last): File "", line 1, in ? RuntimeError: restricted attribute >>> I'm using Python 2.3 (#1, Sep 13 2003, 00:49:11) [GCC 3.3 20030304 (Apple Computer, Inc. build 1495)] on darwin -- ?!ng From foom at fuhm.net Sat Apr 9 22:58:54 2005 From: foom at fuhm.net (James Y Knight) Date: Sat Apr 9 22:59:06 2005 Subject: [Python-Dev] Security capabilities in Python In-Reply-To: <2mhdif7r9r.fsf@starship.python.net> References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm> <2mhdif7r9r.fsf@starship.python.net> Message-ID: <0f238c16eb17a9e9085625e4281b24ad@fuhm.net> On Apr 9, 2005, at 2:13 PM, Michael Hudson wrote: > The funniest I know is part of PyPy: > > def extract_cell_content(c): > """Get the value contained in a CPython 'cell', as read through > the func_closure of a function object.""" > # yuk! this is all I could come up with that works in Python 2.2 > too > class X(object): > def __eq__(self, other): > self.other = other > x = X() > x_cell, = (lambda: x).func_closure > x_cell == c > return x.other > > It would be unfortunate for PyPy (and IMHO, very un-pythonic) if this > process became impossible. It would be quite fortunate if you didn't have to do all that, and cell just had a "value" attribute, though. James From python-dev at zesty.ca Sat Apr 9 23:37:34 2005 From: python-dev at zesty.ca (Ka-Ping Yee) Date: Sat Apr 9 23:37:41 2005 Subject: [Python-Dev] Security capabilities in Python In-Reply-To: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm> References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm> Message-ID: On Sat, 9 Apr 2005, Jp Calderone wrote: > Does using the gc module to bypass this security count? If so: > > exarkun@boson:~$ python -i facet.py > >>> import gc > >>> c = readonly_facet.__getattr__.func_closure[1] > >>> r = gc.get_referents(c)[0] > >>> r.n = 'hax0r3d' > >>> readonly_facet.value() > 'hax0r3d' > >>> You can't get func_closure in restricted mode. (Or at least, i can't, using the Python included with Mac OS 10.3.8.) >>> restrict() >>> readonly_facet.__getattr__.func_closure Traceback (most recent call last): File "", line 1, in ? RuntimeError: restricted attribute >>> Even though this particular example doesn't work in restricted mode, it's true that the gc module violates capability discipline, and you would have to forbid its import. In any real use case, you would have to restrict imports anyway to prevent access to sys.modules or loading of arbitrary binaries. For a version that restricts imports, see: http://zesty.ca/python/facet.py Let me know if you figure out how to defeat that. (This is a fun exercise, but with a potential purpose -- it would be nice to have a coherent story on this for Python 3000, or maybe even Python 2.x.) -- ?!ng From python-dev at zesty.ca Sat Apr 9 23:46:11 2005 From: python-dev at zesty.ca (Ka-Ping Yee) Date: Sat Apr 9 23:46:16 2005 Subject: [Python-Dev] Security capabilities in Python In-Reply-To: <2mhdif7r9r.fsf@starship.python.net> References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm> <2mhdif7r9r.fsf@starship.python.net> Message-ID: On Sat, 9 Apr 2005, Michael Hudson wrote: > The funniest I know is part of PyPy: > > def extract_cell_content(c): > """Get the value contained in a CPython 'cell', as read through > the func_closure of a function object.""" > # yuk! this is all I could come up with that works in Python 2.2 too > class X(object): > def __eq__(self, other): > self.other = other > x = X() > x_cell, = (lambda: x).func_closure > x_cell == c > return x.other That's pretty amazing. > It would be unfortunate for PyPy (and IMHO, very un-pythonic) if this > process became impossible. Not a problem. func_closure is already a restricted attribute. IMHO, the clean way to do this is to provide a built-in function to get the cell content in a more direct and reliable way, and then put that in a separate module with other interpreter hacks. That both makes it easier to do stuff like this, and easier to prevent it simply by forbidding import of that module. -- ?!ng From pedronis at strakt.com Sat Apr 9 23:50:48 2005 From: pedronis at strakt.com (Samuele Pedroni) Date: Sat Apr 9 23:49:19 2005 Subject: [Python-Dev] Security capabilities in Python In-Reply-To: References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm> Message-ID: <42584E38.6060203@strakt.com> Ka-Ping Yee wrote: > On Sat, 9 Apr 2005, Jp Calderone wrote: > >> Does using the gc module to bypass this security count? If so: >> >> exarkun@boson:~$ python -i facet.py >> >>> import gc >> >>> c = readonly_facet.__getattr__.func_closure[1] >> >>> r = gc.get_referents(c)[0] >> >>> r.n = 'hax0r3d' >> >>> readonly_facet.value() >> 'hax0r3d' >> >>> > > > You can't get func_closure in restricted mode. (Or at least, i can't, > using the Python included with Mac OS 10.3.8.) > > >>> restrict() > >>> readonly_facet.__getattr__.func_closure > Traceback (most recent call last): > File "", line 1, in ? > RuntimeError: restricted attribute > >>> > > Even though this particular example doesn't work in restricted mode, > it's true that the gc module violates capability discipline, and you > would have to forbid its import. In any real use case, you would have > to restrict imports anyway to prevent access to sys.modules or loading > of arbitrary binaries. > > For a version that restricts imports, see: > > http://zesty.ca/python/facet.py > > Let me know if you figure out how to defeat that. you should probably search the list and look at my old attacks against restricted execution, there's reason why is not much supported anymore. One can still try to use it but needs to be extremely careful or use C defined proxies... etc. > > (This is a fun exercise, but with a potential purpose -- it would be > nice to have a coherent story on this for Python 3000, or maybe even > Python 2.x.) > > > -- ?!ng > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/pedronis%40strakt.com From foom at fuhm.net Sun Apr 10 00:02:22 2005 From: foom at fuhm.net (James Y Knight) Date: Sun Apr 10 00:02:36 2005 Subject: [Python-Dev] Security capabilities in Python In-Reply-To: References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm> Message-ID: <1f8cfb9a8805dcc73339b4ea0164e63b@fuhm.net> On Apr 9, 2005, at 5:37 PM, Ka-Ping Yee wrote: > Let me know if you figure out how to defeat that. You can protect against this, too, but it does show that it's *really* hard to get restricting code right...I'm of the opinion that it's not really worth it -- you should just use OS protections. untrusted_module.py: class foostr(str): def __eq__(self, other): return True def have_at_it(immutable_facet, readonly_facet): getattr(immutable_facet, foostr('append'))(5) print immutable_facet James From python-dev at zesty.ca Sun Apr 10 00:34:12 2005 From: python-dev at zesty.ca (Ka-Ping Yee) Date: Sun Apr 10 00:34:20 2005 Subject: [Python-Dev] Security capabilities in Python In-Reply-To: <1f8cfb9a8805dcc73339b4ea0164e63b@fuhm.net> References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm> <1f8cfb9a8805dcc73339b4ea0164e63b@fuhm.net> Message-ID: On Sat, 9 Apr 2005, James Y Knight wrote: > You can protect against this, too, but it does show that it's *really* > hard to get restricting code right... Good point. If you can't trust ==, then you're hosed. > I'm of the opinion that it's not > really worth it -- you should just use OS protections. This i disagree with, however. OS protections are a few orders of magnitude more heavyweight and vastly more error-prone than using a language with simple, clear semantics. Predictable code behaviour is good. -- ?!ng From Scott.Daniels at Acm.Org Sun Apr 10 16:43:22 2005 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Sun Apr 10 16:44:23 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: References: <1f7befae0504081638145d3b4c@mail.gmail.com><4257B8A1.6000902@v.loewis.de><16983.51916.407019.489590@montanaro.dyndns.org> Message-ID: Fredrik Lundh wrote: >>pickle doesn't have the INF=>1.0 bug: >>>>>import pickle >>>>>pickle.loads(pickle.dumps(1e10000)) >>... >>ValueError: invalid literal for float(): 1.#INF >>>>>import cPickle >>>>>cPickle.loads(cPickle.dumps(1e10000)) >>... >>ValueError: could not convert string to float >>>>>import marshal >>>>>marshal.loads(marshal.dumps(1e10000)) >>1.0 > should I check in a fix for this? > > the code in PyFloat_FromString contains lots of trickery to deal with more or less > broken literals, and more or less broken C libraries. > > unfortunately, and unlike most other functions with similar names, PyFloat_FromString > takes a Python object, not a char pointer. would it be a good idea to add a variant > that takes a char*? if so, should PyFloat_FromString use the new function, or are we > avoiding that kind of refactoring for speed reasons these days? > > any opinions? > > From yesterday's sprint, we found a smallest-change style fix. At the least a change like this will catch the unpacking: in marshal.c (around line 500) in function r_object: ... case TYPE_FLOAT: { char buf[256]; + char *endp; double dx; n = r_byte(p); if (n == EOF || r_string(buf, (int)n, p) != n) { PyErr_SetString(PyExc_EOFError, "EOF read where object expected"); return NULL; } buf[n] = '\0'; PyFPE_START_PROTECT("atof", return 0) - dx = PyOS_ascii_atof(buf); + dx = PyOS_ascii_strtod(buf, &endptr); PyFPE_END_PROTECT(dx) + if buf + n != &endptr) { + PyErr_SetString(PyExc_ValueError, + "not all marshalled float text read"); + return NULL; + } return PyFloat_FromDouble(dx); } -- Scott David Daniels Scott.Daniels@Acm.Org From mwh at python.net Sun Apr 10 17:22:15 2005 From: mwh at python.net (Michael Hudson) Date: Sun Apr 10 17:22:18 2005 Subject: [Python-Dev] threading (GilState) question In-Reply-To: <3af93ffa0bd325b09c6ca6a607c9528d@redivi.com> (Bob Ippolito's message of "Sat, 9 Apr 2005 13:54:30 -0700") References: <2mmzsb7zds.fsf@starship.python.net> <1f7befae050407082140a591fd@mail.gmail.com> <2m7jje8tmb.fsf@starship.python.net> <20050408214218.GE24751@zot.electricrain.com> <2md5t37r59.fsf@starship.python.net> <3af93ffa0bd325b09c6ca6a607c9528d@redivi.com> Message-ID: <2m8y3q7j2w.fsf@starship.python.net> Bob Ippolito writes: > Is there a good reason to *not* call PyEval_InitThreads when using a > threaded Python? Well, it depends how expensive ones OS's locking primitives are, I think. There were some numbers posted to the twisted list recently that showed it didn't make a whole lot of difference on some platform or other... I don't have the knowledge or the courage to make that call. > Sounds like it would just be easier to implicitly call it during > Py_Initialize some day. That might indeed be simpler. Cheers, mwh -- The gripping hand is really that there are morons everywhere, it's just that the Americon morons are funnier than average. -- Pim van Riezen, alt.sysadmin.recovery From fredrik at pythonware.com Sun Apr 10 17:29:24 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sun Apr 10 17:29:39 2005 Subject: [Python-Dev] Re: marshal / unmarshal References: <1f7befae0504081638145d3b4c@mail.gmail.com><4257B8A1.6000902@v.loewis.de><16983.51916.407019.489590@montanaro.dyndns.org> Message-ID: Scott David Daniels wrote: > From yesterday's sprint sprint? I was beginning to wonder why nobody cared about this; guess I missed the announcement ;-) > At the least a change like this will catch the unpacking: > in marshal.c (around line 500) in function r_object: > PyFPE_START_PROTECT("atof", return 0) > - dx = PyOS_ascii_atof(buf); > + dx = PyOS_ascii_strtod(buf, &endptr); > PyFPE_END_PROTECT(dx) the PROTECT contents should probably match the function you're using. > + if buf + n != &endptr) { > + PyErr_SetString(PyExc_ValueError, > + "not all marshalled float text read"); > + return NULL; this will fix the problem, sure. I still think it would be cleaner to reuse the float() semantics, since marshal.dumps uses repr(). to do that, you should use the code in floatobject.c (it wraps strtod in additional logic designed to take care of various plat- form quirks). but nevermind, you have a patch and I don't. if nobody objects, go ahead and check it in. From mwh at python.net Sun Apr 10 17:34:17 2005 From: mwh at python.net (Michael Hudson) Date: Sun Apr 10 17:34:20 2005 Subject: [Python-Dev] Security capabilities in Python In-Reply-To: <0f238c16eb17a9e9085625e4281b24ad@fuhm.net> (James Y. Knight's message of "Sat, 9 Apr 2005 16:58:54 -0400") References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm> <2mhdif7r9r.fsf@starship.python.net> <0f238c16eb17a9e9085625e4281b24ad@fuhm.net> Message-ID: <2m4qee7iiu.fsf@starship.python.net> James Y Knight writes: > On Apr 9, 2005, at 2:13 PM, Michael Hudson wrote: > >> The funniest I know is part of PyPy: >> >> def extract_cell_content(c): >> """Get the value contained in a CPython 'cell', as read through >> the func_closure of a function object.""" >> # yuk! this is all I could come up with that works in Python 2.2 >> too >> class X(object): >> def __eq__(self, other): >> self.other = other >> x = X() >> x_cell, = (lambda: x).func_closure >> x_cell == c >> return x.other >> >> It would be unfortunate for PyPy (and IMHO, very un-pythonic) if this >> process became impossible. > > It would be quite fortunate if you didn't have to do all that, and > cell just had a "value" attribute, though. Indeed. The 2.2 compatibility issue remains, though. Cheers, mwh -- Presumably pronging in the wrong place zogs it. -- Aldabra Stoddart, ucam.chat From eyal.lotem at gmail.com Sun Apr 10 18:08:01 2005 From: eyal.lotem at gmail.com (Eyal Lotem) Date: Sun Apr 10 18:08:05 2005 Subject: [Python-Dev] Security capabilities in Python In-Reply-To: <1f8cfb9a8805dcc73339b4ea0164e63b@fuhm.net> References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm> <1f8cfb9a8805dcc73339b4ea0164e63b@fuhm.net> Message-ID: It may be really hard to get it right, unless we are overlooking some simple solution. I disagree that we should "just use OS protections". The reason I am interested in Pythonic protection is because it is so much more powerful than OS protections. The capability model is much more powerful than the ACL model used by all OS's these days, and allows for interesting security concepts. What about implementing the facet in C? This could avoid the class of problems you have just mentioned. On Apr 9, 2005 2:02 PM, James Y Knight wrote: > On Apr 9, 2005, at 5:37 PM, Ka-Ping Yee wrote: > > Let me know if you figure out how to defeat that. > > You can protect against this, too, but it does show that it's *really* > hard to get restricting code right...I'm of the opinion that it's not > really worth it -- you should just use OS protections. > > untrusted_module.py: > > class foostr(str): > def __eq__(self, other): > return True > > def have_at_it(immutable_facet, readonly_facet): > getattr(immutable_facet, foostr('append'))(5) > print immutable_facet > > James -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20050410/600d8b8e/attachment.htm From tim.peters at gmail.com Sun Apr 10 19:23:32 2005 From: tim.peters at gmail.com (Tim Peters) Date: Sun Apr 10 19:23:35 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: References: <1f7befae0504081638145d3b4c@mail.gmail.com> <4257B8A1.6000902@v.loewis.de> <16983.51916.407019.489590@montanaro.dyndns.org> Message-ID: <1f7befae05041010237d11d7a9@mail.gmail.com> marshal shouldn't be representing doubles as decimal strings to begin with. All code for (de)serialing C doubles should go thru _PyFloat_Pack8() and _PyFloat_Unpack8(). cPickle (proto >= 1) and struct (std mode) already do; marshal is the oddball. But as the docs (floatobject.h) for these say: ... * Bug: What this does is undefined if x is a NaN or infinity. * Bug: -0.0 and +0.0 produce the same string. */ PyAPI_FUNC(int) _PyFloat_Pack4(double x, unsigned char *p, int le); PyAPI_FUNC(int) _PyFloat_Pack8(double x, unsigned char *p, int le); ... * Bug: What this does is undefined if the string represents a NaN or * infinity. */ PyAPI_FUNC(double) _PyFloat_Unpack4(const unsigned char *p, int le); PyAPI_FUNC(double) _PyFloat_Unpack8(const unsigned char *p, int le); From fredrik at pythonware.com Sun Apr 10 20:26:44 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Sun Apr 10 20:27:09 2005 Subject: [Python-Dev] Re: Re: marshal / unmarshal References: <1f7befae0504081638145d3b4c@mail.gmail.com> <4257B8A1.6000902@v.loewis.de><16983.51916.407019.489590@montanaro.dyndns.org> <1f7befae05041010237d11d7a9@mail.gmail.com> Message-ID: Tim Peters wrote: > marshal shouldn't be representing doubles as decimal strings to begin > with. All code for (de)serialing C doubles should go thru > _PyFloat_Pack8() and _PyFloat_Unpack8(). cPickle (proto >= 1) and > struct (std mode) already do; marshal is the oddball. is changing the marshal format really the right thing to do at this point? From mwh at python.net Sun Apr 10 20:08:56 2005 From: mwh at python.net (Michael Hudson) Date: Sun Apr 10 22:34:00 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: <1f7befae05041010237d11d7a9@mail.gmail.com> (Tim Peters's message of "Sun, 10 Apr 2005 13:23:32 -0400") References: <1f7befae0504081638145d3b4c@mail.gmail.com> <4257B8A1.6000902@v.loewis.de> <16983.51916.407019.489590@montanaro.dyndns.org> <1f7befae05041010237d11d7a9@mail.gmail.com> Message-ID: <2mmzs65wsn.fsf@starship.python.net> Tim Peters writes: > marshal shouldn't be representing doubles as decimal strings to begin > with. All code for (de)serialing C doubles should go thru > _PyFloat_Pack8() and _PyFloat_Unpack8(). cPickle (proto >= 1) and > struct (std mode) already do; marshal is the oddball. > > But as the docs (floatobject.h) for these say: > > ... > * Bug: What this does is undefined if x is a NaN or infinity. > * Bug: -0.0 and +0.0 produce the same string. > */ > PyAPI_FUNC(int) _PyFloat_Pack4(double x, unsigned char *p, int le); > PyAPI_FUNC(int) _PyFloat_Pack8(double x, unsigned char *p, int le); > ... > * Bug: What this does is undefined if the string represents a NaN or > * infinity. > */ > PyAPI_FUNC(double) _PyFloat_Unpack4(const unsigned char *p, int le); > PyAPI_FUNC(double) _PyFloat_Unpack8(const unsigned char *p, int le); OTOH, the implementation has this comment: /*---------------------------------------------------------------------------- * _PyFloat_{Pack,Unpack}{4,8}. See floatobject.h. * * TODO: On platforms that use the standard IEEE-754 single and double * formats natively, these routines could simply copy the bytes. */ Doing that would fix these problems, surely?[1] The question, of course, is how to tell. I suppose one could jsut do it unconditionally and wait for one of the three remaining VAX users[2] to compile Python 2.5 and then notice. More conservatively, one could just do this on Windows, linux/most architectures and Mac OS X. Cheers, mwh [1] I'm slighyly worried about oddball systems that do insane things with the FPU by default -- but don't think the mooted change would make things any worse. [2] Exaggeration, I realize -- but how many non 754 systems are out there? How many will see Python 2.5? -- If you give someone Fortran, he has Fortran. If you give someone Lisp, he has any language he pleases. -- Guy L. Steele Jr, quoted by David Rush in comp.lang.scheme.scsh From skip at pobox.com Sun Apr 10 22:44:52 2005 From: skip at pobox.com (Skip Montanaro) Date: Sun Apr 10 22:44:19 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: <2mmzs65wsn.fsf@starship.python.net> References: <1f7befae0504081638145d3b4c@mail.gmail.com> <4257B8A1.6000902@v.loewis.de> <16983.51916.407019.489590@montanaro.dyndns.org> <1f7befae05041010237d11d7a9@mail.gmail.com> <2mmzs65wsn.fsf@starship.python.net> Message-ID: <16985.36932.105169.855614@montanaro.dyndns.org> Michael> I suppose one could jsut do it unconditionally and wait for one Michael> of the three remaining VAX users[2] to compile Python 2.5 and Michael> then notice. You forgot the two remaining CRAY users. Since their machines are so much more powerful than VAXen, they have much more influence over Python development. Skip From foom at fuhm.net Sun Apr 10 22:54:08 2005 From: foom at fuhm.net (James Y Knight) Date: Sun Apr 10 22:54:25 2005 Subject: [Python-Dev] threading (GilState) question In-Reply-To: <2m8y3q7j2w.fsf@starship.python.net> References: <2mmzsb7zds.fsf@starship.python.net> <1f7befae050407082140a591fd@mail.gmail.com> <2m7jje8tmb.fsf@starship.python.net> <20050408214218.GE24751@zot.electricrain.com> <2md5t37r59.fsf@starship.python.net> <3af93ffa0bd325b09c6ca6a607c9528d@redivi.com> <2m8y3q7j2w.fsf@starship.python.net> Message-ID: <9b2720665f8d1d6e90b44ea182c0e42a@fuhm.net> On Apr 10, 2005, at 11:22 AM, Michael Hudson wrote: > Bob Ippolito writes: > >> Is there a good reason to *not* call PyEval_InitThreads when using a >> threaded Python? > > Well, it depends how expensive ones OS's locking primitives are, I > think. There were some numbers posted to the twisted list recently > that showed it didn't make a whole lot of difference on some platform > or other... I don't have the knowledge or the courage to make that > call. > >> Sounds like it would just be easier to implicitly call it during >> Py_Initialize some day. > > That might indeed be simpler. Here's the numbers. It looks like something changed between python 2.2 and 2.3 that made calling PyEval_InitThreads a lot less expensive. So, it doesn't seem to make a whole lot of difference on recent versions of Python. Three test programs: ${PYTHON} -c 'import pystone, time; print pystone.pystones(200000)' ${PYTHON} -c 'import thread, pystone, time; print pystone.pystones(200000)' ${PYTHON} -c 'import thread, pystone, time; thread.start_new_thread(lambda: time.sleep(10000), ()); print pystone.pystones(200000)' All tests run using the same copy of pystone. System 1: RH73, dual 3GHz Xeon [GCC 2.96 20000731 (Red Hat Linux 7.3 2.96-110)] -------- Python 1.5.2 (#1, Apr 3 2002, 18:16:26) (8.15, 24540) (8.28, 24155) (12.78, 15649) Python 2.2.2 (#1, Jul 23 2003, 13:47:48) (6.32, 31646) (6.27, 31898) (11.1, 18018) Python 2.4.1 (#1, Apr 4 2005, 17:19:27) (4.60, 43478) (4.61, 43384) (4.74, 42194) System 2, FC3/64, dual 2.4GHz athlon 64. [GCC 3.4.2 20041017 (Red Hat 3.4.2-6.fc3)] -------- Python 2.3.4 (#1, Oct 26 2004, 16:45:38) (3.84, 52083) (3.80, 52632) (3.98, 50251) Python 2.4.1 (#1, Apr 10 2005, 15:47:53) (3.09, 64725) (3.08, 64935) (3.26, 61350) Python 2.4.1 (#1, Apr 1 2005, 16:45:07) *compiled in 32 bit mode* (3.35, 59701) (3.42, 58480) (3.57, 56022) From mwh at python.net Sun Apr 10 23:48:59 2005 From: mwh at python.net (Michael Hudson) Date: Sun Apr 10 23:49:02 2005 Subject: [Python-Dev] threading (GilState) question In-Reply-To: <9b2720665f8d1d6e90b44ea182c0e42a@fuhm.net> (James Y. Knight's message of "Sun, 10 Apr 2005 16:54:08 -0400") References: <2mmzsb7zds.fsf@starship.python.net> <1f7befae050407082140a591fd@mail.gmail.com> <2m7jje8tmb.fsf@starship.python.net> <20050408214218.GE24751@zot.electricrain.com> <2md5t37r59.fsf@starship.python.net> <3af93ffa0bd325b09c6ca6a607c9528d@redivi.com> <2m8y3q7j2w.fsf@starship.python.net> <9b2720665f8d1d6e90b44ea182c0e42a@fuhm.net> Message-ID: <2mis2u5mlw.fsf@starship.python.net> James Y Knight writes: > On Apr 10, 2005, at 11:22 AM, Michael Hudson wrote: > >> Bob Ippolito writes: >> >>> Is there a good reason to *not* call PyEval_InitThreads when using a >>> threaded Python? >> >> Well, it depends how expensive ones OS's locking primitives are, I >> think. There were some numbers posted to the twisted list recently >> that showed it didn't make a whole lot of difference on some platform >> or other... I don't have the knowledge or the courage to make that >> call. >> >>> Sounds like it would just be easier to implicitly call it during >>> Py_Initialize some day. >> >> That might indeed be simpler. > > Here's the numbers. It looks like something changed between python 2.2 > and 2.3 that made calling PyEval_InitThreads a lot less expensive. So, > it doesn't seem to make a whole lot of difference on recent versions > of Python. Thanks. I see similar results for 2.3 and 2.4 on OS X (don't have 2.2 here). It's very much a guess, but could this patch: [ 525532 ] Add support for POSIX semaphores be the one to thank? Cheers, mwh -- Now this is what I don't get. Nobody said absolutely anything bad about anything. Yet it is always possible to just pull random flames out of ones ass. -- http://www.advogato.org/person/vicious/diary.html?start=60 From bob at redivi.com Mon Apr 11 00:26:00 2005 From: bob at redivi.com (Bob Ippolito) Date: Mon Apr 11 00:26:06 2005 Subject: [Python-Dev] threading (GilState) question In-Reply-To: <2mis2u5mlw.fsf@starship.python.net> References: <2mmzsb7zds.fsf@starship.python.net> <1f7befae050407082140a591fd@mail.gmail.com> <2m7jje8tmb.fsf@starship.python.net> <20050408214218.GE24751@zot.electricrain.com> <2md5t37r59.fsf@starship.python.net> <3af93ffa0bd325b09c6ca6a607c9528d@redivi.com> <2m8y3q7j2w.fsf@starship.python.net> <9b2720665f8d1d6e90b44ea182c0e42a@fuhm.net> <2mis2u5mlw.fsf@starship.python.net> Message-ID: On Apr 10, 2005, at 2:48 PM, Michael Hudson wrote: > James Y Knight writes: > >> On Apr 10, 2005, at 11:22 AM, Michael Hudson wrote: >> >>> Bob Ippolito writes: >>> >>>> Is there a good reason to *not* call PyEval_InitThreads when using a >>>> threaded Python? >>> >>> Well, it depends how expensive ones OS's locking primitives are, I >>> think. There were some numbers posted to the twisted list recently >>> that showed it didn't make a whole lot of difference on some platform >>> or other... I don't have the knowledge or the courage to make that >>> call. >>> >>>> Sounds like it would just be easier to implicitly call it during >>>> Py_Initialize some day. >>> >>> That might indeed be simpler. >> >> Here's the numbers. It looks like something changed between python 2.2 >> and 2.3 that made calling PyEval_InitThreads a lot less expensive. So, >> it doesn't seem to make a whole lot of difference on recent versions >> of Python. > > Thanks. I see similar results for 2.3 and 2.4 on OS X (don't have 2.2 > here). > > It's very much a guess, but could this patch: > > [ 525532 ] Add support for POSIX semaphores > > be the one to thank? No, Mac OS X doesn't implement POSIX semaphores. -bob From tim.peters at gmail.com Mon Apr 11 00:37:44 2005 From: tim.peters at gmail.com (Tim Peters) Date: Mon Apr 11 00:37:48 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: <2mmzs65wsn.fsf@starship.python.net> References: <4257B8A1.6000902@v.loewis.de> <16983.51916.407019.489590@montanaro.dyndns.org> <1f7befae05041010237d11d7a9@mail.gmail.com> <2mmzs65wsn.fsf@starship.python.net> Message-ID: <1f7befae05041015372cf17e91@mail.gmail.com> [mwh] > OTOH, the implementation has this comment: > > /*---------------------------------------------------------------------------- > * _PyFloat_{Pack,Unpack}{4,8}. See floatobject.h. > * > * TODO: On platforms that use the standard IEEE-754 single and double > * formats natively, these routines could simply copy the bytes. > */ > > Doing that would fix these problems, surely?[1] The 754 standard doesn't say anything about how the difference between signaling and quiet NaNs is represented. So it's possible that a qNaN on one box would "look like" an sNaN on a different box, and vice versa. But since most people run with all FPU traps disabled, and Python doesn't expose a way to read the FPU status flags, they couldn't tell the difference. Copying bytes works perfectly for all other cases (signed zeroes, non-zero finites, infinities), because their representations are wholly defined, although it's possible that a subnormal on one box will be treated like a zero (with the same sign) on a partially-conforming box. > [1] I'm slighyly worried about oddball systems that do insane things > with the FPU by default -- but don't think the mooted change would > make things any worse. Sorry, don't know what that means. > The question, of course, is how to tell. Store a few small doubles at module initialization time and stare at their bits. That's enough to settle whether a 754 format is in use, and, if it is, whether it's big-endian or little-endian. ... > [2] Exaggeration, I realize -- but how many non 754 systems are out > there? How many will see Python 2.5? No idea here. The existing pack routines strive to do a good job of _creating_ an IEEE-754-format representation regardless of platform representation. I assume that code would still be present, so "oddball" platforms would be left no worse off than they are now. From tim.peters at gmail.com Mon Apr 11 00:42:20 2005 From: tim.peters at gmail.com (Tim Peters) Date: Mon Apr 11 00:42:23 2005 Subject: [Python-Dev] Re: Re: marshal / unmarshal In-Reply-To: References: <4257B8A1.6000902@v.loewis.de> <16983.51916.407019.489590@montanaro.dyndns.org> <1f7befae05041010237d11d7a9@mail.gmail.com> Message-ID: <1f7befae0504101542361dd121@mail.gmail.com> [Fredrik Lundh] > is changing the marshal format really the right thing to do at this > point? I don't see anything special about "this point" -- it's just sometime between 2.4.1 and 2.5a0. What do you have in mind? Like pickle formats, I expect a change to marshal would add a new format code, not take away an older code, so older marshal strings could still be read. Etc. From mwh at python.net Mon Apr 11 01:08:18 2005 From: mwh at python.net (Michael Hudson) Date: Mon Apr 11 01:08:20 2005 Subject: [Python-Dev] threading (GilState) question In-Reply-To: (Bob Ippolito's message of "Sun, 10 Apr 2005 15:26:00 -0700") References: <2mmzsb7zds.fsf@starship.python.net> <1f7befae050407082140a591fd@mail.gmail.com> <2m7jje8tmb.fsf@starship.python.net> <20050408214218.GE24751@zot.electricrain.com> <2md5t37r59.fsf@starship.python.net> <3af93ffa0bd325b09c6ca6a607c9528d@redivi.com> <2m8y3q7j2w.fsf@starship.python.net> <9b2720665f8d1d6e90b44ea182c0e42a@fuhm.net> <2mis2u5mlw.fsf@starship.python.net> Message-ID: <2mekdi5ixp.fsf@starship.python.net> Bob Ippolito writes: > On Apr 10, 2005, at 2:48 PM, Michael Hudson wrote: > >> James Y Knight writes: >> >>> Here's the numbers. It looks like something changed between python 2.2 >>> and 2.3 that made calling PyEval_InitThreads a lot less expensive. So, >>> it doesn't seem to make a whole lot of difference on recent versions >>> of Python. >> >> Thanks. I see similar results for 2.3 and 2.4 on OS X (don't have 2.2 >> here). >> >> It's very much a guess, but could this patch: >> >> [ 525532 ] Add support for POSIX semaphores >> >> be the one to thank? > > No, Mac OS X doesn't implement POSIX semaphores. Well, does OS X show the same effect between 2.2 and 2.3? I don't have a 2.2 on OS X any more, I was just talking about James' results on linux. Cheers, mwh -- Slim Shady is fed up with your shit, and he's going to kill you. -- Eminem, "Public Service Announcement 2000" From bob at redivi.com Mon Apr 11 01:24:17 2005 From: bob at redivi.com (Bob Ippolito) Date: Mon Apr 11 01:24:22 2005 Subject: [Python-Dev] threading (GilState) question In-Reply-To: <2mekdi5ixp.fsf@starship.python.net> References: <2mmzsb7zds.fsf@starship.python.net> <1f7befae050407082140a591fd@mail.gmail.com> <2m7jje8tmb.fsf@starship.python.net> <20050408214218.GE24751@zot.electricrain.com> <2md5t37r59.fsf@starship.python.net> <3af93ffa0bd325b09c6ca6a607c9528d@redivi.com> <2m8y3q7j2w.fsf@starship.python.net> <9b2720665f8d1d6e90b44ea182c0e42a@fuhm.net> <2mis2u5mlw.fsf@starship.python.net> <2mekdi5ixp.fsf@starship.python.net> Message-ID: <4531465672695d375c721c56af37fe87@redivi.com> On Apr 10, 2005, at 4:08 PM, Michael Hudson wrote: > Bob Ippolito writes: > >> On Apr 10, 2005, at 2:48 PM, Michael Hudson wrote: >> >>> James Y Knight writes: >>> >>>> Here's the numbers. It looks like something changed between python >>>> 2.2 >>>> and 2.3 that made calling PyEval_InitThreads a lot less expensive. >>>> So, >>>> it doesn't seem to make a whole lot of difference on recent versions >>>> of Python. >>> >>> Thanks. I see similar results for 2.3 and 2.4 on OS X (don't have >>> 2.2 >>> here). >>> >>> It's very much a guess, but could this patch: >>> >>> [ 525532 ] Add support for POSIX semaphores >>> >>> be the one to thank? >> >> No, Mac OS X doesn't implement POSIX semaphores. > > Well, does OS X show the same effect between 2.2 and 2.3? I don't > have a 2.2 on OS X any more, I was just talking about James' results > on linux. I don't have 2.2 on OS X any more, either. -bob From aleaxit at yahoo.com Mon Apr 11 07:07:00 2005 From: aleaxit at yahoo.com (Alex Martelli) Date: Mon Apr 11 07:07:06 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: <16985.36932.105169.855614@montanaro.dyndns.org> References: <1f7befae0504081638145d3b4c@mail.gmail.com> <4257B8A1.6000902@v.loewis.de> <16983.51916.407019.489590@montanaro.dyndns.org> <1f7befae05041010237d11d7a9@mail.gmail.com> <2mmzs65wsn.fsf@starship.python.net> <16985.36932.105169.855614@montanaro.dyndns.org> Message-ID: <538f17711d377e0154e295e4b9f924bc@yahoo.com> On Apr 10, 2005, at 13:44, Skip Montanaro wrote: > > Michael> I suppose one could jsut do it unconditionally and wait > for one > Michael> of the three remaining VAX users[2] to compile Python 2.5 > and > Michael> then notice. > > You forgot the two remaining CRAY users. Since their machines are so > much > more powerful than VAXen, they have much more influence over Python > development. The latest ads I've seen from Cray were touting AMD-64 processors anyway...;-) Alex From fredrik at pythonware.com Mon Apr 11 09:33:09 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon Apr 11 09:43:21 2005 Subject: [Python-Dev] Re: Re: Re: marshal / unmarshal References: <4257B8A1.6000902@v.loewis.de><16983.51916.407019.489590@montanaro.dyndns.org> <1f7befae05041010237d11d7a9@mail.gmail.com> <1f7befae0504101542361dd121@mail.gmail.com> Message-ID: Tim Peters wrote: > [Fredrik Lundh] >> is changing the marshal format really the right thing to do at this >> point? > > I don't see anything special about "this point" -- it's just sometime > between 2.4.1 and 2.5a0. What do you have in mind? I was under the impression that the marshal format has been stable for quite a long time (people are using it for various RPC protocols, among other things). I might be wrong. From bob at redivi.com Mon Apr 11 10:00:50 2005 From: bob at redivi.com (Bob Ippolito) Date: Mon Apr 11 10:00:55 2005 Subject: [Python-Dev] Re: Re: Re: marshal / unmarshal In-Reply-To: References: <4257B8A1.6000902@v.loewis.de><16983.51916.407019.489590@montanaro.dyndns.org> <1f7befae05041010237d11d7a9@mail.gmail.com> <1f7befae0504101542361dd121@mail.gmail.com> Message-ID: <93c9ae04e805845e5bdf84671c6802c7@redivi.com> On Apr 11, 2005, at 12:33 AM, Fredrik Lundh wrote: > Tim Peters wrote: >> [Fredrik Lundh] >>> is changing the marshal format really the right thing to do at this >>> point? >> >> I don't see anything special about "this point" -- it's just sometime >> between 2.4.1 and 2.5a0. What do you have in mind? > > I was under the impression that the marshal format has been stable for > quite a long time (people are using it for various RPC protocols, among > other things). I might be wrong. The documentation for marshal explicitly states that you should not use it for such purposes. There's also a version argument to dumps and dump (though the argument list in the dump documentation doesn't say so), where version 0 is pre-2.4, and version 1 is 2.4+. I don't think it's out of the question to add a version 2 for 2.5+ that uses a better serialization for floats (and it should probably add set/frozenset too since those are builtins now). -bob From mwh at python.net Mon Apr 11 15:37:58 2005 From: mwh at python.net (Michael Hudson) Date: Mon Apr 11 15:38:00 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: <1f7befae05041015372cf17e91@mail.gmail.com> (Tim Peters's message of "Sun, 10 Apr 2005 18:37:44 -0400") References: <4257B8A1.6000902@v.loewis.de> <16983.51916.407019.489590@montanaro.dyndns.org> <1f7befae05041010237d11d7a9@mail.gmail.com> <2mmzs65wsn.fsf@starship.python.net> <1f7befae05041015372cf17e91@mail.gmail.com> Message-ID: <2maco55t8p.fsf@starship.python.net> Tim Peters writes: > The 754 standard doesn't say anything about how the difference between > signaling and quiet NaNs is represented. So it's possible that a qNaN > on one box would "look like" an sNaN on a different box, and vice > versa. But since most people run with all FPU traps disabled, and > Python doesn't expose a way to read the FPU status flags, they > couldn't tell the difference. OK. Do you have any intuition as to whether 754 implementations actually *do* differ on this point? > Copying bytes works perfectly for all other cases (signed zeroes, > non-zero finites, infinities), because their representations are > wholly defined, although it's possible that a subnormal on one box > will be treated like a zero (with the same sign) on a > partially-conforming box. I'd find struggling to care about that pretty hard. >> [1] I'm slighyly worried about oddball systems that do insane things >> with the FPU by default -- but don't think the mooted change would >> make things any worse. > > Sorry, don't know what that means. Neither do I, now. Oh well . >> The question, of course, is how to tell. > > Store a few small doubles at module initialization time and stare at ./configure time, surely? > their bits. That's enough to settle whether a 754 format is in use, > and, if it is, whether it's big-endian or little-endian. Do you have a pointer to code that does this? Googling around the subject appears to turn up lots of Python stuff... >> [2] Exaggeration, I realize -- but how many non 754 systems are out >> there? How many will see Python 2.5? > > No idea here. The existing pack routines strive to do a good job of > _creating_ an IEEE-754-format representation regardless of platform > representation. I assume that code would still be present, so > "oddball" platforms would be left no worse off than they are now. Well, yes, given the above. The text this footnote was attached to was asking if just assuming 754 float formats would inconvenience anyone. Cheers, mwh -- I don't have any special knowledge of all this. In fact, I made all the above up, in the hope that it corresponds to reality. -- Mark Carroll, ucam.chat From tim.peters at gmail.com Mon Apr 11 17:27:43 2005 From: tim.peters at gmail.com (Tim Peters) Date: Mon Apr 11 17:27:46 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: <2maco55t8p.fsf@starship.python.net> References: <16983.51916.407019.489590@montanaro.dyndns.org> <1f7befae05041010237d11d7a9@mail.gmail.com> <2mmzs65wsn.fsf@starship.python.net> <1f7befae05041015372cf17e91@mail.gmail.com> <2maco55t8p.fsf@starship.python.net> Message-ID: <1f7befae050411082733dca644@mail.gmail.com> [Tim] >> The 754 standard doesn't say anything about how the difference between >> signaling and quiet NaNs is represented. So it's possible that a qNaN >> on one box would "look like" an sNaN on a different box, and vice >> versa. But since most people run with all FPU traps disabled, and >> Python doesn't expose a way to read the FPU status flags, they >> couldn't tell the difference. [mwh] > OK. Do you have any intuition as to whether 754 implementations > actually *do* differ on this point? Not anymore -- hasn't been part of my job, or a hobby, for over a decade. There were differences a decade+ ago. All NaNs have all exponent bits set, and at least one mantissa bit set, and every bit pattern of that form represents a NaN. That's all the standard says. The most popular way to distinguish quiet from signaling NaNs keyed off the most-significant mantissa bit: set for a qNaN, clear for an sNaN. It's possible that all 754 HW does that now. There's at least still that Pentium hardware adds a third not-a-number possibility: in addition to 754's quiet and signaling NaNs, it also has "indeterminate" values. Here w/ native Windows Python 2.4 on a Pentium: >>> inf = 1e300 * 1e300 >>> inf - inf # indeterminate -1.#IND >>> - _ # but the negation of IND is a quiet NaN 1.#QNAN >>> Do the same thing under Cygwin Python on the same box and it prints "NaN" twice. Do people care about this? I don't know. It seems unlikely -- in effect, IND just gives a special string name to a single one of the many bit patterns that represent a quiet NaN. OTOH, Pentium hardware still preserves this distinction, and MS library docs do too. IND isn't part of the 754 standard (although, IIRC, it was part of a pre-standard draft, which Intel implemented and is now stuck with). >> Copying bytes works perfectly for all other cases (signed zeroes, >> non-zero finites, infinities), because their representations are >> wholly defined, although it's possible that a subnormal on one box >> will be treated like a zero (with the same sign) on a >> partially-conforming box. > I'd find struggling to care about that pretty hard. Me too. >>> The question, of course, is how to tell. >> Store a few small doubles at module initialization time and stare at > ./configure time, surely? Unsure. Not all Python platforms _have_ "./configure time". Module initialization code is harder to screw up for that reason (the code is in an obvious place then, self-contained, and doesn't require any relevant knowledge of any platform porter unless/until it breaks). >> their bits. That's enough to settle whether a 754 format is in use, >> and, if it is, whether it's big-endian or little-endian. > Do you have a pointer to code that does this? No. Pemberton's enquire.c contains enough code to do it. Given how few distinct architectures still exist, it's probably enough to store just double x = 1.5 and stare at it. >>> [2] Exaggeration, I realize -- but how many non 754 systems are out >>> there? How many will see Python 2.5? >> No idea here. The existing pack routines strive to do a good job of >> _creating_ an IEEE-754-format representation regardless of platform >> representation. I assume that code would still be present, so >> "oddball" platforms would be left no worse off than they are now. > Well, yes, given the above. The text this footnote was attached to > was asking if just assuming 754 float formats would inconvenience > anyone. I think I'm still missing your intent here. If you're asking whether Python can blindly assume that 745 is in use, I'd say that's undesirable but defensible if necessary. From mwh at python.net Mon Apr 11 18:01:49 2005 From: mwh at python.net (Michael Hudson) Date: Mon Apr 11 18:01:51 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: <1f7befae050411082733dca644@mail.gmail.com> (Tim Peters's message of "Mon, 11 Apr 2005 11:27:43 -0400") References: <16983.51916.407019.489590@montanaro.dyndns.org> <1f7befae05041010237d11d7a9@mail.gmail.com> <2mmzs65wsn.fsf@starship.python.net> <1f7befae05041015372cf17e91@mail.gmail.com> <2maco55t8p.fsf@starship.python.net> <1f7befae050411082733dca644@mail.gmail.com> Message-ID: <2msm1x480i.fsf@starship.python.net> Tim Peters writes: > [Tim] >>> The 754 standard doesn't say anything about how the difference between >>> signaling and quiet NaNs is represented. So it's possible that a qNaN >>> on one box would "look like" an sNaN on a different box, and vice >>> versa. But since most people run with all FPU traps disabled, and >>> Python doesn't expose a way to read the FPU status flags, they >>> couldn't tell the difference. > > [mwh] >> OK. Do you have any intuition as to whether 754 implementations >> actually *do* differ on this point? > > Not anymore -- hasn't been part of my job, or a hobby, for over a > decade. There were differences a decade+ ago. All NaNs have all > exponent bits set, and at least one mantissa bit set, and every bit > pattern of that form represents a NaN. That's all the standard says. > The most popular way to distinguish quiet from signaling NaNs keyed > off the most-significant mantissa bit: set for a qNaN, clear for an > sNaN. It's possible that all 754 HW does that now. [snip details] OK, so the worst that could happen here is that moving marshal data from one box to another could turn one sort of NaN into another? This doesn't seem very bad. [denorms] >> I'd find struggling to care about that pretty hard. > > Me too. Good. >>>> The question, of course, is how to tell. > >>> Store a few small doubles at module initialization time and stare at > >> ./configure time, surely? > > Unsure. Not all Python platforms _have_ "./configure time". But they all have pyconfig.h. > Module initialization code is harder to screw up for that reason > (the code is in an obvious place then, self-contained, and doesn't > require any relevant knowledge of any platform porter unless/until > it breaks). Well, sure, but false negatives here are not a big deal here. >>> their bits. That's enough to settle whether a 754 format is in use, >>> and, if it is, whether it's big-endian or little-endian. > >> Do you have a pointer to code that does this? > > No. Pemberton's enquire.c contains enough code to do it. Yikes! And much else besides. > Given how few distinct architectures still exist, it's probably > enough to store just double x = 1.5 and stare at it. Something along these lines: double x = 1.5; is_big_endian_ieee_double = sizeof(double) == 8 && \ memcmp((char*)&x, "\077\370\000\000\000\000\000\000", 8); ? [me being obscure] > I think I'm still missing your intent here. If you're asking whether > Python can blindly assume that 745 is in use, I'd say that's > undesirable but defensible if necessary. Yes, that's what I was asking, in a rather obscure way. Cheers, mwh -- Strangely enough I saw just such a beast at the grocery store last night. Starbucks sells Javachip. (It's ice cream, but that shouldn't be an obstacle for the Java marketing people.) -- Jeremy Hylton, 29 Apr 1997 From sdementen at hotmail.com Fri Apr 8 11:32:37 2005 From: sdementen at hotmail.com (Sébastien de Menten) Date: Mon Apr 11 20:05:03 2005 Subject: [Python-Dev] args attribute of Exception objects Message-ID: Hi, When I need to make sense of a python exception, I often need to parse the string exception in order to retrieve the data. Example: try: print foo except NameError, e: print e.args symbol = e.args[0][17:-16] ==> ("NameError: name 'foo' is not defined", ) or try: (4).foo except NameError, e: print e.args ==> ("'int' object has no attribute 'foo'",) Moreover, in the documentation about Exception, I read """Warning: Messages to exceptions are not part of the Python API. Their contents may change from one version of Python to the next without warning and should not be relied on by code which will run under multiple versions of the interpreter. """ So even args could not be relied upon ! Two questions: 1) did I miss something in dealing with exceptions ? 2) Could this be changed to .args more in line with: a) first example: e.args = ('foo', "NameError: name 'foo' is not defined") b) second example: e.args = (4, 'foo', "'int' object has no attribute 'foo'",) the message of the string can even be retrieved with str(e) so it is also redundant. BTW, the Warning in the doc enables to change this :-) To be backward compatible, the error message could also be the first element of the tuple. Seb ps: There may be problems (that I am not aware) with an exception keeping references to other objects From tim.peters at gmail.com Mon Apr 11 20:28:20 2005 From: tim.peters at gmail.com (Tim Peters) Date: Mon Apr 11 20:28:24 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: <2msm1x480i.fsf@starship.python.net> References: <1f7befae05041010237d11d7a9@mail.gmail.com> <2mmzs65wsn.fsf@starship.python.net> <1f7befae05041015372cf17e91@mail.gmail.com> <2maco55t8p.fsf@starship.python.net> <1f7befae050411082733dca644@mail.gmail.com> <2msm1x480i.fsf@starship.python.net> Message-ID: <1f7befae05041111284b61992a@mail.gmail.com> ... [mwh] > OK, so the worst that could happen here is that moving marshal data > from one box to another could turn one sort of NaN into another? Right. Assuming source and destination boxes both use 754 format, and the implementation adjusts endianess if necessary. Heh. I have a vague half-memory of _some_ box that stored the two 4-byte "words" in an IEEE double in one order, but the bytes within each word in the opposite order. It's always something ... > This doesn't seem very bad. Not bad at all: But since most people run with all FPU traps disabled, and Python doesn't expose a way to read the FPU status flags, they couldn't tell the difference. >>>> Store a few small doubles at module initialization time and stare at >>> ./configure time, surely? >> Unsure. Not all Python platforms _have_ "./configure time". > But they all have pyconfig.h. Yes, and then a platform porter has to understand what to #define/#undefine, and why. People doing cross-compilation may have an especially confusing time of it. Module initialization code "just works", so I certainly understand why it doesn't appeal to the Unix frame of mind . >> Module initialization code is harder to screw up for that reason >> (the code is in an obvious place then, self-contained, and doesn't >> require any relevant knowledge of any platform porter unless/until >> it breaks). > Well, sure, but false negatives here are not a big deal here. Sorry, unsure that "false negative" means here. ... > Something along these lines: > > double x = 1.5; > is_big_endian_ieee_double = sizeof(double) == 8 && \ > memcmp((char*)&x, "\077\370\000\000\000\000\000\000", 8); Right, it's that easy -- at least under MSVC and gcc. From arigo at tunes.org Mon Apr 11 22:47:42 2005 From: arigo at tunes.org (Armin Rigo) Date: Mon Apr 11 22:54:55 2005 Subject: [Python-Dev] New style classes and operator methods In-Reply-To: <425610AE.5070605@canterbury.ac.nz> References: <425610AE.5070605@canterbury.ac.nz> Message-ID: <20050411204742.GA362@vicky.ecs.soton.ac.uk> Hi Greg, On Fri, Apr 08, 2005 at 05:03:42PM +1200, Greg Ewing wrote: > If the left and right operands are of the same class, > and the class implements a right operand method but > not a left operand method, the right operand method > is not called. Instead, two attempts are made to call > the left operand method. This is not a general rule. The rule is that if both elements are of the same class, only the non-reversed method is ever called. The confusing bit is about having it called twice. Funnily enough, this only occurs for some operators (I think only add and mul). The reason is that internally, the C core distinguishes about number adding vs sequence concatenation, and number multiplying vs sequence repetition. So __add__() and __mul__() are called twice: once as a numeric computation and as a sequence operation... Could be fixed with more strange special cases in abstract.c, but I'm not sure it's worth it. Armin From martin at v.loewis.de Mon Apr 11 23:35:50 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon Apr 11 23:35:53 2005 Subject: [Python-Dev] Re: Re: Re: marshal / unmarshal In-Reply-To: References: <4257B8A1.6000902@v.loewis.de><16983.51916.407019.489590@montanaro.dyndns.org> <1f7befae05041010237d11d7a9@mail.gmail.com> <1f7befae0504101542361dd121@mail.gmail.com> Message-ID: <425AEDB6.8000009@v.loewis.de> Fredrik Lundh wrote: > I was under the impression that the marshal format has been stable for > quite a long time (people are using it for various RPC protocols, among > other things). I might be wrong. Python 2.4 introduced support for string sharing in marshal files, with an option to suppress sharing if an application needs to suppress it for backwards compatibility. Regards, Martin From mwh at python.net Mon Apr 11 22:08:12 2005 From: mwh at python.net (Michael Hudson) Date: Mon Apr 11 23:56:53 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: <1f7befae05041111284b61992a@mail.gmail.com> (Tim Peters's message of "Mon, 11 Apr 2005 14:28:20 -0400") References: <1f7befae05041010237d11d7a9@mail.gmail.com> <2mmzs65wsn.fsf@starship.python.net> <1f7befae05041015372cf17e91@mail.gmail.com> <2maco55t8p.fsf@starship.python.net> <1f7befae050411082733dca644@mail.gmail.com> <2msm1x480i.fsf@starship.python.net> <1f7befae05041111284b61992a@mail.gmail.com> Message-ID: <2moecl3wlv.fsf@starship.python.net> I've just submitted http://python.org/sf/1180995 which adds format codes for binary marshalling of floats if version > 1, but it doesn't quite have the effect I expected (see below): >>> inf = 1e308*1e308 >>> nan = inf/inf >>> marshal.dumps(nan, 2) Traceback (most recent call last): File "", line 1, in ? ValueError: unmarshallable object frexp(nan, &e), it turns out, returns nan, which results in this (to be expected if you read _PyFloat_Pack8 and know that I'm using a new-ish GCC -- it might be different for MSVC 6). Also (this is the same thing, really): >>> struct.pack('>d', inf) Traceback (most recent call last): File "", line 1, in ? SystemError: frexp() result out of range Although I was a little surprised by this: >>> struct.pack('d', inf) '\x7f\xf0\x00\x00\x00\x00\x00\x00' (this is a big-endian system). Again, reading the source explains the behaviour. Tim Peters writes: > ... > > [mwh] >> OK, so the worst that could happen here is that moving marshal data >> from one box to another could turn one sort of NaN into another? > > Right. Assuming source and destination boxes both use 754 format, and > the implementation adjusts endianess if necessary. Well, I was assuming marshal would do floats little-endian-wise, as it does for integers. > Heh. I have a vague half-memory of _some_ box that stored the two > 4-byte "words" in an IEEE double in one order, but the bytes within > each word in the opposite order. It's always something ... I recall stories of machines that stored the bytes of long in some crazy order like that. I think Python would already be broken on such a system, but, also, don't care. >>>>> Store a few small doubles at module initialization time and stare at > >>>> ./configure time, surely? > >>> Unsure. Not all Python platforms _have_ "./configure time". > >> But they all have pyconfig.h. > > Yes, and then a platform porter has to understand what to > #define/#undefine, and why. People doing cross-compilation may have > an especially confusing time of it. Well, they can always not #define HAVE_IEEE_DOUBLES and not suffer all that much (this is what I meant by false negatives below). > Module initialization code "just works", so I certainly understand > why it doesn't appeal to the Unix frame of mind . It just strikes as silly to test at runtime sometime that is so obviously not going to change between invocations. But it's not a big deal either way. > ... > >> Something along these lines: >> >> double x = 1.5; >> is_big_endian_ieee_double = sizeof(double) == 8 && \ >> memcmp((char*)&x, "\077\370\000\000\000\000\000\000", 8); > > Right, it's that easy Cool. > -- at least under MSVC and gcc. Huh? Now it's my turn to be confused (for starters, under MSVC ieee doubles really can be assumed...). Cheers, mwh -- You sound surprised. We're talking about a government department here - they have procedures, not intelligence. -- Ben Hutchings, cam.misc From tim.peters at gmail.com Tue Apr 12 01:39:23 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue Apr 12 01:39:26 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: <2moecl3wlv.fsf@starship.python.net> References: <1f7befae05041010237d11d7a9@mail.gmail.com> <2mmzs65wsn.fsf@starship.python.net> <1f7befae05041015372cf17e91@mail.gmail.com> <2maco55t8p.fsf@starship.python.net> <1f7befae050411082733dca644@mail.gmail.com> <2msm1x480i.fsf@starship.python.net> <1f7befae05041111284b61992a@mail.gmail.com> <2moecl3wlv.fsf@starship.python.net> Message-ID: <1f7befae050411163959e7b90c@mail.gmail.com> [Michael Hudson] > I've just submitted http://python.org/sf/1180995 which adds format > codes for binary marshalling of floats if version > 1, but it doesn't > quite have the effect I expected (see below): > >>> inf = 1e308*1e308 > >>> nan = inf/inf > >>> marshal.dumps(nan, 2) > Traceback (most recent call last): > File "", line 1, in ? > ValueError: unmarshallable object I don't understand. Does "binary marshalling" _not_ mean just copying the bytes on a 754 platform? If so, that won't work. I pointed out the relevant comments before: /* The pack routines write 4 or 8 bytes, starting at p. ... * Bug: What this does is undefined if x is a NaN or infinity. * Bug: -0.0 and +0.0 produce the same string. */ PyAPI_FUNC(int) _PyFloat_Pack4(double x, unsigned char *p, int le); PyAPI_FUNC(int) _PyFloat_Pack8(double x, unsigned char *p, int le); > frexp(nan, &e), it turns out, returns nan, This is an undefined case in C89 (all 754 special values are). > which results in this (to be expected if you read _PyFloat_Pack8 and > know that I'm using a new-ish GCC -- it might be different for MSVC 6). > > Also (this is the same thing, really): Right. So is pickling with proto >= 1. Changing the pack/unpack routines to copy bytes instead (when possible) "fixes" all of these things at one stroke, on boxes where it applies. > >>> struct.pack('>d', inf) > Traceback (most recent call last): > File "", line 1, in ? > SystemError: frexp() result out of range > > Although I was a little surprised by this: > > >>> struct.pack('d', inf) > '\x7f\xf0\x00\x00\x00\x00\x00\x00' > > (this is a big-endian system). Again, reading the source explains the > behaviour. >>> OK, so the worst that could happen here is that moving marshal data >>> from one box to another could turn one sort of NaN into another? >> Right. Assuming source and destination boxes both use 754 format, and >> the implementation adjusts endianess if necessary. > Well, I was assuming marshal would do floats little-endian-wise, as it > does for integers. Then on a big-endian 754 system, loads() will have to reverse the bytes in the little-endian marshal bytestring, and dumps() likewise. That's all "if necessary" meant -- sometimes cast + memcpy isn't enough, and regardless of which direction marshal decides to use. >> Heh. I have a vague half-memory of _some_ box that stored the two >> 4-byte "words" in an IEEE double in one order, but the bytes within >> each word in the opposite order. It's always something ... > I recall stories of machines that stored the bytes of long in some > crazy order like that. I think Python would already be broken on such > a system, but, also, don't care. Python does very little that depends on internal native byte order, and C hides it in the absence of casting abuse. Copying internal native bytes across boxes is plain ugly -- can't get more brittle than that. In this case it looks like a good tradeoff, though. > ... > Well, they can always not #define HAVE_IEEE_DOUBLES and not suffer all > that much (this is what I meant by false negatives below). > ... > It just strikes as silly to test at runtime sometime that is so > obviously not going to change between invocations. But it's not a big > deal either way. It isn't to me either. It just strikes me as silly to give porters another thing to wonder about and screw up when it's possible to solve it completely with a few measly runtime cycles . >>> Something along these lines: >>> >>> double x = 1.5; >>> is_big_endian_ieee_double = sizeof(double) == 8 && \ >>> memcmp((char*)&x, "\077\370\000\000\000\000\000\000", 8); >> Right, it's that easy > Cool. >> -- at least under MSVC and gcc. > Huh? Now it's my turn to be confused (for starters, under MSVC ieee > doubles really can be assumed...). So you have no argument with the "at least under MSVC" part . There's nothing to worry about here -- I was just tweaking. From mwh at python.net Tue Apr 12 09:39:08 2005 From: mwh at python.net (Michael Hudson) Date: Tue Apr 12 09:39:10 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: <1f7befae050411163959e7b90c@mail.gmail.com> (Tim Peters's message of "Mon, 11 Apr 2005 19:39:23 -0400") References: <1f7befae05041010237d11d7a9@mail.gmail.com> <2mmzs65wsn.fsf@starship.python.net> <1f7befae05041015372cf17e91@mail.gmail.com> <2maco55t8p.fsf@starship.python.net> <1f7befae050411082733dca644@mail.gmail.com> <2msm1x480i.fsf@starship.python.net> <1f7befae05041111284b61992a@mail.gmail.com> <2moecl3wlv.fsf@starship.python.net> <1f7befae050411163959e7b90c@mail.gmail.com> Message-ID: <2m4qece95v.fsf@starship.python.net> My mail is experincing random delays of up to a few hours at the moment. I wrote this before I saw your comments on my patch. Tim Peters writes: > [Michael Hudson] >> I've just submitted http://python.org/sf/1180995 which adds format >> codes for binary marshalling of floats if version > 1, but it doesn't >> quite have the effect I expected (see below): > >> >>> inf = 1e308*1e308 >> >>> nan = inf/inf >> >>> marshal.dumps(nan, 2) >> Traceback (most recent call last): >> File "", line 1, in ? >> ValueError: unmarshallable object > > I don't understand. Does "binary marshalling" _not_ mean just copying > the bytes on a 754 platform? No, it means using _PyFloat_Pack8/Unpack8, like the patch description says. Making those functions just fiddle bytes when they can I regard as a separate project (watch a patch manager near you, though). > If so, that won't work. I can tell! >>> Right. Assuming source and destination boxes both use 754 format, and >>> the implementation adjusts endianess if necessary. > >> Well, I was assuming marshal would do floats little-endian-wise, as it >> does for integers. > > Then on a big-endian 754 system, loads() will have to reverse the > bytes in the little-endian marshal bytestring, and dumps() likewise. Really? Even I had worked this out... >>> Heh. I have a vague half-memory of _some_ box that stored the two >>> 4-byte "words" in an IEEE double in one order, but the bytes within >>> each word in the opposite order. It's always something ... > >> I recall stories of machines that stored the bytes of long in some >> crazy order like that. I think Python would already be broken on such >> a system, but, also, don't care. > > Python does very little that depends on internal native byte order, > and C hides it in the absence of casting abuse. This surely does: PyObject * PyLong_FromLongLong(PY_LONG_LONG ival) { PY_LONG_LONG bytes = ival; int one = 1; return _PyLong_FromByteArray( (unsigned char *)&bytes, SIZEOF_LONG_LONG, IS_LITTLE_ENDIAN, 1); } It occurs that in the IEEE case, special values can be detected with reliablity -- by picking the exponent field out by force -- and a warning emitted or exception raised. Good idea? Hard to say, to me. Cheers, mwh Oh, by the way: http://python.org/sf/1181301 -- It is time-consuming to produce high-quality software. However, that should not alone be a reason to give up the high standards of Python development. -- Martin von Loewis, python-dev From tim.peters at gmail.com Tue Apr 12 17:16:22 2005 From: tim.peters at gmail.com (Tim Peters) Date: Tue Apr 12 17:16:29 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: <2m4qece95v.fsf@starship.python.net> References: <2mmzs65wsn.fsf@starship.python.net> <1f7befae05041015372cf17e91@mail.gmail.com> <2maco55t8p.fsf@starship.python.net> <1f7befae050411082733dca644@mail.gmail.com> <2msm1x480i.fsf@starship.python.net> <1f7befae05041111284b61992a@mail.gmail.com> <2moecl3wlv.fsf@starship.python.net> <1f7befae050411163959e7b90c@mail.gmail.com> <2m4qece95v.fsf@starship.python.net> Message-ID: <1f7befae050412081676898e80@mail.gmail.com> ... [mwh] >>> I recall stories of machines that stored the bytes of long in some >>> crazy order like that. I think Python would already be broken on such >>> a system, but, also, don't care. [Tim] >> Python does very little that depends on internal native byte order, >> and C hides it in the absence of casting abuse. [mwh] > This surely does: > > PyObject * > PyLong_FromLongLong(PY_LONG_LONG ival) > { > PY_LONG_LONG bytes = ival; > int one = 1; > return _PyLong_FromByteArray( > (unsigned char *)&bytes, > SIZEOF_LONG_LONG, IS_LITTLE_ENDIAN, 1); > } Yes, that's "casting abuse'. Python does very little of that. If it becomes necessary, it's straightforward but long-winded to rewrite the above in wholly portable C (peel the bytes out of ival, least-signficant first, via shifting and masking 8 times; "ival & 0xff" is the least-significant byte regardless of memory storage order; etc). BTW, the IS_LITTLE_ENDIAN macro also relies on casting abuse, and more deeply than does the visible cast there. > It occurs that in the IEEE case, special values can be detected with > reliablity -- by picking the exponent field out by force Right, that works for NaNs and infinities; signed zeroes are a bit trickier to detect. > -- and a warning emitted or exception raised. Good idea? Hard to say, to me. It's not possible to _create_ a NaN or infinity from finite operands in 754 without signaling some exceptional condition. Once you have one, though, there's generally nothing exceptional about _using_ it. Sometimes there is, like +Inf - +Inf or Inf / Inf, but not generally. Using a quiet NaN never signals; using a signaling NaN almost always signals. So packing a nan or inf shouldn't complain. On a 754 box, unpacking one shouldn't complain either. Unpacking a nan or inf on a non-754 box probably should complain, since there's in general nothing it can be unpacked _to_ that makes any sense ("errors should never pass silently"). From mwh at python.net Tue Apr 12 17:32:17 2005 From: mwh at python.net (Michael Hudson) Date: Tue Apr 12 23:20:49 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: <1f7befae050412081676898e80@mail.gmail.com> (Tim Peters's message of "Tue, 12 Apr 2005 11:16:22 -0400") References: <2mmzs65wsn.fsf@starship.python.net> <1f7befae05041015372cf17e91@mail.gmail.com> <2maco55t8p.fsf@starship.python.net> <1f7befae050411082733dca644@mail.gmail.com> <2msm1x480i.fsf@starship.python.net> <1f7befae05041111284b61992a@mail.gmail.com> <2moecl3wlv.fsf@starship.python.net> <1f7befae050411163959e7b90c@mail.gmail.com> <2m4qece95v.fsf@starship.python.net> <1f7befae050412081676898e80@mail.gmail.com> Message-ID: <2mk6n8c8ou.fsf@starship.python.net> Tim Peters writes: > ... > > [mwh] >>>> I recall stories of machines that stored the bytes of long in some >>>> crazy order like that. I think Python would already be broken on such >>>> a system, but, also, don't care. > > [Tim] >>> Python does very little that depends on internal native byte order, >>> and C hides it in the absence of casting abuse. > > [mwh] >> This surely does: >> >> PyObject * >> PyLong_FromLongLong(PY_LONG_LONG ival) >> { >> PY_LONG_LONG bytes = ival; >> int one = 1; >> return _PyLong_FromByteArray( >> (unsigned char *)&bytes, >> SIZEOF_LONG_LONG, IS_LITTLE_ENDIAN, 1); >> } > > Yes, that's "casting abuse'. Python does very little of that. If it > becomes necessary, it's straightforward but long-winded to rewrite the > above in wholly portable C (peel the bytes out of ival, > least-signficant first, via shifting and masking 8 times; "ival & > 0xff" is the least-significant byte regardless of memory storage > order; etc). Not arguing with that. > BTW, the IS_LITTLE_ENDIAN macro also relies on casting abuse, and > more deeply than does the visible cast there. I'd like to claim that was part of my point :) There is a certain, small level of assumption in Python that "big-endian or little-endian" is the only question to ask -- and I don't think that's a problem! Even in this isn't a big deal, at least if we choose a more interesting 'probe value' that 1.5, it will just lead to an oddball box degrading to the non-ieee code. >> It occurs that in the IEEE case, special values can be detected with >> reliablity -- by picking the exponent field out by force > > Right, that works for NaNs and infinities; signed zeroes are a bit > trickier to detect. Hmm. Don't think they're such a big deal. >> -- and a warning emitted or exception raised. Good idea? Hard to >> say, to me. > > It's not possible to _create_ a NaN or infinity from finite operands > in 754 without signaling some exceptional condition. Once you have > one, though, there's generally nothing exceptional about _using_ it. > Sometimes there is, like +Inf - +Inf or Inf / Inf, but not generally. > Using a quiet NaN never signals; using a signaling NaN almost always > signals. > > So packing a nan or inf shouldn't complain. On a 754 box, unpacking > one shouldn't complain either. Unpacking a nan or inf on a non-754 > box probably should complain, since there's in general nothing it can > be unpacked _to_ that makes any sense ("errors should never pass > silently"). This sounds like good behaviour to me. I'll try to update the patch soon. Cheers, mwh -- BUGS Never use this function. This function modifies its first argument. The identity of the delimiting character is lost. This function cannot be used on constant strings. -- the glibc manpage for strtok(3) From python at rcn.com Tue Apr 12 14:03:50 2005 From: python at rcn.com (Raymond Hettinger) Date: Wed Apr 13 02:04:39 2005 Subject: [Python-Dev] args attribute of Exception objects In-Reply-To: Message-ID: <000001c53f57$b0fe7d20$c2bd2c81@oemcomputer> [S?bastien de Menten] > 2) Could this be changed to .args more in line with: > a) first example: e.args = ('foo', "NameError: name 'foo' is not > defined") > b) second example: e.args = (4, 'foo', "'int' object has no attribute > 'foo'",) > the message of the string can even be retrieved with str(e) so it is > also > redundant. Something like this ought to be explored at some point. It would certainly improve the exception API to be able to get references to the objects without parsing strings. The balancing forces are backwards compatibility and a need to keep the exception mechanism as lightweight as possible. Please log a feature request on SF. Note that the idea is only for making builtin exceptions more informative. User defined exceptions can already attach arbitrary objects: >>> class Boom(Exception): pass >>> x = 10 >>> if x != 5: raise Boom("Value must be a five", x) Traceback (most recent call last): File "", line 2, in -toplevel- raise Boom("Value must be a five", x) Boom: ('Value must be a five', 10) Raymond Hettinger From prabu333 at hotpop.com Wed Apr 13 08:21:07 2005 From: prabu333 at hotpop.com (Senthil Prabu.S) Date: Wed Apr 13 08:21:24 2005 Subject: [Python-Dev] Python tests fails on HP-UX 11.11 and core dumps Message-ID: <02ab01c53ff0$fd2ef0a0$1f0110ac@sesco> Hello Experts, I tried python -4.2.1 on a HP-UX 11.11 PA machine. I was able to python. Gmake passes, gmake test results in error. The python reported that test_pty fails,when running this test alone. Can anyone help to find why core dumps at running the test_subprocess.py test. Also how can I solve it? Have anyone faced the same problem earlier. The details are given below; # ../../python test_pty.py Calling master_open() Got master_fd '3', slave_name '/dev/pts/0' Calling slave_open('/dev/pts/0') Got slave_fd '4' Traceback (most recent call last): File "test_pty.py", line 58, in ? test_basic_pty() File "test_pty.py", line 29, in test_basic_pty if not os.isatty(slave_fd): File "test_pty.py", line 50, in handle_sig raise TestFailed, "isatty hung" test.test_support.TestFailed: isatty hung # # ../../python test_subprocess.py test_args_string (__main__.ProcessTestCase) ... ok test_call_kwargs (__main__.ProcessTestCase) ... ok test_call_seq (__main__.ProcessTestCase) ... ok test_call_string (__main__.ProcessTestCase) ... ok test_communicate (__main__.ProcessTestCase) ... ok test_communicate_pipe_buf (__main__.ProcessTestCase) ... ok test_communicate_returns (__main__.ProcessTestCase) ... ok test_cwd (__main__.ProcessTestCase) ... ok test_env (__main__.ProcessTestCase) ... ok test_exceptions (__main__.ProcessTestCase) ... ok test_executable (__main__.ProcessTestCase) ... ok test_invalid_args (__main__.ProcessTestCase) ... ok test_invalid_bufsize (__main__.ProcessTestCase) ... ok test_list2cmdline (__main__.ProcessTestCase) ... ok test_no_leaking (__main__.ProcessTestCase) ... ok test_poll (__main__.ProcessTestCase) ... ok test_preexec (__main__.ProcessTestCase) ... ok test_run_abort (__main__.ProcessTestCase) ... ok test_shell_sequence (__main__.ProcessTestCase) ... ok test_shell_string (__main__.ProcessTestCase) ... ok test_stderr_filedes (__main__.ProcessTestCase) ... ok test_stderr_fileobj (__main__.ProcessTestCase) ... ok test_stderr_none (__main__.ProcessTestCase) ... ok test_stderr_pipe (__main__.ProcessTestCase) ... ok test_stdin_filedes (__main__.ProcessTestCase) ... ok test_stdin_fileobj (__main__.ProcessTestCase) ... ok test_stdin_none (__main__.ProcessTestCase) ... ok test_stdin_pipe (__main__.ProcessTestCase) ... ok test_stdout_filedes (__main__.ProcessTestCase) ... ok test_stdout_fileobj (__main__.ProcessTestCase) ... ok this bit of output is from a test of stdout in a different process ... test_stdout_none (__main__.ProcessTestCase) ... ok test_stdout_pipe (__main__.ProcessTestCase) ... ok test_stdout_stderr_file (__main__.ProcessTestCase) ... ok test_stdout_stderr_pipe (__main__.ProcessTestCase) ... ok test_universal_newlines (__main__.ProcessTestCase) ... ok test_universal_newlines_communicate (__main__.ProcessTestCase) ... ok test_wait (__main__.ProcessTestCase) ... ok test_writes_before_communicate (__main__.ProcessTestCase) ... ok ---------------------------------------------------------------------- Ran 38 tests in 8.171s Analysing the core file through GDB; # gdb ../../python core HP gdb 4.5 for PA-RISC 1.1 or 2.0 (narrow), HP-UX 11.00 and target hppa1.1-hp-hpux11.00. Copyright 1986 - 2001 Free Software Foundation, Inc. Hewlett-Packard Wildebeest 4.5 (based on GDB) is covered by the GNU General Public License. Type "show copying" to see the conditions to change it and/or distribute copies. Type "show warranty" for warranty/support. .. Core was generated by `python'. Program terminated with signal 6, Aborted. #0 0xc020bad0 in kill+0x10 () from /usr/lib/libc.2 (gdb) bt #0 0xc020bad0 in kill+0x10 () from /usr/lib/libc.2 #1 0xc01a655c in raise+0x24 () from /usr/lib/libc.2 #2 0xc01e69a8 in abort_C+0x160 () from /usr/lib/libc.2 #3 0xc01e6a04 in abort+0x1c () from /usr/lib/libc.2 #4 0xffbe4 in posix_abort (self=0x40029098, noargs=0x0) at ./Modules/posixmodule.c:7158 #5 0xc9b7c in PyEval_EvalFrame (f=0x40028e54) at Python/ceval.c:3531 #6 0xc01a655c in raise+0x24 () from /usr/lib/libc.2 #7 0x475b0 in freechildren (n=0x0) at Parser/node.c:131 (gdb) Build Environment; GCC - gcc version 3.4.3 HP-UX omega B.11.11 U 9000/800 ./configure --prefix=/opt/iexpress/python --disable-ipv6 --with-signal-module --with-threads Earlier, I faced problem while gmake, and make changes as per the following link; https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1071597&group_id=5470 And I was able to build Python succesfully. Also, the overall result of tests are; 250 tests OK. 1 test failed: test_pty 39 tests skipped: test_aepack test_al test_applesingle test_bsddb test_bsddb185 test_bsddb3 test_bz2 test_cd test_cl test_codecmaps_cn test_codecmaps_hk test_codecmaps_jp test_codecmaps_kr test_codecmaps_tw test_curses test_dl test_gdbm test_gl test_imgfile test_largefile test_linuxaudiodev test_locale test_macfs test_macostools test_nis test_normalization test_ossaudiodev test_pep277 test_plistlib test_scriptpackages test_socket_ssl test_socketserver test_sunaudiodev test_tcl test_timeout test_urllib2net test_urllibnet test_winreg test_winsound 2 skips unexpected on hp-ux11: test_tcl test_bz2 gmake: *** [test] Error 1 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20050413/9ad9e89f/attachment.htm From python-dev at zesty.ca Wed Apr 13 11:03:43 2005 From: python-dev at zesty.ca (Ka-Ping Yee) Date: Wed Apr 13 11:03:56 2005 Subject: [Python-Dev] Security capabilities in Python In-Reply-To: References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm> <1f8cfb9a8805dcc73339b4ea0164e63b@fuhm.net> Message-ID: On Sun, 10 Apr 2005, Eyal Lotem wrote: > It may be really hard to get it right, unless we are overlooking some simple > solution. To "get it right", you at least need to know exactly what your operators mean. I messed up because i failed to realize that '==' can be redefined, and 'in' depends on '==' to work properly. > What about implementing the facet in C? This could avoid the class of > problems you have just mentioned. I don't think that's a good solution. A facet is just one basic programming pattern that you can build in a capability system; it would be silly to have to go back to C every time you wanted to build some other construct. A better way would be to start with capabilities that behave simply and correctly; then you can build whatever you want. -- ?!ng From ncoghlan at iinet.net.au Wed Apr 13 11:23:37 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Wed Apr 13 11:27:31 2005 Subject: [Python-Dev] Unified or context diffs? Message-ID: <425CE519.4090700@iinet.net.au> Are context diffs still favoured for patches? The patch submission guidelines [1] still say that, but is it actually true these days? I personally prefer unified diffs, but have been generating context diffs because of what the guidelines say. Brett can probably guess why I'm asking :) Cheers, Nick. [1] http://www.python.org/patches/ -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From phil at riverbankcomputing.co.uk Wed Apr 13 11:41:24 2005 From: phil at riverbankcomputing.co.uk (Phil Thompson) Date: Wed Apr 13 11:41:03 2005 Subject: [Python-Dev] super_getattro() Behaviour Message-ID: <13256.82.68.80.137.1113385284.squirrel@river-bank.demon.co.uk> In PyQt, wrapped types implement lazy access to the type dictionary through tp_getattro. If the normal attribute lookup fails, then private tables are searched and the attribute (if found) is created on the fly and returned. It is also put into the type dictionary so that it is found next time through the normal lookup. This is done to speed up the import of, and the memory consumed by, the qt module which contains thousands of class methods. This all works fine - except when super is used. The implementation of super_getattro() doesn't use the normal attribute lookup (ie. doesn't go via tp_getattro). Instead it walks the MRO hierarchy itself and searches instance dictionaries explicitly. This means that attributes that have not yet been referenced (ie. not yet been cached in the type dictionary) will not be found. Questions... 1. What is the reason why it doesn't go via tp_getattro? Bug or feature? 2. A possible workaround is to subvert the ma_lookup function of the type dictionary after creating the type to do something similar to what my tp_getattro function is doing. Are there any inherent problems with that? 3. Why, when creating a new type and eventually calling type_new() is a copy of the dictionary passed in made? Why not take a reference to it? This would allow a dict sub-class to be used as the type dictionary. I could then implement a lazy-dict sub-class with the behaviour I need. 4. Am I missing a more correct/obvious technique? (There is no need to support classic classes.) Many thanks, Phil From prabu333 at hotpop.com Wed Apr 13 13:11:54 2005 From: prabu333 at hotpop.com (Senthil Prabu.S) Date: Wed Apr 13 13:12:10 2005 Subject: [Python-Dev] IPV6 with Python- 4.2.1 on HPUX References: <02ab01c53ff0$fd2ef0a0$1f0110ac@sesco> Message-ID: <03ee01c54019$9af8bdc0$1f0110ac@sesco> Hi Experts, I am pretty new to Python. I have been trying to compile python on HP-UX 11.23 IPF machine. I tried to build with following configure option. ./configure --prefix=/opt/iexpress/python --enable-ipv6 --with-signal-module --with-threads machine info : HP-UX beta B.11.23 U ia64 gcc : gcc version 3.4.3 While configure, I faced the following pbm, checking ipv6 stack type... ./configure[13033]: /usr/xpg4/bin/grep: not found. unknown Then I checked the config.log to find the entires for IPV6; configure:12811: checking if --enable-ipv6 is specified configure:12822: result: yes configure:12954: checking ipv6 stack type conftest.c:78:22: features.h: No such file or directory conftest.c:78:48: /usr/local/v6/include/sys/v6config.h: No such file or directory configure:13111: result: unknown But, configure didnot produce any error mesage. So plz advice whether Python supports the IPV6 option on HP-UX. Bez, I know ipv6 differs from linux and HP-UX. If I no need to worry about this and build python. How to check whether IPV6 option works well with my python. Anyone please help to how to test the IPV6 functionality test. Is there any specific IPV6 test available with python. I could not find any specific testsuit for IPV6 under test directory. Plz share ur comments Advance Thanks, Senthil Prabu.S -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20050413/c4903408/attachment.html From python at rcn.com Wed Apr 13 07:28:18 2005 From: python at rcn.com (Raymond Hettinger) Date: Wed Apr 13 19:28:45 2005 Subject: [Python-Dev] Unified or context diffs? In-Reply-To: <425CE519.4090700@iinet.net.au> Message-ID: <003301c53fe9$9995fa40$3c36c797@oemcomputer> [Nick Coghlan] > Are context diffs still favoured for patches? > > The patch submission guidelines [1] still say that, but is it actually > true > these days? I personally prefer unified diffs, but have been generating > context > diffs because of what the guidelines say. Submit whichever is the most informative. For some changes, it is easier to see the changed lines immediately above and below each other. For others, it helps to be able to see the whole algorithm. Raymond From irmen at xs4all.nl Wed Apr 13 19:38:40 2005 From: irmen at xs4all.nl (Irmen de Jong) Date: Wed Apr 13 19:38:44 2005 Subject: [Python-Dev] Unified or context diffs? In-Reply-To: <003301c53fe9$9995fa40$3c36c797@oemcomputer> References: <003301c53fe9$9995fa40$3c36c797@oemcomputer> Message-ID: <425D5920.3050402@xs4all.nl> Raymond Hettinger wrote: > [Nick Coghlan] > >>Are context diffs still favoured for patches? >> >>The patch submission guidelines [1] still say that, but is it actually >>true >>these days? I personally prefer unified diffs, but have been > > generating > >>context >>diffs because of what the guidelines say. > > > Submit whichever is the most informative. For some changes, it is > easier to see the changed lines immediately above and below each other. > For others, it helps to be able to see the whole algorithm. And for the 'patch' tool, it doesn't really matter what you use, right? --Irmen From bac at OCF.Berkeley.EDU Wed Apr 13 21:54:08 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Wed Apr 13 21:54:17 2005 Subject: [Python-Dev] Unified or context diffs? In-Reply-To: <425CE519.4090700@iinet.net.au> References: <425CE519.4090700@iinet.net.au> Message-ID: <425D78E0.5070605@ocf.berkeley.edu> Nick Coghlan wrote: > Are context diffs still favoured for patches? > > The patch submission guidelines [1] still say that, but is it actually > true these days? I personally prefer unified diffs, but have been > generating context diffs because of what the guidelines say. > I personally like unified diffs a lot more since you can see exactly how a line changed compared to the previous version, but that's me. I just checked the dev FAQ and it consistently says contextual diffs as well. > Brett can probably guess why I'm asking :) > =) > Cheers, > Nick. > > [1] http://www.python.org/patches/ > I didn't even know that page existed! I thought at one point this question came up and the general consensus was that unified diffs were preferred? -Brett From nas at arctrix.com Wed Apr 13 22:09:43 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Wed Apr 13 22:09:48 2005 Subject: [Python-Dev] Unified or context diffs? In-Reply-To: <425D78E0.5070605@ocf.berkeley.edu> References: <425CE519.4090700@iinet.net.au> <425D78E0.5070605@ocf.berkeley.edu> Message-ID: <20050413200943.GA24038@mems-exchange.org> On Wed, Apr 13, 2005 at 12:54:08PM -0700, Brett C. wrote: > I thought at one point this question came up and the general > consensus was that unified diffs were preferred? Guido used to prefer context diffs but says he now doesn't mind unified diffs. I think unified diffs are much more common these days so that's probably what most people are used to. As Raymond says, for certain types of changes, context diffs are more readable. Still, I always use unified diffs. Neil From barry at python.org Wed Apr 13 22:49:14 2005 From: barry at python.org (Barry Warsaw) Date: Wed Apr 13 22:49:18 2005 Subject: [Python-Dev] Unified or context diffs? In-Reply-To: <425D78E0.5070605@ocf.berkeley.edu> References: <425CE519.4090700@iinet.net.au> <425D78E0.5070605@ocf.berkeley.edu> Message-ID: <1113425354.10345.37.camel@geddy.wooz.org> On Wed, 2005-04-13 at 15:54, Brett C. wrote: > I thought at one point this question came up and the general consensus was that > unified diffs were preferred? Back in the day, we preferred context diffs, and I think of the original Python core group, Guido was the last holdout. But IIRC, a few years ago the issue came up again; Guido had changed his mind so we changed syncmail to produce unified diffs. IMO unifieds are preferred when the diffs are for human consumption, but when they're only for machine consumption, anything that the patch program accepts is fine. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050413/c2db5280/attachment.pgp From martin at v.loewis.de Wed Apr 13 23:11:27 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed Apr 13 23:11:31 2005 Subject: [Python-Dev] Unified or context diffs? In-Reply-To: <425CE519.4090700@iinet.net.au> References: <425CE519.4090700@iinet.net.au> Message-ID: <425D8AFF.5090508@v.loewis.de> Nick Coghlan wrote: > Are context diffs still favoured for patches? Just for the record: I also prefer unified over context diffs. Regards, Martin From bac at OCF.Berkeley.EDU Wed Apr 13 23:26:12 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Wed Apr 13 23:26:33 2005 Subject: [Python-Dev] Unified or context diffs? In-Reply-To: <1113425354.10345.37.camel@geddy.wooz.org> References: <425CE519.4090700@iinet.net.au> <425D78E0.5070605@ocf.berkeley.edu> <1113425354.10345.37.camel@geddy.wooz.org> Message-ID: <425D8E74.3030708@ocf.berkeley.edu> Barry Warsaw wrote: > On Wed, 2005-04-13 at 15:54, Brett C. wrote: > > >>I thought at one point this question came up and the general consensus was that >>unified diffs were preferred? > > > Back in the day, we preferred context diffs, and I think of the original > Python core group, Guido was the last holdout. But IIRC, a few years > ago the issue came up again; Guido had changed his mind so we changed > syncmail to produce unified diffs. > Eh. Guido doesn't deal with patches anymore, so his opinion doesn't count. =) > IMO unifieds are preferred when the diffs are for human consumption, but > when they're only for machine consumption, anything that the patch > program accepts is fine. > OK, it seems like everyone who cares enough to speak up has said so far that unified diffs are better I will change the docs some time between now and when I keel over dead to have people use unified diffs assuming some rush of people don't suddenly start saying they prefer contextual diffs. -Brett From martin at v.loewis.de Wed Apr 13 23:31:28 2005 From: martin at v.loewis.de (=?windows-1252?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed Apr 13 23:31:31 2005 Subject: [Python-Dev] Python tests fails on HP-UX 11.11 and core dumps In-Reply-To: <02ab01c53ff0$fd2ef0a0$1f0110ac@sesco> References: <02ab01c53ff0$fd2ef0a0$1f0110ac@sesco> Message-ID: <425D8FB0.3030404@v.loewis.de> Senthil Prabu.S wrote: > I tried python -4.2.1 on a HP-UX 11.11 PA machine. I was able to > python. Gmake passes, gmake test results in error. The python reported > that test_pty fails,when running this test alone. > > Can anyone help to find why core dumps at running the > *test_subprocess.py* test. > Also how can I solve it? Please understand that python-dev is not the place to get free consulting. If you are willing to investigate somewhat further, try to understand the problem, and propose patches, then I would be willing to review the patches, comment on their correctness, and perhaps integrate them into the Python CVS. As it stands, I can personally take no more time to help with HP-UX problems for the near future (say, ten years :-) I do recall that there are serious problems with pseudo-terminals in Python and HP-UX, so yes, we have heard of this before. If I knew a solution, it were applied to Python already. Please understand that this perhaps hostile-sounding response is just my personal view; if somebody else responds more gracefully, just ignore me. Regards, Martin From mwh at python.net Wed Apr 13 12:06:13 2005 From: mwh at python.net (Michael Hudson) Date: Wed Apr 13 23:32:47 2005 Subject: [Python-Dev] Unified or context diffs? In-Reply-To: <425CE519.4090700@iinet.net.au> (Nick Coghlan's message of "Wed, 13 Apr 2005 19:23:37 +1000") References: <425CE519.4090700@iinet.net.au> Message-ID: <2m3btvc7oq.fsf@starship.python.net> Nick Coghlan writes: > Are context diffs still favoured for patches? If you want me to review it, yes, probably, but see below... > The patch submission guidelines [1] still say that, but is it actually > true these days? I personally prefer unified diffs, but have been > generating context diffs because of what the guidelines say. Emacs 21's diff-mode can convert between the two with a keypress. People who continue to abuse themselves by not using Emacs can probably find other tools to do this job. So *I* don't regard this as a big deal. Plain diffs are of course, right out. Cheers, mwh -- It is never worth a first class man's time to express a majority opinion. By definition, there are plenty of others to do that. -- G. H. Hardy From mwh at python.net Wed Apr 13 13:52:32 2005 From: mwh at python.net (Michael Hudson) Date: Wed Apr 13 23:55:59 2005 Subject: [Python-Dev] super_getattro() Behaviour In-Reply-To: <13256.82.68.80.137.1113385284.squirrel@river-bank.demon.co.uk> (Phil Thompson's message of "Wed, 13 Apr 2005 10:41:24 +0100 (BST)") References: <13256.82.68.80.137.1113385284.squirrel@river-bank.demon.co.uk> Message-ID: <2mu0mac2rj.fsf@starship.python.net> "Phil Thompson" writes: > In PyQt, wrapped types implement lazy access to the type dictionary > through tp_getattro. If the normal attribute lookup fails, then private > tables are searched and the attribute (if found) is created on the fly and > returned. It is also put into the type dictionary so that it is found next > time through the normal lookup. This is done to speed up the import of, > and the memory consumed by, the qt module which contains thousands of > class methods. > > This all works fine - except when super is used. > > The implementation of super_getattro() doesn't use the normal attribute > lookup (ie. doesn't go via tp_getattro). Instead it walks the MRO > hierarchy itself and searches instance dictionaries explicitly. This means > that attributes that have not yet been referenced (ie. not yet been cached > in the type dictionary) will not be found. > > Questions... > > 1. What is the reason why it doesn't go via tp_getattro? Because it wouldn't work if it did? I'm not sure what you're suggesting here. > 2. A possible workaround is to subvert the ma_lookup function of the type > dictionary after creating the type to do something similar to what my > tp_getattro function is doing. Eek! > Are there any inherent problems with that? Well, I think the layout of dictionaries is fiercely private. IIRC, the only reason it's in a public header is to allow some optimzations in ceval.c (though this isn't at all obvious from the headers, so maybe I'm mistaken). > 3. Why, when creating a new type and eventually calling type_new() is a > copy of the dictionary passed in made? I think this is to prevent changes to tp_dict behind the type's back. It's important to keep the dict and the slots in sync. > Why not take a reference to it? This would allow a dict sub-class > to be used as the type dictionary. I could then implement a > lazy-dict sub-class with the behaviour I need. Well, not really, because super_getattro uses PyDict_GetItem, which doesn't respect subclasses... > 4. Am I missing a more correct/obvious technique? (There is no need to > support classic classes.) Hum, I can't think of one, I'm afraid. There has been some vague talk of having a tp_lookup slot in typeobjects, so PyDict_GetItem(t->tp_dict, x); would become t->tp_lookup(x); (well, ish, it might make more sense to only do that if the dict lookup fails). For now, not being lazy seems your only option :-/ (it's what PyObjC does). Cheers, mwh -- Many of the posts you see on Usenet are actually from moths. You can tell which posters they are by their attraction to the flames. -- Internet Oracularity #1279-06 From anthony at interlink.com.au Thu Apr 14 05:14:19 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu Apr 14 05:15:12 2005 Subject: [Python-Dev] IPV6 with Python- 4.2.1 on HPUX In-Reply-To: <03ee01c54019$9af8bdc0$1f0110ac@sesco> References: <02ab01c53ff0$fd2ef0a0$1f0110ac@sesco> <03ee01c54019$9af8bdc0$1f0110ac@sesco> Message-ID: <200504141314.21335.anthony@interlink.com.au> On Wednesday 13 April 2005 21:11, Senthil Prabu.S wrote: > Hi Experts, > I am pretty new to Python. I have been trying to compile python > on HP-UX 11.23 IPF machine. I tried to build with following configure > option. > > ./configure --prefix=/opt/iexpress/python --enable-ipv6 > --with-signal-module --with-threads machine info : HP-UX beta B.11.23 U > ia64 > gcc : gcc version 3.4.3 > While configure, I faced the following pbm, Last time I tried, gcc on HPUX/ia64 was completely unable to build a working version of Python - this was not the fault of Python, but simply that gcc on that platform was utterly broken. Please try with the HP compiler instead, see if that is any better. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From anthony at interlink.com.au Thu Apr 14 05:17:54 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Thu Apr 14 05:18:42 2005 Subject: [Python-Dev] Unified or context diffs? In-Reply-To: <425D8E74.3030708@ocf.berkeley.edu> References: <425CE519.4090700@iinet.net.au> <1113425354.10345.37.camel@geddy.wooz.org> <425D8E74.3030708@ocf.berkeley.edu> Message-ID: <200504141317.57623.anthony@interlink.com.au> On Thursday 14 April 2005 07:26, Brett C. wrote: > OK, it seems like everyone who cares enough to speak up has said so far > that unified diffs are better I will change the docs some time between now > and when I keel over dead to have people use unified diffs assuming some > rush of people don't suddenly start saying they prefer contextual diffs. Should probably say either context or unified diffs - I'm sure there's vendor supplied 'diff' programs out there that don't support -u ed-style patches, of course, are RIGHT OUT. Anthony -- Anthony Baxter It's never too late to have a happy childhood. From bob at redivi.com Thu Apr 14 05:35:49 2005 From: bob at redivi.com (Bob Ippolito) Date: Thu Apr 14 05:35:56 2005 Subject: [Python-Dev] Unified or context diffs? In-Reply-To: <200504141317.57623.anthony@interlink.com.au> References: <425CE519.4090700@iinet.net.au> <1113425354.10345.37.camel@geddy.wooz.org> <425D8E74.3030708@ocf.berkeley.edu> <200504141317.57623.anthony@interlink.com.au> Message-ID: <42a12315867d56b187a49306f578ac52@redivi.com> On Apr 13, 2005, at 11:17 PM, Anthony Baxter wrote: > On Thursday 14 April 2005 07:26, Brett C. wrote: >> OK, it seems like everyone who cares enough to speak up has said so >> far >> that unified diffs are better I will change the docs some time >> between now >> and when I keel over dead to have people use unified diffs assuming >> some >> rush of people don't suddenly start saying they prefer contextual >> diffs. > > Should probably say either context or unified diffs - I'm sure there's > vendor > supplied 'diff' programs out there that don't support -u > > ed-style patches, of course, are RIGHT OUT. It might be worth mentioning that if/when subversion is used to replace CVS, unified diffs are going to be the obvious way to do it, because I don't think that subversion supports context diffs without using an external diff command. -bob From phil at riverbankcomputing.co.uk Thu Apr 14 10:24:55 2005 From: phil at riverbankcomputing.co.uk (Phil Thompson) Date: Thu Apr 14 10:25:11 2005 Subject: [Python-Dev] super_getattro() Behaviour In-Reply-To: <2mu0mac2rj.fsf@starship.python.net> References: <13256.82.68.80.137.1113385284.squirrel@river-bank.demon.co.uk> <2mu0mac2rj.fsf@starship.python.net> Message-ID: <25259.82.68.80.137.1113467095.squirrel@river-bank.demon.co.uk> > "Phil Thompson" writes: > >> In PyQt, wrapped types implement lazy access to the type dictionary >> through tp_getattro. If the normal attribute lookup fails, then private >> tables are searched and the attribute (if found) is created on the fly >> and >> returned. It is also put into the type dictionary so that it is found >> next >> time through the normal lookup. This is done to speed up the import of, >> and the memory consumed by, the qt module which contains thousands of >> class methods. >> >> This all works fine - except when super is used. >> >> The implementation of super_getattro() doesn't use the normal attribute >> lookup (ie. doesn't go via tp_getattro). Instead it walks the MRO >> hierarchy itself and searches instance dictionaries explicitly. This >> means >> that attributes that have not yet been referenced (ie. not yet been >> cached >> in the type dictionary) will not be found. >> >> Questions... >> >> 1. What is the reason why it doesn't go via tp_getattro? > > Because it wouldn't work if it did? I'm not sure what you're > suggesting here. I'm asking for an explanation for the current implementation. Why wouldn't it work if it got the attribute via tp_getattro? >> 2. A possible workaround is to subvert the ma_lookup function of the >> type >> dictionary after creating the type to do something similar to what my >> tp_getattro function is doing. > > Eek! Agreed. >> Are there any inherent problems with that? > > Well, I think the layout of dictionaries is fiercely private. IIRC, > the only reason it's in a public header is to allow some optimzations > in ceval.c (though this isn't at all obvious from the headers, so > maybe I'm mistaken). Yes, having looked in more detail at the dict implementation I really don't want to go there. >> 3. Why, when creating a new type and eventually calling type_new() is a >> copy of the dictionary passed in made? > > I think this is to prevent changes to tp_dict behind the type's back. > It's important to keep the dict and the slots in sync. > >> Why not take a reference to it? This would allow a dict sub-class >> to be used as the type dictionary. I could then implement a >> lazy-dict sub-class with the behaviour I need. > > Well, not really, because super_getattro uses PyDict_GetItem, which > doesn't respect subclasses... I suppose I was hoping for more C++ like behaviour. >> 4. Am I missing a more correct/obvious technique? (There is no need to >> support classic classes.) > > Hum, I can't think of one, I'm afraid. > > There has been some vague talk of having a tp_lookup slot in > typeobjects, so > > PyDict_GetItem(t->tp_dict, x); > > would become > > t->tp_lookup(x); > > (well, ish, it might make more sense to only do that if the dict > lookup fails). That would be perfect. I can't Google any reference to a discussion - can you point me at something? > For now, not being lazy seems your only option :-/ (it's what PyObjC > does). Not practical I'm afraid. I think I can only document that super doesn't work in this context. Thanks, Phil From mwh at python.net Thu Apr 14 10:56:43 2005 From: mwh at python.net (Michael Hudson) Date: Thu Apr 14 10:56:46 2005 Subject: [Python-Dev] super_getattro() Behaviour In-Reply-To: <25259.82.68.80.137.1113467095.squirrel@river-bank.demon.co.uk> (Phil Thompson's message of "Thu, 14 Apr 2005 09:24:55 +0100 (BST)") References: <13256.82.68.80.137.1113385284.squirrel@river-bank.demon.co.uk> <2mu0mac2rj.fsf@starship.python.net> <25259.82.68.80.137.1113467095.squirrel@river-bank.demon.co.uk> Message-ID: <2mwtr5ag8k.fsf@starship.python.net> "Phil Thompson" writes: >>> Questions... >>> >>> 1. What is the reason why it doesn't go via tp_getattro? >> >> Because it wouldn't work if it did? I'm not sure what you're >> suggesting here. > > I'm asking for an explanation for the current implementation. Why wouldn't > it work if it got the attribute via tp_getattro? Well, using type->tp_getattro is just different to looking in tp_dict -- it finds metamethods, for example. Hmm. Well, I'm fairly sure there is a difference, I'm not sure I can explain it right now :( >>> 2. A possible workaround is to subvert the ma_lookup function of the >>> type >>> dictionary after creating the type to do something similar to what my >>> tp_getattro function is doing. [...] > Yes, having looked in more detail at the dict implementation I really > don't want to go there. Good :) >>> 4. Am I missing a more correct/obvious technique? (There is no need to >>> support classic classes.) >> >> Hum, I can't think of one, I'm afraid. >> >> There has been some vague talk of having a tp_lookup slot in >> typeobjects, so >> >> PyDict_GetItem(t->tp_dict, x); >> >> would become >> >> t->tp_lookup(x); >> >> (well, ish, it might make more sense to only do that if the dict >> lookup fails). > > That would be perfect. I can't Google any reference to a discussion - can > you point me at something? Well, most of the discussion so far has been in my head :) There was a little talk of it in the thread "can we stop pretending _PyType_Lookup is internal" here and possibly on pyobjc-dev around the same time. I'm not that likely to work on it soon -- I have enough moderately complex patches to core Python I'm persuading people to think about :-/. >> For now, not being lazy seems your only option :-/ (it's what PyObjC >> does). > > Not practical I'm afraid. I think I can only document that super doesn't > work in this context. Oh well. I can't even think of a way to make it fail reliably... Cheers, mwh -- Java sucks. [...] Java on TV set top boxes will suck so hard it might well inhale people from off their sofa until their heads get wedged in the card slots. --- Jon Rabone, ucam.chat From phil at riverbankcomputing.co.uk Thu Apr 14 11:15:34 2005 From: phil at riverbankcomputing.co.uk (Phil Thompson) Date: Thu Apr 14 11:15:11 2005 Subject: [Python-Dev] super_getattro() Behaviour In-Reply-To: <2mwtr5ag8k.fsf@starship.python.net> References: <13256.82.68.80.137.1113385284.squirrel@river-bank.demon.co.uk> <2mu0mac2rj.fsf@starship.python.net> <25259.82.68.80.137.1113467095.squirrel@river-bank.demon.co.uk> <2mwtr5ag8k.fsf@starship.python.net> Message-ID: <27767.82.68.80.137.1113470134.squirrel@river-bank.demon.co.uk> >>>> 4. Am I missing a more correct/obvious technique? (There is no need to >>>> support classic classes.) >>> >>> Hum, I can't think of one, I'm afraid. >>> >>> There has been some vague talk of having a tp_lookup slot in >>> typeobjects, so >>> >>> PyDict_GetItem(t->tp_dict, x); >>> >>> would become >>> >>> t->tp_lookup(x); >>> >>> (well, ish, it might make more sense to only do that if the dict >>> lookup fails). >> >> That would be perfect. I can't Google any reference to a discussion - >> can >> you point me at something? > > Well, most of the discussion so far has been in my head :) > > There was a little talk of it in the thread "can we stop pretending > _PyType_Lookup is internal" here and possibly on pyobjc-dev around the > same time. > > I'm not that likely to work on it soon -- I have enough moderately > complex patches to core Python I'm persuading people to think about > :-/. Anything I can do to help push it along? Phil From fredrik at pythonware.com Thu Apr 14 12:41:34 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu Apr 14 12:43:25 2005 Subject: [Python-Dev] Re: Unified or context diffs? References: <425CE519.4090700@iinet.net.au><1113425354.10345.37.camel@geddy.wooz.org><425D8E74.3030708@ocf.berkeley.edu><200504141317.57623.anthony@interlink.com.au> <42a12315867d56b187a49306f578ac52@redivi.com> Message-ID: Bob Ippolito wrote: > It might be worth mentioning that if/when subversion is used to replace CVS, unified diffs are > going to be the obvious way to do it, because I don't think that subversion supports context diffs > without using an external diff command. subversion? you meant bazaar-ng, right? From python-dev at zesty.ca Thu Apr 14 13:27:20 2005 From: python-dev at zesty.ca (Ka-Ping Yee) Date: Thu Apr 14 13:27:24 2005 Subject: [Python-Dev] Check out a new way to read threaded conversations. Message-ID: I hope you will not mind too much if I ask a small favor. Sorry for this off-topic post. I am working on a new design for displaying online conversations. (Some of you saw this at PyCon.) I'm conducting a short survey to gather some opinions on the current design. If you have just a few minutes to spare, would you please visit: http://zesty.ca/threadmap/pydev.cgi You'll see a new way of looking at this discussion list that you may find pretty interesting. I look forward to learning what you think of it. I am very grateful for your time and assistance. (If you reply to this message, please reply to me only -- I don't want to clutter up python-dev with lots of off-topic messages.) -- Ping From drobinow at gmail.com Thu Apr 14 15:08:30 2005 From: drobinow at gmail.com (David Robinow) Date: Thu Apr 14 15:08:45 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: <1f7befae05041111284b61992a@mail.gmail.com> References: <1f7befae05041010237d11d7a9@mail.gmail.com> <2mmzs65wsn.fsf@starship.python.net> <1f7befae05041015372cf17e91@mail.gmail.com> <2maco55t8p.fsf@starship.python.net> <1f7befae050411082733dca644@mail.gmail.com> <2msm1x480i.fsf@starship.python.net> <1f7befae05041111284b61992a@mail.gmail.com> Message-ID: <4eb0089f0504140608493fe329@mail.gmail.com> On 4/11/05, Tim Peters wrote: > Heh. I have a vague half-memory of _some_ box that stored the two > 4-byte "words" in an IEEE double in one order, but the bytes within > each word in the opposite order. It's always something ... I believe this was the Floating Instruction Set on the PDP 11/35. The fact that it's still remembered 30 years later shows how unusual it was. From ldlandis at gmail.com Thu Apr 14 15:23:57 2005 From: ldlandis at gmail.com (LD "Gus" Landis) Date: Thu Apr 14 15:30:55 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: <16983.51916.407019.489590@montanaro.dyndns.org> References: <1f7befae0504081638145d3b4c@mail.gmail.com> <4257B8A1.6000902@v.loewis.de> <16983.51916.407019.489590@montanaro.dyndns.org> Message-ID: Hi, For AIX: Python 2.2 (#1, Feb 17 2003, 21:43:03) [C] on aix4 Type "help", "copyright", "credits" or "license" for more information. >>> import marshal >>> marshal.dumps(1e10000) 'f\x03INF' >>> marshal.loads(marshal.dumps(1e10000)) INF >>> float("INF") INF >>> float("NaN") NaNQ >>> On 4/9/05, Skip Montanaro wrote: > > Martin> Yet, this *still* is a platform dependence. Python makes no > Martin> guarantee that 1e1000 is a supported float literal on any > Martin> platform, and indeed, on your platform, 1e1000 is not supported > Martin> on your platform. > > Are float("inf") and float("nan") supported everywhere? > -- LD Landis - N0YRQ - from the St Paul side of Minneapolis From nbastin at opnet.com Thu Apr 14 19:57:43 2005 From: nbastin at opnet.com (Nicholas Bastin) Date: Thu Apr 14 20:18:28 2005 Subject: [Python-Dev] PyCallable_Check redeclaration Message-ID: <43ed01d7a659a9a79793ddfcff0e957e@opnet.com> Why is PyCallable_Check declared in both object.h and abstract.h? It appears that it's been this way for quite some time (exists in both 2.3.4 and 2.4.1). -- Nick From kbk at shore.net Thu Apr 14 20:43:28 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Thu Apr 14 20:43:56 2005 Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200504141843.j3EIhSNw007456@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 314 open ( +6) / 2824 closed ( +5) / 3138 total (+11) Bugs : 898 open (+16) / 4921 closed ( +8) / 5819 total (+24) RFE : 177 open ( +1) / 151 closed ( +0) / 328 total ( +1) New / Reopened Patches ______________________ typos in rpc.py (2005-04-09) CLOSED http://python.org/sf/1179503 opened by engelbert gruber [AST] Fix for core in test_grammar.py (2005-04-08) http://python.org/sf/1179513 opened by logistix no html file for modulefinder (2005-04-10) http://python.org/sf/1180012 opened by George Yoshida fix typos in Library Reference (2005-04-10) http://python.org/sf/1180062 opened by George Yoshida great improvement for locale.py formatting functions (2005-04-10) http://python.org/sf/1180296 opened by Georg Brandl clarify behavior of StringIO objects when preinitialized (2005-04-10) CLOSED http://python.org/sf/1180305 opened by Georg Brandl st_gen and st_birthtime support for FreeBSD (2005-04-11) http://python.org/sf/1180695 opened by Antti Louko binary formats for marshalling floats (2005-04-11) http://python.org/sf/1180995 opened by Michael Hudson make float packing copy bytes when they can (2005-04-12) http://python.org/sf/1181301 opened by Michael Hudson range() in for loops, again (2005-04-12) http://python.org/sf/1181334 opened by Armin Rigo HMAC hexdigest and general review (2005-04-13) http://python.org/sf/1182394 opened by Shane Holloway Patches Closed ______________ Complex commented (2005-04-06) http://python.org/sf/1177597 closed by loewis typos in rpc.py (2005-04-08) http://python.org/sf/1179503 closed by rhettinger clarify behavior of StringIO objects when preinitialized (2005-04-10) http://python.org/sf/1180305 closed by rhettinger Improved output for unittest failUnlessEqual (2003-04-22) http://python.org/sf/725569 closed by purcell [AST] Generator expressions (2005-03-21) http://python.org/sf/1167628 closed by bcannon New / Reopened Bugs ___________________ 256 should read 255 in operator module docs (2005-04-06) CLOSED http://python.org/sf/1178255 opened by Dan Everhart operator.isMappingType and isSequenceType on instances (2005-04-06) CLOSED http://python.org/sf/1178269 opened by Dan Everhart Erroneous line number error in Py2.4.1 (2005-04-07) http://python.org/sf/1178484 opened by Timo Linna configure: refuses setgroups (2005-04-07) http://python.org/sf/1178510 opened by zosh 2.4.1 breaks pyTTS (2005-04-07) http://python.org/sf/1178624 opened by Dieter Deyke Variable.__init__ uses self.set(), blocking specialization (2005-04-07) http://python.org/sf/1178863 opened by Emil Variable.__init__ uses self.set(), blocking specialization (2005-04-07) http://python.org/sf/1178872 opened by Emil IDLE bug - changing shortcuts (2005-04-08) http://python.org/sf/1179168 opened by Przemysław Gocyła can't import thru cygwin symlink (2005-04-08) http://python.org/sf/1179412 opened by steveward Missing def'n of equality for set elements (2005-04-09) CLOSED http://python.org/sf/1179957 opened by Skip Montanaro codecs.readline sometimes removes newline chars (2005-04-02) http://python.org/sf/1175396 reopened by doerwalter locale.format question (2005-04-10) CLOSED http://python.org/sf/1180002 opened by Andrew Ma test_posix fails on cygwin (2005-04-10) http://python.org/sf/1180147 opened by Henrik Wist subprocess.Popen fails with closed stdout (2005-04-10) http://python.org/sf/1180160 opened by neuhauser broken pyc files (2005-04-10) http://python.org/sf/1180193 opened by Armin Rigo Python keeps file references after calling close methode (2005-04-10) http://python.org/sf/1180237 opened by Eelco expanding platform module and making it work as it should (2005-04-10) http://python.org/sf/1180267 opened by Nikos Kouremenos StringIO's docs should mention overwriting of initial value (2005-04-10) CLOSED http://python.org/sf/1180392 opened by Leif K-Brooks BaseHTTPServer uses deprecated mimetools.Message (2005-04-11) http://python.org/sf/1180470 opened by Paul Jimenez lax error-checking in new-in-2.4 marshal stuff (2005-04-11) http://python.org/sf/1180997 opened by Michael Hudson Bad sys.executable value for bdist_wininst install script (2005-04-12) http://python.org/sf/1181619 opened by follower asyncore.loop() documentation (2005-04-13) http://python.org/sf/1181939 opened by Graham re.escape(s) prints wrong for chr(0) (2005-04-13) http://python.org/sf/1182603 opened by Nick Jacobson dir() does not include _ (2005-04-13) http://python.org/sf/1182614 opened by Nick Jacobson ZipFile __del__/close problem with longint/long files (2005-04-14) http://python.org/sf/1182788 opened by Robert Kiendl Bugs Closed ___________ 256 should read 255 in operator module docs (2005-04-06) http://python.org/sf/1178255 closed by rhettinger operator.isMappingType and isSequenceType on instances (2005-04-06) http://python.org/sf/1178269 closed by rhettinger GNU readline 4.2 prompt issue (2002-12-30) http://python.org/sf/660083 closed by mwh non-ascii readline input crashes python (2004-08-14) http://python.org/sf/1009263 closed by mwh readline+no threads (2003-09-24) http://python.org/sf/811844 closed by mwh compiler module didn't get updated for "class foo():pass" (2005-04-03) http://python.org/sf/1176012 closed by bcannon Missing def'n of equality for set elements (2005-04-09) http://python.org/sf/1179957 closed by rhettinger locale.format question (2005-04-10) http://python.org/sf/1180002 closed by andrewma StringIO's docs should mention overwriting of initial value/ (2005-04-10) http://python.org/sf/1180392 closed by rhettinger New / Reopened RFE __________________ making builtin exceptions more informative (2005-04-13) http://python.org/sf/1182143 opened by Sebastien de Menten From irmen at xs4all.nl Fri Apr 15 03:17:53 2005 From: irmen at xs4all.nl (Irmen de Jong) Date: Fri Apr 15 03:17:57 2005 Subject: [Python-Dev] shadow password module (spwd) is never built due to error in setup.py Message-ID: <425F1641.9080708@xs4all.nl> Hello, A modification was made in setup.py, cvs rel 1.213 (see diff here: http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/setup.py?r1=1.212&r2=1.213 ) which appears to be wrong. At least, on my system, the spwd module is never built anymore, because the if statement is never true. Actually, the sysconfig doesn't contain *any* of the HAVE_XXXX vars that occur in pyconfig.h (I checked by printing all vars). I don't really understand the distutils magic that is done in setup.py, but it appears to me that either the if statement is wrong (because the vars never exist) or distutils does something wrong by leaving out all HAVE_XXX vars from pyconfig.h. Please advise? I want my spwd module back ;-) --Irmen de Jong PS I checked that pyconfig.h correctly #defines both HAVE_GETSPNAM and HAVE_GETSPENT to 1 on my system (Mandrake linux 10.1), so the rest of the configure script runs fine (it should, I created the original patches for it... see SF patch # 579435) From janssen at parc.com Fri Apr 15 04:28:20 2005 From: janssen at parc.com (Bill Janssen) Date: Fri Apr 15 04:47:06 2005 Subject: [Python-Dev] Check out a new way to read threaded conversations. In-Reply-To: Your message of "Thu, 14 Apr 2005 04:27:20 PDT." Message-ID: <05Apr14.192823pdt."58617"@synergy1.parc.xerox.com> > http://zesty.ca/threadmap/pydev.cgi Very reminiscent of Paula Newman's work at PARC several years ago. Check out http://www2.parc.com/istl/groups/hdi/papers/psn_emailvis01.pdf, particularly page 5. Bill From barry at python.org Fri Apr 15 05:46:04 2005 From: barry at python.org (Barry Warsaw) Date: Fri Apr 15 05:46:08 2005 Subject: [Python-Dev] Inconsistent exception for read-only properties? Message-ID: <1113536764.23564.310.camel@geddy.wooz.org> I've noticed an apparent inconsistency in the exception thrown for read-only properties for C extension types vs. Python new-style classes. I'm wondering if this is intentional, a bug, a bug worth fixing, or whether I'm just missing something. class other(object): def __init__(self, value): self._value = value def _get_value(self): return self._value value = property(_get_value) With this class, if you attempt "other(1).value = 7" you will get an AttributeError. However, if you define something similar in C using a tp_getset, where the structure has NULL for the setter, you will get a TypeError (code available upon request). At best, this is inconsistent. What's the "right" exception to raise? I think the documentation I've seen (e.g. Raymond's How To for Descriptors) describes AttributeError as the thing to raise when trying to set read-only properties. Thoughts? Should this be fixed (in 2.4?). -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050414/433916e8/attachment-0001.pgp From martin at v.loewis.de Fri Apr 15 06:59:08 2005 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri Apr 15 06:59:10 2005 Subject: [Python-Dev] shadow password module (spwd) is never built due to error in setup.py In-Reply-To: <425F1641.9080708@xs4all.nl> References: <425F1641.9080708@xs4all.nl> Message-ID: <425F4A1C.9080505@v.loewis.de> Irmen de Jong wrote: > Please advise? setup.py should refer to config_h_vars, which in turn should be set earlier. Regards, Martin From irmen at xs4all.nl Fri Apr 15 19:06:00 2005 From: irmen at xs4all.nl (Irmen de Jong) Date: Fri Apr 15 19:06:03 2005 Subject: [Python-Dev] shadow password module (spwd) is never built due to error in setup.py In-Reply-To: <425F4A1C.9080505@v.loewis.de> References: <425F1641.9080708@xs4all.nl> <425F4A1C.9080505@v.loewis.de> Message-ID: <425FF478.8070607@xs4all.nl> Martin v. L?wis wrote: > Irmen de Jong wrote: > >>Please advise? > > > setup.py should refer to config_h_vars, which in turn should be set earlier. > > Regards, > Martin Ah so the setup.py script is flawed. However, the sysconfig object doesn't contain a config_h_vars... So I guess distutils must be patched too? --Irmen From gvanrossum at gmail.com Fri Apr 15 22:05:32 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Apr 15 22:05:38 2005 Subject: [Python-Dev] PyCon 2005 keynote on-line Message-ID: http://python.org/doc/essays/ppt/ -- scroll to the end. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bac at OCF.Berkeley.EDU Fri Apr 15 22:33:21 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Fri Apr 15 22:33:27 2005 Subject: [Python-Dev] shadow password module (spwd) is never built due to error in setup.py In-Reply-To: <425FF478.8070607@xs4all.nl> References: <425F1641.9080708@xs4all.nl> <425F4A1C.9080505@v.loewis.de> <425FF478.8070607@xs4all.nl> Message-ID: <42602511.2010200@ocf.berkeley.edu> Irmen de Jong wrote: > Martin v. L?wis wrote: > >> Irmen de Jong wrote: >> >>> Please advise? >> >> >> >> setup.py should refer to config_h_vars, which in turn should be set >> earlier. >> >> Regards, >> Martin > > > Ah so the setup.py script is flawed. > However, the sysconfig object doesn't contain a config_h_vars... > So I guess distutils must be patched too? > While it probably should be included in distutils.sysconfig, config_h_vars was created later on in setup.py by some code dealing with whether to compile expat. I just moved that up to the top of the funciton so that it can be used sooner. Fixed in rev. 1.217 . Sorry about the bad checking that broke the building of it in the first place. =) -Brett From Jack.Jansen at cwi.nl Sat Apr 16 00:19:02 2005 From: Jack.Jansen at cwi.nl (Jack Jansen) Date: Sat Apr 16 00:19:05 2005 Subject: [Python-Dev] Re: marshal / unmarshal In-Reply-To: <4eb0089f0504140608493fe329@mail.gmail.com> References: <1f7befae05041010237d11d7a9@mail.gmail.com> <2mmzs65wsn.fsf@starship.python.net> <1f7befae05041015372cf17e91@mail.gmail.com> <2maco55t8p.fsf@starship.python.net> <1f7befae050411082733dca644@mail.gmail.com> <2msm1x480i.fsf@starship.python.net> <1f7befae05041111284b61992a@mail.gmail.com> <4eb0089f0504140608493fe329@mail.gmail.com> Message-ID: On 14-apr-05, at 15:08, David Robinow wrote: > On 4/11/05, Tim Peters wrote: > >> Heh. I have a vague half-memory of _some_ box that stored the two >> 4-byte "words" in an IEEE double in one order, but the bytes within >> each word in the opposite order. It's always something ... > I believe this was the Floating Instruction Set on the PDP 11/35. > The fact that it's still remembered 30 years later shows how unusual > it was. I think it was actually "logical", because all PDP-11s (there were 2 or 3 FPU instructionsets/architecture in the family IIRC) stored 32 bit integers in middle-endian (high-order word first, but low-order byte first). But note that neither of the PDP-11 FPUs were IEEE, that was a much later invention. At least, I didn't come across it until much later:-) -- Jack Jansen, , http://www.cwi.nl/~jack If I can't dance I don't want to be part of your revolution -- Emma Goldman From barry at python.org Sun Apr 17 01:24:27 2005 From: barry at python.org (Barry Warsaw) Date: Sun Apr 17 01:24:31 2005 Subject: [Python-Dev] Inconsistent exception for read-only properties? In-Reply-To: <1113536764.23564.310.camel@geddy.wooz.org> References: <1113536764.23564.310.camel@geddy.wooz.org> Message-ID: <1113693867.32074.80.camel@presto.wooz.org> On Thu, 2005-04-14 at 23:46, Barry Warsaw wrote: > I've noticed an apparent inconsistency in the exception thrown for > read-only properties for C extension types vs. Python new-style > classes. I haven't seen any follow ups on this, so I've gone ahead and posted a patch, assigning it to Raymond: http://sourceforge.net/tracker/index.php?func=detail&aid=1184449&group_id=5470&atid=105470 I would have attached a patch to that issue but SF it being finicky. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050416/db516c1d/attachment.pgp From jack at performancedrivers.com Sun Apr 17 17:53:31 2005 From: jack at performancedrivers.com (Jack Diederich) Date: Sun Apr 17 17:53:36 2005 Subject: [Python-Dev] Inconsistent exception for read-only properties? In-Reply-To: <1113693867.32074.80.camel@presto.wooz.org> References: <1113536764.23564.310.camel@geddy.wooz.org> <1113693867.32074.80.camel@presto.wooz.org> Message-ID: <20050417155331.GC25115@performancedrivers.com> On Sat, Apr 16, 2005 at 07:24:27PM -0400, Barry Warsaw wrote: > On Thu, 2005-04-14 at 23:46, Barry Warsaw wrote: > > I've noticed an apparent inconsistency in the exception thrown for > > read-only properties for C extension types vs. Python new-style > > classes. > > I haven't seen any follow ups on this, so I've gone ahead and posted a > patch, assigning it to Raymond: > > http://sourceforge.net/tracker/index.php?func=detail&aid=1184449&group_id=5470&atid=105470 > In 2.4 & 2.3 does it make sense to raise an exception that multiply inherits from both TypeError and AttributeError? If anyone currently does catch the error raising only AttributeError will break their code. 2.5 should just raise an AttributeError, of course. If that's acceptable I'll gladly submit a similar patch for mmap.get_byte() PyErr_SetString (PyExc_ValueError, "read byte out of range"); has always irked me (the same thing with mmap[i] is an IndexError). I hadn't thought of a clean way to fix it, but MI on the error might work. -jackdied From barry at python.org Sun Apr 17 17:57:11 2005 From: barry at python.org (Barry Warsaw) Date: Sun Apr 17 17:57:14 2005 Subject: [Python-Dev] Inconsistent exception for read-only properties? In-Reply-To: <20050417155331.GC25115@performancedrivers.com> References: <1113536764.23564.310.camel@geddy.wooz.org> <1113693867.32074.80.camel@presto.wooz.org> <20050417155331.GC25115@performancedrivers.com> Message-ID: <1113753431.32079.269.camel@presto.wooz.org> On Sun, 2005-04-17 at 11:53, Jack Diederich wrote: > In 2.4 & 2.3 does it make sense to raise an exception that multiply inherits > from both TypeError and AttributeError? If anyone currently does catch the > error raising only AttributeError will break their code. 2.5 should just > raise an AttributeError, of course. Without introducing a new exception class (which I think is out of the question for anything but 2.5), the only common base is StandardError, which seems too general for this exception. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050417/9152f245/attachment.pgp From jack at performancedrivers.com Sun Apr 17 18:07:20 2005 From: jack at performancedrivers.com (Jack Diederich) Date: Sun Apr 17 18:07:24 2005 Subject: [Python-Dev] Inconsistent exception for read-only properties? In-Reply-To: <20050417155331.GC25115@performancedrivers.com> References: <1113536764.23564.310.camel@geddy.wooz.org> <1113693867.32074.80.camel@presto.wooz.org> <20050417155331.GC25115@performancedrivers.com> Message-ID: <20050417160720.GD25115@performancedrivers.com> On Sun, Apr 17, 2005 at 11:53:31AM -0400, Jack Diederich wrote: > On Sat, Apr 16, 2005 at 07:24:27PM -0400, Barry Warsaw wrote: > > On Thu, 2005-04-14 at 23:46, Barry Warsaw wrote: > > > I've noticed an apparent inconsistency in the exception thrown for > > > read-only properties for C extension types vs. Python new-style > > > classes. > > > > I haven't seen any follow ups on this, so I've gone ahead and posted a > > patch, assigning it to Raymond: > > > > http://sourceforge.net/tracker/index.php?func=detail&aid=1184449&group_id=5470&atid=105470 > > > In 2.4 & 2.3 does it make sense to raise an exception that multiply inherits > from both TypeError and AttributeError? If anyone currently does catch the > error raising only AttributeError will break their code. 2.5 should just > raise an AttributeError, of course. > > If that's acceptable I'll gladly submit a similar patch for mmap.get_byte() > PyErr_SetString (PyExc_ValueError, "read byte out of range"); > has always irked me (the same thing with mmap[i] is an IndexError). > I hadn't thought of a clean way to fix it, but MI on the error might work. > I just did a quick grep for raised ValueErrors with "range" in the explanation string and didn't find any general consensus. I dunno what that means, if anything. wopr:~/src/python_head/dist/src# find ./ -name '*.c' | xargs grep ValueError | grep range | wc -l 13 wopr:~/src/python_head/dist/src# find ./ -name '*.c' | xargs grep IndexError | grep range | wc -l 31 (long versions below) -jackdied wopr:~/src/python_head/dist/src# find ./ -name '*.c' | xargs grep -n IndexError | grep range ./Modules/arraymodule.c:599: PyErr_SetString(PyExc_IndexError, "array index out of range"); ./Modules/arraymodule.c:997: PyErr_SetString(PyExc_IndexError, "pop index out of range"); ./Modules/mmapmodule.c:639: PyErr_SetString(PyExc_IndexError, "mmap index out of range"); ./Modules/mmapmodule.c:727: PyErr_SetString(PyExc_IndexError, "mmap index out of range"); ./Modules/_heapqmodule.c:19: PyErr_SetString(PyExc_IndexError, "index out of range"); ./Modules/_heapqmodule.c:58: PyErr_SetString(PyExc_IndexError, "index out of range"); ./Modules/_heapqmodule.c:136: PyErr_SetString(PyExc_IndexError, "index out of range"); ./Modules/_heapqmodule.c:173: PyErr_SetString(PyExc_IndexError, "index out of range"); ./Modules/_heapqmodule.c:310: PyErr_SetString(PyExc_IndexError, "index out of range"); ./Modules/_heapqmodule.c:349: PyErr_SetString(PyExc_IndexError, "index out of range"); ./Objects/bufferobject.c:403: PyErr_SetString(PyExc_IndexError, "buffer index out of range"); ./Objects/listobject.c:876: PyErr_SetString(PyExc_IndexError, "pop index out of range"); ./Objects/rangeobject.c:94: PyErr_SetString(PyExc_IndexError, ./Objects/stringobject.c:1055: PyErr_SetString(PyExc_IndexError, "string index out of range"); ./Objects/structseq.c:62: PyErr_SetString(PyExc_IndexError, "tuple index out of range"); ./Objects/tupleobject.c:104: PyErr_SetString(PyExc_IndexError, "tuple index out of range"); ./Objects/tupleobject.c:310: PyErr_SetString(PyExc_IndexError, "tuple index out of range"); ./Objects/unicodeobject.c:5164: PyErr_SetString(PyExc_IndexError, "string index out of range"); ./Python/exceptions.c:1504:PyDoc_STRVAR(IndexError__doc__, "Sequence index out of range."); ./RISCOS/Modules/drawfmodule.c:534: { PyErr_SetString(PyExc_IndexError,"drawf index out of range"); ./RISCOS/Modules/drawfmodule.c:555: { PyErr_SetString(PyExc_IndexError,"drawf index out of range"); ./RISCOS/Modules/drawfmodule.c:578: { PyErr_SetString(PyExc_IndexError,"drawf index out of range"); ./RISCOS/Modules/swimodule.c:113: { PyErr_SetString(PyExc_IndexError,"block index out of range"); ./RISCOS/Modules/swimodule.c:124: { PyErr_SetString(PyExc_IndexError,"block index out of range"); ./RISCOS/Modules/swimodule.c:136: { PyErr_SetString(PyExc_IndexError,"block index out of range"); ./RISCOS/Modules/swimodule.c:150: { PyErr_SetString(PyExc_IndexError,"block index out of range"); ./RISCOS/Modules/swimodule.c:164: { PyErr_SetString(PyExc_IndexError,"block index out of range"); ./RISCOS/Modules/swimodule.c:225: { PyErr_SetString(PyExc_IndexError,"block index out of range"); ./RISCOS/Modules/swimodule.c:237: { PyErr_SetString(PyExc_IndexError,"block index out of range"); ./RISCOS/Modules/swimodule.c:248: { PyErr_SetString(PyExc_IndexError,"block index out of range"); ./RISCOS/Modules/swimodule.c:264: { PyErr_SetString(PyExc_IndexError,"block index out of range"); wopr:~/src/python_head/dist/src# find ./ -name '*.c' | xargs grep -n ValueError | grep range ./Modules/mmapmodule.c:181: PyErr_SetString (PyExc_ValueError, "read byte out of range"); ./Modules/mmapmodule.c:301: PyErr_SetString (PyExc_ValueError, "data out of range"); ./Modules/mmapmodule.c:524: PyErr_SetString (PyExc_ValueError, "seek out of range"); ./Modules/timemodule.c:405: PyErr_SetString(PyExc_ValueError, "month out of range"); ./Modules/timemodule.c:409: PyErr_SetString(PyExc_ValueError, "day of month out of range"); ./Modules/timemodule.c:413: PyErr_SetString(PyExc_ValueError, "hour out of range"); ./Modules/timemodule.c:417: PyErr_SetString(PyExc_ValueError, "minute out of range"); ./Modules/timemodule.c:421: PyErr_SetString(PyExc_ValueError, "seconds out of range"); ./Modules/timemodule.c:427: PyErr_SetString(PyExc_ValueError, "day of week out of range"); ./Modules/timemodule.c:431: PyErr_SetString(PyExc_ValueError, "day of year out of range"); ./Objects/rangeobject.c:61: PyErr_SetString(PyExc_ValueError, "xrange() arg 3 must not be zero"); ./Objects/rangeobject.c:106: PyErr_SetString(PyExc_ValueError, ./RISCOS/Modules/drawfmodule.c:450: { PyErr_SetString(PyExc_ValueError,"Object out of range"); From aahz at pythoncraft.com Sun Apr 17 18:25:09 2005 From: aahz at pythoncraft.com (Aahz) Date: Sun Apr 17 18:25:13 2005 Subject: [Python-Dev] Inconsistent exception for read-only properties? In-Reply-To: <1113753431.32079.269.camel@presto.wooz.org> References: <1113536764.23564.310.camel@geddy.wooz.org> <1113693867.32074.80.camel@presto.wooz.org> <20050417155331.GC25115@performancedrivers.com> <1113753431.32079.269.camel@presto.wooz.org> Message-ID: <20050417162509.GA3300@panix.com> On Sun, Apr 17, 2005, Barry Warsaw wrote: > On Sun, 2005-04-17 at 11:53, Jack Diederich wrote: >> >> In 2.4 & 2.3 does it make sense to raise an exception that multiply >> inherits from both TypeError and AttributeError? If anyone currently >> does catch the error raising only AttributeError will break their >> code. 2.5 should just raise an AttributeError, of course. > > Without introducing a new exception class (which I think is out of the > question for anything but 2.5), the only common base is StandardError, > which seems too general for this exception. Why is changing an exception more acceptable than creating a new one? (I don't have a strong opinion either way, but I'd like some reasoning; Jack's approach at least doesn't break code.) Especially if the new exception isn't "public" (in the builtins with other exceptions). -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death." --GvR From gvanrossum at gmail.com Sun Apr 17 20:36:21 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun Apr 17 20:36:26 2005 Subject: [Python-Dev] Inconsistent exception for read-only properties? In-Reply-To: <20050417155331.GC25115@performancedrivers.com> References: <1113536764.23564.310.camel@geddy.wooz.org> <1113693867.32074.80.camel@presto.wooz.org> <20050417155331.GC25115@performancedrivers.com> Message-ID: > In 2.4 & 2.3 does it make sense to raise an exception that multiply inherits > from both TypeError and AttributeError? If anyone currently does catch the > error raising only AttributeError will break their code. 2.5 should just > raise an AttributeError, of course. I think that sets a bad precedent. I understand you want to do this for backwards compatibility, but it's a real ugly thing in the exception inheritance tree and once it's in it's hard to get rid of. It's also introducing a new feature so it's a no-no to do this for 2.3 or 2.4 anyway. I wonder if long-term, AttributeError shouldn't inherit from TypeError? AttributeError really feels to me like a particular case of the stuff that typically raises TypeError. Unfortunately this is *also* a b/w compatibility problem, since people currently might have code like this: try: ... except TypeError: ... except AttributeError: ... and the AttributeError branch would become unreachable. Personally, I think it would be fine to just change the TypeError to AttributeError. I expect that very few people would be hurt by that change (they'd be building *way* too much specific arcane knowledge into their program if they had code for which it mattered). So why, given two different backwards incompatible choices, do I prefer changing the exception raised in this specific case over making AttributeError inherit from TypeError? Because the latter change has a much larger scope; it can affect much more code (including code that doesn't have anything to do with the problem we're trying to solve). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From barry at python.org Sun Apr 17 21:09:41 2005 From: barry at python.org (Barry Warsaw) Date: Sun Apr 17 21:10:07 2005 Subject: [Python-Dev] Inconsistent exception for read-only properties? In-Reply-To: <20050417162509.GA3300@panix.com> References: <1113536764.23564.310.camel@geddy.wooz.org> <1113693867.32074.80.camel@presto.wooz.org> <20050417155331.GC25115@performancedrivers.com> <1113753431.32079.269.camel@presto.wooz.org> <20050417162509.GA3300@panix.com> Message-ID: <1113764981.32079.284.camel@presto.wooz.org> On Sun, 2005-04-17 at 12:25, Aahz wrote: > Why is changing an exception more acceptable than creating a new one? > (I don't have a strong opinion either way, but I'd like some reasoning; > Jack's approach at least doesn't break code.) Especially if the new > exception isn't "public" (in the builtins with other exceptions). Adding an exception that we have to live with forever (even if it's localized to this one module) seems like it would fall under the new feature rubric, whereas I think the choice of exception was just a bug. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050417/94ad9087/attachment.pgp From barry at python.org Sun Apr 17 21:17:19 2005 From: barry at python.org (Barry Warsaw) Date: Sun Apr 17 21:17:21 2005 Subject: [Python-Dev] Inconsistent exception for read-only properties? In-Reply-To: References: <1113536764.23564.310.camel@geddy.wooz.org> <1113693867.32074.80.camel@presto.wooz.org> <20050417155331.GC25115@performancedrivers.com> Message-ID: <1113765439.32081.287.camel@presto.wooz.org> On Sun, 2005-04-17 at 14:36, Guido van Rossum wrote: > Personally, I think it would be fine to just change the TypeError to > AttributeError. I expect that very few people would be hurt by that > change (they'd be building *way* too much specific arcane knowledge > into their program if they had code for which it mattered). Unless there are any objections in the next few days, I will take this as a pronouncement and make the change at least in 2.5 and 2.4. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050417/6f5d5f24/attachment.pgp From gvanrossum at gmail.com Sun Apr 17 21:44:43 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun Apr 17 21:44:48 2005 Subject: [Python-Dev] Inconsistent exception for read-only properties? In-Reply-To: <1113765439.32081.287.camel@presto.wooz.org> References: <1113536764.23564.310.camel@geddy.wooz.org> <1113693867.32074.80.camel@presto.wooz.org> <20050417155331.GC25115@performancedrivers.com> <1113765439.32081.287.camel@presto.wooz.org> Message-ID: > > Personally, I think it would be fine to just change the TypeError to > > AttributeError. I expect that very few people would be hurt by that > > change (they'd be building *way* too much specific arcane knowledge > > into their program if they had code for which it mattered). > > Unless there are any objections in the next few days, I will take this > as a pronouncement and make the change at least in 2.5 and 2.4. You meant 2.5 only of course. It's still a new feature and as such can't be changed in 2.4. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mwh at python.net Sun Apr 17 22:14:39 2005 From: mwh at python.net (Michael Hudson) Date: Sun Apr 17 22:14:41 2005 Subject: [Python-Dev] Inconsistent exception for read-only properties? In-Reply-To: <1113765439.32081.287.camel@presto.wooz.org> (Barry Warsaw's message of "Sun, 17 Apr 2005 15:17:19 -0400") References: <1113536764.23564.310.camel@geddy.wooz.org> <1113693867.32074.80.camel@presto.wooz.org> <20050417155331.GC25115@performancedrivers.com> <1113765439.32081.287.camel@presto.wooz.org> Message-ID: <2macnx88k0.fsf@starship.python.net> Barry Warsaw writes: > On Sun, 2005-04-17 at 14:36, Guido van Rossum wrote: > >> Personally, I think it would be fine to just change the TypeError to >> AttributeError. I expect that very few people would be hurt by that >> change (they'd be building *way* too much specific arcane knowledge >> into their program if they had code for which it mattered). > > Unless there are any objections in the next few days, I will take this > as a pronouncement and make the change at least in 2.5 and 2.4. I don't think this should be changed in 2.4. Cheers, mwh -- As far as I'm concerned, the meat pie is the ultimate unit of currency. -- from Twisted.Quotes From barry at python.org Sun Apr 17 23:48:35 2005 From: barry at python.org (Barry Warsaw) Date: Sun Apr 17 23:48:37 2005 Subject: [Python-Dev] Inconsistent exception for read-only properties? In-Reply-To: References: <1113536764.23564.310.camel@geddy.wooz.org> <1113693867.32074.80.camel@presto.wooz.org> <20050417155331.GC25115@performancedrivers.com> <1113765439.32081.287.camel@presto.wooz.org> Message-ID: <1113774515.32074.300.camel@presto.wooz.org> On Sun, 2005-04-17 at 15:44, Guido van Rossum wrote: > You meant 2.5 only of course. It's still a new feature and as such > can't be changed in 2.4. Fair enough. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050417/87f9a6d4/attachment.pgp From anthony at interlink.com.au Mon Apr 18 04:07:27 2005 From: anthony at interlink.com.au (Anthony Baxter) Date: Mon Apr 18 04:07:43 2005 Subject: [Python-Dev] Inconsistent exception for read-only properties? In-Reply-To: <1113765439.32081.287.camel@presto.wooz.org> References: <1113536764.23564.310.camel@geddy.wooz.org> <1113765439.32081.287.camel@presto.wooz.org> Message-ID: <200504181207.28667.anthony@interlink.com.au> On Monday 18 April 2005 05:17, Barry Warsaw wrote: > Unless there are any objections in the next few days, I will take this > as a pronouncement and make the change at least in 2.5 and 2.4. God no - this isn't suitable for a bugfix release. It seems fine for 2.5, though. -- Anthony Baxter It's never too late to have a happy childhood. From gvanrossum at gmail.com Mon Apr 18 16:49:05 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon Apr 18 16:49:24 2005 Subject: [Python-Dev] Fwd: CFP: DLS05: ACM Dynamic Languages Symposium In-Reply-To: <6b0be5726fb86107ff97205642357f4f@ulb.ac.be> References: <6b0be5726fb86107ff97205642357f4f@ulb.ac.be> Message-ID: See you all at OOPSLA! ---------- Forwarded message ---------- From: Roel Wuyts Date: Apr 17, 2005 10:59 PM Subject: CFP: DLS05: ACM Dynamic Languages Symposium To: python-announce-list@python.org CALL FOR PAPERS FOR THE ACM Dynamic Languages Symposium 2005 October 18, 2005 San Diego, California (co-located with OOPSLA'05) URL: http://decomp.ulb.ac.be:8082/events/dls05/ ----------- Abstract ----------- In industry, static languages (such as Java, C++ and C#) are much more widely used than their dynamic counterparts (like CLOS, Python, Self, Perl, php or Smalltalk). So it appears as though dynamic language concepts were forgotten and lost the race. But this is not the case. Java and C#, the latest mainstream static languages, popularized to a certain extent dynamic language features such as garbage collection, portability and (limited forms of) reflection. In the near future, we expect this dynamicity to increase even further. E.g., it is getting clearer year after year that pervasive computing is becoming the rule and that concepts such as meta programming, reflection, mobility, dynamic reconfigurability and distribution are becoming increasingly popular. All of these features are the domain of dynamic languages, and hence it is only logical that more dynamic language concepts have to be taken up by static languages, or that dynamic languages can make a breakthrough. Currently, the dynamic language community is fragmented, split over a multitude of paradigms (from functional over logic to object-oriented), languages and syntaxes. This fragmentation severely hinders research as well as acceptance, and results in either language wars or, even worse, language ignorance. The goal of this symposium is to provide a highly visible, international forum for researchers working on dynamic features and languages. We explicitly invite submissions from all kinds of paradigms (object-oriented, functional, logic, ...), as can be seen from the structure of the program committee. Areas of interests include, but are not limited to: - closures - delegation - actors, active objects - constraint systems - mixins and traits - reflection and meta-programming - language symbiosis and multi-paradigm languages - experience reports on successful application of dynamic languages Accepted Papers will be published in the ACM Digital Library. ------------------------------- Submission Guidelines ------------------------------- Papers will need to be submitted using an online tracking system, of which the URL will be given later. All papers must be submitted electronically in PDF format (or PostScript, if you do not have access to PDF-producing programs, but this is not recommended). Submissions, as well as final versions, must be formatted to conform to ACM Proceedings requirements: Nine point font on ten point baseline, two columns per page, each column 3.33 inches wide by 9 inches tall, with a column gutter of 0.33 inches, etc. See the ACM Proceedings Guidelines. You can save preparation time by using one of the templates from that page. Note that MS Word documents must be converted to PDF before being submitted. ---------------------- Important Dates ---------------------- - Deadline for receipt of submissions: June 24th 2005 - Notification of acceptance or rejection: August 5th 2005 - Final version for the proceedings: To be announced later --------------------------- Program Committee --------------------------- - Gilad Bracha - Wolfgang De Meuter - Stephane Ducasse - Gopal Gupta - Robert Hirschfeld - Dan Ingalls - Yukihiro Matsumoto - Mark Miller - Eliot Miranda - Philippe Mougin - Oscar Nierstrasz - Dave Thomas - David Ungar - Guido Van Rossum - Peter Van Roy - Jon L White (G) - Roel Wuyts (Chair) -- Roel Wuyts DeComp roel.wuyts@ulb.ac.be Universit? Libre de Bruxelles http://homepages.ulb.ac.be/~rowuyts/ Belgique Vice-President of the European Smalltalk Users Group: www.esug.org -- http://mail.python.org/mailman/listinfo/python-announce-list Support the Python Software Foundation: http://www.python.org/psf/donations.html -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tlesher at gmail.com Mon Apr 18 19:19:04 2005 From: tlesher at gmail.com (Tim Lesher) Date: Mon Apr 18 19:19:07 2005 Subject: [Python-Dev] python-dev Summary for 2005-04-01 through 2005-04-15 [draft] Message-ID: <9613db60050418101934f0e3e8@mail.gmail.com> Here's the first draft of the python-dev summary for the first half of April. Please send any corrections or suggestions to the summarizers. ====================== Summary Announcements ====================== --------------------------- New python-dev summary team --------------------------- This summary marks the first by the team of Steve Bethard, Tim Lesher, and Tony Meyer. We're trying a collaborative approach to the summaries: each fortnight, we'll be getting together in a virtual smoke-filled back room to divide up the interesting threads. Then we'll stitch together the summaries in roughly the same form as you've seen in the past. We'll mark each editor's entries with his initials. Thanks to Brett Cannon for sixty-one excellent python-dev summaries. Also, thanks for providing scripts to help get the new summaries off the ground! We're looking forward to the contributions you'll make to the Python core, now that the summaries aren't taking up all your time. [TDL] ========= Summaries ========= ---------------------- Right Operator Methods ---------------------- Greg Ewing explored an issue with new-style classes that define only right operator methods (__radd__, __rmul__, etc.) Instances of such a class cannot be added/multiplied/etc. together as Python raises a TypeError. Armin Rigo explained the rule: if the instances on both sides of an operator are of the same class, only the non-reversed method is ever called. Armin also explained that an __add__ or __mul__ method that returns NotImplemented may be called twice when Python attempts to differentiate between numeric and sequence operations. Contributing threads: - `New style classes and operator methods `__ [SJB] ------------------------------------------ Hierarchical groups in regular expressions ------------------------------------------ Chris Ottrey demoed his `pyre2 project`_ that can extract a hierarchy of strings when nested groups match in a regular expression. The current re module (in the stdlib) only matches the last occurrence of a group in the string, throwing away any preceding matches. People discussed some of pyre2's proposed API, with the main suggestion being to extend the API to support unnamed (positional) groups in addition to named groups. Though a number of people expressed interest in the idea, it was not clear whether the functionality should be included in the standard library. However, most agreed that if it was included, it should be integrated with the existing re module. Gustavo Niemeyer offered to perform this integration if an API could be agreed upon. Further discussion was moved to the pyre2 `development wiki`_ and `mailing list`_. Contributing threads: - `hierarchicial named groups extension to the re library `__ .. _pyre2 project: http://pyre2.sourceforge.net/ .. _development wiki: http://py.redsoft.be/pyre2/wiki/ .. _mailing list: http://lists.sourceforge.net/lists/listinfo/pyre2-devel [SJB] ------------------------------- Security capabilities in Python ------------------------------- The issue of security came up again, and Ka-Ping Yee suggested that in Python's restricted execution mode secure proxies can be created by using lexical scoping. He posted `some code`_ for revealing only certain "facets" of an object by using a function to declare a proxy class that used function local variables to build the proxy. Thus to access the attributes used in the proxy class, you need to access things like im_func or func_closure, which are not accessible in restricted execution mode. James Y Knight illustrated how strategic overriding of __eq__ in a subclass of str could allow access to the hidden "facets". Eyal Lotem suggested that such an attack could be countered by implementing "facets" in C, but having to turn to C every time you needed a particular security construct seemed unappealing. Contributing threads: - `Security capabilities in Python `__ .. _some code: http://zesty.ca/python/facet.py [SJB] --------------------------------- Improving GilState API Robustness --------------------------------- Michael Hudson noted that his changes to thread handling in the readline module appeared to trigger `bug 1176893`_ ("Readline segfault"). However, he believed the problem lay in the GilState API, rather than in his changes: PyGilState_Release crashes if PyEval_InitThreads wasn't called, even if the code you're writing doesn't use multiple threads. He proposed several solutions, none of which met with resounding approbation, and Tim Peters noted that `PEP 311`_, Simplified Global Interpreter Lock Acquisition for Extensions, "specifically disowns responsibility for worrying about whether Py_Initialize and PyEval_InitThreads have been called." Bob Ippolito wondered whether just calling PyEval_InitThreads directly in Py_Initialize might be a better idea. No objections were raised, so long as the underlying OS locking mechanisms weren't overly expensive; some initial benchmarks indicated that this approach was viable, at least on Linux and OS X. Contributing threads: - `threading (GilState) question `__ .. _bug 1176893: http://sourceforge.net/tracker/index.php?func=detail&aid=1176893&group_id=5470&atid=105470 .. _PEP 311: http://www.python.org/peps/pep-0311.html [TDL] ---------------------------------------- Unicode byte order mark decoding ---------------------------------------- Evan Jones saw that the UTF-16 decoder discards the byte-order mark (BOM) from Unicode files, while the UTF-8 decoder doesn't. Although the BOM isn't really required in UTF-8 files, many Unicode-generating applications, especially on Microsoft platforms, add it. Walter D?rwald created a patch_ to add a UTF-8-Sig codec that generates a BOM on writing and skips it on reading, but after a long discussion on the history of the Unicode, Microsoft's influence over its evolution, the consensus was that BOM and signature handling belong at a higher level (for example, a stream API) than the codec. Contributing threads: - `Unicode byte order mark decoding `__ .. _patch: http://sourceforge.net/tracker/index.php?func=detail&aid=1177307&group_id=5470&atid=305470 [TDL] --------------- Developers List --------------- Raymond Hettinger has started a `project to track developers`_ and the (tracker and commit) privileges they have, and who gave them the privileges, and why (for example, was it for a one-shot project). Removing inactive developers should improve clarity, institutional memory, security, and makes everything tidier. Raymond has begun contacting recently inactive developers to check whether they still require the privileges they have. Contributing threads: - `Developer list update `__ .. _project to track developers: http://cvs.sourceforge.net/viewcvs.py/*checkout*/python/python/dist/src/Misc/developers.txt [TAM] -------------------- Marshalling Infinity -------------------- Scott David Daniels kicked off a very long thread by asking what (un)marshal should do with floating point NaNs. The current behaviour (as with any NaN, infinity, or signed zero) is undefined: a platform-dependant accident, because Python is written to C89, which has no such concepts. Tim Peters pointed out all code for (de)serialing C doubles should go through _PyFloat_Pack8()/_PyFloat_Unpack8(), and that the current implementation suggests that the routines could simply copy bytes on platforms that use the standard IEEE-754 single and double formats natively. Michael Hudson obliged by creating a `patch to implement this`_. The consensus was that the correct behaviour is that packing a NaN or infinity shouldn't cause an exception. When unpacking, an IEEE-754 platform shouldn't cause an exception, but a non-754 platform should, since there's no sensible value that it can be unpacked to, and errors should never pass silently. Contributing threads: - `marshal / unmarshal `__ .. _patch to implement this: http://python.org/sf/1181301 [TAM] --------------------------------- Location of the sign bit in longs --------------------------------- Michael Hudson asked about the possibility of longs storing the sign bit somewhere other than the current location, suggesting the top bit of ob_digit[0]. Tim Peters suggested that it would be better to give struct _longobject a distinct sign member. This simplifies code, costs no extra bytes for some longs, and 8 extra bytes for others, and shouldn't hurt binary compatibility. Michael coughed up a `longobject patch`_, which seems likely to be checked in. Contributing threads: - `marshal / unmarshal `__ .. _longobject patch: http://python.org/sf/1177779 [TAM] ----------------------- Acceptable diff formats ----------------------- Nick Coghlan asked if context diffs are still favoured for patches. Historically, context diffs were preferred, but it appears that unified diffs are the today's choice. Raymond Hettinger made the sensible suggestion that whichever is most informative for the particular patch should be used, and Bob Ippolito pointed out that if CVS is replaced with subversion, unified diffs will have better support. The `patch submission guidelines`_ will be updated at some point to reflect the preference for unified diffs, although if your diff program doesn't support '-u', then context diffs are ok - plain patches are, of course, not. Contributing threads: - `Unified or context diffs? `__ .. _patch submission guidelines: http://www.python.org/patches/ [TAM] =============== Skipped Threads =============== - python-dev Summary for 2005-03-16 through 2005-03-31 [draft] - [Python-checkins] python/dist/src/Lib/logging handlers.py, 1.19, 1.19.2.1 - [Python-checkins] python/dist/src/Modules mathmodule.c, 2.74, 2.75 - Weekly Python Patch/Bug Summary - Mail.python.org - New bug, directly assigned, okay? - inconsistency when swapping obj.__dict__ with a dict-like object... - Pickling instances of nested classes - args attribute of Exception objects -- Tim Lesher From mal at egenix.com Mon Apr 18 19:47:03 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon Apr 18 19:47:06 2005 Subject: [Python-Dev] Security capabilities in Python In-Reply-To: References: Message-ID: <4263F297.3020205@egenix.com> Eyal Lotem wrote: > I would like to experiment with security based on Python references as > security capabilities. > > Unfortunatly, there are several problems that make Python references > invalid as capabilities: > > * There is no way to create secure proxies because there are no > private attributes. > * Lots of Python objects are reachable unnecessarily breaking the > principle of least privelege (i.e: object.__subclasses__() etc.) > > I was wondering if any such effort has already begun or if there are > other considerations making Python unusable as a capability platform? You might want to have a look at mxProxy objects. These were created to provide secure wrappers around Python objects with a well-defined access mechanism, e.g. by defining a list of methods/attributes which can be accessed from the outside or by creating a method which then decides whether access is granted or not: http://www.egenix.com/files/python/mxProxy.html Note that the new-style classes may have introduced some security leaks. If you find any, please let me know. PS: A nice side-effect of the these proxy objects is that you can create weak-reference to all Python objects (not just those that support the protocol). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 18 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mwh at python.net Mon Apr 18 20:05:52 2005 From: mwh at python.net (Michael Hudson) Date: Mon Apr 18 20:10:26 2005 Subject: [Python-Dev] python-dev Summary for 2005-04-01 through 2005-04-15 [draft] In-Reply-To: <9613db60050418101934f0e3e8@mail.gmail.com> (Tim Lesher's message of "Mon, 18 Apr 2005 13:19:04 -0400") References: <9613db60050418101934f0e3e8@mail.gmail.com> Message-ID: <2mzmvw6jun.fsf@starship.python.net> Tim Lesher writes: > Here's the first draft of the python-dev summary for the first half of > April. Please send any corrections or suggestions to the summarizers. > > ====================== > Summary Announcements > ====================== > > --------------------------- > New python-dev summary team > --------------------------- > > This summary marks the first by the team of Steve Bethard, Tim Lesher, > and Tony Meyer. Nice work! An update: > --------------------------------- > Improving GilState API Robustness > --------------------------------- > > Michael Hudson noted that his changes to thread handling in the > readline module appeared to trigger `bug 1176893`_ ("Readline > segfault"). However, he believed the problem lay in the GilState API, > rather than in his changes: PyGilState_Release crashes if > PyEval_InitThreads wasn't called, even if the code you're writing > doesn't use multiple threads. > > He proposed several solutions, none of which met with resounding > approbation, Nevertheless, I've checked one of them in :) After reading a fair bit of code, and docs, I went for option 2) in the linked mail. > and Tim Peters noted that `PEP 311`_, Simplified Global Interpreter > Lock Acquisition for Extensions, "specifically disowns > responsibility for worrying about whether Py_Initialize and > PyEval_InitThreads have been called." I think this reading is a bit of a stretch of the wording of the PEP. It also contradicts the documentation ("regardless of the current state of Python"). Finally, the current behaviour has a strong whiff of being accidental. > -------------------- > Marshalling Infinity > -------------------- > > Scott David Daniels kicked off a very long thread by asking what (un)marshal > should do with floating point NaNs. The current behaviour (as with any NaN, > infinity, or signed zero) is undefined: a platform-dependant accident, > because Python is written to C89, which has no such concepts. Tim Peters > pointed out all code for (de)serialing C doubles should go through > _PyFloat_Pack8()/_PyFloat_Unpack8(), and that the current implementation > suggests that the routines could simply copy bytes on platforms that use the > standard IEEE-754 single and double formats natively. Michael Hudson > obliged by creating a `patch to implement this`_. I hope to check this in soon. Note that the patch is in two pieces, one to marshal floats in binary format and one ... > The consensus was that the correct behaviour is that packing a NaN or > infinity shouldn't cause an exception. When unpacking, an IEEE-754 platform > shouldn't cause an exception, but a non-754 platform should, since there's > no sensible value that it can be unpacked to, and errors should never pass > silently. ... to do this bit. > --------------------------------- > Location of the sign bit in longs > --------------------------------- > > Michael Hudson asked about the possibility of longs storing the sign bit > somewhere other than the current location, suggesting the top bit of > ob_digit[0]. Tim Peters suggested that it would be better to give struct > _longobject a distinct sign member. This simplifies code, costs no extra > bytes for some longs, and 8 extra bytes for others, and shouldn't hurt > binary compatibility. > > Michael coughed up a `longobject patch`_, which seems likely to be checked > in. I'm actually in less of a rush to get this one in :) (Hmm, had a busy couple of weeks, didn't I? :) > Contributing threads: > > - `marshal / unmarshal > `__ ? Cheers, mwh -- we should write an os YES * itamar starts a sourceforge project -- from Twisted.Quotes From aahz at pythoncraft.com Mon Apr 18 20:27:48 2005 From: aahz at pythoncraft.com (Aahz) Date: Mon Apr 18 20:27:51 2005 Subject: [Python-Dev] python-dev Summary for 2005-04-01 through 2005-04-15 [draft] In-Reply-To: <9613db60050418101934f0e3e8@mail.gmail.com> References: <9613db60050418101934f0e3e8@mail.gmail.com> Message-ID: <20050418182748.GA4709@panix.com> On Mon, Apr 18, 2005, Tim Lesher wrote: > > Here's the first draft of the python-dev summary for the first half of > April. Please send any corrections or suggestions to the summarizers. Good show! One suggestion: might want to order threads in order of relevance to random python-dev readers (the bit that triggered this comment was seeing the unified vs. context diffs thread so far down). -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death." --GvR From bac at OCF.Berkeley.EDU Mon Apr 18 21:02:54 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Mon Apr 18 21:03:10 2005 Subject: [Python-Dev] python-dev Summary for 2005-04-01 through 2005-04-15 [draft] In-Reply-To: <9613db60050418101934f0e3e8@mail.gmail.com> References: <9613db60050418101934f0e3e8@mail.gmail.com> Message-ID: <4264045E.9000300@ocf.berkeley.edu> Tim Lesher wrote: > Here's the first draft of the python-dev summary for the first half of > April. Please send any corrections or suggestions to the summarizers. > > ====================== > Summary Announcements > ====================== > > --------------------------- > New python-dev summary team > --------------------------- > > This summary marks the first by the team of Steve Bethard, Tim Lesher, > and Tony Meyer. We're trying a collaborative approach to the > summaries: each fortnight, we'll be getting together in a virtual > smoke-filled back room to divide up the interesting threads. Then > we'll stitch together the summaries in roughly the same form as you've > seen in the past. We'll mark each editor's entries with his initials. > Woohoo! Once again, thanks for doing this guys. > Thanks to Brett Cannon for sixty-one excellent python-dev summaries. > Also, thanks for providing scripts to help get the new summaries off > the ground! We're looking forward to the contributions you'll make to > the Python core, now that the summaries aren't taking up all your > time. > Gee, no pressure. =) [SNIP] > ------------------------------- > Security capabilities in Python > ------------------------------- > > The issue of security came up again, and Ka-Ping Yee suggested that in > Python's restricted execution mode secure proxies can be created by > using lexical scoping. He posted `some code`_ for revealing only > certain "facets" of an object by using a function to declare a proxy > class that used function local variables to build the proxy. Thus to "... that used a function's local variables ..." [SNIP] > > --------------------------------- > Improving GilState API Robustness > --------------------------------- > > Michael Hudson noted that his changes to thread handling in the > readline module appeared to trigger `bug 1176893`_ ("Readline > segfault"). However, he believed the problem lay in the GilState API, > rather than in his changes: PyGilState_Release crashes if > PyEval_InitThreads wasn't called, even if the code you're writing > doesn't use multiple threads. > > He proposed several solutions, none of which met with resounding > approbation, and Tim Peters noted that `PEP 311`_, Simplified Global > Interpreter Lock Acquisition for Extensions, "specifically disowns > responsibility for worrying about whether Py_Initialize and > PyEval_InitThreads have been called." > > Bob Ippolito wondered whether just calling PyEval_InitThreads directly > in Py_Initialize might be a better idea. No objections were raised, > so long as the underlying OS locking mechanisms weren't overly > expensive; some initial benchmarks indicated that this approach was > viable, at least on Linux and OS X. > > Contributing threads: > > - `threading (GilState) question > `__ > > .. _bug 1176893: > http://sourceforge.net/tracker/index.php?func=detail&aid=1176893&group_id=5470&atid=105470 > For any tracker item, the easiest way to do a URL is to use the python.org shortcut: http://www.python.org/sf/##### . So the above would be http://www.python.org/sf/1176893 . > .. _PEP 311: http://www.python.org/peps/pep-0311.html > > [TDL] > > ---------------------------------------- > Unicode byte order mark decoding > ---------------------------------------- > > Evan Jones saw that the UTF-16 decoder discards the byte-order mark > (BOM) from Unicode files, while the UTF-8 decoder doesn't. Although > the BOM isn't really required in UTF-8 files, many Unicode-generating > applications, especially on Microsoft platforms, add it. > > Walter D?rwald created a patch_ to add a UTF-8-Sig codec that generates > a BOM on writing and skips it on reading, but after a long discussion > on the history of the Unicode, Microsoft's influence over its "... of Unicode and Microsoft's influence ..." [SNIP] > --------------- > Developers List > --------------- > > Raymond Hettinger has started a `project to track developers`_ and the > (tracker and commit) privileges they have, and who gave them the privileges, > and why (for example, was it for a one-shot project). Removing inactive > developers should improve clarity, institutional memory, security, and makes > everything tidier. Raymond has begun contacting recently inactive > developers to check whether they still require the privileges they have. > > Contributing threads: > > - `Developer list update > `__ > > .. _project to track developers: > http://cvs.sourceforge.net/viewcvs.py/*checkout*/python/python/dist/src/Misc/developers.txt > > [TAM] > > -------------------- > Marshalling Infinity > -------------------- > > Scott David Daniels kicked off a very long thread by asking what (un)marshal > should do with floating point NaNs. The current behaviour (as with any NaN, > infinity, or signed zero) is undefined: a platform-dependant accident, > because Python is written to C89, which has no such concepts. Tim Peters > pointed out all code for (de)serialing C doubles should go through > _PyFloat_Pack8()/_PyFloat_Unpack8(), and that the current implementation > suggests that the routines could simply copy bytes on platforms that use the > standard IEEE-754 single and double formats natively. Michael Hudson > obliged by creating a `patch to implement this`_. > > The consensus was that the correct behaviour is that packing a NaN or "... behavious of packing a NaN ..." [SNIP] Well done guys! Very impressed; succinct, clear, and a ton less errors then I used to put into the first draft. =) When you are happy with the draft just email me the plaintext and I will get it up on python.org for you. -Brett From walter at livinglogic.de Mon Apr 18 23:33:58 2005 From: walter at livinglogic.de (=?iso-8859-1?Q?Walter_D=F6rwald?=) Date: Mon Apr 18 23:34:00 2005 Subject: [Python-Dev] python-dev Summary for 2005-04-01 through 2005-04-15 [draft] In-Reply-To: <9613db60050418101934f0e3e8@mail.gmail.com> References: <9613db60050418101934f0e3e8@mail.gmail.com> Message-ID: <1260.84.56.100.23.1113860038.squirrel@isar.livinglogic.de> Tim Lesher sagte: > Here's the first draft of the python-dev summary for the first half of April. Please send any corrections or suggestions to > the summarizers. > [...] > ---------------------------------------- > Unicode byte order mark decoding > ---------------------------------------- > > Evan Jones saw that the UTF-16 decoder discards the byte-order mark (BOM) from Unicode files, while the UTF-8 decoder > doesn't. Although the BOM isn't really required in UTF-8 files, many Unicode-generating applications, especially on Microsoft > platforms, add it. > > Walter D?rwald created a patch_ to add a UTF-8-Sig codec that generates a BOM on writing and skips it on reading, but after a > long discussion on the history of the Unicode, Microsoft's influence over its > evolution, the consensus was that BOM and signature handling belong at a higher level (for example, a stream API) than the > codec. All codecs provide a stream API, so there is no higher level. Bye, Walter D?rwald From python at rcn.com Mon Apr 18 12:08:21 2005 From: python at rcn.com (Raymond Hettinger) Date: Tue Apr 19 00:08:48 2005 Subject: [Python-Dev] python-dev Summary for 2005-04-01 through 2005-04-15[draft] In-Reply-To: <9613db60050418101934f0e3e8@mail.gmail.com> Message-ID: <000201c543fe$8dd18580$e827a044@oemcomputer> > ====================== > Summary Announcements > ====================== Executive summary: Hudson goes wild fixing obscure bugs. > --------------------------- > New python-dev summary team > --------------------------- > > This summary marks the first by the team of Steve Bethard, Tim Lesher, > and Tony Meyer. We're trying a collaborative approach to the > summaries: each fortnight, we'll be getting together in a virtual > smoke-filled back room to divide up the interesting threads. Both your process and results are excellent. Raymond Hettinger From oliphant at ee.byu.edu Tue Apr 19 02:47:31 2005 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Tue Apr 19 02:47:35 2005 Subject: [Python-Dev] Pickling buffer objects. Message-ID: <42645523.3000109@ee.byu.edu> Before submitting a patch to pickle.py and cPickle.c, I'd be interested in knowing how likely to be accepted a patch that allows Python to pickle the buffer object. The problem being solved is that Numeric currently has to copy all of its data into a string before writing it out to a pickle. Yes, I know there are ways to write directly to a file. But, it is desireable to have Numeric arrays interact seamlessly with other pickleable types without a separate stream. This is especially utilized for network transport. The patch would simply write the opcode for a Python string to the stream and then write the character-interpreted data (without making an intermediate copy) of the void * pointer of the buffer object. Yes, I know all of the old arguments about the buffer object and that it should be replaced with something better. I've read all the old posts and am quite familiar with the issues about it. But, this can be considered a separate issue. Since the buffer object exists, it ought to be pickleable, and it would make a lot of applications a lot faster. I'm proposing to pickle the buffer object so that it unpickles as a string. Arguably, there should be a separate mutable-byte object opcode so that buffer objects unpickle as mutable-byte buffer objects. If that is more desireable, I'd even offer a patch to do that (though such pickles wouldn't unpickle under earlier versions of Python). I suspect that the buffer object would need to be reworked into something more along the lines of the previously-proposed bytes object before a separate bytecode for pickleable mutable-bytes is accepted, however. -Travis Oliphant From greg.ewing at canterbury.ac.nz Tue Apr 19 06:39:06 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue Apr 19 06:39:27 2005 Subject: [Python-Dev] Pickling buffer objects. In-Reply-To: <42645523.3000109@ee.byu.edu> References: <42645523.3000109@ee.byu.edu> Message-ID: <42648B6A.3070700@canterbury.ac.nz> Travis Oliphant wrote: > > I'm proposing to pickle the buffer object so that it unpickles as a > string. Wouldn't this mean you're only solving half the problem? Unpickling a Numeric array this way would still use an intermediate string. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From martin at v.loewis.de Tue Apr 19 07:16:16 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Apr 19 07:16:20 2005 Subject: [Python-Dev] Pickling buffer objects. In-Reply-To: <42648B6A.3070700@canterbury.ac.nz> References: <42645523.3000109@ee.byu.edu> <42648B6A.3070700@canterbury.ac.nz> Message-ID: <42649420.3040501@v.loewis.de> Greg Ewing wrote: > Wouldn't this mean you're only solving half the problem? > Unpickling a Numeric array this way would still use an > intermediate string. Precisely my concern. Martin From prakash.ayyardevar at gmail.com Tue Apr 19 13:39:12 2005 From: prakash.ayyardevar at gmail.com (Prakash A) Date: Tue Apr 19 13:39:22 2005 Subject: [Python-Dev] Python 2.1 in HP-UX Message-ID: <025b01c544d4$6b7a7470$1a0110ac@PRACO> Hello All, I using jython 2.1. For that i need of Python 2.1 ( i am sure about this, pls clarify me if any version of Python can be used with Jython). and i am working HP-UX platform. I need to know that, whether Python can be Built in HP-UX, because i seeing some of the mails saying Python 2.1 did not compile in HP-UX and Python can not build with HP-UX. Please tell me, whether Python 2.1 can be build in HP-UX. If yes, please give me the stpes to do that. Thanks in Advance, Prakash.A -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20050419/5b7cea30/attachment.html From aahz at pythoncraft.com Tue Apr 19 16:35:03 2005 From: aahz at pythoncraft.com (Aahz) Date: Tue Apr 19 16:35:25 2005 Subject: [Python-Dev] Python 2.1 in HP-UX In-Reply-To: <025b01c544d4$6b7a7470$1a0110ac@PRACO> References: <025b01c544d4$6b7a7470$1a0110ac@PRACO> Message-ID: <20050419143503.GA24331@panix.com> On Tue, Apr 19, 2005, Prakash A wrote: > > I using jython 2.1. For that i need of Python 2.1 ( i am sure about > this, pls clarify me if any version of Python can be used with > Jython). and i am working HP-UX platform. I need to know that, > whether Python can be Built in HP-UX, because i seeing some of the > mails saying Python 2.1 did not compile in HP-UX and Python can not > build with HP-UX. Please tell me, whether Python 2.1 can be build in > HP-UX. If yes, please give me the stpes to do that. python-dev is for development of the Python project. Please use comp.lang.python for other questions. Thank you. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death." --GvR From flinkkettel at yahoo.com Tue Apr 19 17:34:46 2005 From: flinkkettel at yahoo.com (Ralph Hilton) Date: Tue Apr 19 17:34:49 2005 Subject: [Python-Dev] How do you get yesterday from a time object Message-ID: <20050419153446.30407.qmail@web60116.mail.yahoo.com> i'm a beginning python programmer. I want to get the date for yesterday nowTime = time.localtime(time.time()) print nowTime. oneDay = 60*60*24 # number seconds in a day yday = nowTime - oneDay # <-- generates an error print yday.strftime("%Y-%m-%d") How can I just get yesterday's day? It a simple concept yet it seems to be so hard to figure out. What i'm worried about is if today is say June 1, 2023 what is yesterday? and how do i compute that? Ralph Hilton __________________________________ Do you Yahoo!? Plan great trips with Yahoo! Travel: Now over 17,000 guides! http://travel.yahoo.com/p-travelguide From simon.brunning at gmail.com Tue Apr 19 17:57:31 2005 From: simon.brunning at gmail.com (Simon Brunning) Date: Tue Apr 19 17:57:33 2005 Subject: [Python-Dev] How do you get yesterday from a time object In-Reply-To: <20050419153446.30407.qmail@web60116.mail.yahoo.com> References: <20050419153446.30407.qmail@web60116.mail.yahoo.com> Message-ID: <8c7f10c6050419085724d6e8e5@mail.gmail.com> On 4/19/05, Ralph Hilton wrote: > i'm a beginning python programmer. > > I want to get the date for yesterday This is the wrong place for this question. Nip over to http://mail.python.org/mailman/listinfo/python-list, and I'd be more than happy answer it there... -- Cheers, Simon B, simon@brunningonline.net, http://www.brunningonline.net/simon/blog/ From p.f.moore at gmail.com Tue Apr 19 17:59:07 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Tue Apr 19 17:59:11 2005 Subject: [Python-Dev] How do you get yesterday from a time object In-Reply-To: <20050419153446.30407.qmail@web60116.mail.yahoo.com> References: <20050419153446.30407.qmail@web60116.mail.yahoo.com> Message-ID: <79990c6b0504190859316a9e69@mail.gmail.com> On 4/19/05, Ralph Hilton wrote: > i'm a beginning python programmer. > > I want to get the date for yesterday > > nowTime = time.localtime(time.time()) > print nowTime. > oneDay = 60*60*24 # number seconds in a day > yday = nowTime - oneDay # <-- generates an error > print yday.strftime("%Y-%m-%d") > > How can I just get yesterday's day? It a simple > concept yet it seems to be so hard to figure out. > > What i'm worried about is if today is say > June 1, 2023 > what is yesterday? and how do i compute that? You don't want the python-dev list for this type of question. Python-dev is for development *of* Python. For usage questions such as this, you would be better asking on python-list (or, equivalently, the Usenet group comp.lang.python). To assist with your question, though, I'd suggest you look at the documentation of the datetime module, which allows you to do what you are after (and much more). Regards, Paul From pdecat at gmail.com Tue Apr 19 18:07:17 2005 From: pdecat at gmail.com (Patrick DECAT) Date: Tue Apr 19 18:07:19 2005 Subject: [Python-Dev] How do you get yesterday from a time object In-Reply-To: <20050419153446.30407.qmail@web60116.mail.yahoo.com> References: <20050419153446.30407.qmail@web60116.mail.yahoo.com> Message-ID: <3dd9f8f6050419090737913059@mail.gmail.com> Hi, I believe it's not the appropriate place to ask such questions. You should check the Python users' list ( http://python.org/community/lists.html ) Anyway, here you go : now = time.time() nowTuple = time.localtime(now) yesterdayTuple = time.localtime(now-60*60*24) Regards, Patrick. 2005/4/19, Ralph Hilton : > i'm a beginning python programmer. > > I want to get the date for yesterday > > nowTime = time.localtime(time.time()) > print nowTime. > oneDay = 60*60*24 # number seconds in a day > yday = nowTime - oneDay # <-- generates an error > print yday.strftime("%Y-%m-%d") > > How can I just get yesterday's day? It a simple > concept yet it seems to be so hard to figure out. > > What i'm worried about is if today is say > June 1, 2023 > what is yesterday? and how do i compute that? > > Ralph Hilton > > __________________________________ > Do you Yahoo!? > Plan great trips with Yahoo! Travel: Now over 17,000 guides! > http://travel.yahoo.com/p-travelguide > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/pdecat%40gmail.com > From lcaamano at gmail.com Tue Apr 19 19:10:03 2005 From: lcaamano at gmail.com (Luis P Caamano) Date: Tue Apr 19 19:10:08 2005 Subject: [Python-Dev] os.urandom uses closed FD (sf 1177468) Message-ID: We're running into the problem described in bug 1177468, where urandom tries to use a cached file descriptor that was closed by a daemonizing function. A quick fix/workaround is to have os.urandom open /dev/urandom everytime it gets called instead of using the a cached fd. Would that create any problems other that those related to the additional system call overhead? BTW, I added the traceback we're getting as comment to the bug. Thanks PS This is with Python 2.4.1 -- Luis P Caamano Atlanta, GA USA -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20050419/3d0e7848/attachment.htm From jjinux at gmail.com Tue Apr 19 20:35:21 2005 From: jjinux at gmail.com (Shannon -jj Behrens) Date: Tue Apr 19 20:35:32 2005 Subject: [Python-Dev] anonymous blocks Message-ID: (I apologize that this is my first post. Please don't flame me into oblivion or think I'm a quack!) Have you guys considered the following syntax for anonymous blocks? I think it's possible to parse given Python's existing syntax: items.doFoo( def (a, b) { return a + b }, def (c, d) { return c + d } ) Notice the trick is that there is no name between the def and the "(", and the ")" is followed by a "{". I understand that there is hesitance to use "{}". However, you can think of this as a Python special case on the same level as using ";" between statements on a single line. From that perspective, it's not inconsistent at all. Best Regards, -jj From tjreedy at udel.edu Tue Apr 19 20:48:59 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Tue Apr 19 20:50:43 2005 Subject: [Python-Dev] Re: anonymous blocks References: Message-ID: "Shannon -jj Behrens" wrote in message news:c41f67b90504191135e85c8b5@mail.gmail.com... >Have you guys considered the following syntax for anonymous blocks? There have probably been about 10 such proposals bandied about over the years, mostly on comp.lang.python, which is the more appropriate place for speculative proposals such as this. >I understand that there is hesitance to use "{}". For some, there is more than 'hisitance'. If you understood why, as has been discussed on c.l.p several times, I doubt you would bother proposing such. I won't repeat them here. I should hope there is a Python FAQ entry on this. Terry J. Reedy From gvanrossum at gmail.com Tue Apr 19 20:55:02 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Apr 19 20:55:14 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: Message-ID: > (I apologize that this is my first post. Please don't flame me into > oblivion or think I'm a quack!) (Having met JJ I can assure he's not a quack. But don't let that stop the flames. :-) > Have you guys considered the following syntax for anonymous blocks? I > think it's possible to parse given Python's existing syntax: > > items.doFoo( > def (a, b) { > return a + b > }, > def (c, d) { > return c + d > } > ) > > Notice the trick is that there is no name between the def and the "(", > and the ")" is followed by a "{". > > I understand that there is hesitance to use "{}". However, you can > think of this as a Python special case on the same level as using ";" > between statements on a single line. From that perspective, it's not > inconsistent at all. It would be a lot less inconsistent if {...} would be acceptable alternative block syntax everywhere. But what exactly are you trying to accomplish here? I think that putting the defs *before* the call (and giving the anonymous blocks temporary local names) actually makes the code clearer: def block1(a, b): return a + b def block2(c, d): return c + d items.doFoo(block1, block2) This reflects a style pattern that I've come to appreciate more recently: when breaking a call with a long argument list to fit on your screen, instead of trying to find the optimal break points in the argument list, take one or two of the longest arguments and put them in local variables. Thus, instead of this: self.disentangle(0x40, self.triangulation("The quick brown fox jumps over the lazy dog"), self.indent+1) I'd recommend this: tri = self.subcalculation("The quick brown fox jumps over the lazy dog") self.disentangle(0x40, tri, self.indent+1) IMO this is clearer, and even shorter! If we apply this to the anonymous block problem, we may end up finding lambda the ultimate compromise -- like a gentleman in the back of my talk last week at baypiggies observed (unfortunately I don't know his name). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From sabbey at u.washington.edu Tue Apr 19 21:11:01 2005 From: sabbey at u.washington.edu (Brian Sabbey) Date: Tue Apr 19 21:11:06 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: Message-ID: Shannon -jj Behrens wrote: > Have you guys considered the following syntax for anonymous blocks? I > think it's possible to parse given Python's existing syntax: > > items.doFoo( > def (a, b) { > return a + b > }, > def (c, d) { > return c + d > } > ) > There was a proposal in the last few days on comp.lang.python that allows you to do this in a way that requires less drastic changes to python's syntax. See the thread "pre-PEP: Suite-Based Keywords" (shamless plug) (an earlier, similar proposal is here: http://groups.google.co.uk/groups?selm=mailman.403.1105274631.22381.python-list %40python.org ). In short, if doFoo is defined like: def doFoo(func1, func2): pass You would be able to call it like: doFoo(**): def func1(a, b): return a + b def func2(c, d): return c + d That is, a suite can be used to define keyword arguments. -Brian From oliphant at ee.byu.edu Tue Apr 19 21:13:40 2005 From: oliphant at ee.byu.edu (Travis Oliphant) Date: Tue Apr 19 21:13:44 2005 Subject: [Python-Dev] Pickling buffer objects. In-Reply-To: <42648B6A.3070700@canterbury.ac.nz> References: <42645523.3000109@ee.byu.edu> <42648B6A.3070700@canterbury.ac.nz> Message-ID: <42655864.5030903@ee.byu.edu> Greg Ewing wrote: > Travis Oliphant wrote: > >> >> I'm proposing to pickle the buffer object so that it unpickles as a >> string. > > > Wouldn't this mean you're only solving half the problem? > Unpickling a Numeric array this way would still use an > intermediate string. Well, actually, unpickling in the new numeric uses the intermediate string as the memory (yes, I know it's not supposed to be "mutable", but without a mutable bytes object what else are you supposed to do?). Thus, ideally we would have a mutable-bytes object with a separate pickle opcode. Without this, then we overuse the string object. But, since the string is only created by the pickle (and nobody else uses it, then what's the real harm). So, in reality the previously-mentioned patch together with modificiations to Numeric's unpickling code actually solves the whole problem. -Travis From gvanrossum at gmail.com Tue Apr 19 21:24:25 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Apr 19 21:24:28 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: Message-ID: > See the thread "pre-PEP: Suite-Based Keywords" (shamless plug) > (an earlier, similar proposal is here: > http://groups.google.co.uk/groups?selm=mailman.403.1105274631.22381.python-list > %40python.org ). > > In short, if doFoo is defined like: > > def doFoo(func1, func2): > pass > > You would be able to call it like: > > doFoo(**): > def func1(a, b): > return a + b > def func2(c, d): > return c + d > > That is, a suite can be used to define keyword arguments. I'm still not sure how this is particularly solving a pressing problem that isn't solved by putting the function definitions in front of the call. I saw the first version of the proto-PEP and didn't think that the motivating example (keeping the getx/setx methods passed to a property definition out of the class namespace) was all that valuable. Two more issues: (1) It seems that *every* name introduced in the block automatically becomes a keyword argument. This looks like a problem, since you could easily need temporary variables there. (I don't see that a problem with class bodies because the typical use there is only method and property definitions and the occasional instance variable default.) (2) This seems to be attaching a block to a specific function call but there are more general cases: e.g. you might want to assign the return value of doFoo() to a variable, or you might want to pass it as an argument to another call. *If* we're going to create syntax for anonymous blocks, I think the primary use case ought to be cleanup operations to replace try/finally blocks for locking and similar things. I'd love to have syntactical support so I can write blahblah(myLock): code code code instead of myLock.acquire() try: code code code finally: myLock.release() -- --Guido van Rossum (home page: http://www.python.org/~guido/) From fredrik at pythonware.com Tue Apr 19 21:28:31 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue Apr 19 21:30:32 2005 Subject: [Python-Dev] Re: anonymous blocks References: Message-ID: Brian Sabbey wrote: > In short, if doFoo is defined like: > > def doFoo(func1, func2): > pass > > You would be able to call it like: > > doFoo(**): > def func1(a, b): > return a + b > def func2(c, d): > return c + d > > That is, a suite can be used to define keyword arguments. umm. isn't that just an incredibly obscure way to write def func1(a, b): return a + b def func2(c, d): return c + d doFoo(func1, func2) but with more indentation? From pje at telecommunity.com Tue Apr 19 21:39:56 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Apr 19 21:35:59 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: Message-ID: <5.1.1.6.0.20050419153025.00a7eec0@mail.telecommunity.com> At 11:55 AM 04/19/2005 -0700, Guido van Rossum wrote: >I'd recommend this: > >tri = self.subcalculation("The quick brown fox jumps over the lazy dog") >self.disentangle(0x40, tri, self.indent+1) > >IMO this is clearer, and even shorter! What was your opinion on "where" as a lambda replacement? i.e. foo = bar(callback1, callback2) where: def callback1(x): print "hello, " def callback2(x): print "world!" I suspect that you like the define-first approach because of your tendency to ask questions first and read later. That is, you want to know what callback1 and callback2 are before you see them passed to something. However, other people seem to like to have the context first, then fill in the details of each callback later. Interestingly, this syntax also works to do decoration, though it's not a syntax that was ever proposed for that. e.g.: foo = classmethod(foo) where: def foo(cls,x,y,z): # etc. foo = property(get_foo,set_foo) where: def get_foo(self): # ... def set_foo(self): # ... I don't mind @decorators, of course, but maybe they wouldn't be needed here. >If we apply this to the anonymous block problem, we may end up finding >lambda the ultimate compromise -- like a gentleman in the back of my >talk last week at baypiggies observed (unfortunately I don't know his >name). > >-- >--Guido van Rossum (home page: http://www.python.org/~guido/) >_______________________________________________ >Python-Dev mailing list >Python-Dev@python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: >http://mail.python.org/mailman/options/python-dev/pje%40telecommunity.com From pje at telecommunity.com Tue Apr 19 21:47:57 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue Apr 19 21:43:58 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <5.1.1.6.0.20050419153025.00a7eec0@mail.telecommunity.com> References: Message-ID: <5.1.1.6.0.20050419154251.00a8c9a0@mail.telecommunity.com> At 03:39 PM 04/19/2005 -0400, Phillip J. Eby wrote: >I suspect that you like the define-first approach because of your tendency >to ask questions first and read later. Oops; I forgot to put the smiley on that. It was supposed to be a humorous reference to a comment Guido made in private e-mail about the Dr. Dobbs article I wrote on decorators. He had said something similar about the way he reads articles, expecting the author to answer all his questions up front. Without that context, the above sentence sounds like some sort of snippy remark that I did not intend it to be. Sorry. :( From facundobatista at gmail.com Tue Apr 19 21:49:08 2005 From: facundobatista at gmail.com (Facundo Batista) Date: Tue Apr 19 21:49:11 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: Message-ID: On 4/19/05, Guido van Rossum wrote: > I'm still not sure how this is particularly solving a pressing problem > that isn't solved by putting the function definitions in front of the Well. As to what I've read in my short python experience, people wants to change the language *not* because they have a problem that can not be solved in a different way, but because they *like* to solve it in a different way. And you, making a stand against this, are a main Python feature. . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From sabbey at u.washington.edu Tue Apr 19 21:55:33 2005 From: sabbey at u.washington.edu (Brian Sabbey) Date: Tue Apr 19 21:55:37 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: Message-ID: Guido van Rossum wrote: >> See the thread "pre-PEP: Suite-Based Keywords" (shamless plug) >> (an earlier, similar proposal is here: >> http://groups.google.co.uk/groups?selm=mailman.403.1105274631.22381.python-list >> %40python.org ). >> >> In short, if doFoo is defined like: >> >> def doFoo(func1, func2): >> pass >> >> You would be able to call it like: >> >> doFoo(**): >> def func1(a, b): >> return a + b >> def func2(c, d): >> return c + d >> >> That is, a suite can be used to define keyword arguments. > > I'm still not sure how this is particularly solving a pressing problem > that isn't solved by putting the function definitions in front of the > call. I saw the first version of the proto-PEP and didn't think that > the motivating example (keeping the getx/setx methods passed to a > property definition out of the class namespace) was all that valuable. OK. I think most people (myself included) who would prefer to define properties (and event handlers, etc.) in this way are motivated by the perception that the current method is just ugly. I don't know that it solves any pressing problems. > Two more issues: > > (1) It seems that *every* name introduced in the block automatically > becomes a keyword argument. This looks like a problem, since you could > easily need temporary variables there. (I don't see that a problem > with class bodies because the typical use there is only method and > property definitions and the occasional instance variable default.) Combining the suite-based keywords proposal with the earlier, 'where' proposal (linked in my above post), you would be able to name variables individually in the case that temporary variables are needed: f(x=x): x = [i**2 for i in [1,2,3]] > (2) This seems to be attaching a block to a specific function call but > there are more general cases: e.g. you might want to assign the return > value of doFoo() to a variable, or you might want to pass it as an > argument to another call. The 'where' proposal also doesn't have this problem. Any expression is allowed. > *If* we're going to create syntax for anonymous blocks, I think the > primary use case ought to be cleanup operations to replace try/finally > blocks for locking and similar things. I'd love to have syntactical > support so I can write > > blahblah(myLock): > code > code > code > > instead of > > myLock.acquire() > try: > code > code > code > finally: > myLock.release() Well, that was my other proposal, "pre-PEP: Simple Thunks" (there is also an implementation). It didn't seem to go over all that well. I am going to try to rewrite it and give more motivation and explanation (and maybe use 'with' and 'from' instead of 'do' and 'in' as keywords). -Brian From gvanrossum at gmail.com Tue Apr 19 22:00:50 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Apr 19 22:00:55 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <5.1.1.6.0.20050419153025.00a7eec0@mail.telecommunity.com> References: <5.1.1.6.0.20050419153025.00a7eec0@mail.telecommunity.com> Message-ID: > What was your opinion on "where" as a lambda replacement? i.e. > > foo = bar(callback1, callback2) where: > def callback1(x): > print "hello, " > def callback2(x): > print "world!" I don't recall seeing this proposed, but I might have -- I thought of pretty much exactly this syntax in the shower a few days ago. Unfortunately it doesn't solve the lock-release use case that is more pressing in my mind. Also, if you want top-down programming (which is a fine coding style!), we already have several ways to do that. > I suspect that you like the define-first approach because of your tendency > to ask questions first and read later. That is, you want to know what > callback1 and callback2 are before you see them passed to > something. However, other people seem to like to have the context first, > then fill in the details of each callback later. I think it all depends, not so much on the personality of the reader, but on the specifics of the program. When callback1 and callback2 are large chunks of code, we probably all agree that it's better to have them out of the way, either way up or way down -- purely because of their size they deserve to be abstracted away when we're reading on how they are being used. A more interesting use case may be when callback1 and callback2 are very *small* amounts of code, since that's the main use case for lambda; there knowing what callback1 and callback2 stand for is probably important. I have to say that as long as it's only a few lines away I don't care much whether the detail is above or below its application, since it will all fit on a single screen and I can look at it all together. So then the 'where' syntax isn't particularly attractive because it doesn't solve a problem I'm experiencing. > Interestingly, this syntax also works to do decoration, though it's not a > syntax that was ever proposed for that. e.g.: > > foo = classmethod(foo) where: > def foo(cls,x,y,z): > # etc. This requires you to write foo three times, which defeats at least half of the purpose of decorators. > foo = property(get_foo,set_foo) where: > def get_foo(self): > # ... > def set_foo(self): > # ... > > I don't mind @decorators, of course, but maybe they wouldn't be needed here. As I said before, I'm not sure why keeping get_foo etc. out of the class namespace is such a big deal. In fact, I like having them there (sometimes they can even be handy, e.g. you might be able to pass the unbound get_foo method as a sort key). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From sabbey at u.washington.edu Tue Apr 19 22:06:44 2005 From: sabbey at u.washington.edu (Brian Sabbey) Date: Tue Apr 19 22:06:49 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: References: Message-ID: Fredrik Lundh wrote: >> In short, if doFoo is defined like: >> >> def doFoo(func1, func2): >> pass >> >> You would be able to call it like: >> >> doFoo(**): >> def func1(a, b): >> return a + b >> def func2(c, d): >> return c + d >> >> That is, a suite can be used to define keyword arguments. > > umm. isn't that just an incredibly obscure way to write > > def func1(a, b): > return a + b > def func2(c, d): > return c + d > doFoo(func1, func2) > > but with more indentation? If suites were commonly used as above to define properties, event handlers and other callbacks, then I think most people would be able to comprehend what the first example above is doing much more quickly than the second. So, I don't find it obscure for any reason other than because no one does it. Also, the two examples above are not exactly the same since the two functions are defined in a separate namespace in the top example. -Brian From python-kbutler at sabaydi.com Tue Apr 19 22:14:54 2005 From: python-kbutler at sabaydi.com (Kevin J. Butler) Date: Tue Apr 19 22:12:58 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <20050419185608.F19941E4014@bag.python.org> References: <20050419185608.F19941E4014@bag.python.org> Message-ID: <426566BE.60403@sabaydi.com> > > >From: Guido van Rossum > ... >This reflects a style pattern that I've come to appreciate more >recently: when breaking a call with a long argument list to fit on >your screen, instead of trying to find the optimal break points in the >argument list, take one or two of the longest arguments and put them >in local variables. > ... >If we apply this to the anonymous block problem, we may end up finding >lambda the ultimate compromise -- like a gentleman in the back of my >talk last week at baypiggies observed (unfortunately I don't know his >name). > > I like it: Lambda: The Ultimate Compromise (c.f. http://library.readscheme.org/page1.html) kb From reinhold-birkenfeld-nospam at wolke7.net Tue Apr 19 22:14:16 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Tue Apr 19 22:16:11 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: References: <5.1.1.6.0.20050419153025.00a7eec0@mail.telecommunity.com> Message-ID: Guido van Rossum wrote: >> What was your opinion on "where" as a lambda replacement? i.e. >> >> foo = bar(callback1, callback2) where: >> def callback1(x): >> print "hello, " >> def callback2(x): >> print "world!" > > I don't recall seeing this proposed, but I might have -- I thought of > pretty much exactly this syntax in the shower a few days ago. Gee, the time machine again! Lots of proposals on c.l.py base on the introduction of "expression suites", that is, suites embedded in arbitrary expressions. My opinion is that one will never find a suitable (;-) syntax, there's always the question of where to put the code that follows the suite (and is part of the same statement). yours, Reinhold -- Mail address is perfectly valid! From barry at python.org Tue Apr 19 22:18:41 2005 From: barry at python.org (Barry Warsaw) Date: Tue Apr 19 22:18:47 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: Message-ID: <1113941921.14525.39.camel@geddy.wooz.org> On Tue, 2005-04-19 at 15:24, Guido van Rossum wrote: > *If* we're going to create syntax for anonymous blocks, I think the > primary use case ought to be cleanup operations to replace try/finally > blocks for locking and similar things. I'd love to have syntactical > support so I can write > > blahblah(myLock): > code > code > code > > instead of > > myLock.acquire() > try: > code > code > code > finally: > myLock.release() Indeed, it would be very cool to have these kind of (dare I say) block decorators for managing resources. The really nice thing about that is when I have to protect multiple resources in a safe, but clean way inside a single block. Too many nested try/finally's cause you to either get sloppy, or really ugly (or both!). RSMotD (random stupid musing of the day): so I wonder if the decorator syntax couldn't be extended for this kind of thing. @acquire(myLock): code code code -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050419/c2780ce7/attachment.pgp From eric.nieuwland at xs4all.nl Tue Apr 19 22:20:07 2005 From: eric.nieuwland at xs4all.nl (Eric Nieuwland) Date: Tue Apr 19 22:20:10 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: Message-ID: <8f2cb89c88defe7f2c51e0d9bd702ef7@xs4all.nl> Guido van Rossum wrote: > tri = self.subcalculation("The quick brown fox jumps over the lazy > dog") > self.disentangle(0x40, tri, self.indent+1) > > IMO this is clearer, and even shorter! But it clutters the namespace with objects you don't need. So the complete equivalent would be more close to: tri = self.subcalculation("The quick brown fox jumps over the lazy dog") self.disentangle(0x40, tri, self.indent+1) del tri which seems a bit odd to me. > If we apply this to the anonymous block problem, we may end up finding > lambda the ultimate compromise -- like a gentleman in the back of my > talk last week at baypiggies observed (unfortunately I don't know his > name). It wasn't me ;-) It seems this keeps getting back at you. Wish I had thought of this argument before. --eric From gvanrossum at gmail.com Tue Apr 19 22:27:00 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Apr 19 22:27:05 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <8f2cb89c88defe7f2c51e0d9bd702ef7@xs4all.nl> References: <8f2cb89c88defe7f2c51e0d9bd702ef7@xs4all.nl> Message-ID: > > IMO this is clearer, and even shorter! > But it clutters the namespace with objects you don't need. Why do people care about cluttering namespaces so much? I thought thats' what namespaces were for -- to put stuff you want to remember for a bit. A function's local namespace in particular seems a perfectly fine place for temporaries. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From glyph at divmod.com Tue Apr 19 22:27:57 2005 From: glyph at divmod.com (Glyph Lefkowitz) Date: Tue Apr 19 22:28:03 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: Message-ID: <426569CD.1010701@divmod.com> Guido van Rossum wrote: > But what exactly are you trying to accomplish here? I think that > putting the defs *before* the call (and giving the anonymous blocks > temporary local names) actually makes the code clearer: I'm afraid that 'block1', 'block2', and 'doFoo' aren't really making anything clear for me - can you show a slightly more concrete example? > def block1(a, b): > return a + b > def block2(c, d): > return c + d > items.doFoo(block1, block2) Despite being guilty of propagating this style for years myself, I have to disagree. Consider the following network-conversation using Twisted style (which, I might add, would be generalizable to other Twisted-like systems if they existed ;-)): def strawman(self): def sayGoodbye(mingleResult): def goAway(goodbyeResult): self.loseConnection() self.send("goodbye").addCallback(goAway) def mingle(helloResult): self.send("nice weather we're having").addCallback(sayGoodbye) self.send("hello").addCallback(mingle) On the wire, this would look like: > hello < (response) hello > nice weather we're having < (response) nice weather we're having > goodbye < (response) goodbye FIN Note that the temporal order of events here is _exactly backwards_ to the order of calls in the code, because we have to name everything before it can happen. Now, with anonymous blocks (using my own pet favorite syntax, of course): def tinman(self): self.send("hello").addCallback(def (x): self.send("nice weather we're having").addCallback(def (y): self.send("goodbye").addCallback(def (z): self.loseConnection()))) Now, of course, this is written as network I/O because that is my bailiwick, but you could imagine an identical example with a nested chain of dialog boxes in a GUI, or a state machine controlling a robot. For completeness, the same example _can_ be written in the same order as events actually occur, but it takes twice times the number of lines and ends up creating a silly number of extra names: def lion(self): d1 = self.send("hello") def d1r(x): d2 = self.send("nice weather we're having") def d2r(y): d3 = self.send("goodbye") def d3r(z): self.loseConnection() d3.addCallback(d3r) d2.addCallback(d2r) d1.addCallback(d1r) but this only works if you have a callback-holding object like Twisted's Deferred. If you have to pass a callback function as an argument, as many APIs require, you really have to define the functions before they're called. My point here is not that my proposed syntax is particularly great, but that anonymous blocks are a real win in terms of both clarity and linecount. I'm glad guido is giving them a moment in the limelight :). Should there be a PEP about this? From gvanrossum at gmail.com Tue Apr 19 22:33:15 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Apr 19 22:33:18 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <1113941921.14525.39.camel@geddy.wooz.org> References: <1113941921.14525.39.camel@geddy.wooz.org> Message-ID: > @acquire(myLock): > code > code > code It would certainly solve the problem of which keyword to use! :-) And I think the syntax isn't even ambiguous -- the trailing colon distinguishes this from the function decorator syntax. I guess it would morph '@xxx' into "user-defined-keyword". How would acquire be defined? I guess it could be this, returning a function that takes a callable as an argument just like other decorators: def acquire(aLock): def acquirer(block): aLock.acquire() try: block() finally: aLock.release() return acquirer and the substitution of @EXPR: CODE would become something like def __block(): CODE EXPR(__block) I'm not yet sure whether to love or hate it. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From sabbey at u.washington.edu Tue Apr 19 22:46:16 2005 From: sabbey at u.washington.edu (Brian Sabbey) Date: Tue Apr 19 22:46:20 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <1113941921.14525.39.camel@geddy.wooz.org> Message-ID: Guido van Rossum wrote: >> @acquire(myLock): >> code >> code >> code > > It would certainly solve the problem of which keyword to use! :-) And > I think the syntax isn't even ambiguous -- the trailing colon > distinguishes this from the function decorator syntax. I guess it > would morph '@xxx' into "user-defined-keyword". > > How would acquire be defined? I guess it could be this, returning a > function that takes a callable as an argument just like other > decorators: > > def acquire(aLock): > def acquirer(block): > aLock.acquire() > try: > block() > finally: > aLock.release() > return acquirer > > and the substitution of > > @EXPR: > CODE > > would become something like > > def __block(): > CODE > EXPR(__block) Why not have the block automatically be inserted into acquire's argument list? It would probably get annoying to have to define inner functions like that every time one simply wants to use arguments. For example: def acquire(block, aLock): aLock.acquire() try: block() finally: aLock.release() @acquire(myLock): code code code Of course, augmenting the argument list in that way would be different than the behavior of decorators as they are now. -Brian From fredrik at pythonware.com Tue Apr 19 23:06:48 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue Apr 19 23:09:32 2005 Subject: [Python-Dev] Re: Re: anonymous blocks References: Message-ID: Brian Sabbey wrote: > If suites were commonly used as above to define properties, event handlers > and other callbacks, then I think most people would be able to comprehend > what the first example above is doing much more quickly than the second. wonderful logic, there. good luck with your future adventures in language design. From fredrik at pythonware.com Tue Apr 19 23:13:14 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Tue Apr 19 23:14:44 2005 Subject: [Python-Dev] Re: anonymous blocks References: <8f2cb89c88defe7f2c51e0d9bd702ef7@xs4all.nl> Message-ID: Guido van Rossum wrote: > This reflects a style pattern that I've come to appreciate more > recently: what took you so long? ;-) > Why do people care about cluttering namespaces so much? I thought > thats' what namespaces were for -- to put stuff you want to remember > for a bit. A function's local namespace in particular seems a > perfectly fine place for temporaries. and by naming stuff, you can often eliminate a comment or three. this is python. names are cheap. From gvanrossum at gmail.com Tue Apr 19 23:28:16 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Apr 19 23:28:24 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <1113941921.14525.39.camel@geddy.wooz.org> Message-ID: > Why not have the block automatically be inserted into acquire's argument > list? It would probably get annoying to have to define inner functions > like that every time one simply wants to use arguments. But the number of *uses* would be much larger than the number of "block decorators" you'd be coding. If you find yourself writing new block decorators all the time that's probably a sign you're too much in love with the feature. :-) > For example: > > def acquire(block, aLock): > aLock.acquire() > try: > block() > finally: > aLock.release() > > @acquire(myLock): > code > code > code > > Of course, augmenting the argument list in that way would be different > than the behavior of decorators as they are now. I don't like implicit modifications of argument lists other than by method calls. It's okay for method calls because in the x.foo(a) <==> foo(x, a) equivalence, x is really close to the beginning of the argument list. And your proposal would preclude parameterless block decorators (or turn them into an ugly special case), which I think might be quite useful: @forever: infinite loop body @ignore: not executed at all @require: assertions go here and so on. (In essence, we're inventing the opposite of "barewords" in Perl here, right?) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tcdelaney at optusnet.com.au Tue Apr 19 23:47:14 2005 From: tcdelaney at optusnet.com.au (Tim Delaney) Date: Tue Apr 19 23:47:18 2005 Subject: [Python-Dev] anonymous blocks References: <5.1.1.6.0.20050419153025.00a7eec0@mail.telecommunity.com> Message-ID: <006101c54529$59427370$f100a8c0@ryoko> Guido van Rossum wrote: > As I said before, I'm not sure why keeping get_foo etc. out of the > class namespace is such a big deal. In fact, I like having them there > (sometimes they can even be handy, e.g. you might be able to pass the > unbound get_foo method as a sort key). Not to mention that it's possible to override get_foo in subclasses if done right ... Two approaches are here: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/408713 Tim Delaney From sabbey at u.washington.edu Tue Apr 19 23:48:01 2005 From: sabbey at u.washington.edu (Brian Sabbey) Date: Tue Apr 19 23:48:06 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <1113941921.14525.39.camel@geddy.wooz.org> Message-ID: Guido van Rossum wrote: >> Why not have the block automatically be inserted into acquire's argument >> list? It would probably get annoying to have to define inner functions >> like that every time one simply wants to use arguments. > > But the number of *uses* would be much larger than the number of > "block decorators" you'd be coding. If you find yourself writing new > block decorators all the time that's probably a sign you're too much > in love with the feature. :-) Ok, but in explanations of how to use such blocks, they appear about equally often. They will therefore seem more difficult to use than they have to. > I don't like implicit modifications of argument lists other than by > method calls. It's okay for method calls because in the x.foo(a) <==> > foo(x, a) equivalence, x is really close to the beginning of the > argument list. There is a rough equivalence: @foo(x): block <==> @foo(block, x) Of course, the syntax does not allow such an equivalence, but conceptually it's there. To improve the appearance of equivalence, the block could be made the last element in the argument list. > And your proposal would preclude parameterless block decorators (or > turn them into an ugly special case), which I think might be quite > useful: > > @forever: > infinite loop body > > @ignore: > not executed at all > > @require: > assertions go here > > and so on. > > (In essence, we're inventing the opposite of "barewords" in Perl here, right?) I don't understand this. Why not: @forever(): infinite loop body etc.? The same is done with methods: x.foo() (or am I missing something?). I actually prefer this because using '()' make it clear that you are making a call to 'forever'. Importantly, 'forever' can throw exceptions at you. Without the '()' one does not get this reminder. I also believe it is more difficult to read without '()'. The call to the function is implicit in the fact that it sits next to '@'. But, again, if such argument list augmentation were done, something other than '@' would need to be used so as to not conflict with function decorator behavior. -Brian From jcarlson at uci.edu Wed Apr 20 00:21:27 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed Apr 20 00:23:32 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <1113941921.14525.39.camel@geddy.wooz.org> Message-ID: <20050419145855.6391.JCARLSON@uci.edu> [Guido van Rossum] > @EXPR: > CODE > > would become something like > > def __block(): > CODE > EXPR(__block) > > I'm not yet sure whether to love or hate it. :-) Is it preferable for CODE to execute in its own namespace (the above being a literal translation of the given code), or for it to execute in the originally defined namespace? Deferring to Greg Ewing for a moment [1]: They should be lexically scoped, not dynamically scoped. Wrapped blocks in an old namespace, I believe, is the way to go, especially for things like... @synchronize(fooLock): a = foo.method() I cannot come up with any code for which CODE executing in its own namespace makes sense. Can anyone else? - Josiah [1] http://mail.python.org/pipermail/python-dev/2005-March/052239.html From sabbey at u.washington.edu Wed Apr 20 00:43:45 2005 From: sabbey at u.washington.edu (Brian Sabbey) Date: Wed Apr 20 00:43:49 2005 Subject: [Python-Dev] Re: Re: anonymous blocks In-Reply-To: References: Message-ID: Fredrik Lundh wrote: > Brian Sabbey wrote: > >> If suites were commonly used as above to define properties, event handlers >> and other callbacks, then I think most people would be able to comprehend >> what the first example above is doing much more quickly than the second. > > wonderful logic, there. good luck with your future adventures in language > design. > > I'm just trying to help python improve. Maybe I'm not doing a very good job, I don't know. Either way, there's no need to be rude. If I've broken some sort of unspoken code of behavior for this list, then maybe it would be easier if you just 'spoke' it (perhaps in a private email or in the description of this list on python.org). I'm not sure what your point is exactly. Are you saying that any language feature that needs to be commonly used to be comprehendible will never be comprehendible because it will never be commonly used? If so, then I do not think you have a valid point. I never claimed that keyword suites *need* to be commonly used to be comprehendible. I only said that if they were commonly used they would be more comprehendible than the alternative. I happen to also believe that seeing them once or twice is enough to make them about equally as comprehendible as the alternative. -Brian From bjourne at gmail.com Wed Apr 20 00:57:05 2005 From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=) Date: Wed Apr 20 00:57:15 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <1113941921.14525.39.camel@geddy.wooz.org> References: <1113941921.14525.39.camel@geddy.wooz.org> Message-ID: <740c3aec0504191557505d6e9f@mail.gmail.com> > RSMotD (random stupid musing of the day): so I wonder if the decorator > syntax couldn't be extended for this kind of thing. > > @acquire(myLock): > code > code > code Would it be useful for anything other than mutex-locking? And wouldn't it be better to make a function of the block wrapped in a block-decorator and then use a normal decorator? -- mvh Bj?rn From bac at OCF.Berkeley.EDU Wed Apr 20 01:17:33 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Wed Apr 20 01:17:42 2005 Subject: [Python-Dev] Proper place to put extra args for building Message-ID: <4265918D.7040700@ocf.berkeley.edu> I am currently adding some code for a Py_COMPILER_DEBUG build for use on the AST branch. I thought that OPT was the proper variable to put stuff like this into for building (``-DPy_COMPILER_DEBUG``), but that erases ``-g -Wall -Wstrict-prototypes``. Obviously I could just tack all of that into my own thing, but that seems like an unneeded step. >From looking at Makefile.pre.in it seems like CFLAGSFORSHARED is meant for extra arguments to the compiler. Is that right? And I will document this in Misc/Specialbuilds.txt to fix my initial blunderous checkin of specifying OPT (or at least clarifying it). -Brett From pje at telecommunity.com Wed Apr 20 01:23:26 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Apr 20 01:19:31 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <5.1.1.6.0.20050419153025.00a7eec0@mail.telecommunity.com> <5.1.1.6.0.20050419153025.00a7eec0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050419191423.00a628e0@mail.telecommunity.com> At 01:00 PM 04/19/2005 -0700, Guido van Rossum wrote: > > Interestingly, this syntax also works to do decoration, though it's not a > > syntax that was ever proposed for that. e.g.: > > > > foo = classmethod(foo) where: > > def foo(cls,x,y,z): > > # etc. > >This requires you to write foo three times, which defeats at least >half of the purpose of decorators. Well, you could do 'foo = classmethod(x) where: def x(...)', but that *is* kind of kludgy. I'm just suggesting that if 'where:' had existed before decorators, people might have griped about the three-time typing or kludged around it, but there wouldn't likely have been strong support for creating a syntax "just" for decorators. Indeed, if somebody had proposed this syntax during the decorator debates I would have supported it, but of course Bob Ippolito (whose PyObjC use cases involve really long function names) might have disagreed. > > foo = property(get_foo,set_foo) where: > > def get_foo(self): > > # ... > > def set_foo(self): > > # ... > > > > I don't mind @decorators, of course, but maybe they wouldn't be needed > here. > >As I said before, I'm not sure why keeping get_foo etc. out of the >class namespace is such a big deal. That's a relatively minor thing, compared to being able to logically group them with the property, which I think enhances readability, even more than the sometimes-proposed '@property.getter' and '@property.setter' decorators. Anyway, just to be clear, I don't personally think 'where:' is needed in Python 2.x; lambda and decorators suffice for all but the most Twisted use cases. ;) I was just viewing it as a potential alternative to lambda in Py3K. From jjinux at gmail.com Wed Apr 20 01:42:08 2005 From: jjinux at gmail.com (Shannon -jj Behrens) Date: Wed Apr 20 01:42:11 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <5.1.1.6.0.20050419191423.00a628e0@mail.telecommunity.com> References: <5.1.1.6.0.20050419153025.00a7eec0@mail.telecommunity.com> <5.1.1.6.0.20050419191423.00a628e0@mail.telecommunity.com> Message-ID: I apologize for sparking such debate on this list instead of on c.l.py. By the way, the only reason I brought this up was as a replacement for lambdas in Py3K. Guido, in response to your much earlier comment about supporting "{}" for normal defs as a matter of consistency within my proposal, yes, I agree. Just like ";", you should rarely use them. Best Regards, -jj On 4/19/05, Phillip J. Eby wrote: > At 01:00 PM 04/19/2005 -0700, Guido van Rossum wrote: > > > Interestingly, this syntax also works to do decoration, though it's not a > > > syntax that was ever proposed for that. e.g.: > > > > > > foo = classmethod(foo) where: > > > def foo(cls,x,y,z): > > > # etc. > > > >This requires you to write foo three times, which defeats at least > >half of the purpose of decorators. > > Well, you could do 'foo = classmethod(x) where: def x(...)', but that *is* > kind of kludgy. I'm just suggesting that if 'where:' had existed before > decorators, people might have griped about the three-time typing or kludged > around it, but there wouldn't likely have been strong support for creating > a syntax "just" for decorators. > > Indeed, if somebody had proposed this syntax during the decorator debates I > would have supported it, but of course Bob Ippolito (whose PyObjC use cases > involve really long function names) might have disagreed. > > > > > foo = property(get_foo,set_foo) where: > > > def get_foo(self): > > > # ... > > > def set_foo(self): > > > # ... > > > > > > I don't mind @decorators, of course, but maybe they wouldn't be needed > > here. > > > >As I said before, I'm not sure why keeping get_foo etc. out of the > >class namespace is such a big deal. > > That's a relatively minor thing, compared to being able to logically group > them with the property, which I think enhances readability, even more than > the sometimes-proposed '@property.getter' and '@property.setter' decorators. > > Anyway, just to be clear, I don't personally think 'where:' is needed in > Python 2.x; lambda and decorators suffice for all but the most Twisted use > cases. ;) I was just viewing it as a potential alternative to lambda in Py3K. > > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jjinux%40gmail.com > -- I have decided to switch to Gmail, but messages to my Yahoo account will still get through. From jack at performancedrivers.com Wed Apr 20 03:01:58 2005 From: jack at performancedrivers.com (Jack Diederich) Date: Wed Apr 20 03:02:03 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <1113941921.14525.39.camel@geddy.wooz.org> Message-ID: <20050420010158.GA18907@performancedrivers.com> On Tue, Apr 19, 2005 at 01:33:15PM -0700, Guido van Rossum wrote: > > @acquire(myLock): > > code > > code > > code > > It would certainly solve the problem of which keyword to use! :-) And > I think the syntax isn't even ambiguous -- the trailing colon > distinguishes this from the function decorator syntax. I guess it > would morph '@xxx' into "user-defined-keyword". > > How would acquire be defined? I guess it could be this, returning a > function that takes a callable as an argument just like other > decorators: [snip] > and the substitution of > > @EXPR: > CODE > > would become something like > > def __block(): > CODE > EXPR(__block) > > I'm not yet sure whether to love or hate it. :-) > I don't know what the purpose of these things is, but I do think they should be like something else to avoid learning something new. Okay, I lied, I do know what these are: "namespace decorators" Namespaces are currently modules or classes, and decorators currently apply only to functions. The dissonance is that function bodies are evaluated later and namespaces (modules and classes) are evaluated immediately. I don't know if adding a namespace that is only evaluated later makes sense. It is only an extra case but it is one extra case to remember. At best I have only channeled Guido once, and by accident[1] so I'll stay out of the specifics (for a bit). -jackdied [1] during the decorator syntax bru-ha-ha at a Boston PIG meeting I suggested Guido liked the decorator-before-function because it made more sense in Dutch. I was kidding, but someone who knows a little Dutch (Deibel?) stated this was, in fact, the case. From michael.walter at gmail.com Wed Apr 20 03:55:52 2005 From: michael.walter at gmail.com (Michael Walter) Date: Wed Apr 20 03:55:55 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <740c3aec0504191557505d6e9f@mail.gmail.com> References: <1113941921.14525.39.camel@geddy.wooz.org> <740c3aec0504191557505d6e9f@mail.gmail.com> Message-ID: <877e9a170504191855445e0f4d@mail.gmail.com> On 4/19/05, BJ?rn Lindqvist wrote: > > RSMotD (random stupid musing of the day): so I wonder if the decorator > > syntax couldn't be extended for this kind of thing. > > > > @acquire(myLock): > > code > > code > > code > > Would it be useful for anything other than mutex-locking? And wouldn't > it be better to make a function of the block wrapped in a > block-decorator and then use a normal decorator? Yes. Check how blocks in Smalltalk and Ruby are used for starters. Regards, Michael From aleaxit at yahoo.com Wed Apr 20 04:07:45 2005 From: aleaxit at yahoo.com (Alex Martelli) Date: Wed Apr 20 04:07:50 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <740c3aec0504191557505d6e9f@mail.gmail.com> References: <1113941921.14525.39.camel@geddy.wooz.org> <740c3aec0504191557505d6e9f@mail.gmail.com> Message-ID: <560ee46ad8e2a82faedba7349b98ab5a@yahoo.com> On Apr 19, 2005, at 15:57, BJ?rn Lindqvist wrote: >> RSMotD (random stupid musing of the day): so I wonder if the decorator >> syntax couldn't be extended for this kind of thing. >> >> @acquire(myLock): >> code >> code >> code > > Would it be useful for anything other than mutex-locking? And wouldn't Well, one obvious use might be, say: @withfile('foo.bar', 'r'): content = thefile.read() but that would require the decorator and block to be able to interact in some way, so that inside the block 'thefile' is defined suitably. > it be better to make a function of the block wrapped in a > block-decorator and then use a normal decorator? From a viewpoint of namespaces, I think it would be better to have the block execute in the same namespace as the code surrounding it, not a separate one (assigning to 'content' would not work otherwise), so a nested function would not be all that useful. The problem might be, how does the _decorator_ affect that namespace. Perhaps: def withfile(filename, mode='r'): openfile = open(filename, mode) try: block(thefile=openfile) finally: openfile.close() i.e., let the block take keyword arguments to tweak its namespace (but assignments within the block should still affect its _surrounding_ namespace, it seems to me...). Alex From shane at hathawaymix.org Wed Apr 20 06:31:36 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Wed Apr 20 06:31:38 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: References: Message-ID: <4265DB28.8050905@hathawaymix.org> Fredrik Lundh wrote: > Brian Sabbey wrote: >> doFoo(**): >> def func1(a, b): >> return a + b >> def func2(c, d): >> return c + d >> >> That is, a suite can be used to define keyword arguments. > > > umm. isn't that just an incredibly obscure way to write > > def func1(a, b): > return a + b > def func2(c, d): > return c + d > doFoo(func1, func2) > > but with more indentation? Brian's suggestion makes the code read more like an outline. In Brian's example, the high-level intent stands out from the details, while in your example, there is no visual cue that distinguishes the details from the intent. Of course, lambdas are even better, when it's possible to use them: doFoo((lambda a, b: a + b), (lambda c, d: c + d)) Shane From jcarlson at uci.edu Wed Apr 20 06:39:32 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed Apr 20 06:42:43 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <877e9a170504191855445e0f4d@mail.gmail.com> References: <740c3aec0504191557505d6e9f@mail.gmail.com> <877e9a170504191855445e0f4d@mail.gmail.com> Message-ID: <20050419212423.63AD.JCARLSON@uci.edu> Michael Walter wrote: > > On 4/19/05, BJ?rn Lindqvist wrote: > > > RSMotD (random stupid musing of the day): so I wonder if the decorator > > > syntax couldn't be extended for this kind of thing. > > > > > > @acquire(myLock): > > > code > > > code > > > code > > > > Would it be useful for anything other than mutex-locking? And wouldn't > > it be better to make a function of the block wrapped in a > > block-decorator and then use a normal decorator? > > Yes. Check how blocks in Smalltalk and Ruby are used for starters. See the previous two discussions on thunks here on python-dev, and notice how the only problem that seem bettered via blocks/thunks /in Python/ are those which are of the form... #setup try: block finally: #finalization ... and depending on the syntax, properties. I once asked "Any other use cases for one of the most powerful features of Ruby, in Python?" I have yet to hear any sort of reasonable response. Why am I getting no response to my question? Either it is because I am being ignored, or no one has taken the time to translate one of these 'killer features' from Smalltalk or Ruby, or perhaps such translations show that there is a better way in Python already. Now, don't get me wrong, I have more than a few examples of the try/finally block in my code, so I would personally find it useful, but just because this one pattern is made easier, doesn't mean that it should see syntax. - Josiah P.S. If I'm sounding like a broken record to you, don't be surprised. But until my one question is satisfactorally answered, I'll keep poking at its soft underbelly. From shane.holloway at ieee.org Wed Apr 20 07:10:20 2005 From: shane.holloway at ieee.org (Shane Holloway (IEEE)) Date: Wed Apr 20 07:10:55 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: Message-ID: <4265E43C.4080707@ieee.org> > *If* we're going to create syntax for anonymous blocks, I think the > primary use case ought to be cleanup operations to replace try/finally > blocks for locking and similar things. I'd love to have syntactical > support so I can write I heartily agree! Especially when you have very similar try/finally code you use in many places, and wish to refactor it into a common area. If this is done, you are forced into a callback form like follows:: def withFile(filename, callback): aFile = open(filename, 'r') try: result = callback(aFile) finally: aFile.close() return result class Before: def readIt(self, filename): def doReading(aFile): self.readPartA(aFile) self.readPartB(aFile) self.readPartC(aFile) withFile(filename, doReading) Which is certainly functional. I actually use the idiom frequently. However, my opinion is that it does not read smoothly. This form requires that I say what I'm doing with something before I know the context of what that something is. For me, blocks are not about shortening the code, but rather clarifying *intent*. With this proposed change, the code becomes:: class After: def readIt(self, filename): withFile(filename): self.readPartA(aFile) self.readPartB(aFile) self.readPartC(aFile) In my opinion, this is much smoother to read. This particular example brings up the question of how arguments like "aFile" get passed and named into the block. I anticipate the need for a place to put an argument declaration list. ;) And no, I'm not particularly fond of Smalltalk's solution with "| aFile |", but that's just another opinion of aesthetics. Another set of question arose for me when Barry started musing over the combination of blocks and decorators. What are blocks? Well, obviously they are callable. What do they return? The local namespace they created/modified? How do blocks work with control flow statements like "break", "continue", "yield", and "return"? I think these questions have good answers, we just need to figure out what they are. Perhaps "break" and "continue" raise exceptions similar to StopIteration in this case? As to the control flow questions, I believe those answers depend on how the block is used. Perhaps a few different invocation styles are applicable. For instance, the method block.suite() could return a tuple such as (returnedValue, locals()), where block.__call__() would simply return like any other callable. It would be good to figure out what the control flow difference is between:: def readAndReturn(self, filename): withFile(filename): a = self.readPartA(aFile) b = self.readPartB(aFile) c = self.readPartC(aFile) return (a, b, c) and:: def readAndReturn(self, filename): withFile(filename): a = self.readPartA(aFile) b = self.readPartB(aFile) c = self.readPartC(aFile) return (a, b, c) Try it with yield to further vex the puzzle. ;) Thanks for your time! -Shane Holloway From steven.bethard at gmail.com Wed Apr 20 07:23:56 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Wed Apr 20 07:23:59 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <560ee46ad8e2a82faedba7349b98ab5a@yahoo.com> References: <1113941921.14525.39.camel@geddy.wooz.org> <740c3aec0504191557505d6e9f@mail.gmail.com> <560ee46ad8e2a82faedba7349b98ab5a@yahoo.com> Message-ID: On 4/19/05, Alex Martelli wrote: > Well, one obvious use might be, say: > > @withfile('foo.bar', 'r'): > content = thefile.read() > > but that would require the decorator and block to be able to interact > in some way, so that inside the block 'thefile' is defined suitably. > > > it be better to make a function of the block wrapped in a > > block-decorator and then use a normal decorator? > > From a viewpoint of namespaces, I think it would be better to have the > block execute in the same namespace as the code surrounding it, not a > separate one (assigning to 'content' would not work otherwise), so a > nested function would not be all that useful. The problem might be, > how does the _decorator_ affect that namespace. Perhaps: > > def withfile(filename, mode='r'): > openfile = open(filename, mode) > try: > block(thefile=openfile) > finally: > openfile.close() > > i.e., let the block take keyword arguments to tweak its namespace (but > assignments within the block should still affect its _surrounding_ > namespace, it seems to me...). I'm not a big fan of this means of tweaking the block's namespace. It means that if you use a "block decorator", you might find that names have been 'magically' added to your namespace. This has a bad code smell of too much implicitness to me... I believe this was one of the reasons Brian Sabbey's proposal looked something like: do in = (): This way you could write the block above as something like: def withfile(filename, mode='r'): def _(block): openfile = open(filename, mode) try: block(openfile) finally: openfile.close() return _ do thefile in withfile('foo.bar', 'r'): content = thefile.read() where 'thefile' is explicitly named in the do/in-statement's unpack list. Personally, I found the 'do' and 'in' keywords very confusing, but I do like the fact that the parameters passed to the thunk/block are expanded in an explicit unpack list. Using @, I don't see an easy way to insert such an unpack list... Of course, even with the unpack list, you still have to know what kind of arguments the function calls your block with. And because these only appear within the code, e.g. block(openfile) you can't rely on easily accessible things like the function's signature. It means that unlike other callables that can basically document parameters and return type, "block decorators" would have to document parameters, return type, and the parameters with which they call the block... STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From martin at v.loewis.de Wed Apr 20 08:27:34 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed Apr 20 08:27:37 2005 Subject: [Python-Dev] Proper place to put extra args for building In-Reply-To: <4265918D.7040700@ocf.berkeley.edu> References: <4265918D.7040700@ocf.berkeley.edu> Message-ID: <4265F656.4020305@v.loewis.de> Brett C. wrote: > I am currently adding some code for a Py_COMPILER_DEBUG build for use on the > AST branch. I thought that OPT was the proper variable to put stuff like this > into for building (``-DPy_COMPILER_DEBUG``), but that erases ``-g -Wall > -Wstrict-prototypes``. Obviously I could just tack all of that into my own > thing, but that seems like an unneeded step. Actually, this step is needed. >>From looking at Makefile.pre.in it seems like CFLAGSFORSHARED is meant for > extra arguments to the compiler. Is that right? No. This is the set of flags to be passed to the compiler when compiling with --enable-shared. It is set in configure.in. It might be reasonable to add a variable that will just take additional compiler flags, and never be modified in configure. Regards, Martin From fredrik at pythonware.com Wed Apr 20 08:47:51 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed Apr 20 08:48:16 2005 Subject: [Python-Dev] Re: anonymous blocks References: <740c3aec0504191557505d6e9f@mail.gmail.com><877e9a170504191855445e0f4d@mail.gmail.com> <20050419212423.63AD.JCARLSON@uci.edu> Message-ID: Josiah Carlson wrote: > See the previous two discussions on thunks here on python-dev, and > notice how the only problem that seem bettered via blocks/thunks /in > Python/ are those which are of the form... > > #setup > try: > block > finally: > #finalization > > ... and depending on the syntax, properties. I once asked "Any other > use cases for one of the most powerful features of Ruby, in Python?" I > have yet to hear any sort of reasonable response. > > Why am I getting no response to my question? Either it is because I am > being ignored, or no one has taken the time to translate one of these > 'killer features' from Smalltalk or Ruby, or perhaps such translations > show that there is a better way in Python already. for my purposes, I've found that the #1 callback killer in contemporary Python is for-in:s support for the iterator protocol: instead of def callback(x): code dosomething(callback) or with the "high-level intent"-oriented syntax: dosomething(**): def libraryspecifiedargumentname(x): code I simply write for x in dosomething(): code and get shorter code that runs faster. (see cElementTree's iterparse for an excellent example. for typical use cases, it's nearly three times faster than pyexpat, which is the fastest callback-based XML parser we have) unfortunately, def do(): print "setup" try: yield None finally: print "tear down" doesn't quite work (if it did, all you would need is syntactic sugar for "for dummy in"). PS. a side effect of the for-in pattern is that I'm beginning to feel that Python might need a nice "switch" statement based on dictionary lookups, so I can replace multiple callbacks with a single loop body, without writing too many if/elif clauses. From fredrik at pythonware.com Wed Apr 20 08:55:03 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed Apr 20 08:55:24 2005 Subject: [Python-Dev] Re: Re: anonymous blocks References: <4265DB28.8050905@hathawaymix.org> Message-ID: Shane Hathaway wrote: > Brian's suggestion makes the code read more like an outline. In Brian's > example, the high-level intent stands out from the details that assumes that when you call a library function, the high-level intent of *your* code is obvious from the function name in the library, and to some extent, by the argument names chosen by the library implementor. I'm not so sure that's always a valid assumption. > while in your example, there is no visual cue that distinguishes the details > from the intent. carefully chosen function names (that you chose yourself) plus blank lines can help with that. > Of course, lambdas are even better, when it's possible to > use them: > > doFoo((lambda a, b: a + b), (lambda c, d: c + d)) that only tells you that you're calling "doFoo", with no clues whatsoever to what the code in the lambdas are doing. keyword arguments are a step up from that, as long as your intent matches the library writers intent. From jjinux at gmail.com Wed Apr 20 10:01:55 2005 From: jjinux at gmail.com (Shannon -jj Behrens) Date: Wed Apr 20 10:01:58 2005 Subject: [Python-Dev] Re: anonymous blocks (off topic: match) In-Reply-To: References: <740c3aec0504191557505d6e9f@mail.gmail.com> <877e9a170504191855445e0f4d@mail.gmail.com> <20050419212423.63AD.JCARLSON@uci.edu> Message-ID: > PS. a side effect of the for-in pattern is that I'm beginning to feel that Python > might need a nice "switch" statement based on dictionary lookups, so I can > replace multiple callbacks with a single loop body, without writing too many > if/elif clauses. That's funny. I keep wondering if "match" from the ML world would make sense in Python. I keep thinking it'd be a really nice thing to have. -jj -- I have decided to switch to Gmail, but messages to my Yahoo account will still get through. From p.f.moore at gmail.com Wed Apr 20 11:43:32 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Wed Apr 20 11:43:34 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <1113941921.14525.39.camel@geddy.wooz.org> Message-ID: <79990c6b05042002435ce91e79@mail.gmail.com> On 4/19/05, Brian Sabbey wrote: > Guido van Rossum wrote: > >> @acquire(myLock): > >> code > >> code > >> code > > > > It would certainly solve the problem of which keyword to use! :-) And > > I think the syntax isn't even ambiguous -- the trailing colon > > distinguishes this from the function decorator syntax. I guess it > > would morph '@xxx' into "user-defined-keyword". Hmm, this looks to me like a natural extension of decorators. Whether that is a good or a bad thing, I'm unable to decide :-) [I can think of a number of uses for it, PEP 310-style with-blocks being one, but I can't decide if "lots of potential uses" is too close to "lots of potential for abuse" :-)] > > How would acquire be defined? I guess it could be this, returning a > > function that takes a callable as an argument just like other > > decorators: > > > > def acquire(aLock): > > def acquirer(block): > > aLock.acquire() > > try: > > block() > > finally: > > aLock.release() > > return acquirer It really has to be this, IMO, otherwise the parallel with decorators becomes confusing, rather than helpful. > > and the substitution of > > > > @EXPR: > > CODE > > > > would become something like > > > > def __block(): > > CODE > > EXPR(__block) The question of whether assignments within CODE are executed within a new namespace, as this implies, or in the surrounding namespace, remains open. I can see both as reasonable (new namespace = easier to describe/understand, more in line with decorators, probably far easier to implement; surrounding namespace = probably more useful/practical...) > Why not have the block automatically be inserted into acquire's argument > list? It would probably get annoying to have to define inner functions > like that every time one simply wants to use arguments. If this syntax is to be considered, in my view it *must* follow established decorator practice - and that includes the define-an-inner-function-and-return-it idiom. > Of course, augmenting the argument list in that way would be different > than the behavior of decorators as they are now. Exactly. Paul. From flaig at sanctacaris.net Wed Apr 20 13:07:36 2005 From: flaig at sanctacaris.net (flaig@sanctacaris.net) Date: Wed Apr 20 13:07:46 2005 Subject: [Python-Dev] Re: anonymous blocks Message-ID: <200504201107.j3KB7a0G016148@ger5.wwwserver.net> I guess I should begin by introducing myself: My name is Rüdiger Flaig, I live in Heidelberg/Germany (yes indeed, there are not only tourists there) and am a JOAT by profession (Jack Of All Trades). Among other weird things, I am currently teaching immunology and bioinformatics at the once-famous University of Heidelberg. Into this little secluded world of ours, so far dominated by rigid C++ stalwarts, I have successfully introduced Python! I have been lurking on this list for quite a while, interested to watch the further development of the streaked reptile. As students keep on asking me about the differences between languages and the pros and cons, I think I may claim some familiarity with other languages too, especially Python's self-declared antithesis, Ruby. The recent discussion about anonymous blocks immediately brought Ruby to my mind once more, since -- as you will know -- Ruby does have ABs, and rubynos are very proud of them, as they are generally of their more "flexible" program structure. However, I have seen lots of Ruby code and do not really feel that this contributes in any way to the expressiveness of the language. Lambdas are handy for very microscopic matters, but in general I think that one of Python's greatest strengths is the way in which its rather rigid layout combines with the overall approach to force coders to disentangle complex operations. So I cannot really see any benefit in ABs... Just the 0.02 of a serpent lover, but maybe someone's interested in hearing something like an outsider's opinion. Cheers, Rüdiger === Chevalier Dr. Dr. Ruediger Marcus Flaig Institute for Immunology University of Heidelberg Im Neuenheimer Feld 305, D-69120 Heidelberg, FRG "Drain you of your sanity, Face the Thing That Should Not Be." -- Diese E-Mail wurde mit http://www.mail-inspector.de verschickt Mail Inspector ist ein kostenloser Service von http://www.is-fun.net Der Absender dieser E-Mail hatte die IP: 129.206.124.135 From Michaels at rd.bbc.co.uk Wed Apr 20 14:24:05 2005 From: Michaels at rd.bbc.co.uk (Michael Sparks) Date: Wed Apr 20 14:30:34 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: Message-ID: <200504201324.05551.Michaels@rd.bbc.co.uk> On Tuesday 19 Apr 2005 20:24, Guido van Rossum wrote: .. > *If* we're going to create syntax for anonymous blocks, I think the > primary use case ought to be cleanup operations to replace try/finally > blocks for locking and similar things. I'd love to have syntactical > support so I can write > > blahblah(myLock): > code > code > code I've got a basic parser that I wrote last summer which was an experiment in a generic "python-esque" parser. It might be useful for playing with these thing since it accepted the above syntax without change, among many others, happily. (Added as a 38th test, and probably the sixth "language syntax" it understands) It's also entirely keyword free, which strikes me as a novelty. The abstract syntax tree that's generated for it is rather unwieldy and over the top, but that's as far as the parser goes. (As I said I was interested in a generic parser, not a language :) The (entire) grammar resulting was essentially this: (and is LR-parsable) program -> block block -> BLOCKSTART statement_list BLOCKEND statement_list -> statement* statement -> (expression | expression ASSIGNMENT expression | ) EOL expression -> oldexpression (COMMA expression)* oldexpression -> (factor [factorlist] | factor INFIXOPERATOR expression ) factorlist -> factor* factor factor -> ( bracketedexpression | constructorexpression | NUMBER | STRING | ID | factor DOT dotexpression | factor trailer | factor trailertoo ) dotexpression -> (ID bracketedexpression | factor ) bracketedexpression -> BRA [ expression ] KET constructorexpression -> BRA3 [ expression ] KET3 trailer -> BRA2 expression KET2 trailertoo -> COLON EOL block The parser.out file for the curious is here: * http://www.cerenity.org/SWP/parser.out (31 productions) The parser uses a slightly modified PLY based parser and might be useful for playing around with constructs (Might not, but it's the reason I'm mentioning it). The approach taken is to treat ":" as always starting a code block to be passed. The first token on the line is treated as a function name. The idea was that "def", "class", "if", etc then become simple function calls that get various arguments which may include one or more code blocks. The parser was also written entirely test first (as an experiment to see what that's like for writing a parser) and includes a variety of sample programs that pass. (39 different program tests) I've put a tarball here: * http://www.cerenity.org/SWP-0.0.0.tar.gz (includes the modifed version of PLY) * Also browesable here: http://www.cerenity.org/SWP/ * Some fun examples: * Python-like http://www.cerenity.org/SWP/progs/expr_29.p (this example is an earlier version of the parser) * LOGO like: http://www.cerenity.org/SWP/progs/expr_33.p * L-System definition: http://www.cerenity.org/SWP/progs/expr_34.p * SML-like: http://www.cerenity.org/SWP/progs/expr_35.p * Amiga E/Algol like: http://www.cerenity.org/SWP/progs/expr_37.p Needs the modified version of PLY installed first, and the tests can be run using "runtests.sh". Provided in case people want to play around with something, I'm happy with the language as it is. :-) Best Regards, Michael, -- Michael Sparks, Senior R&D Engineer, Digital Media Group Michael.Sparks@rd.bbc.co.uk, British Broadcasting Corporation, Research and Development Kingswood Warren, Surrey KT20 6NP This e-mail may contain personal views which are not the views of the BBC. From foom at fuhm.net Wed Apr 20 16:29:48 2005 From: foom at fuhm.net (James Y Knight) Date: Wed Apr 20 16:30:03 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <79990c6b05042002435ce91e79@mail.gmail.com> References: <1113941921.14525.39.camel@geddy.wooz.org> <79990c6b05042002435ce91e79@mail.gmail.com> Message-ID: <73ceac6fd1ccfa5a342f39cb57c224d9@fuhm.net> On Apr 20, 2005, at 5:43 AM, Paul Moore wrote: >>> and the substitution of >>> >>> @EXPR: >>> CODE >>> >>> would become something like >>> >>> def __block(): >>> CODE >>> EXPR(__block) > > The question of whether assignments within CODE are executed within a > new namespace, as this implies, or in the surrounding namespace, > remains open. I can see both as reasonable (new namespace = easier to > describe/understand, more in line with decorators, probably far easier > to implement; surrounding namespace = probably more > useful/practical...) If it was possible to assign to a variable to a variable bound outside your function, but still in your lexical scope, I think it would fix this issue. That's always something I've thought should be possible, anyways. I propose to make it possible via a declaration similar to 'global'. E.g. (stupid example, but it demonstrates the syntax): def f(): count = 0 def addCount(): lexical count count += 1 assert count == 0 addCount() assert count == 1 Then, there's two choices for the block decorator: either automatically mark all variable names in the immediately surrounding scope "lexical", or don't. Both of those choices are still consistent with the block just being a "normal function", which I think is an important attribute. James From aahz at pythoncraft.com Wed Apr 20 17:15:07 2005 From: aahz at pythoncraft.com (Aahz) Date: Wed Apr 20 17:15:09 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <4265E43C.4080707@ieee.org> References: <4265E43C.4080707@ieee.org> Message-ID: <20050420151506.GA1285@panix.com> On Tue, Apr 19, 2005, Shane Holloway (IEEE) wrote: > > I heartily agree! Especially when you have very similar try/finally > code you use in many places, and wish to refactor it into a common area. > If this is done, you are forced into a callback form like follows:: > > > def withFile(filename, callback): > aFile = open(filename, 'r') > try: > result = callback(aFile) > finally: > aFile.close() > return result > > class Before: > def readIt(self, filename): > def doReading(aFile): > self.readPartA(aFile) > self.readPartB(aFile) > self.readPartC(aFile) > > withFile(filename, doReading) > > Which is certainly functional. I actually use the idiom frequently. > However, my opinion is that it does not read smoothly. This form > requires that I say what I'm doing with something before I know the > context of what that something is. For me, blocks are not about > shortening the code, but rather clarifying *intent*. Hmmmm.... How is this different from defining functions before they're called? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death." --GvR From aahz at pythoncraft.com Wed Apr 20 17:18:11 2005 From: aahz at pythoncraft.com (Aahz) Date: Wed Apr 20 17:18:14 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <200504201107.j3KB7a0G016148@ger5.wwwserver.net> References: <200504201107.j3KB7a0G016148@ger5.wwwserver.net> Message-ID: <20050420151811.GB1285@panix.com> On Wed, Apr 20, 2005, flaig@sanctacaris.net wrote: > > As students keep on asking me about the differences between languages > and the pros and cons, I think I may claim some familiarity with > other languages too, especially Python's self-declared antithesis, > Ruby. That seems a little odd to me. To the extent that Python has an antithesis, it would be either C++ or Perl. Ruby is antithetical to some of Python's core ideology because it borrows from Perl, but Ruby is much more similar to Python than Perl is. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death." --GvR From amk at amk.ca Wed Apr 20 17:53:14 2005 From: amk at amk.ca (A.M. Kuchling) Date: Wed Apr 20 17:54:32 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <20050420151811.GB1285@panix.com> References: <200504201107.j3KB7a0G016148@ger5.wwwserver.net> <20050420151811.GB1285@panix.com> Message-ID: <20050420155314.GA18070@rogue.amk.ca> On Wed, Apr 20, 2005 at 08:18:11AM -0700, Aahz wrote: > antithesis, it would be either C++ or Perl. Ruby is antithetical to some > of Python's core ideology because it borrows from Perl, but Ruby is much > more similar to Python than Perl is. I'm not that familiar with the Ruby community; might it be that they consider Ruby to be Python's antithesis, in that it returns to bracketing instead of Python's indentation? --amk From pedronis at strakt.com Wed Apr 20 18:23:01 2005 From: pedronis at strakt.com (Samuele Pedroni) Date: Wed Apr 20 18:23:08 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: References: <740c3aec0504191557505d6e9f@mail.gmail.com><877e9a170504191855445e0f4d@mail.gmail.com> <20050419212423.63AD.JCARLSON@uci.edu> Message-ID: <426681E5.8050203@strakt.com> > > > def do(): > print "setup" > try: > yield None > finally: > print "tear down" > > doesn't quite work (if it did, all you would need is syntactic sugar > for "for > dummy in"). > PEP325 is about that From shane.holloway at ieee.org Wed Apr 20 18:32:17 2005 From: shane.holloway at ieee.org (Shane Holloway (IEEE)) Date: Wed Apr 20 18:32:54 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <20050420151506.GA1285@panix.com> References: <4265E43C.4080707@ieee.org> <20050420151506.GA1285@panix.com> Message-ID: <42668411.8090907@ieee.org> Aahz wrote: > On Tue, Apr 19, 2005, Shane Holloway (IEEE) wrote: >>However, my opinion is that it does not read smoothly. This form >>requires that I say what I'm doing with something before I know the >>context of what that something is. For me, blocks are not about >>shortening the code, but rather clarifying *intent*. > > > Hmmmm.... How is this different from defining functions before they're > called? It's not. In a function scope I'd prefer to read top-down. When I write classes, I tend to put the public methods at the top. Utility methods used by those entry points are placed toward the bottom. In this way, I read the context of what I'm doing first, and then the details of the internal methods as I need to understand them. Granted I could achieve this effect with:: class Before: def readIt(self, filename): def readIt(): withFile(filename, doReading) def doReading(aFile): self.readPartA(aFile) self.readPartB(aFile) self.readPartC(aFile) return readIt() Which is fine with me, but the *intent* is more obfuscated than what the block construct offers. And I don't think my crew would appreciate if I did this very often. ;) From jcarlson at uci.edu Wed Apr 20 18:41:33 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed Apr 20 18:44:08 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <426681E5.8050203@strakt.com> References: <426681E5.8050203@strakt.com> Message-ID: <20050420094054.63B3.JCARLSON@uci.edu> Samuele Pedroni wrote: > > > > > > > > def do(): > > print "setup" > > try: > > yield None > > finally: > > print "tear down" > > > > doesn't quite work (if it did, all you would need is syntactic sugar > > for "for > > dummy in"). > > > PEP325 is about that PEP 288 can be used like that. - Josiah From jcarlson at uci.edu Wed Apr 20 19:19:20 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed Apr 20 19:22:10 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: References: <20050419212423.63AD.JCARLSON@uci.edu> Message-ID: <20050420084329.63B0.JCARLSON@uci.edu> "Fredrik Lundh" wrote: > > Josiah Carlson wrote: > > > See the previous two discussions on thunks here on python-dev, and > > notice how the only problem that seem bettered via blocks/thunks /in > > Python/ are those which are of the form... > > > > #setup > > try: > > block > > finally: > > #finalization > > > > ... and depending on the syntax, properties. I once asked "Any other > > use cases for one of the most powerful features of Ruby, in Python?" I > > have yet to hear any sort of reasonable response. > > > > Why am I getting no response to my question? Either it is because I am > > being ignored, or no one has taken the time to translate one of these > > 'killer features' from Smalltalk or Ruby, or perhaps such translations > > show that there is a better way in Python already. > > for my purposes, I've found that the #1 callback killer in contemporary Python > is for-in:s support for the iterator protocol: ... > and get shorter code that runs faster. (see cElementTree's iterparse for > an excellent example. for typical use cases, it's nearly three times faster > than pyexpat, which is the fastest callback-based XML parser we have) It seems as though you are saying that because callbacks are so slow, that blocks are a non-starter for you because of how slow it would be to call them. I'm thinking that if people get correct code easier, that speed will not be as much of a concern (that's why I use Python already). With that said, both blocks and iterators makes /writing/ such things easier to understand, but neither really makes /reading/ much easier. Sure, it is far more terse, but that doesn't mean it is easier to read and understand what is going on. Which would people prefer? @a(l): code or l.acquire() try: code finally: l.release() > unfortunately, > > def do(): > print "setup" > try: > yield None > finally: > print "tear down" > > doesn't quite work (if it did, all you would need is syntactic sugar for "for > dummy in"). The use 'for dummy in...' would be sufficient to notify everyone. If 'dummy' is too long, there is always '_'. This kind of thing solves the common case of setup/finalization, albeit in a not-so-obvious-to-an-observer mechanism, which was recently loathed by a nontrivial number of python-dev posters (me being one). Looking at it again, a month or so later, I don't know. It does solve the problem, but it introduces a semantic where iteration is used for something that is not really iteration. Regardless, I believe that solving generator finalization (calling all enclosing finally blocks in the generator) is a worthwhile problem to solve. Whether that be by PEP 325, 288, 325+288, etc., that should be discussed. Whether people use it as a pseudo-block, or decide that blocks are further worthwhile, I suppose we could wait and see. > > > PS. a side effect of the for-in pattern is that I'm beginning to feel that Python > might need a nice "switch" statement based on dictionary lookups, so I can > replace multiple callbacks with a single loop body, without writing too many > if/elif clauses. If I remember correctly, Raymond was working on a peephole optimization that automatically translated if/elif/else clauses to a dictionary lookup when the objects were hashable and only the == operator was used. I've not heard anything about it in over a month, but then again, I've not finished the implementation of an alternate import semantic either. - Josiah From tim.peters at gmail.com Wed Apr 20 20:10:58 2005 From: tim.peters at gmail.com (Tim Peters) Date: Wed Apr 20 20:11:03 2005 Subject: [Python-Dev] Newish test failures Message-ID: <1f7befae0504201110188425c6@mail.gmail.com> Seeing three seemingly related test failures today, on CVS HEAD: test_csv test test_csv failed -- errors occurred; run in verbose mode for details test_descr test test_descr crashed -- exceptions.AttributeError: attribute '__dict__' of 'type' objects is not writable test_file test test_file crashed -- exceptions.AttributeError: attribute 'closed' of 'file' objects is not writable 3 tests failed: test_csv test_descr test_file Drilling into test_csv: ERROR: test_reader_attrs (test.test_csv.Test_Csv) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Code\python\lib\test\test_csv.py", line 62, in test_reader_attrs self._test_default_attrs(csv.reader, []) File "C:\Code\python\lib\test\test_csv.py", line 58, in _test_default_attrs self.assertRaises(TypeError, delattr, obj.dialect, 'quoting') File "C:\Code\python\lib\unittest.py", line 320, in failUnlessRaises callableObj(*args, **kwargs) AttributeError: attribute 'quoting' of '_csv.Dialect' objects is not writable ====================================================================== ERROR: test_writer_attrs (test.test_csv.Test_Csv) ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Code\python\lib\test\test_csv.py", line 65, in test_writer_attrs self._test_default_attrs(csv.writer, StringIO()) File "C:\Code\python\lib\test\test_csv.py", line 58, in _test_default_attrs self.assertRaises(TypeError, delattr, obj.dialect, 'quoting') File "C:\Code\python\lib\unittest.py", line 320, in failUnlessRaises callableObj(*args, **kwargs) AttributeError: attribute 'quoting' of '_csv.Dialect' objects is not writable From fredrik at pythonware.com Wed Apr 20 20:32:27 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed Apr 20 20:34:17 2005 Subject: [Python-Dev] Re: Newish test failures References: <1f7befae0504201110188425c6@mail.gmail.com> Message-ID: > File "C:\Code\python\lib\test\test_csv.py", line 58, in _test_default_attrs > self.assertRaises(TypeError, delattr, obj.dialect, 'quoting') > File "C:\Code\python\lib\unittest.py", line 320, in failUnlessRaises > callableObj(*args, **kwargs) > AttributeError: attribute 'quoting' of '_csv.Dialect' objects is not writable looks like someone didn't run the test suite... From: bwarsaw@users.sourceforge.net Subject: python/dist/src/Objects descrobject.c,2.38,2.39 ... As per discussion on python-dev, descriptors defined in C with a NULL setter now raise AttributeError instead of TypeError, for consistency with their pure-Python equivalent. ... From barry at python.org Wed Apr 20 21:11:30 2005 From: barry at python.org (Barry Warsaw) Date: Wed Apr 20 21:11:39 2005 Subject: [Python-Dev] Re: Newish test failures In-Reply-To: References: <1f7befae0504201110188425c6@mail.gmail.com> Message-ID: <1114024290.10439.130.camel@geddy.wooz.org> On Wed, 2005-04-20 at 14:32, Fredrik Lundh wrote: > > File "C:\Code\python\lib\test\test_csv.py", line 58, in _test_default_attrs > > self.assertRaises(TypeError, delattr, obj.dialect, 'quoting') > > File "C:\Code\python\lib\unittest.py", line 320, in failUnlessRaises > > callableObj(*args, **kwargs) > > AttributeError: attribute 'quoting' of '_csv.Dialect' objects is not writable > > looks like someone didn't run the test suite... My bad, I didn't check everything in. Will do so as soon as SF cvs is working for me again. :/ -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 307 bytes Desc: This is a digitally signed message part Url : http://mail.python.org/pipermail/python-dev/attachments/20050420/80a608f6/attachment.pgp From gvanrossum at gmail.com Wed Apr 20 21:55:47 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Apr 20 21:55:58 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <42668411.8090907@ieee.org> References: <4265E43C.4080707@ieee.org> <20050420151506.GA1285@panix.com> <42668411.8090907@ieee.org> Message-ID: [Shane Holloway] > When I > write classes, I tend to put the public methods at the top. Utility > methods used by those entry points are placed toward the bottom. In > this way, I read the context of what I'm doing first, and then the > details of the internal methods as I need to understand them. > > Granted I could achieve this effect with:: > > class Before: > def readIt(self, filename): > def readIt(): > withFile(filename, doReading) > > def doReading(aFile): > self.readPartA(aFile) > self.readPartB(aFile) > self.readPartC(aFile) > > return readIt() > > Which is fine with me, but the *intent* is more obfuscated than what the > block construct offers. And I don't think my crew would appreciate if I > did this very often. ;) I typically solve that by making doReading() a method: class Before: def readit(self, filename): withFile(filename, self._doReading) def _doReading(self, aFile): self.readPartA(aFile) self.readPartB(aFile) self.readPartC(aFile) Perhaps not as Pure, but certainly Practical. :-) And you could even use __doReading to make it absolutely clear that doReading is a local artefact, if you care about such things. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bac at OCF.Berkeley.EDU Wed Apr 20 22:50:02 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Wed Apr 20 22:50:18 2005 Subject: [Python-Dev] Proper place to put extra args for building In-Reply-To: <4265F656.4020305@v.loewis.de> References: <4265918D.7040700@ocf.berkeley.edu> <4265F656.4020305@v.loewis.de> Message-ID: <4266C07A.9090503@ocf.berkeley.edu> Martin v. L?wis wrote: > Brett C. wrote: > >>I am currently adding some code for a Py_COMPILER_DEBUG build for use on the >>AST branch. I thought that OPT was the proper variable to put stuff like this >>into for building (``-DPy_COMPILER_DEBUG``), but that erases ``-g -Wall >>-Wstrict-prototypes``. Obviously I could just tack all of that into my own >>thing, but that seems like an unneeded step. > > > Actually, this step is needed. > Damn. OK. [SNIP] > It might be reasonable to add a variable that will just take additional > compiler flags, and never be modified in configure. The other option is to not make configure.in skip injecting arguments when a pydebug build is done based on whether OPT is defined in the environment. So configure.in:670 could change to ``OPT="$OPT -g -Wall -Wstrict-prototypes"``. The line for a non-debug build could stay as-is since if people are bothering to tweak those settings for a normal build they are going out of there way to tweak settings. Seems like special-casing this for pydebug builds makes sense since the default values will almost always be desired for a pydebug build. And those rare cases you don't want them you could just edit the generated Makefile by hand. Besides it just makes our lives easier and the special builds even more usual since it is one less thing to have to tweak. Sound reasonable? -Brett From martin at v.loewis.de Wed Apr 20 23:08:57 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed Apr 20 23:09:01 2005 Subject: [Python-Dev] Proper place to put extra args for building In-Reply-To: <4266C07A.9090503@ocf.berkeley.edu> References: <4265918D.7040700@ocf.berkeley.edu> <4265F656.4020305@v.loewis.de> <4266C07A.9090503@ocf.berkeley.edu> Message-ID: <4266C4E9.5060709@v.loewis.de> Brett C. wrote: > The other option is to not make configure.in skip injecting arguments when a > pydebug build is done based on whether OPT is defined in the environment. So > configure.in:670 could change to ``OPT="$OPT -g -Wall -Wstrict-prototypes"``. That's a procedural question: do we want to accept environment settings only when running configure, or do we also want to honor environment or make command line settings when make is invoked. IOW, it is ok if export OPT=-O6 ./configure make works. But what about ./configure export OPT=-O6 make or ./configure make OPT=-O6 All three can be only supported for environment variables that are never explicitly set in Makefile, be it explicitly in Makefile.pre.in, or implicitly through configure. > The line for a non-debug build could stay as-is since if people are bothering > to tweak those settings for a normal build they are going out of there way to > tweak settings. Seems like special-casing this for pydebug builds makes sense > since the default values will almost always be desired for a pydebug build. > And those rare cases you don't want them you could just edit the generated > Makefile by hand. Besides it just makes our lives easier and the special > builds even more usual since it is one less thing to have to tweak. > > Sound reasonable? No. I thought you were talking about extra args, such as -fbrett-cannon. But now you seem to be talking about arguments that replace the ones that configure comes up with. Either of these might be reasonable, but they require different treatment. Replacing configure results is possible already From bac at OCF.Berkeley.EDU Wed Apr 20 23:21:21 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Wed Apr 20 23:21:37 2005 Subject: [Python-Dev] Proper place to put extra args for building In-Reply-To: <4266C4E9.5060709@v.loewis.de> References: <4265918D.7040700@ocf.berkeley.edu> <4265F656.4020305@v.loewis.de> <4266C07A.9090503@ocf.berkeley.edu> <4266C4E9.5060709@v.loewis.de> Message-ID: <4266C7D1.700@ocf.berkeley.edu> Martin v. L?wis wrote: > Brett C. wrote: > >>The other option is to not make configure.in skip injecting arguments when a >>pydebug build is done based on whether OPT is defined in the environment. So >>configure.in:670 could change to ``OPT="$OPT -g -Wall -Wstrict-prototypes"``. > > > That's a procedural question: do we want to accept environment settings > only when running configure, or do we also want to honor environment or > make command line settings when make is invoked. IOW, it is ok if > > export OPT=-O6 > ./configure > make > > works. But what about > > ./configure > export OPT=-O6 > make > > or > > ./configure > make OPT=-O6 > > All three can be only supported for environment variables that are never > explicitly set in Makefile, be it explicitly in Makefile.pre.in, or > implicitly through configure. > Hmm. OK, that is an interesting idea. Would make rebuilding a lot easier if it was just an environment variable that was part of the default OPT value; ``OPT="$BUILDFLAGS -g -Wall -Wstrict-prototyping". I say we go with that. What is a good name, though? PY_OPT? > >>The line for a non-debug build could stay as-is since if people are bothering >>to tweak those settings for a normal build they are going out of there way to >>tweak settings. Seems like special-casing this for pydebug builds makes sense >>since the default values will almost always be desired for a pydebug build. >>And those rare cases you don't want them you could just edit the generated >>Makefile by hand. Besides it just makes our lives easier and the special >>builds even more usual since it is one less thing to have to tweak. >> >>Sound reasonable? > > > No. I thought you were talking about extra args, such as -fbrett-cannon. I am, specifically ``-DPy_COMPILER_DEBUG`` to be tacked on as a flag to gcc. > But now you seem to be talking about arguments that replace the ones > that configure comes up with. Either of these might be reasonable, but > they require different treatment. Replacing configure results is > possible already I am only talking about that because that is how OPT is currently structured; configure.in replaces the defaults with what the user provides if the environment variable is set. This is what I don't want. From mal at egenix.com Wed Apr 20 23:40:25 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Wed Apr 20 23:40:28 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: References: <740c3aec0504191557505d6e9f@mail.gmail.com><877e9a170504191855445e0f4d@mail.gmail.com> <20050419212423.63AD.JCARLSON@uci.edu> Message-ID: <4266CC49.9080901@egenix.com> Fredrik Lundh wrote: > PS. a side effect of the for-in pattern is that I'm beginning to feel > that Python > might need a nice "switch" statement based on dictionary lookups, so I can > replace multiple callbacks with a single loop body, without writing too > many > if/elif clauses. PEP 275 anyone ? (http://www.python.org/peps/pep-0275.html) My use case for switch is that of a parser switching on tokens. mxTextTools applications would greatly benefit from being able to branch on tokens quickly. Currently, there's only callbacks, dict-to-method branching or long if-elif-elif-...-elif-else. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 20 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From mfb at lotusland.dyndns.org Wed Apr 20 23:59:34 2005 From: mfb at lotusland.dyndns.org (Matthew F. Barnes) Date: Wed Apr 20 23:59:42 2005 Subject: [Python-Dev] Reference counting when entering and exiting scopes Message-ID: <61373.130.76.96.19.1114034374.squirrel@lotusland.dyndns.org> Someone on python-help suggested that I forward this question to python-dev. I've been studying Python's core compiler and bytecode interpreter as a model for my own interpreted language, and I've come across what appears to be a reference counting problem in the `symtable_exit_scope' function in . At this point I assume that I'm just misunderstanding what's going on. So I was hoping to contact one of the core developers before I go filing what could very well be a spurious bug report against Python's core. Here's the function copied from CVS HEAD: static int symtable_exit_scope(struct symtable *st) { int end; if (st->st_pass == 1) symtable_update_free_vars(st); Py_DECREF(st->st_cur); end = PyList_GET_SIZE(st->st_stack) - 1; st->st_cur = (PySymtableEntryObject *)PyList_GET_ITEM(st->st_stack, end); if (PySequence_DelItem(st->st_stack, end) < 0) return -1; return 0; } My issue is with the use of PyList_GET_ITEM to fetch a new value for the current scope. As I understand it, PyList_GET_ITEM does not increment the reference count for the returned value. So in effect we're borrowing the reference to the symtable entry object from the tail of the scope stack. But then we turn around and delete the object from the tail of the scope stack, which DOES decrement the reference count. So `symtable_exit_scope' has a net effect of decrementing the reference count of the new current symtable entry object, when it seems to me like it should stay the same. Shouldn't the reference count be incremented when we assign to "st->st_cur" (either explicitly or by fetching the object using the PySequence API instead of PyList)? Can someone explain the rationale here? --------------------------------------------------------------------------- As an addendum to the previous question, further study of the code has made me believe that there's a reference counting problem in the `symtable_enter_scope' function as well (pasted below from CVS HEAD). Namely, that `prev' should be Py_XDECREF'd at some point in the function (at the end of the first IF block, perhaps?). static void symtable_enter_scope(struct symtable *st, char *name, int type, int lineno) { PySymtableEntryObject *prev = NULL; if (st->st_cur) { prev = st->st_cur; if (PyList_Append(st->st_stack, (PyObject *)st->st_cur) < 0) { st->st_errors++; return; } } st->st_cur = (PySymtableEntryObject *) PySymtableEntry_New(st, name, type, lineno); if (st->st_cur == NULL) { st->st_errors++; return; } if (strcmp(name, TOP) == 0) st->st_global = st->st_cur->ste_symbols; if (prev && st->st_pass == 1) { if (PyList_Append(prev->ste_children, (PyObject *)st->st_cur) < 0) st->st_errors++; } } Thanks, Matthew Barnes From jjinux at gmail.com Thu Apr 21 02:59:04 2005 From: jjinux at gmail.com (Shannon -jj Behrens) Date: Thu Apr 21 02:59:07 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: <4266CC49.9080901@egenix.com> References: <740c3aec0504191557505d6e9f@mail.gmail.com> <877e9a170504191855445e0f4d@mail.gmail.com> <20050419212423.63AD.JCARLSON@uci.edu> <4266CC49.9080901@egenix.com> Message-ID: On 4/20/05, M.-A. Lemburg wrote: > Fredrik Lundh wrote: > > PS. a side effect of the for-in pattern is that I'm beginning to feel > > that Python > > might need a nice "switch" statement based on dictionary lookups, so I can > > replace multiple callbacks with a single loop body, without writing too > > many > > if/elif clauses. > > PEP 275 anyone ? (http://www.python.org/peps/pep-0275.html) > > My use case for switch is that of a parser switching on tokens. > > mxTextTools applications would greatly benefit from being able > to branch on tokens quickly. Currently, there's only callbacks, > dict-to-method branching or long if-elif-elif-...-elif-else. I think "match" from Ocaml would be a much nicer addition to Python than "switch" from C. -jj -- I have decided to switch to Gmail, but messages to my Yahoo account will still get through. From bac at OCF.Berkeley.EDU Thu Apr 21 03:59:34 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Apr 21 03:59:45 2005 Subject: [Python-Dev] Reference counting when entering and exiting scopes In-Reply-To: <61373.130.76.96.19.1114034374.squirrel@lotusland.dyndns.org> References: <61373.130.76.96.19.1114034374.squirrel@lotusland.dyndns.org> Message-ID: <42670906.2000308@ocf.berkeley.edu> Matthew F. Barnes wrote: > Someone on python-help suggested that I forward this question to > python-dev. > > I've been studying Python's core compiler and bytecode interpreter as a > model for my own interpreted language, Might want to take a peek at the AST branch in CVS; that is what the compiler is going to change to as soon as it is complete. > and I've come across what appears > to be a reference counting problem in the `symtable_exit_scope' function > in . > > At this point I assume that I'm just misunderstanding what's going on. So > I was hoping to contact one of the core developers before I go filing what > could very well be a spurious bug report against Python's core. > Spurious bug reports are fine. If they turn out to be that they get closed as such. Either way time is spent checking it whether it goes there or here. But at least with a bug report it can be tracked more easily. So for future reference, just go ahead and file the bug report. > Here's the function copied from CVS HEAD: > > static int > symtable_exit_scope(struct symtable *st) > { > int end; > > if (st->st_pass == 1) > symtable_update_free_vars(st); > Py_DECREF(st->st_cur); > end = PyList_GET_SIZE(st->st_stack) - 1; > st->st_cur = (PySymtableEntryObject *)PyList_GET_ITEM(st->st_stack, > end); > if (PySequence_DelItem(st->st_stack, end) < 0) > return -1; > return 0; > } > > My issue is with the use of PyList_GET_ITEM to fetch a new value for the > current scope. As I understand it, PyList_GET_ITEM does not increment the > reference count for the returned value. So in effect we're borrowing the > reference to the symtable entry object from the tail of the scope stack. > But then we turn around and delete the object from the tail of the scope > stack, which DOES decrement the reference count. > > So `symtable_exit_scope' has a net effect of decrementing the reference > count of the new current symtable entry object, when it seems to me like > it should stay the same. Shouldn't the reference count be incremented > when we assign to "st->st_cur" (either explicitly or by fetching the > object using the PySequence API instead of PyList)? > > Can someone explain the rationale here? > If you look at how symtable_enter_scope() and symtable_exit_scope() work together you will notice there is actually no leak. symtable_enter_scope() appends the existing PySymtableEntryObject on to the symtable stack and then places a new PySymtableEntryObject into st->st_cur. Both at this point have a refcount of one; enough to stay alive. Now look at symtable_exit_scope(). When the current PySymtableEntryObject is no longer needed, it is DECREF'ed, putting it at 0 and thus leading to eventual collection. What is on top of the symtable stack, which has a refcount of 1 still, is then put in to st->st_cur. So no leak. Yes, there should be more explicit refcounting to be proper, but the compiler cheats in a couple of places for various reasons. But basically everything is fine since st->st_cur and st->st_stack are only played with refcount-wise by either symtable_enter_scope() and symtable_exit_scope() and they are always called in pairs in the end. -Brett From greg.ewing at canterbury.ac.nz Thu Apr 21 04:58:07 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu Apr 21 04:58:25 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <560ee46ad8e2a82faedba7349b98ab5a@yahoo.com> References: <1113941921.14525.39.camel@geddy.wooz.org> <740c3aec0504191557505d6e9f@mail.gmail.com> <560ee46ad8e2a82faedba7349b98ab5a@yahoo.com> Message-ID: <426716BF.6090803@canterbury.ac.nz> Alex Martelli wrote: > > def withfile(filename, mode='r'): > openfile = open(filename, mode) > try: > block(thefile=openfile) > finally: > openfile.close() > > i.e., let the block take keyword arguments to tweak its namespace I don't think I like that idea, because it means that from the point of view of the user of withfile, the name 'thefile' magically appears in the namespace without it being obvious where it comes from. > (but > assignments within the block should still affect its _surrounding_ > namespace, it seems to me...). I agree with that much. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Apr 21 05:11:38 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu Apr 21 05:11:55 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <20050419212423.63AD.JCARLSON@uci.edu> References: <740c3aec0504191557505d6e9f@mail.gmail.com> <877e9a170504191855445e0f4d@mail.gmail.com> <20050419212423.63AD.JCARLSON@uci.edu> Message-ID: <426719EA.7050605@canterbury.ac.nz> Josiah Carlson wrote: > I once asked "Any other > use cases for one of the most powerful features of Ruby, in Python?" I > have yet to hear any sort of reasonable response. > > Why am I getting no response to my question? Either it is because I am > being ignored, or no one has taken the time to translate one of these > 'killer features' from Smalltalk or Ruby, or perhaps such translations > show that there is a better way in Python already. My feeling is that it's the latter. I don't know about Ruby, but in Smalltalk, block-passing is used so heavily because it's the main way of implementing control structures there. While-loops, for-loops, even if-then-else, are not built into the language, but are implemented by methods that take block parameters. In Python, most of these are taken care of by built-in statements, or various uses of iterators and generators. There isn't all that much left that people want to do on a regular basis. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Apr 21 05:55:45 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu Apr 21 05:56:01 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <1113941921.14525.39.camel@geddy.wooz.org> <740c3aec0504191557505d6e9f@mail.gmail.com> <560ee46ad8e2a82faedba7349b98ab5a@yahoo.com> Message-ID: <42672441.7080909@canterbury.ac.nz> Steven Bethard wrote: > Of course, even with the unpack list, you still have to know what kind > of arguments the function calls your block with. And because these > only appear within the code, e.g. > block(openfile) > you can't rely on easily accessible things like the function's > signature. You can't rely on a function's signature alone to tell you much in any case. A distressingly large number of functions found in third-party extension modules have a help() string that just says something like fooble(arg,...) There's really no substitute for a good docstring! -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Thu Apr 21 06:01:35 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu Apr 21 06:01:52 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <4265E43C.4080707@ieee.org> References: <4265E43C.4080707@ieee.org> Message-ID: <4267259F.6050902@canterbury.ac.nz> Shane Holloway (IEEE) wrote: > class After: > def readIt(self, filename): > withFile(filename): > self.readPartA(aFile) > self.readPartB(aFile) > self.readPartC(aFile) > > In my opinion, this is much smoother to read. This particular example > brings up the question of how arguments like "aFile" get passed and > named into the block. I anticipate the need for a place to put an > argument declaration list. ;) My current thought is that it should look like this: with_file(filename) as f: do_something_with(f) The success of this hinges on how many use cases can be arranged so that the word 'as' makes sense in that position. What we need is a corpus of use cases so we can try out different phrasings on them and see what looks the best for the most cases. I also have a thought concerning whether the block argument to the function should come first or last or whatever. My solution is that the function should take exactly *one* argument, which is the block. Any other arguments are dealt with by currying. In other words, with_file above would be defined as def with_file(filename): def func(block): f = open(filename) try: block(f) finally: f.close() return func This would also make implementation much easier. The parser isn't going to know that it's dealing with anything other than a normal expression statement until it gets to the 'as' or ':', by which time going back and radically re-interpreting a previous function call could be awkward. This way, the syntax is just expr ['as' assignment_target] ':' suite and the expr is evaluated quite normally. > Another set of question arose for me when Barry started musing over the > combination of blocks and decorators. What are blocks? Well, obviously > they are callable. What do they return? The local namespace they > created/modified? I think the return value of a block should be None. In constructs like with_file, the block is being used for its side effect, not to compute a value for consumption by the block function. I don't see a great need for blocks to be able to return values. > How do blocks work with control flow statements like > "break", "continue", "yield", and "return"? Perhaps > "break" and "continue" raise exceptions similar to StopIteration in this > case? Something like that, yes. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From martin at v.loewis.de Thu Apr 21 06:11:47 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu Apr 21 06:11:52 2005 Subject: [Python-Dev] Proper place to put extra args for building In-Reply-To: <4266C7D1.700@ocf.berkeley.edu> References: <4265918D.7040700@ocf.berkeley.edu> <4265F656.4020305@v.loewis.de> <4266C07A.9090503@ocf.berkeley.edu> <4266C4E9.5060709@v.loewis.de> <4266C7D1.700@ocf.berkeley.edu> Message-ID: <42672803.3080208@v.loewis.de> Brett C. wrote: > Hmm. OK, that is an interesting idea. Would make rebuilding a lot easier if > it was just an environment variable that was part of the default OPT value; > ``OPT="$BUILDFLAGS -g -Wall -Wstrict-prototyping". > > I say we go with that. What is a good name, though? PY_OPT? I think EXTRA_CFLAGS is common, and it would not specifically be part of OPT, but rather of CFLAGS. > I am only talking about that because that is how OPT is currently structured; > configure.in replaces the defaults with what the user provides if the > environment variable is set. This is what I don't want. The question is whether the user is supposed to provide a value for OPT in the first place. "OPT" is a set of flag that (IMO) should control the optimization level of the compiler, which, in the wider sense, also includes the question whether debug information should be generated. It should be possible to link object files compiled with different OPT settings, so flags that will give binary-incompatible object files should not be in OPT. It might be desirable to allow the user to override OPT, e.g. to specify that the compiler should not use -O3 but, say, -O1. I don't think there is much point in allowing OPT to be extended. But then, it is already possible to override OPT (when invoking make), which might be enough control. Regards, Martin From steven.bethard at gmail.com Thu Apr 21 07:13:02 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Thu Apr 21 07:13:05 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <42672441.7080909@canterbury.ac.nz> References: <1113941921.14525.39.camel@geddy.wooz.org> <740c3aec0504191557505d6e9f@mail.gmail.com> <560ee46ad8e2a82faedba7349b98ab5a@yahoo.com> <42672441.7080909@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Steven Bethard wrote: > > Of course, even with the unpack list, you still have to know what kind > > of arguments the function calls your block with. And because these > > only appear within the code, e.g. > > block(openfile) > > you can't rely on easily accessible things like the function's > > signature. > > You can't rely on a function's signature alone to tell > you much in any case. A distressingly large number of > functions found in third-party extension modules have > a help() string that just says something like > > fooble(arg,...) > > There's really no substitute for a good docstring! True enough. But the point still stands. Currently, if we describe a function's input (parameters) and output (return value), we can basically fully document the function (given a thorough enough description of course).[1] Functions that accept thunks/blocks require documentation for an additional piece of information that is not part of the input or output of the function: the parameters with which the thunk/block is called. So while: fooble(arg) is pretty nasty, documentation that tells me that 'arg' is a string is probably enough to set me on the right track. But if the documentation tells me that arg is a thunk/block, that's almost certainly not enough to get me going. I also need to know how that thunk/block will be called. True, if arg is not a thunk/block, but another type of callable, I may still need to know how it will be called. But I think with non thunks/blocks, there are a lot of cases where this is not necessary. Consider the variety of decorator recipes.[2] Most don't document what parameters the wrapped function will be called with because they simply pass all arguments on through with *args and **kwargs. Thus the wrapped function will take the same parameters as the original function did. Or if they're different, they're often a relatively simple modification of the original function's parameters, ala classmethod or staticmethod. But thunks/blocks don't work this way. They're not wrapping a function that already takes arguments. They're wrapping a code block that doesn't. So they certainly can't omit the parameter description entirely, and they can't even describe it in terms of a modification to an already existing set of parameters. Because the parameters passed from a thunk/block-accepting function to a thunk are generated by the function itself, all the parameter documentation must be contained within the thunk/block-accepting function. It's not like it's the end of the world of course. ;-) I can certainly learn to document my thunks/blocks thoroughly. I just think it's worth noting that there *would* be a learning process because there are additional pieces of information I'm not used to having to document. STeVe [1] I'm ignoring the issue of functions that modify parameters or globals, but this would also be required for thunks/blocks, so I don't think it detracts from the argument. [2] Probably worth noting that a very large portion of the functions I've written that accepted other functions as parameters were decorators. I lean towards a fairly OO style of programming, so I don't pass around a lot of callbacks. Presumably someone who relies heavily on callbacks would be much more used to documenting the parameters with which a function is called. Still, I think there is probably a large enough group that has similar style to mine that my argument is still valid. -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From bac at OCF.Berkeley.EDU Thu Apr 21 08:07:52 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Apr 21 08:08:04 2005 Subject: [Python-Dev] Proper place to put extra args for building In-Reply-To: <42672803.3080208@v.loewis.de> References: <4265918D.7040700@ocf.berkeley.edu> <4265F656.4020305@v.loewis.de> <4266C07A.9090503@ocf.berkeley.edu> <4266C4E9.5060709@v.loewis.de> <4266C7D1.700@ocf.berkeley.edu> <42672803.3080208@v.loewis.de> Message-ID: <42674338.80009@ocf.berkeley.edu> Martin v. L?wis wrote: > Brett C. wrote: > >>Hmm. OK, that is an interesting idea. Would make rebuilding a lot easier if >>it was just an environment variable that was part of the default OPT value; >>``OPT="$BUILDFLAGS -g -Wall -Wstrict-prototyping". >> >>I say we go with that. What is a good name, though? PY_OPT? > > > I think EXTRA_CFLAGS is common, and it would not specifically be part of > OPT, but rather of CFLAGS. > Works for me. If no one objects I will check in the change for CFLAGS to make it ``$(BASECFLAGS) $(OPT) "$EXTRA_CFLAGS"`` soon (is quoting it enough to make sure that it isn't evaluated by configure but left as a string to be evaluated by the shell when the Makefile is running?). > >>I am only talking about that because that is how OPT is currently structured; >>configure.in replaces the defaults with what the user provides if the >>environment variable is set. This is what I don't want. > > > The question is whether the user is supposed to provide a value for OPT > in the first place. "OPT" is a set of flag that (IMO) should control > the optimization level of the compiler, which, in the wider sense, also > includes the question whether debug information should be generated. > It should be possible to link object files compiled with different > OPT settings, so flags that will give binary-incompatible object files > should not be in OPT. > OK, that makes sense to me. > It might be desirable to allow the user to override OPT, e.g. to specify > that the compiler should not use -O3 but, say, -O1. I don't think there > is much point in allowing OPT to be extended. But then, it is already > possible to override OPT (when invoking make), which might be enough > control. > Probably. I think as long as we state somewhere that EXTRA_CFLAGS is the place to put binary-altering flags and to leave OPT for only binary-compatible flags then that should be enough of a separation that most people probably won't touch OPT most of the time since the defaults are good, but can if they want. I assume this info should all be spelled out in the README and Misc/Specialbuilds.txt . Anywhere else? -Brett From martin at v.loewis.de Thu Apr 21 08:14:55 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu Apr 21 08:14:59 2005 Subject: [Python-Dev] Proper place to put extra args for building In-Reply-To: <42674338.80009@ocf.berkeley.edu> References: <4265918D.7040700@ocf.berkeley.edu> <4265F656.4020305@v.loewis.de> <4266C07A.9090503@ocf.berkeley.edu> <4266C4E9.5060709@v.loewis.de> <4266C7D1.700@ocf.berkeley.edu> <42672803.3080208@v.loewis.de> <42674338.80009@ocf.berkeley.edu> Message-ID: <426744DF.2030309@v.loewis.de> Brett C. wrote: > Works for me. If no one objects I will check in the change for CFLAGS to make > it ``$(BASECFLAGS) $(OPT) "$EXTRA_CFLAGS"`` soon (is quoting it enough to make > sure that it isn't evaluated by configure but left as a string to be evaluated > by the shell when the Makefile is running?). If you put it into Makefile.pre.in, the only thing to avoid that configure evaluates is is not to use @FOO@. OTOH, putting a $ in front of it is not good enough for make: $EXTRA_CFLAGS evaluates the variable E, and then appends XTRA_CFLAGS. Regards, Martin From Ben.Young at risk.sungard.com Thu Apr 21 10:15:10 2005 From: Ben.Young at risk.sungard.com (Ben.Young@risk.sungard.com) Date: Thu Apr 21 10:08:48 2005 Subject: Fw: [Python-Dev] anonymous blocks Message-ID: Reply to Michael Sparks ... > Thats very bizzare! I've done almost exactly the same thing, though in may case I was playing around with a python-like language. In my language, code uses "ast literals" to allow things like class :foo(bar): suite to be handled programmatically within the language. This can be used to implement anonymous blocks. If anyone is interested I've attached the files. Use them for whatever purpose you want: Run frameparser for an example. Sorry about the code quality. I hadn't intended to release it to anyone yet! Cheers, Ben -------------- next part -------------- A non-text attachment was scrubbed... Name: frame.zip Type: application/zip Size: 6856 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20050421/f232a821/frame.zip From p.f.moore at gmail.com Thu Apr 21 10:38:52 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Thu Apr 21 10:38:54 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <426681E5.8050203@strakt.com> References: <740c3aec0504191557505d6e9f@mail.gmail.com> <877e9a170504191855445e0f4d@mail.gmail.com> <20050419212423.63AD.JCARLSON@uci.edu> <426681E5.8050203@strakt.com> Message-ID: <79990c6b050421013879300013@mail.gmail.com> On 4/20/05, Samuele Pedroni wrote: > > > > > > > def do(): > > print "setup" > > try: > > yield None > > finally: > > print "tear down" > > > > doesn't quite work (if it did, all you would need is syntactic sugar > > for "for > > dummy in"). > > > PEP325 is about that And, of course, PEP 310 is all about encapsulating before/after (acquire/release) actions. Paul. From mwh at python.net Thu Apr 21 11:30:58 2005 From: mwh at python.net (Michael Hudson) Date: Thu Apr 21 11:31:00 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: (Shannon's message of "Wed, 20 Apr 2005 17:59:04 -0700") References: <740c3aec0504191557505d6e9f@mail.gmail.com> <877e9a170504191855445e0f4d@mail.gmail.com> <20050419212423.63AD.JCARLSON@uci.edu> <4266CC49.9080901@egenix.com> Message-ID: <2m64yg79yl.fsf@starship.python.net> Shannon -jj Behrens writes: > On 4/20/05, M.-A. Lemburg wrote: > >> My use case for switch is that of a parser switching on tokens. >> >> mxTextTools applications would greatly benefit from being able >> to branch on tokens quickly. Currently, there's only callbacks, >> dict-to-method branching or long if-elif-elif-...-elif-else. > > I think "match" from Ocaml would be a much nicer addition to Python > than "switch" from C. Can you post a quick summary of how you think this would work? Cheers, mwh -- We did requirements and task analysis, iterative design, and user testing. You'd almost think programming languages were an interface between people and computers. -- Steven Pemberton (one of the designers of Python's direct ancestor ABC) From fredrik at pythonware.com Thu Apr 21 12:28:21 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu Apr 21 12:28:49 2005 Subject: [Python-Dev] Re: anonymous blocks References: <426569CD.1010701@divmod.com> Message-ID: Glyph Lefkowitz wrote: > Despite being guilty of propagating this style for years myself, I have to disagree. Consider the > following network-conversation using Twisted style (which, I might add, would be generalizable to > other Twisted-like systems if they existed ;-)): > > def strawman(self): > def sayGoodbye(mingleResult): > def goAway(goodbyeResult): > self.loseConnection() > self.send("goodbye").addCallback(goAway) > def mingle(helloResult): > self.send("nice weather we're having").addCallback(sayGoodbye) > self.send("hello").addCallback(mingle) def iterman(self): yield "hello" yield "nice weather we're having" yield "goodbye" From andrew at indranet.co.nz Thu Apr 21 12:36:56 2005 From: andrew at indranet.co.nz (Andrew McGregor) Date: Thu Apr 21 12:38:24 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: <2m64yg79yl.fsf@starship.python.net> References: <740c3aec0504191557505d6e9f@mail.gmail.com> <877e9a170504191855445e0f4d@mail.gmail.com> <20050419212423.63AD.JCARLSON@uci.edu> <4266CC49.9080901@egenix.com> <2m64yg79yl.fsf@starship.python.net> Message-ID: I can post an alternative, inspired by this bit of Haskell (I've deliberately left out the Haskell type annotation for this): zoneOpts argv = case getOpt Permute options argv of (o,n,[]) -> return (o,n) (_,_,errs) -> error errs which could, in a future Python, look something like: def zoneOpts(argv): case i of getopt(argv, options, longoptions): i[2]: raise OptionError(i[2]) True: return i[:2] The intent is that within the case, the bit before each : is a boolean expression, they're evaluated in order, and the following block is executed for the first one that evaluates to be True. I know we have exceptions for this specific example, but it's just an example. I'm also assuming for the time being that getopt returns a 3-tuple (options, arguments, errors) like the Haskell version does, just for the sake of argument, and there's an OptionError constructor that will do something with that error list.. Yes, that is very different semantics from a Haskell case expression, but it kind of looks like a related idea. A more closely related idea would be to borrow the Haskell patterns: def zoneOpts(argv): case getopt(argv, options, longoptions): (o,n,[]): return o,n (_,_,errs): raise OptionError(errs) where _ matches anything, a presently unbound name is bound for the following block by mentioning it, a bound name would match whatever value it referred to, and a literal matches only itself. The first matching block gets executed. Come to think of it, it should be possible to do both. Not knowing Ocaml, I'd have to presume that 'match' is somewhat similar. Andrew On 21/04/2005, at 9:30 PM, Michael Hudson wrote: > Shannon -jj Behrens writes: > >> On 4/20/05, M.-A. Lemburg wrote: >> >>> My use case for switch is that of a parser switching on tokens. >>> >>> mxTextTools applications would greatly benefit from being able >>> to branch on tokens quickly. Currently, there's only callbacks, >>> dict-to-method branching or long if-elif-elif-...-elif-else. >> >> I think "match" from Ocaml would be a much nicer addition to Python >> than "switch" from C. > > Can you post a quick summary of how you think this would work? > > Cheers, > mwh > > -- > We did requirements and task analysis, iterative design, and user > testing. You'd almost think programming languages were an interface > between people and computers. -- Steven Pemberton > (one of the designers of Python's direct ancestor ABC) > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/ > andrew%40indranet.co.nz > > From fredrik at pythonware.com Thu Apr 21 12:42:00 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu Apr 21 12:42:35 2005 Subject: [Python-Dev] Re: Re: anonymous blocks References: <20050419212423.63AD.JCARLSON@uci.edu> <20050420084329.63B0.JCARLSON@uci.edu> Message-ID: Josiah Carlson wrote: > > for my purposes, I've found that the #1 callback killer in contemporary Python > > is for-in:s support for the iterator protocol: > ... > > and get shorter code that runs faster. (see cElementTree's iterparse for > > an excellent example. for typical use cases, it's nearly three times faster > > than pyexpat, which is the fastest callback-based XML parser we have) > > It seems as though you are saying that because callbacks are so slow, > that blocks are a non-starter for you because of how slow it would be to > call them. Not really -- I see the for-in loop body as the block. The increased speed is just a bonus. > I'm thinking that if people get correct code easier, that speed will not be as much > of a concern (that's why I use Python already). (Slightly OT, but speed is always a concern. I no longer buy the "it's python, it has to be slow" line of reasoning; when done correctly, Python code is often faster than anything else. cElementTree is one such example; people have reported that cElementTree plus Python code can be a lot faster than dedicated XPath/XSLT engines; the Python bytecode engine is extremely fast, also compared to domain-specific interpreters... And in this case, you get improved usability *and* improved speed at the same time. That's the way it should be.) > With that said, both blocks and iterators makes /writing/ such things > easier to understand, but neither really makes /reading/ much easier. > Sure, it is far more terse, but that doesn't mean it is easier to read > and understand what is going on. Well, I was talking about reading here: with the for-in pattern, you loop over the "callback source", and the "callback" itself is inlined. You don't have to think in "here is the callback, here I configure the callback source" terms; just make a function call and loop over the result. > Regardless, I believe that solving generator finalization (calling all > enclosing finally blocks in the generator) is a worthwhile problem to > solve. Whether that be by PEP 325, 288, 325+288, etc., that should be > discussed. Whether people use it as a pseudo-block, or decide that > blocks are further worthwhile, I suppose we could wait and see. Agreed. From bob at redivi.com Thu Apr 21 13:04:45 2005 From: bob at redivi.com (Bob Ippolito) Date: Thu Apr 21 13:04:59 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: References: <426569CD.1010701@divmod.com> Message-ID: <951559a5329cda690f153ee8894e0636@redivi.com> On Apr 21, 2005, at 6:28 AM, Fredrik Lundh wrote: > Glyph Lefkowitz wrote: > >> Despite being guilty of propagating this style for years myself, I >> have to disagree. Consider the >> following network-conversation using Twisted style (which, I might >> add, would be generalizable to >> other Twisted-like systems if they existed ;-)): >> >> def strawman(self): >> def sayGoodbye(mingleResult): >> def goAway(goodbyeResult): >> self.loseConnection() >> self.send("goodbye").addCallback(goAway) >> def mingle(helloResult): >> self.send("nice weather we're having").addCallback(sayGoodbye) >> self.send("hello").addCallback(mingle) > > def iterman(self): > yield "hello" > yield "nice weather we're having" > yield "goodbye" Which, more or less works, for a literal translation of the straw-man above. However, you're missing the point. These deferred operations actually return results. Generators offer no sane way to pass results back in. If they did, then this use case could be mostly served by generators. -bob From steve at holdenweb.com Thu Apr 21 13:11:56 2005 From: steve at holdenweb.com (Steve Holden) Date: Thu Apr 21 13:13:02 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: References: <8f2cb89c88defe7f2c51e0d9bd702ef7@xs4all.nl> Message-ID: <42678A7C.9020106@holdenweb.com> Guido van Rossum wrote: >>>IMO this is clearer, and even shorter! >> >>But it clutters the namespace with objects you don't need. > > > Why do people care about cluttering namespaces so much? I thought > thats' what namespaces were for -- to put stuff you want to remember > for a bit. A function's local namespace in particular seems a > perfectly fine place for temporaries. > Indeed. The way people bang on about "cluttering namespaces" you'd be forgiven for thinking that they are like attics, permanently attached to the house and liable to become cluttered over years. Most function namespaces are in fact extremely short-lived, and there is little point worrying about clutter as long as there's no chance of confusion. regards Steve -- Steve Holden +1 703 861 4237 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ Python Web Programming http://pydish.holdenweb.com/ From mcherm at mcherm.com Thu Apr 21 14:03:45 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Thu Apr 21 14:03:47 2005 Subject: [Python-Dev] Re: switch statement Message-ID: <20050421050345.hmlz8a46bbscw844@mcherm.com> Andrew McGregor writes: > I can post an alternative, inspired by this bit of Haskell [...] > The intent is that within the case, the bit before each : is a boolean > expression, they're evaluated in order, and the following block is > executed for the first one that evaluates to be True. If we're going to be evaluating a series of booleans, then the One Proper Format in Python is: if : elif : elif : else: When people speak of introducing a "switch" statement they are speaking of a construct in which the decision of which branch to take requires time proportional to something LESS than a linear function of the number of branches (it's not O(n) in the number of branches). Now the pattern matching is more interesting, but again, I'd need to see a proposed syntax for Python before I could begin to consider it. If I understand it properly, pattern matching in Haskell relies primarily on Haskell's excellent typing system, which is absent in Python. -- Michael Chermside From mfb at lotusland.dyndns.org Thu Apr 21 14:26:09 2005 From: mfb at lotusland.dyndns.org (Matthew F. Barnes) Date: Thu Apr 21 14:26:17 2005 Subject: [Python-Dev] Reference counting when entering and exiting scopes In-Reply-To: <42670906.2000308@ocf.berkeley.edu> References: <61373.130.76.96.19.1114034374.squirrel@lotusland.dyndns.org> <42670906.2000308@ocf.berkeley.edu> Message-ID: <1114086369.5763.7.camel@workstation> On Wed, 2005-04-20 at 18:59 -0700, Brett C. wrote: > So no leak. Yes, there should be more explicit refcounting to be proper, but > the compiler cheats in a couple of places for various reasons. But basically > everything is fine since st->st_cur and st->st_stack are only played with > refcount-wise by either symtable_enter_scope() and symtable_exit_scope() and > they are always called in pairs in the end. ... except for the "global" scope, for which symtable_exit_scope() is never called. But the last reference to *that* scope (st->st_cur) gets cleaned up in PySymtable_Free(). Correct? So the two things I thought were glitches are actually cancelling each other out. Very good. Thanks for your help. Matthew Barnes From ncoghlan at gmail.com Thu Apr 21 14:39:46 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu Apr 21 14:39:53 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: <20050421050345.hmlz8a46bbscw844@mcherm.com> References: <20050421050345.hmlz8a46bbscw844@mcherm.com> Message-ID: <42679F12.8080902@gmail.com> Michael Chermside wrote: > Now the pattern matching is more interesting, but again, I'd need to > see a proposed syntax for Python before I could begin to consider it. > If I understand it properly, pattern matching in Haskell relies > primarily on Haskell's excellent typing system, which is absent in > Python. There's no real need for special syntax in Python - an appropriate tuple subclass will do the trick quite nicely: class pattern(tuple): ignore = object() def __new__(cls, *args): return tuple.__new__(cls, args) def __hash__(self): raise NotImplementedError def __eq__(self, other): if len(self) != len(other): return False for item, other_item in zip(self, other): if item is pattern.ignore: continue if item != other_item: return False return True Py> x = (1, 2, 3) Py> print x == pattern(1, 2, 3) True Py> print x == pattern(1, pattern.ignore, pattern.ignore) True Py> print x == pattern(1, pattern.ignore, 3) True Py> print x == pattern(2, pattern.ignore, pattern.ignore) False Py> print x == pattern(1) False It's not usable in a dict-based switch statement, obviously, but it's perfectly compatible with the current if/elif idiom. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From michael.walter at gmail.com Thu Apr 21 14:46:42 2005 From: michael.walter at gmail.com (Michael Walter) Date: Thu Apr 21 14:46:44 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: <42679F12.8080902@gmail.com> References: <20050421050345.hmlz8a46bbscw844@mcherm.com> <42679F12.8080902@gmail.com> Message-ID: <877e9a1705042105465df1f925@mail.gmail.com> On 4/21/05, Nick Coghlan wrote: > Michael Chermside wrote: > > Now the pattern matching is more interesting, but again, I'd need to > > see a proposed syntax for Python before I could begin to consider it. > > If I understand it properly, pattern matching in Haskell relies > > primarily on Haskell's excellent typing system, which is absent in > > Python. > > There's no real need for special syntax in Python - an appropriate tuple > subclass will do the trick quite nicely: You are missing the more interesting part of pattern matching, namely that it is used for deconstructing values/binding subvalues. Ex.: case lalala of Foo f -> f Bar (Baz brzzzzz) _ meep -> (brzzzzz, meep) or Python-ish: match doThis() with: Foo as f: return f (_,* as bar,_): return bar Baz(boink as brzzz, meep=10): return brzzz "* as bar" is Not Very Nice (tm) :/ Michael From fredrik at pythonware.com Thu Apr 21 15:11:23 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu Apr 21 15:12:51 2005 Subject: [Python-Dev] Re: Re: anonymous blocks References: <426569CD.1010701@divmod.com> <951559a5329cda690f153ee8894e0636@redivi.com> Message-ID: Bob Ippolito wrote: >>> def strawman(self): >>> def sayGoodbye(mingleResult): >>> def goAway(goodbyeResult): >>> self.loseConnection() >>> self.send("goodbye").addCallback(goAway) >>> def mingle(helloResult): >>> self.send("nice weather we're having").addCallback(sayGoodbye) >>> self.send("hello").addCallback(mingle) >> >> def iterman(self): >> yield "hello" >> yield "nice weather we're having" >> yield "goodbye" > > Which, more or less works, for a literal translation of the straw-man above. However, you're > missing the point. These deferred operations actually return results. Generators offer no sane > way to pass results back in. that's why you need a context object (=self, in this case). def iterman(self): yield "hello" print self.data yield "nice weather we're having" print self.data yield "goodbye" also see: http://effbot.org/zone/asyncore-generators.htm > If they did, then this use case could be mostly served by generators. exactly. From pedronis at strakt.com Thu Apr 21 16:02:38 2005 From: pedronis at strakt.com (Samuele Pedroni) Date: Thu Apr 21 16:02:50 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: <2m64yg79yl.fsf@starship.python.net> References: <740c3aec0504191557505d6e9f@mail.gmail.com> <877e9a170504191855445e0f4d@mail.gmail.com> <20050419212423.63AD.JCARLSON@uci.edu> <4266CC49.9080901@egenix.com> <2m64yg79yl.fsf@starship.python.net> Message-ID: <4267B27E.9030107@strakt.com> Michael Hudson wrote: >Shannon -jj Behrens writes: > > > >>On 4/20/05, M.-A. Lemburg wrote: >> >> >> >>>My use case for switch is that of a parser switching on tokens. >>> >>>mxTextTools applications would greatly benefit from being able >>>to branch on tokens quickly. Currently, there's only callbacks, >>>dict-to-method branching or long if-elif-elif-...-elif-else. >>> >>> >>I think "match" from Ocaml would be a much nicer addition to Python >>than "switch" from C. >> >> > >Can you post a quick summary of how you think this would work? > > > Well, Python lists are used more imperatively and are not made up with cons cells, we have dictionaries which because of ordering issues are not trivial to match, and no general ordered records with labels. We have objects and not algebraic data types. Literature on the topic usually indicates the visitor pattern as the moral equivalent of pattern matching in an OO-context vs. algebraic data types/functional one. I agree with that point of view and Python has idioms for the visitor pattern. Interestingly even in the context of objects one can leverage the infrastructure that is there for generalized copying/pickling to allow generalized pattern matching of nested object data structures. Whether it is practical I don't know. >>> class Pt: ... def __init__(self, x,y): ... self.x = x ... self.y = y ... >>> p(lambda _: Pt(1, _()) ).match(Pt(1,3)) (3,) >>> p(lambda _: Pt(1, Pt(_(),_()))).match(Pt(1,Pt(Pt(5,6),3))) (<__main__.Pt instance at 0x40200b4c>, 3) http://codespeak.net/svn/user/pedronis/match.py is an experiment in that direction (preceding this discussion and inspired while reading a book that was using OCaml for its examples). Notice that this is quite grossly subclassing pickling infrastracture (the innocent bystander should probably not try that), a cleaner approach redoing that logic with matching in mind is possible and would be preferable. From gvanrossum at gmail.com Thu Apr 21 16:28:33 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu Apr 21 16:31:32 2005 Subject: [Python-Dev] Reference counting when entering and exiting scopes In-Reply-To: <1114086369.5763.7.camel@workstation> References: <61373.130.76.96.19.1114034374.squirrel@lotusland.dyndns.org> <42670906.2000308@ocf.berkeley.edu> <1114086369.5763.7.camel@workstation> Message-ID: > So the two things I thought were glitches are actually cancelling each > other out. Very good. Thanks for your help. Though I wonder why it was written so delicately. Would explicit INCREF/DECREF really have hurt the performance that much? This is only the bytecode compiler, which isn't on the critical path. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Thu Apr 21 16:37:50 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu Apr 21 16:38:22 2005 Subject: [Python-Dev] Re: Re: anonymous blocks In-Reply-To: References: Message-ID: [Brian Sabbey] > >> If suites were commonly used as above to define properties, event handlers > >> and other callbacks, then I think most people would be able to comprehend > >> what the first example above is doing much more quickly than the second. [Fredrik] > > wonderful logic, there. good luck with your future adventures in language > > design. [Brian again] > I'm just trying to help python improve. Maybe I'm not doing a very good > job, I don't know. Either way, there's no need to be rude. > > If I've broken some sort of unspoken code of behavior for this list, then > maybe it would be easier if you just 'spoke' it (perhaps in a private > email or in the description of this list on python.org). In his own inimitable way, Fredrik is pointing out that your argument is a tautology (or very close to one): rephrased, it sounds like "if X were commonly used, you'd recognize it easily", which isn't a sufficient argument for anything. While I've used similar arguments occasionally to shut up folks whose only remaining argument against a new feature was "but nobody will understand it the first time they encounter it" (which is true of *everything* you see for the first time), such reasoning isn't strong enough to support favoring one thing over another. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Thu Apr 21 16:49:13 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu Apr 21 16:51:59 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <1113941921.14525.39.camel@geddy.wooz.org> <740c3aec0504191557505d6e9f@mail.gmail.com> <560ee46ad8e2a82faedba7349b98ab5a@yahoo.com> <42672441.7080909@canterbury.ac.nz> Message-ID: > So while: > fooble(arg) > is pretty nasty, documentation that tells me that 'arg' is a string is > probably enough to set me on the right track. But if the > documentation tells me that arg is a thunk/block, that's almost > certainly not enough to get me going. I also need to know how that > thunk/block will be called. This argument against thunks sounds bogus to me. The signature of any callable arguments is recursively part of the signature of the function you're documenting. Just like the element type of any sequence arguments is part of the argument type. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Thu Apr 21 16:52:35 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu Apr 21 16:52:43 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <4267259F.6050902@canterbury.ac.nz> References: <4265E43C.4080707@ieee.org> <4267259F.6050902@canterbury.ac.nz> Message-ID: [Greg Ewing] > My current thought is that it should look like this: > > with_file(filename) as f: > do_something_with(f) > > The success of this hinges on how many use cases can > be arranged so that the word 'as' makes sense in that > position. [...] > This way, the syntax is just > > expr ['as' assignment_target] ':' suite > > and the expr is evaluated quite normally. Perhaps it could be even simpler: [assignment_target '=']* expr ':' suite This would just be an extension of the regular assignment statement. (More in a longer post I'm composing off-line while picking cherries off the thread.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pedronis at strakt.com Thu Apr 21 16:58:52 2005 From: pedronis at strakt.com (Samuele Pedroni) Date: Thu Apr 21 16:59:06 2005 Subject: [Python-Dev] Re: Re: anonymous blocks In-Reply-To: References: <20050419212423.63AD.JCARLSON@uci.edu> <20050420084329.63B0.JCARLSON@uci.edu> Message-ID: <4267BFAC.9060402@strakt.com> Fredrik Lundh wrote: >>Regardless, I believe that solving generator finalization (calling all >>enclosing finally blocks in the generator) is a worthwhile problem to >>solve. Whether that be by PEP 325, 288, 325+288, etc., that should be >>discussed. Whether people use it as a pseudo-block, or decide that >>blocks are further worthwhile, I suppose we could wait and see. >> >> > >Agreed. > > > I agree, in fact I think that solving that issue is very important before/if ever introducing a generalized block statement because otherwise things that would naturally be expressible with for and generators will use the block construct which allow more variety and so possibly less immediate clarity just because generators are not good at resource handling. From mcherm at mcherm.com Thu Apr 21 17:10:30 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Thu Apr 21 17:10:33 2005 Subject: [Python-Dev] Re: switch statement Message-ID: <20050421081030.yky429mt9jgo4gg0@mcherm.com> I wrote: > Now the pattern matching is more interesting, but again, I'd need to > see a proposed syntax for Python before I could begin to consider it. > If I understand it properly, pattern matching in Haskell relies > primarily on Haskell's excellent typing system, which is absent in > Python. Nick Coghlan replies: > There's no real need for special syntax in Python - an appropriate tuple > subclass will do the trick quite nicely: [... sample code matching tuples ...] Aha, but now you've answered my question about syntax, and I can see that your syntax lacks most of the power of Haskell's pattern matching. First of all, it can only match tuples ... most things in Python are NOT tuples. Secondly (as Michael Walter explained) it doesn't allow name binding to parts of the pattern. Honestly, while I understand that pattern matching is extremely powerful, I don't see how to apply it in Python. We have powerful introspective abilities, which seems to be helpful, but on the other hand we lack types, which are typically a key feature of such matching. And then there's the fact that many of the elegent uses of pattern matching use recursion to traverse data structures... a no-no in a CPython that lacks tail-recursion elimination. There is one exception... matching strings. There we have a powerful means of specifying patterns (regular expressions), and a multi-way branch based on the content of a string is a common situation. A new way to write this: s = get_some_string_value() if s == '': continue; elif re.match('#.*$', s): handle_comment() elif s == 'DEFINE': handle_define() elif s == 'UNDEF': handle_undefine() elif re.match('[A-Za-z][A-Za-z0-9]*$', s): handle_identifier() else: syntax_error() would be might be nice, but I can't figure out how to make it work more efficiently than the simple if-elif-else structure, nor an elegent syntax. -- Michael Chermside From fredrik at pythonware.com Thu Apr 21 17:22:29 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Thu Apr 21 17:24:22 2005 Subject: [Python-Dev] Re: Re: switch statement References: <20050421081030.yky429mt9jgo4gg0@mcherm.com> Message-ID: Michael Chermside wrote: > There is one exception... matching strings. There we have a powerful > means of specifying patterns (regular expressions), and a multi-way > branch based on the content of a string is a common situation. A new > way to write this: > > s = get_some_string_value() > if s == '': > continue; > elif re.match('#.*$', s): > handle_comment() > elif s == 'DEFINE': > handle_define() > elif s == 'UNDEF': > handle_undefine() > elif re.match('[A-Za-z][A-Za-z0-9]*$', s): > handle_identifier() > else: > syntax_error() > > would be might be nice, but I can't figure out how to make it work > more efficiently than the simple if-elif-else structure, nor an > elegent syntax. somewhat related: http://mail.python.org/pipermail/python-dev/2003-April/035075.html From steven.bethard at gmail.com Thu Apr 21 17:27:00 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Thu Apr 21 17:27:04 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <1113941921.14525.39.camel@geddy.wooz.org> <740c3aec0504191557505d6e9f@mail.gmail.com> <560ee46ad8e2a82faedba7349b98ab5a@yahoo.com> <42672441.7080909@canterbury.ac.nz> Message-ID: Guido van Rossum wrote: > > So while: > > fooble(arg) > > is pretty nasty, documentation that tells me that 'arg' is a string is > > probably enough to set me on the right track. But if the > > documentation tells me that arg is a thunk/block, that's almost > > certainly not enough to get me going. I also need to know how that > > thunk/block will be called. > > This argument against thunks sounds bogus to me. The signature of any > callable arguments is recursively part of the signature of the > function you're documenting. Just like the element type of any > sequence arguments is part of the argument type. It wasn't really an argument against thunks. (See the disclaimer I gave at the bottom of my previous email.) Think of it as an early documentation request for the thunks in the language reference -- I'd like to see it remind users of thunks that part of the thunk-accepting function interface is the parameters the thunk will be called with, and that these should be documented. In case my point about the difference between thunks and other callables (specifically decorators) slipped by, consider the documentation for staticmethod, which takes a callable. All the staticmethod documentation says about that callable's parameters is: "A static method does not receive an implicit first argument" Pretty simple I'd say. Or classmethod: "A class method receives the class as implicit first argument, just like an instance method receives the instance." Again, pretty simple. Why are these simple? Because decorators generally pass on pretty much the same arguments as the callables they wrap. My point was just that because thunks don't wrap other normal callables, they can't make such abbreviations. STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From gvanrossum at gmail.com Thu Apr 21 17:59:43 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu Apr 21 18:07:27 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <1113941921.14525.39.camel@geddy.wooz.org> <740c3aec0504191557505d6e9f@mail.gmail.com> <560ee46ad8e2a82faedba7349b98ab5a@yahoo.com> <42672441.7080909@canterbury.ac.nz> Message-ID: > In case my point about the difference between thunks and other > callables (specifically decorators) slipped by, consider the > documentation for staticmethod, which takes a callable. All the > staticmethod documentation says about that callable's parameters is: > "A static method does not receive an implicit first argument" > Pretty simple I'd say. Or classmethod: > "A class method receives the class as implicit first argument, > just like an instance method receives the instance." > Again, pretty simple. Why are these simple? Because decorators > generally pass on pretty much the same arguments as the callables they > wrap. My point was just that because thunks don't wrap other normal > callables, they can't make such abbreviations. You've got the special-casing backwards. It's not thinks that are special, but staticmethod (and decorators in general) because they take *any* callable. That's unusual -- most callable arguments have a definite signature, think of map(), filter(), sort() and Button callbacks. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From steven.bethard at gmail.com Thu Apr 21 18:20:50 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Thu Apr 21 18:20:53 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <73ceac6fd1ccfa5a342f39cb57c224d9@fuhm.net> References: <1113941921.14525.39.camel@geddy.wooz.org> <79990c6b05042002435ce91e79@mail.gmail.com> <73ceac6fd1ccfa5a342f39cb57c224d9@fuhm.net> Message-ID: James Y Knight wrote: > If it was possible to assign to a variable to a variable bound outside > your function, but still in your lexical scope, I think it would fix > this issue. That's always something I've thought should be possible, > anyways. I propose to make it possible via a declaration similar to > 'global'. > > E.g. (stupid example, but it demonstrates the syntax): > def f(): > count = 0 > def addCount(): > lexical count > count += 1 > assert count == 0 > addCount() > assert count == 1 It strikes me that with something like this lexical declaration, we could abuse decorators as per Carl Banks's recipe[1] to get the equivalent of thunks: def withfile(filename, mode='r'): def _(func): f = open(filename, mode) try: func(f) finally: f.close() return _ and used like: line = None @withfile("readme.txt") def print_readme(fileobj): lexical line for line in fileobj: print line print "last line:" line As the recipe notes, the main difference between print_readme and a real "code block" is that print_readme doesn't have access to the lexical scope. Something like James's suggestion would solve this problem. One advantage I see of this route (i.e. using defs + lexical scoping instead of new syntactic support) is that because we're using a normal function, the parameter list is not an issue -- arguments to the "thunk" are bound to names just as they are in any other function. The big disadvantage I see is that my normal expectations for decorators are wrong here -- after the decorator is applied print_readme is set to None, not a new callable object. Guess I'm still riding the fence. ;-) STeVe [1]http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/391199 -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From steven.bethard at gmail.com Thu Apr 21 18:27:22 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Thu Apr 21 18:27:27 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <1113941921.14525.39.camel@geddy.wooz.org> <740c3aec0504191557505d6e9f@mail.gmail.com> <560ee46ad8e2a82faedba7349b98ab5a@yahoo.com> <42672441.7080909@canterbury.ac.nz> Message-ID: Guido van Rossum wrote: > > In case my point about the difference between thunks and other > > callables (specifically decorators) slipped by, consider the > > documentation for staticmethod, which takes a callable. All the > > staticmethod documentation says about that callable's parameters is: > > "A static method does not receive an implicit first argument" > > Pretty simple I'd say. Or classmethod: > > "A class method receives the class as implicit first argument, > > just like an instance method receives the instance." > > Again, pretty simple. Why are these simple? Because decorators > > generally pass on pretty much the same arguments as the callables they > > wrap. My point was just that because thunks don't wrap other normal > > callables, they can't make such abbreviations. > > You've got the special-casing backwards. It's not thinks that are > special, but staticmethod (and decorators in general) because they > take *any* callable. That's unusual -- most callable arguments have a > definite signature, think of map(), filter(), sort() and Button > callbacks. Yeah, that was why I footnoted that most of my use for callables taking callables was decorators. But while I don't use map, filter or Button callbacks, I am guilty of using sort and helping to add a key= argument to min and max, so I guess I can't be too serious about only using decorators. ;-) STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From gvanrossum at gmail.com Thu Apr 21 18:38:03 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu Apr 21 18:43:20 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <1113941921.14525.39.camel@geddy.wooz.org> <79990c6b05042002435ce91e79@mail.gmail.com> <73ceac6fd1ccfa5a342f39cb57c224d9@fuhm.net> Message-ID: > It strikes me that with something like this lexical declaration, we > could abuse decorators as per Carl Banks's recipe[1] to get the > equivalent of thunks: "abuse" being the operative word. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From steven.bethard at gmail.com Thu Apr 21 19:05:35 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Thu Apr 21 19:05:37 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <1113941921.14525.39.camel@geddy.wooz.org> <79990c6b05042002435ce91e79@mail.gmail.com> <73ceac6fd1ccfa5a342f39cb57c224d9@fuhm.net> Message-ID: On 4/21/05, Guido van Rossum wrote: > > It strikes me that with something like this lexical declaration, we > > could abuse decorators as per Carl Banks's recipe[1] to get the > > equivalent of thunks: > > "abuse" being the operative word. Yup. I was just drawing the parallel between: @withfile("readme.txt") def thunk(fileobj): for line in fileobj: print line and @withfile("readme.txt"): # called by withfile as thunk(fileobj=) for line in fileobj: print line STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From mwh at python.net Thu Apr 21 19:10:05 2005 From: mwh at python.net (Michael Hudson) Date: Thu Apr 21 19:10:27 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: <4267B27E.9030107@strakt.com> (Samuele Pedroni's message of "Thu, 21 Apr 2005 16:02:38 +0200") References: <740c3aec0504191557505d6e9f@mail.gmail.com> <877e9a170504191855445e0f4d@mail.gmail.com> <20050419212423.63AD.JCARLSON@uci.edu> <4266CC49.9080901@egenix.com> <2m64yg79yl.fsf@starship.python.net> <4267B27E.9030107@strakt.com> Message-ID: <2mu0m05a4y.fsf@starship.python.net> Samuele Pedroni writes: > Michael Hudson wrote: [pattern matching] >>Can you post a quick summary of how you think this would work? >> >> > Well, Python lists are used more imperatively and are not made up > with cons cells, we have dictionaries which because of ordering > issues are not trivial to match, and no general ordered records with > labels. That's a better way of putting it than "pattern matching and python don't really seem to fit together", for sure :) (I'd quite like records with labels, tangentially, but am not so wild about ordering) > We have objects and not algebraic data types. Literature on the > topic usually indicates the visitor pattern as the moral equivalent > of pattern matching in an OO-context vs. algebraic data > types/functional one. I agree with that point of view and Python has > idioms for the visitor pattern. But the visitor pattern is pretty grim, really. It would be nice (tm) to have something like: match node in: Assign(lhs=Var(_), rhs=_): # lhs, rhs bound in here Assign(lhs=Subscr(_,_), rhs=_): # ditto Assign(lhs=Slice(*_), rhs=_): # ditto Assign(lhs=_, rhs=_): raise SyntaxError in Lib/compiler. Vyper had something like this, I think. > > Interestingly even in the context of objects one can leverage the > infrastructure that is there for generalized copying/pickling to > allow generalized pattern matching of nested object data > structures. Whether it is practical I don't know. > > >>> class Pt: > ... def __init__(self, x,y): > ... self.x = x > ... self.y = y > ... > >>> p(lambda _: Pt(1, _()) ).match(Pt(1,3)) > (3,) > >>> p(lambda _: Pt(1, Pt(_(),_()))).match(Pt(1,Pt(Pt(5,6),3))) > (<__main__.Pt instance at 0x40200b4c>, 3) > > http://codespeak.net/svn/user/pedronis/match.py is an experiment in > that direction (preceding this discussion > and inspired while reading a book that was using OCaml for its examples). Yikes! > Notice that this is quite grossly subclassing pickling infrastracture > (the innocent bystander should probably not try that), a cleaner > approach redoing that logic with matching in mind is possible and > would be preferable. Also, the syntax is disgusting. But that's a separate issue, I guess. Cheers, mwh -- /* I'd just like to take this moment to point out that C has all the expressive power of two dixie cups and a string. */ -- Jamie Zawinski from the xkeycaps source From mwh at python.net Thu Apr 21 19:52:11 2005 From: mwh at python.net (Michael Hudson) Date: Thu Apr 21 19:52:14 2005 Subject: [Python-Dev] marshal / unmarshal In-Reply-To: (Scott David Daniels's message of "Fri, 08 Apr 2005 16:15:39 -0700") References: Message-ID: <2moec8586s.fsf@starship.python.net> Scott David Daniels writes: > What should marshal / unmarshal do with floating point NaNs (the case we > are worrying about is Infinity) ? The current behavior is not perfect. So, after a fair bit of hacking, I think I have most of a solution to this, in two patches: make float packing copy bytes when they can http://python.org/sf/1181301 binary formats for marshalling floats http://python.org/sf/1180995 I'd like to check them both in pretty soon, but would really appreciate a review, especially of the first one as it's gotten a little hairy, mainly so I could then write some detailed tests. That said, if there are no objections I'm going to check them in anyway, so if they turn out to suck, it'll be YOUR fault for not reviewing the patches :) Cheers, mwh -- (Of course SML does have its weaknesses, but by comparison, a discussion of C++'s strengths and flaws always sounds like an argument about whether one should face north or east when one is sacrificing one's goat to the rain god.) -- Thant Tessman From jjinux at gmail.com Thu Apr 21 23:10:14 2005 From: jjinux at gmail.com (Shannon -jj Behrens) Date: Thu Apr 21 23:10:17 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: <2m64yg79yl.fsf@starship.python.net> References: <740c3aec0504191557505d6e9f@mail.gmail.com> <877e9a170504191855445e0f4d@mail.gmail.com> <20050419212423.63AD.JCARLSON@uci.edu> <4266CC49.9080901@egenix.com> <2m64yg79yl.fsf@starship.python.net> Message-ID: On 4/21/05, Michael Hudson wrote: > Shannon -jj Behrens writes: > > > On 4/20/05, M.-A. Lemburg wrote: > > > >> My use case for switch is that of a parser switching on tokens. > >> > >> mxTextTools applications would greatly benefit from being able > >> to branch on tokens quickly. Currently, there's only callbacks, > >> dict-to-method branching or long if-elif-elif-...-elif-else. > > > > I think "match" from Ocaml would be a much nicer addition to Python > > than "switch" from C. > > Can you post a quick summary of how you think this would work? Sure. Now that I'm actually trying to come up with an example, I'm noticing that Ocaml is very different than Python because Python distinguishes statements and expressions, unlike say, Scheme. Furthermore, it's important to minimize the number of new keywords and avoid excessive punctuation (which Ocaml is full of). Hence, I propose something like: def handle_token(token): match token: NUMBER: return number / a WHITESPACE if token.value == "\n": return NEWLINE (a, b): return a / b else: return token Hence, the syntax is something like (in pseudo EBNF): 'match' expr ':' {match_expression ':' block}* 'else' ':' block match_expr ::= lvalue | constant_expression Sematically, the above example translates into: def handle_token(token): if token == NUMBER: return number / a elif token == WHITESPACE and token.value == "\n": return NEWLINE elif "setting (a, b) = token succeeds": return a / b else: return token However, unlike the code above, you can more easily and more aggressively optimize. Best Regards, -jj -- I have decided to switch to Gmail, but messages to my Yahoo account will still get through. From sabbey at u.washington.edu Fri Apr 22 00:21:18 2005 From: sabbey at u.washington.edu (Brian Sabbey) Date: Fri Apr 22 00:21:24 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <4267259F.6050902@canterbury.ac.nz> References: <4265E43C.4080707@ieee.org> <4267259F.6050902@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > I also have a thought concerning whether the block > argument to the function should come first or last or > whatever. My solution is that the function should take > exactly *one* argument, which is the block. Any other > arguments are dealt with by currying. In other words, > with_file above would be defined as > > def with_file(filename): > def func(block): > f = open(filename) > try: > block(f) > finally: > f.close() > return func > > This would also make implementation much easier. The > parser isn't going to know that it's dealing with anything > other than a normal expression statement until it gets to > the 'as' or ':', by which time going back and radically > re-interpreting a previous function call could be awkward. I made an example implementation, and this wasn't an issue. It took some code to stick the thunk into the argument list, but it was pretty straightforward. The syntax that is actually used by the parser can be the same regardless of whether or not argument list augmentation is done, so the parser will not find one more awkward than the other. > This way, the syntax is just > > expr ['as' assignment_target] ':' suite > > and the expr is evaluated quite normally. Requiring arguments other than the block to be dealt with by currying can lead to problems. I won't claim these problems are serious, but they will be annoying. Say, for example, you create a block-accepting function that takes no arguments. Naturally, you would define it like this: def f(block): do_something_with_block Now, say you want to add to this function an optional argument, so you wrap another function around it like in your 'with_file' example above. Unfortunately, now you need to go find every call of this function and add empty parentheses. This is annoying. Remember the first time you added optional arguments to a function and what a relief it was not to have to go find every call to that function and stick in the extra argument? Those days are over! (well, in this case anyway.) Some people, aware of this problem of adding optional arguments, will define *all* of their block-accepting functions so that they are wrapped in another function, even if that function takes no arguments (and wars, annoying ones, will be fought over whether this is the "right" way to do it or not!): def f(): def real_func(block): pass return real_func Now the documentation gets confusing. Just saying that the function doesn't take any non-block arguments isn't enough. You would need very specific language, which many library authors will not provide. And there will always be that extra step in thought: do I need the stupid parentheses or not? There will inevitably be people (including me) who get the parentheses wrong because of absentmindedness or carelessness. This will be an extra little speed bump. Now, you may say that all these arguments apply to function decorators, so why have none of these problems appeared? The difference is that defining a function takes a long time, so a little speed bump when decorating it isn't a big deal. But blocks can be defined almost instantly. Much of their purpose has to do with making things quicker. Speed bumps are therefore a bigger deal. This will also be an issue for beginners who use python. A beginner won't necessarily have a good understanding of a function that returns a function. But such an understanding would be required simply to *use* block-accepting functions. Otherwise it would be completely mysterious why sometimes one sees this f(a,b,c) as i: pass and sometimes this g as i: pass even though both of these cases just seem to call the function that appears next to 'as' (imagine you don't have the source of 'f' and 'g'). Even worse, imagine finally learning the rule that parentheses are not allowed if there are zero arguments, and then seeing: h() as i: pass Now it would just seem arbitrary whether or not parentheses are required or disallowed. Such an issue may seem trivial to an experienced programmer, but can be very off-putting for a beginner. >> Another set of question arose for me when Barry started musing over the >> combination of blocks and decorators. What are blocks? Well, obviously >> they are callable. What do they return? The local namespace they >> created/modified? > > I think the return value of a block should be None. > In constructs like with_file, the block is being used for > its side effect, not to compute a value for consumption > by the block function. I don't see a great need for blocks > to be able to return values. If you google "filetype:rb yield", you can see many the uses of yield in ruby. By looking for the uses in which yield's return value is used, you can find blocks that return values. For example, "t = yield()" or "unless yield()" indicate that a block is returning a value. It is true that most of the time blocks do not return values, but I estimate that maybe 20% of the hits returned by google contain at least one block that does. Of course, this information is alone is not very informative, one would like to understand each case individually. But, as a first guess, it seems that people do find good uses for being able to return a value from a block. Probably 'continue ', which I had proposed earlier, is awful syntax for returning a value from a block. But 'produce ' or some other verb may not be so bad. In cases that the block returns no value, 'continue' could still be used to indicate that control should return to the function that called the block. -Brian From gvanrossum at gmail.com Fri Apr 22 01:40:28 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Apr 22 01:40:41 2005 Subject: [Python-Dev] anonymous blocks Message-ID: I've been thinking about this a lot, but haven't made much progess. Here's a brain dump. I've been thinking about integrating PEP 325 (Resource-Release Support for Generators) into the for-loop code, so that you could replace the_lock.acquire() try: BODY finally: the_lock.release() with for dummy in synchronized(the_lock): BODY or perhaps even (making "for VAR" optional in the for-loop syntax) with in synchronized(the_lock): BODY Then synchronized() could be written cleanly as follows: def synchronized(lock): lock.acquire() try: yield None finally: lock.release() But then every for-loop would have to contain an extra try-finally clause; the translation of for VAR in EXPR: BODY would become __it = iter(EXPR) try: while True: try: VAR = __it.next() except StopIteration: break BODY finally: if hasattr(__it, "close"): __it.close() which I don't particularly like: most for-loops DON'T need this, since they don't use a generator but some other form of iterator, or even if they use a generator, not all generators have a try/finally loop. But the bytecode compiler can't know that, so it will always have to generate this code. It also changes the semantics of using a generator in a for-loop slightly: if you break out of the for-loop before the generator is exhausted you will still get the close() call. It's also a bit funny to see this approach used with the only other use case for try/finally we've looked at, which requires passing a variable into the block: the "with_file" use case. We now can write with_file as a nice and clean generator: def with_file(filename): f = open(filename) try: yield f finally: f.close() but the use looks very odd because it is syntactically a for-loop but there's only one iteration: for f in with_file("/etc/passwd"): for line in f: print line[:line.find(":")] Seeing this example makes me cringe -- why two nested for loops to loop over the lines of one file??? So I think that this is probably not the right thing to pursue, and we might be better off with something along the lines of PEP 310. The authors of PEP 310 agree; under Open Issues they wrote: There are some simiralities in concept between 'with ...' blocks and generators, which have led to proposals that for loops could implement the with block functionality[3]. While neat on some levels, we think that for loops should stick to being loops. (Footnote [3] references the tread that originated PEP 325.) Perhaps the most important lesson we've learned in this thread is that the 'with' keyword proposed in PEP 310 is redundant -- the syntax could just be [VAR '=']* EXPR ':' BODY IOW the regular assignment / expression statement gets an optional colon-plus-suite at the end. So now let's assume we accept PEP 310 with this change. Does this leave any use cases for anonymous blocks uncovered? Ruby's each() pattern is covered by generators; personally I prefer Python's for var in seq: ... over Ruby's much-touted seq.each() {|var| ...} The try/finally use case is covered by PEP 310. (If you want to combine this with a for-loop in a single operation, you'll need PEP 325.) The use cases where the block actually returns a value are probably callbacks for things like sort() or map(); I have to admit that I'd rather keep lambda for these (and use named functions for longer blocks) than introduce an anonymous block syntax that can return values! I also note that if you *already* have a comparison function, Ruby's Array sort method doesn't let you pass it in as a function argument; you have to give it a block that calls the comparison function, because blocks are not the same as callables (and I'm not sure that Ruby even *has* callables -- everything seems to be a block). My tentative conclusion remains: Python doesn't need Ruby blocks. Brian Sabbey ought to come up with more examples rather than arguments why his preferred syntax and semantics are best. --Guido van Rossum (home page: http://www.python.org/~guido/) From python-dev at zesty.ca Fri Apr 22 01:49:36 2005 From: python-dev at zesty.ca (Ka-Ping Yee) Date: Fri Apr 22 01:49:47 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <4265E43C.4080707@ieee.org> <4267259F.6050902@canterbury.ac.nz> Message-ID: On Thu, 21 Apr 2005, Guido van Rossum wrote: > Perhaps it could be even simpler: > > [assignment_target '=']* expr ':' suite > > This would just be an extension of the regular assignment statement. It sounds like you are very close to simply translating expression... function_call(args): suite into expression... function_call(args)(suitefunc) If i understand what you proposed above, you're using assignment as a special case to pass arguments to the inner suite, right? So: inner_args = function_call(outer_args): suite becomes: def suitefunc(inner_args): suite function_call(outer_args)(suitefunc) ? This could get a little hard to understand if the right-hand side of the assignment is more complex than a single function call. I think the meaning would be unambiguous, just non-obvious. The only interpretation i see for this: x = spam('foo') + eggs('bar'): suite is this: def suitefunc(x): suite spam('foo') + eggs('bar')(suitefunc) but that could seem a little too mysterious. Or you could (in a later compiler pass) forbid more complex expressions on the RHS. On another note, would there be any difference between x = spam(): suite and x = spam: suite ? -- ?!ng From gvanrossum at gmail.com Fri Apr 22 01:54:20 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Apr 22 01:54:22 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <4265E43C.4080707@ieee.org> <4267259F.6050902@canterbury.ac.nz> Message-ID: [Ping] > It sounds like you are very close to simply translating > > expression... function_call(args): > suite > > into > > expression... function_call(args)(suitefunc) Actually, I'm abandinging this interpretation; see my separate (long) post. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bac at OCF.Berkeley.EDU Fri Apr 22 01:55:14 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Fri Apr 22 01:56:10 2005 Subject: [Python-Dev] Reference counting when entering and exiting scopes In-Reply-To: References: <61373.130.76.96.19.1114034374.squirrel@lotusland.dyndns.org> <42670906.2000308@ocf.berkeley.edu> <1114086369.5763.7.camel@workstation> Message-ID: <42683D62.9080101@ocf.berkeley.edu> Guido van Rossum wrote: >>So the two things I thought were glitches are actually cancelling each >>other out. Very good. Thanks for your help. > > > Though I wonder why it was written so delicately. Don't know; Jeremy wrote those functions back in 2001 to add nested scopes. If he remembers he deserves a cookie for having such a good memory. > Would explicit > INCREF/DECREF really have hurt the performance that much? This is only > the bytecode compiler, which isn't on the critical path. > Probably not. But at this point I doubt it is worth fixing since the AST branch will replace it eventually (work is on-going, just slow since my thesis is on the home stretch; initial draft is done and now I am editing to hand over for final revision by my advisor). -Brett From bac at OCF.Berkeley.EDU Fri Apr 22 02:25:16 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Fri Apr 22 02:25:25 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: Message-ID: <4268446C.6010301@ocf.berkeley.edu> Guido van Rossum wrote: > I've been thinking about this a lot, but haven't made much > progess. Here's a brain dump. > > I've been thinking about integrating PEP 325 (Resource-Release Support > for Generators) into the for-loop code, so that you could replace > [SNIP - using 'for' syntax to delineate the block and resource] > So I think that this is probably not the right thing to pursue, I totally agree with your reasoning on this. > and we > might be better off with something along the lines of PEP 310. The > authors of PEP 310 agree; under Open Issues they wrote: > > There are some simiralities in concept between 'with ...' blocks > and generators, which have led to proposals that for loops could > implement the with block functionality[3]. While neat on some > levels, we think that for loops should stick to being loops. > > (Footnote [3] references the tread that originated PEP 325.) > > Perhaps the most important lesson we've learned in this thread is that > the 'with' keyword proposed in PEP 310 is redundant -- the syntax > could just be > > [VAR '=']* EXPR ':' > BODY > > IOW the regular assignment / expression statement gets an optional > colon-plus-suite at the end. > Sure, but is the redundancy *that* bad? You should be able to pick up visually that something is an anonymous block from the indentation but I don't know how obvious it would be. Probably, in the end, this minimal syntax would be fine, but it just seems almost too plain in terms of screaming at me that something special is going on there (the '=' in an odd place just quite cut if for me for my meaning of "special"). > So now let's assume we accept PEP 310 with this change. Does this > leave any use cases for anonymous blocks uncovered? Ruby's each() > pattern is covered by generators; personally I prefer Python's > > for var in seq: ... > > over Ruby's much-touted > > seq.each() {|var| ...} > > The try/finally use case is covered by PEP 310. (If you want to > combine this with a for-loop in a single operation, you'll need PEP > 325.) > > The use cases where the block actually returns a value are probably > callbacks for things like sort() or map(); I have to admit that I'd > rather keep lambda for these (and use named functions for longer > blocks) than introduce an anonymous block syntax that can return > values! I also note that if you *already* have a comparison function, > Ruby's Array sort method doesn't let you pass it in as a function > argument; you have to give it a block that calls the comparison > function, because blocks are not the same as callables (and I'm not > sure that Ruby even *has* callables -- everything seems to be a > block). > > My tentative conclusion remains: Python doesn't need Ruby blocks. > Brian Sabbey ought to come up with more examples rather than arguments > why his preferred syntax and semantics are best. > I think I agree with Samuele that it would be more pertinent to put all of this effort into trying to come up with some way to handle cleanup in a generator. -Brett From gvanrossum at gmail.com Fri Apr 22 02:34:31 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Apr 22 02:34:32 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <4268446C.6010301@ocf.berkeley.edu> References: <4268446C.6010301@ocf.berkeley.edu> Message-ID: [Brett] > I think I agree with Samuele that it would be more pertinent to put all of this > effort into trying to come up with some way to handle cleanup in a generator. I.e. PEP 325. But (as I explained, and you agree) that still doesn't render PEP 310 unnecessary, because abusing the for-loop for implied cleanup semantics is ugly and expensive, and would change generator semantics; and it bugs me that the finally clause's reachability depends on the destructor executing. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python-dev at zesty.ca Fri Apr 22 02:39:18 2005 From: python-dev at zesty.ca (Ka-Ping Yee) Date: Fri Apr 22 02:39:22 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: Message-ID: On Thu, 21 Apr 2005, Guido van Rossum wrote: > The use cases where the block actually returns a value are probably > callbacks for things like sort() or map(); I have to admit that I'd > rather keep lambda for these (and use named functions for longer > blocks) than introduce an anonymous block syntax that can return > values! It seems to me that, in general, Python likes to use keywords for statements and operators for expressions. Maybe the reason lambda looks like such a wart is that it uses a keyword in the middle of an expression. It also uses the colon *not* to introduce an indented suite, which is a strange thing to the Pythonic eye. This suggests that an operator might fit better. A possible operator for lambda might be ->. sort(items, key=x -> x.lower()) Anyway, just a thought. -- ?!ng From pedronis at strakt.com Fri Apr 22 02:44:17 2005 From: pedronis at strakt.com (Samuele Pedroni) Date: Fri Apr 22 02:42:34 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <4268446C.6010301@ocf.berkeley.edu> Message-ID: <426848E1.6050703@strakt.com> Guido van Rossum wrote: > [Brett] > >>I think I agree with Samuele that it would be more pertinent to put all of this >>effort into trying to come up with some way to handle cleanup in a generator. > > > I.e. PEP 325. > > But (as I explained, and you agree) that still doesn't render PEP 310 > unnecessary, because abusing the for-loop for implied cleanup > semantics is ugly and expensive, and would change generator semantics; > and it bugs me that the finally clause's reachability depends on the > destructor executing. > yes, PEP325 would work in combination with PEP310, whether a combined thing (which cannot be the current for as dicussed) is desirable is a different issue: these anyway f = file(...): for line in f: ... vs. it = gen(): for val in it: ... would be analogous in a PEP310+325 world. From jcarlson at uci.edu Fri Apr 22 02:59:59 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri Apr 22 03:02:17 2005 Subject: [Python-Dev] anonymous blocks (don't combine them with generator finalization) In-Reply-To: References: <4268446C.6010301@ocf.berkeley.edu> Message-ID: <20050421173945.63C3.JCARLSON@uci.edu> Guido van Rossum wrote: > > [Brett] > > I think I agree with Samuele that it would be more pertinent to put all of this > > effort into trying to come up with some way to handle cleanup in a generator. > > I.e. PEP 325. > > But (as I explained, and you agree) that still doesn't render PEP 310 > unnecessary, because abusing the for-loop for implied cleanup > semantics is ugly and expensive, and would change generator semantics; > and it bugs me that the finally clause's reachability depends on the > destructor executing. Yes and no. PEP 325 offers a method to generators that handles cleanup if necessary and calls it close(). Obviously calling it close is a mistake. Actually, calling it anything is a mistake, and trying to combine try/finally handling in generators with __exit__/close (inside or outside of generators) is also a mistake. Start by saying, "If a non-finalized generator is garbage collected, it will be finalized." Whether this be by an exception or forcing a return, so be it. If this were to happen, we have generator finalization handled by the garbage collector, and don't need to translate /any/ for loop. As long as the garbage collection requirement is documented, we are covered (yay!). What about ... i.__enter__() try: ... finally: i.__exit__() ... types of things? Well, you seem to have offered a syntax ... [VAR '=']* EXPR: BODY ... which seems to translate into ... [VAR = ] __var = EXPR try: BODY finally: __var.__exit__() ... or something like that. Great! We've got a syntax for resource allocation/freeing outside of generators, and a non-syntax for resource allocation/freeing inside of generators. - Josiah From bob at redivi.com Fri Apr 22 03:47:14 2005 From: bob at redivi.com (Bob Ippolito) Date: Fri Apr 22 03:47:29 2005 Subject: [Python-Dev] anonymous blocks (don't combine them with generator finalization) In-Reply-To: <20050421173945.63C3.JCARLSON@uci.edu> References: <4268446C.6010301@ocf.berkeley.edu> <20050421173945.63C3.JCARLSON@uci.edu> Message-ID: <4052ed44a40fa22f767a33e8d73d85fb@redivi.com> On Apr 21, 2005, at 8:59 PM, Josiah Carlson wrote: > Guido van Rossum wrote: >> >> [Brett] >>> I think I agree with Samuele that it would be more pertinent to put >>> all of this >>> effort into trying to come up with some way to handle cleanup in a >>> generator. >> >> I.e. PEP 325. >> >> But (as I explained, and you agree) that still doesn't render PEP 310 >> unnecessary, because abusing the for-loop for implied cleanup >> semantics is ugly and expensive, and would change generator semantics; >> and it bugs me that the finally clause's reachability depends on the >> destructor executing. > > Yes and no. PEP 325 offers a method to generators that handles cleanup > if necessary and calls it close(). Obviously calling it close is a > mistake. Actually, calling it anything is a mistake, and trying to > combine try/finally handling in generators with __exit__/close (inside > or outside of generators) is also a mistake. > > > Start by saying, "If a non-finalized generator is garbage collected, it > will be finalized." Whether this be by an exception or forcing a > return, > so be it. > > If this were to happen, we have generator finalization handled by the > garbage collector, and don't need to translate /any/ for loop. As long > as the garbage collection requirement is documented, we are covered > (yay!). Well, for the CPython implementation, couldn't you get away with using garbage collection to do everything? Maybe I'm missing something.. import weakref class ResourceHandle(object): def __init__(self, acquire, release): acquire() # if I understand correctly, this is safer than __del__ self.ref = weakref.ref(self, lambda o:release()) class FakeLock(object): def acquire(self): print "acquired" def release(self): print "released" def with_lock(lock): r = ResourceHandle(lock.acquire, lock.release) yield None del r >>> x = with_lock(FakeLock()) >>> del x >>> with_lock(FakeLock()).next() acquired released >>> for ignore in with_lock(FakeLock()): ... print ignore ... acquired None released I could imagine someone complaining about generators that are never used missing out on the acquire/release. That could be solved with a trivial rewrite: def with_lock(lock): def _with_lock(r): yield None del r return _with_lock(ResourceHandle(lock.acquire, lock.release)) >>> x = with_lock(FakeLock()) acquired >>> del x released Of course, this just exaggerates Guido's "it bugs me that the finally clause's reachability depends on the destructor executing".. but it does work, in CPython. It seems to me that this pattern would be painless enough to use without a syntax change... -bob From aahz at pythoncraft.com Fri Apr 22 03:51:21 2005 From: aahz at pythoncraft.com (Aahz) Date: Fri Apr 22 03:51:24 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: Message-ID: <20050422015121.GB18897@panix.com> On Thu, Apr 21, 2005, Guido van Rossum wrote: > > Perhaps the most important lesson we've learned in this thread is that > the 'with' keyword proposed in PEP 310 is redundant -- the syntax > could just be > > [VAR '=']* EXPR ':' > BODY > > IOW the regular assignment / expression statement gets an optional > colon-plus-suite at the end. Yes, it could. The question then becomes whether it should. Because it's easy to indent Python code when you're not using a block (consider function calls with lots of args), my opinion is that like the "optional" colon after ``for`` and ``if``, the resource block *should* have a keyword. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death." --GvR From tjreedy at udel.edu Fri Apr 22 04:30:22 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Fri Apr 22 04:30:52 2005 Subject: [Python-Dev] Re: anonymous blocks References: Message-ID: I do not know that I have ever needed 'anonymous blocks', and I have therefore not followed this discussion in detail, but I appreciate Python's beauty and want to see it maintained. So I have three comments and yet-another syntax proposal that I do not remember seeing (but could have missed). 1. Python's integration of for loops, iterators, and generators are, to me, a gem of program language design that distinguishes Python from other languages I have used. Using them to not iterate but to do something else may be cute, but in a perverted sort of way. I would rather have 'something else' done some other way. 2. General-purpose passable block objects with parameters look a lot like general-purpose anonymous functions ('full lambdas'). I bet they would be used a such if at all possible. This seems to me like the wrong direction. 3. The specific use-cases for Python not handled better by current syntax seem to be rather specialized: resource management around a block. So I cautiously propose: with : with the exact semantics dependent on . In particular: with lock somelock: codeblock could abbreviate and mean somelock.acquire() try: codeblock finally: somelock.release() (Guido's example). with file somefile: codeblock might translate to (the bytecode equivalent of) if isinstance(somefile, basestring?): somefile = open(somefile,defaults) codeblock somefile.close The compound keywords could be 'underscored' but I presume they could be parsed as is, much like 'not in'. Terry J. Reedy From skip at pobox.com Fri Apr 22 05:16:55 2005 From: skip at pobox.com (Skip Montanaro) Date: Fri Apr 22 05:17:13 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: Message-ID: <17000.27815.106381.198125@montanaro.dyndns.org> Guido> or perhaps even (making "for VAR" optional in the for-loop syntax) Guido> with Guido> in synchronized(the_lock): Guido> BODY This could be a new statement, so the problematic issue of implicit try/finally in every for statement wouldn't be necessary. That complication would only be needed for the above form. (Of course, if you've dispensed with this I am very likely missing something fundamental.) Skip From steven.bethard at gmail.com Fri Apr 22 05:55:42 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Fri Apr 22 05:55:46 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: Message-ID: Ka-Ping Yee wrote: > It seems to me that, in general, Python likes to use keywords for > statements and operators for expressions. Probably worth noting that 'for', 'in' and 'if' in generator expressions and list comprehensions blur this distinction somewhat... Steve -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From pje at telecommunity.com Fri Apr 22 06:25:22 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Apr 22 06:24:36 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: <2mu0m05a4y.fsf@starship.python.net> References: <4267B27E.9030107@strakt.com> <740c3aec0504191557505d6e9f@mail.gmail.com> <877e9a170504191855445e0f4d@mail.gmail.com> <20050419212423.63AD.JCARLSON@uci.edu> <4266CC49.9080901@egenix.com> <2m64yg79yl.fsf@starship.python.net> <4267B27E.9030107@strakt.com> Message-ID: <5.1.1.6.0.20050421150027.021c60b0@mail.telecommunity.com> At 06:10 PM 04/21/2005 +0100, Michael Hudson wrote: >But the visitor pattern is pretty grim, really. It would be nice (tm) >to have something like: > > match node in: > Assign(lhs=Var(_), rhs=_): > # lhs, rhs bound in here > Assign(lhs=Subscr(_,_), rhs=_): > # ditto > Assign(lhs=Slice(*_), rhs=_): > # ditto > Assign(lhs=_, rhs=_): > raise SyntaxError > >in Lib/compiler. FWIW, I do intend to add this sort of thing to PyProtocols' predicate dispatch system. Actually, I can dispatch on rules like the above now, it's just that you have to spell out the cases as e.g.: @do_it.when("isinstance(node, Assign) and isinstance(node.lhs, Subscr)") def do_subscript_assign(node, ...): ... I'd like to create a syntax sugar for pattern matching though, that would let you 1) use a less verbose way of saying the same thing, and 2) let you bind the intermediate values to variables that then become accessible in the function body as locals. Anyway, the main holdup on this is deciding what sort of Python syntax abuse should represent variable bindings. :) Maybe something like this will be suitably horrific: @do_it.when("node in Assign.match(lhs=`lhs` in Subscr,rhs=`rhs`)") def do_subscript_assign((lhs,rhs), node, ...): ... But I think maybe here the cure is worse than the disease. :) Pushed this far, it seems to beg for new syntax to accommodate in-expression variable bindings, something like 'var:=value'. Really, though, the problem is probably just that inline variable binding is downright unpythonic. The only time Python does anything vaguely similar is with the 'except type,var:' syntax. From bac at OCF.Berkeley.EDU Fri Apr 22 06:26:00 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Fri Apr 22 06:26:06 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <4268446C.6010301@ocf.berkeley.edu> Message-ID: <42687CD8.3000204@ocf.berkeley.edu> Guido van Rossum wrote: > [Brett] > >>I think I agree with Samuele that it would be more pertinent to put all of this >>effort into trying to come up with some way to handle cleanup in a generator. > > > I.e. PEP 325. > > But (as I explained, and you agree) that still doesn't render PEP 310 > unnecessary, because abusing the for-loop for implied cleanup > semantics is ugly and expensive, and would change generator semantics; Right, I'm not saying PEP 310 shouldn't also be considered. It just seems like we are beginning to pile a lot on this discussion by bringing in PEP 310 and PEP 325 in at the same time since, as pointed out, there is no guarantee that anything will be called in a generator and thus making PEP 310 work in generators does not seem guaranteed to solve that problem (although I might have missed something; just started really following the thread today). At this point anonymous blocks just don't seem to be happening, at least not like in Ruby. Fine, I didn't want them anyway. Now we are trying to simplify resource cleanup and handling. What I am trying to say is that generators differ just enough as to possibly warrant a separate discussion from all of this other resource handling "stuff". So I am advocating a more focused generator discussion since resource handling in generators is much more difficult than the general case in non-generator situations. I mean obviously in the general case all of this is handled already in Python today with try/finally. But with generators you have to jump through some extra hoops to get similar support (passing in anything that needs to be cleaned up, hoping that garbage collection will eventually handle things, etc.). > and it bugs me that the finally clause's reachability depends on the > destructor executing. > Yeah, I don't like it either. I would rather see something like: def gen(): FILE = open("stuff.txt", 'rU') for line in FILE: yield line cleanup: FILE.close() and have whatever is in the 'cleanup' block be either accessible from a method in the generator or have it become the equivalent of a __del__ for the generator, or maybe even both (which would remove contention that whatever needs to be cleaned up is done too late thanks to gc not guaranteeing immediate cleanup). This way you get the guaranteed cleanup regardless and you don't have to worry about creating everything outside of the generator, passing it in, and then handling cleanup in a try/finally that contains the next() calls to the generator (or any other contortion you might have to go through). Anyway, my random Python suggestion for the day. -Brett From bac at OCF.Berkeley.EDU Fri Apr 22 06:28:42 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Fri Apr 22 06:28:50 2005 Subject: [Python-Dev] anonymous blocks (don't combine them with generator finalization) In-Reply-To: <4052ed44a40fa22f767a33e8d73d85fb@redivi.com> References: <4268446C.6010301@ocf.berkeley.edu> <20050421173945.63C3.JCARLSON@uci.edu> <4052ed44a40fa22f767a33e8d73d85fb@redivi.com> Message-ID: <42687D7A.5010206@ocf.berkeley.edu> Bob Ippolito wrote: > > On Apr 21, 2005, at 8:59 PM, Josiah Carlson wrote: > >> Guido van Rossum wrote: >> >>> >>> [Brett] >>> >>>> I think I agree with Samuele that it would be more pertinent to put >>>> all of this >>>> effort into trying to come up with some way to handle cleanup in a >>>> generator. >>> >>> >>> I.e. PEP 325. >>> >>> But (as I explained, and you agree) that still doesn't render PEP 310 >>> unnecessary, because abusing the for-loop for implied cleanup >>> semantics is ugly and expensive, and would change generator semantics; >>> and it bugs me that the finally clause's reachability depends on the >>> destructor executing. >> >> >> Yes and no. PEP 325 offers a method to generators that handles cleanup >> if necessary and calls it close(). Obviously calling it close is a >> mistake. Actually, calling it anything is a mistake, and trying to >> combine try/finally handling in generators with __exit__/close (inside >> or outside of generators) is also a mistake. >> >> >> Start by saying, "If a non-finalized generator is garbage collected, it >> will be finalized." Whether this be by an exception or forcing a return, >> so be it. >> >> If this were to happen, we have generator finalization handled by the >> garbage collector, and don't need to translate /any/ for loop. As long >> as the garbage collection requirement is documented, we are covered >> (yay!). > > > Well, for the CPython implementation, couldn't you get away with using > garbage collection to do everything? Maybe I'm missing something.. > [SNIP] Well, if you are missing something then so am I since your suggestion is basically correct. The only issue is that people will want more immediate execution of the cleanup code which gc cannot guarantee. That's why the ability to call a method with the PEP 325 approach gets rid of that worry. -Brett From bob at redivi.com Fri Apr 22 06:49:25 2005 From: bob at redivi.com (Bob Ippolito) Date: Fri Apr 22 06:49:36 2005 Subject: [Python-Dev] anonymous blocks (don't combine them with generator finalization) In-Reply-To: <42687D7A.5010206@ocf.berkeley.edu> References: <4268446C.6010301@ocf.berkeley.edu> <20050421173945.63C3.JCARLSON@uci.edu> <4052ed44a40fa22f767a33e8d73d85fb@redivi.com> <42687D7A.5010206@ocf.berkeley.edu> Message-ID: <69746cadca283f5b9c1e76686d2ccb01@redivi.com> On Apr 22, 2005, at 12:28 AM, Brett C. wrote: > Bob Ippolito wrote: >> >> On Apr 21, 2005, at 8:59 PM, Josiah Carlson wrote: >> >>> Guido van Rossum wrote: >>> >>>> >>>> [Brett] >>>> >>>>> I think I agree with Samuele that it would be more pertinent to put >>>>> all of this >>>>> effort into trying to come up with some way to handle cleanup in a >>>>> generator. >>>> >>>> >>>> I.e. PEP 325. >>>> >>>> But (as I explained, and you agree) that still doesn't render PEP >>>> 310 >>>> unnecessary, because abusing the for-loop for implied cleanup >>>> semantics is ugly and expensive, and would change generator >>>> semantics; >>>> and it bugs me that the finally clause's reachability depends on the >>>> destructor executing. >>> >>> >>> Yes and no. PEP 325 offers a method to generators that handles >>> cleanup >>> if necessary and calls it close(). Obviously calling it close is a >>> mistake. Actually, calling it anything is a mistake, and trying to >>> combine try/finally handling in generators with __exit__/close >>> (inside >>> or outside of generators) is also a mistake. >>> >>> >>> Start by saying, "If a non-finalized generator is garbage collected, >>> it >>> will be finalized." Whether this be by an exception or forcing a >>> return, >>> so be it. >>> >>> If this were to happen, we have generator finalization handled by the >>> garbage collector, and don't need to translate /any/ for loop. As >>> long >>> as the garbage collection requirement is documented, we are covered >>> (yay!). >> >> >> Well, for the CPython implementation, couldn't you get away with using >> garbage collection to do everything? Maybe I'm missing something.. >> > > [SNIP] > > Well, if you are missing something then so am I since your suggestion > is > basically correct. The only issue is that people will want more > immediate > execution of the cleanup code which gc cannot guarantee. That's why > the > ability to call a method with the PEP 325 approach gets rid of that > worry. Well in CPython, if you are never assigning the generator to any local or global, then you should be guaranteed that it gets cleaned up at the right time unless it's alive in a traceback somewhere (maybe you WANT it to be!) or some insane trace hook keeps too many references to frames around.. It seems *reasonably* certain that for reasonable uses this solution WILL clean it up optimistically. -bob From bac at OCF.Berkeley.EDU Fri Apr 22 06:56:31 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Fri Apr 22 06:56:38 2005 Subject: [Python-Dev] Proper place to put extra args for building In-Reply-To: <426744DF.2030309@v.loewis.de> References: <4265918D.7040700@ocf.berkeley.edu> <4265F656.4020305@v.loewis.de> <4266C07A.9090503@ocf.berkeley.edu> <4266C4E9.5060709@v.loewis.de> <4266C7D1.700@ocf.berkeley.edu> <42672803.3080208@v.loewis.de> <42674338.80009@ocf.berkeley.edu> <426744DF.2030309@v.loewis.de> Message-ID: <426883FF.5060009@ocf.berkeley.edu> Martin v. L?wis wrote: > Brett C. wrote: > >>Works for me. If no one objects I will check in the change for CFLAGS to make >>it ``$(BASECFLAGS) $(OPT) "$EXTRA_CFLAGS"`` soon (is quoting it enough to make >>sure that it isn't evaluated by configure but left as a string to be evaluated >>by the shell when the Makefile is running?). > > > If you put it into Makefile.pre.in, the only thing to avoid that > configure evaluates is is not to use @FOO@. OTOH, putting a $ > in front of it is not good enough for make: $EXTRA_CFLAGS evaluates > the variable E, and then appends XTRA_CFLAGS. > Yep, you're right. I initially thought that the parentheses meant it was a Makefile-only variable, but it actually goes to the environment for those unknown values. Before I check it in, though, should setup.py be tweaked to use it as well? I say yes. -Brett From greg.ewing at canterbury.ac.nz Fri Apr 22 08:19:45 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri Apr 22 08:20:03 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <4265E43C.4080707@ieee.org> <4267259F.6050902@canterbury.ac.nz> Message-ID: <42689781.20505@canterbury.ac.nz> Ka-Ping Yee wrote: > Can you explain what you meant by currying here? I know what > the word "curry" means, but i am having a hard time seeing how > it applies to your example. It's currying in the sense that instead of one function which takes all the args at once, you have a function that takes some of them (all except the thunk) and returns another one that takes the rest (the thunk). > Could you make up an example that uses more arguments? def with_file(filename, mode): def func(block): f = open(filename, mode) try: block(f) finally: f.close() return func Usage example: with_file("foo.txt", "w") as f: f.write("My hovercraft is full of parrots.") Does that help? -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Fri Apr 22 08:19:48 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri Apr 22 08:20:05 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <4265E43C.4080707@ieee.org> <4267259F.6050902@canterbury.ac.nz> Message-ID: <42689784.80305@canterbury.ac.nz> Guido van Rossum wrote: > Perhaps it could be even simpler: > > [assignment_target '=']* expr ':' suite I don't like that so much. It looks like you're assigning the result of expr to assignment_target, and then doing something else. > This would just be an extension of the regular assignment statement. Syntactically, yes, but semantically it's more complicated than just a "simple extension", to my mind. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Fri Apr 22 08:19:50 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri Apr 22 08:20:09 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <1113941921.14525.39.camel@geddy.wooz.org> <79990c6b05042002435ce91e79@mail.gmail.com> <73ceac6fd1ccfa5a342f39cb57c224d9@fuhm.net> Message-ID: <42689786.7090400@canterbury.ac.nz> Steven Bethard wrote: > line = None > @withfile("readme.txt") > def print_readme(fileobj): > lexical line > for line in fileobj: > print line > print "last line:" line Since the name of the function isn't important, that could be reduced to @withfile("readme.txt") def _(fileobj): ... (Disclaimer: This post should not be taken as an endorsement of this abuse! I'd still much rather have a proper language feature for it.) -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Fri Apr 22 08:19:53 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri Apr 22 08:20:13 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: <4265E43C.4080707@ieee.org> <4267259F.6050902@canterbury.ac.nz> Message-ID: <42689789.5020004@canterbury.ac.nz> Brian Sabbey wrote: > I made an example implementation, and this wasn't an issue. It took > some code to stick the thunk into the argument list, but it was pretty > straightforward. What does your implementation do with something like f() + g(): ... ? (A syntax error, I would hope.) While no doubt it can be done, I still don't like the idea very much. It seems like a violation of modularity in the grammar, so to speak. The syntax effectively allowed for the expression is severely limited by the fact that a block follows it, which is a kind of backward effect that violates the predominantly LL-flavour of the rest of the syntax. There's a backward effect in the semantics, too -- you can't properly understand what the otherwise-normal-looking function call is doing without knowing what comes later. An analogy has been made with the insertion of "self" into the arguments of a method. But that is something quite different. In x.meth(y), the rules are being followed quite consistently: the result of x.meth is being called with y (and only y!) as an argument; the insertion of self happens later. But here, insertion of the thunk would occur *before* any call was made at all, with no clue from looking at the call itself. > Requiring arguments other than the block to be dealt with by currying > can lead to problems. I won't claim these problems are serious, but > they will be annoying. You have some valid concerns there. You've given me something to think about. Here's another idea. Don't write the parameters in the form of a call at all; instead, do this: with_file "foo.txt", "w" as f: f.write("Spam!") This would have the benefit of making it look more like a control structure and less like a funny kind of call. I can see some problems with that, though. Juxtaposing two expressions doesn't really work, because the result can end up looking like a function call or indexing operation. I don't want to put a keyword in between because that would mess up how it reads. Nor do I want to put some arbitrary piece of punctuation in there. The best I can think of right now is with_file {"foo.txt", "w"} as f: f.write("Spam!") > If you google "filetype:rb yield", you can see many the uses of yield in > ruby. I'm sure that use cases can be found, but the pertinent question is whether a substantial number of those use cases from Ruby fall into the class of block-uses which aren't covered by other Python facilities. Also, I have a gut feeling that it's a bad idea to try to provide for this. I think the reason is this: We're trying to create something that feels like a user-defined control structure with a suite, and there's currently no concept in Python of a suite returning a value to be consumed by its containing control structure. It would be something new, and it would require some mental gymnastics to understand what it was doing. We already have "return" and "yield"; this would be a third similar-yet- different thing. If it were considered important enough, it could easily be added later, without disturbing anything. But I think it's best left out of an initial specification. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From facundobatista at gmail.com Fri Apr 22 15:30:11 2005 From: facundobatista at gmail.com (Facundo Batista) Date: Fri Apr 22 15:30:16 2005 Subject: [Python-Dev] Caching objects in memory Message-ID: Is there a document that details which objects are cached in memory (to not create the same object multiple times, for performance)? If not, could please somebody point me out where this is implemented for strings? Thank you! . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From mwh at python.net Fri Apr 22 15:50:57 2005 From: mwh at python.net (Michael Hudson) Date: Fri Apr 22 15:51:01 2005 Subject: [Python-Dev] Caching objects in memory In-Reply-To: (Facundo Batista's message of "Fri, 22 Apr 2005 10:30:11 -0300") References: Message-ID: <2moec6539a.fsf@starship.python.net> Facundo Batista writes: > Is there a document that details which objects are cached in memory > (to not create the same object multiple times, for performance)? No. > If not, could please somebody point me out where this is implemented > for strings? In PyString_FromStringAndSize and PyString_FromString, it seems to me. Cheers, mwh -- I also feel it essential to note, [...], that Description Logics, non-Monotonic Logics, Default Logics and Circumscription Logics can all collectively go suck a cow. Thank you. -- http://advogato.org/person/Johnath/diary.html?start=4 From fredrik at pythonware.com Fri Apr 22 15:50:20 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Fri Apr 22 15:57:43 2005 Subject: [Python-Dev] Re: Caching objects in memory References: Message-ID: Facundo Batista wrote: > Is there a document that details which objects are cached in memory > (to not create the same object multiple times, for performance)? why do you think you need to know? > If not, could please somebody point me out where this is implemented > for strings? Objects/stringobject.c (where else? ;-) From theller at python.net Fri Apr 22 16:57:26 2005 From: theller at python.net (Thomas Heller) Date: Fri Apr 22 16:57:34 2005 Subject: [Python-Dev] Error checking in init functions Message-ID: I always wondered why there usually is very sloppy error checking in init functions. Usually it goes like this (I removed declarations and some other lines for clarity): PyMODINIT_FUNC PyInit_zlib(void) { m = Py_InitModule4("zlib", zlib_methods, zlib_module_documentation, (PyObject*)NULL,PYTHON_API_VERSION); ZlibError = PyErr_NewException("zlib.error", NULL, NULL); if (ZlibError != NULL) { Py_INCREF(ZlibError); PyModule_AddObject(m, "error", ZlibError); } PyModule_AddIntConstant(m, "MAX_WBITS", MAX_WBITS); PyModule_AddIntConstant(m, "DEFLATED", DEFLATED); ver = PyString_FromString(ZLIB_VERSION); if (ver != NULL) PyModule_AddObject(m, "ZLIB_VERSION", ver); PyModule_AddStringConstant(m, "__version__", "1.0"); } Why isn't the result checked in the PyModule_... functions? Why is the failure of PyErr_NewException silently ignored? The problem is that when one of these things fail (although they are probably supposed to NOT fail) you end up with a module missing something, without any error message. What would be the correct thing to do - I assume something like if (PyModule_AddIntConstant(m, "MAX_WBITS", MAX_WBITS)) { PyErr_Print(); return; } Thomas From mwh at python.net Fri Apr 22 17:05:29 2005 From: mwh at python.net (Michael Hudson) Date: Fri Apr 22 17:05:31 2005 Subject: [Python-Dev] Error checking in init functions In-Reply-To: (Thomas Heller's message of "Fri, 22 Apr 2005 16:57:26 +0200") References: Message-ID: <2mk6mu4zt2.fsf@starship.python.net> Thomas Heller writes: > I always wondered why there usually is very sloppy error checking in > init functions. Laziness, I presume... > The problem is that when one of these things fail (although they are > probably supposed to NOT fail) you end up with a module missing > something, without any error message. Err. There's a call to PyErr_Occurred() after the init function is called, so you should get an error message. Carrying on regardless after an error runs the risk that the exception will be cleared, of course. > What would be the correct thing to do - I assume something like > > if (PyModule_AddIntConstant(m, "MAX_WBITS", MAX_WBITS)) { > PyErr_Print(); > return; > } Just return, I think. Cheers, mwh -- The meaning of "brunch" is as yet undefined. -- Simon Booth, ucam.chat From jimjjewett at gmail.com Fri Apr 22 17:06:16 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri Apr 22 17:06:19 2005 Subject: [Python-Dev] Re: switch statement Message-ID: Michael Chermside wrote: > Now the pattern matching is more interesting, but again, I'd need to > see a proposed syntax for Python before I could begin to consider it. > If I understand it properly, pattern matching in Haskell relies > primarily on Haskell's excellent typing system, which is absent in > Python. Why not just use classes? With either mixins or new-style classes, it is quite reasonable to use many small classes for fine distinctions. Change if predicate1(obj): action1(obj) elif predicate2(obj): action2(obj) ... else: default(obj) into either try: obj.action(locals()) except AttributeError: default(obj, locals()) or if hasattr(obj, "action"): obj.action(locals()) else: And then define an action method (perhaps through inheritance from a mixin) for any object that should not take the default path. The object's own methods will have access to any variables used in the match and locals will have access to the current scope. If you have at least one class per "switch", you have a switch statement. The down sides are that (1) Your domain objects will have to conform to a least a weak OO model (or take the default path) (2) Logic that should be together will be split up. Either classes will be modified externally, or the "switch statement" logic will be broken up between different classes. If single-method mixins are used to keep the logic close, then real objects will have to pick an ancestor for what may seem like arbitrary reasons. These objections apply to any matching system based on types; the difference is that other languages have often already paid the price. For Python it is an incremental cost incurred by the match system. -jJ From ncoghlan at gmail.com Fri Apr 22 18:41:58 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri Apr 22 18:42:19 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <17000.27815.106381.198125@montanaro.dyndns.org> References: <17000.27815.106381.198125@montanaro.dyndns.org> Message-ID: <42692956.5070305@gmail.com> Skip Montanaro wrote: > Guido> or perhaps even (making "for VAR" optional in the for-loop syntax) > Guido> with > > Guido> in synchronized(the_lock): > Guido> BODY > > This could be a new statement, so the problematic issue of implicit > try/finally in every for statement wouldn't be necessary. That complication > would only be needed for the above form. s/in/with/ to get PEP 310. A parallel which has been bugging me is the existence of the iterator protocol (__iter__, next()) which you can implement manually if you want, and the existence of generators, which provide a nice clean way of writing iterators as functions. I'm wondering if something similar can't be found for the __enter__/__exit__ resource protocol. Guido's recent screed crystallised the idea of writing resources as two-part generators: def my_resource(): print "Hi!" # Do entrance code yield None # Go on with the contents of the 'with' block print "Bye!" # Do exit code Giving the internal generator object an enter method that calls self.next() (expecting None to be returned), and an exit method that does the same (but expects StopIteration to be raised) should suffice to make this possible with a PEP 310 style syntax. Interestingly, with this approach, "for dummy in my_resource()" would still wrap the block of code in the entrance/exit code (because my_resource *is* a generator), but it wouldn't get the try/finally semantics. An alternative would be to replace the 'yield None' with a 'break' or 'continue', and create an object which supports the resource protocol and NOT the iterator protocol. Something like: def my_resource(): print "Hi!" # Do entrance code continue # Go on with the contents of the 'with' block print "Bye!" # Do exit code (This is currently a SyntaxError, so it isn't ambiguous in any way) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From python at rcn.com Thu Apr 21 19:01:03 2005 From: python at rcn.com (Raymond Hettinger) Date: Fri Apr 22 19:01:12 2005 Subject: [Python-Dev] Caching objects in memory In-Reply-To: Message-ID: <000001c54693$b3bd6d80$ccb72c81@oemcomputer> [Facundo Batista] > Is there a document that details which objects are cached in memory > (to not create the same object multiple times, for performance)? The caches get cleaned-up before Python exit's, so you can find them all listed together in the code in Python/pythonrun.c: /* Sundry finalizers */ PyMethod_Fini(); PyFrame_Fini(); PyCFunction_Fini(); PyTuple_Fini(); PyList_Fini(); PyString_Fini(); PyInt_Fini(); PyFloat_Fini(); #ifdef Py_USING_UNICODE /* Cleanup Unicode implementation */ _PyUnicode_Fini(); #endif Raymond Hettinger From shane at hathawaymix.org Fri Apr 22 19:11:25 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Fri Apr 22 19:16:30 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <42692956.5070305@gmail.com> References: <17000.27815.106381.198125@montanaro.dyndns.org> <42692956.5070305@gmail.com> Message-ID: <4269303D.6030205@hathawaymix.org> Nick Coghlan wrote: > An alternative would be to replace the 'yield None' with a 'break' or > 'continue', and create an object which supports the resource protocol > and NOT the iterator protocol. Something like: > > def my_resource(): > print "Hi!" # Do entrance code > continue # Go on with the contents of the 'with' block > print "Bye!" # Do exit code > > (This is currently a SyntaxError, so it isn't ambiguous in any way) That's a very interesting suggestion. I've been lurking, thinking about a way to use something like PEP 310 to help manage database transactions. Here is some typical code that changes something under transaction control: begin_transaction() try: changestuff() changemorestuff() except: abort_transaction() raise else: commit_transaction() There's a lot of boilerplate code there. Using your suggestion, I could write that something like this: def transaction(): begin_transaction() try: continue except: abort_transaction() raise else: commit_transaction() with transaction(): changestuff() changemorestuff() Shane From reinhold-birkenfeld-nospam at wolke7.net Fri Apr 22 19:20:46 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Fri Apr 22 19:23:31 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <42692956.5070305@gmail.com> References: <17000.27815.106381.198125@montanaro.dyndns.org> <42692956.5070305@gmail.com> Message-ID: Nick Coghlan wrote: > Interestingly, with this approach, "for dummy in my_resource()" would still wrap > the block of code in the entrance/exit code (because my_resource *is* a > generator), but it wouldn't get the try/finally semantics. > > An alternative would be to replace the 'yield None' with a 'break' or > 'continue', and create an object which supports the resource protocol and NOT > the iterator protocol. Something like: > > def my_resource(): > print "Hi!" # Do entrance code > continue # Go on with the contents of the 'with' block > print "Bye!" # Do exit code > > (This is currently a SyntaxError, so it isn't ambiguous in any way) Oh, it is ambiguous, as soon as you insert a for/while statement in your resource function and want to call continue in there. Other than that, it's very neat. Maybe "yield" alone (which is always a SyntaxError) could be used. Reinhold -- Mail address is perfectly valid! From jimjjewett at gmail.com Fri Apr 22 23:36:19 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri Apr 22 23:36:22 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) Message-ID: As best I can tell, the anonymous blocks are used to take care of boilerplate code without changing the scope -- exactly what macros are used for. The only difference I see is that in this case, the macros are limited to entire (possibly compound) statements. To make this more concrete, Guido: >> in synchronized(the_lock): >> BODY Nick Coghlan: > s/in/with/ to get PEP 310. ... >Guido's recent screed crystallised the idea of writing resources > as two-part generators: ... [Adding Reinhold Birkenfeld's suggestion of a blank yield] > def my_resource(): > print "Hi!" # Do entrance code > yield # Go on with the contents of the 'with' block > print "Bye!" # Do exit code The macro itself looks reasonable -- so long as there is only ever one changing block inside the macro. I'm not sure that is a reasonable restriction, but the alternative is ugly enough that maybe passing around locals() starts to be just as good. What about a block that indicates the enclosed namespaces will collapse a level? defmacro myresource(filename): with myresource("thefile"): def reader(): ... def writer(): ... def fn(): .... Then myresource, reader, writer, and fn would share a namespace without having to manually pass it around. -jJ From martin at v.loewis.de Sat Apr 23 00:14:44 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sat Apr 23 00:14:47 2005 Subject: [Python-Dev] Proper place to put extra args for building In-Reply-To: <426883FF.5060009@ocf.berkeley.edu> References: <4265918D.7040700@ocf.berkeley.edu> <4265F656.4020305@v.loewis.de> <4266C07A.9090503@ocf.berkeley.edu> <4266C4E9.5060709@v.loewis.de> <4266C7D1.700@ocf.berkeley.edu> <42672803.3080208@v.loewis.de> <42674338.80009@ocf.berkeley.edu> <426744DF.2030309@v.loewis.de> <426883FF.5060009@ocf.berkeley.edu> Message-ID: <42697754.1000707@v.loewis.de> Brett C. wrote: > Yep, you're right. I initially thought that the parentheses meant it was a > Makefile-only variable, but it actually goes to the environment for those > unknown values. > > Before I check it in, though, should setup.py be tweaked to use it as well? I > say yes. You means sysconfig.py, right? Probably yes. This is a mess. distutils should just do what Makefile does for builtin modules, i.e. use CFLAGS from the Makefile. Instead, it supports CFLAGS as being additive to the Makefile value CFLAGS, which in turn it just *knows* $(BASECFLAGS) $(OPT). Regards, Martin From ncoghlan at gmail.com Sat Apr 23 01:48:40 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat Apr 23 01:49:01 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: <4269303D.6030205@hathawaymix.org> References: <17000.27815.106381.198125@montanaro.dyndns.org> <42692956.5070305@gmail.com> <4269303D.6030205@hathawaymix.org> Message-ID: <42698D58.4090902@gmail.com> Shane Hathaway wrote: > There's a lot of boilerplate code there. Using your suggestion, I could > write that something like this: > > def transaction(): > begin_transaction() > try: > continue > except: > abort_transaction() > raise > else: > commit_transaction() > > with transaction(): > changestuff() > changemorestuff() For that to work, the behaviour would need to differ slightly from what I envisioned (which was that the 'continue' would be behaviourally equivalent to a 'yield None'). Alternatively, something equivalent to the above could be written as: def transaction(): begin_transaction() continue ex = sys.exc_info() if ex[0] is not None: abort_transaction(): else: commit_transaction(): Note that you could do this with a normal resource, too: class transaction(object): def __enter__(): begin_transaction() def __exit__(): ex = sys.exc_info() if ex[0] is not None: abort_transaction(): else: commit_transaction(): Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From hpk at trillke.net Sat Apr 23 01:51:12 2005 From: hpk at trillke.net (holger krekel) Date: Sat Apr 23 01:51:15 2005 Subject: [Python-Dev] PEP 310 and exceptions Message-ID: <20050422235112.GK22996@solar.trillke.net> Hi all, probably unsuprisingly i am still pondering the idea of having an optional __except__ hook on block handlers. The PEP says this about this: An extension to the protocol to include an optional __except__ handler, which is called when an exception is raised, and which can handle or re-raise the exception, has been suggested. It is not at all clear that the semantics of this extension can be made precise and understandable. For example, should the equivalent code be try ... except ... else if an exception handler is defined, and try ... finally if not? How can this be determined at compile time, in general? In fact, i think the translation even to python code is not that tricky: x = X(): ... basically translates to: if hasattr(x, '__enter__'): x.__enter__() try: ... except: if hasattr(x, '__except__'): x.__except__(...) else: x.__exit__() else: x.__exit__() this is the original definition from the PEP with the added except clause. Handlers are free to call 'self.__exit__()' from the except clause. I don't think that anything needs to be determined at compile time. (the above can probably be optimized at the bytecode level but that is a side issue). Moreover, i think that there are more than the "transactional" use cases mentioned in the PEP. For example, a handler may want to log exceptions to some tracing utility or it may want to swallow certain exceptions when its block does IO operations that are ok to fail. cheers, holger From jcarlson at uci.edu Sat Apr 23 04:03:20 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat Apr 23 04:05:00 2005 Subject: [Python-Dev] PEP 310 and exceptions In-Reply-To: <20050422235112.GK22996@solar.trillke.net> References: <20050422235112.GK22996@solar.trillke.net> Message-ID: <20050422190222.63D2.JCARLSON@uci.edu> hpk@trillke.net (holger krekel) wrote: > basically translates to: > > if hasattr(x, '__enter__'): > x.__enter__() > try: > ... > except: > if hasattr(x, '__except__'): x.__except__(...) > else: x.__exit__() > else: > x.__exit__() Nope... >>> def foo(): ... try: ... print 1 ... return ... except: ... print 2 ... else: ... print 3 ... >>> foo() 1 >>> - Josiah From aleaxit at yahoo.com Sat Apr 23 05:15:10 2005 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Apr 23 05:15:15 2005 Subject: [Python-Dev] PEP 310 and exceptions In-Reply-To: <20050422235112.GK22996@solar.trillke.net> References: <20050422235112.GK22996@solar.trillke.net> Message-ID: <1acac02fe434d6433ad197731c43db1b@yahoo.com> On Apr 22, 2005, at 16:51, holger krekel wrote: > Moreover, i think that there are more than the "transactional" > use cases mentioned in the PEP. For example, a handler > may want to log exceptions to some tracing utility > or it may want to swallow certain exceptions when > its block does IO operations that are ok to fail. I entirely agree! In fact, I was discussing this very issue recently with colleagues at Google, most of them well acquainted with Python but not all of them Python enthusiasts, and I was surprised to see unanimity on how PEP 310 *with* __except__ would be a huge step up in usefulness wrt the simple __enter__/__exit__ model, which is roughly equivalent in power to the C++ approach (destructors of auto variables) whose absence from Python and Java some people were bemoaning (which is how the whole discussion got started...). The use cases appear to be aleph-0 or more...;-). Essentially, think of it of encapsulating into reusable forms many common patterns of try/except use, much like iterators/generators can encapsulate looping and recursive constructs, and a new vista of uses open up... Imagine that in two or three places in your code you see something like... try: ...different blocks here... except FooError, foo: # some FooError cases need whizbang resetting before they propagate if foo.wobble > FOOBAR_RESET_THRESHOLD: whizbang.reset_all() raise With PEP 310 and __except__, this would become: with foohandler: ...whatever block.. in each and every otherwise-duplicated-logic case... now THAT is progress!!! IOW, +1 ... ! Alex From ncoghlan at gmail.com Sat Apr 23 05:26:06 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat Apr 23 05:26:43 2005 Subject: [Python-Dev] PEP 310 and exceptions In-Reply-To: <20050422235112.GK22996@solar.trillke.net> References: <20050422235112.GK22996@solar.trillke.net> Message-ID: <4269C04E.5040108@gmail.com> holger krekel wrote: > Moreover, i think that there are more than the "transactional" > use cases mentioned in the PEP. For example, a handler > may want to log exceptions to some tracing utility > or it may want to swallow certain exceptions when > its block does IO operations that are ok to fail. With the current PEP 310 definition, these can be manually handled using sys.exc_info() in the __exit__ method. Cleaning up my earlier transaction handler example: class transaction(object): def __enter__(self): begin_transaction() def __exit__(self): ex = sys.exc_info() if ex[0] is not None: abort_transaction() else: commit_transaction() Alternately, PEP 310 could be defined as equivalent to: if hasattr(x, '__enter__'): x.__enter__() try: try: ... except: if hasattr(x, '__except__'): x.__except__(*sys.exc_info()) else: raise finally: x.__exit__() Then the transaction handler would look like: class transaction(object): def __enter__(self): self.aborted = False begin_transaction() def __except__(self, *exc_info): self.aborted = True abort_transaction() def __exit__(self): if not self.aborted: commit_transaction() Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From ncoghlan at gmail.com Sat Apr 23 05:41:57 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat Apr 23 05:42:03 2005 Subject: [Python-Dev] PEP 310 and exceptions In-Reply-To: <4269C04E.5040108@gmail.com> References: <20050422235112.GK22996@solar.trillke.net> <4269C04E.5040108@gmail.com> Message-ID: <4269C405.1050008@gmail.com> Nick Coghlan wrote: > Alternately, PEP 310 could be defined as equivalent to: > > if hasattr(x, '__enter__'): > x.__enter__() > try: > try: > ... > except: > if hasattr(x, '__except__'): > x.__except__(*sys.exc_info()) > else: > raise > finally: > x.__exit__() > In light of Alex's comments, I'd actually like to suggest the below as a potential new definition for PEP 310 (making __exit__ optional, and adding an __else__ handler): if hasattr(x, '__enter__'): x.__enter__() try: try: # Contents of 'with' block except: if hasattr(x, '__except__'): if not x.__except__(*sys.exc_info()): # [1] raise else: raise else: if hasattr(x, '__else__'): x.__else__() finally: if hasattr(x, '__exit__'): x.__exit__() [1] A possible tweak to this line would be to have it swallow the exception by default (by removing the conditional reraise). I'd prefer to make the silencing of the exception explicit, by returning 'True' from the exception handling, and have 'falling off the end' of the exception handler cause the exception to propagate. Whichever way that point goes, this definition would allow PEP 310 to handle Alex's example of factoring out standardised exception handling, as well as the original use case of resource cleanup, and the transaction handling: class transaction(object): def __enter__(self): begin_transaction() def __except__(self, *exc_info): abort_transaction() def __else__(self): commit_transaction() Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From firemoth at gmail.com Sat Apr 23 05:42:49 2005 From: firemoth at gmail.com (Timothy Fitz) Date: Sat Apr 23 05:42:52 2005 Subject: [Python-Dev] anonymous blocks In-Reply-To: References: Message-ID: <972ec5bd0504222042700b6f42@mail.gmail.com> On 4/21/05, Guido van Rossum wrote: > for dummy in synchronized(the_lock): > BODY > > or perhaps even (making "for VAR" optional in the for-loop syntax) > with > > in synchronized(the_lock): > BODY > > Then synchronized() could be written cleanly as follows: > > def synchronized(lock): > lock.acquire() > try: > yield None > finally: > lock.release() How is this different from: def synchronized(lock): def synch_fn(block): lock.acquire() try: block() finally: lock.release() return synch_fn @synchronized def foo(): BLOCK True, it's non-obvious that foo is being immediately executed, but regardless I like the way synchronized is defined, and doesn't use yield (which in my opinion is a non-obvious solution) From bac at OCF.Berkeley.EDU Sat Apr 23 06:12:42 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Sat Apr 23 06:12:48 2005 Subject: [Python-Dev] Proper place to put extra args for building In-Reply-To: <42697754.1000707@v.loewis.de> References: <4265918D.7040700@ocf.berkeley.edu> <4265F656.4020305@v.loewis.de> <4266C07A.9090503@ocf.berkeley.edu> <4266C4E9.5060709@v.loewis.de> <4266C7D1.700@ocf.berkeley.edu> <42672803.3080208@v.loewis.de> <42674338.80009@ocf.berkeley.edu> <426744DF.2030309@v.loewis.de> <426883FF.5060009@ocf.berkeley.edu> <42697754.1000707@v.loewis.de> Message-ID: <4269CB3A.40306@ocf.berkeley.edu> Martin v. L?wis wrote: > Brett C. wrote: > >>Yep, you're right. I initially thought that the parentheses meant it was a >>Makefile-only variable, but it actually goes to the environment for those >>unknown values. >> >>Before I check it in, though, should setup.py be tweaked to use it as well? I >>say yes. > > > You means sysconfig.py, right? No, I mean Python's setup.py; line 174. > Probably yes. > You mean Distutils' sysconfig, right? I can change that as well if you want. -Brett From ilya at bluefir.net Sat Apr 23 06:23:20 2005 From: ilya at bluefir.net (Ilya Sandler) Date: Sat Apr 23 06:24:08 2005 Subject: [Python-Dev] a few SF bugs which can (probably) be closed Message-ID: Good morning/evening/: Here a few sourceforge bugs which can probably be closed: [ 1168983 ] : ftplib.py string index out of range Original poster reports that the problem disappeared after a patch committed by Raymond [ 1178863 ] Variable.__init__ uses self.set(), blocking specialization seems like a dup of 1178872 [ 415492 ] Compiler generates relative filenames seems to have been fixed at some point. I could not reproduce it with python2.4 [ 751612 ] smtplib crashes Windows Kernal. Seems like an obvious Windows bug (not python's bug) and seems to be unreproducible Ilya From martin at v.loewis.de Sat Apr 23 09:28:25 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sat Apr 23 09:28:28 2005 Subject: [Python-Dev] Proper place to put extra args for building In-Reply-To: <4269CB3A.40306@ocf.berkeley.edu> References: <4265918D.7040700@ocf.berkeley.edu> <4265F656.4020305@v.loewis.de> <4266C07A.9090503@ocf.berkeley.edu> <4266C4E9.5060709@v.loewis.de> <4266C7D1.700@ocf.berkeley.edu> <42672803.3080208@v.loewis.de> <42674338.80009@ocf.berkeley.edu> <426744DF.2030309@v.loewis.de> <426883FF.5060009@ocf.berkeley.edu> <42697754.1000707@v.loewis.de> <4269CB3A.40306@ocf.berkeley.edu> Message-ID: <4269F919.6070901@v.loewis.de> Brett C. wrote: >>You means sysconfig.py, right? Right. > No, I mean Python's setup.py; line 174. Ah, ok. > You mean Distutils' sysconfig, right? I can change that as well if you want. Please do; otherwise, people might see strange effects. Regards, Martin From hpk at trillke.net Sat Apr 23 10:10:41 2005 From: hpk at trillke.net (holger krekel) Date: Sat Apr 23 10:10:43 2005 Subject: [Python-Dev] PEP 310 and exceptions In-Reply-To: <20050422190222.63D2.JCARLSON@uci.edu> References: <20050422235112.GK22996@solar.trillke.net> <20050422190222.63D2.JCARLSON@uci.edu> Message-ID: <20050423081041.GA30548@solar.trillke.net> On Fri, Apr 22, 2005 at 19:03 -0700, Josiah Carlson wrote: > hpk@trillke.net (holger krekel) wrote: > > basically translates to: > > > > if hasattr(x, '__enter__'): > > x.__enter__() > > try: > > ... > > except: > > if hasattr(x, '__except__'): x.__except__(...) > > else: x.__exit__() > > else: > > x.__exit__() > > Nope... > > >>> def foo(): > ... try: > ... print 1 > ... return > ... except: > ... print 2 > ... else: > ... print 3 > ... > >>> foo() > 1 > >>> doh! of course, you are right. So it indeeds better translates to a nested try-finally/try-except when transformed to python code. Nick Coghlan points at the correct ideas below in this thread. At the time i was implementing things by modifying ceval.c rather than by just a compiling addition, i have to admit. cheers, holger From aahz at pythoncraft.com Sat Apr 23 15:50:02 2005 From: aahz at pythoncraft.com (Aahz) Date: Sat Apr 23 15:50:17 2005 Subject: [Python-Dev] PEP 310 and exceptions In-Reply-To: <4269C405.1050008@gmail.com> References: <20050422235112.GK22996@solar.trillke.net> <4269C04E.5040108@gmail.com> <4269C405.1050008@gmail.com> Message-ID: <20050423135002.GA17909@panix.com> On Sat, Apr 23, 2005, Nick Coghlan wrote: > > In light of Alex's comments, I'd actually like to suggest the below as a > potential new definition for PEP 310 (making __exit__ optional, and adding > an __else__ handler): > > if hasattr(x, '__enter__'): > x.__enter__() > try: > try: > # Contents of 'with' block > except: > if hasattr(x, '__except__'): > if not x.__except__(*sys.exc_info()): # [1] > raise > else: > raise > else: > if hasattr(x, '__else__'): > x.__else__() > finally: > if hasattr(x, '__exit__'): > x.__exit__() +1, but prior to reading this post I was thinking along similar lines with your __exit__ named __finally__ and your __else__ named __exit__. My reasoning for that is that most of the time, people want their exit condition aborted if an exception is raised; having the "normal" exit routine called __else__ would be confusing except to people who do lots of exception handling. (I'm a bit sensitive to that right now; this week I wasted an hour because I didn't understand exceptions as well as I thought I did, although it was related more to the precise mechanics of raising and catching exceptions. Perhaps I'll submit a doc bug; I didn't find this explained in _Learning Python_ or Nutshell...) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It's 106 miles to Chicago. We have a full tank of gas, a half-pack of cigarettes, it's dark, and we're wearing sunglasses." "Hit it." From hpk at trillke.net Sat Apr 23 18:06:49 2005 From: hpk at trillke.net (holger krekel) Date: Sat Apr 23 18:06:52 2005 Subject: __except__ use cases (was: Re: [Python-Dev] PEP 310 and exceptions) In-Reply-To: <4269C405.1050008@gmail.com> References: <20050422235112.GK22996@solar.trillke.net> <4269C04E.5040108@gmail.com> <4269C405.1050008@gmail.com> Message-ID: <20050423160649.GC30548@solar.trillke.net> On Sat, Apr 23, 2005 at 13:41 +1000, Nick Coghlan wrote: > Nick Coghlan wrote: > In light of Alex's comments, I'd actually like to suggest the below as a > potential new definition for PEP 310 (making __exit__ optional, and adding > an __else__ handler): > > if hasattr(x, '__enter__'): > x.__enter__() > try: > try: > # Contents of 'with' block > except: > if hasattr(x, '__except__'): > if not x.__except__(*sys.exc_info()): # [1] > raise On a side note, I don't see too much point in having __except__ return something when it is otherwise easy to say: def __except__(self, typ, val, tb): self.abort_transaction() raise typ, val, tb But actually i'd like to to mention some other than transaction-use cases for __except__, for example with class MyObject: def __except__(self, typ, val, tb): if isinstance(val, KeyboardInterrupt): raise # process exception and swallow it you can use it like: x = MyObject(): # do some long running stuff and MyObject() can handle internal problems appropriately and present clean Exceptions to the outside without changing the "calling side". With my implementation i also played with little things like: def __getattr__(self, name): Key2AttributeError: return self._cache[key] ... with an obvious __except__() implementation for Key2AttributeError. Similar to what Alex points out i generally think that being able to define API/object specific exception handling in *one* place is a great thing. I am willing to help with the PEP and implementation, btw. cheers, holger From pje at telecommunity.com Sat Apr 23 19:50:14 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat Apr 23 19:46:06 2005 Subject: [Python-Dev] PEP 310 and exceptions In-Reply-To: <4269C405.1050008@gmail.com> References: <4269C04E.5040108@gmail.com> <20050422235112.GK22996@solar.trillke.net> <4269C04E.5040108@gmail.com> Message-ID: <5.1.1.6.0.20050423134807.03ab79d0@mail.telecommunity.com> At 01:41 PM 4/23/05 +1000, Nick Coghlan wrote: >Whichever way that point goes, this definition would allow PEP 310 to >handle Alex's example of factoring out standardised exception handling, as >well as the original use case of resource cleanup, and the transaction >handling: > >class transaction(object): > def __enter__(self): > begin_transaction() > > def __except__(self, *exc_info): > abort_transaction() > > def __else__(self): > commit_transaction() I'd like to suggest '__success__' in place of '__else__' and '__before__'/'__after__' instead of '__enter__'/'__exit__', if you do take this approach, so that what they do is a bit more obvious. From bh at intevation.de Sat Apr 23 19:59:29 2005 From: bh at intevation.de (Bernhard Herzog) Date: Sat Apr 23 19:59:53 2005 Subject: [Python-Dev] PEP 310 and exceptions In-Reply-To: <4269C04E.5040108@gmail.com> (Nick Coghlan's message of "Sat, 23 Apr 2005 13:26:06 +1000") References: <20050422235112.GK22996@solar.trillke.net> <4269C04E.5040108@gmail.com> Message-ID: Nick Coghlan writes: > holger krekel wrote: >> Moreover, i think that there are more than the "transactional" >> use cases mentioned in the PEP. For example, a handler may want to >> log exceptions to some tracing utility or it may want to swallow >> certain exceptions when >> its block does IO operations that are ok to fail. > > With the current PEP 310 definition, these can be manually handled using > sys.exc_info() in the __exit__ method. With the proposed implementation of PEP 310 rev. 1.5 it wouldn't work. sys.exc_info returns a tuple of Nones unless an except: clause has been entered. Either sys.exc_info() would have to be changed to always return exception information after an exception has been raised or the implementation would have to be changed to do the equivalent of e.g. if hasattr(var, "__enter__"): var.__enter__() try: try: suite except: pass finally: var.__exit__() An empty except: suite suffices. In C that's equivalent to a call to PyErr_NormalizeException AFAICT. Bernhard -- Intevation GmbH http://intevation.de/ Skencil http://skencil.org/ Thuban http://thuban.intevation.org/ From ncoghlan at gmail.com Sun Apr 24 03:22:17 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun Apr 24 03:22:22 2005 Subject: [Python-Dev] PEP 310 and exceptions In-Reply-To: References: <20050422235112.GK22996@solar.trillke.net> <4269C04E.5040108@gmail.com> Message-ID: <426AF4C9.6000008@gmail.com> Bernhard Herzog wrote: > With the proposed implementation of PEP 310 rev. 1.5 it wouldn't work. > sys.exc_info returns a tuple of Nones unless an except: clause has been > entered. Either sys.exc_info() would have to be changed to always > return exception information after an exception has been raised or the > implementation would have to be changed to do the equivalent of e.g. Interesting. Although the 'null' except block should probably be a bare 'raise', rather than a 'pass': Py> try: ... try: ... raise TypeError("I'm an error!") ... except: ... raise ... finally: ... print sys.exc_info() ... (, , ) Traceback (most recent call last): File "", line 3, in ? TypeError: I'm an error! All the more reason to consider switching to a nested try/finally + try/except/else definition for 'with' blocks, I guess. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From ncoghlan at gmail.com Sun Apr 24 03:58:45 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun Apr 24 03:58:50 2005 Subject: [Python-Dev] PEP 310 and exceptions In-Reply-To: <20050423135002.GA17909@panix.com> References: <20050422235112.GK22996@solar.trillke.net> <4269C04E.5040108@gmail.com> <4269C405.1050008@gmail.com> <20050423135002.GA17909@panix.com> Message-ID: <426AFD55.4020804@gmail.com> Aahz wrote: > On Sat, Apr 23, 2005, Nick Coghlan wrote: > >>In light of Alex's comments, I'd actually like to suggest the below as a >>potential new definition for PEP 310 (making __exit__ optional, and adding >>an __else__ handler): >> >> if hasattr(x, '__enter__'): >> x.__enter__() >> try: >> try: >> # Contents of 'with' block >> except: >> if hasattr(x, '__except__'): >> if not x.__except__(*sys.exc_info()): # [1] >> raise >> else: >> raise >> else: >> if hasattr(x, '__else__'): >> x.__else__() >> finally: >> if hasattr(x, '__exit__'): >> x.__exit__() > > > +1, but prior to reading this post I was thinking along similar lines > with your __exit__ named __finally__ and your __else__ named __exit__. > My reasoning for that is that most of the time, people want their exit > condition aborted if an exception is raised; having the "normal" exit > routine called __else__ would be confusing except to people who do lots > of exception handling. In the original motivating use cases (file handles, synchronisation objects), the resource release is desired unconditionally. The aim is to achieve something similar to C++ scope-delimited objects (which release their resources unconditionally as the scope is exited). This parallel is also probably the source of the names of the two basic functions ('enter'ing the contained block, 'exit'ing the contained block). So, I think try/finally is the right semantics for the basic __enter__/__exit__ use case (consider that PEP 310 is seen as possibly worthwhile with *only* these semantics!). For error logging type use cases, only the exception handling is required. The issue of a 'no exception raised' handler only comes up for cases like transactions, where the commit operation is conditional on no exception being triggered. I understand you agree that, for those cases, the best spot to call the handler is an else clause on the inner try/except block. That way, it is skipped by default if an exception goes off, but the exception handling method can still invoke the method directly if desired (e.g. an exception is determined to be 'harmless'. However, I do agree with you that the use of '__else__' as a name is exposing too much of the underlying implementation (i.e. you need to understand the implementation for the name to make sense). I think renaming '__exit_' to '__finally__' would be a similar error, though. Which means finding a different name for '__else__'. Two possibilities that occur to me are '__ok__' or '__no_except__'. The latter makes a fair amount of sense, since I can't think of a way to refer to the thing other than as a 'no exception' handler. Cheers, Nick. P.S. I'm ignoring my housemate's suggestion of '__accept__' for the no-exception handler :) -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From ncoghlan at gmail.com Sun Apr 24 04:40:04 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun Apr 24 04:40:11 2005 Subject: [Python-Dev] Re: __except__ use cases In-Reply-To: <20050423160649.GC30548@solar.trillke.net> References: <20050422235112.GK22996@solar.trillke.net> <4269C04E.5040108@gmail.com> <4269C405.1050008@gmail.com> <20050423160649.GC30548@solar.trillke.net> Message-ID: <426B0704.9050901@gmail.com> holger krekel wrote: > On a side note, I don't see too much point in having __except__ > return something when it is otherwise easy to say: > > def __except__(self, typ, val, tb): > self.abort_transaction() > raise typ, val, tb It has to do with "Errors should never pass silently, unless explicitly silenced". Consider: def __except__(self, typ, val, tb): self.abort_transaction() With __except__ returning a value, the implicit 'return None' means that the exception is propagated by default. Without the 'suppress exception' boolean return value, this naive handler would not only abort the transaction, but swallow each and every exception that occured inside the 'with' block. Another common error with a manual reraise would involve not including the traceback properly, leading to difficulties with debugging. IOW, returning a value from __except__ should make the exception handlers cleaner, and easier to 'do right' (since reraising simply means returning a value that evaluates to False, or falling off the end of the function). Suppressing the exception would require actively adding 'return True' to the end of the handler. > But actually i'd like to to mention some other than > transaction-use cases for __except__, for example with > > class MyObject: > def __except__(self, typ, val, tb): > if isinstance(val, KeyboardInterrupt): > raise > # process exception and swallow it s/raise/return True/ for the return value version. > def __getattr__(self, name): > Key2AttributeError: > return self._cache[key] > ... > > with an obvious __except__() implementation for > Key2AttributeError. Seeing this example has convinced me of something. PEP 310 should use the 'with' keyword, and 'expression block' syntax should be used to denote the 'default object' semantics proposed for Python 3K. For example: class Key2AttributeError(object): def __init__(self, obj, attr): self: .obj_type = type(obj) .attr = attr def __except__(self, ex_type, ex_val, ex_tb): if isinstance(ex_type, KeyError): self: raise AttributeError("%s instance has no attribute %s" % (.obj_type, .attr)) # Somewhere else. . . def __getattr__(self, name): with Key2AttributeError(self, key): self: return ._cache[key] Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From shane at hathawaymix.org Sun Apr 24 06:07:37 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Sun Apr 24 06:07:40 2005 Subject: [Python-Dev] PEP 310 and exceptions In-Reply-To: <426AFD55.4020804@gmail.com> References: <20050422235112.GK22996@solar.trillke.net> <4269C04E.5040108@gmail.com> <4269C405.1050008@gmail.com> <20050423135002.GA17909@panix.com> <426AFD55.4020804@gmail.com> Message-ID: <426B1B89.6040700@hathawaymix.org> Nick Coghlan wrote: > Which means finding a different name for '__else__'. Two possibilities > that occur to me are '__ok__' or '__no_except__'. The latter makes a > fair amount of sense, since I can't think of a way to refer to the thing > other than as a 'no exception' handler. While we're on the subject of block handler method names, do the method names need four underscores? 'enter' and 'exit' look better than '__enter__' and '__exit__'. Shane From ncoghlan at gmail.com Sun Apr 24 08:42:18 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun Apr 24 08:48:19 2005 Subject: [Python-Dev] PEP 310 and exceptions In-Reply-To: <426B1B89.6040700@hathawaymix.org> References: <20050422235112.GK22996@solar.trillke.net> <4269C04E.5040108@gmail.com> <4269C405.1050008@gmail.com> <20050423135002.GA17909@panix.com> <426AFD55.4020804@gmail.com> <426B1B89.6040700@hathawaymix.org> Message-ID: <426B3FCA.2040605@gmail.com> Shane Hathaway wrote: > Nick Coghlan wrote: > >> Which means finding a different name for '__else__'. Two possibilities that >> occur to me are '__ok__' or '__no_except__'. The latter makes a fair >> amount of sense, since I can't think of a way to refer to the thing other >> than as a 'no exception' handler. > > > While we're on the subject of block handler method names, do the method names > need four underscores? 'enter' and 'exit' look better than '__enter__' and > '__exit__'. It's traditional for slots (or pseudo-slots) to have magic method names. It implies that the methods are expected to be called implicitly via special syntax or builtin functions, rather than explicitly in a normal method call. The only exception I can think of is the 'next' method of the iterator protocol. That method is often called explicitly, so the exception makes sense. For resources, there doesn't seem to be any real reason to call the methods directly - the calls will generally be hidden behind the 'with' block syntax. Hence, magic methods. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From arigo at tunes.org Sun Apr 24 15:10:09 2005 From: arigo at tunes.org (Armin Rigo) Date: Sun Apr 24 15:11:45 2005 Subject: [Python-Dev] Error checking in init functions In-Reply-To: References: Message-ID: <20050424131009.GB11964@vicky.ecs.soton.ac.uk> Hi Thomas, On Fri, Apr 22, 2005 at 04:57:26PM +0200, Thomas Heller wrote: > PyMODINIT_FUNC > PyInit_zlib(void) > { > m = Py_InitModule4("zlib", zlib_methods, > zlib_module_documentation, > (PyObject*)NULL,PYTHON_API_VERSION); I've seen a lot of code like this where laziness is actually buginess. If the Py_InitModule4() fails, you get a NULL in m, and that results in a segfault in most of the cases. Armin From jjl at pobox.com Sun Apr 24 19:12:03 2005 From: jjl at pobox.com (John J Lee) Date: Sun Apr 24 19:11:25 2005 Subject: [Python-Dev] Re: __except__ use cases In-Reply-To: <426B0704.9050901@gmail.com> References: <20050422235112.GK22996@solar.trillke.net> <4269C04E.5040108@gmail.com> <4269C405.1050008@gmail.com> <20050423160649.GC30548@solar.trillke.net> <426B0704.9050901@gmail.com> Message-ID: On Sun, 24 Apr 2005, Nick Coghlan wrote: [...] > Seeing this example has convinced me of something. PEP 310 should use the 'with' > keyword, and 'expression block' syntax should be used to denote the 'default > object' semantics proposed for Python 3K. For example: > > class Key2AttributeError(object): > def __init__(self, obj, attr): > self: > .obj_type = type(obj) > .attr = attr > def __except__(self, ex_type, ex_val, ex_tb): > if isinstance(ex_type, KeyError): > self: > raise AttributeError("%s instance has no attribute %s" > % (.obj_type, .attr)) > > > # Somewhere else. . . > def __getattr__(self, name): > with Key2AttributeError(self, key): > self: > return ._cache[key] [...] +1 Purely based on my aesthetic reaction, that is. Never having used other languages with this 'attribute lookup shorthand' feature, that seems to align *much* more with what I expect than the other way around. If 'with' is used in other languages as the keyword for attribute lookup shorthand, though, perhaps it will confuse other people, or at least make them frown :-( John From tdickenson at devmail.geminidataloggers.co.uk Sun Apr 24 19:31:56 2005 From: tdickenson at devmail.geminidataloggers.co.uk (Toby Dickenson) Date: Sun Apr 24 19:31:58 2005 Subject: [Python-Dev] PEP 310 and exceptions In-Reply-To: <426B3FCA.2040605@gmail.com> References: <20050422235112.GK22996@solar.trillke.net> <426B1B89.6040700@hathawaymix.org> <426B3FCA.2040605@gmail.com> Message-ID: <200504241831.56361.tdickenson@devmail.geminidataloggers.co.uk> On Sunday 24 April 2005 07:42, Nick Coghlan wrote: > Shane Hathaway wrote: > > While we're on the subject of block handler method names, do the method > > names need four underscores? 'enter' and 'exit' look better than > > '__enter__' and '__exit__'. I quite like .acquire() and .release(). There are plenty of classes (and not just in the threading module) which already have methods with those names that could controlled by a 'with'. Those names also make the most sense in the C++ 'resource acquisition' model. -- Toby Dickenson From jcarlson at uci.edu Sun Apr 24 20:05:22 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun Apr 24 20:06:37 2005 Subject: [Python-Dev] PEP 310 and exceptions In-Reply-To: <200504241831.56361.tdickenson@devmail.geminidataloggers.co.uk> References: <426B3FCA.2040605@gmail.com> <200504241831.56361.tdickenson@devmail.geminidataloggers.co.uk> Message-ID: <20050424110041.63E4.JCARLSON@uci.edu> Toby Dickenson wrote: > > On Sunday 24 April 2005 07:42, Nick Coghlan wrote: > > Shane Hathaway wrote: > > > > While we're on the subject of block handler method names, do the method > > > names need four underscores? 'enter' and 'exit' look better than > > > '__enter__' and '__exit__'. > > I quite like .acquire() and .release(). > > There are plenty of classes (and not just in the threading module) which > already have methods with those names that could controlled by a 'with'. > > Those names also make the most sense in the C++ 'resource acquisition' model. Perhaps, but names for the equivalent of "acquire resource" and "release resource" are not consistant accross modules. Also, re-read Nick Coghlan's email with message id <426B3FCA.2040605@gmail.com>. - Josiah From bac at OCF.Berkeley.EDU Mon Apr 25 00:31:42 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Mon Apr 25 00:31:46 2005 Subject: [Python-Dev] Proper place to put extra args for building In-Reply-To: <4269F919.6070901@v.loewis.de> References: <4265918D.7040700@ocf.berkeley.edu> <4265F656.4020305@v.loewis.de> <4266C07A.9090503@ocf.berkeley.edu> <4266C4E9.5060709@v.loewis.de> <4266C7D1.700@ocf.berkeley.edu> <42672803.3080208@v.loewis.de> <42674338.80009@ocf.berkeley.edu> <426744DF.2030309@v.loewis.de> <426883FF.5060009@ocf.berkeley.edu> <42697754.1000707@v.loewis.de> <4269CB3A.40306@ocf.berkeley.edu> <4269F919.6070901@v.loewis.de> Message-ID: <426C1E4E.4060809@ocf.berkeley.edu> OK, EXTRA_CFLAGS support has been checked into Makefile.pre.in and distutils.sysconfig . Martin, please double-check I tweaked sysconfig the way you wanted. I also wasn't sure of compatibility for Distutils (first time touching it); checked PEP 291 but Distutils wasn't listed. I went ahead and used a genexp; hope that is okay. I also did it through Lib/distutils instead of the separate distutils top directory in CVS. I didn't bother with touching setup.py because I realized that sysconfig should take care of that. If that is wrong let me know and I can check in a change (and if I am right that line dealing with OPT in setup.py could probably go). Here are the revisions. Checking in Makefile.pre.in; /cvsroot/python/python/dist/src/Makefile.pre.in,v <-- Makefile.pre.in new revision: 1.152; previous revision: 1.151 done Checking in README; /cvsroot/python/python/dist/src/README,v <-- README new revision: 1.188; previous revision: 1.187 done Checking in Lib/distutils/sysconfig.py; /cvsroot/python/python/dist/src/Lib/distutils/sysconfig.py,v <-- sysconfig.py new revision: 1.64; previous revision: 1.63 done Checking in Misc/SpecialBuilds.txt; /cvsroot/python/python/dist/src/Misc/SpecialBuilds.txt,v <-- SpecialBuilds.txt new revision: 1.20; previous revision: 1.19 done Checking in Misc/NEWS; /cvsroot/python/python/dist/src/Misc/NEWS,v <-- NEWS new revision: 1.1288; previous revision: 1.1287 done -Brett From hpk at trillke.net Mon Apr 25 00:34:20 2005 From: hpk at trillke.net (holger krekel) Date: Mon Apr 25 00:34:23 2005 Subject: [Python-Dev] Re: __except__ use cases In-Reply-To: <426B0704.9050901@gmail.com> References: <20050422235112.GK22996@solar.trillke.net> <4269C04E.5040108@gmail.com> <4269C405.1050008@gmail.com> <20050423160649.GC30548@solar.trillke.net> <426B0704.9050901@gmail.com> Message-ID: <20050424223420.GD30548@solar.trillke.net> Hi Nick, On Sun, Apr 24, 2005 at 12:40 +1000, Nick Coghlan wrote: > Seeing this example has convinced me of something. PEP 310 should use the > 'with' keyword, and 'expression block' syntax should be used to denote the > 'default object' semantics proposed for Python 3K. For example: While that may be true, i don't care too much about the syntax yet but more about the idea and semantics of an __except__ hook. I simply followed the syntax that Guido currently seems to prefer. holger From ncoghlan at gmail.com Mon Apr 25 01:37:59 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon Apr 25 01:38:05 2005 Subject: [Python-Dev] PEP 310 and exceptions In-Reply-To: <200504241831.56361.tdickenson@devmail.geminidataloggers.co.uk> References: <20050422235112.GK22996@solar.trillke.net> <426B1B89.6040700@hathawaymix.org> <426B3FCA.2040605@gmail.com> <200504241831.56361.tdickenson@devmail.geminidataloggers.co.uk> Message-ID: <426C2DD7.7070507@gmail.com> Toby Dickenson wrote: > On Sunday 24 April 2005 07:42, Nick Coghlan wrote: > >>Shane Hathaway wrote: > > >>>While we're on the subject of block handler method names, do the method >>>names need four underscores? 'enter' and 'exit' look better than >>>'__enter__' and '__exit__'. > > > I quite like .acquire() and .release(). > > There are plenty of classes (and not just in the threading module) which > already have methods with those names that could controlled by a 'with'. > > Those names also make the most sense in the C++ 'resource acquisition' model. Such existing pairings can be easily handled with a utility class like the one below. Besides, this part of the naming was considered for the original development of PEP 310 - entering and exiting the block is common to _all_ uses of the syntax, whereas other names are more specific to particular use cases. class resource(object): def __init__(self, enter, exit): self.enter = enter self.exit = exit def __enter__(self): self.enter() def __exit__(self): self.exit() with resource(my_resource.acquire, my_resource.release): # Do stuff! Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From gvanrossum at gmail.com Mon Apr 25 01:57:14 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon Apr 25 01:57:18 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: References: Message-ID: After reading a lot of contributions (though perhaps not all -- this thread seems to bifurcate every time someone has a new idea :-) I'm back to liking yield for the PEP 310 use case. I think maybe it was Doug Landauer's post mentioning Beta, plus scanning some more examples of using yield in Ruby. Jim Jewett's post on defmacro also helped, as did Nick Coghlan's post explaining why he prefers 'with' for PEP 310 and a bare expression for the 'with' feature from Pascal (and other languages :-). It seems that the same argument that explains why generators are so good for defining iterators, also applies to the PEP 310 use case: it's just much more natural to write def with_file(filename): f = open(filename) try: yield f finally: f.close() than having to write a class with __entry__ and __exit__ and __except__ methods (I've lost track of the exact proposal at this point). At the same time, having to use it as follows: for f in with_file(filename): for line in f: print process(line) is really ugly, so we need new syntax, which also helps with keeping 'for' semantically backwards compatible. So let's use 'with', and then the using code becomes again this: with f = with_file(filename): for line in f: print process(line) Now let me propose a strawman for the translation of the latter into existing semantics. Let's take the generic case: with VAR = EXPR: BODY This would translate to the following code: it = EXPR err = None while True: try: if err is None: VAR = it.next() else: VAR = it.next_ex(err) except StopIteration: break try: err = None BODY except Exception, err: # Pretend "except Exception:" == "except:" if not hasattr(it, "next_ex"): raise (The variables 'it' and 'err' are not user-visible variables, they are internal to the translation.) This looks slightly awkward because of backward compatibility; what I really want is just this: it = EXPR err = None while True: try: VAR = it.next(err) except StopIteration: break try: err = None BODY except Exception, err: # Pretend "except Exception:" == "except:" pass but for backwards compatibility with the existing argument-less next() API I'm introducing a new iterator API next_ex() which takes an exception argument. If that argument is None, it should behave just like next(). Otherwise, if the iterator is a generator, this will raised that exception in the generator's frame (at the point of the suspended yield). If the iterator is something else, the something else is free to do whatever it likes; if it doesn't want to do anything, it can just re-raise the exception. Also note that, unlike the for-loop translation, this does *not* invoke iter() on the result of EXPR; that's debatable but given that the most common use case should not be an alternate looping syntax (even though it *is* technically a loop) but a more general "macro statement expansion", I think we can expect EXPR to produce a value that is already an iterator (rather than merely an interable). Finally, I think it would be cool if the generator could trap occurrences of break, continue and return occurring in BODY. We could introduce a new class of exceptions for these, named ControlFlow, and (only in the body of a with statement), break would raise BreakFlow, continue would raise ContinueFlow, and return EXPR would raise ReturnFlow(EXPR) (EXPR defaulting to None of course). So a block could return a value to the generator using a return statement; the generator can catch this by catching ReturnFlow. (Syntactic sugar could be "VAR = yield ..." like in Ruby.) With a little extra magic we could also get the behavior that if the generator doesn't handle ControlFlow exceptions but re-raises them, they would affect the code containing the with statement; this means that the generator can decide whether return, break and continue are handled locally or passed through to the containing block. Note that EXPR doesn't have to return a generator; it could be any object that implements next() and next_ex(). (We could also require next_ex() or even next() with an argument; perhaps this is better.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tcdelaney at optusnet.com.au Mon Apr 25 02:37:44 2005 From: tcdelaney at optusnet.com.au (Tim Delaney) Date: Mon Apr 25 02:37:46 2005 Subject: [Python-Dev] Re: anonymous blocks References: Message-ID: <004201c5492e$ff05ca60$f100a8c0@ryoko> Guido van Rossum wrote: > but for backwards compatibility with the existing argument-less next() > API I'm introducing a new iterator API next_ex() which takes an > exception argument. If that argument is None, it should behave just > like next(). Otherwise, if the iterator is a generator, this will Might this be a good time to introduce __next__ (having the same signature and semantics as your proposed next_ex) and builtin next(obj, exception=None)? def next(obj, exception=None): if hasattr(obj, '__next__'): return obj.__next__(exception) if exception is not None: return obj.next(exception) # Will raise an appropriate exception return obj.next() Tim Delaney From bob at redivi.com Mon Apr 25 04:16:28 2005 From: bob at redivi.com (Bob Ippolito) Date: Mon Apr 25 04:16:33 2005 Subject: [Python-Dev] site enhancements (request for review) Message-ID: <53f1dec01a0d78057c40abb1942cf0f1@redivi.com> A few weeks ago I put together a patch to site.py for Python 2.5 that solves three major deficiencies: (1) All site dirs must exist on the filesystem: Since PEP 302 (New Import Hooks) was adopted, this is not necessarily true. sys.meta_path and sys.path_hooks can have valid uses for non- existent paths. Even the standard zipimport hook supports in-zip- file paths (i.e. foo.zip/bar). (2) The directories added to sys.path by .pth files are not scanned for further .pth files. If they were, you could make life much easier on developers and users of multi-user systems. For example, it would be possible for an administrator to drop in a .pth file into the system-wide site-packages to allow users to have their own local site-packages folder. Currently, you could try this, but it wouldn't work because many packages such as PIL, Numeric, and PyObjC take advantage of .pth files during their installation. (3) To support the above use case, .pth files should be allowed to use os.path.expanduser(), so you can toss a tilde in front and do the right thing. Currently, the only way to support (2) is to use an ugly "import" pth hook. So far, it seem that only JvR has reviewed the patch, and recommends apply. I'd like to apply it, but it should probably have a bit more review first. If no negative comments show up for a week or two, I'll assume that people like it or don't care, and apply. -bob From bac at OCF.Berkeley.EDU Mon Apr 25 04:23:59 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Mon Apr 25 04:24:04 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: References: Message-ID: <426C54BF.2010906@ocf.berkeley.edu> Guido van Rossum wrote: [SNIP] > Now let me propose a strawman for the translation of the latter into > existing semantics. Let's take the generic case: > > with VAR = EXPR: > BODY > > This would translate to the following code: [SNIP] > > it = EXPR > err = None > while True: > try: > VAR = it.next(err) > except StopIteration: > break > try: > err = None > BODY > except Exception, err: # Pretend "except Exception:" == "except:" > pass > > but for backwards compatibility with the existing argument-less next() > API I'm introducing a new iterator API next_ex() which takes an > exception argument. Can I suggest the name next_exc() instead? Everything in the sys module uses "exc" as the abbreviation for "exception". I realize you might be suggesting using the "ex" as the suffix because of the use of that as the suffix in the C API for an extended API, but that usage is not prominent in the stdlib. Also, would this change in Python 3000 so that both next_ex() and next() are merged into a single method? As for an opinion of the need of 'with', I am on the fence, leaning towards liking it. To make sure I am understanding the use case, it is to help encapsulate typical resource management with proper cleanup in another function instead of having to constantly pasting in boilerplate into your code, right? So the hope is to be able to create factory functions, typically implemented as a generator, that encapsulate the obtaining, temporary lending out, and cleanup of a resource? Is there some other use that I am totally missing that is obvious? > If that argument is None, it should behave just > like next(). Otherwise, if the iterator is a generator, this will > raised that exception in the generator's frame (at the point of the > suspended yield). If the iterator is something else, the something > else is free to do whatever it likes; if it doesn't want to do > anything, it can just re-raise the exception. > > Also note that, unlike the for-loop translation, this does *not* > invoke iter() on the result of EXPR; that's debatable but given that > the most common use case should not be an alternate looping syntax > (even though it *is* technically a loop) but a more general "macro > statement expansion", I think we can expect EXPR to produce a value > that is already an iterator (rather than merely an interable). > > Finally, I think it would be cool if the generator could trap > occurrences of break, continue and return occurring in BODY. We could > introduce a new class of exceptions for these, named ControlFlow, and > (only in the body of a with statement), break would raise BreakFlow, > continue would raise ContinueFlow, and return EXPR would raise > ReturnFlow(EXPR) (EXPR defaulting to None of course). > > So a block could return a value to the generator using a return > statement; the generator can catch this by catching ReturnFlow. > (Syntactic sugar could be "VAR = yield ..." like in Ruby.) > > With a little extra magic we could also get the behavior that if the > generator doesn't handle ControlFlow exceptions but re-raises them, > they would affect the code containing the with statement; this means > that the generator can decide whether return, break and continue are > handled locally or passed through to the containing block. > Honestly, I am not very comfortable with this magical meaning of 'break', 'continue', and 'return' in a 'with' block. I realize 'return' already has special meaning in an generator, but I don't think that is really needed either. It leads to this odd dichotomy where a non-exception-related statement directly triggers an exception in other code. It seems like code doing something behind my back; "remember, it looks like a 'continue', but it really is a method call with a specific exception instance. Surprise!" Personally, what I would rather see, is to have next_ex(), for a generator, check if the argument is a subclass of Exception. If it is, raise it as such. If not, have the 'yield' statement return the passed-in argument. This use of it would make sense for using the next_ex() name. Then again I guess having exceptions triggering a method call instead of hitting an 'except' statement is already kind of "surprise" semantics anyway. =) Still, I would like to minimize the surprises that we could spring. And before anyone decries the fact that this might confuse a newbie (which seems to happen with every advanced feature ever dreamed up), remember this will not be meant for a newbie but for someone who has experience in Python and iterators at the minimum, and hopefully with generators. Not exactly meant for someone for which raw_input() still holds a "wow" factor for. =) > Note that EXPR doesn't have to return a generator; it could be any > object that implements next() and next_ex(). (We could also require > next_ex() or even next() with an argument; perhaps this is better.) > Yes, that requirement would be good. Will make sure people don't try to use an iterator with the 'with' statement that has not been designed properly for use within the 'with'. And the precedence of requiring an API is set by 'for' since it needs to be an iterable or define __getitem__() as it is. -Brett From bob at redivi.com Mon Apr 25 04:32:35 2005 From: bob at redivi.com (Bob Ippolito) Date: Mon Apr 25 04:32:41 2005 Subject: [Python-Dev] zipfile still has 2GB boundary bug Message-ID: The "2GB bug" that was supposed to be fixed in was not actually fixed. The zipinfo offsets in the structures are still signed longs, so the fix allows you to write one file that extends past the 2G boundary, but if any extend past that point you are screwed. I have opened a new bug and patch that should fix this issue . This is a backport candidate to 2.4.2 and 2.3.6 (if that ever happens). On a related note, if anyone else has a bunch of really big and ostensibly broken zip archives created by dumb versions of the zipfile module, I have written a script that can rebuild the central directory in-place. Ping me off-list if you're interested and I'll clean it up. Someone should think about rewriting the zipfile module to be less hideous, include a repair feature, and be up to date with the latest specifications . Additionally, it'd also be useful if someone were to include support for Apple's "extensions" to the zip format (the __MACOSX folder and its contents) that show up when BOM (private framework) is used to create archives (i.e. Finder in Mac OS X 10.3+). I'm not sure if these are documented anywhere, but I can help with reverse engineering if someone is interested in writing the code. On that note, Mac OS X 10.4 (Tiger) is supposed to have new APIs (or changes to existing APIs?) to facilitate resource fork preservation, ACLs, and Spotlight hooks in tar, cp, mv, etc. Someone should spend some time looking at the Darwin 8 sources for these tools (when they're publicly available in the next few weeks) to see what would need to be done in Python to support them in the standard library (the os, tarfile, etc. modules). -bob From steven.bethard at gmail.com Mon Apr 25 05:12:47 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon Apr 25 05:12:49 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: References: Message-ID: Guido van Rossum wrote: [snip illustration of how generators (and other iterators) can be modified to be used in with-blocks] > the most common use case should not be an alternate looping syntax > (even though it *is* technically a loop) but a more general "macro > statement expansion" I'm sure I could get used to it, but my intuition for with f = with_file(filename): for line in f: print process(line) is that the f = with_file(filename) executes only once. That is, as you said, I don't expect this to be a looping syntax. Of course, as long as the generators (or other objects) here yield only one value (like with_file does), then the with-block will execute only once. But because the implementation lets you make the with-block loop if you want, it makes be nervous... I guess it would be helpful to see example where the looping with-block is useful. So far, I think all the examples I've seen have been like with_file, which only executes the block once. Of course, the loop allows you to do anything that you would normally do in a for-loop, but my feeling is that this is probably better done by composing a with-block that executes the block only once with a normal Python for-loop. I'd almost like to see the with-block translated into something like it = EXPR try: VAR = it.next() except StopIteration: raise WithNotStartedException err = None try: BODY except Exception, err: # Pretend "except Exception:" == "except:" pass try: it.next_ex(err) except StopIteration: pass else: raise WithNotEndedException where there is no looping at all, and the iterator is expected to yield exactly one item and then terminate. Of course this looks a lot like: it = EXPR VAR = it.__enter__() err = None try: BODY except Exception, err: # Pretend "except Exception:" == "except:" pass it.__exit__(err) So maybe I'm just still stuck on the enter/exit semantics. ;-) STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From pje at telecommunity.com Mon Apr 25 05:32:30 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Apr 25 05:28:26 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: References: Message-ID: <5.1.1.6.0.20050424232631.02ffbcb0@mail.telecommunity.com> At 04:57 PM 4/24/05 -0700, Guido van Rossum wrote: >So a block could return a value to the generator using a return >statement; the generator can catch this by catching ReturnFlow. >(Syntactic sugar could be "VAR = yield ..." like in Ruby.) [uncontrolled drooling, followed by much rejoicing] If this were available to generators in general, you could untwist Twisted. I'm basically simulating this sort of exception/value passing in peak.events to do exactly that, except I have to do: yield somethingBlocking(); result=events.resume() where events.resume() magically receives a value or exception from outside the generator and either returns or raises it. If next()-with-argument and next_ex() are available normally on generators, this would allow you to simulate co-routines without the events.resume() magic; the above would simply read: result = yield somethingBlocking() The rest of the peak.events coroutine simulation would remain around to manage the generator stack and scheduling, but the syntax would be cleaner and the operation of it entirely unmagical. From bob at redivi.com Mon Apr 25 05:39:52 2005 From: bob at redivi.com (Bob Ippolito) Date: Mon Apr 25 05:39:57 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <5.1.1.6.0.20050424232631.02ffbcb0@mail.telecommunity.com> References: <5.1.1.6.0.20050424232631.02ffbcb0@mail.telecommunity.com> Message-ID: <10a241223d5ad978b6a582adc1cc4954@redivi.com> On Apr 24, 2005, at 11:32 PM, Phillip J. Eby wrote: > At 04:57 PM 4/24/05 -0700, Guido van Rossum wrote: >> So a block could return a value to the generator using a return >> statement; the generator can catch this by catching ReturnFlow. >> (Syntactic sugar could be "VAR = yield ..." like in Ruby.) > > [uncontrolled drooling, followed by much rejoicing] > > If this were available to generators in general, you could untwist > Twisted. I'm basically simulating this sort of exception/value > passing in peak.events to do exactly that, except I have to do: > > yield somethingBlocking(); result=events.resume() > > where events.resume() magically receives a value or exception from > outside the generator and either returns or raises it. If > next()-with-argument and next_ex() are available normally on > generators, this would allow you to simulate co-routines without the > events.resume() magic; the above would simply read: > > result = yield somethingBlocking() > > The rest of the peak.events coroutine simulation would remain around > to manage the generator stack and scheduling, but the syntax would be > cleaner and the operation of it entirely unmagical. Only if "result = yield somethingBlocking()" could also raise an exception. -bob From pje at telecommunity.com Mon Apr 25 05:57:37 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Apr 25 05:53:34 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <10a241223d5ad978b6a582adc1cc4954@redivi.com> References: <5.1.1.6.0.20050424232631.02ffbcb0@mail.telecommunity.com> <5.1.1.6.0.20050424232631.02ffbcb0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050424235458.03099ac0@mail.telecommunity.com> At 11:39 PM 4/24/05 -0400, Bob Ippolito wrote: >On Apr 24, 2005, at 11:32 PM, Phillip J. Eby wrote: > >>At 04:57 PM 4/24/05 -0700, Guido van Rossum wrote: >>>So a block could return a value to the generator using a return >>>statement; the generator can catch this by catching ReturnFlow. >>>(Syntactic sugar could be "VAR = yield ..." like in Ruby.) >> >>[uncontrolled drooling, followed by much rejoicing] >> >>If this were available to generators in general, you could untwist >>Twisted. I'm basically simulating this sort of exception/value passing >>in peak.events to do exactly that, except I have to do: >> >> yield somethingBlocking(); result=events.resume() >> >>where events.resume() magically receives a value or exception from >>outside the generator and either returns or raises it. If >>next()-with-argument and next_ex() are available normally on generators, >>this would allow you to simulate co-routines without the events.resume() >>magic; the above would simply read: >> >> result = yield somethingBlocking() >> >>The rest of the peak.events coroutine simulation would remain around to >>manage the generator stack and scheduling, but the syntax would be >>cleaner and the operation of it entirely unmagical. > >Only if "result = yield somethingBlocking()" could also raise an exception. Read Guido's post again; he proposed that passing a result would occur by raising a ReturnFlow exception! In other words, it's the result passing that's the exceptional exception, while returning an exception is unexceptional. :) From bob at redivi.com Mon Apr 25 06:08:02 2005 From: bob at redivi.com (Bob Ippolito) Date: Mon Apr 25 06:08:07 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <5.1.1.6.0.20050424235458.03099ac0@mail.telecommunity.com> References: <5.1.1.6.0.20050424232631.02ffbcb0@mail.telecommunity.com> <5.1.1.6.0.20050424232631.02ffbcb0@mail.telecommunity.com> <5.1.1.6.0.20050424235458.03099ac0@mail.telecommunity.com> Message-ID: On Apr 24, 2005, at 11:57 PM, Phillip J. Eby wrote: > At 11:39 PM 4/24/05 -0400, Bob Ippolito wrote: > >> On Apr 24, 2005, at 11:32 PM, Phillip J. Eby wrote: >> >>> At 04:57 PM 4/24/05 -0700, Guido van Rossum wrote: >>>> So a block could return a value to the generator using a return >>>> statement; the generator can catch this by catching ReturnFlow. >>>> (Syntactic sugar could be "VAR = yield ..." like in Ruby.) >>> >>> [uncontrolled drooling, followed by much rejoicing] >>> >>> If this were available to generators in general, you could untwist >>> Twisted. I'm basically simulating this sort of exception/value >>> passing in peak.events to do exactly that, except I have to do: >>> >>> yield somethingBlocking(); result=events.resume() >>> >>> where events.resume() magically receives a value or exception from >>> outside the generator and either returns or raises it. If >>> next()-with-argument and next_ex() are available normally on >>> generators, this would allow you to simulate co-routines without the >>> events.resume() magic; the above would simply read: >>> >>> result = yield somethingBlocking() >>> >>> The rest of the peak.events coroutine simulation would remain around >>> to manage the generator stack and scheduling, but the syntax would >>> be cleaner and the operation of it entirely unmagical. >> >> Only if "result = yield somethingBlocking()" could also raise an >> exception. > > Read Guido's post again; he proposed that passing a result would occur > by raising a ReturnFlow exception! In other words, it's the result > passing that's the exceptional exception, while returning an exception > is unexceptional. :) Oh, right. Too much cold medicine tonight I guess :) You're right, of course. This facility would be VERY nice to ab^Wuse when writing any event driven software.. not just Twisted. -bob From pje at telecommunity.com Mon Apr 25 06:20:00 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Mon Apr 25 06:15:58 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: References: Message-ID: <5.1.1.6.0.20050424235840.0309ca90@mail.telecommunity.com> At 09:12 PM 4/24/05 -0600, Steven Bethard wrote: >I guess it would be helpful to see example where the looping >with-block is useful. Automatically retry an operation a set number of times before hard failure: with auto_retry(times=3): do_something_that_might_fail() Process each row of a database query, skipping and logging those that cause a processing error: with x,y,z = log_errors(db_query()): do_something(x,y,z) You'll notice, by the way, that some of these "runtime macros" may be stackable in the expression. I'm somewhat curious what happens to yields in the body of the macro block, but I assume they'll just do what would normally occur. Somehow it seems strange, though, to be yielding to something other than the enclosing 'with' object. In any case, I'm personally more excited about the part where this means we get to build co-routines with less magic. The 'with' statement itself is of interest mainly for acquisition/release and atomic/rollback scenarios, but being able to do retries or skip items that cause errors is often handy. Sometimes you have a list of things (such as event callbacks) where you need to call all of them, even if one handler fails, but you can't afford to silence the errors either. Code that deals with that scenario well is a bitch to write, and a looping 'with' would make it a bit easier to write once and reuse many. From steven.bethard at gmail.com Mon Apr 25 07:51:46 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon Apr 25 07:51:48 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <5.1.1.6.0.20050424235840.0309ca90@mail.telecommunity.com> References: <5.1.1.6.0.20050424235840.0309ca90@mail.telecommunity.com> Message-ID: On 4/24/05, Phillip J. Eby wrote: > At 09:12 PM 4/24/05 -0600, Steven Bethard wrote: > >I guess it would be helpful to see example where the looping > >with-block is useful. > > Automatically retry an operation a set number of times before hard failure: > > with auto_retry(times=3): > do_something_that_might_fail() > > Process each row of a database query, skipping and logging those that cause > a processing error: > > with x,y,z = log_errors(db_query()): > do_something(x,y,z) Thanks for the examples! If I understand your point here right, the examples that can't be easily rewritten by composing a single-execution with-block with a for-loop are examples where the number of iterations of the for-loop depends on the error handling of the with-block. Could you rewrite these with PEP 288 as something like: gen = auto_retry(times=3) for _ in gen: try: do_something_that_might_fail() except Exception, err: # Pretend "except Exception:" == "except:" gen.throw(err) gen = log_errors(db_query()) for x,y,z in gen: try: do_something(x,y,z) except Exception, err: # Pretend "except Exception:" == "except:" gen.throw(err) Obviously, the code is cleaner using the looping with-block. I'm just trying to make sure I understand your examples right. So assuming we had looping with-blocks, what would be the benefit of using a for-loop instead? Just efficiency? Or is there something that a for-loop could do that a with-block couldn't? STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From martin at v.loewis.de Mon Apr 25 09:17:34 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Mon Apr 25 09:17:39 2005 Subject: [Python-Dev] Proper place to put extra args for building In-Reply-To: <426C1E4E.4060809@ocf.berkeley.edu> References: <4265918D.7040700@ocf.berkeley.edu> <4265F656.4020305@v.loewis.de> <4266C07A.9090503@ocf.berkeley.edu> <4266C4E9.5060709@v.loewis.de> <4266C7D1.700@ocf.berkeley.edu> <42672803.3080208@v.loewis.de> <42674338.80009@ocf.berkeley.edu> <426744DF.2030309@v.loewis.de> <426883FF.5060009@ocf.berkeley.edu> <42697754.1000707@v.loewis.de> <4269CB3A.40306@ocf.berkeley.edu> <4269F919.6070901@v.loewis.de> <426C1E4E.4060809@ocf.berkeley.edu> Message-ID: <426C998E.7070402@v.loewis.de> Brett C. wrote: > OK, EXTRA_CFLAGS support has been checked into Makefile.pre.in and > distutils.sysconfig . Martin, please double-check I tweaked sysconfig the way > you wanted. It is the way I wanted it, but it doesn't work. Just try and use it for some extension modules to see for yourself, I tried with a harmless GCC option (-fgcse). The problem is that distutils only looks at the Makefile, not at the environment variables. So I changed parse_makefile to do what make does: fall back to the environment when no makefile variable is set. This was still not sufficient, since distutils never looks at CFLAGS. So I changed setup.py and sysconfig.py to fetch CFLAGS, and not bother with BASECFLAGS and EXTRA_CFLAGS. setup.py 1.218 NEWS 1.1289 sysconfig.py 1.65 Regards, Martin From fredrik at pythonware.com Mon Apr 25 09:39:12 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Mon Apr 25 09:40:18 2005 Subject: [Python-Dev] Re: anonymous blocks References: Message-ID: Guido van Rossum wrote: > At the same time, having to use it as follows: > > for f in with_file(filename): < for line in f: > print process(line) > > is really ugly, so we need new syntax, which also helps with keeping > 'for' semantically backwards compatible. So let's use 'with', and then > the using code becomes again this: > > with f = with_file(filename): > for line in f: > print process(line) or with with_file(filename) as f: ... ? (assignment inside block-opening constructs aren't used in Python today, as far as I can tell...) > Finally, I think it would be cool if the generator could trap > occurrences of break, continue and return occurring in BODY. We could > introduce a new class of exceptions for these, named ControlFlow, and > (only in the body of a with statement), break would raise BreakFlow, > continue would raise ContinueFlow, and return EXPR would raise > ReturnFlow(EXPR) (EXPR defaulting to None of course). > > So a block could return a value to the generator using a return > statement; the generator can catch this by catching ReturnFlow. > (Syntactic sugar could be "VAR = yield ..." like in Ruby.) slightly weird, but useful enough to be cool. (maybe "return value" is enough, though. the others may be slightly too weird... or should that return perhaps be a "continue value"? you're going back to the top of loop, after all). From mal at egenix.com Mon Apr 25 10:46:28 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Mon Apr 25 10:46:30 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: References: <740c3aec0504191557505d6e9f@mail.gmail.com> <877e9a170504191855445e0f4d@mail.gmail.com> <20050419212423.63AD.JCARLSON@uci.edu> <4266CC49.9080901@egenix.com> Message-ID: <426CAE64.8080404@egenix.com> Shannon -jj Behrens wrote: > On 4/20/05, M.-A. Lemburg wrote: > >>Fredrik Lundh wrote: >> >>>PS. a side effect of the for-in pattern is that I'm beginning to feel >>>that Python >>>might need a nice "switch" statement based on dictionary lookups, so I can >>>replace multiple callbacks with a single loop body, without writing too >>>many >>>if/elif clauses. >> >>PEP 275 anyone ? (http://www.python.org/peps/pep-0275.html) >> >>My use case for switch is that of a parser switching on tokens. >> >>mxTextTools applications would greatly benefit from being able >>to branch on tokens quickly. Currently, there's only callbacks, >>dict-to-method branching or long if-elif-elif-...-elif-else. > > I think "match" from Ocaml would be a much nicer addition to Python > than "switch" from C. PEP 275 is about branching based on dictionary lookups which is somewhat different than pattern matching - for which we already have lots and lots of different tools. The motivation behind the switch statement idea is that of interpreting the multi-state outcome of some analysis that you perform on data. The main benefit is avoiding Python function calls which are very slow compared to branching to inlined Python code. Having a simple switch statement would enable writing very fast parsers in Python - you'd let one of the existing tokenizers such as mxTextTools, re or one of the xml libs create the token input data and then work on the result using a switch statement. Instead of having one function call per token, you'd only have a single dict lookup. BTW, has anyone in this thread actually read the PEP 275 ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 25 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From ncoghlan at gmail.com Mon Apr 25 11:26:26 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon Apr 25 11:27:35 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: References: Message-ID: <426CB7C2.8030508@gmail.com> Guido van Rossum wrote: > It seems that the same argument that explains why generators are so > good for defining iterators, also applies to the PEP 310 use case: > it's just much more natural to write > > def with_file(filename): > f = open(filename) > try: > yield f > finally: > f.close() > > than having to write a class with __entry__ and __exit__ and > __except__ methods (I've lost track of the exact proposal at this > point). Indeed - the transaction example is very easy to write this way: def transaction(): begin_transaction() try: yield None except: abort_transaction() raise else: commit_transaction() > Also note that, unlike the for-loop translation, this does *not* > invoke iter() on the result of EXPR; that's debatable but given that > the most common use case should not be an alternate looping syntax > (even though it *is* technically a loop) but a more general "macro > statement expansion", I think we can expect EXPR to produce a value > that is already an iterator (rather than merely an interable). Not supporting iterables makes it harder to write a class which is inherently usable in a with block, though. The natural way to make iterable classes is to use 'yield' in the definition of __iter__ - if iter() is not called, then that trick can't be used. > Finally, I think it would be cool if the generator could trap > occurrences of break, continue and return occurring in BODY. We could > introduce a new class of exceptions for these, named ControlFlow, and > (only in the body of a with statement), break would raise BreakFlow, > continue would raise ContinueFlow, and return EXPR would raise > ReturnFlow(EXPR) (EXPR defaulting to None of course). Perhaps 'continue' could be used to pass a value into the iterator, rather than 'return'? (I believe this has been suggested previously in the context of for loops) This would permit 'return' to continue to mean breaking out of the containing function (as for other loops). > So a block could return a value to the generator using a return > statement; the generator can catch this by catching ReturnFlow. > (Syntactic sugar could be "VAR = yield ..." like in Ruby.) So, "VAR = yield x" would expand to something like: try: yield x except ReturnFlow, ex: VAR = ReturnFlow.value ? > With a little extra magic we could also get the behavior that if the > generator doesn't handle ControlFlow exceptions but re-raises them, > they would affect the code containing the with statement; this means > that the generator can decide whether return, break and continue are > handled locally or passed through to the containing block. That seems a little bit _too_ magical - it would be nice if break and continue were defined to be local, and return to be non-local, as for the existing loop constructs. For other non-local control flow, application specific exceptions will still be available. Regardless, the ControlFlow exceptions do seem like a very practical way of handling the underlying implementation. > Note that EXPR doesn't have to return a generator; it could be any > object that implements next() and next_ex(). (We could also require > next_ex() or even next() with an argument; perhaps this is better.) With this restriction (i.e. requiring next_ex, next_exc, or Terry's suggested __next__), then the backward's compatible version would be simply your desired semantics, plus an attribute check to exclude old-style iterators: it = EXPR if not hasattr(it, "__next__"): raise TypeError("'with' block requires 2nd gen iterator API support") err = None while True: try: VAR = it.next(err) except StopIteration: break try: err = None BODY except Exception, err: # Pretend "except Exception:" == "except:" pass The generator objects created by using yield would supply the new API, so would be usable immediately inside such 'with' blocks. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From skip at pobox.com Mon Apr 25 15:11:16 2005 From: skip at pobox.com (Skip Montanaro) Date: Mon Apr 25 15:11:21 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: References: Message-ID: <17004.60532.407476.331271@montanaro.dyndns.org> Guido> At the same time, having to use it as follows: Guido> for f in with_file(filename): Guido> for line in f: Guido> print process(line) Guido> is really ugly, so we need new syntax, which also helps with Guido> keeping 'for' semantically backwards compatible. So let's use Guido> 'with', and then the using code becomes again this: Guido> with f = with_file(filename): Guido> for line in f: Guido> print process(line) How about deferring major new syntax changes until Py3K when the grammar and semantic options might be more numerous? Given the constraints of backwards compatibility, adding more syntax or shoehorning new semantics into what's an increasingly crowded space seems to always result in an unsatisfying compromise. Guido> Now let me propose a strawman for the translation of the latter Guido> into existing semantics. Let's take the generic case: Guido> with VAR = EXPR: Guido> BODY What about a multi-variable case? Will you have to introduce a new level of indentation for each 'with' var? Skip From bob at redivi.com Mon Apr 25 15:53:13 2005 From: bob at redivi.com (Bob Ippolito) Date: Mon Apr 25 15:53:23 2005 Subject: [Python-Dev] Re: [Pythonmac-SIG] zipfile still has 2GB boundary bug In-Reply-To: <0a276f3a753a57d89e65013ca77a3714@conncoll.edu> References: <0a276f3a753a57d89e65013ca77a3714@conncoll.edu> Message-ID: <6a8c0d96b4709c84b223395306646ef0@redivi.com> On Apr 25, 2005, at 7:53 AM, Charles Hartman wrote: >> >> Someone should think about rewriting the zipfile module to be less >> hideous, include a repair feature, and be up to date with the latest >> specifications . > > -- and allow *deleting* a file from a zipfile. As far as I can tell, > you now can't (except by rewriting everything but that to a new > zipfile and renaming). Somewhere I saw a patch request for this, but > it was languishing, a year or more old. Or am I just totally missing > something? No, you're not missing anything. Deleting is hard, I guess. Either you'd have to shuffle the zip file around to reclaim the space, or just leave that spot alone and just remove its entry in the central directory. You'd probably want to look at what other software does to decide which approach to use (by default?). I don't see any markers in the format that would otherwise let you say "this file was deleted". -bob From tjreedy at udel.edu Mon Apr 25 16:14:07 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Mon Apr 25 16:15:31 2005 Subject: [Python-Dev] Re: anonymous blocks References: Message-ID: "Fredrik Lundh" wrote in message news:d4i6hg$q88$1@sea.gmane.org... > Guido van Rossum wrote: > >> At the same time, having to use it as follows: >> >> for f in with_file(filename): > < for line in f: >> print process(line) >> >> is really ugly, so we need new syntax, which also helps with keeping >> 'for' semantically backwards compatible. So let's use 'with', and then >> the using code becomes again this: >> >> with f = with_file(filename): >> for line in f: >> print process(line) > > or > > with with_file(filename) as f: with as : would parallel the for-statement header and read smoother to me. for as : would not need new keyword, but would require close reading to distinguish 'as' from 'in'. Terry J. Reedy From s.percivall at chello.se Mon Apr 25 16:26:06 2005 From: s.percivall at chello.se (Simon Percivall) Date: Mon Apr 25 16:26:12 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: References: Message-ID: <640BE6F5-E393-430A-A9B6-793DB471F28D@chello.se> On 25 apr 2005, at 16.14, Terry Reedy wrote: > with as : > > would parallel the for-statement header and read smoother to me. > > for as : > > would not need new keyword, but would require close reading to > distinguish > 'as' from 'in'. But it also moves the value to the right, removing focus. Wouldn't "from" be a good keyword to overload here? "in"/"with"/"for"/"" from : //Simon From rodsenra at gpr.com.br Mon Apr 25 16:38:51 2005 From: rodsenra at gpr.com.br (Rodrigo Dias Arruda Senra) Date: Mon Apr 25 16:38:28 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <640BE6F5-E393-430A-A9B6-793DB471F28D@chello.se> References: <640BE6F5-E393-430A-A9B6-793DB471F28D@chello.se> Message-ID: <20050425113851.10b00c37@localhost.localdomain> [ Simon Percivall ]: > [ Terry Reedy ]: > > with as : > > > > would parallel the for-statement header and read smoother to me. > > > > for as : > > > > would not need new keyword, but would require close reading to > > distinguish > > 'as' from 'in'. > > But it also moves the value to the right, removing focus. Wouldn't > "from" > be a good keyword to overload here? > > "in"/"with"/"for"/"" from : > I do not have strong feelings about this issue, but for completeness sake... Mixing both suggestions: from as : That resembles an import statement which some may consider good (syntax/keyword reuse) or very bad (confusion?, value focus). cheers, Senra -- Rodrigo Senra -- MSc Computer Engineer rodsenra(at)gpr.com.br GPr Sistemas Ltda http://www.gpr.com.br/ Personal Blog http://rodsenra.blogspot.com/ From tjreedy at udel.edu Mon Apr 25 16:38:49 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Mon Apr 25 16:41:01 2005 Subject: [Python-Dev] Re: Re: anonymous blocks References: <426CB7C2.8030508@gmail.com> Message-ID: "Nick Coghlan" wrote in message news:426CB7C2.8030508@gmail.com... > Guido van Rossum wrote: > > statement expansion", I think we can expect EXPR to produce a value > > that is already an iterator (rather than merely an interable). > > Not supporting iterables makes it harder to write a class which is > inherently usable in a with block, though. The natural way to make > iterable classes is to use 'yield' in the definition of __iter__ - if > iter() is not called, then that trick can't be used. Would not calling iter() (or .__iter__) explicitly, instead of depending on the implicit call of for loops, suffice to produce the needed iterator? tjr From tjreedy at udel.edu Mon Apr 25 16:41:37 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Mon Apr 25 16:47:33 2005 Subject: [Python-Dev] Re: Re: anonymous blocks References: <17004.60532.407476.331271@montanaro.dyndns.org> Message-ID: "Skip Montanaro" wrote in message news:17004.60532.407476.331271@montanaro.dyndns.org... > Guido> with VAR = EXPR: > Guido> BODY > > What about a multi-variable case? Will you have to introduce a new level > of > indentation for each 'with' var? I would expect to see the same structure unpacking as with assignment, for loops, and function calls: with a,b,c = x,y,z and so on. Terry J. Reedy From ark-mlist at att.net Mon Apr 25 17:00:16 2005 From: ark-mlist at att.net (Andrew Koenig) Date: Mon Apr 25 17:00:09 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <20050425113851.10b00c37@localhost.localdomain> Message-ID: <00b301c549a7$7e8022e0$6402a8c0@arkdesktop> > Mixing both suggestions: > > from as : > > > That resembles an import statement which some > may consider good (syntax/keyword reuse) or > very bad (confusion?, value focus). I have just noticed that this whole notion is fairly similar to the "local" statement in ML, the syntax for which looks like this: local in end The idea is that the first declarations, whatever they are, are processed without putting their names into the surrounding scope, then the second declarations are processed *with* putting their names into the surrounding scope. For example: local fun add(x:int, y:int) = x+y in fun succ(x) = add(x, 1) end This defines succ in the surrounding scope, but not add. So in Python terms, I think this would be local: in: or, for example: local: = value in: blah blah blah From tjreedy at udel.edu Mon Apr 25 17:11:26 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Mon Apr 25 17:13:21 2005 Subject: [Python-Dev] Re: Re: anonymous blocks References: <426C54BF.2010906@ocf.berkeley.edu> Message-ID: "Brett C." wrote in message news:426C54BF.2010906@ocf.berkeley.edu... > And before anyone decries the fact that this might confuse a newbie > (which > seems to happen with every advanced feature ever dreamed up), remember > this > will not be meant for a newbie but for someone who has experience in > Python and > iterators at the minimum, and hopefully with generators. Not exactly > meant for > someone for which raw_input() still holds a "wow" factor for. =) I have accepted the fact that Python has become a two-level language: basic Python for expressing algorithms + advanced features (metaclasses, decorators, CPython-specific introspection and hacks, and now possibly 'with' or whatever) for solving software engineering issues. Perhaps there should correspondingly be two tutorials. Terry J. Reedy From mcherm at mcherm.com Mon Apr 25 18:42:54 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Mon Apr 25 18:42:57 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) Message-ID: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> Jim Jewett writes: > As best I can tell, the anonymous blocks are used to take > care of boilerplate code without changing the scope -- exactly > what macros are used for. Folks, I think that Jim is onto something here. I've been following this conversation, and it sounds to me as if we are stumbling about in the dark, trying to feel our way toward something very useful and powerful. I think Jim is right, what we're feeling our way toward is macros. The problem, of course, is that Guido (and others!) are on record as being opposed to adding macros to Python. (Even "good" macros... think lisp, not cpp.) I am not quite sure that I am convinced by the argument, but let me see if I can present it: Allowing macros in Python would enable individual programmers or groups to easily invent their own "syntax". Eventually, there would develop a large number of different Python "dialects" (as some claim has happened in the Lisp community) each dependent on macros the others lack. The most important casualty would be Python's great *readability*. (If this is a strawman argument, i.e. if you know of a better reason for keeping macros OUT of Python please speak up. Like I said, I've never been completely convinced of it myself.) I think it would be useful if we approached it like this: either what we want is the full power of macros (in which case the syntax we choose should be guided by that choice), or we want LESS than the full power of macros. If we want less, then HOW less? In other words, rather than hearing what we'd like to be able to DO with blocks, I'd like to hear what we want to PROHIBIT DOING with blocks. I think this might be a fruitful way of thinking about the problem which might make it easier to evaluate syntax suggestions. And if the answer is that we want to prohibit nothing, then the right solution is macros. -- Michael Chermside From facundobatista at gmail.com Mon Apr 25 18:46:15 2005 From: facundobatista at gmail.com (Facundo Batista) Date: Mon Apr 25 18:46:17 2005 Subject: [Python-Dev] Re: Caching objects in memory In-Reply-To: References: Message-ID: On 4/22/05, Fredrik Lundh wrote: > > Is there a document that details which objects are cached in memory > > (to not create the same object multiple times, for performance)? > > why do you think you need to know? I was in my second class of the Python workshop I'm giving here in one Argentine University, and I was explaining how to think using name/object and not variable/value. Using id() for being pedagogic about the objects, the kids saw that id(3) was always the same, but id([]) not. I explained to them that Python, in some circumstances, caches the object, and I kept them happy enough. But I really don't know what objects and in which circumstances. . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From gvanrossum at gmail.com Mon Apr 25 18:52:03 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon Apr 25 18:52:05 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> Message-ID: > I've been following this conversation, and it sounds to me as if we > are stumbling about in the dark, trying to feel our way toward something > very useful and powerful. I think Jim is right, what we're feeling our > way toward is macros. > > The problem, of course, is that Guido (and others!) are on record as > being opposed to adding macros to Python. (Even "good" macros... think > lisp, not cpp.) I am not quite sure that I am convinced by the argument, > but let me see if I can present it: > > Allowing macros in Python would enable individual programmers or > groups to easily invent their own "syntax". Eventually, there would > develop a large number of different Python "dialects" (as some > claim has happened in the Lisp community) each dependent on macros > the others lack. The most important casualty would be Python's > great *readability*. > > (If this is a strawman argument, i.e. if you know of a better reason > for keeping macros OUT of Python please speak up. Like I said, I've > never been completely convinced of it myself.) Nor am I; though I am also not completely unconvinced! The argument as presented here is probably to generic; taken literally, it would argue against having functions and classes as well... My problem with macros is actually more practical: Python's compiler is too dumb. I am assuming that we want to be able to import macros from other modules, and I am assuming that macros are expanded by the compiler, not at run time; but the compiler doesn't follow imports (that happens at run time) so there's no mechanism to tell the compiler about the new syntax. And macros that don't introduce new syntax don't seem very interesting (compared to what we can do already). > I think it would be useful if we approached it like this: either what > we want is the full power of macros (in which case the syntax we choose > should be guided by that choice), or we want LESS than the full power > of macros. If we want less, then HOW less? > > In other words, rather than hearing what we'd like to be able to DO > with blocks, I'd like to hear what we want to PROHIBIT DOING with > blocks. I think this might be a fruitful way of thinking about the > problem which might make it easier to evaluate syntax suggestions. And > if the answer is that we want to prohibit nothing, then the right > solution is macros. I'm personally at a loss understanding your question here. Perhaps you could try answering it for yourself? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Mon Apr 25 18:57:08 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon Apr 25 18:57:11 2005 Subject: [Python-Dev] Re: Caching objects in memory In-Reply-To: References: Message-ID: > I was in my second class of the Python workshop I'm giving here in one > Argentine University, and I was explaining how to think using > name/object and not variable/value. > > Using id() for being pedagogic about the objects, the kids saw that > id(3) was always the same, but id([]) not. I explained to them that > Python, in some circumstances, caches the object, and I kept them > happy enough. > > But I really don't know what objects and in which circumstances. Aargh! Bad explanation. Or at least you're missing something: *mutable* objects (like lists) can *never* be cached, because they have explicit object semantics. For example each time the expression [] is evaluated it *must* produce a fresh list object (though it may be recycled from a GC'ed list object -- or any other GC'ed object, for that matter). But for *immutable* objects (like numbers, strings and tuples) the implementation is free to use caching. In practice, I believe ints between -5 and 100 are cached, and 1-character strings are often cached (but not always). Hope this helps! I would think this is in the docs somewhere but probably not in a place where one would ever think to look... -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pedronis at strakt.com Mon Apr 25 19:08:35 2005 From: pedronis at strakt.com (Samuele Pedroni) Date: Mon Apr 25 19:08:49 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> Message-ID: <426D2413.60301@strakt.com> Michael Chermside wrote: >Jim Jewett writes: > > >>As best I can tell, the anonymous blocks are used to take >>care of boilerplate code without changing the scope -- exactly >>what macros are used for. >> >> > >Folks, I think that Jim is onto something here. > >I've been following this conversation, and it sounds to me as if we >are stumbling about in the dark, trying to feel our way toward something >very useful and powerful. I think Jim is right, what we're feeling our >way toward is macros. > >The problem, of course, is that Guido (and others!) are on record as >being opposed to adding macros to Python. (Even "good" macros... think >lisp, not cpp.) I am not quite sure that I am convinced by the argument, >but let me see if I can present it: > > Allowing macros in Python would enable individual programmers or > groups to easily invent their own "syntax". Eventually, there would > develop a large number of different Python "dialects" (as some > claim has happened in the Lisp community) each dependent on macros > the others lack. The most important casualty would be Python's > great *readability*. > >(If this is a strawman argument, i.e. if you know of a better reason >for keeping macros OUT of Python please speak up. Like I said, I've >never been completely convinced of it myself.) > > The typical argument in defense of macros is that macros are just like functions, you go to the definition and see what they does. But depending on how much variation they offer over the normal grammar even eye parsing them may be difficult. They make it easy to mix to code that is evaluated immediately and code that will be evalutated, maybe even repeatedely, later, each macro having its own rules about this. In most cases the only way to discern this and know what is what is indeed looking at the macro definition. You can get flame wars about whether introducing slightly different variations of if is warranted. <.5 wink> My personal impression is that average macro definitions (I'm thinking about Common Lisp or Dylan and similar) are much less readable that average function definitions. Reading On Lisp may give an idea about this. That means that introducing macros in Python, because of the importance that readability has in Python, would need a serious design effort to make the macro definitions themself readable. I think that's a challenging design problem. Also agree about the technical issues that Guido cited about referencing and when macros definition enter in effect etc. From p.f.moore at gmail.com Mon Apr 25 20:06:24 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Mon Apr 25 20:06:27 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> Message-ID: <79990c6b050425110660cc2f3@mail.gmail.com> On 4/25/05, Michael Chermside wrote: > I've been following this conversation, and it sounds to me as if we > are stumbling about in the dark, trying to feel our way toward something > very useful and powerful. I think Jim is right, what we're feeling our > way toward is macros. I think the key difference with macros is that they act at compile time, not at run time. There is no intention here to provide any form of compile-time processing, and that makes all the difference. What I feel is the key concept here is that of "injecting" code into a template form (try...finally, or try..except..else, or whatever) [1]. This is "traditionally" handled by macros, and I see it as a *good* sign, that the discussion has centred around runtime mechanisms rather than compile-time ones. [1] Specifically, cases where functions aren't enough. If I try to characterise precisely what those cases are, all I can come up with is "when the code being injected needs to run in the current scope, not in the scope of a template function". Is that right? Paul. From shane.holloway at ieee.org Mon Apr 25 20:23:08 2005 From: shane.holloway at ieee.org (Shane Holloway (IEEE)) Date: Mon Apr 25 20:23:49 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> Message-ID: <426D358C.70509@ieee.org> Michael Chermside wrote: > Jim Jewett writes: > >>As best I can tell, the anonymous blocks are used to take >>care of boilerplate code without changing the scope -- exactly >>what macros are used for. > > > Folks, I think that Jim is onto something here. > > I've been following this conversation, and it sounds to me as if we > are stumbling about in the dark, trying to feel our way toward something > very useful and powerful. I think Jim is right, what we're feeling our > way toward is macros. I am very excited about the discussion of blocks. I think they can potentially address two things that are sticky to express in python right now. The first is to compress the common try/finally use cases around resource usage as with files and database commits. The second is language extensibility, which makes us think of what macros did for Lisp. Language extensibility has two motivations. First and foremost is to allow the programmer to express his or her *intent*. The second motivation is to reuse code and thereby increase productivity. Since methods already allow us to reuse code, our motivation is to increase expressivity. What blocks offer is to make Python's suites something a programmer can work with. Much like using a metaclass putting control of class details into the programmer's hands. Or decorators allowing us to modify method semantics. If the uses of decorators tells us anything, I'm pretty sure there are more potential uses of blocks than we could shake many sticks at. ;) So, the question comes back to what are blocks in the language extensibility case? To me, they would be something very like a code object returned from the compile method. To this we would need to attach the globals and locals where the block was from. Then we could use the normal exec statement to invoke the block whenever needed. Perhaps we could add a new mode 'block' to allow the ControlFlow exceptions mentioned elsewhere in the thread. We still need to find a way to pass arguments to the block so we are not tempted to insert them in locals and have them magically appear in the namespace. ;) Personally, I'm rather attached to "as (x, y):" introducing the block. To conclude, I mocked up some potential examples for your entertainment. ;) Thanks for your time and consideration! -Shane Holloway Interfaces:: def interface(interfaceName, *bases, ***aBlockSuite): blockGlobals = aBlockSuite.globals().copy() blockGlobals.update(aBlockSuite.locals()) blockLocals = {} exec aBlock in blockGlobals, blockLocals return iterfaceType(interfaceName, bases, blockLocals) IFoo = interface('IFoo'): def isFoo(self): pass IBar = interface('IBar'): def isBar(self): pass IBaz = interface('IBaz', IFoo, IBar): def isBaz(self): pass Event Suites:: def eventSinksFor(events, ***aBlockSuite): blockGlobals = aBlockSuite.globals().copy() blockGlobals.update(aBlockSuite.locals()) blockLocals = {} exec aBlock in blockGlobals, blockLocals for name, value in blockLocals.iteritems(): if aBlockSuite.locals().get(name) is value: continue if callable(value): events.addEventFor(name, value) def debugScene(scene): eventSinksFor(scene.events): def onMove(pos): print "pos:", pos def onButton(which, state): print "button:", which, state def onKey(which, state): print "key:", which, state From p.f.moore at gmail.com Mon Apr 25 20:28:23 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Mon Apr 25 20:28:25 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <004201c5492e$ff05ca60$f100a8c0@ryoko> References: <004201c5492e$ff05ca60$f100a8c0@ryoko> Message-ID: <79990c6b05042511285237126c@mail.gmail.com> On 4/25/05, Tim Delaney wrote: > Guido van Rossum wrote: > > > but for backwards compatibility with the existing argument-less next() > > API I'm introducing a new iterator API next_ex() which takes an > > exception argument. If that argument is None, it should behave just > > like next(). Otherwise, if the iterator is a generator, this will > > Might this be a good time to introduce __next__ (having the same signature > and semantics as your proposed next_ex) and builtin next(obj, > exception=None)? > > def next(obj, exception=None): > > if hasattr(obj, '__next__'): > return obj.__next__(exception) > > if exception is not None: > return obj.next(exception) # Will raise an appropriate exception > > return obj.next() Hmm, it took me a while to get this, but what you're ssaying is that if you modify Guido's "what I really want" solution to use VAR = next(it, exc) then this builtin next makes "API v2" stuff using __next__ work while remaining backward compatible with old-style "API v1" stuff using 0-arg next() (as long as old-style stuff isn't used in a context where an exception gets passed back in). I'd suggest that the new builtin have a "magic" name (__next__ being the obvious one :-)) to make it clear that it's an internal implementation detail. Paul. PS The first person to replace builtin __next__ in order to implement a "next hook" of some sort, gets shot :-) From aahz at pythoncraft.com Mon Apr 25 20:49:34 2005 From: aahz at pythoncraft.com (Aahz) Date: Mon Apr 25 20:49:38 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: <426D358C.70509@ieee.org> References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> <426D358C.70509@ieee.org> Message-ID: <20050425184934.GA15135@panix.com> On Mon, Apr 25, 2005, Shane Holloway (IEEE) wrote: > > Interfaces:: > > def interface(interfaceName, *bases, ***aBlockSuite): > blockGlobals = aBlockSuite.globals().copy() > blockGlobals.update(aBlockSuite.locals()) > blockLocals = {} > > exec aBlock in blockGlobals, blockLocals > > return iterfaceType(interfaceName, bases, blockLocals) > > IFoo = interface('IFoo'): > def isFoo(self): pass Where does ``aBlock`` come from? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It's 106 miles to Chicago. We have a full tank of gas, a half-pack of cigarettes, it's dark, and we're wearing sunglasses." "Hit it." From shane at hathawaymix.org Mon Apr 25 21:13:06 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Mon Apr 25 21:13:12 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: <79990c6b050425110660cc2f3@mail.gmail.com> References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> <79990c6b050425110660cc2f3@mail.gmail.com> Message-ID: <426D4142.8020703@hathawaymix.org> Paul Moore wrote: > I think the key difference with macros is that they act at compile > time, not at run time. There is no intention here to provide any form > of compile-time processing, and that makes all the difference. > > What I feel is the key concept here is that of "injecting" code into a > template form (try...finally, or try..except..else, or whatever) [1]. > This is "traditionally" handled by macros, and I see it as a *good* > sign, that the discussion has centred around runtime mechanisms rather > than compile-time ones. > > [1] Specifically, cases where functions aren't enough. If I try to > characterise precisely what those cases are, all I can come up with is > "when the code being injected needs to run in the current scope, not > in the scope of a template function". Is that right? That doesn't hold if the code being injected is a single Python expression, since you can put an expression in a lambda and code the template as a function. I would say you need a block template when the code being injected consists of one or more statements that need to run in the current scope. Shane From shane at hathawaymix.org Mon Apr 25 21:14:47 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Mon Apr 25 21:14:53 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> Message-ID: <426D41A7.1060605@hathawaymix.org> Michael Chermside wrote: > In other words, rather than hearing what we'd like to be able to DO > with blocks, I'd like to hear what we want to PROHIBIT DOING with > blocks. I think this might be a fruitful way of thinking about the > problem which might make it easier to evaluate syntax suggestions. And > if the answer is that we want to prohibit nothing, then the right > solution is macros. One thing we don't need, I believe, is arbitrary transformation of code objects. That's actually already possible, thanks to Python's compiler module, although the method isn't clean yet. Zope uses the compiler module to sandbox partially-trusted Python code. For example, it redirects all print statements and replaces operations that change an attribute with a call to a function that checks access before setting the attribute. Also, we don't need any of these macros, AFAICT: http://gauss.gwydiondylan.org/books/drm/drm_86.html Shane From shane.holloway at ieee.org Mon Apr 25 21:20:51 2005 From: shane.holloway at ieee.org (Shane Holloway (IEEE)) Date: Mon Apr 25 21:21:37 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: <20050425184934.GA15135@panix.com> References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> <426D358C.70509@ieee.org> <20050425184934.GA15135@panix.com> Message-ID: <426D4313.8030308@ieee.org> Aahz wrote: > On Mon, Apr 25, 2005, Shane Holloway (IEEE) wrote: > >>Interfaces:: >> >> def interface(interfaceName, *bases, ***aBlockSuite): >> blockGlobals = aBlockSuite.globals().copy() >> blockGlobals.update(aBlockSuite.locals()) >> blockLocals = {} >> >> exec aBlock in blockGlobals, blockLocals >> >> return iterfaceType(interfaceName, bases, blockLocals) >> >> IFoo = interface('IFoo'): >> def isFoo(self): pass > > > Where does ``aBlock`` come from? Sorry! I renamed ``aBlock`` to ``aBlockSuite``, but missed a few. ;) From jimjjewett at gmail.com Mon Apr 25 21:34:55 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon Apr 25 21:35:01 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) Message-ID: Guido: > My problem with macros is actually more practical: Python's compiler > is too dumb. I am assuming that we want to be able to import macros > from other modules, and I am assuming that macros are expanded by the > compiler, not at run time; but the compiler doesn't follow imports ... Expanding at run-time is less efficient, but it works at least as well semantically. If today's alternative is manual cut-n-paste, I would still rather have the computer do it for me, to avoid accidental forks. It could also be done (though not as cleanly) by making macros act as import hooks. import defmacro # Stop processing until defmacro is loaded. # All future lines will be preprocessed by the # hook collection ... from defmacro import foo # installs a foo hook, good for the rest of the file Michael Chermside: >> I think it would be useful if we approached it like this: either what >> we want is the full power of macros (in which case the syntax we choose >> should be guided by that choice), or we want LESS than the full power >> of macros. If we want less, then HOW less? >> In other words, rather than hearing what we'd like to be able to DO >> with blocks, I'd like to hear what we want to PROHIBIT DOING with >> blocks. I think this might be a fruitful way of thinking about the >> problem which might make it easier to evaluate syntax suggestions. And >> if the answer is that we want to prohibit nothing, then the right >> solution is macros. > I'm personally at a loss understanding your question here. Perhaps you > could try answering it for yourself? Why not just introduce macros? If the answer is "We should, it is just hard to code", then use a good syntax for macros. If the answer is "We don't want xx sss (S\ Guido writes: > My problem with macros is actually more practical: Python's compiler > is too dumb. I am assuming that we want to be able to import macros > from other modules, and I am assuming that macros are expanded by the > compiler, not at run time; but the compiler doesn't follow imports > (that happens at run time) so there's no mechanism to tell the > compiler about the new syntax. And macros that don't introduce new > syntax don't seem very interesting (compared to what we can do > already). That's good to hear. It expresses fairly clearly what the challenges are in implementing macros for Python, and expressing the challenges makes it easier to attack the problem. My interest comes because some recent syntax changes (generators, generator expressions) have seemed to me like true language changes, but others (decorators, anonymous-blocks) to me just cry out "this would be easy as a macro!". I wrote: > I think it would be useful if we approached it like this: either what > we want is the full power of macros (in which case the syntax we choose > should be guided by that choice), or we want LESS than the full power > of macros. If we want less, then HOW less? > > In other words, rather than hearing what we'd like to be able to DO > with blocks, I'd like to hear what we want to PROHIBIT DOING with > blocks. I think this might be a fruitful way of thinking about the > problem which might make it easier to evaluate syntax suggestions. And > if the answer is that we want to prohibit nothing, then the right > solution is macros. Guido replied: > I'm personally at a loss understanding your question here. Perhaps you > could try answering it for yourself? You guys just think too fast for me. When I started this email, I replied "Fair enough. One possibility is...". But while I was trying to condense my thoughts down from 1.5 pages to something short and coherent (it takes time to write it short) everything I was thinking became obscelete as both Paul Moore and Jim Jewett did exactly the kind of thinking I was hoping to inspire: Paul: > What I feel is the key concept here is that of "injecting" code into a > template form (try...finally, or try..except..else, or whatever) [...] > Specifically, cases where functions aren't enough. If I try to > characterise precisely what those cases are, all I can come up with is > "when the code being injected needs to run in the current scope, not > in the scope of a template function". Is that right? Jim: > Why not just introduce macros? If the answer is "We should, it is just > hard to code", then use a good syntax for macros. If the answer is > "We don't want > xx sss (S\ to ever be meaningful", then we need to figure out exactly what to > prohibit. [...] > Do we want to limit the changing part (the "anonymous block") to > only a single suite? That does work well with the "yield" syntax, but it > seems like an arbitrary restriction unless *all* we want are resource > wrappers. > > Or do we really just want a way to say that a function should share its > local namespace with it's caller or callee? In that case, maybe the answer > is a "lexical" or "same_namespace" keyword. My own opinion is that we DO want macros. I prefer a language have a few, powerful constructs rather than lots of specialized ones. (Yet I still believe that "doing different things should look different"... which is why I prefer Python to Lisp.) I think that macros could solve a LOT of problems. There are lots of things one might want to replace within macros, from identifiers to punctuation, but I'd be willing to live with just two of them: expressions, and "series-of-statements" (that's almost the same as a block). There are only two places I'd want to be able to USE a macro: where an expression is called for, and where a series-of-statements is called for. In both cases, I'd be happy with a function-call like syntax for including the macro. Well, that's a lot of "wanting"... now I all I need to do is invent a clever syntax that allows these in an elegant fashion while also solving Guido's point about imports (hint: the answer is that it ALL happens at runtime). I'll go think some while you guys zoom past me again. -- Michael Chermside From gvanrossum at gmail.com Mon Apr 25 22:16:16 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Mon Apr 25 22:16:18 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: References: Message-ID: > It could also be done (though not as cleanly) by making macros act as > import hooks. > > import defmacro # Stop processing until defmacro is loaded. > # All future lines will be preprocessed by the > # hook collection > ... > from defmacro import foo # installs a foo hook, good for the rest of the file Brrr. What about imports that aren't at the top level (e.g. inside a function)? > Why not just introduce macros? Because I've been using Python for 15 years without needing them? Sorry, but "why not add feature X" is exactly what we're trying to AVOID here. You've got to come up with some really good use cases before we add new features. "I want macros" just doesn't cut it. > If the answer is "We should, it is just > hard to code", then use a good syntax for macros. If the answer is > "We don't want > > xx sss (S\ > to ever be meaningful", then we need to figure out exactly what to > prohibit. Lisp macros are (generally, excluding read macros) limited > to taking and generating complete S-expressions. If that isn't enough > to enforce readability, then limiting blocks to expressions (or even > statements) probably isn't enough in python. I suspect you've derailed here. Or perhaps you should use a better example; I don't understand what the point is of using an example like "xx sss (S\ Do we want to limit the changing part (the "anonymous block") to > only a single suite? That does work well with the "yield" syntax, but it > seems like an arbitrary restriction unless *all* we want are resource > wrappers. Or loops, of course. Pehaps you've missed some context here? Nobody seems to be able to come up with other use cases, that's why "yield" is so attractive. > Or do we really just want a way to say that a function should share its > local namespace with it's caller or callee? In that case, maybe the answer > is a "lexical" or "same_namespace" keyword. Or maybe just a recipe to make > exec or eval do the right thing. > > def myresource(rcname, callback, *args): > rc=open(rcname) > same_namespace callback(*args) > close(rc) > > def process(*args): > ... But should the same_namespace modifier be part of the call site or part of the callee? You seem to be tossing examples around a little easily here. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From fumanchu at amor.org Mon Apr 25 22:30:31 2005 From: fumanchu at amor.org (Robert Brewer) Date: Mon Apr 25 22:28:57 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3771EF6@exchange.hqamor.amorhq.net> Michael Chermside wrote: > Jim: > > Why not just introduce macros? If the answer is "We > > should, it is just hard to code", then use a good > > syntax for macros. If the answer is "We don't want > > xx sss (S\ > to ever be meaningful", then we need to figure out exactly what to > > prohibit. > [...] > > Do we want to limit the changing part (the "anonymous block") to > > only a single suite? That does work well with the "yield" > syntax, but it > > seems like an arbitrary restriction unless *all* we want > are resource > > wrappers. > > > > Or do we really just want a way to say that a function > should share its > > local namespace with it's caller or callee? In that case, > maybe the answer > > is a "lexical" or "same_namespace" keyword. > > My own opinion is that we DO want macros. I prefer a language > have a few, > powerful constructs rather than lots of specialized ones. (Yet I still > believe that "doing different things should look > different"... which is > why I prefer Python to Lisp.) I think that macros could solve a LOT of > problems. > > There are lots of things one might want to replace within macros, from > identifiers to punctuation, but I'd be willing to live with > just two of them: expressions, and "series-of-statements" > (that's almost the same as a block). There are only two places > I'd want to be able to USE a macro: where an expression is > called for, and where a series-of-statements is called for. > In both cases, I'd be happy with a function-call > like syntax for including the macro. By "function-call like syntax" you mean something like this? def safe_file(filename, body, cleanup): f = open(filename) try: body() finally: f.close() cleanup() ... defmacro body: for line in f: print line[:line.find(":")] defmacro cleanup: print "file closed successfully" safe_file(filename, body, cleanup) If macros were to be evaluated at runtime, I'd certainly want to see them be first-class (meaning able to be referenced and passed around); I don't have much of a need for anonymous macros. Robert Brewer MIS Amor Ministries fumanchu@amor.org From fumanchu at amor.org Mon Apr 25 23:02:55 2005 From: fumanchu at amor.org (Robert Brewer) Date: Mon Apr 25 23:01:18 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3771EF7@exchange.hqamor.amorhq.net> Guido van Rossum wrote: > > Why not just introduce macros? > > Because I've been using Python for 15 years without needing them? > Sorry, but "why not add feature X" is exactly what we're trying to > AVOID here. You've got to come up with some really good use cases > before we add new features. "I want macros" just doesn't cut it. I had a use-case recently which could be done using macros. I'll let you all decide whether it would be "better" with macros or not. ;) My poor-man's ORM uses descriptors to handle the properties of domain objects, a lot of which need custom triggers, constraint-checking, notifications, etc. The base class has: def __set__(self, unit, value): if self.coerce: value = self.coerce(unit, value) oldvalue = unit._properties[self.key] if oldvalue != value: unit._properties[self.key] = value At one time, I had something like: def __set__(self, unit, value): if self.coerce: value = self.coerce(unit, value) oldvalue = unit._properties[self.key] if oldvalue != value: if self.pre: self.pre(unit, value) unit._properties[self.key] = value if self.post: self.post(unit, value) ...to run pre- and post-triggers. But that became unwieldy recently when one of my post functions depended upon calculations inside the corresponding pre function. So currently, all subclasses just override __set__, which leads to a *lot* of duplication of code. If I could write the base class' __set__ to call "macros" like this: def __set__(self, unit, value): self.begin() if self.coerce: value = self.coerce(unit, value) oldvalue = unit._properties[self.key] if oldvalue != value: self.pre() unit._properties[self.key] = value self.post() self.end() defmacro begin: pass defmacro pre: pass defmacro post: pass defmacro end: pass ...(which would require macro-blocks which were decidedly *not* anonymous) then I could more cleanly write a subclass with additional "macro" methods: defmacro pre: old_children = self.children() defmacro post: for child in self.children: if child not in old_children: notify_somebody("New child %s" % child) Notice that the "old_children" local gets injected into the namespace of __set__ (the caller) when "pre" is executed, and is available inside of "post". The "self" name doesn't need to be rebound, either, since it is also available in __set__'s local scope. We also avoid all of the overhead of separate frames. The above is quite ugly written with callbacks (due to excessive argument passing), and is currently fragile when overriding __set__ (due to duplicated code). I'm sure there are other cases with both 1) a relatively invariant series of statements and 2) complicated extensions of that series. Of course, you can do the above with compile() and exec. Maybe I'm just averse to code within strings. Some ideas. Now tear 'em apart. :) Robert Brewer MIS Amor Ministries fumanchu@amor.org From jimjjewett at gmail.com Mon Apr 25 23:04:34 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon Apr 25 23:04:37 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) Message-ID: Michael Chermside: > There are lots of things one might want to replace within macros, from > identifiers to punctuation, but I'd be willing to live with just two of > them: expressions, and "series-of-statements" (that's almost the same as > a block). There are only two places I'd want to be able to USE a macro: > where an expression is called for, and where a series-of-statements is > called for. In both cases, I'd be happy with a function-call like syntax > for including the macro. I have often wanted to replace (parts of) strings, either because I'm writing a wrapper or because I want a non-English version to be loadable without having to wrap strings in my own source code. This is best done as an import hook, but if I had read-write access to (a copy of) the source code, I would use it. I'm not sure I want that door opened, because if I start needing to parse regex substitions just to get a source code listing ... I won't be happy. I do think macros should be prevented from "changing the level" of the code it replaces. Any suites/statements/expressions (including parentheses and strings) that are open before the macro must still be open afterwards, and any opened inside the macro must be closed inside the macro. For example def foo(x): print x macro1(x) print x might print different values for x on the two lines, but I would be less comfortable if it could result in any of the following: def foo(x): print x while True: # An invisible loop, because of print x # Changing the indent level def foo(x): print x return # and you thought it would print twice! (This one is iffy) print x def foo(x) print x [(""" (unclosed string or paren eats up the rest of the file...) print x def foo(x) "Hah! my backspaces and rubouts eliminated the print statements!" def foo(x) print x def anotherfunc(x, y, z): print x # Hey, I didn't even mess with the indent! And to be honest, even def foo(x): macro1(x) stmt1() # syntax error, except for the macro, so not ambiguous expanding to def foo(x) while x: print x # macro does not end on same indent level stmt1() is ... not something I want to worry about when I'm reading. Michael Chermside: > (hint: the answer is that it ALL happens at runtime). I have mixed feelings on this. It is more powerful that way, but it also limits future implementations -- and I'm not sure the extra power is entirely a good thing. defmacro(): print x # Hey, x was in global scope at runtime when *I* tested On The Other Hand, this certainly isn't the only piece of python that could *usually* be moved to compile-time, and I suppose it could piggyback on whatever extension is used for speeding up attribute lookup. -jJ From tjreedy at udel.edu Mon Apr 25 23:13:05 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Mon Apr 25 23:15:45 2005 Subject: [Python-Dev] Re: Re: Caching objects in memory References: Message-ID: Guido: But for *immutable* objects (like numbers, strings and tuples) the implementation is free to use caching. In practice, I believe ints between -5 and 100 are cached, and 1-character strings are often cached (but not always). Hope this helps! I would think this is in the docs somewhere but probably not in a place where one would ever think to look... ----------- I am sure that the fact that immutables *may* be cached is in the ref manual, but I have been under the impression that the private, *mutable* specifics for CPython are intentionally omitted so that people will not think of them as either fixed or as part of the language/library. I have previously suggested that there be a separate doc for CPython implementation details like this that some people want but which are not part of the language or library definition. Terry J. Reedy From shane at hathawaymix.org Mon Apr 25 23:29:01 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Mon Apr 25 23:29:10 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E3771EF7@exchange.hqamor.amorhq.net> References: <3A81C87DC164034AA4E2DDFE11D258E3771EF7@exchange.hqamor.amorhq.net> Message-ID: <426D611D.9060100@hathawaymix.org> Robert Brewer wrote: > So currently, all subclasses just override __set__, which leads to a > *lot* of duplication of code. If I could write the base class' __set__ > to call "macros" like this: > > def __set__(self, unit, value): > self.begin() > if self.coerce: > value = self.coerce(unit, value) > oldvalue = unit._properties[self.key] > if oldvalue != value: > self.pre() > unit._properties[self.key] = value > self.post() > self.end() > > defmacro begin: > pass > > defmacro pre: > pass > > defmacro post: > pass > > defmacro end: > pass Here is a way to write that using anonymous blocks: def __set__(self, unit, value): with self.setting(unit, value): if self.coerce: value = self.coerce(unit, value) oldvalue = unit._properties[self.key] if oldvalue != value: with self.changing(oldvalue, value): unit._properties[self.key] = value def setting(self, unit, value): # begin code goes here yield None # end code goes here def changing(self, oldvalue, newvalue): # pre code goes here yield None # post code goes here > ...(which would require macro-blocks which were decidedly *not* > anonymous) then I could more cleanly write a subclass with additional > "macro" methods: > > defmacro pre: > old_children = self.children() > > defmacro post: > for child in self.children: > if child not in old_children: > notify_somebody("New child %s" % child) def changing(self, oldvalue, newvalue): old_children = self.children() yield None for child in self.children: if child not in old_children: notify_somebody("New child %s" % child) Which do you prefer? I like fewer methods. ;-) Shane From fumanchu at amor.org Mon Apr 25 23:40:12 2005 From: fumanchu at amor.org (Robert Brewer) Date: Mon Apr 25 23:38:34 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3771EF9@exchange.hqamor.amorhq.net> Shane Hathaway wrote: > Robert Brewer wrote: > > So currently, all subclasses just override __set__, which leads to a > > *lot* of duplication of code. If I could write the base > class' __set__ > > to call "macros" like this: > > > > def __set__(self, unit, value): > > self.begin() > > if self.coerce: > > value = self.coerce(unit, value) > > oldvalue = unit._properties[self.key] > > if oldvalue != value: > > self.pre() > > unit._properties[self.key] = value > > self.post() > > self.end() > > > > defmacro begin: > > pass > > > > defmacro pre: > > pass > > > > defmacro post: > > pass > > > > defmacro end: > > pass > > Here is a way to write that using anonymous blocks: > > def __set__(self, unit, value): > with self.setting(unit, value): > if self.coerce: > value = self.coerce(unit, value) > oldvalue = unit._properties[self.key] > if oldvalue != value: > with self.changing(oldvalue, value): > unit._properties[self.key] = value > > def setting(self, unit, value): > # begin code goes here > yield None > # end code goes here > > def changing(self, oldvalue, newvalue): > # pre code goes here > yield None > # post code goes here > ... > Which do you prefer? I like fewer methods. ;-) I still prefer more methods, because my actual use-cases are more complicated. Your solution would work for the specific case I gave, but try factoring in: * A subclass which needs to share locals between begin and post, instead of pre and post. or * A set of 10 subclasses which need the same begin() but different end() code. Yielding seems both too restrictive and too inside-out to be readable, IMO. Robert Brewer MIS Amor Ministries fumanchu@amor.org From jjinux at gmail.com Mon Apr 25 23:52:01 2005 From: jjinux at gmail.com (Shannon -jj Behrens) Date: Mon Apr 25 23:52:04 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: <426CAE64.8080404@egenix.com> References: <740c3aec0504191557505d6e9f@mail.gmail.com> <877e9a170504191855445e0f4d@mail.gmail.com> <20050419212423.63AD.JCARLSON@uci.edu> <4266CC49.9080901@egenix.com> <426CAE64.8080404@egenix.com> Message-ID: On 4/25/05, M.-A. Lemburg wrote: > Shannon -jj Behrens wrote: > > On 4/20/05, M.-A. Lemburg wrote: > > > >>Fredrik Lundh wrote: > >> > >>>PS. a side effect of the for-in pattern is that I'm beginning to feel > >>>that Python > >>>might need a nice "switch" statement based on dictionary lookups, so I can > >>>replace multiple callbacks with a single loop body, without writing too > >>>many > >>>if/elif clauses. > >> > >>PEP 275 anyone ? (http://www.python.org/peps/pep-0275.html) > >> > >>My use case for switch is that of a parser switching on tokens. > >> > >>mxTextTools applications would greatly benefit from being able > >>to branch on tokens quickly. Currently, there's only callbacks, > >>dict-to-method branching or long if-elif-elif-...-elif-else. > > > > I think "match" from Ocaml would be a much nicer addition to Python > > than "switch" from C. > > PEP 275 is about branching based on dictionary lookups which > is somewhat different than pattern matching - for which we > already have lots and lots of different tools. > > The motivation behind the switch statement idea is that of > interpreting the multi-state outcome of some analysis that > you perform on data. The main benefit is avoiding Python > function calls which are very slow compared to branching to > inlined Python code. > > Having a simple switch statement > would enable writing very fast parsers in Python - > you'd let one of the existing tokenizers such as mxTextTools, > re or one of the xml libs create the token input data > and then work on the result using a switch statement. > > Instead of having one function call per token, you'd > only have a single dict lookup. > > BTW, has anyone in this thread actually read the PEP 275 ? I'll admit that I haven't because dict-based lookups aren't as interesting to me as an Ocaml-style match statement. Furthermore, the argument "Instead of having one function call per token, you'd only have a single dict lookup" isn't very compelling to me personally, because I don't have such a performance problem in my applications, which isn't to say that it isn't important or that you don't have a valid point. Best Regards, -jj -- I have decided to switch to Gmail, but messages to my Yahoo account will still get through. From jimjjewett at gmail.com Tue Apr 26 00:01:06 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue Apr 26 00:01:09 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: References: Message-ID: On 4/25/05, Guido van Rossum wrote: > > It could also be done (though not as cleanly) by making macros act as > > import hooks. > Brrr. What about imports that aren't at the top level (e.g. inside a function)? Bad style already. :D If you want to use the macro, you have to ensure it was already imported. That said, I did say it wasn't as clean; think of it like pre-caching which dictionary that resolved an attribute lookup. Don't start with the complexity, but consider not making the optimization impossible. > > Why not just introduce macros? > Because I've been using Python for 15 years without needing them? And also without anonymous blocks or generator finalizers or resource managers. > Sorry, but "why not add feature X" is exactly what we're trying to > AVOID here. If anything is added, it might be better to add a single generalized tool instead of several special cases -- unless the tool is so general as to be hazardous. Unlimited macros are that hazardous. > > If the answer is "We don't want > > > > xx sss (S\ > > > to ever be meaningful", then we need to figure out exactly what to > > prohibit. > I don't understand what the point is of using an example like > "xx sss (S\> [yield works great for a single "anonymous block", but not so >> great for several blocks per macro/template.] > Pehaps you've missed some context here? Nobody seems to be able to > come up with other [than resource wrappers] use cases, that's why > "yield" is so attractive. Sorry; to me it seemed obvious that you would occasionally want to interleave the macro/template and the variable portion. Robert Brewer has since provided good examples at http://mail.python.org/pipermail/python-dev/2005-April/052923.html http://mail.python.org/pipermail/python-dev/2005-April/052924.html > > Or do we really just want a way to say that a function should share its > > local namespace with it's caller or callee? In that case, maybe the answer > > is a "lexical" or "same_namespace" keyword. Or maybe just a recipe to make > > exec or eval do the right thing. > But should the same_namespace modifier be part of the call site or > part of the callee? IMHO, it should be part of the calling site, because it is the calling site that could be surprised to find its own locals modified. The callee presumably runs through a complete call before it has a chance to be surprised. I did leave the decision open because I'm not certain that mention-in-caller wouldn't end up contorting a common code style. (It effectively forces the macro to be in control, and the "meaningful" code to be callbacks.) -jJ From tcdelaney at optusnet.com.au Tue Apr 26 00:10:44 2005 From: tcdelaney at optusnet.com.au (Tim Delaney) Date: Tue Apr 26 00:10:46 2005 Subject: [Python-Dev] Re: anonymous blocks References: <004201c5492e$ff05ca60$f100a8c0@ryoko> <79990c6b05042511285237126c@mail.gmail.com> Message-ID: <001201c549e3$a07d57a0$0700a8c0@ryoko> Paul Moore wrote: > Hmm, it took me a while to get this, but what you're ssaying is that > if you modify Guido's "what I really want" solution to use > > VAR = next(it, exc) > > then this builtin next makes "API v2" stuff using __next__ work while > remaining backward compatible with old-style "API v1" stuff using > 0-arg next() (as long as old-style stuff isn't used in a context where > an exception gets passed back in). Yes, but it could also be used (almost) anywhere an explicit obj.next() is used. it = iter(seq) while True: print next(it) for loops would also change to use builtin next() rather than calling it.next() directly. > I'd suggest that the new builtin have a "magic" name (__next__ being > the obvious one :-)) to make it clear that it's an internal > implementation detail. There aren't many builtins that have magic names, and I don't think this should be one of them - it has obvious uses other than as an implementation detail. > PS The first person to replace builtin __next__ in order to implement > a "next hook" of some sort, gets shot :-) Damn! There goes the use case ;) Tim Delaney From gvanrossum at gmail.com Tue Apr 26 00:11:07 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Apr 26 00:11:09 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: References: Message-ID: It seems that what you call macros is really an unlimited preprocessor. I'm even less interested in that topic than in macros, and I haven't seen anything here to change my mind. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jimjjewett at gmail.com Tue Apr 26 00:20:04 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue Apr 26 00:20:07 2005 Subject: [Python-Dev] Re: switch statement Message-ID: M.-A. Lemburg wrote: > Having a simple switch statement > would enable writing very fast parsers in Python - ... > Instead of having one function call per token, you'd > only have a single dict lookup. > BTW, has anyone in this thread actually read the PEP 275 ? I haven't actually seen any use cases outside of parsers branching on a constant token. When I see stacked elif clauses, the condition almost always includes some computation (perhaps only ".startswith" or "in" or a regex match), and there are often cases which look at a second variable. If speed for a limited number of cases is the only advantage, then I would say it belongs in (at most) the implementation, rather than the language spec. -jJ From shane at hathawaymix.org Tue Apr 26 00:30:09 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Tue Apr 26 00:30:17 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E3771EF9@exchange.hqamor.amorhq.net> References: <3A81C87DC164034AA4E2DDFE11D258E3771EF9@exchange.hqamor.amorhq.net> Message-ID: <426D6F71.6020103@hathawaymix.org> Robert Brewer wrote: > I still prefer more methods, because my actual use-cases are more > complicated. Your solution would work for the specific case I gave, but > try factoring in: > > * A subclass which needs to share locals between begin and post, instead > of pre and post. > > or > > * A set of 10 subclasses which need the same begin() but different end() > code. > > Yielding seems both too restrictive and too inside-out to be readable, > IMO. Ok, that makes sense. However, one of your examples seemingly pulls a name, 'old_children', out of nowhere. That's hard to fix. One of the greatest features of Python is the simple name scoping; we can't lose that. Shane From pedronis at strakt.com Tue Apr 26 00:45:43 2005 From: pedronis at strakt.com (Samuele Pedroni) Date: Tue Apr 26 00:44:02 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E3771EF9@exchange.hqamor.amorhq.net> References: <3A81C87DC164034AA4E2DDFE11D258E3771EF9@exchange.hqamor.amorhq.net> Message-ID: <426D7317.10104@strakt.com> Robert Brewer wrote: > Shane Hathaway wrote: > >>Robert Brewer wrote: >> >>>So currently, all subclasses just override __set__, which leads to a >>>*lot* of duplication of code. If I could write the base >> >>class' __set__ >> >>>to call "macros" like this: >>> >>> def __set__(self, unit, value): >>> self.begin() >>> if self.coerce: >>> value = self.coerce(unit, value) >>> oldvalue = unit._properties[self.key] >>> if oldvalue != value: >>> self.pre() >>> unit._properties[self.key] = value >>> self.post() >>> self.end() >>> >>> defmacro begin: >>> pass >>> >>> defmacro pre: >>> pass >>> >>> defmacro post: >>> pass >>> >>> defmacro end: >>> pass >> >>Here is a way to write that using anonymous blocks: >> >> def __set__(self, unit, value): >> with self.setting(unit, value): >> if self.coerce: >> value = self.coerce(unit, value) >> oldvalue = unit._properties[self.key] >> if oldvalue != value: >> with self.changing(oldvalue, value): >> unit._properties[self.key] = value >> >> def setting(self, unit, value): >> # begin code goes here >> yield None >> # end code goes here >> >> def changing(self, oldvalue, newvalue): >> # pre code goes here >> yield None >> # post code goes here >> > > ... > >>Which do you prefer? I like fewer methods. ;-) > > > I still prefer more methods, because my actual use-cases are more > complicated. Your solution would work for the specific case I gave, but > try factoring in: > > * A subclass which needs to share locals between begin and post, instead > of pre and post. > > or > > * A set of 10 subclasses which need the same begin() but different end() > code. > > Yielding seems both too restrictive and too inside-out to be readable, > IMO. > > it seems what you are asking for are functions that are evaluated in namespace of the caller: - this seems fragile, the only safe wat to implement 'begin' etc is to exactly know what goes on in __set__ and what names are used there - if you throw in deferred evaluation for exprs or suites passed in as arguments and even without considering that, it seems pretty horrid implementation-wise Notice that even in Common Lisp you cannot really do this, you could define a macro that produce a definition for __set__ and takes fragments corresponding to begin ... etc From abo at minkirri.apana.org.au Tue Apr 26 02:01:05 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Tue Apr 26 02:01:21 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: References: Message-ID: <1114473665.3698.2.camel@schizo> On Mon, 2005-04-25 at 18:20 -0400, Jim Jewett wrote: [...] > If speed for a limited number of cases is the only advantage, > then I would say it belongs in (at most) the implementation, > rather than the language spec. Agreed. I don't find any switch syntaxes better than if/elif/else. Speed benefits belong in implementation optimisations, not new bad syntax. -- Donovan Baarda http://minkirri.apana.org.au/~abo/ From exogen at gmail.com Tue Apr 26 03:21:37 2005 From: exogen at gmail.com (Brian Beck) Date: Tue Apr 26 03:25:48 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: <1114473665.3698.2.camel@schizo> References: <1114473665.3698.2.camel@schizo> Message-ID: Donovan Baarda wrote: > Agreed. I don't find any switch syntaxes better than if/elif/else. Speed > benefits belong in implementation optimisations, not new bad syntax. I posted this 'switch' recipe to the Cookbook this morning, it saves some typing over the if/elif/else construction, and people seemed to like it. Take a look: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/410692 -- Brian Beck Adventurer of the First Order From abo at minkirri.apana.org.au Tue Apr 26 04:20:07 2005 From: abo at minkirri.apana.org.au (Donovan Baarda) Date: Tue Apr 26 04:21:36 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: References: <1114473665.3698.2.camel@schizo> Message-ID: <1114482007.3698.15.camel@schizo> On Mon, 2005-04-25 at 21:21 -0400, Brian Beck wrote: > Donovan Baarda wrote: > > Agreed. I don't find any switch syntaxes better than if/elif/else. Speed > > benefits belong in implementation optimisations, not new bad syntax. > > I posted this 'switch' recipe to the Cookbook this morning, it saves > some typing over the if/elif/else construction, and people seemed to > like it. Take a look: > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/410692 Very clever... you have shown that current python syntax is capable of almost exactly replicating a C case statement. My only problem is C case statements are ugly. A simple if/elif/else is much more understandable to me. The main benefit in C of case statements is the compiler can optimise them. This copy of a C case statement will be slower than an if/elif/else, and just as ugly :-) -- Donovan Baarda http://minkirri.apana.org.au/~abo/ From tjreedy at udel.edu Tue Apr 26 04:33:23 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Tue Apr 26 04:33:44 2005 Subject: [Python-Dev] Re: Re: Caching objects in memory References: Message-ID: "Terry Reedy" wrote in message news:d4jm79$uji$1@sea.gmane.org... > Guido: > > But for *immutable* objects (like numbers, strings and tuples) the > implementation is free to use caching. In practice, I believe ints > between -5 and 100 are cached, and 1-character strings are often > cached (but not always). > > Hope this helps! I would think this is in the docs somewhere but > probably not in a place where one would ever think to look... > > ----------- To be clearer, the above quotes what Guido wrote in the post of his that I am responding to. Only the below is my response. > I am sure that the fact that immutables *may* be cached is in the ref > manual, but I have been under the impression that the private, *mutable* > specifics for CPython are intentionally omitted so that people will not > think of them as either fixed or as part of the language/library. > > I have previously suggested that there be a separate doc for CPython > implementation details like this that some people want but which are not > part of the language or library definition. > Terry J. Reedy From greg.ewing at canterbury.ac.nz Tue Apr 26 04:47:37 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue Apr 26 04:47:54 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <001201c549e3$a07d57a0$0700a8c0@ryoko> References: <004201c5492e$ff05ca60$f100a8c0@ryoko> <79990c6b05042511285237126c@mail.gmail.com> <001201c549e3$a07d57a0$0700a8c0@ryoko> Message-ID: <426DABC9.4020801@canterbury.ac.nz> Tim Delaney wrote: > There aren't many builtins that have magic names, and I don't think this > should be one of them - it has obvious uses other than as an > implementation detail. I think there's some confusion here. As I understood the suggestion, __next__ would be the Python name of the method corresponding to the tp_next typeslot, analogously with __len__, __iter__, etc. There would be a builtin function next(obj) which would invoke obj.__next__(), for use by Python code. For loops wouldn't use it, though; they would continue to call the tp_next typeslot directly. > Paul Moore wrote: >> PS The first person to replace builtin __next__ in order to implement >> a "next hook" of some sort, gets shot :-) I think he meant next(), not __next__. And it wouldn't work anyway, since as I mentioned above, C code would bypass next() and call the typeslot directly. I'm +1 on moving towards __next__, BTW. IMO, that's the WISHBDITFP. :-) -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From michael.walter at gmail.com Tue Apr 26 05:15:52 2005 From: michael.walter at gmail.com (Michael Walter) Date: Tue Apr 26 05:15:55 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: References: Message-ID: <877e9a1705042520151e493f3f@mail.gmail.com> A couple of examples out of my tired head (solely from a user perspective) :-) Embedding domain specific language (ex.: state machine): stateful Person: state Calm(initial=True): def react(event): self.chill_pill.take() ignore(event) state Furious: def react(event): self.say("Macros are the evil :)") react(event) # xD p = Person() p.become(Furious) p.react(42) --- Embedding domain specific language (ex.: markup language): # no, i haven't thought about whether the presented syntax as such is unambiguous # enough to make sense def hello_world(): : : : "Tralalalala" <body>: for g in uiods: <h1>: uido2str(g) --- Embedding domain-specific language (ex.: badly-designed database table): deftable Player: id: primary_key(integer) # does this feel backward? handle: string fans: m2n_assoc(Fan) --- Control constructs: forever: print "tralalala" unless you.are(LUCKY): print "awwww" I'm not sure whether this the Python you want it to become, so in a certain sense I feel kind of counterproductive now (sublanguage design is hard at 11 PM, which might actually prove someone's point that the language designer shouldn't allow people to do such things. I'm sure other people are more mature or at least less tired than me, though, so I beg to differ :-), Michael On 4/25/05, Guido van Rossum <gvanrossum@gmail.com> wrote: > It seems that what you call macros is really an unlimited > preprocessor. I'm even less interested in that topic than in macros, > and I haven't seen anything here to change my mind. > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/michael.walter%40gmail.com > From greg.ewing at canterbury.ac.nz Tue Apr 26 05:38:48 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue Apr 26 05:39:09 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc205042416572da9db71@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> Message-ID: <426DB7C8.5020708@canterbury.ac.nz> Guido van Rossum wrote: > with VAR = EXPR: > BODY > > This would translate to the following code: > > it = EXPR > err = None > while True: > try: > if err is None: > VAR = it.next() > else: > VAR = it.next_ex(err) > except StopIteration: > break > try: > err = None > BODY > except Exception, err: # Pretend "except Exception:" == "except:" > if not hasattr(it, "next_ex"): > raise I like the general shape of this, but I have one or two reservations about the details. 1) We're going to have to think carefully about the naming of functions designed for use with this statement. If 'with' is going to be in there as a keyword, then it really shouldn't be part of the function name as well. Instead of with f = with_file(pathname): ... I would rather see something like with f = opened(pathname): ... This sort of convention (using a past participle as a function name) would work for some other cases as well: with some_data.locked(): ... with some_resource.allocated(): ... On the negative side, not having anything like 'with' in the function name means that the fact the function is designed for use in a with-statement could be somewhat non-obvious. Since there's not going to be much other use for such a function, this is a bad thing. It could also lead people into subtle usage traps such as with f = open(pathname): ... which would fail in a somewhat obscure way. So maybe the 'with' keyword should be dropped (again!) in favour of with_opened(pathname) as f: ... 2) I'm not sure about the '='. It makes it look rather deceptively like an ordinary assignment, and I'm sure many people are going to wonder what the difference is between with f = opened(pathname): do_stuff_to(f) and simply f = opened(pathname) do_stuff_to(f) or even just unconsciously read the first as the second without noticing that anything special is going on. Especially if they're coming from a language like Pascal which has a much less magical form of with-statement. So maybe it would be better to make it look more different: with opened(pathname) as f: ... * It seems to me that this same exception-handling mechanism would be just as useful in a regular for-loop, and that, once it becomes possible to put 'yield' in a try-statement, people are going to *expect* it to work in for-loops as well. Guido has expressed concern about imposing extra overhead on all for-loops. But would the extra overhead really be all that noticeable? For-loops already put a block on the block stack, so the necessary processing could be incorporated into the code for unwinding a for-block during an exception, and little if anything would need to change in the absence of an exception. However, if for-loops also gain this functionality, we end up with the rather embarrassing situation that there is *no difference* in semantics between a for-loop and a with-statement! This could be "fixed" by making the with-statement not loop, as has been suggested. That was my initial thought as well, but having thought more deeply, I'm starting to think that Guido was right in the first place, and that a with-statement should be capable of looping. I'll elaborate in another post. > So a block could return a value to the generator using a return > statement; the generator can catch this by catching ReturnFlow. > (Syntactic sugar could be "VAR = yield ..." like in Ruby.) This is a very elegant idea, but I'm seriously worried by the possibility that a return statement could do something other than return from the function it's written in, especially if for-loops also gain this functionality. Intercepting break and continue isn't so bad, since they're already associated with the loop they're in, but return has always been an unconditional get-me-out-of-this-function. I'd feel uncomfortable if this were no longer true. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From kbk at shore.net Tue Apr 26 05:56:11 2005 From: kbk at shore.net (Kurt B. Kaiser) Date: Tue Apr 26 05:56:44 2005 Subject: [Python-Dev] Weekly Python Patch/Bug Summary Message-ID: <200504260356.j3Q3uBNw020893@bayview.thirdcreek.com> Patch / Bug Summary ___________________ Patches : 316 open ( +2) / 2831 closed ( +7) / 3147 total ( +9) Bugs : 908 open (+10) / 4941 closed (+20) / 5849 total (+30) RFE : 178 open ( +1) / 153 closed ( +2) / 331 total ( +3) New / Reopened Patches ______________________ package_data chops off first char of default package (2005-04-15) http://python.org/sf/1183712 opened by Wummel [ast] fix for 1183468: return/yield in class (2005-04-16) http://python.org/sf/1184418 opened by logistix urllib2 dloads failing through HTTP proxy w/ auth (2005-04-18) http://python.org/sf/1185444 opened by Mike Fleetwood binascii.b2a_qp does not handle binary data correctly (2005-04-18) http://python.org/sf/1185447 opened by Eric Huss Automatically build fpectl module from setup.py (2005-04-18) http://python.org/sf/1185529 opened by Jeff Epler Typo in Curses-Function doc (2005-04-20) http://python.org/sf/1186781 opened by grzankam subprocess: optional auto-reaping fixing os.wait() lossage (2005-04-21) http://python.org/sf/1187312 opened by Mattias Engdeg?rd Add const specifier to PySpam_System prototype (2005-04-21) http://python.org/sf/1187396 opened by Luis Bruno Don't assume all exceptions are SyntaxError's (2005-04-25) http://python.org/sf/1189210 opened by John Ehresman Patches Closed ______________ fix typos in Library Reference (2005-04-10) http://python.org/sf/1180062 closed by doerwalter [AST] Fix for core in test_grammar.py (2005-04-08) http://python.org/sf/1179513 closed by nascheme Implemented new 'class foo():pass' syntax (2005-04-03) http://python.org/sf/1176019 closed by nascheme range() in for loops, again (2005-04-12) http://python.org/sf/1181334 closed by arigo Info Associated with Merge to AST (2005-01-07) http://python.org/sf/1097671 closed by kbk New / Reopened Bugs ___________________ Minor error in tutorial (2005-04-14) CLOSED http://python.org/sf/1183274 opened by Konrads Smelkovs check for return/yield outside function is wrong (2005-04-15) http://python.org/sf/1183468 opened by Neil Schemenauer try to open /dev/null as directory (2005-04-15) http://python.org/sf/1183585 opened by Roberto A. Foglietta PyDict_Copy() can return non-NULL value on error (2005-04-15) CLOSED http://python.org/sf/1183742 opened by Phil Thompson Popen4 wait() fails sporadically with threads (2005-04-15) http://python.org/sf/1183780 opened by Taale Skogan return val in __init__ doesn't raise TypeError in new-style (2005-04-15) CLOSED http://python.org/sf/1183959 opened by Adal Chiriliuc dest parameter in optparse (2005-04-15) http://python.org/sf/1183972 opened by ahmado Missing trailing newline with comment raises SyntaxError (2005-04-15) http://python.org/sf/1184112 opened by Eric Huss example broken in section 1.12 of Extending & Embedding (2005-04-16) http://python.org/sf/1184380 opened by bamoore Read-only property attributes raise wrong exception (2005-04-16) CLOSED http://python.org/sf/1184449 opened by Barry A. Warsaw itertools.imerge: merge sequences (2005-04-18) CLOSED http://python.org/sf/1185121 opened by Jurjen N.E. Bos pydoc doesn't find all module doc strings (2005-04-18) http://python.org/sf/1185124 opened by Kent Johnson PyObject_Realloc bug in obmalloc.c (2005-04-19) http://python.org/sf/1185883 opened by Kristj?n Valur python socketmodule dies on ^c (2005-04-19) CLOSED http://python.org/sf/1185931 opened by nodata tempnam doc doesn't include link to tmpfile (2005-04-19) http://python.org/sf/1186072 opened by Ian Bicking [AST] genexps get scoping wrong (2005-04-19) http://python.org/sf/1186195 opened by Brett Cannon [AST] assert failure on ``eval("u'\Ufffffffe'")`` (2005-04-19) http://python.org/sf/1186345 opened by Brett Cannon [AST] automatic unpacking of arguments broken (2005-04-19) http://python.org/sf/1186353 opened by Brett Cannon Python Programming FAQ should be updated for Python 2.4 (2005-02-09) CLOSED http://python.org/sf/1119439 reopened by montanaro nntplib shouldn't raise generic EOFError (2005-04-20) http://python.org/sf/1186900 opened by Matt Roper TypeError message on bad iteration is misleading (2005-04-21) http://python.org/sf/1187437 opened by Roy Smith Pickle with HIGHEST_PROTOCOL "ord() expected..." (2005-04-22) CLOSED http://python.org/sf/1188175 opened by Heiko Selber Rebuilding from source on RH9 fails (_tkinter.so missing) (2005-04-22) http://python.org/sf/1188231 opened by Marty Heyman Python 2.4 Not Recognized by Any Programs (2005-04-23) http://python.org/sf/1188637 opened by Yoshi Nagasaki zipfile module and 2G boundary (2005-04-24) http://python.org/sf/1189216 opened by Bob Ippolito Seg Fault when compiling small program (2005-04-24) http://python.org/sf/1189248 opened by Reginald B. Charney LINKCC incorrect (2005-04-25) http://python.org/sf/1189330 opened by Christoph Ludwig LINKCC incorrect (2005-04-25) CLOSED http://python.org/sf/1189337 opened by Christoph Ludwig file.write(x) where len(x) > 64*1024**2 is unreliable (2005-04-25) CLOSED http://python.org/sf/1189525 opened by Martin Gfeller pydoc may hide non-private doc strings. (2005-04-25) http://python.org/sf/1189811 opened by J Livingston "Atuple containing default argument values ..." (2005-04-25) http://python.org/sf/1189819 opened by Chad Whitacre Bugs Closed ___________ Minor error in tutorial (2005-04-14) http://python.org/sf/1183274 closed by doerwalter copy.py bug (2005-02-03) http://python.org/sf/1114776 closed by anthonybaxter re.escape(s) prints wrong for chr(0) (2005-04-13) http://python.org/sf/1182603 closed by nascheme PyDict_Copy() can return non-NULL value on error (2005-04-15) http://python.org/sf/1183742 closed by rhettinger return val in __init__ doesn't raise TypeError in new-style (2005-04-15) http://python.org/sf/1183959 closed by rhettinger dir() does not include _ (2005-04-13) http://python.org/sf/1182614 closed by nickjacobson Read-only property attributes raise wrong exception (2005-04-16) http://python.org/sf/1184449 closed by bwarsaw Readline segfault (2005-04-05) http://python.org/sf/1176893 closed by mwh python socketmodule dies on ^c (2005-04-19) http://python.org/sf/1185931 closed by nodata101 Bad sys.executable value for bdist_wininst install script (2005-04-12) http://python.org/sf/1181619 closed by theller StringIO and cStringIO don't provide 'name' attribute (2005-04-03) http://python.org/sf/1175967 closed by mwh Python Interpreter shell is crashed (2005-01-12) http://python.org/sf/1100673 closed by mwh Python Programming FAQ should be updated for Python 2.4 (2005-02-09) http://python.org/sf/1119439 closed by jafo Dictionary Parsing Problem (2005-02-05) http://python.org/sf/1117048 closed by tjreedy 2.4.1 breaks pyTTS (2005-04-07) http://python.org/sf/1178624 closed by doerwalter Pickle with HIGHEST_PROTOCOL "ord() expected..." (2005-04-22) http://python.org/sf/1188175 closed by drhok multiple broken links in profiler docs (2005-03-30) http://python.org/sf/1173773 closed by isandler LINKCC incorrect (2005-04-25) http://python.org/sf/1189337 closed by cludwig file.write(x) where len(x) > 64*1024**2 is unreliable (2005-04-25) http://python.org/sf/1189525 closed by tim_one New / Reopened RFE __________________ "replace" function should accept lists. (2005-04-17) CLOSED http://python.org/sf/1184678 opened by Poromenos Make bisect.* functions accept an optional compare function (2005-04-18) http://python.org/sf/1185383 opened by Marcin Ciura RFE Closed __________ "replace" function should accept lists. (2005-04-17) http://python.org/sf/1184678 closed by rhettinger itertools.imerge: merge sequences (2005-04-18) http://python.org/sf/1185121 closed by jneb From greg.ewing at canterbury.ac.nz Tue Apr 26 06:00:14 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue Apr 26 06:00:34 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <426C54BF.2010906@ocf.berkeley.edu> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426C54BF.2010906@ocf.berkeley.edu> Message-ID: <426DBCCE.40903@canterbury.ac.nz> Brett C. wrote: > And before anyone decries the fact that this might confuse a newbie (which > seems to happen with every advanced feature ever dreamed up), remember this > will not be meant for a newbie but for someone who has experience in Python and > iterators at the minimum, and hopefully with generators. This is dangerously close to the "you don't need to know about it if you're not going to use it" argument, which is widely recognised as false. Newbies might not need to know all the details of the implementation, but they will need to know enough about the semantics of with-statements to understand what they're doing when they come across them in other people's code. Which leads me to another concern. How are we going to explain the externally visible semantics of a with-statement in a way that's easy to grok, without mentioning any details of the implementation? You can explain a for-loop pretty well by saying something like "It executes the body once for each item from the sequence", without having to mention anything about iterators, generators, next() methods, etc. etc. How the items are produced is completely irrelevant to the concept of the for-loop. But what is the equivalent level of description of the with-statement going to say? "It executes the body with... ???" And a related question: What are we going to call the functions designed for with-statements, and the objects they return? Calling them generators and iterators (even though they are) doesn't seem right, because they're being used for a purpose very different from generating and iterating. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Tue Apr 26 06:37:33 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue Apr 26 06:37:51 2005 Subject: [Python-Dev] Re: Caching objects in memory In-Reply-To: <ca471dc20504250957753a7445@mail.gmail.com> References: <e04bdf31050422063019fda86b@mail.gmail.com> <d4av64$ogd$1@sea.gmane.org> <e04bdf310504250946371f59c@mail.gmail.com> <ca471dc20504250957753a7445@mail.gmail.com> Message-ID: <426DC58D.2010102@canterbury.ac.nz> Guido van Rossum wrote: > But for *immutable* objects (like numbers, strings and tuples) the > implementation is free to use caching. In practice, I believe ints > between -5 and 100 are cached, and 1-character strings are often > cached (but not always). Also, string literals that resemble Python identifiers are often interned, although this is not guaranteed. And this only applies to literals, not strings constructed dynamically by the program (unless you explicitly apply intern() to them). Python 2.3.4 (#1, Jun 30 2004, 16:47:37) [GCC 3.2 20020903 (Red Hat Linux 8.0 3.2-7)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> "foo" is "foo" True >>> "foo" is "f" + "oo" False >>> "foo" is intern("f" + "oo") True -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Tue Apr 26 06:45:14 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue Apr 26 06:45:30 2005 Subject: [Python-Dev] Re: Re: anonymous blocks In-Reply-To: <d4iv44$9gn$1@sea.gmane.org> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426CB7C2.8030508@gmail.com> <d4iv44$9gn$1@sea.gmane.org> Message-ID: <426DC75A.1010005@canterbury.ac.nz> Terry Reedy wrote: >>Not supporting iterables makes it harder to write a class which is >>inherently usable in a with block, though. The natural way to make >>iterable classes is to use 'yield' in the definition of __iter__ - if >>iter() is not called, then that trick can't be used. If you're defining it by means of a generator, you don't need a class at all -- just make the whole thing a generator function. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Tue Apr 26 07:00:21 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue Apr 26 07:00:39 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: <1114473665.3698.2.camel@schizo> References: <fb6fbf560504251520797338b2@mail.gmail.com> <1114473665.3698.2.camel@schizo> Message-ID: <426DCAE5.2070501@canterbury.ac.nz> Donovan Baarda wrote: > Agreed. I don't find any switch syntaxes better than if/elif/else. Speed > benefits belong in implementation optimisations, not new bad syntax. > Two things are mildly annoying about if-elif chains as a substitute for a switch statement: 1) Repeating the name of the thing being switched on all the time, and the operator being used for comparison. 2) The first case is syntactically different from subsequent ones, even though semantically all the cases are equivalent. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Tue Apr 26 07:12:14 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue Apr 26 07:12:34 2005 Subject: [Python-Dev] site enhancements (request for review) In-Reply-To: <53f1dec01a0d78057c40abb1942cf0f1@redivi.com> References: <53f1dec01a0d78057c40abb1942cf0f1@redivi.com> Message-ID: <426DCDAE.8060907@canterbury.ac.nz> Bob Ippolito wrote: > A few weeks ago I put together a patch to site.py for Python 2.5 > <http://python.org/sf/1174614> that solves three major deficiencies: > > [concerning .pth files] While we're on the subject of .pth files, what about the idea of scanning the directory containing the main .py file for .pth files? This would make it easier to have collections of Python programs sharing a common set of modules, without having to either install them system-wide or write hairy sys.path-manipulating code or use platform-dependent symlink or PATH hacks. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Tue Apr 26 07:48:11 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue Apr 26 07:48:27 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: <fb6fbf56050422143614d8431c@mail.gmail.com> References: <fb6fbf56050422143614d8431c@mail.gmail.com> Message-ID: <426DD61B.3030708@canterbury.ac.nz> Jim Jewett wrote: > defmacro myresource(filename): > <make explicit calls to named callback "functions", but > within the same locals() scope.> > > with myresource("thefile"): > def reader(): > ... > def writer(): > ... > def fn(): > .... -1. This is ugly. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Tue Apr 26 07:59:52 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue Apr 26 08:00:10 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> Message-ID: <426DD8D8.5040908@canterbury.ac.nz> Michael Chermside wrote: > I've been following this conversation, and it sounds to me as if we > are stumbling about in the dark, trying to feel our way toward something > very useful and powerful. I think Jim is right, what we're feeling our > way toward is macros. I considered saying something like that about 3 posts ago, but I was afraid of getting stoned for heresy... > ... Eventually, there would > develop a large number of different Python "dialects" (as some > claim has happened in the Lisp community) each dependent on macros > the others lack. The most important casualty would be Python's > great *readability*. > In other words, rather than hearing what we'd like to be able to DO > with blocks, I'd like to hear what we want to PROHIBIT DOING with > blocks. From that quote, it would seem what we want to do is prohibit anything that would make code less readable. Or prohibit anything that would permit creating a new dialect. Or something. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Tue Apr 26 08:19:51 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue Apr 26 08:20:07 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> Message-ID: <426DDD87.60908@canterbury.ac.nz> Michael Chermside wrote: > if the answer is that we want to prohibit nothing, then the right > solution is macros. I'm not sure about that. Smalltalk manages to provide very reasonable-looking user-defined control structures without using compile-time macros, just normal runtime evaluation together with block arguments. It does this by starting out with a fairly minimal and very flexible syntax. This raises the question of why people feel the need for macros in Lisp or Scheme, which have an even more minimal and flexible syntax. I think part of the reason is that the syntax for passing an unevaluated block is too obtrusive. In Scheme you can define a function (not macro) that is used like this: (with-file "foo/blarg" (lambda (f) (do-something-with f))) But there is a natural tendency to want to be able to cut out the lambda cruft and just write something like: (with-file "foo/blarg" (f) (do-something-with f)) and for that you need a macro. The equivalent in Smalltalk would be something like File open: "foo/blarg" do: [:f f something] which doesn't look too bad (compared to the rest of the language!) because the block-passing syntax is fairly unobtrusive. So in summary, I don't think you necessarily *need* macros to get nice-looking user-defined control structures. It depends on other features of the language. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From bac at OCF.Berkeley.EDU Tue Apr 26 08:30:14 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Tue Apr 26 08:30:25 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <426DBCCE.40903@canterbury.ac.nz> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426C54BF.2010906@ocf.berkeley.edu> <426DBCCE.40903@canterbury.ac.nz> Message-ID: <426DDFF6.3060808@ocf.berkeley.edu> Greg Ewing wrote: > Brett C. wrote: > >> And before anyone decries the fact that this might confuse a newbie >> (which >> seems to happen with every advanced feature ever dreamed up), remember >> this >> will not be meant for a newbie but for someone who has experience in >> Python and >> iterators at the minimum, and hopefully with generators. > > > This is dangerously close to the "you don't need to know about > it if you're not going to use it" argument, which is widely > recognised as false. Newbies might not need to know all the > details of the implementation, but they will need to know > enough about the semantics of with-statements to understand > what they're doing when they come across them in other people's > code. > I am not saying it is totally to be ignored by people staring at Python code, but we don't need to necessarily spell out the intricacies. > Which leads me to another concern. How are we going to explain > the externally visible semantics of a with-statement in a way > that's easy to grok, without mentioning any details of the > implementation? > > You can explain a for-loop pretty well by saying something like > "It executes the body once for each item from the sequence", > without having to mention anything about iterators, generators, > next() methods, etc. etc. How the items are produced is completely > irrelevant to the concept of the for-loop. > > But what is the equivalent level of description of the > with-statement going to say? > > "It executes the body with... ???" > It executes the body, calling next() on the argument name on each time through until the iteration stops. > And a related question: What are we going to call the functions > designed for with-statements, and the objects they return? > Calling them generators and iterators (even though they are) > doesn't seem right, because they're being used for a purpose > very different from generating and iterating. > I like "managers" since they are basically managing resources most of the time for the user. -Brett From python-dev at zesty.ca Tue Apr 26 08:47:10 2005 From: python-dev at zesty.ca (Ka-Ping Yee) Date: Tue Apr 26 08:47:21 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <426DDFF6.3060808@ocf.berkeley.edu> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426C54BF.2010906@ocf.berkeley.edu> <426DBCCE.40903@canterbury.ac.nz> <426DDFF6.3060808@ocf.berkeley.edu> Message-ID: <Pine.LNX.4.58.0504260142080.4786@server1.LFW.org> On Mon, 25 Apr 2005, Brett C. wrote: > It executes the body, calling next() on the argument name on each > time through until the iteration stops. There's a little more to it than that. But on the whole I do support the goal of finding a simple, short description of what this construct is intended to do. If it can be described accurately in a sentence or two, that's a good sign that the semantics are sufficiently clear and simple. > I like "managers" since they are basically managing resources > most of the time for the user. No, please let's not call them that. "Manager" is a very common word to describe all kinds of classes in object-oriented designs, and it is so generic as to hardly mean anything. (Sorry, i don't have a better alternative at the moment.) -- ?!ng From stephen at xemacs.org Tue Apr 26 10:36:16 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue Apr 26 10:36:22 2005 Subject: [Python-Dev] defmacro In-Reply-To: <426DDD87.60908@canterbury.ac.nz> (Greg Ewing's message of "Tue, 26 Apr 2005 18:19:51 +1200") References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> <426DDD87.60908@canterbury.ac.nz> Message-ID: <87k6mqnddr.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Greg" == Greg Ewing <greg.ewing@canterbury.ac.nz> writes: Greg> This raises the question of why people feel the need for Greg> macros in Lisp or Scheme, which have an even more minimal Greg> and flexible syntax. I think part of the reason is that the Greg> syntax for passing an unevaluated block is too obtrusive. Greg> [... T]here is a natural tendency to want to be able to cut Greg> out the lambda cruft.... This doesn't feel right to me. By that argument, people would want to "improve" (mapcar (lambda (x) (car x)) list-of-lists) to (mapcar list-of-lists (x) (car x)) Have you ever heard someone complain about that lambda, though? My feeling is that the reason for macros in Lisps is that people want control structures to look like control structures, not like function calls whose actual arguments "just happen" to be anonymous function objects. In this context, the lambda does not merely bind f, it also excludes a lot of other possibilities. I mean when I see (with-locked-file "foo/blarg" (lambda (f) (do-something-with f))) I go "What's this? Oh, here the file is obviously important, and there we have a function of one formal argument with no actual arguments, so it must be that we're processing the file with the function." This emphasizes the application of this function to that file too much for my taste, and I will assume that the behavior of the block is self-contained---it had better not depend on free variables. But with (with-locked-file (f "foo/blarg") (do-something-with-as-modified-by f x)) there's no particular need for the block to exclusively concentrate on handling f, and there's nothing disconcerting about the presence of x. N.B. for non-Lispers: in Common Lisp idiom the list (f "foo/blarg") may be treated as two arguments, but associating f with "foo/blarg" in some way. I think in this context it is much more readable. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From bob at redivi.com Tue Apr 26 10:43:34 2005 From: bob at redivi.com (Bob Ippolito) Date: Tue Apr 26 10:44:30 2005 Subject: [Python-Dev] site enhancements (request for review) In-Reply-To: <426DCDAE.8060907@canterbury.ac.nz> References: <53f1dec01a0d78057c40abb1942cf0f1@redivi.com> <426DCDAE.8060907@canterbury.ac.nz> Message-ID: <f0d8fb4c2fd52d7b16349831db70179e@redivi.com> On Apr 26, 2005, at 1:12 AM, Greg Ewing wrote: > Bob Ippolito wrote: >> A few weeks ago I put together a patch to site.py for Python 2.5 >> <http://python.org/sf/1174614> that solves three major deficiencies: > > > > [concerning .pth files] > > While we're on the subject of .pth files, what about > the idea of scanning the directory containing the main > .py file for .pth files? This would make it easier to > have collections of Python programs sharing a common > set of modules, without having to either install them > system-wide or write hairy sys.path-manipulating code > or use platform-dependent symlink or PATH hacks. I don't think I'd ever use that, but it doesn't sound like a terrible idea. -bob From stephen at xemacs.org Tue Apr 26 10:55:10 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue Apr 26 10:55:15 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: <426DCAE5.2070501@canterbury.ac.nz> (Greg Ewing's message of "Tue, 26 Apr 2005 17:00:21 +1200") References: <fb6fbf560504251520797338b2@mail.gmail.com> <1114473665.3698.2.camel@schizo> <426DCAE5.2070501@canterbury.ac.nz> Message-ID: <87fyxdor2p.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Greg" == Greg Ewing <greg.ewing@canterbury.ac.nz> writes: Greg> Two things are mildly annoying about if-elif chains as a Greg> substitute for a switch statement: Greg> 1) Repeating the name of the thing being switched on all the Greg> time, and the operator being used for comparison. What's worse, to my mind, is the not infrequent case where the thing being switched on or the operator changes. Sure, that's bad style, but sometimes you have to read other people's code like that. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From gvanrossum at gmail.com Tue Apr 26 11:24:53 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Apr 26 11:24:56 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: <87fyxdor2p.fsf@tleepslib.sk.tsukuba.ac.jp> References: <fb6fbf560504251520797338b2@mail.gmail.com> <1114473665.3698.2.camel@schizo> <426DCAE5.2070501@canterbury.ac.nz> <87fyxdor2p.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <ca471dc2050426022458a4ad@mail.gmail.com> > Greg> 1) Repeating the name of the thing being switched on all the > Greg> time, and the operator being used for comparison. > > What's worse, to my mind, is the not infrequent case where the thing > being switched on or the operator changes. Sure, that's bad style, > but sometimes you have to read other people's code like that. You mean like this? if x > 0: ...normal case... elif y > 0: ....abnormal case... else: ...edge case... You have guts to call that bad style! :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Tue Apr 26 11:36:05 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Apr 26 11:36:10 2005 Subject: [Python-Dev] site enhancements (request for review) In-Reply-To: <426DCDAE.8060907@canterbury.ac.nz> References: <53f1dec01a0d78057c40abb1942cf0f1@redivi.com> <426DCDAE.8060907@canterbury.ac.nz> Message-ID: <ca471dc2050426023629559cab@mail.gmail.com> > While we're on the subject of .pth files, what about > the idea of scanning the directory containing the main > .py file for .pth files? This would make it easier to > have collections of Python programs sharing a common > set of modules, without having to either install them > system-wide or write hairy sys.path-manipulating code > or use platform-dependent symlink or PATH hacks. I do that all the time without .pth files -- I just put all the common modules in a package and place the package in the directory containing the "main" .py files. I do have use cases where for reasons of separate development cycles (etc.) I have some code (usually experimental or "unofficial" in some way) in a different place that also needs access to the same set of common modules, and there I use explicit sys.path manipulations. I think that even if the proposed feature was available I wouldn't switch to it -- it's too easy to forget about the .pth file and be confused when it points to the wrong place. That's also the reason why I don't use symlinks or $PYTHONPATH for this purpose. EIBTI. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From flaig at sanctacaris.net Tue Apr 26 12:39:26 2005 From: flaig at sanctacaris.net (flaig@sanctacaris.net) Date: Tue Apr 26 12:39:33 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) Message-ID: <200504261039.j3QAdQU2013249@ger5.wwwserver.net> Actually I was thinking of something related the other day: Wouldn't it be nice to be able to define/overload not only operators but also control structures? That way Python's core language could be kept simple and free of "featuritis" while at the same time everyone who desires a match/case or repeat/until statement, or anything more sophisticated, could implement it for himself. If you suspect that good old Lisp is on my mind, you are probably right :) . Actually, the idea of a programming language whose structures can be adapted to everyone's personal style is still very appealing to me. I have no very distinct ideas about how such a thing might be designed (and still less whether it could be made to work efficiently), but perhaps somewhat like this (just to clarify along which paths my thoughts are currently moving): structure whaddayacallit: # name only as a comment def opening_clause: statements def alternative_clause_1: statements def *alternative_clause_2: # the asterisk to indicate that this may occur several times statements def closing-clause: statements e.g.: structure multiple_switch: condition = None def switch(self, c): # condition must be passed as a lambdoid function self.condition, self.finished = c, False def *case(self, x, statements): # so must the statements subordinate to the new "case" expression if self.condition( x ): statements() self.finished = True break structure def otherwise(self, statements); if not self.finished: statements() and the application: switch my_favourite_language: case Python: print "Hi Guido" case Perl: print "Hi Larry" otherwise: print "Hi pleb" At least to me, this has a definitively macroish flavour... and not in the #dumbdown style of C. Or rather say, macros might be a generalized way to achieve this, if they are intelligently designed. (<= That at least shouldn't be the problem, since this is the Python community and not M$'s development department :-) .) Do you think any of this might make sense? -- Rüdiger Marcus PS. Aahz: When describing Ruby as the "antithesis" of Python recently I was thinking in Laskerian rather than Hegelian terms... the differences are not really big, but Ruby has always been positioned as a deliberate challenge to Python. Date: Mon, 25 Apr 2005 09:42:54 -0700 > From: Michael Chermside <mcherm@mcherm.com> > Subject: RE: [Python-Dev] defmacro (was: Anonymous blocks) > To: python-dev@python.org > Cc: jimjjewett@gmail.com > Message-ID: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> > Content-Type: text/plain; charset=ISO-8859-1 > > Jim Jewett writes: > > As best I can tell, the anonymous blocks are used to take > > care of boilerplate code without changing the scope -- exactly > > what macros are used for. > > Folks, I think that Jim is onto something here. > > I've been following this conversation, and it sounds to me as if we > are stumbling about in the dark, trying to feel our way toward something > very useful and powerful. I think Jim is right, what we're feeling our > way toward is macros. > > The problem, of course, is that Guido (and others!) are on record as > being opposed to adding macros to Python. (Even "good" macros... think > lisp, not cpp.) I am not quite sure that I am convinced by the argument, > but let me see if I can present it: > > Allowing macros in Python would enable individual programmers or > groups to easily invent their own "syntax". Eventually, there would > develop a large number of different Python "dialects" (as some > claim has happened in the Lisp community) each dependent on macros > the others lack. The most important casualty would be Python's > great *readability*. > > (If this is a strawman argument, i.e. if you know of a better reason > for keeping macros OUT of Python please speak up. Like I said, I've > never been completely convinced of it myself.) > > I think it would be useful if we approached it like this: either what > we want is the full power of macros (in which case the syntax we choose > should be guided by that choice), or we want LESS than the full power > of macros. If we want less, then HOW less? > > In other words, rather than hearing what we'd like to be able to DO > with blocks, I'd like to hear what we want to PROHIBIT DOING with > blocks. I think this might be a fruitful way of thinking about the > problem which might make it easier to evaluate syntax suggestions. And > if the answer is that we want to prohibit nothing, then the right > solution is macros. > > -- Michael Chermside > === Chevalier Dr. Dr. Ruediger Marcus Flaig Institute for Immunology University of Heidelberg Im Neuenheimer Feld 305, D-69120 Heidelberg, FRG <flaig@sanctacaris.net> -- Diese E-Mail wurde mit http://www.mail-inspector.de verschickt Mail Inspector ist ein kostenloser Service von http://www.is-fun.net Der Absender dieser E-Mail hatte die IP: 129.206.124.135 From gvanrossum at gmail.com Tue Apr 26 13:37:47 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Apr 26 13:37:53 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <426DB7C8.5020708@canterbury.ac.nz> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> Message-ID: <ca471dc2050426043713116248@mail.gmail.com> [Greg Ewing] > I like the general shape of this, but I have one or two > reservations about the details. That summarizes the feedback so far pretty well. I think we're on to something. And I'm not too proud to say that Ruby has led the way here to some extent (even if Python's implementation would be fundamentally different, since it's based on generators, which has some different possibilities and precludes some Ruby patterns). > 1) We're going to have to think carefully about the naming of > functions designed for use with this statement. If 'with' > is going to be in there as a keyword, then it really shouldn't > be part of the function name as well. Of course. I only used 'with_opened' because it's been the running example in this thread. > I would rather see something like > > with f = opened(pathname): > ... > > This sort of convention (using a past participle as a function > name) would work for some other cases as well: > > with some_data.locked(): > ... > > with some_resource.allocated(): > ... Or how about with synchronized(some_resource): ... > On the negative side, not having anything like 'with' in the > function name means that the fact the function is designed for > use in a with-statement could be somewhat non-obvious. Since > there's not going to be much other use for such a function, > this is a bad thing. This seems a pretty mild problem; one could argue that every function is only useful in a context where its return type makes sense, and we seem to be getting along just fine with naming conventions (or just plain clear naming). > It could also lead people into subtle usage traps such as > > with f = open(pathname): > ... > > which would fail in a somewhat obscure way. Ouch. That one hurts. (I was going to say "but f doesn't have a next() method" when I realized it *does*. :-) It is *almost* equivalent to for f in open(pathname): ... except if the "..." block raises an exception. Fortunately your proposal to use 'as' makes this mistake less likely. > So maybe the 'with' keyword should be dropped (again!) in > favour of > > with_opened(pathname) as f: > ... But that doesn't look so great for the case where there's no variable to be assigned to -- I wasn't totally clear about it, but I meant the syntax to be with [VAR =] EXPR: BLOCK where VAR would have the same syntax as the left hand side of an assignment (or the variable in a for-statement). > 2) I'm not sure about the '='. It makes it look rather deceptively > like an ordinary assignment, and I'm sure many people are going > to wonder what the difference is between > > with f = opened(pathname): > do_stuff_to(f) > > and simply > > f = opened(pathname) > do_stuff_to(f) > > or even just unconsciously read the first as the second without > noticing that anything special is going on. Especially if they're > coming from a language like Pascal which has a much less magical > form of with-statement. Right. > So maybe it would be better to make it look more different: > > with opened(pathname) as f: > ... Fredrik said this too, and as long as we're going to add 'with' as a new keyword, we might as well promote 'as' to become a real keyword. So then the syntax would become with EXPR [as VAR]: BLOCK I don't see a particular need for assignment to multiple VARs (but VAR can of course be a tuple of identifiers). > * It seems to me that this same exception-handling mechanism > would be just as useful in a regular for-loop, and that, once > it becomes possible to put 'yield' in a try-statement, people > are going to *expect* it to work in for-loops as well. (You can already put a yield inside a try-except, just not inside a try-finally.) > Guido has expressed concern about imposing extra overhead on > all for-loops. But would the extra overhead really be all that > noticeable? For-loops already put a block on the block stack, > so the necessary processing could be incorporated into the > code for unwinding a for-block during an exception, and little > if anything would need to change in the absence of an exception. Probably. > However, if for-loops also gain this functionality, we end up > with the rather embarrassing situation that there is *no difference* > in semantics between a for-loop and a with-statement! There would still be the difference that a for-loop invokes iter() and a with-block doesn't. Also, for-loops that don't exhaust the iterator leave it available for later use. I believe there are even examples of this pattern, where one for-loop searches the iterable for some kind of marker value and the next for-loop iterates over the remaining items. For example: f = open(messagefile) # Process message headers for line in f: if not line.strip(): break if line[0].isspace(): addcontinuation(line) else: addheader(line) # Process message body for line in f: addbody(line) > This could be "fixed" by making the with-statement not loop, > as has been suggested. That was my initial thought as well, > but having thought more deeply, I'm starting to think that > Guido was right in the first place, and that a with-statement > should be capable of looping. I'll elaborate in another post. So perhaps the short description of a with-statement that we give to newbies could be the following: """ The statement: for VAR in EXPR: BLOCK does the same thing as: with iter(EXPR) as VAR: # Note the iter() call BLOCK except that: - you can leave out the "as VAR" part from the with-statement; - they work differently when an exception happens inside BLOCK; - break and continue don't always work the same way. The only time you should write a with-statement is when the documentation for the function you are calling says you should. """ > > So a block could return a value to the generator using a return > > statement; the generator can catch this by catching ReturnFlow. > > (Syntactic sugar could be "VAR = yield ..." like in Ruby.) > > This is a very elegant idea, but I'm seriously worried by the > possibility that a return statement could do something other > than return from the function it's written in, especially if > for-loops also gain this functionality. But they wouldn't! > Intercepting break > and continue isn't so bad, since they're already associated > with the loop they're in, but return has always been an > unconditional get-me-out-of-this-function. I'd feel uncomfortable > if this were no longer true. Me too. Let me explain the use cases that led me to throwing that in (I ws running out of time and didn't properly explain it) and then let me propose an alternative. This is a bit long, but important! *First*, in the non-looping use cases (like acquiring and releasing a lock), a return-statement should definitely be allowed when the with-statement is contained in a function. There's lots of code like this out there: def search(self, eligible, default=None): self.lock.acquire() try: for item in self.elements: if eligible(item): return item # no eligible iems return default finally: self.lock.release() and this translates quite nicely to a with-statement: def search(self, eligible, default=None): with synchronized(self.lock): for item in self.elements: if eligible(item): return item # no eligible iems return default *Second*, it might make sense if break and continue would be handled the same way; here's an example: def alt_search(self): for item in self.elements: with synchronized(item): if item.abandoned(): continue if item.eligible(): break else: item = self.default_item return item.post_process() (I realize the case for continue isn't as strong as that for break, but I think we have to support both if we support one.) *Third*, if there is a try-finally block around a yield in the generator, the finally clause absolutely must be executed when control leaves the body of the with-statement, whether it is through return, break, or continue. This pretty much means these have to be turned into some kind of exception. So the first example would first be transformed into this: def search(self, eligible, default=None): try: with synchronized(self.lock): for item in self.elements: if eligible(item): raise ReturnFlow(item) # was "return item" # no eligible iems raise ReturnFlow(default) # was "return default" except ReturnFlow, exc: return exc.value before applying the transformation of the with-statement, which I won't repeat here (look it up in my previous long post in this thread). (BTW I do agree that it should use __next__(), not next_ex().) I'm assuming the following definition of the ReturnFlow exception: class ReturnFlow(Exception): def __init__(self, value=None): self.value = value The translation of break into raise BreakFlow() and continue into rase ContinueFlow() is now obvious. (BTW ReturnFlow etc. aren't great names. Suggestions?) *Fourth*, and this is what makes Greg and me uncomfortable at the same time as making Phillip and other event-handling folks drool: from the previous three points it follows that an iterator may *intercept* any or all of ReturnFlow, BreakFlow and ContinueFlow, and use them to implement whatever cool or confusing magic they want. For example, a generator can decide that for the purposes of break and continue, the with-statement that calls it is a loop, and give them the usual semantics (or the opposite, if you're into that sort of thing :-). Or a generator can receive a value from the block via a return statement. Notes: - I think there's a better word than Flow, but I'll keep using it until we find something better. - This is not limited to generators -- the with-statement uses an arbitrary "new-style" iterator (something with a __next__() method taking an optional exception argument). - The new __next__() API can also (nay, *must*, to make all this work reliably) be used to define exception and cleanup semantics for generators, thereby rendering obsolete PEP 325 and the second half of PEP 288. When a generator is GC'ed (whether by reference counting or by the cyclical garbage collector), its __next__() method is called with a BreakFlow exception instance as argument (or perhaps some other special exception created for the purpose). If the generator catches the exception and yields another value, too bad -- I consider that broken behavior. (The alternative would be to keep calling __next__(BreakFlow()) until it doesn't return a value, but that feels uncomfortable in a finalization context.) - Inside a with-statement, user code raising a Flow exception acts the same as the corresponding statement. This is slightly unfortunate, because it might lead one to assume that the same is true for example in a for-loop or while-loop, but I don't want to make that change. I don't think it's a big problem. Given that 1, 2 and 3 combined make 4 inevitable, I think we might as well give in, and *always* syntactically accept return, break and continue in a with-statement, whether or not it is contained in a loop or function. When the iterator does not handle the Flow exceptions, and there is no outer context in which the statement is valid, the Flow exception is turned into an IllegalFlow exception, which is the run-time equivalent of SyntaxError: 'return' outside function (or 'break' outside loop, etc.). Now there's one more twist, which you may or may not like. Presumably (barring obfuscations or bugs) the handling of BreakFlow and ContinueFlow by an iterator (or generator) is consistent for all uses of that particular iterator. For example synchronized(lock) and transactional(db) do not behave as loops, and forever() does. Ditto for handling ReturnFlow. This is why I've been thinking of leaving out the 'with' keyword: in your mind, these calls would become new statement types, even though the compiler sees them all the same: synchronized(lock): BLOCK transactional(db): BLOCK forever(): BLOCK opening(filename) as f: BLOCK It does require the authors of such iterators to pick good names, and it doesn't look as good when the iterator is a method of some object: self.elements[0].locker.synchronized(): BLOCK You proposed this too (and I even commented on it, ages ago in this same endless message :-) and while I'm still on the fence, at least I now have a better motivational argument (i.e., that each iterator becomes a new statement type in your mind). One last thing: if we need a special name for iterators and generators designed for use in a with-statement, how about calling them with-iterators and with-generators. The non-looping kind can be called resource management iterators / generators. I think whatever term we come up with should not be a totally new term but a combination of iterator or generator with some prefix, and it should work both for iterators and for generators. That's all I can muster right now (I should've been in bed hours ago) but I'm feeling pretty good about this. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rodsenra at gpr.com.br Tue Apr 26 14:11:59 2005 From: rodsenra at gpr.com.br (Rodrigo Dias Arruda Senra) Date: Tue Apr 26 14:11:31 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: <877e9a1705042520151e493f3f@mail.gmail.com> References: <fb6fbf560504251234553e881f@mail.gmail.com> <ca471dc205042513162c5cff33@mail.gmail.com> <fb6fbf560504251501111fa7a5@mail.gmail.com> <ca471dc20504251511689cffd3@mail.gmail.com> <877e9a1705042520151e493f3f@mail.gmail.com> Message-ID: <20050426091159.74a83735@localhost.localdomain> [ Michael Walter ]: > A couple of examples out of my tired head (solely from a user perspective) :-) > > Embedding domain specific language (ex.: state machine): > ... > > Embedding domain specific language (ex.: markup language): > ... > > Embedding domain-specific language (ex.: badly-designed database table): > ... > > ..., which might actually prove someone's point that the > language designer shouldn't allow people to do such things. The whole macros issue comes to a tradeoff between power+expressiviness X readability. IMVHO, macros are readability assassins. The power (for any developer) to introduce new syntax is *not* a desirable feature, but something to be avoided. And that alone should be a stronger argument than a hundred use cases. cheers, Senra -- Rodrigo Senra -- MSc Computer Engineer rodsenra(at)gpr.com.br GPr Sistemas Ltda http://www.gpr.com.br/ Personal Blog http://rodsenra.blogspot.com/ From greg.ewing at canterbury.ac.nz Tue Apr 26 14:36:26 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue Apr 26 14:45:31 2005 Subject: [Python-Dev] Re: anonymous blocks References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426C54BF.2010906@ocf.berkeley.edu> <426DBCCE.40903@canterbury.ac.nz> <426DDFF6.3060808@ocf.berkeley.edu> Message-ID: <426E35CA.60004@canterbury.ac.nz> Brett C. wrote: > It executes the body, calling next() on the argument > name on each time through until the iteration stops. But that's no good, because (1) it mentions next(), which should be an implementation detail, and (2) it talks about iteration, when most of the time the high-level intent has nothing to do with iteration. In other words, this is too low a level of explanation. Greg From michael.walter at gmail.com Tue Apr 26 14:51:22 2005 From: michael.walter at gmail.com (Michael Walter) Date: Tue Apr 26 14:51:26 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: <20050426091159.74a83735@localhost.localdomain> References: <fb6fbf560504251234553e881f@mail.gmail.com> <ca471dc205042513162c5cff33@mail.gmail.com> <fb6fbf560504251501111fa7a5@mail.gmail.com> <ca471dc20504251511689cffd3@mail.gmail.com> <877e9a1705042520151e493f3f@mail.gmail.com> <20050426091159.74a83735@localhost.localdomain> Message-ID: <877e9a17050426055138d243af@mail.gmail.com> On 4/26/05, Rodrigo Dias Arruda Senra <rodsenra@gpr.com.br> wrote: > IMVHO, macros are readability assassins. The power (for any developer) > to introduce new syntax is *not* a desirable feature, but something > to be avoided. And that alone should be a stronger argument than > a hundred use cases. Personally, I believe that EDSLs can improve usability of a library. I've been following this list for quite a while, and trying to see what lengths (hacks) people go (use) to implement "sexy" syntax can give you quite an idea that custom syntax matters. And surely all of these tricks (hacks) are way harder to use than a EDSL would be. Regards, Michael From reinhold-birkenfeld-nospam at wolke7.net Tue Apr 26 14:49:41 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Tue Apr 26 14:53:47 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc2050426043713116248@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> Message-ID: <d4ld6r$ms5$1@sea.gmane.org> Guido van Rossum wrote: > [Greg Ewing] >> I like the general shape of this, but I have one or two >> reservations about the details. > > That summarizes the feedback so far pretty well. I think we're on to > something. And I'm not too proud to say that Ruby has led the way here > to some extent (even if Python's implementation would be fundamentally > different, since it's based on generators, which has some different > possibilities and precludes some Ruby patterns). Five random thoughts: 1. So if break and continue are allowed in with statements only when there is an enclosing loop, it would be a inconsistency; consider for item in seq: with gen(): continue when the generator gen catches the ContinueFlow and does with it what it wants. It is then slightly unfair not to allow with x: continue Anyway, I would consider both counterintuitive. So what about making ReturnFlow, BreakFlow and ContinueFlow "private" exceptions that cannot be caught in user code and instead introducing a new statement that allows passing data to the generator? 2. In process of handling this, would it be reasonable to (re)introduce a combined try-except-finally statement with defined syntax (all except before finally) and behavior (finally is always executed)? 5. What about the intended usage of 'with' as in Visual B.. NO, NO, NOT THE WHIP! (not that you couldn't emulate this with a clever "generator": def short(x): yield x with short(my.long["object"]reference()) as _: _.spam = _.ham = _.eggs() yours, Reinhold -- Mail address is perfectly valid! From ncoghlan at gmail.com Tue Apr 26 15:03:33 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue Apr 26 15:03:42 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc2050426043713116248@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> Message-ID: <426E3C25.4090204@gmail.com> Guido van Rossum wrote: [snip] > - I think there's a better word than Flow, but I'll keep using it > until we find something better. How about simply reusing Iteration (ala StopIteration)? Pass in 'ContinueIteration' for 'continue' Pass in 'BreakIteration' for 'break' Pass in 'AbortIteration' for 'return' and finalisation. And advise strongly *against* intercepting AbortIteration with anything other than a finally block. > - The new __next__() API can also (nay, *must*, to make all this work > reliably) be used to define exception and cleanup semantics for > generators, thereby rendering obsolete PEP 325 and the second half > of PEP 288. When a generator is GC'ed (whether by reference > counting or by the cyclical garbage collector), its __next__() > method is called with a BreakFlow exception instance as argument (or > perhaps some other special exception created for the purpose). If > the generator catches the exception and yields another value, too > bad -- I consider that broken behavior. (The alternative would be > to keep calling __next__(BreakFlow()) until it doesn't return a > value, but that feels uncomfortable in a finalization context.) As suggested above, perhaps the exception used here should be the exception that is raised when a 'return' statement is encountered inside the block, rather than the more-likely-to-be-messed-with 'break' statement. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From greg.ewing at canterbury.ac.nz Tue Apr 26 14:58:41 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue Apr 26 15:07:45 2005 Subject: [Python-Dev] Re: anonymous blocks References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> Message-ID: <426E3B01.1010007@canterbury.ac.nz> Guido van Rossum wrote: > [Greg Ewing] >>* It seems to me that this same exception-handling mechanism >>would be just as useful in a regular for-loop, and that, once >>it becomes possible to put 'yield' in a try-statement, people >>are going to *expect* it to work in for-loops as well. > > (You can already put a yield inside a try-except, just not inside a > try-finally.) Well, my point still stands. People are going to write try-finally around their yields and expect the natural thing to happen when their generator is used in a for-loop. > There would still be the difference that a for-loop invokes iter() and > a with-block doesn't. > > Also, for-loops that don't exhaust the iterator leave it available for > later use. Hmmm. But are these big enough differences to justify having a whole new control structure? Whither TOOWTDI? > """ > The statement: > > for VAR in EXPR: > BLOCK > > does the same thing as: > > with iter(EXPR) as VAR: # Note the iter() call > BLOCK > > except that: > > - you can leave out the "as VAR" part from the with-statement; > - they work differently when an exception happens inside BLOCK; > - break and continue don't always work the same way. > > The only time you should write a with-statement is when the > documentation for the function you are calling says you should. > """ Surely you jest. Any newbie reading this is going to think he hasn't a hope in hell of ever understanding what is going on here, and give up on Python in disgust. >>I'm seriously worried by the >>possibility that a return statement could do something other >>than return from the function it's written in. > Let me explain the use cases that led me to throwing that in Yes, I can see that it's going to be necessary to treat return as an exception, and accept the possibility that it will be abused. I'd still much prefer people refrain from abusing it that way, though. Using "return" to spell "send value back to yield statement" would be extremely obfuscatory. > (BTW ReturnFlow etc. aren't great > names. Suggestions?) I'd suggest just calling them Break, Continue and Return. > synchronized(lock): > BLOCK > > transactional(db): > BLOCK > > forever(): > BLOCK > > opening(filename) as f: > BLOCK Hey, I like that last one! Well done! > One last thing: if we need a special name for iterators and generators > designed for use in a with-statement, how about calling them > with-iterators and with-generators. Except that if it's no longer a "with" statement, this doesn't make so much sense... Greg From reinhold-birkenfeld-nospam at wolke7.net Tue Apr 26 15:13:58 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Tue Apr 26 15:18:41 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <426E3C25.4090204@gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3C25.4090204@gmail.com> Message-ID: <d4lekc$tnq$1@sea.gmane.org> Nick Coghlan wrote: > Guido van Rossum wrote: > [snip] >> - I think there's a better word than Flow, but I'll keep using it >> until we find something better. > > How about simply reusing Iteration (ala StopIteration)? > > Pass in 'ContinueIteration' for 'continue' > Pass in 'BreakIteration' for 'break' > Pass in 'AbortIteration' for 'return' and finalisation. > > And advise strongly *against* intercepting AbortIteration with anything other > than a finally block. Hmmm... another idea: If break and continue return keep exactly the current semantics (break or continue the innermost for/while-loop), do we need different exceptions at all? AFAICS AbortIteration (+1 on the name) would be sufficient for all three interrupting statements, and this would prevent misuse too, I think. yours, Reinhold -- Mail address is perfectly valid! From caglar at uludag.org.tr Tue Apr 26 15:29:29 2005 From: caglar at uludag.org.tr (=?iso-8859-9?Q?S=2E=C7a=F0lar?= Onur) Date: Tue Apr 26 15:29:36 2005 Subject: [Python-Dev] Removing --with-wctype-functions support Message-ID: <1114522169.5352.9.camel@poseidon.cekirdek.int> Hi; I just subscribed this list, so i don't know whether this discussed before. If so, sorry. I want to know status of http://mail.python.org/pipermail/python-dev/2004-December/050193.html this thread. Will python remove wctype functions support from its core? If it will, what about locale-dependent case conversation functions? Without this support python behaves wrong in tr_TR.UTF-8 locale. As a side effect, this problem reported to Gentoo Linux to add a wctype support to current python ebuild ( http://bugs.gentoo.org/show_bug.cgi?id=69322 ), but they don't want to add any removed feature to their ebuilds, and also not break anything because portage is built on python. So :), what will the feature of wctype? Yours -- S.?a?lar Onur <caglar@uludag.org.tr> http://cekirdek.uludag.org.tr/~caglar/ Linux is like living in a teepee. No Windows, no Gates and an Apache in house! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: Bu dijital olarak =?iso-8859-9?Q?imzalanm=FD=FE?= ileti =?iso-8859-9?Q?par=E7as=FDd=FDr?= Url : http://mail.python.org/pipermail/python-dev/attachments/20050426/4f66994d/attachment.pgp From ncoghlan at gmail.com Tue Apr 26 15:44:30 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue Apr 26 15:44:39 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <d4lekc$tnq$1@sea.gmane.org> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3C25.4090204@gmail.com> <d4lekc$tnq$1@sea.gmane.org> Message-ID: <426E45BE.2020009@gmail.com> Reinhold Birkenfeld wrote: > Nick Coghlan wrote: > >>Guido van Rossum wrote: >>[snip] >> >>>- I think there's a better word than Flow, but I'll keep using it >>> until we find something better. >> >>How about simply reusing Iteration (ala StopIteration)? >> >> Pass in 'ContinueIteration' for 'continue' >> Pass in 'BreakIteration' for 'break' >> Pass in 'AbortIteration' for 'return' and finalisation. >> >>And advise strongly *against* intercepting AbortIteration with anything other >>than a finally block. > > > Hmmm... another idea: If break and continue return keep exactly the current > semantics (break or continue the innermost for/while-loop), do we need > different exceptions at all? AFAICS AbortIteration (+1 on the name) would be > sufficient for all three interrupting statements, and this would prevent > misuse too, I think. No, the iterator should be able to keep state around in the case of BreakIteration and ContinueIteration, whereas AbortIteration should shut the whole thing down. In particular "VAR = yield None" is likely to become syntactic sugar for: try: yield None except ContinueIteration, exc: VAR = ContinueIteration.value We definitely don't want that construct swallowing AbortIteration. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From mwh at python.net Tue Apr 26 16:13:07 2005 From: mwh at python.net (Michael Hudson) Date: Tue Apr 26 16:13:01 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc205042416572da9db71@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> Message-ID: <3cf156b4a85d5b9907c6c9333d8c6af8@python.net> Whew! This is a bit long... On 25 Apr 2005, at 00:57, Guido van Rossum wrote: > After reading a lot of contributions (though perhaps not all -- this > thread seems to bifurcate every time someone has a new idea :-) I haven't read all the posts around the subject, I'll have to admit. I've read the one I'm replying and its followups to pretty carefully, though. > I'm back to liking yield for the PEP 310 use case. I think maybe it was > Doug Landauer's post mentioning Beta, plus scanning some more examples > of using yield in Ruby. Jim Jewett's post on defmacro also helped, as > did Nick Coghlan's post explaining why he prefers 'with' for PEP 310 > and a bare expression for the 'with' feature from Pascal (and other > languages :-). The history of iterators and generators could be summarized by saying that an API was invented, then it turned out that in practice one way of implementing them -- generators -- was almost universally useful. This proposal seems a bit like an effort to make generators good at doing something that they aren't really intended -- or dare I say suited? -- for. The tail wagging the dog so to speak. > It seems that the same argument that explains why generators are so > good for defining iterators, also applies to the PEP 310 use case: > it's just much more natural to write > > def with_file(filename): > f = open(filename) > try: > yield f > finally: > f.close() This is a syntax error today, of course. When does the finally: clause execute with your proposal? [I work this one out below :)] > than having to write a class with __entry__ and __exit__ and > __except__ methods (I've lost track of the exact proposal at this > point). > At the same time, having to use it as follows: > > for f in with_file(filename): > for line in f: > print process(line) > > is really ugly, This is a non-starter, I hope. I really meant what I said in PEP 310 about loops being loops. > so we need new syntax, which also helps with keeping > 'for' semantically backwards compatible. So let's use 'with', and then > the using code becomes again this: > > with f = with_file(filename): > for line in f: > print process(line) > > Now let me propose a strawman for the translation of the latter into > existing semantics. Let's take the generic case: > > with VAR = EXPR: > BODY > > This would translate to the following code: > > it = EXPR > err = None > while True: > try: > if err is None: > VAR = it.next() > else: > VAR = it.next_ex(err) > except StopIteration: > break > try: > err = None > BODY > except Exception, err: # Pretend "except Exception:" == > "except:" > if not hasattr(it, "next_ex"): > raise > > (The variables 'it' and 'err' are not user-visible variables, they are > internal to the translation.) > > This looks slightly awkward because of backward compatibility; what I > really want is just this: > > it = EXPR > err = None > while True: > try: > VAR = it.next(err) > except StopIteration: > break > try: > err = None > BODY > except Exception, err: # Pretend "except Exception:" == > "except:" > pass > > but for backwards compatibility with the existing argument-less next() > API More than that: if I'm implementing an iterator for, uh, iterating, why would one dream of needing to handle an 'err' argument in the next() method? > I'm introducing a new iterator API next_ex() which takes an > exception argument. If that argument is None, it should behave just > like next(). Otherwise, if the iterator is a generator, this will > raised that exception in the generator's frame (at the point of the > suspended yield). If the iterator is something else, the something > else is free to do whatever it likes; if it doesn't want to do > anything, it can just re-raise the exception. Ah, this answers my 'when does finally' execute question above. > Finally, I think it would be cool if the generator could trap > occurrences of break, continue and return occurring in BODY. We could > introduce a new class of exceptions for these, named ControlFlow, and > (only in the body of a with statement), break would raise BreakFlow, > continue would raise ContinueFlow, and return EXPR would raise > ReturnFlow(EXPR) (EXPR defaulting to None of course). Well, this is quite a big thing. > So a block could return a value to the generator using a return > statement; the generator can catch this by catching ReturnFlow. > (Syntactic sugar could be "VAR = yield ..." like in Ruby.) > > With a little extra magic we could also get the behavior that if the > generator doesn't handle ControlFlow exceptions but re-raises them, > they would affect the code containing the with statement; this means > that the generator can decide whether return, break and continue are > handled locally or passed through to the containing block. > > Note that EXPR doesn't have to return a generator; it could be any > object that implements next() and next_ex(). (We could also require > next_ex() or even next() with an argument; perhaps this is better.) My main objection to all this is that it conflates iteration and a more general kind of execution control (I guess iteration is a kind of execution control, but I contend that it's a sufficiently common case to get special treatment and also that names like 'for' and 'next' are only applicable to iteration). So, here's a counterproposal! with expr as var: ... code ... is roughly: def _(var): ... code ... __private = expr __private(_) (var optional as in other proposals). so one might write: def open_file(f): def inner(block): try: block(f) finally: f.close() return inner and have with auto_closing(open("/tmp/foo")) as f: f.write('bob') The need for approximation in the above translation is necessary because you'd want to make assignments in '...code...' affect the scope their written in, and also one might want to allow breaks and continues to be handled as in the end of your proposal. And grudgingly, I guess you'd need to make returns behave like that anyway. Has something like this been argued out somewhere in this thread? As another example, here's how you'd implement something very like a for loop: def as_for_loop(thing): it = iter(thing) def inner(thunk): while 1: try: v = it.next() except StopIteration: break try: thunk(v) except Continue: continue except Break: break so for x in s: and with as_for_loop(s) as x: are now equivalent (I hope :). Cheers, mwh From mwh at python.net Tue Apr 26 16:26:27 2005 From: mwh at python.net (Michael Hudson) Date: Tue Apr 26 16:26:21 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <3cf156b4a85d5b9907c6c9333d8c6af8@python.net> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <3cf156b4a85d5b9907c6c9333d8c6af8@python.net> Message-ID: <67909725b5458513bfe4eb9da573af93@python.net> On 26 Apr 2005, at 15:13, Michael Hudson wrote: > So, here's a counterproposal! And a correction! > with expr as var: > ... code ... > > is roughly: def _(var): ... code ... try: expr(_) except Return, e: return e.value Cheers, mwh From ncoghlan at gmail.com Tue Apr 26 17:14:51 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue Apr 26 17:14:59 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <3cf156b4a85d5b9907c6c9333d8c6af8@python.net> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <3cf156b4a85d5b9907c6c9333d8c6af8@python.net> Message-ID: <426E5AEB.3030707@gmail.com> Michael Hudson wrote: > This is a non-starter, I hope. I really meant what I said in PEP 310 > about loops being loops. The more I play with this, the more I want the 'with' construct to NOT be a loop construct. The main reason is that it would be really nice to be able to write and use a multipart code template as: def template(): # pre_part_1 yield None # post_part_1 yield None # pre_part_2 yield None # post_part_2 yield None # pre_part_3 yield None # post_part_3 def user(): block = template() with block: # do_part_1 with block: # do_part_2 with block: # do_part_3 If 'with' is a looping construct, the above won't work, since the first usage will drain the template. Accordingly, I would like to suggest that 'with' revert to something resembling the PEP 310 definition: resource = EXPR if hasattr(resource, "__enter__"): VAR = resource.__enter__() else: VAR = None try: try: BODY except: raise # Force realisation of sys.exc_info() for use in __exit__() finally: if hasattr(resource, "__exit__"): VAR = resource.__exit__() else: VAR = None Generator objects could implement this protocol, with the following behaviour: def __enter__(): try: return self.next() except StopIteration: raise RuntimeError("Generator exhausted, unable to enter with block") def __exit__(): try: return self.next() except StopIteration: return None def __except__(*exc_info): pass def __no_except__(): pass Note that the code template can deal with exceptions quite happily by utilising sys.exc_info(), and that the result of the call to __enter__ is available *inside* the with block, while the result of the call to __exit__ is available *after* the block (useful for multi-part blocks). If I want to drain the template, then I can use a 'for' loop (albeit without the cleanup guarantees). Taking this route would mean that: * PEP 310 and the question of passing values or exceptions into iterators would again become orthogonal * Resources written using generator syntax aren't cluttered with the repetitive try/finally code PEP 310 is trying to eliminate * 'for' remains TOOW to write an iterative loop * it is possible to execute _different_ suites between each yield in the template block, rather than being constrained to a single suite as in the looping case. * no implications for the semantics of 'return', 'break', 'continue' * 'yield' would not be usable inside a with block, unless the AbortIteration concept was adopting for forcible generator termination. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From jimjjewett at gmail.com Tue Apr 26 17:26:30 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue Apr 26 17:26:32 2005 Subject: [Python-Dev] Re: anonymous blocks vs scope-collapse Message-ID: <fb6fbf5605042608267ed17786@mail.gmail.com> Michael Hudson: > This proposal seems a bit like an effort to make generators good at > doing something that they aren't really intended -- or dare I say > suited? -- for. I think it is more an effort to use the right keyword, which has unfortunately already been claimed by generators (and linked to iterators). yield is a sensible way for code to say "your turn, but come back later". But at the moment, it means "I am producing an intermediate value", and the way to call that function is to treat it as an iterator (which seems to imply looping over a closed set, so don't send in more information after the initial setup). Should we accept that "yield" is already used up, or should we shoehorn the concepts until they're "close enough"? > So, here's a counterproposal! > with expr as var: > ... code ... > is roughly: > def _(var): > ... code ... > __private = expr > __private(_) ... > The need for approximation in the above translation is necessary > because you'd want to make assignments in '...code...' affect the scope > their written in, To me, this seems like the core requirement. I see three sensible paths: (1) Do nothing. (2) Add a way to say "Make this function I'm calling use *my* locals and globals." This seems to meet all the agreed-upon-as-good use cases, but there is disagreement over how to sensibly write it. The calling function is the place that could get surprised, but people who want thunks seem to want the specialness in the called function. (3) Add macros. We still have to figure out how to limit their obfuscation. Attempts to detail that goal seem to get sidetracked. -jJ From tjreedy at udel.edu Tue Apr 26 18:00:30 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Tue Apr 26 18:05:05 2005 Subject: [Python-Dev] Re: Re: Re: anonymous blocks References: <ca471dc205042116402d7d38da@mail.gmail.com><ca471dc205042416572da9db71@mail.gmail.com><426CB7C2.8030508@gmail.com> <d4iv44$9gn$1@sea.gmane.org> <426DC75A.1010005@canterbury.ac.nz> Message-ID: <d4lo91$9g6$1@sea.gmane.org> "Greg Ewing" <greg.ewing@canterbury.ac.nz> wrote in message news:426DC75A.1010005@canterbury.ac.nz... > Terry Reedy wrote: The part you quoted was by Nick Coghlan, not me, as indicated by the >> (now >>>) instead of > (which would now be >>) in front of the lines. >>>Not supporting iterables makes it harder to write a class which is ... From ark-mlist at att.net Tue Apr 26 18:09:58 2005 From: ark-mlist at att.net (Andrew Koenig) Date: Tue Apr 26 18:09:49 2005 Subject: [Python-Dev] defmacro In-Reply-To: <87k6mqnddr.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <006201c54a7a$656bdb40$6402a8c0@arkdesktop> > This doesn't feel right to me. By that argument, people would want > to "improve" > > (mapcar (lambda (x) (car x)) list-of-lists) > > to > > (mapcar list-of-lists (x) (car x)) > > Have you ever heard someone complain about that lambda, though? Welllll.... Shouldn't you have written (mapcar car list-of-lists) or am I missing something painfully obvious? From facundobatista at gmail.com Tue Apr 26 18:22:15 2005 From: facundobatista at gmail.com (Facundo Batista) Date: Tue Apr 26 18:22:18 2005 Subject: [Python-Dev] Re: Caching objects in memory In-Reply-To: <ca471dc20504250957753a7445@mail.gmail.com> References: <e04bdf31050422063019fda86b@mail.gmail.com> <d4av64$ogd$1@sea.gmane.org> <e04bdf310504250946371f59c@mail.gmail.com> <ca471dc20504250957753a7445@mail.gmail.com> Message-ID: <e04bdf31050426092230502ab1@mail.gmail.com> On 4/25/05, Guido van Rossum <gvanrossum@gmail.com> wrote: > > I was in my second class of the Python workshop I'm giving here in one > > Argentine University, and I was explaining how to think using > > name/object and not variable/value. > > > > Using id() for being pedagogic about the objects, the kids saw that > > id(3) was always the same, but id([]) not. I explained to them that > > Python, in some circumstances, caches the object, and I kept them > > happy enough. > > > > But I really don't know what objects and in which circumstances. > > Aargh! Bad explanation. Or at least you're missing something: Not really. It's easier for me to show that id(3) is always the same and id([]) not, and let the kids see that's not so easy and you'll have to look deeper if you want to know better. If I did id(3) and id(500), then the difference would look more subtle, and I would had to explain it longer. Remember, it was the second day (2 hours per day). > implementation is free to use caching. In practice, I believe ints > between -5 and 100 are cached, and 1-character strings are often > cached (but not always). These are exactly my doubts, ;). . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From facundobatista at gmail.com Tue Apr 26 18:24:11 2005 From: facundobatista at gmail.com (Facundo Batista) Date: Tue Apr 26 18:24:13 2005 Subject: [Python-Dev] Re: Caching objects in memory In-Reply-To: <426DC58D.2010102@canterbury.ac.nz> References: <e04bdf31050422063019fda86b@mail.gmail.com> <d4av64$ogd$1@sea.gmane.org> <e04bdf310504250946371f59c@mail.gmail.com> <ca471dc20504250957753a7445@mail.gmail.com> <426DC58D.2010102@canterbury.ac.nz> Message-ID: <e04bdf3105042609241b161d84@mail.gmail.com> On 4/26/05, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote: > Also, string literals that resemble Python identifiers > are often interned, although this is not guaranteed. > And this only applies to literals, not strings constructed > dynamically by the program (unless you explicitly apply > intern() to them). This simplifies the whole thing. If the issue arises again, my speech will be: "Don't worry about that, Python worries for you". :D And I *someone* in particular keeps interested in it (I'm pretty sure the whole class won't), I'll explain it to him better, and with more time. Thank you! . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From gvanrossum at gmail.com Tue Apr 26 18:57:01 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Apr 26 18:57:04 2005 Subject: [Python-Dev] Re: anonymous blocks vs scope-collapse In-Reply-To: <fb6fbf5605042608267ed17786@mail.gmail.com> References: <fb6fbf5605042608267ed17786@mail.gmail.com> Message-ID: <ca471dc20504260957523dce36@mail.gmail.com> > (2) Add a way to say "Make this function I'm calling use *my* locals > and globals." This seems to meet all the agreed-upon-as-good use > cases, but there is disagreement over how to sensibly write it. The > calling function is the place that could get surprised, but people > who want thunks seem to want the specialness in the called function. I think there are several problems with this. First, it looks difficult to provide semantics that cover all the corners for the blending of two namespaces. What happens to names that have a different meaning in each scope? (E.g. 'self' when calling a method of another object; or any other name clash.) Are the globals also blended? How? Second, this construct only makes sense for all callables; you seem to want to apply it for function (and I suppose methods, whether bound or not), but it makes no sense when the callable is implemented as a C function, or is a class, or an object with a __call__ method. Third, I expect that if we solve the first two problems, we'll still find that for an efficient implementation we need to modify the bytecode of the called function. If you really want to pursue this idea beyond complaining "nobody listens to me" (which isn't true BTW), I suggest that you try to define *exactly* how you think it should work. Try to make sure that it can be used in a "statement context" as well as in an "expression context". You don't need to come up with a working implementation, but you should be able to convince me (or Raymond H :-) that it *can* be implemented, and that the performance will be reasonable, and that it won't affect performance when not used, etc. If you think that's beyond you, then perhaps you should accept "no" as the only answer you're gonna get. Because I personally strongly suspect that it won't work, so the burden of "proof", so to speak, is on you. > (3) Add macros. We still have to figure out how to limit their obfuscation. > Attempts to detail that goal seem to get sidetracked. No, the problem is not how to limit the obfuscation. The problem is the same as for (2), only more so: nobody has given even a *remotely* plausible mechanism for how exactly you would get code executed at compile time. You might want to look at Boo, a Python-inspired language that translates to C#. They have something they call syntactic macros: http://boo.codehaus.org/Syntactic+Macros . -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pedronis at strakt.com Tue Apr 26 19:12:22 2005 From: pedronis at strakt.com (Samuele Pedroni) Date: Tue Apr 26 19:12:27 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <3cf156b4a85d5b9907c6c9333d8c6af8@python.net> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <3cf156b4a85d5b9907c6c9333d8c6af8@python.net> Message-ID: <426E7676.2040501@strakt.com> Michael Hudson wrote: > The history of iterators and generators could be summarized by saying > that an API was invented, then it turned out that in practice one way > of implementing them -- generators -- was almost universally useful. > > This proposal seems a bit like an effort to make generators good at > doing something that they aren't really intended -- or dare I say > suited? -- for. The tail wagging the dog so to speak. > it is fun because the two of us sort of already had this discussion in compressed form a lot of time ago: http://groups-beta.google.com/groups?q=with+generators+pedronis&hl=en not that I was really conviced about my idea at the time which was very embrional, and in fact I'm bit skeptical right now about how much bending or not of generators makes sense, especially for a learnability point of view. From mwh at python.net Tue Apr 26 19:48:26 2005 From: mwh at python.net (Michael Hudson) Date: Tue Apr 26 19:48:28 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <426E7676.2040501@strakt.com> (Samuele Pedroni's message of "Tue, 26 Apr 2005 19:12:22 +0200") References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <3cf156b4a85d5b9907c6c9333d8c6af8@python.net> <426E7676.2040501@strakt.com> Message-ID: <2mk6mp2zv9.fsf@starship.python.net> Samuele Pedroni <pedronis@strakt.com> writes: > Michael Hudson wrote: > >> The history of iterators and generators could be summarized by >> saying that an API was invented, then it turned out that in practice >> one way of implementing them -- generators -- was almost universally >> useful. >> >> This proposal seems a bit like an effort to make generators good at >> doing something that they aren't really intended -- or dare I say >> suited? -- for. The tail wagging the dog so to speak. >> > it is fun because the two of us sort of already had this discussion in > compressed form a lot of time ago: Oh yes. That was the discussion that led to PEP 310 being written. > http://groups-beta.google.com/groups?q=with+generators+pedronis&hl=en At least I'm consistent :) > not that I was really conviced about my idea at the time which was > very embrional, and in fact I'm bit skeptical right now about how > much bending or not of generators makes sense, especially for a > learnability point of view. As am I, obviously. Cheers, mwh -- Arrrrgh, the braindamage! It's not unlike the massively non-brilliant decision to use the period in abbreviations as well as a sentence terminator. Had these people no imagination at _all_? -- Erik Naggum, comp.lang.lisp From rrr at ronadam.com Tue Apr 26 20:18:17 2005 From: rrr at ronadam.com (ron adam) Date: Tue Apr 26 20:18:40 2005 Subject: [Python-Dev] Re: anonymous blocks Message-ID: <426E85E9.4060606@ronadam.com> Hi, this is my first post here and I've been following this very interesting discussion as is has developed. A really short intro about me, I was trained as a computer tech in the early 80's... ie. learned transistors, gates, logic etc... And so my focus tends to be from that of a troubleshooter. I'm medically retired now (not a subject for here) and am looking for something meaningful and rewarding that I can contribute to with my free time. I will not post often at first as I am still getting up to speed with CVS and how Pythons core works. Hopefully I'm not lagging this discussion too far or adding unneeded noise to it. :-) >> So maybe the 'with' keyword should be dropped (again!) in >> favour of >> >> with_opened(pathname) as f: >> ... >> > > But that doesn't look so great for the case where there's no variable > to be assigned to -- I wasn't totally clear about it, but I meant the > syntax to be > > with [VAR =] EXPR: BLOCK > > where VAR would have the same syntax as the left hand side of an > assignment (or the variable in a for-statement). > I keep wanting to read it as: with OBJECT [from EXPR]: BLOCK >> 2) I'm not sure about the '='. It makes it look rather deceptively >> like an ordinary assignment, and I'm sure many people are going >> to wonder what the difference is between >> >> with f = opened(pathname): >> do_stuff_to(f) >> >> and simply >> >> f = opened(pathname) >> do_stuff_to(f) >> >> or even just unconsciously read the first as the second without >> noticing that anything special is going on. Especially if they're >> coming from a language like Pascal which has a much less magical >> form of with-statement. >> Below is what gives me the clearest picture so far. To me there is nothing 'anonymous' going on here. Which is good I think. :-) After playing around with Guido's example a bit, it looks to me the role of a 'with' block is to define the life of a resource object. so "with OBJECT: BLOCK" seems to me to be the simplest and most natural way to express this. def with_file(filename, mode): """ Create a file resource """ f = open(filename, mode) try: yield f # use yield here finally: # Do at exit of 'with <resource>: <block>' f.close # Get a resource/generator object and use it. f_resource = with_file('resource.py', 'r') with f_resource: f = f_resource.next() # get values from yields for line in f: print line, # Generator resource with yield loop. def with_file(filename): """ Create a file line resource """ f = open(filename, 'r') try: for line in f: yield line finally: f.close() # print lines in this file. f_resource = with_file('resource.py') with f_resource: while 1: line = f_resource.next() if line == "": break print line, The life of an object used with a 'with' block is shorter than that of the function it is called from, but if the function is short, the life could be the same as the function. Then the 'with' block could be optional if the resource objects __exit__ method is called when the function exits, but that may require some way to tag a resource as being different from other class's and generators to keep from evaluating __exit__ methods of other objects. As far as looping behaviors go, I prefer the loop to be explicitly defined in the resource or the body of the 'with', because it looks to be more flexible. Ron_Adam # "The right question is a good start to finding the correct answer." From aahz at pythoncraft.com Tue Apr 26 21:02:56 2005 From: aahz at pythoncraft.com (Aahz) Date: Tue Apr 26 21:03:00 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc2050426043713116248@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> Message-ID: <20050426190256.GA14052@panix.com> On Tue, Apr 26, 2005, Guido van Rossum wrote: > > Now there's one more twist, which you may or may not like. Presumably > (barring obfuscations or bugs) the handling of BreakFlow and > ContinueFlow by an iterator (or generator) is consistent for all uses > of that particular iterator. For example synchronized(lock) and > transactional(db) do not behave as loops, and forever() does. Ditto > for handling ReturnFlow. This is why I've been thinking of leaving > out the 'with' keyword: in your mind, these calls would become new > statement types, even though the compiler sees them all the same: > > synchronized(lock): > BLOCK > > transactional(db): > BLOCK > > forever(): > BLOCK > > opening(filename) as f: > BLOCK That's precisely why I think we should keep the ``with``: the point of Python is to have a restricted syntax and requiring a prefix for these constructs makes it easier to read the code. You'll soon start to gloss over the ``with`` but it will be there as a marker for your subconscious. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It's 106 miles to Chicago. We have a full tank of gas, a half-pack of cigarettes, it's dark, and we're wearing sunglasses." "Hit it." From nicksjacobson at hotmail.com Tue Apr 26 21:57:17 2005 From: nicksjacobson at hotmail.com (Nick Jacobson) Date: Tue Apr 26 21:57:20 2005 Subject: [Python-Dev] atexit missing an unregister method Message-ID: <BAY17-F7DF544743281830AFF246A4210@phx.gbl> I was looking at the atexit module the other day; it seems like an elegant way to ensure that resources are cleaned up (that the garbage collector doesn't take care of). But while you can mark functions to be called with the 'register' method, there's no 'unregister' method to remove them from the stack of functions to be called. Nor is there any way to view this stack and e.g. call 'del' on a registered function. This would be useful in the following scenario, in which x and y are resources that need to be cleaned up, even in the event of a program exit: import atexit def free_resource(resource): ... atexit.register(free_resource, x) atexit.register(free_resource, y) # do operations with x and y, potentially causing the program to exit ... # if nothing caused the program to unexpectedly quit, close the resources free_resource(x) free_resource(y) #unregister the functions, so that you don't try to free the resources twice! atexit.unregisterall() Alternatively, it would be great if there were a way to view the stack of registered functions, and delete them from there. --Nick Jacobson _________________________________________________________________ Don’t just search. Find. Check out the new MSN Search! http://search.msn.click-url.com/go/onm00200636ave/direct/01/ From gvanrossum at gmail.com Tue Apr 26 21:59:55 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Tue Apr 26 22:00:00 2005 Subject: [Python-Dev] atexit missing an unregister method In-Reply-To: <BAY17-F7DF544743281830AFF246A4210@phx.gbl> References: <BAY17-F7DF544743281830AFF246A4210@phx.gbl> Message-ID: <ca471dc20504261259440098e0@mail.gmail.com> On 4/26/05, Nick Jacobson <nicksjacobson@hotmail.com> wrote: > I was looking at the atexit module the other day; it seems like an elegant > way to ensure that resources are cleaned up (that the garbage collector > doesn't take care of). > > But while you can mark functions to be called with the 'register' method, > there's no 'unregister' method to remove them from the stack of functions to > be called. Nor is there any way to view this stack and e.g. call 'del' on a > registered function. > > This would be useful in the following scenario, in which x and y are > resources that need to be cleaned up, even in the event of a program exit: > > import atexit > > def free_resource(resource): > ... > > atexit.register(free_resource, x) > atexit.register(free_resource, y) > # do operations with x and y, potentially causing the program to exit > ... > # if nothing caused the program to unexpectedly quit, close the resources > free_resource(x) > free_resource(y) > #unregister the functions, so that you don't try to free the resources > twice! > atexit.unregisterall() > > Alternatively, it would be great if there were a way to view the stack of > registered functions, and delete them from there. Methinks that the resource cleanup routines ought to be written so as to be reentrant. That shouldn't be too hard (you can always maintain a global flag that means "already called"). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From skip at pobox.com Tue Apr 26 22:06:50 2005 From: skip at pobox.com (Skip Montanaro) Date: Tue Apr 26 22:06:54 2005 Subject: [Python-Dev] atexit missing an unregister method In-Reply-To: <BAY17-F7DF544743281830AFF246A4210@phx.gbl> References: <BAY17-F7DF544743281830AFF246A4210@phx.gbl> Message-ID: <17006.40794.279653.631294@montanaro.dyndns.org> Nick> But while you can mark functions to be called with the 'register' Nick> method, there's no 'unregister' method to remove them from the Nick> stack of functions to be called. Nor is there any way to view Nick> this stack and e.g. call 'del' on a registered function. Nick> This would be useful in the following scenario, in which x and y Nick> are resources that need to be cleaned up, even in the event of a Nick> program exit: Nick> import atexit Nick> def free_resource(resource): Nick> ... Nick> atexit.register(free_resource, x) Nick> atexit.register(free_resource, y) Nick> # do operations with x and y, potentially causing the program to exit Nick> ... Nick> # if nothing caused the program to unexpectedly quit, close the resources Nick> free_resource(x) Nick> free_resource(y) Nick> #unregister the functions, so that you don't try to free the resources Nick> twice! Nick> atexit.unregisterall() This seems like a poor argument for unregistering exit handlers. If you've registered an exit handler, why then explicitly do what you've already asked the system to do? Also, your proposed unregisterall() function would be dangerous. As an application writer you don't know what other parts of the system (libraries you use, for example) might have registered exit functions. Skip From aahz at pythoncraft.com Tue Apr 26 22:18:30 2005 From: aahz at pythoncraft.com (Aahz) Date: Tue Apr 26 22:18:33 2005 Subject: [Python-Dev] atexit missing an unregister method In-Reply-To: <BAY17-F7DF544743281830AFF246A4210@phx.gbl> References: <BAY17-F7DF544743281830AFF246A4210@phx.gbl> Message-ID: <20050426201830.GA5253@panix.com> On Tue, Apr 26, 2005, Nick Jacobson wrote: > > I was looking at the atexit module the other day; it seems like an elegant > way to ensure that resources are cleaned up (that the garbage collector > doesn't take care of). > > But while you can mark functions to be called with the 'register' method, > there's no 'unregister' method to remove them from the stack of functions > to be called. Nor is there any way to view this stack and e.g. call 'del' > on a registered function. > > This would be useful in the following scenario, in which x and y are > resources that need to be cleaned up, even in the event of a program exit: > > import atexit > > def free_resource(resource): > ... > > atexit.register(free_resource, x) > atexit.register(free_resource, y) This seems like the wrong way. Why not do this: class ResourceCleanup: def register(self, resource, func): ... def unregister(self, resource): ... def __call__(self): ... handler = ResourceCleanup) atexit.register(handler) handler.register(x, free_resource) do(x) handler.unregister(x) Probably further discussion should go to comp.lang.python -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It's 106 miles to Chicago. We have a full tank of gas, a half-pack of cigarettes, it's dark, and we're wearing sunglasses." "Hit it." From martin at v.loewis.de Tue Apr 26 22:24:37 2005 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue Apr 26 22:24:40 2005 Subject: [Python-Dev] Removing --with-wctype-functions support In-Reply-To: <1114522169.5352.9.camel@poseidon.cekirdek.int> References: <1114522169.5352.9.camel@poseidon.cekirdek.int> Message-ID: <426EA385.90609@v.loewis.de> S.?a?lar Onur wrote: > I want to know status of > http://mail.python.org/pipermail/python-dev/2004-December/050193.html > this thread. The status is that they are still there. > Will python remove wctype functions support from its core? I don't know what MAL's plans are these days, but it is likely that he will remove the functions from the places where they are used at the moment. > If it will, what about locale-dependent case conversation functions? > > Without this support python behaves wrong in tr_TR.UTF-8 locale. I can sympathise with the problem. IMO, the right solution is to provide them throught the locale module. That would have the advantage that the choice of locale-awareness of Unicode case conversions (etc.) is a per-script decision, rather than an interpreter built-time decision. Patches in this direction (adding the functions to _localemodule.c) are welcome, independent of whether they are removed from the methods on Unicode objects. Such functions should probably polymorphically operate both on byte strings and Unicode strings, allowing to deprecate the locale-specific methods on strings as well. Regards, Martin From python at rcn.com Mon Apr 25 22:25:50 2005 From: python at rcn.com (Raymond Hettinger) Date: Tue Apr 26 22:26:26 2005 Subject: [Python-Dev] atexit missing an unregister method In-Reply-To: <BAY17-F7DF544743281830AFF246A4210@phx.gbl> Message-ID: <001401c549d5$06b34200$7c29a044@oemcomputer> [Nick Jacobson] > I was looking at the atexit module the other day; it seems like an elegant > way to ensure that resources are cleaned up (that the garbage collector > doesn't take care of). > > But while you can mark functions to be called with the 'register' method, > there's no 'unregister' method to remove them from the stack of functions > to > be called. . . . > Alternatively, it would be great if there were a way to view the stack of > registered functions, and delete them from there. Please file a feature request on SourceForge. Will mull it over for a while. My first impression is that try/finally is a better tool for the scenario you outlined. The issue with unregister() is that the order of clean-up calls is potentially significant. If the same function is listed more than once, there would be no clear-cut way to know which should be removed when unregister() is called. Likewise, I suspect that exposing the stack will create more pitfalls and risks than it could provide in benefits. Dealing with a stack of functions is likely to be clumsy at best. Raymond Hettinger From mal at egenix.com Tue Apr 26 22:32:13 2005 From: mal at egenix.com (M.-A. Lemburg) Date: Tue Apr 26 22:32:17 2005 Subject: [Python-Dev] Removing --with-wctype-functions support In-Reply-To: <426EA385.90609@v.loewis.de> References: <1114522169.5352.9.camel@poseidon.cekirdek.int> <426EA385.90609@v.loewis.de> Message-ID: <426EA54D.2060604@egenix.com> Martin v. L?wis wrote: > S.?a?lar Onur wrote: > >>I want to know status of >>http://mail.python.org/pipermail/python-dev/2004-December/050193.html >>this thread. > > > The status is that they are still there. Due to lack of time on my part. >>Will python remove wctype functions support from its core? > > > I don't know what MAL's plans are these days, but it is likely that > he will remove the functions from the places where they are used > at the moment. Right. I haven't heard any complaints, so that's the plan. >>If it will, what about locale-dependent case conversation functions? >> >>Without this support python behaves wrong in tr_TR.UTF-8 locale. Could you be more specific about the problem ? It's probably best to open a bug report in SourceForge. > I can sympathise with the problem. IMO, the right solution is to provide > them throught the locale module. That would have the advantage that > the choice of locale-awareness of Unicode case conversions (etc.) is > a per-script decision, rather than an interpreter built-time decision. > > Patches in this direction (adding the functions to _localemodule.c) > are welcome, independent of whether they are removed from the methods > on Unicode objects. Such functions should probably polymorphically > operate both on byte strings and Unicode strings, allowing to deprecate > the locale-specific methods on strings as well. +1, though we will only want to deprecate the "locale dependency", not the methods themselves ;-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 26 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: From nicksjacobson at hotmail.com Tue Apr 26 22:40:12 2005 From: nicksjacobson at hotmail.com (Nick Jacobson) Date: Tue Apr 26 22:40:33 2005 Subject: [Python-Dev] Re: atexit missing an unregister method Message-ID: <BAY17-F150A28EE34D5041AD5AB3BA4210@phx.gbl> << This seems like a poor argument for unregistering exit handlers. If you've registered an exit handler, why then explicitly do what you've already asked the system to do? >> 1. To free up memory for the rest of the program. 2. If the following block is in a loop, and you need to allocate & then deallocate resources multiple times.: << atexit.register(free_resource, x) atexit.register(free_resource, y) # do operations with x and y, potentially causing the program to exit ... # if nothing caused the program to unexpectedly quit, close the resources free_resource(x) free_resource(y) >> << Also, your proposed unregisterall() function would be dangerous. As an application writer you don't know what other parts of the system (libraries you use, for example) might have registered exit functions. Skip >> That's true...it would probably be better to expose the stack of registered functions. That way you can manually unregister functions you've registered. --Nick Jacobson _________________________________________________________________ Express yourself instantly with MSN Messenger! Download today - it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ From nicksjacobson at hotmail.com Tue Apr 26 22:50:28 2005 From: nicksjacobson at hotmail.com (Nick Jacobson) Date: Tue Apr 26 22:50:31 2005 Subject: [Python-Dev] Re: atexit missing an unregister method Message-ID: <BAY17-F40F3C1F4E1BEB611E15248A4210@phx.gbl> Raymond Hettinger wrote: << Will mull it over for a while. My first impression is that try/finally is a better tool for the scenario you outlined. >> You're right. try/finally takes care of my sample scenario. There may still be a case to be made for atexit.unregister(), though. --Nick Jacobson _________________________________________________________________ On the road to retirement? Check out MSN Life Events for advice on how to get there! http://lifeevents.msn.com/category.aspx?cid=Retirement From paul at pfdubois.com Tue Apr 26 22:53:05 2005 From: paul at pfdubois.com (Paul Dubois) Date: Tue Apr 26 22:53:09 2005 Subject: [Python-Dev] python.org crashing Mozilla? Message-ID: <426EAA31.1050304@pfdubois.com> Three different computers running Linux / Mozilla are crashing Mozilla when directed to python.org. A Netscape works ok. Are we hacked or are we showing off? From modelnine at ceosg.de Wed Apr 27 00:58:21 2005 From: modelnine at ceosg.de (Heiko Wundram) Date: Tue Apr 26 22:58:48 2005 Subject: [Python-Dev] python.org crashing Mozilla? In-Reply-To: <426EAA31.1050304@pfdubois.com> References: <426EAA31.1050304@pfdubois.com> Message-ID: <200504270058.24606.modelnine@ceosg.de> Am Dienstag, 26. April 2005 22:53 schrieb Paul Dubois: > Three different computers running Linux / Mozilla are crashing Mozilla > when directed to python.org. A Netscape works ok. Are we hacked or are > we showing off? Firefox on Gentoo works okay...? -- --- Heiko. listening to: Incubus - Megalomaniac see you at: http://www.stud.mh-hannover.de/~hwundram/wordpress/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20050427/7b5d2572/attachment.pgp From fdrake at acm.org Tue Apr 26 23:00:49 2005 From: fdrake at acm.org (Fred L. Drake, Jr.) Date: Tue Apr 26 23:01:06 2005 Subject: [Python-Dev] python.org crashing Mozilla? In-Reply-To: <426EAA31.1050304@pfdubois.com> References: <426EAA31.1050304@pfdubois.com> Message-ID: <200504261700.49587.fdrake@acm.org> On Tuesday 26 April 2005 16:53, Paul Dubois wrote: > Three different computers running Linux / Mozilla are crashing Mozilla > when directed to python.org. A Netscape works ok. Are we hacked or are > we showing off? Paul, My Firefox 1.0.2 is fine. What version(s) of Mozilla, and what host platforms, would be helpful. -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> From Ugo_DiGirolamo at invision.iip.com Tue Apr 26 23:15:38 2005 From: Ugo_DiGirolamo at invision.iip.com (Ugo Di Girolamo) Date: Tue Apr 26 23:14:54 2005 Subject: [Python-Dev] Problem with embedded python Message-ID: <3D4A0A4A0225484B965A23CFD127B82F4A6800@invnmail.invision.iip.com> I have the following code, that seems to make sense to me. However, it crashes about 1/3 of the times. My platform is Python 2.4.1 on WXP (I tried the release version from the msi and the debug version built by me, both downloaded today to have the latest version). The crash happens while the main thread is in Py_Finalize. I traced the crash to _Py_ForgetReference(op) in object.c at line 1847, where I have op->_ob_prev == NULL. What am I doing wrong? I'm definitely not too sure about the way I'm handling the GIL. Thanks in adv for any suggestion/ comment Cheers and ciao Ugo ////////////////////////// TestPyThreads.py ////////////////////////// #include <windows.h> #include "Python.h" int main() { PyEval_InitThreads(); Py_Initialize(); PyGILState_STATE main_restore_state = PyGILState_UNLOCKED; PyGILState_Release(main_restore_state); // start the thread { PyGILState_STATE state = PyGILState_Ensure(); int trash = PyRun_SimpleString( "import thread\n" "import time\n" "def foo():\n" " f = open('pippo.out', 'w', 0)\n" " i = 0;\n" " while 1:\n" " f.write('%d\\n'%i)\n" " time.sleep(0.01)\n" " i += 1\n" "t = thread.start_new_thread(foo, ())\n" ); PyGILState_Release(state); } // wait 300 ms Sleep(300); PyGILState_Ensure(); Py_Finalize(); return 0; } From python at rcn.com Mon Apr 25 23:20:35 2005 From: python at rcn.com (Raymond Hettinger) Date: Tue Apr 26 23:20:49 2005 Subject: [Python-Dev] Re: atexit missing an unregister method In-Reply-To: <BAY17-F40F3C1F4E1BEB611E15248A4210@phx.gbl> Message-ID: <003d01c549dc$9f2ebbc0$7c29a044@oemcomputer> [Raymond Hettinger] > << Will mull it over for a while. My first impression is that try/finally > is a better tool for the scenario you outlined. >> [Nick Jacobson] > You're right. try/finally takes care of my sample scenario. There may > still be a case to be made for atexit.unregister(), though. Now is the time to move the discussion to SF feature requests or to comp.lang.python. If you devote time to "making a case", then also devote equal effort to researching the hazards and API issues. "Potentially useful" is usually trumped by "potentially harmful". Also, if the API is awkward or error-prone, that is a bad sign. Specifically, consider whether exposing the data structure opens the possibility of accidentally violating invariants assumed by other calls of atexit(). With respect to the API, consider whether you could explain to a newbie (who has just finished the tutorial) how to access the structure, lookup a target function, and make appropriate modifications without breaking anything else. Raymond From ncoghlan at gmail.com Tue Apr 26 23:21:06 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue Apr 26 23:21:14 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <5.1.1.6.0.20050424235840.0309ca90@mail.telecommunity.com> References: <ca471dc205042416572da9db71@mail.gmail.com> <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <5.1.1.6.0.20050424235840.0309ca90@mail.telecommunity.com> Message-ID: <426EB0C2.8020006@gmail.com> Phillip J. Eby wrote: > At 09:12 PM 4/24/05 -0600, Steven Bethard wrote: > >> I guess it would be helpful to see example where the looping >> with-block is useful. > > > Automatically retry an operation a set number of times before hard failure: > > with auto_retry(times=3): > do_something_that_might_fail() > > Process each row of a database query, skipping and logging those that > cause a processing error: > > with x,y,z = log_errors(db_query()): > do_something(x,y,z) > > You'll notice, by the way, that some of these "runtime macros" may be > stackable in the expression. These are also possible by combining a normal for loop with a non-looping with (but otherwise using Guido's exception injection semantics): def auto_retry(attempts): success = [False] failures = [0] except = [None] def block(): try: yield None except: failures[0] += 1 else: success[0] = True while not success[0] and failures[0] < attempts: yield block() if not success[0]: raise Exception # You'd actually propagate the last inner failure for attempt in auto_retry(3): with attempt: do_something_that_might_fail() The non-looping version of with seems to give the best of both worlds - multipart operation can be handled by multiple with statements, and repeated use of the same suite can be handled by nesting the with block inside iteration over an appropriate generator. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From martin at v.loewis.de Tue Apr 26 23:29:01 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Apr 26 23:29:04 2005 Subject: [Python-Dev] Re: atexit missing an unregister method In-Reply-To: <BAY17-F40F3C1F4E1BEB611E15248A4210@phx.gbl> References: <BAY17-F40F3C1F4E1BEB611E15248A4210@phx.gbl> Message-ID: <426EB29D.30403@v.loewis.de> Nick Jacobson wrote: > You're right. try/finally takes care of my sample scenario. There may > still be a case to be made for atexit.unregister(), though. No. Anybody in need of such a feature can easily unregister it. allregistrations=[] def _run(): for fn in allregistrations: fn() atexit.register(_run) def register(fn): allregistrations.append(fn) def unregister(fn): allregistrations.remove(fn) Regards, Martin From jimjjewett at gmail.com Tue Apr 26 23:30:30 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue Apr 26 23:30:34 2005 Subject: [Python-Dev] Re: anonymous blocks vs scope-collapse Message-ID: <fb6fbf5605042614308198cb6@mail.gmail.com> >> (2) Add a way to say "Make this function I'm calling use *my* locals >> and globals." This seems to meet all the agreed-upon-as-good use >> cases, but there is disagreement over how to sensibly write it. The >> calling function is the place that could get surprised, but people >> who want thunks seem to want the specialness in the called function. > I think there are several problems with this. First, it looks > difficult to provide semantics that cover all the corners for the > blending of two namespaces. What happens to names that have a > different meaning in each scope? Programming error. Same name ==> same object. If a function is using one of _your_ names for something incompatible, then don't call that function with collapsed scope. The same "problem" happens with globals today. Code in module X can break if module Y replaces (not shadows, replaces) a builtin with an incompatible object. Except ... > (E.g. 'self' when calling a method of > another object; or any other name clash.) The first argument of a method *might* be a special case. It seems wrong to unbind a bound method. On the other hand, resource managers may well want to use unbound methods for the called code. > Are the globals also blended? How? Yes. The callee does not even get to see its normal namespace. Therefore, the callee does not get to use its normal name resolution. If the name normally resolves in locals (often inlined to a tuple, today), it looks in the shared scope, which is "owned" by the caller. This is different from a free variable only because the callee can write to this dictionary. If the name is free in that shared scope, (which implies that the callee does not bind it, else it would be added to the shared scope) then the callee looks up the caller's nested stack and then to the caller's globals, and then the caller's builtins. > Second, this construct only makes sense for all callables; Agreed. But using it on a non-function may cause surprising results especially if bound methods are not special-cased. The same is true of decorators, which is why we have (at least initially) "function decorators" instead of "callable decorators". > it makes no sense when the callable is implemented as > a C function, Or rather, it can't be implemented, as the compiler may well have optimized the variables names right out. Stack frame transitions between C and python are already special. > or is a class, or an object with a __call__ method. These are just calls to __init__ (or __new__) or __call__. These may be foolish things to call (particularly if the first argument to a method isn't special-cased), but ... it isn't a problem if the class is written appropriately. If the class is not written appropriately, then don't call it with collapsed scope. > Third, I expect that if we solve the first two > problems, we'll still find that for an efficient implementation we > need to modify the bytecode of the called function. Absolutely. Even giving up the XXX_FAST optimizations would still require new bytecode to not assume them. (Deoptimizing *all* functions, in *all* contexts, is not a sensible tradeoff.) Eventually, an optimizing compiler could do the right thing, but ... that isn't the point. For a given simple algorithm, interpeted python is generally slower than compiled C, but we write in python anyhow -- it is fast enough, and has other advantages. The same is true of anything that lets me not cut-and-paste. > Try to make sure that it can be used in a "statement context" > as well as in an "expression context". I'm not sure I understand this. The preferred way would be to just stick the keyword before the call. Using 'collapse', it would look like: def foo(b): c=a def bar(): a="a1" collapse foo("b1") print b, c # prints "b1", "a1" a="a2" foo("b2") # Not collapsed this time print b, c # still prints "b1", "a1" but I suppose you could treat it like the 'global' keyword def bar(): a="a1" collapse foo # forces foo to always collapse when called within bar foo("b1") print b, c # prints "b1", "a1" a="a2" foo("b2") # still collapsed print b, c # now prints "b2", "a2" >> [Alternative 3 ... bigger that merely collapsing scope] >> (3) Add macros. We still have to figure out how to limit their obfuscation. >> Attempts to detail that goal seem to get sidetracked. > No, the problem is not how to limit the obfuscation. The problem is > the same as for (2), only more so: nobody has given even a *remotely* > plausible mechanism for how exactly you would get code executed at > compile time. macros can (and *possibly* should) be evaluated at run-time. Compile time should be possible (there is an interpreter running) and faster, but ... is certainly not required. Even if the macros just rerun the same boilerplate code less efficiently, it is still good to have that boilerplate defined once, instead of cutting and pasting. Or, at least, it is better *if* that once doesn't become unreadable in the process. -jJ From martin at v.loewis.de Tue Apr 26 23:31:01 2005 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue Apr 26 23:31:05 2005 Subject: [Python-Dev] Problem with embedded python In-Reply-To: <3D4A0A4A0225484B965A23CFD127B82F4A6800@invnmail.invision.iip.com> References: <3D4A0A4A0225484B965A23CFD127B82F4A6800@invnmail.invision.iip.com> Message-ID: <426EB315.4040907@v.loewis.de> Ugo Di Girolamo wrote: > What am I doing wrong? This is not the forum to ask this question, please use python-list@python.org instead. Regards, Martin From Ugo_DiGirolamo at invision.iip.com Tue Apr 26 23:34:49 2005 From: Ugo_DiGirolamo at invision.iip.com (Ugo Di Girolamo) Date: Tue Apr 26 23:33:54 2005 Subject: [Python-Dev] Problem with embedded python Message-ID: <3D4A0A4A0225484B965A23CFD127B82F4A6801@invnmail.invision.iip.com> Sorry. will do. Ugo -----Original Message----- From: "Martin v. L?wis" [mailto:martin@v.loewis.de] Sent: Tuesday, April 26, 2005 2:31 PM To: Ugo Di Girolamo Cc: python-dev@python.org Subject: Re: [Python-Dev] Problem with embedded python Ugo Di Girolamo wrote: > What am I doing wrong? This is not the forum to ask this question, please use python-list@python.org instead. Regards, Martin From p.f.moore at gmail.com Tue Apr 26 23:40:27 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Tue Apr 26 23:40:29 2005 Subject: [Python-Dev] Re: anonymous blocks vs scope-collapse In-Reply-To: <fb6fbf5605042614308198cb6@mail.gmail.com> References: <fb6fbf5605042614308198cb6@mail.gmail.com> Message-ID: <79990c6b05042614406bd8f95@mail.gmail.com> On 4/26/05, Jim Jewett <jimjjewett@gmail.com> wrote: > I'm not sure I understand this. The preferred way would be > to just stick the keyword before the call. Using 'collapse', it > would look like: > > def foo(b): > c=a > def bar(): > a="a1" > collapse foo("b1") > print b, c # prints "b1", "a1" > a="a2" > foo("b2") # Not collapsed this time > print b, c # still prints "b1", "a1" *YUK* I spent a long time staring at this and wondering "where did b come from?" You'd have to come up with a very compelling use case to get me to like this. Paul. From gvanrossum at gmail.com Wed Apr 27 00:02:03 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Apr 27 00:02:06 2005 Subject: [Python-Dev] Re: anonymous blocks vs scope-collapse In-Reply-To: <fb6fbf5605042614308198cb6@mail.gmail.com> References: <fb6fbf5605042614308198cb6@mail.gmail.com> Message-ID: <ca471dc20504261502560c128f@mail.gmail.com> [Jim Jewett] > >> (2) Add a way to say "Make this function I'm calling use *my* locals > >> and globals." This seems to meet all the agreed-upon-as-good use > >> cases, but there is disagreement over how to sensibly write it. The > >> calling function is the place that could get surprised, but people > >> who want thunks seem to want the specialness in the called function. [Guido] > > I think there are several problems with this. First, it looks > > difficult to provide semantics that cover all the corners for the > > blending of two namespaces. What happens to names that have a > > different meaning in each scope? [Jim] > Programming error. Same name ==> same object. Sounds like a recipe for bugs to me. At the very least it is a total breach of abstraction, which is the fundamental basis of the relationship between caller and callee in normal circumstances. The more I understand your proposal the less I like it. > If a function is using one of _your_ names for something incompatible, > then don't call that function with collapsed scope. The same "problem" > happens with globals today. Code in module X can break if module Y > replaces (not shadows, replaces) a builtin with an incompatible object. > > Except ... > > (E.g. 'self' when calling a method of > > another object; or any other name clash.) > > The first argument of a method *might* be a special case. It seems > wrong to unbind a bound method. On the other hand, resource > managers may well want to use unbound methods for the called > code. Well, what would you pass in as the first argument then? > > Are the globals also blended? How? > > Yes. The callee does not even get to see its normal namespace. > Therefore, the callee does not get to use its normal name resolution. Another breach of abstraction: if a callee wants to use an imported module, the import should be present in the caller, not in the callee. This seems to me to repeat all the mistakes of the dynamic scoping of early Lisps (including GNU Emacs Lisp I believe). It really strikes me as an endless source of errors that these blended-scope callees (in your proposal) are ordinary functions/methods, which means that they can *also* be called without blending scopes. Having special syntax to define a callee intended for scope-blending seems much more appropriate (even if there's also special syntax at the call site). > If the name normally resolves in locals (often inlined to a tuple, today), > it looks in the shared scope, which is "owned" by the caller. This is > different from a free variable only because the callee can write to this > dictionary. Aha! This suggests that a blend-callee needs to use different bytecode to avoid doing lookups in the tuple of optimized locals, since the indices assigned to locals in the callee and the caller won't match up except by miracle. > If the name is free in that shared scope, (which implies that the > callee does not bind it, else it would be added to the shared scope) > then the callee looks up the caller's nested stack and then to the > caller's globals, and then the caller's builtins. > > > Second, this construct only makes sense for all callables; (I meant this to read "does not make sense for all callables".) > Agreed. (And I presume you read it that way. :-) > But using it on a non-function may cause surprising results > especially if bound methods are not special-cased. > > The same is true of decorators, which is why we have (at least > initially) "function decorators" instead of "callable decorators". Not true. It is possible today to write decorators that accept things other than functions -- in fact, this is often necessary if you want to write decorators that combine properly with other decorators that don't return function objects (such as staticmethod and classmethod). > > it makes no sense when the callable is implemented as > > a C function, > > Or rather, it can't be implemented, as the compiler may well > have optimized the variables names right out. Stack frame > transitions between C and python are already special. Understatement of the year. There just is no similarity between C and Python stack frames. How much do you really know about Python's internals??? > > or is a class, or an object with a __call__ method. > > These are just calls to __init__ (or __new__) or __call__. No they're not. Calling a class *first* creates an instance (calling __new__ if it exists) and *then* calls __init__ (if it exists). > These may be foolish things to call (particularly if the first > argument to a method isn't special-cased), but ... it isn't > a problem if the class is written appropriately. If the class > is not written appropriately, then don't call it with collapsed > scope. That's easy for you to say. Since the failure behavior is so messy I'd rather not get started. > > Third, I expect that if we solve the first two > > problems, we'll still find that for an efficient implementation we > > need to modify the bytecode of the called function. > > Absolutely. Even giving up the XXX_FAST optimizations would > still require new bytecode to not assume them. (Deoptimizing > *all* functions, in *all* contexts, is not a sensible tradeoff.) So you actually *agree* that blended-scope functions should be marked as such at the callee definition, not just at the call site. Or how else would you do this? > Eventually, an optimizing compiler could do the right thing, but ... > that isn't the point. > > For a given simple algorithm, interpeted python is generally slower > than compiled C, but we write in python anyhow -- it is fast enough, > and has other advantages. The same is true of anything that lets > me not cut-and-paste. Whatever. Any new feature that causes a measurable slowdown for code that does *not* need the feature has a REALLY hard time getting accepted, by me as well as by the Python community. Slow Python down enough, and your target audience reduces to a small bunch of folks who are programming for their own education. > > Try to make sure that it can be used in a "statement context" > > as well as in an "expression context". > > I'm not sure I understand this. The preferred way would be > to just stick the keyword before the call. Using 'collapse', it > would look like: > > def foo(b): > c=a > def bar(): > a="a1" > collapse foo("b1") > print b, c # prints "b1", "a1" > a="a2" > foo("b2") # Not collapsed this time > print b, c # still prints "b1", "a1" I'm trying to sensitize you to potential uses like this: def bar(): a = "a1" print collapse foo("b1") > but I suppose you could treat it like the 'global' keyword > > def bar(): > a="a1" > collapse foo # forces foo to always collapse when called within bar > foo("b1") > print b, c # prints "b1", "a1" > a="a2" > foo("b2") # still collapsed > print b, c # now prints "b2", "a2" Would make more sense if the collapse keyword was at the module level. > >> [Alternative 3 ... bigger that merely collapsing scope] > >> (3) Add macros. We still have to figure out how to limit their obfuscation. > >> Attempts to detail that goal seem to get sidetracked. > > > No, the problem is not how to limit the obfuscation. The problem is > > the same as for (2), only more so: nobody has given even a *remotely* > > plausible mechanism for how exactly you would get code executed at > > compile time. > > macros can (and *possibly* should) be evaluated at run-time. We must still have very different views on what a macro is. After a macros is run, there is new syntax that needs to be parsed and compiled to bytecode. While Python frequently switches between compile time and run time, anything that requires invoking the compiler each time a macro is used will be so slow that nobody will want to use it. (Python's compiler is very slow, and it's even slower in alternate implementations like Jython and IronPython.) > Compile time should be possible (there is an interpreter running) and > faster, but ... is certainly not required. OK, now you *must* look at the Boo solution. http://boo.codehaus.org/Syntactic+Macros > Even if the macros just rerun the same boilerplate code less efficiently, > it is still good to have that boilerplate defined once, instead of cutting > and pasting. Or, at least, it is better *if* that once doesn't become > unreadable in the process. I am unable to assess the value of this mechanism unless you make a concrete proposal. You seem to have something in mind but you're not doing a good job getting it into mine... -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Wed Apr 27 00:03:58 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Apr 27 00:04:03 2005 Subject: [Python-Dev] Re: anonymous blocks vs scope-collapse In-Reply-To: <79990c6b05042614406bd8f95@mail.gmail.com> References: <fb6fbf5605042614308198cb6@mail.gmail.com> <79990c6b05042614406bd8f95@mail.gmail.com> Message-ID: <ca471dc20504261503767cd117@mail.gmail.com> [Paul Moore] > *YUK* I spent a long time staring at this and wondering "where did b come from?" > > You'd have to come up with a very compelling use case to get me to like this. I couldn't have said it better. I said it longer though. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy at udel.edu Wed Apr 27 00:18:45 2005 From: tjreedy at udel.edu (Terry Reedy) Date: Wed Apr 27 00:20:27 2005 Subject: [Python-Dev] Re: a few SF bugs which can (probably) be closed References: <Pine.LNX.4.58.0504222034070.772@bagira> Message-ID: <d4mee6$brf$1@sea.gmane.org> "Ilya Sandler" <ilya@bluefir.net> wrote in message news:Pine.LNX.4.58.0504222034070.772@bagira... > Here a few sourceforge bugs which can probably be closed: > > [ 1168983 ] : ftplib.py string index out of range > Original poster reports that the problem disappeared after a patch > committed by Raymond Not clear to me if this is really finished or not. Leaving for Raymond or ... . Closed 3 below. > [ 1178863 ] Variable.__init__ uses self.set(), blocking specialization > seems like a dup of 1178872 Closed latter. > [ 415492 ] Compiler generates relative filenames > seems to have been fixed at some point. I could not reproduce it with > python2.4 > > [ 751612 ] smtplib crashes Windows Kernal. > Seems like an obvious Windows bug (not python's bug) and seems to be > unreproducible Terry J. Reedy From jimjjewett at gmail.com Wed Apr 27 01:42:46 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed Apr 27 01:42:49 2005 Subject: [Python-Dev] scope-collapse Message-ID: <fb6fbf56050426164269d95a90@mail.gmail.com> [Jim Jewett] >> >> (2) Add a way to say "Make this function I'm calling use *my* locals >> >> and globals." This seems to meet all the agreed-upon-as-good use >> >> cases, but there is disagreement over how to sensibly write it. [Guido] >> > What happens to names that have a >> > different meaning in each scope? [Jim] >> Programming error. Same name ==> same object. > Sounds like a recipe for bugs to me. At the very least it is a total > breach of abstraction, which is the fundamental basis of the > relationship between caller and callee in normal circumstances. Yes. Collapsing scope is not a good idea in general. But there is no way to avoid (some version of) it under the thunk or even the resource-manager suggestions. I interpret that to mean that either The code *can* get ugly and you rely on conventions. or These constructs should not be added to the language. The pretend-it-is-a-generator proposals try to specify that only certain names will be shared, in only certain ways. That might work (practicality beats purity) but I suspect it will evolve into a wart. It won't be quite strong enough to solve the problem completely (particularly with multiple blocks), but it will be strong enough to obfuscate when mishandled. >> Yes. The callee does not even get to see its normal namespace. >> Therefore, the callee does not get to use its normal name resolution. > This seems to me to repeat all the mistakes of the dynamic scoping > of early Lisps (including GNU Emacs Lisp I believe). With one exception -- the caller must state explicitly that the collapse it happening, and even then, it only goes down one level at a time. Still an ugly tool, but at least not an ugly surprise. > It really strikes me as an endless source of errors that these > blended-scope callees (in your proposal) are ordinary > functions/methods, which means that they can *also* be called without > blending scopes. Having special syntax to define a callee intended for > scope-blending seems much more appropriate (even if there's also > special syntax at the call site). This might well be a good restriction. The number of times it causes annoyance (why do I have to code this twice?) should be outweighed by the number of times it saves a surprise (oops -- those functions both defined the same keyword argument). >> If the name normally resolves in locals (often inlined to a tuple, today), >> it looks in the shared scope, which is "owned" by the caller. This is >> different from a free variable only because the callee can write to this >> dictionary. > Aha! This suggests that a blend-callee needs to use different bytecode > to avoid doing lookups in the tuple of optimized locals Yes. I believe the translation is mechanical, so that the compiler could choose (or create) the right version based on the caller, but ... I agree that making them a separate kind of callable would simplify things. > (I meant this to read "does not make sense for all callables".) > (And I presume you read it that way. :-) nah... I think it takes special justification to do use anything but duck typing in python. If functions can intersperse with boilerplate, than other callables (and even other suites, such as class definitions) should be able to do the same. But I also agree that it makes sense to wait until it can be done sensibly. Just as @decorator only applies to functions (even if the specific decorator could accept something else), this interspersing should probably not apply to non-functions until the "but I can't use a function" use cases are clear. >> For a given simple algorithm, interpeted python is generally slower >> than compiled C, but we write in python anyhow -- it is fast enough, >> and has other advantages. The same is true of anything that lets >> me not cut-and-paste. > Whatever. Any new feature that causes a measurable slowdown for code > that does *not* need the feature has a REALLY hard time getting > accepted, Agreed. But the scope-collapse penalty is restricted to the caller (which ordered the collapse) and the immediate callees during the collapsed call. >> I'm not sure I understand this. The preferred way would be >> to just stick the keyword before the call. Using 'collapse', it >> would look like: # (Added comment to make the ugliess potential more explicit) >> def foo(b): # Yes, parameters are in the namespace. >> c=a >> def bar(): >> a="a1" >> collapse foo("b1") >> print b, c # prints "b1", "a1" >> a="a2" >> foo("b2") # Not collapsed this time >> print b, c # still prints "b1", "a1" > I'm trying to sensitize you to potential uses like this: > def bar(): > a = "a1" > print collapse foo("b1") In this case, it would print None, as foo didn't bother to return anything. > but I suppose you could treat it like the 'global' keyword >> def bar(): >> a="a1" >> collapse foo # forces foo to always collapse when called within bar >> foo("b1") >> print b, c # prints "b1", "a1" >> a="a2" >> foo("b2") # still collapsed >> print b, c # now prints "b2", "a2" > Would make more sense if the collapse keyword was at the module level. ??? Are you suggesting that everything defined in the module must live in a single namespace, just because the collapse was wanted in one place? -jJ From charles.hartman at conncoll.edu Mon Apr 25 13:53:58 2005 From: charles.hartman at conncoll.edu (Charles Hartman) Date: Wed Apr 27 01:49:08 2005 Subject: [Python-Dev] Re: [Pythonmac-SIG] zipfile still has 2GB boundary bug In-Reply-To: <f8130a8bcf8ee1165b8c6c6c5da86f80@redivi.com> References: <f8130a8bcf8ee1165b8c6c6c5da86f80@redivi.com> Message-ID: <0a276f3a753a57d89e65013ca77a3714@conncoll.edu> > > Someone should think about rewriting the zipfile module to be less > hideous, include a repair feature, and be up to date with the latest > specifications <http://www.pkware.com/company/standards/appnote/>. -- and allow *deleting* a file from a zipfile. As far as I can tell, you now can't (except by rewriting everything but that to a new zipfile and renaming). Somewhere I saw a patch request for this, but it was languishing, a year or more old. Or am I just totally missing something? Charles Hartman From greg.ewing at canterbury.ac.nz Wed Apr 27 01:53:07 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed Apr 27 01:53:27 2005 Subject: [Python-Dev] atexit missing an unregister method In-Reply-To: <BAY17-F7DF544743281830AFF246A4210@phx.gbl> References: <BAY17-F7DF544743281830AFF246A4210@phx.gbl> Message-ID: <426ED463.6040800@canterbury.ac.nz> Nick Jacobson wrote: > But while you can mark functions to be called with the 'register' > method, there's no 'unregister' method to remove them from the stack of > functions to be called. You can always build your own mechanism for managing cleanup functions however you want, and register a single atexit() hander to invoke it. I don't think there's any need to mess with the way atexit() currently works. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From fumanchu at amor.org Wed Apr 27 02:04:54 2005 From: fumanchu at amor.org (Robert Brewer) Date: Wed Apr 27 02:03:17 2005 Subject: [Python-Dev] Re: scope-collapse (was: anonymous blocks) Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3771F0C@exchange.hqamor.amorhq.net> [Jim Jewett] > (2) Add a way to say "Make this function I'm calling > use *my* locals and globals." This seems to meet all > the agreed-upon-as-good use cases, but there is disagreement > over how to sensibly write it. The calling function is > the place that could get surprised, but people who want > thunks seem to want the specialness in the > called function. [Guido] > I think there are several problems with this. First, it looks > difficult to provide semantics that cover all the corners for the > blending of two namespaces. What happens to names that have a > different meaning in each scope? [Jim] > Programming error. Same name ==> same object. [Guido] > Sounds like a recipe for bugs to me. At the very least it is a total > breach of abstraction, which is the fundamental basis of the > relationship between caller and callee in normal circumstances. The > more I understand your proposal the less I like it. [Jim] > If a function is using one of _your_ names for something > incompatible, then don't call that function with collapsed > scope. The same "problem" happens with globals today. > Code in module X can break if module Y replaces (not shadows, > replaces) a builtin with an incompatible object. > > Except ... > (E.g. 'self' when calling a method of > another object; or any other name clash.) > > The first argument of a method *might* be a special case. It seems > wrong to unbind a bound method. On the other hand, resource > managers may well want to use unbound methods for the called > code. Urg. Please, no. If you're going to blend scopes, the callee should have nothing passed to it. Why would you possibly want it when you already have access to both scopes which are to be blended? [Guido] > Are the globals also blended? How? [Jim] > Yes. The callee does not even get to see its normal namespace. > Therefore, the callee does not get to use its normal name > resolution. [Guido] > Another breach of abstraction: if a callee wants to use an imported > module, the import should be present in the caller, not in the callee. Yes, although if a callee wants to use a module that has not been imported by the caller, it should be able to do so with a new import statement (which then affects the namespace of the caller). [Guido again] > It really strikes me as an endless source of errors that these > blended-scope callees (in your proposal) are ordinary > functions/methods, which means that they can *also* be called without > blending scopes. Having special syntax to define a callee intended for > scope-blending seems much more appropriate (even if there's also > special syntax at the call site). Agreed. They shouldn't be ordinary functions at all, in my mind. That means one can also mark the actual call on the callee side, instead of the caller side; in other words, you wouldn't need a "collapse" keyword at all if you formed the callee with a "defmacro" or other (better ;) keyword. I guess if y'all find it surprising, you could keep "collapse". [Jim] > If the name normally resolves in locals (often inlined to a > tuple, today), > it looks in the shared scope, which is "owned" by the > caller. This is > different from a free variable only because the callee can > write to this > dictionary. [Guido] > Aha! This suggests that a blend-callee needs to use different bytecode > to avoid doing lookups in the tuple of optimized locals, since the > indices assigned to locals in the callee and the caller won't match up > except by miracle. [Guido] > Third, I expect that if we solve the first two > problems, we'll still find that for an efficient implementation we > need to modify the bytecode of the called function. [Jim] > Absolutely. Even giving up the XXX_FAST optimizations would > still require new bytecode to not assume them. (Deoptimizing > *all* functions, in *all* contexts, is not a sensible tradeoff.) I'm afraid I'm only familiar with CPython, but wouldn't callee locals just map to XXX_FAST indices via the caller's co_names tuple? Remapping jump targets, on the other hand, would be something to quickly ban. You shouldn't be able to write trash like: defmacro keepgoing: else: continue [Guido] > Try to make sure that it can be used in a "statement context" > as well as in an "expression context". ... > I'm trying to sensitize you to potential uses like this: > > def foo(b): > c=a > def bar(): > a = "a1" > print collapse foo("b1") If the callees aren't real functions and don't get passed anything, the "sensible" approach would be to disallow expression-context use of them. Rewrite the above to: defcallee foo: c = a def bar(): a = "a1" collapse foo print c Robert Brewer MIS Amor Ministries fumanchu@amor.org From greg.ewing at canterbury.ac.nz Wed Apr 27 02:08:15 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed Apr 27 02:08:30 2005 Subject: [Python-Dev] defmacro In-Reply-To: <87k6mqnddr.fsf@tleepslib.sk.tsukuba.ac.jp> References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> <426DDD87.60908@canterbury.ac.nz> <87k6mqnddr.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <426ED7EF.1090508@canterbury.ac.nz> Stephen J. Turnbull wrote: > This doesn't feel right to me. By that argument, people would want > to "improve" > > (mapcar (lambda (x) (car x)) list-of-lists) > > to > > (mapcar list-of-lists (x) (car x)) I didn't claim that people would feel compelled to eliminate all uses of lambda; only that, in those cases where they *do* feel so compelled, they might not if lambda weren't such a long word. I was just trying to understand why Smalltalkers seem to get on fine without macros, whereas Lispers feel they are needed. I think Smalltalk's lightweight block-passing syntax has a lot to do with it. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From jimjjewett at gmail.com Wed Apr 27 02:12:19 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed Apr 27 02:12:21 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) Message-ID: <fb6fbf5605042617126c217664@mail.gmail.com> >> >> (3) Add macros. We still have to figure out how to limit their obfuscation. >> > nobody has given even a *remotely* >> > plausible mechanism for how exactly you would get code executed at >> > compile time. >> macros can (and *possibly* should) be evaluated at run-time. > We must still have very different views on what a macro is. In a compiled language, it is (necessarily) compiled. In an interpreted language, it doesn't have to be. > After a macros is run, there is new syntax that needs to be parsed > and compiled to bytecode. ... anything that requires invoking the > compiler each time a macro is used will be so slow that nobody will > want to use it. I had been thinking that the typical use would be during function (or class) definition. The overhead would be similar to that of decorators, and confined mostly to module loading. I do see your point that putting a macro call inside a function could be slow -- but I'm not sure that is a reason to forbid it. >> Even if the macros just rerun the same boilerplate code less efficiently, >> it is still good to have that boilerplate defined once, instead of cutting >> and pasting. Or, at least, it is better *if* that once doesn't become >> unreadable in the process. > I am unable to assess the value of this mechanism unless you make a > concrete proposal. You seem to have something in mind but you're not > doing a good job getting it into mine... I'm not confident that macros are even a good idea; I just don't want a series of half-macros. That said, here is a strawman. defmacro boiler1(name, rejects): def %(name) (*args): for a in args: if a in %(rejects): print "Don't send me %s" % a ... boiler1(novowels, "aeiouy") boiler2(nokey5, "jkl") I'm pretty sure that a real version should accept suites instead of just arguments, and the variable portion might even be limited to suites (as in the thunk discussion). It might even be reasonable to mark macro calls as different from function calls. template novowels from boiler1("aeiou"): <suite> but I can't help thinking that multiple suites should be possible, and then they should be named, and ... that spurred at least one objection. http://mail.python.org/pipermail/python-dev/2005-April/052949.html -jJ From greg.ewing at canterbury.ac.nz Wed Apr 27 02:13:06 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed Apr 27 02:13:24 2005 Subject: [Python-Dev] defmacro In-Reply-To: <426D358C.70509@ieee.org> References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> <426D358C.70509@ieee.org> Message-ID: <426ED912.40603@canterbury.ac.nz> Shane Holloway (IEEE) wrote: > So, the question comes back to what are blocks in the language > extensibility case? To me, they would be something very like a code > object returned from the compile method. To this we would need to > attach the globals and locals where the block was from. Then we could > use the normal exec statement to invoke the block whenever needed. There's no need for all that. They're just callable objects. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Wed Apr 27 02:18:06 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed Apr 27 02:18:21 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: <fb6fbf5605042617126c217664@mail.gmail.com> References: <fb6fbf5605042617126c217664@mail.gmail.com> Message-ID: <426EDA3E.7090208@canterbury.ac.nz> Jim Jewett wrote: > I had been thinking that the typical use would be during function (or > class) definition. The overhead would be similar to that of decorators, > and confined mostly to module loading. But that's too late, unless you want to resort to bytecode hacking. By the time the module is loaded, its source code has long since been compiled into code objects. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From gvanrossum at gmail.com Wed Apr 27 02:18:48 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Apr 27 02:18:50 2005 Subject: [Python-Dev] scope-collapse In-Reply-To: <fb6fbf56050426164269d95a90@mail.gmail.com> References: <fb6fbf56050426164269d95a90@mail.gmail.com> Message-ID: <ca471dc2050426171860ff388f@mail.gmail.com> [Jim jewett] > The pretend-it-is-a-generator proposals try to specify that only > certain names will be shared, in only certain ways. Huh? I don't see it this way. There is *no* sharing between the frame of the generator and the frame of the block. The block is a permanent part of the frame surrounding the with-statement, so all names are shared there. > > Would make more sense if the collapse keyword was at the module level. > > ??? Are you suggesting that everything defined in the module must live > in a single namespace, just because the collapse was wanted in one place? No, I was just proposing putting 'collapse foo' in the module, which would mean that (a) the definition of foo is intended to be a macro, and (b) all uses of foo are intended to call that macro. But I still think this whole proposal is built on quicksand, so don't take that suggestion too seriously. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Wed Apr 27 02:24:44 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Apr 27 02:24:47 2005 Subject: [Python-Dev] Re: [Pythonmac-SIG] zipfile still has 2GB boundary bug In-Reply-To: <0a276f3a753a57d89e65013ca77a3714@conncoll.edu> References: <f8130a8bcf8ee1165b8c6c6c5da86f80@redivi.com> <0a276f3a753a57d89e65013ca77a3714@conncoll.edu> Message-ID: <ca471dc2050426172429fb52b@mail.gmail.com> > > Someone should think about rewriting the zipfile module to be less > > hideous, include a repair feature, and be up to date with the latest > > specifications <http://www.pkware.com/company/standards/appnote/>. > > -- and allow *deleting* a file from a zipfile. As far as I can tell, > you now can't (except by rewriting everything but that to a new zipfile > and renaming). Somewhere I saw a patch request for this, but it was > languishing, a year or more old. Or am I just totally missing > something? Please don't propose a grand rewrite (even it's only a single module). Given that the API is mostly sensible, please propose gradual refactoring of the implementation, perhaps some new API methods, and so on. Don't throw away the work that went into making it work in the first place! http://www.joelonsoftware.com/articles/fog0000000069.html -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Wed Apr 27 02:27:17 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed Apr 27 02:27:32 2005 Subject: [Python-Dev] defmacro (was: Anonymous blocks) In-Reply-To: <200504261039.j3QAdQU2013249@ger5.wwwserver.net> References: <200504261039.j3QAdQU2013249@ger5.wwwserver.net> Message-ID: <426EDC65.3070603@canterbury.ac.nz> flaig@sanctacaris.net wrote: > Actually I was thinking of something related the other day: > Wouldn't it be nice to be able to define/overload not only > operators but also control structures? That triggered off something in my mind that's somewhat different from what you went on to talk about. So far we've been talking about ways of defining new syntax. But operator overloading isn't creating new syntax, it's giving a new meaning to existing syntax. So the statement equivalent of that would be defining new meanings for *existing* control structures! For example, when you write while expr: ... it gets turned into expr.__while__(thunk) etc. No, I'm not really serious about this -- it was just a wild thought! -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From jimjjewett at gmail.com Wed Apr 27 02:30:48 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed Apr 27 02:30:51 2005 Subject: [Python-Dev] Re: scope-collapse (was: anonymous blocks) In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E3771F0C@exchange.hqamor.amorhq.net> References: <3A81C87DC164034AA4E2DDFE11D258E3771F0C@exchange.hqamor.amorhq.net> Message-ID: <fb6fbf5605042617301cf399e2@mail.gmail.com> On 4/26/05, Robert Brewer <fumanchu@amor.org> wrote: > [Jim] > > Absolutely. Even giving up the XXX_FAST optimizations would > > still require new bytecode to not assume them. > I'm afraid I'm only familiar with CPython, but wouldn't callee locals > just map to XXX_FAST indices via the caller's co_names tuple? Only if all names are in the caller's tuple. In your example at http://mail.python.org/pipermail/python-dev/2005-April/052924.html two of the callees wanted a shared old_children, but that name didn't appear in the caller, so I wouldn't expect the compiler to make room for it in the tuple. -jJ From jcarlson at uci.edu Wed Apr 27 02:42:48 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed Apr 27 02:43:25 2005 Subject: [Python-Dev] Re: anonymous blocks vs scope-collapse In-Reply-To: <ca471dc20504261502560c128f@mail.gmail.com> References: <fb6fbf5605042614308198cb6@mail.gmail.com> <ca471dc20504261502560c128f@mail.gmail.com> Message-ID: <20050426165507.6401.JCARLSON@uci.edu> [Guido] > OK, now you *must* look at the Boo solution. > http://boo.codehaus.org/Syntactic+Macros That is an interesting solution, requiring macro writers to actually write an AST modifier seems pretty reasonable to me. Whether we want macros or not... <shrug> - Josiah From sabbey at u.washington.edu Wed Apr 27 02:45:01 2005 From: sabbey at u.washington.edu (Brian Sabbey) Date: Wed Apr 27 02:45:05 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <426E5AEB.3030707@gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <3cf156b4a85d5b9907c6c9333d8c6af8@python.net> <426E5AEB.3030707@gmail.com> Message-ID: <Pine.A41.4.61b.0504261605490.142804@dante68.u.washington.edu> Nick Coghlan wrote: > Accordingly, I would like to suggest that 'with' revert to something > resembling the PEP 310 definition: > > resource = EXPR > if hasattr(resource, "__enter__"): > VAR = resource.__enter__() > else: > VAR = None > try: > try: > BODY > except: > raise # Force realisation of sys.exc_info() for use in __exit__() > finally: > if hasattr(resource, "__exit__"): > VAR = resource.__exit__() > else: > VAR = None > > Generator objects could implement this protocol, with the following > behaviour: > > def __enter__(): > try: > return self.next() > except StopIteration: > raise RuntimeError("Generator exhausted, unable to enter with > block") > > def __exit__(): > try: > return self.next() > except StopIteration: > return None > > def __except__(*exc_info): > pass > > def __no_except__(): > pass One peculiarity of this is that every other 'yield' would not be allowed in the 'try' block of a try/finally statement (TBOATFS). Specifically, a 'yield' reached through the call to __exit__ would not be allowed in the TBOATFS. It gets even more complicated when one considers that 'next' may be called inside BODY. In such a case, it would not be sufficient to just disallow every other 'yield' in the TBOATFS. It seems like 'next' would need some hidden parameter that indicates whether 'yield' should be allowed in the TBOATFS. (I assume that if a TBOATFS contains an invalid 'yield', then an exception will be raised immediately before its 'try' block is executed. Or would the exception be raised upon reaching the 'yield'?) > These are also possible by combining a normal for loop with a non-looping > with (but otherwise using Guido's exception injection semantics): > > def auto_retry(attempts): > success = [False] > failures = [0] > except = [None] > > def block(): > try: > yield None > except: > failures[0] += 1 > else: > success[0] = True > > while not success[0] and failures[0] < attempts: > yield block() > if not success[0]: > raise Exception # You'd actually propagate the last inner failure > > for attempt in auto_retry(3): > with attempt: > do_something_that_might_fail() I think your example above is a good reason to *allow* 'with' to loop. Writing 'auto_retry' with a looping 'with' would be pretty straightforward and intuitive. But the above, non-looping 'with' example requires two fairly advanced techniques (inner functions, variables-as-arrays trick) that would probably be lost on some python users (and make life more difficult for the rest). But I do see the appeal to having a non-looping 'with'. In many (most?) uses of generators, 'for' and looping 'with' could be used interchangeably. This seems ugly-- more than one way to do it and all that. -Brian From greg.ewing at canterbury.ac.nz Wed Apr 27 02:53:04 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed Apr 27 02:53:21 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <426E5AEB.3030707@gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <3cf156b4a85d5b9907c6c9333d8c6af8@python.net> <426E5AEB.3030707@gmail.com> Message-ID: <426EE270.1080303@canterbury.ac.nz> Nick Coghlan wrote: > def template(): > # pre_part_1 > yield None > # post_part_1 > yield None > # pre_part_2 > yield None > # post_part_2 > yield None > # pre_part_3 > yield None > # post_part_3 > > def user(): > block = template() > with block: > # do_part_1 > with block: > # do_part_2 > with block: > # do_part_3 That's an interesting idea, but do you have any use cases in mind? I worry that it will be too restrictive to be really useful. Without the ability for the iterator to control which blocks get executed and when, you wouldn't be able to implement something like a case statement, for example. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From bob at redivi.com Wed Apr 27 03:00:43 2005 From: bob at redivi.com (Bob Ippolito) Date: Wed Apr 27 03:00:46 2005 Subject: [Python-Dev] Re: [Pythonmac-SIG] zipfile still has 2GB boundary bug In-Reply-To: <ca471dc2050426172429fb52b@mail.gmail.com> References: <f8130a8bcf8ee1165b8c6c6c5da86f80@redivi.com> <0a276f3a753a57d89e65013ca77a3714@conncoll.edu> <ca471dc2050426172429fb52b@mail.gmail.com> Message-ID: <6cc9cf942f6b6fa876741dff724e87cd@redivi.com> On Apr 26, 2005, at 8:24 PM, Guido van Rossum wrote: >>> Someone should think about rewriting the zipfile module to be less >>> hideous, include a repair feature, and be up to date with the latest >>> specifications <http://www.pkware.com/company/standards/appnote/>. >> >> -- and allow *deleting* a file from a zipfile. As far as I can tell, >> you now can't (except by rewriting everything but that to a new >> zipfile >> and renaming). Somewhere I saw a patch request for this, but it was >> languishing, a year or more old. Or am I just totally missing >> something? > > Please don't propose a grand rewrite (even it's only a single module). > Given that the API is mostly sensible, please propose gradual > refactoring of the implementation, perhaps some new API methods, and > so on. Don't throw away the work that went into making it work in the > first place! Well, I didn't necessarily mean it should be thrown away and started from scratch -- however, once you get all the ugly out of it, there's not much left! Obviously there's something wrong with the way it's written if it took years and *several passes* to correctly identify and fix a simple format character case bug. Most of this can be blamed on the struct module, which is more obscure and error-prone than writing the same code in C. One of the most useful things that could happen to the zipfile module would be a stream interface for both reading and writing. Right now it's slow and memory hungry when dealing with large chunks. The use case that lead me to fix this bug is a tool that archives video to zip files of targa sequences with a reference QuickTime movie.. so I end up with thousands of bite sized chunks. This >2GB bug really caused me some grief in that I didn't test with such large sequences because I didn't have any. I didn't end up finding out about it until months later because client *ignored* the exceptions raised by the GUI and came back to me with broken zip files. Fortunately the TOC in a zip file can be reconstructed from an otherwise pristine stream. Of course, I had to rewrite half of the zipfile module to come up with such a recovery program, because it's not designed well enough to let me build such a tool on top of it. Another "bug" I ran into was that it has some crazy default for the ZipInfo record: it assumes the platform ("create_system") is Windows regardless of where you are! This caused some really subtle and annoying issues with some unzip tools (of course, on everyone's machines except mine). Fortunately someone was able to figure out why and send me a patch, but it was completely unexpected and I didn't see such craziness documented anywhere. If it weren't for this patch, it'd either still be broken, or I'd have switched to some other way of creating archives! The zipfile module is good enough to create input files for zipimport.. which is well tested and generally works -- barring the fact that zipimport has quite a few rough edges of its own. I certainly wouldn't recommend it for any heavy duty tasks in its current state. -bob From alan.mcintyre at esrgtech.com Wed Apr 27 03:48:06 2005 From: alan.mcintyre at esrgtech.com (Alan McIntyre) Date: Wed Apr 27 03:48:10 2005 Subject: [Python-Dev] Re: [Pythonmac-SIG] zipfile still has 2GB boundary bug In-Reply-To: <6cc9cf942f6b6fa876741dff724e87cd@redivi.com> References: <f8130a8bcf8ee1165b8c6c6c5da86f80@redivi.com> <0a276f3a753a57d89e65013ca77a3714@conncoll.edu> <ca471dc2050426172429fb52b@mail.gmail.com> <6cc9cf942f6b6fa876741dff724e87cd@redivi.com> Message-ID: <426EEF56.8090905@esrgtech.com> Bob Ippolito wrote: > One of the most useful things that could happen to the zipfile module > would be a stream interface for both reading and writing. Right now > it's slow and memory hungry when dealing with large chunks. The use > case that lead me to fix this bug is a tool that archives video to zip > files of targa sequences with a reference QuickTime movie.. so I end > up with thousands of bite sized chunks. While it's probably not an improvement on the order of magnitude you're looking for, there's a patch (1121142) that lets you read large items out of a zip archive via a file-like object. I'm occasionally running into the 2GB problem myself, so if any changes are made to get around that I can at least help out by testing it against some "real-life" data sets. Alan From greg.ewing at canterbury.ac.nz Wed Apr 27 05:30:17 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed Apr 27 05:30:34 2005 Subject: [Python-Dev] Re: anonymous blocks vs scope-collapse In-Reply-To: <ca471dc20504261503767cd117@mail.gmail.com> References: <fb6fbf5605042614308198cb6@mail.gmail.com> <79990c6b05042614406bd8f95@mail.gmail.com> <ca471dc20504261503767cd117@mail.gmail.com> Message-ID: <426F0749.60402@canterbury.ac.nz> I don't think this proposal has any chance as long as it's dynamically scoped. It mightn't be so bad if it were lexically scoped, i.e. a special way of defining a function so that it shares the lexically enclosing scope. This would be implementable, since the compiler has all the necessary information about both scopes available. Although it might be better to have some sort of "outer" declaration for rebinding in the enclosing scope, instead of doing it on a whole-function basis. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Wed Apr 27 05:34:37 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed Apr 27 05:34:54 2005 Subject: [Python-Dev] Re: Caching objects in memory In-Reply-To: <e04bdf31050426092230502ab1@mail.gmail.com> References: <e04bdf31050422063019fda86b@mail.gmail.com> <d4av64$ogd$1@sea.gmane.org> <e04bdf310504250946371f59c@mail.gmail.com> <ca471dc20504250957753a7445@mail.gmail.com> <e04bdf31050426092230502ab1@mail.gmail.com> Message-ID: <426F084D.4000503@canterbury.ac.nz> Facundo Batista wrote: >>Aargh! Bad explanation. Or at least you're missing something: > > Not really. It's easier for me to show that id(3) is always the same > and id([]) not, and let the kids see that's not so easy and you'll > have to look deeper if you want to know better. I think Guido was saying that it's important for them to know that mutable objects are never in danger of being shared, so you should at least tell them that much. Otherwise they may end up worrying unnecessarily that two of their lists might get shared somehow behind their back. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From stephen at xemacs.org Wed Apr 27 05:58:56 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed Apr 27 05:59:01 2005 Subject: [Python-Dev] defmacro In-Reply-To: <006201c54a7a$656bdb40$6402a8c0@arkdesktop> (Andrew Koenig's message of "Tue, 26 Apr 2005 12:09:58 -0400") References: <006201c54a7a$656bdb40$6402a8c0@arkdesktop> Message-ID: <87ekcwna4f.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Andrew" == Andrew Koenig <ark-mlist@att.net> writes: Andrew> Welllll.... Shouldn't you have written Andrew> (mapcar car list-of-lists) Andrew> or am I missing something painfully obvious? Greg should have written (with-file "foo/blarg" 'do-something-with) too. I guess I should have used do-something-with, too. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From shane at hathawaymix.org Wed Apr 27 06:15:16 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Wed Apr 27 06:15:19 2005 Subject: [Python-Dev] Re: [Pythonmac-SIG] zipfile still has 2GB boundary bug In-Reply-To: <6cc9cf942f6b6fa876741dff724e87cd@redivi.com> References: <f8130a8bcf8ee1165b8c6c6c5da86f80@redivi.com> <0a276f3a753a57d89e65013ca77a3714@conncoll.edu> <ca471dc2050426172429fb52b@mail.gmail.com> <6cc9cf942f6b6fa876741dff724e87cd@redivi.com> Message-ID: <426F11D4.1080707@hathawaymix.org> Bob Ippolito wrote: > The zipfile module is good enough to create input files for zipimport.. > which is well tested and generally works -- barring the fact that > zipimport has quite a few rough edges of its own. I certainly wouldn't > recommend it for any heavy duty tasks in its current state. That's interesting because Java seems to suffer from similar problems. In the early days of Java, although a jar file was a zip file, Java wouldn't read jar files created by the standard zip utilities I used. I think the distinction was that the jar utility stored the files uncompressed. Java is fixed now, but I think it illustrates that zip files are non-trivial. BTW, I don't think the jar utility can delete files from a zip file either. ;-) Shane From gvanrossum at gmail.com Wed Apr 27 06:19:47 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Apr 27 06:19:55 2005 Subject: [Python-Dev] Re: [Pythonmac-SIG] zipfile still has 2GB boundary bug In-Reply-To: <6cc9cf942f6b6fa876741dff724e87cd@redivi.com> References: <f8130a8bcf8ee1165b8c6c6c5da86f80@redivi.com> <0a276f3a753a57d89e65013ca77a3714@conncoll.edu> <ca471dc2050426172429fb52b@mail.gmail.com> <6cc9cf942f6b6fa876741dff724e87cd@redivi.com> Message-ID: <ca471dc205042621193ddcdaad@mail.gmail.com> > > Please don't propose a grand rewrite (even it's only a single module). > > Given that the API is mostly sensible, please propose gradual > > refactoring of the implementation, perhaps some new API methods, and > > so on. Don't throw away the work that went into making it work in the > > first place! > > Well, I didn't necessarily mean it should be thrown away and started > from scratch Well, you *did* say "rewrite". :-) > -- however, once you get all the ugly out of it, there's > not much left! Obviously there's something wrong with the way it's > written if it took years and *several passes* to correctly identify and > fix a simple format character case bug. Most of this can be blamed on > the struct module, which is more obscure and error-prone than writing > the same code in C. I think the reason is different -- it just hasn't had all that much use beyond the one use case for which it was written (zipping up the Python library). Also, don't underestimate the baroqueness of the zip spec. > One of the most useful things that could happen to the zipfile module > would be a stream interface for both reading and writing. Right now > it's slow and memory hungry when dealing with large chunks. The use > case that lead me to fix this bug is a tool that archives video to zip > files of targa sequences with a reference QuickTime movie.. so I end up > with thousands of bite sized chunks. Sounds like a use case nobody else has tried yet. > This >2GB bug really caused me some grief in that I didn't test with > such large sequences because I didn't have any. I didn't end up > finding out about it until months later because client *ignored* the > exceptions raised by the GUI and came back to me with broken zip files. > Fortunately the TOC in a zip file can be reconstructed from an > otherwise pristine stream. Of course, I had to rewrite half of the > zipfile module to come up with such a recovery program, because it's > not designed well enough to let me build such a tool on top of it. Given more typical use cases for zip files (sending around collections of source files) I'm not surprised that a bug that only occurs for files >2GB remained hidden for so long. I don't remember if you have Python CVS permissions, but you sound like you really know the module as well as the zip file spec, so I'm hoping that you'll find the time to do some reconstructive surgery on the zip module for Python 2.5, without breaking the existing APIs. I like the idea you have for a stream API; I recall that the one time I had to use it I was surprised that the API dealt with files as string buffers exclusively. > Another "bug" I ran into was that it has some crazy default for the > ZipInfo record: it assumes the platform ("create_system") is Windows > regardless of where you are! I vaguely recall that the initial author was a Windows-head; perhaps he didn't realize how useful the module would be on other platforms, or that it would make any difference at all. > This caused some really subtle and > annoying issues with some unzip tools (of course, on everyone's > machines except mine). Fortunately someone was able to figure out why > and send me a patch, but it was completely unexpected and I didn't see > such craziness documented anywhere. If it weren't for this patch, it'd > either still be broken, or I'd have switched to some other way of > creating archives! > > The zipfile module is good enough to create input files for zipimport.. > which is well tested and generally works -- barring the fact that > zipimport has quite a few rough edges of its own. I certainly wouldn't > recommend it for any heavy duty tasks in its current state. So, please fix it! -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Wed Apr 27 06:31:45 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed Apr 27 06:32:10 2005 Subject: [Python-Dev] site enhancements (request for review) In-Reply-To: <ca471dc2050426023629559cab@mail.gmail.com> References: <53f1dec01a0d78057c40abb1942cf0f1@redivi.com> <426DCDAE.8060907@canterbury.ac.nz> <ca471dc2050426023629559cab@mail.gmail.com> Message-ID: <426F15B1.3020403@canterbury.ac.nz> Guido van Rossum wrote: > I do that all the time without .pth files -- I just put all the common > modules in a package and place the package in the directory containing > the "main" .py files. That's fine as long as you're willing to put all the main .py files together in one directory, with everything else below it, but sometimes it's not convenient to do that. I had a use for this the other night, involving two applications which each consisted of multiple .py files (belonging only to that application) plus some shared ones. I wanted to have a directory for each application containing all the files private to that application. > it's too easy to forget about the .pth file and be > confused when it points to the wrong place. I don't think I would be confused by that. I would consider the .pth file to be a part of the source code of the application, to be maintained along with it. If I got an ImportError for one of the shared modules, checking the .pth file would be a natural thing to do -- just as would checking the sys.path munging code if it were being done that way. And a .pth file would be much easier to maintain than the hairy-looking code you need to write to munge sys.path in an equivalent way. > That's also the reason why > I don't use symlinks or $PYTHONPATH for this purpose. Another reason for avoiding that is portability. My first attempt at solving the aforementioned problem used symlinks. Trouble is, it also had to work on Windows running under Virtual PC mounting the source directory from the host system as a file share, and it turns out that reading a unix symlink from the Windows end just returns the contents of the link. Aaarrghh! Braindamage! -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From gvanrossum at gmail.com Wed Apr 27 06:47:14 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Apr 27 06:47:17 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <426E3B01.1010007@canterbury.ac.nz> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> Message-ID: <ca471dc205042621472b1f6edf@mail.gmail.com> > > [Greg Ewing] > >>* It seems to me that this same exception-handling mechanism > >>would be just as useful in a regular for-loop, and that, once > >>it becomes possible to put 'yield' in a try-statement, people > >>are going to *expect* it to work in for-loops as well. [Guido] > > (You can already put a yield inside a try-except, just not inside a > > try-finally.) [Greg] > Well, my point still stands. People are going to write > try-finally around their yields and expect the natural > thing to happen when their generator is used in a > for-loop. Well, the new finalization semantics should take care of that when their generator is finalized -- its __next__() will be called with some exception. But as long you hang on to the generator, it will not be finalized, which is distinctly different from the desired with-statement semantics. > > There would still be the difference that a for-loop invokes iter() > > and a with-block doesn't. > > > > Also, for-loops that don't exhaust the iterator leave it > > available for later use. > > Hmmm. But are these big enough differences to justify > having a whole new control structure? Whither TOOWTDI? Indeed, but apart from declaring that henceforth the with-statement (by whatever name) is the recommended looping construct and a for-statement is just a backwards compatibility macro, I just don't see how we can implement the necessary immediate cleanup semantics of a with-statement. In order to serve as a resource cleanup statement it *must* have stronger cleanup guarantees than the for-statement can give (if only for backwards compatibility reasons). > > """ > > The statement: > > > > for VAR in EXPR: > > BLOCK > > > > does the same thing as: > > > > with iter(EXPR) as VAR: # Note the iter() call > > BLOCK > > > > except that: > > > > - you can leave out the "as VAR" part from the with-statement; > > - they work differently when an exception happens inside BLOCK; > > - break and continue don't always work the same way. > > > > The only time you should write a with-statement is when the > > documentation for the function you are calling says you should. > > """ > > Surely you jest. Any newbie reading this is going to think > he hasn't a hope in hell of ever understanding what is going > on here, and give up on Python in disgust. And surely you exaggerate. How about this then: The with-statement is similar to the for-loop. Until you've learned about the differences in detail, the only time you should write a with-statement is when the documentation for the function you are calling says you should. > >>I'm seriously worried by the > >>possibility that a return statement could do something other > >>than return from the function it's written in. > > > Let me explain the use cases that led me to throwing that in > > Yes, I can see that it's going to be necessary to treat > return as an exception, and accept the possibility that > it will be abused. I'd still much prefer people refrain > from abusing it that way, though. Using "return" to spell > "send value back to yield statement" would be extremely > obfuscatory. That depends on where you're coming from. To Ruby users it will look completely natural because that's what Ruby uses. (In fact it'll be a while before they appreciate the deep differences between yield in Python and in Ruby.) But I accept that in Python we might want to use a different keyword to pass a value to the generator. I think using 'continue' should work; continue with a value has no precedent in Python, and continue without a value happens to have exactly the right semantics anyway. > > (BTW ReturnFlow etc. aren't great > > names. Suggestions?) > > I'd suggest just calling them Break, Continue and Return. Too close to break, continue and return IMO. > > One last thing: if we need a special name for iterators and > > generators designed for use in a with-statement, how about calling > > them with-iterators and with-generators. > > Except that if it's no longer a "with" statement, this > doesn't make so much sense... Then of course we'll call it after whatever the new statement is going to be called. If we end up calling it the foible-statement, they will be foible-iterators and foible-generators. Anyway, I think I'll need to start writing a PEP. I'll ask the PEP editor for a number. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Wed Apr 27 09:30:22 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Apr 27 09:30:33 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc205042621472b1f6edf@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> Message-ID: <ca471dc20504270030405f922f@mail.gmail.com> I've written a PEP about this topic. It's PEP 340: Anonymous Block Statements (http://python.org/peps/pep-0340.html). Some highlights: - temporarily sidestepping the syntax by proposing 'block' instead of 'with' - __next__() argument simplified to StopIteration or ContinueIteration instance - use "continue EXPR" to pass a value to the generator - generator exception handling explained -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jason at diamond.name Wed Apr 27 11:44:22 2005 From: jason at diamond.name (Jason Diamond) Date: Wed Apr 27 11:42:41 2005 Subject: [Python-Dev] Another Anonymous Block Proposal Message-ID: <426F5EF6.9050400@diamond.name> Hi. I hope you don't mind another proposal. Please feel free to tear it apart. A limitation of both Ruby's block syntax and the new PEP 340 syntax is the fact that they don't allow you to pass in more than a single anonymous block parameter. If Python's going to add anonymous blocks, shouldn't it do it better than Ruby? What follows is a proposal for a syntax that allows passing multiple, anonymous callable objects into another callable. No new protocols are introduced and none of it is tied to iterators/generators which makes it much simpler to understand (and hopefully simpler to implement). This is long and the initial syntax isn't ideal so please bear with me as I move towards what I'd like to see. The Python grammar would get one new production: do_statement ::= "do" call ":" NEWLINE ( "with" funcname "(" [parameter_list] ")" ":" suite )* Here's an example using this new "do" statement: do process_file(path): with process(file): for line in file: print line That would translate into: def __process(file): for line in file: print line process_file(path, process=__process) Notice that the name after each "with" keyword is the name of a parameter to the function being called. This will be what allows multiple block parameters. The implementation of `process_file` could look something like: def process_file(path, process): try: f = file(path) process(f) finally: if f: f.close() There's no magic in `process_file`. It's just a function that receives a callable named `process` as a parameter and it calls that callable with one parameter. There's no magic in the post-translated code, either, except for the temporary `__process` definition which shouldn't be user-visible. The magic comes when the pre-translated code gets each "with" block turned into a hidden, local def and passed in as a parameter to `process_file`. This syntax allows for multiple blocks: do process_file(path): with process(file): for line in file: print line with success(): print 'file processed successfully!' with error(exc): print 'an exception was raised during processing:', exc That's three separate anonymous block parameters with varying number of parameters in each one. This is what `process_file` might look like now: def process_file(path, process, success=None, error=None): try: try: f = file(path) process(f) if success: success(() except: if error: error(sys.exc_info()) raise finally: if f: f.close() I'm sure that being able to pass in multiple, anonymous blocks will be a huge advantage. Here's an example of how Twisted might be able to use multiple block parameters: d = do Deferred(): with callback(data): ... with errback(failure): ... (After typing that in, I realized the do_statement production needs an optional assignment part.) There's nothing requiring that anonymous blocks be used for looping. They're strictly parameters which need to be callable. They can, of course, be called from within a loop: def process_lines(path, process): try: f = file(path) for line in f: process(line) finally: if f: f.close() do process_lines(path): with process(line): print line Admittedly, this syntax is pretty bulky. The "do" keyword is necessary to indicate to the parser that this isn't a normal call--this call has anonymous block parameters. Having to prefix each one of these parameters with "with" is just following the example of "if/elif/else" blocks. An alternative might be to use indentation the way that class statements "contain" def statements: do_statement ::= "do" call ":" NEWLINE INDENT ( funcname "(" [parameter_list] ")" ":" suite )* DEDENT That would turn our last example into this: do process_lines(path): process(line): print line The example with the `success` and `error` parameters would look like this: do process_file(path): process(file): for line in file: print line success(): print 'file processed successfully!' error(exc): print 'an exception was raised during processing:', exc To me, that's much easier to see that the three anonymous block statements are part of the "do" statement. It would be ideal if we could even lose the "do" keyword. I think that might make the grammar ambiguous, though. If it was possible, we could do this: process_file(path): process(file): for line in file: print line success(): print 'file processed successfully!' error(exc): print 'an exception was raised during processing:', exc Now the only difference between a normal call and a call with anonymous block parameters would be the presence of the trailing colon. I could live with the "do" keyword if this can't be done, however. The only disadvantage to this syntax that I can see is that the simple case of opening a file and processing it is slightly more verbose than it is in Ruby. This is Ruby: File.open_and_process("testfile", "r") do |file| while line = file.gets puts line end end This would be the Python equivalent: do open_and_process("testfile", "r"): process(file): for line in file: print line It's one extra line in Python (I'm not counting lines that contain nothing but "end" in Ruby) because we have to specify the name of the block parameter. The extra flexibility that the proposed syntax has (being able to pass in multiple blocks) is worth this extra line, in my opinion. If we wanted to optimize even further for this case, however, we could allow for an alternate form of the "do" statement that lets you only specify one anonymous block parameter. Maybe it would look like this: do open_and_process("testfile", "r") process(file): for line in file: print line I don't really think this is necessary. I don't mind being verbose if it makes things clearer and simpler. Here's some other ideas: use "def" instead of "with". They'd have to be indented to avoid ambiguity, though: do process_file(path): def process(file): for line in file: print line def success(): print 'file processed successfully!' def error(exc): print 'an exception was raised during processing:', exc The presence of the familiar def keyword should help people understand what's happening here. Note that I didn't include an example but there's no reason why an anonymous block parameter couldn't return a value which could be used in the function calling the block. Please, be gentle. -- Jason From jason at diamond.name Wed Apr 27 12:06:21 2005 From: jason at diamond.name (Jason Diamond) Date: Wed Apr 27 12:06:28 2005 Subject: [Python-Dev] Another Anonymous Block Proposal In-Reply-To: <20050427055231.R7719@familjen.svensson.org> References: <426F5EF6.9050400@diamond.name> <20050427055231.R7719@familjen.svensson.org> Message-ID: <426F641D.1010802@diamond.name> Paul Svensson wrote: > You're not mentioning scopes of local variables, which seems to be > the issue where most of the previous proposals lose their balance > between hairy and pointless... My syntax is just sugar for nested defs. I assumed the scopes of local variables would be identical when using either syntax. Do you have any pointers to that go into the issues I'm probably missing? Thanks. -- Jason From fredrik at pythonware.com Wed Apr 27 12:36:47 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed Apr 27 12:38:10 2005 Subject: [Python-Dev] Re: Re: anonymous blocks References: <ca471dc205042116402d7d38da@mail.gmail.com><ca471dc205042416572da9db71@mail.gmail.com><426DB7C8.5020708@canterbury.ac.nz><ca471dc2050426043713116248@mail.gmail.com><426E3B01.1010007@canterbury.ac.nz><ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> Message-ID: <d4nplu$sgh$1@sea.gmane.org> Guido van Rossum wrote: > I've written a PEP about this topic. It's PEP 340: Anonymous Block > Statements (http://python.org/peps/pep-0340.html). > > Some highlights: > > - temporarily sidestepping the syntax by proposing 'block' instead of 'with' > - __next__() argument simplified to StopIteration or ContinueIteration instance > - use "continue EXPR" to pass a value to the generator > - generator exception handling explained +1 (most excellent) </F> From ncoghlan at gmail.com Wed Apr 27 13:22:58 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed Apr 27 13:23:04 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc20504270030405f922f@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> Message-ID: <426F7612.6090707@gmail.com> Guido van Rossum wrote: > I've written a PEP about this topic. It's PEP 340: Anonymous Block > Statements (http://python.org/peps/pep-0340.html). > > Some highlights: > > - temporarily sidestepping the syntax by proposing 'block' instead of 'with' > - __next__() argument simplified to StopIteration or ContinueIteration instance > - use "continue EXPR" to pass a value to the generator > - generator exception handling explained > I'm still trying to build a case for a non-looping block statement, but the proposed enhancements to generators look great. Any further suggestions I make regarding a PEP 310 style block statement will account for those generator changes. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From stephen at xemacs.org Wed Apr 27 13:28:55 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed Apr 27 13:29:00 2005 Subject: [Python-Dev] defmacro In-Reply-To: <426ED7EF.1090508@canterbury.ac.nz> (Greg Ewing's message of "Wed, 27 Apr 2005 12:08:15 +1200") References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com> <426DDD87.60908@canterbury.ac.nz> <87k6mqnddr.fsf@tleepslib.sk.tsukuba.ac.jp> <426ED7EF.1090508@canterbury.ac.nz> Message-ID: <877jiolaq0.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Greg" == Greg Ewing <greg.ewing@canterbury.ac.nz> writes: Greg> I didn't claim that people would feel compelled to eliminate Greg> all uses of lambda; only that, in those cases where they Greg> *do* feel so compelled, they might not if lambda weren't Greg> such a long word. Sure, I understood that. It's just that my feeling is that lambda can't "just quote a suite", it brings lots of other semantic baggage with it. Anyway, with dynamic scope, we can eliminate lambda, can't we? Just pass the suites as quoted lists of forms, compute the macro expansion, and eval it. So it seems to me that the central issue us scoping, not preventing evaluation of the suites. In Lisp, macros are a way of temporarily enabling certain amounts of dynamic scoping for all variables, without declaring them "special". It is very convenient that they don't evaluate their arguments, but that is syntactic sugar, AFAICT. In other words, it's the same idea as the "collapse" keyword that was proposed, but with different rules about what gets collapsed, when. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From jim at zope.com Wed Apr 27 13:42:07 2005 From: jim at zope.com (Jim Fulton) Date: Wed Apr 27 13:42:12 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc20504270030405f922f@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> Message-ID: <426F7A8F.8090109@zope.com> Guido van Rossum wrote: > I've written a PEP about this topic. It's PEP 340: Anonymous Block > Statements (http://python.org/peps/pep-0340.html). > > Some highlights: > > - temporarily sidestepping the syntax by proposing 'block' instead of 'with' > - __next__() argument simplified to StopIteration or ContinueIteration instance > - use "continue EXPR" to pass a value to the generator > - generator exception handling explained This looks pretty cool. Some observations: 1. It looks to me like a bare return or a return with an EXPR3 that happens to evaluate to None inside a block simply exits the block, rather than exiting a surrounding function. Did I miss something, or is this a bug? 2. I assume it would be a hack to try to use block statements to implement something like interfaces or classes, because doing so would require significant local-variable manipulation. I'm guessing that either implementing interfaces (or implementing a class statement in which the class was created before execution of a suite) is not a use case for this PEP. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From ncoghlan at gmail.com Wed Apr 27 13:44:49 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed Apr 27 13:44:55 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <426EE270.1080303@canterbury.ac.nz> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <3cf156b4a85d5b9907c6c9333d8c6af8@python.net> <426E5AEB.3030707@gmail.com> <426EE270.1080303@canterbury.ac.nz> Message-ID: <426F7B31.2040109@gmail.com> Greg Ewing wrote: > Nick Coghlan wrote: > >> def template(): >> # pre_part_1 >> yield None >> # post_part_1 >> yield None >> # pre_part_2 >> yield None >> # post_part_2 >> yield None >> # pre_part_3 >> yield None >> # post_part_3 >> >> def user(): >> block = template() >> with block: >> # do_part_1 >> with block: >> # do_part_2 >> with block: >> # do_part_3 > > > That's an interesting idea, but do you have any use cases > in mind? I was trying to address a use case which looked something like: do_begin() # code if some_condition: do_pre() # more code do_post() do_end() It's actually doable with a non-looping block statement, but I have yet to come up with a version which isn't as ugly as hell. > I worry that it will be too restrictive to be really useful. > Without the ability for the iterator to control which blocks > get executed and when, you wouldn't be able to implement > something like a case statement, for example. We can't write a case statement with a looping block statement either, since we're restricted to executing the same suite whenever we encounter a yield expression. At least the non-looping version offers some hope, since each yield can result in the execution of different code. For me, the main sticking point is that we *already* have a looping construct to drain an iterator - a 'for' loop. The more different the block statement's semantics are from a regular loop, the more powerful I think the combination will be. Whereas if the block statement is just a for loop with slightly tweaked exception handling semantics, then the potential combinations will be far less interesting. My current thinking is that we would be better served by a block construct that guaranteed it would call __next__() on entry and on exit, but did not drain the generator (e.g. by supplying appropriate __enter__() and __exit__() methods on generators for a PEP 310 style block statement, or __enter__(), __except__() and __no_except__() for the enhanced version posted elsewhere in this rambling discussion). However, I'm currently scattering my thoughts across half-a-dozen different conversation threads. So I'm going to stop doing that, and try to put it all into one coherent post :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From duncan.booth at suttoncourtenay.org.uk Wed Apr 27 14:22:20 2005 From: duncan.booth at suttoncourtenay.org.uk (Duncan Booth) Date: Wed Apr 27 14:22:27 2005 Subject: [Python-Dev] Re: anonymous blocks References: <ca471dc20504270030405f922f@mail.gmail.com> <426F7A8F.8090109@zope.com> Message-ID: <n2m-g.Xns9645880BB6DDCduncanrcpcouk@127.0.0.1> Jim Fulton <jim@zope.com> wrote in news:426F7A8F.8090109@zope.com: > Guido van Rossum wrote: >> I've written a PEP about this topic. It's PEP 340: Anonymous Block >> Statements (http://python.org/peps/pep-0340.html). >> > Some observations: > > 1. It looks to me like a bare return or a return with an EXPR3 that > happens > to evaluate to None inside a block simply exits the block, rather > than exiting a surrounding function. Did I miss something, or is > this a bug? > No, the return sets a flag and raises StopIteration which should make the iterator also raise StopIteration at which point the real return happens. If the iterator fails to re-raise the StopIteration exception (the spec only says it should, not that it must) I think the return would be ignored but a subsquent exception would then get converted into a return value. I think the flag needs reset to avoid this case. Also, I wonder whether other exceptions from next() shouldn't be handled a bit differently. If BLOCK1 throws an exception, and this causes the iterator to also throw an exception then one exception will be lost. I think it would be better to propogate the original exception rather than the second exception. So something like (added lines to handle both of the above): itr = EXPR1 exc = arg = None ret = False while True: try: VAR1 = next(itr, arg) except StopIteration: if exc is not None: if ret: return exc else: raise exc # XXX See below break + except: + if ret or exc is None: + raise + raise exc # XXX See below + ret = False try: exc = arg = None BLOCK1 except Exception, exc: arg = StopIteration() From ncoghlan at iinet.net.au Wed Apr 27 15:27:35 2005 From: ncoghlan at iinet.net.au (Nick Coghlan) Date: Wed Apr 27 15:33:11 2005 Subject: [Python-Dev] Integrating PEP 310 with PEP 340 Message-ID: <426F9347.6000505@iinet.net.au> This is my attempt at a coherent combination of what I like about both proposals (as opposed to my assortment of half-baked attempts scattered through the existing discussion). PEP 340 has many ideas I like: - enhanced yield statements and yield expressions - enhanced continue and break - generator finalisation - 'next' builtin and associated __next__() slot - changes to 'for' loop One restriction I don't like is the limitation to ContinueIteration and StopIteration as arguments to next(). The proposed semantics and conventions for ContinueIteration and StopIteration are fine, but I would like to be able to pass _any_ exception in to the generator, allowing the generator to decide if a given exception justifies halting the iteration. The _major_ part I don't like is that the block statement's semantics are too similar to those of a 'for' loop. I would like to see a new construct that can do things a for loop can't do, and which can be used in _conjunction_ with a for loop, to provide greater power than either construct on their own. PEP 310 forms the basis for a block construct that I _do_ like. The question then becomes whether or not generators can be used to write useful PEP 310 style block managers (I think they can, in a style very similar to that of the looping block construct from PEP 340). Block statement syntax from PEP 340: block EXPR1 [as VAR1]: BLOCK1 Proposed semantics (based on PEP 310, with some ideas stolen from PEP 340): blk_mgr = EXPR1 VAR1 = blk_mgr.__enter__() try: try: BLOCK1 except Exception, exc: blk_mgr.__except__(exc) else: blk_mgr.__else__() finally: blk_mgr.__exit__() 'blk_mgr' is a hidden variable (as per PEP 340). Note that nothing special happens to 'break', 'return' or 'continue' statements with this proposal. Generator methods to support the block manager protocol used by the block statement: def __enter__(self): try: return next(self) except StopIteration: raise RuntimeError("Generator exhausted before block statement") def __except__(self, exc): try: next(self, exc) except StopIteration: pass def __no_except__(self): try: next(self) except StopIteration: pass def __exit__(self): pass Writing simple block managers with this proposal (these should be identical to the equivalent PEP 340 block managers): def opening(name): opened = open(name) try: yield opened finally: opened.close() def logging(logger, name): logger.enter_scope(name) try: try: yield except Exception, exc: logger.log_exception(exc) finally: logger.exit_scope() def transacting(ts): ts.begin() try: yield except: ts.abort() else: ts.commit() Using simple block managers with this proposal (again, identical to PEP 340): block opening(name) as f: pass block logging(logger, name): pass block transacting(ts): pass Obviously, the more interesting block managers are those like auto_retry (which is a loop, and hence an excellent match for PEP 340), and using a single generator in multiple block statements (which PEP 340 doesn't allow at all). I'll try to get to those tomorrow (and if I can't find any good use cases for the latter trick, then this idea can be summarily discarded in favour of PEP 340). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From jim at zope.com Wed Apr 27 15:44:03 2005 From: jim at zope.com (Jim Fulton) Date: Wed Apr 27 15:44:08 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <n2m-g.Xns9645880BB6DDCduncanrcpcouk@127.0.0.1> References: <ca471dc20504270030405f922f@mail.gmail.com> <426F7A8F.8090109@zope.com> <n2m-g.Xns9645880BB6DDCduncanrcpcouk@127.0.0.1> Message-ID: <426F9723.4080604@zope.com> Duncan Booth wrote: > Jim Fulton <jim@zope.com> wrote in news:426F7A8F.8090109@zope.com: > > >>Guido van Rossum wrote: >> >>>I've written a PEP about this topic. It's PEP 340: Anonymous Block >>>Statements (http://python.org/peps/pep-0340.html). >>> >> >>Some observations: >> >>1. It looks to me like a bare return or a return with an EXPR3 that >>happens >> to evaluate to None inside a block simply exits the block, rather >> than exiting a surrounding function. Did I miss something, or is >> this a bug? >> > > > No, the return sets a flag and raises StopIteration which should make the > iterator also raise StopIteration at which point the real return happens. Only if exc is not None The only return in the pseudocode is inside "if exc is not None". Is there another return that's not shown? ;) I agree that we leave the block, but it doesn't look like we leave the surrounding scope. Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From pedronis at strakt.com Wed Apr 27 15:48:24 2005 From: pedronis at strakt.com (Samuele Pedroni) Date: Wed Apr 27 15:48:36 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <426F9723.4080604@zope.com> References: <ca471dc20504270030405f922f@mail.gmail.com> <426F7A8F.8090109@zope.com> <n2m-g.Xns9645880BB6DDCduncanrcpcouk@127.0.0.1> <426F9723.4080604@zope.com> Message-ID: <426F9828.6060102@strakt.com> Jim Fulton wrote: > Duncan Booth wrote: > >> Jim Fulton <jim@zope.com> wrote in news:426F7A8F.8090109@zope.com: >> >> >>> Guido van Rossum wrote: >>> >>>> I've written a PEP about this topic. It's PEP 340: Anonymous Block >>>> Statements (http://python.org/peps/pep-0340.html). >>>> >>> >>> Some observations: >>> >>> 1. It looks to me like a bare return or a return with an EXPR3 that >>> happens to evaluate to None inside a block simply exits the >>> block, rather >>> than exiting a surrounding function. Did I miss something, or is >>> this a bug? >>> >> >> >> No, the return sets a flag and raises StopIteration which should make >> the iterator also raise StopIteration at which point the real return >> happens. > > > Only if exc is not None > > The only return in the pseudocode is inside "if exc is not None". > Is there another return that's not shown? ;) > > I agree that we leave the block, but it doesn't look like we > leave the surrounding scope. that we are having this discussion at all seems a signal that the semantics are likely too subtle. From duncan.booth at suttoncourtenay.org.uk Wed Apr 27 16:19:35 2005 From: duncan.booth at suttoncourtenay.org.uk (Duncan Booth) Date: Wed Apr 27 16:19:40 2005 Subject: [Python-Dev] Re: anonymous blocks References: <n2m-g.Xns9645880BB6DDCduncanrcpcouk@127.0.0.1> <426F9723.4080604@zope.com> Message-ID: <n2m-g.Xns96459BEDFA1B5duncanrcpcouk@127.0.0.1> Jim Fulton <jim@zope.com> wrote in news:426F9723.4080604@zope.com: >> No, the return sets a flag and raises StopIteration which should make >> the iterator also raise StopIteration at which point the real return >> happens. > > Only if exc is not None > > The only return in the pseudocode is inside "if exc is not None". > Is there another return that's not shown? ;) > Ah yes, I see now what you mean. I would think that the relevant psuedo-code should look more like: except StopIteration: if ret: return exc if exc is not None: raise exc # XXX See below break From pje at telecommunity.com Wed Apr 27 17:01:27 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Apr 27 16:57:40 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc20504270030405f922f@mail.gmail.com> References: <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> Message-ID: <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> At 12:30 AM 4/27/05 -0700, Guido van Rossum wrote: >I've written a PEP about this topic. It's PEP 340: Anonymous Block >Statements (http://python.org/peps/pep-0340.html). > >Some highlights: > >- temporarily sidestepping the syntax by proposing 'block' instead of 'with' >- __next__() argument simplified to StopIteration or ContinueIteration >instance >- use "continue EXPR" to pass a value to the generator >- generator exception handling explained Very nice. It's not clear from the text, btw, if normal exceptions can be passed into __next__, and if so, whether they can include a traceback. If they *can*, then generators can also be considered co-routines now, in which case it might make sense to call blocks "coroutine blocks", because they're basically a way to interleave a block of code with the execution of a specified coroutine. From pje at telecommunity.com Wed Apr 27 17:10:41 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Apr 27 17:06:53 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc2050426043713116248@mail.gmail.com> References: <426DB7C8.5020708@canterbury.ac.nz> <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> Message-ID: <5.1.1.6.0.20050427110325.02471bf0@mail.telecommunity.com> At 04:37 AM 4/26/05 -0700, Guido van Rossum wrote: >*Fourth*, and this is what makes Greg and me uncomfortable at the same >time as making Phillip and other event-handling folks drool: from the >previous three points it follows that an iterator may *intercept* any >or all of ReturnFlow, BreakFlow and ContinueFlow, and use them to >implement whatever cool or confusing magic they want. Actually, this isn't my interest at all. It's the part where you can pass values or exceptions *in* to a generator with *less* magic than is currently required. This interest is unrelated to anonymous blocks in any case; it's about being able to simulate lightweight pseudo-threads ala Stackless, for use with Twisted. I can do this now of course, but "yield expressions" as described in PEP 340 would eliminate the need for the awkward syntax and frame hackery I currently use. From fredrik at pythonware.com Wed Apr 27 17:32:16 2005 From: fredrik at pythonware.com (Fredrik Lundh) Date: Wed Apr 27 17:36:38 2005 Subject: [Python-Dev] Re: Re: anonymous blocks References: <426DB7C8.5020708@canterbury.ac.nz><ca471dc205042116402d7d38da@mail.gmail.com><ca471dc205042416572da9db71@mail.gmail.com><426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <5.1.1.6.0.20050427110325.02471bf0@mail.telecommunity.com> Message-ID: <d4oavq$qef$1@sea.gmane.org> Phillip J. Eby wrote: > This interest is unrelated to anonymous blocks in any case; it's about > being able to simulate lightweight pseudo-threads ala Stackless, for use > with Twisted. I can do this now of course, but "yield expressions" as > described in PEP 340 would eliminate the need for the awkward syntax and > frame hackery I currently use. since when does def mythread(self): ... yield request print self.response ... qualify as frame hackery? </F> From jcarlson at uci.edu Wed Apr 27 18:25:13 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed Apr 27 18:27:20 2005 Subject: [Python-Dev] Another Anonymous Block Proposal In-Reply-To: <426F641D.1010802@diamond.name> References: <20050427055231.R7719@familjen.svensson.org> <426F641D.1010802@diamond.name> Message-ID: <20050427091934.640A.JCARLSON@uci.edu> Jason Diamond <jason@diamond.name> wrote: > > Paul Svensson wrote: > > > You're not mentioning scopes of local variables, which seems to be > > the issue where most of the previous proposals lose their balance > > between hairy and pointless... > > My syntax is just sugar for nested defs. I assumed the scopes of local > variables would be identical when using either syntax. > > Do you have any pointers to that go into the issues I'm probably missing? We already have nested defs in Python, no need for a new syntax there. The trick is that people would like to be able to execute the body of a def (or at least portions) in the namespace of where it is lexically defined (seemingly making block syntaxes less appealing), and even some who want to execute the body of the def in the namespace where the function is evaluated (which has been discussed as being almost not possible, if not entirely impossible). - Josiah From jcarlson at uci.edu Wed Apr 27 18:44:12 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed Apr 27 18:45:03 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc20504270030405f922f@mail.gmail.com> References: <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> Message-ID: <20050427093635.640D.JCARLSON@uci.edu> Guido van Rossum <gvanrossum@gmail.com> wrote: > > I've written a PEP about this topic. It's PEP 340: Anonymous Block > Statements (http://python.org/peps/pep-0340.html). > > Some highlights: > > - temporarily sidestepping the syntax by proposing 'block' instead of 'with' > - __next__() argument simplified to StopIteration or ContinueIteration instance > - use "continue EXPR" to pass a value to the generator > - generator exception handling explained Your code for the translation of a standard for loop is flawed. From the PEP: for VAR1 in EXPR1: BLOCK1 else: BLOCK2 will be translated as follows: itr = iter(EXPR1) arg = None while True: try: VAR1 = next(itr, arg) finally: break arg = None BLOCK1 else: BLOCK2 Note that in the translated version, BLOCK2 can only ever execute if next raises a StopIteration in the call, and BLOCK1 will never be executed because of the 'break' in the finally clause. Unless it is too early for me, I believe what you wanted is... itr = iter(EXPR1) arg = None while True: VAR1 = next(itr, arg) arg = None BLOCK1 else: BLOCK2 - Josiah From gvanrossum at gmail.com Wed Apr 27 18:55:14 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Apr 27 18:55:19 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <n2m-g.Xns96459BEDFA1B5duncanrcpcouk@127.0.0.1> References: <n2m-g.Xns9645880BB6DDCduncanrcpcouk@127.0.0.1> <426F9723.4080604@zope.com> <n2m-g.Xns96459BEDFA1B5duncanrcpcouk@127.0.0.1> Message-ID: <ca471dc205042709555b24f522@mail.gmail.com> > I would think that the relevant psuedo-code should look more like: > > except StopIteration: > if ret: > return exc > if exc is not None: > raise exc # XXX See below > break Thanks! This was a bug in the PEP due to a last-minute change in how I wanted to handle return; I've fixed it as you show (also renaming 'exc' to 'var' since it doesn't always hold an exception). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From steven.bethard at gmail.com Wed Apr 27 19:35:20 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Wed Apr 27 19:35:24 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc20504270030405f922f@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> Message-ID: <d11dcfba05042710355eba8d39@mail.gmail.com> On 4/27/05, Guido van Rossum <gvanrossum@gmail.com> wrote: > I've written a PEP about this topic. It's PEP 340: Anonymous Block > Statements (http://python.org/peps/pep-0340.html). So block-statements would be very much like for-loops, except: (1) iter() is not called on the expression (2) the fact that break, continue, return or a raised Exception occurred can all be intercepted by the block-iterator/generator, though break, return and a raised Exception all look the same to the block-iterator/generator (they are signaled with a StopIteration) (3) the while loop can only be broken out of by next() raising a StopIteration, so all well-behaved iterators will be exhausted when the block-statement is exited Hope I got that mostly right. I know this is looking a little far ahead, but is the intention that even in Python 3.0 for-loops and block-statements will still be separate statements? It seems like there's a pretty large section of overlap. Playing with for-loop semantics right now isn't possible due to backwards compatibility, but when that limitation is removed in Python 3.0, are we hoping that these two similar structures will be expressed in a single statement? STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From gvanrossum at gmail.com Wed Apr 27 19:42:04 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Apr 27 19:42:11 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <20050427093635.640D.JCARLSON@uci.edu> References: <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <20050427093635.640D.JCARLSON@uci.edu> Message-ID: <ca471dc205042710424c7c5006@mail.gmail.com> > Your code for the translation of a standard for loop is flawed. From > the PEP: > > for VAR1 in EXPR1: > BLOCK1 > else: > BLOCK2 > > will be translated as follows: > > itr = iter(EXPR1) > arg = None > while True: > try: > VAR1 = next(itr, arg) > finally: > break > arg = None > BLOCK1 > else: > BLOCK2 > > Note that in the translated version, BLOCK2 can only ever execute if > next raises a StopIteration in the call, and BLOCK1 will never be > executed because of the 'break' in the finally clause. Ouch. Another bug in the PEP. It was late. ;-) The "finally:" should have been "except StopIteration:" I've updated the PEP online. > Unless it is too early for me, I believe what you wanted is... > > itr = iter(EXPR1) > arg = None > while True: > VAR1 = next(itr, arg) > arg = None > BLOCK1 > else: > BLOCK2 No, this would just propagate the StopIteration when next() raises it. StopIteration is not caught implicitly except around the next() call made by the for-loop control code. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From Benjamin.Schollnick at xerox.com Wed Apr 27 16:58:26 2005 From: Benjamin.Schollnick at xerox.com (Schollnick, Benjamin) Date: Wed Apr 27 20:22:38 2005 Subject: [Python-Dev] ZipFile revision.... Message-ID: <266589E1B9392B4C9195CC25A07C73B9AA5033@usa0300ms04.na.xerox.net> Folks, There's been a lot of talk lately about changes to the ZipFile module... Along with people stating that there are few "real life" applications for it.... Here's a small "gift"... A "Quick" Backup utility for your files.... Example: c:\develope\backup\backup.py --source c:\install_software --target c:\backups\ --label installers c:\develope\backup\backup.py --source c:\develope --target c:\backups\ --label development -z .pyc c:\develope\backup\backup.py --source "C:\Program Files\Microsoft SQL Server\MSSQL\Data" --target c:\backups\ --label sql It's evolved a bit, but still could use some work.... It's currently only tested in a windows environment... So don't expect Mac OS X resource forks to be preserved..... But it creates and verifies 1Gb+ zip files.... If you wish to use this to help benchmark, test, etc, any changes to the ZipFile module please feel free to... - Benjamin """Backup Creator Utility This utility will backup the tree of files that you indicate, into a archive of your choice. """ # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # __version__ = '0.95' # Human Readable Version number version_info = (0,9,5) # Easier format version data for comparisons # i.e. if version_info > (1,2,5) # # if __version__ > '1.00' is a little more contrived. __author__ = 'Benjamin A. Schollnick' __date__ = '2004-12-28' # yyyy-mm-dd __email__ = 'Benjamin.Schollnick@xerox.com' __module_name__ = "Archive Backup Tool" __short_cright__= "" import bas_init import os import os.path import sys import time import zipfile ####################################################################### class zip_file_engine: """The archive backup tool uses pregenerated classes to allow multiple styles of archives to be created. This is the wrapper around the Python ZIPFILE module. """ def __init__ ( self ): """ Inputs -- None Outputs -- None """ self.Backup_File = None self.Backup_Open = False self.Backup_ReadOnly = None self.Backup_FileName = None def close_Backup (self ): """This will close the current Archive file, and reset the internal structures to a clean state. Inputs -- None Outputs -- None """ if self.Backup_Open <> False: self.Backup_File.close () self.Backup_File = None self.Backup_Open = False self.Backup_ReadOnly = None self.Backup_FileName = None def open_Backup ( self, readonly = False, filename = r"./temp.zip"): """This will open a archive file. Currently appending is not formally supported... The Read Only / Read/Write status is set via the readonly flag. Inputs -- Readonly: True = Read/Write False = Read Only Filename contains the full file/pathname of the zip file. Outputs -- None """ if self.Backup_Open == True: self.close_Backup () self.Backup_Filename = filename if readonly == False: self.Backup_File = zipfile.ZipFile ( filename, "r", zipfile.ZIP_DEFLATED ) self.Backup_Open = True self.Backup_ReadOnly = True self.Backup_FileName = filename else: self.Backup_File = zipfile.ZipFile ( filename, "w", zipfile.ZIP_DEFLATED ) self.Backup_Open = True self.Backup_ReadOnly = False self.Backup_FileName = filename def Verify_ZipFile ( self, FileName ): """Will create a temporary Zip File object, and verify the Zip file at <filename> location. Inputs - FileName - The filename of the ZIP file to verify. Outputs - True - File Intact CRCs match Anything else, File Corrupted. String Contains the 1st corrupted file. """ temporary_Backup_File = zip_file_engine ( ) temporary_Backup_File.open_Backup ( False, FileName) test_results = temporary_Backup_File.Backup_File.testzip () temporary_Backup_File.close_Backup() return test_results def Verify_Backup (self, FileName ): """ Generic Wrapper around the Verify_ZipFile object. """ return self.Verify_ZipFile ( FileName ) def add_file_to_Backup ( self, filename, archived_filename): """Add a file to the writable Zip file. inputs - filename = Filename of the file to be added archived_filename = the Filename stored in the archive. Outputs - None - Zip file is in Read Only Mode True - File has been added to the Zip File. -1 - Or the Zipfile engine is not initialized. """ if self.Backup_ReadOnly: # Archive is read only! return None elif self.Backup_ReadOnly == False: # Archive is Read Write Mode if self.Backup_File <> None: # Zip File Engine is initialized self.Backup_File.write (filename, archived_filename) # Return Success return True else: # Zip File Engine is *NOT* initialized. return -1 ######################################################################## # No_Archive = 1 ZIP = 2 class backup_system: """Main Class for the Backup Engine. Inputs - default_source = The pathname for the source files. default_target = The pathname for the archive to be written to. default_tag = the prepended text tag for the archive file. Outputs - None """ def __init__ ( self, default_source = None, default_target = None, default_exclude = "", default_extensions = "", default_tag = None, prepend = False, quiet = False): """The initialization routines for the Backup Engine. Inputs - default_source = The pathname for the source files. default_target = The pathname for the archive to be written to. default_exclude = default_tag = the text tag for the archive file. prepend = Deterimines the placement of the default tag. True - The tag is prepended to the filename False - The Tag is appended to the filename. The default is to append the default_tag. (False) quiet = Forcibly prevent any output from the directory walk. Outputs - None; Internal Values are initialized for the core engine. """ self.directory_to_backup = default_source self.backup_storage_location = default_target self.base_filename = default_tag self.archive_filename = None self.force_quiet = quiet # self.exlude_files_dir = default_exlude self.exclude_files_dir = default_exclude.upper().strip().split(",") self.exclude_exts = default_extensions.upper().strip().split(",") if default_tag == None: self.archive_filename_template = "%m_%d_%Y__%H_%M_%S" else: if (prepend==True): self.archive_filename_template = self.base_filename + "_%m_%d_%Y__%H_%M_%S" elif (prepend==None) or (prepend==False): self.archive_filename_template = "%m_%d_%Y__%H_%M_%S_" + self.base_filename self.archive_filename_extension = ".zip" self.archive_engine_to_use = None def create_archive_filename ( self ): """This sets the archive filename in the object. This is set, to prevent timing issues internally. Inputs - None Outputs - None; Internally creates the archives filename from the backup_storage_location, and the archive_filename_template. """ self.archive_filename = self.backup_storage_location + os.sep + time.strftime (self.archive_filename_template, time.localtime() ) + self.archive_filename_extension def Verify_Backup ( self ): """ Wrapper around the archive_engines verify routines. This will automatically start the verification process, and return the results. Inputs - None Outputs - True - File Intact CRCs match Anything else, File Corrupted. String Contains details from the archiver engine. """ return self.archive_engine_to_use.Verify_Backup ( self.archive_filename ) def start_archive_engine ( self, Backup_Type): """ Initializae the derived archive_engine, depending on the Backup_Type. Inputs - Backup_Type 1 - None 2 - Zip Outputs - None """ if Backup_Type == 2: self.archive_engine_to_use = zip_file_engine () self.create_archive_filename () self.archive_engine_to_use.open_Backup ( readonly = True, filename = self.archive_filename ) def close_archive_file ( self ): """Stop and Close the Archive File. This does terminate the Archive Engine. But does not terminate the Backup_Engine. Inputs - None Outputs - None; Internally resets the archive engine to a closed state. """ self.archive_engine_to_use.close_Backup () def walk_directory_tree ( self, notify_directory = None, notify_file = None ): """Walk the source directory tree, and add each file/directory into the archive file. Inputs - notify_directory (Pointer) - see below notify_file (Pointer) - see below Outputs - None If you wish to have a console output for the walk function, you can have that via the notify_directory and notify_file functions.... Create two stubs and pass them to the walk routine. The routines only have a single input, either a directory name, or a filename, depending on the function. def notify_dir ( directory_name ): print print "Processing Directory - %s " % directory_name def notify_file ( file_name ): print "\t\tFile - %s " % file_name Backup_Engine.walk_directory_tree ( notify_directory = notify_dir, notify_file = notify_file ) """ selfexclude = os.path.normpath(self.archive_filename) if self.force_quiet: notify_directory = None notify_file = None for root, dirs, files in os.walk( self.directory_to_backup ): if notify_directory <> None: notify_directory ( root ) for file in files: if notify_file <> None: notify_file ( file ) # # Add the file to the backup zip file. # if os.path.normpath(file) <> selfexclude: # self.archive_engine_to_use.add_file_to_Backup ( root + os.sep + file, root + os.sep + file) # else: # print "Skipping, it is backup file - %s" % file exclude_file = False if os.path.normpath(file) == selfexclude: exclude_file = True if file.strip().upper() in self.exclude_files_dir: exclude_file = True if root <> '.': root_segment = root.strip().upper().split(os.sep) for x in root_segment: if x in self.exclude_files_dir: exclude_file = True #self.exclude_exts for x in self.exclude_exts: #print "X: ", x.strip(), " - ", os.path.splitext( file )[1].upper().strip() #print x.strip() == os.path.splitext( file )[1].upper().strip() if x.strip() == os.path.splitext( file )[1].upper().strip(): exclude_file = True if not exclude_file: self.archive_engine_to_use.add_file_to_Backup ( root + os.sep + file, root + os.sep + file) # else: # print "Skipping - %s" % file exclude_file = False def notify_dir ( directory_name ): print print "Processing Directory - %s " % directory_name def notify_file ( file_name ): print "\t\tFile - %s " % file_name def Backup_Directories_App (): """Example Application that will backup the Directories as stated in the command line. Inputs - None Outputs - None; FileSystem, Archive file. """ initialization_data = bas_init.initialization_wrapper () initialization_data.cmd_line_interface.add_option ("-s", "--source", action="store", type="string", dest="source", help="Directory Tree to Read From", default=".") initialization_data.cmd_line_interface.add_option ("-t", "--target", action="store", type="string", dest="target", help="Directory to write the backup to", default=".") initialization_data.cmd_line_interface.add_option ("-l", "--label", action="store", type="string", dest="label", help="What to Label the Backup File as", default="backup") initialization_data.cmd_line_interface.add_option ("-p", "--pre", action="store_true", dest="prelabel", help="If used, the label is prepended to the filename. Otherwise it is appended.") initialization_data.cmd_line_interface.add_option ("-q", "--quiet", action="store_true", dest="quiet", help="Force File & Directory printing to be turned off.") initialization_data.cmd_line_interface.add_option ("-x", "--exclude", action="store", type="string", dest="exclude", help="List of files/directories to exclude", default="") initialization_data.cmd_line_interface.add_option ("-z", "--extensions", action="store", type="string", dest="exclude_exts", help="List of file extensions to exclude", default="") initialization_data.run_cmd_line_parse ( ) print "Initialization Successful..." print Backup_Engine = backup_system ( initialization_data.cmd_line_options.source, initialization_data.cmd_line_options.target, initialization_data.cmd_line_options.exclude, initialization_data.cmd_line_options.exclude_exts, initialization_data.cmd_line_options.label, initialization_data.cmd_line_options.prelabel, initialization_data.cmd_line_options.quiet) Backup_Engine.start_archive_engine ( ZIP ) print "Backup Archive - %s " % Backup_Engine.archive_filename Backup_Engine.walk_directory_tree ( notify_directory = notify_dir, notify_file = notify_file ) Backup_Engine.close_archive_file () print "Verifying the Archive File...." if Backup_Engine.Verify_Backup ( ): print print "The Backup has failed!" print print "This file, %s, is bad." % test else: print print "The Backup has been verified!" print print "Backup is successful." print print "Backup Application has completed." if __name__ == "__main__": # If run from the Command line Backup_Directories_App () # run the unit test. From ark-mlist at att.net Wed Apr 27 20:44:04 2005 From: ark-mlist at att.net (Andrew Koenig) Date: Wed Apr 27 20:43:55 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <426F9828.6060102@strakt.com> Message-ID: <005301c54b59$170763e0$6402a8c0@arkdesktop> > that we are having this discussion at all seems a signal that the > semantics are likely too subtle. I feel like we're quietly, delicately tiptoeing toward continuations... From jcarlson at uci.edu Wed Apr 27 21:21:25 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed Apr 27 21:24:13 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc205042710424c7c5006@mail.gmail.com> References: <20050427093635.640D.JCARLSON@uci.edu> <ca471dc205042710424c7c5006@mail.gmail.com> Message-ID: <20050427121952.6410.JCARLSON@uci.edu> Guido van Rossum <gvanrossum@gmail.com> wrote: > Ouch. Another bug in the PEP. It was late. ;-) > > The "finally:" should have been "except StopIteration:" I've updated > the PEP online. > > > Unless it is too early for me, I believe what you wanted is... > > > > itr = iter(EXPR1) > > arg = None > > while True: > > VAR1 = next(itr, arg) > > arg = None > > BLOCK1 > > else: > > BLOCK2 > > No, this would just propagate the StopIteration when next() raises it. > StopIteration is not caught implicitly except around the next() call > made by the for-loop control code. Still no good. On break, the else isn't executed. How about... itr = iter(EXPR1) arg = None while True: try: VAR1 = next(itr, arg) except StopIteration: BLOCK2 break arg = None BLOCK1 - Josiah From bac at OCF.Berkeley.EDU Wed Apr 27 22:20:34 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Wed Apr 27 22:20:44 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc20504270030405f922f@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> Message-ID: <426FF412.7010709@ocf.berkeley.edu> Guido van Rossum wrote: > I've written a PEP about this topic. It's PEP 340: Anonymous Block > Statements (http://python.org/peps/pep-0340.html). > > Some highlights: > > - temporarily sidestepping the syntax by proposing 'block' instead of 'with' > - __next__() argument simplified to StopIteration or ContinueIteration instance > - use "continue EXPR" to pass a value to the generator > - generator exception handling explained > I am at least +0 on all of this now, with a slow warming up to +1 (but then it might just be the cold talking =). I still prefer the idea of arguments to __next__() be raised if they are exceptions and otherwise just be returned through the yield expression. But I do realize this is easily solved with a helper function now:: def raise_or_yield(val): """Return the argument if not an exception, otherwise raise it. Meant to have a yield expression as an argument. Worries about Iteration subclasses are invalid since they will have been handled by the __next__() method on the generator already. """ if isinstance(val, Exception): raise val else: return val My objections that I had earlier to 'continue' and 'break' being somewhat magical in block statements has subsided. It all seems reasonable now within the context of a block statement. And while the thought is in my head, I think block statements should be viewed less as a tweaked version of a 'for' loop and more as an extension to generators that happens to be very handy for resource management (while allowing iterators to come over and play on the new swing set as well). I think if you take that view then the argument that they are too similar to 'for' loops loses some luster (although I doubt Nick is going to be buy this =) . Basically block statements are providing a simplified, syntactically supported way to control a generator externally from itself (or at least this is the impression I am getting). I just had a flash of worry about how this would work in terms of abstractions of things to functions with block statements in them, but then I realized you just push more code into the generator and handle it there with the block statement just driving the generator. Seems like this might provide that last key piece for generators to finally provide cool flow control that we all know they are capable of but just required extra work beforehand. -Brett From gvanrossum at gmail.com Wed Apr 27 22:27:18 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Apr 27 22:27:36 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> Message-ID: <ca471dc205042713277846852d@mail.gmail.com> [Phillip Eby] > Very nice. It's not clear from the text, btw, if normal exceptions can be > passed into __next__, and if so, whether they can include a traceback. If > they *can*, then generators can also be considered co-routines now, in > which case it might make sense to call blocks "coroutine blocks", because > they're basically a way to interleave a block of code with the execution of > a specified coroutine. The PEP is clear on this: __next__() only takes Iteration instances, i.e., StopIteration and ContinueIteration. (But see below.) I'm not sure what the relevance of including a stack trace would be, and why that feature would be necessary to call them coroutines. But... Maybe it would be nice if generators could also be used to implement exception handling patterns, rather than just resource release patterns. IOW, maybe this should work: def safeLoop(seq): for var in seq: try: yield var except Exception, err: print "ignored", var, ":", err.__class__.__name__ block safeLoop([10, 5, 0, 20]) as x: print 1.0/x This should print 0.1 0.2 ignored 0 : ZeroDivisionError 0.02 I've been thinking of alternative signatures for the __next__() method to handle this. We have the following use cases: 1. plain old next() 2. passing a value from continue EXPR 3. forcing a break due to a break statement 4. forcing a break due to a return statement 5. passing an exception EXC Cases 3 and 4 are really the same; I don't think the generator needs to know the difference between a break and a return statement. And these can be mapped to case 5 with EXC being StopIteration(). Now the simplest API would be this: if the argument to __next__() is an exception instance (let's say we're talking Python 3000, where all exceptions are subclasses of Exception), it is raised when yield resumes; otherwise it is the return value from yield (may be None). This is somewhat unsatisfactory because it means that you can't pass an exception instance as a value. I don't know how much of a problem this will be in practice; I could see it causing unpleasant surprises when someone designs an API around this that takes an arbitrary object, when someone tries to pass an exception instance. Fixing such a thing could be expensive (you'd have to change the API to pass the object wrapped in a list or something). An alternative that solves this would be to give __next__() a second argument, which is a bool that should be true when the first argument is an exception that should be raised. What do people think? I'll add this to the PEP as an alternative for now. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Wed Apr 27 22:47:32 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Apr 27 22:47:40 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <005301c54b59$170763e0$6402a8c0@arkdesktop> References: <426F9828.6060102@strakt.com> <005301c54b59$170763e0$6402a8c0@arkdesktop> Message-ID: <ca471dc205042713475c552de6@mail.gmail.com> > I feel like we're quietly, delicately tiptoeing toward continuations... No way we aren't. We're not really adding anything to the existing generator machinery (the exception/value passing is a trivial modification) and that is only capable of 80% of coroutines (but it's the 80% you need most :-). As long as I am BDFL Python is unlikely to get continuations -- my head explodes each time someone tries to explain them to me. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From david.ascher at gmail.com Wed Apr 27 22:53:59 2005 From: david.ascher at gmail.com (David Ascher) Date: Wed Apr 27 22:54:03 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc205042713475c552de6@mail.gmail.com> References: <426F9828.6060102@strakt.com> <005301c54b59$170763e0$6402a8c0@arkdesktop> <ca471dc205042713475c552de6@mail.gmail.com> Message-ID: <dd28fc2f05042713536f191eda@mail.gmail.com> On 4/27/05, Guido van Rossum <gvanrossum@gmail.com> wrote: > As long as I am BDFL Python is unlikely to get continuations -- my > head explodes each time someone tries to explain them to me. You just need a safety valve installed. It's outpatient surgery, don't worry. --david From pje at telecommunity.com Wed Apr 27 22:59:46 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed Apr 27 22:56:04 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc205042713277846852d@mail.gmail.com> References: <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com> At 01:27 PM 4/27/05 -0700, Guido van Rossum wrote: >[Phillip Eby] > > Very nice. It's not clear from the text, btw, if normal exceptions can be > > passed into __next__, and if so, whether they can include a traceback. If > > they *can*, then generators can also be considered co-routines now, in > > which case it might make sense to call blocks "coroutine blocks", because > > they're basically a way to interleave a block of code with the execution of > > a specified coroutine. > >The PEP is clear on this: __next__() only takes Iteration instances, >i.e., StopIteration and ContinueIteration. (But see below.) > >I'm not sure what the relevance of including a stack trace would be, >and why that feature would be necessary to call them coroutines. Well, you need that feature in order to retain traceback information when you're simulating threads with a stack of generators. Although you can't return from a generator inside a nested generator, you can simulate this by keeping a stack of generators and having a wrapper that passes control between generators, such that: def somegen(): result = yield othergen() causes the wrapper to push othergen() on the generator stack and execute it. If othergen() raises an error, the wrapper resumes somegen() and passes in the error. If you can only specify the value but not the traceback, you lose the information about where the error occurred in othergen(). So, the feature is necessary for anything other than "simple" (i.e. single-frame) coroutines, at least if you want to retain any possibility of debugging. :) >But... Maybe it would be nice if generators could also be used to >implement exception handling patterns, rather than just resource >release patterns. IOW, maybe this should work: > > def safeLoop(seq): > for var in seq: > try: > yield var > except Exception, err: > print "ignored", var, ":", err.__class__.__name__ > > block safeLoop([10, 5, 0, 20]) as x: > print 1.0/x Yes, it would be nice. Also, you may have just come up with an even better word for what these things should be called... patterns. Perhaps they could be called "pattern blocks" or "patterned blocks". Pattern sounds so much more hip and politically correct than "macro" or even "code block". :) >An alternative that solves this would be to give __next__() a second >argument, which is a bool that should be true when the first argument >is an exception that should be raised. What do people think? I think it'd be simpler just to have two methods, conceptually "resume(value=None)" and "error(value,tb=None)", whatever the actual method names are. From mcherm at mcherm.com Wed Apr 27 23:09:59 2005 From: mcherm at mcherm.com (Michael Chermside) Date: Wed Apr 27 23:10:01 2005 Subject: [Python-Dev] Re: switch statement Message-ID: <20050427140959.qhpyf65lqkls8kkg@mcherm.com> Guido writes: > You mean like this? > > if x > 0: > ...normal case... > elif y > 0: > ....abnormal case... > else: > ...edge case... > > You have guts to call that bad style! :-) Well, maybe, but this: if x == 1: do_number_1() elif x == 2: do_number_2() elif x == 3: do_number_3() elif y == 4: do_number_4() elif x == 5: do_number_5() else: raise ValueError is clearly bad style. (Even knowing what I did here, how long does it take you to find the problem? Hint: line 7.) I've seen Jim's recipe in the cookbook, and as I said there, I'm impressed by the clever implementation, but I think it's unwise. PEP 275 proposes an O(1) solution... either by compiler optimization of certain if-elif-else structures, or via a new syntax with 'switch' and 'case' keywords. (I prefer the keywords version myself... that optimization seems awefully messy, and wouldn't help with the problem above.) Jim's recipe fixes the problem given above, but it's a O(n) solution, and to me the words 'switch' and 'case' just *scream* "O(1)". But perhaps it's worthwhile, just because it avoids repeating "x ==". Really, this seems like a direct analog of another frequently-heard Python gripe: the lack of a conditional expression. After all, the problems with these two code snippets: if x == 1: | if condition_1: do_1() | y = 1 elif x == 2: | elif condition_2: do_2() | y = 2 elif x == 3: | elif condition_3: do_3() | y = 3 else: | else: default() | y = 4 is the repetition of "x ==" and of "y =". As my earlier example demonstrates, a structure like this in which the "x ==" or the "y =" VARIES has a totally different *meaning* to the programmer than one in which the "x ==" or "y =" is the same for every single branch. But let's not start discussing conditional expressions now, because there's already more traffic on the list than I can read. -- Michael Chermside From gvanrossum at gmail.com Wed Apr 27 23:50:10 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Apr 27 23:50:19 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> <ca471dc205042713277846852d@mail.gmail.com> <5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com> Message-ID: <ca471dc2050427145022e8985f@mail.gmail.com> [Guido] > >I'm not sure what the relevance of including a stack trace would be, > >and why that feature would be necessary to call them coroutines. [Phillip] > Well, you need that feature in order to retain traceback information when > you're simulating threads with a stack of generators. Although you can't > return from a generator inside a nested generator, you can simulate this by > keeping a stack of generators and having a wrapper that passes control > between generators, such that: > > def somegen(): > result = yield othergen() > > causes the wrapper to push othergen() on the generator stack and execute > it. If othergen() raises an error, the wrapper resumes somegen() and > passes in the error. If you can only specify the value but not the > traceback, you lose the information about where the error occurred in > othergen(). > > So, the feature is necessary for anything other than "simple" (i.e. > single-frame) coroutines, at least if you want to retain any possibility of > debugging. :) OK. I think you must be describing continuations there, because my brain just exploded. :-) In Python 3000 I want to make the traceback a standard attribute of Exception instances; would that suffice? I really don't want to pass the whole (type, value, traceback) triple that currently represents an exception through __next__(). > Yes, it would be nice. Also, you may have just come up with an even better > word for what these things should be called... patterns. Perhaps they > could be called "pattern blocks" or "patterned blocks". Pattern sounds so > much more hip and politically correct than "macro" or even "code block". :) Yes, but the word has a much loftier meaning. I could get used to template blocks though (template being a specific pattern, and this whole thing being a non-OO version of the Template Method Pattern from the GoF book). > >An alternative that solves this would be to give __next__() a second > >argument, which is a bool that should be true when the first argument > >is an exception that should be raised. What do people think? > > I think it'd be simpler just to have two methods, conceptually > "resume(value=None)" and "error(value,tb=None)", whatever the actual method > names are. Part of me likes this suggestion, but part of me worries that it complicates the iterator API too much. Your resume() would be __next__(), but that means your error() would become __error__(). This is more along the lines of PEP 288 and PEP 325 (and even PEP 310), but we have a twist here in that it is totally acceptable (see my example) for __error__() to return the next value or raise StopIteration. IOW the return behavior of __error__() is the same as that of __next__(). Fredrik, what does your intuition tell you? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From shane at hathawaymix.org Wed Apr 27 23:54:31 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Wed Apr 27 23:52:39 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: <20050427140959.qhpyf65lqkls8kkg@mcherm.com> References: <20050427140959.qhpyf65lqkls8kkg@mcherm.com> Message-ID: <42700A17.9020905@hathawaymix.org> Michael Chermside wrote: > if x == 1: | if condition_1: > do_1() | y = 1 > elif x == 2: | elif condition_2: > do_2() | y = 2 > elif x == 3: | elif condition_3: > do_3() | y = 3 > else: | else: > default() | y = 4 This inspired a twisted thought: if you just redefine truth, you don't have to repeat the variable. <0.9 wink> True = x if 1: do_1() elif 2: do_2() elif 3: do_3() else: default() Shane From gvanrossum at gmail.com Wed Apr 27 23:57:01 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Apr 27 23:57:12 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <n2m-g.Xns9645880BB6DDCduncanrcpcouk@127.0.0.1> References: <ca471dc20504270030405f922f@mail.gmail.com> <426F7A8F.8090109@zope.com> <n2m-g.Xns9645880BB6DDCduncanrcpcouk@127.0.0.1> Message-ID: <ca471dc205042714575e98d89c@mail.gmail.com> > If the iterator fails to re-raise the StopIteration exception (the spec > only says it should, not that it must) I think the return would be ignored > but a subsquent exception would then get converted into a return value. I > think the flag needs reset to avoid this case. Good catch. I've fixed this in the PEP. > Also, I wonder whether other exceptions from next() shouldn't be handled a > bit differently. If BLOCK1 throws an exception, and this causes the > iterator to also throw an exception then one exception will be lost. I > think it would be better to propogate the original exception rather than > the second exception. I don't think so. It's similar to this case: try: raise Foo except: raise Bar Here, Foo is also lost. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Wed Apr 27 23:59:42 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Wed Apr 27 23:59:50 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <426F7A8F.8090109@zope.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <426F7A8F.8090109@zope.com> Message-ID: <ca471dc205042714596c053236@mail.gmail.com> [Jim Fulton] > 2. I assume it would be a hack to try to use block statements to implement > something like interfaces or classes, because doing so would require > significant local-variable manipulation. I'm guessing that > either implementing interfaces (or implementing a class statement > in which the class was created before execution of a suite) > is not a use case for this PEP. I would like to get back to the discussion about interfaces and signature type declarations at some point, and a syntax dedicated to declaring interfaces is high on my wish list. In the mean time, if you need interfaces today, I think using metaclasses would be easier than using a block-statement (if it were even possible using the latter without passing locals() to the generator). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at gmail.com Thu Apr 28 00:00:40 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu Apr 28 00:00:48 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <426FF412.7010709@ocf.berkeley.edu> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <426FF412.7010709@ocf.berkeley.edu> Message-ID: <42700B88.40703@gmail.com> Brett C. wrote: > And while the thought is in my head, I think block statements should be viewed > less as a tweaked version of a 'for' loop and more as an extension to > generators that happens to be very handy for resource management (while > allowing iterators to come over and play on the new swing set as well). I > think if you take that view then the argument that they are too similar to > 'for' loops loses some luster (although I doubt Nick is going to be buy this =) . I'm surprisingly close to agreeing with you, actually. I've worked out that it isn't the looping that I object to, it's the inability to get out of the loop without exhausting the entire iterator. I need to think about some ideas involving iterator factories, then my objections may disappear. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From ncoghlan at gmail.com Thu Apr 28 00:07:54 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu Apr 28 00:08:01 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc205042713277846852d@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> <ca471dc205042713277846852d@mail.gmail.com> Message-ID: <42700D3A.5020208@gmail.com> Guido van Rossum wrote: > An alternative that solves this would be to give __next__() a second > argument, which is a bool that should be true when the first argument > is an exception that should be raised. What do people think? > > I'll add this to the PEP as an alternative for now. An optional third argument (raise=False) seems a lot friendlier (and more flexible) than a typecheck. Yet another alternative would be for the default behaviour to be to raise Exceptions, and continue with anything else, and have the third argument be "raise_exc=True" and set it to False to pass an exception in without raising it. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From gvanrossum at gmail.com Thu Apr 28 00:16:06 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu Apr 28 00:16:22 2005 Subject: [Python-Dev] Integrating PEP 310 with PEP 340 In-Reply-To: <426F9347.6000505@iinet.net.au> References: <426F9347.6000505@iinet.net.au> Message-ID: <ca471dc205042715165dede48d@mail.gmail.com> [Nick Coghlan] > This is my attempt at a coherent combination of what I like about both proposals > (as opposed to my assortment of half-baked attempts scattered through the > existing discussion). > > PEP 340 has many ideas I like: > - enhanced yield statements and yield expressions > - enhanced continue and break > - generator finalisation > - 'next' builtin and associated __next__() slot > - changes to 'for' loop > > One restriction I don't like is the limitation to ContinueIteration and > StopIteration as arguments to next(). The proposed semantics and conventions for > ContinueIteration and StopIteration are fine, but I would like to be able to > pass _any_ exception in to the generator, allowing the generator to decide if a > given exception justifies halting the iteration. I'm close to dropping this if we can agree on the API for passing exceptions into __next__(); see the section "Alternative __next__() and Generator Exception Handling" that I just added to the PEP. > The _major_ part I don't like is that the block statement's semantics are too > similar to those of a 'for' loop. I would like to see a new construct that can > do things a for loop can't do, and which can be used in _conjunction_ with a for > loop, to provide greater power than either construct on their own. While both 'block' and 'for' are looping constructs, their handling of the iterator upon premature exit is entirely different, and it's hard to reconcile these two before Python 3000. > PEP 310 forms the basis for a block construct that I _do_ like. The question > then becomes whether or not generators can be used to write useful PEP 310 style > block managers (I think they can, in a style very similar to that of the looping > block construct from PEP 340). I've read through your example, and I'm not clear why you think this is better. It's a much more complex API with less power. What's your use case? Why should 'block' be disallowed from looping? TOOWTDI or do you have something better? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Thu Apr 28 00:22:00 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu Apr 28 00:29:37 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <42700D3A.5020208@gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> <ca471dc205042713277846852d@mail.gmail.com> <42700D3A.5020208@gmail.com> Message-ID: <ca471dc20504271522e79ce4a@mail.gmail.com> [Guido] > > An alternative that solves this would be to give __next__() a second > > argument, which is a bool that should be true when the first argument > > is an exception that should be raised. What do people think? > > > > I'll add this to the PEP as an alternative for now. [Nick] > An optional third argument (raise=False) seems a lot friendlier (and more > flexible) than a typecheck. I think I agree, especially since Phillip's alternative (a different method) is even worse IMO. > Yet another alternative would be for the default behaviour to be to raise > Exceptions, and continue with anything else, and have the third argument be > "raise_exc=True" and set it to False to pass an exception in without raising it. You've lost me there. If you care about this, can you write it up in more detail (with code samples or whatever)? Or we can agree on a 2nd arg to __next__() (and a 3rd one to next()). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Thu Apr 28 00:38:53 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Apr 28 00:35:14 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc2050427145022e8985f@mail.gmail.com> References: <5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com> <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> <ca471dc205042713277846852d@mail.gmail.com> <5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com> At 02:50 PM 4/27/05 -0700, Guido van Rossum wrote: >[Guido] > > >I'm not sure what the relevance of including a stack trace would be, > > >and why that feature would be necessary to call them coroutines. > >[Phillip] > > Well, you need that feature in order to retain traceback information when > > you're simulating threads with a stack of generators. Although you can't > > return from a generator inside a nested generator, you can simulate this by > > keeping a stack of generators and having a wrapper that passes control > > between generators, such that: > > > > def somegen(): > > result = yield othergen() > > > > causes the wrapper to push othergen() on the generator stack and execute > > it. If othergen() raises an error, the wrapper resumes somegen() and > > passes in the error. If you can only specify the value but not the > > traceback, you lose the information about where the error occurred in > > othergen(). > > > > So, the feature is necessary for anything other than "simple" (i.e. > > single-frame) coroutines, at least if you want to retain any possibility of > > debugging. :) > >OK. I think you must be describing continuations there, because my >brain just exploded. :-) Probably my attempt at a *brief* explanation backfired. No, they're not continuations or anything nearly that complicated. I'm "just" simulating threads using generators that yield a nested generator when they need to do something that might block waiting for I/O. The pseudothread object pushes the yielded generator-iterator and resumes it. If that generator-iterator raises an error, the pseudothread catches it, pops the previous generator-iterator, and passes the error into it, traceback and all. The net result is that as long as you use a "yield expression" for any function/method call that might do blocking I/O, and those functions or methods are written as generators, you get the benefits of Twisted (async I/O without threading headaches) without having to "twist" your code into the callback-registration patterns of Twisted. And, by passing in errors with tracebacks, the normal process of exception call-stack unwinding combined with pseudothread stack popping results in a traceback that looks just as if you had called the functions or methods normally, rather than via the pseudothreading mechanism. Without that, you would only get the error context of 'async_readline()', because the traceback wouldn't be able to show who *called* async_readline. >In Python 3000 I want to make the traceback a standard attribute of >Exception instances; would that suffice? If you're planning to make 'raise' reraise it, such that 'raise exc' is equivalent to 'raise type(exc), exc, exc.traceback'. Is that what you mean? (i.e., just making it easier to pass the darn things around) If so, then I could probably do what I need as long as there exist no error types whose instances disallow setting a 'traceback' attribute on them after the fact. Of course, if Exception provides a slot (or dictionary) for this, then it shouldn't be a problem. Of course, it seems to me that you also have the problem of adding to the traceback when the same error is reraised... All in all it seems more complex than just allowing an exception and a traceback to be passed. >I really don't want to pass >the whole (type, value, traceback) triple that currently represents an >exception through __next__(). The point of passing it in is so that the traceback can be preserved without special action in the body of generators the exception is passing through. I could be wrong, but it seems to me you need this even for PEP 340, if you're going to support error management templates, and want tracebacks to include the line in the block where the error originated. Just reraising the error inside the generator doesn't seem like it would be enough. > > >An alternative that solves this would be to give __next__() a second > > >argument, which is a bool that should be true when the first argument > > >is an exception that should be raised. What do people think? > > > > I think it'd be simpler just to have two methods, conceptually > > "resume(value=None)" and "error(value,tb=None)", whatever the actual method > > names are. > >Part of me likes this suggestion, but part of me worries that it >complicates the iterator API too much. I was thinking that maybe these would be a "coroutine API" or "generator API" instead. That is, something not usable except with generator-iterators and with *new* objects written to conform to it. I don't really see a lot of value in making template blocks work with existing iterators. For that matter, I don't see a lot of value in hand-writing new objects with resume/error, instead of just using a generator. So, I guess I'm thinking you'd have something like tp_block_resume and tp_block_error type slots, and generators' tp_iter_next would just be the same as tp_block_resume(None). But maybe this is the part you're thinking is complicated. :) From tcdelaney at optusnet.com.au Thu Apr 28 00:34:59 2005 From: tcdelaney at optusnet.com.au (Tim Delaney) Date: Thu Apr 28 00:39:19 2005 Subject: [Python-Dev] Re: anonymous blocks References: <ca471dc205042116402d7d38da@mail.gmail.com><ca471dc205042416572da9db71@mail.gmail.com><426DB7C8.5020708@canterbury.ac.nz><ca471dc2050426043713116248@mail.gmail.com><426E3B01.1010007@canterbury.ac.nz><ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> Message-ID: <00c501c54b79$58655890$0201a8c0@ryoko> Guido van Rossum wrote: > - temporarily sidestepping the syntax by proposing 'block' instead of > 'with' > - __next__() argument simplified to StopIteration or > ContinueIteration instance > - use "continue EXPR" to pass a value to the generator > - generator exception handling explained +1 A minor sticking point - I don't like that the generator has to re-raise any ``StopIteration`` passed in. Would it be possible to have the semantics be: If a generator is resumed with ``StopIteration``, the exception is raised at the resumption point (and stored for later use). When the generator exits normally (i.e. ``return`` or falls off the end) it re-raises the stored exception (if any) or raises a new ``StopIteration`` exception. So a generator would become effectively:: try: stopexc = None exc = None BLOCK1 finally: if exc is not None: raise exc if stopexc is not None: raise stopexc raise StopIteration where within BLOCK1: ``raise <exception>`` is equivalent to:: exc = <exception> return The start of an ``except`` clause sets ``exc`` to None (if the clause is executed of course). Calling ``__next__(exception)`` with ``StopIteration`` is equivalent to:: stopexc = exception (raise exception at resumption point) Calling ``__next__(exception)`` with ``ContinueIteration`` is equivalent to:: (resume exception with exception.value) Calling ``__next__(exception)__`` with any other value just raises that value at the resumption point - this allows for calling with arbitrary exceptions. Also, within a for-loop or block-statement, we could have ``raise <exception>`` be equivalent to:: arg = <exception> continue This also takes care of Brett's concern about distinguishing between exceptions and values passed to the generator. Anything except StopIteration or ContinueIteration will be presumed to be an exception and will be raised. Anything passed via ContinueIteration is a value. Tim Delaney From gvanrossum at gmail.com Thu Apr 28 00:58:14 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu Apr 28 00:58:17 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> <ca471dc205042713277846852d@mail.gmail.com> <5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com> <ca471dc2050427145022e8985f@mail.gmail.com> <5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com> Message-ID: <ca471dc205042715585917829f@mail.gmail.com> [Phillip] > Probably my attempt at a *brief* explanation backfired. No, they're not > continuations or anything nearly that complicated. I'm "just" simulating > threads using generators that yield a nested generator when they need to do > something that might block waiting for I/O. The pseudothread object pushes > the yielded generator-iterator and resumes it. If that generator-iterator > raises an error, the pseudothread catches it, pops the previous > generator-iterator, and passes the error into it, traceback and all. > > The net result is that as long as you use a "yield expression" for any > function/method call that might do blocking I/O, and those functions or > methods are written as generators, you get the benefits of Twisted (async > I/O without threading headaches) without having to "twist" your code into > the callback-registration patterns of Twisted. And, by passing in errors > with tracebacks, the normal process of exception call-stack unwinding > combined with pseudothread stack popping results in a traceback that looks > just as if you had called the functions or methods normally, rather than > via the pseudothreading mechanism. Without that, you would only get the > error context of 'async_readline()', because the traceback wouldn't be able > to show who *called* async_readline. OK, I sort of get it, at a very high-level, although I still feel this is wildly out of my league. I guess I should try it first. ;-) > >In Python 3000 I want to make the traceback a standard attribute of > >Exception instances; would that suffice? > > If you're planning to make 'raise' reraise it, such that 'raise exc' is > equivalent to 'raise type(exc), exc, exc.traceback'. Is that what you > mean? (i.e., just making it easier to pass the darn things around) > > If so, then I could probably do what I need as long as there exist no error > types whose instances disallow setting a 'traceback' attribute on them > after the fact. Of course, if Exception provides a slot (or dictionary) > for this, then it shouldn't be a problem. Right, this would be a standard part of the Exception base class, just like in Java. > Of course, it seems to me that you also have the problem of adding to the > traceback when the same error is reraised... I think when it is re-raised, no traceback entry should be added; the place that re-raises it should not show up in the traceback, only the place that raised it in the first place. To me that's the essence of re-raising (and I think that's how it works when you use raise without arguments). > All in all it seems more complex than just allowing an exception and a > traceback to be passed. Making the traceback a standard attribute of the exception sounds simpler; having to keep track of two separate arguments that are as closely related as an exception and the corresponding traceback is more complex IMO. The only reason why it isn't done that way in current Python is that it couldn't be done that way back when exceptions were strings. > >I really don't want to pass > >the whole (type, value, traceback) triple that currently represents an > >exception through __next__(). > > The point of passing it in is so that the traceback can be preserved > without special action in the body of generators the exception is passing > through. > > I could be wrong, but it seems to me you need this even for PEP 340, if > you're going to support error management templates, and want tracebacks to > include the line in the block where the error originated. Just reraising > the error inside the generator doesn't seem like it would be enough. *** I have to think about this more... *** > > > I think it'd be simpler just to have two methods, conceptually > > > "resume(value=None)" and "error(value,tb=None)", whatever the actual method > > > names are. > > > >Part of me likes this suggestion, but part of me worries that it > >complicates the iterator API too much. > > I was thinking that maybe these would be a "coroutine API" or "generator > API" instead. That is, something not usable except with > generator-iterators and with *new* objects written to conform to it. I > don't really see a lot of value in making template blocks work with > existing iterators. (You mean existing non-generator iterators, right? existing *generators* will work just fine -- the exception will pass right through them and that's exactly the right default semantics. Existing non-generator iterators are indeed a different case, and this is actually an argument for having a separate API: if the __error__() method doesn't exist, the exception is just re-raised rather than bothering the iterator. OK, I think I'm sold. > For that matter, I don't see a lot of value in > hand-writing new objects with resume/error, instead of just using a generator. Not a lot, but I expect that there may be a few, like an optimized version of lock synchronization. > So, I guess I'm thinking you'd have something like tp_block_resume and > tp_block_error type slots, and generators' tp_iter_next would just be the > same as tp_block_resume(None). > > But maybe this is the part you're thinking is complicated. :) No, this is where I feel right at home. ;-) I hadn't thought much about the C-level slots yet, but this is a reasonable proposal. Time to update the PEP; I'm pretty much settled on these semantics now... -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Thu Apr 28 01:01:58 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu Apr 28 01:02:02 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <00c501c54b79$58655890$0201a8c0@ryoko> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <00c501c54b79$58655890$0201a8c0@ryoko> Message-ID: <ca471dc205042716017a85d241@mail.gmail.com> > A minor sticking point - I don't like that the generator has to re-raise any > ``StopIteration`` passed in. Would it be possible to have the semantics be: > > If a generator is resumed with ``StopIteration``, the exception is raised > at the resumption point (and stored for later use). When the generator > exits normally (i.e. ``return`` or falls off the end) it re-raises the > stored exception (if any) or raises a new ``StopIteration`` exception. I don't like the idea of storing exceptions. Let's just say that we don't care whether it re-raises the very same StopIteration exception that was passed in or a different one -- it's all moot anyway because the StopIteration instance is thrown away by the caller of next(). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tcdelaney at optusnet.com.au Thu Apr 28 01:03:37 2005 From: tcdelaney at optusnet.com.au (Tim Delaney) Date: Thu Apr 28 01:03:47 2005 Subject: [Python-Dev] Re: anonymous blocks References: <ca471dc205042116402d7d38da@mail.gmail.com><ca471dc205042416572da9db71@mail.gmail.com><426DB7C8.5020708@canterbury.ac.nz><ca471dc2050426043713116248@mail.gmail.com><426E3B01.1010007@canterbury.ac.nz><ca471dc205042621472b1f6edf@mail.gmail.com><ca471dc20504270030405f922f@mail.gmail.com> <00c501c54b79$58655890$0201a8c0@ryoko> Message-ID: <00e001c54b7d$5888c240$0201a8c0@ryoko> Tim Delaney wrote: > Also, within a for-loop or block-statement, we could have ``raise > <exception>`` be equivalent to:: > > arg = <exception> > continue For this to work, builtin next() would need to be a bit smarter ... specifically, for an old-style iterator, any non-Iteration exception would need to be re-raised there. Tim Delaney From tcdelaney at optusnet.com.au Thu Apr 28 01:07:00 2005 From: tcdelaney at optusnet.com.au (Tim Delaney) Date: Thu Apr 28 01:07:10 2005 Subject: [Python-Dev] Re: anonymous blocks References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <00c501c54b79$58655890$0201a8c0@ryoko> <ca471dc205042716017a85d241@mail.gmail.com> Message-ID: <00e401c54b7d$d17e7cd0$0201a8c0@ryoko> Guido van Rossum wrote: >> A minor sticking point - I don't like that the generator has to >> re-raise any ``StopIteration`` passed in. Would it be possible to >> have the semantics be: >> >> If a generator is resumed with ``StopIteration``, the exception >> is raised at the resumption point (and stored for later use). >> When the generator exits normally (i.e. ``return`` or falls off >> the end) it re-raises the stored exception (if any) or raises a >> new ``StopIteration`` exception. > > I don't like the idea of storing exceptions. Let's just say that we > don't care whether it re-raises the very same StopIteration exception > that was passed in or a different one -- it's all moot anyway because > the StopIteration instance is thrown away by the caller of next(). OK - so what is the point of the sentence:: The generator should re-raise this exception; it should not yield another value. when discussing StopIteration? Tim Delaney From gvanrossum at gmail.com Thu Apr 28 01:17:32 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu Apr 28 01:17:35 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <00e401c54b7d$d17e7cd0$0201a8c0@ryoko> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <00c501c54b79$58655890$0201a8c0@ryoko> <ca471dc205042716017a85d241@mail.gmail.com> <00e401c54b7d$d17e7cd0$0201a8c0@ryoko> Message-ID: <ca471dc205042716173c992c2c@mail.gmail.com> > OK - so what is the point of the sentence:: > > The generator should re-raise this exception; it should not yield > another value. > > when discussing StopIteration? It forbids returning a value, since that would mean the generator could "refuse" a break or return statement, which is a little bit too weird (returning a value instead would turn these into continue statements). I'll change this to clarify that I don't care about the identity of the StopException instance. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bac at OCF.Berkeley.EDU Thu Apr 28 01:56:33 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Apr 28 01:56:43 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <42700B88.40703@gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <426FF412.7010709@ocf.berkeley.edu> <42700B88.40703@gmail.com> Message-ID: <427026B1.4020002@ocf.berkeley.edu> Nick Coghlan wrote: > Brett C. wrote: > >> And while the thought is in my head, I think block statements should >> be viewed >> less as a tweaked version of a 'for' loop and more as an extension to >> generators that happens to be very handy for resource management (while >> allowing iterators to come over and play on the new swing set as >> well). I >> think if you take that view then the argument that they are too >> similar to >> 'for' loops loses some luster (although I doubt Nick is going to be >> buy this =) . > > > I'm surprisingly close to agreeing with you, actually. I've worked out > that it isn't the looping that I object to, it's the inability to get > out of the loop without exhausting the entire iterator. > 'break' isn't' enough for you as laid out by the proposal? The raising of StopIteration, which is what 'break' does according to the standard, should be enough to stop the loop without exhausting things. Same way you stop a 'for' loop from executing entirely. -Brett From pje at telecommunity.com Thu Apr 28 02:01:38 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Apr 28 01:58:00 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc205042715585917829f@mail.gmail.com> References: <5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com> <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> <ca471dc205042713277846852d@mail.gmail.com> <5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com> <ca471dc2050427145022e8985f@mail.gmail.com> <5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050427193759.032367b0@mail.telecommunity.com> At 03:58 PM 4/27/05 -0700, Guido van Rossum wrote: >OK, I sort of get it, at a very high-level, although I still feel this >is wildly out of my league. > >I guess I should try it first. ;-) It's not unlike David Mertz' articles on implementing coroutines and multitasking using generators, except that I'm adding more "debugging sugar", if you will, by making the tracebacks look normal. It's just that the *how* requires me to pass the traceback into the generator. At the moment, I accomplish that by doing a 3-argument raise inside of 'events.resume()', but it would be really nice to be able to get rid of 'events.resume()' in a future version of Python. > > Of course, it seems to me that you also have the problem of adding to the > > traceback when the same error is reraised... > >I think when it is re-raised, no traceback entry should be added; the >place that re-raises it should not show up in the traceback, only the >place that raised it in the first place. To me that's the essence of >re-raising (and I think that's how it works when you use raise without >arguments). I think maybe I misspoke. I mean adding to the traceback *so* that when the same error is reraised, the intervening frames are included, rather than lost. In other words, IIRC, the traceback chain is normally increased by one entry for each frame the exception escapes. However, if you start hiding that inside of the exception instance, you'll have to modify it instead of just modifying the threadstate. Does that make sense, or am I missing something? > > For that matter, I don't see a lot of value in > > hand-writing new objects with resume/error, instead of just using a > generator. > >Not a lot, but I expect that there may be a few, like an optimized >version of lock synchronization. My point was mainly that we can err on the side of caller convenience rather than callee convenience, if there are fewer implementations. So, e.g. multiple methods aren't a big deal if it makes the 'block' implementation simpler, if only generators and a handful of special template objects are going need to implement the block API. > > So, I guess I'm thinking you'd have something like tp_block_resume and > > tp_block_error type slots, and generators' tp_iter_next would just be the > > same as tp_block_resume(None). > >I hadn't thought much about the C-level slots yet, but this is a >reasonable proposal. Note that it also doesn't require a 'next()' builtin, or a next vs. __next__ distinction, if you don't try to overload iteration and templating. The fact that a generator can be used for templating, doesn't have to imply that any iterator should be usable as a template, or that the iteration protocol is involved in any way. You could just have __resume__/__error__ matching the tp_block_* slots. This also has the benefit of making the delineation between template blocks and for loops more concrete. For example, this: block open("filename") as f: ... could be an immediate TypeError (due to the lack of a __resume__) instead of biting you later on in the block when you try to do something with f, or because the block is repeating for each line of the file, etc. From nas at arctrix.com Thu Apr 28 02:02:23 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Thu Apr 28 02:02:27 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc20504270030405f922f@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> Message-ID: <20050428000223.GA8869@mems-exchange.org> On Wed, Apr 27, 2005 at 12:30:22AM -0700, Guido van Rossum wrote: > I've written a PEP about this topic. It's PEP 340: Anonymous Block > Statements (http://python.org/peps/pep-0340.html). [Note: most of these comments are based on version 1.2 of the PEP] It seems like what you are proposing is a limited form of coroutines. Just as Python's generators are limited (yield can only jump up one stack frame), these coroutines have a similar limitation. Someone mentioned that we are edging closer to continuations. I think that may be a good thing. One big difference between what you propose and general continuations is in finalization semantics. I don't think anyone has figured out a way for try/finally to work with continuations. The fact that try/finally can be used inside generators is a significant feature of this PEP, IMO. Regarding the syntax, I actually quite like the 'block' keyword. It doesn't seem so surprising that the block may be a loop. Allowing 'continue' to have an optional value is elegant syntax. I'm a little bit concerned about what happens if the iterator does not expect a value. If I understand the PEP, it is silently ignored. That seems like it could hide bugs. OTOH, it doesn't seem any worse then a caller not expecting a return value. It's interesting that there is such similarity between 'for' and 'block'. Why is it that block does not call iter() on EXPR1? I guess that fact that 'break' and 'return' work differently is a more significant difference. After thinking about this more, I wonder if iterators meant for 'for' loops and iterators meant for 'block' statements are really very different things. It seems like a block-iterator really needs to handle yield-expressions. I wonder if generators that contain a yield-expression should properly be called coroutines. Practically, I suspect it would just cause confusion. Perhaps passing an Iteration instance to next() should not be treated the same as passing None. It seems like that would implementing the iterator easier. Why not treat Iterator like any normal value? Then only None, StopIteration, and ContinueIteration would be special. Argh, it took me so long to write this that you are already up to version 1.6 of the PEP. Time to start a new message. :-) Neil From bac at OCF.Berkeley.EDU Thu Apr 28 02:18:19 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Apr 28 02:18:47 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc20504271522e79ce4a@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> <ca471dc205042713277846852d@mail.gmail.com> <42700D3A.5020208@gmail.com> <ca471dc20504271522e79ce4a@mail.gmail.com> Message-ID: <42702BCB.9000500@ocf.berkeley.edu> Guido van Rossum wrote: > [Guido] > >>>An alternative that solves this would be to give __next__() a second >>>argument, which is a bool that should be true when the first argument >>>is an exception that should be raised. What do people think? >>> >>>I'll add this to the PEP as an alternative for now. > > > [Nick] > >>An optional third argument (raise=False) seems a lot friendlier (and more >>flexible) than a typecheck. > > > I think I agree, especially since Phillip's alternative (a different > method) is even worse IMO. > The extra argument works for me as well. > >>Yet another alternative would be for the default behaviour to be to raise >>Exceptions, and continue with anything else, and have the third argument be >>"raise_exc=True" and set it to False to pass an exception in without raising it. > > > You've lost me there. If you care about this, can you write it up in > more detail (with code samples or whatever)? Or we can agree on a 2nd > arg to __next__() (and a 3rd one to next()). > Channeling Nick, I think he is saying that the raising argument should be made True by default and be named 'raise_exc'. -Brett From gvanrossum at gmail.com Thu Apr 28 02:19:08 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu Apr 28 02:19:12 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <5.1.1.6.0.20050427193759.032367b0@mail.telecommunity.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> <ca471dc205042713277846852d@mail.gmail.com> <5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com> <ca471dc2050427145022e8985f@mail.gmail.com> <5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com> <ca471dc205042715585917829f@mail.gmail.com> <5.1.1.6.0.20050427193759.032367b0@mail.telecommunity.com> Message-ID: <ca471dc2050427171967cec0ac@mail.gmail.com> [Phillip] > It's not unlike David Mertz' articles on implementing coroutines and > multitasking using generators, except that I'm adding more "debugging > sugar", if you will, by making the tracebacks look normal. It's just that > the *how* requires me to pass the traceback into the generator. At the > moment, I accomplish that by doing a 3-argument raise inside of > 'events.resume()', but it would be really nice to be able to get rid of > 'events.resume()' in a future version of Python. I'm not familiar with Mertz' articles and frankly I still fear it's head-explosive material. ;-) > I think maybe I misspoke. I mean adding to the traceback *so* that when > the same error is reraised, the intervening frames are included, rather > than lost. > > In other words, IIRC, the traceback chain is normally increased by one > entry for each frame the exception escapes. However, if you start hiding > that inside of the exception instance, you'll have to modify it instead of > just modifying the threadstate. Does that make sense, or am I missing > something? Adding to the traceback chain already in the exception object is totally kosher, if that's where the traceback is kept. > My point was mainly that we can err on the side of caller convenience > rather than callee convenience, if there are fewer implementations. So, > e.g. multiple methods aren't a big deal if it makes the 'block' > implementation simpler, if only generators and a handful of special > template objects are going need to implement the block API. Well, the way my translation is currently written, writing next(itr, arg, exc) is a lot more convenient for the caller than having to write # if exc is True, arg is an exception; otherwise arg is a value if exc: err = getattr(itr, "__error__", None) if err is not None: VAR1 = err(arg) else: raise arg else: VAR1 = next(itr, arg) but since this will actually be code generated by the bytecode compiler, I think callee convenience is more important. And the ability to default __error__ to raise the exception makes a lot of sense. And we could wrap all this inside the next() built-in -- even if the actual object should have separate __next__() and __error__() methods, the user-facing built-in next() function might take an extra flag to indicate that the argument is an exception, and to handle it appropriate (like shown above). > > > So, I guess I'm thinking you'd have something like tp_block_resume and > > > tp_block_error type slots, and generators' tp_iter_next would just be the > > > same as tp_block_resume(None). > > > >I hadn't thought much about the C-level slots yet, but this is a > >reasonable proposal. > > Note that it also doesn't require a 'next()' builtin, or a next vs. > __next__ distinction, if you don't try to overload iteration and > templating. The fact that a generator can be used for templating, doesn't > have to imply that any iterator should be usable as a template, or that the > iteration protocol is involved in any way. You could just have > __resume__/__error__ matching the tp_block_* slots. > > This also has the benefit of making the delineation between template blocks > and for loops more concrete. For example, this: > > block open("filename") as f: > ... > > could be an immediate TypeError (due to the lack of a __resume__) instead > of biting you later on in the block when you try to do something with f, or > because the block is repeating for each line of the file, etc. I'm not convinced of that, especially since all *generators* will automatically be usable as templates, whether or not they were intended as such. And why *shouldn't* you be allowed to use a block for looping, if you like the exit behavior (guaranteeing that the iterator is exhausted when you leave the block in any way)? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Thu Apr 28 02:43:19 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu Apr 28 02:43:21 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <20050428000223.GA8869@mems-exchange.org> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <20050428000223.GA8869@mems-exchange.org> Message-ID: <ca471dc2050427174368f4ee3d@mail.gmail.com> > It seems like what you are proposing is a limited form of > coroutines. Well, I though that's already what generators were -- IMO there isn't much news there. We're providing a more convenient way to pass a value back, but that's always been possible (see Fredrik's examples). > Allowing 'continue' to have an optional value is elegant syntax. > I'm a little bit concerned about what happens if the iterator does > not expect a value. If I understand the PEP, it is silently > ignored. That seems like it could hide bugs. OTOH, it doesn't seem > any worse then a caller not expecting a return value. Exactly. > It's interesting that there is such similarity between 'for' and > 'block'. Why is it that block does not call iter() on EXPR1? I > guess that fact that 'break' and 'return' work differently is a more > significant difference. Well, perhaps block *should* call iter()? I'd like to hear votes about this. In most cases that would make a block-statement entirely equivalent to a for-loop, the exception being only when there's an exception or when breaking out of an iterator with resource management. I initially decided it should not call iter() so as to emphasize that this isn't supposed to be used for looping over sequences -- EXPR1 is really expected to be a resource management generator (or iterator). > After thinking about this more, I wonder if iterators meant for > 'for' loops and iterators meant for 'block' statements are really > very different things. It seems like a block-iterator really needs > to handle yield-expressions. But who knows, they might be useful for for-loops as well. After all, passing values back to the generator has been on some people's wish list for a long time. > I wonder if generators that contain a yield-expression should > properly be called coroutines. Practically, I suspect it would just > cause confusion. I have to admit that I haven't looked carefully for use cases for this! I just looked at a few Ruby examples and realized that it would be a fairly simple extension of generators. You can call such generators coroutines, but they are still generators. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From nas at arctrix.com Thu Apr 28 02:48:52 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Thu Apr 28 02:48:55 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc205042715585917829f@mail.gmail.com> References: <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> <ca471dc205042713277846852d@mail.gmail.com> <5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com> <ca471dc2050427145022e8985f@mail.gmail.com> <5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com> <ca471dc205042715585917829f@mail.gmail.com> Message-ID: <20050428004851.GB8869@mems-exchange.org> On Wed, Apr 27, 2005 at 03:58:14PM -0700, Guido van Rossum wrote: > Time to update the PEP; I'm pretty much settled on these semantics > now... [I'm trying to do a bit of Guido channeling here. I fear I may not be entirely successful.] The the __error__ method seems to simplify things a lot. The purpose of the __error__ method is to notify the iterator that the loop has been exited in some unusual way (i.e. not via a StopIteration raised by the iterator itself). The translation of a block-statement could become: itr = EXPR1 arg = None while True: try: VAR1 = next(itr, arg) except StopIteration: break try: arg = None BLOCK1 except Exception, exc: err = getattr(itr, '__error__', None) if err is None: raise exc err(exc) The translation of "continue EXPR2" would become: arg = EXPR2 continue The translation of "break" inside a block-statement would become: err = getattr(itr, '__error__', None) if err is not None: err(StopIteration()) break The translation of "return EXPR3" inside a block-statement would become: err = getattr(itr, '__error__', None) if err is not None: err(StopIteration()) return EXPR3 For generators, calling __error__ with a StopIteration instance would execute any 'finally' block. Any other argument to __error__ would get re-raised by the generator instance. You could then write: def opened(filename): fp = open(filename) try: yield fp finally: fp.close() and use it like this: block opened(filename) as fp: .... The main difference between 'for' and 'block' is that more iteration may happen after breaking or returning out of a 'for' loop. An iterator used in a block statement is always used up before the block is exited. Maybe __error__ should be called __break__ instead. StopIteration is not really an error. If it is called something like __break__, does it really need to accept an argument? Of hand I can't think of what an iterator might do with an exception. Neil From bac at OCF.Berkeley.EDU Thu Apr 28 02:52:13 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Apr 28 02:52:20 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc2050427174368f4ee3d@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <20050428000223.GA8869@mems-exchange.org> <ca471dc2050427174368f4ee3d@mail.gmail.com> Message-ID: <427033BD.4060600@ocf.berkeley.edu> Guido van Rossum wrote: [SNIP] >>It's interesting that there is such similarity between 'for' and >>'block'. Why is it that block does not call iter() on EXPR1? I >>guess that fact that 'break' and 'return' work differently is a more >>significant difference. > > > Well, perhaps block *should* call iter()? I'd like to hear votes about > this. In most cases that would make a block-statement entirely > equivalent to a for-loop, the exception being only when there's an > exception or when breaking out of an iterator with resource > management. > I am -0 on changing it to call iter(). I do like the distinction from a 'for' loop and leaving an emphasis for template blocks (or blocks, or whatever hip term you crazy kids are using for these things at the moment) to use generators. As I said before, I am viewing these blocks as a construct for external control of generators, not as a snazzy 'for' loop. -Brett From pje at telecommunity.com Thu Apr 28 03:00:01 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Apr 28 02:56:26 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc2050427171967cec0ac@mail.gmail.com> References: <5.1.1.6.0.20050427193759.032367b0@mail.telecommunity.com> <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> <ca471dc205042713277846852d@mail.gmail.com> <5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com> <ca471dc2050427145022e8985f@mail.gmail.com> <5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com> <ca471dc205042715585917829f@mail.gmail.com> <5.1.1.6.0.20050427193759.032367b0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050427205110.03b54ec0@mail.telecommunity.com> At 05:19 PM 4/27/05 -0700, Guido van Rossum wrote: >[Phillip] > > This also has the benefit of making the delineation between template blocks > > and for loops more concrete. For example, this: > > > > block open("filename") as f: > > ... > > > > could be an immediate TypeError (due to the lack of a __resume__) instead > > of biting you later on in the block when you try to do something with f, or > > because the block is repeating for each line of the file, etc. > >I'm not convinced of that, especially since all *generators* will >automatically be usable as templates, whether or not they were >intended as such. And why *shouldn't* you be allowed to use a block >for looping, if you like the exit behavior (guaranteeing that the >iterator is exhausted when you leave the block in any way)? It doesn't guarantee that, does it? (Re-reads PEP.) Aha, for *generators* it does, because it says passing StopIteration in, stops execution of the generator. But it doesn't say anything about whether iterators in general are allowed to be resumed afterward, just that they should not yield a value in response to the __next__, IIUC. As currently written, it sounds like existing non-generator iterators would not be forced to an exhausted state. As for the generator-vs-template distinction, I'd almost say that argues in favor of requiring some small extra distinction to make a generator template-safe, rather than in favor of making all iterators template-promiscuous, as it were. Perhaps a '@block_template' decorator on the generator? This would have the advantage of documenting the fact that the generator was written with that purpose in mind. It seems to me that using a template block to loop over a normal iterator is a TOOWTDI violation, but perhaps you're seeing something deeper here...? From bac at OCF.Berkeley.EDU Thu Apr 28 03:03:55 2005 From: bac at OCF.Berkeley.EDU (Brett C.) Date: Thu Apr 28 03:04:01 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <20050428004851.GB8869@mems-exchange.org> References: <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> <ca471dc205042713277846852d@mail.gmail.com> <5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com> <ca471dc2050427145022e8985f@mail.gmail.com> <5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com> <ca471dc205042715585917829f@mail.gmail.com> <20050428004851.GB8869@mems-exchange.org> Message-ID: <4270367B.7000606@ocf.berkeley.edu> Neil Schemenauer wrote: > On Wed, Apr 27, 2005 at 03:58:14PM -0700, Guido van Rossum wrote: > >>Time to update the PEP; I'm pretty much settled on these semantics >>now... > > > [I'm trying to do a bit of Guido channeling here. I fear I may not > be entirely successful.] > > The the __error__ method seems to simplify things a lot. The > purpose of the __error__ method is to notify the iterator that the > loop has been exited in some unusual way (i.e. not via a > StopIteration raised by the iterator itself). > > The translation of a block-statement could become: > > itr = EXPR1 > arg = None > while True: > try: > VAR1 = next(itr, arg) > except StopIteration: > break > try: > arg = None > BLOCK1 > except Exception, exc: > err = getattr(itr, '__error__', None) > if err is None: > raise exc > err(exc) > > > The translation of "continue EXPR2" would become: > > arg = EXPR2 > continue > > The translation of "break" inside a block-statement would > become: > > err = getattr(itr, '__error__', None) > if err is not None: > err(StopIteration()) > break > > The translation of "return EXPR3" inside a block-statement would > become: > > err = getattr(itr, '__error__', None) > if err is not None: > err(StopIteration()) > return EXPR3 > > For generators, calling __error__ with a StopIteration instance > would execute any 'finally' block. Any other argument to __error__ > would get re-raised by the generator instance. > > You could then write: > > def opened(filename): > fp = open(filename) > try: > yield fp > finally: > fp.close() > > and use it like this: > > block opened(filename) as fp: > .... > Seems great to me. Clean separation of when the block wants things to keep going if it can and when it wants to let the generator it's all done. > The main difference between 'for' and 'block' is that more iteration > may happen after breaking or returning out of a 'for' loop. An > iterator used in a block statement is always used up before the > block is exited. > This constant use of the phrase "used up" for these blocks is bugging me slightly. It isn't like the passed-in generator is having next() called on it until it stops, it is just finishing up (or cleaning up, choose your favorite term). It may have had more iterations to go, but the block signaled it was done and thus the generator got its chance to finish up and wipe pick up after itself. > Maybe __error__ should be called __break__ instead. I like that. > StopIteration > is not really an error. If it is called something like __break__, > does it really need to accept an argument? Of hand I can't think of > what an iterator might do with an exception. > Could just make the default value be StopIteration. Is there really a perk to __break__ only raising StopIteration and not accepting an argument? The real question of whether people would use the ability of raising other exceptions passed in from the block. If you view yield expressions as method calls, then being able to call __break__ with other exceptions makes sense since you might code up try/except statements within the generator and that will care about what kind of exception gets raised. -Brett From pje at telecommunity.com Thu Apr 28 03:12:19 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Thu Apr 28 03:08:43 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc2050427174368f4ee3d@mail.gmail.com> References: <20050428000223.GA8869@mems-exchange.org> <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <20050428000223.GA8869@mems-exchange.org> Message-ID: <5.1.1.6.0.20050427210148.03bf0ae0@mail.telecommunity.com> At 05:43 PM 4/27/05 -0700, Guido van Rossum wrote: >Well, perhaps block *should* call iter()? I'd like to hear votes about >this. In most cases that would make a block-statement entirely >equivalent to a for-loop, the exception being only when there's an >exception or when breaking out of an iterator with resource >management. > >I initially decided it should not call iter() so as to emphasize that >this isn't supposed to be used for looping over sequences -- EXPR1 is >really expected to be a resource management generator (or iterator). Which is why I vote for not calling iter(), and further, that blocks not use the iteration protocol, but rather use a new "block template" protocol. And finally, that a decorator be used to convert a generator function to a "template function" (i.e., a function that returns a block template). I think it's less confusing to have two completely distinct concepts, than to have two things that are very similar, yet different in a blurry kind of way. If you want to use a block on an iterator, you can always explicitly do something like this: @blocktemplate def iterate(iterable): for value in iterable: yield value block iterate([1,2,3]) as x: print x > > I wonder if generators that contain a yield-expression should > > properly be called coroutines. Practically, I suspect it would just > > cause confusion. > >I have to admit that I haven't looked carefully for use cases for >this! Anything that wants to do co-operative multitasking, basically. From steven.bethard at gmail.com Thu Apr 28 06:37:36 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Thu Apr 28 06:37:39 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <20050428004851.GB8869@mems-exchange.org> References: <ca471dc2050426043713116248@mail.gmail.com> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> <ca471dc205042713277846852d@mail.gmail.com> <5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com> <ca471dc2050427145022e8985f@mail.gmail.com> <5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com> <ca471dc205042715585917829f@mail.gmail.com> <20050428004851.GB8869@mems-exchange.org> Message-ID: <d11dcfba050427213772840fcb@mail.gmail.com> Neil Schemenauer wrote: > For generators, calling __error__ with a StopIteration instance > would execute any 'finally' block. Any other argument to __error__ > would get re-raised by the generator instance. This is only one case right? Any exception (including StopIteration) passed to a generator's __error__ method will just be re-raised at the point of the last yield, right? Or is there a need to special-case StopIteration? STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From steven.bethard at gmail.com Thu Apr 28 06:59:27 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Thu Apr 28 06:59:30 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <5.1.1.6.0.20050427205110.03b54ec0@mail.telecommunity.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> <ca471dc205042713277846852d@mail.gmail.com> <5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com> <ca471dc2050427145022e8985f@mail.gmail.com> <5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com> <ca471dc205042715585917829f@mail.gmail.com> <5.1.1.6.0.20050427193759.032367b0@mail.telecommunity.com> <ca471dc2050427171967cec0ac@mail.gmail.com> <5.1.1.6.0.20050427205110.03b54ec0@mail.telecommunity.com> Message-ID: <d11dcfba05042721597ca99716@mail.gmail.com> Phillip J. Eby wrote: > At 05:19 PM 4/27/05 -0700, Guido van Rossum wrote: > >I'm not convinced of that, especially since all *generators* will > >automatically be usable as templates, whether or not they were > >intended as such. And why *shouldn't* you be allowed to use a block > >for looping, if you like the exit behavior (guaranteeing that the > >iterator is exhausted when you leave the block in any way)? > > It doesn't guarantee that, does it? (Re-reads PEP.) Aha, for *generators* > it does, because it says passing StopIteration in, stops execution of the > generator. But it doesn't say anything about whether iterators in general > are allowed to be resumed afterward, just that they should not yield a > value in response to the __next__, IIUC. As currently written, it sounds > like existing non-generator iterators would not be forced to an exhausted > state. I wonder if something can be done like what was done for (dare I say it?) "old-style" iterators: "The intention of the protocol is that once an iterator's next() method raises StopIteration, it will continue to do so on subsequent calls. Implementations that do not obey this property are deemed broken. (This constraint was added in Python 2.3; in Python 2.2, various iterators are broken according to this rule.)"[1] This would mean that if next(itr, ...) raised StopIteration, then next(itr, ...) should continue to raise StopIteration on subsequent calls. I don't know how this is done in the current implementation. Would it be hard to do so for the proposed block-statements? If nothing else, we might at least clearly document what well-behaved iterators should do... STeVe [1] http://docs.python.org/lib/typeiter.html -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From greg.ewing at canterbury.ac.nz Thu Apr 28 08:26:39 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu Apr 28 08:35:45 2005 Subject: [Python-Dev] Re: anonymous blocks References: <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> <ca471dc205042713277846852d@mail.gmail.com> <5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com> <ca471dc2050427145022e8985f@mail.gmail.com> <5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com> <ca471dc205042715585917829f@mail.gmail.com> <20050428004851.GB8869@mems-exchange.org> Message-ID: <4270821F.5000404@canterbury.ac.nz> Neil Schemenauer wrote: > The translation of a block-statement could become: > > itr = EXPR1 > arg = None > while True: > try: > VAR1 = next(itr, arg) > except StopIteration: > break > try: > arg = None > BLOCK1 > except Exception, exc: > err = getattr(itr, '__error__', None) > if err is None: > raise exc > err(exc) That can't be right. When __error__ is called, if the iterator catches the exception and goes on to do another yield, the yielded value needs to be assigned to VAR1 and the block executed again. It looks like your version will ignore the value from the second yield and only execute the block again on the third yield. So something like Guido's safe_loop() would miss every other yield. I think Guido was right in the first place, and __error__ really is just a minor variation on __next__ that shouldn't have a separate entry point. Greg From greg.ewing at canterbury.ac.nz Thu Apr 28 08:33:20 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu Apr 28 08:42:26 2005 Subject: [Python-Dev] Re: anonymous blocks References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> Message-ID: <427083B0.6040204@canterbury.ac.nz> Guido van Rossum wrote: > And surely you exaggerate. How about this then: > > The with-statement is similar to the for-loop. Until you've > learned about the differences in detail, the only time you should > write a with-statement is when the documentation for the function > you are calling says you should. I think perhaps I'm not expressing myself very well. What I'm after is a high-level explanation that actually tells people something useful, and *doesn't* cop out by just saying "you're not experienced enough to understand this yet". If such an explanation can't be found, I strongly suspect that this doesn't correspond to a cohesive enough concept to be made into a built-in language feature. If you can't give a short, understandable explanation of it, then it's probably a bad idea. Greg From python-dev at zesty.ca Thu Apr 28 09:01:35 2005 From: python-dev at zesty.ca (Ka-Ping Yee) Date: Thu Apr 28 09:01:45 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <427083B0.6040204@canterbury.ac.nz> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <427083B0.6040204@canterbury.ac.nz> Message-ID: <Pine.LNX.4.58.0504280158590.4786@server1.LFW.org> On Thu, 28 Apr 2005, Greg Ewing wrote: > If such an explanation can't be found, I strongly suspect > that this doesn't correspond to a cohesive enough concept > to be made into a built-in language feature. If you can't > give a short, understandable explanation of it, then it's > probably a bad idea. In general, i agree with the sentiment of this -- though it's also okay if there is a way to break the concept down into concepts that *are* simple enough to have short, understandable explanations. -- ?!ng From stephen at xemacs.org Thu Apr 28 10:42:53 2005 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu Apr 28 10:42:58 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: <ca471dc2050426022458a4ad@mail.gmail.com> (Guido van Rossum's message of "Tue, 26 Apr 2005 02:24:53 -0700") References: <fb6fbf560504251520797338b2@mail.gmail.com> <1114473665.3698.2.camel@schizo> <426DCAE5.2070501@canterbury.ac.nz> <87fyxdor2p.fsf@tleepslib.sk.tsukuba.ac.jp> <ca471dc2050426022458a4ad@mail.gmail.com> Message-ID: <874qdrjnqq.fsf@tleepslib.sk.tsukuba.ac.jp> >>>>> "Guido" == Guido van Rossum <gvanrossum@gmail.com> writes: Guido> You mean like this? if x > 0: ...normal case... elif y > 0: ....abnormal case... else: ...edge case... The salient example! If it's no accident that those conditions are mutually exclusive and exhaustive, doesn't that code require at least a comment saying so, and maybe even an assertion to that effect? Where you can use a switch, it gives both, and throws in economy in both source and object code as a bonus. Not a compelling argument---your example shows switches are not universally applicable---but it's a pretty good deal. Guido> You have guts to call that bad style! :-) Exaggeration in defense of elegance is no vice.<wink> -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of Tsukuba Tennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. From pedronis at strakt.com Thu Apr 28 10:51:32 2005 From: pedronis at strakt.com (Samuele Pedroni) Date: Thu Apr 28 10:49:45 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <427083B0.6040204@canterbury.ac.nz> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <427083B0.6040204@canterbury.ac.nz> Message-ID: <4270A414.2000104@strakt.com> Greg Ewing wrote: > Guido van Rossum wrote: > >> And surely you exaggerate. How about this then: >> >> The with-statement is similar to the for-loop. Until you've >> learned about the differences in detail, the only time you should >> write a with-statement is when the documentation for the function >> you are calling says you should. > > > I think perhaps I'm not expressing myself very well. > What I'm after is a high-level explanation that actually > tells people something useful, and *doesn't* cop out by > just saying "you're not experienced enough to understand > this yet". > this makes sense to me, also because a new control statement will not be usually as hidden as metaclasses and some other possibly obscure corners can be. OTOH I have the impression that the new toy is too shiny to have a lucid discussion whether it could have sharp edges or produce dizziness for the unexperienced. From greg.ewing at canterbury.ac.nz Thu Apr 28 10:42:26 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu Apr 28 10:51:32 2005 Subject: [Python-Dev] Anonymous blocks: Thunks or iterators? References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> Message-ID: <4270A1F2.1030401@canterbury.ac.nz> Elegant as the idea behind PEP 340 is, I can't shake the feeling that it's an abuse of generators. It seems to go to a lot of trouble and complication so you can write a generator and pretend it's a function taking a block argument. I'd like to reconsider a thunk implementation. It would be a lot simpler, doing just what is required without any jiggery pokery with exceptions and break/continue/return statements. It would be easy to explain what it does and why it's useful. Are there any objective reasons to prefer a generator implementation over a thunk implementation? If for-loops had been implemented with thunks, we might never have created generators. But generators have turned out to be more powerful, because you can have more than one of them on the go at once. Is there a use for that capability here? I can think of one possible use. Suppose you want to acquire multiple resources; one way would be to nest block-statements, like block opening(file1) as f: block opening(file2) as g: ... If you have a lot of resources to acquire, the nesting could get very deep. But with the generator implementation, you could do something like block iterzip(opening(file1), opening(file2)) as f, g: ... provided iterzip were modified to broadcast __next__ arguments to its elements appropriately. You couldn't do this sort of thing with a thunk implementation. On the other hand, a thunk implementation has the potential to easily handle multiple block arguments, if a suitable syntax could ever be devised. It's hard to see how that could be done in a general way with the generator implementation. [BTW, I've just discovered we're not the only people with numbered things called PEPs. I typed "PEP 340" into Google and got "PEP 340: Prevention and Care of Athletic Injuries"!] Greg From ncoghlan at gmail.com Thu Apr 28 14:08:10 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri Apr 29 02:10:18 2005 Subject: [Python-Dev] Integrating PEP 310 with PEP 340 In-Reply-To: <ca471dc205042715165dede48d@mail.gmail.com> References: <426F9347.6000505@iinet.net.au> <ca471dc205042715165dede48d@mail.gmail.com> Message-ID: <4270D22A.9020907@gmail.com> Guido van Rossum wrote: >>PEP 310 forms the basis for a block construct that I _do_ like. The question >>then becomes whether or not generators can be used to write useful PEP 310 style >>block managers (I think they can, in a style very similar to that of the looping >>block construct from PEP 340). > > > I've read through your example, and I'm not clear why you think this > is better. It's a much more complex API with less power. What's your > use case? Why should 'block' be disallowed from looping? TOOWTDI or do > you have something better? I'm no longer clear on why I thought what I suggested would be better either. Can I use the 'it was late' excuse? :) Actually, the real reason is that I hadn't figured out what was really possible with PEP 340. The cases that I thought PEP 310 would handle better, I've since worked out how to do using the PEP 340 mechanism, and PEP 340 handles them _far_ more elegantly. With PEP 340, multi-stage constructs can be handled by using one generator as an argument to the block, and something else (such as a class or another generator) to maintain state between the blocks. The looping nature is a big win, because it lets execution of a contained block be prevented entirely. My favourite discovery is that PEP 340 can be used to write a switch statement like this: block switch(value) as sw: block sw.case(1): # Handle case 1 block sw.case(2): # Handle case 2 block sw.default(): # Handle default case Given the following definitions: class _switch(object): def __init__(self, switch_var): self.switch_var = switch_var self.run_default = True def case(self, case_value): self.run_default = False if self.switch_var == case_value: yield def default(self): if self.run_default: yield def switch(switch_var): yield _switch(switch_var) With the keyword-less syntax previously mentioned, such a 'custom structure' could look like: switch(value) as sw: sw.case(1): # Handle case 1 sw.case(2): # Handle case 2 sw.default(): # Handle default case (Actually doing a switch using blocks like this would be *insane* for performance reasons, but it is still rather cool that it is possible) With an appropriate utility block manager PEP 340 can also be used to abstract multi-stage operations. I haven't got a real use case for this as yet, but the potential is definitely there: def next_stage(itr): """Execute a single stage of a multi-stage block manager""" arg = None next_item = next(itr) while True: if next_item is StopIteration: raise StopIteration try: arg = yield next_item except: if not hasattr(itr, "__error__"): raise next_item = itr.__error__(sys.exc_info()[1]) else: next_item = next(itr, arg) def multi_stage(): """Code template accepting multiple suites""" # Pre stage 1 result_1 = yield # Post stage 1 yield StopIteration result_2 = 0 if result_1: # Pre stage 2 result_2 = yield # Post stage 2 yield StopIteration for i in range(result_2): # Pre stage 3 result_3 = yield # Post stage 3 yield StopIteration # Pre stage 4 result_4 = yield # Post stage 4 def use_multistage(): blk = multi_stage() block next_stage(blk): # Stage 1 continue val_1 block next_stage(blk): # Stage 2 continue val_2 block next_stage(blk): # Stage 3 continue val_3 block next_stage(blk): # Stage 4 continue val_4 Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From steven.bethard at gmail.com Thu Apr 28 19:23:22 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Fri Apr 29 02:32:09 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <d11dcfba05042810164214e9d0@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <427083B0.6040204@canterbury.ac.nz> <d11dcfba05042810164214e9d0@mail.gmail.com> Message-ID: <d11dcfba05042810232a6f87a7@mail.gmail.com> On 4/28/05, Steven Bethard <steven.bethard@gmail.com> wrote: > however, the iterable object is notified whenever a 'continue', > 'break', or 'return' statement is executed inside the block-statement. This should read: however, the iterable object is notified whenever a 'continue', 'break' or 'return' statement is executed *or an exception is raised* inside the block-statement. Sorry! STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From gvanrossum at gmail.com Thu Apr 28 16:44:05 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Apr 29 02:32:20 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: <874qdrjnqq.fsf@tleepslib.sk.tsukuba.ac.jp> References: <fb6fbf560504251520797338b2@mail.gmail.com> <1114473665.3698.2.camel@schizo> <426DCAE5.2070501@canterbury.ac.nz> <87fyxdor2p.fsf@tleepslib.sk.tsukuba.ac.jp> <ca471dc2050426022458a4ad@mail.gmail.com> <874qdrjnqq.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <ca471dc2050428074475d7d6b0@mail.gmail.com> > Exaggeration in defense of elegance is no vice.<wink> Maybe not, but it still sounds like BS to me. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jim at zope.com Thu Apr 28 13:50:58 2005 From: jim at zope.com (Jim Fulton) Date: Fri Apr 29 02:32:24 2005 Subject: [Python-Dev] Anonymous blocks: Thunks or iterators? In-Reply-To: <4270A1F2.1030401@canterbury.ac.nz> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <4270A1F2.1030401@canterbury.ac.nz> Message-ID: <4270CE22.7030406@zope.com> Greg Ewing wrote: > Elegant as the idea behind PEP 340 is, I can't shake > the feeling that it's an abuse of generators. It seems > to go to a lot of trouble and complication so you > can write a generator and pretend it's a function > taking a block argument. > > I'd like to reconsider a thunk implementation. It > would be a lot simpler, doing just what is required > without any jiggery pokery with exceptions and > break/continue/return statements. It would be easy > to explain what it does and why it's useful. "Simple is better than Complex." Is there a thunk PEP? Jim -- Jim Fulton mailto:jim@zope.com Python Powered! CTO (540) 361-1714 http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org From steven.bethard at gmail.com Thu Apr 28 19:16:15 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Fri Apr 29 02:32:41 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <427083B0.6040204@canterbury.ac.nz> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <427083B0.6040204@canterbury.ac.nz> Message-ID: <d11dcfba05042810164214e9d0@mail.gmail.com> On 4/28/05, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote: > Guido van Rossum wrote: > > And surely you exaggerate. How about this then: > > > > The with-statement is similar to the for-loop. Until you've > > learned about the differences in detail, the only time you should > > write a with-statement is when the documentation for the function > > you are calling says you should. > > I think perhaps I'm not expressing myself very well. > What I'm after is a high-level explanation that actually > tells people something useful, and *doesn't* cop out by > just saying "you're not experienced enough to understand > this yet". How about: """ A block-statement is much like a for-loop, and is also used to iterate over the elements of an iterable object. In a block-statement however, the iterable object is notified whenever a 'continue', 'break', or 'return' statement is executed inside the block-statement. Most iterable objects do not need to be notified of such statement executions, so for most iteration over iterable objects, you should use a for-loop. Functions that return iterable objects that should be used in a block-statement will be documented as such. """ If you need more information, you could also include something like: """ When generator objects are used in a block-statement, they are guaranteed to be "exhausted" at the end of the block-statement. That is, any additional call to next() with the generator object will produce a StopIteration. """ STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From Ugo_DiGirolamo at invision.iip.com Thu Apr 28 21:29:10 2005 From: Ugo_DiGirolamo at invision.iip.com (Ugo Di Girolamo) Date: Fri Apr 29 02:32:45 2005 Subject: [Python-Dev] Problem with embedded python - bug? Message-ID: <3D4A0A4A0225484B965A23CFD127B82F4A680F@invnmail.invision.iip.com> I have been having a few more discussions around about this, and I'm starting to think that this is a bug. My take is that, when I call Py_Finalize, the python thread should be shut down gracefully, closing the file and everything. Maybe I'm missing a call to something (?PyEval_FinalizeThreads?) but the docs seem to say that just PyFinalize should be called. The open file seems to be the issue, since if I remove all the references to the file I cannot get the program to crash. I can reproduce the same behavior on two different wxp systems, under python 2.4 and 2.4.1. Ugo -----Original Message----- From: Ugo Di Girolamo Sent: Tuesday, April 26, 2005 2:16 PM To: 'python-dev@python.org' Subject: Problem with embedded python I have the following code, that seems to make sense to me. However, it crashes about 1/3 of the times. My platform is Python 2.4.1 on WXP (I tried the release version from the msi and the debug version built by me, both downloaded today to have the latest version). The crash happens while the main thread is in Py_Finalize. I traced the crash to _Py_ForgetReference(op) in object.c at line 1847, where I have op->_ob_prev == NULL. What am I doing wrong? I'm definitely not too sure about the way I'm handling the GIL. Thanks in adv for any suggestion/ comment Cheers and ciao Ugo ////////////////////////// TestPyThreads.py ////////////////////////// #include <windows.h> #include "Python.h" int main() { PyEval_InitThreads(); Py_Initialize(); PyGILState_STATE main_restore_state = PyGILState_UNLOCKED; PyGILState_Release(main_restore_state); // start the thread { PyGILState_STATE state = PyGILState_Ensure(); int trash = PyRun_SimpleString( "import thread\n" "import time\n" "def foo():\n" " f = open('pippo.out', 'w', 0)\n" " i = 0;\n" " while 1:\n" " f.write('%d\\n'%i)\n" " time.sleep(0.01)\n" " i += 1\n" "t = thread.start_new_thread(foo, ())\n" ); PyGILState_Release(state); } // wait 300 ms Sleep(300); PyGILState_Ensure(); Py_Finalize(); return 0; } From ncoghlan at gmail.com Thu Apr 28 14:07:55 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri Apr 29 02:32:53 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <42702BCB.9000500@ocf.berkeley.edu> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> <ca471dc205042713277846852d@mail.gmail.com> <42700D3A.5020208@gmail.com> <ca471dc20504271522e79ce4a@mail.gmail.com> <42702BCB.9000500@ocf.berkeley.edu> Message-ID: <4270D21B.9040401@gmail.com> Brett C. wrote: > Guido van Rossum wrote: >>>Yet another alternative would be for the default behaviour to be to raise >>>Exceptions, and continue with anything else, and have the third argument be >>>"raise_exc=True" and set it to False to pass an exception in without raising it. >> >> >>You've lost me there. If you care about this, can you write it up in >>more detail (with code samples or whatever)? Or we can agree on a 2nd >>arg to __next__() (and a 3rd one to next()). > > Channeling Nick, I think he is saying that the raising argument should be made > True by default and be named 'raise_exc'. Pretty close, although I'd say 'could' rather than 'should', as it was an idle thought, rather than something I actually consider a good idea. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From steven.bethard at gmail.com Thu Apr 28 18:21:59 2005 From: steven.bethard at gmail.com (Steven Bethard) Date: Fri Apr 29 02:32:56 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <4270821F.5000404@canterbury.ac.nz> References: <ca471dc2050426043713116248@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com> <ca471dc205042713277846852d@mail.gmail.com> <5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com> <ca471dc2050427145022e8985f@mail.gmail.com> <5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com> <ca471dc205042715585917829f@mail.gmail.com> <20050428004851.GB8869@mems-exchange.org> <4270821F.5000404@canterbury.ac.nz> Message-ID: <d11dcfba050428092150ddabac@mail.gmail.com> On 4/28/05, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote: > Neil Schemenauer wrote: > > > The translation of a block-statement could become: > > > > itr = EXPR1 > > arg = None > > while True: > > try: > > VAR1 = next(itr, arg) > > except StopIteration: > > break > > try: > > arg = None > > BLOCK1 > > except Exception, exc: > > err = getattr(itr, '__error__', None) > > if err is None: > > raise exc > > err(exc) > > That can't be right. When __error__ is called, if the iterator > catches the exception and goes on to do another yield, the > yielded value needs to be assigned to VAR1 and the block > executed again. It looks like your version will ignore the > value from the second yield and only execute the block again > on the third yield. Could you do something like: itr = EXPR1 arg = None next_func = next while True: try: VAR1 = next_func(itr, arg) except StopIteration: break try: arg = None next_func = next BLOCK1 except Exception, arg: try: next_func = type(itr).__error__ except AttributeError: raise arg ? STeVe -- You can wordify anything if you just verb it. --- Bucky Katt, Get Fuzzy From michael.walter at gmail.com Thu Apr 28 14:46:22 2005 From: michael.walter at gmail.com (Michael Walter) Date: Fri Apr 29 02:47:23 2005 Subject: [Python-Dev] Re: switch statement In-Reply-To: <874qdrjnqq.fsf@tleepslib.sk.tsukuba.ac.jp> References: <fb6fbf560504251520797338b2@mail.gmail.com> <1114473665.3698.2.camel@schizo> <426DCAE5.2070501@canterbury.ac.nz> <87fyxdor2p.fsf@tleepslib.sk.tsukuba.ac.jp> <ca471dc2050426022458a4ad@mail.gmail.com> <874qdrjnqq.fsf@tleepslib.sk.tsukuba.ac.jp> Message-ID: <877e9a1705042805464161c3a4@mail.gmail.com> On 4/28/05, Stephen J. Turnbull <stephen@xemacs.org> wrote: > >>>>> "Guido" == Guido van Rossum <gvanrossum@gmail.com> writes: > > Guido> You mean like this? > > if x > 0: > ...normal case... > elif y > 0: > ....abnormal case... > else: > ...edge case... > > The salient example! If it's no accident that those conditions are > mutually exclusive and exhaustive, doesn't that code require at least > a comment saying so, and maybe even an assertion to that effect? I usually do: if ...: return ... if ...: return ... assert ... return ... Michael From gargamel.su at gmail.com Thu Apr 28 23:47:18 2005 From: gargamel.su at gmail.com (Jing Su) Date: Fri Apr 29 02:47:26 2005 Subject: [Python-Dev] noob question regarding the interpreter Message-ID: <f11de95f050428144735fe364e@mail.gmail.com> Hello, I know this is a n00b question, so I apologize ahead of time. I've been taking a look at they python interpreter, trying to understand how it works on the compiled byte-codes. Looking through the sources of the 2.4.1 stable version, it looks like Python/ceval.c is the module that does the main dispatch. However, it looks like a switched interpreter. I just find this surprising because python seems to run pretty fast, and a switched interpreter is usually painfully slow. Is there work to change python into a direct-threaded or even JIT'ed interpreter? Has there been previous discussion on this topic? I'd greatly appreciate any pointers to discussions on this topic. Thus far my google-fu has not turned up fruitful hits. Thanks in advance for any help! -Jing -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20050428/bed2d653/attachment-0001.html From jimjjewett at gmail.com Thu Apr 28 23:53:36 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri Apr 29 02:47:46 2005 Subject: [Python-Dev] anonymous blocks as scope-collapse: detailed proposal Message-ID: <fb6fbf5605042814537f6d2e37@mail.gmail.com> Based on Guido's opinion that caller and callee should both be marked, I have used keywords 'include' and 'chunk'. I therefore call them "Chunks" and "Includers". Examples are based on (1) The common case of a simple resource manager. e.g. http://mail.python.org/pipermail/python-dev/2005-April/052751.html (2) Robert Brewer's Object Relational Mapper http://mail.python.org/pipermail/python-dev/2005-April/052924.html which uses several communicating Chunks in the same Includer, and benefits from Includer inheritance. Note that several cooperating Chunks may use the same name (e.g. old_children) to refer to the same object, even though that object is never mentioned by the Includer. It is possible for the same code object to be both a Chunk and an Includer. Its own included sub-Chunks also share the top Includer's namespace. Chunks and Includers must both be written in pure python, because C frames cannot be easily manipulated. They can of course call or be called (as a unit) by extension modules. I have assumed that Chunks should not take arguments. While arguments are useful ("Which pattern should I match against on this inclusion?"), the same functionality *can* be had by binding a known name in the Includer. When that starts to get awkward, it is a sign that you should be using separate namespaces (and callbacks, or value objects). "self" and "cls" are just random names to a Chunk, though using them for any but the conventional meaning will be as foolhardy as it is in a method. Chunks are limited to statement context, as they do not return a value. Includers must provide a namespace. Therefore a single inclusion will turn the entire nearest enclosing namespace into an Includer. ? Should this be limited to nearest enclosing function or method? I can't think of a good use case for including directly from class definition or module toplevel, except registration. And even then, a metaclass might be better. Includers may only be used in a statement context, as the Chunks must be specified in a following suite. (It would be possible to skip the suite if all Chunk names are already bound, but I'm not sure that is a good habit to encourage -- so initially forbid it.) Chunks are defined without a (), in analogy to parentless classes. They are included (called) with a (), so that they can remain first class objects. Example Usage ============= def withfile(filename, mode='r'): """Close the file as soon we're done. This frees up file handles sooner. This is particularly important under Jython, or if you are using files in cyclic structures.""" openfile = open(filename, mode) try: include fileproc() # keyword 'include' prevents XXX_FAST optimization finally: openfile.close() chunk nullreader: # callee Chunk defined for reuse for line in openfile: pass withfile("testr.txt"): # Is this creation of a new block-starter a problem? fileproc=nullreader # Using an external Chunk object withfile("testw.txt", "w"): chunk fileproc: # Providing an "inline" Chunk openfile.write("Line 1") # If callers must be supported in expression context #fileproc=nullreader #withfile("tests.txt") # Resolve Chunk name from caller's default # binding, which in this case defaults back # to the current globals. # Is this just asking for trouble? class ORM(object): chunk nullchunk: # The extra processing is not always needed. pass begin=pre=post=end=nullchunk # Default to no extra processing def __set__(self, unit, value): include self.begin() if self.coerce: value = self.coerce(unit, value) oldvalue = unit._properties[self.key] if oldvalue != value: include self.pre() unit._properties[self.key] = value include self.post() include self.end() class TriggerORM(ORM): chunk pre: include super(self,TriggerORM).pre() # self was bound by __set__ old_children = self.children() # inject new variable chunk post: include super(self,TriggerORM).post() for child in self.children(): if child not in old_children: # will see pre's binding notify_somebody("New child %s" % child) As Robert Brewer said, > The above is quite ugly written with callbacks (due to > excessive argument passing), and is currently fragile > when overriding __set__ (due to duplicated code). How to Implement ---------------- The Includer cannot know which variables a Chunk will use (or inject), so the namespace must remain a dictionary. This precludes use of the XXX_FAST bytecodes. But as Robert pointed out, avoiding another frame creation/destruction will compensate somewhat. Two new bytecodes will be needed to handle the jump and return to a different bytecode string without setting up or tearing down a new frame. Position in the Includer bytecode will need to be kept in a stack, though it might make sense to use a frame variable instead of the execution stack. With those two exceptions, the Includer and Chunk are both composed entirely of valid statements that can already be compiled to ordinary bytecode. -jJ From ncoghlan at gmail.com Thu Apr 28 14:08:08 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri Apr 29 02:49:16 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <427026B1.4020002@ocf.berkeley.edu> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <ca471dc20504270030405f922f@mail.gmail.com> <426FF412.7010709@ocf.berkeley.edu> <42700B88.40703@gmail.com> <427026B1.4020002@ocf.berkeley.edu> Message-ID: <4270D228.8020607@gmail.com> Brett C. wrote: >>I'm surprisingly close to agreeing with you, actually. I've worked out >>that it isn't the looping that I object to, it's the inability to get >>out of the loop without exhausting the entire iterator. > 'break' isn't' enough for you as laid out by the proposal? The raising of > StopIteration, which is what 'break' does according to the standard, should be > enough to stop the loop without exhausting things. Same way you stop a 'for' > loop from executing entirely. The StopIteration exception effectively exhausted the generator, though. However, I've figured out how to deal with that, and my reservations about PEP 340 are basically gone. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From mwh at python.net Thu Apr 28 13:26:00 2005 From: mwh at python.net (Michael Hudson) Date: Fri Apr 29 02:49:22 2005 Subject: [Python-Dev] Anonymous blocks: Thunks or iterators? In-Reply-To: <4270A1F2.1030401@canterbury.ac.nz> (Greg Ewing's message of "Thu, 28 Apr 2005 20:42:26 +1200") References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <4270A1F2.1030401@canterbury.ac.nz> Message-ID: <2mfyxb16t3.fsf@starship.python.net> Greg Ewing <greg.ewing@canterbury.ac.nz> writes: > Are there any objective reasons to prefer a generator > implementation over a thunk implementation? I, too, would like to see an answer to this question. I'd like to see an answer in the PEP, too. Cheers, mwh -- All obscurity will buy you is time enough to contract venereal diseases. -- Tim Peters, python-dev From gvanrossum at gmail.com Fri Apr 29 00:15:13 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Apr 29 02:50:00 2005 Subject: [Python-Dev] Anonymous blocks: Thunks or iterators? In-Reply-To: <4270A1F2.1030401@canterbury.ac.nz> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <4270A1F2.1030401@canterbury.ac.nz> Message-ID: <ca471dc205042815157cf20297@mail.gmail.com> [Greg Ewing] > Elegant as the idea behind PEP 340 is, I can't shake > the feeling that it's an abuse of generators. It seems > to go to a lot of trouble and complication so you > can write a generator and pretend it's a function > taking a block argument. Maybe. You're not the first one saying this and I'm not saying "no" outright, but I'd like to defend the PEP. There are a number of separate ideas that all contribute to PEP 340. One is turning generators into more general coroutines: continue EXPR passes the expression to the iterator's next() method (renamed to __next__() to work around a compatibility issue and because it should have been called that in the first place), and in a generator this value can be received as the return value of yield. Incidentally this makes the generator *syntax* more similar to Ruby (even though Ruby uses thunks, and consequently uses return instead of continue to pass a value back). I'd like to have this even if I don't get the block statement. The second is a solution for generator resource cleanup. There are already two PEPs proposing a solution (288 and 325) so I have to assume this addresses real pain! The only new twist offered by PEP 340 is a unification of the next() API and the resource cleanup API: neither PEP 288 nor PEP 325 seems to specify rigorously what should happen if the generator executes another yield in response to a throw() or close() call (or whether that should even be allowed); PEP 340 takes the stance that it *is* allowed and should return a value from whatever call sent the exception. This feels "right", especially together with the previous feature: if yield can return a value as if it were a function call, it should also be allowed to raise an exception, and catch or propagate it with impunity. Even without a block-statement, these two changes make yield look a lot like invoking a thunk -- but it's more efficient, since calling yield doesn't create a frame. The main advantage of thunks that I can see is that you can save the thunk for later, like a callback for a button widget (the thunk then becomes a closure). You can't use a yield-based block for that (except in Ruby, which uses yield syntax with a thunk-based implementation). But I have to say that I almost see this as an advantage: I think I'd be slightly uncomfortable seeing a block and not knowing whether it will be executed in the normal control flow or later. Defining an explicit nested function for that purpose doesn't have this problem for me, because I already know that the 'def' keyword means its body is executed later. The other problem with thunks is that once we think of them as the anonymous functions they are, we're pretty much forced to say that a return statement in a thunk returns from the thunk rather than from the containing function. Doing it any other way would cause major weirdness when the thunk were to survive its containing function as a closure (perhaps continuations would help, but I'm not about to go there :-). But then an IMO important use case for the resource cleanup template pattern is lost. I routinely write code like this: def findSomething(self, key, default=None): self.lock.acquire() try: for item in self.elements: if item.matches(key): return item return default finally: self.lock.release() and I'd be bummed if I couldn't write this as def findSomething(self, key, default=None): block synchronized(self.lock): for item in self.elements: if item.matches(key): return item return default This particular example can be rewritten using a break: def findSomething(self, key, default=None): block synchronized(self.lock): for item in self.elements: if item.matches(key): break else: item = default return item but it looks forced and the transformation isn't always that easy; you'd be forced to rewrite your code in a single-return style which feels too restrictive. > I'd like to reconsider a thunk implementation. It > would be a lot simpler, doing just what is required > without any jiggery pokery with exceptions and > break/continue/return statements. It would be easy > to explain what it does and why it's useful. I don't know. In order to obtain the required local variable sharing between the thunk and the containing function I believe that every local variable used or set in the thunk would have to become a 'cell' (our mechanism for sharing variables between nested scopes). Cells slow down access somewhat compared to regular local variables. Perhaps not entirely coincidentally, the last example above (findSomething() rewritten to avoid a return inside the block) shows that, unlike for regular nested functions, we'll want variables *assigned to* by the thunk also to be shared with the containing function, even if they are not assigned to outside the thunk. I swear I didn't create the example for this purpose -- it just happened. > Are there any objective reasons to prefer a generator > implementation over a thunk implementation? If > for-loops had been implemented with thunks, we might > never have created generators. But generators have > turned out to be more powerful, because you can > have more than one of them on the go at once. Is > there a use for that capability here? I think the async event folks like to use this (see the Mertz references in PEP 288). > I can think of one possible use. Suppose you want > to acquire multiple resources; one way would be to > nest block-statements, like > > block opening(file1) as f: > block opening(file2) as g: > ... > > If you have a lot of resources to acquire, the nesting > could get very deep. But with the generator implementation, > you could do something like > > block iterzip(opening(file1), opening(file2)) as f, g: > ... > > provided iterzip were modified to broadcast __next__ > arguments to its elements appropriately. You couldn't > do this sort of thing with a thunk implementation. > > On the other hand, a thunk implementation has the > potential to easily handle multiple block arguments, if > a suitable syntax could ever be devised. It's hard > to see how that could be done in a general way with > the generator implementation. Right, but the use cases for multiple blocks seem elusive. If you really want to have multiple blocks with yield, I suppose we could use "yield/n" to yield to the n'th block argument, or perhaps yield>>n. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Fri Apr 29 00:51:13 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Apr 29 02:50:23 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <Pine.LNX.4.58.0504280158590.4786@server1.LFW.org> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <427083B0.6040204@canterbury.ac.nz> <Pine.LNX.4.58.0504280158590.4786@server1.LFW.org> Message-ID: <ca471dc2050428155127cec9a5@mail.gmail.com> [Greg Ewing] > I think perhaps I'm not expressing myself very well. > What I'm after is a high-level explanation that actually > tells people something useful, and *doesn't* cop out by > just saying "you're not experienced enough to understand > this yet". > > If such an explanation can't be found, I strongly suspect > that this doesn't correspond to a cohesive enough concept > to be made into a built-in language feature. If you can't > give a short, understandable explanation of it, then it's > probably a bad idea. [Ping] > In general, i agree with the sentiment of this -- though it's > also okay if there is a way to break the concept down into > concepts that *are* simple enough to have short, understandable > explanations. I don't know. What exactly is the audience supposed to be of this high-level statement? It would be pretty darn impossible to explain even the for-statement to people who are new to programming, let alone generators. And yet explaining the block-statement *must* involve a reference to generators. I'm guessing most introductions to Python, even for experienced programmers, put generators off until the "advanced" section, because this is pretty wild if you're not used to a language that has something similar. (I wonder how you'd explain Python generators to an experienced Ruby programmer -- their mind has been manipulated to the point where they'd be unable to understand Python's yield no matter how hard they tried. :-) If I weren't limited to newbies (either to Python or to programming in general) but simply had to explain it to Python programmers pre-Python-2.5, I would probably start with a typical example of the try/finally idiom for acquiring and releasing a lock, then explain how for software engineering reasons you'd want to templatize that, and show the solution with a generator and block-statement. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Fri Apr 29 00:55:03 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Apr 29 02:50:28 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement Message-ID: <ca471dc205042815557616722b@mail.gmail.com> How about, instead of trying to emphasize how different a block-statement is from a for-loop, we emphasize their similarity? A regular old loop over a sequence or iterable is written as: for VAR in EXPR: BLOCK A variation on this with somewhat different semantics swaps the keywords: in EXPR for VAR: BLOCK If you don't need the variable, you can leave the "for VAR" part out: in EXPR: BLOCK Too cute? :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From nas at arctrix.com Fri Apr 29 04:24:02 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Fri Apr 29 04:24:05 2005 Subject: [Python-Dev] noob question regarding the interpreter In-Reply-To: <f11de95f050428144735fe364e@mail.gmail.com> References: <f11de95f050428144735fe364e@mail.gmail.com> Message-ID: <20050429022401.GA13119@mems-exchange.org> On Thu, Apr 28, 2005 at 05:47:18PM -0400, Jing Su wrote: > Is there work to change python into a direct-threaded or even JIT'ed > interpreter? People have experimented with making the ceval loop use direct threading. If I recall correctly, the resulting speedup was not significant. I suspect the reason is that most of Python's opcodes do a significant amount of work. There's probably more to be gained by moving to a register based VM. Also, I think direct threading is hard to do portably. If you are interested in JIT, take a look at Psyco. Neil From nas at arctrix.com Fri Apr 29 04:35:39 2005 From: nas at arctrix.com (Neil Schemenauer) Date: Fri Apr 29 04:35:42 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <ca471dc205042815557616722b@mail.gmail.com> References: <ca471dc205042815557616722b@mail.gmail.com> Message-ID: <20050429023539.GB13119@mems-exchange.org> On Thu, Apr 28, 2005 at 03:55:03PM -0700, Guido van Rossum wrote: > A variation on this with somewhat different semantics swaps the keywords: > > in EXPR for VAR: > BLOCK Looks weird to my eyes. On a related note, I was thinking about the extra cleanup 'block' provides. If the 'file' object would provide a suitable iterator, you could write: block open(filename) as line: ... and have the file closed at the end of the block. It does not read so well though. In a way, it seems to make more sense if 'block' called iter() on the expression and 'for' did not. block would guarantee to cleanup iterators that it created. 'for' does not but implictly creates them. Neil From nidoizo at yahoo.com Fri Apr 29 05:21:12 2005 From: nidoizo at yahoo.com (Nicolas Fleury) Date: Fri Apr 29 05:18:54 2005 Subject: [Python-Dev] Re: PEP 340 - possible new name for block-statement In-Reply-To: <ca471dc205042815557616722b@mail.gmail.com> References: <ca471dc205042815557616722b@mail.gmail.com> Message-ID: <d4s8nh$fk8$1@sea.gmane.org> Guido van Rossum wrote: > A variation on this with somewhat different semantics swaps the keywords: > > in EXPR for VAR: > BLOCK > > If you don't need the variable, you can leave the "for VAR" part out: > > in EXPR: > BLOCK > > Too cute? :-) > I don't think it reads well. I would prefer something that would be understandable for a newbie's eyes, even if it fits more with common usage than with the real semantics behind it. For example a Boost-like keyword like: scoped EXPR as VAR: BLOCK scoped EXPR: BLOCK We may argue that it doesn't mean a lot, but at least if a newbie sees the following code, he would easily guess what it does: scoped synchronized(mutex): scoped opening(filename) as file: ... When compared with: in synchronized(mutex): in opening(filename) for file: ... As a C++ programmer, I still dream I could also do: scoped synchronized(mutex) scoped opening(filename) as file ... which would define a block until the end of the current block... Regards, Nicolas From gvanrossum at gmail.com Fri Apr 29 05:39:23 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Apr 29 05:39:25 2005 Subject: [Python-Dev] noob question regarding the interpreter In-Reply-To: <f11de95f050428144735fe364e@mail.gmail.com> References: <f11de95f050428144735fe364e@mail.gmail.com> Message-ID: <ca471dc20504282039aeee008@mail.gmail.com> > However, it looks like a switched interpreter. I just > find this surprising because python seems to run pretty fast, and a switched > interpreter is usually painfully slow. This just proves how worthless a generalization that is. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From shane at hathawaymix.org Fri Apr 29 05:56:42 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Fri Apr 29 05:56:44 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc2050428155127cec9a5@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <427083B0.6040204@canterbury.ac.nz> <Pine.LNX.4.58.0504280158590.4786@server1.LFW.org> <ca471dc2050428155127cec9a5@mail.gmail.com> Message-ID: <4271B07A.4010501@hathawaymix.org> Guido van Rossum wrote: > I don't know. What exactly is the audience supposed to be of this > high-level statement? It would be pretty darn impossible to explain > even the for-statement to people who are new to programming, let alone > generators. And yet explaining the block-statement *must* involve a > reference to generators. I'm guessing most introductions to Python, > even for experienced programmers, put generators off until the > "advanced" section, because this is pretty wild if you're not used to > a language that has something similar. (I wonder how you'd explain > Python generators to an experienced Ruby programmer -- their mind has > been manipulated to the point where they'd be unable to understand > Python's yield no matter how hard they tried. :-) I think this concept can be explained clearly. I'd like to try explaining PEP 340 to someone new to Python but not new to programming. I'll use the term "block iterator" to refer to the new type of iterator. This is according to my limited understanding. "Good programmers move commonly used code into reusable functions. Sometimes, however, patterns arise in the structure of the functions rather than the actual sequence of statements. For example, many functions acquire a lock, execute some code specific to that function, and unconditionally release the lock. Repeating the locking code in every function that uses it is error prone and makes refactoring difficult. "Block statements provide a mechanism for encapsulating patterns of structure. Code inside the block statement runs under the control of an object called a block iterator. Simple block iterators execute code before and after the code inside the block statement. Block iterators also have the opportunity to execute the controlled code more than once (or not at all), catch exceptions, or receive data from the body of the block statement. "A convenient way to write block iterators is to write a generator. A generator looks a lot like a Python function, but instead of returning a value immediately, generators pause their execution at "yield" statements. When a generator is used as a block iterator, the yield statement tells the Python interpreter to suspend the block iterator, execute the block statement body, and resume the block iterator when the body has executed. "The Python interpreter behaves as follows when it encounters a block statement based on a generator. First, the interpreter instantiates the generator and begins executing it. The generator does setup work appropriate to the pattern it encapsulates, such as acquiring a lock, opening a file, starting a database transaction, or starting a loop. Then the generator yields execution to the body of the block statement using a yield statement. When the block statement body completes, raises an uncaught exception, or sends data back to the generator using a continue statement, the generator resumes. At this point, the generator can either clean up and stop or yield again, causing the block statement body to execute again. When the generator finishes, the interpreter leaves the block statement." Is it understandable so far? Shane From greg.ewing at canterbury.ac.nz Fri Apr 29 06:05:32 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri Apr 29 06:05:50 2005 Subject: [Python-Dev] Anonymous blocks: Thunks or iterators? In-Reply-To: <ca471dc205042815157cf20297@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <4270A1F2.1030401@canterbury.ac.nz> <ca471dc205042815157cf20297@mail.gmail.com> Message-ID: <4271B28C.9030504@canterbury.ac.nz> Guido van Rossum wrote: > The main advantage of thunks that I can see is that you can save the > thunk for later, like a callback for a button widget (the thunk then > becomes a closure). Or pass it on to another function. This is something we haven't considered -- what if one resource-acquision- generator (RAG?) wants to delegate to another RAG? With normal generators, one can always use the pattern for x in sub_generator(some_args): yield x But that clearly isn't going to work if the generators involved are RAGs, because the exceptions passed in are going to be raised at the point of the yield in the outer RAG, and the inner RAG isn't going to get finalized (assuming the for-loop doesn't participate in the finalization protocol). To get the finalization right, the inner generator needs to be invoked as a RAG, too: block sub_generator(some_args): yield But PEP 340 doesn't say what happens when the block contains a yield. A thunk implementation wouldn't have any problem with this, since the thunk can be passed down any number of levels before being called, and any exceptions raised in it will be propagated back up through all of them. > The other problem with thunks is that once we think of them as the > anonymous functions they are, we're pretty much forced to say that a > return statement in a thunk returns from the thunk rather than from > the containing function. Urg, you're right. Unless return is turned into an exception in that case. And then I suppose break and return (and yield?) will have to follow suit. I'm just trying to think how Smalltalk handles this, since it must have a similar problem, but I can't remember the details. > every > local variable used or set in the thunk would have to become a 'cell' > . Cells > slow down access somewhat compared to regular local variables. True, but is the difference all that great? It's just one more C-level indirection, isn't it? > we'll want variables > *assigned to* by the thunk also to be shared with the containing > function, Agreed. We'd need to add a STORE_CELL bytecode or something for this. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Fri Apr 29 06:13:13 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri Apr 29 06:13:42 2005 Subject: [Python-Dev] Integrating PEP 310 with PEP 340 In-Reply-To: <4270D22A.9020907@gmail.com> References: <426F9347.6000505@iinet.net.au> <ca471dc205042715165dede48d@mail.gmail.com> <4270D22A.9020907@gmail.com> Message-ID: <4271B459.2010208@canterbury.ac.nz> Nick Coghlan wrote: > With an appropriate utility block manager I've just thought of another potential name for them: Block Utilization and Management Function (BUMF) :-) -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From sabbey at u.washington.edu Fri Apr 29 06:15:14 2005 From: sabbey at u.washington.edu (Brian Sabbey) Date: Fri Apr 29 06:15:17 2005 Subject: [Python-Dev] Anonymous blocks: Thunks or iterators? In-Reply-To: <ca471dc205042815157cf20297@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <4270A1F2.1030401@canterbury.ac.nz> <ca471dc205042815157cf20297@mail.gmail.com> Message-ID: <Pine.A41.4.61b.0504281913370.127138@dante65.u.washington.edu> Guido van Rossum wrote: > Even without a block-statement, these two changes make yield look a > lot like invoking a thunk -- but it's more efficient, since calling > yield doesn't create a frame. I like PEP 340 a lot, probably as much or more than any thunk ideas I've seen. But I want to defend thunks here a little. It is possible to implement thunks without them creating their own frame. They can reuse the frame of the surrounding function. So a new frame does not need to be created when the thunk is called, and, much like with a yield statement, the frame is not taken down when the thunk completes running. The implementation just needs to take care to save and restore members of the frame that get clobbered when the thunk is running. Cells would of course not be required if the thunk does not create its own frame. > The main advantage of thunks that I can see is that you can save the > thunk for later, like a callback for a button widget (the thunk then > becomes a closure). You can't use a yield-based block for that (except > in Ruby, which uses yield syntax with a thunk-based implementation). > But I have to say that I almost see this as an advantage: I think I'd > be slightly uncomfortable seeing a block and not knowing whether it > will be executed in the normal control flow or later. Defining an > explicit nested function for that purpose doesn't have this problem > for me, because I already know that the 'def' keyword means its body > is executed later. I would also be uncomfortable if the thunk could be called at a later time. This can be disallowed by raising an exception if such an attempt is made. Such a restriction would not be completely arbitrary. One consequence of having the thunk borrow its surrounding function's frame is that it does not make much sense, implementationally speaking, to allow the thunk to be called at a later time (although I do realize that "it's difficult to implement" is not a good argument for anything). > The other problem with thunks is that once we think of them as the > anonymous functions they are, we're pretty much forced to say that a > return statement in a thunk returns from the thunk rather than from > the containing function. Doing it any other way would cause major > weirdness when the thunk were to survive its containing function as a > closure (perhaps continuations would help, but I'm not about to go > there :-). If it is accepted that the thunk won't be callable at a later time, then I think it would seem normal that a return statement would return from the surrounding function. -Brian From greg.ewing at canterbury.ac.nz Fri Apr 29 06:17:54 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri Apr 29 06:18:21 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <20050429023539.GB13119@mems-exchange.org> References: <ca471dc205042815557616722b@mail.gmail.com> <20050429023539.GB13119@mems-exchange.org> Message-ID: <4271B572.2030906@canterbury.ac.nz> Neil Schemenauer wrote: >>A variation on this with somewhat different semantics swaps the keywords: >> >> in EXPR for VAR: >> BLOCK > > Looks weird to my eyes. Probably makes perfect sense if you're Dutch, though. :-) -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From rmunn at pobox.com Fri Apr 29 06:28:44 2005 From: rmunn at pobox.com (Robin Munn) Date: Fri Apr 29 06:28:50 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <ca471dc205042815557616722b@mail.gmail.com> References: <ca471dc205042815557616722b@mail.gmail.com> Message-ID: <4271B7FC.1070801@pobox.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Guido van Rossum wrote: | How about, instead of trying to emphasize how different a | block-statement is from a for-loop, we emphasize their similarity? | | A regular old loop over a sequence or iterable is written as: | | for VAR in EXPR: | BLOCK | | A variation on this with somewhat different semantics swaps the keywords: | | in EXPR for VAR: | BLOCK | | If you don't need the variable, you can leave the "for VAR" part out: | | in EXPR: | BLOCK | | Too cute? :-) Far too close to the "for" loop, IMHO. I read that, I'd have to remind myself every time, "now, which one is it that can receive values passed back in: for ... in, or in ... for?" I'm definitely -1 on that one: too confusing. Another possibility just occurred to me. How about "using"? ~ using EXPR as VAR: ~ BLOCK Reads similarly to "with", but leaves the "with" keyword open for possible use later. Since it seems traditional for one to introduce oneself upon first posting to python-dev, my name is Robin Munn. Yes, my name is just one letter different from Robin Dunn's. It's not like I *intended* to cause confusion... :-) Anyway, I was introduced to Python a few years ago, around version 2.1 or so, and fell in love with the fact that I could read my own code six months later and understand it. I try to help out where I can, but I don't know the guts of the interpreter, so on python-dev I mostly lurk. Robin Munn -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (Darwin) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCcbf16OLMk9ZJcBQRAuYpAJ4n24AgsX3SrW0g7jlWJM+HfzHXMwCfTbTq eJ2mLzg1uLZv09KDUemM+WU= =SXux -----END PGP SIGNATURE----- From greg.ewing at canterbury.ac.nz Fri Apr 29 06:45:16 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri Apr 29 06:45:35 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <ca471dc2050428155127cec9a5@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <427083B0.6040204@canterbury.ac.nz> <Pine.LNX.4.58.0504280158590.4786@server1.LFW.org> <ca471dc2050428155127cec9a5@mail.gmail.com> Message-ID: <4271BBDC.9060506@canterbury.ac.nz> Guido van Rossum wrote: > I don't know. What exactly is the audience supposed to be of this > high-level statement? It would be pretty darn impossible to explain > even the for-statement to people who are new to programming, let alone > generators. If the use of block-statements becomes common for certain tasks such as opening files, it seems to me that people are going to encounter their use around about the same time they encounter for-statements. We need *something* to tell these people to enable them to understand the code they're reading. Maybe it would be sufficient just to explain the meanings of those particular uses, and leave the full general explanation as an advanced topic. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Fri Apr 29 06:46:56 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri Apr 29 06:47:14 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <d11dcfba05042810164214e9d0@mail.gmail.com> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <427083B0.6040204@canterbury.ac.nz> <d11dcfba05042810164214e9d0@mail.gmail.com> Message-ID: <4271BC40.4050802@canterbury.ac.nz> Steven Bethard wrote: > """ > A block-statement is much like a for-loop, and is also used to iterate > over the elements of an iterable object. No, no, no. Similarity to a for-loop is the *last* thing we want to emphasise, because the intended use is very different from the intended use of a for-loop. This is going to give people the wrong idea altogether. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From gvanrossum at gmail.com Fri Apr 29 06:50:36 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Apr 29 06:50:40 2005 Subject: [Python-Dev] Anonymous blocks: Thunks or iterators? In-Reply-To: <4271B28C.9030504@canterbury.ac.nz> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <4270A1F2.1030401@canterbury.ac.nz> <ca471dc205042815157cf20297@mail.gmail.com> <4271B28C.9030504@canterbury.ac.nz> Message-ID: <ca471dc20504282150c40d3b2@mail.gmail.com> (BTW, I'm trying to update the PEP with a discussion of thunks.) [Guido] > > The main advantage of thunks that I can see is that you can save the > > thunk for later, like a callback for a button widget (the thunk then > > becomes a closure). [Greg] > Or pass it on to another function. This is something we > haven't considered -- what if one resource-acquision- > generator (RAG?) wants to delegate to another RAG? > > With normal generators, one can always use the pattern > > for x in sub_generator(some_args): > yield x > > But that clearly isn't going to work if the generators > involved are RAGs, because the exceptions passed in > are going to be raised at the point of the yield in > the outer RAG, and the inner RAG isn't going to get > finalized (assuming the for-loop doesn't participate > in the finalization protocol). > > To get the finalization right, the inner generator > needs to be invoked as a RAG, too: > > block sub_generator(some_args): > yield > > But PEP 340 doesn't say what happens when the block > contains a yield. The same as when a for-loop contains a yield. The sub_generator is entirely unaware of this yield, since the local control flow doesn't actually leave the block (i.e., it's not like a break, continue or return statement). When the loop that was resumed by the yield calls next(), the block is resumed back after the yield. The generator finalization semantics guarantee (within the limitations of all finalization semantics) that the block will be resumed eventually. I'll add this to the PEP, too. I'd say that a yield in a thunk would be more troublesome: does it turn the thunk into a generator or the containing function? It would have to be the thunk, but then things get weird quickly (the caller of the thunk has to treat the result of the call as an iterator). > A thunk implementation wouldn't have any problem with > this, since the thunk can be passed down any number of > levels before being called, and any exceptions raised > in it will be propagated back up through all of them. > > > The other problem with thunks is that once we think of them as the > > anonymous functions they are, we're pretty much forced to say that a > > return statement in a thunk returns from the thunk rather than from > > the containing function. > > Urg, you're right. Unless return is turned into an > exception in that case. And then I suppose break and > return (and yield?) will have to follow suit. But wasn't that exactly what you were trying to avoid? :-) > I'm just trying to think how Smalltalk handles this, > since it must have a similar problem, but I can't > remember the details. > > > every > > local variable used or set in the thunk would have to become a 'cell' > > . Cells > > slow down access somewhat compared to regular local variables. > > True, but is the difference all that great? It's > just one more C-level indirection, isn't it? Alas not. It becomes a call to PyCell_Set() or PyCell_Get(). > > we'll want variables > > *assigned to* by the thunk also to be shared with the containing > > function, > > Agreed. We'd need to add a STORE_CELL bytecode or > something for this. This actually exists -- it is used for when an outer function stores into a local that it shares with an inner function. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Fri Apr 29 06:51:56 2005 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri Apr 29 06:52:11 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <4271B07A.4010501@hathawaymix.org> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <427083B0.6040204@canterbury.ac.nz> <Pine.LNX.4.58.0504280158590.4786@server1.LFW.org> <ca471dc2050428155127cec9a5@mail.gmail.com> <4271B07A.4010501@hathawaymix.org> Message-ID: <4271BD6C.1010001@canterbury.ac.nz> Shane Hathaway wrote: > "Block statements provide a mechanism for encapsulating patterns of > structure. Code inside the block statement runs under the control of an > object called a block iterator. Simple block iterators execute code > before and after the code inside the block statement. Block iterators > also have the opportunity to execute the controlled code more than once > (or not at all), catch exceptions, or receive data from the body of the > block statement. That actually looks pretty reasonable. Hmmm. "Patterns of structure." Maybe we could call it a "struct" statement. struct opening(foo) as f: ... Then we could confuse both C *and* Ruby programmers at the same time! :-) [No, I don't really mean this. I actually prefer "block" to this.] -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg.ewing@canterbury.ac.nz +--------------------------------------+ From gvanrossum at gmail.com Fri Apr 29 07:18:58 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Apr 29 07:19:01 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <4271BBDC.9060506@canterbury.ac.nz> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <427083B0.6040204@canterbury.ac.nz> <Pine.LNX.4.58.0504280158590.4786@server1.LFW.org> <ca471dc2050428155127cec9a5@mail.gmail.com> <4271BBDC.9060506@canterbury.ac.nz> Message-ID: <ca471dc2050428221873fbf94@mail.gmail.com> > If the use of block-statements becomes common for certain > tasks such as opening files, it seems to me that people are > going to encounter their use around about the same time > they encounter for-statements. We need *something* to > tell these people to enable them to understand the code > they're reading. > > Maybe it would be sufficient just to explain the meanings > of those particular uses, and leave the full general > explanation as an advanced topic. Right. The block statement is a bit like a chameleon: it adapts its meaning to the generator you supply. (Or maybe it's like a sewer: what you get out of it depends on what you put into it. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Fri Apr 29 07:27:26 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Apr 29 07:27:28 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <4271B7FC.1070801@pobox.com> References: <ca471dc205042815557616722b@mail.gmail.com> <4271B7FC.1070801@pobox.com> Message-ID: <ca471dc205042822271a43bc83@mail.gmail.com> > Far too close to the "for" loop, IMHO. I read that, I'd have to remind > myself every time, "now, which one is it that can receive values passed > back in: for ... in, or in ... for?" Whoa! Read the PEP closely. Passing a value back to the iterator (using "continue EXPR") is supported both in the for-loop and in the block-statement; it's new syntax so there's no backwards compatibility issue. The real difference is that when a for-loop is exited through a break, return or exception, the iterator is left untouched; but when the same happens in a block-statement, the iterator's __exit__ or __error__ method is called (I haven't decided what to call it). > Another possibility just occurred to me. How about "using"? Blah. I'm beginning to like block just fine. With using, the choice of word for the generator name becomes iffy IMO; and it almost sounds like it's a simple renaming: "using X as Y" could mean "Y = X". -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Fri Apr 29 07:30:20 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Apr 29 07:30:23 2005 Subject: [Python-Dev] Re: PEP 340 - possible new name for block-statement In-Reply-To: <d4s8nh$fk8$1@sea.gmane.org> References: <ca471dc205042815557616722b@mail.gmail.com> <d4s8nh$fk8$1@sea.gmane.org> Message-ID: <ca471dc2050428223023aa80fc@mail.gmail.com> [Nicolas Fleury] > I would prefer something that would be > understandable for a newbie's eyes, even if it fits more with common > usage than with the real semantics behind it. For example a Boost-like > keyword like: > > scoped EXPR as VAR: > BLOCK Definitely not. In too many languages, a "scope" is a new namespace, and that's exactly what a block (by whichever name) is *not*. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jcarlson at uci.edu Fri Apr 29 09:38:38 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri Apr 29 09:40:31 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <4271BD6C.1010001@canterbury.ac.nz> References: <4271B07A.4010501@hathawaymix.org> <4271BD6C.1010001@canterbury.ac.nz> Message-ID: <20050429003757.644F.JCARLSON@uci.edu> Greg Ewing <greg.ewing@canterbury.ac.nz> wrote: > That actually looks pretty reasonable. > > Hmmm. "Patterns of structure." Maybe we could call it a > "struct" statement. > > struct opening(foo) as f: > ... > > Then we could confuse both C *and* Ruby programmers at > the same time! :-) And Python programmers who already use the struct module! - Josiah From fumanchu at amor.org Fri Apr 29 09:48:01 2005 From: fumanchu at amor.org (Robert Brewer) Date: Fri Apr 29 09:46:29 2005 Subject: [Python-Dev] Anonymous blocks: Thunks or iterators? Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3771F1E@exchange.hqamor.amorhq.net> > [Greg Ewing] > > Elegant as the idea behind PEP 340 is, I can't shake > > the feeling that it's an abuse of generators. It seems > > to go to a lot of trouble and complication so you > > can write a generator and pretend it's a function > > taking a block argument. [Guido] > Maybe. You're not the first one saying this and I'm not saying "no" > outright, but I'd like to defend the PEP. > > There are a number of separate ideas that all contribute to PEP 340. > One is turning generators into more general coroutines: continue EXPR > passes the expression to the iterator's next() method (renamed to > __next__() to work around a compatibility issue and because it should > have been called that in the first place), and in a generator this > value can be received as the return value of yield. Incidentally this > makes the generator *syntax* more similar to Ruby (even though Ruby > uses thunks, and consequently uses return instead of continue to pass > a value back). I'd like to have this even if I don't get the block > statement. Completely agree. Maybe we should have PEP 340 push just that, and make a PEP 341 independently for resource-cleanup (which assumes 340)? > [snip] > > The other problem with thunks is that once we think of them as the > anonymous functions they are, we're pretty much forced to say that a > return statement in a thunk returns from the thunk rather than from > the containing function. Doing it any other way would cause major > weirdness when the thunk were to survive its containing function as a > closure (perhaps continuations would help, but I'm not about to go > there :-). > > But then an IMO important use case for the resource cleanup template > pattern is lost. I routinely write code like this: > > def findSomething(self, key, default=None): > self.lock.acquire() > try: > for item in self.elements: > if item.matches(key): > return item > return default > finally: > self.lock.release() > > and I'd be bummed if I couldn't write this as > > def findSomething(self, key, default=None): > block synchronized(self.lock): > for item in self.elements: > if item.matches(key): > return item > return default Okay, you've convinced me. The only way I can think of to get the effect I've been wanting would be to recompile the template function every time that it's executed with a different block. Call it a "Python _re_processor" ;). Although you could memoize the the resultant bytecode, etc., it would still be pretty slow, and you wouldn't be able to alter (rebind) the thunk once you'd entered the caller. Even then, you'd have the cell issues you mentioned, trying to push values from the thunk's original scope. Bah. It's so tempting on the semantic level, but the implementation's a bear. Robert Brewer MIS Amor Ministries fumanchu@amor.org From jcarlson at uci.edu Fri Apr 29 09:47:49 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri Apr 29 09:49:32 2005 Subject: [Python-Dev] Re: PEP 340 - possible new name for block-statement In-Reply-To: <ca471dc2050428223023aa80fc@mail.gmail.com> References: <d4s8nh$fk8$1@sea.gmane.org> <ca471dc2050428223023aa80fc@mail.gmail.com> Message-ID: <20050429003912.6452.JCARLSON@uci.edu> Guido van Rossum <gvanrossum@gmail.com> wrote: > > [Nicolas Fleury] > > I would prefer something that would be > > understandable for a newbie's eyes, even if it fits more with common > > usage than with the real semantics behind it. For example a Boost-like > > keyword like: > > > > scoped EXPR as VAR: > > BLOCK > > Definitely not. In too many languages, a "scope" is a new namespace, > and that's exactly what a block (by whichever name) is *not*. scopeless, unscoped, Scope(tm) (we would be required to use the unicode trademark symbol, of course)... It's way too long, and is too close to a pre-existing keyword, but I think 'finalized' is descriptive. But... finalize EXPR as VAR: BLOCK That reads nice... Maybe even 'cleanup', or 'finalize_after_iteration_without_iter_call' (abbreviated to 'faiwic', of course). <1.0 wink> All right, it's late enough. Enough 'ideas' from me tonight. - Josiah From ncoghlan at gmail.com Fri Apr 29 10:58:03 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri Apr 29 10:58:09 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <ca471dc205042815557616722b@mail.gmail.com> References: <ca471dc205042815557616722b@mail.gmail.com> Message-ID: <4271F71B.8010000@gmail.com> Guido van Rossum wrote: > How about, instead of trying to emphasize how different a > block-statement is from a for-loop, we emphasize their similarity? If you want to emphasise the similarity, the following syntax and explanation is something that occurred to me during lunch today: Python offers two variants on the basic iterative loop. "for NAME from EXPR:" enforces finalisation of the iterator. At loop completion, a well-behaved iterator is always completely exhausted. This form supports block management operations, that ensure timely release of resources such as locks or file handles. If the values being iterated over are not required, then the statement may be simplified to "for EXPR:". "for NAME in EXPR:" skips the finalisation step. At loop completion, a well-behaved iterator may still contain additional values. This form allows an iterator to be consumed in stages. Regardless of whether you like the above or not, I think the PEP's proposed use of 'as' is incorrect - it looks like the variable should be referring to the expression being iterated over, rather than the values returned from the iterator. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From mwh at python.net Fri Apr 29 11:35:27 2005 From: mwh at python.net (Michael Hudson) Date: Fri Apr 29 11:35:29 2005 Subject: [Python-Dev] Anonymous blocks: Thunks or iterators? In-Reply-To: <ca471dc205042815157cf20297@mail.gmail.com> (Guido van Rossum's message of "Thu, 28 Apr 2005 15:15:13 -0700") References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <4270A1F2.1030401@canterbury.ac.nz> <ca471dc205042815157cf20297@mail.gmail.com> Message-ID: <2mll720vts.fsf@starship.python.net> Guido van Rossum <gvanrossum@gmail.com> writes: > [Greg Ewing] >> Elegant as the idea behind PEP 340 is, I can't shake >> the feeling that it's an abuse of generators. It seems >> to go to a lot of trouble and complication so you >> can write a generator and pretend it's a function >> taking a block argument. > > Maybe. You're not the first one saying this and I'm not saying "no" > outright, but I'd like to defend the PEP. This is kind of my point too; I'm not saying that I really prefer the thunk solution, just that I want to see it mentioned. I think the making-generators-more-sexy thing is nice, but I'm think that's almost orthogonal. [...] > Even without a block-statement, these two changes make yield look a > lot like invoking a thunk -- but it's more efficient, since calling > yield doesn't create a frame. > > The main advantage of thunks that I can see is that you can save the > thunk for later, I also find them somewhat easier to understand. > like a callback for a button widget (the thunk then becomes a > closure). You can't use a yield-based block for that (except in > Ruby, which uses yield syntax with a thunk-based implementation). > But I have to say that I almost see this as an advantage: I think > I'd be slightly uncomfortable seeing a block and not knowing whether > it will be executed in the normal control flow or later. Defining an > explicit nested function for that purpose doesn't have this problem > for me, because I already know that the 'def' keyword means its body > is executed later. > > The other problem with thunks is that once we think of them as the > anonymous functions they are, we're pretty much forced to say that a > return statement in a thunk returns from the thunk rather than from > the containing function. Doing it any other way would cause major > weirdness when the thunk were to survive its containing function as a > closure (perhaps continuations would help, but I'm not about to go > there :-). I'm not so sure about this. Did you read this mail: http://mail.python.org/pipermail/python-dev/2005-April/052970.html ? In this proposal, you have to go to some effort to make the thunk survive the block, and I think if weirdness results, that's the programmer's problem. > But then an IMO important use case for the resource cleanup template > pattern is lost. I routinely write code like this: > > def findSomething(self, key, default=None): > self.lock.acquire() > try: > for item in self.elements: > if item.matches(key): > return item > return default > finally: > self.lock.release() > > and I'd be bummed if I couldn't write this as > > def findSomething(self, key, default=None): > block synchronized(self.lock): > for item in self.elements: > if item.matches(key): > return item > return default If you can't write it this way, the thunk proposal is dead. >> I'd like to reconsider a thunk implementation. It >> would be a lot simpler, doing just what is required >> without any jiggery pokery with exceptions and >> break/continue/return statements. It would be easy >> to explain what it does and why it's useful. > > I don't know. In order to obtain the required local variable sharing > between the thunk and the containing function I believe that every > local variable used or set in the thunk would have to become a 'cell' > (our mechanism for sharing variables between nested scopes). Yes. > Cells slow down access somewhat compared to regular local variables. So make them faster. I'm not sure I think this is a good argument. You could also do some analysis and treat variables that are only accessed or written in the block as normal locals. This all makes a block-created thunk somewhat different from an anonymous function, to be sure. But the creating syntax is different, so I don't know if I care (hell, the invoking syntax could be made different too, but I really don't think that's a good idea). > Perhaps not entirely coincidentally, the last example above > (findSomething() rewritten to avoid a return inside the block) shows > that, unlike for regular nested functions, we'll want variables > *assigned to* by the thunk also to be shared with the containing > function, even if they are not assigned to outside the thunk. I swear > I didn't create the example for this purpose -- it just happened. Oh, absolutely. >> On the other hand, a thunk implementation has the >> potential to easily handle multiple block arguments, if >> a suitable syntax could ever be devised. It's hard >> to see how that could be done in a general way with >> the generator implementation. > > Right, but the use cases for multiple blocks seem elusive. If you > really want to have multiple blocks with yield, I suppose we could use > "yield/n" to yield to the n'th block argument, or perhaps yield>>n. > :-) Hmm, it's nearly *May* 1... :) Cheers, mwh -- I'm a keen cyclist and I stop at red lights. Those who don't need hitting with a great big slapping machine. -- Colin Davidson, cam.misc From mwh at python.net Fri Apr 29 11:37:30 2005 From: mwh at python.net (Michael Hudson) Date: Fri Apr 29 11:37:32 2005 Subject: [Python-Dev] Anonymous blocks: Thunks or iterators? In-Reply-To: <Pine.A41.4.61b.0504281913370.127138@dante65.u.washington.edu> (Brian Sabbey's message of "Thu, 28 Apr 2005 21:15:14 -0700 (PDT)") References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <4270A1F2.1030401@canterbury.ac.nz> <ca471dc205042815157cf20297@mail.gmail.com> <Pine.A41.4.61b.0504281913370.127138@dante65.u.washington.edu> Message-ID: <2mhdhp2aat.fsf@starship.python.net> Brian Sabbey <sabbey@u.washington.edu> writes: > It is possible to implement thunks without them creating their own > frame. They can reuse the frame of the surrounding function. So a new > frame does not need to be created when the thunk is called, and, much > like with a yield statement, the frame is not taken down when the > thunk completes running. The implementation just needs to take care > to save and restore members of the frame that get clobbered when the > thunk is running. Woo. That's cute. Cheers, mwh -- SCSI is not magic. There are fundamental technical reasons why it is necessary to sacrifice a young goat to your SCSI chain now and then. -- John Woods From p.f.moore at gmail.com Fri Apr 29 12:41:19 2005 From: p.f.moore at gmail.com (Paul Moore) Date: Fri Apr 29 12:41:22 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <4271B07A.4010501@hathawaymix.org> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <427083B0.6040204@canterbury.ac.nz> <Pine.LNX.4.58.0504280158590.4786@server1.LFW.org> <ca471dc2050428155127cec9a5@mail.gmail.com> <4271B07A.4010501@hathawaymix.org> Message-ID: <79990c6b05042903417313df72@mail.gmail.com> On 4/29/05, Shane Hathaway <shane@hathawaymix.org> wrote: > I think this concept can be explained clearly. I'd like to try > explaining PEP 340 to someone new to Python but not new to programming. > I'll use the term "block iterator" to refer to the new type of > iterator. This is according to my limited understanding. [...] > Is it understandable so far? I like it. Paul. From pierre.barbier at cirad.fr Fri Apr 29 13:44:46 2005 From: pierre.barbier at cirad.fr (Pierre Barbier de Reuille) Date: Fri Apr 29 13:44:20 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <4271F71B.8010000@gmail.com> References: <ca471dc205042815557616722b@mail.gmail.com> <4271F71B.8010000@gmail.com> Message-ID: <42721E2E.8020108@cirad.fr> Nick Coghlan a ?crit : > Python offers two variants on the basic iterative loop. > > "for NAME from EXPR:" enforces finalisation of the iterator. At loop > completion, a well-behaved iterator is always completely exhausted. This > form supports block management operations, that ensure timely release of > resources such as locks or file handles. > If the values being iterated over are not required, then the statement > may be simplified to "for EXPR:". > > "for NAME in EXPR:" skips the finalisation step. At loop completion, a > well-behaved iterator may still contain additional values. This form > allows an iterator to be consumed in stages. > > > Regardless of whether you like the above or not, I think the PEP's > proposed use of 'as' is incorrect - it looks like the variable should be > referring to the expression being iterated over, rather than the values > returned from the iterator. > > Cheers, > Nick. > Well, I would go a step further and keep only the for-loop syntax, mainly because I don't understand why there is two syntax for things that's so close we can merge them ! You can simply states that the for-loop call the "__error__" method of the object if available without invalidating any other property of the new for-loop (ie. as defined in the PEP 340). One main reason is a common error could be (using the synchronised iterator introduced in the PEP): for l in synchronised(mylock): do_something() It will compile, run, never raise any error but the lock will be acquired and never released ! Then, I think there is no use case of a generator with __error__ in the for-loop as it is now. So, IMO, it is error-prone and useless to have two different syntaxes for such things. Pierre -- Pierre Barbier de Reuille INRA - UMR Cirad/Inra/Cnrs/Univ.MontpellierII AMAP Botanique et Bio-informatique de l'Architecture des Plantes TA40/PSII, Boulevard de la Lironde 34398 MONTPELLIER CEDEX 5, France tel : (33) 4 67 61 65 77 fax : (33) 4 67 61 56 68 From lcaamano at gmail.com Fri Apr 29 14:45:56 2005 From: lcaamano at gmail.com (Luis P Caamano) Date: Fri Apr 29 14:46:00 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <20050429044559.526E31E4006@bag.python.org> References: <20050429044559.526E31E4006@bag.python.org> Message-ID: <c56e219d050429054572444ab6@mail.gmail.com> On 4/29/05, python-dev-request@python.org <python-dev-request@python.org> wrote: > > Message: 2 > Date: Thu, 28 Apr 2005 21:56:42 -0600 > From: Shane Hathaway <shane@hathawaymix.org> > Subject: Re: [Python-Dev] Re: anonymous blocks > To: guido@python.org > Cc: Ka-Ping Yee <python-dev@zesty.ca>, Python Developers List > <python-dev@python.org> > Message-ID: <4271B07A.4010501@hathawaymix.org> > Content-Type: text/plain; charset=ISO-8859-1 > > > I think this concept can be explained clearly. I'd like to try > explaining PEP 340 to someone new to Python but not new to programming. > I'll use the term "block iterator" to refer to the new type of > iterator. This is according to my limited understanding. > > "Good programmers move commonly used code into reusable functions. > Sometimes, however, patterns arise in the structure of the functions > rather than the actual sequence of statements. For example, many > functions acquire a lock, execute some code specific to that function, > and unconditionally release the lock. Repeating the locking code in > every function that uses it is error prone and makes refactoring difficult. > > "Block statements provide a mechanism for encapsulating patterns of > structure. Code inside the block statement runs under the control of an > object called a block iterator. Simple block iterators execute code > before and after the code inside the block statement. Block iterators > also have the opportunity to execute the controlled code more than once > (or not at all), catch exceptions, or receive data from the body of the > block statement. > > "A convenient way to write block iterators is to write a generator. A > generator looks a lot like a Python function, but instead of returning a > value immediately, generators pause their execution at "yield" > statements. When a generator is used as a block iterator, the yield > statement tells the Python interpreter to suspend the block iterator, > execute the block statement body, and resume the block iterator when the > body has executed. > > "The Python interpreter behaves as follows when it encounters a block > statement based on a generator. First, the interpreter instantiates the > generator and begins executing it. The generator does setup work > appropriate to the pattern it encapsulates, such as acquiring a lock, > opening a file, starting a database transaction, or starting a loop. > Then the generator yields execution to the body of the block statement > using a yield statement. When the block statement body completes, > raises an uncaught exception, or sends data back to the generator using > a continue statement, the generator resumes. At this point, the > generator can either clean up and stop or yield again, causing the block > statement body to execute again. When the generator finishes, the > interpreter leaves the block statement." > > Is it understandable so far? > I've been skipping most of the anonymous block discussion and thus, I only had a very vague idea of what it was about until I read this explanation. Yes, it is understandable -- assuming it's correct :-) Mind you though, I'm not new to python and I've been writing system software for 20+ years. -- Luis P Caamano Atlanta, GA USA From lbruno at republico.estv.ipv.pt Fri Apr 29 15:50:16 2005 From: lbruno at republico.estv.ipv.pt (Luis Bruno) Date: Fri Apr 29 15:48:19 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <4271B07A.4010501@hathawaymix.org> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <427083B0.6040204@canterbury.ac.nz> <Pine.LNX.4.58.0504280158590.4786@server1.LFW.org> <ca471dc2050428155127cec9a5@mail.gmail.com> <4271B07A.4010501@hathawaymix.org> Message-ID: <20050429145016.00005d4e@LAB2-14.esi> Hello, Shane Hathaway wrote: > Is it understandable so far? Definitely yes! I had the structure upside-down; your explanation is right on target. Thanks! -- Luis Bruno From shane at hathawaymix.org Fri Apr 29 15:48:39 2005 From: shane at hathawaymix.org (Shane Hathaway) Date: Fri Apr 29 15:48:43 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <c56e219d050429054572444ab6@mail.gmail.com> References: <20050429044559.526E31E4006@bag.python.org> <c56e219d050429054572444ab6@mail.gmail.com> Message-ID: <42723B37.3050004@hathawaymix.org> Luis P Caamano wrote: > I've been skipping most of the anonymous block discussion and thus, > I only had a very vague idea of what it was about until I read this > explanation. > > Yes, it is understandable -- assuming it's correct :-) To my surprise, the explanation is now in the PEP. (Thanks, Guido!) Shane From jimjjewett at gmail.com Fri Apr 29 16:43:01 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri Apr 29 16:43:05 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement Message-ID: <fb6fbf5605042907431311af71@mail.gmail.com> Nick Coghlan: > Python offers two variants on the basic iterative loop. > "for NAME in EXPR:" skips the finalisation step. At loop > completion, a well-behaved iterator may still contain additional values. > "for NAME from EXPR:" enforces finalisation of the iterator. > ... At loop completion, a well-behaved [finalizing] iterator is > always completely exhausted. (nitpick): "from" isn't really different from "in". Perhaps for NAME inall EXPR: for NAME draining EXPR: for NAME finalizing EXPR: # too hard to spell, because of s/z? (substance): "finalized or not" is a very useful distinction, but I'm not sure it is something the user should have to worry about. Realistically, most of my loops intend to drain the iterator (which the compiler knows because I have no "break:". Regardless of whether I use a break, I still want the iterator cleaned up if it is drained. The only thing this second loop form does is set a flag saying "No, I won't continue -- and I happen to know that no one else ever will either, even if they do have a reference that prevents garbage collection. I'm *sure* they won't use it." That strikes me as a dangerous thing to get in the habit of saying. Why not just agressively run the finalization on both forms when the reference count permits? > This form supports block management operations, And this seems unrelated to finalization. I understand that as an implementation detail, you need to define the finalizers somehow. But the decision to aggressively finalize (in some manner) and desire to pass a block (that could be for finalization) seem like orthogonal issues. -jJ From lcaamano at gmail.com Fri Apr 29 16:43:34 2005 From: lcaamano at gmail.com (Luis P Caamano) Date: Fri Apr 29 16:43:36 2005 Subject: [Python-Dev] About block statement name alternative Message-ID: <c56e219d05042907433a52ec34@mail.gmail.com> How about "bracket" or "bracket_with"? As in: bracket_with synchronized(lock): BLOCK bracket_with opening("/etc/passwd") as f: for line in f: print line.rstrip() bracket_with transactional(db): db.store() bracket_with auto_retry(3, IOError): f = urllib.urlopen("http://python.org/peps/pep-0340.html") print f.read() block_with synchronized_opening("/etc/passwd", myLock) as f: for line in f: print line.rstrip() def synchronized_opening(lock, filename, mode="r"): bracket_with synchronized(lock): bracket_with opening(filename) as f: yield f bracket_with synchronized_opening("/etc/passwd", myLock) as f: for line in f: print line.rstrip() Or for that matter, "block_with", as in: block_with transactional(db): db.store() -- Luis P Caamano Atlanta, GA USA From skip at pobox.com Fri Apr 29 16:48:24 2005 From: skip at pobox.com (Skip Montanaro) Date: Fri Apr 29 16:48:28 2005 Subject: [Python-Dev] PEP 340: What is "ret" in block statement semantics? Message-ID: <17010.18744.51208.918622@montanaro.dyndns.org> PEP 340 describes the block statement translation as: itr = EXPR1 val = arg = None ret = False while True: try: VAR1 = next(itr, arg) except StopIteration: if ret: return val if val is not None: raise val break try: val = arg = None ret = False BLOCK1 except Exception, val: arg = StopIteration() It uses a variable "ret" that is always False. If it does manage to take on a True value, a return statement is executed. How does ret become True? What's meaning of return in this context? Something seems amiss. Skip From skip at pobox.com Fri Apr 29 16:49:09 2005 From: skip at pobox.com (Skip Montanaro) Date: Fri Apr 29 16:49:12 2005 Subject: [Python-Dev] PEP 340: What is "ret" in block statement semantics? Message-ID: <17010.18789.916072.361333@montanaro.dyndns.org> me> It uses a variable "ret" that is always False. Gaack. Please ignore. Skip From ncoghlan at gmail.com Fri Apr 29 17:00:44 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri Apr 29 17:00:50 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <42721E2E.8020108@cirad.fr> References: <ca471dc205042815557616722b@mail.gmail.com> <4271F71B.8010000@gmail.com> <42721E2E.8020108@cirad.fr> Message-ID: <42724C1C.4040200@gmail.com> Pierre Barbier de Reuille wrote: > One main reason is a common error could be (using the synchronised > iterator introduced in the PEP): > > for l in synchronised(mylock): > do_something() > > It will compile, run, never raise any error but the lock will be > acquired and never released ! It's better than that. With the code above, CPython is actually likely to release the lock when the loop exits. Change the code to the below to ensure the lock doesn't get released: sync = synchronised(mylock): for l in sync: do_something() > Then, I think there is no use case of a generator with __error__ in the > for-loop as it is now. So, IMO, it is error-prone and useless to have > two different syntaxes for such things. Hmm. This does make PJE's suggestion of requiring a decorator in order to flag generators for finalisation a little more appealing. Existing generators (without the flag) would not be cleaned up, preserving backwards compatibility. Generators with the flag would allow resource clean up. In this case of no new statement syntax, it would probably make more sense to refer to iterators that get cleaned up as finalised iterators, and a builtin with the obvious name would be: def finalised(obj): obj.__finalise__ = True # The all important flag! return obj The syntax below would still be horrible: for f in opening(filename): for line in f: # process line But such ugliness could be fixed by pushing the inner loop inside the block iterator: for line in opened(filename): # process line @finalised def opened(filename): f = open(filename) try: for line in f: yield line finally: f.close() Then, in Py3K, finalisation could simply become the default for loop behaviour. However, the '__finalise__' flag would result in some impressive code bloat, as any for loop would need to expand to: itr = iter(EXPR1) if getattr(itr, "__finalise__", False): # Finalised semantics # I'm trying to channel Guido here. # This would really look like whatever the PEP 340 block statement # semantics end up being val = arg = None ret = broke = False while True: try: VAR1 = next(itr, arg) except StopIteration: BLOCK2 break try: val = arg = None ret = False BLOCK1 except Exception, val: itr.__error__(val) if ret: try: itr.__error__(StopIteration()) except StopIteration: pass return val else: # Non-finalised semantics arg = None while True: try: VAR1 = next(itr, arg) except StopIteration: BLOCK2 break arg = None BLOCK1 The major danger I see is that you could then write a generator containing a yield inside a try/finally, _without_ applying the finalisation decorator. Leading to exactly the problem described above - the lock (or whatever) is never cleaned up, because the generator is not flagged for finalisation. In this scenario, even destruction of the generator object won't help. Cheers, Nick. P.S. I think PEP 340's proposed for loop semantics are currently incorrect, as BLOCK2 is unreachable. It should look more like the non-finalised semantics above (with BLOCK2 before the break in the except clause) -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From ncoghlan at gmail.com Fri Apr 29 17:26:13 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri Apr 29 17:26:18 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <fb6fbf5605042907431311af71@mail.gmail.com> References: <fb6fbf5605042907431311af71@mail.gmail.com> Message-ID: <42725215.7030206@gmail.com> Jim Jewett wrote: > Why not just agressively run the finalization on both forms when the > reference count permits? So the iterator is always finalised if the for loop has the only reference? Two problems I can see there is that naming the target of the for loop would prevent it being finalised, and that this would make life interesting when the Jython or IronPython folks catch up to Python 2.5. . . The finalised/not finalised aspect definitely seems to be the key behavioural distinction between the two forms, though. And I think there are legitimate use cases for a non-finalised form. Things like: for line in f: if end_of_header(line): break # process header line for line in f: # process body line With only a finalised form of iteration available, this would need to be rewritten as something like: def header(f): line = next(f) while not end_of_header(line): line = next(f, yield line) for line in header(f): # process header line for line in f: # process body line Considering the above, I actually have grave reservations about *ever* making finalisation the default behaviour of for loops - if I break out of a standard for loop before exhausting the iterator, I would expect to be able to resume the iterator afterwards, rather than having it flushed behind my back. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From pierre.barbier at cirad.fr Fri Apr 29 17:45:21 2005 From: pierre.barbier at cirad.fr (Pierre Barbier de Reuille) Date: Fri Apr 29 17:44:53 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <42724C1C.4040200@gmail.com> References: <ca471dc205042815557616722b@mail.gmail.com> <4271F71B.8010000@gmail.com> <42721E2E.8020108@cirad.fr> <42724C1C.4040200@gmail.com> Message-ID: <42725691.4030308@cirad.fr> Nick Coghlan a ?crit : > Pierre Barbier de Reuille wrote: > >> One main reason is a common error could be (using the synchronised >> iterator introduced in the PEP): >> >> for l in synchronised(mylock): >> do_something() >> >> It will compile, run, never raise any error but the lock will be >> acquired and never released ! > > > It's better than that. With the code above, CPython is actually likely > to release the lock when the loop exits. Change the code to the below to > ensure the lock doesn't get released: > > sync = synchronised(mylock): > for l in sync: > do_something() > Well indeed, but this will be an implementation-dependant behaviour ... >> Then, I think there is no use case of a generator with __error__ in >> the for-loop as it is now. So, IMO, it is error-prone and useless to >> have two different syntaxes for such things. > > [...] > > The major danger I see is that you could then write a generator > containing a yield inside a try/finally, _without_ applying the > finalisation decorator. Leading to exactly the problem described above - > the lock (or whatever) is never cleaned up, because the generator is not > flagged for finalisation. In this scenario, even destruction of the > generator object won't help. Mmmmh ... why introduce a new flag ? Can't you just test the presence of the "__error__" method ? This would lift your problem wouldn't it ? > > Cheers, > Nick. > > P.S. I think PEP 340's proposed for loop semantics are currently > incorrect, as BLOCK2 is unreachable. It should look more like the > non-finalised semantics above (with BLOCK2 before the break in the > except clause) > -- Pierre Barbier de Reuille INRA - UMR Cirad/Inra/Cnrs/Univ.MontpellierII AMAP Botanique et Bio-informatique de l'Architecture des Plantes TA40/PSII, Boulevard de la Lironde 34398 MONTPELLIER CEDEX 5, France tel : (33) 4 67 61 65 77 fax : (33) 4 67 61 56 68 From aahz at pythoncraft.com Fri Apr 29 18:34:08 2005 From: aahz at pythoncraft.com (Aahz) Date: Fri Apr 29 18:34:10 2005 Subject: [Python-Dev] Anonymous blocks: Thunks or iterators? In-Reply-To: <Pine.A41.4.61b.0504281913370.127138@dante65.u.washington.edu> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <4270A1F2.1030401@canterbury.ac.nz> <ca471dc205042815157cf20297@mail.gmail.com> <Pine.A41.4.61b.0504281913370.127138@dante65.u.washington.edu> Message-ID: <20050429163408.GA14920@panix.com> On Thu, Apr 28, 2005, Brian Sabbey wrote: > > It is possible to implement thunks without them creating their own frame. > They can reuse the frame of the surrounding function. So a new frame does > not need to be created when the thunk is called, and, much like with a > yield statement, the frame is not taken down when the thunk completes > running. The implementation just needs to take care to save and restore > members of the frame that get clobbered when the thunk is running. > > Cells would of course not be required if the thunk does not create its own > frame. Maybe. It's not clear whether your thunks are lexical (I haven't been following the discussion closely). If it's not lexical, how do locals get handled without cells? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It's 106 miles to Chicago. We have a full tank of gas, a half-pack of cigarettes, it's dark, and we're wearing sunglasses." "Hit it." From aahz at pythoncraft.com Fri Apr 29 18:38:54 2005 From: aahz at pythoncraft.com (Aahz) Date: Fri Apr 29 18:38:58 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <4271F71B.8010000@gmail.com> References: <ca471dc205042815557616722b@mail.gmail.com> <4271F71B.8010000@gmail.com> Message-ID: <20050429163854.GB14920@panix.com> On Fri, Apr 29, 2005, Nick Coghlan wrote: > > If you want to emphasise the similarity, the following syntax and > explanation is something that occurred to me during lunch today: We don't want to emphasize the similarity. > Python offers two variants on the basic iterative loop. > > "for NAME from EXPR:" enforces finalisation of the iterator. At loop > completion, a well-behaved iterator is always completely exhausted. This > form supports block management operations, that ensure timely release of > resources such as locks or file handles. > If the values being iterated over are not required, then the statement > may be simplified to "for EXPR:". > > "for NAME in EXPR:" skips the finalisation step. At loop completion, a > well-behaved iterator may still contain additional values. This form allows > an iterator to be consumed in stages. -1 -- the Zen of Python implies that we should be able to tell which construct we're using at the beginning of the line. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It's 106 miles to Chicago. We have a full tank of gas, a half-pack of cigarettes, it's dark, and we're wearing sunglasses." "Hit it." From david.ascher at gmail.com Fri Apr 29 18:42:33 2005 From: david.ascher at gmail.com (David Ascher) Date: Fri Apr 29 18:42:35 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <ca471dc205042815557616722b@mail.gmail.com> References: <ca471dc205042815557616722b@mail.gmail.com> Message-ID: <dd28fc2f05042909422742b720@mail.gmail.com> On 4/28/05, Guido van Rossum <gvanrossum@gmail.com> wrote: > How about, instead of trying to emphasize how different a > block-statement is from a for-loop, we emphasize their similarity? > > A regular old loop over a sequence or iterable is written as: > > for VAR in EXPR: > BLOCK > > A variation on this with somewhat different semantics swaps the keywords: > > in EXPR for VAR: > BLOCK > > If you don't need the variable, you can leave the "for VAR" part out: > > in EXPR: > BLOCK > > Too cute? :-) If you want to truly confuse the Ruby folks, you could go for something like: { EXPR } VAR: BLOCK <wink/> From jcarlson at uci.edu Fri Apr 29 19:02:38 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri Apr 29 19:03:49 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <42724C1C.4040200@gmail.com> References: <42721E2E.8020108@cirad.fr> <42724C1C.4040200@gmail.com> Message-ID: <20050429095557.6455.JCARLSON@uci.edu> Nick Coghlan <ncoghlan@gmail.com> wrote: > Then, in Py3K, finalisation could simply become the default for loop behaviour. > However, the '__finalise__' flag would result in some impressive code bloat, as > any for loop would need to expand to: > > itr = iter(EXPR1) > if getattr(itr, "__finalise__", False): > # Finalised semantics > # I'm trying to channel Guido here. > # This would really look like whatever the PEP 340 block statement > # semantics end up being > val = arg = None > ret = broke = False > while True: > try: > VAR1 = next(itr, arg) > except StopIteration: > BLOCK2 > break > try: > val = arg = None > ret = False > BLOCK1 > except Exception, val: > itr.__error__(val) > if ret: > try: > itr.__error__(StopIteration()) > except StopIteration: > pass > return val The problem is that BLOCK2 is executed within the while loop (the same problem I had with a fix I offered), which may contain a break for breaking out of a higher-level loop construct. Here's one that works as you intended (though perhaps I'm being a bit to paranoid about the __error__ attribute)... val = arg = None ret = ex_block_2 = False while True: try: VAR1 = next(itr, arg) except StopIteration: ex_block_2 = True break try: val = arg = None ret = False BLOCK1 except Exception, val: if hasattr(itr, '__error__): itr.__error__(val) if ret: try: if hasattr(itr, '__error__'): itr.__error__(StopIteration()) except StopIteration: pass return val if ex_block_2: BLOCK2 > P.S. I think PEP 340's proposed for loop semantics are currently incorrect, as > BLOCK2 is unreachable. It should look more like the non-finalised semantics > above (with BLOCK2 before the break in the except clause) Indeed, I also mentioned this on Wednesday. - Josiah From pje at telecommunity.com Fri Apr 29 19:08:04 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Apr 29 19:04:56 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <20050429163854.GB14920@panix.com> References: <4271F71B.8010000@gmail.com> <ca471dc205042815557616722b@mail.gmail.com> <4271F71B.8010000@gmail.com> Message-ID: <5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com> At 09:38 AM 4/29/05 -0700, Aahz wrote: >-1 -- the Zen of Python implies that we should be able to tell which >construct we're using at the beginning of the line. Hm, maybe we should just use "@", then. :) e.g. @synchronized(self): @with_file("foo") as f: # etc. Although I'd personally prefer a no-keyword approach: synchronized(self): with_file("foo") as f: # etc. From jcarlson at uci.edu Fri Apr 29 19:08:22 2005 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri Apr 29 19:09:50 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <42724C1C.4040200@gmail.com> References: <42721E2E.8020108@cirad.fr> <42724C1C.4040200@gmail.com> Message-ID: <20050429100605.6458.JCARLSON@uci.edu> Nick Coghlan <ncoghlan@gmail.com> wrote: > # Non-finalised semantics > arg = None > while True: > try: > VAR1 = next(itr, arg) > except StopIteration: > BLOCK2 > break > arg = None > BLOCK1 And that bad boy should be... # Non-finalised semantics ex_block_2 = False arg = None while True: try: VAR1 = next(itr, arg) except StopIteration: ex_block_2 = True break arg = None BLOCK1 if ex_block_2: BLOCK2 Josiah Carlson wrote: > Indeed, I also mentioned this on Wednesday. Though I was somewhat incorrect as code examples I offered express the actual intent. - Josiah From gvanrossum at gmail.com Fri Apr 29 19:16:12 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Apr 29 19:16:15 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com> References: <ca471dc205042815557616722b@mail.gmail.com> <4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com> <5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com> Message-ID: <ca471dc205042910162befaaee@mail.gmail.com> [Phillip J. Eby] > Although I'd personally prefer a no-keyword approach: > > synchronized(self): > with_file("foo") as f: > # etc. I'd like that too, but it was shot down at least once. Maybe we can resurrect it? opening("foo") as f: # etc. is just a beauty! -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jimjjewett at gmail.com Fri Apr 29 19:17:13 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri Apr 29 19:17:16 2005 Subject: [Python-Dev] Anonymous blocks: Thunks or iterators? Message-ID: <fb6fbf560504291017362204a7@mail.gmail.com> Brian Sabbey: > It is possible to implement thunks without them creating their own > frame. They can reuse the frame of the surrounding function ... > The implementation just needs to take care > to save and restore members of the frame that get clobbered when the > thunk is running. Michael Hudson: > Woo. That's cute. It *sounds* horrendous, but is actually pretty reasonable. Conceptually, a thunk replaces a suite in the caller. Most frame members are intended to be shared, and changes should be visible -- so they don't have to (and shouldn't) be restored. The only members that need special attention are (f_code, f_lasti) and possibly (f_blockstack, f_iblock). (f_code, f_lasti) would need to be replaced with a stack of pairs. Finishing a code string would mean popping this stack, rather than popping the whole frame. Since a completed suite leaves the blockstack where it started, (f_blockstack, f_iblock) *can* be ignored, though debugging and CO_MAXBLOCKS both *suggest* replacing the pair with a stack of pairs. -jJ From david.ascher at gmail.com Fri Apr 29 19:23:21 2005 From: david.ascher at gmail.com (David Ascher) Date: Fri Apr 29 19:23:23 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <ca471dc205042910162befaaee@mail.gmail.com> References: <ca471dc205042815557616722b@mail.gmail.com> <4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com> <5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com> <ca471dc205042910162befaaee@mail.gmail.com> Message-ID: <dd28fc2f05042910232e69c41f@mail.gmail.com> On 4/29/05, Guido van Rossum <gvanrossum@gmail.com> wrote: > [Phillip J. Eby] > > Although I'd personally prefer a no-keyword approach: > > > > synchronized(self): > > with_file("foo") as f: > > # etc. > > I'd like that too, but it was shot down at least once. Maybe we can > resurrect it? > > opening("foo") as f: > # etc. > > is just a beauty! I agree, but does this then work: x = opening("foo") ...stuff... x as f: # etc ? And if not, why not? And if yes, what happens if "stuff" raises an exception? From david.ascher at gmail.com Fri Apr 29 19:24:58 2005 From: david.ascher at gmail.com (David Ascher) Date: Fri Apr 29 19:25:00 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <dd28fc2f05042910232e69c41f@mail.gmail.com> References: <ca471dc205042815557616722b@mail.gmail.com> <4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com> <5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com> <ca471dc205042910162befaaee@mail.gmail.com> <dd28fc2f05042910232e69c41f@mail.gmail.com> Message-ID: <dd28fc2f050429102445d8e435@mail.gmail.com> > I agree, but does this then work: > > x = opening("foo") > ...stuff... > x as f: > # etc > > ? And if not, why not? And if yes, what happens if "stuff" raises an > exception? Forget it -- the above is probably addressed by the PEP and doesn't really depend on whether there's a kw or not. From aahz at pythoncraft.com Fri Apr 29 19:42:32 2005 From: aahz at pythoncraft.com (Aahz) Date: Fri Apr 29 19:42:34 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <ca471dc205042910162befaaee@mail.gmail.com> References: <ca471dc205042815557616722b@mail.gmail.com> <4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com> <5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com> <ca471dc205042910162befaaee@mail.gmail.com> Message-ID: <20050429174232.GB18361@panix.com> On Fri, Apr 29, 2005, Guido van Rossum wrote: > [Phillip J. Eby] >> >> Although I'd personally prefer a no-keyword approach: >> >> synchronized(self): >> with_file("foo") as f: >> # etc. > > I'd like that too, but it was shot down at least once. Maybe we can > resurrect it? > > opening("foo") as f: > # etc. I'm still -1 for the same reason I mentioned earlier: function calls spanning multiple lines are moderately common in Python code, and it's hard to distinguish these cases because multi-line calls usually get indented like blocks. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It's 106 miles to Chicago. We have a full tank of gas, a half-pack of cigarettes, it's dark, and we're wearing sunglasses." "Hit it." From jimjjewett at gmail.com Fri Apr 29 19:48:57 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri Apr 29 19:48:59 2005 Subject: [Python-Dev] next(arg) was: Anonymous blocks: Thunks or iterators? Message-ID: <fb6fbf56050429104851d1992@mail.gmail.com> Guido van Rossum: > One [of many separate ideas in PEP 340] is turning generators > into more general coroutines: continue EXPR passes the expression > to the iterator's next() method ... I would have been very happy with that a week ago. Seeing the specific implementation changed my mind. The caller shouldn't know what state the generator is in, so the passed-in-message will be the same regardless of which yield accepts it. Unless I have a single-yield generator, this means I end up writing boilerplate code to accept and process the arg at each yield. I don't want more boilerplate. > Even without a block-statement, these two changes make yield look a > lot like invoking a thunk Though it feels backwards to me; yield is returning control to something that already had to coordinate the thunks itself. -jJ From pje at telecommunity.com Fri Apr 29 19:54:43 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Apr 29 19:51:38 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <20050429174232.GB18361@panix.com> References: <ca471dc205042910162befaaee@mail.gmail.com> <ca471dc205042815557616722b@mail.gmail.com> <4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com> <5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com> <ca471dc205042910162befaaee@mail.gmail.com> Message-ID: <5.1.1.6.0.20050429134751.03099cb0@mail.telecommunity.com> At 10:42 AM 4/29/05 -0700, Aahz wrote: >On Fri, Apr 29, 2005, Guido van Rossum wrote: > > [Phillip J. Eby] > >> > >> Although I'd personally prefer a no-keyword approach: > >> > >> synchronized(self): > >> with_file("foo") as f: > >> # etc. > > > > I'd like that too, but it was shot down at least once. Maybe we can > > resurrect it? > > > > opening("foo") as f: > > # etc. > >I'm still -1 for the same reason I mentioned earlier: function calls >spanning multiple lines are moderately common in Python code, and it's >hard to distinguish these cases because multi-line calls usually get >indented like blocks. But the indentation of a multi-line call doesn't start with a colon. Or are you saying you're concerned about things like: opening( blah, blah, foo, wah=flah ) as fidgety, widgety, foo: sping() Which is quite ugly, to be sure, but then I don't see where adding an extra keyword helps. From gvanrossum at gmail.com Fri Apr 29 19:55:28 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Apr 29 19:55:31 2005 Subject: [Python-Dev] Anonymous blocks: Thunks or iterators? In-Reply-To: <2mll720vts.fsf@starship.python.net> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <4270A1F2.1030401@canterbury.ac.nz> <ca471dc205042815157cf20297@mail.gmail.com> <2mll720vts.fsf@starship.python.net> Message-ID: <ca471dc20504291055f5bb88e@mail.gmail.com> [Michael Hudson] > I think the making-generators-more-sexy thing is nice, but I'm think > that's almost orthogonal. Not entirely. I agree that "continue EXPR" calling next(EXPR) which enables yield-expressions is entirely orthogonal. But there are already two PEPs asking for passing exceptions and/or cleanup into generators and from there it's only a small step to using them as resource allocation/release templates. The "small step" part is important -- given that we're going to do that work on generators anyway, I expect the changes to the compiler and VM to support the block statement are actually *less* than the changes needed to support thunks. No language feature is designed in isolation. > Did you read this mail: > > http://mail.python.org/pipermail/python-dev/2005-April/052970.html > > ? In this proposal, you have to go to some effort to make the thunk > survive the block, and I think if weirdness results, that's the > programmer's problem. It's not a complete proposal though. You say "And grudgingly, I guess you'd need to make returns behave like that anyway" (meaning they should return from the containing function). But you don't give a hint on how that could be made to happen, and I expect that by the time you've figured out a mechanism, thunks aren't all that simple any more. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rrr at ronadam.com Fri Apr 29 19:57:25 2005 From: rrr at ronadam.com (Ron Adam) Date: Fri Apr 29 19:57:59 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <fb6fbf5605042907431311af71@mail.gmail.com> References: <fb6fbf5605042907431311af71@mail.gmail.com> Message-ID: <42727585.3090801@ronadam.com> Jim Jewett wrote: >Nick Coghlan: > > >>Python offers two variants on the basic iterative loop. >> > >> "for NAME in EXPR:" skips the finalisation step. At loop >>completion, a well-behaved iterator may still contain additional values. >> > >> "for NAME from EXPR:" enforces finalisation of the iterator. >>... At loop completion, a well-behaved [finalizing] iterator is >>always completely exhausted. >> > >(nitpick): > "from" isn't really different from "in". Perhaps > > for NAME inall EXPR: > for NAME draining EXPR: > for NAME finalizing EXPR: # too hard to spell, because of s/z? > >(substance): > >"finalized or not" is a very useful distinction, but I'm not sure it >is something the user should have to worry about. Realistically, >most of my loops intend to drain the iterator (which the compiler >knows because I have no "break:". Regardless of whether I >use a break, I still want the iterator cleaned up if it is drained. > >The only thing this second loop form does is set a flag saying > > "No, I won't continue -- and I happen to know that no one else > ever will either, even if they do have a reference that prevents > garbage collection. I'm *sure* they won't use it." > >That strikes me as a dangerous thing to get in the habit of saying. > >Why not just agressively run the finalization on both forms when the >reference count permits? > > >>This form supports block management operations, >> > >And this seems unrelated to finalization. I understand that as an >implementation detail, you need to define the finalizers somehow. >But the decision to aggressively finalize (in some manner) and >desire to pass a block (that could be for finalization) seem like >orthogonal issues. > >-jJ >_______________________________________________ >Python-Dev mailing list >Python-Dev@python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: http://mail.python.org/mailman/options/python-dev/rrr%40ronadam.com > > How about 'serve' as in a server of items from a service? serve NAME from EXPR: <block> I think this is more descriptive of what it does and will make it easier to explain. It also implies the correct relationship between the block, the name, and the expression. I think 'block' and 'with' are both *way* too general. The problem I see with 'block' is that the term is often used as a general term to describe the body of other statements.... while, for, if, ... etc. The generator in this case could be called a 'server' which would distinguish it from a normal genrator. By using 'serve' as a keyword, you can then refer to the expression as a whole as a 'service' or a 'resouce manager'. And a simple description of it would be.... A SERVE statement serves NAME(s) from a SERVER to the following statement block. (Details of how to use SERVE blocks and SERVERS.) Ron Adam From gvanrossum at gmail.com Fri Apr 29 20:00:06 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Apr 29 20:00:08 2005 Subject: [Python-Dev] next(arg) was: Anonymous blocks: Thunks or iterators? In-Reply-To: <fb6fbf56050429104851d1992@mail.gmail.com> References: <fb6fbf56050429104851d1992@mail.gmail.com> Message-ID: <ca471dc2050429110062695af8@mail.gmail.com> [Guido van Rossum] > > One [of many separate ideas in PEP 340] is turning generators > > into more general coroutines: continue EXPR passes the expression > > to the iterator's next() method ... [Jim Jewett] > I would have been very happy with that a week ago. Seeing the > specific implementation changed my mind. > > The caller shouldn't know what state the generator is in, so the > passed-in-message will be the same regardless of which yield > accepts it. Unless I have a single-yield generator, this means > I end up writing boilerplate code to accept and process the arg > at each yield. I don't want more boilerplate. I think your premise is wrong. When necessary (which it usually won't be) the caller can tell the generator's state from the last thing it yielded. Coroutines can easily define a protocol based on this if needed. Anyway, single-yield generators are by far the majority. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From gvanrossum at gmail.com Fri Apr 29 20:01:01 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Fri Apr 29 20:01:04 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <42727585.3090801@ronadam.com> References: <fb6fbf5605042907431311af71@mail.gmail.com> <42727585.3090801@ronadam.com> Message-ID: <ca471dc2050429110119120892@mail.gmail.com> [Ron Adam] > How about 'serve' as in a server of items from a service? No, please. This has way too strong connotations with network protocols. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From sabbey at u.washington.edu Fri Apr 29 20:10:42 2005 From: sabbey at u.washington.edu (Brian Sabbey) Date: Fri Apr 29 20:10:48 2005 Subject: [Python-Dev] Anonymous blocks: Thunks or iterators? In-Reply-To: <fb6fbf560504291017362204a7@mail.gmail.com> References: <fb6fbf560504291017362204a7@mail.gmail.com> Message-ID: <Pine.A41.4.61b.0504291101290.83522@dante74.u.washington.edu> Jim Jewett wrote: > The only members that need special attention are (f_code, f_lasti) > and possibly (f_blockstack, f_iblock). You don't even need to take care of f_code. The thunk and its surrounding function can share the same code. The thunk gets compiled into the function the same way the body of a for loop would. > (f_code, f_lasti) would need to be replaced with a stack of pairs. > Finishing a code string would mean popping this stack, rather > than popping the whole frame. There doesn't need to be a stack; each thunk can store its own f_lasti. One also needs to store f_back, and, to avoid exception weirdness, f_exc_XXX. In this way, calling the thunk is much like resuming a generator. -Brian From mahs at telcopartners.com Fri Apr 29 20:03:33 2005 From: mahs at telcopartners.com (Michael Spencer) Date: Fri Apr 29 20:17:05 2005 Subject: [Python-Dev] PEP 340: syntax suggestion - try opening(filename) as f: Message-ID: <d4tsik$4be$1@sea.gmane.org> I don't know whether it's true for all the PEP 340 use cases, but the all the current examples would read very naturally if the block-template could be specified in an extended try statement: > 1. A template for ensuring that a lock, acquired at the start of a > block, is released when the block is left: try with_lock(myLock): # Code here executes with myLock held. The lock is # guaranteed to be released when the block is left (even # if by an uncaught exception). > 2. A template for opening a file that ensures the file is closed > when the block is left: try opening("/etc/passwd") as f: for line in f: print line.rstrip() > > 3. A template for committing or rolling back a database > transaction: > try transaction(mydb): > 4. A template that tries something up to n times: > try auto_retry(3): f = urllib.urlopen("http://python.org/peps/pep-0340.html") print f.read() > 5. It is possible to nest blocks and combine templates: try with_lock(myLock): try opening("/etc/passwd") as f: for line in f: print line.rstrip() Michael From jimjjewett at gmail.com Fri Apr 29 20:23:05 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri Apr 29 20:23:08 2005 Subject: [Python-Dev] Anonymous blocks: Thunks or iterators? Message-ID: <fb6fbf5605042911233ce849db@mail.gmail.com> Guido van Rossum: > -- but it's more efficient, since calling yield doesn't create a frame. Neither should a thunk. > The other problem with thunks is that once we think of them as the > anonymous functions they are, we're pretty much forced to say that a > return statement in a thunk returns from the thunk rather than from > the containing function. Why should a thunk be a function? We already have first class functions. What we're missing is a way to pass around a suite. def foo(a): if a > 4: b = a c = process(a) # thunk line 1 print a # thunk line 2 return # thunk line 3 else: a.something() We don't have a good way to package up "c = process(a); print a; return" The return should exit the whole function, not just (part of) the if clause. Greg: >> I'd like to reconsider a thunk implementation. It >> would be a lot simpler, doing just what is required >> without any jiggery pokery with exceptions and >> break/continue/return statements. It would be easy >> to explain what it does and why it's useful. > I don't know. In order to obtain the required local variable sharing > between the thunk and the containing function I believe that every > local variable used or set in the thunk would have to become a 'cell' > (our mechanism for sharing variables between nested scopes). Cells only work if you have a complete set of names at compile-time. Your own resource-example added "item" to the namespace inside a block. If you don't know which blocks could be used with a pattern, cells are out. That said, the compiler code is already two-pass. Once to find names, and another time to resolve them. This just means that for thunks (and functions that call them) the adjustment will be to LOAD_NAME instead of getting a LOAD_FAST index. -jJ From jjl at pobox.com Fri Apr 29 20:31:17 2005 From: jjl at pobox.com (John J Lee) Date: Fri Apr 29 20:30:02 2005 Subject: [Python-Dev] Re: anonymous blocks In-Reply-To: <4271B07A.4010501@hathawaymix.org> References: <ca471dc205042116402d7d38da@mail.gmail.com> <ca471dc205042416572da9db71@mail.gmail.com> <426DB7C8.5020708@canterbury.ac.nz> <ca471dc2050426043713116248@mail.gmail.com> <426E3B01.1010007@canterbury.ac.nz> <ca471dc205042621472b1f6edf@mail.gmail.com> <427083B0.6040204@canterbury.ac.nz> <Pine.LNX.4.58.0504280158590.4786@server1.LFW.org> <ca471dc2050428155127cec9a5@mail.gmail.com> <4271B07A.4010501@hathawaymix.org> Message-ID: <Pine.WNT.4.58.0504291927120.2748@vernon> On Thu, 28 Apr 2005, Shane Hathaway wrote: [...] > I think this concept can be explained clearly. I'd like to try > explaining PEP 340 to someone new to Python but not new to programming. [...snip explanation...] > Is it understandable so far? Yes, excellent. Speaking as somebody who scanned the PEP and this thread and only half-understood either, that was quite painless to read. Still not sure whether thunks or PEP 340 are better, but I'm at least confused on a higher level now. John From rrr at ronadam.com Fri Apr 29 20:35:35 2005 From: rrr at ronadam.com (Ron Adam) Date: Fri Apr 29 20:33:38 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <ca471dc2050429110119120892@mail.gmail.com> References: <fb6fbf5605042907431311af71@mail.gmail.com> <42727585.3090801@ronadam.com> <ca471dc2050429110119120892@mail.gmail.com> Message-ID: <42727E77.80504@ronadam.com> Guido van Rossum wrote: >[Ron Adam] > >>How about 'serve' as in a server of items from a service? >> > >No, please. This has way too strong connotations with network protocols. > > Errr... you're right of course... :-/ (I was thinking *way* to narrow.) I think the context is correct, just need a synonym that isn't already used. provide, provider supply, supplier dispense, dispenser deal, dealer deliver, deliveror or parcel, meter, dish, give, dole, offer, cede... Maybe barrow from a different language? Ron Adam From jimjjewett at gmail.com Fri Apr 29 20:33:54 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri Apr 29 20:33:56 2005 Subject: [Python-Dev] Anonymous blocks: Thunks or iterators? In-Reply-To: <Pine.A41.4.61b.0504291101290.83522@dante74.u.washington.edu> References: <fb6fbf560504291017362204a7@mail.gmail.com> <Pine.A41.4.61b.0504291101290.83522@dante74.u.washington.edu> Message-ID: <fb6fbf5605042911337960ee4d@mail.gmail.com> On 4/29/05, Brian Sabbey <sabbey@u.washington.edu> wrote: > Jim Jewett wrote: > > The only members that need special attention are (f_code, f_lasti) > > and possibly (f_blockstack, f_iblock). > You don't even need to take care of f_code. The thunk and its surrounding > function can share the same code. The thunk gets compiled into the > function the same way the body of a for loop would. This only works if you already know what the thunk's code will be when you compile the function. (Just splicing it in messes up jump targets.) > One also needs to store f_back, and, to avoid exception weirdness, > f_exc_XXX. f_back lists the previous stack frame (which shouldn't change during a thunk[1]), and f_exc_XXX is for the most recent exception -- I don't see any reason to treat thunks differently from loop bodies in that regard. [1] If the thunk calls another function (that needs its own frame), then that is handled the same as any regular function call. -jJ From jimjjewett at gmail.com Fri Apr 29 21:01:20 2005 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri Apr 29 21:01:25 2005 Subject: [Python-Dev] PEP 340: syntax suggestion - try opening(filename) as f: Message-ID: <fb6fbf5605042912013a7f896@mail.gmail.com> Michael Spencer: > I don't know whether it's true for all the PEP 340 use cases, but the all the > current examples would read very naturally if the block-template could be > specified in an extended try statement: >> 1. A template for ensuring that a lock, acquired at the start of a >> block, is released when the block is left: > try with_lock(myLock): > # Code here executes with myLock held. The lock is > # guaranteed to be released when the block is left (even > # if by an uncaught exception). So we would have try ... finally, try ... except, and try (no close). It works for me, and should be backwards-compatible. The cases where it doesn't work as well are (1) You want to insert several different suites. But the anonymous yield syntax doesn't work well for that either. (That is one of the arguments for thunks instead of generator abuse.) (2) You really do want to loop over the suite. Try doesn't imply a loop. But this is a *good* thing. Resources are not loops, and you can always make the loop explicit as iteration over the resource def opener(file): f=open(file) try: yield f finally: f.close() try opener(file) as f: for line in f: process(line) From aahz at pythoncraft.com Fri Apr 29 21:05:25 2005 From: aahz at pythoncraft.com (Aahz) Date: Fri Apr 29 21:05:27 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <5.1.1.6.0.20050429134751.03099cb0@mail.telecommunity.com> References: <ca471dc205042910162befaaee@mail.gmail.com> <ca471dc205042815557616722b@mail.gmail.com> <4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com> <5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com> <ca471dc205042910162befaaee@mail.gmail.com> <5.1.1.6.0.20050429134751.03099cb0@mail.telecommunity.com> Message-ID: <20050429190525.GA2708@panix.com> On Fri, Apr 29, 2005, Phillip J. Eby wrote: > At 10:42 AM 4/29/05 -0700, Aahz wrote: >>On Fri, Apr 29, 2005, Guido van Rossum wrote: >>> [Phillip J. Eby] >>>> >>>> Although I'd personally prefer a no-keyword approach: >>>> >>>> synchronized(self): >>>> with_file("foo") as f: >>>> # etc. >>> >>> I'd like that too, but it was shot down at least once. Maybe we can >>> resurrect it? >>> >>> opening("foo") as f: >>> # etc. >> >>I'm still -1 for the same reason I mentioned earlier: function calls >>spanning multiple lines are moderately common in Python code, and it's >>hard to distinguish these cases because multi-line calls usually get >>indented like blocks. > > But the indentation of a multi-line call doesn't start with a colon. Neither does the un-keyworded block. It starts with a colon on the end of the previous line. I thought part of the point of Python was to minimize reliance on punctuation, especially where it's not clearly visible? -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It's 106 miles to Chicago. We have a full tank of gas, a half-pack of cigarettes, it's dark, and we're wearing sunglasses." "Hit it." From andre.roberge at gmail.com Fri Apr 29 22:10:57 2005 From: andre.roberge at gmail.com (=?ISO-8859-1?Q?Andr=E9_Roberge?=) Date: Fri Apr 29 22:16:04 2005 Subject: [Python-Dev] Re: PEP 340 - possible new name for block-statement In-Reply-To: <4271B7FC.1070801@pobox.com> References: <ca471dc205042815557616722b@mail.gmail.com> <4271B7FC.1070801@pobox.com> Message-ID: <d4u41i$bpc$1@sea.gmane.org> Robin Munn wrote: [snip] > > Another possibility just occurred to me. How about "using"? > > ~ using EXPR as VAR: > ~ BLOCK > Examples from PEP 340: ========== def synchronized(lock): ... using synchronized(myLock): ... ===== (+0) def opening(filename, mode="r"): ... using opening("/etc/passwd") as f: ... ===== (+1) def auto_retry(n=3, exc=Exception): ... using auto_retry(3, IOError): ... ===== (+1) def synchronized_opening(lock, filename, mode="r"): ... using synchronized_opening("/etc/passwd", myLock) as f: ... ===== (+1) A.R. From nidoizo at yahoo.com Fri Apr 29 22:26:27 2005 From: nidoizo at yahoo.com (Nicolas Fleury) Date: Fri Apr 29 22:27:34 2005 Subject: [Python-Dev] Re: PEP 340 - possible new name for block-statement In-Reply-To: <ca471dc2050428223023aa80fc@mail.gmail.com> References: <ca471dc205042815557616722b@mail.gmail.com> <d4s8nh$fk8$1@sea.gmane.org> <ca471dc2050428223023aa80fc@mail.gmail.com> Message-ID: <d4u4uq$fd5$1@sea.gmane.org> Guido van Rossum wrote: > [Nicolas Fleury] >>scoped EXPR as VAR: >> BLOCK > > Definitely not. In too many languages, a "scope" is a new namespace, > and that's exactly what a block (by whichever name) is *not*. Humm... what about "context"? context EXPR as VAR: BLOCK I may answer the question myself, but is an alternative syntax without an indentation conceivable? (yes, even since the implicit block could be run multiple times). Because in that case, a keyword like "block" would not look right. It seems to me that in most RAII cases, the block could end at the end of the current block and that's fine, and over-indentation can be avoided. However, I realize that the indentation makes more sense in the context of Python and removes some magic that would be natural for a C++ programmer used to presence of stack... Ok, I answer my question, but "context" still sounds nicer to me than "block";) Regards, Nicolas From pje at telecommunity.com Fri Apr 29 23:18:52 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Fri Apr 29 23:15:52 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <20050429190525.GA2708@panix.com> References: <5.1.1.6.0.20050429134751.03099cb0@mail.telecommunity.com> <ca471dc205042910162befaaee@mail.gmail.com> <ca471dc205042815557616722b@mail.gmail.com> <4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com> <5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com> <ca471dc205042910162befaaee@mail.gmail.com> <5.1.1.6.0.20050429134751.03099cb0@mail.telecommunity.com> Message-ID: <5.1.1.6.0.20050429170733.031a74e0@mail.telecommunity.com> At 12:05 PM 4/29/05 -0700, Aahz wrote: >On Fri, Apr 29, 2005, Phillip J. Eby wrote: > > At 10:42 AM 4/29/05 -0700, Aahz wrote: > >>On Fri, Apr 29, 2005, Guido van Rossum wrote: > >>> [Phillip J. Eby] > >>>> > >>>> Although I'd personally prefer a no-keyword approach: > >>>> > >>>> synchronized(self): > >>>> with_file("foo") as f: > >>>> # etc. > >>> > >>> I'd like that too, but it was shot down at least once. Maybe we can > >>> resurrect it? > >>> > >>> opening("foo") as f: > >>> # etc. > >> > >>I'm still -1 for the same reason I mentioned earlier: function calls > >>spanning multiple lines are moderately common in Python code, and it's > >>hard to distinguish these cases because multi-line calls usually get > >>indented like blocks. > > > > But the indentation of a multi-line call doesn't start with a colon. > >Neither does the un-keyworded block. It starts with a colon on the end >of the previous line. I thought part of the point of Python was to >minimize reliance on punctuation, especially where it's not clearly >visible? Actually, I've just realized that I was misled by your argument into thinking that the possibility of confusing a multi-line call and a block of this sort is a problem. It's not, because template blocks can be viewed as multi-line calls that just happen to include a block of code as one of the arguments. So, mistaking one for the other when you're just skimming the code and not looking at things like "as" or the ":", is really not important. In the second place, the most important cue to understanding the behavior of a template block is the template function itself; the bare syntax gives it the most prominence. Blocks like 'synchronized(self):' should be instantly comprehensible to Java programmers, for example, and 'retry(3):' is also pretty self-explanatory. And so far, template function names and signatures have been quite brief as well. From aahz at pythoncraft.com Sat Apr 30 00:43:00 2005 From: aahz at pythoncraft.com (Aahz) Date: Sat Apr 30 00:43:02 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <5.1.1.6.0.20050429170733.031a74e0@mail.telecommunity.com> References: <5.1.1.6.0.20050429134751.03099cb0@mail.telecommunity.com> <ca471dc205042910162befaaee@mail.gmail.com> <ca471dc205042815557616722b@mail.gmail.com> <4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com> <5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com> <ca471dc205042910162befaaee@mail.gmail.com> <5.1.1.6.0.20050429134751.03099cb0@mail.telecommunity.com> <5.1.1.6.0.20050429170733.031a74e0@mail.telecommunity.com> Message-ID: <20050429224300.GA9425@panix.com> On Fri, Apr 29, 2005, Phillip J. Eby wrote: > > Actually, I've just realized that I was misled by your argument into > thinking that the possibility of confusing a multi-line call and a block of > this sort is a problem. It's not, because template blocks can be viewed as > multi-line calls that just happen to include a block of code as one of the > arguments. So, mistaking one for the other when you're just skimming the > code and not looking at things like "as" or the ":", is really not > important. Maybe. I'm not persuaded, but this inclines me toward agreeing with your position. > In the second place, the most important cue to understanding the behavior > of a template block is the template function itself; the bare syntax gives > it the most prominence. Blocks like 'synchronized(self):' should be > instantly comprehensible to Java programmers, for example, and 'retry(3):' > is also pretty self-explanatory. And so far, template function names and > signatures have been quite brief as well. This works IMO IFF Python is regarded as a language with user-defined syntactical structures. Guido has historically disagreed strongly with that philosophy; until and unless he reverses his opinion, this is precisely why the non-keyword version will continue to receive -1 from me. (As it happens, I agree with Guido, so if Guido wants to change, I'll probably argue until I see good reason. ;-) -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "It's 106 miles to Chicago. We have a full tank of gas, a half-pack of cigarettes, it's dark, and we're wearing sunglasses." "Hit it." From reinhold-birkenfeld-nospam at wolke7.net Sat Apr 30 00:53:12 2005 From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld) Date: Sat Apr 30 00:55:45 2005 Subject: [Python-Dev] Re: PEP 340 - possible new name for block-statement In-Reply-To: <ca471dc205042822271a43bc83@mail.gmail.com> References: <ca471dc205042815557616722b@mail.gmail.com> <4271B7FC.1070801@pobox.com> <ca471dc205042822271a43bc83@mail.gmail.com> Message-ID: <d4udl0$7j3$1@sea.gmane.org> Guido van Rossum wrote: >> Another possibility just occurred to me. How about "using"? > > Blah. I'm beginning to like block just fine. With using, the choice of > word for the generator name becomes iffy IMO; and it almost sounds > like it's a simple renaming: "using X as Y" could mean "Y = X". FWIW, the first association when seeing block something: is with the verb "to block", and not with the noun, which is most displeasing. Reinhold -- Mail address is perfectly valid! From gvanrossum at gmail.com Sat Apr 30 01:02:16 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat Apr 30 01:02:18 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <20050429224300.GA9425@panix.com> References: <ca471dc205042815557616722b@mail.gmail.com> <4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com> <5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com> <ca471dc205042910162befaaee@mail.gmail.com> <5.1.1.6.0.20050429134751.03099cb0@mail.telecommunity.com> <5.1.1.6.0.20050429170733.031a74e0@mail.telecommunity.com> <20050429224300.GA9425@panix.com> Message-ID: <ca471dc205042916024c03501a@mail.gmail.com> [Phillip] > > In the second place, the most important cue to understanding the behavior > > of a template block is the template function itself; the bare syntax gives > > it the most prominence. Blocks like 'synchronized(self):' should be > > instantly comprehensible to Java programmers, for example, and 'retry(3):' > > is also pretty self-explanatory. And so far, template function names and > > signatures have been quite brief as well. [Aahz] > This works IMO IFF Python is regarded as a language with user-defined > syntactical structures. Guido has historically disagreed strongly with > that philosophy; until and unless he reverses his opinion, this is > precisely why the non-keyword version will continue to receive -1 from > me. (As it happens, I agree with Guido, so if Guido wants to change, > I'll probably argue until I see good reason. ;-) Actually, I think this is a nice way to have my cake and eat it too: on the one hand, there still isn't any user-defined syntax, because the keyword-less block syntax is still fixed by the compiler. On the other hand, people are free to *think* of it as introducing syntax if it helps them understand the code better. Just as you can think of each distinct @decorator as a separate piece of syntax that modifies a function/method definition. And just as you can think of a function call as a user-defined language extension. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rodrigobamboo at gmail.com Sat Apr 30 01:15:12 2005 From: rodrigobamboo at gmail.com (Rodrigo B. de Oliveira) Date: Sat Apr 30 01:15:28 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <ca471dc205042910162befaaee@mail.gmail.com> References: <ca471dc205042815557616722b@mail.gmail.com> <4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com> <5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com> <ca471dc205042910162befaaee@mail.gmail.com> Message-ID: <5917478b050429161524eba5a3@mail.gmail.com> On 4/29/05, Guido van Rossum <gvanrossum@gmail.com> wrote: > [Phillip J. Eby] > > Although I'd personally prefer a no-keyword approach: > > > > synchronized(self): > > with_file("foo") as f: > > # etc. > > I'd like that too, but it was shot down at least once. Maybe we can > resurrect it? > > opening("foo") as f: > # etc. > > is just a beauty! > Yes. I like it. EXPRESSION [as VAR]: BLOCK lock(self._monitor): # typing synchronized freaks me out spam() using(DB.open()) as conn: eggs(conn) From gvanrossum at gmail.com Sat Apr 30 01:19:59 2005 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sat Apr 30 01:20:02 2005 Subject: [Python-Dev] PEP 340: syntax suggestion - try opening(filename) as f: In-Reply-To: <d4tsik$4be$1@sea.gmane.org> References: <d4tsik$4be$1@sea.gmane.org> Message-ID: <ca471dc205042916194f20c27@mail.gmail.com> [Michael Spencer] > I don't know whether it's true for all the PEP 340 use cases, but the all the > current examples would read very naturally if the block-template could be > specified in an extended try statement: Sorry, this emphasizes the wrong thing. A try-statement emphasizes that the body may fail (and then provides some cleanup semantics). IMO a block-statement, while it has cleanup semantics, should emphasize that the block executes under some kind of supervision. The more I think about it the more I like having no keyword at all (see other messages). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From s.percivall at chello.se Sat Apr 30 01:44:38 2005 From: s.percivall at chello.se (Simon Percivall) Date: Sat Apr 30 01:44:41 2005 Subject: [Python-Dev] Anonymous blocks: Thunks or iterators? In-Reply-To: <Pine.A41.4.61b.0504291101290.83522@dante74.u.washington.edu> References: <fb6fbf560504291017362204a7@mail.gmail.com> <Pine.A41.4.61b.0504291101290.83522@dante74.u.washington.edu> Message-ID: <0EBAEB8A-29A5-4B51-9894-F808993EC0A3@chello.se> On 29 apr 2005, at 20.10, Brian Sabbey wrote: > [...] The thunk and its surrounding function can share the same > code. The thunk gets compiled into the function the same way the > body of a for loop would. This seems really, truly, nasty! Wouldn't this require you to check the source code of the function you want to integrate your thunk into to avoid namespace collisions? Well, no, not to avoid collisions I guess, if it's truly regarded as part of the function. But this means it would use the function's global namespace, etc. You'd be unable to use anything from the scopes in which the thunk is defined, which makes it really, really ... wierd. Or have I not gotten it? //Simon From ncoghlan at gmail.com Sat Apr 30 01:55:22 2005 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat Apr 30 01:55:27 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <42725691.4030308@cirad.fr> References: <ca471dc205042815557616722b@mail.gmail.com> <4271F71B.8010000@gmail.com> <42721E2E.8020108@cirad.fr> <42724C1C.4040200@gmail.com> <42725691.4030308@cirad.fr> Message-ID: <4272C96A.5080709@gmail.com> Pierre Barbier de Reuille wrote: > Mmmmh ... why introduce a new flag ? Can't you just test the presence of > the "__error__" method ? This would lift your problem wouldn't it ? Perhaps - it would require doing something a little tricky with generators to allow the programmer to specify whether the generator should be finalised or not. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://boredomandlaziness.skystorm.net From shane.holloway at ieee.org Sat Apr 30 02:52:27 2005 From: shane.holloway at ieee.org (Shane Holloway (IEEE)) Date: Sat Apr 30 02:52:54 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <ca471dc205042910162befaaee@mail.gmail.com> References: <ca471dc205042815557616722b@mail.gmail.com> <4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com> <5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com> <ca471dc205042910162befaaee@mail.gmail.com> Message-ID: <4272D6CB.5070806@ieee.org> Guido van Rossum wrote: > [Phillip J. Eby] > >>Although I'd personally prefer a no-keyword approach: >> >> synchronized(self): >> with_file("foo") as f: >> # etc. > > > I'd like that too, but it was shot down at least once. Maybe we can > resurrect it? > > opening("foo") as f: > # etc. > > is just a beauty! +1 Certainly my favorite because it's direct and easy on the eyes. Second would be:: in opening("foo") as f: # etc. because I can see Aahz's point about introducing the block with a keyword instead of relying on the ":" punctuation and subsequent indentation of the block for skimming code. -Shane Holloway From python-dev at zesty.ca Sat Apr 30 03:21:26 2005 From: python-dev at zesty.ca (Ka-Ping Yee) Date: Sat Apr 30 03:21:29 2005 Subject: [Python-Dev] PEP 340: syntax suggestion - try opening(filename) as f: In-Reply-To: <ca471dc205042916194f20c27@mail.gmail.com> References: <d4tsik$4be$1@sea.gmane.org> <ca471dc205042916194f20c27@mail.gmail.com> Message-ID: <Pine.LNX.4.58.0504292007510.4786@server1.LFW.org> On Fri, 29 Apr 2005, Guido van Rossum wrote: > The more I think about it the more I like having no keyword at all > (see other messages). I hope you'll reconsider this. I really think introducing a new statement requires a keyword, for pedagogical reasons as well as readability and consistency. Here's my pitch: All the statements in Python are associated with keywords, except for assignment, which is simple and extremely common. I don't think the block statement is simple enough or common enough for that; its semantics are much too significant to be flagged only by a little punctuation mark like a colon. I can empathize with wanting to avoid a keyword in order to avoid an endless debate about what the keyword will be. But that debate can't be avoided anyway -- we still have to agree on what to call this thing when talking about it and teaching it. The keyword gives us a name, a conceptual tag from which to hang our knowledge and discussions. Once we have a keyword, there can be no confusion about what to call the construct. And if there is a distinctive keyword, a Python programmer who comes across this unfamiliar construct will be able to ask someone "What does this 'spam' keyword mean?" or can search on Google for "Python spam" to find out what it means. Without a keyword, they're out of luck. Names are power. -- ?!ng From pje at telecommunity.com Sat Apr 30 03:52:07 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat Apr 30 03:49:22 2005 Subject: [Python-Dev] PEP 340 - possible new name for block-statement In-Reply-To: <ca471dc205042916024c03501a@mail.gmail.com> References: <20050429224300.GA9425@panix.com> <ca471dc205042815557616722b@mail.gmail.com> <4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com> <5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com> <ca471dc205042910162befaaee@mail.gmail.com> <5.1.1.6.0.20050429134751.03099cb0@mail.telecommunity.com> <5.1.1.6.0.20050429170733.031a74e0@mail.telecommunity.com> <20050429224300.GA9425@panix.com> Message-ID: <5.1.1.6.0.20050429212046.032abd70@mail.telecommunity.com> At 04:02 PM 4/29/05 -0700, Guido van Rossum wrote: >Actually, I think this is a nice way to have my cake and eat it too: >on the one hand, there still isn't any user-defined syntax, because >the keyword-less block syntax is still fixed by the compiler. On the >other hand, people are free to *think* of it as introducing syntax if >it helps them understand the code better. Just as you can think of >each distinct @decorator as a separate piece of syntax that modifies a >function/method definition. And just as you can think of a function >call as a user-defined language extension. And, amusingly enough, those folks who wanted a decorator suite can now have their wish, e.g.: decorate(classmethod): def something(cls, blah): ... Given a suitable frame-sniffing implementation of 'decorate'. :) By the way, I notice PEP 340 has two outstanding items with my name on them; let me see if I can help eliminate one real quick. Tracebacks: it occurs to me that I may have unintentionally given the impression that I need to pass in an arbitrary traceback, when in fact I only need to pass in the current sys.exc_info(). So, if the error call-in doesn't pass in anything but an error flag, and the template iterator is supposed to just read sys.exc_info(), maybe that would be less of an issue? For one thing, it would make handling arbitrary errors in the template block cleaner, because the traceback for unhandled errors in something like this: synchronized(foo): raise Bar would look something like this: File .... line ... of __main__: synchronized(foo): File .... line ... of synchronized: yield File .... line ... of __main__: raise Bar Which, IMO, is the "correct" traceback for this circumstance, although since the first and last frame would actually be the same, you'd probably only get the lower two entries (the yield and the raise), which is OK too I think. Anyway, I mainly just wanted to note that I'd be fine with having a way to say, "Hey, there's an error, handle it" that doesn't allow passing in the exception or traceback, but is just a flag that means "look at Python's error state" instead of passing a value back in. I can do this because when I need to pass in a traceback, it's because I'm trying to pass a terminated coroutine's error into another coroutine. So, the traceback I want to pass in is Python's existing "last error" state anyway. From pje at telecommunity.com Sat Apr 30 03:54:47 2005 From: pje at telecommunity.com (Phillip J. Eby) Date: Sat Apr 30 03:52:01 2005 Subject: [Python-Dev] PEP 340: syntax suggestion - try opening(filename) as f: In-Reply-To: <Pine.LNX.4.58.0504292007510.4786@server1.LFW.org> References: <ca471dc205042916194f20c27@mail.gmail.com> <d4tsik$4be$1@sea.gmane.org> <ca471dc205042916194f20c27@mail.gmail.com> Message-ID: <5.1.1.6.0.20050429213620.0322cec0@mail.telecommunity.com> At 08:21 PM 4/29/05 -0500, Ka-Ping Yee wrote: >All the statements in Python are associated with keywords, except >for assignment, which is simple and extremely common. I don't >think the block statement is simple enough or common enough for >that; its semantics are much too significant to be flagged only >by a little punctuation mark like a colon. Don't forget the 'as' clause. >I can empathize with wanting to avoid a keyword in order to >avoid an endless debate about what the keyword will be. But >that debate can't be avoided anyway -- we still have to agree >on what to call this thing when talking about it and teaching it. A "template invocation", perhaps, for the statement, and a "templated block" for the actual block. The expression part of the statement would be the "template expression" which must result in a "template iterator". >The keyword gives us a name, a conceptual tag from which to hang >our knowledge and discussions. Once we have a keyword, there >can be no confusion about what to call the construct. And if >there is a distinctive keyword, a Python programmer who comes >across this unfamiliar construct will be able to ask someone >"What does this 'spam' keyword mean?" or can search on Google for >"Python spam" to find out what it means. Without a keyword, >they're out of luck. Names are power. help(synchronized) or help(retry) would doubtless display useful information. Conversely, try Googling for Python's "for" or "if" keywords, and see if you get anything useful -- I didn't. From python-dev at zesty.ca Sat Apr 30 09:44:20 2005 From: python-dev at zesty.ca (Ka-Ping Yee) Date: Sat Apr 30 09:44:23 2005 Subject: [Python-Dev] Keyword for block statements In-Reply-To: <5.1.1.6.0.20050429213620.0322cec0@mail.telecommunity.com> References: <ca471dc205042916194f20c27@mail.gmail.com> <d4tsik$4be$1@sea.gmane.org> <ca471dc205042916194f20c27@mail.gmail.com> <5.1.1.6.0.20050429213620.0322cec0@mail.telecommunity.com> Message-ID: <Pine.LNX.4.58.0504300227440.4786@server1.LFW.org> On Fri, 29 Apr 2005, Phillip J. Eby wrote: > At 08:21 PM 4/29/05 -0500, Ka-Ping Yee wrote: > >All the statements in Python are associated with keywords, except > >for assignment, which is simple and extremely common. I don't > >think the block statement is simple enough or common enough for > >that; its semantics are much too significant to be flagged only > >by a little punctuation mark like a colon. > > Don't forget the 'as' clause. It's optional, and you have to skip an arbitrarily long expression to get to it. > >if there is a distinctive keyword, a Python programmer who comes > >across this unfamiliar construct will be able to ask someone > >"What does this 'spam' keyword mean?" or can search on Google for > >"Python spam" to find out what it means. Without a keyword, > >they're out of luck. Names are power. > > help(synchronized) or help(retry) would doubtless display useful > information. The programmer who writes the function used to introduce a block can hardly be relied upon to explain the language semantics. We don't expect the docstring of every class to repeat an explanation of Python classes, for example. The language reference manual is for that; it's a different level of documentation. > Conversely, try Googling for Python's "for" or "if" keywords, > and see if you get anything useful -- I didn't. I tried some of my favourite Python keywords :) and found that the following searches all successfully turn up information on the associated kinds of Python statements in the first couple of hits: python if python else python del python while python assert python yield python break python continue python pass python raise python try python finally python class python for statement python return statement python print statement -- ?!ng From python at rcn.com Sat Apr 30 17:34:22 2005 From: python at rcn.com (Raymond Hettinger) Date: Sat Apr 30 17:35:36 2005 Subject: [Python-Dev] Developer list update In-Reply-To: <1112972430.19904.9.camel@geddy.wooz.org> Message-ID: <001101c54d9a$22c70da0$8d22c797@oemcomputer> I haven't heard back from Greg Stein, Jim Fulton, or Paul Prescod. If anyone can get in touch with them, that would be great. I suspect that Jim may want to keep the commit privileges active and that Paul and Greg are done with commits for the time being. Raymond Hettinger From aleaxit at yahoo.com Sat Apr 30 20:59:53 2005 From: aleaxit at yahoo.com (Alex Martelli) Date: Sat Apr 30 20:59:55 2005 Subject: [Python-Dev] Developer list update In-Reply-To: <001101c54d9a$22c70da0$8d22c797@oemcomputer> References: <001101c54d9a$22c70da0$8d22c797@oemcomputer> Message-ID: <e807dd3a37c599a5629f73c8edae0f8f@yahoo.com> On Apr 30, 2005, at 08:34, Raymond Hettinger wrote: > I haven't heard back from Greg Stein, Jim Fulton, or Paul Prescod. > > If anyone can get in touch with them, that would be great. > I suspect that Jim may want to keep the commit privileges active > and that Paul and Greg are done with commits for the time being. Greg (gstein at lyra dot org, also gstein at google dot com), I assume, might also want to keep the commit privileges -- he's now working on the opensource projects at Google, and actively speaking about "Python at Google" (he did so both at Pycon and ACCU/PythonUK), so it seems far from unlikely to me that he might be back to active contributions soon. Anyway, you can ask him directly. Alex From prescod at gmail.com Sat Apr 30 22:38:58 2005 From: prescod at gmail.com (Paul Prescod) Date: Sat Apr 30 22:39:00 2005 Subject: [Python-Dev] Developer list update In-Reply-To: <001101c54d9a$22c70da0$8d22c797@oemcomputer> References: <1112972430.19904.9.camel@geddy.wooz.org> <001101c54d9a$22c70da0$8d22c797@oemcomputer> Message-ID: <1cb725390504301338dccb8c9@mail.gmail.com> I haven't been using Python recently and don't have plans to contribute to its development. Go ahead and drop me from the list. From python at rcn.com Sat Apr 30 23:21:42 2005 From: python at rcn.com (Raymond Hettinger) Date: Sat Apr 30 23:22:04 2005 Subject: [Python-Dev] Developer list update In-Reply-To: <1cb725390504301338dccb8c9@mail.gmail.com> Message-ID: <000901c54dca$9acbb0a0$8d22c797@oemcomputer> Thanks for the note. Let me know if you need to be switched on again at some point. Raymond Hettinger > -----Original Message----- > From: Paul Prescod [mailto:prescod@gmail.com] > Sent: Saturday, April 30, 2005 4:39 PM > To: Raymond Hettinger > Cc: python-dev@python.org > Subject: Re: [Python-Dev] Developer list update > > I haven't been using Python recently and don't have plans to > contribute to its development. Go ahead and drop me from the list.