From anthony at interlink.com.au  Fri Apr  1 05:27:36 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Fri Apr  1 05:30:33 2005
Subject: [Python-Dev] RE: [Python-checkins] python/dist/src/Lib/logging
	handlers.py, 1.19, 1.19.2.1
In-Reply-To: <003101c53634$a60b87e0$d2bc958d@oemcomputer>
References: <003101c53634$a60b87e0$d2bc958d@oemcomputer>
Message-ID: <200504011327.37278.anthony@interlink.com.au>

On Friday 01 April 2005 07:00, Raymond Hettinger wrote:
> >       Tag: release24-maint
> > 	handlers.py
> > Log Message:
> > Added optional encoding argument to File based handlers and improved
> > error handling for SysLogHandler
>
> Are you sure you want to backport an API change and new feature?

What Raymond said. Please don't add new features to the maintenance branch.

Anthony

-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From bac at OCF.Berkeley.EDU  Fri Apr  1 11:39:01 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Apr  1 11:39:06 2005
Subject: [Python-Dev] python-dev Summary for 2005-03-16 through 2005-03-31
	[draft]
Message-ID: <424D16B5.4090204@ocf.berkeley.edu>

OK, so here is my final Summary.  Like to send it out some time this weekend so
please get corrections in ASAP.

--------------------------------

=====================
Summary Announcements
=====================

---------------
My last summary
---------------
So, after nearly 2.5 years, this is my final python-dev Summary.  Steve
Bethard, Tim Lesher, and Tony Meyer will be taking over for me starting with
the April 1 - April 15 summary (and no, this is not an elaborate April Fool's).
 I have learned a ton during my time doing the Summaries and I appreciate
python-dev allowing me to do them all this time.  Hopefully I will be able to
contribute more now in a programming capacity thanks to having more free time.

--------------------
PyCon was fantastic!
--------------------
For those of you who missed PyCon, you missed a great one!  It is actually my
favorite PyCon to date.  Already looking forward to next year.

--------------------
Python fireside chat
--------------------
Scott David Daniels requested a short little blurb from me expounding on my
thoughts on Python.  Not one to pass on an opportunity to just open myself and
possibly shoot myself in the foot, I figured I would take up the idea.  So hear
we go.

First, I suspect Python 3000 stuff will start to make its way into Python.
Stuff that doesn't break backwards compatibility will most likely start to be
implemented as we head toward the Python 2.9 barrier (Guido has stated several
times that there will never be a Python 2.10).  Things that are not
backwards-compatible will most likely end up being hashed out in various PEPs.
 All of this will allow the features in Python 3000 to be worked in over time
so there is not a huge culture shock.

As for things behind the scenes, work on the back-end will move forward.  Guido
himself has suggested that JIT work should be looked into (according to an
interview at http://www.devsource.com/article2/0,1759,1778272,00.asp).  I know
I plan to fiddle with the back-end to see if the compiler can be made to do
more work.

Otherwise I expect changes to be made, flame wars to come and go, and for
someone else to write the python-dev Summaries.  =)


=========
Summaries
=========

----------------
Python 2.4.1 out
----------------
Anthony Baxter, on behalf of python-dev, has released `Python 2.4.1`_.

.. _Python 2.4.1: http://www.python.org/2.4.1/

Contributing threads:
  - `RELEASED Python 2.4.1, release candidate 1
<http://mail.python.org/pipermail/python-dev/2005-March/051992.html>`__
  - `RELEASED Python 2.4.1, release candidate 2
<http://mail.python.org/pipermail/python-dev/2005-March/052270.html>`__
  - `BRANCH FREEZE for 2.4.1 final, 2005-03-30 00:00 UTC
<http://mail.python.org/pipermail/python-dev/2005-March/052444.html>`__
  - `RELEASED Python 2.4.1 (final)
<http://mail.python.org/pipermail/python-dev/2005-March/052467.html>`__


-----------------
AST branch update
-----------------
I, along with some other people, sprinted on the AST branch at PyCon.  This led
to a much more fleshed out design document (found in Python/compile.txt in the
AST branch), the ability to build on Windows, and applying Nick Coghlan's fix
for hex numbers.

Nick also did some more patch work and asked how AST work should be tagged.
There is now an AST category on SourceForge that people should use to flag
things as for the AST.  They should also, by default, assign such items to me
("bcannon" on SF).  We have also taken to flagging threads on the AST with
"[AST]" as the first item in the subject line.

There was also a slight discussion/clarification on the functions named
marshal_write_*() that output a byte format for the AST that is supposed to be
agnostic of implementation.  This will most likely end up being used as the way
to pass AST objects back and forth between C and Python code.  But with the
name collision of the word "marshal" with the actual 'marshal' module, it needs
to be changed.  I have suggested

- byte_encode
- linear_form
- zephyr_encoding
- flat_form
- flat_prefix
- prefix_form

while Nick Coghlan suggsted

- linear_ast
- bytestream_ast

Obviously I prefer "form" and Nick prefers "ast".  With Nick's reply being
independent of mine it will most likely have "linear" or "byte" in the name.

With the patches for descriptors and generator expressions sitting on SF,
syntactic support for all of Python 2.4 should get applied shortly.  After that
it will come down to bug hunting and such.  There is a todo list in the design
doc for those interested in helping out.

Contributing threads:
  - `Procedure for AST Branch patches
<http://mail.python.org/pipermail/python-dev/2005-March/052308.html>`__
  - `[AST] A somewhat less trivial patch than the last one. . .
<http://mail.python.org/pipermail/python-dev/2005-March/052336.html>`__
  - `[AST] question about marshal_write_*() fxns
<http://mail.python.org/pipermail/python-dev/2005-March/052340.html>`__


-------------------------------------------------------
Putting docstrings before function declarations is ugly
-------------------------------------------------------
The idea of moving docstrings after a 'def' was proposed, making it like most
other practices in other languages.  But very quickly people spoke up against
the suggestion.  A main argument was people just like the current way much
better.  I personally like the style so much that even in my C code I put the
comment for all functions after the first curly brace, indented to match the
flow of code.

There was also an issue of ambiguity.  How do you tell where the docstring for
a module is when there is a function definition with a comment right after?::

  """Module doc"""

  """Fxn doc"""
  def foo(): pass

There is an ambiguity there thanks to constant string concatenation.

In the end no one seemed to like the idea.

Contributing threads:
  - `docstring before function declaration
<http://mail.python.org/pipermail/python-dev/2005-March/052406.html>`__


-------------------------------------------
PyPI improvements thanks to PyCon sprinting
-------------------------------------------
Thanks to the hard work of Richard Jones, "Fred Drake, Sean Reifschneider,
Martin v. L?wis, Mick Twomey, John Camara, Andy Harrington, Andrew Kuchling,
David Goodger and Ian Bicking (with Barry Warsaw in a supporting role)"
accordinng to Richard, there are a bunch of new features to PyPI_ (pronounced
"pippy" to prevent name clashes with PyPy).  These improvements include using
reST_ for descriptions, a new 'upload' feature for Distutils (requires Python
2.5), ability to sign releases using OpenPGP (requires Python 2.5), metadata
fields are now expected to be UTF-8 encoded, interface cleanup, and saner URLs
for projects (e.g., http://www.python.org/pypi/roundup/0.8.2).

.. _PyPI: http://www.python.org/pypi/

Contributing threads:
  - `New PyPI broken package editing
<http://mail.python.org/pipermail/python-dev/2005-March/052360.html>`__
  - `Re: python/dist/src/Lib/distutils/command upload.py, 1.3, 1.4
<http://mail.python.org/pipermail/python-dev/2005-March/052362.html>`__


-------------------------------
Decorators for class statements
-------------------------------
The desire to have decorators applied to class statements was brought up once
again.  Guido quickly responded, though, stating that unless a compelling use
case that showed them much more useful than metaclasses it just would not happen.

Contributing threads:
  - `@decoration of classes
<http://mail.python.org/pipermail/python-dev/2005-March/052369.html>`__


===============
Skipped Threads
===============
+ itertools.walk()
+ Problems with definition of _POSIX_C_SOURCE
+ thread semantics for file objects
    Assume nothing is thread-safe
+ Draft PEP to make file objects support non-blocking mode.
+ Faster Set.discard() method?
+ __metaclass__ problem
+ Example workaround classes for using Unicode with csv module...
+ Change 'env var BROWSER override' semantics in webbrowser.py
+ bdist_deb checkin comments
+ Python 2.4 | 7.3 The for statement
+ Patch review: all webbrowser.py related patches up to	2005-03-20
+ webbrowser.py: browser >/dev/null 2>&1
+ C API for the bool type?
+ Shorthand for lambda
+ FYI: news items about Burton Report on P-languages
+ using SCons to build Python
+ 64-bit sequence and buffer protocol
+ Pickling instances of nested classes
+ python.org/sf URLs aren't working?
From walter at livinglogic.de  Fri Apr  1 13:17:24 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Fri Apr  1 13:17:29 2005
Subject: [Python-Dev] Pickling instances of nested classes
In-Reply-To: <424C561F.5060409@strakt.com>
References: <1752.84.56.104.245.1112132476.squirrel@isar.livinglogic.de>	<4249E904.30808@strakt.com>
	<424C0CF4.6040607@livinglogic.de> <424C561F.5060409@strakt.com>
Message-ID: <424D2DC4.4060904@livinglogic.de>

Samuele Pedroni wrote:

>> [...]
>> And having the full name of the class available would certainly help 
>> in debugging.
> 
> that's probably the only plus point but the names would be confusing wrt
>  modules vs. classes.

You'd propably need a different separator in repr. XIST does this:

 >>> from ll.xist.ns import html
 >>> html.a.Attrs.href
<attribute class ll.xist.ns.html:a.Attrs.href at 0x8319284>

> My point was that enabling reduce hooks at the metaclass level has
> propably other interesting applications, is far less complicated than
> your proposal to implement, it does not further complicate the notion of
> what happens at class creation time, and indeed avoids the
> implementation costs (for all python impls) of your proposal and still
> allows fairly generic solutions to the problem at hand because the
> solution can be formulated at the metaclass level.

Pickling classes like objects (i.e. by using the pickling methods in 
their (meta-)classes) solves only the second part of the problem: 
Finding the nested classes in the module on unpickling. The other 
problem is to add additional info to the inner class, which gets pickled 
and makes it findable on unpickling.

> If pickle.py is patched along these lines [*] (strawman impl, not much
> tested but test_pickle.py still passes, needs further work to support
> __reduce_ex__ and cPickle would need similar changes) then this example 
> works:
> 
> 
> class HierarchMeta(type):
>   """metaclass such that inner classes know their outer class, with 
> pickling support"""
>   def __new__(cls, name, bases, dic):
>       sub = [x for x in dic.values() if isinstance(x,HierarchMeta)]

I did something similar to this in XIST, but the problem with this 
approach is that in:

class Foo(Elm):
    pass

class Bar(Elm):
    Baz = Foo

the class Foo will get its _outer_ set to Bar although it shouldn't.

> [...]
>   def __reduce__(cls):
>       if hasattr(cls, '_outer_'):
>           return getattr, (cls._outer_, cls.__name__)
>       else:
>           return cls.__name__

I like this approach: Instead of hardcoding how references to classes 
are pickled (pickle the __name__), deligate it to the metaclass.

BTW, if classes and functions are pickable, why aren't modules:

 >>> import urllib, cPickle
 >>> cPickle.dumps(urllib.URLopener)
'curllib\nURLopener\np1\n.'
 >>> cPickle.dumps(urllib.splitport)
'curllib\nsplitport\np1\n.'
 >>> cPickle.dumps(urllib)
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "/usr/local/lib/python2.4/copy_reg.py", line 69, in _reduce_ex
     raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle module objects

We'd just have to pickle the module name.

Bye,
    Walter D?rwald
From pedronis at strakt.com  Fri Apr  1 15:57:37 2005
From: pedronis at strakt.com (Samuele Pedroni)
Date: Fri Apr  1 15:57:47 2005
Subject: [Python-Dev] Pickling instances of nested classes
In-Reply-To: <424D2DC4.4060904@livinglogic.de>
References: <1752.84.56.104.245.1112132476.squirrel@isar.livinglogic.de>	<4249E904.30808@strakt.com>
	<424C0CF4.6040607@livinglogic.de> <424C561F.5060409@strakt.com>
	<424D2DC4.4060904@livinglogic.de>
Message-ID: <424D5351.1070305@strakt.com>

Walter D?rwald wrote:

> Samuele Pedroni wrote:
>
>>> [...]
>>> And having the full name of the class available would certainly help 
>>> in debugging.
>>
>>
>> that's probably the only plus point but the names would be confusing wrt
>>  modules vs. classes.
>
>
> You'd propably need a different separator in repr. XIST does this:
>
> >>> from ll.xist.ns import html
> >>> html.a.Attrs.href
> <attribute class ll.xist.ns.html:a.Attrs.href at 0x8319284>
>
>> My point was that enabling reduce hooks at the metaclass level has
>> propably other interesting applications, is far less complicated than
>> your proposal to implement, it does not further complicate the notion of
>> what happens at class creation time, and indeed avoids the
>> implementation costs (for all python impls) of your proposal and still
>> allows fairly generic solutions to the problem at hand because the
>> solution can be formulated at the metaclass level.
>
>
> Pickling classes like objects (i.e. by using the pickling methods in 
> their (meta-)classes) solves only the second part of the problem: 
> Finding the nested classes in the module on unpickling. The other 
> problem is to add additional info to the inner class, which gets 
> pickled and makes it findable on unpickling.
>
>> If pickle.py is patched along these lines [*] (strawman impl, not much
>> tested but test_pickle.py still passes, needs further work to support
>> __reduce_ex__ and cPickle would need similar changes) then this 
>> example works:
>>
>>
>> class HierarchMeta(type):
>>   """metaclass such that inner classes know their outer class, with 
>> pickling support"""
>>   def __new__(cls, name, bases, dic):
>>       sub = [x for x in dic.values() if isinstance(x,HierarchMeta)]
>
>
> I did something similar to this in XIST, but the problem with this 
> approach is that in:
>
> class Foo(Elm):
>    pass
>
> class Bar(Elm):
>    Baz = Foo
>
> the class Foo will get its _outer_ set to Bar although it shouldn't.

this should approximate that behavior better: [not tested]

   import sys

  ....
  def __new__(cls, name, bases, dic):
      sub = [x for x in dic.values() if isinstance(x,HierarchMeta)]
      newtype = type.__new__(cls, name, bases, dic)
      for x in sub:
          if not hasattr(x, '_outer_') and 
getattr(sys.modules.get(x.__module__), x.__name__, None) is not x:
               x._outer_ = newtype
      return newtype

  .....

we don't set _outer_ if a way to pickle the class is already there
From tjreedy at udel.edu  Fri Apr  1 18:21:40 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri Apr  1 18:23:55 2005
Subject: [Python-Dev] Re: python-dev Summary for 2005-03-16 through
	2005-03-31[draft]
References: <424D16B5.4090204@ocf.berkeley.edu>
Message-ID: <d2jsc0$374$1@sea.gmane.org>

>This led to a much more fleshed out design document
> (found in Python/compile.txt in the AST branch),

The directory URL

http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Python/?only_with_tag=ast-branch

or even the file URL

http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Python/Attic/compile.txt?rev=1.1.2.10&only_with_tag=ast-branch&view=auto

would be helpful to people not fully familiar with the depository and the 
required prefix to 'Python' (versus 'python').  I initially found the 
two-year-old

ttp://cvs.sourceforge.net/viewcvs.py/python/python/nondist/sandbox/ast/


>The idea of moving docstrings after a 'def' was proposed

/after/before/


From Scott.Daniels at Acm.Org  Fri Apr  1 19:01:08 2005
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Fri Apr  1 19:02:35 2005
Subject: [Python-Dev] Re: python-dev Summary for 2005-03-16 through
	2005-03-31 [draft]
In-Reply-To: <424D16B5.4090204@ocf.berkeley.edu>
References: <424D16B5.4090204@ocf.berkeley.edu>
Message-ID: <d2julm$asg$1@sea.gmane.org>

Brett C. wrote:
> ... I figured I would take up the idea.  So hear
                                          ^^   here  ^^
> we go.
> 

From walter at livinglogic.de  Fri Apr  1 21:29:44 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Fri Apr  1 21:29:47 2005
Subject: [Python-Dev] Pickling instances of nested classes
In-Reply-To: <424D5351.1070305@strakt.com>
References: <1752.84.56.104.245.1112132476.squirrel@isar.livinglogic.de>	<4249E904.30808@strakt.com>
	<424C0CF4.6040607@livinglogic.de> <424C561F.5060409@strakt.com>
	<424D2DC4.4060904@livinglogic.de> <424D5351.1070305@strakt.com>
Message-ID: <424DA128.2060809@livinglogic.de>

Samuele Pedroni wrote:

> [...]
> 
> this should approximate that behavior better: [not tested]
> 
>   import sys
> 
>  ....
>  def __new__(cls, name, bases, dic):
>      sub = [x for x in dic.values() if isinstance(x,HierarchMeta)]
>      newtype = type.__new__(cls, name, bases, dic)
>      for x in sub:
>          if not hasattr(x, '_outer_') and 
> getattr(sys.modules.get(x.__module__), x.__name__, None) is not x:
>               x._outer_ = newtype
>      return newtype
> 
>  .....
> 
> we don't set _outer_ if a way to pickle the class is already there

This doesn't fix

class Foo:
    class Bar:
       pass

class Baz:
    Bar = Foo.Bar

both this should be a simple fix.

Bye,
    Walter D?rwald
From ejones at uwaterloo.ca  Fri Apr  1 21:36:07 2005
From: ejones at uwaterloo.ca (Evan Jones)
Date: Fri Apr  1 21:35:31 2005
Subject: [Python-Dev] Unicode byte order mark decoding
Message-ID: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>

I recently rediscovered this strange behaviour in Python's Unicode 
handling. I *think* it is a bug, but before I go and try to hack 
together a patch, I figure I should run it by the experts here on 
Python-Dev. If you understand Unicode, please let me know if there are 
problems with making these minor changes.


 >>> import codecs
 >>> codecs.BOM_UTF8.decode( "utf8" )
u'\ufeff'
 >>> codecs.BOM_UTF16.decode( "utf16" )
u''

Why does the UTF-16 decoder discard the BOM, while the UTF-8 decoder 
turns it into a character? The UTF-16 decoder contains logic to 
correctly handle the BOM. It even handles byte swapping, if necessary. 
I propose that  the UTF-8 decoder should have the same logic: it should 
remove the BOM if it is detected at the beginning of a string. This 
will remove a bit of manual work for Python programs that deal with 
UTF-8 files created on Windows, which frequently have the BOM at the 
beginning. The Unicode standard is unclear about how it should be 
handled (version 4, section 15.9):

> Although there are never any questions of byte order with UTF-8 text, 
> this sequence can serve as signature for UTF-8 encoded text where the 
> character set is unmarked. [...] Systems that use the byte order mark 
> must recognize when an initial U+FEFF signals the byte order. In those 
> cases, it is not part of the textual content and should be removed 
> before processing, because otherwise it may be mistaken for a 
> legitimate zero width no-break space.

At the very least, it would be nice to add a note about this to the 
documentation, and possibly add this example function that implements 
the "UTF-8 or ASCII?" logic:

def autodecode( s ):
	if s.beginswith( codecs.BOM_UTF8 ):
		# The byte string s is UTF-8
		out = s.decode( "utf8" )
		return out[1:]
	else: return s.decode( "ascii" )


As a second issue, the UTF-16LE and UTF-16BE encoders almost do the 
right thing: They turn the BOM into a character, just like the Unicode 
specification says they should.

 >>> codecs.BOM_UTF16_LE.decode( "utf-16le" )
u'\ufeff'
 >>> codecs.BOM_UTF16_BE.decode( "utf-16be" )
u'\ufeff'

However, they also *incorrectly* handle the reversed byte order mark:

 >>> codecs.BOM_UTF16_BE.decode( "utf-16le" )
u'\ufffe'

This is *not* a valid Unicode character. The Unicode specification 
(version 4, section 15.8) says the following about non-characters:

> Applications are free to use any of these noncharacter code points 
> internally but should never attempt to exchange them. If a 
> noncharacter is received in open interchange, an application is not 
> required to interpret it in any way. It is good practice, however, to 
> recognize it as a noncharacter and to take appropriate action, such as 
> removing it from the text. Note that Unicode conformance freely allows 
> the removal of these characters. (See C10 in Section3.2, Conformance 
> Requirements.)

My interpretation of the specification means that Python should 
silently remove the character, resulting in a zero length Unicode 
string. Similarly, both of the following lines should also result in a 
zero length Unicode string:

 >>> '\xff\xfe\xfe\xff'.decode( "utf16" )
u'\ufffe'
 >>> '\xff\xfe\xff\xff'.decode( "utf16" )
u'\uffff'


Thanks for your feedback,

Evan Jones

From mal at egenix.com  Fri Apr  1 22:19:40 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri Apr  1 22:19:42 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>
Message-ID: <424DACDC.4080601@egenix.com>

Evan Jones wrote:
> I recently rediscovered this strange behaviour in Python's Unicode
> handling. I *think* it is a bug, but before I go and try to hack
> together a patch, I figure I should run it by the experts here on
> Python-Dev. If you understand Unicode, please let me know if there are
> problems with making these minor changes.
> 
> 
>>>> import codecs
>>>> codecs.BOM_UTF8.decode( "utf8" )
> u'\ufeff'
>>>> codecs.BOM_UTF16.decode( "utf16" )
> u''
> 
> Why does the UTF-16 decoder discard the BOM, while the UTF-8 decoder
> turns it into a character? 

The BOM (byte order mark) was a non-standard Microsoft invention
to detect Unicode text data as such (MS always uses UTF-16-LE for
Unicode text files).

It is not needed for the UTF-8 because that format doesn't rely on
the byte order and the BOM character at the beginning of a stream is
a legitimate ZWNBSP (zero width non breakable space) code point.

The "utf-16" codec detects and removes the mark, while the
two others "utf-16-le" (little endian byte order) and "utf-16-be"
(big endian byte order) don't.

> The UTF-16 decoder contains logic to
> correctly handle the BOM. It even handles byte swapping, if necessary. I
> propose that  the UTF-8 decoder should have the same logic: it should
> remove the BOM if it is detected at the beginning of a string. 

-1; there's no standard for UTF-8 BOMs - adding it to the
codecs module was probably a mistake to begin with. You usually
only get UTF-8 files with BOM marks as the result of recoding
UTF-16 files into UTF-8.

> This will
> remove a bit of manual work for Python programs that deal with UTF-8
> files created on Windows, which frequently have the BOM at the
> beginning. The Unicode standard is unclear about how it should be
> handled (version 4, section 15.9):
> 
>> Although there are never any questions of byte order with UTF-8 text,
>> this sequence can serve as signature for UTF-8 encoded text where the
>> character set is unmarked. [...] Systems that use the byte order mark
>> must recognize when an initial U+FEFF signals the byte order. In those
>> cases, it is not part of the textual content and should be removed
>> before processing, because otherwise it may be mistaken for a
>> legitimate zero width no-break space.
> 
> 
> At the very least, it would be nice to add a note about this to the
> documentation, and possibly add this example function that implements
> the "UTF-8 or ASCII?" logic:
> 
> def autodecode( s ):
>     if s.beginswith( codecs.BOM_UTF8 ):
>         # The byte string s is UTF-8
>         out = s.decode( "utf8" )
>         return out[1:]
>     else: return s.decode( "ascii" )

Well, I'd say that's a very English way of dealing with encoded
text ;-)

BTW, how do you know that s came from the start of a file
and not from slicing some already loaded file somewhere
in the middle ?

> As a second issue, the UTF-16LE and UTF-16BE encoders almost do the
> right thing: They turn the BOM into a character, just like the Unicode
> specification says they should.
> 
>>>> codecs.BOM_UTF16_LE.decode( "utf-16le" )
> u'\ufeff'
>>>> codecs.BOM_UTF16_BE.decode( "utf-16be" )
> u'\ufeff'
> 
> However, they also *incorrectly* handle the reversed byte order mark:
> 
>>>> codecs.BOM_UTF16_BE.decode( "utf-16le" )
> u'\ufffe'
> 
> This is *not* a valid Unicode character. The Unicode specification
> (version 4, section 15.8) says the following about non-characters:
> 
>> Applications are free to use any of these noncharacter code points
>> internally but should never attempt to exchange them. If a
>> noncharacter is received in open interchange, an application is not
>> required to interpret it in any way. It is good practice, however, to
>> recognize it as a noncharacter and to take appropriate action, such as
>> removing it from the text. Note that Unicode conformance freely allows
>> the removal of these characters. (See C10 in Section3.2, Conformance
>> Requirements.)
> 
> 
> My interpretation of the specification means that Python should silently
> remove the character, resulting in a zero length Unicode string.
> Similarly, both of the following lines should also result in a zero
> length Unicode string:
> 
>>>> '\xff\xfe\xfe\xff'.decode( "utf16" )
> u'\ufffe'
>>>> '\xff\xfe\xff\xff'.decode( "utf16" )
> u'\uffff'

Hmm, wouldn't it be better to raise an error ? After all,
a reversed BOM mark in the stream looks a lot like you're
trying to decode a UTF-16 stream assuming the wrong
byte order ?!

Other than that: +1 on fixing this case.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 01 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From bac at OCF.Berkeley.EDU  Fri Apr  1 22:52:47 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Apr  1 22:52:53 2005
Subject: [Python-Dev] Re: python-dev Summary for 2005-03-16
	through	2005-03-31[draft]
In-Reply-To: <d2jsc0$374$1@sea.gmane.org>
References: <424D16B5.4090204@ocf.berkeley.edu> <d2jsc0$374$1@sea.gmane.org>
Message-ID: <424DB49F.4060607@ocf.berkeley.edu>

Terry Reedy wrote:
>>This led to a much more fleshed out design document
>>(found in Python/compile.txt in the AST branch),
> 
> 
> The directory URL
> 
> http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Python/?only_with_tag=ast-branch
> 
> or even the file URL
> 
> http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Python/Attic/compile.txt?rev=1.1.2.10&only_with_tag=ast-branch&view=auto
> 
> would be helpful to people not fully familiar with the depository and the 
> required prefix to 'Python' (versus 'python').  I initially found the 
> two-year-old
> 
> ttp://cvs.sourceforge.net/viewcvs.py/python/python/nondist/sandbox/ast/
> 

Yeah, that has become a popular suggestion.  It has been fixed.  Just didn't
think about it.  One of those instances where I have been neck-deep in
python-dev for so long I forgot that not everyone has a CVS checkout.  =)

> 
> 
>>The idea of moving docstrings after a 'def' was proposed
> 
> 
> /after/before/
> 

Fixed.

Thanks, Terry.

-Brett
From ejones at uwaterloo.ca  Sat Apr  2 05:04:11 2005
From: ejones at uwaterloo.ca (Evan Jones)
Date: Sat Apr  2 05:03:36 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <424DACDC.4080601@egenix.com>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>
	<424DACDC.4080601@egenix.com>
Message-ID: <ae67189a095a816810379be22287400c@uwaterloo.ca>

On Apr 1, 2005, at 15:19, M.-A. Lemburg wrote:
> The BOM (byte order mark) was a non-standard Microsoft invention
> to detect Unicode text data as such (MS always uses UTF-16-LE for
> Unicode text files).

Well, it's origins do not really matter since at this point the BOM is 
firmly encoded in the Unicode standard. It seems to me that it is in 
everyone's best interest to support it.

> It is not needed for the UTF-8 because that format doesn't rely on
> the byte order and the BOM character at the beginning of a stream is
> a legitimate ZWNBSP (zero width non breakable space) code point.

You are correct: it is a legitimate character. However, its use as a 
ZWNBSP character has been deprecated:

> The overloading of semantics for this code point has caused problems 
> for programs and protocols. The new character U+2060 WORD JOINER has 
> the same semantics in all cases as U+FEFF, except that it cannot be 
> used as a signature. Implementers are strongly encouraged to use word 
> joiner in those circumstances whenever word joining semantics is 
> intended.

Also, the Unicode specification is ambiguous on what an implementation 
should do about a leading ZWNBSP that is encoded in UTF-16. Like I 
mentioned, if you look at the Unicode standard, version 4, section 
15.9, it says:

> 2. Unmarked Character Set. In some circumstances, the character set 
> information for a stream of coded characters (such as a file) is not 
> available. The only information available is that the stream contains 
> text, but the precise character set is not known.

This seems to indicate that it is permitted to strip the BOM from the 
beginning of UTF-8 text.

> -1; there's no standard for UTF-8 BOMs - adding it to the
> codecs module was probably a mistake to begin with. You usually
> only get UTF-8 files with BOM marks as the result of recoding
> UTF-16 files into UTF-8.

This is clearly incorrect. The UTF-8 is specified in the Unicode 
standard version 4, section 15.9:

> In UTF-8, the BOM corresponds to the byte sequence <EF BB BF>.

I normally find files with UTF-8 BOMs from many Windows applications 
when you save a text file as UTF8. I think that Notepad or WordPad does 
this, for example. I think UltraEdit also does the same thing. I know 
that Scintilla definitely does.

>> At the very least, it would be nice to add a note about this to the
>> documentation, and possibly add this example function that implements
>> the "UTF-8 or ASCII?" logic.
> Well, I'd say that's a very English way of dealing with encoded
> text ;-)

Please note I am saying only that something like this may want to me 
considered for addition to the documentation, and not to the Python 
standard library. This example function more closely replicates the 
logic that is used on those Windows applications when opening ".txt" 
files. It uses the default locale if there is no BOM:

def autodecode( s ):
	if s.beginswith( codecs.BOM_UTF8 ):
		# The byte string s is UTF-8
		out = s.decode( "utf8" )
		return out[1:]
	else: return s.decode()

> BTW, how do you know that s came from the start of a file
> and not from slicing some already loaded file somewhere
> in the middle ?

Well, the same argument could be applied to the UTF-16 decoder know 
that the string came from the start of a file, and not from slicing 
some already loaded file? The standard states that:

> In the UTF-16 encoding scheme, U+FEFF at the very beginning of a file 
> or stream explicitly signals the byte order.

So it is perfectly permissible to perform this type of processing if 
you consider a string to be equivalent to a stream.

>> My interpretation of the specification means that Python should 
>> silently
>> remove the character, resulting in a zero length Unicode string.
> Hmm, wouldn't it be better to raise an error ? After all,
> a reversed BOM mark in the stream looks a lot like you're
> trying to decode a UTF-16 stream assuming the wrong
> byte order ?!

Well, either one is possible, however the Unicode standard suggests, 
but does not require, silently removing them:

> It is good practice, however, to recognize it as a noncharacter and to 
> take appropriate action, such as removing it from the text. Note that 
> Unicode conformance freely allows the removal of these characters.

I would prefer silently ignoring them from the str.decode() function, 
since I believe in "be strict in what you emit, but liberal in what you 
accept." I think that this only applies to str.decode(). Any other 
attempt to create non-characters, such as unichr( 0xffff ), *should* 
raise an exception because clearly the programmer is making a mistake.

> Other than that: +1 on fixing this case.

Cool!

Evan Jones

From irmen at xs4all.nl  Sat Apr  2 17:24:31 2005
From: irmen at xs4all.nl (Irmen de Jong)
Date: Sat Apr  2 17:24:34 2005
Subject: [Python-Dev] New bug, directly assigned, okay?
Message-ID: <424EB92F.5080308@xs4all.nl>

I just added a new bug on SF (1175396) and because I think
that it is related to other bugs that were assigned to
Walter Doerwald, I assigned this new bug directly to Walter too.

Is that good practice or does someone else usually assign SF bugs to people?

--Irmen
From ncoghlan at iinet.net.au  Sat Apr  2 17:57:14 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Sat Apr  2 17:57:21 2005
Subject: [Python-Dev] New bug, directly assigned, okay?
In-Reply-To: <424EB92F.5080308@xs4all.nl>
References: <424EB92F.5080308@xs4all.nl>
Message-ID: <424EC0DA.1020307@iinet.net.au>

Irmen de Jong wrote:
> I just added a new bug on SF (1175396) and because I think
> that it is related to other bugs that were assigned to
> Walter Doerwald, I assigned this new bug directly to Walter too.
> 
> Is that good practice or does someone else usually assign SF bugs to people?

I've certainly done that a few times myself - I figure that even if I get it 
wrong, the recipient will either pass it on to a more appropriate person, or 
simply revert it back to unassigned.

I usually try to put in a comment to say *why* I've assigned it the way I have, 
though. Picking an assignee at random should probably be discouraged, but if 
there is someone that makes sense, then I don't see a problem with asking them 
to look at it directly.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From irmen at xs4all.nl  Sat Apr  2 18:04:21 2005
From: irmen at xs4all.nl (Irmen de Jong)
Date: Sat Apr  2 18:04:23 2005
Subject: [Python-Dev] New bug, directly assigned, okay?
In-Reply-To: <424EC0DA.1020307@iinet.net.au>
References: <424EB92F.5080308@xs4all.nl> <424EC0DA.1020307@iinet.net.au>
Message-ID: <424EC285.8020703@xs4all.nl>

Nick Coghlan wrote:
> Irmen de Jong wrote:
> 
>> I just added a new bug on SF (1175396) and because I think
>> that it is related to other bugs that were assigned to
>> Walter Doerwald, I assigned this new bug directly to Walter too.
>>
>> Is that good practice or does someone else usually assign SF bugs to
>> people?
> 
> 
> I've certainly done that a few times myself - I figure that even if I
> get it wrong, the recipient will either pass it on to a more appropriate
> person, or simply revert it back to unassigned.

Ah, okay.

> I usually try to put in a comment to say *why* I've assigned it the way
> I have, though. Picking an assignee at random should probably be
> discouraged, but if there is someone that makes sense, then I don't see
> a problem with asking them to look at it directly.

Yep, that's what I've done.
In my bug report (about codecs.readline) I referenced the two
other bugs related to it (those were assigned to Walter).


Thanks,
Irmen.
From ottrey at py.redsoft.be  Sat Apr  2 09:22:41 2005
From: ottrey at py.redsoft.be (ottrey@py.redsoft.be)
Date: Sat Apr  2 21:15:12 2005
Subject: [Python-Dev] hierarchicial named groups extension to the re library
Message-ID: <MS4ysQaz.1112426561.4504890.ottrey@py.redsoft.be>


I've written an extension to the re library, to provide a more
complete matching of hierarchical named groups in regular expressions.

I've set up a sourceforge project for it:

  http://pyre2.sourceforge.net/

re2 extracts a hierarchy of named groups matches from a string,
rather than the flat, incomplete dictionary that the
standard re module returns.

(ie. the re library only returns the ~last~ match for named groups - not
a list of ~all~ the matches for the named groups.  And the hierarchy of
those named groups is non-existant in the flat dictionary of matches
that results. )

eg.

>>> import re
>>> buf='12 drummers drumming, 11 pipers piping, 10 lords a-leaping'
>>> regex='^((?P<verse>(?P<number>\d+) (?P<activity>[^,]+))(, )?)*$'
>>> pat1=re.compile(regex)
>>> m=pat1.match(buf)
>>> m.groupdict()
{'verse': '10 lords a-leaping', 'number': '10',
'activity': 'lords a-leaping'}

>>> import re2
>>> buf='12 drummers drumming, 11 pipers piping, 10 lords a-leaping'
>>> regex='^((?P<verse>(?P<number>\d+) (?P<activity>[^,]+))(, )?)*$'
>>> pat2=re2.compile(regex)
>>> x=pat2.extract(buf)
>>> x
{'verse': [{'number': '12', 'activity': 'drummers
drumming'}, {'number': '11', 'activity': 'pipers
piping'}, {'number': '10', 'activity': 'lords a-leaping'}]}


(See http://pyre2.sourceforge.net/ for more details.)


I am wondering what would be the best direction to take this project in.

Firstly is it, (or can it be made) useful enough to be included in the
python stdlib?  (ie. Should I bother writing a PEP for it.)

And if so, would it be best to merge its functionality in with the re
library, or to leave it as a separate module?

And, also are there any suggestions/criticisms on the library itself?
From nidoizo at yahoo.com  Sat Apr  2 23:01:40 2005
From: nidoizo at yahoo.com (Nicolas Fleury)
Date: Sat Apr  2 23:00:44 2005
Subject: [Python-Dev] Re: hierarchicial named groups extension to the re
	library
In-Reply-To: <MS4ysQaz.1112426561.4504890.ottrey@py.redsoft.be>
References: <MS4ysQaz.1112426561.4504890.ottrey@py.redsoft.be>
Message-ID: <d2n0vt$rlr$1@sea.gmane.org>

ottrey@py.redsoft.be wrote:
>>>>import re2
>>>>buf='12 drummers drumming, 11 pipers piping, 10 lords a-leaping'
>>>>regex='^((?P<verse>(?P<number>\d+) (?P<activity>[^,]+))(, )?)*$'
>>>>pat2=re2.compile(regex)
>>>>x=pat2.extract(buf)
>>>>x
> 
> {'verse': [{'number': '12', 'activity': 'drummers
> drumming'}, {'number': '11', 'activity': 'pipers
> piping'}, {'number': '10', 'activity': 'lords a-leaping'}]}

Is a dictionary the good container or should another class be used? 
Because in the example the content of the "verse" group is lost, 
excluding its sub-groups.  Something like a hierarchic MatchObject could 
provide access to both information, the sub-groups and the group itself. 
  Also, should it be limited to named groups?

> I am wondering what would be the best direction to take this project in.
> 
> Firstly is it, (or can it be made) useful enough to be included in the
> python stdlib?  (ie. Should I bother writing a PEP for it.)
> 
> And if so, would it be best to merge its functionality in with the re
> library, or to leave it as a separate module?
> 
> And, also are there any suggestions/criticisms on the library itself?

I find the feature very interesting, but being used to live without it, 
I have difficulty evaluating its usefulness.  However, it reminds me how 
much at first I found strange that only the last match was kept, so I 
think, FWIW, that on a purist point of vue the functionality would make 
sense in the stdlib in some way or another.

Regards,
Nicolas

From jcarlson at uci.edu  Sun Apr  3 01:01:57 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun Apr  3 01:11:45 2005
Subject: [Python-Dev] Re: hierarchicial named groups extension to the re
	library
In-Reply-To: <d2n0vt$rlr$1@sea.gmane.org>
References: <MS4ysQaz.1112426561.4504890.ottrey@py.redsoft.be>
	<d2n0vt$rlr$1@sea.gmane.org>
Message-ID: <20050402134150.7215.JCARLSON@uci.edu>


Nicolas Fleury <nidoizo@yahoo.com> wrote:
> 
> ottrey@py.redsoft.be wrote:
> >>>>import re2
> >>>>buf='12 drummers drumming, 11 pipers piping, 10 lords a-leaping'
> >>>>regex='^((?P<verse>(?P<number>\d+) (?P<activity>[^,]+))(, )?)*$'
> >>>>pat2=re2.compile(regex)
> >>>>x=pat2.extract(buf)

If one wanted to match the API of the re module, one should use
pat2.findall(buf), which would return a list of 'hierarchical match
objects', though with the above, one should really return a list of
'verse' items (the way the regular expression is written).

> >>>>x
> > 
> > {'verse': [{'number': '12', 'activity': 'drummers
> > drumming'}, {'number': '11', 'activity': 'pipers
> > piping'}, {'number': '10', 'activity': 'lords a-leaping'}]}
> 
> Is a dictionary the good container or should another class be used? 
> Because in the example the content of the "verse" group is lost, 
> excluding its sub-groups.  Something like a hierarchic MatchObject could 
> provide access to both information, the sub-groups and the group itself. 

Its contents are not lost, look at the overall dictionary...  In any
case, I think one can do better than a dictionary.

>>> x=pat2.match(buf) #or x=pat2.findall(buf)[0]
>>> x
'12 drummers drumming,'
>>> dir(x)
['verse']
>>> x.verse
'12 drummers drumming,'
>>> dir(x.verse)
['number', 'activity']
>>> x.verse.number
'12'
>>> x.verse.activity
'drummers drumming'

...would get my vote (or using obj.group(i) semantics I discuss below).
I notice that this is basically what the re2 module already does (having
read the web page), though rather than...
>>> pat2.extract(buf).verse[1].activity
'pipers piping'

I would prefer...

>>> pat2.findall(buf)[1].verse.activity
'pipers piping'

For .verse[1] or .verse[2] to make sense, it implies that the pattern is
something like...
((?P<verse>... )(?P<verse>...))
... which it isn't.

I understand that the decision was probably made to make it similar to
the case of...
((?P<foo>... (?p<goo>...)+))

... where multiple matches for goo would require x.foo.goo[i].


>   Also, should it be limited to named groups?

Probably not.  I would suggest using matchobj.group(i) semantics to
match the standard re module semantics, though only allow returning
items in the current level of the hierarchy.  That is, one could use
x.verse.group(1) and get back '12', but x.group(1) would return '12
pipers piping'


> > I am wondering what would be the best direction to take this project in.
> > 
> > Firstly is it, (or can it be made) useful enough to be included in the
> > python stdlib?  (ie. Should I bother writing a PEP for it.)
> > 
> > And if so, would it be best to merge its functionality in with the re
> > library, or to leave it as a separate module?
> > 
> > And, also are there any suggestions/criticisms on the library itself?
> 
> I find the feature very interesting, but being used to live without it, 
> I have difficulty evaluating its usefulness.  However, it reminds me how 
> much at first I found strange that only the last match was kept, so I 
> think, FWIW, that on a purist point of vue the functionality would make 
> sense in the stdlib in some way or another.

re2 can be used as a limited structural parser.  This makes the re
module useful for more things than it is currently. The question of it
being in the standard library, however, I think should be made based on
the criteria used previously (whatever they were).

 - Josiah

From nidoizo at yahoo.com  Sun Apr  3 02:16:44 2005
From: nidoizo at yahoo.com (Nicolas Fleury)
Date: Sun Apr  3 02:14:49 2005
Subject: [Python-Dev] Re: hierarchicial named groups extension to the re
	library
In-Reply-To: <20050402134150.7215.JCARLSON@uci.edu>
References: <MS4ysQaz.1112426561.4504890.ottrey@py.redsoft.be>	<d2n0vt$rlr$1@sea.gmane.org>
	<20050402134150.7215.JCARLSON@uci.edu>
Message-ID: <d2ncdk$ek5$1@sea.gmane.org>

Josiah Carlson wrote:
> Nicolas Fleury <nidoizo@yahoo.com> wrote:
>>ottrey@py.redsoft.be wrote:
>>
>>>>>>import re2
>>>>>>buf='12 drummers drumming, 11 pipers piping, 10 lords a-leaping'
>>>>>>regex='^((?P<verse>(?P<number>\d+) (?P<activity>[^,]+))(, )?)*$'
>>>>>>pat2=re2.compile(regex)
>>>>>>x=pat2.extract(buf)
> 
> If one wanted to match the API of the re module, one should use
> pat2.findall(buf), which would return a list of 'hierarchical match
> objects', though with the above, one should really return a list of
> 'verse' items (the way the regular expression is written).

As far as I can understand, the two are orthogonal.  findall is used to 
match the regular expression multiple times; in that case the regular 
expression is still matched only once.

>>>{'verse': [{'number': '12', 'activity': 'drummers
>>>drumming'}, {'number': '11', 'activity': 'pipers
>>>piping'}, {'number': '10', 'activity': 'lords a-leaping'}]}
>>
>>Is a dictionary the good container or should another class be used? 
>>Because in the example the content of the "verse" group is lost, 
>>excluding its sub-groups.  Something like a hierarchic MatchObject could 
>>provide access to both information, the sub-groups and the group itself. 
> 
> Its contents are not lost, look at the overall dictionary...  In any
> case, I think one can do better than a dictionary.

In that specific example, I meant that the space between "10" and "lords 
a-leaping" was not stored in the dictionary, unless you talk about the 
dictionary from re instead of re2.  Your proposal fixes that, by making 
the entire content of the parent group (verse) accessible.

>>>>x=pat2.match(buf) #or x=pat2.findall(buf)[0]
>>>>x
> 
> '12 drummers drumming,'
> 
>>>>dir(x)
> 
> ['verse']
> 
>>>>x.verse
> 
> '12 drummers drumming,'
> 

It is very easy to use, but I doubt it is a good idea as a return value 
for match (maybe a match object could have a function to return this 
easy-to-use object).  It would mean that the name of the groups are 
limited by the interface of the match object returned (what would happen 
if a group is named "start", "end" of simpliy "group"?).

Another solution is to use x["verse"] instead (or continue use a "group" 
method).

>>  Also, should it be limited to named groups?
> 
> Probably not.  I would suggest using matchobj.group(i) semantics to
> match the standard re module semantics, though only allow returning
> items in the current level of the hierarchy.  That is, one could use
> x.verse.group(1) and get back '12', but x.group(1) would return '12
> pipers piping'
> 

Totally agree that matchobj.group interface should be matched.  Should 
group return another match object?  Or maybe another function to get 
match objects of groups?  Something like:
x.groupobj("verse").group("number")
or
str(x["verse"]["number"])

Regards,
Nicolas

From martin at v.loewis.de  Sun Apr  3 08:48:16 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Apr  3 08:48:20 2005
Subject: [Python-Dev] Re: hierarchicial named groups extension to the
	re	library
In-Reply-To: <20050402134150.7215.JCARLSON@uci.edu>
References: <MS4ysQaz.1112426561.4504890.ottrey@py.redsoft.be>	<d2n0vt$rlr$1@sea.gmane.org>
	<20050402134150.7215.JCARLSON@uci.edu>
Message-ID: <424F91B0.2050307@v.loewis.de>

Josiah Carlson wrote:
> re2 can be used as a limited structural parser.  This makes the re
> module useful for more things than it is currently. The question of it
> being in the standard library, however, I think should be made based on
> the criteria used previously (whatever they were).

In general, if developers can readily agree that a functionality should
be added (i.e. it is "obvious" for some reason), it is added right away.
Otherwise, a PEP should be written, and reviewed by the community.

In the specific case, Chris Ottrey submitted a link to his project to
the SF patches tracker, asking for inclusion. I felt that there is
likely no immediate agreement, and suggested he asks on python-dev,
and writes a PEP.

If this kind of functionality would fall on immediate rejection for
some reason, even writing the PEP might be pointless. If the
functionality is generally considered useful, a PEP can be written,
and then implemented according to the PEP procedures (i.e. collect
feedback, discuss alternatives, ask for BDFL pronouncement).

I personally think that the proposed functionality should *not* live
in a separate module, but somehow be integrated into SRE. Whether or
not the proposed functionality is useful in the first place, I don't
know. I never have nested named groups in my regular expressions.

Regards,
Martin
From ottrey at py.redsoft.be  Sun Apr  3 09:24:49 2005
From: ottrey at py.redsoft.be (ottrey@py.redsoft.be)
Date: Sun Apr  3 09:25:01 2005
Subject: [Python-Dev] hierarchicial named groups extension to the re
	library
Message-ID: <7jV2C9bu.1112513089.4609730.ottrey@py.redsoft.be>


Nicolas Fleury <nidoizo at yahoo.com> wrote:
>
> ottrey at py.redsoft.be wrote:
> >>>>import re2
> >>>>buf='12 drummers drumming, 11 pipers piping, 10 lords a-leaping'
> >>>>regex='^((?P<verse>(?P<number>\d+) (?P<activity>[^,]+))(, )?)*$'
> >>>>pat2=re2.compile(regex)
> >>>>x=pat2.extract(buf)
> >>>>x
> >
> > {'verse': [{'number': '12', 'activity': 'drummers
> > drumming'}, {'number': '11', 'activity': 'pipers
> > piping'}, {'number': '10', 'activity': 'lords a-leaping'}]}
>
> Is a dictionary the good container or should another class be used?
> Because in the example the content of the "verse" group is lost,
> excluding its sub-groups.  Something like a hierarchic MatchObject could
> provide access to both information, the sub-groups and the group itself.

Yes, very good point.
Actually it ~is~ a container (that uses dict as it's base class).
(I probably should add the following lines to the example.)

>>> type(x)
<class 're2._Match'>
>>> x._value
'12 drummers drumming, 11 pipers piping, 10 lords a-leaping'
>>> x.verse[0]._value
'12 drummers drumming'


Josiah Carlson jcarlson at uci.edu wrote:
> If one wanted to match the API of the re module, one should use
> pat2.findall(buf), which would return a list of 'hierarchical match
> objects'

Well, that would be something I'd want to discuss here.
As I'm not sure if I actually ~want~ to match the API of the re module.

> Also, should it be limited to named groups?

I have given that some thought as well.
Internally un-named groups are recursively given the names _group0,
_group1 etc as they are found.  And then those groups are recursively
matched. And in the final step the resulting _Match object is compressed
and those un-named groups are discarded.

IMO If you don't bother to name a group then you probably aren't going
to be interested in it anyway - so why keeping a reference to it?

eg.
If you only wanted to extract the numbers from those verses...

>>> regex='^(((?P<number>\d+) ([^,]+))(, )?)*$'
>>> pat2=re2.compile(regex)
>>> x=pat2.extract(buf)
>>> x
{'number': ['12', '11', '10']}

Before the compression stage the _Match object actually looked like this:

{'_group0': {'_value': '12 drummers drumming, 11 pipers piping, 10
lords
a-leaping', '_group0': [{'_value': '12 drummers drumming, ',
'_group1':
', ', '_group0': {'_value': '12 drummers drumming', '_group1':
'drummers
drumming', 'number': '12'}}, {'_value': '11 pipers piping, ',
'_group1':
', ', '_group0': {'_value': '11 pipers piping', '_group1':
'pipers
piping', 'number': '11'}}, {'_value': '10 lords a-leaping',
'_group0':
{'_value': '10 lords a-leaping', '_group1': 'lords a-leaping',
'number':
'10'}}]}}

But the compression algorithm collected the named groups and brought
them to the surface, to return the much nicer looking:

{'number': ['12', '11', '10']}


NB. There are also a few other tricks up the sleeve of re2.

eg.
It allows for named groups to be repeated in different branches of a
named group hierarchy, without the name redefinition error that the re
library will complain about.

eg.
>>> pat1=re2.compile(
  '(?P<parents>(?P<mother>(?P<name>[\w ]+)),(?P<father>(?P<name>[\w
]+)))'
)
>>> pat1.extract('Mum,Dad')
{'parents': {'father': {'name': 'Dad'}, 'mother': {'name':
'Mum'}}}


> I find the feature very interesting, but being used to live without it,
> I have difficulty evaluating its usefulness.

Yes - this is a good point too, because it ~is~ different from the re
library.  re2 aims to do all that searching, grouping, iterating and
collecting and constructing work for you.

> However, it reminds me how much at first I found strange that only the
> last match was kept, so I think, FWIW, that on a purist point of vue the
> functionality would make sense in the stdlib in some way or another.

Actually that "last match only" confusion was part of the motivation for
writing it in the first place.


> For .verse[1] or .verse[2] to make sense, it implies that the pattern is
> something like...
> ((?P<verse>... )(?P<verse>...))
> ... which it isn't.

Good pickup!
You've seen through my smoke and mirrors.  ;-)
That list of verses was actually created in the compression stage.
(The stage that I failed to mention in my first post.)

ie. The regex was:

((?P<verse>(?P<number>\d+) (?P<activity>[^,]+))(, )?)*

Which returns an un-named list of verse groups.

Something like:

{'_group0': [ {'verse': {'number': '12', 'activity': 'drummers
drumming'}, {'verse': {'number': '11', 'activity': 'pipers
piping'}},
{'verse': {'number': '10', 'activity': 'lords a-leaping'}}]}

But the compression algorithm discarded that '_group0' key and brought
the 'verse' groups to the surface, then grouped them together in one
'verse' list.

ie. to make:

{'verse': [{'number': '12', 'activity': 'drummers
drumming'}, {'number': '11', 'activity': 'pipers
piping'}, {'number': '10', 'activity': 'lords a-leaping'}]}

> > Also, should it be limited to named groups?
>
> Probably not.  I would suggest using matchobj.group(i) semantics to
> match the standard re module semantics, though only allow returning
> items in the current level of the hierarchy.  That is, one could use
> x.verse.group(1) and get back '12', but x.group(1) would return '12
> pipers piping'

Actually, I ~would~ like to limit it to just named groups.
I reckon, if you're not going to bother naming a group, then why would
you have any interest in it.
I guess its up for discussion how confusing this "new" way of thinking
could be and what drawbacks it might have.

Regards.

Chris.
From pierre.barbier at cirad.fr  Sun Apr  3 11:13:51 2005
From: pierre.barbier at cirad.fr (Pierre Barbier de Reuille)
Date: Sun Apr  3 11:12:48 2005
Subject: [Python-Dev] hierarchicial named groups extension to the
	re	library
In-Reply-To: <7jV2C9bu.1112513089.4609730.ottrey@py.redsoft.be>
References: <7jV2C9bu.1112513089.4609730.ottrey@py.redsoft.be>
Message-ID: <424FB3CF.7020102@cirad.fr>


ottrey@py.redsoft.be a ?crit :
> Nicolas Fleury <nidoizo at yahoo.com> wrote:
 >
> [...]
> 
> Actually, I ~would~ like to limit it to just named groups.
> I reckon, if you're not going to bother naming a group, then why would
> you have any interest in it.
> I guess its up for discussion how confusing this "new" way of thinking
> could be and what drawbacks it might have.

I would find interesting to match every groups without naming them ! For 
example, if the position in the father group is the best meaning, why 
bother with names ? If you just allow the user to skip the compression 
stage it will do the trick !

That leads me to a question: would it be possible to use, as names for 
unnamed groups, integers instead of strings ? That way, you could access 
unnamed groups by their rank in their father group for example.

A small example of what I would want:

 >>> buf="123 234 345, 123 256, and 123 289"
 >>> regex=r'^(( *\d+)+,)+ *(?P<logic>[^ ]+)(( *\d+)+).*$'
 >>> pat2=re2.compile(regex)
 >>> x=pat2.extract(buf)
 >>> x
{ 0: {'_value': "123 234 345,", 0: "123", 1: " 234", 2: " 345"},
   1: {'_value': " 123 256,", 0: " 123", 1:" 256"},
   'logic': {'_value': 'and'},
   3: {'_value': " 123 289", 1: " 123", 2:" 289"} }

Pierre

> 
> Regards.
> 
> Chris.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/pierre.barbier%40cirad.fr
> 

-- 
Pierre Barbier de Reuille

INRA - UMR Cirad/Inra/Cnrs/Univ.MontpellierII AMAP
Botanique et Bio-informatique de l'Architecture des Plantes
TA40/PSII, Boulevard de la Lironde
34398 MONTPELLIER CEDEX 5, France

tel   : (33) 4 67 61 65 77    fax   : (33) 4 67 61 56 68
From pje at telecommunity.com  Sun Apr  3 16:30:21 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sun Apr  3 16:26:59 2005
Subject: [Python-Dev] Re: hierarchicial named groups extension to
	the re	library
In-Reply-To: <424F91B0.2050307@v.loewis.de>
References: <20050402134150.7215.JCARLSON@uci.edu>
	<MS4ysQaz.1112426561.4504890.ottrey@py.redsoft.be>
	<d2n0vt$rlr$1@sea.gmane.org> <20050402134150.7215.JCARLSON@uci.edu>
Message-ID: <5.1.1.6.0.20050403102549.0350bec0@mail.telecommunity.com>

At 08:48 AM 4/3/05 +0200, Martin v. L?wis wrote:
>I personally think that the proposed functionality should *not* live
>in a separate module, but somehow be integrated into SRE.

+1.


>  Whether or
>not the proposed functionality is useful in the first place, I don't
>know. I never have nested named groups in my regular expressions.

Neither have I, but only because it doesn't do what re2 does.  :)

I'd like to suggest that the addition also allow you to match a group by a 
named reference, thus allowing a complete grammar to be formed.  Of course, 
I don't know if the underlying regular expression engine could actually do 
that, but it would be nice if it could, since it would allow simple 
grammars to be more easily parsed without recourse to a more complex 
parsing module.

From mwh at python.net  Sun Apr  3 17:14:16 2005
From: mwh at python.net (Michael Hudson)
Date: Sun Apr  3 17:14:18 2005
Subject: [Python-Dev] longobject.c & ob_size
Message-ID: <2mk6njdh9z.fsf@starship.python.net>

Asking mostly for curiousity, how hard would it be to have longs store
their sign bit somewhere less aggravating?  It seems to me that the
top bit of ob_digit[0] is always 0, for example, and I'm sure this
would result no less convolution in longobject.c it'd be considerably
more localized convolution.

Cheers,
mwh

-- 
  <glyph> CDATA is not an integration strategy.
                                                -- from Twisted.Quotes
From martin at v.loewis.de  Sun Apr  3 18:03:29 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun Apr  3 18:03:32 2005
Subject: [Python-Dev] longobject.c & ob_size
In-Reply-To: <2mk6njdh9z.fsf@starship.python.net>
References: <2mk6njdh9z.fsf@starship.python.net>
Message-ID: <425013D1.6090302@v.loewis.de>

Michael Hudson wrote:
> Asking mostly for curiousity, how hard would it be to have longs store
> their sign bit somewhere less aggravating?  It seems to me that the
> top bit of ob_digit[0] is always 0, for example, and I'm sure this
> would result no less convolution in longobject.c it'd be considerably
> more localized convolution.

I think the amount of special-casing that you need would remain the
same - i.e. you would have to mask out the sign before performing
the algorithms, then bring it back in. Masking out the bit from digit[0]
might slow down the algorithms somewhat, because you would probably mask
it out from every digit, not only digit[0] (or else test for digit[0],
which test would then be performed for all digits).

You would also have to keep the special case for 0L, which has
ob_size==0 (i.e. doesn't have digit[0]).

That said, I think the change could be implemented within a few hours,
taking a day to make the testsuite run again; depending on the review
process, you might need two releases to fix the bugs (but then, it
is also reasonable to expect to get it right the first time).

Regards,
Martin

From gustavo at niemeyer.net  Mon Apr  4 02:16:19 2005
From: gustavo at niemeyer.net (Gustavo Niemeyer)
Date: Mon Apr  4 02:16:43 2005
Subject: [Python-Dev] Re: hierarchicial named groups extension to the
	re	library
In-Reply-To: <424F91B0.2050307@v.loewis.de>
References: <MS4ysQaz.1112426561.4504890.ottrey@py.redsoft.be>
	<d2n0vt$rlr$1@sea.gmane.org> <20050402134150.7215.JCARLSON@uci.edu>
	<424F91B0.2050307@v.loewis.de>
Message-ID: <20050404001619.GA11017@burma.localdomain>

Greetings,

> If this kind of functionality would fall on immediate rejection for
> some reason, even writing the PEP might be pointless. If the
[...]

In my opinion the functionality is useful.

> I personally think that the proposed functionality should *not* live
> in a separate module, but somehow be integrated into SRE. Whether or
[...]

Agreed. I propose to integrate this functionality into the SRE syntax,
so that this special kind of group may be used when explicitly wanted.
This would avoid backward compatibility problems, would give each
regular expression a single meaning, and would allow interleaving
hierarchical/non-hierarchical groups.

I offer myself to integrate the change once we decide on the right
way to implement it, and achieve consensus on its adoption.

Best regards,

-- 
Gustavo Niemeyer
http://niemeyer.net
From gustavo at niemeyer.net  Mon Apr  4 03:17:17 2005
From: gustavo at niemeyer.net (Gustavo Niemeyer)
Date: Mon Apr  4 03:17:43 2005
Subject: [Python-Dev] hierarchicial named groups extension to the re
	library
In-Reply-To: <7jV2C9bu.1112513089.4609730.ottrey@py.redsoft.be>
References: <7jV2C9bu.1112513089.4609730.ottrey@py.redsoft.be>
Message-ID: <20050404011717.GA11463@burma.localdomain>

Greetings Chris,

> Well, that would be something I'd want to discuss here.  As I'm not
> sure if I actually ~want~ to match the API of the re module.

If this feature is considered a good addition for the standard
library, integrating it on re would be an interesting option.
But given what you say above, I'm not sure if *you* want to
make it a part of re itself.

[...]
> IMO If you don't bother to name a group then you probably aren't going
> to be interested in it anyway - so why keeping a reference to it?

That's not true. There's a lot of code out there using unnamed
groups genuinely. The syntax (?: ) is used when the group content
is considered unuseful.

> If you only wanted to extract the numbers from those verses...
> 
> >>> regex='^(((?P<number>\d+) ([^,]+))(, )?)*$'
> >>> pat2=re2.compile(regex)
> >>> x=pat2.extract(buf)
> >>> x
> {'number': ['12', '11', '10']}
> 
> Before the compression stage the _Match object actually looked like this:
> 
> {'_group0': {'_value': '12 drummers drumming, 11 pipers piping, 10
> lords
[...]
> '10'}}]}}
> 
> But the compression algorithm collected the named groups and brought
> them to the surface, to return the much nicer looking:
> 
> {'number': ['12', '11', '10']}

I confess I didn't thought about how that could be cleanly
implemented, but both outputs you present above look inadequate
in my opinion. Regular expressions already have a widely adopted
meaning. If we're going to introduce new features, we should try
to do that without breaking the current well known meanings they
have.

> > I find the feature very interesting, but being used to live without it,
> > I have difficulty evaluating its usefulness.
> 
> Yes - this is a good point too, because it ~is~ different from the re
> library.  re2 aims to do all that searching, grouping, iterating and
> collecting and constructing work for you.
[...]
> Actually, I ~would~ like to limit it to just named groups.
> I reckon, if you're not going to bother naming a group, then why would
> you have any interest in it.
> I guess its up for discussion how confusing this "new" way of thinking
> could be and what drawbacks it might have.

Your target seems to be a new kind of regular expressions indeed.
In that case, I'm not sure if "re2" is the right name for it, given
that you haven't written an improved SRE, but a completely new
kind of regular expression matching which depends on SRE itself
rather than extending it on a compatible way.

While I would like to see *some* kind of successive matching
implemented in SRE (besides the Scanner which is already available),
I'm not in favor of that specific implementation.

I'm open to discuss that further.

-- 
Gustavo Niemeyer
http://niemeyer.net
From arigo at tunes.org  Mon Apr  4 08:10:43 2005
From: arigo at tunes.org (Armin Rigo)
Date: Mon Apr  4 08:17:23 2005
Subject: [Python-Dev] longobject.c & ob_size
In-Reply-To: <2mk6njdh9z.fsf@starship.python.net>
References: <2mk6njdh9z.fsf@starship.python.net>
Message-ID: <20050404061043.GA2960@vicky.ecs.soton.ac.uk>

Hi Michael,

On Sun, Apr 03, 2005 at 04:14:16PM +0100, Michael Hudson wrote:
> Asking mostly for curiousity, how hard would it be to have longs store
> their sign bit somewhere less aggravating?

As I guess your goal is to get rid of all the "if (size < 0) size = -size" in
object.c and friends, I should point out that longobject.c has set out an
example that might have been followed by C extension writers.  Maybe it is too
late to say now that ob_size cannot be negative any more :-(


Armin
From ottrey at py.redsoft.be  Mon Apr  4 08:27:46 2005
From: ottrey at py.redsoft.be (ottrey@py.redsoft.be)
Date: Mon Apr  4 08:27:57 2005
Subject: [Python-Dev] hierarchicial named groups extension to the re
	library
In-Reply-To: <20050404011717.GA11463@burma.localdomain>
Message-ID: <F8hvEsF7.1112596066.6742940.ottrey@py.redsoft.be>


Hi Gustavo!,

On 4/4/2005, "Gustavo Niemeyer" <gustavo@niemeyer.net> wrote:
>> Well, that would be something I'd want to discuss here.  As I'm not
>> sure if I actually ~want~ to match the API of the re module.
>
>If this feature is considered a good addition for the standard
>library, integrating it on re would be an interesting option.
>But given what you say above, I'm not sure if *you* want to
>make it a part of re itself.
>

After taking in the great comments made in this discussion, I'm now
thinking that it ~would~ be best to try and integrate the new
functionality with the existing re library (matching the current API),
as there is (at least some) re2 functionality that I think could fit
neatly into the existing re API.

As, like you say:
> This would avoid backward compatibility problems, would give each
> regular expression a single meaning, and would allow interleaving
> hierarchical/non-hierarchical groups.

>If we're going to introduce new features, we should try
>to do that without breaking the current well known meanings they
>have.

Agreed.

>I'm not in favor of that specific implementation.
>
>I'm open to discuss that further.

And I'm happy to work on a proposal that attempts to implement the new
functionality in a backwardly compatible, integrated way.

> I offer myself to integrate the change

Thanx!  That'd be great.

> once we decide on the right way to implement it,
> and achieve consensus on its adoption.

Great.
So I'll conclude from this discussion that (some implementation) of re2
is indeed worth adding to the re library (once we achieve consensus).

And as for creating a PEP...

>Josiah Carlson wrote:
>In general, if developers can readily agree that a functionality should
>be added (i.e. it is "obvious" for some reason), it is added right away.
>Otherwise, a PEP should be written, and reviewed by the community

I'd like to call the current functionality a "work in progress".
ie. I'd like to work on it more, taking on board the comments made here.

I'd also like to take this discussion off the python-dev list now and
shift it to pyre2.  (possibly to come back with a more polished proposal.)

We've set up a development wiki here:

  http://py.redsoft.be/pyre2/wiki/

(feel free to add any more suggestions.)

And there is also a mailing list, if anyone is interested and would like
to subscribe:

  http://lists.sourceforge.net/lists/listinfo/pyre2-devel


Regards.

Chris.
From olsongt at verizon.net  Mon Apr  4 23:51:34 2005
From: olsongt at verizon.net (Grant Olson)
Date: Mon Apr  4 23:54:18 2005
Subject: [Python-Dev] Mail.python.org
Message-ID: <0IEF00B0UZIC91I3@vms048.mailsrvcs.net>

Not a big deal, but I noticed that https://mail.python.org/ is live and
shows a generic "Welcome to your new home in cyberspace!" message.  One of
the webmasters may want to automatically redirect to http://mail.python.org.

-Grant

From stephen at xemacs.org  Tue Apr  5 08:25:09 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue Apr  5 08:25:20 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <424DACDC.4080601@egenix.com> (M.'s message of "Fri, 01 Apr
	2005 22:19:40 +0200")
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>
	<424DACDC.4080601@egenix.com>
Message-ID: <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "MAL" == M  <mal@egenix.com> writes:

    MAL> The BOM (byte order mark) was a non-standard Microsoft
    MAL> invention to detect Unicode text data as such (MS always uses
    MAL> UTF-16-LE for Unicode text files).

The Japanese "memopado" (Notepad) uses UTF-8 signatures; it even adds
them to existing UTF-8 files lacking them.

    MAL> -1; there's no standard for UTF-8 BOMs - adding it to the
    MAL> codecs module was probably a mistake to begin with. You
    MAL> usually only get UTF-8 files with BOM marks as the result of
    MAL> recoding UTF-16 files into UTF-8.

There is a standard for UTF-8 _signatures_, however.  I don't have the
most recent version of the ISO-10646 standard, but Amendment 2 (which
defined UTF-8 for ISO-10646) specifically added the UTF-8 signature to
Annex F of that standard.  Evan quotes Version 4 of the Unicode
standard, which explicitly defines the UTF-8 signature.

So there is a standard for the UTF-8 signature, and I know of
applications which produce it.  While I agree with you that Python's
codecs shouldn't produce it (by default), providing an option to strip
is a good idea.

However, this option should be part of the initialization of an IO
stream which produces Unicodes, _not_ an operation on arbitrary
internal strings (whether raw or Unicode).

    MAL> BTW, how do you know that s came from the start of a file and
    MAL> not from slicing some already loaded file somewhere in the
    MAL> middle ?

The programmer or the application might, but Python's codecs don't.
The point is that this is also true of rawstrings that happen to
contain UTF-16 or UTF-32 data.  The UTF-16 ("auto-endian") codec
shouldn't strip leading BOMs either, unless it has been told it has
the beginning of the string.

    MAL> Evan Jones wrote:

    >> This is *not* a valid Unicode character. The Unicode
    >> specification (version 4, section 15.8) says the following
    >> about non-characters:
    >> 
    >>> Applications are free to use any of these noncharacter code
    >>> points internally but should never attempt to exchange
    >>> them. If a noncharacter is received in open interchange, an
    >>> application is not required to interpret it in any way. It is
    >>> good practice, however, to recognize it as a noncharacter and
    >>> to take appropriate action, such as removing it from the
    >>> text. Note that Unicode conformance freely allows the removal
    >>> of these characters. (See C10 in Section3.2, Conformance
    >>> Requirements.)
    >> 
    >> My interpretation of the specification means that Python should

The specification _permits_ silent removal; it does not recommend.

    >> silently remove the character, resulting in a zero length
    >> Unicode string.  Similarly, both of the following lines should
    >> also result in a zero length Unicode string:

    >>>> '\xff\xfe\xfe\xff'.decode( "utf16" )
    > u'\ufffe'
    >>>> '\xff\xfe\xff\xff'.decode( "utf16" )
    > u'\uffff'

I strongly disagree; these decisions should be left to a higher layer.
In the case of specified UTFs, the codecs should simply invert the UTF
to Python's internal encoding.

    MAL> Hmm, wouldn't it be better to raise an error ? After all, a
    MAL> reversed BOM mark in the stream looks a lot like you're
    MAL> trying to decode a UTF-16 stream assuming the wrong byte
    MAL> order ?!

+1 on (optionally) raising an error.  -1 on removing it or anything
like that, unless under control of the application (ie, the program
written in Python, not Python itself).  It's far too easy for software
to generate broken Unicode streams[1], and the choice of how to deal
with those should be with the application, not with the implementation
language.


Footnotes: 
[1]  An egregious example was the Outlook Express distributed with
early Win2k betas, which produced MIME bodies with apparent
Content-Type: text/html; charset=utf-16, but the HTML tags and
newlines were 7-bit ASCII!

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
From martin at v.loewis.de  Tue Apr  5 10:03:15 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Apr  5 10:03:19 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>
	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <42524643.3070604@v.loewis.de>

Stephen J. Turnbull wrote:
> So there is a standard for the UTF-8 signature, and I know of
> applications which produce it.  While I agree with you that Python's
> codecs shouldn't produce it (by default), providing an option to strip
> is a good idea.

I would personally like to see an "utf-8-bom" codec (perhaps better
named "utf-8-sig", which strips the BOM on reading (if present)
and generates it on writing.

> However, this option should be part of the initialization of an IO
> stream which produces Unicodes, _not_ an operation on arbitrary
> internal strings (whether raw or Unicode).

With the UTF-8-SIG codec, it would apply to all operation modes of
the codec, whether stream-based or from strings. Whether or not to
use the codec would be the application's choice.

Regards,
Martin
From mal at egenix.com  Tue Apr  5 12:19:49 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue Apr  5 12:19:52 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <42524643.3070604@v.loewis.de>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>
	<42524643.3070604@v.loewis.de>
Message-ID: <42526645.3010600@egenix.com>

Martin v. L?wis wrote:
> Stephen J. Turnbull wrote:
> 
>> So there is a standard for the UTF-8 signature, and I know of
>> applications which produce it.  While I agree with you that Python's
>> codecs shouldn't produce it (by default), providing an option to strip
>> is a good idea.
> 
> I would personally like to see an "utf-8-bom" codec (perhaps better
> named "utf-8-sig", which strips the BOM on reading (if present)
> and generates it on writing.

+1.

>> However, this option should be part of the initialization of an IO
>> stream which produces Unicodes, _not_ an operation on arbitrary
>> internal strings (whether raw or Unicode).
> 
> 
> With the UTF-8-SIG codec, it would apply to all operation modes of
> the codec, whether stream-based or from strings. Whether or not to
> use the codec would be the application's choice.

I'd suggest to use the same mode of operation as we have in
the UTF-16 codec: it removes the BOM mark on the first call
to the StreamReader .decode() method and writes a BOM mark
on the first call to .encode() on a StreamWriter.

Note that the UTF-16 codec is strict w/r to the presence
of the BOM mark: you get a UnicodeError if a stream does
not start with a BOM mark. For the UTF-8-SIG codec, this
should probably be relaxed to not require the BOM.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 05 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From walter at livinglogic.de  Tue Apr  5 12:31:06 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Tue Apr  5 12:31:10 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <42526645.3010600@egenix.com>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>
	<42526645.3010600@egenix.com>
Message-ID: <425268EA.7070703@livinglogic.de>

M.-A. Lemburg wrote:

>> [...]
>>With the UTF-8-SIG codec, it would apply to all operation modes of
>>the codec, whether stream-based or from strings. Whether or not to
>>use the codec would be the application's choice.
> 
> I'd suggest to use the same mode of operation as we have in
> the UTF-16 codec: it removes the BOM mark on the first call
> to the StreamReader .decode() method and writes a BOM mark
> on the first call to .encode() on a StreamWriter.
> 
> Note that the UTF-16 codec is strict w/r to the presence
> of the BOM mark: you get a UnicodeError if a stream does
> not start with a BOM mark. For the UTF-8-SIG codec, this
> should probably be relaxed to not require the BOM.

I've started writing such a codec. Making the BOM optional on decoding 
definitely simplifies the implementation.

Bye,
    Walter D?rwald
From mal at egenix.com  Tue Apr  5 12:34:53 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue Apr  5 12:34:58 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>
	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <425269CD.3090009@egenix.com>

Stephen J. Turnbull wrote:
>>>>>>"MAL" == M  <mal@egenix.com> writes:
> 
> 
>     MAL> The BOM (byte order mark) was a non-standard Microsoft
>     MAL> invention to detect Unicode text data as such (MS always uses
>     MAL> UTF-16-LE for Unicode text files).
> 
> The Japanese "memopado" (Notepad) uses UTF-8 signatures; it even adds
> them to existing UTF-8 files lacking them.

Is that a MS application ? AFAIK, notepad, wordpad and MS Office
always use UTF-16-LE + BOM when saving text as "Unicode text".

>     MAL> -1; there's no standard for UTF-8 BOMs - adding it to the
>     MAL> codecs module was probably a mistake to begin with. You
>     MAL> usually only get UTF-8 files with BOM marks as the result of
>     MAL> recoding UTF-16 files into UTF-8.
> 
> There is a standard for UTF-8 _signatures_, however.  I don't have the
> most recent version of the ISO-10646 standard, but Amendment 2 (which
> defined UTF-8 for ISO-10646) specifically added the UTF-8 signature to
> Annex F of that standard.  Evan quotes Version 4 of the Unicode
> standard, which explicitly defines the UTF-8 signature.

Ok, as signature the BOM does make some sense - whether to
strip signatures from a document is a good idea or not
is a different matter, though.

Here's the Unicode Cons. FAQ on the subject:

	http://www.unicode.org/faq/utf_bom.html#22

They also explicitly warn about adding BOMs to UTF-8 data
since it can break applications and protocols that do not
expect such a signature.

> So there is a standard for the UTF-8 signature, and I know of
> applications which produce it.  While I agree with you that Python's
> codecs shouldn't produce it (by default), providing an option to strip
> is a good idea.
> 
> However, this option should be part of the initialization of an IO
> stream which produces Unicodes, _not_ an operation on arbitrary
> internal strings (whether raw or Unicode).

Right.

>     MAL> BTW, how do you know that s came from the start of a file and
>     MAL> not from slicing some already loaded file somewhere in the
>     MAL> middle ?
> 
> The programmer or the application might, but Python's codecs don't.
> The point is that this is also true of rawstrings that happen to
> contain UTF-16 or UTF-32 data.  The UTF-16 ("auto-endian") codec
> shouldn't strip leading BOMs either, unless it has been told it has
> the beginning of the string.

The UTF-16 stream codecs implement this logic.

The UTF-16 encode and decode functions will however always strip
the BOM mark from the beginning of a string.

If the application doesn't want this stripping to happen,
it should use the UTF-16-LE or -BE codec resp.

>     MAL> Evan Jones wrote:
> 
>     >> This is *not* a valid Unicode character. The Unicode
>     >> specification (version 4, section 15.8) says the following
>     >> about non-characters:
>     >> 
>     >>> Applications are free to use any of these noncharacter code
>     >>> points internally but should never attempt to exchange
>     >>> them. If a noncharacter is received in open interchange, an
>     >>> application is not required to interpret it in any way. It is
>     >>> good practice, however, to recognize it as a noncharacter and
>     >>> to take appropriate action, such as removing it from the
>     >>> text. Note that Unicode conformance freely allows the removal
>     >>> of these characters. (See C10 in Section3.2, Conformance
>     >>> Requirements.)
>     >> 
>     >> My interpretation of the specification means that Python should
> 
> The specification _permits_ silent removal; it does not recommend.
> 
>     >> silently remove the character, resulting in a zero length
>     >> Unicode string.  Similarly, both of the following lines should
>     >> also result in a zero length Unicode string:
> 
>     >>>> '\xff\xfe\xfe\xff'.decode( "utf16" )
>     > u'\ufffe'
>     >>>> '\xff\xfe\xff\xff'.decode( "utf16" )
>     > u'\uffff'
> 
> I strongly disagree; these decisions should be left to a higher layer.
> In the case of specified UTFs, the codecs should simply invert the UTF
> to Python's internal encoding.
> 
>     MAL> Hmm, wouldn't it be better to raise an error ? After all, a
>     MAL> reversed BOM mark in the stream looks a lot like you're
>     MAL> trying to decode a UTF-16 stream assuming the wrong byte
>     MAL> order ?!
> 
> +1 on (optionally) raising an error. 

The advantage of raising an error is that the application
can deal with the situation in whatever way seems fit (by
registering a special error handler or by simply using
"ignore" or "replace").

I agree that much of this lies outside the scope of codecs
and should be handled at an application or protocol level.

> -1 on removing it or anything
> like that, unless under control of the application (ie, the program
> written in Python, not Python itself).  It's far too easy for software
> to generate broken Unicode streams[1], and the choice of how to deal
> with those should be with the application, not with the implementation
> language.
> 
> Footnotes: 
> [1]  An egregious example was the Outlook Express distributed with
> early Win2k betas, which produced MIME bodies with apparent
> Content-Type: text/html; charset=utf-16, but the HTML tags and
> newlines were 7-bit ASCII!
> 

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 05 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From stephen at xemacs.org  Tue Apr  5 14:03:19 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue Apr  5 14:03:25 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <42524643.3070604@v.loewis.de> (Martin v.
	=?iso-8859-1?q?L=F6wis's?= message of "Tue, 05 Apr 2005 10:03:15 +0200")
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>
	<424DACDC.4080601@egenix.com>
	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>
	<42524643.3070604@v.loewis.de>
Message-ID: <87is31e8hk.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Martin" == Martin v L?wis <martin@v.loewis.de> writes:

    Martin> Stephen J. Turnbull wrote:

    >> However, this option should be part of the initialization of an
    >> IO stream which produces Unicodes, _not_ an operation on
    >> arbitrary internal strings (whether raw or Unicode).

    Martin> With the UTF-8-SIG codec, it would apply to all operation
    Martin> modes of the codec, whether stream-based or from strings.

I had in mind the ability to treat a string as a stream.

    Martin> Whether or not to use the codec would be the application's
    Martin> choice.

What I think should be provided is a stateful object encapsulating the
codec.  Ie, to avoid the need to write

    out = chunk[0].encode("utf-8-sig") + chunk[1].encode("utf-8")

    
-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
From stephen at xemacs.org  Tue Apr  5 15:04:34 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue Apr  5 15:04:42 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <425269CD.3090009@egenix.com> (M.'s message of "Tue, 05 Apr
	2005 12:34:53 +0200")
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>
	<424DACDC.4080601@egenix.com>
	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>
	<425269CD.3090009@egenix.com>
Message-ID: <87zmwdcr31.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>>>"MAL" == M  <mal@egenix.com> writes:

    MAL> Stephen J. Turnbull wrote:

    >> The Japanese "memopado" (Notepad) uses UTF-8 signatures; it
    >> even adds them to existing UTF-8 files lacking them.

    MAL> Is that a MS application ? AFAIK, notepad, wordpad and MS
    MAL> Office always use UTF-16-LE + BOM when saving text as "Unicode
    MAL> text".

Yes, it is an MS application.  I'll have to borrow somebody's box to
check, but IIRC UTF-8 is the native "text" encoding for Japanese now.
(Japanized applications generally behave differently from everything
else, as there are so many "standards" for encoding Japanese.)

    M> The UTF-16 stream codecs implement this logic.

    M> The UTF-16 encode and decode functions will however always
    M> strip the BOM mark from the beginning of a string.

    M> If the application doesn't want this stripping to happen, it
    M> should use the UTF-16-LE or -BE codec resp.

That sounds like it would work fine almost all the time.  If it
doesn't it's straightforward to work around, and certainly would be
more convenient for the non-standards-geek programmer.


-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
From skip at pobox.com  Tue Apr  5 15:57:16 2005
From: skip at pobox.com (Skip Montanaro)
Date: Tue Apr  5 15:57:19 2005
Subject: [Python-Dev] Mail.python.org
In-Reply-To: <0IEF00B0UZIC91I3@vms048.mailsrvcs.net>
References: <0IEF00B0UZIC91I3@vms048.mailsrvcs.net>
Message-ID: <16978.39228.310785.460397@montanaro.dyndns.org>


    Grant> Not a big deal, but I noticed that https://mail.python.org/ is
    Grant> live and shows a generic "Welcome to your new home in
    Grant> cyberspace!" message.  One of the webmasters may want to
    Grant> automatically redirect to http://mail.python.org.

Thanks, I forwarded this along to the folks who can deal with this.

Skip
From martin at v.loewis.de  Tue Apr  5 20:44:47 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Apr  5 20:44:49 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <87is31e8hk.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>
	<87is31e8hk.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <4252DC9F.50000@v.loewis.de>

Stephen J. Turnbull wrote:
>     Martin> With the UTF-8-SIG codec, it would apply to all operation
>     Martin> modes of the codec, whether stream-based or from strings.
> 
> I had in mind the ability to treat a string as a stream.

Hmm. A string is not a stream, but it could be the contents of a stream.

A typical application of codecs goes like this:

data = stream.read()
[analyze data, e.g. by checking whether there is encoding= in <?xml...]
data = data.decode(encoding analyzed)

So people do use the "decode-it-all" mode, where no sequential access
is necessary - yet the beginning of the string is still the beginning of
what once was a stream. This case must be supported.

>     Martin> Whether or not to use the codec would be the application's
>     Martin> choice.
> 
> What I think should be provided is a stateful object encapsulating the
> codec.  Ie, to avoid the need to write
> 
>     out = chunk[0].encode("utf-8-sig") + chunk[1].encode("utf-8")

No. People who want streaming should use cStringIO, i.e.

 >>> s=cStringIO.StringIO()
 >>> s1=codecs.getwriter("utf-8")(s)
 >>> s1.write(u"Hallo")
 >>> s.getvalue()
'Hallo'

Regards,
Martin
From walter at livinglogic.de  Tue Apr  5 21:33:00 2005
From: walter at livinglogic.de (=?iso-8859-1?Q?Walter_D=F6rwald?=)
Date: Tue Apr  5 21:33:04 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <425268EA.7070703@livinglogic.de>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>
	<42526645.3010600@egenix.com> <425268EA.7070703@livinglogic.de>
Message-ID: <1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de>

Walter D?rwald sagte:

> M.-A. Lemburg wrote:
>
>>> [...]
>>>With the UTF-8-SIG codec, it would apply to all operation
>>> modes of the codec, whether stream-based or from strings. Whether
>>>or not to use the codec would be the application's choice.
>>
>> I'd suggest to use the same mode of operation as we have in
>> the UTF-16 codec: it removes the BOM mark on the first call
>> to the StreamReader .decode() method and writes a BOM mark
>> on the first call to .encode() on a StreamWriter.
>>
>> Note that the UTF-16 codec is strict w/r to the presence
>> of the BOM mark: you get a UnicodeError if a stream does
>> not start with a BOM mark. For the UTF-8-SIG codec, this
>> should probably be relaxed to not require the BOM.
>
> I've started writing such a codec. Making the BOM optional
> on decoding definitely simplifies the implementation.

OK, here is the patch: http://www.python.org/sf/1177307

The stateful decoder has a little problem: At least three bytes
have to be available from the stream until the StreamReader
decides whether these bytes are a BOM that has to be skipped.
This means that if the file only contains "ab", the user will
never see these two characters.

A solution for this would be to add an argument named final to
the decode and read methods that tells the decoder that the
stream has ended and the remaining buffered bytes have to be
handled now.

Bye,
   Walter D?rwald


From ejones at uwaterloo.ca  Tue Apr  5 21:53:05 2005
From: ejones at uwaterloo.ca (Evan Jones)
Date: Tue Apr  5 21:52:27 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>
	<42526645.3010600@egenix.com> <425268EA.7070703@livinglogic.de>
	<1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de>
Message-ID: <bf46d5271e4377a53077a69ba7ff3b1e@uwaterloo.ca>

On Apr 5, 2005, at 15:33, Walter D?rwald wrote:
> The stateful decoder has a little problem: At least three bytes
> have to be available from the stream until the StreamReader
> decides whether these bytes are a BOM that has to be skipped.
> This means that if the file only contains "ab", the user will
> never see these two characters.

Shouldn't the decoder be capable of doing a partial match and quitting 
early? After all, "ab" is encoded in UTF8 as <61> <62> but the BOM is 
<ef> <bb> <bf>. If it did this type of partial matching, this issue 
would be avoided except in rare situations.

> A solution for this would be to add an argument named final to
> the decode and read methods that tells the decoder that the
> stream has ended and the remaining buffered bytes have to be
> handled now.

This functionality is provided by a flush() method on similar objects, 
such as the zlib compression objects.

Evan Jones

From fdrake at acm.org  Tue Apr  5 21:56:46 2005
From: fdrake at acm.org (Fred Drake)
Date: Tue Apr  5 21:57:35 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <bf46d5271e4377a53077a69ba7ff3b1e@uwaterloo.ca>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>
	<1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de>
	<bf46d5271e4377a53077a69ba7ff3b1e@uwaterloo.ca>
Message-ID: <200504051556.46947.fdrake@acm.org>

On Tuesday 05 April 2005 15:53, Evan Jones wrote:
 > This functionality is provided by a flush() method on similar objects,
 > such as the zlib compression objects.

Or by close() on other objects (htmllib, HTMLParser, the SAX incremental 
parser, etc.).

Too bad there's more than one way to do it.  :-(


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at acm.org>
From martin at v.loewis.de  Tue Apr  5 22:05:14 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Apr  5 22:05:16 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>
	<42526645.3010600@egenix.com> <425268EA.7070703@livinglogic.de>
	<1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de>
Message-ID: <4252EF7A.8080804@v.loewis.de>

Walter D?rwald wrote:
> The stateful decoder has a little problem: At least three bytes
> have to be available from the stream until the StreamReader
> decides whether these bytes are a BOM that has to be skipped.
> This means that if the file only contains "ab", the user will
> never see these two characters.

This can be improved, of course: If the first byte is "a", it most
definitely is *not* an UTF-8 signature.

So we only need a second byte for the characters between U+F000
and U+FFFF, and a third byte only for the characters
U+FEC0...U+FEFF. But with the first byte being  \xef, we need
three bytes *anyway*, so we can always decide with the first
byte only whether we need to wait for three bytes.

> A solution for this would be to add an argument named final to
> the decode and read methods that tells the decoder that the
> stream has ended and the remaining buffered bytes have to be
> handled now.

Shouldn't an empty read from the underlying stream be taken
as an EOF?

Regards,
Martin
From tim.peters at gmail.com  Tue Apr  5 22:11:14 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue Apr  5 22:11:17 2005
Subject: [Python-Dev] longobject.c & ob_size
In-Reply-To: <2mk6njdh9z.fsf@starship.python.net>
References: <2mk6njdh9z.fsf@starship.python.net>
Message-ID: <1f7befae05040513113c825c92@mail.gmail.com>

[Michael Hudson]
> Asking mostly for curiousity, how hard would it be to have longs store
> their sign bit somewhere less aggravating?

Depends on where that is.

> It seems to me that the top bit of ob_digit[0] is always 0, for example,

Yes, the top bit of ob_digit[i], for all relevant i, is 0 on all platforms now.

> and I'm sure this would result no less convolution in longobject.c it'd be
> considerably more localized convolution.

I'd much rather give struct _longobject a distinct sign member (say, 0
== zero, -1 = non-zero negative, 1 == non-zero positive).  That would
simplify code.  It would cost no extra bytes for some longs, and 8
extra bytes for others (since obmalloc rounds up to a multiple of 8);
I don't care about that (e.g., I never use millions of longs
simultaneously, but often use a few dozen very big longs
simultaneously; the memory difference is in the noise then).

Note that longintrepr.h isn't included by Python.h.  Only longobject.h
is, and longobject.h doesn't reveal the internal structure of longs. 
IOW, changing the internal layout of longs shouldn't even hurt binary
compatibility.

The ob_size member of PyObject_VAR_HEAD would also be redeclared as
size_t in an ideal world.
From walter at livinglogic.de  Tue Apr  5 22:37:24 2005
From: walter at livinglogic.de (=?iso-8859-1?Q?Walter_D=F6rwald?=)
Date: Tue Apr  5 22:37:27 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <4252EF7A.8080804@v.loewis.de>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>
	<42526645.3010600@egenix.com> <425268EA.7070703@livinglogic.de>
	<1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de>
	<4252EF7A.8080804@v.loewis.de>
Message-ID: <1809.84.56.104.122.1112733444.squirrel@isar.livinglogic.de>

Martin v. L?wis sagte:
> Walter D?rwald wrote:
>> The stateful decoder has a little problem: At least three bytes
>> have to be available from the stream until the StreamReader
>> decides whether these bytes are a BOM that has to be skipped.
>> This means that if the file only contains "ab", the user will
>> never see these two characters.
>
> This can be improved, of course: If the first byte is "a",
> it most definitely is *not* an UTF-8 signature.
>
> So we only need a second byte for the characters between U+F000
> and U+FFFF, and a third byte only for the characters
> U+FEC0...U+FEFF. But with the first byte being  \xef, we need
> three bytes *anyway*, so we can always decide with the first
> byte only whether we need to wait for three bytes.

OK, I've updated the patch so that the first bytes will only be kept
in the buffer if they are a prefix of the BOM.

>> A solution for this would be to add an argument named final to
>> the decode and read methods that tells the decoder that the
>> stream has ended and the remaining buffered bytes have to be
>> handled now.
>
> Shouldn't an empty read from the underlying stream be taken
> as an EOF?

There are situations where the byte stream might be temporarily
exhausted, e.g. an XML parser that tries to support the
IncrementalParser interface, or when you want to decode
encoded data piecewise, because you want to give a progress
report.

Bye,
   Walter D?rwald


From walter at livinglogic.de  Tue Apr  5 22:43:03 2005
From: walter at livinglogic.de (=?iso-8859-1?Q?Walter_D=F6rwald?=)
Date: Tue Apr  5 22:43:06 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <bf46d5271e4377a53077a69ba7ff3b1e@uwaterloo.ca>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>
	<42526645.3010600@egenix.com> <425268EA.7070703@livinglogic.de>
	<1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de>
	<bf46d5271e4377a53077a69ba7ff3b1e@uwaterloo.ca>
Message-ID: <1811.84.56.104.122.1112733783.squirrel@isar.livinglogic.de>

Evan Jones sagte:
> On Apr 5, 2005, at 15:33, Walter D?rwald wrote:
>> The stateful decoder has a little problem: At least three bytes
>> have to be available from the stream until the StreamReader
>> decides whether these bytes are a BOM that has to be skipped.
>> This means that if the file only contains "ab", the user will
>> never see these two characters.
>
> Shouldn't the decoder be capable of doing a partial match and quitting  early? After all, "ab" is encoded in UTF8 as <61>
> <62> but the BOM is  <ef> <bb> <bf>. If it did this type of partial matching, this issue  would be avoided except in rare
> situations.
>
>> A solution for this would be to add an argument named final to
>> the decode and read methods that tells the decoder that the
>> stream has ended and the remaining buffered bytes have to be
>> handled now.
>
> This functionality is provided by a flush() method on similar objects,  such as the zlib compression objects.

Theoretically the name is unimportant, but read(..., final=True) or flush()
or close() should subject the pending bytes to normal error handling and
must return the result of decoding these pending bytes just like the
other methods do. This would mean that we would have to implement
a decodecode(), a readclose() and a readlineclose(). IMHO it would be
best to add this argument to decode, read and readline directly. But I'm
not sure, what this would mean for iterating through a StreamReader.

Bye,
    Walter D?rwald


From martin at v.loewis.de  Tue Apr  5 22:52:14 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Apr  5 22:52:18 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <1809.84.56.104.122.1112733444.squirrel@isar.livinglogic.de>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>
	<42526645.3010600@egenix.com> <425268EA.7070703@livinglogic.de>
	<1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de>
	<4252EF7A.8080804@v.loewis.de>
	<1809.84.56.104.122.1112733444.squirrel@isar.livinglogic.de>
Message-ID: <4252FA7E.3090206@v.loewis.de>

Walter D?rwald wrote:
> There are situations where the byte stream might be temporarily
> exhausted, e.g. an XML parser that tries to support the
> IncrementalParser interface, or when you want to decode
> encoded data piecewise, because you want to give a progress
> report.

Yes, but these are not file-like objects. In the IncrementalParser,
it is *not* the case that a read operation returns an empty
string. Instead, the application repeatedly feeds data explicitly.
For a file-like object, returning "" indicates EOF.

Regards,
Martin
From raymond.hettinger at verizon.net  Tue Apr  5 12:47:07 2005
From: raymond.hettinger at verizon.net (Raymond Hettinger)
Date: Wed Apr  6 00:47:20 2005
Subject: [Python-Dev] Developer list update
Message-ID: <000101c539cc$d0419e00$e7bd2c81@oemcomputer>

FYI, I'm starting a project to see what has become of some of the
inactive developers.

Essentially, it involves sending them a note to see if they still have
use for their checkin permissions.  If not, then we can make the change
and improve security a bit.

Also, to help with institutional memory, I started a log of changes to
developer permissions.  The goal is to remember who was given access, by
whom, and why (some folks are given access for a one-shot project for
example).  The file is at Misc/developers.

The first entry is for Nick Coghlan who was just granted tracker
permissions so he can help manage outstanding bugs and patches.


Raymond Hettinger

From fdrake at acm.org  Wed Apr  6 01:06:34 2005
From: fdrake at acm.org (Fred Drake)
Date: Wed Apr  6 01:07:13 2005
Subject: [Python-Dev] Developer list update
In-Reply-To: <000101c539cc$d0419e00$e7bd2c81@oemcomputer>
References: <000101c539cc$d0419e00$e7bd2c81@oemcomputer>
Message-ID: <200504051906.34590.fdrake@acm.org>

On Tuesday 05 April 2005 06:47, Raymond Hettinger wrote:
 > Also, to help with institutional memory, I started a log of changes to
 > developer permissions.  The goal is to remember who was given access, by
 > whom, and why (some folks are given access for a one-shot project for
 > example).  The file is at Misc/developers.

Thanks, Raymond!

Would anyone here object to renaming the file to developers.txt, though?


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at acm.org>
From barry at python.org  Wed Apr  6 01:20:36 2005
From: barry at python.org (Barry Warsaw)
Date: Wed Apr  6 01:20:41 2005
Subject: [Python-Dev] Developer list update
In-Reply-To: <200504051906.34590.fdrake@acm.org>
References: <000101c539cc$d0419e00$e7bd2c81@oemcomputer>
	<200504051906.34590.fdrake@acm.org>
Message-ID: <1112743236.18820.178.camel@geddy.wooz.org>

On Tue, 2005-04-05 at 19:06, Fred Drake wrote:

> Would anyone here object to renaming the file to developers.txt, though?

+1, please!
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050405/27f6a8f8/attachment-0001.pgp
From stephen at xemacs.org  Wed Apr  6 02:32:01 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed Apr  6 02:32:07 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <4252DC9F.50000@v.loewis.de> (Martin v.
	=?iso-8859-1?q?L=F6wis's?= message of "Tue, 05 Apr 2005 20:44:47 +0200")
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>
	<424DACDC.4080601@egenix.com>
	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>
	<42524643.3070604@v.loewis.de>
	<87is31e8hk.fsf@tleepslib.sk.tsukuba.ac.jp>
	<4252DC9F.50000@v.loewis.de>
Message-ID: <87fyy4d9tq.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Martin" == Martin v L?wis <martin@v.loewis.de> writes:

    Martin> So people do use the "decode-it-all" mode, where no
    Martin> sequential access is necessary - yet the beginning of the
    Martin> string is still the beginning of what once was a
    Martin> stream. This case must be supported.

Of course it must be supported.  My point is that many strings (in my
applications, all but those strings that result from slurping in a
file or process output in one go -- example, not a statistically valid
sample!) are not the beginning of "what once was a stream".  It is
error-prone (not to mention unaesthetic) to not make that distinction.

"Explicit is better than implicit."

    Martin> Whether or not to use the codec would be the application's
    Martin> choice.

    >> What I think should be provided is a stateful object
    >> encapsulating the codec.  Ie, to avoid the need to write

    >> out = chunk[0].encode("utf-8-sig") + chunk[1].encode("utf-8")

    Martin> No. People who want streaming should use cStringIO, i.e.

 >>> s=cStringIO.StringIO()
 >>> s1=codecs.getwriter("utf-8")(s)
 >>> s1.write(u"Hallo")
 >>> s.getvalue()
'Hallo'

Yes!  Exactly (except in reverse, we want to _read_ from the slurped
stream-as-string, not write to one)!  ... and there's no need for a
utf-8-sig codec for strings, since you can support the usage in
exactly this way.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
From tim.peters at gmail.com  Wed Apr  6 03:00:43 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Wed Apr  6 03:00:47 2005
Subject: [Python-Dev] Developer list update
In-Reply-To: <1112743236.18820.178.camel@geddy.wooz.org>
References: <000101c539cc$d0419e00$e7bd2c81@oemcomputer>
	<200504051906.34590.fdrake@acm.org>
	<1112743236.18820.178.camel@geddy.wooz.org>
Message-ID: <1f7befae0504051800452bcca7@mail.gmail.com>

[Fred Drake]
>> Would anyone here object to renaming the file to developers.txt, though?

[Barry Warsaw]
> +1, please!

I voted with my DOS box.
From alex.nanou at gmail.com  Wed Apr  6 03:29:44 2005
From: alex.nanou at gmail.com (Alex A. Naanou)
Date: Wed Apr  6 03:29:47 2005
Subject: [Python-Dev] inconsistency when swapping obj.__dict__ with a
	dict-like object...
Message-ID: <36f889220504051829266cea1e@mail.gmail.com>

Hi!

here is a simple piece of code
<pre>
---cut---
class Dict(dict):
    def __init__(self, dct={}):
        self._dict = dct
    def __getitem__(self, name):
        return self._dct[name]
    def __setitem__(self, name, value):
        self._dct[name] = value
    def __delitem__(self, name):
        del self._dct[name]
    def __contains__(self, name):
        return name in self._dct
    def __iter__(self):
        return iter(self._dct)

class A(object):
    def __new__(cls, *p, **n):
        o = object.__new__(cls)
        o.__dict__ = Dict()
        return o

a = A()
a.xxx = 123
print a.__dict__._dict
a.__dict__._dict['yyy'] = 321
print a.yyy

--uncut--
</pre>

Here there are two problems, the first is minor, and it is that
anything assigned to the __dict__ attribute is checked to be a
descendant of the dict class (mixing this in does not seem to work)...
and the second problem is a real annoyance, it is that the mapping
protocol supported by the Dict object in the example above is not used
by the attribute access mechanics (the same thing that once happened
in exec)...

P.S. (IMHO) the type check here is not that necessary (at least in its
current state), as what we need to assert is not the relation to the
dict class but the support of the mapping protocol....

thanks.
-- 
Alex.
From bac at OCF.Berkeley.EDU  Wed Apr  6 04:46:07 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Apr  6 04:46:18 2005
Subject: [Python-Dev] inconsistency when swapping obj.__dict__ with a
	dict-like object...
In-Reply-To: <36f889220504051829266cea1e@mail.gmail.com>
References: <36f889220504051829266cea1e@mail.gmail.com>
Message-ID: <42534D6F.40200@ocf.berkeley.edu>

Alex A. Naanou wrote:
> Hi!
> 
> here is a simple piece of code
> <pre>
> ---cut---
> class Dict(dict):
>     def __init__(self, dct={}):
>         self._dict = dct
>     def __getitem__(self, name):
>         return self._dct[name]
>     def __setitem__(self, name, value):
>         self._dct[name] = value
>     def __delitem__(self, name):
>         del self._dct[name]
>     def __contains__(self, name):
>         return name in self._dct
>     def __iter__(self):
>         return iter(self._dct)
> 
> class A(object):
>     def __new__(cls, *p, **n):
>         o = object.__new__(cls)
>         o.__dict__ = Dict()
>         return o
> 
> a = A()
> a.xxx = 123
> print a.__dict__._dict
> a.__dict__._dict['yyy'] = 321
> print a.yyy
> 
> --uncut--
> </pre>
> 
> Here there are two problems, the first is minor, and it is that
> anything assigned to the __dict__ attribute is checked to be a
> descendant of the dict class (mixing this in does not seem to work)...
> and the second problem is a real annoyance, it is that the mapping
> protocol supported by the Dict object in the example above is not used
> by the attribute access mechanics (the same thing that once happened
> in exec)...
> 

Actually, overriding __getattribute__() does work; __getattr__() and
__getitem__() doesn't.  This was brought up last month at some point without
any resolve (I think Steve Bethard pointed it out).

> P.S. (IMHO) the type check here is not that necessary (at least in its
> current state), as what we need to assert is not the relation to the
> dict class but the support of the mapping protocol....
> 

Semantically necessary, no.  But simplicity- and performance-wise, maybe.  If
you grep around in Objects/classobject.c, for instance, you will see
PyClassObject.cl_dict is accessed using PyDict_GetItem() and I spotted at least
one use of PyDict_DelItem().  To use the mapping protocol would require
changing all of these to PyObject_GetItem() and such.

Which will be a performance penalty compared to PyDict_GetItem().  So the
question is whether the flexibility is worth it.

-Brett
From martin at v.loewis.de  Wed Apr  6 08:06:08 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Apr  6 08:06:12 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <87fyy4d9tq.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>	<87is31e8hk.fsf@tleepslib.sk.tsukuba.ac.jp>	<4252DC9F.50000@v.loewis.de>
	<87fyy4d9tq.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <42537C50.8000608@v.loewis.de>

Stephen J. Turnbull wrote:
> Of course it must be supported.  My point is that many strings (in my
> applications, all but those strings that result from slurping in a
> file or process output in one go -- example, not a statistically valid
> sample!) are not the beginning of "what once was a stream".  It is
> error-prone (not to mention unaesthetic) to not make that distinction.
> 
> "Explicit is better than implicit."

I can't put these two paragraphs together. If you think that explicit
is better than implicit, why do you not want to make different calls
for the first chunk of a stream, and the subsequent chunks?

>  >>> s=cStringIO.StringIO()
>  >>> s1=codecs.getwriter("utf-8")(s)
>  >>> s1.write(u"Hallo")
>  >>> s.getvalue()
> 'Hallo'
> 
> Yes!  Exactly (except in reverse, we want to _read_ from the slurped
> stream-as-string, not write to one)!  ... and there's no need for a
> utf-8-sig codec for strings, since you can support the usage in
> exactly this way.

However, if there is an utf-8-sig codec for streams, there is currently
no way of *preventing* this codec to also be available for strings. The
very same code is used for streams and for strings, and automatically
so.

Regards,
Martin

From walter at livinglogic.de  Wed Apr  6 10:32:59 2005
From: walter at livinglogic.de (=?iso-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed Apr  6 10:33:02 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <4252FA7E.3090206@v.loewis.de>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>
	<42526645.3010600@egenix.com> <425268EA.7070703@livinglogic.de>
	<1569.84.56.104.122.1112729580.squirrel@isar.livinglogic.de>
	<4252EF7A.8080804@v.loewis.de>
	<1809.84.56.104.122.1112733444.squirrel@isar.livinglogic.de>
	<4252FA7E.3090206@v.loewis.de>
Message-ID: <1520.84.56.99.39.1112776379.squirrel@isar.livinglogic.de>

Martin v. L?wis sagte:
> Walter D?rwald wrote:
>> There are situations where the byte stream might be temporarily
>> exhausted, e.g. an XML parser that tries to support the
>> IncrementalParser interface, or when you want to decode
>> encoded data piecewise, because you want to give a progress
>> report.
>
> Yes, but these are not file-like objects.

True, on the outside there are no file-like objects. But the
IncrementalParser gets passed the XML bytes in chunks,
so it has to use a stateful decoder for decoding. Unfortunately
this means that is has to use a stream API. (See
http://www.python.org/sf/1101097 for a patch that somewhat
fixes that.)

(Another option would be to completely ignore the stateful API
and handcraft stateful decoding (or only support stateless
decoding), like most XML parsers for Python do now.)

> In the IncrementalParser,
> it is *not* the case that a read operation returns an empty
> string. Instead, the application repeatedly feeds data explicitly.

That's true, but the parser has to wrap this data into an object
that can be passed to the StreamReader constructor. (See the
Queue class in Lib/test/test_codecs.py for an example.)

> For a file-like object, returning "" indicates EOF.

Not neccassarily. In the example above the IncrementalParser
gets fed a chunk of data, it stuffs this data into the Queue,
so that the StreamReader can decode it. Once the data
from the Queue is exhausted, there won't any further
data until the user calls feed() on the IncrementalParser again.

Bye,
   Walter D?rwald


From stephen at xemacs.org  Wed Apr  6 11:31:21 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed Apr  6 11:31:27 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <42537C50.8000608@v.loewis.de> (Martin v.
	=?iso-8859-1?q?L=F6wis's?= message of "Wed, 06 Apr 2005 08:06:08 +0200")
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>
	<424DACDC.4080601@egenix.com>
	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>
	<42524643.3070604@v.loewis.de>
	<87is31e8hk.fsf@tleepslib.sk.tsukuba.ac.jp>
	<4252DC9F.50000@v.loewis.de>
	<87fyy4d9tq.fsf@tleepslib.sk.tsukuba.ac.jp>
	<42537C50.8000608@v.loewis.de>
Message-ID: <874qekb6ae.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Martin" == Martin v L?wis <martin@v.loewis.de> writes:

    Martin> I can't put these two paragraphs together. If you think
    Martin> that explicit is better than implicit, why do you not want
    Martin> to make different calls for the first chunk of a stream,
    Martin> and the subsequent chunks?

Because the signature/BOM is not a chunk, it's a header.  Handling the
signature/BOM is part of stream initialization, not translation, to my
mind.

The point is that explicitly using a stream shows that initialization
(and finalization) matter.  The default can be BOM or not, as a
pragmatic matter.  But then the stream data itself can be treated
homogeneously, as implied by the notion of stream.

I think it probably also would solve Walter's conundrum about
buffering the signature/BOM if responsibility for that were moved out
of the codecs and into the objects where signatures make sense.

I don't know whether that's really feasible in the short run---I
suspect there may be a lot of stream-like modules that would need to
be updated---but it would be a saner in the long run.

    >> Yes!  Exactly (except in reverse, we want to _read_ from the
    >> slurped stream-as-string, not write to one)!  ... and there's
    >> no need for a utf-8-sig codec for strings, since you can
    >> support the usage in exactly this way.

    Martin> However, if there is an utf-8-sig codec for streams, there
    Martin> is currently no way of *preventing* this codec to also be
    Martin> available for strings. The very same code is used for
    Martin> streams and for strings, and automatically so.

And of course it should be.  But if it's not possible to move the -sig
facility out of the codecs into the streams, that would be a shame.  I
think we should encourage people to use streams where initialization or
finalization semantics are non-trivial, as they are with signatures.

But as long as both utf-8-we-dont-need-no-steenkin-sigs-in-strings and
utf-8-sig are available, I can program as I want to (and refer those
whose strings get cratered by stray BOMs to you<wink>).

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
From walter at livinglogic.de  Wed Apr  6 13:48:48 2005
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Wed Apr  6 13:48:51 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <874qekb6ae.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>	<87is31e8hk.fsf@tleepslib.sk.tsukuba.ac.jp>	<4252DC9F.50000@v.loewis.de>	<87fyy4d9tq.fsf@tleepslib.sk.tsukuba.ac.jp>	<42537C50.8000608@v.loewis.de>
	<874qekb6ae.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <4253CCA0.9020008@livinglogic.de>

Stephen J. Turnbull wrote:
>>>>>>"Martin" == Martin v L?wis <martin@v.loewis.de> writes:
> 
>     Martin> I can't put these two paragraphs together. If you think
>     Martin> that explicit is better than implicit, why do you not want
>     Martin> to make different calls for the first chunk of a stream,
>     Martin> and the subsequent chunks?
> 
> Because the signature/BOM is not a chunk, it's a header.  Handling the
> signature/BOM is part of stream initialization, not translation, to my
> mind.
> 
> The point is that explicitly using a stream shows that initialization
> (and finalization) matter.  The default can be BOM or not, as a
> pragmatic matter.  But then the stream data itself can be treated
> homogeneously, as implied by the notion of stream.
> 
> I think it probably also would solve Walter's conundrum about
> buffering the signature/BOM if responsibility for that were moved out
> of the codecs and into the objects where signatures make sense.

Not really. In every encoding where a sequence of more than one byte 
maps to one Unicode character, you will always need some kind of 
buffering. If we remove the handling of initial BOMs from the codecs 
(except for UTF-16 where it is required), this wouldn't change any 
buffering requirements.

> I don't know whether that's really feasible in the short run---I
> suspect there may be a lot of stream-like modules that would need to
> be updated---but it would be a saner in the long run.

I'm not exactly sure, what you're proposing here. That all codecs (even 
UTF-16) pass the BOM through and some other infrastructure is 
responsible for dropping it?

> [...]

Bye,
    Walter D?rwald
From mwh at python.net  Wed Apr  6 11:37:22 2005
From: mwh at python.net (Michael Hudson)
Date: Wed Apr  6 14:02:12 2005
Subject: [Python-Dev] longobject.c & ob_size
In-Reply-To: <1f7befae05040513113c825c92@mail.gmail.com> (Tim Peters's
	message of "Tue, 5 Apr 2005 16:11:14 -0400")
References: <2mk6njdh9z.fsf@starship.python.net>
	<1f7befae05040513113c825c92@mail.gmail.com>
Message-ID: <2m4qek9rfx.fsf@starship.python.net>

Tim Peters <tim.peters@gmail.com> writes:

> [Michael Hudson]
>> Asking mostly for curiousity, how hard would it be to have longs store
>> their sign bit somewhere less aggravating?
>
> Depends on where that is.
>
>> It seems to me that the top bit of ob_digit[0] is always 0, for example,
>
> Yes, the top bit of ob_digit[i], for all relevant i, is 0 on all
> platforms now.
>
>> and I'm sure this would result no less convolution in longobject.c it'd be
>> considerably more localized convolution.
>
> I'd much rather give struct _longobject a distinct sign member (say, 0
> == zero, -1 = non-zero negative, 1 == non-zero positive). 

Well, that would indeed be simpler.

> That would simplify code.  It would cost no extra bytes for some
> longs, and 8 extra bytes for others (since obmalloc rounds up to a
> multiple of 8); I don't care about that (e.g., I never use millions
> of longs simultaneously, but often use a few dozen very big longs
> simultaneously; the memory difference is in the noise then).
>
> Note that longintrepr.h isn't included by Python.h.  Only longobject.h
> is, and longobject.h doesn't reveal the internal structure of longs. 
> IOW, changing the internal layout of longs shouldn't even hurt binary
> compatibility.

Bonus.

> The ob_size member of PyObject_VAR_HEAD would also be redeclared as
> size_t in an ideal world.

As nature intended.

I might do a patch, at some point...

Cheers,
mwh

-- 
  Indeed, when I design my killer language, the identifiers "foo" and
  "bar" will be reserved words, never used, and not even mentioned in
  the reference manual. Any program using one will simply dump core
  without comment. Multitudes will rejoice. -- Tim Peters, 29 Apr 1998
From tim.peters at gmail.com  Wed Apr  6 14:50:46 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Wed Apr  6 14:50:50 2005
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules
	mathmodule.c, 2.74, 2.75
In-Reply-To: <E1DJ8hE-000229-4x@sc8-pr-cvs1.sourceforge.net>
References: <E1DJ8hE-000229-4x@sc8-pr-cvs1.sourceforge.net>
Message-ID: <1f7befae050406055026a1e00c@mail.gmail.com>

[mwh@users.sourceforge.net]
> Modified Files:
>        mathmodule.c
> Log Message:
> Add a comment explaining the import of longintrepr.h.
> 
> Index: mathmodule.c
...
> #include "Python.h"
> -#include "longintrepr.h"
> +#include "longintrepr.h" // just for SHIFT

The intent is fine, but please use a standard C (not C++) comment. 
That is, /*...*/, not //.
From mwh at python.net  Wed Apr  6 15:57:24 2005
From: mwh at python.net (Michael Hudson)
Date: Wed Apr  6 15:57:26 2005
Subject: [Python-Dev] longobject.c & ob_size
In-Reply-To: <2m4qek9rfx.fsf@starship.python.net> (Michael Hudson's message
	of "Wed, 06 Apr 2005 10:37:22 +0100")
References: <2mk6njdh9z.fsf@starship.python.net>
	<1f7befae05040513113c825c92@mail.gmail.com>
	<2m4qek9rfx.fsf@starship.python.net>
Message-ID: <2mwtrg80u3.fsf@starship.python.net>

Michael Hudson <mwh@python.net> writes:

> Tim Peters <tim.peters@gmail.com> writes:
>
>> [Michael Hudson]
>>> Asking mostly for curiousity, how hard would it be to have longs store
>>> their sign bit somewhere less aggravating?
>>
>> Depends on where that is.

[...]

>> I'd much rather give struct _longobject a distinct sign member (say, 0
>> == zero, -1 = non-zero negative, 1 == non-zero positive). 

I ended up doing -1 non-zero negative, 1 zero and positive, but I
don't know if this is really clearer than what you suggest overall.  I
suspect it's a wash.

[...]

> I might do a patch, at some point...

http://python.org/sf/1177779

Assigned to you, but unassign if you don't have time (testing the
patch is probably more worthwhile than reading it!).

Cheers,
mwh

-- 
  Linux: Horse. Like a wild horse, fun to ride. Also prone to
  throwing you and stamping you into the ground because it doesn't
  like your socks.         -- Jim's pedigree of operating systems, asr
From steven.bethard at gmail.com  Wed Apr  6 16:43:44 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Wed Apr  6 16:43:46 2005
Subject: [Python-Dev] inconsistency when swapping obj.__dict__ with a
	dict-like object...
In-Reply-To: <42534D6F.40200@ocf.berkeley.edu>
References: <36f889220504051829266cea1e@mail.gmail.com>
	<42534D6F.40200@ocf.berkeley.edu>
Message-ID: <d11dcfba050406074348d8941b@mail.gmail.com>

On Apr 5, 2005 8:46 PM, Brett C. <bac@ocf.berkeley.edu> wrote:
> Alex A. Naanou wrote:
> > Here there are two problems, the first is minor, and it is that
> > anything assigned to the __dict__ attribute is checked to be a
> > descendant of the dict class (mixing this in does not seem to work)...
> > and the second problem is a real annoyance, it is that the mapping
> > protocol supported by the Dict object in the example above is not used
> > by the attribute access mechanics (the same thing that once happened
> > in exec)...
>
> Actually, overriding __getattribute__() does work; __getattr__() and
> __getitem__() doesn't.  This was brought up last month at some point without
> any resolve (I think Steve Bethard pointed it out).

Yeah, here's the link:

http://mail.python.org/pipermail/python-dev/2005-March/051837.html

I've pointed out three possible "solutions" there, but they all have
some significant drawbacks.  I took the complete silence on the topic
as an indication that none of the options were acceptable.

STeVe
--
You can wordify anything if you just verb it.
       --- Bucky Katt, Get Fuzzy
From ncoghlan at gmail.com  Wed Apr  6 13:31:44 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed Apr  6 17:17:20 2005
Subject: [Python-Dev] inconsistency when swapping obj.__dict__ with a
	dict-like object...
In-Reply-To: <36f889220504051829266cea1e@mail.gmail.com>
References: <36f889220504051829266cea1e@mail.gmail.com>
Message-ID: <4253C8A0.2050509@gmail.com>

> P.S. (IMHO) the type check here is not that necessary (at least in its
> current state), as what we need to assert is not the relation to the
> dict class but the support of the mapping protocol....

The type-check is basically correct - as you have discovered, type & object use 
the PyDict_* API internally (for speed reasons, as I understand it), so 
supporting the mapping API is not really sufficient for something assigned to 
__dict__. Changing this for exec is one thing, as speed of access to the locals 
dict isn't likely to have a major impact on the overall performance of such 
code, but I would expect changing class dictionary access code in a similar way 
would have a major (detrimental) performance impact.

Depending on the use case, it is possible to work around the problem by defining 
__dict__, __getattribute__, __setattr__ and __delattr__ in the class. defining 
__dict__ sidesteps the type error, defining the other three methods then let's 
you get around the fact that the standard C-level dict pointer is no longer 
being updated, as well as making sure the general mapping API is used, rather 
than the concrete PyDict_* API. This is kinda ugly, but it works as long as any 
C code using the class __dict__ goes via the attribute access machinery and 
doesn't try to get the dictionary automatically supplied by Python by digging 
directly into the type structure.


=====================
from UserDict import DictMixin
class Dict(DictMixin):
     def __init__(self, dct=None):
         if dct is None:
             dct = {}
         self._dict = dct
     def __getitem__(self, name):
         return self._dict[name]
     def __setitem__(self, name, value):
         self._dict[name] = value
     def __delitem__(self, name):
         del self._dict[name]
     def keys(self):
         return self._dict.keys()

class A(object):
     def __new__(cls, *p, **n):
         o = object.__new__(cls)
         super(A, o).__setattr__('__dict__', Dict())
         return o
     __dict__ = None
     def __getattr__(self, attr):
         try:
             return self.__dict__[attr]
         except KeyError:
             raise AttributeError("%s" % attr)
     def __setattr__(self, attr, value):
         if attr in self.__dict__ or not hasattr(self, attr):
             self.__dict__[attr] = value
         else:
             super(A, self).__setattr__(attr, value)
     def __delattr__(self, attr):
         if attr in self.__dict__:
             del self.__dict__[attr]
         else:
             super(A, self).__delattr__(attr)


Py> a = A()
Py> a.__dict__._dict
{}
Py> a.xxx = 123
Py> a.__dict__._dict
{'xxx': 123}
Py> a.__dict__._dict['yyy'] = 321
Py> a.yyy
321
Py> a.__dict__._dict
{'xxx': 123, 'yyy': 321}
Py> del a.xxx
Py> a.__dict__._dict
{'yyy': 321}
Py> del a.xxx
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "<stdin>", line 21, in __delattr__
AttributeError: xxx
Py> a.__dict__ = {}
Py> a.yyy
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "<stdin>", line 11, in __getattr__
AttributeError: yyy

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From martin at v.loewis.de  Wed Apr  6 22:22:19 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Apr  6 22:22:22 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <874qekb6ae.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>	<87is31e8hk.fsf@tleepslib.sk.tsukuba.ac.jp>	<4252DC9F.50000@v.loewis.de>	<87fyy4d9tq.fsf@tleepslib.sk.tsukuba.ac.jp>	<42537C50.8000608@v.loewis.de>
	<874qekb6ae.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <425444FB.6040407@v.loewis.de>

Stephen J. Turnbull wrote:
> Because the signature/BOM is not a chunk, it's a header.  Handling the
> signature/BOM is part of stream initialization, not translation, to my
> mind.

I'm sorry, but I'm losing track as to what precisely you are trying to
say. You seem to be using a mental model that is entirely different
from mine.

> The point is that explicitly using a stream shows that initialization
> (and finalization) matter.  The default can be BOM or not, as a
> pragmatic matter.  But then the stream data itself can be treated
> homogeneously, as implied by the notion of stream.

But what follows from that point? So it shows some kind of matter...
what does that mean for actual changes to Python API?

> I think it probably also would solve Walter's conundrum about
> buffering the signature/BOM if responsibility for that were moved out
> of the codecs and into the objects where signatures make sense.
> 
> I don't know whether that's really feasible in the short run---I
> suspect there may be a lot of stream-like modules that would need to
> be updated---but it would be a saner in the long run.

What is "that" which might be really feasible? To "solve Walter's
conundrum"? That "signatures make sense"?

So I can't really respond to your message in a meaningful way;
I just let it rest...

Regards,
Martin
From kbk at shore.net  Thu Apr  7 04:12:08 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Thu Apr  7 04:12:23 2005
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200504070212.j372C8Nw030750@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  308 open (+11) /  2819 closed ( +7) /  3127 total (+18)
Bugs    :  882 open (+11) /  4913 closed (+13) /  5795 total (+24)
RFE     :  176 open ( +1) /   151 closed ( +1) /   327 total ( +2)

New / Reopened Patches
______________________

improvement of the script adaptation for the win32 platform  (2005-03-30)
       http://python.org/sf/1173134  opened by  Vivian De Smedt

unicodedata docstrings  (2005-03-30)
CLOSED http://python.org/sf/1173245  opened by  Jeremy Yallop

__slots__ for subclasses of variable length types  (2005-03-30)
       http://python.org/sf/1173475  opened by  Michael Hudson

Python crashes in pyexpat.c if malformed XML is parsed  (2005-03-31)
       http://python.org/sf/1173998  opened by  pdecat

hierarchical regular expression  (2005-04-01)
CLOSED http://python.org/sf/1174589  opened by  Chris Ottrey

site enhancements  (2005-04-01)
       http://python.org/sf/1174614  opened by  Bob Ippolito

Export more libreadline API functions  (2005-04-01)
       http://python.org/sf/1175004  opened by  Bruce Edge

Export more libreadline API functions  (2005-04-01)
CLOSED http://python.org/sf/1175048  opened by  Bruce Edge

Patch for whitespace enforcement  (2005-04-01)
CLOSED http://python.org/sf/1175070  opened by  Guido van Rossum

Allow weak referencing of classic classes  (2005-04-03)
       http://python.org/sf/1175850  opened by  Greg Chapman

threading.Condition.wait() return value indicates timeout  (2005-04-03)
       http://python.org/sf/1175933  opened by  Martin Blais

Make subprocess.Popen support file-like objects (win)  (2005-04-03)
       http://python.org/sf/1175984  opened by  Nicolas Fleury

Implemented new 'class foo():pass' syntax  (2005-04-03)
       http://python.org/sf/1176019  opened by  logistix

locale._build_localename treatment for utf8  (2005-04-05)
       http://python.org/sf/1176504  opened by  Hye-Shik Chang

Clarify unicode.(en|de)code.() docstrings  (2005-04-04)
CLOSED http://python.org/sf/1176578  opened by  Brett Cannon

UTF-8-Sig codec  (2005-04-05)
       http://python.org/sf/1177307  opened by  Walter D?rwald

Complex commented  (2005-04-06)
       http://python.org/sf/1177597  opened by  engelbert gruber

explicit sign variable for longs  (2005-04-06)
       http://python.org/sf/1177779  opened by  Michael Hudson

Patches Closed
______________

unicodedata docstrings  (2005-03-30)
       http://python.org/sf/1173245  closed by  perky

hierarchical regular expression  (2005-04-01)
       http://python.org/sf/1174589  closed by  loewis

Export more libreadline API functions  (2005-04-01)
       http://python.org/sf/1175048  closed by  loewis

Patch for whitespace enforcement  (2005-04-01)
       http://python.org/sf/1175070  closed by  gvanrossum

ast for decorators  (2005-03-21)
       http://python.org/sf/1167709  closed by  nascheme

[ast branch] unicode literal fixes  (2005-03-25)
       http://python.org/sf/1170272  closed by  nascheme

Clarify unicode.(en|de)code.() docstrings  (2005-04-04)
       http://python.org/sf/1176578  closed by  bcannon

New / Reopened Bugs
___________________

very minor doc bug in 'listsort.txt'  (2005-03-30)
CLOSED http://python.org/sf/1173407  opened by  gyrof

quit should quit  (2005-03-30)
CLOSED http://python.org/sf/1173637  opened by  Matt Chaput

multiple broken links in profiler docs  (2005-03-30)
       http://python.org/sf/1173773  opened by  Ilya Sandler

Reading /dev/zero causes SystemError  (2005-04-01)
       http://python.org/sf/1174606  opened by  Adam Olsen

subclassing ModuleType and another built-in type  (2005-04-01)
       http://python.org/sf/1174712  opened by  Armin Rigo

PYTHONPATH is not working  (2005-04-01)
CLOSED http://python.org/sf/1174795  opened by  Alexander Belchenko

property example code error  (2005-04-01)
       http://python.org/sf/1175022  opened by  John Ridley

import statement likely to crash if module launches threads  (2005-04-01)
       http://python.org/sf/1175194  opened by  Jeff Stearns

python hangs if import statement launches threads  (2005-04-01)
CLOSED http://python.org/sf/1175202  opened by  Jeff Stearns

codecs.readline sometimes removes newline chars  (2005-04-02)
CLOSED http://python.org/sf/1175396  opened by  Irmen de Jong

poorly named variable in urllib2.py  (2005-04-03)
       http://python.org/sf/1175848  opened by  Roy Smith

StringIO and cStringIO don't provide 'name' attribute  (2005-04-03)
       http://python.org/sf/1175967  opened by  logistix

compiler module didn't get updated for "class foo():pass"  (2005-04-03)
       http://python.org/sf/1176012  opened by  logistix

Python garbage collector isn't detecting deadlocks  (2005-04-04)
CLOSED http://python.org/sf/1176467  opened by  Nathan Marushak

Readline segfault  (2005-04-05)
       http://python.org/sf/1176893  opened by  Walter D?rwald

[PyPI] Password reset problem.  (2005-04-05)
CLOSED http://python.org/sf/1177077  opened by  Darek Suchojad

random.py/os.urandom robustness  (2005-04-05)
       http://python.org/sf/1177468  opened by  Fazal Majid

error locale.getlocale() with LANGUAGE=eu_ES  (2005-04-06)
CLOSED http://python.org/sf/1177674  opened by  Zunbeltz Izaola

Exec Inside A Function  (2005-04-06)
       http://python.org/sf/1177811  opened by  Andrew Wilkinson

(?(id)yes|no) only works when referencing the first group  (2005-04-06)
       http://python.org/sf/1177831  opened by  Andr? Malo

Iterator on Fileobject gives no MemoryError  (2005-04-06)
       http://python.org/sf/1177964  opened by  Folke Lemaitre

cgitb.py support for frozen images  (2005-04-06)
       http://python.org/sf/1178136  opened by  Barry Alan Scott

urllib.py overwrite HTTPError code with 200  (2005-04-06)
       http://python.org/sf/1178141  opened by  Barry Alan Scott

urllib2.py assumes 206 is an error  (2005-04-06)
       http://python.org/sf/1178145  opened by  Barry Alan Scott

cgitb.py report wrong line number  (2005-04-07)
       http://python.org/sf/1178148  opened by  Barry Alan Scott

Bugs Closed
___________

The readline module can cause python to segfault  (2005-03-19)
       http://python.org/sf/1166660  closed by  mwh

very minor doc bug in 'listsort.txt'  (2005-03-30)
       http://python.org/sf/1173407  closed by  rhettinger

Property access with decorator makes interpreter crash  (2005-03-17)
       http://python.org/sf/1165306  closed by  mwh

"cmp" should be "key" in sort doc  (2005-03-29)
       http://python.org/sf/1172581  closed by  rhettinger

why should URL be required for all packages  (2005-03-25)
       http://python.org/sf/1170424  closed by  loewis

Possible windows+python bug  (2005-03-22)
       http://python.org/sf/1168427  closed by  holo9

quit should quit  (2005-03-30)
       http://python.org/sf/1173637  closed by  loewis

PYTHONPATH is not working  (2005-04-01)
       http://python.org/sf/1174795  closed by  bcannon

python hangs if import statement launches threads  (2005-04-02)
       http://python.org/sf/1175202  closed by  loewis

codecs.readline sometimes removes newline chars  (2005-04-02)
       http://python.org/sf/1175396  closed by  doerwalter

Python garbage collector isn't detecting deadlocks  (2005-04-04)
       http://python.org/sf/1176467  closed by  nascheme

[PyPI] Password reset problem.  (2005-04-05)
       http://python.org/sf/1177077  closed by  jafo

Minor error in section 3.2  (2005-03-11)
       http://python.org/sf/1161595  closed by  jyby

error locale.getlocale() with LANGUAGE=eu_ES  (2005-04-06)
       http://python.org/sf/1177674  closed by  perky

New / Reopened RFE
__________________

add "reload" function  (2005-04-03)
       http://python.org/sf/1175686  opened by  paul rubin

Add a settimeout to ftplib.FTP object  (2005-04-06)
       http://python.org/sf/1177998  opened by  Juan Antonio Vali?o Garc?a

RFE Closed
__________

file() on a file  (2005-03-03)
       http://python.org/sf/1155485  closed by  loewis

From nbastin at opnet.com  Thu Apr  7 05:09:24 2005
From: nbastin at opnet.com (Nicholas Bastin)
Date: Thu Apr  7 05:09:47 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <42526645.3010600@egenix.com>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>
	<42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com>
Message-ID: <2019f504df72a18fb04061248e3f55d8@opnet.com>


On Apr 5, 2005, at 6:19 AM, M.-A. Lemburg wrote:

> Note that the UTF-16 codec is strict w/r to the presence
> of the BOM mark: you get a UnicodeError if a stream does
> not start with a BOM mark. For the UTF-8-SIG codec, this
> should probably be relaxed to not require the BOM.

I've actually been confused about this point for quite some time now, 
but never had a chance to bring it up.  I do not understand why 
UnicodeError should be raised if there is no BOM.  I know that PEP-100 
says:

'utf-16':             16-bit variable length encoding (little/big 
endian)

and:

Note: 'utf-16' should be implemented by using and requiring byte order 
marks (BOM) for file input/output.

But this appears to be in error, at least in the current unicode 
standard.  'utf-16', as defined by the unicode standard, is big-endian 
in the absence of a BOM:

---
3.10.D42:  UTF-16 encoding scheme:
...
* The UTF-16 encoding scheme may or may not begin with a BOM.  However, 
when there is no BOM, and in the absence of a higher-level protocol, 
the byte order of the UTF-16 encoding scheme is big-endian.
---

The current implementation of the utf-16 codecs makes for some 
irritating gymnastics to write the BOM into the file before reading it 
if it contains no BOM, which seems quite like a bug in the codec.  I 
allow for the possibility that this was ambiguous in the standard when 
the PEP was written, but it is certainly not ambiguous now.

--
Nick

From stephen at xemacs.org  Thu Apr  7 06:20:53 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu Apr  7 06:21:10 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <4253CCA0.9020008@livinglogic.de> (Walter
	=?iso-8859-1?q?D=F6rwald's?= message of "Wed,
	06 Apr 2005 13:48:48 +0200")
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>
	<424DACDC.4080601@egenix.com>
	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>
	<42524643.3070604@v.loewis.de>
	<87is31e8hk.fsf@tleepslib.sk.tsukuba.ac.jp>
	<4252DC9F.50000@v.loewis.de>
	<87fyy4d9tq.fsf@tleepslib.sk.tsukuba.ac.jp>
	<42537C50.8000608@v.loewis.de>
	<874qekb6ae.fsf@tleepslib.sk.tsukuba.ac.jp>
	<4253CCA0.9020008@livinglogic.de>
Message-ID: <87br8r9pzu.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Walter" == Walter D?rwald <walter@livinglogic.de> writes:

    Walter> Not really. In every encoding where a sequence of more
    Walter> than one byte maps to one Unicode character, you will
    Walter> always need some kind of buffering. If we remove the
    Walter> handling of initial BOMs from the codecs (except for
    Walter> UTF-16 where it is required), this wouldn't change any
    Walter> buffering requirements.

Sure.  My point is that codecs should be stateful only to the extent
needed to assemble semantically meaningful units (ie, multioctet coded
characters).  In particular, they should not need to know about
location at the beginning, middle, or end of some stream---because in
the context of operating on a string they _can't_.

    >> I don't know whether that's really feasible in the short
    >> run---I suspect there may be a lot of stream-like modules that
    >> would need to be updated---but it would be a saner in the long
    >> run.

    Walter> I'm not exactly sure, what you're proposing here. That all
    Walter> codecs (even UTF-16) pass the BOM through and some other
    Walter> infrastructure is responsible for dropping it?

Not exactly.  I think that at the lowest level codecs should not
implement complex mode-switching internally, but rather explicitly
abdicate responsibility to a more appropriate codec.

For example, autodetecting UTF-16 on input would be implemented by a
Python program that does something like

    data = stream.read()
    for detector in [ "utf-16-signature", "utf-16-statistical" ]:
        # for the UTF-16 detectors, OUT will always be u"" or None
        out, data, codec = data.decode(detector)
        if codec: break
    while codec:
        more_out, data, codec = data.decode(codec)
        out = out + more_out
    if data:
        # a real program would complain about it
        pass
    process(out)

where decode("utf-16-signature") would be implemented

def utf-16-signature-internal (data):
    if data[0:2] == "\xfe\xff":
        return (u"", data[2:], "utf-16-be")
    else if data[0:2] == "\xff\xfe":
        return (u"", data[2:], "utf-16-le")
    else
        # note: data is undisturbed if the detector fails
        return (None, data, None)

The main point is that the detector is just a codec that stops when it
figures out what the next codec should be, touches only data that
would be incorrect to pass to the next codec, and leaves the data
alone if detection fails.  utf-16-signature only handles the BOM (if
present), and does not handle arbitrary "chunks" of data.  Instead, it
passes on the rest of the data (including the first chunk) to be
handled by the appropriate utf-16-?e codec.

I think that the temptation to encapsulate this logic in a utf-16
codec that "simplifies" things by calling the appropriate utf-16-?e
codec itself should be deprecated, but YMMV.  What I would really like
is for the above style to be easier to achieve than it currently is.

BTW, I appreciate your patience in exploring this; after Martin's
remark about different mental models I have to suspect this approach
is just somehow un-Pythonic, but fleshing it out this way I can see
how it will be useful in the context of a different project.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
From anthony at interlink.com.au  Thu Apr  7 09:27:02 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Thu Apr  7 09:27:21 2005
Subject: [Python-Dev] Re: hierarchicial named groups extension to the
	=?iso-8859-1?q?re=09library?=
In-Reply-To: <424F91B0.2050307@v.loewis.de>
References: <MS4ysQaz.1112426561.4504890.ottrey@py.redsoft.be>
	<20050402134150.7215.JCARLSON@uci.edu>
	<424F91B0.2050307@v.loewis.de>
Message-ID: <200504071727.03601.anthony@interlink.com.au>

On Sunday 03 April 2005 16:48, Martin v. L?wis wrote:
> If this kind of functionality would fall on immediate rejection for
> some reason, even writing the PEP might be pointless.

Note that even if something is rejected, the PEP itself is useful - it
collects knowledge in a format that's far more accessible than searching
the mailing list archives. 

(note that I'm not talking about this particular case, but about PEPs in
general - I have no opinion on the current proposal, because I'm not a
heavy user of REs)

-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From mal at egenix.com  Thu Apr  7 11:07:58 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu Apr  7 11:08:02 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <2019f504df72a18fb04061248e3f55d8@opnet.com>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>
	<42526645.3010600@egenix.com>
	<2019f504df72a18fb04061248e3f55d8@opnet.com>
Message-ID: <4254F86E.4000203@egenix.com>

Nicholas Bastin wrote:
> 
> On Apr 5, 2005, at 6:19 AM, M.-A. Lemburg wrote:
> 
>> Note that the UTF-16 codec is strict w/r to the presence
>> of the BOM mark: you get a UnicodeError if a stream does
>> not start with a BOM mark. For the UTF-8-SIG codec, this
>> should probably be relaxed to not require the BOM.
> 
> 
> I've actually been confused about this point for quite some time now,
> but never had a chance to bring it up.  I do not understand why
> UnicodeError should be raised if there is no BOM.  I know that PEP-100
> says:
> 
> 'utf-16':             16-bit variable length encoding (little/big endian)
> 
> and:
> 
> Note: 'utf-16' should be implemented by using and requiring byte order
> marks (BOM) for file input/output.
> 
> But this appears to be in error, at least in the current unicode
> standard.  'utf-16', as defined by the unicode standard, is big-endian
> in the absence of a BOM:
> 
> ---
> 3.10.D42:  UTF-16 encoding scheme:
> ...
> * The UTF-16 encoding scheme may or may not begin with a BOM.  However,
> when there is no BOM, and in the absence of a higher-level protocol, the
> byte order of the UTF-16 encoding scheme is big-endian.
> ---

The problem is "in the absence of a higher level protocol": the
codec doesn't know anything about a protocol - it's the application
using the codec that knows which protocol get's used. It's a lot
safer to require the BOM for UTF-16 streams and raise an exception
to have the application decide whether to use UTF-16-BE or the
by far more common UTF-16-LE.

Unlike for the UTF-8 codec, the BOM for UTF-16 is a configuration
parameter, not merely a signature.

In terms of history, I don't recall whether your quote was
already in the standard at the time I wrote the PEP. You are the
first to have reported a problem with the current implementation
(which has been around since 2000), so I believe that application
writers are more comfortable with the way the UTF-16 codec
is currently implemented. Explicit is better than implicit :-)

> The current implementation of the utf-16 codecs makes for some
> irritating gymnastics to write the BOM into the file before reading it
> if it contains no BOM, which seems quite like a bug in the codec. 

The codec writes a BOM in the first call to .write() - it
doesn't write a BOM before reading from the file.

> I allow for the possibility that this was ambiguous in the standard when
> the PEP was written, but it is certainly not ambiguous now.

See above.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 07 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From mwh at python.net  Thu Apr  7 10:41:03 2005
From: mwh at python.net (Michael Hudson)
Date: Thu Apr  7 11:49:31 2005
Subject: [Python-Dev] threading (GilState) question
Message-ID: <2mmzsb7zds.fsf@starship.python.net>

I recently redid how the readline module handled threads around
callbacks into Python (the previous code was insane).

This resulted in the following bug report:

    http://www.python.org/sf/1176893

Which is correctly assigned to me as it's clearly a result of my
recent checkin.  However, I think my code is correct and the fault
lies elsewhere.

Basically, if you call PyGilState_Release before PyEval_InitThreads
you crash, because PyEval_ReleaseThread gets called while
interpreter_lock is NULL.  This is very simple to make go away -- the
problem is that there are several ways!

Point the first is that I really think this is a bug in the GilState
APIs: the readline API isn't inherently multi-threaded and so it would
be insane to call PyEval_InitThreads() in initreadline, yet it has to
cope with being called in a multithreaded situation.  If you can't use
the GilState APIs in this situation, what are they for?

Option 1) Call PyEval_ThreadsInitialized() in PyGilState_Release().
Non-invasive, but bleh.

Option 2) Call PyEval_SaveThread() instead of
PyEval_ReleaseThread()[1] in PyGilState_Release().  This is my
favourite option (PyGilState_Ensure() calls PyEval_RestoreThread which
is PyEval_SaveThread()s "mate") and I guess you can distill this long
mail into the question "why doesn't PyGilState_Release do this
already?"

Option 3) Make PyEval_ReleaseThread() not crash when interpreter_lock
== NULL.  Easy, but it's actually documented that you can't do this.

Opinions?  Am I placing too much trust into PyGilState_Release()s
existing choice of function?

Cheers,
mwh

[1] The issue of having almost-but-not-quite identical variations of
    API functions -- here

         PyEval_AcquireThread/PyEval_ReleaseThread
     vs. PyEval_RestoreThread/PyEval_SaveThread

    -- is something I can rant about at length, if anyone is
    interested :)

--
  I located the link but haven't bothered to re-read the article,
  preferring to post nonsense to usenet before checking my facts.
                                      -- Ben Wolfson, comp.lang.python
From ncoghlan at gmail.com  Thu Apr  7 13:21:00 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu Apr  7 13:21:06 2005
Subject: [Python-Dev] threading (GilState) question
In-Reply-To: <2mmzsb7zds.fsf@starship.python.net>
References: <2mmzsb7zds.fsf@starship.python.net>
Message-ID: <4255179C.4090608@gmail.com>

Michael Hudson wrote:
> Option 1) Call PyEval_ThreadsInitialized() in PyGilState_Release().
> Non-invasive, but bleh.

Tim rejected this option back when PyEval_ThreadsInitialized() was added to the 
API [1]. Gustavo was having a similar problem with pygtk, and the end result was 
to add the ThreadsInitialized API so that pygtk could make its own check without 
slowing down the default case in the core.

> Option 2) Call PyEval_SaveThread() instead of
> PyEval_ReleaseThread()[1] in PyGilState_Release().  This is my
> favourite option (PyGilState_Ensure() calls PyEval_RestoreThread which
> is PyEval_SaveThread()s "mate") and I guess you can distill this long
> mail into the question "why doesn't PyGilState_Release do this
> already?"

See above. Although I'm now wondering about the opposite question: Why doesn't 
PyGilState_Ensure use PyEval_AcquireThread?

Cheers,
Nick.

[1] 
http://sourceforge.net/tracker/?func=detail&aid=1044089&group_id=5470&atid=305470
[2] http://mail.python.org/pipermail/python-dev/2004-August/047870.html

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From mwh at python.net  Thu Apr  7 14:27:16 2005
From: mwh at python.net (Michael Hudson)
Date: Thu Apr  7 14:27:19 2005
Subject: [Python-Dev] threading (GilState) question
In-Reply-To: <4255179C.4090608@gmail.com> (Nick Coghlan's message of "Thu,
	07 Apr 2005 21:21:00 +1000")
References: <2mmzsb7zds.fsf@starship.python.net> <4255179C.4090608@gmail.com>
Message-ID: <2mis2y93h7.fsf@starship.python.net>

Nick Coghlan <ncoghlan@gmail.com> writes:

> Michael Hudson wrote:
>> Option 1) Call PyEval_ThreadsInitialized() in PyGilState_Release().
>> Non-invasive, but bleh.
>
> Tim rejected this option back when PyEval_ThreadsInitialized() was
> added to the API [1].

Well, not really.  The patch that was rejected was much larger than
any proposal of mine.  My option 1) is this:

--- pystate.c   09 Feb 2005 10:56:18 +0000      2.39
+++ pystate.c   07 Apr 2005 13:19:55 +0100      
@@ -502,7 +502,8 @@
          PyThread_delete_key_value(autoTLSkey);
          }
          /* Release the lock if necessary */
-         else if (oldstate == PyGILState_UNLOCKED)
-              PyEval_ReleaseThread(tcur);
+              else if (oldstate == PyGILState_UNLOCKED
+                       && PyEval_ThreadsInitialized())
+                       PyEval_ReleaseThread();
 }
 #endif /* WITH_THREAD */

> Gustavo was having a similar problem with pygtk, and the end result
> was to add the ThreadsInitialized API so that pygtk could make its
> own check without slowing down the default case in the core.

Well, Gustavo seemed to be complaining about the cost of the locking.
I'm complaining about crashes.

>> Option 2) Call PyEval_SaveThread() instead of
>> PyEval_ReleaseThread()[1] in PyGilState_Release().  This is my
>> favourite option (PyGilState_Ensure() calls PyEval_RestoreThread which
>> is PyEval_SaveThread()s "mate") and I guess you can distill this long
>> mail into the question "why doesn't PyGilState_Release do this
>> already?"

This option corresponds to this patch:

--- pystate.c           09 Feb 2005 10:56:18 +0000      2.39
+++ pystate.c           07 Apr 2005 13:24:33 +0100      
@@ -503,6 +503,6 @@
   }
   /* Release the lock if necessary */
   else if (oldstate == PyGILState_UNLOCKED)
-          PyEval_ReleaseThread(tcur);
+          PyEval_SaveThread();
 }
 #endif /* WITH_THREAD */

> See above. Although I'm now wondering about the opposite question: Why
> doesn't PyGilState_Ensure use PyEval_AcquireThread?

Well, that would make more sense than what we have now.  OTOH, I'd
*much* rather make the PyGilState functions more tolerant -- I thought
being vaguely easy to use was part of their point.

I fail to believe the patch associated with option 2) has any
detectable performance cost.

Cheers,
mwh

-- 
  People think I'm a nice guy, and the fact is that I'm a scheming,
  conniving bastard who doesn't care for any hurt feelings or lost
  hours of work if it just results in what I consider to be a better
  system.                                            -- Linus Torvalds
From nbastin at opnet.com  Thu Apr  7 16:19:37 2005
From: nbastin at opnet.com (Nicholas Bastin)
Date: Thu Apr  7 16:19:58 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <4254F86E.4000203@egenix.com>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>
	<42526645.3010600@egenix.com>
	<2019f504df72a18fb04061248e3f55d8@opnet.com>
	<4254F86E.4000203@egenix.com>
Message-ID: <6355f4b2429cfb6fa42cff5670a49ea3@opnet.com>


On Apr 7, 2005, at 5:07 AM, M.-A. Lemburg wrote:

>> The current implementation of the utf-16 codecs makes for some
>> irritating gymnastics to write the BOM into the file before reading it
>> if it contains no BOM, which seems quite like a bug in the codec.
>
> The codec writes a BOM in the first call to .write() - it
> doesn't write a BOM before reading from the file.

Yes, see, I read a *lot* of UTF-16 that comes from other sources.  It's 
not a matter of writing with python and reading with python.

--
Nick

From tim.peters at gmail.com  Thu Apr  7 17:21:39 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Thu Apr  7 17:21:44 2005
Subject: [Python-Dev] threading (GilState) question
In-Reply-To: <2mmzsb7zds.fsf@starship.python.net>
References: <2mmzsb7zds.fsf@starship.python.net>
Message-ID: <1f7befae050407082140a591fd@mail.gmail.com>

[Michael Hudson]
> ...
> Point the first is that I really think this is a bug in the GilState
> APIs: the readline API isn't inherently multi-threaded and so it would
> be insane to call PyEval_InitThreads() in initreadline, yet it has to
> cope with being called in a multithreaded situation.  If you can't use
> the GilState APIs in this situation, what are they for?

That's explained in the PEP -- of course <wink>:

    http://www.python.org/peps/pep-0311.html

Under "Limitations and Exclusions" it specifically disowns
responsibility for worrying about whether Py_Initialize() and
PyEval_InitThreads() have been called:

    This API will not perform automatic initialization of Python, or
    initialize Python for multi-threaded operation.  Extension authors
    must continue to call Py_Initialize(), and for multi-threaded
    applications, PyEval_InitThreads().  The reason for this is that
    the first thread to call PyEval_InitThreads() is nominated as the
    "main thread" by Python, and so forcing the extension author to
    specify the main thread (by forcing her to make this first call)
    removes ambiguity.  As Py_Initialize() must be called before
    PyEval_InitThreads(), and as both of these functions currently
    support being called multiple times, the burden this places on
    extension authors is considered reasonable.

That doesn't mean there isn't a clever way to get the same effect
anyway, but I don't have time to think about it, and reassigned the
bug report to Mark (who may or may not have time).
From mal at egenix.com  Thu Apr  7 17:35:38 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu Apr  7 17:35:41 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <6355f4b2429cfb6fa42cff5670a49ea3@opnet.com>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>	<42526645.3010600@egenix.com>	<2019f504df72a18fb04061248e3f55d8@opnet.com>	<4254F86E.4000203@egenix.com>
	<6355f4b2429cfb6fa42cff5670a49ea3@opnet.com>
Message-ID: <4255534A.2090505@egenix.com>

Nicholas Bastin wrote:
> 
> On Apr 7, 2005, at 5:07 AM, M.-A. Lemburg wrote:
> 
>>> The current implementation of the utf-16 codecs makes for some
>>> irritating gymnastics to write the BOM into the file before reading it
>>> if it contains no BOM, which seems quite like a bug in the codec.
>>
>>
>> The codec writes a BOM in the first call to .write() - it
>> doesn't write a BOM before reading from the file.
> 
> 
> Yes, see, I read a *lot* of UTF-16 that comes from other sources.  It's
> not a matter of writing with python and reading with python.

Ok, but I don't really follow you here: you are suggesting to
relax the current UTF-16 behavior and to start defaulting to
UTF-16-BE if no BOM is present - that's most likely going to
cause more problems that it seems to solve: namely complete
garbage if the data turns out to be UTF-16-LE encoded and,
what's worse, enters the application undetected.

If you do have UTF-16 without a BOM mark it's much better
to let a short function analyze the text by reading for first
few bytes of the file and then make an educated guess based
on the findings. You can then process the file using one
of the other codecs UTF-16-LE or -BE.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 07 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From mwh at python.net  Thu Apr  7 18:00:12 2005
From: mwh at python.net (Michael Hudson)
Date: Thu Apr  7 18:29:59 2005
Subject: [Python-Dev] threading (GilState) question
In-Reply-To: <1f7befae050407082140a591fd@mail.gmail.com> (Tim Peters's
	message of "Thu, 7 Apr 2005 11:21:39 -0400")
References: <2mmzsb7zds.fsf@starship.python.net>
	<1f7befae050407082140a591fd@mail.gmail.com>
Message-ID: <2m7jje8tmb.fsf@starship.python.net>

Tim Peters <tim.peters@gmail.com> writes:

> [Michael Hudson]
>> ...
>> Point the first is that I really think this is a bug in the GilState
>> APIs: the readline API isn't inherently multi-threaded and so it would
>> be insane to call PyEval_InitThreads() in initreadline, yet it has to
>> cope with being called in a multithreaded situation.  If you can't use
>> the GilState APIs in this situation, what are they for?
>
> That's explained in the PEP -- of course <wink>:
>
>     http://www.python.org/peps/pep-0311.html

Gnarr.  Of course, I read this passage.  I think it's missing a use
case.

> Under "Limitations and Exclusions" it specifically disowns
> responsibility for worrying about whether Py_Initialize() and
> PyEval_InitThreads() have been called:
>
[snip quote]

This suggests that I should call PyEval_InitThreads() in
initreadline(), which seems daft.

> That doesn't mean there isn't a clever way to get the same effect
> anyway,

Pah.  There's a very simple way (see my reply to Nick).  It even works
in the case that PyEval_InitThreads() is called in between the call to
PyGilState_Ensure() and PyGilState_Release().

> but I don't have time to think about it, and reassigned the bug
> report to Mark (who may or may not have time).

He gets a week :)

Cheers,
mwh

-- 
  Or here's an even simpler indicator of how much C++ sucks: Print
  out the C++ Public Review Document.  Have someone  hold it about
  three feet  above your head and then drop it.  Thus  you will be
  enlightened.                                        -- Thant Tessman
From nbastin at opnet.com  Thu Apr  7 22:27:03 2005
From: nbastin at opnet.com (Nicholas Bastin)
Date: Thu Apr  7 22:27:55 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <4255534A.2090505@egenix.com>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>	<42526645.3010600@egenix.com>	<2019f504df72a18fb04061248e3f55d8@opnet.com>	<4254F86E.4000203@egenix.com>
	<6355f4b2429cfb6fa42cff5670a49ea3@opnet.com>
	<4255534A.2090505@egenix.com>
Message-ID: <e287a96e5f1d1ac1132c1f4485325fb8@opnet.com>


On Apr 7, 2005, at 11:35 AM, M.-A. Lemburg wrote:

> Ok, but I don't really follow you here: you are suggesting to
> relax the current UTF-16 behavior and to start defaulting to
> UTF-16-BE if no BOM is present - that's most likely going to
> cause more problems that it seems to solve: namely complete
> garbage if the data turns out to be UTF-16-LE encoded and,
> what's worse, enters the application undetected.

The crux of my argument is that the spec declares that UTF-16 without a 
BOM is BE.  If the file is encoded in UTF-16LE and it doesn't have a 
BOM, it doesn't deserve to be processed correctly.  That being said, 
treating it as UTF-16BE if it's LE will result in a lot of invalid code 
points, so it shouldn't be non-obvious that something has gone wrong.

> If you do have UTF-16 without a BOM mark it's much better
> to let a short function analyze the text by reading for first
> few bytes of the file and then make an educated guess based
> on the findings. You can then process the file using one
> of the other codecs UTF-16-LE or -BE.

This is about what we do now - we catch UnicodeError and then add a BOM 
to the file, and read it again.  We know our files are UTF-16BE if they 
don't have a BOM, as the files are written by code which observes the 
spec.  We can't use UTF-16BE all the time, because sometimes they're 
UTF-16LE, and in those cases the BOM is set.

It would be nice if you could optionally specify that the codec would 
assume UTF-16BE if no BOM was present, and not raise UnicodeError in 
that case, which would preserve the current behaviour as well as allow 
users' to ask for behaviour which conforms to the standard.

I'm not saying that you can't work around the issue now, what I'm 
saying is that you shouldn't *have* to - I think there is a reasonable 
expectation that the UTF-16 codec conforms to the spec, and if you 
wanted it to do something else, it is those users who should be forced 
to come up with a workaround.

--
Nick

From walter at livinglogic.de  Thu Apr  7 23:32:28 2005
From: walter at livinglogic.de (=?iso-8859-1?Q?Walter_D=F6rwald?=)
Date: Thu Apr  7 23:32:31 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <e287a96e5f1d1ac1132c1f4485325fb8@opnet.com>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>	<42526645.3010600@egenix.com>	<2019f504df72a18fb04061248e3f55d8@opnet.com>	<4254F86E.4000203@egenix.com>
	<6355f4b2429cfb6fa42cff5670a49ea3@opnet.com>
	<4255534A.2090505@egenix.com>
	<e287a96e5f1d1ac1132c1f4485325fb8@opnet.com>
Message-ID: <1318.84.56.111.122.1112909548.squirrel@isar.livinglogic.de>

Nicholas Bastin sagte:

> On Apr 7, 2005, at 11:35 AM, M.-A. Lemburg wrote:
>
> [...]
>> If you do have UTF-16 without a BOM mark it's much better
>> to let a short function analyze the text by reading for first
>> few bytes of the file and then make an educated guess based
>> on the findings. You can then process the file using one
>> of the other codecs UTF-16-LE or -BE.
>
> This is about what we do now - we catch UnicodeError and
> then add a BOM  to the file, and read it again.  We know
> our files are UTF-16BE if they  don't have a BOM, as the
> files are written by code which observes the  spec.
> We can't use UTF-16BE all the time, because sometimes
> they're UTF-16LE, and in those cases the BOM is set.
>
> It would be nice if you could optionally specify that the
> codec would assume UTF-16BE if no BOM was present,
> and not raise UnicodeError in  that case, which would
> preserve the current behaviour as well as allow users'
> to ask for behaviour which conforms to the standard.

It should be feasible to implement your own codec for that
based on Lib/encodings/utf_16.py. Simply replace the line
in StreamReader.decode():
   raise UnicodeError,"UTF-16 stream does not start with BOM"
with:
   self.decode = codecs.utf_16_be_decode
and you should be done.

> [...]

Bye,
   Walter D?rwald


From martin at v.loewis.de  Thu Apr  7 23:38:39 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu Apr  7 23:38:42 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <e287a96e5f1d1ac1132c1f4485325fb8@opnet.com>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>	<42526645.3010600@egenix.com>	<2019f504df72a18fb04061248e3f55d8@opnet.com>	<4254F86E.4000203@egenix.com>
	<6355f4b2429cfb6fa42cff5670a49ea3@opnet.com>
	<4255534A.2090505@egenix.com>
	<e287a96e5f1d1ac1132c1f4485325fb8@opnet.com>
Message-ID: <4255A85F.1080307@v.loewis.de>

Nicholas Bastin wrote:
> It would be nice if you could optionally specify that the codec would
> assume UTF-16BE if no BOM was present, and not raise UnicodeError in
> that case, which would preserve the current behaviour as well as allow
> users' to ask for behaviour which conforms to the standard.

Alternatively, the UTF-16BE codec could support the BOM, and do
UTF-16LE if the "other" BOM is found.

This would also support your usecase, and in a better way. The
Unicode assertion that UTF-16 is BE by default is void these
days - there is *always* a higher layer protocol, and it more
often than not specifies (perhaps not in English words, but
only in the source code of the generator) that the default should
by LE.

Regards,
Martin
From walter at livinglogic.de  Thu Apr  7 23:47:07 2005
From: walter at livinglogic.de (=?iso-8859-1?Q?Walter_D=F6rwald?=)
Date: Thu Apr  7 23:47:10 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <1318.84.56.111.122.1112909548.squirrel@isar.livinglogic.de>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>	<42526645.3010600@egenix.com>	<2019f504df72a18fb04061248e3f55d8@opnet.com>	<4254F86E.4000203@egenix.com>
	<6355f4b2429cfb6fa42cff5670a49ea3@opnet.com>
	<4255534A.2090505@egenix.com>
	<e287a96e5f1d1ac1132c1f4485325fb8@opnet.com>
	<1318.84.56.111.122.1112909548.squirrel@isar.livinglogic.de>
Message-ID: <1329.84.56.111.122.1112910427.squirrel@isar.livinglogic.de>

Walter D?rwald sagte:

> Nicholas Bastin sagte:
>
> It should be feasible to implement your own codec for that
> based on Lib/encodings/utf_16.py. Simply replace the line
> in StreamReader.decode():
>   raise UnicodeError,"UTF-16 stream does not start with BOM"
> with:
>   self.decode = codecs.utf_16_be_decode
> and you should be done.

Oops, this only works if you have a big endian system.
Otherwise you have to redecode the input with:
   codecs.utf_16_ex_decode(input, errors, 1, False)

Bye,
   Walter D?rwald


From mal at egenix.com  Fri Apr  8 00:12:53 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri Apr  8 00:12:57 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <4255A85F.1080307@v.loewis.de>
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>	<424DACDC.4080601@egenix.com>	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>	<42524643.3070604@v.loewis.de>	<42526645.3010600@egenix.com>	<2019f504df72a18fb04061248e3f55d8@opnet.com>	<4254F86E.4000203@egenix.com>	<6355f4b2429cfb6fa42cff5670a49ea3@opnet.com>
	<4255534A.2090505@egenix.com>	<e287a96e5f1d1ac1132c1f4485325fb8@opnet.com>
	<4255A85F.1080307@v.loewis.de>
Message-ID: <4255B065.9040101@egenix.com>

Martin v. L?wis wrote:
> Nicholas Bastin wrote:
> 
>>It would be nice if you could optionally specify that the codec would
>>assume UTF-16BE if no BOM was present, and not raise UnicodeError in
>>that case, which would preserve the current behaviour as well as allow
>>users' to ask for behaviour which conforms to the standard.
> 
> 
> Alternatively, the UTF-16BE codec could support the BOM, and do
> UTF-16LE if the "other" BOM is found.

That would violate the Unicode standard - the BOM character
for UTF-16-LE and -BE must be interpreted as ZWNBSP.

> This would also support your usecase, and in a better way. The
> Unicode assertion that UTF-16 is BE by default is void these
> days - there is *always* a higher layer protocol, and it more
> often than not specifies (perhaps not in English words, but
> only in the source code of the generator) that the default should
> by LE.

I've checked the various versions of the Unicode standard
docs: it seems that the quote you have was silently introduced
between 3.0 and 4.0.

Python currently uses version 3.2.0 of the standard and I don't
think enough people are aware of the change in the standard to make
a case for dropping the exception raising in the case of a UTF-16
finding a stream without a BOM mark.

By the time we switch to 4.1 or later, we can then
make the change in the native UTF-16 codec as you
requested.

Personally, I think that the Unicode consortium should not
have introduced a default for the UTF-16 encoding byte
order. Using big endian as default in a world where most
Unicode data is created on little endian machines is not
very realistic either.

Note that the UTF-16 codec starts reading data in
the machines native byte order and then learns a possibly
different byte order by looking for BOMs.

Implementing a codec which implements the 4.0 behavior
is easy, though.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 07 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From stephen at xemacs.org  Fri Apr  8 04:22:50 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri Apr  8 04:23:11 2005
Subject: [Python-Dev] Unicode byte order mark decoding
In-Reply-To: <4255B065.9040101@egenix.com> (M.'s message of "Fri, 08 Apr
	2005 00:12:53 +0200")
References: <2fc759a7383fa61335e3e8e28fe880b9@uwaterloo.ca>
	<424DACDC.4080601@egenix.com>
	<87psx9eo56.fsf@tleepslib.sk.tsukuba.ac.jp>
	<42524643.3070604@v.loewis.de> <42526645.3010600@egenix.com>
	<2019f504df72a18fb04061248e3f55d8@opnet.com>
	<4254F86E.4000203@egenix.com>
	<6355f4b2429cfb6fa42cff5670a49ea3@opnet.com>
	<4255534A.2090505@egenix.com>
	<e287a96e5f1d1ac1132c1f4485325fb8@opnet.com>
	<4255A85F.1080307@v.loewis.de> <4255B065.9040101@egenix.com>
Message-ID: <87vf6y6m85.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "MvL" == "Martin v. L?wis" <martin@v.loewis.de> writes:

    MvL> This would also support your usecase, and in a better way.
    MvL> The Unicode assertion that UTF-16 is BE by default is void
    MvL> these days - there is *always* a higher layer protocol, and
    MvL> it more often than not specifies (perhaps not in English
    MvL> words, but only in the source code of the generator) that the
    MvL> default should by LE.

That is _not_ a protocol.  A protocol is a published specification,
not merely a frequent accident of implementation.  Anyway, both ISO
10646 and the Unicode standard consider that "internal use" and there
is no requirement at all placed on those data.  And such generators
typically take great advantage of that freedom---have you looked in a
.doc file recently?  Have you noticed how many different options
(previous implementations) of .doc are offered in the Import menu?

>>>>> "MAL" == "M.-A. Lemburg" <mal@egenix.com> writes:

    MAL> I've checked the various versions of the Unicode standard
    MAL> docs: it seems that the quote you have was silently
    MAL> introduced between 3.0 and 4.0.

Probably because ISO 10646 was _always_ BE until the standards were
unified.  But note that ISO 10646 standardizes only use as a
communications medium.  Neither ISO 10646 nor Unicode makes any
specification about internal usage.  Conformance in internal
processing is a matter of the programmer's convenience in producing
conforming output.

    MAL> Python currently uses version 3.2.0 of the standard and I
    MAL> don't think enough people are aware of the change in the
    MAL> standard

There's only one (corporate) person that matters: Microsoft.

    MAL> By the time we switch to 4.1 or later, we can then make the
    MAL> change in the native UTF-16 codec as you requested.

While in principle I sympathize with Nick, pragmatically Microsoft is
unlikely to conform.  They will take the position that files created
by Windows are "internal" to the Windows environment, except where
explicitly intended for exchange with arbitrary platforms, and only
then will they conform.  As Martin points out, that is what really
matters for these defaults.  I think you should look to see what
Microsoft does.

    MAL> Personally, I think that the Unicode consortium should not
    MAL> have introduced a default for the UTF-16 encoding byte
    MAL> order. Using big endian as default in a world where most
    MAL> Unicode data is created on little endian machines is not very
    MAL> realistic either.

It's not a default for the UTF-16 encoding byte order.  It's a default
for the UTF-16 encoding byte order _when UTF-16 is a communications
medium_.  Given that the generic network byte order is bigendian, I
think it would be insane to specify littleendian as Unicode's default.

With Unicode same as network, you specify UTF-16 strings internally as
an array of uint16_t, and when you put them on the wire (including
saving them to a file that might be put on the wire as octet-stream)
you apply htons(3) to it.  On reading, you apply ntohs(3) to it.  The
source code is portable, the file is portable.  How can you beat that?

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
From python at rcn.com  Thu Apr  7 16:58:11 2005
From: python at rcn.com (Raymond Hettinger)
Date: Fri Apr  8 04:58:25 2005
Subject: [Python-Dev] Developer list update
In-Reply-To: <200504051906.34590.fdrake@acm.org>
Message-ID: <000301c53b82$3834d160$4baf958d@oemcomputer>

Does anyone know what has become of the following developers and perhaps
have their current email addresses?  Are any of these folks still active
in Python development?

  Ben Gertzfield
  Charles G Waldman
  Eric Price
  Finn Bock
  Ken Manheimer
  Moshe Zadka


Raymond Hettinger
From aleaxit at yahoo.com  Fri Apr  8 05:50:53 2005
From: aleaxit at yahoo.com (Alex Martelli)
Date: Fri Apr  8 05:50:59 2005
Subject: [Python-Dev] Developer list update
In-Reply-To: <000301c53b82$3834d160$4baf958d@oemcomputer>
References: <000301c53b82$3834d160$4baf958d@oemcomputer>
Message-ID: <adc6f6625b46c2810512bd71fa957261@yahoo.com>


On Apr 7, 2005, at 07:58, Raymond Hettinger wrote:

> Does anyone know what has become of the following developers and 
> perhaps
> have their current email addresses?  Are any of these folks still 
> active
> in Python development?
>
>   Ben Gertzfield
>   Charles G Waldman
>   Eric Price
>   Finn Bock
>   Ken Manheimer
>   Moshe Zadka

Moshe was at Pycon (sorry I didn't think of introducing you to each 
other!) so I do assume he's still active.


Alex

From greg.ewing at canterbury.ac.nz  Fri Apr  8 07:03:42 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri Apr  8 07:03:59 2005
Subject: [Python-Dev] New style classes and operator methods
Message-ID: <425610AE.5070605@canterbury.ac.nz>

I think I've found a small flaw in the implementation
of binary operator methods for new-style Python classes.

If the left and right operands are of the same class,
and the class implements a right operand method but
not a left operand method, the right operand method
is not called. Instead, two attempts are made to call
the left operand method.

I'm surmising this is because both calls are funnelled
through the same C-level method, which is using the
types of the operands to decide whether to call the
left or right Python methods.

I suppose this isn't really a serious problem, since
it's easily worked around by always defining at least
a left operand method. But I thought I'd point it out
anyway.

The following example illustrates the problem:


class NewStyleSpam(object):

   def __add__(self, other):
     print "NewStyleSpam.__add__", self, other
     return NotImplemented

   def __radd__(self, other):
     print "NewStyleSpam.__radd__", self, other
     return 42

x1 = NewStyleSpam()
x2 = NewStyleSpam()
print x1 + x2


which produces:

NewStyleSpam.__add__ <__main__.NewStyleSpam object at 0x4019062c> 
<__main__.NewStyleSpam object at 0x4019056c>
NewStyleSpam.__add__ <__main__.NewStyleSpam object at 0x4019062c> 
<__main__.NewStyleSpam object at 0x4019056c>
Traceback (most recent call last):
   File "/home/cosc/staff/research/greg/tmp/foo.py", line 27, in ?
     print x1 + x2
TypeError: unsupported operand type(s) for +: 'NewStyleSpam' and 'NewStyleSpam'


Old-style classes, on the other hand, work as expected:


class OldStyleSpam:

   def __add__(self, other):
     print "OldStyleSpam.__add__", self, other
     return NotImplemented

   def __radd__(self, other):
     print "OldStyleSpam.__radd__", self, other
     return 42

y1 = OldStyleSpam()
y2 = OldStyleSpam()
print y1 + y2


produces:

OldStyleSpam.__add__ <__main__.OldStyleSpam instance at 0x4019054c> 
<__main__.OldStyleSpam instance at 0x401901ec>
OldStyleSpam.__radd__ <__main__.OldStyleSpam instance at 0x401901ec> 
<__main__.OldStyleSpam instance at 0x4019054c>
42

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From fdrake at acm.org  Fri Apr  8 15:31:38 2005
From: fdrake at acm.org (Fred Drake)
Date: Fri Apr  8 15:32:02 2005
Subject: [Python-Dev] Developer list update
In-Reply-To: <000301c53b82$3834d160$4baf958d@oemcomputer>
References: <000301c53b82$3834d160$4baf958d@oemcomputer>
Message-ID: <200504080931.38652.fdrake@acm.org>

On Thursday 07 April 2005 10:58, Raymond Hettinger wrote:
 >   Eric Price

Eric Price was an intern at CNRI; I think it's safe to remove him from the 
list, as I've not seen anything from him in a *long* time.


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at acm.org>
From jhylton at gmail.com  Fri Apr  8 15:53:07 2005
From: jhylton at gmail.com (Jeremy Hylton)
Date: Fri Apr  8 15:53:09 2005
Subject: [Python-Dev] Developer list update
In-Reply-To: <200504080931.38652.fdrake@acm.org>
References: <000301c53b82$3834d160$4baf958d@oemcomputer>
	<200504080931.38652.fdrake@acm.org>
Message-ID: <e8bf7a5305040806535f157684@mail.gmail.com>

On Apr 8, 2005 9:31 AM, Fred Drake <fdrake@acm.org> wrote:
> On Thursday 07 April 2005 10:58, Raymond Hettinger wrote:
>  >   Eric Price
> 
> Eric Price was an intern at CNRI; I think it's safe to remove him from the
> list, as I've not seen anything from him in a *long* time.

Eric Price did some of the work on the decimal package, which was only
two summers ago.  He wasn't an intern at CNRI.

Jeremy
From eyal.lotem at gmail.com  Fri Apr  8 16:01:02 2005
From: eyal.lotem at gmail.com (Eyal Lotem)
Date: Fri Apr  8 16:01:05 2005
Subject: [Python-Dev] Security capabilities in Python
Message-ID: <b64f365b0504080701206af8d3@mail.gmail.com>

I would like to experiment with security based on Python references as
security capabilities.

Unfortunatly, there are several problems that make Python references
invalid as capabilities:

* There is no way to create secure proxies because there are no
private attributes.
* Lots of Python objects are reachable unnecessarily breaking the
principle of least privelege (i.e: object.__subclasses__() etc.)

I was wondering if any such effort has already begun or if there are
other considerations making Python unusable as a capability platform?

(Please cc the reply to my email)
From fdrake at acm.org  Fri Apr  8 16:02:18 2005
From: fdrake at acm.org (Fred Drake)
Date: Fri Apr  8 16:02:25 2005
Subject: [Python-Dev] Developer list update
In-Reply-To: <e8bf7a5305040806535f157684@mail.gmail.com>
References: <000301c53b82$3834d160$4baf958d@oemcomputer>
	<200504080931.38652.fdrake@acm.org>
	<e8bf7a5305040806535f157684@mail.gmail.com>
Message-ID: <200504081002.18073.fdrake@acm.org>

On Friday 08 April 2005 09:53, Jeremy Hylton wrote:
 > Eric Price did some of the work on the decimal package, which was only
 > two summers ago.  He wasn't an intern at CNRI.

A different Eric Price, then.  Mea culpa.

(Or am I misremembering the intern's name?  Hmm.)


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at acm.org>
From jim at zope.com  Fri Apr  8 16:45:22 2005
From: jim at zope.com (Jim Fulton)
Date: Fri Apr  8 16:45:31 2005
Subject: [Python-Dev] Security capabilities in Python
In-Reply-To: <b64f365b0504080701206af8d3@mail.gmail.com>
References: <b64f365b0504080701206af8d3@mail.gmail.com>
Message-ID: <42569902.9030307@zope.com>

You might take a look at zope.security:

   http://svn.zope.org/Zope3/trunk/src/zope/security/

It isn't a capability-based system, but it does address
similar problems and might have some useful ideas.

See the README.txt and untrustedinterpreter.txt.

Jim

Eyal Lotem wrote:
> I would like to experiment with security based on Python references as
> security capabilities.
> 
> Unfortunatly, there are several problems that make Python references
> invalid as capabilities:
> 
> * There is no way to create secure proxies because there are no
> private attributes.
> * Lots of Python objects are reachable unnecessarily breaking the
> principle of least privelege (i.e: object.__subclasses__() etc.)
> 
> I was wondering if any such effort has already begun or if there are
> other considerations making Python unusable as a capability platform?
> 
> (Please cc the reply to my email)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/jim%40zope.com


-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org
From barry at python.org  Fri Apr  8 16:58:28 2005
From: barry at python.org (Barry Warsaw)
Date: Fri Apr  8 16:58:34 2005
Subject: [Python-Dev] Developer list update
In-Reply-To: <000301c53b82$3834d160$4baf958d@oemcomputer>
References: <000301c53b82$3834d160$4baf958d@oemcomputer>
Message-ID: <1112972308.19892.6.camel@geddy.wooz.org>

On Thu, 2005-04-07 at 10:58, Raymond Hettinger wrote:

>   Ben Gertzfield

Ben did a lot of work on the i18n parts of the email package.  I haven't
heard from him in quite a while.

>   Ken Manheimer

Ken's still around.  I'll send you his current email address in a
separate (pvt) message.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050408/0a6b71b3/attachment.pgp
From tim.peters at gmail.com  Fri Apr  8 19:01:33 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Fri Apr  8 19:01:37 2005
Subject: [Python-Dev] Developer list update
In-Reply-To: <000301c53b82$3834d160$4baf958d@oemcomputer>
References: <200504051906.34590.fdrake@acm.org>
	<000301c53b82$3834d160$4baf958d@oemcomputer>
Message-ID: <1f7befae05040810013a338acc@mail.gmail.com>

[Raymond Hettinger]
> Does anyone know what has become of the following developers and perhaps
> have their current email addresses?

How about we exploit that if someone is a Python developer on SF, they
necessarily have an SF email address ($(SFNAME)@users.sourceforge.net,
like I'm tim_one@users.sourceforge.net)?

Then, IMO, if someone with SF commit privs can't be reached via their
SF address, they shouldn't have SF commit privs.
From tim.peters at gmail.com  Fri Apr  8 19:05:41 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Fri Apr  8 19:05:47 2005
Subject: [Python-Dev] Developer list update
In-Reply-To: <200504081002.18073.fdrake@acm.org>
References: <000301c53b82$3834d160$4baf958d@oemcomputer>
	<200504080931.38652.fdrake@acm.org>
	<e8bf7a5305040806535f157684@mail.gmail.com>
	<200504081002.18073.fdrake@acm.org>
Message-ID: <1f7befae05040810052e3d40df@mail.gmail.com>

[Jeremy]
>> Eric Price did some of the work on the decimal package, which was only
>> two summers ago.  He wasn't an intern at CNRI.

[Fred]
> A different Eric Price, then.  Mea culpa.
>
> (Or am I misremembering the intern's name?  Hmm.)

Yes, Eric Price was "the PythonLabs intern", for the brief time that
lasted.  I'll add info about him to developers.txt.  He was given SF
developer status specifically to work on the decimal module, which
then lived in the Python sandbox.  There isn't a reason for him to
remain a developer.
From python at rcn.com  Fri Apr  8 09:02:42 2005
From: python at rcn.com (Raymond Hettinger)
Date: Fri Apr  8 21:02:56 2005
Subject: [Python-Dev] Developer list update
In-Reply-To: <1f7befae05040810013a338acc@mail.gmail.com>
Message-ID: <000101c53c08$f5b4d100$d122a044@oemcomputer>

> [Raymond Hettinger]
> > Does anyone know what has become of the following developers and
perhaps
> > have their current email addresses?

[Tim Peters]
> How about we exploit that if someone is a Python developer on SF, they
> necessarily have an SF email address ($(SFNAME)@users.sourceforge.net,
> like I'm tim_one@users.sourceforge.net)?

I used those addresses and sent notes to everyone who hasn't made a
recent checkin.  For the most part, we've gotten lots of cheerful
responses (with one notable exception) indicating a continuing use for
the checkin privs.  A few people no longer have a use for the access and
I'm recording those as we go.


> Then, IMO, if someone with SF commit privs can't be reached via their
> SF address, they shouldn't have SF commit privs.

I'm taking a lighter approach and making every effort to get in contact.
If they respond, I'll ask them to update their SF address.


Raymond
From tim.peters at gmail.com  Fri Apr  8 21:54:32 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Fri Apr  8 21:55:05 2005
Subject: [Python-Dev] Developer list update
In-Reply-To: <000101c53c08$f5b4d100$d122a044@oemcomputer>
References: <1f7befae05040810013a338acc@mail.gmail.com>
	<000101c53c08$f5b4d100$d122a044@oemcomputer>
Message-ID: <1f7befae05040812546478a677@mail.gmail.com>

...

[Uncle "Bad Cop" Timmy]
>> Then, IMO, if someone with SF commit privs can't be reached via their
>> SF address, they shouldn't have SF commit privs.

[Raymond "Good Cop" Hettinger]
> I'm taking a lighter approach and making every effort to get in contact.
> If they respond, I'll ask them to update their SF address.

Of course!  I would too, if I were you.  But given that I'm still me,
the annotated attributions above should clarify the role I'm playing
here <0.9 wink>.
From tim.peters at gmail.com  Fri Apr  8 21:54:32 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Fri Apr  8 21:57:30 2005
Subject: [Python-Dev] Developer list update
In-Reply-To: <000101c53c08$f5b4d100$d122a044@oemcomputer>
References: <1f7befae05040810013a338acc@mail.gmail.com>
	<000101c53c08$f5b4d100$d122a044@oemcomputer>
Message-ID: <1f7befae05040812546478a677@mail.gmail.com>

...

[Uncle "Bad Cop" Timmy]
>> Then, IMO, if someone with SF commit privs can't be reached via their
>> SF address, they shouldn't have SF commit privs.

[Raymond "Good Cop" Hettinger]
> I'm taking a lighter approach and making every effort to get in contact.
> If they respond, I'll ask them to update their SF address.

Of course!  I would too, if I were you.  But given that I'm still me,
the annotated attributions above should clarify the role I'm playing
here <0.9 wink>.
From tjreedy at udel.edu  Fri Apr  8 22:26:36 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri Apr  8 22:28:21 2005
Subject: [Python-Dev] Re: Security capabilities in Python
References: <b64f365b0504080701206af8d3@mail.gmail.com>
Message-ID: <d36p96$vvb$1@sea.gmane.org>


"Eyal Lotem" <eyal.lotem@gmail.com> wrote in message 
news:b64f365b0504080701206af8d3@mail.gmail.com...
>I would like to experiment with security based on Python references as
> security capabilities.

I am pretty sure that there was a prolonged discussion on Python, security, 
and capability on this list a year or two ago.  Perhaps you can find it in 
the summary archives or the archives themselves.

tjr


From skip at pobox.com  Fri Apr  8 22:30:09 2005
From: skip at pobox.com (Skip Montanaro)
Date: Fri Apr  8 22:30:19 2005
Subject: [Python-Dev] Developer list update
In-Reply-To: <000301c53b82$3834d160$4baf958d@oemcomputer>
References: <200504051906.34590.fdrake@acm.org>
	<000301c53b82$3834d160$4baf958d@oemcomputer>
Message-ID: <16982.59857.373473.929701@montanaro.dyndns.org>


    Raymond> Does anyone know what has become of ...

    Raymond>   Charles G Waldman

I'd scratch Charles from the list.  I work at the same company he did.
Nobody here has been in touch with him for over a year.  Several of us have
tried to get ahold of him but to no avail.

Skip
From greg at electricrain.com  Fri Apr  8 23:42:18 2005
From: greg at electricrain.com (Gregory P. Smith)
Date: Fri Apr  8 23:42:25 2005
Subject: [Python-Dev] threading (GilState) question
In-Reply-To: <2m7jje8tmb.fsf@starship.python.net>
References: <2mmzsb7zds.fsf@starship.python.net>
	<1f7befae050407082140a591fd@mail.gmail.com>
	<2m7jje8tmb.fsf@starship.python.net>
Message-ID: <20050408214218.GE24751@zot.electricrain.com>

> > Under "Limitations and Exclusions" it specifically disowns
> > responsibility for worrying about whether Py_Initialize() and
> > PyEval_InitThreads() have been called:
> >
> [snip quote]
> 
> This suggests that I should call PyEval_InitThreads() in
> initreadline(), which seems daft.

fwiw, Modules/_bsddb.c does exactly that.

-g

From fredrik at pythonware.com  Sat Apr  9 00:55:02 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Apr  9 00:55:13 2005
Subject: [Python-Dev] Re: Developer list update
References: <1f7befae05040810013a338acc@mail.gmail.com>
	<000101c53c08$f5b4d100$d122a044@oemcomputer>
Message-ID: <d371ud$o46$1@sea.gmane.org>

Raymond Hettinger wrote:

> I used those addresses and sent notes to everyone who hasn't made a
> recent checkin.

where recent obviously was defined as "after 2.4" for checkins, and "last week"
for tracker activities.

python-dev was a lot more fun in the old days.

</F> 


From Scott.Daniels at Acm.Org  Sat Apr  9 01:15:39 2005
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Sat Apr  9 01:15:45 2005
Subject: [Python-Dev] marshal / unmarshal
Message-ID: <d3735o$r5k$1@sea.gmane.org>

What should marshal / unmarshal do with floating point NaNs (the case we
are worrying about is Infinity) ?  The current behavior is not perfect.

Michael Spencer chased down a supposed "Idle" problem to (on Win2k):
     marshal.dumps(1e10000) == 'f\x061.#INF'
     marshal.loads('f\x061.#INF') == 1.0

Should loads raise an exception?
Somehow, I thing 1.0 is not the best possible representation for +Inf.

-- Scott David Daniels
Scott.Daniels@Acm.Org

From fredrik at pythonware.com  Sat Apr  9 01:32:21 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Apr  9 01:32:33 2005
Subject: [Python-Dev] Re: marshal / unmarshal
References: <d3735o$r5k$1@sea.gmane.org>
Message-ID: <d3744c$ten$1@sea.gmane.org>

Scott David Daniels wrote:

> What should marshal / unmarshal do with floating point NaNs (the case we
> are worrying about is Infinity) ?  The current behavior is not perfect.
>
> Michael Spencer chased down a supposed "Idle" problem to (on Win2k):
>     marshal.dumps(1e10000) == 'f\x061.#INF'
>     marshal.loads('f\x061.#INF') == 1.0
>
> Should loads raise an exception?
> Somehow, I thing 1.0 is not the best possible representation for +Inf.

looks like marshal uses atof to parse the string, without bothering to
check for trailing junk...  it should probably use a strtod instead, and
raise an exception if there's enough junk left at the end (see PyFloat_
FromString for sample code).

fwiw, here's what I get on a linux box:

>>> import marshal
>>> marshal.dumps(1e10000)
'f\x03inf'
>>> marshal.loads(_)
inf

and yes, someone should fix the NaN mess, but I guess everyone's too
busy removing unworthy developers from sourceforge to bother working
on stuff that's actually useful for real Python users...

</F> 


From tim.peters at gmail.com  Sat Apr  9 01:38:24 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Sat Apr  9 01:38:28 2005
Subject: [Python-Dev] marshal / unmarshal
In-Reply-To: <d3735o$r5k$1@sea.gmane.org>
References: <d3735o$r5k$1@sea.gmane.org>
Message-ID: <1f7befae0504081638145d3b4c@mail.gmail.com>

[Scott David Daniels]
> What should marshal / unmarshal do with floating point NaNs (the case we
> are worrying about is Infinity) ?  The current behavior is not perfect.

All Python behavior in the presence of a NaN, infinity, or signed zero
is a platform-dependent accident.  This is because C89 has no such
concepts, and Python is written to the C89 standard.  It's not easy to
fix across all platforms (because there is no portable way to do so in
standard C), although it may be reasonably easy to fix if all anyone
cares about is gcc and MSVC (every platform C compiler has its own set
of gimmicks for "dealing with" these things).

If marshal could reliably detect a NaN, then of course unmarshal
should reliably reproduce the NaN -- provided the platform on which
it's unpacked supports NaNs.

> Should loads raise an exception?

Never for a quiet NaN, unless the platform doesn't support NaNs.  It's
harder to know what to with a signaling NaN, because Python doesn't
have any of 754's trap-enable or exception status flags either (the
new ``decimal`` module does, but none of that is integrated with the
_rest_ of Python yet).

Should note that what the fp literal 1e10000 does across boxes is also
an accident -- Python defers to the platform C libraries for
string<->float conversions.
From fredrik at pythonware.com  Sat Apr  9 01:37:20 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Apr  9 01:43:17 2005
Subject: [Python-Dev] Re: hierarchicial named groups extension to the re
	library
References: <MS4ysQaz.1112426561.4504890.ottrey@py.redsoft.be>
Message-ID: <d374dn$u4n$1@sea.gmane.org>

<ottrey@py.redsoft.be> wrote:

> (ie. the re library only returns the ~last~ match for named groups - not
> a list of ~all~ the matches for the named groups.  And the hierarchy of
 those named groups is non-existant in the flat dictionary of matches
> that results. )

are you 100% sure that this can be implemented on top of other RE
engines (CPython isn't the only Python implementation out there).

(generally speaking, trying to turn an RE engine into a parser is a lousy
idea.  the library would benefit more from a simple parser toolkit than
it benefits from more non-standard and highly specialized RE hacks...)

</F> 


From tim.peters at gmail.com  Sat Apr  9 02:06:06 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Sat Apr  9 02:06:10 2005
Subject: [Python-Dev] Re: Developer list update
In-Reply-To: <d371ud$o46$1@sea.gmane.org>
References: <1f7befae05040810013a338acc@mail.gmail.com>
	<000101c53c08$f5b4d100$d122a044@oemcomputer>
	<d371ud$o46$1@sea.gmane.org>
Message-ID: <1f7befae0504081706160c3fa@mail.gmail.com>

[Raymond Hettinger wrote:
>> I used those addresses and sent notes to everyone who hasn't made a
>> recent checkin.

[Fredrik Lundh]
> where recent obviously was defined as "after 2.4" for checkins, and "last week"
> for tracker activities.

Raymond didn't mention tracker activity above, and that's a different
issue -- it's possible now to separate commit privileges from tracker
privileges on SourceForge.  Like it or not (I think I can guess
which), every person with commit privs implies at least one box that
can become a security hole, and at least 5 people who in fact never
commit anymore were agreeable to giving up SF developer privs.

> python-dev was a lot more fun in the old days.

Ya, but you were too -- and so was I.  I expect these all go together,
given that (the collective) we _are_ python-dev.

So what have you been up to lately?  Skip it unless the answer's fun <wink>.
From tjreedy at udel.edu  Sat Apr  9 03:59:32 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat Apr  9 03:59:40 2005
Subject: [Python-Dev] Re: marshal / unmarshal
References: <d3735o$r5k$1@sea.gmane.org>
	<1f7befae0504081638145d3b4c@mail.gmail.com>
Message-ID: <d37cpc$g0d$1@sea.gmane.org>


"Tim Peters" <tim.peters@gmail.com> wrote in message 
news:1f7befae0504081638145d3b4c@mail.gmail.com...
> All Python behavior in the presence of a NaN, infinity, or signed zero
> is a platform-dependent accident.

The particular issue here is not platform dependence as such but 
within-platform usage dependence, as in the same code giving radically 
different answers in a standard interactive console window and an idle 
window, or when you run it the first time (from xx.py) versus subsequent 
times (from xx.pyc) until you edit the file again. (I verified this on 2.2, 
but MSpencer claimed to have tested on 2.4).  Having the value of an 
expression such as '100 < 1e1000' flip back and forth between True and 
False from run to run *is* distressing for some people ;-).

I know that this has come up before as 'wont fix' bug, but it might be 
better to have invalid floats like 1e1000, etc, not compile and raise an 
exception (at least on Windows) instead of breaking the reasonable 
expectation that unmarshal(marshal(codeob)) == codeob.  That would force 
people (at least on Windows) to do something more more within-platform 
deterministic.

>If marshal could reliably detect a NaN, then of course unmarshal
>should reliably reproduce the NaN -- provided the platform on which
>it's unpacked supports NaNs

Windows seems to support +- INF just fine, doing arithmetic and comparisons 
'correctly'.  So it seems that detection or reproduction is the problem.

Terry J. Reedy


From Scott.Daniels at Acm.Org  Sat Apr  9 04:20:30 2005
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Sat Apr  9 04:20:51 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <d37cpc$g0d$1@sea.gmane.org>
References: <d3735o$r5k$1@sea.gmane.org>	<1f7befae0504081638145d3b4c@mail.gmail.com>
	<d37cpc$g0d$1@sea.gmane.org>
Message-ID: <d37e0a$jc8$1@sea.gmane.org>

Terry Reedy wrote:
> "Tim Peters" <tim.peters@gmail.com> wrote in message 
> news:1f7befae0504081638145d3b4c@mail.gmail.com...
> 
>>All Python behavior in the presence of a NaN, infinity, or signed zero
>>is a platform-dependent accident.
> 
> 
> The particular issue here is not platform dependence as such but 
> within-platform usage dependence, as in the same code giving radically 
> different answers in a standard interactive console window and an idle 
> window, or when you run it the first time (from xx.py) versus subsequent 
> times (from xx.pyc) until you edit the file again. (I verified this on 2.2, 
> but MSpencer claimed to have tested on 2.4).  Having the value of an 
> expression such as '100 < 1e1000' flip back and forth between True and 
> False from run to run *is* distressing for some people ;-).
> 
> I know that this has come up before as 'wont fix' bug, but it might be 
> better to have invalid floats like 1e1000, etc, not compile and raise an 
> exception (at least on Windows) instead of breaking the reasonable 
> expectation that unmarshal(marshal(codeob)) == codeob.  That would force 
> people (at least on Windows) to do something more more within-platform 
> deterministic.
> 
> 
>>If marshal could reliably detect a NaN, then of course unmarshal
>>should reliably reproduce the NaN -- provided the platform on which
>>it's unpacked supports NaNs
> 
> 
> Windows seems to support +- INF just fine, doing arithmetic and comparisons 
> 'correctly'.  So it seems that detection or reproduction is the problem.
> 
> Terry J. Reedy
> 
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org
> 
I can write the Windows-dependent detect code if that is what is wanted.
I just want to know what the consensus is on the "should."  If we cause
exceptions, should they be one encode or decode or both?  If not, do we
replicate all NaNs, Infs of both signs, Indeterminates?....

--Scott David Daniels
Scott.Daniels@Acm.Org

From python-dev at zesty.ca  Sat Apr  9 07:13:40 2005
From: python-dev at zesty.ca (Ka-Ping Yee)
Date: Sat Apr  9 07:13:45 2005
Subject: [Python-Dev] Security capabilities in Python
In-Reply-To: <b64f365b0504080701206af8d3@mail.gmail.com>
References: <b64f365b0504080701206af8d3@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0504082348180.27211@server1.LFW.org>

On Fri, 8 Apr 2005, Eyal Lotem wrote:
> I would like to experiment with security based on Python references as
> security capabilities.

This is an interesting and worthwhile thought.  Several people
(including myself) have talked about the possibility of doing
this in the past.  I believe the two problems you mention can be
addressed without modifying the Python core.

> * There is no way to create secure proxies because there are no
> private attributes.

Attributes are not private, but local variables are.  If you use
lexical scoping to restrict variable access (as one would in
Scheme, E, etc.) you can create secure proxies.  See below.

> * Lots of Python objects are reachable unnecessarily breaking the
> principle of least privelege (i.e: object.__subclasses__() etc.)

True.  However, Python's restricted execution mode prevents access
to these attributes, allowing you to enforce encapsulation.  (At
least, that is part of the intent of restricted execution mode,
though currently we do not make official guarantees about it.)
Replacing __builtins__ activates restricted execution mode.

Here is a simple facet function.

    def facet(target, allowed_attrs):
        class Facet:
            def __repr__(self):
                return '<Facet %r on %r>' % (allowed_attrs, target)
            def __getattr__(self, name):
                if name in allowed_attrs:
                    return getattr(target, name)
                raise NameError(name)
        return Facet()

    def restrict():
        global __builtins__
        __builtins__ = __builtins__.__dict__.copy()

    # Here's an example.

    list = [1, 2, 3]
    immutable_facet = facet(list, ['__getitem__', '__len__', '__iter__'])

    # Here's another example.

    class Counter:
        def __init__(self):
            self.n = 0

        def increment(self):
            self.n += 1

        def value(self):
            return self.n

    counter = Counter()
    readonly_facet = facet(counter, ['value'])

If i've done this correctly, it should be impossible to alter the
contents of the list or the counter, given only the immutable_facet
or the readonly_facet, after restrict() has been called.

(Try it out and let me know if you can poke holes in it...)

The upshot of all this is that i think you can do secure programming
in Python if you just use a different style.  Unfortunately, this
style is incompatible with the way classes are usually written in
Python, which means you can't safely use much of the standard library,
but i believe the language itself is not fatally flawed.


-- ?!ng
From fredrik at pythonware.com  Sat Apr  9 07:32:37 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Apr  9 07:32:37 2005
Subject: [Python-Dev] Re: marshal / unmarshal
References: <d3735o$r5k$1@sea.gmane.org>
	<1f7befae0504081638145d3b4c@mail.gmail.com>
Message-ID: <d37p7p$8kk$1@sea.gmane.org>

Tim Peters wrote:

> All Python behavior in the presence of a NaN, infinity, or signed zero
> is a platform-dependent accident.  This is because C89 has no such
> concepts, and Python is written to the C89 standard.  It's not easy to
> fix across all platforms (because there is no portable way to do so in
> standard C), although it may be reasonably easy to fix if all anyone
> cares about is gcc and MSVC

which probably represents very close to 100% of all python interpreter
instances out there.  making floats behave the same on standard builds
for windows, mac os x, and linux would be a great step forward.

+1.0 from me.

>> Should loads raise an exception?
>
> Never for a quiet NaN, unless the platform doesn't support NaNs.  It's
> harder to know what to with a signaling NaN, because Python doesn't
> have any of 754's trap-enable or exception status flags either (the
> new ``decimal`` module does, but none of that is integrated with the
> _rest_ of Python yet).
>
> Should note that what the fp literal 1e10000 does across boxes is also
> an accident -- Python defers to the platform C libraries for
> string<->float conversions.

yeah, but the problem here is that MSVC cannot read its own NaN:s;
float() checks for that, but loads doesn't.  compare and contrast:

    >>> float(str(1e10000))
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    ValueError: invalid literal for float(): 1.#INF

    >>> import marshal
    >>> marshal.loads(marshal.dumps(1e10000))
    1.0

on the other hand,

    >>> marshal.loads("\f\x01x")
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    ValueError: bad marshal data

adding basic error checking shouldn't be very hard (you could probably
call the string->float converter in the float object module, and just map any
exceptions to "bad marshal data")

</F> 


From steve at holdenweb.com  Sat Apr  9 08:37:15 2005
From: steve at holdenweb.com (Steve Holden)
Date: Sat Apr  9 08:37:36 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <d3744c$ten$1@sea.gmane.org>
References: <d3735o$r5k$1@sea.gmane.org> <d3744c$ten$1@sea.gmane.org>
Message-ID: <4257781B.4050704@holdenweb.com>

Fredrik Lundh wrote:
[...]
> 
> and yes, someone should fix the NaN mess, but I guess everyone's too
> busy removing unworthy developers from sourceforge to bother working
> on stuff that's actually useful for real Python users...
> 
That's not at all true. Some of us are busy giving up commit privileges 
in order to avoid the impression that we might one day work on stuff 
that actually useful to real Python users.

Except, possibly, conferences.

The effbot is at least averagely cantankerous this month :-)

unworthi-ly y'rs  - steve
-- 
Steve Holden        +1 703 861 4237  +1 800 494 3119
Holden Web LLC             http://www.holdenweb.com/
Python Web Programming  http://pydish.holdenweb.com/

From exarkun at divmod.com  Sat Apr  9 11:02:23 2005
From: exarkun at divmod.com (Jp Calderone)
Date: Sat Apr  9 11:02:32 2005
Subject: [Python-Dev] Security capabilities in Python
In-Reply-To: <Pine.LNX.4.58.0504082348180.27211@server1.LFW.org>
Message-ID: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm>

On Sat, 9 Apr 2005 00:13:40 -0500 (CDT), Ka-Ping Yee <python-dev@zesty.ca> wrote:
>On Fri, 8 Apr 2005, Eyal Lotem wrote:
> > I would like to experiment with security based on Python references as
> > security capabilities.
> 
> This is an interesting and worthwhile thought.  Several people
> (including myself) have talked about the possibility of doing
> this in the past.  I believe the two problems you mention can be
> addressed without modifying the Python core.
> 
> > * There is no way to create secure proxies because there are no
> > private attributes.
> 
> Attributes are not private, but local variables are.  If you use
> lexical scoping to restrict variable access (as one would in
> Scheme, E, etc.) you can create secure proxies.  See below.
> 
> > * Lots of Python objects are reachable unnecessarily breaking the
> > principle of least privelege (i.e: object.__subclasses__() etc.)
> 
> True.  However, Python's restricted execution mode prevents access
> to these attributes, allowing you to enforce encapsulation.  (At
> least, that is part of the intent of restricted execution mode,
> though currently we do not make official guarantees about it.)
> Replacing __builtins__ activates restricted execution mode.
> 
> Here is a simple facet function.
> 
>     def facet(target, allowed_attrs):
>         class Facet:
>             def __repr__(self):
>                 return '<Facet %r on %r>' % (allowed_attrs, target)
>             def __getattr__(self, name):
>                 if name in allowed_attrs:
>                     return getattr(target, name)
>                 raise NameError(name)
>         return Facet()
> 
>     def restrict():
>         global __builtins__
>         __builtins__ = __builtins__.__dict__.copy()
> 
>     # Here's an example.
> 
>     list = [1, 2, 3]
>     immutable_facet = facet(list, ['__getitem__', '__len__', '__iter__'])
> 
>     # Here's another example.
> 
>     class Counter:
>         def __init__(self):
>             self.n = 0
> 
>         def increment(self):
>             self.n += 1
> 
>         def value(self):
>             return self.n
> 
>     counter = Counter()
>     readonly_facet = facet(counter, ['value'])
> 
> If i've done this correctly, it should be impossible to alter the
> contents of the list or the counter, given only the immutable_facet
> or the readonly_facet, after restrict() has been called.
> 
> (Try it out and let me know if you can poke holes in it...)
> 
> The upshot of all this is that i think you can do secure programming
> in Python if you just use a different style.  Unfortunately, this
> style is incompatible with the way classes are usually written in
> Python, which means you can't safely use much of the standard library,
> but i believe the language itself is not fatally flawed.
> 

  Does using the gc module to bypass this security count?  If so:

    exarkun@boson:~$ python -i facet.py 
    >>> import gc
    >>> c = readonly_facet.__getattr__.func_closure[1]
    >>> r = gc.get_referents(c)[0]
    >>> r.n = 'hax0r3d'
    >>> readonly_facet.value()
    'hax0r3d'
    >>> 

  This is the easiest way of which I know to bypass the use of cells as a security mechanism.  I believe there are other more involved (and fragile, probably) ways, though.

  Jp
From martin at v.loewis.de  Sat Apr  9 13:12:33 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat Apr  9 13:12:37 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <d37cpc$g0d$1@sea.gmane.org>
References: <d3735o$r5k$1@sea.gmane.org>	<1f7befae0504081638145d3b4c@mail.gmail.com>
	<d37cpc$g0d$1@sea.gmane.org>
Message-ID: <4257B8A1.6000902@v.loewis.de>

Terry Reedy wrote:
> The particular issue here is not platform dependence as such but 
> within-platform usage dependence, as in the same code giving radically 
> different answers in a standard interactive console window and an idle 
> window, or when you run it the first time (from xx.py) versus subsequent 
> times (from xx.pyc) until you edit the file again. 

Yet, this *still* is a platform dependence. Python makes no guarantee
that 1e1000 is a supported float literal on any platform, and indeed,
on your platform, 1e1000 is not supported on your platform.

Furthermore, Python makes no guarantee that it will report when an
unsupported float-literal is found, so you just get different behaviour,
by accident.

This, in turn, is a violation of the principle "errors should never
pass silently". Alas, nobody found the time to detect the error, yet.

Just don't do that, then.

Regards,
Martin
From skip at pobox.com  Sat Apr  9 14:30:04 2005
From: skip at pobox.com (Skip Montanaro)
Date: Sat Apr  9 14:30:07 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <4257B8A1.6000902@v.loewis.de>
References: <d3735o$r5k$1@sea.gmane.org>
	<1f7befae0504081638145d3b4c@mail.gmail.com>
	<d37cpc$g0d$1@sea.gmane.org> <4257B8A1.6000902@v.loewis.de>
Message-ID: <16983.51916.407019.489590@montanaro.dyndns.org>


    Martin> Yet, this *still* is a platform dependence. Python makes no
    Martin> guarantee that 1e1000 is a supported float literal on any
    Martin> platform, and indeed, on your platform, 1e1000 is not supported
    Martin> on your platform.

Are float("inf") and float("nan") supported everywhere?  I don't have ready
access to a Windows machine, but on the couple Linux and MacOS machines
at-hand they are.  As a starting point can it be agreed on whether they
should be supported?  (There is a unique IEEE-754 representation for both
values, right?  Should we try and support any other floating point format?)
If so, the float("1e10000") == float("inf") in all cases, right?  If not,
then Python's lexer should be trained to know what out-of-range floats are
and complain when it encounters them.  In either case, we should then know
how to fix marshal.loads (and probably pickle.loads).

That seems like it would be a start in the right direction.

Skip
From fredrik at pythonware.com  Sat Apr  9 14:53:15 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Apr  9 14:54:00 2005
Subject: [Python-Dev] Re: Re: marshal / unmarshal
References: <d3735o$r5k$1@sea.gmane.org><1f7befae0504081638145d3b4c@mail.gmail.com><d37cpc$g0d$1@sea.gmane.org>
	<4257B8A1.6000902@v.loewis.de>
	<16983.51916.407019.489590@montanaro.dyndns.org>
Message-ID: <d38j1t$sm0$1@sea.gmane.org>

Skip Montanaro wrote:

> Are float("inf") and float("nan") supported everywhere?

nope.

>>> float("inf")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: invalid literal for float(): inf
>>> float("nan")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: invalid literal for float(): nan

>>> 1e10000
1.#INF
>>> float("1.#INF")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: invalid literal for float(): 1.#INF

> As a starting point can it be agreed on whether they should be supported?

that would be nice.

> In either case, we should then know how to fix marshal.loads (and probably
> pickle.loads).

pickle doesn't have the INF=>1.0 bug:

>>> import pickle
>>> pickle.loads(pickle.dumps(1e10000))
...
ValueError: invalid literal for float(): 1.#INF

>>> import cPickle
>>> cPickle.loads(cPickle.dumps(1e10000))
...
ValueError: could not convert string to float

>>> import marshal
>>> marshal.loads(marshal.dumps(1e10000))
1.0

</F> 


From martin at v.loewis.de  Sat Apr  9 15:32:06 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat Apr  9 15:32:09 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <16983.51916.407019.489590@montanaro.dyndns.org>
References: <d3735o$r5k$1@sea.gmane.org>
	<1f7befae0504081638145d3b4c@mail.gmail.com>
	<d37cpc$g0d$1@sea.gmane.org> <4257B8A1.6000902@v.loewis.de>
	<16983.51916.407019.489590@montanaro.dyndns.org>
Message-ID: <4257D956.80402@v.loewis.de>

Skip Montanaro wrote:
>     Martin> Yet, this *still* is a platform dependence. Python makes no
>     Martin> guarantee that 1e1000 is a supported float literal on any
>     Martin> platform, and indeed, on your platform, 1e1000 is not supported
>     Martin> on your platform.
> 
> Are float("inf") and float("nan") supported everywhere? 

I would not expect that, but Tim will correct me if I'm wrong.

> As a starting point can it be agreed on whether they
> should be supported?  (There is a unique IEEE-754 representation for both
> values, right?

Perhaps yes for inf, but I think maybe no for nan. There are multiple
IEEE-754 representations for NaN. However, I understand all NaN are
meant to compare unequal - even if they use the same representation.


> If so, the float("1e10000") == float("inf") in all cases, right?

Currently, not necessarily: if a large-enough exponent is supported
(which might be the case with a IEEE "long double", dunno), 1e10000
would be a regular value.

> That seems like it would be a start in the right direction.

Pieces of it would be a start in the right direction.

Regards,
Martin
From fredrik at pythonware.com  Sat Apr  9 18:36:48 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Apr  9 18:36:57 2005
Subject: [Python-Dev] Re: Re: marshal / unmarshal
References: <d3735o$r5k$1@sea.gmane.org><1f7befae0504081638145d3b4c@mail.gmail.com><d37cpc$g0d$1@sea.gmane.org><4257B8A1.6000902@v.loewis.de><16983.51916.407019.489590@montanaro.dyndns.org>
	<d38j1t$sm0$1@sea.gmane.org>
Message-ID: <d39050$uf2$1@sea.gmane.org>

> pickle doesn't have the INF=>1.0 bug:
>
>>>> import pickle
>>>> pickle.loads(pickle.dumps(1e10000))
> ...
> ValueError: invalid literal for float(): 1.#INF
>
>>>> import cPickle
>>>> cPickle.loads(cPickle.dumps(1e10000))
> ...
> ValueError: could not convert string to float
>
>>>> import marshal
>>>> marshal.loads(marshal.dumps(1e10000))
> 1.0

should I check in a fix for this?

the code in PyFloat_FromString contains lots of trickery to deal with more or less
broken literals, and more or less broken C libraries.

unfortunately, and unlike most other functions with similar names, PyFloat_FromString
takes a Python object, not a char pointer.  would it be a good idea to add a variant
that takes a char*?  if so, should PyFloat_FromString use the new function, or are we
avoiding that kind of refactoring for speed reasons these days?

any opinions?

</F> 


From fredrik at pythonware.com  Sat Apr  9 19:43:26 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sat Apr  9 19:44:00 2005
Subject: [Python-Dev] Re: Security capabilities in Python
References: <b64f365b0504080701206af8d3@mail.gmail.com>
	<Pine.LNX.4.58.0504082348180.27211@server1.LFW.org>
Message-ID: <d3942t$8nt$1@sea.gmane.org>

Ka-Ping wrote:

>     counter = Counter()
>     readonly_facet = facet(counter, ['value'])
>
> If i've done this correctly, it should be impossible to alter the
> contents of the list or the counter, given only the immutable_facet
> or the readonly_facet, after restrict() has been called.

I'm probably missing something, but a straightforward reflection
approach seems to work on my machine:

>>> restrict()
>>> readonly_facet = facet(counter, ['value'])
>>> print readonly_facet.value()
0
>>> readonly_facet.value.im_self.n = "oops!"
>>> print readonly_facet.value()
oops!
>>> class mycounter:
...     def value(self): return "muhaha!"
...
>>> readonly_facet.value.im_self.__class__ = mycounter
>>> print readonly_facet.value()
muhaha!
...
>>> readonly_facet.value.im_func.func_globals["readonly_facet"] = myinstance
...

and so on

does that restrict() function really do the right thing, or is my
python install broken?

</F>


From mwh at python.net  Sat Apr  9 20:13:04 2005
From: mwh at python.net (Michael Hudson)
Date: Sat Apr  9 20:13:05 2005
Subject: [Python-Dev] Security capabilities in Python
In-Reply-To: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm> (Jp
	Calderone's message of "Sat, 09 Apr 2005 09:02:23 GMT")
References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm>
Message-ID: <2mhdif7r9r.fsf@starship.python.net>

Jp Calderone <exarkun@divmod.com> writes:

>   Does using the gc module to bypass this security count?  If so:
>
>     exarkun@boson:~$ python -i facet.py 
>     >>> import gc
>     >>> c = readonly_facet.__getattr__.func_closure[1]
>     >>> r = gc.get_referents(c)[0]
>     >>> r.n = 'hax0r3d'
>     >>> readonly_facet.value()
>     'hax0r3d'
>     >>> 
>
>   This is the easiest way of which I know to bypass the use of cells
>   as a security mechanism.  I believe there are other more involved
>   (and fragile, probably) ways, though.

The funniest I know is part of PyPy:

def extract_cell_content(c):
    """Get the value contained in a CPython 'cell', as read through
    the func_closure of a function object."""
    # yuk! this is all I could come up with that works in Python 2.2 too
    class X(object):
        def __eq__(self, other):
            self.other = other
    x = X()
    x_cell, = (lambda: x).func_closure
    x_cell == c
    return x.other

It would be unfortunate for PyPy (and IMHO, very un-pythonic) if this
process became impossible.

Cheers,
mwh

-- 
  Java sucks. [...] Java on TV set top boxes will suck so hard it
  might well inhale people from off  their sofa until their heads
  get wedged in the card slots.              --- Jon Rabone, ucam.chat
From mwh at python.net  Sat Apr  9 20:15:46 2005
From: mwh at python.net (Michael Hudson)
Date: Sat Apr  9 22:17:09 2005
Subject: [Python-Dev] threading (GilState) question
In-Reply-To: <20050408214218.GE24751@zot.electricrain.com> (Gregory P.
	Smith's message of "Fri, 8 Apr 2005 14:42:18 -0700")
References: <2mmzsb7zds.fsf@starship.python.net>
	<1f7befae050407082140a591fd@mail.gmail.com>
	<2m7jje8tmb.fsf@starship.python.net>
	<20050408214218.GE24751@zot.electricrain.com>
Message-ID: <2md5t37r59.fsf@starship.python.net>

"Gregory P. Smith" <greg@electricrain.com> writes:

>> > Under "Limitations and Exclusions" it specifically disowns
>> > responsibility for worrying about whether Py_Initialize() and
>> > PyEval_InitThreads() have been called:
>> >
>> [snip quote]
>> 
>> This suggests that I should call PyEval_InitThreads() in
>> initreadline(), which seems daft.
>
> fwiw, Modules/_bsddb.c does exactly that.

Interesting.  The problem with readline.c doing this is that it gets
implicitly imported by the interpreter -- although only for
interactive sessions.  Maybe that's not that big a deal.  I'd still
prefer to change the functions (would updating the PEP be in order
here?  Obviously, I'd update the api documentation).

Cheers,
mwh

-- 
  It's relatively seldom that desire for sex is involved in 
  technology procurement decisions.          -- ESR at EuroPython 2002
From bob at redivi.com  Sat Apr  9 22:54:30 2005
From: bob at redivi.com (Bob Ippolito)
Date: Sat Apr  9 22:54:35 2005
Subject: [Python-Dev] threading (GilState) question
In-Reply-To: <2md5t37r59.fsf@starship.python.net>
References: <2mmzsb7zds.fsf@starship.python.net>
	<1f7befae050407082140a591fd@mail.gmail.com>
	<2m7jje8tmb.fsf@starship.python.net>
	<20050408214218.GE24751@zot.electricrain.com>
	<2md5t37r59.fsf@starship.python.net>
Message-ID: <3af93ffa0bd325b09c6ca6a607c9528d@redivi.com>


On Apr 9, 2005, at 11:15 AM, Michael Hudson wrote:

> "Gregory P. Smith" <greg@electricrain.com> writes:
>
>>>> Under "Limitations and Exclusions" it specifically disowns
>>>> responsibility for worrying about whether Py_Initialize() and
>>>> PyEval_InitThreads() have been called:
>>>>
>>> [snip quote]
>>>
>>> This suggests that I should call PyEval_InitThreads() in
>>> initreadline(), which seems daft.
>>
>> fwiw, Modules/_bsddb.c does exactly that.
>
> Interesting.  The problem with readline.c doing this is that it gets
> implicitly imported by the interpreter -- although only for
> interactive sessions.  Maybe that's not that big a deal.  I'd still
> prefer to change the functions (would updating the PEP be in order
> here?  Obviously, I'd update the api documentation).

Is there a good reason to *not* call PyEval_InitThreads when using a 
threaded Python?  Sounds like it would just be easier to implicitly 
call it during Py_Initialize some day.

-bob

From python-dev at zesty.ca  Sat Apr  9 22:56:46 2005
From: python-dev at zesty.ca (Ka-Ping Yee)
Date: Sat Apr  9 22:56:55 2005
Subject: [Python-Dev] Re: Security capabilities in Python
In-Reply-To: <d3942t$8nt$1@sea.gmane.org>
References: <b64f365b0504080701206af8d3@mail.gmail.com>
	<Pine.LNX.4.58.0504082348180.27211@server1.LFW.org>
	<d3942t$8nt$1@sea.gmane.org>
Message-ID: <Pine.LNX.4.58.0504091552450.27211@server1.LFW.org>

On Sat, 9 Apr 2005, Fredrik Lundh wrote:
> Ka-Ping wrote:
> >     counter = Counter()
> >     readonly_facet = facet(counter, ['value'])
> >
> > If i've done this correctly, it should be impossible to alter the
> > contents of the list or the counter, given only the immutable_facet
> > or the readonly_facet, after restrict() has been called.
>
> I'm probably missing something, but a straightforward reflection
> approach seems to work on my machine:

That's funny.  After i called restrict() Python didn't let me get im_self.

    >>> restrict()
    >>> readonly_facet.value
    <bound method Counter.value of <__main__.Counter instance at 0x41df0>>
    >>> readonly_facet.value.im_self
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    RuntimeError: restricted attribute
    >>>

It doesn't matter if i make the facet before or after restrict().

    >>> restrict()
    >>> rf2 = facet(counter, ['value'])
    >>> rf2.value
    <bound method Counter.value of <__main__.Counter instance at 0x41df0>>
    >>> rf2.value.im_self
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    RuntimeError: restricted attribute
    >>>

I'm using

    Python 2.3 (#1, Sep 13 2003, 00:49:11)
    [GCC 3.3 20030304 (Apple Computer, Inc. build 1495)] on darwin


-- ?!ng
From foom at fuhm.net  Sat Apr  9 22:58:54 2005
From: foom at fuhm.net (James Y Knight)
Date: Sat Apr  9 22:59:06 2005
Subject: [Python-Dev] Security capabilities in Python
In-Reply-To: <2mhdif7r9r.fsf@starship.python.net>
References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm>
	<2mhdif7r9r.fsf@starship.python.net>
Message-ID: <0f238c16eb17a9e9085625e4281b24ad@fuhm.net>


On Apr 9, 2005, at 2:13 PM, Michael Hudson wrote:

> The funniest I know is part of PyPy:
>
> def extract_cell_content(c):
>     """Get the value contained in a CPython 'cell', as read through
>     the func_closure of a function object."""
>     # yuk! this is all I could come up with that works in Python 2.2 
> too
>     class X(object):
>         def __eq__(self, other):
>             self.other = other
>     x = X()
>     x_cell, = (lambda: x).func_closure
>     x_cell == c
>     return x.other
>
> It would be unfortunate for PyPy (and IMHO, very un-pythonic) if this
> process became impossible.

It would be quite fortunate if you didn't have to do all that, and cell 
just had a "value" attribute, though.

James

From python-dev at zesty.ca  Sat Apr  9 23:37:34 2005
From: python-dev at zesty.ca (Ka-Ping Yee)
Date: Sat Apr  9 23:37:41 2005
Subject: [Python-Dev] Security capabilities in Python
In-Reply-To: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm>
References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm>
Message-ID: <Pine.LNX.4.58.0504091557540.27211@server1.LFW.org>

On Sat, 9 Apr 2005, Jp Calderone wrote:
>   Does using the gc module to bypass this security count?  If so:
>
>     exarkun@boson:~$ python -i facet.py
>     >>> import gc
>     >>> c = readonly_facet.__getattr__.func_closure[1]
>     >>> r = gc.get_referents(c)[0]
>     >>> r.n = 'hax0r3d'
>     >>> readonly_facet.value()
>     'hax0r3d'
>     >>>

You can't get func_closure in restricted mode.  (Or at least, i can't,
using the Python included with Mac OS 10.3.8.)

    >>> restrict()
    >>> readonly_facet.__getattr__.func_closure
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    RuntimeError: restricted attribute
    >>>

Even though this particular example doesn't work in restricted mode,
it's true that the gc module violates capability discipline, and you
would have to forbid its import.  In any real use case, you would have
to restrict imports anyway to prevent access to sys.modules or loading
of arbitrary binaries.

For a version that restricts imports, see:

    http://zesty.ca/python/facet.py

Let me know if you figure out how to defeat that.

(This is a fun exercise, but with a potential purpose -- it would be
nice to have a coherent story on this for Python 3000, or maybe even
Python 2.x.)


-- ?!ng
From python-dev at zesty.ca  Sat Apr  9 23:46:11 2005
From: python-dev at zesty.ca (Ka-Ping Yee)
Date: Sat Apr  9 23:46:16 2005
Subject: [Python-Dev] Security capabilities in Python
In-Reply-To: <2mhdif7r9r.fsf@starship.python.net>
References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm>
	<2mhdif7r9r.fsf@starship.python.net>
Message-ID: <Pine.LNX.4.58.0504091638570.27211@server1.LFW.org>

On Sat, 9 Apr 2005, Michael Hudson wrote:
> The funniest I know is part of PyPy:
>
> def extract_cell_content(c):
>     """Get the value contained in a CPython 'cell', as read through
>     the func_closure of a function object."""
>     # yuk! this is all I could come up with that works in Python 2.2 too
>     class X(object):
>         def __eq__(self, other):
>             self.other = other
>     x = X()
>     x_cell, = (lambda: x).func_closure
>     x_cell == c
>     return x.other

That's pretty amazing.

> It would be unfortunate for PyPy (and IMHO, very un-pythonic) if this
> process became impossible.

Not a problem.  func_closure is already a restricted attribute.

IMHO, the clean way to do this is to provide a built-in function to
get the cell content in a more direct and reliable way, and then
put that in a separate module with other interpreter hacks.

That both makes it easier to do stuff like this, and easier to prevent
it simply by forbidding import of that module.


-- ?!ng
From pedronis at strakt.com  Sat Apr  9 23:50:48 2005
From: pedronis at strakt.com (Samuele Pedroni)
Date: Sat Apr  9 23:49:19 2005
Subject: [Python-Dev] Security capabilities in Python
In-Reply-To: <Pine.LNX.4.58.0504091557540.27211@server1.LFW.org>
References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm>
	<Pine.LNX.4.58.0504091557540.27211@server1.LFW.org>
Message-ID: <42584E38.6060203@strakt.com>

Ka-Ping Yee wrote:
> On Sat, 9 Apr 2005, Jp Calderone wrote:
> 
>>  Does using the gc module to bypass this security count?  If so:
>>
>>    exarkun@boson:~$ python -i facet.py
>>    >>> import gc
>>    >>> c = readonly_facet.__getattr__.func_closure[1]
>>    >>> r = gc.get_referents(c)[0]
>>    >>> r.n = 'hax0r3d'
>>    >>> readonly_facet.value()
>>    'hax0r3d'
>>    >>>
> 
> 
> You can't get func_closure in restricted mode.  (Or at least, i can't,
> using the Python included with Mac OS 10.3.8.)
> 
>     >>> restrict()
>     >>> readonly_facet.__getattr__.func_closure
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     RuntimeError: restricted attribute
>     >>>
> 
> Even though this particular example doesn't work in restricted mode,
> it's true that the gc module violates capability discipline, and you
> would have to forbid its import.  In any real use case, you would have
> to restrict imports anyway to prevent access to sys.modules or loading
> of arbitrary binaries.
> 
> For a version that restricts imports, see:
> 
>     http://zesty.ca/python/facet.py
> 
> Let me know if you figure out how to defeat that.

you should probably search the list and look at my old attacks against
restricted execution, there's reason why is not much supported anymore.
One can still try to use it but needs to be extremely careful or use C 
defined proxies... etc.

> 
> (This is a fun exercise, but with a potential purpose -- it would be
> nice to have a coherent story on this for Python 3000, or maybe even
> Python 2.x.)
> 
> 
> -- ?!ng
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/pedronis%40strakt.com

From foom at fuhm.net  Sun Apr 10 00:02:22 2005
From: foom at fuhm.net (James Y Knight)
Date: Sun Apr 10 00:02:36 2005
Subject: [Python-Dev] Security capabilities in Python
In-Reply-To: <Pine.LNX.4.58.0504091557540.27211@server1.LFW.org>
References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm>
	<Pine.LNX.4.58.0504091557540.27211@server1.LFW.org>
Message-ID: <1f8cfb9a8805dcc73339b4ea0164e63b@fuhm.net>

On Apr 9, 2005, at 5:37 PM, Ka-Ping Yee wrote:
> Let me know if you figure out how to defeat that.

You can protect against this, too, but it does show that it's *really* 
hard to get restricting code right...I'm of the opinion that it's not 
really worth it -- you should just use OS protections.

untrusted_module.py:

class foostr(str):
  def __eq__(self, other):
   return True

def have_at_it(immutable_facet, readonly_facet):
   getattr(immutable_facet, foostr('append'))(5)
   print immutable_facet

James

From python-dev at zesty.ca  Sun Apr 10 00:34:12 2005
From: python-dev at zesty.ca (Ka-Ping Yee)
Date: Sun Apr 10 00:34:20 2005
Subject: [Python-Dev] Security capabilities in Python
In-Reply-To: <1f8cfb9a8805dcc73339b4ea0164e63b@fuhm.net>
References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm>
	<Pine.LNX.4.58.0504091557540.27211@server1.LFW.org>
	<1f8cfb9a8805dcc73339b4ea0164e63b@fuhm.net>
Message-ID: <Pine.LNX.4.58.0504091730200.27211@server1.LFW.org>

On Sat, 9 Apr 2005, James Y Knight wrote:
> You can protect against this, too, but it does show that it's *really*
> hard to get restricting code right...

Good point.  If you can't trust ==, then you're hosed.

> I'm of the opinion that it's not
> really worth it -- you should just use OS protections.

This i disagree with, however.  OS protections are a few orders of
magnitude more heavyweight and vastly more error-prone than using a
language with simple, clear semantics.

Predictable code behaviour is good.


-- ?!ng
From Scott.Daniels at Acm.Org  Sun Apr 10 16:43:22 2005
From: Scott.Daniels at Acm.Org (Scott David Daniels)
Date: Sun Apr 10 16:44:23 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <d39050$uf2$1@sea.gmane.org>
References: <d3735o$r5k$1@sea.gmane.org><1f7befae0504081638145d3b4c@mail.gmail.com><d37cpc$g0d$1@sea.gmane.org><4257B8A1.6000902@v.loewis.de><16983.51916.407019.489590@montanaro.dyndns.org>	<d38j1t$sm0$1@sea.gmane.org>
	<d39050$uf2$1@sea.gmane.org>
Message-ID: <d3bdsn$rjp$1@sea.gmane.org>

Fredrik Lundh wrote:
>>pickle doesn't have the INF=>1.0 bug:
>>>>>import pickle
>>>>>pickle.loads(pickle.dumps(1e10000))
>>...
>>ValueError: invalid literal for float(): 1.#INF
>>>>>import cPickle
>>>>>cPickle.loads(cPickle.dumps(1e10000))
>>...
>>ValueError: could not convert string to float
>>>>>import marshal
>>>>>marshal.loads(marshal.dumps(1e10000))
>>1.0
> should I check in a fix for this?
> 
> the code in PyFloat_FromString contains lots of trickery to deal with more or less
> broken literals, and more or less broken C libraries.
> 
> unfortunately, and unlike most other functions with similar names, PyFloat_FromString
> takes a Python object, not a char pointer.  would it be a good idea to add a variant
> that takes a char*?  if so, should PyFloat_FromString use the new function, or are we
> avoiding that kind of refactoring for speed reasons these days?
> 
> any opinions?
> 
> </F> 
 From yesterday's sprint, we found a smallest-change style fix.
At the least a change like this will catch the unpacking:
in marshal.c (around line 500) in function r_object:
...
case TYPE_FLOAT:
	{
		char buf[256];
+		char *endp;
		double dx;
		n = r_byte(p);
		if (n == EOF || r_string(buf, (int)n, p) != n) {
			PyErr_SetString(PyExc_EOFError,
				"EOF read where object expected");
			return NULL;
		}
		buf[n] = '\0';
		PyFPE_START_PROTECT("atof", return 0)
-		dx = PyOS_ascii_atof(buf);
+		dx = PyOS_ascii_strtod(buf, &endptr);
		PyFPE_END_PROTECT(dx)
+		if buf + n != &endptr) {
+			PyErr_SetString(PyExc_ValueError,
+				"not all marshalled float text read");
+			return NULL;
+		}
		return PyFloat_FromDouble(dx);
	}


-- Scott David Daniels
Scott.Daniels@Acm.Org

From mwh at python.net  Sun Apr 10 17:22:15 2005
From: mwh at python.net (Michael Hudson)
Date: Sun Apr 10 17:22:18 2005
Subject: [Python-Dev] threading (GilState) question
In-Reply-To: <3af93ffa0bd325b09c6ca6a607c9528d@redivi.com> (Bob Ippolito's
	message of "Sat, 9 Apr 2005 13:54:30 -0700")
References: <2mmzsb7zds.fsf@starship.python.net>
	<1f7befae050407082140a591fd@mail.gmail.com>
	<2m7jje8tmb.fsf@starship.python.net>
	<20050408214218.GE24751@zot.electricrain.com>
	<2md5t37r59.fsf@starship.python.net>
	<3af93ffa0bd325b09c6ca6a607c9528d@redivi.com>
Message-ID: <2m8y3q7j2w.fsf@starship.python.net>

Bob Ippolito <bob@redivi.com> writes:

> Is there a good reason to *not* call PyEval_InitThreads when using a
> threaded Python?

Well, it depends how expensive ones OS's locking primitives are, I
think.  There were some numbers posted to the twisted list recently
that showed it didn't make a whole lot of difference on some platform
or other... I don't have the knowledge or the courage to make that
call.

> Sounds like it would just be easier to implicitly call it during
> Py_Initialize some day.

That might indeed be simpler.

Cheers,
mwh

-- 
  The gripping hand is really that there are morons everywhere, it's
  just that the Americon morons are funnier than average.
                              -- Pim van Riezen, alt.sysadmin.recovery
From fredrik at pythonware.com  Sun Apr 10 17:29:24 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun Apr 10 17:29:39 2005
Subject: [Python-Dev] Re: marshal / unmarshal
References: <d3735o$r5k$1@sea.gmane.org><1f7befae0504081638145d3b4c@mail.gmail.com><d37cpc$g0d$1@sea.gmane.org><4257B8A1.6000902@v.loewis.de><16983.51916.407019.489590@montanaro.dyndns.org>	<d38j1t$sm0$1@sea.gmane.org><d39050$uf2$1@sea.gmane.org>
	<d3bdsn$rjp$1@sea.gmane.org>
Message-ID: <d3bgj8$2kn$1@sea.gmane.org>

Scott David Daniels wrote:

>  From yesterday's sprint

sprint?  I was beginning to wonder why nobody cared about this;
guess I missed the announcement ;-)

> At the least a change like this will catch the unpacking:
> in marshal.c (around line 500) in function r_object:

> PyFPE_START_PROTECT("atof", return 0)
> - dx = PyOS_ascii_atof(buf);
> + dx = PyOS_ascii_strtod(buf, &endptr);
> PyFPE_END_PROTECT(dx)

the PROTECT contents should probably match the function
you're using.

> + if buf + n != &endptr) {
> + PyErr_SetString(PyExc_ValueError,
> + "not all marshalled float text read");
> + return NULL;

this will fix the problem, sure.  I still think it would be cleaner
to reuse the float() semantics, since marshal.dumps uses repr().
to do that, you should use the code in floatobject.c (it wraps
strtod in additional logic designed to take care of various plat-
form quirks).

but nevermind, you have a patch and I don't.  if nobody objects,
go ahead and check it in.

</F>


From mwh at python.net  Sun Apr 10 17:34:17 2005
From: mwh at python.net (Michael Hudson)
Date: Sun Apr 10 17:34:20 2005
Subject: [Python-Dev] Security capabilities in Python
In-Reply-To: <0f238c16eb17a9e9085625e4281b24ad@fuhm.net> (James Y. Knight's
	message of "Sat, 9 Apr 2005 16:58:54 -0400")
References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm>
	<2mhdif7r9r.fsf@starship.python.net>
	<0f238c16eb17a9e9085625e4281b24ad@fuhm.net>
Message-ID: <2m4qee7iiu.fsf@starship.python.net>

James Y Knight <foom@fuhm.net> writes:

> On Apr 9, 2005, at 2:13 PM, Michael Hudson wrote:
>
>> The funniest I know is part of PyPy:
>>
>> def extract_cell_content(c):
>>     """Get the value contained in a CPython 'cell', as read through
>>     the func_closure of a function object."""
>>     # yuk! this is all I could come up with that works in Python 2.2
>> too
>>     class X(object):
>>         def __eq__(self, other):
>>             self.other = other
>>     x = X()
>>     x_cell, = (lambda: x).func_closure
>>     x_cell == c
>>     return x.other
>>
>> It would be unfortunate for PyPy (and IMHO, very un-pythonic) if this
>> process became impossible.
>
> It would be quite fortunate if you didn't have to do all that, and
> cell just had a "value" attribute, though.

Indeed.  The 2.2 compatibility issue remains, though.

Cheers,
mwh

-- 
  Presumably pronging in the wrong place zogs it.
                                        -- Aldabra Stoddart, ucam.chat
From eyal.lotem at gmail.com  Sun Apr 10 18:08:01 2005
From: eyal.lotem at gmail.com (Eyal Lotem)
Date: Sun Apr 10 18:08:05 2005
Subject: [Python-Dev] Security capabilities in Python
In-Reply-To: <1f8cfb9a8805dcc73339b4ea0164e63b@fuhm.net>
References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm>
	<Pine.LNX.4.58.0504091557540.27211@server1.LFW.org>
	<1f8cfb9a8805dcc73339b4ea0164e63b@fuhm.net>
Message-ID: <b64f365b05041009081af0cd8@mail.gmail.com>

It may be really hard to get it right, unless we are overlooking some simple 
solution.

I disagree that we should "just use OS protections".
The reason I am interested in Pythonic protection is because it is so much 
more powerful than OS protections. The capability model is much more 
powerful than the ACL model used by all OS's these days, and allows for 
interesting security concepts.

What about implementing the facet in C? This could avoid the class of 
problems you have just mentioned.

On Apr 9, 2005 2:02 PM, James Y Knight <foom@fuhm.net> wrote:
> On Apr 9, 2005, at 5:37 PM, Ka-Ping Yee wrote:
> > Let me know if you figure out how to defeat that.
> 
> You can protect against this, too, but it does show that it's *really*
> hard to get restricting code right...I'm of the opinion that it's not
> really worth it -- you should just use OS protections.
> 
> untrusted_module.py:
> 
> class foostr(str):
> def __eq__(self, other):
> return True
> 
> def have_at_it(immutable_facet, readonly_facet):
> getattr(immutable_facet, foostr('append'))(5)
> print immutable_facet
> 
> James
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20050410/600d8b8e/attachment.htm
From tim.peters at gmail.com  Sun Apr 10 19:23:32 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Sun Apr 10 19:23:35 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <d3bgj8$2kn$1@sea.gmane.org>
References: <d3735o$r5k$1@sea.gmane.org>
	<1f7befae0504081638145d3b4c@mail.gmail.com>
	<d37cpc$g0d$1@sea.gmane.org> <4257B8A1.6000902@v.loewis.de>
	<16983.51916.407019.489590@montanaro.dyndns.org>
	<d38j1t$sm0$1@sea.gmane.org> <d39050$uf2$1@sea.gmane.org>
	<d3bdsn$rjp$1@sea.gmane.org> <d3bgj8$2kn$1@sea.gmane.org>
Message-ID: <1f7befae05041010237d11d7a9@mail.gmail.com>

marshal shouldn't be representing doubles as decimal strings to begin
with.  All code for (de)serialing C doubles should go thru
_PyFloat_Pack8() and _PyFloat_Unpack8().  cPickle (proto >= 1) and
struct (std mode) already do; marshal is the oddball.

But as the docs (floatobject.h) for these say:

...
 * Bug:  What this does is undefined if x is a NaN or infinity.
 * Bug:  -0.0 and +0.0 produce the same string.
 */
PyAPI_FUNC(int) _PyFloat_Pack4(double x, unsigned char *p, int le);
PyAPI_FUNC(int) _PyFloat_Pack8(double x, unsigned char *p, int le);
...
 * Bug:  What this does is undefined if the string represents a NaN or
 * infinity.
 */
PyAPI_FUNC(double) _PyFloat_Unpack4(const unsigned char *p, int le);
PyAPI_FUNC(double) _PyFloat_Unpack8(const unsigned char *p, int le);
From fredrik at pythonware.com  Sun Apr 10 20:26:44 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun Apr 10 20:27:09 2005
Subject: [Python-Dev] Re: Re: marshal / unmarshal
References: <d3735o$r5k$1@sea.gmane.org><1f7befae0504081638145d3b4c@mail.gmail.com><d37cpc$g0d$1@sea.gmane.org>
	<4257B8A1.6000902@v.loewis.de><16983.51916.407019.489590@montanaro.dyndns.org><d38j1t$sm0$1@sea.gmane.org>
	<d39050$uf2$1@sea.gmane.org><d3bdsn$rjp$1@sea.gmane.org>
	<d3bgj8$2kn$1@sea.gmane.org>
	<1f7befae05041010237d11d7a9@mail.gmail.com>
Message-ID: <d3bqvm$umr$1@sea.gmane.org>

Tim Peters wrote:

> marshal shouldn't be representing doubles as decimal strings to begin
> with.  All code for (de)serialing C doubles should go thru
> _PyFloat_Pack8() and _PyFloat_Unpack8().  cPickle (proto >= 1) and
> struct (std mode) already do; marshal is the oddball.

is changing the marshal format really the right thing to do at this
point?

</F>


From mwh at python.net  Sun Apr 10 20:08:56 2005
From: mwh at python.net (Michael Hudson)
Date: Sun Apr 10 22:34:00 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <1f7befae05041010237d11d7a9@mail.gmail.com> (Tim Peters's
	message of "Sun, 10 Apr 2005 13:23:32 -0400")
References: <d3735o$r5k$1@sea.gmane.org>
	<1f7befae0504081638145d3b4c@mail.gmail.com>
	<d37cpc$g0d$1@sea.gmane.org> <4257B8A1.6000902@v.loewis.de>
	<16983.51916.407019.489590@montanaro.dyndns.org>
	<d38j1t$sm0$1@sea.gmane.org> <d39050$uf2$1@sea.gmane.org>
	<d3bdsn$rjp$1@sea.gmane.org> <d3bgj8$2kn$1@sea.gmane.org>
	<1f7befae05041010237d11d7a9@mail.gmail.com>
Message-ID: <2mmzs65wsn.fsf@starship.python.net>

Tim Peters <tim.peters@gmail.com> writes:

> marshal shouldn't be representing doubles as decimal strings to begin
> with.  All code for (de)serialing C doubles should go thru
> _PyFloat_Pack8() and _PyFloat_Unpack8().  cPickle (proto >= 1) and
> struct (std mode) already do; marshal is the oddball.
>
> But as the docs (floatobject.h) for these say:
>
> ...
>  * Bug:  What this does is undefined if x is a NaN or infinity.
>  * Bug:  -0.0 and +0.0 produce the same string.
>  */
> PyAPI_FUNC(int) _PyFloat_Pack4(double x, unsigned char *p, int le);
> PyAPI_FUNC(int) _PyFloat_Pack8(double x, unsigned char *p, int le);
> ...
>  * Bug:  What this does is undefined if the string represents a NaN or
>  * infinity.
>  */
> PyAPI_FUNC(double) _PyFloat_Unpack4(const unsigned char *p, int le);
> PyAPI_FUNC(double) _PyFloat_Unpack8(const unsigned char *p, int le);

OTOH, the implementation has this comment:

/*----------------------------------------------------------------------------
 * _PyFloat_{Pack,Unpack}{4,8}.  See floatobject.h.
 *
 * TODO:  On platforms that use the standard IEEE-754 single and double
 * formats natively, these routines could simply copy the bytes.
 */

Doing that would fix these problems, surely?[1]

The question, of course, is how to tell.  I suppose one could jsut do
it unconditionally and wait for one of the three remaining VAX
users[2] to compile Python 2.5 and then notice.

More conservatively, one could just do this on Windows, linux/most
architectures and Mac OS X.

Cheers,
mwh

[1] I'm slighyly worried about oddball systems that do insane things
    with the FPU by default -- but don't think the mooted change would
    make things any worse.

[2] Exaggeration, I realize -- but how many non 754 systems are out
    there?  How many will see Python 2.5?

-- 
  If you give someone Fortran, he has Fortran.
  If you give someone Lisp, he has any language he pleases.
    -- Guy L. Steele Jr, quoted by David Rush in comp.lang.scheme.scsh
From skip at pobox.com  Sun Apr 10 22:44:52 2005
From: skip at pobox.com (Skip Montanaro)
Date: Sun Apr 10 22:44:19 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <2mmzs65wsn.fsf@starship.python.net>
References: <d3735o$r5k$1@sea.gmane.org>
	<1f7befae0504081638145d3b4c@mail.gmail.com>
	<d37cpc$g0d$1@sea.gmane.org> <4257B8A1.6000902@v.loewis.de>
	<16983.51916.407019.489590@montanaro.dyndns.org>
	<d38j1t$sm0$1@sea.gmane.org> <d39050$uf2$1@sea.gmane.org>
	<d3bdsn$rjp$1@sea.gmane.org> <d3bgj8$2kn$1@sea.gmane.org>
	<1f7befae05041010237d11d7a9@mail.gmail.com>
	<2mmzs65wsn.fsf@starship.python.net>
Message-ID: <16985.36932.105169.855614@montanaro.dyndns.org>


    Michael> I suppose one could jsut do it unconditionally and wait for one
    Michael> of the three remaining VAX users[2] to compile Python 2.5 and
    Michael> then notice.

You forgot the two remaining CRAY users.  Since their machines are so much
more powerful than VAXen, they have much more influence over Python
development. <wink>

Skip
From foom at fuhm.net  Sun Apr 10 22:54:08 2005
From: foom at fuhm.net (James Y Knight)
Date: Sun Apr 10 22:54:25 2005
Subject: [Python-Dev] threading (GilState) question
In-Reply-To: <2m8y3q7j2w.fsf@starship.python.net>
References: <2mmzsb7zds.fsf@starship.python.net>
	<1f7befae050407082140a591fd@mail.gmail.com>
	<2m7jje8tmb.fsf@starship.python.net>
	<20050408214218.GE24751@zot.electricrain.com>
	<2md5t37r59.fsf@starship.python.net>
	<3af93ffa0bd325b09c6ca6a607c9528d@redivi.com>
	<2m8y3q7j2w.fsf@starship.python.net>
Message-ID: <9b2720665f8d1d6e90b44ea182c0e42a@fuhm.net>


On Apr 10, 2005, at 11:22 AM, Michael Hudson wrote:

> Bob Ippolito <bob@redivi.com> writes:
>
>> Is there a good reason to *not* call PyEval_InitThreads when using a
>> threaded Python?
>
> Well, it depends how expensive ones OS's locking primitives are, I
> think.  There were some numbers posted to the twisted list recently
> that showed it didn't make a whole lot of difference on some platform
> or other... I don't have the knowledge or the courage to make that
> call.
>
>> Sounds like it would just be easier to implicitly call it during
>> Py_Initialize some day.
>
> That might indeed be simpler.

Here's the numbers. It looks like something changed between python 2.2 
and 2.3 that made calling PyEval_InitThreads a lot less expensive. So, 
it doesn't seem to make a whole lot of difference on recent versions of 
Python.

Three test programs:
${PYTHON} -c 'import pystone, time; print pystone.pystones(200000)'
${PYTHON} -c 'import thread, pystone, time; print 
pystone.pystones(200000)'
${PYTHON} -c 'import thread, pystone, time; 
thread.start_new_thread(lambda: time.sleep(10000), ()); print 
pystone.pystones(200000)'

All tests run using the same copy of pystone.

System 1: RH73, dual 3GHz Xeon
[GCC 2.96 20000731 (Red Hat Linux 7.3 2.96-110)]
--------
Python 1.5.2 (#1, Apr  3 2002, 18:16:26)

(8.15, 24540)
(8.28, 24155)
(12.78, 15649)

Python 2.2.2 (#1, Jul 23 2003, 13:47:48)

(6.32, 31646)
(6.27, 31898)
(11.1, 18018)

Python 2.4.1 (#1, Apr  4 2005, 17:19:27)

(4.60, 43478)
(4.61, 43384)
(4.74, 42194)

System 2, FC3/64, dual 2.4GHz athlon 64.
[GCC 3.4.2 20041017 (Red Hat 3.4.2-6.fc3)]
--------
Python 2.3.4 (#1, Oct 26 2004, 16:45:38)

(3.84, 52083)
(3.80, 52632)
(3.98, 50251)

Python 2.4.1 (#1, Apr 10 2005, 15:47:53)

(3.09, 64725)
(3.08, 64935)
(3.26, 61350)

Python 2.4.1 (#1, Apr  1 2005, 16:45:07)
*compiled in 32 bit mode*

(3.35, 59701)
(3.42, 58480)
(3.57, 56022)

From mwh at python.net  Sun Apr 10 23:48:59 2005
From: mwh at python.net (Michael Hudson)
Date: Sun Apr 10 23:49:02 2005
Subject: [Python-Dev] threading (GilState) question
In-Reply-To: <9b2720665f8d1d6e90b44ea182c0e42a@fuhm.net> (James Y. Knight's
	message of "Sun, 10 Apr 2005 16:54:08 -0400")
References: <2mmzsb7zds.fsf@starship.python.net>
	<1f7befae050407082140a591fd@mail.gmail.com>
	<2m7jje8tmb.fsf@starship.python.net>
	<20050408214218.GE24751@zot.electricrain.com>
	<2md5t37r59.fsf@starship.python.net>
	<3af93ffa0bd325b09c6ca6a607c9528d@redivi.com>
	<2m8y3q7j2w.fsf@starship.python.net>
	<9b2720665f8d1d6e90b44ea182c0e42a@fuhm.net>
Message-ID: <2mis2u5mlw.fsf@starship.python.net>

James Y Knight <foom@fuhm.net> writes:

> On Apr 10, 2005, at 11:22 AM, Michael Hudson wrote:
>
>> Bob Ippolito <bob@redivi.com> writes:
>>
>>> Is there a good reason to *not* call PyEval_InitThreads when using a
>>> threaded Python?
>>
>> Well, it depends how expensive ones OS's locking primitives are, I
>> think.  There were some numbers posted to the twisted list recently
>> that showed it didn't make a whole lot of difference on some platform
>> or other... I don't have the knowledge or the courage to make that
>> call.
>>
>>> Sounds like it would just be easier to implicitly call it during
>>> Py_Initialize some day.
>>
>> That might indeed be simpler.
>
> Here's the numbers. It looks like something changed between python 2.2
> and 2.3 that made calling PyEval_InitThreads a lot less expensive. So,
> it doesn't seem to make a whole lot of difference on recent versions
> of Python.

Thanks.  I see similar results for 2.3 and 2.4 on OS X (don't have 2.2
here).

It's very much a guess, but could this patch:

[ 525532 ] Add support for POSIX semaphores

be the one to thank?

Cheers,
mwh

-- 
  Now this is what I don't get.  Nobody said absolutely anything
  bad about anything.  Yet it is always possible to just pull
  random flames out of ones ass.
         -- http://www.advogato.org/person/vicious/diary.html?start=60
From bob at redivi.com  Mon Apr 11 00:26:00 2005
From: bob at redivi.com (Bob Ippolito)
Date: Mon Apr 11 00:26:06 2005
Subject: [Python-Dev] threading (GilState) question
In-Reply-To: <2mis2u5mlw.fsf@starship.python.net>
References: <2mmzsb7zds.fsf@starship.python.net>
	<1f7befae050407082140a591fd@mail.gmail.com>
	<2m7jje8tmb.fsf@starship.python.net>
	<20050408214218.GE24751@zot.electricrain.com>
	<2md5t37r59.fsf@starship.python.net>
	<3af93ffa0bd325b09c6ca6a607c9528d@redivi.com>
	<2m8y3q7j2w.fsf@starship.python.net>
	<9b2720665f8d1d6e90b44ea182c0e42a@fuhm.net>
	<2mis2u5mlw.fsf@starship.python.net>
Message-ID: <f4b94eea54e56f56ccad20923c3eed74@redivi.com>


On Apr 10, 2005, at 2:48 PM, Michael Hudson wrote:

> James Y Knight <foom@fuhm.net> writes:
>
>> On Apr 10, 2005, at 11:22 AM, Michael Hudson wrote:
>>
>>> Bob Ippolito <bob@redivi.com> writes:
>>>
>>>> Is there a good reason to *not* call PyEval_InitThreads when using a
>>>> threaded Python?
>>>
>>> Well, it depends how expensive ones OS's locking primitives are, I
>>> think.  There were some numbers posted to the twisted list recently
>>> that showed it didn't make a whole lot of difference on some platform
>>> or other... I don't have the knowledge or the courage to make that
>>> call.
>>>
>>>> Sounds like it would just be easier to implicitly call it during
>>>> Py_Initialize some day.
>>>
>>> That might indeed be simpler.
>>
>> Here's the numbers. It looks like something changed between python 2.2
>> and 2.3 that made calling PyEval_InitThreads a lot less expensive. So,
>> it doesn't seem to make a whole lot of difference on recent versions
>> of Python.
>
> Thanks.  I see similar results for 2.3 and 2.4 on OS X (don't have 2.2
> here).
>
> It's very much a guess, but could this patch:
>
> [ 525532 ] Add support for POSIX semaphores
>
> be the one to thank?

No, Mac OS X doesn't implement POSIX semaphores.

-bob

From tim.peters at gmail.com  Mon Apr 11 00:37:44 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Mon Apr 11 00:37:48 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <2mmzs65wsn.fsf@starship.python.net>
References: <d3735o$r5k$1@sea.gmane.org> <d37cpc$g0d$1@sea.gmane.org>
	<4257B8A1.6000902@v.loewis.de>
	<16983.51916.407019.489590@montanaro.dyndns.org>
	<d38j1t$sm0$1@sea.gmane.org> <d39050$uf2$1@sea.gmane.org>
	<d3bdsn$rjp$1@sea.gmane.org> <d3bgj8$2kn$1@sea.gmane.org>
	<1f7befae05041010237d11d7a9@mail.gmail.com>
	<2mmzs65wsn.fsf@starship.python.net>
Message-ID: <1f7befae05041015372cf17e91@mail.gmail.com>

[mwh]
> OTOH, the implementation has this comment:
>
> /*----------------------------------------------------------------------------
> * _PyFloat_{Pack,Unpack}{4,8}.  See floatobject.h.
> *
> * TODO:  On platforms that use the standard IEEE-754 single and double
> * formats natively, these routines could simply copy the bytes.
> */
> 
> Doing that would fix these problems, surely?[1]

The 754 standard doesn't say anything about how the difference between
signaling and quiet NaNs is represented.  So it's possible that a qNaN
on one box would "look like" an sNaN on a different box, and vice
versa.  But since most people run with all FPU traps disabled, and
Python doesn't expose a way to read the FPU status flags, they
couldn't tell the difference.

Copying bytes works perfectly for all other cases (signed zeroes,
non-zero finites, infinities), because their representations are
wholly defined, although it's possible that a subnormal on one box
will be treated like a zero (with the same sign) on a
partially-conforming box.

> [1] I'm slighyly worried about oddball systems that do insane things
>    with the FPU by default -- but don't think the mooted change would
>    make things any worse.

Sorry, don't know what that means.

> The question, of course, is how to tell.

Store a few small doubles at module initialization time and stare at
their bits.  That's enough to settle whether a 754 format is in use,
and, if it is, whether it's big-endian or little-endian.

...

> [2] Exaggeration, I realize -- but how many non 754 systems are out
>    there?  How many will see Python 2.5?

No idea here.  The existing pack routines strive to do a good job of
_creating_ an IEEE-754-format representation regardless of platform
representation.  I assume that code would still be present, so
"oddball" platforms would be left no worse off than they are now.
From tim.peters at gmail.com  Mon Apr 11 00:42:20 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Mon Apr 11 00:42:23 2005
Subject: [Python-Dev] Re: Re: marshal / unmarshal
In-Reply-To: <d3bqvm$umr$1@sea.gmane.org>
References: <d3735o$r5k$1@sea.gmane.org> <d37cpc$g0d$1@sea.gmane.org>
	<4257B8A1.6000902@v.loewis.de>
	<16983.51916.407019.489590@montanaro.dyndns.org>
	<d38j1t$sm0$1@sea.gmane.org> <d39050$uf2$1@sea.gmane.org>
	<d3bdsn$rjp$1@sea.gmane.org> <d3bgj8$2kn$1@sea.gmane.org>
	<1f7befae05041010237d11d7a9@mail.gmail.com>
	<d3bqvm$umr$1@sea.gmane.org>
Message-ID: <1f7befae0504101542361dd121@mail.gmail.com>

[Fredrik Lundh]
> is changing the marshal format really the right thing to do at this
> point?

I don't see anything special about "this point" -- it's just sometime
between 2.4.1 and 2.5a0.  What do you have in mind?

Like pickle formats, I expect a change to marshal would add a new
format code, not take away an older code, so older marshal strings
could still be read.  Etc.
From mwh at python.net  Mon Apr 11 01:08:18 2005
From: mwh at python.net (Michael Hudson)
Date: Mon Apr 11 01:08:20 2005
Subject: [Python-Dev] threading (GilState) question
In-Reply-To: <f4b94eea54e56f56ccad20923c3eed74@redivi.com> (Bob Ippolito's
	message of "Sun, 10 Apr 2005 15:26:00 -0700")
References: <2mmzsb7zds.fsf@starship.python.net>
	<1f7befae050407082140a591fd@mail.gmail.com>
	<2m7jje8tmb.fsf@starship.python.net>
	<20050408214218.GE24751@zot.electricrain.com>
	<2md5t37r59.fsf@starship.python.net>
	<3af93ffa0bd325b09c6ca6a607c9528d@redivi.com>
	<2m8y3q7j2w.fsf@starship.python.net>
	<9b2720665f8d1d6e90b44ea182c0e42a@fuhm.net>
	<2mis2u5mlw.fsf@starship.python.net>
	<f4b94eea54e56f56ccad20923c3eed74@redivi.com>
Message-ID: <2mekdi5ixp.fsf@starship.python.net>

Bob Ippolito <bob@redivi.com> writes:

> On Apr 10, 2005, at 2:48 PM, Michael Hudson wrote:
>
>> James Y Knight <foom@fuhm.net> writes:
>>
>>> Here's the numbers. It looks like something changed between python 2.2
>>> and 2.3 that made calling PyEval_InitThreads a lot less expensive. So,
>>> it doesn't seem to make a whole lot of difference on recent versions
>>> of Python.
>>
>> Thanks.  I see similar results for 2.3 and 2.4 on OS X (don't have 2.2
>> here).
>>
>> It's very much a guess, but could this patch:
>>
>> [ 525532 ] Add support for POSIX semaphores
>>
>> be the one to thank?
>
> No, Mac OS X doesn't implement POSIX semaphores.

Well, does OS X show the same effect between 2.2 and 2.3?  I don't
have a 2.2 on OS X any more, I was just talking about James' results
on linux.

Cheers,
mwh

-- 
  Slim Shady is fed up with your shit, and he's going to kill you.
                         -- Eminem, "Public Service Announcement 2000"
From bob at redivi.com  Mon Apr 11 01:24:17 2005
From: bob at redivi.com (Bob Ippolito)
Date: Mon Apr 11 01:24:22 2005
Subject: [Python-Dev] threading (GilState) question
In-Reply-To: <2mekdi5ixp.fsf@starship.python.net>
References: <2mmzsb7zds.fsf@starship.python.net>
	<1f7befae050407082140a591fd@mail.gmail.com>
	<2m7jje8tmb.fsf@starship.python.net>
	<20050408214218.GE24751@zot.electricrain.com>
	<2md5t37r59.fsf@starship.python.net>
	<3af93ffa0bd325b09c6ca6a607c9528d@redivi.com>
	<2m8y3q7j2w.fsf@starship.python.net>
	<9b2720665f8d1d6e90b44ea182c0e42a@fuhm.net>
	<2mis2u5mlw.fsf@starship.python.net>
	<f4b94eea54e56f56ccad20923c3eed74@redivi.com>
	<2mekdi5ixp.fsf@starship.python.net>
Message-ID: <4531465672695d375c721c56af37fe87@redivi.com>


On Apr 10, 2005, at 4:08 PM, Michael Hudson wrote:

> Bob Ippolito <bob@redivi.com> writes:
>
>> On Apr 10, 2005, at 2:48 PM, Michael Hudson wrote:
>>
>>> James Y Knight <foom@fuhm.net> writes:
>>>
>>>> Here's the numbers. It looks like something changed between python 
>>>> 2.2
>>>> and 2.3 that made calling PyEval_InitThreads a lot less expensive. 
>>>> So,
>>>> it doesn't seem to make a whole lot of difference on recent versions
>>>> of Python.
>>>
>>> Thanks.  I see similar results for 2.3 and 2.4 on OS X (don't have 
>>> 2.2
>>> here).
>>>
>>> It's very much a guess, but could this patch:
>>>
>>> [ 525532 ] Add support for POSIX semaphores
>>>
>>> be the one to thank?
>>
>> No, Mac OS X doesn't implement POSIX semaphores.
>
> Well, does OS X show the same effect between 2.2 and 2.3?  I don't
> have a 2.2 on OS X any more, I was just talking about James' results
> on linux.

I don't have 2.2 on OS X any more, either.

-bob

From aleaxit at yahoo.com  Mon Apr 11 07:07:00 2005
From: aleaxit at yahoo.com (Alex Martelli)
Date: Mon Apr 11 07:07:06 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <16985.36932.105169.855614@montanaro.dyndns.org>
References: <d3735o$r5k$1@sea.gmane.org>
	<1f7befae0504081638145d3b4c@mail.gmail.com>
	<d37cpc$g0d$1@sea.gmane.org> <4257B8A1.6000902@v.loewis.de>
	<16983.51916.407019.489590@montanaro.dyndns.org>
	<d38j1t$sm0$1@sea.gmane.org> <d39050$uf2$1@sea.gmane.org>
	<d3bdsn$rjp$1@sea.gmane.org> <d3bgj8$2kn$1@sea.gmane.org>
	<1f7befae05041010237d11d7a9@mail.gmail.com>
	<2mmzs65wsn.fsf@starship.python.net>
	<16985.36932.105169.855614@montanaro.dyndns.org>
Message-ID: <538f17711d377e0154e295e4b9f924bc@yahoo.com>


On Apr 10, 2005, at 13:44, Skip Montanaro wrote:

>
>     Michael> I suppose one could jsut do it unconditionally and wait 
> for one
>     Michael> of the three remaining VAX users[2] to compile Python 2.5 
> and
>     Michael> then notice.
>
> You forgot the two remaining CRAY users.  Since their machines are so 
> much
> more powerful than VAXen, they have much more influence over Python
> development. <wink>

The latest ads I've seen from Cray were touting AMD-64 processors 
anyway...;-)


Alex

From fredrik at pythonware.com  Mon Apr 11 09:33:09 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon Apr 11 09:43:21 2005
Subject: [Python-Dev] Re: Re: Re: marshal / unmarshal
References: <d3735o$r5k$1@sea.gmane.org>
	<d37cpc$g0d$1@sea.gmane.org><4257B8A1.6000902@v.loewis.de><16983.51916.407019.489590@montanaro.dyndns.org><d38j1t$sm0$1@sea.gmane.org>
	<d39050$uf2$1@sea.gmane.org><d3bdsn$rjp$1@sea.gmane.org>
	<d3bgj8$2kn$1@sea.gmane.org><1f7befae05041010237d11d7a9@mail.gmail.com><d3bqvm$umr$1@sea.gmane.org>
	<1f7befae0504101542361dd121@mail.gmail.com>
Message-ID: <d3d929$pd8$1@sea.gmane.org>

Tim Peters wrote:
> [Fredrik Lundh]
>> is changing the marshal format really the right thing to do at this
>> point?
>
> I don't see anything special about "this point" -- it's just sometime
> between 2.4.1 and 2.5a0.  What do you have in mind?

I was under the impression that the marshal format has been stable for
quite a long time (people are using it for various RPC protocols, among
other things).  I might be wrong.

</F> 


From bob at redivi.com  Mon Apr 11 10:00:50 2005
From: bob at redivi.com (Bob Ippolito)
Date: Mon Apr 11 10:00:55 2005
Subject: [Python-Dev] Re: Re: Re: marshal / unmarshal
In-Reply-To: <d3d929$pd8$1@sea.gmane.org>
References: <d3735o$r5k$1@sea.gmane.org>
	<d37cpc$g0d$1@sea.gmane.org><4257B8A1.6000902@v.loewis.de><16983.51916.407019.489590@montanaro.dyndns.org><d38j1t$sm0$1@sea.gmane.org>
	<d39050$uf2$1@sea.gmane.org><d3bdsn$rjp$1@sea.gmane.org>
	<d3bgj8$2kn$1@sea.gmane.org><1f7befae05041010237d11d7a9@mail.gmail.com><d3bqvm$umr$1@sea.gmane.org>
	<1f7befae0504101542361dd121@mail.gmail.com>
	<d3d929$pd8$1@sea.gmane.org>
Message-ID: <93c9ae04e805845e5bdf84671c6802c7@redivi.com>


On Apr 11, 2005, at 12:33 AM, Fredrik Lundh wrote:

> Tim Peters wrote:
>> [Fredrik Lundh]
>>> is changing the marshal format really the right thing to do at this
>>> point?
>>
>> I don't see anything special about "this point" -- it's just sometime
>> between 2.4.1 and 2.5a0.  What do you have in mind?
>
> I was under the impression that the marshal format has been stable for
> quite a long time (people are using it for various RPC protocols, among
> other things).  I might be wrong.

The documentation for marshal explicitly states that you should not use 
it for such purposes.

There's also a version argument to dumps and dump (though the argument 
list in the dump documentation doesn't say so), where version 0 is 
pre-2.4, and version 1 is 2.4+.  I don't think it's out of the question 
to add a version 2 for 2.5+ that uses a better serialization for floats 
(and it should probably add set/frozenset too since those are builtins 
now).

-bob

From mwh at python.net  Mon Apr 11 15:37:58 2005
From: mwh at python.net (Michael Hudson)
Date: Mon Apr 11 15:38:00 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <1f7befae05041015372cf17e91@mail.gmail.com> (Tim Peters's
	message of "Sun, 10 Apr 2005 18:37:44 -0400")
References: <d3735o$r5k$1@sea.gmane.org> <d37cpc$g0d$1@sea.gmane.org>
	<4257B8A1.6000902@v.loewis.de>
	<16983.51916.407019.489590@montanaro.dyndns.org>
	<d38j1t$sm0$1@sea.gmane.org> <d39050$uf2$1@sea.gmane.org>
	<d3bdsn$rjp$1@sea.gmane.org> <d3bgj8$2kn$1@sea.gmane.org>
	<1f7befae05041010237d11d7a9@mail.gmail.com>
	<2mmzs65wsn.fsf@starship.python.net>
	<1f7befae05041015372cf17e91@mail.gmail.com>
Message-ID: <2maco55t8p.fsf@starship.python.net>

Tim Peters <tim.peters@gmail.com> writes:

> The 754 standard doesn't say anything about how the difference between
> signaling and quiet NaNs is represented.  So it's possible that a qNaN
> on one box would "look like" an sNaN on a different box, and vice
> versa.  But since most people run with all FPU traps disabled, and
> Python doesn't expose a way to read the FPU status flags, they
> couldn't tell the difference.

OK.  Do you have any intuition as to whether 754 implementations
actually *do* differ on this point?

> Copying bytes works perfectly for all other cases (signed zeroes,
> non-zero finites, infinities), because their representations are
> wholly defined, although it's possible that a subnormal on one box
> will be treated like a zero (with the same sign) on a
> partially-conforming box.

I'd find struggling to care about that pretty hard.

>> [1] I'm slighyly worried about oddball systems that do insane things
>>    with the FPU by default -- but don't think the mooted change would
>>    make things any worse.
>
> Sorry, don't know what that means.

Neither do I, now.  Oh well <wink>.

>> The question, of course, is how to tell.
>
> Store a few small doubles at module initialization time and stare at

./configure time, surely?

> their bits.  That's enough to settle whether a 754 format is in use,
> and, if it is, whether it's big-endian or little-endian.

Do you have a pointer to code that does this?  Googling around the
subject appears to turn up lots of Python stuff...

>> [2] Exaggeration, I realize -- but how many non 754 systems are out
>>    there?  How many will see Python 2.5?
>
> No idea here.  The existing pack routines strive to do a good job of
> _creating_ an IEEE-754-format representation regardless of platform
> representation.  I assume that code would still be present, so
> "oddball" platforms would be left no worse off than they are now.

Well, yes, given the above.  The text this footnote was attached to
was asking if just assuming 754 float formats would inconvenience
anyone.

Cheers,
mwh

-- 
  I don't have any special knowledge of all this. In fact, I made all
  the above up, in the hope that it corresponds to reality.
                                            -- Mark Carroll, ucam.chat
From tim.peters at gmail.com  Mon Apr 11 17:27:43 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Mon Apr 11 17:27:46 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <2maco55t8p.fsf@starship.python.net>
References: <d3735o$r5k$1@sea.gmane.org>
	<16983.51916.407019.489590@montanaro.dyndns.org>
	<d38j1t$sm0$1@sea.gmane.org> <d39050$uf2$1@sea.gmane.org>
	<d3bdsn$rjp$1@sea.gmane.org> <d3bgj8$2kn$1@sea.gmane.org>
	<1f7befae05041010237d11d7a9@mail.gmail.com>
	<2mmzs65wsn.fsf@starship.python.net>
	<1f7befae05041015372cf17e91@mail.gmail.com>
	<2maco55t8p.fsf@starship.python.net>
Message-ID: <1f7befae050411082733dca644@mail.gmail.com>

[Tim]
>> The 754 standard doesn't say anything about how the difference between
>> signaling and quiet NaNs is represented.  So it's possible that a qNaN
>> on one box would "look like" an sNaN on a different box, and vice
>> versa.  But since most people run with all FPU traps disabled, and
>> Python doesn't expose a way to read the FPU status flags, they
>> couldn't tell the difference.

[mwh]
> OK.  Do you have any intuition as to whether 754 implementations
> actually *do* differ on this point?

Not anymore -- hasn't been part of my job, or a hobby, for over a
decade.  There were differences a decade+ ago.  All NaNs have all
exponent bits set, and at least one mantissa bit set, and every bit
pattern of that form represents a NaN.  That's all the standard says. 
The most popular way to distinguish quiet from signaling NaNs keyed
off the most-significant mantissa bit:  set for a qNaN, clear for an
sNaN.  It's possible that all 754 HW does that now.

There's at least still that Pentium hardware adds a third not-a-number
possibility: in addition to 754's quiet and signaling NaNs, it also
has "indeterminate" values.  Here w/ native Windows Python 2.4 on a
Pentium:

>>> inf = 1e300 * 1e300
>>> inf - inf   # indeterminate
-1.#IND
>>> - _  # but the negation of IND is a quiet NaN
1.#QNAN
>>>

Do the same thing under Cygwin Python on the same box and it prints "NaN" twice.

Do people care about this?  I don't know.  It seems unlikely -- in
effect, IND just gives a special string name to a single one of the
many bit patterns that represent a quiet NaN.  OTOH, Pentium hardware
still preserves this distinction, and MS library docs do too.  IND
isn't part of the 754 standard (although, IIRC, it was part of a
pre-standard draft, which Intel implemented and is now stuck with).

>> Copying bytes works perfectly for all other cases (signed zeroes,
>> non-zero finites, infinities), because their representations are
>> wholly defined, although it's possible that a subnormal on one box
>> will be treated like a zero (with the same sign) on a
>> partially-conforming box.

> I'd find struggling to care about that pretty hard.

Me too.

>>> The question, of course, is how to tell.

>> Store a few small doubles at module initialization time and stare at

> ./configure time, surely?

Unsure.  Not all Python platforms _have_ "./configure time".  Module
initialization code is harder to screw up for that reason (the code is
in an obvious place then, self-contained, and doesn't require any
relevant knowledge of any platform porter unless/until it breaks).

>> their bits.  That's enough to settle whether a 754 format is in use,
>> and, if it is, whether it's big-endian or little-endian.

> Do you have a pointer to code that does this?

No.  Pemberton's enquire.c contains enough code to do it.  Given how
few distinct architectures still exist, it's probably enough to store
just double x = 1.5 and stare at it.

>>> [2] Exaggeration, I realize -- but how many non 754 systems are out
>>>    there?  How many will see Python 2.5?

>> No idea here.  The existing pack routines strive to do a good job of
>> _creating_ an IEEE-754-format representation regardless of platform
>> representation.  I assume that code would still be present, so
>> "oddball" platforms would be left no worse off than they are now.
 
> Well, yes, given the above.  The text this footnote was attached to
> was asking if just assuming 754 float formats would inconvenience
> anyone.

I think I'm still missing your intent here.  If you're asking whether
Python can blindly assume that 745 is in use, I'd say that's
undesirable but defensible if necessary.
From mwh at python.net  Mon Apr 11 18:01:49 2005
From: mwh at python.net (Michael Hudson)
Date: Mon Apr 11 18:01:51 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <1f7befae050411082733dca644@mail.gmail.com> (Tim Peters's
	message of "Mon, 11 Apr 2005 11:27:43 -0400")
References: <d3735o$r5k$1@sea.gmane.org>
	<16983.51916.407019.489590@montanaro.dyndns.org>
	<d38j1t$sm0$1@sea.gmane.org> <d39050$uf2$1@sea.gmane.org>
	<d3bdsn$rjp$1@sea.gmane.org> <d3bgj8$2kn$1@sea.gmane.org>
	<1f7befae05041010237d11d7a9@mail.gmail.com>
	<2mmzs65wsn.fsf@starship.python.net>
	<1f7befae05041015372cf17e91@mail.gmail.com>
	<2maco55t8p.fsf@starship.python.net>
	<1f7befae050411082733dca644@mail.gmail.com>
Message-ID: <2msm1x480i.fsf@starship.python.net>

Tim Peters <tim.peters@gmail.com> writes:

> [Tim]
>>> The 754 standard doesn't say anything about how the difference between
>>> signaling and quiet NaNs is represented.  So it's possible that a qNaN
>>> on one box would "look like" an sNaN on a different box, and vice
>>> versa.  But since most people run with all FPU traps disabled, and
>>> Python doesn't expose a way to read the FPU status flags, they
>>> couldn't tell the difference.
>
> [mwh]
>> OK.  Do you have any intuition as to whether 754 implementations
>> actually *do* differ on this point?
>
> Not anymore -- hasn't been part of my job, or a hobby, for over a
> decade.  There were differences a decade+ ago.  All NaNs have all
> exponent bits set, and at least one mantissa bit set, and every bit
> pattern of that form represents a NaN.  That's all the standard says. 
> The most popular way to distinguish quiet from signaling NaNs keyed
> off the most-significant mantissa bit:  set for a qNaN, clear for an
> sNaN.  It's possible that all 754 HW does that now.

[snip details]

OK, so the worst that could happen here is that moving marshal data
from one box to another could turn one sort of NaN into another?  This
doesn't seem very bad.

[denorms]

>> I'd find struggling to care about that pretty hard.
>
> Me too.

Good.

>>>> The question, of course, is how to tell.
>
>>> Store a few small doubles at module initialization time and stare at
>
>> ./configure time, surely?
>
> Unsure.  Not all Python platforms _have_ "./configure time".  

But they all have pyconfig.h.

> Module initialization code is harder to screw up for that reason
> (the code is in an obvious place then, self-contained, and doesn't
> require any relevant knowledge of any platform porter unless/until
> it breaks).

Well, sure, but false negatives here are not a big deal here.

>>> their bits.  That's enough to settle whether a 754 format is in use,
>>> and, if it is, whether it's big-endian or little-endian.
>
>> Do you have a pointer to code that does this?
>
> No.  Pemberton's enquire.c contains enough code to do it.  

Yikes!  And much else besides.

> Given how few distinct architectures still exist, it's probably
> enough to store just double x = 1.5 and stare at it.

Something along these lines:

double x = 1.5;
is_big_endian_ieee_double = sizeof(double) == 8 && \
       memcmp((char*)&x, "\077\370\000\000\000\000\000\000", 8);

?

[me being obscure]
> I think I'm still missing your intent here.  If you're asking whether
> Python can blindly assume that 745 is in use, I'd say that's
> undesirable but defensible if necessary.

Yes, that's what I was asking, in a rather obscure way.

Cheers,
mwh

-- 
  Strangely enough  I saw just such a beast at  the grocery store
  last night. Starbucks sells Javachip. (It's ice cream, but that
  shouldn't be an obstacle for the Java marketing people.)
                                         -- Jeremy Hylton, 29 Apr 1997
From sdementen at hotmail.com  Fri Apr  8 11:32:37 2005
From: sdementen at hotmail.com (S�bastien de Menten)
Date: Mon Apr 11 20:05:03 2005
Subject: [Python-Dev] args attribute of Exception objects
Message-ID: <BAY103-F240325F2A9A571969FDB0A43F0@phx.gbl>

Hi,

When I need to make sense of a python exception, I often need to parse the 
string exception in order to retrieve the data.
Example:

try:
    print foo
except NameError, e:
    print e.args
    symbol = e.args[0][17:-16]
==> ("NameError: name 'foo' is not defined", )

or

try:
    (4).foo
except NameError, e:
    print e.args
==> ("'int' object has no attribute 'foo'",)

Moreover, in the documentation about Exception, I read
"""Warning: Messages to exceptions are not part of the Python API. Their 
contents may change from one version of Python to the next without warning 
and should not be relied on by code which will run under multiple versions 
of the interpreter. """

So even args could not be relied upon !

Two questions:
  1) did I miss something in dealing with exceptions ?
  2) Could this be changed to .args more in line with:
    a) first example: e.args = ('foo', "NameError: name 'foo' is not 
defined")
    b) second example: e.args = (4, 'foo', "'int' object has no attribute 
'foo'",)
  the message of the string can even be retrieved with str(e) so it is also 
redundant.
  BTW, the Warning in the doc enables to change this :-) To be backward 
compatible, the error message could also be the first element of the tuple.

Seb

ps: There may be problems (that I am not aware) with an exception keeping 
references to other objects


From tim.peters at gmail.com  Mon Apr 11 20:28:20 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Mon Apr 11 20:28:24 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <2msm1x480i.fsf@starship.python.net>
References: <d3735o$r5k$1@sea.gmane.org> <d39050$uf2$1@sea.gmane.org>
	<d3bdsn$rjp$1@sea.gmane.org> <d3bgj8$2kn$1@sea.gmane.org>
	<1f7befae05041010237d11d7a9@mail.gmail.com>
	<2mmzs65wsn.fsf@starship.python.net>
	<1f7befae05041015372cf17e91@mail.gmail.com>
	<2maco55t8p.fsf@starship.python.net>
	<1f7befae050411082733dca644@mail.gmail.com>
	<2msm1x480i.fsf@starship.python.net>
Message-ID: <1f7befae05041111284b61992a@mail.gmail.com>

...

[mwh]
> OK, so the worst that could happen here is that moving marshal data
> from one box to another could turn one sort of NaN into another?

Right.  Assuming source and destination boxes both use 754 format, and
the implementation adjusts endianess if necessary.

Heh.  I have a vague half-memory of _some_ box that stored the two
4-byte "words" in an IEEE double in one order, but the bytes within
each word in the opposite order.  It's always something ...

> This doesn't seem very bad.

Not bad at all:

    But since most people run with all FPU traps disabled, and
    Python doesn't expose a way to read the FPU status flags, they
    couldn't tell the difference.

>>>> Store a few small doubles at module initialization time and stare at

>>> ./configure time, surely?

>> Unsure.  Not all Python platforms _have_ "./configure time".
 
> But they all have pyconfig.h.

Yes, and then a platform porter has to understand what to
#define/#undefine, and why.  People doing cross-compilation may have
an especially confusing time of it.  Module initialization code "just
works", so I certainly understand why it doesn't appeal to the Unix
frame of mind <wink>.

>> Module initialization code is harder to screw up for that reason
>> (the code is in an obvious place then, self-contained, and doesn't
>> require any relevant knowledge of any platform porter unless/until
>> it breaks).

> Well, sure, but false negatives here are not a big deal here.

Sorry, unsure that "false negative" means here.

...

> Something along these lines:
>
> double x = 1.5;
> is_big_endian_ieee_double = sizeof(double) == 8 && \
>       memcmp((char*)&x, "\077\370\000\000\000\000\000\000", 8);

Right, it's that easy -- at least under MSVC and gcc.
From arigo at tunes.org  Mon Apr 11 22:47:42 2005
From: arigo at tunes.org (Armin Rigo)
Date: Mon Apr 11 22:54:55 2005
Subject: [Python-Dev] New style classes and operator methods
In-Reply-To: <425610AE.5070605@canterbury.ac.nz>
References: <425610AE.5070605@canterbury.ac.nz>
Message-ID: <20050411204742.GA362@vicky.ecs.soton.ac.uk>

Hi Greg,

On Fri, Apr 08, 2005 at 05:03:42PM +1200, Greg Ewing wrote:
> If the left and right operands are of the same class,
> and the class implements a right operand method but
> not a left operand method, the right operand method
> is not called. Instead, two attempts are made to call
> the left operand method.

This is not a general rule.  The rule is that if both elements are of the same
class, only the non-reversed method is ever called.  The confusing bit is
about having it called twice.  Funnily enough, this only occurs for some
operators (I think only add and mul).  The reason is that internally, the C
core distinguishes about number adding vs sequence concatenation, and number
multiplying vs sequence repetition.  So __add__() and __mul__() are called
twice: once as a numeric computation and as a sequence operation...

Could be fixed with more strange special cases in abstract.c, but I'm not sure
it's worth it.


Armin
From martin at v.loewis.de  Mon Apr 11 23:35:50 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon Apr 11 23:35:53 2005
Subject: [Python-Dev] Re: Re: Re: marshal / unmarshal
In-Reply-To: <d3d929$pd8$1@sea.gmane.org>
References: <d3735o$r5k$1@sea.gmane.org>	<d37cpc$g0d$1@sea.gmane.org><4257B8A1.6000902@v.loewis.de><16983.51916.407019.489590@montanaro.dyndns.org><d38j1t$sm0$1@sea.gmane.org>	<d39050$uf2$1@sea.gmane.org><d3bdsn$rjp$1@sea.gmane.org>	<d3bgj8$2kn$1@sea.gmane.org><1f7befae05041010237d11d7a9@mail.gmail.com><d3bqvm$umr$1@sea.gmane.org>	<1f7befae0504101542361dd121@mail.gmail.com>
	<d3d929$pd8$1@sea.gmane.org>
Message-ID: <425AEDB6.8000009@v.loewis.de>

Fredrik Lundh wrote:
> I was under the impression that the marshal format has been stable for
> quite a long time (people are using it for various RPC protocols, among
> other things).  I might be wrong.

Python 2.4 introduced support for string sharing in marshal files, with
an option to suppress sharing if an application needs to suppress it for
backwards compatibility.

Regards,
Martin
From mwh at python.net  Mon Apr 11 22:08:12 2005
From: mwh at python.net (Michael Hudson)
Date: Mon Apr 11 23:56:53 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <1f7befae05041111284b61992a@mail.gmail.com> (Tim Peters's
	message of "Mon, 11 Apr 2005 14:28:20 -0400")
References: <d3735o$r5k$1@sea.gmane.org> <d39050$uf2$1@sea.gmane.org>
	<d3bdsn$rjp$1@sea.gmane.org> <d3bgj8$2kn$1@sea.gmane.org>
	<1f7befae05041010237d11d7a9@mail.gmail.com>
	<2mmzs65wsn.fsf@starship.python.net>
	<1f7befae05041015372cf17e91@mail.gmail.com>
	<2maco55t8p.fsf@starship.python.net>
	<1f7befae050411082733dca644@mail.gmail.com>
	<2msm1x480i.fsf@starship.python.net>
	<1f7befae05041111284b61992a@mail.gmail.com>
Message-ID: <2moecl3wlv.fsf@starship.python.net>

I've just submitted http://python.org/sf/1180995 which adds format
codes for binary marshalling of floats if version > 1, but it doesn't
quite have the effect I expected (see below):

>>> inf = 1e308*1e308
>>> nan = inf/inf
>>> marshal.dumps(nan, 2)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: unmarshallable object

frexp(nan, &e), it turns out, returns nan, which results in this (to
be expected if you read _PyFloat_Pack8 and know that I'm using a
new-ish GCC -- it might be different for MSVC 6).

Also (this is the same thing, really):

>>> struct.pack('>d', inf)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
SystemError: frexp() result out of range

Although I was a little surprised by this:

>>> struct.pack('d', inf)
'\x7f\xf0\x00\x00\x00\x00\x00\x00'

(this is a big-endian system).  Again, reading the source explains the
behaviour.

Tim Peters <tim.peters@gmail.com> writes:

> ...
>
> [mwh]
>> OK, so the worst that could happen here is that moving marshal data
>> from one box to another could turn one sort of NaN into another?
>
> Right.  Assuming source and destination boxes both use 754 format, and
> the implementation adjusts endianess if necessary.

Well, I was assuming marshal would do floats little-endian-wise, as it
does for integers.

> Heh.  I have a vague half-memory of _some_ box that stored the two
> 4-byte "words" in an IEEE double in one order, but the bytes within
> each word in the opposite order.  It's always something ...

I recall stories of machines that stored the bytes of long in some
crazy order like that.  I think Python would already be broken on such
a system, but, also, don't care.

>>>>> Store a few small doubles at module initialization time and stare at
>
>>>> ./configure time, surely?
>
>>> Unsure.  Not all Python platforms _have_ "./configure time".
>  
>> But they all have pyconfig.h.
>
> Yes, and then a platform porter has to understand what to
> #define/#undefine, and why.  People doing cross-compilation may have
> an especially confusing time of it.

Well, they can always not #define HAVE_IEEE_DOUBLES and not suffer all
that much (this is what I meant by false negatives below).

> Module initialization code "just works", so I certainly understand
> why it doesn't appeal to the Unix frame of mind <wink>.

It just strikes as silly to test at runtime sometime that is so
obviously not going to change between invocations.  But it's not a big
deal either way.

> ...
>
>> Something along these lines:
>>
>> double x = 1.5;
>> is_big_endian_ieee_double = sizeof(double) == 8 && \
>>       memcmp((char*)&x, "\077\370\000\000\000\000\000\000", 8);
>
> Right, it's that easy

Cool.

> -- at least under MSVC and gcc.

Huh?  Now it's my turn to be confused (for starters, under MSVC ieee
doubles really can be assumed...).

Cheers,
mwh 

-- 
  You sound surprised.  We're talking about a government department
  here - they have procedures, not intelligence.
                                            -- Ben Hutchings, cam.misc
From tim.peters at gmail.com  Tue Apr 12 01:39:23 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue Apr 12 01:39:26 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <2moecl3wlv.fsf@starship.python.net>
References: <d3735o$r5k$1@sea.gmane.org> <d3bgj8$2kn$1@sea.gmane.org>
	<1f7befae05041010237d11d7a9@mail.gmail.com>
	<2mmzs65wsn.fsf@starship.python.net>
	<1f7befae05041015372cf17e91@mail.gmail.com>
	<2maco55t8p.fsf@starship.python.net>
	<1f7befae050411082733dca644@mail.gmail.com>
	<2msm1x480i.fsf@starship.python.net>
	<1f7befae05041111284b61992a@mail.gmail.com>
	<2moecl3wlv.fsf@starship.python.net>
Message-ID: <1f7befae050411163959e7b90c@mail.gmail.com>

[Michael Hudson]
> I've just submitted http://python.org/sf/1180995 which adds format
> codes for binary marshalling of floats if version > 1, but it doesn't
> quite have the effect I expected (see below):

> >>> inf = 1e308*1e308
> >>> nan = inf/inf
> >>> marshal.dumps(nan, 2)
> Traceback (most recent call last):
>  File "<stdin>", line 1, in ?
> ValueError: unmarshallable object

I don't understand.  Does "binary marshalling" _not_ mean just copying
the bytes on a 754 platform?  If so, that won't work.  I pointed out
the relevant comments before:

/* The pack routines write 4 or 8 bytes, starting at p.
...
 * Bug:  What this does is undefined if x is a NaN or infinity.
 * Bug:  -0.0 and +0.0 produce the same string.
 */
PyAPI_FUNC(int) _PyFloat_Pack4(double x, unsigned char *p, int le);
PyAPI_FUNC(int) _PyFloat_Pack8(double x, unsigned char *p, int le);

> frexp(nan, &e), it turns out, returns nan,

This is an undefined case in C89 (all 754 special values are).

> which results in this (to be expected if you read _PyFloat_Pack8 and
> know that I'm using a new-ish GCC -- it might be different for MSVC 6).
>
> Also (this is the same thing, really):

Right.  So is pickling with proto >= 1.  Changing the pack/unpack
routines to copy bytes instead (when possible) "fixes" all of these
things at one stroke, on boxes where it applies.
 
> >>> struct.pack('>d', inf)
> Traceback (most recent call last):
>  File "<stdin>", line 1, in ?
> SystemError: frexp() result out of range
>
> Although I was a little surprised by this:
>
> >>> struct.pack('d', inf)
> '\x7f\xf0\x00\x00\x00\x00\x00\x00'
>
> (this is a big-endian system).  Again, reading the source explains the
> behaviour.

>>> OK, so the worst that could happen here is that moving marshal data
>>> from one box to another could turn one sort of NaN into another?

>> Right.  Assuming source and destination boxes both use 754 format, and
>> the implementation adjusts endianess if necessary.

> Well, I was assuming marshal would do floats little-endian-wise, as it
> does for integers.

Then on a big-endian 754 system, loads() will have to reverse the
bytes in the little-endian marshal bytestring, and dumps() likewise. 
That's all "if necessary" meant -- sometimes cast + memcpy isn't
enough, and regardless of which direction marshal decides to use.

>> Heh.  I have a vague half-memory of _some_ box that stored the two
>> 4-byte "words" in an IEEE double in one order, but the bytes within
>> each word in the opposite order.  It's always something ...

> I recall stories of machines that stored the bytes of long in some
> crazy order like that.  I think Python would already be broken on such
> a system, but, also, don't care.

Python does very little that depends on internal native byte order,
and C hides it in the absence of casting abuse.  Copying internal
native bytes across boxes is plain ugly -- can't get more brittle than
that.  In this case it looks like a good tradeoff, though.

> ...
> Well, they can always not #define HAVE_IEEE_DOUBLES and not suffer all
> that much (this is what I meant by false negatives below).
> ...
> It just strikes as silly to test at runtime sometime that is so
> obviously not going to change between invocations.  But it's not a big
> deal either way.

It isn't to me either.  It just strikes me as silly to give porters
another thing to wonder about and screw up when it's possible to solve
it completely with a few measly runtime cycles <wink>.

>>> Something along these lines:
>>>
>>> double x = 1.5;
>>> is_big_endian_ieee_double = sizeof(double) == 8 && \
>>>       memcmp((char*)&x, "\077\370\000\000\000\000\000\000", 8);

>> Right, it's that easy

> Cool.

>> -- at least under MSVC and gcc.
 
> Huh?  Now it's my turn to be confused (for starters, under MSVC ieee
> doubles really can be assumed...).

So you have no argument with the "at least under MSVC" part <wink>. 
There's nothing to worry about here -- I was just tweaking.
From mwh at python.net  Tue Apr 12 09:39:08 2005
From: mwh at python.net (Michael Hudson)
Date: Tue Apr 12 09:39:10 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <1f7befae050411163959e7b90c@mail.gmail.com> (Tim Peters's
	message of "Mon, 11 Apr 2005 19:39:23 -0400")
References: <d3735o$r5k$1@sea.gmane.org> <d3bgj8$2kn$1@sea.gmane.org>
	<1f7befae05041010237d11d7a9@mail.gmail.com>
	<2mmzs65wsn.fsf@starship.python.net>
	<1f7befae05041015372cf17e91@mail.gmail.com>
	<2maco55t8p.fsf@starship.python.net>
	<1f7befae050411082733dca644@mail.gmail.com>
	<2msm1x480i.fsf@starship.python.net>
	<1f7befae05041111284b61992a@mail.gmail.com>
	<2moecl3wlv.fsf@starship.python.net>
	<1f7befae050411163959e7b90c@mail.gmail.com>
Message-ID: <2m4qece95v.fsf@starship.python.net>

My mail is experincing random delays of up to a few hours at the
moment.  I wrote this before I saw your comments on my patch.

Tim Peters <tim.peters@gmail.com> writes:

> [Michael Hudson]
>> I've just submitted http://python.org/sf/1180995 which adds format
>> codes for binary marshalling of floats if version > 1, but it doesn't
>> quite have the effect I expected (see below):
>
>> >>> inf = 1e308*1e308
>> >>> nan = inf/inf
>> >>> marshal.dumps(nan, 2)
>> Traceback (most recent call last):
>>  File "<stdin>", line 1, in ?
>> ValueError: unmarshallable object
>
> I don't understand.  Does "binary marshalling" _not_ mean just copying
> the bytes on a 754 platform?

No, it means using _PyFloat_Pack8/Unpack8, like the patch description
says.  Making those functions just fiddle bytes when they can I regard
as a separate project (watch a patch manager near you, though).

> If so, that won't work.

I can tell! <wink>

>>> Right.  Assuming source and destination boxes both use 754 format, and
>>> the implementation adjusts endianess if necessary.
>
>> Well, I was assuming marshal would do floats little-endian-wise, as it
>> does for integers.
>
> Then on a big-endian 754 system, loads() will have to reverse the
> bytes in the little-endian marshal bytestring, and dumps() likewise. 

Really?  Even I had worked this out...

>>> Heh.  I have a vague half-memory of _some_ box that stored the two
>>> 4-byte "words" in an IEEE double in one order, but the bytes within
>>> each word in the opposite order.  It's always something ...
>
>> I recall stories of machines that stored the bytes of long in some
>> crazy order like that.  I think Python would already be broken on such
>> a system, but, also, don't care.
>
> Python does very little that depends on internal native byte order,
> and C hides it in the absence of casting abuse.  

This surely does:

PyObject *
PyLong_FromLongLong(PY_LONG_LONG ival)
{
        PY_LONG_LONG bytes = ival;
        int one = 1;
        return _PyLong_FromByteArray(
                (unsigned char *)&bytes,
                               SIZEOF_LONG_LONG, IS_LITTLE_ENDIAN, 1);
}

It occurs that in the IEEE case, special values can be detected with
reliablity -- by picking the exponent field out by force -- and a
warning emitted or exception raised.  Good idea?  Hard to say, to me.

Cheers,
mwh

Oh, by the way: http://python.org/sf/1181301

-- 
  It is time-consuming to produce high-quality software. However,
  that should not alone be a reason to give up the high standards
  of Python development.              -- Martin von Loewis, python-dev
From tim.peters at gmail.com  Tue Apr 12 17:16:22 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Tue Apr 12 17:16:29 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <2m4qece95v.fsf@starship.python.net>
References: <d3735o$r5k$1@sea.gmane.org> <2mmzs65wsn.fsf@starship.python.net>
	<1f7befae05041015372cf17e91@mail.gmail.com>
	<2maco55t8p.fsf@starship.python.net>
	<1f7befae050411082733dca644@mail.gmail.com>
	<2msm1x480i.fsf@starship.python.net>
	<1f7befae05041111284b61992a@mail.gmail.com>
	<2moecl3wlv.fsf@starship.python.net>
	<1f7befae050411163959e7b90c@mail.gmail.com>
	<2m4qece95v.fsf@starship.python.net>
Message-ID: <1f7befae050412081676898e80@mail.gmail.com>

...

[mwh]
>>> I recall stories of machines that stored the bytes of long in some
>>> crazy order like that.  I think Python would already be broken on such
>>> a system, but, also, don't care.

[Tim]
>> Python does very little that depends on internal native byte order,
>> and C hides it in the absence of casting abuse.

[mwh]
> This surely does:
>
> PyObject *
> PyLong_FromLongLong(PY_LONG_LONG ival)
> {
>        PY_LONG_LONG bytes = ival;
>        int one = 1;
>        return _PyLong_FromByteArray(
>                (unsigned char *)&bytes,
>                               SIZEOF_LONG_LONG, IS_LITTLE_ENDIAN, 1);
> }

Yes, that's "casting abuse'.  Python does very little of that.  If it
becomes necessary, it's straightforward but long-winded to rewrite the
above in wholly portable C (peel the bytes out of ival,
least-signficant first, via shifting and masking 8 times; "ival &
0xff" is the least-significant byte regardless of memory storage
order; etc).  BTW, the IS_LITTLE_ENDIAN macro also relies on casting
abuse, and more deeply than does the visible cast there.
 
> It occurs that in the IEEE case, special values can be detected with
> reliablity -- by picking the exponent field out by force

Right, that works for NaNs and infinities; signed zeroes are a bit
trickier to detect.

> -- and a warning emitted or exception raised.  Good idea?  Hard to say, to me.

It's not possible to _create_ a NaN or infinity from finite operands
in 754 without signaling some exceptional condition.  Once you have
one, though, there's generally nothing exceptional about _using_ it. 
Sometimes there is, like +Inf - +Inf or Inf / Inf, but not generally. 
Using a quiet NaN never signals; using a signaling NaN almost always
signals.

So packing a nan or inf shouldn't complain.  On a 754 box, unpacking
one shouldn't complain either.  Unpacking a nan or inf on a non-754
box probably should complain, since there's in general nothing it can
be unpacked _to_ that makes any sense ("errors should never pass
silently").
From mwh at python.net  Tue Apr 12 17:32:17 2005
From: mwh at python.net (Michael Hudson)
Date: Tue Apr 12 23:20:49 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <1f7befae050412081676898e80@mail.gmail.com> (Tim Peters's
	message of "Tue, 12 Apr 2005 11:16:22 -0400")
References: <d3735o$r5k$1@sea.gmane.org> <2mmzs65wsn.fsf@starship.python.net>
	<1f7befae05041015372cf17e91@mail.gmail.com>
	<2maco55t8p.fsf@starship.python.net>
	<1f7befae050411082733dca644@mail.gmail.com>
	<2msm1x480i.fsf@starship.python.net>
	<1f7befae05041111284b61992a@mail.gmail.com>
	<2moecl3wlv.fsf@starship.python.net>
	<1f7befae050411163959e7b90c@mail.gmail.com>
	<2m4qece95v.fsf@starship.python.net>
	<1f7befae050412081676898e80@mail.gmail.com>
Message-ID: <2mk6n8c8ou.fsf@starship.python.net>

Tim Peters <tim.peters@gmail.com> writes:

> ...
>
> [mwh]
>>>> I recall stories of machines that stored the bytes of long in some
>>>> crazy order like that.  I think Python would already be broken on such
>>>> a system, but, also, don't care.
>
> [Tim]
>>> Python does very little that depends on internal native byte order,
>>> and C hides it in the absence of casting abuse.
>
> [mwh]
>> This surely does:
>>
>> PyObject *
>> PyLong_FromLongLong(PY_LONG_LONG ival)
>> {
>>        PY_LONG_LONG bytes = ival;
>>        int one = 1;
>>        return _PyLong_FromByteArray(
>>                (unsigned char *)&bytes,
>>                               SIZEOF_LONG_LONG, IS_LITTLE_ENDIAN, 1);
>> }
>
> Yes, that's "casting abuse'.  Python does very little of that.  If it
> becomes necessary, it's straightforward but long-winded to rewrite the
> above in wholly portable C (peel the bytes out of ival,
> least-signficant first, via shifting and masking 8 times; "ival &
> 0xff" is the least-significant byte regardless of memory storage
> order; etc).

Not arguing with that.

> BTW, the IS_LITTLE_ENDIAN macro also relies on casting abuse, and
> more deeply than does the visible cast there.

I'd like to claim that was part of my point :)

There is a certain, small level of assumption in Python that
"big-endian or little-endian" is the only question to ask -- and I
don't think that's a problem!

Even in this isn't a big deal, at least if we choose a more
interesting 'probe value' that 1.5, it will just lead to an oddball
box degrading to the non-ieee code.

>> It occurs that in the IEEE case, special values can be detected with
>> reliablity -- by picking the exponent field out by force
>
> Right, that works for NaNs and infinities; signed zeroes are a bit
> trickier to detect.

Hmm.  Don't think they're such a big deal.

>> -- and a warning emitted or exception raised.  Good idea?  Hard to
>> say, to me.
>
> It's not possible to _create_ a NaN or infinity from finite operands
> in 754 without signaling some exceptional condition.  Once you have
> one, though, there's generally nothing exceptional about _using_ it. 
> Sometimes there is, like +Inf - +Inf or Inf / Inf, but not generally. 
> Using a quiet NaN never signals; using a signaling NaN almost always
> signals.
>
> So packing a nan or inf shouldn't complain.  On a 754 box, unpacking
> one shouldn't complain either.  Unpacking a nan or inf on a non-754
> box probably should complain, since there's in general nothing it can
> be unpacked _to_ that makes any sense ("errors should never pass
> silently").

This sounds like good behaviour to me.  I'll try to update the patch
soon.

Cheers,
mwh

-- 
  BUGS   Never use this function.  This function modifies its first
         argument.   The  identity  of  the delimiting character is
         lost.  This function cannot be used on constant strings.
                                    -- the glibc manpage for strtok(3)
From python at rcn.com  Tue Apr 12 14:03:50 2005
From: python at rcn.com (Raymond Hettinger)
Date: Wed Apr 13 02:04:39 2005
Subject: [Python-Dev] args attribute of Exception objects
In-Reply-To: <BAY103-F240325F2A9A571969FDB0A43F0@phx.gbl>
Message-ID: <000001c53f57$b0fe7d20$c2bd2c81@oemcomputer>

[S?bastien de Menten]
>  2) Could this be changed to .args more in line with:
>     a) first example: e.args = ('foo', "NameError: name 'foo' is not
> defined")
>     b) second example: e.args = (4, 'foo', "'int' object has no
attribute
> 'foo'",)
>   the message of the string can even be retrieved with str(e) so it is
> also
> redundant.

Something like this ought to be explored at some point.  It would
certainly improve the exception API to be able to get references to the
objects without parsing strings.

The balancing forces are backwards compatibility and a need to keep the
exception mechanism as lightweight as possible.

Please log a feature request on SF.  Note that the idea is only for
making builtin exceptions more informative.  User defined exceptions can
already attach arbitrary objects:

>>> class Boom(Exception):
        pass

>>> x = 10
>>> if x != 5:
        raise Boom("Value must be a five", x)

Traceback (most recent call last):
  File "<pyshell#12>", line 2, in -toplevel-
    raise Boom("Value must be a five", x)
Boom: ('Value must be a five', 10)


Raymond Hettinger
From prabu333 at hotpop.com  Wed Apr 13 08:21:07 2005
From: prabu333 at hotpop.com (Senthil Prabu.S)
Date: Wed Apr 13 08:21:24 2005
Subject: [Python-Dev] Python tests fails on HP-UX 11.11 and core dumps
Message-ID: <02ab01c53ff0$fd2ef0a0$1f0110ac@sesco>

Hello Experts,
       I tried python -4.2.1 on a HP-UX 11.11 PA machine. I was able to 
python. Gmake passes, gmake test results in error. The python reported 
that test_pty fails,when running this test alone.

Can anyone help to find why core dumps at running the test_subprocess.py test.
Also how can I solve it?
Have anyone faced the same problem earlier.

The details are given below;
# ../../python test_pty.py
Calling master_open()
Got master_fd '3', slave_name '/dev/pts/0'
Calling slave_open('/dev/pts/0')
Got slave_fd '4'
Traceback (most recent call last):
  File "test_pty.py", line 58, in ?
    test_basic_pty()
  File "test_pty.py", line 29, in test_basic_pty
    if not os.isatty(slave_fd):
  File "test_pty.py", line 50, in handle_sig
    raise TestFailed, "isatty hung"
test.test_support.TestFailed: isatty hung
#

# ../../python test_subprocess.py
test_args_string (__main__.ProcessTestCase) ... ok
test_call_kwargs (__main__.ProcessTestCase) ... ok
test_call_seq (__main__.ProcessTestCase) ... ok
test_call_string (__main__.ProcessTestCase) ... ok
test_communicate (__main__.ProcessTestCase) ... ok
test_communicate_pipe_buf (__main__.ProcessTestCase) ... ok
test_communicate_returns (__main__.ProcessTestCase) ... ok
test_cwd (__main__.ProcessTestCase) ... ok
test_env (__main__.ProcessTestCase) ... ok
test_exceptions (__main__.ProcessTestCase) ... ok
test_executable (__main__.ProcessTestCase) ... ok
test_invalid_args (__main__.ProcessTestCase) ... ok
test_invalid_bufsize (__main__.ProcessTestCase) ... ok
test_list2cmdline (__main__.ProcessTestCase) ... ok
test_no_leaking (__main__.ProcessTestCase) ... ok
test_poll (__main__.ProcessTestCase) ... ok
test_preexec (__main__.ProcessTestCase) ... ok
test_run_abort (__main__.ProcessTestCase) ... ok
test_shell_sequence (__main__.ProcessTestCase) ... ok
test_shell_string (__main__.ProcessTestCase) ... ok
test_stderr_filedes (__main__.ProcessTestCase) ... ok
test_stderr_fileobj (__main__.ProcessTestCase) ... ok
test_stderr_none (__main__.ProcessTestCase) ... ok
test_stderr_pipe (__main__.ProcessTestCase) ... ok
test_stdin_filedes (__main__.ProcessTestCase) ... ok
test_stdin_fileobj (__main__.ProcessTestCase) ... ok
test_stdin_none (__main__.ProcessTestCase) ... ok
test_stdin_pipe (__main__.ProcessTestCase) ... ok
test_stdout_filedes (__main__.ProcessTestCase) ... ok
test_stdout_fileobj (__main__.ProcessTestCase) ... ok
    this bit of output is from a test of stdout in a different process ...
test_stdout_none (__main__.ProcessTestCase) ... ok
test_stdout_pipe (__main__.ProcessTestCase) ... ok
test_stdout_stderr_file (__main__.ProcessTestCase) ... ok
test_stdout_stderr_pipe (__main__.ProcessTestCase) ... ok
test_universal_newlines (__main__.ProcessTestCase) ... ok
test_universal_newlines_communicate (__main__.ProcessTestCase) ... ok
test_wait (__main__.ProcessTestCase) ... ok
test_writes_before_communicate (__main__.ProcessTestCase) ... ok

----------------------------------------------------------------------
Ran 38 tests in 8.171s


Analysing the core file through GDB;
# gdb ../../python core
HP gdb 4.5 for PA-RISC 1.1 or 2.0 (narrow), HP-UX 11.00
and target hppa1.1-hp-hpux11.00.
Copyright 1986 - 2001 Free Software Foundation, Inc.
Hewlett-Packard Wildebeest 4.5 (based on GDB) is covered by the
GNU General Public License. Type "show copying" to see the conditions to
change it and/or distribute copies. Type "show warranty" for warranty/support.
..
Core was generated by `python'.
Program terminated with signal 6, Aborted.
#0  0xc020bad0 in kill+0x10 () from /usr/lib/libc.2
(gdb) bt
#0  0xc020bad0 in kill+0x10 () from /usr/lib/libc.2
#1  0xc01a655c in raise+0x24 () from /usr/lib/libc.2
#2  0xc01e69a8 in abort_C+0x160 () from /usr/lib/libc.2
#3  0xc01e6a04 in abort+0x1c () from /usr/lib/libc.2
#4  0xffbe4 in posix_abort (self=0x40029098, noargs=0x0) at ./Modules/posixmodule.c:7158
#5  0xc9b7c in PyEval_EvalFrame (f=0x40028e54) at Python/ceval.c:3531
#6  0xc01a655c in raise+0x24 () from /usr/lib/libc.2
#7  0x475b0 in freechildren (n=0x0) at Parser/node.c:131
(gdb)

Build Environment;
GCC - gcc version 3.4.3
HP-UX omega B.11.11 U 9000/800 
./configure --prefix=/opt/iexpress/python --disable-ipv6 --with-signal-module --with-threads
Earlier, I faced problem while gmake, and make changes as per the following link;
   https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1071597&group_id=5470
And I was able to build Python succesfully.

Also, the overall result of tests are;
250 tests OK.
1 test failed:
    test_pty
39 tests skipped:
    test_aepack test_al test_applesingle test_bsddb test_bsddb185
    test_bsddb3 test_bz2 test_cd test_cl test_codecmaps_cn
    test_codecmaps_hk test_codecmaps_jp test_codecmaps_kr
    test_codecmaps_tw test_curses test_dl test_gdbm test_gl
    test_imgfile test_largefile test_linuxaudiodev test_locale
    test_macfs test_macostools test_nis test_normalization
    test_ossaudiodev test_pep277 test_plistlib test_scriptpackages
    test_socket_ssl test_socketserver test_sunaudiodev test_tcl
    test_timeout test_urllib2net test_urllibnet test_winreg
    test_winsound
2 skips unexpected on hp-ux11:
    test_tcl test_bz2
gmake: *** [test] Error 1

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20050413/9ad9e89f/attachment.htm
From python-dev at zesty.ca  Wed Apr 13 11:03:43 2005
From: python-dev at zesty.ca (Ka-Ping Yee)
Date: Wed Apr 13 11:03:56 2005
Subject: [Python-Dev] Security capabilities in Python
In-Reply-To: <b64f365b05041009081af0cd8@mail.gmail.com>
References: <20050409090223.13806.1208552072.divmod.quotient.54598@ohm>
	<Pine.LNX.4.58.0504091557540.27211@server1.LFW.org>
	<1f8cfb9a8805dcc73339b4ea0164e63b@fuhm.net>
	<b64f365b05041009081af0cd8@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0504130358390.27211@server1.LFW.org>

On Sun, 10 Apr 2005, Eyal Lotem wrote:
> It may be really hard to get it right, unless we are overlooking some simple
> solution.

To "get it right", you at least need to know exactly what your
operators mean.  I messed up because i failed to realize that
'==' can be redefined, and 'in' depends on '==' to work properly.

> What about implementing the facet in C? This could avoid the class of
> problems you have just mentioned.

I don't think that's a good solution.  A facet is just one basic
programming pattern that you can build in a capability system; it
would be silly to have to go back to C every time you wanted to
build some other construct.  A better way would be to start with
capabilities that behave simply and correctly; then you can build
whatever you want.


-- ?!ng
From ncoghlan at iinet.net.au  Wed Apr 13 11:23:37 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Wed Apr 13 11:27:31 2005
Subject: [Python-Dev] Unified or context diffs?
Message-ID: <425CE519.4090700@iinet.net.au>

Are context diffs still favoured for patches?

The patch submission guidelines [1] still say that, but is it actually true 
these days? I personally prefer unified diffs, but have been generating context 
diffs because of what the guidelines say.

Brett can probably guess why I'm asking :)

Cheers,
Nick.

[1] http://www.python.org/patches/

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From phil at riverbankcomputing.co.uk  Wed Apr 13 11:41:24 2005
From: phil at riverbankcomputing.co.uk (Phil Thompson)
Date: Wed Apr 13 11:41:03 2005
Subject: [Python-Dev] super_getattro() Behaviour
Message-ID: <13256.82.68.80.137.1113385284.squirrel@river-bank.demon.co.uk>

In PyQt, wrapped types implement lazy access to the type dictionary
through tp_getattro. If the normal attribute lookup fails, then private
tables are searched and the attribute (if found) is created on the fly and
returned. It is also put into the type dictionary so that it is found next
time through the normal lookup. This is done to speed up the import of,
and the memory consumed by, the qt module which contains thousands of
class methods.

This all works fine - except when super is used.

The implementation of super_getattro() doesn't use the normal attribute
lookup (ie. doesn't go via tp_getattro). Instead it walks the MRO
hierarchy itself and searches instance dictionaries explicitly. This means
that attributes that have not yet been referenced (ie. not yet been cached
in the type dictionary) will not be found.

Questions...

1. What is the reason why it doesn't go via tp_getattro? Bug or feature?

2. A possible workaround is to subvert the ma_lookup function of the type
dictionary after creating the type to do something similar to what my
tp_getattro function is doing. Are there any inherent problems with that?

3. Why, when creating a new type and eventually calling type_new() is a
copy of the dictionary passed in made? Why not take a reference to it?
This would allow a dict sub-class to be used as the type dictionary. I
could then implement a lazy-dict sub-class with the behaviour I need.

4. Am I missing a more correct/obvious technique? (There is no need to
support classic classes.)

Many thanks,
Phil

From prabu333 at hotpop.com  Wed Apr 13 13:11:54 2005
From: prabu333 at hotpop.com (Senthil Prabu.S)
Date: Wed Apr 13 13:12:10 2005
Subject: [Python-Dev] IPV6 with Python- 4.2.1 on HPUX
References: <02ab01c53ff0$fd2ef0a0$1f0110ac@sesco>
Message-ID: <03ee01c54019$9af8bdc0$1f0110ac@sesco>

Hi Experts,
       I am pretty new to Python. I have been trying to compile python
on HP-UX 11.23 IPF machine. I tried to build with following configure 
option.

./configure --prefix=/opt/iexpress/python --enable-ipv6 --with-signal-module --with-threads
machine info : HP-UX beta B.11.23 U ia64
gcc : gcc version 3.4.3
While configure, I faced the following pbm,

checking ipv6 stack type... ./configure[13033]: /usr/xpg4/bin/grep:  not found.
unknown

Then I checked the config.log to find the entires for IPV6;

configure:12811: checking if --enable-ipv6 is specified
configure:12822: result: yes
configure:12954: checking ipv6 stack type
conftest.c:78:22: features.h: No such file or directory
conftest.c:78:48: /usr/local/v6/include/sys/v6config.h: No such file or directory
configure:13111: result: unknown

But, configure didnot produce any error mesage.
So plz advice whether Python supports the IPV6 option on HP-UX. Bez, I know ipv6
differs from linux and HP-UX. If I no need to worry about this and build python. How 
to check whether IPV6 option works well with my python.

Anyone please help to how to test the IPV6 functionality test.
Is there any specific IPV6 test available with python. I could not
find any specific testsuit for IPV6 under test directory.

Plz share ur comments


Advance Thanks,
Senthil Prabu.S
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20050413/c4903408/attachment.html
From python at rcn.com  Wed Apr 13 07:28:18 2005
From: python at rcn.com (Raymond Hettinger)
Date: Wed Apr 13 19:28:45 2005
Subject: [Python-Dev] Unified or context diffs?
In-Reply-To: <425CE519.4090700@iinet.net.au>
Message-ID: <003301c53fe9$9995fa40$3c36c797@oemcomputer>

[Nick Coghlan]
> Are context diffs still favoured for patches?
> 
> The patch submission guidelines [1] still say that, but is it actually
> true
> these days? I personally prefer unified diffs, but have been
generating
> context
> diffs because of what the guidelines say.

Submit whichever is the most informative.  For some changes, it is
easier to see the changed lines immediately above and below each other.
For others, it helps to be able to see the whole algorithm.


Raymond
From irmen at xs4all.nl  Wed Apr 13 19:38:40 2005
From: irmen at xs4all.nl (Irmen de Jong)
Date: Wed Apr 13 19:38:44 2005
Subject: [Python-Dev] Unified or context diffs?
In-Reply-To: <003301c53fe9$9995fa40$3c36c797@oemcomputer>
References: <003301c53fe9$9995fa40$3c36c797@oemcomputer>
Message-ID: <425D5920.3050402@xs4all.nl>

Raymond Hettinger wrote:
> [Nick Coghlan]
> 
>>Are context diffs still favoured for patches?
>>
>>The patch submission guidelines [1] still say that, but is it actually
>>true
>>these days? I personally prefer unified diffs, but have been
> 
> generating
> 
>>context
>>diffs because of what the guidelines say.
> 
> 
> Submit whichever is the most informative.  For some changes, it is
> easier to see the changed lines immediately above and below each other.
> For others, it helps to be able to see the whole algorithm.

And for the 'patch' tool, it doesn't really matter what you use, right?

--Irmen
From bac at OCF.Berkeley.EDU  Wed Apr 13 21:54:08 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Apr 13 21:54:17 2005
Subject: [Python-Dev] Unified or context diffs?
In-Reply-To: <425CE519.4090700@iinet.net.au>
References: <425CE519.4090700@iinet.net.au>
Message-ID: <425D78E0.5070605@ocf.berkeley.edu>

Nick Coghlan wrote:
> Are context diffs still favoured for patches?
> 
> The patch submission guidelines [1] still say that, but is it actually
> true these days? I personally prefer unified diffs, but have been
> generating context diffs because of what the guidelines say.
> 

I personally like unified diffs a lot more since you can see exactly how a line
changed compared to the previous version, but that's me.  I just checked the
dev FAQ and it consistently says contextual diffs as well.

> Brett can probably guess why I'm asking :)
> 

=)

> Cheers,
> Nick.
> 
> [1] http://www.python.org/patches/
> 

I didn't even know that page existed!

I thought at one point this question came up and the general consensus was that
unified diffs were preferred?

-Brett
From nas at arctrix.com  Wed Apr 13 22:09:43 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed Apr 13 22:09:48 2005
Subject: [Python-Dev] Unified or context diffs?
In-Reply-To: <425D78E0.5070605@ocf.berkeley.edu>
References: <425CE519.4090700@iinet.net.au> <425D78E0.5070605@ocf.berkeley.edu>
Message-ID: <20050413200943.GA24038@mems-exchange.org>

On Wed, Apr 13, 2005 at 12:54:08PM -0700, Brett C. wrote:
> I thought at one point this question came up and the general
> consensus was that unified diffs were preferred?

Guido used to prefer context diffs but says he now doesn't mind
unified diffs.  I think unified diffs are much more common these
days so that's probably what most people are used to.  As Raymond
says, for certain types of changes, context diffs are more readable.
Still, I always use unified diffs.

  Neil
From barry at python.org  Wed Apr 13 22:49:14 2005
From: barry at python.org (Barry Warsaw)
Date: Wed Apr 13 22:49:18 2005
Subject: [Python-Dev] Unified or context diffs?
In-Reply-To: <425D78E0.5070605@ocf.berkeley.edu>
References: <425CE519.4090700@iinet.net.au> <425D78E0.5070605@ocf.berkeley.edu>
Message-ID: <1113425354.10345.37.camel@geddy.wooz.org>

On Wed, 2005-04-13 at 15:54, Brett C. wrote:

> I thought at one point this question came up and the general consensus was that
> unified diffs were preferred?

Back in the day, we preferred context diffs, and I think of the original
Python core group, Guido was the last holdout.  But IIRC, a few years
ago the issue came up again; Guido had changed his mind so we changed
syncmail to produce unified diffs.

IMO unifieds are preferred when the diffs are for human consumption, but
when they're only for machine consumption, anything that the patch
program accepts is fine.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050413/c2db5280/attachment.pgp
From martin at v.loewis.de  Wed Apr 13 23:11:27 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Apr 13 23:11:31 2005
Subject: [Python-Dev] Unified or context diffs?
In-Reply-To: <425CE519.4090700@iinet.net.au>
References: <425CE519.4090700@iinet.net.au>
Message-ID: <425D8AFF.5090508@v.loewis.de>

Nick Coghlan wrote:
> Are context diffs still favoured for patches?

Just for the record: I also prefer unified over context diffs.

Regards,
Martin
From bac at OCF.Berkeley.EDU  Wed Apr 13 23:26:12 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Apr 13 23:26:33 2005
Subject: [Python-Dev] Unified or context diffs?
In-Reply-To: <1113425354.10345.37.camel@geddy.wooz.org>
References: <425CE519.4090700@iinet.net.au>	
	<425D78E0.5070605@ocf.berkeley.edu>
	<1113425354.10345.37.camel@geddy.wooz.org>
Message-ID: <425D8E74.3030708@ocf.berkeley.edu>

Barry Warsaw wrote:
> On Wed, 2005-04-13 at 15:54, Brett C. wrote:
> 
> 
>>I thought at one point this question came up and the general consensus was that
>>unified diffs were preferred?
> 
> 
> Back in the day, we preferred context diffs, and I think of the original
> Python core group, Guido was the last holdout.  But IIRC, a few years
> ago the issue came up again; Guido had changed his mind so we changed
> syncmail to produce unified diffs.
> 

Eh.  Guido doesn't deal with patches anymore, so his opinion doesn't count.  =)

> IMO unifieds are preferred when the diffs are for human consumption, but
> when they're only for machine consumption, anything that the patch
> program accepts is fine.
> 

OK, it seems like everyone who cares enough to speak up has said so far that
unified diffs are better I will change the docs some time between now and when
I keel over dead to have people use unified diffs assuming some rush of people
don't suddenly start saying they prefer contextual diffs.

-Brett
From martin at v.loewis.de  Wed Apr 13 23:31:28 2005
From: martin at v.loewis.de (=?windows-1252?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed Apr 13 23:31:31 2005
Subject: [Python-Dev] Python tests fails on HP-UX 11.11 and core dumps
In-Reply-To: <02ab01c53ff0$fd2ef0a0$1f0110ac@sesco>
References: <02ab01c53ff0$fd2ef0a0$1f0110ac@sesco>
Message-ID: <425D8FB0.3030404@v.loewis.de>

Senthil Prabu.S wrote:
>        I tried python -4.2.1 on a HP-UX 11.11 PA machine. I was able to
> python. Gmake passes, gmake test results in error. The python reported
> that test_pty fails,when running this test alone.
>  
> Can anyone help to find why core dumps at running the
> *test_subprocess.py* test.
> Also how can I solve it?

Please understand that python-dev is not the place to get free
consulting. If you are willing to investigate somewhat further, try to
understand the problem, and propose patches, then I would be willing
to review the patches, comment on their correctness, and perhaps
integrate them into the Python CVS. As it stands, I can personally
take no more time to help with HP-UX problems for the near future
(say, ten years :-)

I do recall that there are serious problems with pseudo-terminals
in Python and HP-UX, so yes, we have heard of this before. If I
knew a solution, it were applied to Python already.

Please understand that this perhaps hostile-sounding response is just
my personal view; if somebody else responds more gracefully, just
ignore me.

Regards,
Martin
From mwh at python.net  Wed Apr 13 12:06:13 2005
From: mwh at python.net (Michael Hudson)
Date: Wed Apr 13 23:32:47 2005
Subject: [Python-Dev] Unified or context diffs?
In-Reply-To: <425CE519.4090700@iinet.net.au> (Nick Coghlan's message of
	"Wed, 13 Apr 2005 19:23:37 +1000")
References: <425CE519.4090700@iinet.net.au>
Message-ID: <2m3btvc7oq.fsf@starship.python.net>

Nick Coghlan <ncoghlan@iinet.net.au> writes:

> Are context diffs still favoured for patches?

If you want me to review it, yes, probably, but see below...

> The patch submission guidelines [1] still say that, but is it actually
> true these days? I personally prefer unified diffs, but have been
> generating context diffs because of what the guidelines say.

Emacs 21's diff-mode can convert between the two with a keypress.
People who continue to abuse themselves by not using Emacs can
probably find other tools to do this job.  So *I* don't regard this as
a big deal.

Plain diffs are of course, right out.

Cheers,
mwh

-- 
  It is never worth a first class man's time to express a majority
  opinion.  By definition, there are plenty of others to do that.
                                                        -- G. H. Hardy
From mwh at python.net  Wed Apr 13 13:52:32 2005
From: mwh at python.net (Michael Hudson)
Date: Wed Apr 13 23:55:59 2005
Subject: [Python-Dev] super_getattro() Behaviour
In-Reply-To: <13256.82.68.80.137.1113385284.squirrel@river-bank.demon.co.uk>
	(Phil
	Thompson's message of "Wed, 13 Apr 2005 10:41:24 +0100 (BST)")
References: <13256.82.68.80.137.1113385284.squirrel@river-bank.demon.co.uk>
Message-ID: <2mu0mac2rj.fsf@starship.python.net>

"Phil Thompson" <phil@riverbankcomputing.co.uk> writes:

> In PyQt, wrapped types implement lazy access to the type dictionary
> through tp_getattro. If the normal attribute lookup fails, then private
> tables are searched and the attribute (if found) is created on the fly and
> returned. It is also put into the type dictionary so that it is found next
> time through the normal lookup. This is done to speed up the import of,
> and the memory consumed by, the qt module which contains thousands of
> class methods.
>
> This all works fine - except when super is used.
>
> The implementation of super_getattro() doesn't use the normal attribute
> lookup (ie. doesn't go via tp_getattro). Instead it walks the MRO
> hierarchy itself and searches instance dictionaries explicitly. This means
> that attributes that have not yet been referenced (ie. not yet been cached
> in the type dictionary) will not be found.
>
> Questions...
>
> 1. What is the reason why it doesn't go via tp_getattro? 

Because it wouldn't work if it did?  I'm not sure what you're
suggesting here.

> 2. A possible workaround is to subvert the ma_lookup function of the type
> dictionary after creating the type to do something similar to what my
> tp_getattro function is doing.

Eek!

> Are there any inherent problems with that?

Well, I think the layout of dictionaries is fiercely private.  IIRC,
the only reason it's in a public header is to allow some optimzations
in ceval.c (though this isn't at all obvious from the headers, so
maybe I'm mistaken).

> 3. Why, when creating a new type and eventually calling type_new() is a
> copy of the dictionary passed in made?

I think this is to prevent changes to tp_dict behind the type's back.
It's important to keep the dict and the slots in sync.

> Why not take a reference to it?  This would allow a dict sub-class
> to be used as the type dictionary. I could then implement a
> lazy-dict sub-class with the behaviour I need.

Well, not really, because super_getattro uses PyDict_GetItem, which
doesn't respect subclasses...

> 4. Am I missing a more correct/obvious technique? (There is no need to
> support classic classes.)

Hum, I can't think of one, I'm afraid.

There has been some vague talk of having a tp_lookup slot in
typeobjects, so 

PyDict_GetItem(t->tp_dict, x);

would become 

t->tp_lookup(x);

(well, ish, it might make more sense to only do that if the dict
lookup fails).

For now, not being lazy seems your only option :-/ (it's what PyObjC
does).

Cheers,
mwh

-- 
  Many of the posts you see on Usenet are actually from moths.  You
  can tell which posters they are by their attraction to the flames.
                                      -- Internet Oracularity #1279-06
From anthony at interlink.com.au  Thu Apr 14 05:14:19 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Thu Apr 14 05:15:12 2005
Subject: [Python-Dev] IPV6 with Python- 4.2.1 on HPUX
In-Reply-To: <03ee01c54019$9af8bdc0$1f0110ac@sesco>
References: <02ab01c53ff0$fd2ef0a0$1f0110ac@sesco>
	<03ee01c54019$9af8bdc0$1f0110ac@sesco>
Message-ID: <200504141314.21335.anthony@interlink.com.au>

On Wednesday 13 April 2005 21:11, Senthil Prabu.S wrote:
> Hi Experts,
>        I am pretty new to Python. I have been trying to compile python
> on HP-UX 11.23 IPF machine. I tried to build with following configure
> option.
>
> ./configure --prefix=/opt/iexpress/python --enable-ipv6
> --with-signal-module --with-threads machine info : HP-UX beta B.11.23 U
> ia64
> gcc : gcc version 3.4.3
> While configure, I faced the following pbm,

Last time I tried, gcc on HPUX/ia64 was completely unable to build a working
version of Python - this was not the fault of Python, but simply that gcc on
that platform was utterly broken. Please try with the HP compiler instead, see
if that is any better.

Anthony

-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From anthony at interlink.com.au  Thu Apr 14 05:17:54 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Thu Apr 14 05:18:42 2005
Subject: [Python-Dev] Unified or context diffs?
In-Reply-To: <425D8E74.3030708@ocf.berkeley.edu>
References: <425CE519.4090700@iinet.net.au>
	<1113425354.10345.37.camel@geddy.wooz.org>
	<425D8E74.3030708@ocf.berkeley.edu>
Message-ID: <200504141317.57623.anthony@interlink.com.au>

On Thursday 14 April 2005 07:26, Brett C. wrote:
> OK, it seems like everyone who cares enough to speak up has said so far
> that unified diffs are better I will change the docs some time between now
> and when I keel over dead to have people use unified diffs assuming some
> rush of people don't suddenly start saying they prefer contextual diffs.

Should probably say either context or unified diffs - I'm sure there's vendor
supplied 'diff' programs out there that don't support -u 

ed-style patches, of course, are RIGHT OUT. 

Anthony

-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From bob at redivi.com  Thu Apr 14 05:35:49 2005
From: bob at redivi.com (Bob Ippolito)
Date: Thu Apr 14 05:35:56 2005
Subject: [Python-Dev] Unified or context diffs?
In-Reply-To: <200504141317.57623.anthony@interlink.com.au>
References: <425CE519.4090700@iinet.net.au>
	<1113425354.10345.37.camel@geddy.wooz.org>
	<425D8E74.3030708@ocf.berkeley.edu>
	<200504141317.57623.anthony@interlink.com.au>
Message-ID: <42a12315867d56b187a49306f578ac52@redivi.com>


On Apr 13, 2005, at 11:17 PM, Anthony Baxter wrote:

> On Thursday 14 April 2005 07:26, Brett C. wrote:
>> OK, it seems like everyone who cares enough to speak up has said so 
>> far
>> that unified diffs are better I will change the docs some time 
>> between now
>> and when I keel over dead to have people use unified diffs assuming 
>> some
>> rush of people don't suddenly start saying they prefer contextual 
>> diffs.
>
> Should probably say either context or unified diffs - I'm sure there's 
> vendor
> supplied 'diff' programs out there that don't support -u
>
> ed-style patches, of course, are RIGHT OUT.

It might be worth mentioning that if/when subversion is used to replace 
CVS, unified diffs are going to be the obvious way to do it, because I 
don't think that subversion supports context diffs without using an 
external diff command.

-bob

From phil at riverbankcomputing.co.uk  Thu Apr 14 10:24:55 2005
From: phil at riverbankcomputing.co.uk (Phil Thompson)
Date: Thu Apr 14 10:25:11 2005
Subject: [Python-Dev] super_getattro() Behaviour
In-Reply-To: <2mu0mac2rj.fsf@starship.python.net>
References: <13256.82.68.80.137.1113385284.squirrel@river-bank.demon.co.uk>
	<2mu0mac2rj.fsf@starship.python.net>
Message-ID: <25259.82.68.80.137.1113467095.squirrel@river-bank.demon.co.uk>

> "Phil Thompson" <phil@riverbankcomputing.co.uk> writes:
>
>> In PyQt, wrapped types implement lazy access to the type dictionary
>> through tp_getattro. If the normal attribute lookup fails, then private
>> tables are searched and the attribute (if found) is created on the fly
>> and
>> returned. It is also put into the type dictionary so that it is found
>> next
>> time through the normal lookup. This is done to speed up the import of,
>> and the memory consumed by, the qt module which contains thousands of
>> class methods.
>>
>> This all works fine - except when super is used.
>>
>> The implementation of super_getattro() doesn't use the normal attribute
>> lookup (ie. doesn't go via tp_getattro). Instead it walks the MRO
>> hierarchy itself and searches instance dictionaries explicitly. This
>> means
>> that attributes that have not yet been referenced (ie. not yet been
>> cached
>> in the type dictionary) will not be found.
>>
>> Questions...
>>
>> 1. What is the reason why it doesn't go via tp_getattro?
>
> Because it wouldn't work if it did?  I'm not sure what you're
> suggesting here.

I'm asking for an explanation for the current implementation. Why wouldn't
it work if it got the attribute via tp_getattro?

>> 2. A possible workaround is to subvert the ma_lookup function of the
>> type
>> dictionary after creating the type to do something similar to what my
>> tp_getattro function is doing.
>
> Eek!

Agreed.

>> Are there any inherent problems with that?
>
> Well, I think the layout of dictionaries is fiercely private.  IIRC,
> the only reason it's in a public header is to allow some optimzations
> in ceval.c (though this isn't at all obvious from the headers, so
> maybe I'm mistaken).

Yes, having looked in more detail at the dict implementation I really
don't want to go there.

>> 3. Why, when creating a new type and eventually calling type_new() is a
>> copy of the dictionary passed in made?
>
> I think this is to prevent changes to tp_dict behind the type's back.
> It's important to keep the dict and the slots in sync.
>
>> Why not take a reference to it?  This would allow a dict sub-class
>> to be used as the type dictionary. I could then implement a
>> lazy-dict sub-class with the behaviour I need.
>
> Well, not really, because super_getattro uses PyDict_GetItem, which
> doesn't respect subclasses...

I suppose I was hoping for more C++ like behaviour.

>> 4. Am I missing a more correct/obvious technique? (There is no need to
>> support classic classes.)
>
> Hum, I can't think of one, I'm afraid.
>
> There has been some vague talk of having a tp_lookup slot in
> typeobjects, so
>
> PyDict_GetItem(t->tp_dict, x);
>
> would become
>
> t->tp_lookup(x);
>
> (well, ish, it might make more sense to only do that if the dict
> lookup fails).

That would be perfect. I can't Google any reference to a discussion - can
you point me at something?

> For now, not being lazy seems your only option :-/ (it's what PyObjC
> does).

Not practical I'm afraid. I think I can only document that super doesn't
work in this context.

Thanks,
Phil

From mwh at python.net  Thu Apr 14 10:56:43 2005
From: mwh at python.net (Michael Hudson)
Date: Thu Apr 14 10:56:46 2005
Subject: [Python-Dev] super_getattro() Behaviour
In-Reply-To: <25259.82.68.80.137.1113467095.squirrel@river-bank.demon.co.uk>
	(Phil
	Thompson's message of "Thu, 14 Apr 2005 09:24:55 +0100 (BST)")
References: <13256.82.68.80.137.1113385284.squirrel@river-bank.demon.co.uk>
	<2mu0mac2rj.fsf@starship.python.net>
	<25259.82.68.80.137.1113467095.squirrel@river-bank.demon.co.uk>
Message-ID: <2mwtr5ag8k.fsf@starship.python.net>

"Phil Thompson" <phil@riverbankcomputing.co.uk> writes:

>>> Questions...
>>>
>>> 1. What is the reason why it doesn't go via tp_getattro?
>>
>> Because it wouldn't work if it did?  I'm not sure what you're
>> suggesting here.
>
> I'm asking for an explanation for the current implementation. Why wouldn't
> it work if it got the attribute via tp_getattro?

Well, using type->tp_getattro is just different to looking in tp_dict
-- it finds metamethods, for example.

Hmm.  Well, I'm fairly sure there is a difference, I'm not sure I can
explain it right now :(

>>> 2. A possible workaround is to subvert the ma_lookup function of the
>>> type
>>> dictionary after creating the type to do something similar to what my
>>> tp_getattro function is doing.

[...]

> Yes, having looked in more detail at the dict implementation I really
> don't want to go there.

Good :)

>>> 4. Am I missing a more correct/obvious technique? (There is no need to
>>> support classic classes.)
>>
>> Hum, I can't think of one, I'm afraid.
>>
>> There has been some vague talk of having a tp_lookup slot in
>> typeobjects, so
>>
>> PyDict_GetItem(t->tp_dict, x);
>>
>> would become
>>
>> t->tp_lookup(x);
>>
>> (well, ish, it might make more sense to only do that if the dict
>> lookup fails).
>
> That would be perfect. I can't Google any reference to a discussion - can
> you point me at something?

Well, most of the discussion so far has been in my head :)

There was a little talk of it in the thread "can we stop pretending
_PyType_Lookup is internal" here and possibly on pyobjc-dev around the
same time.

I'm not that likely to work on it soon -- I have enough moderately
complex patches to core Python I'm persuading people to think about
:-/.

>> For now, not being lazy seems your only option :-/ (it's what PyObjC
>> does).
>
> Not practical I'm afraid. I think I can only document that super doesn't
> work in this context.

Oh well.  I can't even think of a way to make it fail reliably...

Cheers,
mwh

-- 
  Java sucks. [...] Java on TV set top boxes will suck so hard it
  might well inhale people from off  their sofa until their heads
  get wedged in the card slots.              --- Jon Rabone, ucam.chat
From phil at riverbankcomputing.co.uk  Thu Apr 14 11:15:34 2005
From: phil at riverbankcomputing.co.uk (Phil Thompson)
Date: Thu Apr 14 11:15:11 2005
Subject: [Python-Dev] super_getattro() Behaviour
In-Reply-To: <2mwtr5ag8k.fsf@starship.python.net>
References: <13256.82.68.80.137.1113385284.squirrel@river-bank.demon.co.uk>
	<2mu0mac2rj.fsf@starship.python.net>
	<25259.82.68.80.137.1113467095.squirrel@river-bank.demon.co.uk>
	<2mwtr5ag8k.fsf@starship.python.net>
Message-ID: <27767.82.68.80.137.1113470134.squirrel@river-bank.demon.co.uk>

>>>> 4. Am I missing a more correct/obvious technique? (There is no need to
>>>> support classic classes.)
>>>
>>> Hum, I can't think of one, I'm afraid.
>>>
>>> There has been some vague talk of having a tp_lookup slot in
>>> typeobjects, so
>>>
>>> PyDict_GetItem(t->tp_dict, x);
>>>
>>> would become
>>>
>>> t->tp_lookup(x);
>>>
>>> (well, ish, it might make more sense to only do that if the dict
>>> lookup fails).
>>
>> That would be perfect. I can't Google any reference to a discussion -
>> can
>> you point me at something?
>
> Well, most of the discussion so far has been in my head :)
>
> There was a little talk of it in the thread "can we stop pretending
> _PyType_Lookup is internal" here and possibly on pyobjc-dev around the
> same time.
>
> I'm not that likely to work on it soon -- I have enough moderately
> complex patches to core Python I'm persuading people to think about
> :-/.

Anything I can do to help push it along?

Phil

From fredrik at pythonware.com  Thu Apr 14 12:41:34 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu Apr 14 12:43:25 2005
Subject: [Python-Dev] Re: Unified or context diffs?
References: <425CE519.4090700@iinet.net.au><1113425354.10345.37.camel@geddy.wooz.org><425D8E74.3030708@ocf.berkeley.edu><200504141317.57623.anthony@interlink.com.au>
	<42a12315867d56b187a49306f578ac52@redivi.com>
Message-ID: <d3lh6m$9u2$1@sea.gmane.org>

Bob Ippolito wrote:

> It might be worth mentioning that if/when subversion is used to replace CVS, unified diffs are 
> going to be the obvious way to do it, because I don't think that subversion supports context diffs 
> without using an external diff command.

subversion?  you meant bazaar-ng, right?

</F> 


From python-dev at zesty.ca  Thu Apr 14 13:27:20 2005
From: python-dev at zesty.ca (Ka-Ping Yee)
Date: Thu Apr 14 13:27:24 2005
Subject: [Python-Dev] Check out a new way to read threaded conversations.
Message-ID: <Pine.LNX.4.58.0504140624110.27211@server1.LFW.org>

I hope you will not mind too much if I ask a small favor.  Sorry
for this off-topic post.

I am working on a new design for displaying online conversations.
(Some of you saw this at PyCon.)  I'm conducting a short survey to
gather some opinions on the current design.

If you have just a few minutes to spare, would you please visit:

    http://zesty.ca/threadmap/pydev.cgi

You'll see a new way of looking at this discussion list that you
may find pretty interesting.  I look forward to learning what
you think of it.

I am very grateful for your time and assistance.

(If you reply to this message, please reply to me only -- I don't
want to clutter up python-dev with lots of off-topic messages.)


-- Ping
From drobinow at gmail.com  Thu Apr 14 15:08:30 2005
From: drobinow at gmail.com (David Robinow)
Date: Thu Apr 14 15:08:45 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <1f7befae05041111284b61992a@mail.gmail.com>
References: <d3735o$r5k$1@sea.gmane.org> <d3bdsn$rjp$1@sea.gmane.org>
	<d3bgj8$2kn$1@sea.gmane.org>
	<1f7befae05041010237d11d7a9@mail.gmail.com>
	<2mmzs65wsn.fsf@starship.python.net>
	<1f7befae05041015372cf17e91@mail.gmail.com>
	<2maco55t8p.fsf@starship.python.net>
	<1f7befae050411082733dca644@mail.gmail.com>
	<2msm1x480i.fsf@starship.python.net>
	<1f7befae05041111284b61992a@mail.gmail.com>
Message-ID: <4eb0089f0504140608493fe329@mail.gmail.com>

On 4/11/05, Tim Peters <tim.peters@gmail.com> wrote:

> Heh.  I have a vague half-memory of _some_ box that stored the two
> 4-byte "words" in an IEEE double in one order, but the bytes within
> each word in the opposite order.  It's always something ...
 I believe this was the Floating Instruction Set on the PDP 11/35.
The fact that it's still remembered 30 years later shows how unusual it was.
From ldlandis at gmail.com  Thu Apr 14 15:23:57 2005
From: ldlandis at gmail.com (LD "Gus" Landis)
Date: Thu Apr 14 15:30:55 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <16983.51916.407019.489590@montanaro.dyndns.org>
References: <d3735o$r5k$1@sea.gmane.org>
	<1f7befae0504081638145d3b4c@mail.gmail.com>
	<d37cpc$g0d$1@sea.gmane.org> <4257B8A1.6000902@v.loewis.de>
	<16983.51916.407019.489590@montanaro.dyndns.org>
Message-ID: <a1ddf57e05041406235856d6a8@mail.gmail.com>

Hi,

  For AIX:

Python 2.2 (#1, Feb 17 2003, 21:43:03) [C] on aix4
Type "help", "copyright", "credits" or "license" for more information.
>>> import marshal
>>> marshal.dumps(1e10000)
'f\x03INF'
>>> marshal.loads(marshal.dumps(1e10000))
INF
>>> float("INF")
INF
>>> float("NaN")
NaNQ
>>>


On 4/9/05, Skip Montanaro <skip@pobox.com> wrote:
> 
>     Martin> Yet, this *still* is a platform dependence. Python makes no
>     Martin> guarantee that 1e1000 is a supported float literal on any
>     Martin> platform, and indeed, on your platform, 1e1000 is not supported
>     Martin> on your platform.
> 
> Are float("inf") and float("nan") supported everywhere?
>

-- 
LD Landis - N0YRQ - from the St Paul side of Minneapolis
From nbastin at opnet.com  Thu Apr 14 19:57:43 2005
From: nbastin at opnet.com (Nicholas Bastin)
Date: Thu Apr 14 20:18:28 2005
Subject: [Python-Dev] PyCallable_Check redeclaration
Message-ID: <43ed01d7a659a9a79793ddfcff0e957e@opnet.com>

Why is PyCallable_Check declared in both object.h and abstract.h?  It 
appears that it's been this way for quite some time (exists in both 
2.3.4 and 2.4.1).

--
Nick

From kbk at shore.net  Thu Apr 14 20:43:28 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Thu Apr 14 20:43:56 2005
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200504141843.j3EIhSNw007456@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  314 open ( +6) /  2824 closed ( +5) /  3138 total (+11)
Bugs    :  898 open (+16) /  4921 closed ( +8) /  5819 total (+24)
RFE     :  177 open ( +1) /   151 closed ( +0) /   328 total ( +1)

New / Reopened Patches
______________________

typos in rpc.py  (2005-04-09)
CLOSED http://python.org/sf/1179503  opened by  engelbert gruber

[AST] Fix for core in test_grammar.py  (2005-04-08)
       http://python.org/sf/1179513  opened by  logistix

no html file for modulefinder  (2005-04-10)
       http://python.org/sf/1180012  opened by  George Yoshida

fix typos in Library Reference  (2005-04-10)
       http://python.org/sf/1180062  opened by  George Yoshida

great improvement for locale.py formatting functions  (2005-04-10)
       http://python.org/sf/1180296  opened by  Georg Brandl

clarify behavior of StringIO objects when preinitialized  (2005-04-10)
CLOSED http://python.org/sf/1180305  opened by  Georg Brandl

st_gen and st_birthtime support for FreeBSD  (2005-04-11)
       http://python.org/sf/1180695  opened by  Antti Louko

binary formats for marshalling floats  (2005-04-11)
       http://python.org/sf/1180995  opened by  Michael Hudson

make float packing copy bytes when they can  (2005-04-12)
       http://python.org/sf/1181301  opened by  Michael Hudson

range() in for loops, again  (2005-04-12)
       http://python.org/sf/1181334  opened by  Armin Rigo

HMAC hexdigest and general review  (2005-04-13)
       http://python.org/sf/1182394  opened by  Shane Holloway

Patches Closed
______________

Complex commented  (2005-04-06)
       http://python.org/sf/1177597  closed by  loewis

typos in rpc.py  (2005-04-08)
       http://python.org/sf/1179503  closed by  rhettinger

clarify behavior of StringIO objects when preinitialized  (2005-04-10)
       http://python.org/sf/1180305  closed by  rhettinger

Improved output for unittest failUnlessEqual  (2003-04-22)
       http://python.org/sf/725569  closed by  purcell

[AST] Generator expressions  (2005-03-21)
       http://python.org/sf/1167628  closed by  bcannon

New / Reopened Bugs
___________________

256 should read 255 in operator module docs  (2005-04-06)
CLOSED http://python.org/sf/1178255  opened by  Dan Everhart

operator.isMappingType and isSequenceType on instances  (2005-04-06)
CLOSED http://python.org/sf/1178269  opened by  Dan Everhart

Erroneous line number error in Py2.4.1  (2005-04-07)
       http://python.org/sf/1178484  opened by  Timo Linna

configure: refuses setgroups  (2005-04-07)
       http://python.org/sf/1178510  opened by  zosh

2.4.1 breaks pyTTS  (2005-04-07)
       http://python.org/sf/1178624  opened by  Dieter Deyke

Variable.__init__ uses self.set(), blocking specialization  (2005-04-07)
       http://python.org/sf/1178863  opened by  Emil

Variable.__init__ uses self.set(), blocking specialization  (2005-04-07)
       http://python.org/sf/1178872  opened by  Emil

IDLE bug - changing shortcuts  (2005-04-08)
       http://python.org/sf/1179168  opened by  Przemys&#322;aw Gocy&#322;a

can't import thru cygwin symlink  (2005-04-08)
       http://python.org/sf/1179412  opened by  steveward

Missing def'n of equality for set elements  (2005-04-09)
CLOSED http://python.org/sf/1179957  opened by  Skip Montanaro

codecs.readline sometimes removes newline chars  (2005-04-02)
       http://python.org/sf/1175396  reopened by  doerwalter

locale.format question  (2005-04-10)
CLOSED http://python.org/sf/1180002  opened by  Andrew Ma

test_posix fails on cygwin  (2005-04-10)
       http://python.org/sf/1180147  opened by  Henrik Wist

subprocess.Popen fails with closed stdout  (2005-04-10)
       http://python.org/sf/1180160  opened by  neuhauser

broken pyc files  (2005-04-10)
       http://python.org/sf/1180193  opened by  Armin Rigo

Python keeps file references after calling close methode  (2005-04-10)
       http://python.org/sf/1180237  opened by  Eelco

expanding platform module and making it work as it should  (2005-04-10)
       http://python.org/sf/1180267  opened by  Nikos Kouremenos

StringIO's docs should mention overwriting of initial value  (2005-04-10)
CLOSED http://python.org/sf/1180392  opened by  Leif K-Brooks

BaseHTTPServer uses deprecated mimetools.Message  (2005-04-11)
       http://python.org/sf/1180470  opened by  Paul Jimenez

lax error-checking in new-in-2.4 marshal stuff  (2005-04-11)
       http://python.org/sf/1180997  opened by  Michael Hudson

Bad sys.executable value for bdist_wininst install script  (2005-04-12)
       http://python.org/sf/1181619  opened by  follower

asyncore.loop() documentation  (2005-04-13)
       http://python.org/sf/1181939  opened by  Graham

re.escape(s) prints wrong for chr(0)  (2005-04-13)
       http://python.org/sf/1182603  opened by  Nick Jacobson

dir() does not include _  (2005-04-13)
       http://python.org/sf/1182614  opened by  Nick Jacobson

ZipFile __del__/close problem with longint/long files  (2005-04-14)
       http://python.org/sf/1182788  opened by  Robert Kiendl

Bugs Closed
___________

256 should read 255 in operator module docs  (2005-04-06)
       http://python.org/sf/1178255  closed by  rhettinger

operator.isMappingType and isSequenceType on instances  (2005-04-06)
       http://python.org/sf/1178269  closed by  rhettinger

GNU readline 4.2 prompt issue  (2002-12-30)
       http://python.org/sf/660083  closed by  mwh

non-ascii readline input crashes python  (2004-08-14)
       http://python.org/sf/1009263  closed by  mwh

readline+no threads   (2003-09-24)
       http://python.org/sf/811844  closed by  mwh

compiler module didn't get updated for "class foo():pass"  (2005-04-03)
       http://python.org/sf/1176012  closed by  bcannon

Missing def'n of equality for set elements  (2005-04-09)
       http://python.org/sf/1179957  closed by  rhettinger

locale.format question  (2005-04-10)
       http://python.org/sf/1180002  closed by  andrewma

StringIO's docs should mention overwriting of initial value/  (2005-04-10)
       http://python.org/sf/1180392  closed by  rhettinger

New / Reopened RFE
__________________

making builtin exceptions more informative  (2005-04-13)
       http://python.org/sf/1182143  opened by  Sebastien de Menten

From irmen at xs4all.nl  Fri Apr 15 03:17:53 2005
From: irmen at xs4all.nl (Irmen de Jong)
Date: Fri Apr 15 03:17:57 2005
Subject: [Python-Dev] shadow password module (spwd) is never built due to
	error in setup.py
Message-ID: <425F1641.9080708@xs4all.nl>

Hello,

A modification was made in setup.py, cvs rel 1.213
(see diff here:
http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/setup.py?r1=1.212&r2=1.213
)
which appears to be wrong. At least, on my system,
the spwd module is never built anymore, because the
if statement is never true.

Actually, the sysconfig doesn't contain *any* of the HAVE_XXXX vars
that occur in pyconfig.h (I checked by printing all vars).

I don't really understand the distutils magic that is done
in setup.py, but it appears to me that either the if statement
is wrong (because the vars never exist) or distutils does something
wrong by leaving out all HAVE_XXX vars from pyconfig.h.

Please advise?
I want my spwd module back ;-)


--Irmen de Jong


PS
I checked that pyconfig.h correctly #defines both HAVE_GETSPNAM
and HAVE_GETSPENT to 1 on my system (Mandrake linux 10.1), so
the rest of the configure script runs fine (it should, I created
the original patches for it... see SF patch # 579435)
From janssen at parc.com  Fri Apr 15 04:28:20 2005
From: janssen at parc.com (Bill Janssen)
Date: Fri Apr 15 04:47:06 2005
Subject: [Python-Dev] Check out a new way to read threaded conversations. 
In-Reply-To: Your message of "Thu, 14 Apr 2005 04:27:20 PDT."
	<Pine.LNX.4.58.0504140624110.27211@server1.LFW.org> 
Message-ID: <05Apr14.192823pdt."58617"@synergy1.parc.xerox.com>

>     http://zesty.ca/threadmap/pydev.cgi

Very reminiscent of Paula Newman's work at PARC several years ago.
Check out
http://www2.parc.com/istl/groups/hdi/papers/psn_emailvis01.pdf,
particularly page 5.

Bill
From barry at python.org  Fri Apr 15 05:46:04 2005
From: barry at python.org (Barry Warsaw)
Date: Fri Apr 15 05:46:08 2005
Subject: [Python-Dev] Inconsistent exception for read-only properties?
Message-ID: <1113536764.23564.310.camel@geddy.wooz.org>

I've noticed an apparent inconsistency in the exception thrown for
read-only properties for C extension types vs. Python new-style
classes.  I'm wondering if this is intentional, a bug, a bug worth
fixing, or whether I'm just missing something.

class other(object):
    def __init__(self, value):
        self._value = value

    def _get_value(self):
        return self._value

    value = property(_get_value)

With this class, if you attempt "other(1).value = 7" you will get an
AttributeError.  However, if you define something similar in C using a
tp_getset, where the structure has NULL for the setter, you will get a
TypeError (code available upon request).

At best, this is inconsistent.  What's the "right" exception to raise? 
I think the documentation I've seen (e.g. Raymond's How To for
Descriptors) describes AttributeError as the thing to raise when trying
to set read-only properties.

Thoughts?  Should this be fixed (in 2.4?).

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050414/433916e8/attachment-0001.pgp
From martin at v.loewis.de  Fri Apr 15 06:59:08 2005
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri Apr 15 06:59:10 2005
Subject: [Python-Dev] shadow password module (spwd) is never built due
	to	error in setup.py
In-Reply-To: <425F1641.9080708@xs4all.nl>
References: <425F1641.9080708@xs4all.nl>
Message-ID: <425F4A1C.9080505@v.loewis.de>

Irmen de Jong wrote:
> Please advise?

setup.py should refer to config_h_vars, which in turn should be set earlier.

Regards,
Martin
From irmen at xs4all.nl  Fri Apr 15 19:06:00 2005
From: irmen at xs4all.nl (Irmen de Jong)
Date: Fri Apr 15 19:06:03 2005
Subject: [Python-Dev] shadow password module (spwd) is never built due
	to	error in setup.py
In-Reply-To: <425F4A1C.9080505@v.loewis.de>
References: <425F1641.9080708@xs4all.nl> <425F4A1C.9080505@v.loewis.de>
Message-ID: <425FF478.8070607@xs4all.nl>

Martin v. L?wis wrote:
> Irmen de Jong wrote:
> 
>>Please advise?
> 
> 
> setup.py should refer to config_h_vars, which in turn should be set earlier.
> 
> Regards,
> Martin

Ah so the setup.py script is flawed.
However, the sysconfig object doesn't contain a config_h_vars...
So I guess distutils must be patched too?

--Irmen
From gvanrossum at gmail.com  Fri Apr 15 22:05:32 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Apr 15 22:05:38 2005
Subject: [Python-Dev] PyCon 2005 keynote on-line
Message-ID: <ca471dc20504151305686b6141@mail.gmail.com>

http://python.org/doc/essays/ppt/ -- scroll to the end.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From bac at OCF.Berkeley.EDU  Fri Apr 15 22:33:21 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Apr 15 22:33:27 2005
Subject: [Python-Dev] shadow password module (spwd) is never built due
	to	error in setup.py
In-Reply-To: <425FF478.8070607@xs4all.nl>
References: <425F1641.9080708@xs4all.nl> <425F4A1C.9080505@v.loewis.de>
	<425FF478.8070607@xs4all.nl>
Message-ID: <42602511.2010200@ocf.berkeley.edu>

Irmen de Jong wrote:
> Martin v. L?wis wrote:
> 
>> Irmen de Jong wrote:
>>
>>> Please advise?
>>
>>
>>
>> setup.py should refer to config_h_vars, which in turn should be set
>> earlier.
>>
>> Regards,
>> Martin
> 
> 
> Ah so the setup.py script is flawed.
> However, the sysconfig object doesn't contain a config_h_vars...
> So I guess distutils must be patched too?
> 

While it probably should be included in distutils.sysconfig, config_h_vars was
created later on in setup.py by some code dealing with whether to compile
expat.  I just moved that up to the top of the funciton so that it can be used
sooner.

Fixed in rev. 1.217 .  Sorry about the bad checking that broke the building of
it in the first place.  =)

-Brett
From Jack.Jansen at cwi.nl  Sat Apr 16 00:19:02 2005
From: Jack.Jansen at cwi.nl (Jack Jansen)
Date: Sat Apr 16 00:19:05 2005
Subject: [Python-Dev] Re: marshal / unmarshal
In-Reply-To: <4eb0089f0504140608493fe329@mail.gmail.com>
References: <d3735o$r5k$1@sea.gmane.org> <d3bdsn$rjp$1@sea.gmane.org>
	<d3bgj8$2kn$1@sea.gmane.org>
	<1f7befae05041010237d11d7a9@mail.gmail.com>
	<2mmzs65wsn.fsf@starship.python.net>
	<1f7befae05041015372cf17e91@mail.gmail.com>
	<2maco55t8p.fsf@starship.python.net>
	<1f7befae050411082733dca644@mail.gmail.com>
	<2msm1x480i.fsf@starship.python.net>
	<1f7befae05041111284b61992a@mail.gmail.com>
	<4eb0089f0504140608493fe329@mail.gmail.com>
Message-ID: <ae9cd73fc84cee73cf8436696abfa0c1@cwi.nl>


On 14-apr-05, at 15:08, David Robinow wrote:

> On 4/11/05, Tim Peters <tim.peters@gmail.com> wrote:
>
>> Heh.  I have a vague half-memory of _some_ box that stored the two
>> 4-byte "words" in an IEEE double in one order, but the bytes within
>> each word in the opposite order.  It's always something ...
>  I believe this was the Floating Instruction Set on the PDP 11/35.
> The fact that it's still remembered 30 years later shows how unusual 
> it was.

I think it was actually "logical", because all PDP-11s (there were 2 or 
3 FPU instructionsets/architecture in the family IIRC) stored 32 bit 
integers in middle-endian (high-order word first, but low-order byte 
first).

But note that neither of the PDP-11 FPUs were IEEE, that was a much 
later invention. At least, I didn't come across it until much later:-)
--
Jack Jansen, <Jack.Jansen@cwi.nl>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

From barry at python.org  Sun Apr 17 01:24:27 2005
From: barry at python.org (Barry Warsaw)
Date: Sun Apr 17 01:24:31 2005
Subject: [Python-Dev] Inconsistent exception for read-only properties?
In-Reply-To: <1113536764.23564.310.camel@geddy.wooz.org>
References: <1113536764.23564.310.camel@geddy.wooz.org>
Message-ID: <1113693867.32074.80.camel@presto.wooz.org>

On Thu, 2005-04-14 at 23:46, Barry Warsaw wrote:
> I've noticed an apparent inconsistency in the exception thrown for
> read-only properties for C extension types vs. Python new-style
> classes.

I haven't seen any follow ups on this, so I've gone ahead and posted a
patch, assigning it to Raymond:

http://sourceforge.net/tracker/index.php?func=detail&aid=1184449&group_id=5470&atid=105470

I would have attached a patch to that issue but SF it being finicky.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050416/db516c1d/attachment.pgp
From jack at performancedrivers.com  Sun Apr 17 17:53:31 2005
From: jack at performancedrivers.com (Jack Diederich)
Date: Sun Apr 17 17:53:36 2005
Subject: [Python-Dev] Inconsistent exception for read-only properties?
In-Reply-To: <1113693867.32074.80.camel@presto.wooz.org>
References: <1113536764.23564.310.camel@geddy.wooz.org>
	<1113693867.32074.80.camel@presto.wooz.org>
Message-ID: <20050417155331.GC25115@performancedrivers.com>

On Sat, Apr 16, 2005 at 07:24:27PM -0400, Barry Warsaw wrote:
> On Thu, 2005-04-14 at 23:46, Barry Warsaw wrote:
> > I've noticed an apparent inconsistency in the exception thrown for
> > read-only properties for C extension types vs. Python new-style
> > classes.
> 
> I haven't seen any follow ups on this, so I've gone ahead and posted a
> patch, assigning it to Raymond:
> 
> http://sourceforge.net/tracker/index.php?func=detail&aid=1184449&group_id=5470&atid=105470
> 
In 2.4 & 2.3 does it make sense to raise an exception that multiply inherits
from both TypeError and AttributeError?  If anyone currently does catch the
error raising only AttributeError will break their code.  2.5 should just
raise an AttributeError, of course.

If that's acceptable I'll gladly submit a similar patch for mmap.get_byte()
  PyErr_SetString (PyExc_ValueError, "read byte out of range");
has always irked me (the same thing with mmap[i] is an IndexError).
I hadn't thought of a clean way to fix it, but MI on the error might work.

-jackdied
From barry at python.org  Sun Apr 17 17:57:11 2005
From: barry at python.org (Barry Warsaw)
Date: Sun Apr 17 17:57:14 2005
Subject: [Python-Dev] Inconsistent exception for read-only properties?
In-Reply-To: <20050417155331.GC25115@performancedrivers.com>
References: <1113536764.23564.310.camel@geddy.wooz.org>
	<1113693867.32074.80.camel@presto.wooz.org>
	<20050417155331.GC25115@performancedrivers.com>
Message-ID: <1113753431.32079.269.camel@presto.wooz.org>

On Sun, 2005-04-17 at 11:53, Jack Diederich wrote:

> In 2.4 & 2.3 does it make sense to raise an exception that multiply inherits
> from both TypeError and AttributeError?  If anyone currently does catch the
> error raising only AttributeError will break their code.  2.5 should just
> raise an AttributeError, of course.

Without introducing a new exception class (which I think is out of the
question for anything but 2.5), the only common base is StandardError,
which seems too general for this exception.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050417/9152f245/attachment.pgp
From jack at performancedrivers.com  Sun Apr 17 18:07:20 2005
From: jack at performancedrivers.com (Jack Diederich)
Date: Sun Apr 17 18:07:24 2005
Subject: [Python-Dev] Inconsistent exception for read-only properties?
In-Reply-To: <20050417155331.GC25115@performancedrivers.com>
References: <1113536764.23564.310.camel@geddy.wooz.org>
	<1113693867.32074.80.camel@presto.wooz.org>
	<20050417155331.GC25115@performancedrivers.com>
Message-ID: <20050417160720.GD25115@performancedrivers.com>

On Sun, Apr 17, 2005 at 11:53:31AM -0400, Jack Diederich wrote:
> On Sat, Apr 16, 2005 at 07:24:27PM -0400, Barry Warsaw wrote:
> > On Thu, 2005-04-14 at 23:46, Barry Warsaw wrote:
> > > I've noticed an apparent inconsistency in the exception thrown for
> > > read-only properties for C extension types vs. Python new-style
> > > classes.
> > 
> > I haven't seen any follow ups on this, so I've gone ahead and posted a
> > patch, assigning it to Raymond:
> > 
> > http://sourceforge.net/tracker/index.php?func=detail&aid=1184449&group_id=5470&atid=105470
> > 
> In 2.4 & 2.3 does it make sense to raise an exception that multiply inherits
> from both TypeError and AttributeError?  If anyone currently does catch the
> error raising only AttributeError will break their code.  2.5 should just
> raise an AttributeError, of course.
> 
> If that's acceptable I'll gladly submit a similar patch for mmap.get_byte()
>   PyErr_SetString (PyExc_ValueError, "read byte out of range");
> has always irked me (the same thing with mmap[i] is an IndexError).
> I hadn't thought of a clean way to fix it, but MI on the error might work.
> 

I just did a quick grep for raised ValueErrors with "range" in the
explanation string and didn't find any general consensus.  I dunno what 
that means, if anything.

wopr:~/src/python_head/dist/src# find ./ -name '*.c' | xargs grep ValueError | grep range | wc -l
13
wopr:~/src/python_head/dist/src# find ./ -name '*.c' | xargs grep IndexError | grep range | wc -l
31

(long versions below)

-jackdied

wopr:~/src/python_head/dist/src# find ./ -name '*.c' | xargs grep -n IndexError | grep range 
./Modules/arraymodule.c:599:            PyErr_SetString(PyExc_IndexError, "array index out of range");
./Modules/arraymodule.c:997:            PyErr_SetString(PyExc_IndexError, "pop index out of range");
./Modules/mmapmodule.c:639:             PyErr_SetString(PyExc_IndexError, "mmap index out of range");
./Modules/mmapmodule.c:727:             PyErr_SetString(PyExc_IndexError, "mmap index out of range");
./Modules/_heapqmodule.c:19:            PyErr_SetString(PyExc_IndexError, "index out of range");
./Modules/_heapqmodule.c:58:            PyErr_SetString(PyExc_IndexError, "index out of range");
./Modules/_heapqmodule.c:136:           PyErr_SetString(PyExc_IndexError, "index out of range");
./Modules/_heapqmodule.c:173:           PyErr_SetString(PyExc_IndexError, "index out of range");
./Modules/_heapqmodule.c:310:           PyErr_SetString(PyExc_IndexError, "index out of range");
./Modules/_heapqmodule.c:349:           PyErr_SetString(PyExc_IndexError, "index out of range");
./Objects/bufferobject.c:403:           PyErr_SetString(PyExc_IndexError, "buffer index out of range");
./Objects/listobject.c:876:             PyErr_SetString(PyExc_IndexError, "pop index out of range");
./Objects/rangeobject.c:94:             PyErr_SetString(PyExc_IndexError,
./Objects/stringobject.c:1055:          PyErr_SetString(PyExc_IndexError, "string index out of range");
./Objects/structseq.c:62:               PyErr_SetString(PyExc_IndexError, "tuple index out of range");
./Objects/tupleobject.c:104:            PyErr_SetString(PyExc_IndexError, "tuple index out of range");
./Objects/tupleobject.c:310:            PyErr_SetString(PyExc_IndexError, "tuple index out of range");
./Objects/unicodeobject.c:5164:        PyErr_SetString(PyExc_IndexError, "string index out of range");
./Python/exceptions.c:1504:PyDoc_STRVAR(IndexError__doc__, "Sequence index out of range.");
./RISCOS/Modules/drawfmodule.c:534:  { PyErr_SetString(PyExc_IndexError,"drawf index out of range");
./RISCOS/Modules/drawfmodule.c:555:  { PyErr_SetString(PyExc_IndexError,"drawf index out of range");
./RISCOS/Modules/drawfmodule.c:578:  { PyErr_SetString(PyExc_IndexError,"drawf index out of range");
./RISCOS/Modules/swimodule.c:113:  { PyErr_SetString(PyExc_IndexError,"block index out of range");
./RISCOS/Modules/swimodule.c:124:  { PyErr_SetString(PyExc_IndexError,"block index out of range");
./RISCOS/Modules/swimodule.c:136:  { PyErr_SetString(PyExc_IndexError,"block index out of range");
./RISCOS/Modules/swimodule.c:150:  { PyErr_SetString(PyExc_IndexError,"block index out of range");
./RISCOS/Modules/swimodule.c:164:  { PyErr_SetString(PyExc_IndexError,"block index out of range");
./RISCOS/Modules/swimodule.c:225:  { PyErr_SetString(PyExc_IndexError,"block index out of range");
./RISCOS/Modules/swimodule.c:237:  { PyErr_SetString(PyExc_IndexError,"block index out of range");
./RISCOS/Modules/swimodule.c:248:  { PyErr_SetString(PyExc_IndexError,"block index out of range");
./RISCOS/Modules/swimodule.c:264:  { PyErr_SetString(PyExc_IndexError,"block index out of range");
wopr:~/src/python_head/dist/src# find ./ -name '*.c' | xargs grep -n ValueError | grep range
./Modules/mmapmodule.c:181:             PyErr_SetString (PyExc_ValueError, "read byte out of range");
./Modules/mmapmodule.c:301:             PyErr_SetString (PyExc_ValueError, "data out of range");
./Modules/mmapmodule.c:524:     PyErr_SetString (PyExc_ValueError, "seek out of range");
./Modules/timemodule.c:405:            PyErr_SetString(PyExc_ValueError, "month out of range");
./Modules/timemodule.c:409:            PyErr_SetString(PyExc_ValueError, "day of month out of range");
./Modules/timemodule.c:413:            PyErr_SetString(PyExc_ValueError, "hour out of range");
./Modules/timemodule.c:417:            PyErr_SetString(PyExc_ValueError, "minute out of range");
./Modules/timemodule.c:421:            PyErr_SetString(PyExc_ValueError, "seconds out of range");
./Modules/timemodule.c:427:            PyErr_SetString(PyExc_ValueError, "day of week out of range");
./Modules/timemodule.c:431:            PyErr_SetString(PyExc_ValueError, "day of year out of range");
./Objects/rangeobject.c:61:             PyErr_SetString(PyExc_ValueError, "xrange() arg 3 must not be zero");
./Objects/rangeobject.c:106:            PyErr_SetString(PyExc_ValueError,
./RISCOS/Modules/drawfmodule.c:450:      { PyErr_SetString(PyExc_ValueError,"Object out of range");
From aahz at pythoncraft.com  Sun Apr 17 18:25:09 2005
From: aahz at pythoncraft.com (Aahz)
Date: Sun Apr 17 18:25:13 2005
Subject: [Python-Dev] Inconsistent exception for read-only properties?
In-Reply-To: <1113753431.32079.269.camel@presto.wooz.org>
References: <1113536764.23564.310.camel@geddy.wooz.org>
	<1113693867.32074.80.camel@presto.wooz.org>
	<20050417155331.GC25115@performancedrivers.com>
	<1113753431.32079.269.camel@presto.wooz.org>
Message-ID: <20050417162509.GA3300@panix.com>

On Sun, Apr 17, 2005, Barry Warsaw wrote:
> On Sun, 2005-04-17 at 11:53, Jack Diederich wrote:
>> 
>> In 2.4 & 2.3 does it make sense to raise an exception that multiply
>> inherits from both TypeError and AttributeError?  If anyone currently
>> does catch the error raising only AttributeError will break their
>> code.  2.5 should just raise an AttributeError, of course.
>
> Without introducing a new exception class (which I think is out of the
> question for anything but 2.5), the only common base is StandardError,
> which seems too general for this exception.

Why is changing an exception more acceptable than creating a new one?
(I don't have a strong opinion either way, but I'd like some reasoning;
Jack's approach at least doesn't break code.)  Especially if the new
exception isn't "public" (in the builtins with other exceptions).
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code -- 
not in reams of trivial code that bores the reader to death."  --GvR
From gvanrossum at gmail.com  Sun Apr 17 20:36:21 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun Apr 17 20:36:26 2005
Subject: [Python-Dev] Inconsistent exception for read-only properties?
In-Reply-To: <20050417155331.GC25115@performancedrivers.com>
References: <1113536764.23564.310.camel@geddy.wooz.org>
	<1113693867.32074.80.camel@presto.wooz.org>
	<20050417155331.GC25115@performancedrivers.com>
Message-ID: <ca471dc2050417113628dae7f@mail.gmail.com>

> In 2.4 & 2.3 does it make sense to raise an exception that multiply inherits
> from both TypeError and AttributeError?  If anyone currently does catch the
> error raising only AttributeError will break their code.  2.5 should just
> raise an AttributeError, of course.

I think that sets a bad precedent. I understand you want to do this
for backwards compatibility, but it's a real ugly thing in the
exception inheritance tree and once it's in it's hard to get rid of.
It's also introducing a new feature so it's a no-no to do this for 2.3
or 2.4 anyway.

I wonder if long-term, AttributeError shouldn't inherit from
TypeError? AttributeError really feels to me like a particular case of
the stuff that typically raises TypeError. Unfortunately this is
*also* a b/w compatibility problem, since people currently might have
code like this:

  try: ...
  except TypeError: ...
  except AttributeError: ...

and the AttributeError branch would become unreachable.

Personally, I think it would be fine to just change the TypeError to
AttributeError. I expect that very few people would be hurt by that
change (they'd be building *way* too much specific arcane knowledge
into their program if they had code for which it mattered).

So why, given two different backwards incompatible choices, do I
prefer changing the exception raised in this specific case over making
AttributeError inherit from TypeError? Because the latter change has a
much larger scope; it can affect much more code (including code that
doesn't have anything to do with the problem we're trying to solve).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From barry at python.org  Sun Apr 17 21:09:41 2005
From: barry at python.org (Barry Warsaw)
Date: Sun Apr 17 21:10:07 2005
Subject: [Python-Dev] Inconsistent exception for read-only properties?
In-Reply-To: <20050417162509.GA3300@panix.com>
References: <1113536764.23564.310.camel@geddy.wooz.org>
	<1113693867.32074.80.camel@presto.wooz.org>
	<20050417155331.GC25115@performancedrivers.com>
	<1113753431.32079.269.camel@presto.wooz.org>
	<20050417162509.GA3300@panix.com>
Message-ID: <1113764981.32079.284.camel@presto.wooz.org>

On Sun, 2005-04-17 at 12:25, Aahz wrote:

> Why is changing an exception more acceptable than creating a new one?
> (I don't have a strong opinion either way, but I'd like some reasoning;
> Jack's approach at least doesn't break code.)  Especially if the new
> exception isn't "public" (in the builtins with other exceptions).

Adding an exception that we have to live with forever (even if it's
localized to this one module) seems like it would fall under the new
feature rubric, whereas I think the choice of exception was just a bug. 
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050417/94ad9087/attachment.pgp
From barry at python.org  Sun Apr 17 21:17:19 2005
From: barry at python.org (Barry Warsaw)
Date: Sun Apr 17 21:17:21 2005
Subject: [Python-Dev] Inconsistent exception for read-only properties?
In-Reply-To: <ca471dc2050417113628dae7f@mail.gmail.com>
References: <1113536764.23564.310.camel@geddy.wooz.org>
	<1113693867.32074.80.camel@presto.wooz.org>
	<20050417155331.GC25115@performancedrivers.com>
	<ca471dc2050417113628dae7f@mail.gmail.com>
Message-ID: <1113765439.32081.287.camel@presto.wooz.org>

On Sun, 2005-04-17 at 14:36, Guido van Rossum wrote:

> Personally, I think it would be fine to just change the TypeError to
> AttributeError. I expect that very few people would be hurt by that
> change (they'd be building *way* too much specific arcane knowledge
> into their program if they had code for which it mattered).

Unless there are any objections in the next few days, I will take this
as a pronouncement and make the change at least in 2.5 and 2.4.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050417/6f5d5f24/attachment.pgp
From gvanrossum at gmail.com  Sun Apr 17 21:44:43 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sun Apr 17 21:44:48 2005
Subject: [Python-Dev] Inconsistent exception for read-only properties?
In-Reply-To: <1113765439.32081.287.camel@presto.wooz.org>
References: <1113536764.23564.310.camel@geddy.wooz.org>
	<1113693867.32074.80.camel@presto.wooz.org>
	<20050417155331.GC25115@performancedrivers.com>
	<ca471dc2050417113628dae7f@mail.gmail.com>
	<1113765439.32081.287.camel@presto.wooz.org>
Message-ID: <ca471dc205041712444735b95f@mail.gmail.com>

> > Personally, I think it would be fine to just change the TypeError to
> > AttributeError. I expect that very few people would be hurt by that
> > change (they'd be building *way* too much specific arcane knowledge
> > into their program if they had code for which it mattered).
> 
> Unless there are any objections in the next few days, I will take this
> as a pronouncement and make the change at least in 2.5 and 2.4.


You meant 2.5 only of course. It's still a new feature and as such
can't be changed in 2.4.


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From mwh at python.net  Sun Apr 17 22:14:39 2005
From: mwh at python.net (Michael Hudson)
Date: Sun Apr 17 22:14:41 2005
Subject: [Python-Dev] Inconsistent exception for read-only properties?
In-Reply-To: <1113765439.32081.287.camel@presto.wooz.org> (Barry Warsaw's
	message of "Sun, 17 Apr 2005 15:17:19 -0400")
References: <1113536764.23564.310.camel@geddy.wooz.org>
	<1113693867.32074.80.camel@presto.wooz.org>
	<20050417155331.GC25115@performancedrivers.com>
	<ca471dc2050417113628dae7f@mail.gmail.com>
	<1113765439.32081.287.camel@presto.wooz.org>
Message-ID: <2macnx88k0.fsf@starship.python.net>

Barry Warsaw <barry@python.org> writes:

> On Sun, 2005-04-17 at 14:36, Guido van Rossum wrote:
>
>> Personally, I think it would be fine to just change the TypeError to
>> AttributeError. I expect that very few people would be hurt by that
>> change (they'd be building *way* too much specific arcane knowledge
>> into their program if they had code for which it mattered).
>
> Unless there are any objections in the next few days, I will take this
> as a pronouncement and make the change at least in 2.5 and 2.4.

I don't think this should be changed in 2.4.

Cheers,
mwh

-- 
  <spiv> As far as I'm concerned, the meat pie is the ultimate unit
         of currency.                           -- from Twisted.Quotes
From barry at python.org  Sun Apr 17 23:48:35 2005
From: barry at python.org (Barry Warsaw)
Date: Sun Apr 17 23:48:37 2005
Subject: [Python-Dev] Inconsistent exception for read-only properties?
In-Reply-To: <ca471dc205041712444735b95f@mail.gmail.com>
References: <1113536764.23564.310.camel@geddy.wooz.org>
	<1113693867.32074.80.camel@presto.wooz.org>
	<20050417155331.GC25115@performancedrivers.com>
	<ca471dc2050417113628dae7f@mail.gmail.com>
	<1113765439.32081.287.camel@presto.wooz.org>
	<ca471dc205041712444735b95f@mail.gmail.com>
Message-ID: <1113774515.32074.300.camel@presto.wooz.org>

On Sun, 2005-04-17 at 15:44, Guido van Rossum wrote:

> You meant 2.5 only of course. It's still a new feature and as such
> can't be changed in 2.4.

Fair enough.
-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050417/87f9a6d4/attachment.pgp
From anthony at interlink.com.au  Mon Apr 18 04:07:27 2005
From: anthony at interlink.com.au (Anthony Baxter)
Date: Mon Apr 18 04:07:43 2005
Subject: [Python-Dev] Inconsistent exception for read-only properties?
In-Reply-To: <1113765439.32081.287.camel@presto.wooz.org>
References: <1113536764.23564.310.camel@geddy.wooz.org>
	<ca471dc2050417113628dae7f@mail.gmail.com>
	<1113765439.32081.287.camel@presto.wooz.org>
Message-ID: <200504181207.28667.anthony@interlink.com.au>

On Monday 18 April 2005 05:17, Barry Warsaw wrote:
> Unless there are any objections in the next few days, I will take this
> as a pronouncement and make the change at least in 2.5 and 2.4.

God no - this isn't suitable for a bugfix release. It seems fine for 2.5,
though.


-- 
Anthony Baxter     <anthony@interlink.com.au>
It's never too late to have a happy childhood.
From gvanrossum at gmail.com  Mon Apr 18 16:49:05 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Apr 18 16:49:24 2005
Subject: [Python-Dev] Fwd: CFP: DLS05: ACM Dynamic Languages Symposium
In-Reply-To: <6b0be5726fb86107ff97205642357f4f@ulb.ac.be>
References: <6b0be5726fb86107ff97205642357f4f@ulb.ac.be>
Message-ID: <ca471dc2050418074959f4937b@mail.gmail.com>

See you all at OOPSLA!

---------- Forwarded message ----------
From: Roel Wuyts <Roel.Wuyts@ulb.ac.be>
Date: Apr 17, 2005 10:59 PM
Subject: CFP: DLS05: ACM Dynamic Languages Symposium
To: python-announce-list@python.org


                        CALL FOR PAPERS FOR THE

            ACM Dynamic Languages Symposium 2005
                                 October 18, 2005
                             San Diego, California
                       (co-located with OOPSLA'05)

    URL: http://decomp.ulb.ac.be:8082/events/dls05/

-----------
Abstract
-----------

In industry, static languages (such as Java, C++ and C#) are much more
widely used than their dynamic counterparts (like CLOS, Python, Self,
Perl, php or Smalltalk). So it appears as though dynamic language
concepts were forgotten and lost the race.

But this is not the case.

Java and C#, the latest mainstream static languages, popularized to a
certain extent dynamic language features such as garbage collection,
portability and (limited forms of) reflection. In the near future, we
expect this dynamicity to increase even further. E.g., it is getting
clearer year after year that pervasive computing is becoming the rule
and that concepts such as meta programming, reflection, mobility,
dynamic reconfigurability and distribution are becoming increasingly
popular. All of these features are the domain of dynamic languages, and
hence it is only logical that more dynamic language concepts have to be
taken up by static languages, or that dynamic languages can make a
breakthrough.

Currently, the dynamic language community is fragmented, split over a
multitude of paradigms (from functional over logic to object-oriented),
languages and syntaxes. This fragmentation severely hinders research as
well as acceptance, and results in either language wars or, even worse,
language ignorance. The goal of this symposium is to provide a highly
visible, international forum for researchers working on dynamic
features and languages. We explicitly invite submissions from all kinds
of paradigms (object-oriented, functional, logic, ...), as can be seen
from the structure of the program committee.

Areas of interests include, but are not limited to:
- closures
- delegation
- actors, active objects
- constraint systems
- mixins and traits
- reflection and meta-programming
- language symbiosis and multi-paradigm languages
- experience reports on successful application of dynamic languages

Accepted Papers will be published in the ACM Digital Library.

-------------------------------
Submission Guidelines
-------------------------------

Papers will need to be submitted using an online tracking system, of
which the URL will be given later.

All papers must be submitted electronically in PDF format (or
PostScript, if you do not have access to PDF-producing programs, but
this is not recommended). Submissions, as well as final versions, must
be formatted to conform to ACM Proceedings requirements: Nine point
font on ten point baseline, two columns per page, each column 3.33
inches wide by 9 inches tall, with a column gutter of 0.33 inches, etc.
See the ACM Proceedings Guidelines. You can save preparation time by
using one of the templates from that page. Note that MS Word documents
must be converted to PDF before being submitted.

----------------------
  Important Dates
----------------------

- Deadline for receipt of submissions: June 24th 2005
- Notification of acceptance or rejection: August 5th 2005
- Final version for the proceedings: To be announced later

---------------------------
Program Committee
---------------------------

- Gilad Bracha
- Wolfgang De Meuter
- Stephane Ducasse
- Gopal Gupta
- Robert Hirschfeld
- Dan Ingalls
- Yukihiro Matsumoto
- Mark Miller
- Eliot Miranda
- Philippe Mougin
- Oscar Nierstrasz
- Dave Thomas
- David Ungar
- Guido Van Rossum
- Peter Van Roy
- Jon L White (G)
- Roel Wuyts (Chair)

--
Roel Wuyts
   DeComp
roel.wuyts@ulb.ac.be                               Universit? Libre de
Bruxelles
http://homepages.ulb.ac.be/~rowuyts/
Belgique
Vice-President of the European Smalltalk Users Group: www.esug.org

--
http://mail.python.org/mailman/listinfo/python-announce-list

        Support the Python Software Foundation:
        http://www.python.org/psf/donations.html


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From tlesher at gmail.com  Mon Apr 18 19:19:04 2005
From: tlesher at gmail.com (Tim Lesher)
Date: Mon Apr 18 19:19:07 2005
Subject: [Python-Dev] python-dev Summary for 2005-04-01 through 2005-04-15
	[draft]
Message-ID: <9613db60050418101934f0e3e8@mail.gmail.com>

Here's the first draft of the python-dev summary for the first half of
April.  Please send any corrections or suggestions to the summarizers.

======================
Summary Announcements
======================

---------------------------
New python-dev summary team
---------------------------

This summary marks the first by the team of Steve Bethard, Tim Lesher,
and Tony Meyer.  We're trying a collaborative approach to the
summaries: each fortnight, we'll be getting together in a virtual
smoke-filled back room to divide up the interesting threads.  Then
we'll stitch together the summaries in roughly the same form as you've
seen in the past. We'll mark each editor's entries with his initials.

Thanks to Brett Cannon for sixty-one excellent python-dev summaries.
Also, thanks for providing scripts to help get the new summaries off
the ground! We're looking forward to the contributions you'll make to
the Python core, now that the summaries aren't taking up all your
time.

[TDL]

=========
Summaries
=========

----------------------
Right Operator Methods
----------------------

Greg Ewing explored an issue with new-style classes that define only
right operator methods (__radd__, __rmul__, etc.)  Instances of such
a class cannot be added/multiplied/etc. together as Python raises a
TypeError. Armin Rigo explained the rule: if the instances on both sides
of an operator are of the same class, only the non-reversed method is
ever called. Armin also explained that an __add__ or __mul__ method that
returns NotImplemented may be called twice when Python attempts to
differentiate between numeric and sequence operations.

Contributing threads:

- `New style classes and operator methods
<http://mail.python.org/pipermail/python-dev/2005-April/052577.html>`__

[SJB]

------------------------------------------
Hierarchical groups in regular expressions
------------------------------------------

Chris Ottrey demoed his `pyre2 project`_ that can extract a hierarchy of
strings when nested groups match in a regular expression.  The current
re module (in the stdlib) only matches the last occurrence of a group in
the string, throwing away any preceding matches. People discussed some
of pyre2's proposed API, with the main suggestion being to extend the
API to support unnamed (positional) groups in addition to named groups.

Though a number of people expressed interest in the idea, it was not
clear whether the functionality should be included in the standard
library. However, most agreed that if it was included, it should
be integrated with the existing re module. Gustavo Niemeyer offered to
perform this integration if an API could be agreed upon. Further
discussion was moved to the pyre2 `development wiki`_ and `mailing
list`_.

Contributing threads:

- `hierarchicial named groups extension to the re library
<http://mail.python.org/pipermail/python-dev/2005-April/052508.html>`__

.. _pyre2 project: http://pyre2.sourceforge.net/

.. _development wiki: http://py.redsoft.be/pyre2/wiki/

.. _mailing list: http://lists.sourceforge.net/lists/listinfo/pyre2-devel

[SJB]

-------------------------------
Security capabilities in Python
-------------------------------

The issue of security came up again, and Ka-Ping Yee suggested that in
Python's restricted execution mode secure proxies can be created by
using lexical scoping.  He posted `some code`_ for revealing only
certain "facets" of an object by using a function to declare a proxy
class that used function local variables to build the proxy. Thus to
access the attributes used in the proxy class, you need to access
things like im_func or func_closure, which are not accessible in
restricted execution mode.

James Y Knight illustrated how strategic overriding of __eq__ in a
subclass of str could allow access to the hidden "facets". Eyal Lotem
suggested that such an attack could be countered by implementing
"facets" in C, but having to turn to C every time you needed a
particular security construct seemed unappealing.

Contributing threads:

- `Security capabilities in Python
<http://mail.python.org/pipermail/python-dev/2005-April/052580.html>`__

.. _some code: http://zesty.ca/python/facet.py

[SJB]

---------------------------------
Improving GilState API Robustness
---------------------------------

Michael Hudson noted that his changes to thread handling in the
readline module appeared to trigger `bug 1176893`_ ("Readline
segfault"). However, he believed the problem lay in the GilState API,
rather than in his changes: PyGilState_Release crashes if
PyEval_InitThreads wasn't called, even if the code you're writing
doesn't use multiple threads.

He proposed several solutions, none of which met with resounding
approbation, and Tim Peters noted that `PEP 311`_, Simplified Global
Interpreter Lock Acquisition for Extensions, "specifically disowns
responsibility for worrying about whether Py_Initialize and
PyEval_InitThreads have been called."

Bob Ippolito wondered whether just calling PyEval_InitThreads directly
in Py_Initialize might be a better idea.  No objections were raised,
so long as the underlying OS locking mechanisms weren't overly
expensive; some initial benchmarks indicated that this approach was
viable, at least on Linux and OS X.

Contributing threads:

- `threading (GilState) question
<http://mail.python.org/pipermail/python-dev/2005-April/052562.html>`__

.. _bug 1176893:
http://sourceforge.net/tracker/index.php?func=detail&aid=1176893&group_id=5470&atid=105470

.. _PEP 311: http://www.python.org/peps/pep-0311.html

[TDL]

----------------------------------------
Unicode byte order mark decoding
----------------------------------------

Evan Jones saw that the UTF-16 decoder discards the byte-order mark
(BOM) from Unicode files, while the UTF-8 decoder doesn't. Although
the BOM isn't really required in UTF-8 files, many Unicode-generating
applications, especially on Microsoft platforms, add it.

Walter D?rwald created a patch_ to add a UTF-8-Sig codec that generates
a BOM on writing and skips it on reading, but after a long discussion
on the history of the Unicode, Microsoft's influence over its
evolution, the consensus was that BOM and signature handling belong at
a higher level (for example, a stream API) than the codec.

Contributing threads:

- `Unicode byte order mark decoding
<http://mail.python.org/pipermail/python-dev/2005-April/052501.html>`__

.. _patch: http://sourceforge.net/tracker/index.php?func=detail&aid=1177307&group_id=5470&atid=305470

[TDL]

---------------
Developers List
---------------

Raymond Hettinger has started a `project to track developers`_ and the
(tracker and commit) privileges they have, and who gave them the privileges,
and why (for example, was it for a one-shot project). Removing inactive
developers should improve clarity, institutional memory, security, and makes
everything tidier.  Raymond has begun contacting recently inactive
developers to check whether they still require the privileges they have.

Contributing threads:

 - `Developer list update
<http://mail.python.org/pipermail/python-dev/2005-April/052540.html>`__

.. _project to track developers:
http://cvs.sourceforge.net/viewcvs.py/*checkout*/python/python/dist/src/Misc/developers.txt

[TAM]

--------------------
Marshalling Infinity
--------------------

Scott David Daniels kicked off a very long thread by asking what (un)marshal
should do with floating point NaNs.  The current behaviour (as with any NaN,
infinity, or signed zero) is undefined: a platform-dependant accident,
because Python is written to C89, which has no such concepts.  Tim Peters
pointed out all code for (de)serialing C doubles should go through
_PyFloat_Pack8()/_PyFloat_Unpack8(), and that the current implementation
suggests that the routines could simply copy bytes on platforms that use the
standard IEEE-754 single and double formats natively.  Michael Hudson
obliged by creating a `patch to implement this`_.

The consensus was that the correct behaviour is that packing a NaN or
infinity shouldn't cause an exception.  When unpacking, an IEEE-754 platform
shouldn't cause an exception, but a non-754 platform should, since there's
no sensible value that it can be unpacked to, and errors should never pass
silently.

Contributing threads:

 - `marshal / unmarshal
<http://mail.python.org/pipermail/python-dev/2005-April/052593.html>`__

.. _patch to implement this: http://python.org/sf/1181301

[TAM]

---------------------------------
Location of the sign bit in longs
---------------------------------

Michael Hudson asked about the possibility of longs storing the sign bit
somewhere other than the current location, suggesting the top bit of
ob_digit[0].  Tim Peters suggested that it would be better to give struct
_longobject a distinct sign member.  This simplifies code, costs no extra
bytes for some longs, and 8 extra bytes for others, and shouldn't hurt
binary compatibility.

Michael coughed up a `longobject patch`_, which seems likely to be checked
in.

Contributing threads:

 - `marshal / unmarshal
<http://mail.python.org/pipermail/python-dev/2005-April/052593.html>`__

.. _longobject patch: http://python.org/sf/1177779

[TAM]

-----------------------
Acceptable diff formats
-----------------------

Nick Coghlan asked if context diffs are still favoured for patches.
Historically, context diffs were preferred, but it appears that unified
diffs are the today's choice.  Raymond Hettinger made the sensible
suggestion that whichever is most informative for the particular patch
should be used, and Bob Ippolito pointed out that if CVS is replaced with
subversion, unified diffs will have better support.  The `patch submission
guidelines`_ will be updated at some point to reflect the preference for
unified diffs, although if your diff program doesn't support '-u', then
context diffs are ok - plain patches are, of course, not.

Contributing threads:

 - `Unified or context diffs?
<http://mail.python.org/pipermail/python-dev/2005-April/052657.html>`__

.. _patch submission guidelines: http://www.python.org/patches/

[TAM]

===============
Skipped Threads
===============

- python-dev Summary for 2005-03-16 through 2005-03-31 [draft]
- [Python-checkins] python/dist/src/Lib/logging handlers.py, 1.19, 1.19.2.1
- [Python-checkins] python/dist/src/Modules mathmodule.c, 2.74, 2.75
- Weekly Python Patch/Bug Summary
- Mail.python.org
- New bug, directly assigned, okay?
- inconsistency when swapping obj.__dict__ with a dict-like object...
- Pickling instances of nested classes
- args attribute of Exception objects


-- 
Tim Lesher <tlesher@gmail.com>
From mal at egenix.com  Mon Apr 18 19:47:03 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon Apr 18 19:47:06 2005
Subject: [Python-Dev] Security capabilities in Python
In-Reply-To: <b64f365b0504080701206af8d3@mail.gmail.com>
References: <b64f365b0504080701206af8d3@mail.gmail.com>
Message-ID: <4263F297.3020205@egenix.com>

Eyal Lotem wrote:
> I would like to experiment with security based on Python references as
> security capabilities.
> 
> Unfortunatly, there are several problems that make Python references
> invalid as capabilities:
> 
> * There is no way to create secure proxies because there are no
> private attributes.
> * Lots of Python objects are reachable unnecessarily breaking the
> principle of least privelege (i.e: object.__subclasses__() etc.)
> 
> I was wondering if any such effort has already begun or if there are
> other considerations making Python unusable as a capability platform?

You might want to have a look at mxProxy objects. These
were created to provide secure wrappers around Python
objects with a well-defined access mechanism, e.g.
by defining a list of methods/attributes which can
be accessed from the outside or by creating a method
which then decides whether access is granted or not:

    http://www.egenix.com/files/python/mxProxy.html

Note that the new-style classes may have introduced some
security leaks. If you find any, please let me know.

PS: A nice side-effect of the these proxy objects is that
you can create weak-reference to all Python objects (not just
those that support the protocol).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 18 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From mwh at python.net  Mon Apr 18 20:05:52 2005
From: mwh at python.net (Michael Hudson)
Date: Mon Apr 18 20:10:26 2005
Subject: [Python-Dev] python-dev Summary for 2005-04-01 through
	2005-04-15 [draft]
In-Reply-To: <9613db60050418101934f0e3e8@mail.gmail.com> (Tim Lesher's
	message of "Mon, 18 Apr 2005 13:19:04 -0400")
References: <9613db60050418101934f0e3e8@mail.gmail.com>
Message-ID: <2mzmvw6jun.fsf@starship.python.net>

Tim Lesher <tlesher@gmail.com> writes:

> Here's the first draft of the python-dev summary for the first half of
> April.  Please send any corrections or suggestions to the summarizers.
>
> ======================
> Summary Announcements
> ======================
>
> ---------------------------
> New python-dev summary team
> ---------------------------
>
> This summary marks the first by the team of Steve Bethard, Tim Lesher,
> and Tony Meyer.

Nice work!

An update:

> ---------------------------------
> Improving GilState API Robustness
> ---------------------------------
>
> Michael Hudson noted that his changes to thread handling in the
> readline module appeared to trigger `bug 1176893`_ ("Readline
> segfault"). However, he believed the problem lay in the GilState API,
> rather than in his changes: PyGilState_Release crashes if
> PyEval_InitThreads wasn't called, even if the code you're writing
> doesn't use multiple threads.
>
> He proposed several solutions, none of which met with resounding
> approbation, 

Nevertheless, I've checked one of them in :) After reading a fair bit
of code, and docs, I went for option 2) in the linked mail.

> and Tim Peters noted that `PEP 311`_, Simplified Global Interpreter
> Lock Acquisition for Extensions, "specifically disowns
> responsibility for worrying about whether Py_Initialize and
> PyEval_InitThreads have been called."

I think this reading is a bit of a stretch of the wording of the PEP.
It also contradicts the documentation ("regardless of the current
state of Python").

Finally, the current behaviour has a strong whiff of being accidental.

> --------------------
> Marshalling Infinity
> --------------------
>
> Scott David Daniels kicked off a very long thread by asking what (un)marshal
> should do with floating point NaNs.  The current behaviour (as with any NaN,
> infinity, or signed zero) is undefined: a platform-dependant accident,
> because Python is written to C89, which has no such concepts.  Tim Peters
> pointed out all code for (de)serialing C doubles should go through
> _PyFloat_Pack8()/_PyFloat_Unpack8(), and that the current implementation
> suggests that the routines could simply copy bytes on platforms that use the
> standard IEEE-754 single and double formats natively.  Michael Hudson
> obliged by creating a `patch to implement this`_.

I hope to check this in soon.  Note that the patch is in two pieces,
one to marshal floats in binary format and one ...

> The consensus was that the correct behaviour is that packing a NaN or
> infinity shouldn't cause an exception.  When unpacking, an IEEE-754 platform
> shouldn't cause an exception, but a non-754 platform should, since there's
> no sensible value that it can be unpacked to, and errors should never pass
> silently.

... to do this bit.

> ---------------------------------
> Location of the sign bit in longs
> ---------------------------------
>
> Michael Hudson asked about the possibility of longs storing the sign bit
> somewhere other than the current location, suggesting the top bit of
> ob_digit[0].  Tim Peters suggested that it would be better to give struct
> _longobject a distinct sign member.  This simplifies code, costs no extra
> bytes for some longs, and 8 extra bytes for others, and shouldn't hurt
> binary compatibility.
>
> Michael coughed up a `longobject patch`_, which seems likely to be checked
> in.

I'm actually in less of a rush to get this one in :)

(Hmm, had a busy couple of weeks, didn't I? :)

> Contributing threads:
>
>  - `marshal / unmarshal
> <http://mail.python.org/pipermail/python-dev/2005-April/052593.html>`__

?

Cheers,
mwh

-- 
  <wzZzy> we should write an os
  <itamar> YES
  * itamar starts a sourceforge project
                                                -- from Twisted.Quotes
From aahz at pythoncraft.com  Mon Apr 18 20:27:48 2005
From: aahz at pythoncraft.com (Aahz)
Date: Mon Apr 18 20:27:51 2005
Subject: [Python-Dev] python-dev Summary for 2005-04-01 through 2005-04-15
	[draft]
In-Reply-To: <9613db60050418101934f0e3e8@mail.gmail.com>
References: <9613db60050418101934f0e3e8@mail.gmail.com>
Message-ID: <20050418182748.GA4709@panix.com>

On Mon, Apr 18, 2005, Tim Lesher wrote:
>
> Here's the first draft of the python-dev summary for the first half of
> April.  Please send any corrections or suggestions to the summarizers.

<applause!>  Good show!

One suggestion: might want to order threads in order of relevance to
random python-dev readers (the bit that triggered this comment was
seeing the unified vs. context diffs thread so far down).
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code -- 
not in reams of trivial code that bores the reader to death."  --GvR
From bac at OCF.Berkeley.EDU  Mon Apr 18 21:02:54 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Mon Apr 18 21:03:10 2005
Subject: [Python-Dev] python-dev Summary for 2005-04-01 through 2005-04-15
	[draft]
In-Reply-To: <9613db60050418101934f0e3e8@mail.gmail.com>
References: <9613db60050418101934f0e3e8@mail.gmail.com>
Message-ID: <4264045E.9000300@ocf.berkeley.edu>

Tim Lesher wrote:
> Here's the first draft of the python-dev summary for the first half of
> April.  Please send any corrections or suggestions to the summarizers.
> 
> ======================
> Summary Announcements
> ======================
> 
> ---------------------------
> New python-dev summary team
> ---------------------------
> 
> This summary marks the first by the team of Steve Bethard, Tim Lesher,
> and Tony Meyer.  We're trying a collaborative approach to the
> summaries: each fortnight, we'll be getting together in a virtual
> smoke-filled back room to divide up the interesting threads.  Then
> we'll stitch together the summaries in roughly the same form as you've
> seen in the past. We'll mark each editor's entries with his initials.
> 

Woohoo!  Once again, thanks for doing this guys.

> Thanks to Brett Cannon for sixty-one excellent python-dev summaries.
> Also, thanks for providing scripts to help get the new summaries off
> the ground! We're looking forward to the contributions you'll make to
> the Python core, now that the summaries aren't taking up all your
> time.
> 

Gee, no pressure.  =)

[SNIP]
> -------------------------------
> Security capabilities in Python
> -------------------------------
> 
> The issue of security came up again, and Ka-Ping Yee suggested that in
> Python's restricted execution mode secure proxies can be created by
> using lexical scoping.  He posted `some code`_ for revealing only
> certain "facets" of an object by using a function to declare a proxy
> class that used function local variables to build the proxy. Thus to

"... that used a function's local variables ..."

[SNIP]
> 
> ---------------------------------
> Improving GilState API Robustness
> ---------------------------------
> 
> Michael Hudson noted that his changes to thread handling in the
> readline module appeared to trigger `bug 1176893`_ ("Readline
> segfault"). However, he believed the problem lay in the GilState API,
> rather than in his changes: PyGilState_Release crashes if
> PyEval_InitThreads wasn't called, even if the code you're writing
> doesn't use multiple threads.
> 
> He proposed several solutions, none of which met with resounding
> approbation, and Tim Peters noted that `PEP 311`_, Simplified Global
> Interpreter Lock Acquisition for Extensions, "specifically disowns
> responsibility for worrying about whether Py_Initialize and
> PyEval_InitThreads have been called."
> 
> Bob Ippolito wondered whether just calling PyEval_InitThreads directly
> in Py_Initialize might be a better idea.  No objections were raised,
> so long as the underlying OS locking mechanisms weren't overly
> expensive; some initial benchmarks indicated that this approach was
> viable, at least on Linux and OS X.
> 
> Contributing threads:
> 
> - `threading (GilState) question
> <http://mail.python.org/pipermail/python-dev/2005-April/052562.html>`__
> 
> .. _bug 1176893:
> http://sourceforge.net/tracker/index.php?func=detail&aid=1176893&group_id=5470&atid=105470
> 

For any tracker item, the easiest way to do a URL is to use the python.org
shortcut: http://www.python.org/sf/##### .  So the above would be
http://www.python.org/sf/1176893 .

> .. _PEP 311: http://www.python.org/peps/pep-0311.html
> 
> [TDL]
> 
> ----------------------------------------
> Unicode byte order mark decoding
> ----------------------------------------
> 
> Evan Jones saw that the UTF-16 decoder discards the byte-order mark
> (BOM) from Unicode files, while the UTF-8 decoder doesn't. Although
> the BOM isn't really required in UTF-8 files, many Unicode-generating
> applications, especially on Microsoft platforms, add it.
> 
> Walter D?rwald created a patch_ to add a UTF-8-Sig codec that generates
> a BOM on writing and skips it on reading, but after a long discussion
> on the history of the Unicode, Microsoft's influence over its

"... of Unicode and Microsoft's influence ..."

[SNIP]
> ---------------
> Developers List
> ---------------
> 
> Raymond Hettinger has started a `project to track developers`_ and the
> (tracker and commit) privileges they have, and who gave them the privileges,
> and why (for example, was it for a one-shot project). Removing inactive
> developers should improve clarity, institutional memory, security, and makes
> everything tidier.  Raymond has begun contacting recently inactive
> developers to check whether they still require the privileges they have.
> 
> Contributing threads:
> 
>  - `Developer list update
> <http://mail.python.org/pipermail/python-dev/2005-April/052540.html>`__
> 
> .. _project to track developers:
> http://cvs.sourceforge.net/viewcvs.py/*checkout*/python/python/dist/src/Misc/developers.txt
> 
> [TAM]
> 
> --------------------
> Marshalling Infinity
> --------------------
> 
> Scott David Daniels kicked off a very long thread by asking what (un)marshal
> should do with floating point NaNs.  The current behaviour (as with any NaN,
> infinity, or signed zero) is undefined: a platform-dependant accident,
> because Python is written to C89, which has no such concepts.  Tim Peters
> pointed out all code for (de)serialing C doubles should go through
> _PyFloat_Pack8()/_PyFloat_Unpack8(), and that the current implementation
> suggests that the routines could simply copy bytes on platforms that use the
> standard IEEE-754 single and double formats natively.  Michael Hudson
> obliged by creating a `patch to implement this`_.
> 
> The consensus was that the correct behaviour is that packing a NaN or

"... behavious of packing a NaN ..."

[SNIP]


Well done guys!  Very impressed; succinct, clear, and a ton less errors then I
used to put into the first draft.  =)

When you are happy with the draft just email me the plaintext and I will get it
up on python.org for you.

-Brett
From walter at livinglogic.de  Mon Apr 18 23:33:58 2005
From: walter at livinglogic.de (=?iso-8859-1?Q?Walter_D=F6rwald?=)
Date: Mon Apr 18 23:34:00 2005
Subject: [Python-Dev] python-dev Summary for 2005-04-01 through 2005-04-15
	[draft]
In-Reply-To: <9613db60050418101934f0e3e8@mail.gmail.com>
References: <9613db60050418101934f0e3e8@mail.gmail.com>
Message-ID: <1260.84.56.100.23.1113860038.squirrel@isar.livinglogic.de>

Tim Lesher sagte:

> Here's the first draft of the python-dev summary for the first half of April.  Please send any corrections or suggestions to
> the summarizers.
> [...]
> ----------------------------------------
> Unicode byte order mark decoding
> ----------------------------------------
>
> Evan Jones saw that the UTF-16 decoder discards the byte-order mark (BOM) from Unicode files, while the UTF-8 decoder
> doesn't. Although the BOM isn't really required in UTF-8 files, many Unicode-generating applications, especially on Microsoft
> platforms, add it.
>
> Walter D?rwald created a patch_ to add a UTF-8-Sig codec that generates a BOM on writing and skips it on reading, but after a
> long discussion on the history of the Unicode, Microsoft's influence over its
> evolution, the consensus was that BOM and signature handling belong at a higher level (for example, a stream API) than the
> codec.

All codecs provide a stream API, so there is no higher level.

Bye,
   Walter D?rwald


From python at rcn.com  Mon Apr 18 12:08:21 2005
From: python at rcn.com (Raymond Hettinger)
Date: Tue Apr 19 00:08:48 2005
Subject: [Python-Dev] python-dev Summary for 2005-04-01 through
	2005-04-15[draft]
In-Reply-To: <9613db60050418101934f0e3e8@mail.gmail.com>
Message-ID: <000201c543fe$8dd18580$e827a044@oemcomputer>

> ======================
> Summary Announcements
> ======================

Executive summary:  Hudson goes wild fixing obscure bugs.


> ---------------------------
> New python-dev summary team
> ---------------------------
> 
> This summary marks the first by the team of Steve Bethard, Tim Lesher,
> and Tony Meyer.  We're trying a collaborative approach to the
> summaries: each fortnight, we'll be getting together in a virtual
> smoke-filled back room to divide up the interesting threads. 

Both your process and results are excellent.


Raymond Hettinger
From oliphant at ee.byu.edu  Tue Apr 19 02:47:31 2005
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Tue Apr 19 02:47:35 2005
Subject: [Python-Dev] Pickling buffer objects.
Message-ID: <42645523.3000109@ee.byu.edu>


Before submitting a patch to pickle.py and cPickle.c,  I'd be interested 
in knowing how likely to be accepted a patch that allows Python to 
pickle the buffer object.

The problem being solved is that Numeric currently has to copy all of 
its data into a string before writing it out to a pickle.  Yes, I know 
there are ways to write directly to a file.  But,  it is desireable to 
have Numeric arrays interact seamlessly with other pickleable types 
without a separate stream.   This is especially utilized for network 
transport. 

The patch would simply write the opcode for a Python string to the 
stream and then write the character-interpreted data (without making an 
intermediate copy) of the void * pointer of the buffer object. 

Yes, I know all of the old arguments about the buffer object and that it 
should be replaced with something better.    I've read all the old posts 
and am quite familiar with the issues about it.

But, this can be considered a separate issue.  Since the buffer object 
exists, it ought to be pickleable, and it would make a lot of 
applications a lot faster.  

I'm proposing to pickle the buffer object so that it unpickles as a 
string.  Arguably, there should be a separate mutable-byte object opcode 
so that buffer objects unpickle as mutable-byte buffer objects.   If 
that is more desireable, I'd even offer a patch to do that (though such 
pickles wouldn't unpickle under earlier versions of Python).   I suspect 
that the buffer object would need to be reworked into something more 
along the lines of the previously-proposed bytes object before a 
separate bytecode for pickleable mutable-bytes is accepted, however.

-Travis Oliphant


From greg.ewing at canterbury.ac.nz  Tue Apr 19 06:39:06 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue Apr 19 06:39:27 2005
Subject: [Python-Dev] Pickling buffer objects.
In-Reply-To: <42645523.3000109@ee.byu.edu>
References: <42645523.3000109@ee.byu.edu>
Message-ID: <42648B6A.3070700@canterbury.ac.nz>

Travis Oliphant wrote:
> 
> I'm proposing to pickle the buffer object so that it unpickles as a 
> string.

Wouldn't this mean you're only solving half the problem?
Unpickling a Numeric array this way would still use an
intermediate string.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From martin at v.loewis.de  Tue Apr 19 07:16:16 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Apr 19 07:16:20 2005
Subject: [Python-Dev] Pickling buffer objects.
In-Reply-To: <42648B6A.3070700@canterbury.ac.nz>
References: <42645523.3000109@ee.byu.edu> <42648B6A.3070700@canterbury.ac.nz>
Message-ID: <42649420.3040501@v.loewis.de>

Greg Ewing wrote:
> Wouldn't this mean you're only solving half the problem?
> Unpickling a Numeric array this way would still use an
> intermediate string.

Precisely my concern.

Martin

From prakash.ayyardevar at gmail.com  Tue Apr 19 13:39:12 2005
From: prakash.ayyardevar at gmail.com (Prakash A)
Date: Tue Apr 19 13:39:22 2005
Subject: [Python-Dev] Python 2.1 in HP-UX
Message-ID: <025b01c544d4$6b7a7470$1a0110ac@PRACO>

Hello All,

        I using jython 2.1. For that i need of Python 2.1 ( i am sure about this, pls clarify me if any version of Python can be used with Jython). and i am working HP-UX platform. I need to know that, whether Python can be Built in HP-UX, because i seeing some of the mails saying Python 2.1 did not compile in HP-UX and Python can not build with HP-UX. Please tell me, whether Python 2.1 can be build in HP-UX. If yes, please give me the stpes to do that.

Thanks in Advance,
Prakash.A
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20050419/5b7cea30/attachment.html
From aahz at pythoncraft.com  Tue Apr 19 16:35:03 2005
From: aahz at pythoncraft.com (Aahz)
Date: Tue Apr 19 16:35:25 2005
Subject: [Python-Dev] Python 2.1 in HP-UX
In-Reply-To: <025b01c544d4$6b7a7470$1a0110ac@PRACO>
References: <025b01c544d4$6b7a7470$1a0110ac@PRACO>
Message-ID: <20050419143503.GA24331@panix.com>

On Tue, Apr 19, 2005, Prakash A wrote:
>
> I using jython 2.1. For that i need of Python 2.1 ( i am sure about
> this, pls clarify me if any version of Python can be used with
> Jython). and i am working HP-UX platform. I need to know that,
> whether Python can be Built in HP-UX, because i seeing some of the
> mails saying Python 2.1 did not compile in HP-UX and Python can not
> build with HP-UX. Please tell me, whether Python 2.1 can be build in
> HP-UX. If yes, please give me the stpes to do that.

python-dev is for development of the Python project.  Please use
comp.lang.python for other questions.  Thank you.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code -- 
not in reams of trivial code that bores the reader to death."  --GvR
From flinkkettel at yahoo.com  Tue Apr 19 17:34:46 2005
From: flinkkettel at yahoo.com (Ralph Hilton)
Date: Tue Apr 19 17:34:49 2005
Subject: [Python-Dev] How do you get yesterday from a time object
Message-ID: <20050419153446.30407.qmail@web60116.mail.yahoo.com>

i'm a beginning python programmer.

I want to get the date for yesterday

nowTime = time.localtime(time.time())
print nowTime.
oneDay = 60*60*24 # number seconds in a day
yday = nowTime - oneDay  # <-- generates an error
print yday.strftime("%Y-%m-%d")

How can I just get yesterday's day?  It a simple
concept yet it seems to be so hard to figure out.

What i'm worried about is if today is say
June 1, 2023
what is yesterday?   and how do i compute that?

Ralph Hilton


__________________________________ 
Do you Yahoo!? 
Plan great trips with Yahoo! Travel: Now over 17,000 guides!
http://travel.yahoo.com/p-travelguide
From simon.brunning at gmail.com  Tue Apr 19 17:57:31 2005
From: simon.brunning at gmail.com (Simon Brunning)
Date: Tue Apr 19 17:57:33 2005
Subject: [Python-Dev] How do you get yesterday from a time object
In-Reply-To: <20050419153446.30407.qmail@web60116.mail.yahoo.com>
References: <20050419153446.30407.qmail@web60116.mail.yahoo.com>
Message-ID: <8c7f10c6050419085724d6e8e5@mail.gmail.com>

On 4/19/05, Ralph Hilton <flinkkettel@yahoo.com> wrote:
> i'm a beginning python programmer.
> 
> I want to get the date for yesterday

This is the wrong place for this question. Nip over to
http://mail.python.org/mailman/listinfo/python-list, and I'd be more
than happy answer it there...

-- 
Cheers,
Simon B,
simon@brunningonline.net,
http://www.brunningonline.net/simon/blog/
From p.f.moore at gmail.com  Tue Apr 19 17:59:07 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue Apr 19 17:59:11 2005
Subject: [Python-Dev] How do you get yesterday from a time object
In-Reply-To: <20050419153446.30407.qmail@web60116.mail.yahoo.com>
References: <20050419153446.30407.qmail@web60116.mail.yahoo.com>
Message-ID: <79990c6b0504190859316a9e69@mail.gmail.com>

On 4/19/05, Ralph Hilton <flinkkettel@yahoo.com> wrote:
> i'm a beginning python programmer.
> 
> I want to get the date for yesterday
> 
> nowTime = time.localtime(time.time())
> print nowTime.
> oneDay = 60*60*24 # number seconds in a day
> yday = nowTime - oneDay  # <-- generates an error
> print yday.strftime("%Y-%m-%d")
> 
> How can I just get yesterday's day?  It a simple
> concept yet it seems to be so hard to figure out.
> 
> What i'm worried about is if today is say
> June 1, 2023
> what is yesterday?   and how do i compute that?

You don't want the python-dev list for this type of question.
Python-dev is for development *of* Python. For usage questions such as
this, you would be better asking on python-list (or, equivalently, the
Usenet group comp.lang.python).

To assist with your question, though, I'd suggest you look at the
documentation of the datetime module, which allows you to do what you
are after (and much more).

Regards,
Paul
From pdecat at gmail.com  Tue Apr 19 18:07:17 2005
From: pdecat at gmail.com (Patrick DECAT)
Date: Tue Apr 19 18:07:19 2005
Subject: [Python-Dev] How do you get yesterday from a time object
In-Reply-To: <20050419153446.30407.qmail@web60116.mail.yahoo.com>
References: <20050419153446.30407.qmail@web60116.mail.yahoo.com>
Message-ID: <3dd9f8f6050419090737913059@mail.gmail.com>

Hi, I believe it's not the appropriate place to ask such questions.
You should check the Python users' list (
http://python.org/community/lists.html )

Anyway, here you go :

now = time.time()
nowTuple = time.localtime(now)
yesterdayTuple = time.localtime(now-60*60*24)

Regards,
Patrick.

2005/4/19, Ralph Hilton <flinkkettel@yahoo.com>:
> i'm a beginning python programmer.
> 
> I want to get the date for yesterday
> 
> nowTime = time.localtime(time.time())
> print nowTime.
> oneDay = 60*60*24 # number seconds in a day
> yday = nowTime - oneDay  # <-- generates an error
> print yday.strftime("%Y-%m-%d")
> 
> How can I just get yesterday's day?  It a simple
> concept yet it seems to be so hard to figure out.
> 
> What i'm worried about is if today is say
> June 1, 2023
> what is yesterday?   and how do i compute that?
> 
> Ralph Hilton
> 
> __________________________________
> Do you Yahoo!?
> Plan great trips with Yahoo! Travel: Now over 17,000 guides!
> http://travel.yahoo.com/p-travelguide
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/pdecat%40gmail.com
>
From lcaamano at gmail.com  Tue Apr 19 19:10:03 2005
From: lcaamano at gmail.com (Luis P Caamano)
Date: Tue Apr 19 19:10:08 2005
Subject: [Python-Dev] os.urandom uses closed FD (sf 1177468)
Message-ID: <c56e219d050419101068a06d70@mail.gmail.com>

We're running into the problem described in bug 1177468, where urandom tries 
to
use a cached file descriptor that was closed by a daemonizing function. A 
quick
fix/workaround is to have os.urandom open /dev/urandom everytime it gets 
called
instead of using the a cached fd.
 Would that create any problems other that those related to the additional 
system
call overhead?
 BTW, I added the traceback we're getting as comment to the bug.
 Thanks
 PS
 This is with Python 2.4.1
 -- 
Luis P Caamano
Atlanta, GA USA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20050419/3d0e7848/attachment.htm
From jjinux at gmail.com  Tue Apr 19 20:35:21 2005
From: jjinux at gmail.com (Shannon -jj Behrens)
Date: Tue Apr 19 20:35:32 2005
Subject: [Python-Dev] anonymous blocks
Message-ID: <c41f67b90504191135e85c8b5@mail.gmail.com>

(I apologize that this is my first post.  Please don't flame me into
oblivion or think I'm a quack!)

Have you guys considered the following syntax for anonymous blocks?  I
think it's possible to parse given Python's existing syntax:

   items.doFoo(
       def (a, b) {
           return a + b
       },
       def (c, d) {
           return c + d
       }
   )

Notice the trick is that there is no name between the def and the "(",
and the ")" is followed by a "{".

I understand that there is hesitance to use "{}".  However, you can
think of this as a Python special case on the same level as using ";"
between statements on a single line.  From that perspective, it's not
inconsistent at all.

Best Regards,
-jj
From tjreedy at udel.edu  Tue Apr 19 20:48:59 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue Apr 19 20:50:43 2005
Subject: [Python-Dev] Re: anonymous blocks
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
Message-ID: <d43jiv$80n$1@sea.gmane.org>


"Shannon -jj Behrens" <jjinux@gmail.com> wrote in message 
news:c41f67b90504191135e85c8b5@mail.gmail.com...
>Have you guys considered the following syntax for anonymous blocks?

There have probably been about 10 such proposals bandied about over the 
years, mostly on comp.lang.python, which is the more appropriate place for 
speculative proposals such as this.

>I understand that there is hesitance to use "{}".

For some, there is more than 'hisitance'.  If you understood why, as has 
been discussed on c.l.p several times, I doubt you would bother proposing 
such.  I won't repeat them here.  I should hope there is a Python FAQ entry 
on this.

Terry J. Reedy


From gvanrossum at gmail.com  Tue Apr 19 20:55:02 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Apr 19 20:55:14 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <c41f67b90504191135e85c8b5@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
Message-ID: <ca471dc205041911556ebfeb20@mail.gmail.com>

> (I apologize that this is my first post.  Please don't flame me into
> oblivion or think I'm a quack!)

(Having met JJ I can assure he's not a quack. But don't let that stop
the flames. :-)

> Have you guys considered the following syntax for anonymous blocks?  I
> think it's possible to parse given Python's existing syntax:
> 
>    items.doFoo(
>        def (a, b) {
>            return a + b
>        },
>        def (c, d) {
>            return c + d
>        }
>    )
> 
> Notice the trick is that there is no name between the def and the "(",
> and the ")" is followed by a "{".
> 
> I understand that there is hesitance to use "{}".  However, you can
> think of this as a Python special case on the same level as using ";"
> between statements on a single line.  From that perspective, it's not
> inconsistent at all.

It would be a lot less inconsistent if {...} would be acceptable
alternative block syntax everywhere.

But what exactly are you trying to accomplish here? I think that
putting the defs *before* the call (and giving the anonymous blocks
temporary local names) actually makes the code clearer:

def block1(a, b):
    return a + b
def block2(c, d):
    return c + d
items.doFoo(block1, block2)

This reflects a style pattern that I've come to appreciate more
recently: when breaking a call with a long argument list to fit on
your screen, instead of trying to find the optimal break points in the
argument list, take one or two of the longest arguments and put them
in local variables. Thus, instead of this:

self.disentangle(0x40,
                 self.triangulation("The quick brown fox jumps over
the lazy dog"),
                 self.indent+1)

I'd recommend this:

tri = self.subcalculation("The quick brown fox jumps over the lazy dog")
self.disentangle(0x40, tri, self.indent+1)

IMO this is clearer, and even shorter!

If we apply this to the anonymous block problem, we may end up finding
lambda the ultimate compromise -- like a gentleman in the back of my
talk last week at baypiggies observed (unfortunately I don't know his
name).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From sabbey at u.washington.edu  Tue Apr 19 21:11:01 2005
From: sabbey at u.washington.edu (Brian Sabbey)
Date: Tue Apr 19 21:11:06 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <c41f67b90504191135e85c8b5@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
Message-ID: <Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>

Shannon -jj Behrens wrote:
> Have you guys considered the following syntax for anonymous blocks?  I
> think it's possible to parse given Python's existing syntax:
>
>   items.doFoo(
>       def (a, b) {
>           return a + b
>       },
>       def (c, d) {
>           return c + d
>       }
>   )
>

There was a proposal in the last few days on comp.lang.python that allows 
you to do this in a way that requires less drastic changes to python's 
syntax.  See the thread "pre-PEP: Suite-Based Keywords" (shamless plug) 
(an earlier, similar proposal is here: 
http://groups.google.co.uk/groups?selm=mailman.403.1105274631.22381.python-list 
%40python.org ).

In short, if doFoo is defined like:

def doFoo(func1, func2):
 	pass

You would be able to call it like:

doFoo(**):
 	def func1(a, b):
 		return a + b
 	def func2(c, d):
 		return c + d

That is, a suite can be used to define keyword arguments.

-Brian
From oliphant at ee.byu.edu  Tue Apr 19 21:13:40 2005
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Tue Apr 19 21:13:44 2005
Subject: [Python-Dev] Pickling buffer objects.
In-Reply-To: <42648B6A.3070700@canterbury.ac.nz>
References: <42645523.3000109@ee.byu.edu> <42648B6A.3070700@canterbury.ac.nz>
Message-ID: <42655864.5030903@ee.byu.edu>

Greg Ewing wrote:

> Travis Oliphant wrote:
>
>>
>> I'm proposing to pickle the buffer object so that it unpickles as a 
>> string.
>
>
> Wouldn't this mean you're only solving half the problem?
> Unpickling a Numeric array this way would still use an
> intermediate string.


Well, actually, unpickling in the new numeric uses the intermediate 
string as the memory (yes, I know it's not supposed to be "mutable", but 
without a mutable bytes object what else are you supposed to do?). 

Thus, ideally we would have a mutable-bytes object with a separate 
pickle opcode.  Without this, then we overuse the string object.  But, 
since the string is only created by the pickle (and nobody else uses it, 
then what's the real harm).

So, in reality the previously-mentioned patch together with 
modificiations to Numeric's unpickling code actually solves the whole 
problem.

-Travis


From gvanrossum at gmail.com  Tue Apr 19 21:24:25 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Apr 19 21:24:28 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
Message-ID: <ca471dc205041912245212376b@mail.gmail.com>

> See the thread "pre-PEP: Suite-Based Keywords" (shamless plug)
> (an earlier, similar proposal is here:
> http://groups.google.co.uk/groups?selm=mailman.403.1105274631.22381.python-list
> %40python.org ).
> 
> In short, if doFoo is defined like:
> 
> def doFoo(func1, func2):
>         pass
> 
> You would be able to call it like:
> 
> doFoo(**):
>         def func1(a, b):
>                 return a + b
>         def func2(c, d):
>                 return c + d
> 
> That is, a suite can be used to define keyword arguments.

I'm still not sure how this is particularly solving a pressing problem
that isn't solved by putting the function definitions in front of the
call. I saw the first version of the proto-PEP and didn't think that
the motivating example (keeping the getx/setx methods passed to a
property definition out of the class namespace) was all that valuable.

Two more issues:

(1) It seems that *every* name introduced in the block automatically
becomes a keyword argument. This looks like a problem, since you could
easily need temporary variables there. (I don't see that a problem
with class bodies because the typical use there is only method and
property definitions and the occasional instance variable default.)

(2) This seems to be attaching a block to a specific function call but
there are more general cases: e.g. you might want to assign the return
value of doFoo() to a variable, or you might want to pass it as an
argument to another call.

*If* we're going to create syntax for anonymous blocks, I think the
primary use case ought to be cleanup operations to replace try/finally
blocks for locking and similar things. I'd love to have syntactical
support so I can write

blahblah(myLock):
    code
    code
    code

instead of

myLock.acquire()
try:
    code
    code
    code
finally:
    myLock.release()

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From fredrik at pythonware.com  Tue Apr 19 21:28:31 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue Apr 19 21:30:32 2005
Subject: [Python-Dev] Re: anonymous blocks
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
Message-ID: <d43lt8$fs8$1@sea.gmane.org>

Brian Sabbey wrote:

> In short, if doFoo is defined like:
> 
> def doFoo(func1, func2):
>  pass
> 
> You would be able to call it like:
> 
> doFoo(**):
>     def func1(a, b):
>         return a + b
>     def func2(c, d):
>         return c + d
> 
> That is, a suite can be used to define keyword arguments.

umm.  isn't that just an incredibly obscure way to write

    def func1(a, b):
        return a + b
    def func2(c, d):
        return c + d
    doFoo(func1, func2)

but with more indentation?

</F>

From pje at telecommunity.com  Tue Apr 19 21:39:56 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Apr 19 21:35:59 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc205041911556ebfeb20@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<c41f67b90504191135e85c8b5@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050419153025.00a7eec0@mail.telecommunity.com>

At 11:55 AM 04/19/2005 -0700, Guido van Rossum wrote:
>I'd recommend this:
>
>tri = self.subcalculation("The quick brown fox jumps over the lazy dog")
>self.disentangle(0x40, tri, self.indent+1)
>
>IMO this is clearer, and even shorter!

What was your opinion on "where" as a lambda replacement?  i.e.

foo = bar(callback1, callback2) where:
     def callback1(x):
         print "hello, "
     def callback2(x):
         print "world!"

I suspect that you like the define-first approach because of your tendency 
to ask questions first and read later.  That is, you want to know what 
callback1 and callback2 are before you see them passed to 
something.  However, other people seem to like to have the context first, 
then fill in the details of each callback later.

Interestingly, this syntax also works to do decoration, though it's not a 
syntax that was ever proposed for that.  e.g.:

foo = classmethod(foo) where:
     def foo(cls,x,y,z):
         # etc.

foo = property(get_foo,set_foo) where:
     def get_foo(self):
         # ...
     def set_foo(self):
         # ...

I don't mind @decorators, of course, but maybe they wouldn't be needed here.


>If we apply this to the anonymous block problem, we may end up finding
>lambda the ultimate compromise -- like a gentleman in the back of my
>talk last week at baypiggies observed (unfortunately I don't know his
>name).
>
>--
>--Guido van Rossum (home page: http://www.python.org/~guido/)
>_______________________________________________
>Python-Dev mailing list
>Python-Dev@python.org
>http://mail.python.org/mailman/listinfo/python-dev
>Unsubscribe: 
>http://mail.python.org/mailman/options/python-dev/pje%40telecommunity.com

From pje at telecommunity.com  Tue Apr 19 21:47:57 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Tue Apr 19 21:43:58 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <5.1.1.6.0.20050419153025.00a7eec0@mail.telecommunity.com>
References: <ca471dc205041911556ebfeb20@mail.gmail.com>
	<c41f67b90504191135e85c8b5@mail.gmail.com>
	<c41f67b90504191135e85c8b5@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050419154251.00a8c9a0@mail.telecommunity.com>

At 03:39 PM 04/19/2005 -0400, Phillip J. Eby wrote:
>I suspect that you like the define-first approach because of your tendency 
>to ask questions first and read later.

Oops; I forgot to put the smiley on that.  It was supposed to be a humorous 
reference to a comment Guido made in private e-mail about the Dr. Dobbs 
article I wrote on decorators.  He had said something similar about the way 
he reads articles, expecting the author to answer all his questions up 
front.  Without that context, the above sentence sounds like some sort of 
snippy remark that I did not intend it to be.  Sorry.  :(

From facundobatista at gmail.com  Tue Apr 19 21:49:08 2005
From: facundobatista at gmail.com (Facundo Batista)
Date: Tue Apr 19 21:49:11 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc205041912245212376b@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
Message-ID: <e04bdf3105041912496ff88272@mail.gmail.com>

On 4/19/05, Guido van Rossum <gvanrossum@gmail.com> wrote:

> I'm still not sure how this is particularly solving a pressing problem
> that isn't solved by putting the function definitions in front of the

Well.

As to what I've read in my short python experience, people wants to
change the language *not* because they have a problem that can not be
solved in a different way, but because they *like* to solve it in a
different way.

And you, making a stand against this, are a main Python feature.

.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/
From sabbey at u.washington.edu  Tue Apr 19 21:55:33 2005
From: sabbey at u.washington.edu (Brian Sabbey)
Date: Tue Apr 19 21:55:37 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc205041912245212376b@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com> 
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
Message-ID: <Pine.A41.4.61b.0504191226360.79410@dante75.u.washington.edu>

Guido van Rossum wrote:
>> See the thread "pre-PEP: Suite-Based Keywords" (shamless plug)
>> (an earlier, similar proposal is here:
>> http://groups.google.co.uk/groups?selm=mailman.403.1105274631.22381.python-list
>> %40python.org ).
>>
>> In short, if doFoo is defined like:
>>
>> def doFoo(func1, func2):
>>         pass
>>
>> You would be able to call it like:
>>
>> doFoo(**):
>>         def func1(a, b):
>>                 return a + b
>>         def func2(c, d):
>>                 return c + d
>>
>> That is, a suite can be used to define keyword arguments.
>
> I'm still not sure how this is particularly solving a pressing problem
> that isn't solved by putting the function definitions in front of the
> call. I saw the first version of the proto-PEP and didn't think that
> the motivating example (keeping the getx/setx methods passed to a
> property definition out of the class namespace) was all that valuable.

OK.  I think most people (myself included) who would prefer to define 
properties (and event handlers, etc.) in this way are motivated by the 
perception that the current method is just ugly.  I don't know that it 
solves any pressing problems.

> Two more issues:
>
> (1) It seems that *every* name introduced in the block automatically
> becomes a keyword argument. This looks like a problem, since you could
> easily need temporary variables there. (I don't see that a problem
> with class bodies because the typical use there is only method and
> property definitions and the occasional instance variable default.)

Combining the suite-based keywords proposal with the earlier, 'where' 
proposal (linked in my above post), you would be able to name variables 
individually in the case that temporary variables are needed:

f(x=x):
 	x = [i**2 for i in [1,2,3]]

> (2) This seems to be attaching a block to a specific function call but
> there are more general cases: e.g. you might want to assign the return
> value of doFoo() to a variable, or you might want to pass it as an
> argument to another call.

The 'where' proposal also doesn't have this problem.  Any expression is 
allowed.

> *If* we're going to create syntax for anonymous blocks, I think the
> primary use case ought to be cleanup operations to replace try/finally
> blocks for locking and similar things. I'd love to have syntactical
> support so I can write
>
> blahblah(myLock):
>    code
>    code
>    code
>
> instead of
>
> myLock.acquire()
> try:
>    code
>    code
>    code
> finally:
>    myLock.release()

Well, that was my other proposal, "pre-PEP: Simple Thunks" (there is also 
an implementation).  It didn't seem to go over all that well.  I am going 
to try to rewrite it and give more motivation and explanation (and maybe 
use 'with' and 'from' instead of 'do' and 'in' as keywords).

-Brian
From gvanrossum at gmail.com  Tue Apr 19 22:00:50 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Apr 19 22:00:55 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <5.1.1.6.0.20050419153025.00a7eec0@mail.telecommunity.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<ca471dc205041911556ebfeb20@mail.gmail.com>
	<5.1.1.6.0.20050419153025.00a7eec0@mail.telecommunity.com>
Message-ID: <ca471dc205041913007fccaee@mail.gmail.com>

> What was your opinion on "where" as a lambda replacement?  i.e.
> 
> foo = bar(callback1, callback2) where:
>      def callback1(x):
>          print "hello, "
>      def callback2(x):
>          print "world!"

I don't recall seeing this proposed, but I might have -- I thought of
pretty much exactly this syntax in the shower a few days ago.

Unfortunately it doesn't solve the lock-release use case that is more
pressing in my mind.

Also, if you want top-down programming (which is a fine coding
style!), we already have several ways to do that.

> I suspect that you like the define-first approach because of your tendency
> to ask questions first and read later.  That is, you want to know what
> callback1 and callback2 are before you see them passed to
> something.  However, other people seem to like to have the context first,
> then fill in the details of each callback later.

I think it all depends, not so much on the personality of the reader,
but on the specifics of the program. When callback1 and callback2 are
large chunks of code, we probably all agree that it's better to have
them out of the way, either way up or way down -- purely because of
their size they deserve to be abstracted away when we're reading on
how they are being used. A more interesting use case may be when
callback1 and callback2 are very *small* amounts of code, since that's
the main use case for lambda; there knowing what callback1 and
callback2 stand for is probably important. I have to say that as long
as it's only a few lines away I don't care much whether the detail is
above or below its application, since it will all fit on a single
screen and I can look at it all together. So then the 'where' syntax
isn't particularly attractive because it doesn't solve a problem I'm
experiencing.

> Interestingly, this syntax also works to do decoration, though it's not a
> syntax that was ever proposed for that.  e.g.:
> 
> foo = classmethod(foo) where:
>      def foo(cls,x,y,z):
>          # etc.

This requires you to write foo three times, which defeats at least
half of the purpose of decorators.

> foo = property(get_foo,set_foo) where:
>      def get_foo(self):
>          # ...
>      def set_foo(self):
>          # ...
> 
> I don't mind @decorators, of course, but maybe they wouldn't be needed here.

As I said before, I'm not sure why keeping get_foo etc. out of the
class namespace is such a big deal. In fact, I like having them there
(sometimes they can even be handy, e.g. you might be able to pass the
unbound get_foo method as a sort key).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From sabbey at u.washington.edu  Tue Apr 19 22:06:44 2005
From: sabbey at u.washington.edu (Brian Sabbey)
Date: Tue Apr 19 22:06:49 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <d43lt8$fs8$1@sea.gmane.org>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<d43lt8$fs8$1@sea.gmane.org>
Message-ID: <Pine.A41.4.61b.0504191255580.79410@dante75.u.washington.edu>

Fredrik Lundh wrote:
>> In short, if doFoo is defined like:
>> 
>> def doFoo(func1, func2):
>>  pass
>> 
>> You would be able to call it like:
>> 
>> doFoo(**):
>>     def func1(a, b):
>>         return a + b
>>     def func2(c, d):
>>         return c + d
>> 
>> That is, a suite can be used to define keyword arguments.
>
> umm.  isn't that just an incredibly obscure way to write
>
>   def func1(a, b):
>       return a + b
>   def func2(c, d):
>       return c + d
>   doFoo(func1, func2)
>
> but with more indentation?

If suites were commonly used as above to define properties, event handlers 
and other callbacks, then I think most people would be able to comprehend 
what the first example above is doing much more quickly than the second.

So, I don't find it obscure for any reason other than because no one does 
it.

Also, the two examples above are not exactly the same since the two 
functions are defined in a separate namespace in the top example.

-Brian
From python-kbutler at sabaydi.com  Tue Apr 19 22:14:54 2005
From: python-kbutler at sabaydi.com (Kevin J. Butler)
Date: Tue Apr 19 22:12:58 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <20050419185608.F19941E4014@bag.python.org>
References: <20050419185608.F19941E4014@bag.python.org>
Message-ID: <426566BE.60403@sabaydi.com>

>
>
>From: Guido van Rossum <gvanrossum@gmail.com>
>
...

>This reflects a style pattern that I've come to appreciate more
>recently: when breaking a call with a long argument list to fit on
>your screen, instead of trying to find the optimal break points in the
>argument list, take one or two of the longest arguments and put them
>in local variables. 
>
...

>If we apply this to the anonymous block problem, we may end up finding
>lambda the ultimate compromise -- like a gentleman in the back of my
>talk last week at baypiggies observed (unfortunately I don't know his
>name).
>  
>
I like it:

  Lambda: The Ultimate Compromise

(c.f. http://library.readscheme.org/page1.html)

kb
From reinhold-birkenfeld-nospam at wolke7.net  Tue Apr 19 22:14:16 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Tue Apr 19 22:16:11 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc205041913007fccaee@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>	<ca471dc205041911556ebfeb20@mail.gmail.com>	<5.1.1.6.0.20050419153025.00a7eec0@mail.telecommunity.com>
	<ca471dc205041913007fccaee@mail.gmail.com>
Message-ID: <d43oio$qe5$1@sea.gmane.org>

Guido van Rossum wrote:
>> What was your opinion on "where" as a lambda replacement?  i.e.
>> 
>> foo = bar(callback1, callback2) where:
>>      def callback1(x):
>>          print "hello, "
>>      def callback2(x):
>>          print "world!"
> 
> I don't recall seeing this proposed, but I might have -- I thought of
> pretty much exactly this syntax in the shower a few days ago.

Gee, the time machine again!

Lots of proposals on c.l.py base on the introduction of "expression
suites", that is, suites embedded in arbitrary expressions. My opinion
is that one will never find a suitable (;-) syntax, there's always the
question of where to put the code that follows the suite (and is part
of the same statement).

yours,
Reinhold

-- 
Mail address is perfectly valid!

From barry at python.org  Tue Apr 19 22:18:41 2005
From: barry at python.org (Barry Warsaw)
Date: Tue Apr 19 22:18:47 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc205041912245212376b@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
Message-ID: <1113941921.14525.39.camel@geddy.wooz.org>

On Tue, 2005-04-19 at 15:24, Guido van Rossum wrote:

> *If* we're going to create syntax for anonymous blocks, I think the
> primary use case ought to be cleanup operations to replace try/finally
> blocks for locking and similar things. I'd love to have syntactical
> support so I can write
> 
> blahblah(myLock):
>     code
>     code
>     code
> 
> instead of
> 
> myLock.acquire()
> try:
>     code
>     code
>     code
> finally:
>     myLock.release()

Indeed, it would be very cool to have these kind of (dare I say) block
decorators for managing resources.  The really nice thing about that is
when I have to protect multiple resources in a safe, but clean way
inside a single block.  Too many nested try/finally's cause you to
either get sloppy, or really ugly (or both!).

RSMotD (random stupid musing of the day): so I wonder if the decorator
syntax couldn't be extended for this kind of thing.

@acquire(myLock):
    code
    code
    code

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050419/c2780ce7/attachment.pgp
From eric.nieuwland at xs4all.nl  Tue Apr 19 22:20:07 2005
From: eric.nieuwland at xs4all.nl (Eric Nieuwland)
Date: Tue Apr 19 22:20:10 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc205041911556ebfeb20@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<ca471dc205041911556ebfeb20@mail.gmail.com>
Message-ID: <8f2cb89c88defe7f2c51e0d9bd702ef7@xs4all.nl>

Guido van Rossum wrote:
> tri = self.subcalculation("The quick brown fox jumps over the lazy 
> dog")
> self.disentangle(0x40, tri, self.indent+1)
>
> IMO this is clearer, and even shorter!
But it clutters the namespace with objects you don't need. So the 
complete equivalent would be more close to:
	tri = self.subcalculation("The quick brown fox jumps over the lazy 
dog")
	self.disentangle(0x40, tri, self.indent+1)
	del tri
which seems a bit odd to me.

> If we apply this to the anonymous block problem, we may end up finding
> lambda the ultimate compromise -- like a gentleman in the back of my
> talk last week at baypiggies observed (unfortunately I don't know his
> name).
It wasn't me ;-) It seems this keeps getting back at you. Wish I had 
thought of this argument before.

--eric

From gvanrossum at gmail.com  Tue Apr 19 22:27:00 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Apr 19 22:27:05 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <8f2cb89c88defe7f2c51e0d9bd702ef7@xs4all.nl>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<ca471dc205041911556ebfeb20@mail.gmail.com>
	<8f2cb89c88defe7f2c51e0d9bd702ef7@xs4all.nl>
Message-ID: <ca471dc20504191327182e4a90@mail.gmail.com>

> > IMO this is clearer, and even shorter!
> But it clutters the namespace with objects you don't need.

Why do people care about cluttering namespaces so much? I thought
thats' what namespaces were for -- to put stuff you want to remember
for a bit. A function's local namespace in particular seems a
perfectly fine place for temporaries.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From glyph at divmod.com  Tue Apr 19 22:27:57 2005
From: glyph at divmod.com (Glyph Lefkowitz)
Date: Tue Apr 19 22:28:03 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc205041911556ebfeb20@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<ca471dc205041911556ebfeb20@mail.gmail.com>
Message-ID: <426569CD.1010701@divmod.com>

Guido van Rossum wrote:

> But what exactly are you trying to accomplish here? I think that
> putting the defs *before* the call (and giving the anonymous blocks
> temporary local names) actually makes the code clearer:

I'm afraid that 'block1', 'block2', and 'doFoo' aren't really making 
anything clear for me - can you show a slightly more concrete example?

> def block1(a, b):
>     return a + b
> def block2(c, d):
>     return c + d
> items.doFoo(block1, block2)

Despite being guilty of propagating this style for years myself, I have 
to disagree.  Consider the following network-conversation using Twisted 
style (which, I might add, would be generalizable to other Twisted-like 
systems if they existed ;-)):

def strawman(self):
     def sayGoodbye(mingleResult):
         def goAway(goodbyeResult):
             self.loseConnection()
         self.send("goodbye").addCallback(goAway)
     def mingle(helloResult):
         self.send("nice weather we're having").addCallback(sayGoodbye)
     self.send("hello").addCallback(mingle)

On the wire, this would look like:

     > hello
     < (response) hello
     > nice weather we're having
     < (response) nice weather we're having
     > goodbye
     < (response) goodbye
     FIN

Note that the temporal order of events here is _exactly backwards_ to 
the order of calls in the code, because we have to name everything 
before it can happen.  Now, with anonymous blocks (using my own pet 
favorite syntax, of course):

def tinman(self):
     self.send("hello").addCallback(def (x):
         self.send("nice weather we're having").addCallback(def (y):
             self.send("goodbye").addCallback(def (z):
                 self.loseConnection())))

Now, of course, this is written as network I/O because that is my 
bailiwick, but you could imagine an identical example with a nested 
chain of dialog boxes in a GUI, or a state machine controlling a robot.

For completeness, the same example _can_ be written in the same order as 
events actually occur, but it takes twice times the number of lines and 
ends up creating a silly number of extra names:

def lion(self):
     d1 = self.send("hello")
     def d1r(x):
         d2 = self.send("nice weather we're having")
         def d2r(y):
             d3 = self.send("goodbye")
             def d3r(z):
                 self.loseConnection()
             d3.addCallback(d3r)
         d2.addCallback(d2r)
     d1.addCallback(d1r)

but this only works if you have a callback-holding object like Twisted's 
Deferred.  If you have to pass a callback function as an argument, as 
many APIs require, you really have to define the functions before 
they're called.

My point here is not that my proposed syntax is particularly great, but 
that anonymous blocks are a real win in terms of both clarity and 
linecount.  I'm glad guido is giving them a moment in the limelight :).

Should there be a PEP about this?
From gvanrossum at gmail.com  Tue Apr 19 22:33:15 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Apr 19 22:33:18 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <1113941921.14525.39.camel@geddy.wooz.org>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
Message-ID: <ca471dc2050419133363a1cea9@mail.gmail.com>

> @acquire(myLock):
>     code
>     code
>     code

It would certainly solve the problem of which keyword to use! :-) And
I think the syntax isn't even ambiguous -- the trailing colon
distinguishes this from the function decorator syntax. I guess it
would morph '@xxx' into "user-defined-keyword".

How would acquire be defined? I guess it could be this, returning a
function that takes a callable as an argument just like other
decorators:

def acquire(aLock):
    def acquirer(block):
        aLock.acquire()
        try:
            block()
        finally:
            aLock.release()
    return acquirer

and the substitution of

@EXPR:
    CODE

would become something like

def __block():
    CODE
EXPR(__block)

I'm not yet sure whether to love or hate it. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From sabbey at u.washington.edu  Tue Apr 19 22:46:16 2005
From: sabbey at u.washington.edu (Brian Sabbey)
Date: Tue Apr 19 22:46:20 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc2050419133363a1cea9@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com> 
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu> 
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<ca471dc2050419133363a1cea9@mail.gmail.com>
Message-ID: <Pine.A41.4.61b.0504191341280.124046@dante72.u.washington.edu>

Guido van Rossum wrote:
>> @acquire(myLock):
>>     code
>>     code
>>     code
>
> It would certainly solve the problem of which keyword to use! :-) And
> I think the syntax isn't even ambiguous -- the trailing colon
> distinguishes this from the function decorator syntax. I guess it
> would morph '@xxx' into "user-defined-keyword".
>
> How would acquire be defined? I guess it could be this, returning a
> function that takes a callable as an argument just like other
> decorators:
>
> def acquire(aLock):
>    def acquirer(block):
>        aLock.acquire()
>        try:
>            block()
>        finally:
>            aLock.release()
>    return acquirer
>
> and the substitution of
>
> @EXPR:
>    CODE
>
> would become something like
>
> def __block():
>    CODE
> EXPR(__block)

Why not have the block automatically be inserted into acquire's argument 
list?  It would probably get annoying to have to define inner functions 
like that every time one simply wants to use arguments.  For example:

def acquire(block, aLock):
         aLock.acquire()
         try:
                 block()
         finally:
                 aLock.release()

@acquire(myLock):
 	code
 	code
 	code

Of course, augmenting the argument list in that way would be different 
than the behavior of decorators as they are now.

-Brian
From fredrik at pythonware.com  Tue Apr 19 23:06:48 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue Apr 19 23:09:32 2005
Subject: [Python-Dev] Re: Re: anonymous blocks
References: <c41f67b90504191135e85c8b5@mail.gmail.com><Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu><d43lt8$fs8$1@sea.gmane.org>
	<Pine.A41.4.61b.0504191255580.79410@dante75.u.washington.edu>
Message-ID: <d43rl9$5af$1@sea.gmane.org>

Brian Sabbey wrote:

> If suites were commonly used as above to define properties, event handlers 
> and other callbacks, then I think most people would be able to comprehend 
> what the first example above is doing much more quickly than the second.

wonderful logic, there.  good luck with your future adventures in language design.

</F>

From fredrik at pythonware.com  Tue Apr 19 23:13:14 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue Apr 19 23:14:44 2005
Subject: [Python-Dev] Re: anonymous blocks
References: <c41f67b90504191135e85c8b5@mail.gmail.com><ca471dc205041911556ebfeb20@mail.gmail.com><8f2cb89c88defe7f2c51e0d9bd702ef7@xs4all.nl>
	<ca471dc20504191327182e4a90@mail.gmail.com>
Message-ID: <d43s1b$6ki$1@sea.gmane.org>

Guido van Rossum wrote:

> This reflects a style pattern that I've come to appreciate more
> recently:

what took you so long? ;-)

> Why do people care about cluttering namespaces so much? I thought
> thats' what namespaces were for -- to put stuff you want to remember
> for a bit. A function's local namespace in particular seems a
> perfectly fine place for temporaries.

and by naming stuff, you can often eliminate a comment or three.

this is python.  names are cheap.

</F>

From gvanrossum at gmail.com  Tue Apr 19 23:28:16 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Apr 19 23:28:24 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <Pine.A41.4.61b.0504191341280.124046@dante72.u.washington.edu>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<ca471dc2050419133363a1cea9@mail.gmail.com>
	<Pine.A41.4.61b.0504191341280.124046@dante72.u.washington.edu>
Message-ID: <ca471dc205041914287144bdf5@mail.gmail.com>

> Why not have the block automatically be inserted into acquire's argument
> list?  It would probably get annoying to have to define inner functions
> like that every time one simply wants to use arguments.

But the number of *uses* would be much larger than the number of
"block decorators" you'd be coding. If you find yourself writing new
block decorators all the time that's probably a sign you're too much
in love with the feature. :-)

> For example:
> 
> def acquire(block, aLock):
>          aLock.acquire()
>          try:
>                  block()
>          finally:
>                  aLock.release()
> 
> @acquire(myLock):
>         code
>         code
>         code
> 
> Of course, augmenting the argument list in that way would be different
> than the behavior of decorators as they are now.

I don't like implicit modifications of argument lists other than by
method calls. It's okay for method calls because in the x.foo(a) <==>
foo(x, a) equivalence, x is really close to the beginning of the
argument list.

And your proposal would preclude parameterless block decorators (or
turn them into an ugly special case), which I think might be quite
useful:

@forever:
    infinite loop body

@ignore:
    not executed at all

@require:
    assertions go here

and so on.

(In essence, we're inventing the opposite of "barewords" in Perl here, right?)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From tcdelaney at optusnet.com.au  Tue Apr 19 23:47:14 2005
From: tcdelaney at optusnet.com.au (Tim Delaney)
Date: Tue Apr 19 23:47:18 2005
Subject: [Python-Dev] anonymous blocks
References: <c41f67b90504191135e85c8b5@mail.gmail.com><ca471dc205041911556ebfeb20@mail.gmail.com><5.1.1.6.0.20050419153025.00a7eec0@mail.telecommunity.com>
	<ca471dc205041913007fccaee@mail.gmail.com>
Message-ID: <006101c54529$59427370$f100a8c0@ryoko>

Guido van Rossum wrote:

> As I said before, I'm not sure why keeping get_foo etc. out of the
> class namespace is such a big deal. In fact, I like having them there
> (sometimes they can even be handy, e.g. you might be able to pass the
> unbound get_foo method as a sort key).

Not to mention that it's possible to override get_foo in subclasses if done 
right ...

Two approaches are here:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/408713

Tim Delaney 

From sabbey at u.washington.edu  Tue Apr 19 23:48:01 2005
From: sabbey at u.washington.edu (Brian Sabbey)
Date: Tue Apr 19 23:48:06 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc205041914287144bdf5@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com> 
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu> 
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<ca471dc2050419133363a1cea9@mail.gmail.com> 
	<Pine.A41.4.61b.0504191341280.124046@dante72.u.washington.edu>
	<ca471dc205041914287144bdf5@mail.gmail.com>
Message-ID: <Pine.A41.4.61b.0504191430390.124046@dante72.u.washington.edu>

Guido van Rossum wrote:
>> Why not have the block automatically be inserted into acquire's argument
>> list?  It would probably get annoying to have to define inner functions
>> like that every time one simply wants to use arguments.
>
> But the number of *uses* would be much larger than the number of
> "block decorators" you'd be coding. If you find yourself writing new
> block decorators all the time that's probably a sign you're too much
> in love with the feature. :-)

Ok, but in explanations of how to use such blocks, they appear about 
equally often.  They will therefore seem more difficult to use than they 
have to.

> I don't like implicit modifications of argument lists other than by
> method calls. It's okay for method calls because in the x.foo(a) <==>
> foo(x, a) equivalence, x is really close to the beginning of the
> argument list.

There is a rough equivalence:

@foo(x):
 	block

<==>

@foo(block, x)

Of course, the syntax does not allow such an equivalence, but conceptually 
it's there.

To improve the appearance of equivalence, the block could be made the last 
element in the argument list.

> And your proposal would preclude parameterless block decorators (or
> turn them into an ugly special case), which I think might be quite
> useful:
>
> @forever:
>    infinite loop body
>
> @ignore:
>    not executed at all
>
> @require:
>    assertions go here
>
> and so on.
>
> (In essence, we're inventing the opposite of "barewords" in Perl here, right?)

I don't understand this.  Why not:

@forever():
 	infinite loop body

etc.?  The same is done with methods: x.foo()  (or am I missing 
something?).  I actually prefer this because using '()' make it clear that 
you are making a call to 'forever'.  Importantly, 'forever' can throw 
exceptions at you.  Without the '()' one does not get this reminder.

I also believe it is more difficult to read without '()'.  The call to the 
function is implicit in the fact that it sits next to '@'.

But, again, if such argument list augmentation were done, something other 
than '@' would need to be used so as to not conflict with function 
decorator behavior.

-Brian
From jcarlson at uci.edu  Wed Apr 20 00:21:27 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed Apr 20 00:23:32 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc2050419133363a1cea9@mail.gmail.com>
References: <1113941921.14525.39.camel@geddy.wooz.org>
	<ca471dc2050419133363a1cea9@mail.gmail.com>
Message-ID: <20050419145855.6391.JCARLSON@uci.edu>


[Guido van Rossum]
> @EXPR:
>     CODE
> 
> would become something like
> 
> def __block():
>     CODE
> EXPR(__block)
> 
> I'm not yet sure whether to love or hate it. :-)

Is it preferable for CODE to execute in its own namespace (the above
being a literal translation of the given code), or for it to execute in
the originally defined namespace?

Deferring to Greg Ewing for a moment [1]:
They should be lexically scoped, not dynamically scoped.

Wrapped blocks in an old namespace, I believe, is the way to go,
especially for things like...

@synchronize(fooLock):
    a = foo.method()

I cannot come up with any code for which CODE executing in its own
namespace makes sense.  Can anyone else?

<discussion on the overlap with PEP 310 removed for brevity>


 - Josiah

[1] http://mail.python.org/pipermail/python-dev/2005-March/052239.html
From sabbey at u.washington.edu  Wed Apr 20 00:43:45 2005
From: sabbey at u.washington.edu (Brian Sabbey)
Date: Wed Apr 20 00:43:49 2005
Subject: [Python-Dev] Re: Re: anonymous blocks
In-Reply-To: <d43rl9$5af$1@sea.gmane.org>
References: <c41f67b90504191135e85c8b5@mail.gmail.com><Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu><d43lt8$fs8$1@sea.gmane.org>
	<Pine.A41.4.61b.0504191255580.79410@dante75.u.washington.edu>
	<d43rl9$5af$1@sea.gmane.org>
Message-ID: <Pine.A41.4.61b.0504191414390.79410@dante75.u.washington.edu>

Fredrik Lundh wrote:
> Brian Sabbey wrote:
>
>> If suites were commonly used as above to define properties, event handlers 
>> and other callbacks, then I think most people would be able to comprehend 
>> what the first example above is doing much more quickly than the second.
>
> wonderful logic, there.  good luck with your future adventures in language 
> design.
>
> </F>

I'm just trying to help python improve.  Maybe I'm not doing a very good 
job, I don't know.  Either way, there's no need to be rude.

If I've broken some sort of unspoken code of behavior for this list, then 
maybe it would be easier if you just 'spoke' it (perhaps in a private 
email or in the description of this list on python.org).

I'm not sure what your point is exactly.  Are you saying that any language 
feature that needs to be commonly used to be comprehendible will never be 
comprehendible because it will never be commonly used?  If so, then I do 
not think you have a valid point.  I never claimed that keyword suites 
*need* to be commonly used to be comprehendible.  I only said that if they 
were commonly used they would be more comprehendible than the alternative. 
I happen to also believe that seeing them once or twice is enough to make 
them about equally as comprehendible as the alternative.

-Brian
From bjourne at gmail.com  Wed Apr 20 00:57:05 2005
From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=)
Date: Wed Apr 20 00:57:15 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <1113941921.14525.39.camel@geddy.wooz.org>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
Message-ID: <740c3aec0504191557505d6e9f@mail.gmail.com>

> RSMotD (random stupid musing of the day): so I wonder if the decorator
> syntax couldn't be extended for this kind of thing.
> 
> @acquire(myLock):
>     code
>     code
>     code

Would it be useful for anything other than mutex-locking? And wouldn't
it be better to make a function of the block wrapped in a
block-decorator and then use a normal decorator?

-- 
mvh Bj?rn
From bac at OCF.Berkeley.EDU  Wed Apr 20 01:17:33 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Apr 20 01:17:42 2005
Subject: [Python-Dev] Proper place to put extra args for building
Message-ID: <4265918D.7040700@ocf.berkeley.edu>

I am currently adding some code for a Py_COMPILER_DEBUG build for use on the
AST branch.  I thought that OPT was the proper variable to put stuff like this
into for building (``-DPy_COMPILER_DEBUG``), but that erases ``-g -Wall
-Wstrict-prototypes``.  Obviously I could just tack all of that into my own
thing, but that seems like an unneeded step.

>From looking at Makefile.pre.in it seems like CFLAGSFORSHARED is meant for
extra arguments to the compiler.  Is that right?

And I will document this in Misc/Specialbuilds.txt to fix my initial blunderous
checkin of specifying OPT (or at least clarifying it).

-Brett
From pje at telecommunity.com  Wed Apr 20 01:23:26 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Apr 20 01:19:31 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc205041913007fccaee@mail.gmail.com>
References: <5.1.1.6.0.20050419153025.00a7eec0@mail.telecommunity.com>
	<c41f67b90504191135e85c8b5@mail.gmail.com>
	<ca471dc205041911556ebfeb20@mail.gmail.com>
	<5.1.1.6.0.20050419153025.00a7eec0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050419191423.00a628e0@mail.telecommunity.com>

At 01:00 PM 04/19/2005 -0700, Guido van Rossum wrote:
> > Interestingly, this syntax also works to do decoration, though it's not a
> > syntax that was ever proposed for that.  e.g.:
> >
> > foo = classmethod(foo) where:
> >      def foo(cls,x,y,z):
> >          # etc.
>
>This requires you to write foo three times, which defeats at least
>half of the purpose of decorators.

Well, you could do 'foo = classmethod(x) where: def x(...)', but that *is* 
kind of kludgy.  I'm just suggesting that if 'where:' had existed before 
decorators, people might have griped about the three-time typing or kludged 
around it, but there wouldn't likely have been strong support for creating 
a syntax "just" for decorators.

Indeed, if somebody had proposed this syntax during the decorator debates I 
would have supported it, but of course Bob Ippolito (whose PyObjC use cases 
involve really long function names) might have disagreed.


> > foo = property(get_foo,set_foo) where:
> >      def get_foo(self):
> >          # ...
> >      def set_foo(self):
> >          # ...
> >
> > I don't mind @decorators, of course, but maybe they wouldn't be needed 
> here.
>
>As I said before, I'm not sure why keeping get_foo etc. out of the
>class namespace is such a big deal.

That's a relatively minor thing, compared to being able to logically group 
them with the property, which I think enhances readability, even more than 
the sometimes-proposed '@property.getter' and '@property.setter' decorators.

Anyway, just to be clear, I don't personally think 'where:' is needed in 
Python 2.x; lambda and decorators suffice for all but the most Twisted use 
cases.  ;)  I was just viewing it as a potential alternative to lambda in Py3K.

From jjinux at gmail.com  Wed Apr 20 01:42:08 2005
From: jjinux at gmail.com (Shannon -jj Behrens)
Date: Wed Apr 20 01:42:11 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <5.1.1.6.0.20050419191423.00a628e0@mail.telecommunity.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<ca471dc205041911556ebfeb20@mail.gmail.com>
	<5.1.1.6.0.20050419153025.00a7eec0@mail.telecommunity.com>
	<ca471dc205041913007fccaee@mail.gmail.com>
	<5.1.1.6.0.20050419191423.00a628e0@mail.telecommunity.com>
Message-ID: <c41f67b905041916426c7def04@mail.gmail.com>

I apologize for sparking such debate on this list instead of on
c.l.py.  By the way, the only reason I brought this up was as a
replacement for lambdas in Py3K.

Guido, in response to your much earlier comment about supporting "{}"
for normal defs as a matter of consistency within my proposal, yes, I
agree.  Just like ";", you should rarely use them.

Best Regards,
-jj

On 4/19/05, Phillip J. Eby <pje@telecommunity.com> wrote:
> At 01:00 PM 04/19/2005 -0700, Guido van Rossum wrote:
> > > Interestingly, this syntax also works to do decoration, though it's not a
> > > syntax that was ever proposed for that.  e.g.:
> > >
> > > foo = classmethod(foo) where:
> > >      def foo(cls,x,y,z):
> > >          # etc.
> >
> >This requires you to write foo three times, which defeats at least
> >half of the purpose of decorators.
> 
> Well, you could do 'foo = classmethod(x) where: def x(...)', but that *is*
> kind of kludgy.  I'm just suggesting that if 'where:' had existed before
> decorators, people might have griped about the three-time typing or kludged
> around it, but there wouldn't likely have been strong support for creating
> a syntax "just" for decorators.
> 
> Indeed, if somebody had proposed this syntax during the decorator debates I
> would have supported it, but of course Bob Ippolito (whose PyObjC use cases
> involve really long function names) might have disagreed.
> 
> 
> > > foo = property(get_foo,set_foo) where:
> > >      def get_foo(self):
> > >          # ...
> > >      def set_foo(self):
> > >          # ...
> > >
> > > I don't mind @decorators, of course, but maybe they wouldn't be needed
> > here.
> >
> >As I said before, I'm not sure why keeping get_foo etc. out of the
> >class namespace is such a big deal.
> 
> That's a relatively minor thing, compared to being able to logically group
> them with the property, which I think enhances readability, even more than
> the sometimes-proposed '@property.getter' and '@property.setter' decorators.
> 
> Anyway, just to be clear, I don't personally think 'where:' is needed in
> Python 2.x; lambda and decorators suffice for all but the most Twisted use
> cases.  ;)  I was just viewing it as a potential alternative to lambda in Py3K.
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/jjinux%40gmail.com
> 


-- 
I have decided to switch to Gmail, but messages to my Yahoo account will
still get through.
From jack at performancedrivers.com  Wed Apr 20 03:01:58 2005
From: jack at performancedrivers.com (Jack Diederich)
Date: Wed Apr 20 03:02:03 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc2050419133363a1cea9@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<ca471dc2050419133363a1cea9@mail.gmail.com>
Message-ID: <20050420010158.GA18907@performancedrivers.com>

On Tue, Apr 19, 2005 at 01:33:15PM -0700, Guido van Rossum wrote:
> > @acquire(myLock):
> >     code
> >     code
> >     code
> 
> It would certainly solve the problem of which keyword to use! :-) And
> I think the syntax isn't even ambiguous -- the trailing colon
> distinguishes this from the function decorator syntax. I guess it
> would morph '@xxx' into "user-defined-keyword".
> 
> How would acquire be defined? I guess it could be this, returning a
> function that takes a callable as an argument just like other
> decorators:
[snip]
> and the substitution of
> 
> @EXPR:
>     CODE
> 
> would become something like
> 
> def __block():
>     CODE
> EXPR(__block)
> 
> I'm not yet sure whether to love or hate it. :-)
> 
I don't know what the purpose of these things is, but I do think
they should be like something else to avoid learning something new.

Okay, I lied, I do know what these are: "namespace decorators"
Namespaces are currently modules or classes, and decorators currently
apply only to functions.  The dissonance is that function bodies are
evaluated later and namespaces (modules and classes) are evaluated
immediately.  I don't know if adding a namespace that is only evaluated
later makes sense.  It is only an extra case but it is one extra case
to remember.  At best I have only channeled Guido once, and by accident[1]
so I'll stay out of the specifics (for a bit).

-jackdied

[1] during the decorator syntax bru-ha-ha at a Boston PIG meeting I
suggested Guido liked the decorator-before-function because it made
more sense in Dutch.  I was kidding, but someone who knows a little Dutch
(Deibel?) stated this was, in fact, the case.
From michael.walter at gmail.com  Wed Apr 20 03:55:52 2005
From: michael.walter at gmail.com (Michael Walter)
Date: Wed Apr 20 03:55:55 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <740c3aec0504191557505d6e9f@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<740c3aec0504191557505d6e9f@mail.gmail.com>
Message-ID: <877e9a170504191855445e0f4d@mail.gmail.com>

On 4/19/05, BJ?rn Lindqvist <bjourne@gmail.com> wrote:
> > RSMotD (random stupid musing of the day): so I wonder if the decorator
> > syntax couldn't be extended for this kind of thing.
> >
> > @acquire(myLock):
> >     code
> >     code
> >     code
> 
> Would it be useful for anything other than mutex-locking? And wouldn't
> it be better to make a function of the block wrapped in a
> block-decorator and then use a normal decorator?

Yes. Check how blocks in Smalltalk and Ruby are used for starters.

Regards,
Michael
From aleaxit at yahoo.com  Wed Apr 20 04:07:45 2005
From: aleaxit at yahoo.com (Alex Martelli)
Date: Wed Apr 20 04:07:50 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <740c3aec0504191557505d6e9f@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<740c3aec0504191557505d6e9f@mail.gmail.com>
Message-ID: <560ee46ad8e2a82faedba7349b98ab5a@yahoo.com>


On Apr 19, 2005, at 15:57, BJ?rn Lindqvist wrote:

>> RSMotD (random stupid musing of the day): so I wonder if the decorator
>> syntax couldn't be extended for this kind of thing.
>>
>> @acquire(myLock):
>>     code
>>     code
>>     code
>
> Would it be useful for anything other than mutex-locking? And wouldn't

Well, one obvious use might be, say:

@withfile('foo.bar', 'r'):
     content = thefile.read()

but that would require the decorator and block to be able to interact 
in some way, so that inside the block 'thefile' is defined suitably.

> it be better to make a function of the block wrapped in a
> block-decorator and then use a normal decorator?

 From a viewpoint of namespaces, I think it would be better to have the 
block execute in the same namespace as the code surrounding it, not a 
separate one (assigning to 'content' would not work otherwise), so a 
nested function would not be all that useful.  The problem might be, 
how does the _decorator_ affect that namespace.  Perhaps:

def withfile(filename, mode='r'):
     openfile = open(filename, mode)
     try:
         block(thefile=openfile)
     finally:
         openfile.close()

i.e., let the block take keyword arguments to tweak its namespace (but 
assignments within the block should still affect its _surrounding_ 
namespace, it seems to me...).


Alex

From shane at hathawaymix.org  Wed Apr 20 06:31:36 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Wed Apr 20 06:31:38 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <d43lt8$fs8$1@sea.gmane.org>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<d43lt8$fs8$1@sea.gmane.org>
Message-ID: <4265DB28.8050905@hathawaymix.org>

Fredrik Lundh wrote:
> Brian Sabbey wrote:
>> doFoo(**):
>>     def func1(a, b):
>>         return a + b
>>     def func2(c, d):
>>         return c + d
>>
>> That is, a suite can be used to define keyword arguments.
> 
> 
> umm.  isn't that just an incredibly obscure way to write
> 
>    def func1(a, b):
>        return a + b
>    def func2(c, d):
>        return c + d
>    doFoo(func1, func2)
> 
> but with more indentation?

Brian's suggestion makes the code read more like an outline.  In Brian's
example, the high-level intent stands out from the details, while in
your example, there is no visual cue that distinguishes the details from
the intent.  Of course, lambdas are even better, when it's possible to
use them:

    doFoo((lambda a, b: a + b), (lambda c, d: c + d))

Shane
From jcarlson at uci.edu  Wed Apr 20 06:39:32 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed Apr 20 06:42:43 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <877e9a170504191855445e0f4d@mail.gmail.com>
References: <740c3aec0504191557505d6e9f@mail.gmail.com>
	<877e9a170504191855445e0f4d@mail.gmail.com>
Message-ID: <20050419212423.63AD.JCARLSON@uci.edu>


Michael Walter <michael.walter@gmail.com> wrote:
> 
> On 4/19/05, BJ?rn Lindqvist <bjourne@gmail.com> wrote:
> > > RSMotD (random stupid musing of the day): so I wonder if the decorator
> > > syntax couldn't be extended for this kind of thing.
> > >
> > > @acquire(myLock):
> > >     code
> > >     code
> > >     code
> > 
> > Would it be useful for anything other than mutex-locking? And wouldn't
> > it be better to make a function of the block wrapped in a
> > block-decorator and then use a normal decorator?
> 
> Yes. Check how blocks in Smalltalk and Ruby are used for starters.


See the previous two discussions on thunks here on python-dev, and
notice how the only problem that seem bettered via blocks/thunks /in
Python/ are those which are of the form...

#setup
try:
    block
finally:
    #finalization

... and depending on the syntax, properties.  I once asked "Any other
use cases for one of the most powerful features of Ruby, in Python?"  I
have yet to hear any sort of reasonable response.

Why am I getting no response to my question?  Either it is because I am
being ignored, or no one has taken the time to translate one of these
'killer features' from Smalltalk or Ruby, or perhaps such translations
show that there is a better way in Python already.

Now, don't get me wrong, I have more than a few examples of the
try/finally block in my code, so I would personally find it useful, but
just because this one pattern is made easier, doesn't mean that it
should see syntax.

 - Josiah

P.S. If I'm sounding like a broken record to you, don't be surprised. 
But until my one question is satisfactorally answered, I'll keep poking
at its soft underbelly.

From shane.holloway at ieee.org  Wed Apr 20 07:10:20 2005
From: shane.holloway at ieee.org (Shane Holloway (IEEE))
Date: Wed Apr 20 07:10:55 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc205041912245212376b@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
Message-ID: <4265E43C.4080707@ieee.org>

> *If* we're going to create syntax for anonymous blocks, I think the
> primary use case ought to be cleanup operations to replace try/finally
> blocks for locking and similar things. I'd love to have syntactical
> support so I can write

I heartily agree!  Especially when you have very similar try/finally 
code you use in many places, and wish to refactor it into a common area. 
  If this is done, you are forced into a callback form like follows::


     def withFile(filename, callback):
         aFile = open(filename, 'r')
         try:
            result = callback(aFile)
         finally:
            aFile.close()
         return result

     class Before:
         def readIt(self, filename):
             def doReading(aFile):
                 self.readPartA(aFile)
                 self.readPartB(aFile)
                 self.readPartC(aFile)

             withFile(filename, doReading)

Which is certainly functional.  I actually use the idiom frequently. 
However, my opinion is that it does not read smoothly.  This form 
requires that I say what I'm doing with something before I know the 
context of what that something is.  For me, blocks are not about 
shortening the code, but rather clarifying *intent*.  With this proposed 
change, the code becomes::

     class After:
         def readIt(self, filename):
             withFile(filename):
                 self.readPartA(aFile)
                 self.readPartB(aFile)
                 self.readPartC(aFile)

In my opinion, this is much smoother to read.  This particular example 
brings up the question of how arguments like "aFile" get passed and 
named into the block.  I anticipate the need for a place to put an 
argument declaration list.  ;)  And no, I'm not particularly fond of 
Smalltalk's solution with "| aFile |", but that's just another opinion 
of aesthetics.


Another set of question arose for me when Barry started musing over the 
combination of blocks and decorators.  What are blocks?  Well, obviously 
they are callable.  What do they return?  The local namespace they 
created/modified?  How do blocks work with control flow statements like 
"break", "continue", "yield", and "return"?  I think these questions 
have good answers, we just need to figure out what they are.  Perhaps 
"break" and "continue" raise exceptions similar to StopIteration in this 
case?

As to the control flow questions, I believe those answers depend on how 
the block is used.  Perhaps a few different invocation styles are 
applicable.  For instance, the method block.suite() could return a tuple 
such as (returnedValue, locals()), where block.__call__() would simply 
return like any other callable.

It would be good to figure out what the control flow difference is between::

     def readAndReturn(self, filename):
         withFile(filename):
             a = self.readPartA(aFile)
             b = self.readPartB(aFile)
             c = self.readPartC(aFile)
             return (a, b, c)

and::

     def readAndReturn(self, filename):
         withFile(filename):
             a = self.readPartA(aFile)
             b = self.readPartB(aFile)
             c = self.readPartC(aFile)
         return (a, b, c)

Try it with yield to further vex the puzzle.  ;)

Thanks for your time!
-Shane Holloway
From steven.bethard at gmail.com  Wed Apr 20 07:23:56 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Wed Apr 20 07:23:59 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <560ee46ad8e2a82faedba7349b98ab5a@yahoo.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<740c3aec0504191557505d6e9f@mail.gmail.com>
	<560ee46ad8e2a82faedba7349b98ab5a@yahoo.com>
Message-ID: <d11dcfba05041922231bf8fb95@mail.gmail.com>

On 4/19/05, Alex Martelli <aleaxit@yahoo.com> wrote:
> Well, one obvious use might be, say:
> 
> @withfile('foo.bar', 'r'):
>      content = thefile.read()
> 
> but that would require the decorator and block to be able to interact
> in some way, so that inside the block 'thefile' is defined suitably.
> 
> > it be better to make a function of the block wrapped in a
> > block-decorator and then use a normal decorator?
> 
>  From a viewpoint of namespaces, I think it would be better to have the
> block execute in the same namespace as the code surrounding it, not a
> separate one (assigning to 'content' would not work otherwise), so a
> nested function would not be all that useful.  The problem might be,
> how does the _decorator_ affect that namespace.  Perhaps:
> 
> def withfile(filename, mode='r'):
>      openfile = open(filename, mode)
>      try:
>          block(thefile=openfile)
>      finally:
>          openfile.close()
> 
> i.e., let the block take keyword arguments to tweak its namespace (but
> assignments within the block should still affect its _surrounding_
> namespace, it seems to me...).

I'm not a big fan of this means of tweaking the block's namespace.  It
means that if you use a "block decorator", you might find that names
have been 'magically' added to your namespace.  This has a bad code
smell of too much implicitness to me...

I believe this was one of the reasons Brian Sabbey's proposal looked
something like:

    do <unpack_list> in <returnval> = <callable>(<params>):
        <code>

This way you could write the block above as something like:

    def withfile(filename, mode='r'):
        def _(block):
            openfile = open(filename, mode)
            try:
                block(openfile)
            finally:
                openfile.close()
        return _

    do thefile in withfile('foo.bar', 'r'):
        content = thefile.read()

where 'thefile' is explicitly named in the do/in-statement's unpack
list.  Personally, I found the 'do' and 'in' keywords very confusing,
but I do like the fact that the parameters passed to the thunk/block
are expanded in an explicit unpack list.  Using @, I don't see an easy
way to insert such an unpack list...

Of course, even with the unpack list, you still have to know what kind
of arguments the function calls your block with.  And because these
only appear within the code, e.g.
    block(openfile)
you can't rely on easily accessible things like the function's
signature.  It means that unlike other callables that can basically
document parameters and return type, "block decorators" would have to
document parameters, return type, and the parameters with which they
call the block...

STeVe
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From martin at v.loewis.de  Wed Apr 20 08:27:34 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed Apr 20 08:27:37 2005
Subject: [Python-Dev] Proper place to put extra args for building
In-Reply-To: <4265918D.7040700@ocf.berkeley.edu>
References: <4265918D.7040700@ocf.berkeley.edu>
Message-ID: <4265F656.4020305@v.loewis.de>

Brett C. wrote:
> I am currently adding some code for a Py_COMPILER_DEBUG build for use on the
> AST branch.  I thought that OPT was the proper variable to put stuff like this
> into for building (``-DPy_COMPILER_DEBUG``), but that erases ``-g -Wall
> -Wstrict-prototypes``.  Obviously I could just tack all of that into my own
> thing, but that seems like an unneeded step.

Actually, this step is needed.

>>From looking at Makefile.pre.in it seems like CFLAGSFORSHARED is meant for
> extra arguments to the compiler.  Is that right?

No. This is the set of flags to be passed to the compiler when compiling
with --enable-shared. It is set in configure.in.

It might be reasonable to add a variable that will just take additional
compiler flags, and never be modified in configure.

Regards,
Martin
From fredrik at pythonware.com  Wed Apr 20 08:47:51 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed Apr 20 08:48:16 2005
Subject: [Python-Dev] Re: anonymous blocks
References: <740c3aec0504191557505d6e9f@mail.gmail.com><877e9a170504191855445e0f4d@mail.gmail.com>
	<20050419212423.63AD.JCARLSON@uci.edu>
Message-ID: <d44tmo$ih0$1@sea.gmane.org>

Josiah Carlson wrote:

> See the previous two discussions on thunks here on python-dev, and
> notice how the only problem that seem bettered via blocks/thunks /in
> Python/ are those which are of the form...
> 
> #setup
> try:
>     block
> finally:
>     #finalization
> 
> ... and depending on the syntax, properties.  I once asked "Any other
> use cases for one of the most powerful features of Ruby, in Python?"  I
> have yet to hear any sort of reasonable response.
> 
> Why am I getting no response to my question?  Either it is because I am
> being ignored, or no one has taken the time to translate one of these
> 'killer features' from Smalltalk or Ruby, or perhaps such translations
> show that there is a better way in Python already.

for my purposes, I've found that the #1 callback killer in contemporary Python
is for-in:s support for the iterator protocol:

instead of

    def callback(x):
        code
    dosomething(callback)

or with the "high-level intent"-oriented syntax:

    dosomething(**):
        def libraryspecifiedargumentname(x):
            code

I simply write

    for x in dosomething():
        code

and get shorter code that runs faster.  (see cElementTree's iterparse for
an excellent example.  for typical use cases, it's nearly three times faster
than pyexpat, which is the fastest callback-based XML parser we have)

unfortunately,

    def do():
        print "setup"
        try:
            yield None
        finally:
            print "tear down"

doesn't quite work (if it did, all you would need is syntactic sugar for "for
dummy in").

</F>

PS. a side effect of the for-in pattern is that I'm beginning to feel that Python
might need a nice "switch" statement based on dictionary lookups, so I can
replace multiple callbacks with a single loop body, without writing too many
if/elif clauses.

From fredrik at pythonware.com  Wed Apr 20 08:55:03 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed Apr 20 08:55:24 2005
Subject: [Python-Dev] Re: Re: anonymous blocks
References: <c41f67b90504191135e85c8b5@mail.gmail.com>	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu><d43lt8$fs8$1@sea.gmane.org>
	<4265DB28.8050905@hathawaymix.org>
Message-ID: <d44u49$jgu$1@sea.gmane.org>

Shane Hathaway wrote:

> Brian's suggestion makes the code read more like an outline.  In Brian's
> example, the high-level intent stands out from the details

that assumes that when you call a library function, the high-level intent of
*your* code is obvious from the function name in the library, and to some
extent, by the argument names chosen by the library implementor.

I'm not so sure that's always a valid assumption.

> while in your example, there is no visual cue that distinguishes the details
> from the intent.

carefully chosen function names (that you chose yourself) plus blank lines
can help with that.

> Of course, lambdas are even better, when it's possible to
> use them:
> 
>    doFoo((lambda a, b: a + b), (lambda c, d: c + d))

that only tells you that you're calling "doFoo", with no clues whatsoever to
what the code in the lambdas are doing.  keyword arguments are a step up
from that, as long as your intent matches the library writers intent.

</F>

From jjinux at gmail.com  Wed Apr 20 10:01:55 2005
From: jjinux at gmail.com (Shannon -jj Behrens)
Date: Wed Apr 20 10:01:58 2005
Subject: [Python-Dev] Re: anonymous blocks (off topic: match)
In-Reply-To: <d44tmo$ih0$1@sea.gmane.org>
References: <740c3aec0504191557505d6e9f@mail.gmail.com>
	<877e9a170504191855445e0f4d@mail.gmail.com>
	<20050419212423.63AD.JCARLSON@uci.edu> <d44tmo$ih0$1@sea.gmane.org>
Message-ID: <c41f67b905042001012f218008@mail.gmail.com>

> PS. a side effect of the for-in pattern is that I'm beginning to feel that Python
> might need a nice "switch" statement based on dictionary lookups, so I can
> replace multiple callbacks with a single loop body, without writing too many
> if/elif clauses.

That's funny.  I keep wondering if "match" from the ML world would
make sense in Python.  I keep thinking it'd be a really nice thing to
have.

-jj

-- 
I have decided to switch to Gmail, but messages to my Yahoo account will
still get through.
From p.f.moore at gmail.com  Wed Apr 20 11:43:32 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed Apr 20 11:43:34 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <Pine.A41.4.61b.0504191341280.124046@dante72.u.washington.edu>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<ca471dc2050419133363a1cea9@mail.gmail.com>
	<Pine.A41.4.61b.0504191341280.124046@dante72.u.washington.edu>
Message-ID: <79990c6b05042002435ce91e79@mail.gmail.com>

On 4/19/05, Brian Sabbey <sabbey@u.washington.edu> wrote:
> Guido van Rossum wrote:
> >> @acquire(myLock):
> >>     code
> >>     code
> >>     code
> >
> > It would certainly solve the problem of which keyword to use! :-) And
> > I think the syntax isn't even ambiguous -- the trailing colon
> > distinguishes this from the function decorator syntax. I guess it
> > would morph '@xxx' into "user-defined-keyword".

Hmm, this looks to me like a natural extension of decorators. Whether
that is a good or a bad thing, I'm unable to decide :-)

[I can think of a number of uses for it, PEP 310-style with-blocks
being one, but I can't decide if "lots of potential uses" is too close
to "lots of potential for abuse" :-)]

> > How would acquire be defined? I guess it could be this, returning a
> > function that takes a callable as an argument just like other
> > decorators:
> >
> > def acquire(aLock):
> >    def acquirer(block):
> >        aLock.acquire()
> >        try:
> >            block()
> >        finally:
> >            aLock.release()
> >    return acquirer

It really has to be this, IMO, otherwise the parallel with decorators
becomes confusing, rather than helpful.

> > and the substitution of
> >
> > @EXPR:
> >    CODE
> >
> > would become something like
> >
> > def __block():
> >    CODE
> > EXPR(__block)

The question of whether assignments within CODE are executed within a
new namespace, as this implies, or in the surrounding namespace,
remains open. I can see both as reasonable (new namespace = easier to
describe/understand, more in line with decorators, probably far easier
to implement; surrounding namespace = probably more
useful/practical...)

> Why not have the block automatically be inserted into acquire's argument
> list?  It would probably get annoying to have to define inner functions
> like that every time one simply wants to use arguments.

If this syntax is to be considered, in my view it *must* follow
established decorator practice - and that includes the
define-an-inner-function-and-return-it idiom.

> Of course, augmenting the argument list in that way would be different
> than the behavior of decorators as they are now.

Exactly.

Paul.
From flaig at sanctacaris.net  Wed Apr 20 13:07:36 2005
From: flaig at sanctacaris.net (flaig@sanctacaris.net)
Date: Wed Apr 20 13:07:46 2005
Subject: [Python-Dev] Re: anonymous blocks
Message-ID: <200504201107.j3KB7a0G016148@ger5.wwwserver.net>

I guess I should begin by introducing myself: My name is Rüdiger Flaig, I live in Heidelberg/Germany (yes indeed, there are not only tourists there) and am a JOAT by profession (Jack Of All Trades). Among other weird things, I am currently teaching immunology and bioinformatics at the once-famous University of Heidelberg. Into this little secluded world of ours, so far dominated by rigid C++ stalwarts, I have successfully introduced Python! I have been lurking on this list for quite a while, interested to watch the further development of the streaked reptile.

As students keep on asking me about the differences between languages and the pros and cons, I think I may claim some familiarity with other languages too, especially Python's self-declared antithesis, Ruby. The recent discussion about anonymous blocks immediately brought Ruby to my mind once more, since -- as you will know -- Ruby does have ABs, and rubynos are very proud of them, as they are generally of their more "flexible" program structure. However, I have seen lots of Ruby code and do not really feel that this contributes in any way to the expressiveness of the language. Lambdas are handy for very microscopic matters, but in general I think that one of Python's greatest strengths is the way in which its rather rigid layout combines with the overall approach to force coders to disentangle complex operations.

So I cannot really see any benefit in ABs... Just the 0.02 of a serpent lover, but maybe someone's interested in hearing something like an outsider's opinion.

Cheers,
    Rüdiger

===
Chevalier Dr. Dr. Ruediger Marcus Flaig
    Institute for Immunology
    University of Heidelberg
    Im Neuenheimer Feld 305, D-69120 Heidelberg, FRG
    <flaig@sanctacaris.net>
"Drain you of your sanity,
Face the Thing That Should Not Be."


--
Diese E-Mail wurde mit http://www.mail-inspector.de verschickt
Mail Inspector ist ein kostenloser Service von http://www.is-fun.net
Der Absender dieser E-Mail hatte die IP: 129.206.124.135

From Michaels at rd.bbc.co.uk  Wed Apr 20 14:24:05 2005
From: Michaels at rd.bbc.co.uk (Michael Sparks)
Date: Wed Apr 20 14:30:34 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc205041912245212376b@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
Message-ID: <200504201324.05551.Michaels@rd.bbc.co.uk>

On Tuesday 19 Apr 2005 20:24, Guido van Rossum wrote:
..
> *If* we're going to create syntax for anonymous blocks, I think the
> primary use case ought to be cleanup operations to replace try/finally
> blocks for locking and similar things. I'd love to have syntactical
> support so I can write
>
> blahblah(myLock):
>     code
>     code
>     code

I've got a basic parser that I wrote last summer which was an experiment in a 
generic "python-esque" parser. It might be useful for playing with these 
thing since it accepted the above syntax without change, among many others, 
happily. (Added as a 38th test, and probably the sixth "language syntax" it 
understands)

It's also entirely keyword free, which strikes me as a novelty.

The abstract syntax tree that's generated for it is rather unwieldy and over 
the top, but that's as far as the parser goes. (As I said I was interested in 
a generic parser, not a language :)

The (entire) grammar resulting was essentially this: (and is LR-parsable)

program -> block
block -> BLOCKSTART statement_list BLOCKEND
statement_list -> statement*
statement -> (expression | expression ASSIGNMENT expression | ) EOL
expression -> oldexpression (COMMA expression)*
oldexpression -> (factor [factorlist] |  factor INFIXOPERATOR expression )
factorlist -> factor* factor 
factor -> ( bracketedexpression | constructorexpression | NUMBER | STRING | ID
          | factor DOT dotexpression | factor trailer | factor trailertoo )
dotexpression -> (ID bracketedexpression | factor )
bracketedexpression -> BRA [ expression ] KET
constructorexpression -> BRA3 [ expression ] KET3
trailer -> BRA2 expression KET2
trailertoo -> COLON EOL block

The parser.out file for the curious is here:
   * http://www.cerenity.org/SWP/parser.out   (31 productions)

The parser uses a slightly modified PLY based parser and might be useful for 
playing around with constructs (Might not, but it's the reason I'm mentioning 
it).

The approach taken is to treat ":" as always starting a code block to be 
passed. The first token on the line is treated as a function name. The idea 
was that "def", "class", "if", etc then become simple function calls that get 
various arguments which may include one or more code blocks.

The parser was also written entirely test first (as an experiment to see
what that's like for writing a parser) and includes a variety of sample
programs that pass. (39 different program tests)

I've put a tarball here:
   * http://www.cerenity.org/SWP-0.0.0.tar.gz (includes the modifed version of
     PLY)

   * Also browesable here:
     http://www.cerenity.org/SWP/

   * Some fun examples: 
      * Python-like http://www.cerenity.org/SWP/progs/expr_29.p
        (this example is an earlier version of the parser)
      * LOGO like: http://www.cerenity.org/SWP/progs/expr_33.p
      * L-System definition: http://www.cerenity.org/SWP/progs/expr_34.p
      * SML-like: http://www.cerenity.org/SWP/progs/expr_35.p
      * Amiga E/Algol like: http://www.cerenity.org/SWP/progs/expr_37.p

Needs the modified version of PLY installed first, and the tests can be run 
using "runtests.sh".

Provided in case people want to play around with something, I'm happy with the 
language as it is. :-)

Best Regards,


Michael,
-- 
Michael Sparks, Senior R&D Engineer, Digital Media Group
Michael.Sparks@rd.bbc.co.uk,
British Broadcasting Corporation, Research and Development
Kingswood Warren, Surrey KT20 6NP

This e-mail may contain personal views which are not the views of the BBC.
From foom at fuhm.net  Wed Apr 20 16:29:48 2005
From: foom at fuhm.net (James Y Knight)
Date: Wed Apr 20 16:30:03 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <79990c6b05042002435ce91e79@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<ca471dc2050419133363a1cea9@mail.gmail.com>
	<Pine.A41.4.61b.0504191341280.124046@dante72.u.washington.edu>
	<79990c6b05042002435ce91e79@mail.gmail.com>
Message-ID: <73ceac6fd1ccfa5a342f39cb57c224d9@fuhm.net>

On Apr 20, 2005, at 5:43 AM, Paul Moore wrote:
>>> and the substitution of
>>>
>>> @EXPR:
>>>    CODE
>>>
>>> would become something like
>>>
>>> def __block():
>>>    CODE
>>> EXPR(__block)
>
> The question of whether assignments within CODE are executed within a
> new namespace, as this implies, or in the surrounding namespace,
> remains open. I can see both as reasonable (new namespace = easier to
> describe/understand, more in line with decorators, probably far easier
> to implement; surrounding namespace = probably more
> useful/practical...)

If it was possible to assign to a variable to a variable bound outside 
your function, but still in your lexical scope, I think it would fix 
this issue. That's always something I've thought should be possible, 
anyways. I propose to make it possible via a declaration similar to 
'global'.

E.g. (stupid example, but it demonstrates the syntax):
def f():
   count = 0
   def addCount():
     lexical count
     count += 1
   assert count == 0
   addCount()
   assert count == 1

Then, there's two choices for the block decorator: either automatically 
mark all variable names in the immediately surrounding scope "lexical", 
or don't. Both of those choices are still consistent with the block 
just being a "normal function", which I think is an important 
attribute.

James

From aahz at pythoncraft.com  Wed Apr 20 17:15:07 2005
From: aahz at pythoncraft.com (Aahz)
Date: Wed Apr 20 17:15:09 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <4265E43C.4080707@ieee.org>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<4265E43C.4080707@ieee.org>
Message-ID: <20050420151506.GA1285@panix.com>

On Tue, Apr 19, 2005, Shane Holloway (IEEE) wrote:
>
> I heartily agree!  Especially when you have very similar try/finally 
> code you use in many places, and wish to refactor it into a common area. 
>  If this is done, you are forced into a callback form like follows::
> 
> 
>     def withFile(filename, callback):
>         aFile = open(filename, 'r')
>         try:
>            result = callback(aFile)
>         finally:
>            aFile.close()
>         return result
> 
>     class Before:
>         def readIt(self, filename):
>             def doReading(aFile):
>                 self.readPartA(aFile)
>                 self.readPartB(aFile)
>                 self.readPartC(aFile)
> 
>             withFile(filename, doReading)
> 
> Which is certainly functional.  I actually use the idiom frequently. 
> However, my opinion is that it does not read smoothly.  This form 
> requires that I say what I'm doing with something before I know the 
> context of what that something is.  For me, blocks are not about 
> shortening the code, but rather clarifying *intent*.  

Hmmmm....  How is this different from defining functions before they're
called?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code -- 
not in reams of trivial code that bores the reader to death."  --GvR
From aahz at pythoncraft.com  Wed Apr 20 17:18:11 2005
From: aahz at pythoncraft.com (Aahz)
Date: Wed Apr 20 17:18:14 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <200504201107.j3KB7a0G016148@ger5.wwwserver.net>
References: <200504201107.j3KB7a0G016148@ger5.wwwserver.net>
Message-ID: <20050420151811.GB1285@panix.com>

On Wed, Apr 20, 2005, flaig@sanctacaris.net wrote:
>
> As students keep on asking me about the differences between languages
> and the pros and cons, I think I may claim some familiarity with
> other languages too, especially Python's self-declared antithesis,
> Ruby. 

That seems a little odd to me.  To the extent that Python has an
antithesis, it would be either C++ or Perl.  Ruby is antithetical to some
of Python's core ideology because it borrows from Perl, but Ruby is much
more similar to Python than Perl is.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code -- 
not in reams of trivial code that bores the reader to death."  --GvR
From amk at amk.ca  Wed Apr 20 17:53:14 2005
From: amk at amk.ca (A.M. Kuchling)
Date: Wed Apr 20 17:54:32 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <20050420151811.GB1285@panix.com>
References: <200504201107.j3KB7a0G016148@ger5.wwwserver.net>
	<20050420151811.GB1285@panix.com>
Message-ID: <20050420155314.GA18070@rogue.amk.ca>

On Wed, Apr 20, 2005 at 08:18:11AM -0700, Aahz wrote:
> antithesis, it would be either C++ or Perl.  Ruby is antithetical to some
> of Python's core ideology because it borrows from Perl, but Ruby is much
> more similar to Python than Perl is.

I'm not that familiar with the Ruby community; might it be that they
consider Ruby to be Python's antithesis, in that it returns to
bracketing instead of Python's indentation?

--amk
From pedronis at strakt.com  Wed Apr 20 18:23:01 2005
From: pedronis at strakt.com (Samuele Pedroni)
Date: Wed Apr 20 18:23:08 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <d44tmo$ih0$1@sea.gmane.org>
References: <740c3aec0504191557505d6e9f@mail.gmail.com><877e9a170504191855445e0f4d@mail.gmail.com>	<20050419212423.63AD.JCARLSON@uci.edu>
	<d44tmo$ih0$1@sea.gmane.org>
Message-ID: <426681E5.8050203@strakt.com>


>
>
>    def do():
>        print "setup"
>        try:
>            yield None
>        finally:
>            print "tear down"
>
> doesn't quite work (if it did, all you would need is syntactic sugar 
> for "for
> dummy in").
>
PEP325 is about that
From shane.holloway at ieee.org  Wed Apr 20 18:32:17 2005
From: shane.holloway at ieee.org (Shane Holloway (IEEE))
Date: Wed Apr 20 18:32:54 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <20050420151506.GA1285@panix.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<4265E43C.4080707@ieee.org> <20050420151506.GA1285@panix.com>
Message-ID: <42668411.8090907@ieee.org>


Aahz wrote:
> On Tue, Apr 19, 2005, Shane Holloway (IEEE) wrote:
>>However, my opinion is that it does not read smoothly.  This form 
>>requires that I say what I'm doing with something before I know the 
>>context of what that something is.  For me, blocks are not about 
>>shortening the code, but rather clarifying *intent*.  
> 
> 
> Hmmmm....  How is this different from defining functions before they're
> called?

It's not.  In a function scope I'd prefer to read top-down.  When I 
write classes, I tend to put the public methods at the top.  Utility 
methods used by those entry points are placed toward the bottom.  In 
this way, I read the context of what I'm doing first, and then the 
details of the internal methods as I need to understand them.

Granted I could achieve this effect with::

     class Before:
         def readIt(self, filename):
             def readIt():
                 withFile(filename, doReading)

             def doReading(aFile):
                 self.readPartA(aFile)
                 self.readPartB(aFile)
                 self.readPartC(aFile)

             return readIt()


Which is fine with me, but the *intent* is more obfuscated than what the 
block construct offers.  And I don't think my crew would appreciate if I 
did this very often.  ;)

From jcarlson at uci.edu  Wed Apr 20 18:41:33 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed Apr 20 18:44:08 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <426681E5.8050203@strakt.com>
References: <d44tmo$ih0$1@sea.gmane.org> <426681E5.8050203@strakt.com>
Message-ID: <20050420094054.63B3.JCARLSON@uci.edu>


Samuele Pedroni <pedronis@strakt.com> wrote:
> 
> 
> >
> >
> >    def do():
> >        print "setup"
> >        try:
> >            yield None
> >        finally:
> >            print "tear down"
> >
> > doesn't quite work (if it did, all you would need is syntactic sugar 
> > for "for
> > dummy in").
> >
> PEP325 is about that

PEP 288 can be used like that.

 - Josiah

From jcarlson at uci.edu  Wed Apr 20 19:19:20 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed Apr 20 19:22:10 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <d44tmo$ih0$1@sea.gmane.org>
References: <20050419212423.63AD.JCARLSON@uci.edu> <d44tmo$ih0$1@sea.gmane.org>
Message-ID: <20050420084329.63B0.JCARLSON@uci.edu>


"Fredrik Lundh" <fredrik@pythonware.com> wrote:
> 
> Josiah Carlson wrote:
> 
> > See the previous two discussions on thunks here on python-dev, and
> > notice how the only problem that seem bettered via blocks/thunks /in
> > Python/ are those which are of the form...
> > 
> > #setup
> > try:
> >     block
> > finally:
> >     #finalization
> > 
> > ... and depending on the syntax, properties.  I once asked "Any other
> > use cases for one of the most powerful features of Ruby, in Python?"  I
> > have yet to hear any sort of reasonable response.
> > 
> > Why am I getting no response to my question?  Either it is because I am
> > being ignored, or no one has taken the time to translate one of these
> > 'killer features' from Smalltalk or Ruby, or perhaps such translations
> > show that there is a better way in Python already.
> 
> for my purposes, I've found that the #1 callback killer in contemporary Python
> is for-in:s support for the iterator protocol:
...
> and get shorter code that runs faster.  (see cElementTree's iterparse for
> an excellent example.  for typical use cases, it's nearly three times faster
> than pyexpat, which is the fastest callback-based XML parser we have)

It seems as though you are saying that because callbacks are so slow,
that blocks are a non-starter for you because of how slow it would be to
call them.  I'm thinking that if people get correct code easier, that
speed will not be as much of a concern (that's why I use Python already). 
With that said, both blocks and iterators makes /writing/ such things
easier to understand, but neither really makes /reading/ much easier. 
Sure, it is far more terse, but that doesn't mean it is easier to read
and understand what is going on.

Which would people prefer?

@a(l):
    code

or

l.acquire()
try:
    code
finally:
    l.release()


> unfortunately,
> 
>     def do():
>         print "setup"
>         try:
>             yield None
>         finally:
>             print "tear down"
> 
> doesn't quite work (if it did, all you would need is syntactic sugar for "for
> dummy in").

The use 'for dummy in...' would be sufficient to notify everyone.  If
'dummy' is too long, there is always '_'.

This kind of thing solves the common case of setup/finalization, albeit
in a not-so-obvious-to-an-observer mechanism, which was recently loathed
by a nontrivial number of python-dev posters (me being one).  Looking at
it again, a month or so later, I don't know.  It does solve the problem,
but it introduces a semantic where iteration is used for something that
is not really iteration.

Regardless, I believe that solving generator finalization (calling all
enclosing finally blocks in the generator) is a worthwhile problem to
solve.  Whether that be by PEP 325, 288, 325+288, etc., that should be
discussed.  Whether people use it as a pseudo-block, or decide that
blocks are further worthwhile, I suppose we could wait and see.


> </F>
> 
> PS. a side effect of the for-in pattern is that I'm beginning to feel that Python
> might need a nice "switch" statement based on dictionary lookups, so I can
> replace multiple callbacks with a single loop body, without writing too many
> if/elif clauses.

If I remember correctly, Raymond was working on a peephole optimization
that automatically translated if/elif/else clauses to a dictionary
lookup when the objects were hashable and only the == operator was used. 
I've not heard anything about it in over a month, but then again, I've
not finished the implementation of an alternate import semantic either.

 - Josiah

From tim.peters at gmail.com  Wed Apr 20 20:10:58 2005
From: tim.peters at gmail.com (Tim Peters)
Date: Wed Apr 20 20:11:03 2005
Subject: [Python-Dev] Newish test failures
Message-ID: <1f7befae0504201110188425c6@mail.gmail.com>

Seeing three seemingly related test failures today, on CVS HEAD:

test_csv
test test_csv failed -- errors occurred; run in verbose mode for details
test_descr
test test_descr crashed -- exceptions.AttributeError: attribute
'__dict__' of 'type' objects is not writable
test_file
test test_file crashed -- exceptions.AttributeError: attribute
'closed' of 'file' objects is not writable
3 tests failed:
    test_csv test_descr test_file

Drilling into test_csv:

ERROR: test_reader_attrs (test.test_csv.Test_Csv)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Code\python\lib\test\test_csv.py", line 62, in test_reader_attrs
    self._test_default_attrs(csv.reader, [])
  File "C:\Code\python\lib\test\test_csv.py", line 58, in _test_default_attrs
    self.assertRaises(TypeError, delattr, obj.dialect, 'quoting')
  File "C:\Code\python\lib\unittest.py", line 320, in failUnlessRaises
    callableObj(*args, **kwargs)
AttributeError: attribute 'quoting' of '_csv.Dialect' objects is not writable

======================================================================
ERROR: test_writer_attrs (test.test_csv.Test_Csv)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Code\python\lib\test\test_csv.py", line 65, in test_writer_attrs
    self._test_default_attrs(csv.writer, StringIO())
  File "C:\Code\python\lib\test\test_csv.py", line 58, in _test_default_attrs
    self.assertRaises(TypeError, delattr, obj.dialect, 'quoting')
  File "C:\Code\python\lib\unittest.py", line 320, in failUnlessRaises
    callableObj(*args, **kwargs)
AttributeError: attribute 'quoting' of '_csv.Dialect' objects is not writable
From fredrik at pythonware.com  Wed Apr 20 20:32:27 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed Apr 20 20:34:17 2005
Subject: [Python-Dev] Re: Newish test failures
References: <1f7befae0504201110188425c6@mail.gmail.com>
Message-ID: <d466vj$obb$1@sea.gmane.org>

>  File "C:\Code\python\lib\test\test_csv.py", line 58, in _test_default_attrs
>    self.assertRaises(TypeError, delattr, obj.dialect, 'quoting')
>  File "C:\Code\python\lib\unittest.py", line 320, in failUnlessRaises
>    callableObj(*args, **kwargs)
> AttributeError: attribute 'quoting' of '_csv.Dialect' objects is not writable

looks like someone didn't run the test suite...

From: bwarsaw@users.sourceforge.net
Subject: python/dist/src/Objects descrobject.c,2.38,2.39

     ...

    As per discussion on python-dev, descriptors defined in C with a NULL setter
    now raise AttributeError instead of TypeError, for consistency with their
    pure-Python equivalent.

    ...

</F>

From barry at python.org  Wed Apr 20 21:11:30 2005
From: barry at python.org (Barry Warsaw)
Date: Wed Apr 20 21:11:39 2005
Subject: [Python-Dev] Re: Newish test failures
In-Reply-To: <d466vj$obb$1@sea.gmane.org>
References: <1f7befae0504201110188425c6@mail.gmail.com>
	<d466vj$obb$1@sea.gmane.org>
Message-ID: <1114024290.10439.130.camel@geddy.wooz.org>

On Wed, 2005-04-20 at 14:32, Fredrik Lundh wrote:
> >  File "C:\Code\python\lib\test\test_csv.py", line 58, in _test_default_attrs
> >    self.assertRaises(TypeError, delattr, obj.dialect, 'quoting')
> >  File "C:\Code\python\lib\unittest.py", line 320, in failUnlessRaises
> >    callableObj(*args, **kwargs)
> > AttributeError: attribute 'quoting' of '_csv.Dialect' objects is not writable
> 
> looks like someone didn't run the test suite...

My bad, I didn't check everything in.  Will do so as soon as SF cvs is
working for me again. :/

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 307 bytes
Desc: This is a digitally signed message part
Url : http://mail.python.org/pipermail/python-dev/attachments/20050420/80a608f6/attachment.pgp
From gvanrossum at gmail.com  Wed Apr 20 21:55:47 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Apr 20 21:55:58 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <42668411.8090907@ieee.org>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<4265E43C.4080707@ieee.org> <20050420151506.GA1285@panix.com>
	<42668411.8090907@ieee.org>
Message-ID: <ca471dc2050420125528da0b20@mail.gmail.com>

[Shane Holloway]
> When I
> write classes, I tend to put the public methods at the top.  Utility
> methods used by those entry points are placed toward the bottom.  In
> this way, I read the context of what I'm doing first, and then the
> details of the internal methods as I need to understand them.
> 
> Granted I could achieve this effect with::
> 
>      class Before:
>          def readIt(self, filename):
>              def readIt():
>                  withFile(filename, doReading)
> 
>              def doReading(aFile):
>                  self.readPartA(aFile)
>                  self.readPartB(aFile)
>                  self.readPartC(aFile)
> 
>              return readIt()
> 
> Which is fine with me, but the *intent* is more obfuscated than what the
> block construct offers.  And I don't think my crew would appreciate if I
> did this very often.  ;)

I typically solve that by making doReading() a method:

class Before:

    def readit(self, filename):
        withFile(filename, self._doReading)

    def _doReading(self, aFile):
        self.readPartA(aFile)
        self.readPartB(aFile)
        self.readPartC(aFile)

Perhaps not as Pure, but certainly Practical. :-) And you could even
use __doReading to make it absolutely clear that doReading is a local
artefact, if you care about such things.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From bac at OCF.Berkeley.EDU  Wed Apr 20 22:50:02 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Apr 20 22:50:18 2005
Subject: [Python-Dev] Proper place to put extra args for building
In-Reply-To: <4265F656.4020305@v.loewis.de>
References: <4265918D.7040700@ocf.berkeley.edu> <4265F656.4020305@v.loewis.de>
Message-ID: <4266C07A.9090503@ocf.berkeley.edu>

Martin v. L?wis wrote:
> Brett C. wrote:
> 
>>I am currently adding some code for a Py_COMPILER_DEBUG build for use on the
>>AST branch.  I thought that OPT was the proper variable to put stuff like this
>>into for building (``-DPy_COMPILER_DEBUG``), but that erases ``-g -Wall
>>-Wstrict-prototypes``.  Obviously I could just tack all of that into my own
>>thing, but that seems like an unneeded step.
> 
> 
> Actually, this step is needed.
> 

Damn.  OK.

[SNIP]
> It might be reasonable to add a variable that will just take additional
> compiler flags, and never be modified in configure.

The other option is to not make configure.in skip injecting arguments when a
pydebug build is done based on whether OPT is defined in the environment.  So
configure.in:670 could change to ``OPT="$OPT -g -Wall -Wstrict-prototypes"``.

The line for a non-debug build could stay as-is since if people are bothering
to tweak those settings for a normal build they are going out of there way to
tweak settings.  Seems like special-casing this for pydebug builds makes sense
since the default values will almost always be desired for a pydebug build.
And those rare cases you don't want them you could just edit the generated
Makefile by hand.  Besides it just makes our lives easier and the special
builds even more usual since it is one less thing to have to tweak.

Sound reasonable?

-Brett
From martin at v.loewis.de  Wed Apr 20 23:08:57 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed Apr 20 23:09:01 2005
Subject: [Python-Dev] Proper place to put extra args for building
In-Reply-To: <4266C07A.9090503@ocf.berkeley.edu>
References: <4265918D.7040700@ocf.berkeley.edu> <4265F656.4020305@v.loewis.de>
	<4266C07A.9090503@ocf.berkeley.edu>
Message-ID: <4266C4E9.5060709@v.loewis.de>

Brett C. wrote:
> The other option is to not make configure.in skip injecting arguments when a
> pydebug build is done based on whether OPT is defined in the environment.  So
> configure.in:670 could change to ``OPT="$OPT -g -Wall -Wstrict-prototypes"``.

That's a procedural question: do we want to accept environment settings
only when running configure, or do we also want to honor environment or
make command line settings when make is invoked. IOW, it is ok if

export OPT=-O6
./configure
make

works. But what about

./configure
export OPT=-O6
make

or

./configure
make OPT=-O6

All three can be only supported for environment variables that are never
explicitly set in Makefile, be it explicitly in Makefile.pre.in, or
implicitly through configure.

> The line for a non-debug build could stay as-is since if people are bothering
> to tweak those settings for a normal build they are going out of there way to
> tweak settings.  Seems like special-casing this for pydebug builds makes sense
> since the default values will almost always be desired for a pydebug build.
> And those rare cases you don't want them you could just edit the generated
> Makefile by hand.  Besides it just makes our lives easier and the special
> builds even more usual since it is one less thing to have to tweak.
> 
> Sound reasonable?

No. I thought you were talking about extra args, such as -fbrett-cannon.
But now you seem to be talking about arguments that replace the ones
that configure comes up with. Either of these might be reasonable, but
they require different treatment. Replacing configure results is
possible already
From bac at OCF.Berkeley.EDU  Wed Apr 20 23:21:21 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Apr 20 23:21:37 2005
Subject: [Python-Dev] Proper place to put extra args for building
In-Reply-To: <4266C4E9.5060709@v.loewis.de>
References: <4265918D.7040700@ocf.berkeley.edu> <4265F656.4020305@v.loewis.de>
	<4266C07A.9090503@ocf.berkeley.edu> <4266C4E9.5060709@v.loewis.de>
Message-ID: <4266C7D1.700@ocf.berkeley.edu>

Martin v. L?wis wrote:
> Brett C. wrote:
> 
>>The other option is to not make configure.in skip injecting arguments when a
>>pydebug build is done based on whether OPT is defined in the environment.  So
>>configure.in:670 could change to ``OPT="$OPT -g -Wall -Wstrict-prototypes"``.
> 
> 
> That's a procedural question: do we want to accept environment settings
> only when running configure, or do we also want to honor environment or
> make command line settings when make is invoked. IOW, it is ok if
> 
> export OPT=-O6
> ./configure
> make
> 
> works. But what about
> 
> ./configure
> export OPT=-O6
> make
> 
> or
> 
> ./configure
> make OPT=-O6
> 
> All three can be only supported for environment variables that are never
> explicitly set in Makefile, be it explicitly in Makefile.pre.in, or
> implicitly through configure.
> 

Hmm.  OK, that is an interesting idea.  Would make rebuilding a lot easier if
it was just an environment variable that was part of the default OPT value;
``OPT="$BUILDFLAGS -g -Wall -Wstrict-prototyping".

I say we go with that.  What is a good name, though?  PY_OPT?

> 
>>The line for a non-debug build could stay as-is since if people are bothering
>>to tweak those settings for a normal build they are going out of there way to
>>tweak settings.  Seems like special-casing this for pydebug builds makes sense
>>since the default values will almost always be desired for a pydebug build.
>>And those rare cases you don't want them you could just edit the generated
>>Makefile by hand.  Besides it just makes our lives easier and the special
>>builds even more usual since it is one less thing to have to tweak.
>>
>>Sound reasonable?
> 
> 
> No. I thought you were talking about extra args, such as -fbrett-cannon.

I am, specifically ``-DPy_COMPILER_DEBUG`` to be tacked on as a flag to gcc.

> But now you seem to be talking about arguments that replace the ones
> that configure comes up with. Either of these might be reasonable, but
> they require different treatment. Replacing configure results is
> possible already

I am only talking about that because that is how OPT is currently structured;
configure.in replaces the defaults with what the user provides if the
environment variable is set.  This is what I don't want.
From mal at egenix.com  Wed Apr 20 23:40:25 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed Apr 20 23:40:28 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <d44tmo$ih0$1@sea.gmane.org>
References: <740c3aec0504191557505d6e9f@mail.gmail.com><877e9a170504191855445e0f4d@mail.gmail.com>	<20050419212423.63AD.JCARLSON@uci.edu>
	<d44tmo$ih0$1@sea.gmane.org>
Message-ID: <4266CC49.9080901@egenix.com>

Fredrik Lundh wrote:
> PS. a side effect of the for-in pattern is that I'm beginning to feel 
> that Python
> might need a nice "switch" statement based on dictionary lookups, so I can
> replace multiple callbacks with a single loop body, without writing too 
> many
> if/elif clauses.

PEP 275 anyone ? (http://www.python.org/peps/pep-0275.html)

My use case for switch is that of a parser switching on tokens.

mxTextTools applications would greatly benefit from being able
to branch on tokens quickly. Currently, there's only callbacks,
dict-to-method branching or long if-elif-elif-...-elif-else.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 20 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From mfb at lotusland.dyndns.org  Wed Apr 20 23:59:34 2005
From: mfb at lotusland.dyndns.org (Matthew F. Barnes)
Date: Wed Apr 20 23:59:42 2005
Subject: [Python-Dev] Reference counting when entering and exiting scopes
Message-ID: <61373.130.76.96.19.1114034374.squirrel@lotusland.dyndns.org>

Someone on python-help suggested that I forward this question to
python-dev.

I've been studying Python's core compiler and bytecode interpreter as a
model for my own interpreted language, and I've come across what appears
to be a reference counting problem in the `symtable_exit_scope' function
in <Python/compile.c>.

At this point I assume that I'm just misunderstanding what's going on.  So
I was hoping to contact one of the core developers before I go filing what
could very well be a spurious bug report against Python's core.

Here's the function copied from CVS HEAD:

static int
symtable_exit_scope(struct symtable *st)
{
	int end;

	if (st->st_pass == 1)
		symtable_update_free_vars(st);
	Py_DECREF(st->st_cur);
	end = PyList_GET_SIZE(st->st_stack) - 1;
	st->st_cur = (PySymtableEntryObject *)PyList_GET_ITEM(st->st_stack,
							      end);
	if (PySequence_DelItem(st->st_stack, end) < 0)
		return -1;
	return 0;
}

My issue is with the use of PyList_GET_ITEM to fetch a new value for the
current scope.  As I understand it, PyList_GET_ITEM does not increment the
reference count for the returned value.  So in effect we're borrowing the
reference to the symtable entry object from the tail of the scope stack. 
But then we turn around and delete the object from the tail of the scope
stack, which DOES decrement the reference count.

So `symtable_exit_scope' has a net effect of decrementing the reference
count of the new current symtable entry object, when it seems to me like
it should stay the same.  Shouldn't the reference count be incremented
when we assign to "st->st_cur" (either explicitly or by fetching the
object using the PySequence API instead of PyList)?

Can someone explain the rationale here?

---------------------------------------------------------------------------

As an addendum to the previous question, further study of the
<Python/compile.c> code has made me believe that there's a reference
counting problem in the `symtable_enter_scope' function as well (pasted
below from CVS HEAD).  Namely, that `prev' should be Py_XDECREF'd at some
point in the function (at the end of the first IF block, perhaps?).

static void
symtable_enter_scope(struct symtable *st, char *name, int type,
		     int lineno)
{
	PySymtableEntryObject *prev = NULL;

	if (st->st_cur) {
		prev = st->st_cur;
		if (PyList_Append(st->st_stack, (PyObject *)st->st_cur) < 0) {
			st->st_errors++;
			return;
		}
	}
	st->st_cur = (PySymtableEntryObject *)
		PySymtableEntry_New(st, name, type, lineno);
	if (st->st_cur == NULL) {
		st->st_errors++;
		return;
	}
	if (strcmp(name, TOP) == 0)
		st->st_global = st->st_cur->ste_symbols;
	if (prev && st->st_pass == 1) {
		if (PyList_Append(prev->ste_children,
				  (PyObject *)st->st_cur) < 0)
			st->st_errors++;
	}
}

Thanks,
Matthew Barnes


From jjinux at gmail.com  Thu Apr 21 02:59:04 2005
From: jjinux at gmail.com (Shannon -jj Behrens)
Date: Thu Apr 21 02:59:07 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <4266CC49.9080901@egenix.com>
References: <740c3aec0504191557505d6e9f@mail.gmail.com>
	<877e9a170504191855445e0f4d@mail.gmail.com>
	<20050419212423.63AD.JCARLSON@uci.edu> <d44tmo$ih0$1@sea.gmane.org>
	<4266CC49.9080901@egenix.com>
Message-ID: <c41f67b905042017596328164a@mail.gmail.com>

On 4/20/05, M.-A. Lemburg <mal@egenix.com> wrote:
> Fredrik Lundh wrote:
> > PS. a side effect of the for-in pattern is that I'm beginning to feel
> > that Python
> > might need a nice "switch" statement based on dictionary lookups, so I can
> > replace multiple callbacks with a single loop body, without writing too
> > many
> > if/elif clauses.
> 
> PEP 275 anyone ? (http://www.python.org/peps/pep-0275.html)
> 
> My use case for switch is that of a parser switching on tokens.
> 
> mxTextTools applications would greatly benefit from being able
> to branch on tokens quickly. Currently, there's only callbacks,
> dict-to-method branching or long if-elif-elif-...-elif-else.

I think "match" from Ocaml would be a much nicer addition to Python
than "switch" from C.

-jj

-- 
I have decided to switch to Gmail, but messages to my Yahoo account will
still get through.
From bac at OCF.Berkeley.EDU  Thu Apr 21 03:59:34 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Apr 21 03:59:45 2005
Subject: [Python-Dev] Reference counting when entering and exiting scopes
In-Reply-To: <61373.130.76.96.19.1114034374.squirrel@lotusland.dyndns.org>
References: <61373.130.76.96.19.1114034374.squirrel@lotusland.dyndns.org>
Message-ID: <42670906.2000308@ocf.berkeley.edu>

Matthew F. Barnes wrote:
> Someone on python-help suggested that I forward this question to
> python-dev.
> 
> I've been studying Python's core compiler and bytecode interpreter as a
> model for my own interpreted language,

Might want to take a peek at the AST branch in CVS; that is what the compiler
is going to change to as soon as it is complete.

> and I've come across what appears
> to be a reference counting problem in the `symtable_exit_scope' function
> in <Python/compile.c>.
> 
> At this point I assume that I'm just misunderstanding what's going on.  So
> I was hoping to contact one of the core developers before I go filing what
> could very well be a spurious bug report against Python's core.
> 

Spurious bug reports are fine.  If they turn out to be that they get closed as
such.  Either way time is spent checking it whether it goes there or here.  But
at least with a bug report it can be tracked more easily.

So for future reference, just go ahead and file the bug report.

> Here's the function copied from CVS HEAD:
> 
> static int
> symtable_exit_scope(struct symtable *st)
> {
> 	int end;
> 
> 	if (st->st_pass == 1)
> 		symtable_update_free_vars(st);
> 	Py_DECREF(st->st_cur);
> 	end = PyList_GET_SIZE(st->st_stack) - 1;
> 	st->st_cur = (PySymtableEntryObject *)PyList_GET_ITEM(st->st_stack,
> 							      end);
> 	if (PySequence_DelItem(st->st_stack, end) < 0)
> 		return -1;
> 	return 0;
> }
> 
> My issue is with the use of PyList_GET_ITEM to fetch a new value for the
> current scope.  As I understand it, PyList_GET_ITEM does not increment the
> reference count for the returned value.  So in effect we're borrowing the
> reference to the symtable entry object from the tail of the scope stack. 
> But then we turn around and delete the object from the tail of the scope
> stack, which DOES decrement the reference count.
> 
> So `symtable_exit_scope' has a net effect of decrementing the reference
> count of the new current symtable entry object, when it seems to me like
> it should stay the same.  Shouldn't the reference count be incremented
> when we assign to "st->st_cur" (either explicitly or by fetching the
> object using the PySequence API instead of PyList)?
> 
> Can someone explain the rationale here?
> 

If you look at how symtable_enter_scope() and symtable_exit_scope() work
together you will notice there is actually no leak.  symtable_enter_scope()
appends the existing PySymtableEntryObject on to the symtable stack and then
places a new PySymtableEntryObject into st->st_cur.  Both at this point have a
refcount of one; enough to stay alive.

Now look at symtable_exit_scope().  When the current PySymtableEntryObject is
no longer needed, it is DECREF'ed, putting it at 0 and thus leading to eventual
collection.  What is on top of the symtable stack, which has a refcount of 1
still, is then put in to st->st_cur.

So no leak.  Yes, there should be more explicit refcounting to be proper, but
the compiler cheats in a couple of places for various reasons.  But basically
everything is fine since st->st_cur and st->st_stack are only played with
refcount-wise by either symtable_enter_scope() and symtable_exit_scope() and
they are always called in pairs in the end.

-Brett
From greg.ewing at canterbury.ac.nz  Thu Apr 21 04:58:07 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu Apr 21 04:58:25 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <560ee46ad8e2a82faedba7349b98ab5a@yahoo.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<740c3aec0504191557505d6e9f@mail.gmail.com>
	<560ee46ad8e2a82faedba7349b98ab5a@yahoo.com>
Message-ID: <426716BF.6090803@canterbury.ac.nz>

Alex Martelli wrote:
> 
> def withfile(filename, mode='r'):
>     openfile = open(filename, mode)
>     try:
>         block(thefile=openfile)
>     finally:
>         openfile.close()
> 
> i.e., let the block take keyword arguments to tweak its namespace

I don't think I like that idea, because it means that from
the point of view of the user of withfile, the name 'thefile'
magically appears in the namespace without it being obvious
where it comes from.

> (but 
> assignments within the block should still affect its _surrounding_ 
> namespace, it seems to me...).

I agree with that much.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From greg.ewing at canterbury.ac.nz  Thu Apr 21 05:11:38 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu Apr 21 05:11:55 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <20050419212423.63AD.JCARLSON@uci.edu>
References: <740c3aec0504191557505d6e9f@mail.gmail.com>
	<877e9a170504191855445e0f4d@mail.gmail.com>
	<20050419212423.63AD.JCARLSON@uci.edu>
Message-ID: <426719EA.7050605@canterbury.ac.nz>

Josiah Carlson wrote:

> I once asked "Any other
> use cases for one of the most powerful features of Ruby, in Python?"  I
> have yet to hear any sort of reasonable response.
> 
> Why am I getting no response to my question?  Either it is because I am
> being ignored, or no one has taken the time to translate one of these
> 'killer features' from Smalltalk or Ruby, or perhaps such translations
> show that there is a better way in Python already.

My feeling is that it's the latter. I don't know about Ruby, but
in Smalltalk, block-passing is used so heavily because it's the
main way of implementing control structures there. While-loops,
for-loops, even if-then-else, are not built into the language,
but are implemented by methods that take block parameters.

In Python, most of these are taken care of by built-in statements,
or various uses of iterators and generators. There isn't all that
much left that people want to do on a regular basis.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From greg.ewing at canterbury.ac.nz  Thu Apr 21 05:55:45 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu Apr 21 05:56:01 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <d11dcfba05041922231bf8fb95@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<740c3aec0504191557505d6e9f@mail.gmail.com>
	<560ee46ad8e2a82faedba7349b98ab5a@yahoo.com>
	<d11dcfba05041922231bf8fb95@mail.gmail.com>
Message-ID: <42672441.7080909@canterbury.ac.nz>

Steven Bethard wrote:
> Of course, even with the unpack list, you still have to know what kind
> of arguments the function calls your block with.  And because these
> only appear within the code, e.g.
>     block(openfile)
> you can't rely on easily accessible things like the function's
> signature.

You can't rely on a function's signature alone to tell
you much in any case. A distressingly large number of
functions found in third-party extension modules have
a help() string that just says something like

   fooble(arg,...)

There's really no substitute for a good docstring!

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From greg.ewing at canterbury.ac.nz  Thu Apr 21 06:01:35 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu Apr 21 06:01:52 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <4265E43C.4080707@ieee.org>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com> <4265E43C.4080707@ieee.org>
Message-ID: <4267259F.6050902@canterbury.ac.nz>

Shane Holloway (IEEE) wrote:

>     class After:
>         def readIt(self, filename):
>             withFile(filename):
>                 self.readPartA(aFile)
>                 self.readPartB(aFile)
>                 self.readPartC(aFile)
> 
> In my opinion, this is much smoother to read.  This particular example 
> brings up the question of how arguments like "aFile" get passed and 
> named into the block.  I anticipate the need for a place to put an 
> argument declaration list.  ;)

My current thought is that it should look like this:

   with_file(filename) as f:
     do_something_with(f)

The success of this hinges on how many use cases can
be arranged so that the word 'as' makes sense in that
position. What we need is a corpus of use cases so we
can try out different phrasings on them and see what
looks the best for the most cases.

I also have a thought concerning whether the block
argument to the function should come first or last or
whatever. My solution is that the function should take
exactly *one* argument, which is the block. Any other
arguments are dealt with by currying. In other words,
with_file above would be defined as

   def with_file(filename):
     def func(block):
       f = open(filename)
       try:
         block(f)
       finally:
         f.close()
     return func

This would also make implementation much easier. The
parser isn't going to know that it's dealing with anything
other than a normal expression statement until it gets to
the 'as' or ':', by which time going back and radically
re-interpreting a previous function call could be awkward.
This way, the syntax is just

   expr ['as' assignment_target] ':' suite

and the expr is evaluated quite normally.

> Another set of question arose for me when Barry started musing over the 
> combination of blocks and decorators.  What are blocks?  Well, obviously 
> they are callable.  What do they return?  The local namespace they 
> created/modified?

I think the return value of a block should be None.
In constructs like with_file, the block is being used for
its side effect, not to compute a value for consumption
by the block function. I don't see a great need for blocks
to be able to return values.

> How do blocks work with control flow statements like 
> "break", "continue", "yield", and "return"? Perhaps 
> "break" and "continue" raise exceptions similar to StopIteration in this 
> case?

Something like that, yes.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From martin at v.loewis.de  Thu Apr 21 06:11:47 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu Apr 21 06:11:52 2005
Subject: [Python-Dev] Proper place to put extra args for building
In-Reply-To: <4266C7D1.700@ocf.berkeley.edu>
References: <4265918D.7040700@ocf.berkeley.edu>
	<4265F656.4020305@v.loewis.de>	<4266C07A.9090503@ocf.berkeley.edu>
	<4266C4E9.5060709@v.loewis.de> <4266C7D1.700@ocf.berkeley.edu>
Message-ID: <42672803.3080208@v.loewis.de>

Brett C. wrote:
> Hmm.  OK, that is an interesting idea.  Would make rebuilding a lot easier if
> it was just an environment variable that was part of the default OPT value;
> ``OPT="$BUILDFLAGS -g -Wall -Wstrict-prototyping".
> 
> I say we go with that.  What is a good name, though?  PY_OPT?

I think EXTRA_CFLAGS is common, and it would not specifically be part of
OPT, but rather of CFLAGS.

> I am only talking about that because that is how OPT is currently structured;
> configure.in replaces the defaults with what the user provides if the
> environment variable is set.  This is what I don't want.

The question is whether the user is supposed to provide a value for OPT
in the first place. "OPT" is a set of flag that (IMO) should control
the optimization level of the compiler, which, in the wider sense, also
includes the question whether debug information should be generated.
It should be possible to link object files compiled with different
OPT settings, so flags that will give binary-incompatible object files
should not be in OPT.

It might be desirable to allow the user to override OPT, e.g. to specify
that the compiler should not use -O3 but, say, -O1. I don't think there
is much point in allowing OPT to be extended. But then, it is already
possible to override OPT (when invoking make), which might be enough
control.

Regards,
Martin
From steven.bethard at gmail.com  Thu Apr 21 07:13:02 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Thu Apr 21 07:13:05 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <42672441.7080909@canterbury.ac.nz>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<740c3aec0504191557505d6e9f@mail.gmail.com>
	<560ee46ad8e2a82faedba7349b98ab5a@yahoo.com>
	<d11dcfba05041922231bf8fb95@mail.gmail.com>
	<42672441.7080909@canterbury.ac.nz>
Message-ID: <d11dcfba05042022132d0e0fa@mail.gmail.com>

Greg Ewing wrote:
> Steven Bethard wrote:
> > Of course, even with the unpack list, you still have to know what kind
> > of arguments the function calls your block with.  And because these
> > only appear within the code, e.g.
> >     block(openfile)
> > you can't rely on easily accessible things like the function's
> > signature.
> 
> You can't rely on a function's signature alone to tell
> you much in any case. A distressingly large number of
> functions found in third-party extension modules have
> a help() string that just says something like
> 
>    fooble(arg,...)
> 
> There's really no substitute for a good docstring!

True enough.

But the point still stands.  Currently, if we describe a function's
input (parameters) and output (return value), we can basically fully
document the function (given a thorough enough description of
course).[1]  Functions that accept thunks/blocks require documentation
for an additional piece of information that is not part of the input
or output of the function: the parameters with which the thunk/block
is called.

So while:
    fooble(arg)
is pretty nasty, documentation that tells me that 'arg' is a string is
probably enough to set me on the right track.  But if the
documentation tells me that arg is a thunk/block, that's almost
certainly not enough to get me going.  I also need to know how that
thunk/block will be called.

True, if arg is not a thunk/block, but another type of callable, I may
still need to know how it will be called.  But I think with non
thunks/blocks, there are a lot of cases where this is not necessary.
Consider the variety of decorator recipes.[2] Most don't document what
parameters the wrapped function will be called with because they
simply pass all arguments on through with *args and **kwargs.  Thus
the wrapped function will take the same parameters as the original
function did.  Or if they're different, they're often a relatively
simple modification of the original function's parameters, ala
classmethod or staticmethod.

But thunks/blocks don't work this way.  They're not wrapping a
function that already takes arguments.  They're wrapping a code block
that doesn't.  So they certainly can't omit the parameter description
entirely, and they can't even describe it in terms of a modification
to an already existing set of parameters.  Because the parameters
passed from a thunk/block-accepting function to a thunk are generated
by the function itself, all the parameter documentation must be
contained within the thunk/block-accepting function.

It's not like it's the end of the world of course. ;-)  I can
certainly learn to document my thunks/blocks thoroughly.  I just think
it's worth noting that there *would* be a learning process because
there are additional pieces of information I'm not used to having to
document.

STeVe

[1] I'm ignoring the issue of functions that modify parameters or
globals, but this would also be required for thunks/blocks, so I don't
think it detracts from the argument.

[2] Probably worth noting that a very large portion of the functions
I've written that accepted other functions as parameters were
decorators.  I lean towards a fairly OO style of programming, so I
don't pass around a lot of callbacks.  Presumably someone who relies
heavily on callbacks would be much more used to documenting the
parameters with which a function is called.  Still, I think there is
probably a large enough group that has similar style to mine that my
argument is still valid.
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From bac at OCF.Berkeley.EDU  Thu Apr 21 08:07:52 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Apr 21 08:08:04 2005
Subject: [Python-Dev] Proper place to put extra args for building
In-Reply-To: <42672803.3080208@v.loewis.de>
References: <4265918D.7040700@ocf.berkeley.edu>
	<4265F656.4020305@v.loewis.de>	<4266C07A.9090503@ocf.berkeley.edu>
	<4266C4E9.5060709@v.loewis.de> <4266C7D1.700@ocf.berkeley.edu>
	<42672803.3080208@v.loewis.de>
Message-ID: <42674338.80009@ocf.berkeley.edu>

Martin v. L?wis wrote:
> Brett C. wrote:
> 
>>Hmm.  OK, that is an interesting idea.  Would make rebuilding a lot easier if
>>it was just an environment variable that was part of the default OPT value;
>>``OPT="$BUILDFLAGS -g -Wall -Wstrict-prototyping".
>>
>>I say we go with that.  What is a good name, though?  PY_OPT?
> 
> 
> I think EXTRA_CFLAGS is common, and it would not specifically be part of
> OPT, but rather of CFLAGS.
> 

Works for me.  If no one objects I will check in the change for CFLAGS to make
it ``$(BASECFLAGS) $(OPT) "$EXTRA_CFLAGS"`` soon (is quoting it enough to make
sure that it isn't evaluated by configure but left as a string to be evaluated
by the shell when the Makefile is running?).

> 
>>I am only talking about that because that is how OPT is currently structured;
>>configure.in replaces the defaults with what the user provides if the
>>environment variable is set.  This is what I don't want.
> 
> 
> The question is whether the user is supposed to provide a value for OPT
> in the first place. "OPT" is a set of flag that (IMO) should control
> the optimization level of the compiler, which, in the wider sense, also
> includes the question whether debug information should be generated.
> It should be possible to link object files compiled with different
> OPT settings, so flags that will give binary-incompatible object files
> should not be in OPT.
> 

OK, that makes sense to me.

> It might be desirable to allow the user to override OPT, e.g. to specify
> that the compiler should not use -O3 but, say, -O1. I don't think there
> is much point in allowing OPT to be extended. But then, it is already
> possible to override OPT (when invoking make), which might be enough
> control.
> 

Probably.  I think as long as we state somewhere that EXTRA_CFLAGS is the place
to put binary-altering flags and to leave OPT for only binary-compatible flags
then that should be enough of a separation that most people probably won't
touch OPT most of the time since the defaults are good, but can if they want.

I assume this info should all be spelled out in the README and
Misc/Specialbuilds.txt .  Anywhere else?

-Brett
From martin at v.loewis.de  Thu Apr 21 08:14:55 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Thu Apr 21 08:14:59 2005
Subject: [Python-Dev] Proper place to put extra args for building
In-Reply-To: <42674338.80009@ocf.berkeley.edu>
References: <4265918D.7040700@ocf.berkeley.edu>
	<4265F656.4020305@v.loewis.de>	<4266C07A.9090503@ocf.berkeley.edu>
	<4266C4E9.5060709@v.loewis.de> <4266C7D1.700@ocf.berkeley.edu>
	<42672803.3080208@v.loewis.de> <42674338.80009@ocf.berkeley.edu>
Message-ID: <426744DF.2030309@v.loewis.de>

Brett C. wrote:
> Works for me.  If no one objects I will check in the change for CFLAGS to make
> it ``$(BASECFLAGS) $(OPT) "$EXTRA_CFLAGS"`` soon (is quoting it enough to make
> sure that it isn't evaluated by configure but left as a string to be evaluated
> by the shell when the Makefile is running?).

If you put it into Makefile.pre.in, the only thing to avoid that
configure evaluates is is not to use @FOO@. OTOH, putting a $
in front of it is not good enough for make: $EXTRA_CFLAGS evaluates
the variable E, and then appends XTRA_CFLAGS.

Regards,
Martin
From Ben.Young at risk.sungard.com  Thu Apr 21 10:15:10 2005
From: Ben.Young at risk.sungard.com (Ben.Young@risk.sungard.com)
Date: Thu Apr 21 10:08:48 2005
Subject: Fw: [Python-Dev] anonymous blocks
Message-ID: <OF86998687.D3974DD1-ON80256FEA.002D2186-80256FEA.002C6999@risk.sungard.com>

Reply to Michael Sparks ...
>

Thats very bizzare! I've done almost exactly the same thing, though in may 
case I was playing around with a python-like language.

In my language, code uses "ast  literals" to allow things like class 
:foo(bar): suite to be handled programmatically within the language. This 
can be used to implement anonymous blocks.

If anyone is interested I've attached the files. Use them for whatever 
purpose you want:


Run frameparser for an example. Sorry about the code quality. I hadn't 
intended to release it to anyone yet!

Cheers,
Ben

-------------- next part --------------
A non-text attachment was scrubbed...
Name: frame.zip
Type: application/zip
Size: 6856 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20050421/f232a821/frame.zip
From p.f.moore at gmail.com  Thu Apr 21 10:38:52 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Thu Apr 21 10:38:54 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <426681E5.8050203@strakt.com>
References: <740c3aec0504191557505d6e9f@mail.gmail.com>
	<877e9a170504191855445e0f4d@mail.gmail.com>
	<20050419212423.63AD.JCARLSON@uci.edu> <d44tmo$ih0$1@sea.gmane.org>
	<426681E5.8050203@strakt.com>
Message-ID: <79990c6b050421013879300013@mail.gmail.com>

On 4/20/05, Samuele Pedroni <pedronis@strakt.com> wrote:
> 
> >
> >
> >    def do():
> >        print "setup"
> >        try:
> >            yield None
> >        finally:
> >            print "tear down"
> >
> > doesn't quite work (if it did, all you would need is syntactic sugar
> > for "for
> > dummy in").
> >
> PEP325 is about that

And, of course, PEP 310 is all about encapsulating before/after
(acquire/release) actions.

Paul.
From mwh at python.net  Thu Apr 21 11:30:58 2005
From: mwh at python.net (Michael Hudson)
Date: Thu Apr 21 11:31:00 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <c41f67b905042017596328164a@mail.gmail.com> (Shannon's message
	of "Wed, 20 Apr 2005 17:59:04 -0700")
References: <740c3aec0504191557505d6e9f@mail.gmail.com>
	<877e9a170504191855445e0f4d@mail.gmail.com>
	<20050419212423.63AD.JCARLSON@uci.edu> <d44tmo$ih0$1@sea.gmane.org>
	<4266CC49.9080901@egenix.com>
	<c41f67b905042017596328164a@mail.gmail.com>
Message-ID: <2m64yg79yl.fsf@starship.python.net>

Shannon -jj Behrens <jjinux@gmail.com> writes:

> On 4/20/05, M.-A. Lemburg <mal@egenix.com> wrote:
>
>> My use case for switch is that of a parser switching on tokens.
>> 
>> mxTextTools applications would greatly benefit from being able
>> to branch on tokens quickly. Currently, there's only callbacks,
>> dict-to-method branching or long if-elif-elif-...-elif-else.
>
> I think "match" from Ocaml would be a much nicer addition to Python
> than "switch" from C.

Can you post a quick summary of how you think this would work?

Cheers,
mwh

-- 
  We did requirements and task analysis, iterative design, and user
  testing. You'd almost think programming languages were an interface
  between people and computers.                    -- Steven Pemberton
          (one of the designers of Python's direct ancestor ABC)
From fredrik at pythonware.com  Thu Apr 21 12:28:21 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu Apr 21 12:28:49 2005
Subject: [Python-Dev] Re: anonymous blocks
References: <c41f67b90504191135e85c8b5@mail.gmail.com><ca471dc205041911556ebfeb20@mail.gmail.com>
	<426569CD.1010701@divmod.com>
Message-ID: <d47uvp$l47$1@sea.gmane.org>

Glyph Lefkowitz wrote:

> Despite being guilty of propagating this style for years myself, I have to disagree.  Consider the 
> following network-conversation using Twisted style (which, I might add, would be generalizable to 
> other Twisted-like systems if they existed ;-)):
>
> def strawman(self):
>     def sayGoodbye(mingleResult):
>         def goAway(goodbyeResult):
>             self.loseConnection()
>         self.send("goodbye").addCallback(goAway)
>     def mingle(helloResult):
>         self.send("nice weather we're having").addCallback(sayGoodbye)
>     self.send("hello").addCallback(mingle)

    def iterman(self):
        yield "hello"
        yield "nice weather we're having"
        yield "goodbye"

</F> 


From andrew at indranet.co.nz  Thu Apr 21 12:36:56 2005
From: andrew at indranet.co.nz (Andrew McGregor)
Date: Thu Apr 21 12:38:24 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <2m64yg79yl.fsf@starship.python.net>
References: <740c3aec0504191557505d6e9f@mail.gmail.com>
	<877e9a170504191855445e0f4d@mail.gmail.com>
	<20050419212423.63AD.JCARLSON@uci.edu>
	<d44tmo$ih0$1@sea.gmane.org> <4266CC49.9080901@egenix.com>
	<c41f67b905042017596328164a@mail.gmail.com>
	<2m64yg79yl.fsf@starship.python.net>
Message-ID: <e41113b4cf45c21484833e6198146062@indranet.co.nz>

I can post an alternative, inspired by this bit of Haskell (I've  
deliberately left out the Haskell type annotation for this):

zoneOpts argv =
    case getOpt Permute options argv of
       (o,n,[]) -> return (o,n)
       (_,_,errs) -> error errs

which could, in a future Python, look something like:

def zoneOpts(argv):
	case i of getopt(argv, options, longoptions):
		i[2]:
			raise OptionError(i[2])
		True:
			return i[:2]

The intent is that within the case, the bit before each : is a boolean  
expression, they're evaluated in order, and the following block is  
executed for the first one that evaluates to be True.  I know we have  
exceptions for this specific example, but it's just an example.  I'm  
also assuming for the time being that getopt returns a 3-tuple  
(options, arguments, errors) like the Haskell version does, just for  
the sake of argument, and there's an OptionError constructor that will  
do something with that error list..

Yes, that is very different semantics from a Haskell case expression,  
but it kind of looks like a related idea.  A more closely related idea  
would be to borrow the Haskell patterns:

def zoneOpts(argv):
	case getopt(argv, options, longoptions):
		(o,n,[]):
			return o,n
		(_,_,errs):
			raise OptionError(errs)

where _ matches anything, a presently unbound name is bound for the  
following block by mentioning it, a bound name would match whatever  
value it referred to, and a literal matches only itself.  The first  
matching block gets executed.

Come to think of it, it should be possible to do both.

Not knowing Ocaml, I'd have to presume that 'match' is somewhat similar.

Andrew


On 21/04/2005, at 9:30 PM, Michael Hudson wrote:

> Shannon -jj Behrens <jjinux@gmail.com> writes:
>
>> On 4/20/05, M.-A. Lemburg <mal@egenix.com> wrote:
>>
>>> My use case for switch is that of a parser switching on tokens.
>>>
>>> mxTextTools applications would greatly benefit from being able
>>> to branch on tokens quickly. Currently, there's only callbacks,
>>> dict-to-method branching or long if-elif-elif-...-elif-else.
>>
>> I think "match" from Ocaml would be a much nicer addition to Python
>> than "switch" from C.
>
> Can you post a quick summary of how you think this would work?
>
> Cheers,
> mwh
>
> --  
>   We did requirements and task analysis, iterative design, and user
>   testing. You'd almost think programming languages were an interface
>   between people and computers.                    -- Steven Pemberton
>           (one of the designers of Python's direct ancestor ABC)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:  
> http://mail.python.org/mailman/options/python-dev/ 
> andrew%40indranet.co.nz
>
>

From fredrik at pythonware.com  Thu Apr 21 12:42:00 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu Apr 21 12:42:35 2005
Subject: [Python-Dev] Re: Re: anonymous blocks
References: <20050419212423.63AD.JCARLSON@uci.edu> <d44tmo$ih0$1@sea.gmane.org>
	<20050420084329.63B0.JCARLSON@uci.edu>
Message-ID: <d47vpc$nlc$1@sea.gmane.org>

Josiah Carlson wrote:

> > for my purposes, I've found that the #1 callback killer in contemporary Python
> > is for-in:s support for the iterator protocol:
> ...
> > and get shorter code that runs faster.  (see cElementTree's iterparse for
> > an excellent example.  for typical use cases, it's nearly three times faster
> > than pyexpat, which is the fastest callback-based XML parser we have)
>
> It seems as though you are saying that because callbacks are so slow,
> that blocks are a non-starter for you because of how slow it would be to
> call them.

Not really -- I see the for-in loop body as the block.

The increased speed is just a bonus.

> I'm thinking that if people get correct code easier, that speed will not be as much
> of a concern (that's why I use Python already).

(Slightly OT, but speed is always a concern.  I no longer buy the "it's python,
it has to be slow" line of reasoning; when done correctly, Python code is often
faster than anything else.

cElementTree is one such example; people have reported that cElementTree
plus Python code can be a lot faster than dedicated XPath/XSLT engines; the
Python bytecode engine is extremely fast, also compared to domain-specific
interpreters...

And in this case, you get improved usability *and* improved speed at the
same time.  That's the way it should be.)

> With that said, both blocks and iterators makes /writing/ such things
> easier to understand, but neither really makes /reading/ much easier.
> Sure, it is far more terse, but that doesn't mean it is easier to read
> and understand what is going on.

Well, I was talking about reading here: with the for-in pattern, you loop over
the "callback source", and the "callback" itself is inlined.  You don't have to
think in "here is the callback, here I configure the callback source" terms; just
make a function call and loop over the result.

> Regardless, I believe that solving generator finalization (calling all
> enclosing finally blocks in the generator) is a worthwhile problem to
> solve.  Whether that be by PEP 325, 288, 325+288, etc., that should be
> discussed.  Whether people use it as a pseudo-block, or decide that
> blocks are further worthwhile, I suppose we could wait and see.

Agreed.

</F> 


From bob at redivi.com  Thu Apr 21 13:04:45 2005
From: bob at redivi.com (Bob Ippolito)
Date: Thu Apr 21 13:04:59 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <d47uvp$l47$1@sea.gmane.org>
References: <c41f67b90504191135e85c8b5@mail.gmail.com><ca471dc205041911556ebfeb20@mail.gmail.com>
	<426569CD.1010701@divmod.com> <d47uvp$l47$1@sea.gmane.org>
Message-ID: <951559a5329cda690f153ee8894e0636@redivi.com>


On Apr 21, 2005, at 6:28 AM, Fredrik Lundh wrote:

> Glyph Lefkowitz wrote:
>
>> Despite being guilty of propagating this style for years myself, I 
>> have to disagree.  Consider the
>> following network-conversation using Twisted style (which, I might 
>> add, would be generalizable to
>> other Twisted-like systems if they existed ;-)):
>>
>> def strawman(self):
>>     def sayGoodbye(mingleResult):
>>         def goAway(goodbyeResult):
>>             self.loseConnection()
>>         self.send("goodbye").addCallback(goAway)
>>     def mingle(helloResult):
>>         self.send("nice weather we're having").addCallback(sayGoodbye)
>>     self.send("hello").addCallback(mingle)
>
>     def iterman(self):
>         yield "hello"
>         yield "nice weather we're having"
>         yield "goodbye"

Which, more or less works, for a literal translation of the straw-man 
above.  However, you're missing the point.  These deferred operations 
actually return results.  Generators offer no sane way to pass results 
back in.  If they did, then this use case could be mostly served by 
generators.

-bob

From steve at holdenweb.com  Thu Apr 21 13:11:56 2005
From: steve at holdenweb.com (Steve Holden)
Date: Thu Apr 21 13:13:02 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc20504191327182e4a90@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>	<ca471dc205041911556ebfeb20@mail.gmail.com>	<8f2cb89c88defe7f2c51e0d9bd702ef7@xs4all.nl>
	<ca471dc20504191327182e4a90@mail.gmail.com>
Message-ID: <42678A7C.9020106@holdenweb.com>

Guido van Rossum wrote:
>>>IMO this is clearer, and even shorter!
>>
>>But it clutters the namespace with objects you don't need.
> 
> 
> Why do people care about cluttering namespaces so much? I thought
> thats' what namespaces were for -- to put stuff you want to remember
> for a bit. A function's local namespace in particular seems a
> perfectly fine place for temporaries.
> 
Indeed. The way people bang on about "cluttering namespaces" you'd be 
forgiven for thinking that they are like attics, permanently attached to 
the house and liable to become cluttered over years.

Most function namespaces are in fact extremely short-lived, and there is 
little point worrying about clutter as long as there's no chance of 
confusion.

regards
  Steve
-- 
Steve Holden        +1 703 861 4237  +1 800 494 3119
Holden Web LLC             http://www.holdenweb.com/
Python Web Programming  http://pydish.holdenweb.com/

From mcherm at mcherm.com  Thu Apr 21 14:03:45 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Thu Apr 21 14:03:47 2005
Subject: [Python-Dev] Re: switch statement
Message-ID: <20050421050345.hmlz8a46bbscw844@mcherm.com>

Andrew McGregor writes:
> I can post an alternative, inspired by this bit of Haskell
    [...]
> The intent is that within the case, the bit before each : is a boolean
> expression, they're evaluated in order, and the following block is
> executed for the first one that evaluates to be True.

If we're going to be evaluating a series of booleans, then the One Proper
Format in Python is:

     if <bool-expr-1>:
         <suite-1>
     elif <bool-expr-2>:
         <suite-2>
     elif <bool-expr-3>:
         <suite-3>
     else:
         <default-suite>

When people speak of introducing a "switch" statement they are speaking
of a construct in which the decision of which branch to take requires
time proportional to something LESS than a linear function of the number
of branches (it's not O(n) in the number of branches).

Now the pattern matching is more interesting, but again, I'd need to
see a proposed syntax for Python before I could begin to consider it.
If I understand it properly, pattern matching in Haskell relies
primarily on Haskell's excellent typing system, which is absent in
Python.

-- Michael Chermside

From mfb at lotusland.dyndns.org  Thu Apr 21 14:26:09 2005
From: mfb at lotusland.dyndns.org (Matthew F. Barnes)
Date: Thu Apr 21 14:26:17 2005
Subject: [Python-Dev] Reference counting when entering and exiting scopes
In-Reply-To: <42670906.2000308@ocf.berkeley.edu>
References: <61373.130.76.96.19.1114034374.squirrel@lotusland.dyndns.org>
	<42670906.2000308@ocf.berkeley.edu>
Message-ID: <1114086369.5763.7.camel@workstation>

On Wed, 2005-04-20 at 18:59 -0700, Brett C. wrote:
> So no leak.  Yes, there should be more explicit refcounting to be proper, but
> the compiler cheats in a couple of places for various reasons.  But basically
> everything is fine since st->st_cur and st->st_stack are only played with
> refcount-wise by either symtable_enter_scope() and symtable_exit_scope() and
> they are always called in pairs in the end.

... except for the "global" scope, for which symtable_exit_scope() is
never called.  But the last reference to *that* scope (st->st_cur) gets
cleaned up in PySymtable_Free().  Correct?

So the two things I thought were glitches are actually cancelling each
other out.  Very good.  Thanks for your help.

Matthew Barnes
From ncoghlan at gmail.com  Thu Apr 21 14:39:46 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu Apr 21 14:39:53 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <20050421050345.hmlz8a46bbscw844@mcherm.com>
References: <20050421050345.hmlz8a46bbscw844@mcherm.com>
Message-ID: <42679F12.8080902@gmail.com>

Michael Chermside wrote:
> Now the pattern matching is more interesting, but again, I'd need to
> see a proposed syntax for Python before I could begin to consider it.
> If I understand it properly, pattern matching in Haskell relies
> primarily on Haskell's excellent typing system, which is absent in
> Python.

There's no real need for special syntax in Python - an appropriate tuple 
subclass will do the trick quite nicely:

class pattern(tuple):
   ignore = object()
   def __new__(cls, *args):
     return tuple.__new__(cls, args)
   def __hash__(self):
     raise NotImplementedError
   def __eq__(self, other):
     if len(self) != len(other):
         return False
     for item, other_item in zip(self, other):
       if item is pattern.ignore:
         continue
       if item != other_item:
         return False
     return True

Py> x = (1, 2, 3)
Py> print x == pattern(1, 2, 3)
True
Py> print x == pattern(1, pattern.ignore, pattern.ignore)
True
Py> print x == pattern(1, pattern.ignore, 3)
True
Py> print x == pattern(2, pattern.ignore, pattern.ignore)
False
Py> print x == pattern(1)
False

It's not usable in a dict-based switch statement, obviously, but it's perfectly 
compatible with the current if/elif idiom.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From michael.walter at gmail.com  Thu Apr 21 14:46:42 2005
From: michael.walter at gmail.com (Michael Walter)
Date: Thu Apr 21 14:46:44 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <42679F12.8080902@gmail.com>
References: <20050421050345.hmlz8a46bbscw844@mcherm.com>
	<42679F12.8080902@gmail.com>
Message-ID: <877e9a1705042105465df1f925@mail.gmail.com>

On 4/21/05, Nick Coghlan <ncoghlan@gmail.com> wrote:
> Michael Chermside wrote:
> > Now the pattern matching is more interesting, but again, I'd need to
> > see a proposed syntax for Python before I could begin to consider it.
> > If I understand it properly, pattern matching in Haskell relies
> > primarily on Haskell's excellent typing system, which is absent in
> > Python.
> 
> There's no real need for special syntax in Python - an appropriate tuple
> subclass will do the trick quite nicely:

You are missing the more interesting part of pattern matching, namely
that it is used for deconstructing values/binding subvalues. Ex.:

case lalala of
  Foo f -> f
  Bar (Baz brzzzzz) _ meep -> (brzzzzz, meep)

or Python-ish:

match doThis() with:
  Foo as f: return f
  (_,* as bar,_): return bar
  Baz(boink as brzzz, meep=10): return brzzz

"* as bar" is Not Very Nice (tm) :/

Michael
From fredrik at pythonware.com  Thu Apr 21 15:11:23 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu Apr 21 15:12:51 2005
Subject: [Python-Dev] Re: Re: anonymous blocks
References: <c41f67b90504191135e85c8b5@mail.gmail.com><ca471dc205041911556ebfeb20@mail.gmail.com><426569CD.1010701@divmod.com>
	<d47uvp$l47$1@sea.gmane.org>
	<951559a5329cda690f153ee8894e0636@redivi.com>
Message-ID: <d488hd$kiv$1@sea.gmane.org>

Bob Ippolito wrote:

>>> def strawman(self):
>>>     def sayGoodbye(mingleResult):
>>>         def goAway(goodbyeResult):
>>>             self.loseConnection()
>>>         self.send("goodbye").addCallback(goAway)
>>>     def mingle(helloResult):
>>>         self.send("nice weather we're having").addCallback(sayGoodbye)
>>>     self.send("hello").addCallback(mingle)
>>
>>     def iterman(self):
>>         yield "hello"
>>         yield "nice weather we're having"
>>         yield "goodbye"
>
> Which, more or less works, for a literal translation of the straw-man above.  However, you're 
> missing the point.  These deferred operations actually return results.  Generators offer no sane 
> way to pass results back in.

that's why you need a context object (=self, in this case).

    def iterman(self):
        yield "hello"
        print self.data
        yield "nice weather we're having"
        print self.data
        yield "goodbye"

also see:

    http://effbot.org/zone/asyncore-generators.htm

> If they did, then this use case could be mostly served by generators.

exactly.

</F> 


From pedronis at strakt.com  Thu Apr 21 16:02:38 2005
From: pedronis at strakt.com (Samuele Pedroni)
Date: Thu Apr 21 16:02:50 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <2m64yg79yl.fsf@starship.python.net>
References: <740c3aec0504191557505d6e9f@mail.gmail.com>	<877e9a170504191855445e0f4d@mail.gmail.com>	<20050419212423.63AD.JCARLSON@uci.edu>
	<d44tmo$ih0$1@sea.gmane.org>	<4266CC49.9080901@egenix.com>	<c41f67b905042017596328164a@mail.gmail.com>
	<2m64yg79yl.fsf@starship.python.net>
Message-ID: <4267B27E.9030107@strakt.com>

Michael Hudson wrote:

>Shannon -jj Behrens <jjinux@gmail.com> writes:
>
>  
>
>>On 4/20/05, M.-A. Lemburg <mal@egenix.com> wrote:
>>
>>    
>>
>>>My use case for switch is that of a parser switching on tokens.
>>>
>>>mxTextTools applications would greatly benefit from being able
>>>to branch on tokens quickly. Currently, there's only callbacks,
>>>dict-to-method branching or long if-elif-elif-...-elif-else.
>>>      
>>>
>>I think "match" from Ocaml would be a much nicer addition to Python
>>than "switch" from C.
>>    
>>
>
>Can you post a quick summary of how you think this would work?
>
>  
>
Well, Python lists are used more imperatively and are not made up with 
cons cells,
we have dictionaries which because of ordering issues are not trivial to 
match,
 and  no general ordered records with labels. We have objects and not 
algebraic
data types. Literature on the topic usually indicates the visitor 
pattern as the
moral equivalent of pattern matching in an OO-context vs. algebraic data 
types/functional
one. I agree with that point of view and Python has idioms for the 
visitor pattern.

Interestingly even in the context of objects one can leverage the 
infrastructure that is
there for generalized copying/pickling to allow generalized pattern 
matching of
nested object data structures. Whether it is practical I don't know.

 >>> class Pt:
...   def __init__(self, x,y):
...     self.x = x
...     self.y = y
...
 >>> p(lambda _: Pt(1, _()) ).match(Pt(1,3))
(3,)
 >>> p(lambda _: Pt(1, Pt(_(),_()))).match(Pt(1,Pt(Pt(5,6),3)))
(<__main__.Pt instance at 0x40200b4c>, 3)

http://codespeak.net/svn/user/pedronis/match.py is an experiment in that 
direction (preceding this discussion
and inspired while reading a book that was using OCaml for its examples).

Notice that this is quite grossly subclassing pickling infrastracture  
(the innocent bystander should probably not try that), a cleaner 
approach redoing that logic with matching in mind is possible and would 
be preferable.


From gvanrossum at gmail.com  Thu Apr 21 16:28:33 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Apr 21 16:31:32 2005
Subject: [Python-Dev] Reference counting when entering and exiting scopes
In-Reply-To: <1114086369.5763.7.camel@workstation>
References: <61373.130.76.96.19.1114034374.squirrel@lotusland.dyndns.org>
	<42670906.2000308@ocf.berkeley.edu>
	<1114086369.5763.7.camel@workstation>
Message-ID: <ca471dc205042107283389b3b3@mail.gmail.com>

> So the two things I thought were glitches are actually cancelling each
> other out.  Very good.  Thanks for your help.

Though I wonder why it was written so delicately. Would explicit
INCREF/DECREF really have hurt the performance that much? This is only
the bytecode compiler, which isn't on the critical path.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Thu Apr 21 16:37:50 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Apr 21 16:38:22 2005
Subject: [Python-Dev] Re: Re: anonymous blocks
In-Reply-To: <Pine.A41.4.61b.0504191414390.79410@dante75.u.washington.edu>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<d43lt8$fs8$1@sea.gmane.org>
	<Pine.A41.4.61b.0504191255580.79410@dante75.u.washington.edu>
	<d43rl9$5af$1@sea.gmane.org>
	<Pine.A41.4.61b.0504191414390.79410@dante75.u.washington.edu>
Message-ID: <ca471dc20504210737642310b9@mail.gmail.com>

[Brian Sabbey]
> >> If suites were commonly used as above to define properties, event handlers
> >> and other callbacks, then I think most people would be able to comprehend
> >> what the first example above is doing much more quickly than the second.

[Fredrik]
> > wonderful logic, there.  good luck with your future adventures in language
> > design.

[Brian again]
> I'm just trying to help python improve.  Maybe I'm not doing a very good
> job, I don't know.  Either way, there's no need to be rude.
> 
> If I've broken some sort of unspoken code of behavior for this list, then
> maybe it would be easier if you just 'spoke' it (perhaps in a private
> email or in the description of this list on python.org).

In his own inimitable way, Fredrik is pointing out that your argument
is a tautology (or very close to one): rephrased, it sounds like "if X
were commonly used, you'd recognize it easily", which isn't a
sufficient argument for anything.

While I've used similar arguments occasionally to shut up folks whose
only remaining argument against a new feature was "but nobody will
understand it the first time they encounter it" (which is true of
*everything* you see for the first time), such reasoning isn't strong
enough to support favoring one thing over another.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Thu Apr 21 16:49:13 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Apr 21 16:51:59 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <d11dcfba05042022132d0e0fa@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<740c3aec0504191557505d6e9f@mail.gmail.com>
	<560ee46ad8e2a82faedba7349b98ab5a@yahoo.com>
	<d11dcfba05041922231bf8fb95@mail.gmail.com>
	<42672441.7080909@canterbury.ac.nz>
	<d11dcfba05042022132d0e0fa@mail.gmail.com>
Message-ID: <ca471dc2050421074999a5a56@mail.gmail.com>

> So while:
>     fooble(arg)
> is pretty nasty, documentation that tells me that 'arg' is a string is
> probably enough to set me on the right track.  But if the
> documentation tells me that arg is a thunk/block, that's almost
> certainly not enough to get me going.  I also need to know how that
> thunk/block will be called.

This argument against thunks sounds bogus to me. The signature of any
callable arguments is recursively part of the signature of the
function you're documenting. Just like the element type of any
sequence arguments is part of the argument type.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Thu Apr 21 16:52:35 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Apr 21 16:52:43 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <4267259F.6050902@canterbury.ac.nz>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<4265E43C.4080707@ieee.org> <4267259F.6050902@canterbury.ac.nz>
Message-ID: <ca471dc20504210752431f430a@mail.gmail.com>

[Greg Ewing]
> My current thought is that it should look like this:
> 
>    with_file(filename) as f:
>      do_something_with(f)
> 
> The success of this hinges on how many use cases can
> be arranged so that the word 'as' makes sense in that
> position.
[...]
> This way, the syntax is just
> 
>    expr ['as' assignment_target] ':' suite
> 
> and the expr is evaluated quite normally.

Perhaps it could be even simpler: 

    [assignment_target '=']* expr ':' suite

This would just be an extension of the regular assignment statement.

(More in a longer post I'm composing off-line while picking cherries
off the thread.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From pedronis at strakt.com  Thu Apr 21 16:58:52 2005
From: pedronis at strakt.com (Samuele Pedroni)
Date: Thu Apr 21 16:59:06 2005
Subject: [Python-Dev] Re: Re: anonymous blocks
In-Reply-To: <d47vpc$nlc$1@sea.gmane.org>
References: <20050419212423.63AD.JCARLSON@uci.edu>
	<d44tmo$ih0$1@sea.gmane.org>	<20050420084329.63B0.JCARLSON@uci.edu>
	<d47vpc$nlc$1@sea.gmane.org>
Message-ID: <4267BFAC.9060402@strakt.com>

Fredrik Lundh wrote:

>>Regardless, I believe that solving generator finalization (calling all
>>enclosing finally blocks in the generator) is a worthwhile problem to
>>solve.  Whether that be by PEP 325, 288, 325+288, etc., that should be
>>discussed.  Whether people use it as a pseudo-block, or decide that
>>blocks are further worthwhile, I suppose we could wait and see.
>>    
>>
>
>Agreed.
>
>  
>
I agree, in fact I think that solving that issue is very important 
before/if ever introducing a generalized block statement because otherwise
things that would naturally be expressible with for and generators will 
use the block construct which allow more variety and
so possibly less immediate clarity just because generators are not good 
at resource handling.
From mcherm at mcherm.com  Thu Apr 21 17:10:30 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Thu Apr 21 17:10:33 2005
Subject: [Python-Dev] Re: switch statement
Message-ID: <20050421081030.yky429mt9jgo4gg0@mcherm.com>

I wrote:
> Now the pattern matching is more interesting, but again, I'd need to
> see a proposed syntax for Python before I could begin to consider it.
> If I understand it properly, pattern matching in Haskell relies
> primarily on Haskell's excellent typing system, which is absent in
> Python.

Nick Coghlan replies:
> There's no real need for special syntax in Python - an appropriate tuple
> subclass will do the trick quite nicely:
    [... sample code matching tuples ...]

Aha, but now you've answered my question about syntax, and I can see
that your syntax lacks most of the power of Haskell's pattern matching.
First of all, it can only match tuples ... most things in Python are
NOT tuples. Secondly (as Michael Walter explained) it doesn't allow
name binding to parts of the pattern.

Honestly, while I understand that pattern matching is extremely powerful,
I don't see how to apply it in Python. We have powerful introspective
abilities, which seems to be helpful, but on the other hand we lack
types, which are typically a key feature of such matching. And then
there's the fact that many of the elegent uses of pattern matching use
recursion to traverse data structures... a no-no in a CPython that
lacks tail-recursion elimination.

There is one exception... matching strings. There we have a powerful
means of specifying patterns (regular expressions), and a multi-way
branch based on the content of a string is a common situation. A new
way to write this:

    s = get_some_string_value()
    if s == '':
        continue;
    elif re.match('#.*$', s):
        handle_comment()
    elif s == 'DEFINE':
        handle_define()
    elif s == 'UNDEF':
        handle_undefine()
    elif re.match('[A-Za-z][A-Za-z0-9]*$', s):
        handle_identifier()
    else:
        syntax_error()

would be might be nice, but I can't figure out how to make it work
more efficiently than the simple if-elif-else structure, nor an
elegent syntax.

-- Michael Chermside

From fredrik at pythonware.com  Thu Apr 21 17:22:29 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Thu Apr 21 17:24:22 2005
Subject: [Python-Dev] Re: Re: switch statement
References: <20050421081030.yky429mt9jgo4gg0@mcherm.com>
Message-ID: <d48g77$g19$1@sea.gmane.org>

Michael Chermside wrote:

> There is one exception... matching strings. There we have a powerful
> means of specifying patterns (regular expressions), and a multi-way
> branch based on the content of a string is a common situation. A new
> way to write this:
> 
>    s = get_some_string_value()
>    if s == '':
>        continue;
>    elif re.match('#.*$', s):
>        handle_comment()
>    elif s == 'DEFINE':
>        handle_define()
>    elif s == 'UNDEF':
>        handle_undefine()
>    elif re.match('[A-Za-z][A-Za-z0-9]*$', s):
>        handle_identifier()
>    else:
>        syntax_error()
> 
> would be might be nice, but I can't figure out how to make it work
> more efficiently than the simple if-elif-else structure, nor an
> elegent syntax.

somewhat related:

http://mail.python.org/pipermail/python-dev/2003-April/035075.html

</F>

From steven.bethard at gmail.com  Thu Apr 21 17:27:00 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Thu Apr 21 17:27:04 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc2050421074999a5a56@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<740c3aec0504191557505d6e9f@mail.gmail.com>
	<560ee46ad8e2a82faedba7349b98ab5a@yahoo.com>
	<d11dcfba05041922231bf8fb95@mail.gmail.com>
	<42672441.7080909@canterbury.ac.nz>
	<d11dcfba05042022132d0e0fa@mail.gmail.com>
	<ca471dc2050421074999a5a56@mail.gmail.com>
Message-ID: <d11dcfba05042108277e504e6b@mail.gmail.com>

Guido van Rossum wrote:
> > So while:
> >     fooble(arg)
> > is pretty nasty, documentation that tells me that 'arg' is a string is
> > probably enough to set me on the right track.  But if the
> > documentation tells me that arg is a thunk/block, that's almost
> > certainly not enough to get me going.  I also need to know how that
> > thunk/block will be called.
> 
> This argument against thunks sounds bogus to me. The signature of any
> callable arguments is recursively part of the signature of the
> function you're documenting. Just like the element type of any
> sequence arguments is part of the argument type.

It wasn't really an argument against thunks.  (See the disclaimer I
gave at the bottom of my previous email.)  Think of it as an early
documentation request for the thunks in the language reference -- I'd
like to see it remind users of thunks that part of the thunk-accepting
function interface is the parameters the thunk will be called with,
and that these should be documented.

In case my point about the difference between thunks and other
callables (specifically decorators) slipped by, consider the
documentation for staticmethod, which takes a callable.  All the
staticmethod documentation says about that callable's parameters is:
    "A static method does not receive an implicit first argument"
Pretty simple I'd say.  Or classmethod:
    "A class method receives the class as implicit first argument,
     just like an instance method receives the instance."
Again, pretty simple.  Why are these simple?  Because decorators
generally pass on pretty much the same arguments as the callables they
wrap.  My point was just that because thunks don't wrap other normal
callables, they can't make such abbreviations.

STeVe
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From gvanrossum at gmail.com  Thu Apr 21 17:59:43 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Apr 21 18:07:27 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <d11dcfba05042108277e504e6b@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<740c3aec0504191557505d6e9f@mail.gmail.com>
	<560ee46ad8e2a82faedba7349b98ab5a@yahoo.com>
	<d11dcfba05041922231bf8fb95@mail.gmail.com>
	<42672441.7080909@canterbury.ac.nz>
	<d11dcfba05042022132d0e0fa@mail.gmail.com>
	<ca471dc2050421074999a5a56@mail.gmail.com>
	<d11dcfba05042108277e504e6b@mail.gmail.com>
Message-ID: <ca471dc2050421085911633c38@mail.gmail.com>

> In case my point about the difference between thunks and other
> callables (specifically decorators) slipped by, consider the
> documentation for staticmethod, which takes a callable.  All the
> staticmethod documentation says about that callable's parameters is:
>     "A static method does not receive an implicit first argument"
> Pretty simple I'd say.  Or classmethod:
>     "A class method receives the class as implicit first argument,
>      just like an instance method receives the instance."
> Again, pretty simple.  Why are these simple?  Because decorators
> generally pass on pretty much the same arguments as the callables they
> wrap.  My point was just that because thunks don't wrap other normal
> callables, they can't make such abbreviations.

You've got the special-casing backwards. It's not thinks that are
special, but staticmethod (and decorators in general) because they
take *any* callable. That's unusual -- most callable arguments have a
definite signature, think of map(), filter(), sort() and Button
callbacks.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From steven.bethard at gmail.com  Thu Apr 21 18:20:50 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Thu Apr 21 18:20:53 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <73ceac6fd1ccfa5a342f39cb57c224d9@fuhm.net>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<ca471dc2050419133363a1cea9@mail.gmail.com>
	<Pine.A41.4.61b.0504191341280.124046@dante72.u.washington.edu>
	<79990c6b05042002435ce91e79@mail.gmail.com>
	<73ceac6fd1ccfa5a342f39cb57c224d9@fuhm.net>
Message-ID: <d11dcfba0504210920220632b3@mail.gmail.com>

James Y Knight wrote:
> If it was possible to assign to a variable to a variable bound outside
> your function, but still in your lexical scope, I think it would fix
> this issue. That's always something I've thought should be possible,
> anyways. I propose to make it possible via a declaration similar to
> 'global'.
> 
> E.g. (stupid example, but it demonstrates the syntax):
> def f():
>    count = 0
>    def addCount():
>      lexical count
>      count += 1
>    assert count == 0
>    addCount()
>    assert count == 1

It strikes me that with something like this lexical declaration, we
could abuse decorators as per Carl Banks's recipe[1] to get the
equivalent of thunks:

def withfile(filename, mode='r'):
    def _(func):
        f = open(filename, mode)
        try:
            func(f)
        finally:
            f.close()
    return _

and used like:

line = None
@withfile("readme.txt")
def print_readme(fileobj):
    lexical line
    for line in fileobj:
        print line
print "last line:" line

As the recipe notes, the main difference between print_readme and a
real "code block" is that print_readme doesn't have access to the
lexical scope.  Something like James's suggestion would solve this
problem.

One advantage I see of this route (i.e. using defs + lexical scoping
instead of new syntactic support) is that because we're using a normal
function, the parameter list is not an issue -- arguments to the
"thunk" are bound to names just as they are in any other function.

The big disadvantage I see is that my normal expectations for
decorators are wrong here -- after the decorator is applied
print_readme is set to None, not a new callable object.

Guess I'm still riding the fence. ;-)

STeVe

[1]http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/391199
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From steven.bethard at gmail.com  Thu Apr 21 18:27:22 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Thu Apr 21 18:27:27 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc2050421085911633c38@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<740c3aec0504191557505d6e9f@mail.gmail.com>
	<560ee46ad8e2a82faedba7349b98ab5a@yahoo.com>
	<d11dcfba05041922231bf8fb95@mail.gmail.com>
	<42672441.7080909@canterbury.ac.nz>
	<d11dcfba05042022132d0e0fa@mail.gmail.com>
	<ca471dc2050421074999a5a56@mail.gmail.com>
	<d11dcfba05042108277e504e6b@mail.gmail.com>
	<ca471dc2050421085911633c38@mail.gmail.com>
Message-ID: <d11dcfba050421092753588cd4@mail.gmail.com>

Guido van Rossum <gvanrossum@gmail.com> wrote:
> > In case my point about the difference between thunks and other
> > callables (specifically decorators) slipped by, consider the
> > documentation for staticmethod, which takes a callable.  All the
> > staticmethod documentation says about that callable's parameters is:
> >     "A static method does not receive an implicit first argument"
> > Pretty simple I'd say.  Or classmethod:
> >     "A class method receives the class as implicit first argument,
> >      just like an instance method receives the instance."
> > Again, pretty simple.  Why are these simple?  Because decorators
> > generally pass on pretty much the same arguments as the callables they
> > wrap.  My point was just that because thunks don't wrap other normal
> > callables, they can't make such abbreviations.
> 
> You've got the special-casing backwards. It's not thinks that are
> special, but staticmethod (and decorators in general) because they
> take *any* callable. That's unusual -- most callable arguments have a
> definite signature, think of map(), filter(), sort() and Button
> callbacks.

Yeah, that was why I footnoted that most of my use for callables
taking callables was decorators.  But while I don't use map, filter or
Button callbacks, I am guilty of using sort and helping to add a key=
argument to min and max, so I guess I can't be too serious about only
using decorators. ;-)

STeVe
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From gvanrossum at gmail.com  Thu Apr 21 18:38:03 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Apr 21 18:43:20 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <d11dcfba0504210920220632b3@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<ca471dc2050419133363a1cea9@mail.gmail.com>
	<Pine.A41.4.61b.0504191341280.124046@dante72.u.washington.edu>
	<79990c6b05042002435ce91e79@mail.gmail.com>
	<73ceac6fd1ccfa5a342f39cb57c224d9@fuhm.net>
	<d11dcfba0504210920220632b3@mail.gmail.com>
Message-ID: <ca471dc2050421093820ee3f36@mail.gmail.com>

> It strikes me that with something like this lexical declaration, we
> could abuse decorators as per Carl Banks's recipe[1] to get the
> equivalent of thunks:

"abuse" being the operative word.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From steven.bethard at gmail.com  Thu Apr 21 19:05:35 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Thu Apr 21 19:05:37 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc2050421093820ee3f36@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<ca471dc2050419133363a1cea9@mail.gmail.com>
	<Pine.A41.4.61b.0504191341280.124046@dante72.u.washington.edu>
	<79990c6b05042002435ce91e79@mail.gmail.com>
	<73ceac6fd1ccfa5a342f39cb57c224d9@fuhm.net>
	<d11dcfba0504210920220632b3@mail.gmail.com>
	<ca471dc2050421093820ee3f36@mail.gmail.com>
Message-ID: <d11dcfba0504211005101af6ea@mail.gmail.com>

On 4/21/05, Guido van Rossum <gvanrossum@gmail.com> wrote:
> > It strikes me that with something like this lexical declaration, we
> > could abuse decorators as per Carl Banks's recipe[1] to get the
> > equivalent of thunks:
> 
> "abuse" being the operative word.

Yup.  I was just drawing the parallel between:

@withfile("readme.txt")
def thunk(fileobj):
    for line in fileobj:
        print line

and

@withfile("readme.txt"):
    # called by withfile as thunk(fileobj=<file object>)
    for line in fileobj:
        print line

STeVe
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From mwh at python.net  Thu Apr 21 19:10:05 2005
From: mwh at python.net (Michael Hudson)
Date: Thu Apr 21 19:10:27 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <4267B27E.9030107@strakt.com> (Samuele Pedroni's message of
	"Thu, 21 Apr 2005 16:02:38 +0200")
References: <740c3aec0504191557505d6e9f@mail.gmail.com>
	<877e9a170504191855445e0f4d@mail.gmail.com>
	<20050419212423.63AD.JCARLSON@uci.edu> <d44tmo$ih0$1@sea.gmane.org>
	<4266CC49.9080901@egenix.com>
	<c41f67b905042017596328164a@mail.gmail.com>
	<2m64yg79yl.fsf@starship.python.net> <4267B27E.9030107@strakt.com>
Message-ID: <2mu0m05a4y.fsf@starship.python.net>

Samuele Pedroni <pedronis@strakt.com> writes:

> Michael Hudson wrote:

[pattern matching]

>>Can you post a quick summary of how you think this would work?
>>
>>  
> Well, Python lists are used more imperatively and are not made up
> with cons cells, we have dictionaries which because of ordering
> issues are not trivial to match, and no general ordered records with
> labels.

That's a better way of putting it than "pattern matching and python
don't really seem to fit together", for sure :)

(I'd quite like records with labels, tangentially, but am not so wild
about ordering)

> We have objects and not algebraic data types. Literature on the
> topic usually indicates the visitor pattern as the moral equivalent
> of pattern matching in an OO-context vs. algebraic data
> types/functional one. I agree with that point of view and Python has
> idioms for the visitor pattern.

But the visitor pattern is pretty grim, really.  It would be nice (tm)
to have something like:

  match node in:
    Assign(lhs=Var(_), rhs=_):
       # lhs, rhs bound in here
    Assign(lhs=Subscr(_,_), rhs=_):
       # ditto
    Assign(lhs=Slice(*_), rhs=_):
       # ditto
    Assign(lhs=_, rhs=_):
       raise SyntaxError
     
in Lib/compiler.

Vyper had something like this, I think.

>
> Interestingly even in the context of objects one can leverage the
> infrastructure that is there for generalized copying/pickling to
> allow generalized pattern matching of nested object data
> structures. Whether it is practical I don't know.
>
>  >>> class Pt:
> ...   def __init__(self, x,y):
> ...     self.x = x
> ...     self.y = y
> ...
>  >>> p(lambda _: Pt(1, _()) ).match(Pt(1,3))
> (3,)
>  >>> p(lambda _: Pt(1, Pt(_(),_()))).match(Pt(1,Pt(Pt(5,6),3)))
> (<__main__.Pt instance at 0x40200b4c>, 3)
>
> http://codespeak.net/svn/user/pedronis/match.py is an experiment in
> that direction (preceding this discussion
> and inspired while reading a book that was using OCaml for its examples).

Yikes!

> Notice that this is quite grossly subclassing pickling infrastracture
> (the innocent bystander should probably not try that), a cleaner
> approach redoing that logic with matching in mind is possible and
> would be preferable.

Also, the syntax is disgusting.  But that's a separate issue, I guess.

Cheers,
mwh

-- 
  /* I'd just like to take this moment to point out that C has all
     the expressive power of two dixie cups and a string.
   */                       -- Jamie Zawinski from the xkeycaps source
From mwh at python.net  Thu Apr 21 19:52:11 2005
From: mwh at python.net (Michael Hudson)
Date: Thu Apr 21 19:52:14 2005
Subject: [Python-Dev] marshal / unmarshal
In-Reply-To: <d3735o$r5k$1@sea.gmane.org> (Scott David Daniels's message of
	"Fri, 08 Apr 2005 16:15:39 -0700")
References: <d3735o$r5k$1@sea.gmane.org>
Message-ID: <2moec8586s.fsf@starship.python.net>

Scott David Daniels <Scott.Daniels@Acm.Org> writes:

> What should marshal / unmarshal do with floating point NaNs (the case we
> are worrying about is Infinity) ?  The current behavior is not perfect.

So, after a fair bit of hacking, I think I have most of a solution to
this, in two patches:

   make float packing copy bytes when they can
   http://python.org/sf/1181301
   binary formats for marshalling floats
   http://python.org/sf/1180995

I'd like to check them both in pretty soon, but would really
appreciate a review, especially of the first one as it's gotten a
little hairy, mainly so I could then write some detailed tests.

That said, if there are no objections I'm going to check them in
anyway, so if they turn out to suck, it'll be YOUR fault for not
reviewing the patches :)

Cheers,
mwh

-- 
 (Of course SML does have its weaknesses, but by comparison, a
  discussion of C++'s strengths and flaws always sounds like an
  argument about whether one should face north or east when one
  is sacrificing one's goat to the rain god.)         -- Thant Tessman
From jjinux at gmail.com  Thu Apr 21 23:10:14 2005
From: jjinux at gmail.com (Shannon -jj Behrens)
Date: Thu Apr 21 23:10:17 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <2m64yg79yl.fsf@starship.python.net>
References: <740c3aec0504191557505d6e9f@mail.gmail.com>
	<877e9a170504191855445e0f4d@mail.gmail.com>
	<20050419212423.63AD.JCARLSON@uci.edu> <d44tmo$ih0$1@sea.gmane.org>
	<4266CC49.9080901@egenix.com>
	<c41f67b905042017596328164a@mail.gmail.com>
	<2m64yg79yl.fsf@starship.python.net>
Message-ID: <c41f67b9050421141041957dfb@mail.gmail.com>

On 4/21/05, Michael Hudson <mwh@python.net> wrote:
> Shannon -jj Behrens <jjinux@gmail.com> writes:
> 
> > On 4/20/05, M.-A. Lemburg <mal@egenix.com> wrote:
> >
> >> My use case for switch is that of a parser switching on tokens.
> >>
> >> mxTextTools applications would greatly benefit from being able
> >> to branch on tokens quickly. Currently, there's only callbacks,
> >> dict-to-method branching or long if-elif-elif-...-elif-else.
> >
> > I think "match" from Ocaml would be a much nicer addition to Python
> > than "switch" from C.
> 
> Can you post a quick summary of how you think this would work?

Sure.  

Now that I'm actually trying to come up with an example, I'm noticing
that Ocaml is very different than Python because Python distinguishes
statements and expressions, unlike say, Scheme.  Furthermore, it's
important to minimize the number of new keywords and avoid excessive
punctuation (which Ocaml is full of).  Hence, I propose something
like:

def handle_token(token):
    match token:
        NUMBER:
            return number / a
        WHITESPACE if token.value == "\n":
            return NEWLINE
        (a, b):
            return a / b
        else:
            return token

Hence, the syntax is something like (in pseudo EBNF):

    'match' expr ':'
        {match_expression ':'
            block}*
        'else' ':' 
            block

    match_expr ::= lvalue | constant_expression

Sematically, the above example translates into:

def handle_token(token):
    if token == NUMBER:
        return number / a
    elif token == WHITESPACE and token.value == "\n":
        return NEWLINE
    elif "setting (a, b) = token succeeds":
        return a / b
    else:
        return token

However, unlike the code above, you can more easily and more
aggressively optimize.

Best Regards,
-jj

-- 
I have decided to switch to Gmail, but messages to my Yahoo account will
still get through.
From sabbey at u.washington.edu  Fri Apr 22 00:21:18 2005
From: sabbey at u.washington.edu (Brian Sabbey)
Date: Fri Apr 22 00:21:24 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <4267259F.6050902@canterbury.ac.nz>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com> <4265E43C.4080707@ieee.org>
	<4267259F.6050902@canterbury.ac.nz>
Message-ID: <Pine.A41.4.61b.0504211251360.52386@dante74.u.washington.edu>

Greg Ewing wrote:

> I also have a thought concerning whether the block
> argument to the function should come first or last or
> whatever. My solution is that the function should take
> exactly *one* argument, which is the block. Any other
> arguments are dealt with by currying. In other words,
> with_file above would be defined as
>
>  def with_file(filename):
>    def func(block):
>      f = open(filename)
>      try:
>        block(f)
>      finally:
>        f.close()
>    return func
>
> This would also make implementation much easier. The
> parser isn't going to know that it's dealing with anything
> other than a normal expression statement until it gets to
> the 'as' or ':', by which time going back and radically
> re-interpreting a previous function call could be awkward.

I made an example implementation, and this wasn't an issue.  It took some 
code to stick the thunk into the argument list, but it was pretty 
straightforward.  The syntax that is actually used by the parser can be 
the same regardless of whether or not argument list augmentation is done, 
so the parser will not find one more awkward than the other.

> This way, the syntax is just
>
>  expr ['as' assignment_target] ':' suite
>
> and the expr is evaluated quite normally.

Requiring arguments other than the block to be dealt with by currying can 
lead to problems.  I won't claim these problems are serious, but they will 
be annoying.  Say, for example, you create a block-accepting function that 
takes no arguments.  Naturally, you would define it like this:

def f(block):
 	do_something_with_block

Now, say you want to add to this function an optional argument, so you 
wrap another function around it like in your 'with_file' example above. 
Unfortunately, now you need to go find every call of this function and add 
empty parentheses.  This is annoying.  Remember the first time you added 
optional arguments to a function and what a relief it was not to have to 
go find every call to that function and stick in the extra argument? 
Those days are over!  (well, in this case anyway.)

Some people, aware of this problem of adding optional arguments, will 
define *all* of their block-accepting functions so that they are wrapped 
in another function, even if that function takes no arguments (and wars, 
annoying ones, will be fought over whether this is the "right" way to do 
it or not!):

def f():
 	def real_func(block):
 		pass
 	return real_func

Now the documentation gets confusing.  Just saying that the function 
doesn't take any non-block arguments isn't enough.  You would need very 
specific language, which many library authors will not provide.

And there will always be that extra step in thought: do I need the stupid 
parentheses or not?  There will inevitably be people (including me) who 
get the parentheses wrong because of absentmindedness or carelessness. 
This will be an extra little speed bump.

Now, you may say that all these arguments apply to function decorators, so 
why have none of these problems appeared?  The difference is that defining 
a function takes a long time, so a little speed bump when decorating it 
isn't a big deal.  But blocks can be defined almost instantly.  Much of 
their purpose has to do with making things quicker.  Speed bumps are 
therefore a bigger deal.

This will also be an issue for beginners who use python.  A beginner won't 
necessarily have a good understanding of a function that returns a 
function.  But such an understanding would be required simply to *use* 
block-accepting functions.  Otherwise it would be completely mysterious 
why sometimes one sees this

f(a,b,c) as i:
 	pass

and sometimes this

g as i:
 	pass

even though both of these cases just seem to call the function that 
appears next to 'as' (imagine you don't have the source of 'f' and 'g'). 
Even worse, imagine finally learning the rule that parentheses are not 
allowed if there are zero arguments, and then seeing:

h() as i:
 	pass

Now it would just seem arbitrary whether or not parentheses are required 
or disallowed.  Such an issue may seem trivial to an experienced 
programmer, but can be very off-putting for a beginner.

>> Another set of question arose for me when Barry started musing over the 
>> combination of blocks and decorators.  What are blocks?  Well, obviously 
>> they are callable.  What do they return?  The local namespace they 
>> created/modified?
>
> I think the return value of a block should be None.
> In constructs like with_file, the block is being used for
> its side effect, not to compute a value for consumption
> by the block function. I don't see a great need for blocks
> to be able to return values.

If you google "filetype:rb yield", you can see many the uses of yield in 
ruby.  By looking for the uses in which yield's return value is used, you 
can find blocks that return values.  For example, "t = yield()" or "unless 
yield()" indicate that a block is returning a value.  It is true that most 
of the time blocks do not return values, but I estimate that maybe 20% of 
the hits returned by google contain at least one block that does.  Of 
course, this information is alone is not very informative, one would like 
to understand each case individually.  But, as a first guess, it seems 
that people do find good uses for being able to return a value from a 
block.

Probably 'continue <block_retun_val>', which I had proposed earlier, is 
awful syntax for returning a value from a block.  But 'produce 
<block_return_val>' or some other verb may not be so bad.  In cases that 
the block returns no value, 'continue' could still be used to indicate 
that control should return to the function that called the block.

-Brian
From gvanrossum at gmail.com  Fri Apr 22 01:40:28 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Apr 22 01:40:41 2005
Subject: [Python-Dev] anonymous blocks
Message-ID: <ca471dc205042116402d7d38da@mail.gmail.com>

I've been thinking about this a lot, but haven't made much
progess. Here's a brain dump.

I've been thinking about integrating PEP 325 (Resource-Release Support
for Generators) into the for-loop code, so that you could replace

    the_lock.acquire()
    try:
        BODY
    finally:
        the_lock.release()

with

    for dummy in synchronized(the_lock):
        BODY

or perhaps even (making "for VAR" optional in the for-loop syntax)
with

    in synchronized(the_lock):
        BODY

Then synchronized() could be written cleanly as follows:

    def synchronized(lock):
        lock.acquire()
        try:
            yield None
        finally:
            lock.release()

But then every for-loop would have to contain an extra try-finally
clause; the translation of

    for VAR in EXPR:
        BODY

would become

    __it = iter(EXPR)
    try:
        while True:
            try:
                VAR = __it.next()
            except StopIteration:
                break
            BODY
    finally:
        if hasattr(__it, "close"):
            __it.close()

which I don't particularly like: most for-loops DON'T need this, since
they don't use a generator but some other form of iterator, or even if
they use a generator, not all generators have a try/finally loop.  But
the bytecode compiler can't know that, so it will always have to
generate this code.  It also changes the semantics of using a
generator in a for-loop slightly: if you break out of the for-loop
before the generator is exhausted you will still get the close() call.

It's also a bit funny to see this approach used with the only other
use case for try/finally we've looked at, which requires passing a
variable into the block: the "with_file" use case.  We now can write
with_file as a nice and clean generator:

    def with_file(filename):
        f = open(filename)
        try:
            yield f
        finally:
            f.close()

but the use looks very odd because it is syntactically a for-loop but
there's only one iteration:

    for f in with_file("/etc/passwd"):
        for line in f:
            print line[:line.find(":")]

Seeing this example makes me cringe -- why two nested for loops to
loop over the lines of one file???

So I think that this is probably not the right thing to pursue, and we
might be better off with something along the lines of PEP 310.  The
authors of PEP 310 agree; under Open Issues they wrote:

    There are some simiralities in concept between 'with ...' blocks
    and generators, which have led to proposals that for loops could
    implement the with block functionality[3].  While neat on some
    levels, we think that for loops should stick to being loops.

(Footnote [3] references the tread that originated PEP 325.)

Perhaps the most important lesson we've learned in this thread is that
the 'with' keyword proposed in PEP 310 is redundant -- the syntax
could just be

    [VAR '=']* EXPR ':'
        BODY

IOW the regular assignment / expression statement gets an optional
colon-plus-suite at the end.

So now let's assume we accept PEP 310 with this change.  Does this
leave any use cases for anonymous blocks uncovered?  Ruby's each()
pattern is covered by generators; personally I prefer Python's

    for var in seq: ...

over Ruby's much-touted

    seq.each() {|var| ...}

The try/finally use case is covered by PEP 310. (If you want to
combine this with a for-loop in a single operation, you'll need PEP
325.)

The use cases where the block actually returns a value are probably
callbacks for things like sort() or map(); I have to admit that I'd
rather keep lambda for these (and use named functions for longer
blocks) than introduce an anonymous block syntax that can return
values!  I also note that if you *already* have a comparison function,
Ruby's Array sort method doesn't let you pass it in as a function
argument; you have to give it a block that calls the comparison
function, because blocks are not the same as callables (and I'm not
sure that Ruby even *has* callables -- everything seems to be a
block).

My tentative conclusion remains: Python doesn't need Ruby blocks.
Brian Sabbey ought to come up with more examples rather than arguments
why his preferred syntax and semantics are best.

--Guido van Rossum (home page: http://www.python.org/~guido/)
From python-dev at zesty.ca  Fri Apr 22 01:49:36 2005
From: python-dev at zesty.ca (Ka-Ping Yee)
Date: Fri Apr 22 01:49:47 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc20504210752431f430a@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com> <4265E43C.4080707@ieee.org>
	<4267259F.6050902@canterbury.ac.nz>
	<ca471dc20504210752431f430a@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0504211835160.6080@server1.LFW.org>

On Thu, 21 Apr 2005, Guido van Rossum wrote:
> Perhaps it could be even simpler:
>
>     [assignment_target '=']* expr ':' suite
>
> This would just be an extension of the regular assignment statement.

It sounds like you are very close to simply translating

    expression... function_call(args):
        suite

into

    expression... function_call(args)(suitefunc)

If i understand what you proposed above, you're using assignment
as a special case to pass arguments to the inner suite, right?  So:

    inner_args = function_call(outer_args):
        suite

becomes:

    def suitefunc(inner_args):
        suite
    function_call(outer_args)(suitefunc)

?

This could get a little hard to understand if the right-hand side
of the assignment is more complex than a single function call.
I think the meaning would be unambiguous, just non-obvious.  The
only interpretation i see for this:

    x = spam('foo') + eggs('bar'):
        suite

is this:

    def suitefunc(x):
        suite
    spam('foo') + eggs('bar')(suitefunc)

but that could seem a little too mysterious.  Or you could (in a
later compiler pass) forbid more complex expressions on the RHS.

On another note, would there be any difference between

    x = spam():
        suite

and

    x = spam:
        suite

?


-- ?!ng
From gvanrossum at gmail.com  Fri Apr 22 01:54:20 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Apr 22 01:54:22 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <Pine.LNX.4.58.0504211835160.6080@server1.LFW.org>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<4265E43C.4080707@ieee.org> <4267259F.6050902@canterbury.ac.nz>
	<ca471dc20504210752431f430a@mail.gmail.com>
	<Pine.LNX.4.58.0504211835160.6080@server1.LFW.org>
Message-ID: <ca471dc2050421165440f8322a@mail.gmail.com>

[Ping]
> It sounds like you are very close to simply translating
> 
>     expression... function_call(args):
>         suite
> 
> into
> 
>     expression... function_call(args)(suitefunc)

Actually, I'm abandinging this interpretation; see my separate (long) post.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From bac at OCF.Berkeley.EDU  Fri Apr 22 01:55:14 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Apr 22 01:56:10 2005
Subject: [Python-Dev] Reference counting when entering and exiting scopes
In-Reply-To: <ca471dc205042107283389b3b3@mail.gmail.com>
References: <61373.130.76.96.19.1114034374.squirrel@lotusland.dyndns.org>	
	<42670906.2000308@ocf.berkeley.edu>	
	<1114086369.5763.7.camel@workstation>
	<ca471dc205042107283389b3b3@mail.gmail.com>
Message-ID: <42683D62.9080101@ocf.berkeley.edu>

Guido van Rossum wrote:
>>So the two things I thought were glitches are actually cancelling each
>>other out.  Very good.  Thanks for your help.
> 
> 
> Though I wonder why it was written so delicately.

Don't know; Jeremy wrote those functions back in 2001 to add nested scopes.  If
he remembers he deserves a cookie for having such a good memory.

> Would explicit
> INCREF/DECREF really have hurt the performance that much? This is only
> the bytecode compiler, which isn't on the critical path.
> 

Probably not.  But at this point I doubt it is worth fixing since the AST
branch will replace it eventually (work is on-going, just slow since my thesis
is on the home stretch; initial draft is done and now I am editing to hand over
for final revision by my advisor).

-Brett
From bac at OCF.Berkeley.EDU  Fri Apr 22 02:25:16 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Apr 22 02:25:25 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc205042116402d7d38da@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
Message-ID: <4268446C.6010301@ocf.berkeley.edu>

Guido van Rossum wrote:
> I've been thinking about this a lot, but haven't made much
> progess. Here's a brain dump.
> 
> I've been thinking about integrating PEP 325 (Resource-Release Support
> for Generators) into the for-loop code, so that you could replace
> 

[SNIP - using 'for' syntax to delineate the block and resource]

> So I think that this is probably not the right thing to pursue,

I totally agree with your reasoning on this.

> and we
> might be better off with something along the lines of PEP 310.  The
> authors of PEP 310 agree; under Open Issues they wrote:
> 
>     There are some simiralities in concept between 'with ...' blocks
>     and generators, which have led to proposals that for loops could
>     implement the with block functionality[3].  While neat on some
>     levels, we think that for loops should stick to being loops.
> 
> (Footnote [3] references the tread that originated PEP 325.)
> 
> Perhaps the most important lesson we've learned in this thread is that
> the 'with' keyword proposed in PEP 310 is redundant -- the syntax
> could just be
> 
>     [VAR '=']* EXPR ':'
>         BODY
> 
> IOW the regular assignment / expression statement gets an optional
> colon-plus-suite at the end.
> 

Sure, but is the redundancy *that* bad?  You should be able to pick up visually
that something is an anonymous block from the indentation but I don't know how
obvious it would be.

Probably, in the end, this minimal syntax would be fine, but it just seems
almost too plain in terms of screaming at me that something special is going on
there (the '=' in an odd place just quite cut if for me for my meaning of
"special").

> So now let's assume we accept PEP 310 with this change.  Does this
> leave any use cases for anonymous blocks uncovered?  Ruby's each()
> pattern is covered by generators; personally I prefer Python's
> 
>     for var in seq: ...
> 
> over Ruby's much-touted
> 
>     seq.each() {|var| ...}
> 
> The try/finally use case is covered by PEP 310. (If you want to
> combine this with a for-loop in a single operation, you'll need PEP
> 325.)
> 
> The use cases where the block actually returns a value are probably
> callbacks for things like sort() or map(); I have to admit that I'd
> rather keep lambda for these (and use named functions for longer
> blocks) than introduce an anonymous block syntax that can return
> values!  I also note that if you *already* have a comparison function,
> Ruby's Array sort method doesn't let you pass it in as a function
> argument; you have to give it a block that calls the comparison
> function, because blocks are not the same as callables (and I'm not
> sure that Ruby even *has* callables -- everything seems to be a
> block).
> 
> My tentative conclusion remains: Python doesn't need Ruby blocks.
> Brian Sabbey ought to come up with more examples rather than arguments
> why his preferred syntax and semantics are best.
> 

I think I agree with Samuele that it would be more pertinent to put all of this
effort into trying to come up with some way to handle cleanup in a generator.

-Brett
From gvanrossum at gmail.com  Fri Apr 22 02:34:31 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Apr 22 02:34:32 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <4268446C.6010301@ocf.berkeley.edu>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<4268446C.6010301@ocf.berkeley.edu>
Message-ID: <ca471dc205042117341f1add1b@mail.gmail.com>

[Brett]
> I think I agree with Samuele that it would be more pertinent to put all of this
> effort into trying to come up with some way to handle cleanup in a generator.

I.e. PEP 325.

But (as I explained, and you agree) that still doesn't render PEP 310
unnecessary, because abusing the for-loop for implied cleanup
semantics is ugly and expensive, and would change generator semantics;
and it bugs me that the finally clause's reachability depends on the
destructor executing.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From python-dev at zesty.ca  Fri Apr 22 02:39:18 2005
From: python-dev at zesty.ca (Ka-Ping Yee)
Date: Fri Apr 22 02:39:22 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc205042116402d7d38da@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0504211853560.6080@server1.LFW.org>

On Thu, 21 Apr 2005, Guido van Rossum wrote:
> The use cases where the block actually returns a value are probably
> callbacks for things like sort() or map(); I have to admit that I'd
> rather keep lambda for these (and use named functions for longer
> blocks) than introduce an anonymous block syntax that can return
> values!

It seems to me that, in general, Python likes to use keywords for
statements and operators for expressions.

Maybe the reason lambda looks like such a wart is that it uses a
keyword in the middle of an expression.  It also uses the colon
*not* to introduce an indented suite, which is a strange thing to
the Pythonic eye.  This suggests that an operator might fit better.

A possible operator for lambda might be ->.

    sort(items, key=x -> x.lower())

Anyway, just a thought.


-- ?!ng
From pedronis at strakt.com  Fri Apr 22 02:44:17 2005
From: pedronis at strakt.com (Samuele Pedroni)
Date: Fri Apr 22 02:42:34 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc205042117341f1add1b@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<4268446C.6010301@ocf.berkeley.edu>
	<ca471dc205042117341f1add1b@mail.gmail.com>
Message-ID: <426848E1.6050703@strakt.com>

Guido van Rossum wrote:
> [Brett]
> 
>>I think I agree with Samuele that it would be more pertinent to put all of this
>>effort into trying to come up with some way to handle cleanup in a generator.
> 
> 
> I.e. PEP 325.
> 
> But (as I explained, and you agree) that still doesn't render PEP 310
> unnecessary, because abusing the for-loop for implied cleanup
> semantics is ugly and expensive, and would change generator semantics;
> and it bugs me that the finally clause's reachability depends on the
> destructor executing.
> 

yes, PEP325 would work in combination with PEP310, whether a combined 
thing (which cannot be the current for as dicussed) is desirable is a 
different issue: these anyway

f = file(...):
   for line in f:
     ...

vs.

it = gen():
   for val in it:
     ...

would be analogous in a PEP310+325 world.
From jcarlson at uci.edu  Fri Apr 22 02:59:59 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri Apr 22 03:02:17 2005
Subject: [Python-Dev] anonymous blocks (don't combine them with generator
	finalization)
In-Reply-To: <ca471dc205042117341f1add1b@mail.gmail.com>
References: <4268446C.6010301@ocf.berkeley.edu>
	<ca471dc205042117341f1add1b@mail.gmail.com>
Message-ID: <20050421173945.63C3.JCARLSON@uci.edu>


Guido van Rossum <gvanrossum@gmail.com> wrote:
> 
> [Brett]
> > I think I agree with Samuele that it would be more pertinent to put all of this
> > effort into trying to come up with some way to handle cleanup in a generator.
> 
> I.e. PEP 325.
> 
> But (as I explained, and you agree) that still doesn't render PEP 310
> unnecessary, because abusing the for-loop for implied cleanup
> semantics is ugly and expensive, and would change generator semantics;
> and it bugs me that the finally clause's reachability depends on the
> destructor executing.

Yes and no.  PEP 325 offers a method to generators that handles cleanup
if necessary and calls it close().  Obviously calling it close is a
mistake.  Actually, calling it anything is a mistake, and trying to
combine try/finally handling in generators with __exit__/close (inside
or outside of generators) is also a mistake.


Start by saying, "If a non-finalized generator is garbage collected, it
will be finalized."  Whether this be by an exception or forcing a return,
so be it.

If this were to happen, we have generator finalization handled by the
garbage collector, and don't need to translate /any/ for loop.  As long
as the garbage collection requirement is documented, we are covered (yay!).


What about ...

i.__enter__()
try:
    ...
finally:
    i.__exit__()

... types of things?  Well, you seem to have offered a syntax ...

[VAR '=']* EXPR:
    BODY

... which seems to translate into ...

[VAR = ] __var = EXPR
try:
    BODY
finally:
    __var.__exit__()

... or something like that.  Great!  We've got a syntax for resource
allocation/freeing outside of generators, and a non-syntax for resource
allocation/freeing inside of generators.


 - Josiah

From bob at redivi.com  Fri Apr 22 03:47:14 2005
From: bob at redivi.com (Bob Ippolito)
Date: Fri Apr 22 03:47:29 2005
Subject: [Python-Dev] anonymous blocks (don't combine them with generator
	finalization)
In-Reply-To: <20050421173945.63C3.JCARLSON@uci.edu>
References: <4268446C.6010301@ocf.berkeley.edu>
	<ca471dc205042117341f1add1b@mail.gmail.com>
	<20050421173945.63C3.JCARLSON@uci.edu>
Message-ID: <4052ed44a40fa22f767a33e8d73d85fb@redivi.com>


On Apr 21, 2005, at 8:59 PM, Josiah Carlson wrote:

> Guido van Rossum <gvanrossum@gmail.com> wrote:
>>
>> [Brett]
>>> I think I agree with Samuele that it would be more pertinent to put 
>>> all of this
>>> effort into trying to come up with some way to handle cleanup in a 
>>> generator.
>>
>> I.e. PEP 325.
>>
>> But (as I explained, and you agree) that still doesn't render PEP 310
>> unnecessary, because abusing the for-loop for implied cleanup
>> semantics is ugly and expensive, and would change generator semantics;
>> and it bugs me that the finally clause's reachability depends on the
>> destructor executing.
>
> Yes and no.  PEP 325 offers a method to generators that handles cleanup
> if necessary and calls it close().  Obviously calling it close is a
> mistake.  Actually, calling it anything is a mistake, and trying to
> combine try/finally handling in generators with __exit__/close (inside
> or outside of generators) is also a mistake.
>
>
> Start by saying, "If a non-finalized generator is garbage collected, it
> will be finalized."  Whether this be by an exception or forcing a 
> return,
> so be it.
>
> If this were to happen, we have generator finalization handled by the
> garbage collector, and don't need to translate /any/ for loop.  As long
> as the garbage collection requirement is documented, we are covered 
> (yay!).

Well, for the CPython implementation, couldn't you get away with using 
garbage collection to do everything?  Maybe I'm missing something..

import weakref

class ResourceHandle(object):
     def __init__(self, acquire, release):
         acquire()
         # if I understand correctly, this is safer than __del__
         self.ref = weakref.ref(self, lambda o:release())

class FakeLock(object):
     def acquire(self):
         print "acquired"
     def release(self):
         print "released"

def with_lock(lock):
     r = ResourceHandle(lock.acquire, lock.release)
     yield None
     del r

 >>> x = with_lock(FakeLock())
 >>> del x
 >>> with_lock(FakeLock()).next()
acquired
released
 >>> for ignore in with_lock(FakeLock()):
...     print ignore
...
acquired
None
released

I could imagine someone complaining about generators that are never 
used missing out on the acquire/release.  That could be solved with a 
trivial rewrite:

def with_lock(lock):
     def _with_lock(r):
         yield None
         del r
     return _with_lock(ResourceHandle(lock.acquire, lock.release))

 >>> x = with_lock(FakeLock())
acquired
 >>> del x
released

Of course, this just exaggerates Guido's "it bugs me that the finally 
clause's reachability depends on the destructor executing".. but it 
does work, in CPython.

It seems to me that this pattern would be painless enough to use 
without a syntax change...

-bob

From aahz at pythoncraft.com  Fri Apr 22 03:51:21 2005
From: aahz at pythoncraft.com (Aahz)
Date: Fri Apr 22 03:51:24 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc205042116402d7d38da@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
Message-ID: <20050422015121.GB18897@panix.com>

On Thu, Apr 21, 2005, Guido van Rossum wrote:
>
> Perhaps the most important lesson we've learned in this thread is that
> the 'with' keyword proposed in PEP 310 is redundant -- the syntax
> could just be
> 
>     [VAR '=']* EXPR ':'
>         BODY
> 
> IOW the regular assignment / expression statement gets an optional
> colon-plus-suite at the end.

Yes, it could.  The question then becomes whether it should.  Because
it's easy to indent Python code when you're not using a block (consider
function calls with lots of args), my opinion is that like the "optional"
colon after ``for`` and ``if``, the resource block *should* have a
keyword.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"The joy of coding Python should be in seeing short, concise, readable
classes that express a lot of action in a small amount of clear code -- 
not in reams of trivial code that bores the reader to death."  --GvR
From tjreedy at udel.edu  Fri Apr 22 04:30:22 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri Apr 22 04:30:52 2005
Subject: [Python-Dev] Re: anonymous blocks
References: <ca471dc205042116402d7d38da@mail.gmail.com>
Message-ID: <d49nbc$ac2$1@sea.gmane.org>

I do not know that I have ever needed 'anonymous blocks', and I have 
therefore not followed this discussion in detail, but I appreciate Python's 
beauty and want to see it maintained.  So I have three comments and 
yet-another syntax  proposal that I do not remember seeing (but could have 
missed).

1. Python's integration of for loops, iterators, and generators are, to me, 
a gem of program language design that distinguishes Python from other 
languages I have used.  Using them to not iterate but to do something else 
may be cute, but in a perverted sort of way.  I would rather have 
'something else' done some other way.

2. General-purpose passable block objects with parameters look a lot like 
general-purpose anonymous functions ('full lambdas').  I bet they would be 
used a such if at all possible.  This seems to me like the wrong direction.

3. The specific use-cases for Python not handled better by current syntax 
seem to be rather specialized: resource management around a block.  So I 
cautiously propose:

with <resource type> <specific resource>: <suite>

with the exact semantics dependent on <resource type>.  In particular:

with lock somelock:
    codeblock

could abbreviate and mean

somelock.acquire()
try:
    codeblock
finally:
    somelock.release()

(Guido's example).

with file somefile:
    codeblock

might translate to (the bytecode equivalent of)

if isinstance(somefile, basestring?):
    somefile = open(somefile,defaults)
codeblock
somefile.close

The compound keywords could be 'underscored' but I presume they could be 
parsed as is, much like 'not in'.

Terry J. Reedy


From skip at pobox.com  Fri Apr 22 05:16:55 2005
From: skip at pobox.com (Skip Montanaro)
Date: Fri Apr 22 05:17:13 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc205042116402d7d38da@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
Message-ID: <17000.27815.106381.198125@montanaro.dyndns.org>


    Guido> or perhaps even (making "for VAR" optional in the for-loop syntax)
    Guido> with

    Guido>     in synchronized(the_lock):
    Guido>         BODY

This could be a new statement, so the problematic issue of implicit
try/finally in every for statement wouldn't be necessary.  That complication
would only be needed for the above form.

(Of course, if you've dispensed with this I am very likely missing something
fundamental.)

Skip
From steven.bethard at gmail.com  Fri Apr 22 05:55:42 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri Apr 22 05:55:46 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <Pine.LNX.4.58.0504211853560.6080@server1.LFW.org>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<Pine.LNX.4.58.0504211853560.6080@server1.LFW.org>
Message-ID: <d11dcfba05042120552d29a217@mail.gmail.com>

Ka-Ping Yee wrote:
> It seems to me that, in general, Python likes to use keywords for
> statements and operators for expressions.

Probably worth noting that 'for', 'in' and 'if' in generator
expressions and list comprehensions blur this distinction somewhat...

Steve
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From pje at telecommunity.com  Fri Apr 22 06:25:22 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Apr 22 06:24:36 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <2mu0m05a4y.fsf@starship.python.net>
References: <4267B27E.9030107@strakt.com>
	<740c3aec0504191557505d6e9f@mail.gmail.com>
	<877e9a170504191855445e0f4d@mail.gmail.com>
	<20050419212423.63AD.JCARLSON@uci.edu> <d44tmo$ih0$1@sea.gmane.org>
	<4266CC49.9080901@egenix.com>
	<c41f67b905042017596328164a@mail.gmail.com>
	<2m64yg79yl.fsf@starship.python.net> <4267B27E.9030107@strakt.com>
Message-ID: <5.1.1.6.0.20050421150027.021c60b0@mail.telecommunity.com>

At 06:10 PM 04/21/2005 +0100, Michael Hudson wrote:
>But the visitor pattern is pretty grim, really.  It would be nice (tm)
>to have something like:
>
>   match node in:
>     Assign(lhs=Var(_), rhs=_):
>        # lhs, rhs bound in here
>     Assign(lhs=Subscr(_,_), rhs=_):
>        # ditto
>     Assign(lhs=Slice(*_), rhs=_):
>        # ditto
>     Assign(lhs=_, rhs=_):
>        raise SyntaxError
>
>in Lib/compiler.

FWIW, I do intend to add this sort of thing to PyProtocols' predicate 
dispatch system.  Actually, I can dispatch on rules like the above now, 
it's just that you have to spell out the cases as e.g.:

     @do_it.when("isinstance(node, Assign) and isinstance(node.lhs, Subscr)")
     def do_subscript_assign(node, ...):
         ...

I'd like to create a syntax sugar for pattern matching though, that would 
let you 1) use a less verbose way of saying the same thing, and 2) let you 
bind the intermediate values to variables that then become accessible in 
the function body as locals.

Anyway, the main holdup on this is deciding what sort of Python syntax 
abuse should represent variable bindings.  :)  Maybe something like this 
will be suitably horrific:

    @do_it.when("node in Assign.match(lhs=`lhs` in Subscr,rhs=`rhs`)")
    def do_subscript_assign((lhs,rhs), node, ...):
        ...

But I think maybe here the cure is worse than the disease.  :)  Pushed this 
far, it seems to beg for new syntax to accommodate in-expression variable 
bindings, something like 'var:=value'.  Really, though, the problem is 
probably just that inline variable binding is downright unpythonic.  The 
only time Python does anything vaguely similar is with the 'except 
type,var:' syntax.

From bac at OCF.Berkeley.EDU  Fri Apr 22 06:26:00 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Apr 22 06:26:06 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc205042117341f1add1b@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	
	<4268446C.6010301@ocf.berkeley.edu>
	<ca471dc205042117341f1add1b@mail.gmail.com>
Message-ID: <42687CD8.3000204@ocf.berkeley.edu>

Guido van Rossum wrote:
> [Brett]
> 
>>I think I agree with Samuele that it would be more pertinent to put all of this
>>effort into trying to come up with some way to handle cleanup in a generator.
> 
> 
> I.e. PEP 325.
> 
> But (as I explained, and you agree) that still doesn't render PEP 310
> unnecessary, because abusing the for-loop for implied cleanup
> semantics is ugly and expensive, and would change generator semantics;

Right, I'm not saying PEP 310 shouldn't also be considered.  It just seems like
we are beginning to pile a lot on this discussion by bringing in PEP 310 and
PEP 325 in at the same time since, as pointed out, there is no guarantee that
anything will be called in a generator and thus making PEP 310 work in
generators does not seem guaranteed to solve that problem (although I might
have missed something; just started really following the thread today).

At this point anonymous blocks just don't seem to be happening, at least not
like in Ruby.  Fine, I didn't want them anyway.  Now we are trying to simplify
resource cleanup and handling.  What I am trying to say is that generators
differ just enough as to possibly warrant a separate discussion from all of
this other resource handling "stuff".

So I am advocating a more focused generator discussion since resource handling
in generators is much more difficult than the general case in non-generator
situations.  I mean obviously in the general case all of this is handled
already in Python today with try/finally.  But with generators you have to jump
through some extra hoops to get similar support (passing in anything that needs
to be cleaned up, hoping that garbage collection will eventually handle things,
etc.).

> and it bugs me that the finally clause's reachability depends on the
> destructor executing.
> 

Yeah, I don't like it either.  I would rather see something like:

 def gen():
    FILE = open("stuff.txt", 'rU')
    for line in FILE:
        yield line

    cleanup:
        FILE.close()

and have whatever is in the 'cleanup' block be either accessible from a method
in the generator or have it become the equivalent of a __del__ for the
generator, or maybe even both (which would remove contention that whatever
needs to be cleaned up is done too late thanks to gc not guaranteeing immediate
cleanup).  This way you get the guaranteed cleanup regardless and you don't
have to worry about creating everything outside of the generator, passing it
in, and then handling cleanup in a try/finally that contains the next() calls
to the generator (or any other contortion you might have to go through).

Anyway, my random Python suggestion for the day.

-Brett
From bac at OCF.Berkeley.EDU  Fri Apr 22 06:28:42 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Apr 22 06:28:50 2005
Subject: [Python-Dev] anonymous blocks (don't combine them with generator
	finalization)
In-Reply-To: <4052ed44a40fa22f767a33e8d73d85fb@redivi.com>
References: <4268446C.6010301@ocf.berkeley.edu>	<ca471dc205042117341f1add1b@mail.gmail.com>	<20050421173945.63C3.JCARLSON@uci.edu>
	<4052ed44a40fa22f767a33e8d73d85fb@redivi.com>
Message-ID: <42687D7A.5010206@ocf.berkeley.edu>

Bob Ippolito wrote:
> 
> On Apr 21, 2005, at 8:59 PM, Josiah Carlson wrote:
> 
>> Guido van Rossum <gvanrossum@gmail.com> wrote:
>>
>>>
>>> [Brett]
>>>
>>>> I think I agree with Samuele that it would be more pertinent to put
>>>> all of this
>>>> effort into trying to come up with some way to handle cleanup in a
>>>> generator.
>>>
>>>
>>> I.e. PEP 325.
>>>
>>> But (as I explained, and you agree) that still doesn't render PEP 310
>>> unnecessary, because abusing the for-loop for implied cleanup
>>> semantics is ugly and expensive, and would change generator semantics;
>>> and it bugs me that the finally clause's reachability depends on the
>>> destructor executing.
>>
>>
>> Yes and no.  PEP 325 offers a method to generators that handles cleanup
>> if necessary and calls it close().  Obviously calling it close is a
>> mistake.  Actually, calling it anything is a mistake, and trying to
>> combine try/finally handling in generators with __exit__/close (inside
>> or outside of generators) is also a mistake.
>>
>>
>> Start by saying, "If a non-finalized generator is garbage collected, it
>> will be finalized."  Whether this be by an exception or forcing a return,
>> so be it.
>>
>> If this were to happen, we have generator finalization handled by the
>> garbage collector, and don't need to translate /any/ for loop.  As long
>> as the garbage collection requirement is documented, we are covered
>> (yay!).
> 
> 
> Well, for the CPython implementation, couldn't you get away with using
> garbage collection to do everything?  Maybe I'm missing something..
> 

[SNIP]

Well, if you are missing something then so am I since your suggestion is
basically correct.  The only issue is that people will want more immediate
execution of the cleanup code which gc cannot guarantee.  That's why the
ability to call a method with the PEP 325 approach gets rid of that worry.

-Brett
From bob at redivi.com  Fri Apr 22 06:49:25 2005
From: bob at redivi.com (Bob Ippolito)
Date: Fri Apr 22 06:49:36 2005
Subject: [Python-Dev] anonymous blocks (don't combine them with generator
	finalization)
In-Reply-To: <42687D7A.5010206@ocf.berkeley.edu>
References: <4268446C.6010301@ocf.berkeley.edu>	<ca471dc205042117341f1add1b@mail.gmail.com>	<20050421173945.63C3.JCARLSON@uci.edu>
	<4052ed44a40fa22f767a33e8d73d85fb@redivi.com>
	<42687D7A.5010206@ocf.berkeley.edu>
Message-ID: <69746cadca283f5b9c1e76686d2ccb01@redivi.com>


On Apr 22, 2005, at 12:28 AM, Brett C. wrote:

> Bob Ippolito wrote:
>>
>> On Apr 21, 2005, at 8:59 PM, Josiah Carlson wrote:
>>
>>> Guido van Rossum <gvanrossum@gmail.com> wrote:
>>>
>>>>
>>>> [Brett]
>>>>
>>>>> I think I agree with Samuele that it would be more pertinent to put
>>>>> all of this
>>>>> effort into trying to come up with some way to handle cleanup in a
>>>>> generator.
>>>>
>>>>
>>>> I.e. PEP 325.
>>>>
>>>> But (as I explained, and you agree) that still doesn't render PEP 
>>>> 310
>>>> unnecessary, because abusing the for-loop for implied cleanup
>>>> semantics is ugly and expensive, and would change generator 
>>>> semantics;
>>>> and it bugs me that the finally clause's reachability depends on the
>>>> destructor executing.
>>>
>>>
>>> Yes and no.  PEP 325 offers a method to generators that handles 
>>> cleanup
>>> if necessary and calls it close().  Obviously calling it close is a
>>> mistake.  Actually, calling it anything is a mistake, and trying to
>>> combine try/finally handling in generators with __exit__/close 
>>> (inside
>>> or outside of generators) is also a mistake.
>>>
>>>
>>> Start by saying, "If a non-finalized generator is garbage collected, 
>>> it
>>> will be finalized."  Whether this be by an exception or forcing a 
>>> return,
>>> so be it.
>>>
>>> If this were to happen, we have generator finalization handled by the
>>> garbage collector, and don't need to translate /any/ for loop.  As 
>>> long
>>> as the garbage collection requirement is documented, we are covered
>>> (yay!).
>>
>>
>> Well, for the CPython implementation, couldn't you get away with using
>> garbage collection to do everything?  Maybe I'm missing something..
>>
>
> [SNIP]
>
> Well, if you are missing something then so am I since your suggestion 
> is
> basically correct.  The only issue is that people will want more 
> immediate
> execution of the cleanup code which gc cannot guarantee.  That's why 
> the
> ability to call a method with the PEP 325 approach gets rid of that 
> worry.

Well in CPython, if you are never assigning the generator to any local 
or global, then you should be guaranteed that it gets cleaned up at the 
right time unless it's alive in a traceback somewhere (maybe you WANT 
it to be!) or some insane trace hook keeps too many references to 
frames around..

It seems *reasonably* certain that for reasonable uses this solution 
WILL clean it up optimistically.

-bob

From bac at OCF.Berkeley.EDU  Fri Apr 22 06:56:31 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Fri Apr 22 06:56:38 2005
Subject: [Python-Dev] Proper place to put extra args for building
In-Reply-To: <426744DF.2030309@v.loewis.de>
References: <4265918D.7040700@ocf.berkeley.edu>
	<4265F656.4020305@v.loewis.de>	<4266C07A.9090503@ocf.berkeley.edu>
	<4266C4E9.5060709@v.loewis.de> <4266C7D1.700@ocf.berkeley.edu>
	<42672803.3080208@v.loewis.de> <42674338.80009@ocf.berkeley.edu>
	<426744DF.2030309@v.loewis.de>
Message-ID: <426883FF.5060009@ocf.berkeley.edu>

Martin v. L?wis wrote:
> Brett C. wrote:
> 
>>Works for me.  If no one objects I will check in the change for CFLAGS to make
>>it ``$(BASECFLAGS) $(OPT) "$EXTRA_CFLAGS"`` soon (is quoting it enough to make
>>sure that it isn't evaluated by configure but left as a string to be evaluated
>>by the shell when the Makefile is running?).
> 
> 
> If you put it into Makefile.pre.in, the only thing to avoid that
> configure evaluates is is not to use @FOO@. OTOH, putting a $
> in front of it is not good enough for make: $EXTRA_CFLAGS evaluates
> the variable E, and then appends XTRA_CFLAGS.
> 

Yep, you're right.  I initially thought that the parentheses meant it was a
Makefile-only variable, but it actually goes to the environment for those
unknown values.

Before I check it in, though, should setup.py be tweaked to use it as well?  I
say yes.

-Brett
From greg.ewing at canterbury.ac.nz  Fri Apr 22 08:19:45 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri Apr 22 08:20:03 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <Pine.LNX.4.58.0504210623450.6080@server1.LFW.org>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com> <4265E43C.4080707@ieee.org>
	<4267259F.6050902@canterbury.ac.nz>
	<Pine.LNX.4.58.0504210623450.6080@server1.LFW.org>
Message-ID: <42689781.20505@canterbury.ac.nz>

Ka-Ping Yee wrote:

> Can you explain what you meant by currying here?  I know what
> the word "curry" means, but i am having a hard time seeing how
> it applies to your example.

It's currying in the sense that instead of one function
which takes all the args at once, you have a function
that takes some of them (all except the thunk) and
returns another one that takes the rest (the thunk).

>  Could you make up an example that uses more arguments?

   def with_file(filename, mode):
     def func(block):
       f = open(filename, mode)
       try:
         block(f)
       finally:
         f.close()
     return func

Usage example:

   with_file("foo.txt", "w") as f:
     f.write("My hovercraft is full of parrots.")

Does that help?

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From greg.ewing at canterbury.ac.nz  Fri Apr 22 08:19:48 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri Apr 22 08:20:05 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc20504210752431f430a@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com> <4265E43C.4080707@ieee.org>
	<4267259F.6050902@canterbury.ac.nz>
	<ca471dc20504210752431f430a@mail.gmail.com>
Message-ID: <42689784.80305@canterbury.ac.nz>

Guido van Rossum wrote:

> Perhaps it could be even simpler: 
> 
>     [assignment_target '=']* expr ':' suite

I don't like that so much. It looks like you're
assigning the result of expr to assignment_target,
and then doing something else.

> This would just be an extension of the regular assignment statement.

Syntactically, yes, but semantically it's more
complicated than just a "simple extension", to
my mind.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From greg.ewing at canterbury.ac.nz  Fri Apr 22 08:19:50 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri Apr 22 08:20:09 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <d11dcfba0504210920220632b3@mail.gmail.com>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com>
	<1113941921.14525.39.camel@geddy.wooz.org>
	<ca471dc2050419133363a1cea9@mail.gmail.com>
	<Pine.A41.4.61b.0504191341280.124046@dante72.u.washington.edu>
	<79990c6b05042002435ce91e79@mail.gmail.com>
	<73ceac6fd1ccfa5a342f39cb57c224d9@fuhm.net>
	<d11dcfba0504210920220632b3@mail.gmail.com>
Message-ID: <42689786.7090400@canterbury.ac.nz>

Steven Bethard wrote:

> line = None
> @withfile("readme.txt")
> def print_readme(fileobj):
>     lexical line
>     for line in fileobj:
>         print line
> print "last line:" line

Since the name of the function isn't important,
that could be reduced to

   @withfile("readme.txt")
   def _(fileobj):
     ...

(Disclaimer: This post should not be taken as
an endorsement of this abuse! I'd still much
rather have a proper language feature for it.)

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From greg.ewing at canterbury.ac.nz  Fri Apr 22 08:19:53 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri Apr 22 08:20:13 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <Pine.A41.4.61b.0504211251360.52386@dante74.u.washington.edu>
References: <c41f67b90504191135e85c8b5@mail.gmail.com>
	<Pine.A41.4.61b.0504191147030.79410@dante75.u.washington.edu>
	<ca471dc205041912245212376b@mail.gmail.com> <4265E43C.4080707@ieee.org>
	<4267259F.6050902@canterbury.ac.nz>
	<Pine.A41.4.61b.0504211251360.52386@dante74.u.washington.edu>
Message-ID: <42689789.5020004@canterbury.ac.nz>

Brian Sabbey wrote:

> I made an example implementation, and this wasn't an issue.  It took 
> some code to stick the thunk into the argument list, but it was pretty 
> straightforward.

What does your implementation do with something like

   f() + g():
     ...

? (A syntax error, I would hope.)

While no doubt it can be done, I still don't like the
idea very much. It seems like a violation of modularity
in the grammar, so to speak. The syntax effectively
allowed for the expression is severely limited by the
fact that a block follows it, which is a kind of backward
effect that violates the predominantly LL-flavour of
the rest of the syntax. There's a backward effect in
the semantics, too -- you can't properly understand what
the otherwise-normal-looking function call is doing
without knowing what comes later.

An analogy has been made with the insertion of "self"
into the arguments of a method. But that is something quite
different. In x.meth(y), the rules are being followed
quite consistently: the result of x.meth is being
called with y (and only y!) as an argument; the insertion
of self happens later.

But here, insertion of the thunk would occur *before* any
call was made at all, with no clue from looking at the
call itself.

> Requiring arguments other than the block to be dealt with by currying 
> can lead to problems.  I won't claim these problems are serious, but 
> they will be annoying.

You have some valid concerns there. You've given me
something to think about.

Here's another idea. Don't write the parameters in the form
of a call at all; instead, do this:

   with_file "foo.txt", "w" as f:
     f.write("Spam!")

This would have the benefit of making it look more like
a control structure and less like a funny kind of call.

I can see some problems with that, though. Juxtaposing two
expressions doesn't really work, because the result can end up
looking like a function call or indexing operation. I
don't want to put a keyword in between because that would
mess up how it reads. Nor do I want to put some arbitrary
piece of punctuation in there.

The best I can think of right now is

   with_file {"foo.txt", "w"} as f:
     f.write("Spam!")


> If you google "filetype:rb yield", you can see many the uses of yield in 
> ruby.

I'm sure that use cases can be found, but the pertinent question
is whether a substantial number of those use cases from Ruby fall
into the class of block-uses which aren't covered by other Python
facilities.

Also, I have a gut feeling that it's a bad idea to try to provide
for this. I think the reason is this: We're trying to create
something that feels like a user-defined control structure with
a suite, and there's currently no concept in Python of a suite
returning a value to be consumed by its containing control
structure. It would be something new, and it would require some
mental gymnastics to understand what it was doing. We already
have "return" and "yield"; this would be a third similar-yet-
different thing.

If it were considered important enough, it could easily be added
later, without disturbing anything. But I think it's best left
out of an initial specification.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From facundobatista at gmail.com  Fri Apr 22 15:30:11 2005
From: facundobatista at gmail.com (Facundo Batista)
Date: Fri Apr 22 15:30:16 2005
Subject: [Python-Dev] Caching objects in memory
Message-ID: <e04bdf31050422063019fda86b@mail.gmail.com>

Is there a document that details which objects are cached in memory
(to not create the same object multiple times, for performance)?

If not, could please somebody point me out where this is implemented
for strings?

Thank you!

.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/
From mwh at python.net  Fri Apr 22 15:50:57 2005
From: mwh at python.net (Michael Hudson)
Date: Fri Apr 22 15:51:01 2005
Subject: [Python-Dev] Caching objects in memory
In-Reply-To: <e04bdf31050422063019fda86b@mail.gmail.com> (Facundo Batista's
	message of "Fri, 22 Apr 2005 10:30:11 -0300")
References: <e04bdf31050422063019fda86b@mail.gmail.com>
Message-ID: <2moec6539a.fsf@starship.python.net>

Facundo Batista <facundobatista@gmail.com> writes:

> Is there a document that details which objects are cached in memory
> (to not create the same object multiple times, for performance)?

No.

> If not, could please somebody point me out where this is implemented
> for strings?

In PyString_FromStringAndSize and PyString_FromString, it seems to me.

Cheers,
mwh

-- 
  I also feel it essential to note, [...], that Description Logics,
  non-Monotonic Logics, Default Logics and Circumscription Logics 
  can all collectively go suck a cow. Thank you.
              -- http://advogato.org/person/Johnath/diary.html?start=4
From fredrik at pythonware.com  Fri Apr 22 15:50:20 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Fri Apr 22 15:57:43 2005
Subject: [Python-Dev] Re: Caching objects in memory
References: <e04bdf31050422063019fda86b@mail.gmail.com>
Message-ID: <d4av64$ogd$1@sea.gmane.org>

Facundo Batista wrote:

> Is there a document that details which objects are cached in memory
> (to not create the same object multiple times, for performance)?

why do you think you need to know?
 
> If not, could please somebody point me out where this is implemented
> for strings?

Objects/stringobject.c (where else? ;-)

</F>

From theller at python.net  Fri Apr 22 16:57:26 2005
From: theller at python.net (Thomas Heller)
Date: Fri Apr 22 16:57:34 2005
Subject: [Python-Dev] Error checking in init<module> functions
Message-ID: <y8bax3jd.fsf@python.net>

I always wondered why there usually is very sloppy error checking in
init<module> functions.  Usually it goes like this (I removed
declarations and some other lines for clarity):

PyMODINIT_FUNC
PyInit_zlib(void)
{
    m = Py_InitModule4("zlib", zlib_methods,
		       zlib_module_documentation,
		       (PyObject*)NULL,PYTHON_API_VERSION);

    ZlibError = PyErr_NewException("zlib.error", NULL, NULL);
    if (ZlibError != NULL) {
        Py_INCREF(ZlibError);
	PyModule_AddObject(m, "error", ZlibError);
    }
    PyModule_AddIntConstant(m, "MAX_WBITS", MAX_WBITS);
    PyModule_AddIntConstant(m, "DEFLATED", DEFLATED);

    ver = PyString_FromString(ZLIB_VERSION);
    if (ver != NULL)
	PyModule_AddObject(m, "ZLIB_VERSION", ver);

    PyModule_AddStringConstant(m, "__version__", "1.0");
}

Why isn't the result checked in the PyModule_... functions?
Why is the failure of PyErr_NewException silently ignored?
The problem is that when one of these things fail (although they are
probably supposed to NOT fail) you end up with a module missing
something, without any error message.

What would be the correct thing to do - I assume something like

     if (PyModule_AddIntConstant(m, "MAX_WBITS", MAX_WBITS)) {
         PyErr_Print();
         return;
     }

Thomas

From mwh at python.net  Fri Apr 22 17:05:29 2005
From: mwh at python.net (Michael Hudson)
Date: Fri Apr 22 17:05:31 2005
Subject: [Python-Dev] Error checking in init<module> functions
In-Reply-To: <y8bax3jd.fsf@python.net> (Thomas Heller's message of "Fri, 22
	Apr 2005 16:57:26 +0200")
References: <y8bax3jd.fsf@python.net>
Message-ID: <2mk6mu4zt2.fsf@starship.python.net>

Thomas Heller <theller@python.net> writes:

> I always wondered why there usually is very sloppy error checking in
> init<module> functions.

Laziness, I presume...

> The problem is that when one of these things fail (although they are
> probably supposed to NOT fail) you end up with a module missing
> something, without any error message.

Err.  There's a call to PyErr_Occurred() after the init function is
called, so you should get an error message.

Carrying on regardless after an error runs the risk that the exception
will be cleared, of course.

> What would be the correct thing to do - I assume something like
>
>      if (PyModule_AddIntConstant(m, "MAX_WBITS", MAX_WBITS)) {
>          PyErr_Print();
>          return;
>      }

Just return, I think.

Cheers,
mwh

-- 
  The meaning of "brunch" is as yet undefined.
                                             -- Simon Booth, ucam.chat
From jimjjewett at gmail.com  Fri Apr 22 17:06:16 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri Apr 22 17:06:19 2005
Subject: [Python-Dev] Re: switch statement
Message-ID: <fb6fbf5605042208067a954e6b@mail.gmail.com>

Michael Chermside wrote:

> Now the pattern matching is more interesting, but again, I'd need to
> see a proposed syntax for Python before I could begin to consider it.
> If I understand it properly, pattern matching in Haskell relies
> primarily on Haskell's excellent typing system, which is absent in
> Python.

Why not just use classes?  With either mixins or new-style classes,
it is quite reasonable to use many small classes for fine distinctions.

Change 
    if predicate1(obj):
        action1(obj)
    elif predicate2(obj):
        action2(obj)
    ...
    else:
        default(obj)

into either

    try:
        obj.action(locals())
    except AttributeError:
        default(obj, locals())

or

    if hasattr(obj, "action"):
        obj.action(locals())
    else:
        <default>

And then define an action method (perhaps through inheritance
from a mixin) for any object that should not take the default path.  
The object's own methods will have access to any variables used 
in the match and locals will have access to the current scope.  If
you have at least one class per "switch", you have a switch statement.

The down sides are that 

(1)  Your domain objects will have to conform to a least a weak OO 
model (or take the default path)

(2)  Logic that should be together will be split up.  Either classes will 
be modified externally, or the "switch statement" logic will be broken 
up between different classes.  If single-method mixins are used to 
keep the logic close, then real objects will have to pick an ancestor 
for what may seem like arbitrary reasons.

These objections apply to any matching system based on types; the 
difference is that other languages have often already paid the price.
For Python it is an incremental cost incurred by the match system.

-jJ
From ncoghlan at gmail.com  Fri Apr 22 18:41:58 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri Apr 22 18:42:19 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <17000.27815.106381.198125@montanaro.dyndns.org>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<17000.27815.106381.198125@montanaro.dyndns.org>
Message-ID: <42692956.5070305@gmail.com>

Skip Montanaro wrote:
>     Guido> or perhaps even (making "for VAR" optional in the for-loop syntax)
>     Guido> with
> 
>     Guido>     in synchronized(the_lock):
>     Guido>         BODY
> 
> This could be a new statement, so the problematic issue of implicit
> try/finally in every for statement wouldn't be necessary.  That complication
> would only be needed for the above form.

s/in/with/ to get PEP 310.

A parallel which has been bugging me is the existence of the iterator protocol 
(__iter__, next()) which you can implement manually if you want, and the 
existence of generators, which provide a nice clean way of writing iterators as 
functions.

I'm wondering if something similar can't be found for the __enter__/__exit__ 
resource protocol.

Guido's recent screed crystallised the idea of writing resources as two-part 
generators:

def my_resource():
   print "Hi!"   # Do entrance code
   yield None    # Go on with the contents of the 'with' block
   print "Bye!"  # Do exit code

Giving the internal generator object an enter method that calls self.next() 
(expecting None to be returned), and an exit method that does the same (but 
expects StopIteration to be raised) should suffice to make this possible with a 
PEP 310 style syntax.

Interestingly, with this approach, "for dummy in my_resource()" would still wrap 
the block of code in the entrance/exit code (because my_resource *is* a 
generator), but it wouldn't get the try/finally semantics.

An alternative would be to replace the 'yield None' with a 'break' or 
'continue', and create an object which supports the resource protocol and NOT 
the iterator protocol. Something like:

def my_resource():
   print "Hi!"   # Do entrance code
   continue      # Go on with the contents of the 'with' block
   print "Bye!"  # Do exit code

(This is currently a SyntaxError, so it isn't ambiguous in any way)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From python at rcn.com  Thu Apr 21 19:01:03 2005
From: python at rcn.com (Raymond Hettinger)
Date: Fri Apr 22 19:01:12 2005
Subject: [Python-Dev] Caching objects in memory
In-Reply-To: <e04bdf31050422063019fda86b@mail.gmail.com>
Message-ID: <000001c54693$b3bd6d80$ccb72c81@oemcomputer>

[Facundo Batista]
> Is there a document that details which objects are cached in memory
> (to not create the same object multiple times, for performance)?

The caches get cleaned-up before Python exit's, so you can find them all
listed together in the code in Python/pythonrun.c:

        /* Sundry finalizers */
        PyMethod_Fini();
        PyFrame_Fini();
        PyCFunction_Fini();
        PyTuple_Fini();
        PyList_Fini();
        PyString_Fini();
        PyInt_Fini();
        PyFloat_Fini();

#ifdef Py_USING_UNICODE
        /* Cleanup Unicode implementation */
        _PyUnicode_Fini();
#endif


Raymond Hettinger
From shane at hathawaymix.org  Fri Apr 22 19:11:25 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Fri Apr 22 19:16:30 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <42692956.5070305@gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<17000.27815.106381.198125@montanaro.dyndns.org>
	<42692956.5070305@gmail.com>
Message-ID: <4269303D.6030205@hathawaymix.org>

Nick Coghlan wrote:
> An alternative would be to replace the 'yield None' with a 'break' or
> 'continue', and create an object which supports the resource protocol
> and NOT the iterator protocol. Something like:
> 
> def my_resource():
>   print "Hi!"   # Do entrance code
>   continue      # Go on with the contents of the 'with' block
>   print "Bye!"  # Do exit code
> 
> (This is currently a SyntaxError, so it isn't ambiguous in any way)

That's a very interesting suggestion.  I've been lurking, thinking about
a way to use something like PEP 310 to help manage database
transactions.  Here is some typical code that changes something under
transaction control:

    begin_transaction()
    try:
        changestuff()
        changemorestuff()
    except:
        abort_transaction()
        raise
    else:
        commit_transaction()

There's a lot of boilerplate code there.  Using your suggestion, I could
write that something like this:

    def transaction():
        begin_transaction()
        try:
            continue
        except:
            abort_transaction()
            raise
        else:
            commit_transaction()

    with transaction():
        changestuff()
        changemorestuff()

Shane
From reinhold-birkenfeld-nospam at wolke7.net  Fri Apr 22 19:20:46 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Fri Apr 22 19:23:31 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <42692956.5070305@gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<17000.27815.106381.198125@montanaro.dyndns.org>
	<42692956.5070305@gmail.com>
Message-ID: <d4bbkk$593$1@sea.gmane.org>

Nick Coghlan wrote:

> Interestingly, with this approach, "for dummy in my_resource()" would still wrap 
> the block of code in the entrance/exit code (because my_resource *is* a 
> generator), but it wouldn't get the try/finally semantics.
> 
> An alternative would be to replace the 'yield None' with a 'break' or 
> 'continue', and create an object which supports the resource protocol and NOT 
> the iterator protocol. Something like:
> 
> def my_resource():
>    print "Hi!"   # Do entrance code
>    continue      # Go on with the contents of the 'with' block
>    print "Bye!"  # Do exit code
> 
> (This is currently a SyntaxError, so it isn't ambiguous in any way)

Oh, it is ambiguous, as soon as you insert a for/while statement in your resource
function and want to call continue in there. Other than that, it's very neat.

Maybe "yield" alone (which is always a SyntaxError) could be used.

Reinhold

-- 
Mail address is perfectly valid!

From jimjjewett at gmail.com  Fri Apr 22 23:36:19 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri Apr 22 23:36:22 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
Message-ID: <fb6fbf56050422143614d8431c@mail.gmail.com>

As best I can tell, the anonymous blocks are used to take 
care of boilerplate code without changing the scope -- exactly 
what macros are used for.  The only difference I see is that in 
this case, the macros are limited to entire (possibly compound) 
statements.  

To make this more concrete,

Guido:
>>     in synchronized(the_lock):
>>         BODY

Nick Coghlan:
> s/in/with/ to get PEP 310.
...

>Guido's recent screed crystallised the idea of writing resources
> as two-part generators:
...

[Adding Reinhold Birkenfeld's suggestion of a blank yield]

> def my_resource():
>    print "Hi!"     # Do entrance code
>    yield           # Go on with the contents of the 'with' block
>    print "Bye!"    # Do exit code

The macro itself looks reasonable -- so long as there is only 
ever one changing block inside the macro.  I'm not sure that 
is a reasonable restriction, but the alternative is ugly enough 
that maybe passing around locals() starts to be just as good.

What about a block that indicates the enclosed namespaces
will collapse a level?

defmacro myresource(filename):
    <make explicit calls to named callback "functions", but 
      within the same locals() scope.>

with myresource("thefile"):
    def reader(): 
        ...
    def writer():
        ...
    def fn():
        ....

Then myresource, reader, writer, and fn would share a
namespace without having to manually pass it around.

-jJ
From martin at v.loewis.de  Sat Apr 23 00:14:44 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sat Apr 23 00:14:47 2005
Subject: [Python-Dev] Proper place to put extra args for building
In-Reply-To: <426883FF.5060009@ocf.berkeley.edu>
References: <4265918D.7040700@ocf.berkeley.edu>	<4265F656.4020305@v.loewis.de>	<4266C07A.9090503@ocf.berkeley.edu>	<4266C4E9.5060709@v.loewis.de>
	<4266C7D1.700@ocf.berkeley.edu>	<42672803.3080208@v.loewis.de>
	<42674338.80009@ocf.berkeley.edu>	<426744DF.2030309@v.loewis.de>
	<426883FF.5060009@ocf.berkeley.edu>
Message-ID: <42697754.1000707@v.loewis.de>

Brett C. wrote:
> Yep, you're right.  I initially thought that the parentheses meant it was a
> Makefile-only variable, but it actually goes to the environment for those
> unknown values.
> 
> Before I check it in, though, should setup.py be tweaked to use it as well?  I
> say yes.

You means sysconfig.py, right? Probably yes.

This is a mess. distutils should just do what Makefile does for builtin
modules, i.e. use CFLAGS from the Makefile. Instead, it supports CFLAGS
as being additive to the Makefile value CFLAGS, which in turn it just
*knows* $(BASECFLAGS) $(OPT).

Regards,
Martin
From ncoghlan at gmail.com  Sat Apr 23 01:48:40 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat Apr 23 01:49:01 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <4269303D.6030205@hathawaymix.org>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<17000.27815.106381.198125@montanaro.dyndns.org>
	<42692956.5070305@gmail.com> <4269303D.6030205@hathawaymix.org>
Message-ID: <42698D58.4090902@gmail.com>

Shane Hathaway wrote:
> There's a lot of boilerplate code there.  Using your suggestion, I could
> write that something like this:
> 
>     def transaction():
>         begin_transaction()
>         try:
>             continue
>         except:
>             abort_transaction()
>             raise
>         else:
>             commit_transaction()
> 
>     with transaction():
>         changestuff()
>         changemorestuff()

For that to work, the behaviour would need to differ slightly from what I 
envisioned (which was that the 'continue' would be behaviourally equivalent to a 
'yield None').

Alternatively, something equivalent to the above could be written as:

def transaction():
   begin_transaction()
   continue
   ex = sys.exc_info()
   if ex[0] is not None:
     abort_transaction():
   else:
     commit_transaction():

Note that you could do this with a normal resource, too:

class transaction(object):
   def __enter__():
       begin_transaction()

   def __exit__():
     ex = sys.exc_info()
     if ex[0] is not None:
       abort_transaction():
     else:
       commit_transaction():

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From hpk at trillke.net  Sat Apr 23 01:51:12 2005
From: hpk at trillke.net (holger krekel)
Date: Sat Apr 23 01:51:15 2005
Subject: [Python-Dev] PEP 310 and exceptions
Message-ID: <20050422235112.GK22996@solar.trillke.net>

Hi all, 

probably unsuprisingly i am still pondering the idea of having
an optional __except__ hook on block handlers.  The PEP says this
about this: 

    An extension to the protocol to include an optional __except__
    handler, which is called when an exception is raised, and which
    can handle or re-raise the exception, has been suggested.  It is
    not at all clear that the semantics of this extension can be made
    precise and understandable.  For example, should the equivalent
    code be try ... except ... else if an exception handler is
    defined, and try ... finally if not?  How can this be determined
    at compile time, in general?

In fact, i think the translation even to python code is not that tricky: 

    x = X(): 
        ... 

basically translates to: 

    if hasattr(x, '__enter__'): 
        x.__enter__() 
    try: 
        ... 
    except: 
        if hasattr(x, '__except__'): x.__except__(...) 
        else: x.__exit__()
    else: 
        x.__exit__()

this is the original definition from the PEP with the added
except clause.   Handlers are free to call 'self.__exit__()'
from the except clause.  I don't think that anything needs to
be determined at compile time.  (the above can probably be 
optimized at the bytecode level but that is a side issue). 

Moreover, i think that there are more than the "transactional"
use cases mentioned in the PEP.  For example, a handler 
may want to log exceptions to some tracing utility 
or it may want to swallow certain exceptions when
its block does IO operations that are ok to fail. 

cheers, 

    holger
From jcarlson at uci.edu  Sat Apr 23 04:03:20 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sat Apr 23 04:05:00 2005
Subject: [Python-Dev] PEP 310 and exceptions
In-Reply-To: <20050422235112.GK22996@solar.trillke.net>
References: <20050422235112.GK22996@solar.trillke.net>
Message-ID: <20050422190222.63D2.JCARLSON@uci.edu>


hpk@trillke.net (holger krekel) wrote:
> basically translates to: 
> 
>     if hasattr(x, '__enter__'): 
>         x.__enter__() 
>     try: 
>         ... 
>     except: 
>         if hasattr(x, '__except__'): x.__except__(...) 
>         else: x.__exit__()
>     else: 
>         x.__exit__()

Nope...

>>> def foo():
...     try:
...         print 1
...         return
...     except:
...         print 2
...     else:
...         print 3
...
>>> foo()
1
>>> 

 - Josiah

From aleaxit at yahoo.com  Sat Apr 23 05:15:10 2005
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Apr 23 05:15:15 2005
Subject: [Python-Dev] PEP 310 and exceptions
In-Reply-To: <20050422235112.GK22996@solar.trillke.net>
References: <20050422235112.GK22996@solar.trillke.net>
Message-ID: <1acac02fe434d6433ad197731c43db1b@yahoo.com>


On Apr 22, 2005, at 16:51, holger krekel wrote:

> Moreover, i think that there are more than the "transactional"
> use cases mentioned in the PEP.  For example, a handler
> may want to log exceptions to some tracing utility
> or it may want to swallow certain exceptions when
> its block does IO operations that are ok to fail.

I entirely agree!  In fact, I was discussing this very issue recently 
with colleagues at Google, most of them well acquainted with Python but 
not all of them Python enthusiasts, and I was surprised to see 
unanimity on how PEP 310 *with* __except__ would be a huge step up in 
usefulness wrt the simple __enter__/__exit__ model, which is roughly 
equivalent in power to the C++ approach (destructors of auto variables) 
whose absence from Python and Java some people were bemoaning (which is 
how the whole discussion got started...).

The use cases appear to be aleph-0 or more...;-).  Essentially, think 
of it of encapsulating into reusable forms many common patterns of 
try/except use, much like iterators/generators can encapsulate looping 
and recursive constructs, and a new vista of uses open up...

Imagine that in two or three places in your code you see something 
like...

try:
    ...different blocks here...
except FooError, foo:
    # some FooError cases need whizbang resetting before they propagate
    if foo.wobble > FOOBAR_RESET_THRESHOLD:
       whizbang.reset_all()
    raise

With PEP 310 and __except__, this would become:

with foohandler:
    ...whatever block..

in each and every otherwise-duplicated-logic case... now THAT is 
progress!!!

IOW, +1 ... !


Alex

From ncoghlan at gmail.com  Sat Apr 23 05:26:06 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat Apr 23 05:26:43 2005
Subject: [Python-Dev] PEP 310 and exceptions
In-Reply-To: <20050422235112.GK22996@solar.trillke.net>
References: <20050422235112.GK22996@solar.trillke.net>
Message-ID: <4269C04E.5040108@gmail.com>

holger krekel wrote:
> Moreover, i think that there are more than the "transactional"
> use cases mentioned in the PEP.  For example, a handler 
> may want to log exceptions to some tracing utility 
> or it may want to swallow certain exceptions when
> its block does IO operations that are ok to fail. 

With the current PEP 310 definition, these can be manually handled using 
sys.exc_info() in the __exit__ method. Cleaning up my earlier transaction 
handler example:

class transaction(object):
     def __enter__(self):
         begin_transaction()

     def __exit__(self):
         ex = sys.exc_info()
         if ex[0] is not None:
             abort_transaction()
         else:
             commit_transaction()

Alternately, PEP 310 could be defined as equivalent to:

     if hasattr(x, '__enter__'):
         x.__enter__()
     try:
         try:
             ...
         except:
             if hasattr(x, '__except__'):
                 x.__except__(*sys.exc_info())
             else:
                 raise
     finally:
         x.__exit__()

Then the transaction handler would look like:

class transaction(object):
     def __enter__(self):
         self.aborted = False
         begin_transaction()

     def __except__(self, *exc_info):
         self.aborted = True
         abort_transaction()

     def __exit__(self):
         if not self.aborted:
             commit_transaction()


Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From ncoghlan at gmail.com  Sat Apr 23 05:41:57 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat Apr 23 05:42:03 2005
Subject: [Python-Dev] PEP 310 and exceptions
In-Reply-To: <4269C04E.5040108@gmail.com>
References: <20050422235112.GK22996@solar.trillke.net>
	<4269C04E.5040108@gmail.com>
Message-ID: <4269C405.1050008@gmail.com>

Nick Coghlan wrote:
> Alternately, PEP 310 could be defined as equivalent to:
> 
>     if hasattr(x, '__enter__'):
>         x.__enter__()
>     try:
>         try:
>             ...
>         except:
>             if hasattr(x, '__except__'):
>                 x.__except__(*sys.exc_info())
>             else:
>                 raise
>     finally:
>         x.__exit__()
> 

In light of Alex's comments, I'd actually like to suggest the below as a 
potential new definition for PEP 310 (making __exit__ optional, and adding an 
__else__ handler):

     if hasattr(x, '__enter__'):
         x.__enter__()
     try:
         try:
             # Contents of 'with' block
         except:
             if hasattr(x, '__except__'):
                 if not x.__except__(*sys.exc_info()): # [1]
                     raise
             else:
                 raise
         else:
             if hasattr(x, '__else__'):
                 x.__else__()
     finally:
         if hasattr(x, '__exit__'):
             x.__exit__()

[1] A possible tweak to this line would be to have it swallow the exception by 
default (by removing the conditional reraise). I'd prefer to make the silencing 
of the exception explicit, by returning 'True' from the exception handling, and 
have 'falling off the end' of the exception handler cause the exception to 
propagate.

Whichever way that point goes, this definition would allow PEP 310 to handle 
Alex's example of factoring out standardised exception handling, as well as the 
original use case of resource cleanup, and the transaction handling:

class transaction(object):
     def __enter__(self):
         begin_transaction()

     def __except__(self, *exc_info):
         abort_transaction()

     def __else__(self):
         commit_transaction()


Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From firemoth at gmail.com  Sat Apr 23 05:42:49 2005
From: firemoth at gmail.com (Timothy Fitz)
Date: Sat Apr 23 05:42:52 2005
Subject: [Python-Dev] anonymous blocks
In-Reply-To: <ca471dc205042116402d7d38da@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
Message-ID: <972ec5bd0504222042700b6f42@mail.gmail.com>

On 4/21/05, Guido van Rossum <gvanrossum@gmail.com> wrote: 
>    for dummy in synchronized(the_lock):
>        BODY
> 
> or perhaps even (making "for VAR" optional in the for-loop syntax)
> with
> 
>    in synchronized(the_lock):
>        BODY
> 
> Then synchronized() could be written cleanly as follows:
> 
>    def synchronized(lock):
>        lock.acquire()
>        try:
>            yield None
>        finally:
>            lock.release()

How is this different from:

def synchronized(lock):
  def synch_fn(block):
    lock.acquire()
    try:
      block()
    finally:
      lock.release()
  return synch_fn

@synchronized
def foo():
  BLOCK

True, it's non-obvious that foo is being immediately executed, but
regardless I like the way synchronized is defined, and doesn't use
yield (which in my opinion is a non-obvious solution)
From bac at OCF.Berkeley.EDU  Sat Apr 23 06:12:42 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Sat Apr 23 06:12:48 2005
Subject: [Python-Dev] Proper place to put extra args for building
In-Reply-To: <42697754.1000707@v.loewis.de>
References: <4265918D.7040700@ocf.berkeley.edu>	<4265F656.4020305@v.loewis.de>	<4266C07A.9090503@ocf.berkeley.edu>	<4266C4E9.5060709@v.loewis.de>
	<4266C7D1.700@ocf.berkeley.edu>	<42672803.3080208@v.loewis.de>
	<42674338.80009@ocf.berkeley.edu>	<426744DF.2030309@v.loewis.de>
	<426883FF.5060009@ocf.berkeley.edu> <42697754.1000707@v.loewis.de>
Message-ID: <4269CB3A.40306@ocf.berkeley.edu>

Martin v. L?wis wrote:
> Brett C. wrote:
> 
>>Yep, you're right.  I initially thought that the parentheses meant it was a
>>Makefile-only variable, but it actually goes to the environment for those
>>unknown values.
>>
>>Before I check it in, though, should setup.py be tweaked to use it as well?  I
>>say yes.
> 
> 
> You means sysconfig.py, right?

No, I mean Python's setup.py; line 174.

> Probably yes.
> 

You mean Distutils' sysconfig, right?  I can change that as well if you want.

-Brett
From ilya at bluefir.net  Sat Apr 23 06:23:20 2005
From: ilya at bluefir.net (Ilya Sandler)
Date: Sat Apr 23 06:24:08 2005
Subject: [Python-Dev] a few SF bugs which can (probably) be closed
Message-ID: <Pine.LNX.4.58.0504222034070.772@bagira>

Good morning/evening/:

Here a few sourceforge bugs which can probably be closed:

[ 1168983 ] : ftplib.py string index out of range
Original poster reports that the problem disappeared after a patch
committed by Raymond

[ 1178863 ] Variable.__init__ uses self.set(), blocking specialization
seems like a dup of 1178872

[ 415492 ] Compiler generates relative filenames
seems to have been fixed at some point. I could not reproduce it with
python2.4

[ 751612 ] smtplib crashes Windows Kernal.
Seems like an obvious Windows bug (not python's bug) and seems to be
unreproducible

Ilya
From martin at v.loewis.de  Sat Apr 23 09:28:25 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Sat Apr 23 09:28:28 2005
Subject: [Python-Dev] Proper place to put extra args for building
In-Reply-To: <4269CB3A.40306@ocf.berkeley.edu>
References: <4265918D.7040700@ocf.berkeley.edu>	<4265F656.4020305@v.loewis.de>	<4266C07A.9090503@ocf.berkeley.edu>	<4266C4E9.5060709@v.loewis.de>
	<4266C7D1.700@ocf.berkeley.edu>	<42672803.3080208@v.loewis.de>
	<42674338.80009@ocf.berkeley.edu>	<426744DF.2030309@v.loewis.de>
	<426883FF.5060009@ocf.berkeley.edu> <42697754.1000707@v.loewis.de>
	<4269CB3A.40306@ocf.berkeley.edu>
Message-ID: <4269F919.6070901@v.loewis.de>

Brett C. wrote:
>>You means sysconfig.py, right?

Right.

> No, I mean Python's setup.py; line 174.

Ah, ok.

> You mean Distutils' sysconfig, right?  I can change that as well if you want.

Please do; otherwise, people might see strange effects.

Regards,
Martin
From hpk at trillke.net  Sat Apr 23 10:10:41 2005
From: hpk at trillke.net (holger krekel)
Date: Sat Apr 23 10:10:43 2005
Subject: [Python-Dev] PEP 310 and exceptions
In-Reply-To: <20050422190222.63D2.JCARLSON@uci.edu>
References: <20050422235112.GK22996@solar.trillke.net>
	<20050422190222.63D2.JCARLSON@uci.edu>
Message-ID: <20050423081041.GA30548@solar.trillke.net>

On Fri, Apr 22, 2005 at 19:03 -0700, Josiah Carlson wrote:
> hpk@trillke.net (holger krekel) wrote:
> > basically translates to: 
> > 
> >     if hasattr(x, '__enter__'): 
> >         x.__enter__() 
> >     try: 
> >         ... 
> >     except: 
> >         if hasattr(x, '__except__'): x.__except__(...) 
> >         else: x.__exit__()
> >     else: 
> >         x.__exit__()
> 
> Nope...
> 
> >>> def foo():
> ...     try:
> ...         print 1
> ...         return
> ...     except:
> ...         print 2
> ...     else:
> ...         print 3
> ...
> >>> foo()
> 1
> >>> 

doh! of course, you are right.  So it indeeds better translates 
to a nested try-finally/try-except when transformed to python code. 
Nick Coghlan points at the correct ideas below in this thread. 

At the time i was implementing things by modifying ceval.c 
rather than by just a compiling addition, i have to admit. 

cheers, 

    holger
From aahz at pythoncraft.com  Sat Apr 23 15:50:02 2005
From: aahz at pythoncraft.com (Aahz)
Date: Sat Apr 23 15:50:17 2005
Subject: [Python-Dev] PEP 310 and exceptions
In-Reply-To: <4269C405.1050008@gmail.com>
References: <20050422235112.GK22996@solar.trillke.net>
	<4269C04E.5040108@gmail.com> <4269C405.1050008@gmail.com>
Message-ID: <20050423135002.GA17909@panix.com>

On Sat, Apr 23, 2005, Nick Coghlan wrote:
>
> In light of Alex's comments, I'd actually like to suggest the below as a 
> potential new definition for PEP 310 (making __exit__ optional, and adding 
> an __else__ handler):
> 
>     if hasattr(x, '__enter__'):
>         x.__enter__()
>     try:
>         try:
>             # Contents of 'with' block
>         except:
>             if hasattr(x, '__except__'):
>                 if not x.__except__(*sys.exc_info()): # [1]
>                     raise
>             else:
>                 raise
>         else:
>             if hasattr(x, '__else__'):
>                 x.__else__()
>     finally:
>         if hasattr(x, '__exit__'):
>             x.__exit__()

+1, but prior to reading this post I was thinking along similar lines
with your __exit__ named __finally__ and your __else__ named __exit__.
My reasoning for that is that most of the time, people want their exit
condition aborted if an exception is raised; having the "normal" exit
routine called __else__ would be confusing except to people who do lots
of exception handling.

(I'm a bit sensitive to that right now; this week I wasted an hour
because I didn't understand exceptions as well as I thought I did,
although it was related more to the precise mechanics of raising and
catching exceptions.  Perhaps I'll submit a doc bug; I didn't find this
explained in _Learning Python_ or Nutshell...)
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It's 106 miles to Chicago.  We have a full tank of gas, a half-pack of
cigarettes, it's dark, and we're wearing sunglasses."  "Hit it."
From hpk at trillke.net  Sat Apr 23 18:06:49 2005
From: hpk at trillke.net (holger krekel)
Date: Sat Apr 23 18:06:52 2005
Subject: __except__ use cases (was: Re: [Python-Dev] PEP 310 and exceptions)
In-Reply-To: <4269C405.1050008@gmail.com>
References: <20050422235112.GK22996@solar.trillke.net>
	<4269C04E.5040108@gmail.com> <4269C405.1050008@gmail.com>
Message-ID: <20050423160649.GC30548@solar.trillke.net>

On Sat, Apr 23, 2005 at 13:41 +1000, Nick Coghlan wrote:
> Nick Coghlan wrote:
> In light of Alex's comments, I'd actually like to suggest the below as a 
> potential new definition for PEP 310 (making __exit__ optional, and adding 
> an __else__ handler):
> 
>     if hasattr(x, '__enter__'):
>         x.__enter__()
>     try:
>         try:
>             # Contents of 'with' block
>         except:
>             if hasattr(x, '__except__'):
>                 if not x.__except__(*sys.exc_info()): # [1]
>                     raise

On a side note, I don't see too much point in having __except__ 
return something when it is otherwise easy to say: 

    def __except__(self, typ, val, tb): 
        self.abort_transaction() 
        raise typ, val, tb 

But actually i'd like to to mention some other than
transaction-use cases for __except__, for example with

    class MyObject: 
        def __except__(self, typ, val, tb): 
            if isinstance(val, KeyboardInterrupt): 
                raise 
            # process exception and swallow it

you can use it like: 

    x = MyObject(): 
        # do some long running stuff 

and MyObject() can handle internal problems appropriately and
present clean Exceptions to the outside without changing the
"calling side".  With my implementation i also played with
little things like: 
    
    def __getattr__(self, name): 
        Key2AttributeError: 
            return self._cache[key]
        ... 

with an obvious __except__() implementation for
Key2AttributeError.   

Similar to what Alex points out i generally think that being
able to define API/object specific exception handling in *one*
place is a great thing. I am willing to help with the PEP and 
implementation, btw.

cheers, 

    holger
From pje at telecommunity.com  Sat Apr 23 19:50:14 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat Apr 23 19:46:06 2005
Subject: [Python-Dev] PEP 310 and exceptions
In-Reply-To: <4269C405.1050008@gmail.com>
References: <4269C04E.5040108@gmail.com>
	<20050422235112.GK22996@solar.trillke.net>
	<4269C04E.5040108@gmail.com>
Message-ID: <5.1.1.6.0.20050423134807.03ab79d0@mail.telecommunity.com>

At 01:41 PM 4/23/05 +1000, Nick Coghlan wrote:
>Whichever way that point goes, this definition would allow PEP 310 to 
>handle Alex's example of factoring out standardised exception handling, as 
>well as the original use case of resource cleanup, and the transaction 
>handling:
>
>class transaction(object):
>     def __enter__(self):
>         begin_transaction()
>
>     def __except__(self, *exc_info):
>         abort_transaction()
>
>     def __else__(self):
>         commit_transaction()

I'd like to suggest '__success__' in place of '__else__' and 
'__before__'/'__after__' instead of '__enter__'/'__exit__', if you do take 
this approach, so that what they do is a bit more obvious. 

From bh at intevation.de  Sat Apr 23 19:59:29 2005
From: bh at intevation.de (Bernhard Herzog)
Date: Sat Apr 23 19:59:53 2005
Subject: [Python-Dev] PEP 310 and exceptions
In-Reply-To: <4269C04E.5040108@gmail.com> (Nick Coghlan's message of "Sat,
	23 Apr 2005 13:26:06 +1000")
References: <20050422235112.GK22996@solar.trillke.net>
	<4269C04E.5040108@gmail.com>
Message-ID: <s9z64yde5mm.fsf@thetis.intevation.de>

Nick Coghlan <ncoghlan@gmail.com> writes:

> holger krekel wrote:
>> Moreover, i think that there are more than the "transactional"
>> use cases mentioned in the PEP.  For example, a handler may want to
>> log exceptions to some tracing utility or it may want to swallow
>> certain exceptions when
>> its block does IO operations that are ok to fail. 
>
> With the current PEP 310 definition, these can be manually handled using
> sys.exc_info() in the __exit__ method.

With the proposed implementation of PEP 310 rev. 1.5 it wouldn't work.
sys.exc_info returns a tuple of Nones unless an except: clause has been
entered.  Either sys.exc_info() would have to be changed to always
return exception information after an exception has been raised or the
implementation would have to be changed to do the equivalent of e.g.

    if hasattr(var, "__enter__"):
	var.__enter__()

    try:
        try:
            suite
        except:
            pass
    finally:
        var.__exit__()


An empty except: suite suffices.  In C that's equivalent to a call to
PyErr_NormalizeException AFAICT.


   Bernhard

-- 
Intevation GmbH                                 http://intevation.de/
Skencil                                           http://skencil.org/
Thuban                                  http://thuban.intevation.org/
From ncoghlan at gmail.com  Sun Apr 24 03:22:17 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun Apr 24 03:22:22 2005
Subject: [Python-Dev] PEP 310 and exceptions
In-Reply-To: <s9z64yde5mm.fsf@thetis.intevation.de>
References: <20050422235112.GK22996@solar.trillke.net>	<4269C04E.5040108@gmail.com>
	<s9z64yde5mm.fsf@thetis.intevation.de>
Message-ID: <426AF4C9.6000008@gmail.com>

Bernhard Herzog wrote:
> With the proposed implementation of PEP 310 rev. 1.5 it wouldn't work.
> sys.exc_info returns a tuple of Nones unless an except: clause has been
> entered.  Either sys.exc_info() would have to be changed to always
> return exception information after an exception has been raised or the
> implementation would have to be changed to do the equivalent of e.g.

Interesting. Although the 'null' except block should probably be a bare 'raise', 
rather than a 'pass':

Py> try:
...   try:
...     raise TypeError("I'm an error!")
...   except:
...     raise
... finally:
...   print sys.exc_info()
...
(<class exceptions.TypeError at 0x009745A0>, <exceptions.TypeError instance at 0
x009E7238>, <traceback object at 0x009E72B0>)
Traceback (most recent call last):
   File "<stdin>", line 3, in ?
TypeError: I'm an error!

All the more reason to consider switching to a nested try/finally + 
try/except/else definition for 'with' blocks, I guess.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From ncoghlan at gmail.com  Sun Apr 24 03:58:45 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun Apr 24 03:58:50 2005
Subject: [Python-Dev] PEP 310 and exceptions
In-Reply-To: <20050423135002.GA17909@panix.com>
References: <20050422235112.GK22996@solar.trillke.net>	<4269C04E.5040108@gmail.com>
	<4269C405.1050008@gmail.com> <20050423135002.GA17909@panix.com>
Message-ID: <426AFD55.4020804@gmail.com>

Aahz wrote:
> On Sat, Apr 23, 2005, Nick Coghlan wrote:
> 
>>In light of Alex's comments, I'd actually like to suggest the below as a 
>>potential new definition for PEP 310 (making __exit__ optional, and adding 
>>an __else__ handler):
>>
>>    if hasattr(x, '__enter__'):
>>        x.__enter__()
>>    try:
>>        try:
>>            # Contents of 'with' block
>>        except:
>>            if hasattr(x, '__except__'):
>>                if not x.__except__(*sys.exc_info()): # [1]
>>                    raise
>>            else:
>>                raise
>>        else:
>>            if hasattr(x, '__else__'):
>>                x.__else__()
>>    finally:
>>        if hasattr(x, '__exit__'):
>>            x.__exit__()
> 
> 
> +1, but prior to reading this post I was thinking along similar lines
> with your __exit__ named __finally__ and your __else__ named __exit__.
> My reasoning for that is that most of the time, people want their exit
> condition aborted if an exception is raised; having the "normal" exit
> routine called __else__ would be confusing except to people who do lots
> of exception handling.

In the original motivating use cases (file handles, synchronisation objects), 
the resource release is desired unconditionally. The aim is to achieve something 
similar to C++ scope-delimited objects (which release their resources 
unconditionally as the scope is exited). This parallel is also probably the 
source of the names of the two basic functions ('enter'ing the contained block, 
'exit'ing the contained block).

So, I think try/finally is the right semantics for the basic __enter__/__exit__ 
use case (consider that PEP 310 is seen as possibly worthwhile with *only* these 
semantics!).

For error logging type use cases, only the exception handling is required. The 
issue of a 'no exception raised' handler only comes up for cases like 
transactions, where the commit operation is conditional on no exception being 
triggered. I understand you agree that, for those cases, the best spot to call 
the handler is an else clause on the inner try/except block. That way, it is 
skipped by default if an exception goes off, but the exception handling method 
can still invoke the method directly if desired (e.g. an exception is determined 
to be 'harmless'.

However, I do agree with you that the use of '__else__' as a name is exposing 
too much of the underlying implementation (i.e. you need to understand the 
implementation for the name to make sense). I think renaming '__exit_' to 
'__finally__' would be a similar error, though.

Which means finding a different name for '__else__'. Two possibilities that 
occur to me are '__ok__' or '__no_except__'. The latter makes a fair amount of 
sense, since I can't think of a way to refer to the thing other than as a 'no 
exception' handler.

Cheers,
Nick.

P.S. I'm ignoring my housemate's suggestion of '__accept__' for the no-exception 
handler :)

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From ncoghlan at gmail.com  Sun Apr 24 04:40:04 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun Apr 24 04:40:11 2005
Subject: [Python-Dev] Re: __except__ use cases
In-Reply-To: <20050423160649.GC30548@solar.trillke.net>
References: <20050422235112.GK22996@solar.trillke.net>
	<4269C04E.5040108@gmail.com> <4269C405.1050008@gmail.com>
	<20050423160649.GC30548@solar.trillke.net>
Message-ID: <426B0704.9050901@gmail.com>

holger krekel wrote:
> On a side note, I don't see too much point in having __except__ 
> return something when it is otherwise easy to say: 
> 
>     def __except__(self, typ, val, tb): 
>         self.abort_transaction() 
>         raise typ, val, tb 

It has to do with "Errors should never pass silently, unless explicitly 
silenced". Consider:

      def __except__(self, typ, val, tb):
          self.abort_transaction()

With __except__ returning a value, the implicit 'return None' means that the 
exception is propagated by default. Without the 'suppress exception' boolean 
return value, this naive handler would not only abort the transaction, but 
swallow each and every exception that occured inside the 'with' block.

Another common error with a manual reraise would involve not including the 
traceback properly, leading to difficulties with debugging.

IOW, returning a value from __except__ should make the exception handlers 
cleaner, and easier to 'do right' (since reraising simply means returning a 
value that evaluates to False, or falling off the end of the function). 
Suppressing the exception would require actively adding 'return True' to the end 
of the handler.

> But actually i'd like to to mention some other than
> transaction-use cases for __except__, for example with
> 
>     class MyObject: 
>         def __except__(self, typ, val, tb): 
>             if isinstance(val, KeyboardInterrupt): 
>                 raise 
>             # process exception and swallow it

s/raise/return True/ for the return value version.

>     def __getattr__(self, name): 
>         Key2AttributeError: 
>             return self._cache[key]
>         ... 
> 
> with an obvious __except__() implementation for
> Key2AttributeError.

Seeing this example has convinced me of something. PEP 310 should use the 'with' 
keyword, and 'expression block' syntax should be used to denote the 'default 
object' semantics proposed for Python 3K. For example:

class Key2AttributeError(object):
     def __init__(self, obj, attr):
         self:
             .obj_type = type(obj)
             .attr = attr
     def __except__(self, ex_type, ex_val, ex_tb):
         if isinstance(ex_type, KeyError):
              self:
                  raise AttributeError("%s instance has no attribute %s"
                                        % (.obj_type, .attr))


# Somewhere else. . .
     def __getattr__(self, name):
         with Key2AttributeError(self, key):
             self:
                 return ._cache[key]

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From shane at hathawaymix.org  Sun Apr 24 06:07:37 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Sun Apr 24 06:07:40 2005
Subject: [Python-Dev] PEP 310 and exceptions
In-Reply-To: <426AFD55.4020804@gmail.com>
References: <20050422235112.GK22996@solar.trillke.net>	<4269C04E.5040108@gmail.com>	<4269C405.1050008@gmail.com>
	<20050423135002.GA17909@panix.com> <426AFD55.4020804@gmail.com>
Message-ID: <426B1B89.6040700@hathawaymix.org>

Nick Coghlan wrote:
> Which means finding a different name for '__else__'. Two possibilities
> that occur to me are '__ok__' or '__no_except__'. The latter makes a
> fair amount of sense, since I can't think of a way to refer to the thing
> other than as a 'no exception' handler.

While we're on the subject of block handler method names, do the method
names need four underscores?  'enter' and 'exit' look better than
'__enter__' and '__exit__'.

Shane
From ncoghlan at gmail.com  Sun Apr 24 08:42:18 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun Apr 24 08:48:19 2005
Subject: [Python-Dev] PEP 310 and exceptions
In-Reply-To: <426B1B89.6040700@hathawaymix.org>
References: <20050422235112.GK22996@solar.trillke.net>	<4269C04E.5040108@gmail.com>	<4269C405.1050008@gmail.com>
	<20050423135002.GA17909@panix.com> <426AFD55.4020804@gmail.com>
	<426B1B89.6040700@hathawaymix.org>
Message-ID: <426B3FCA.2040605@gmail.com>

Shane Hathaway wrote:
> Nick Coghlan wrote:
> 
>> Which means finding a different name for '__else__'. Two possibilities that
>>  occur to me are '__ok__' or '__no_except__'. The latter makes a fair
>> amount of sense, since I can't think of a way to refer to the thing other
>> than as a 'no exception' handler.
> 
> 
> While we're on the subject of block handler method names, do the method names
>  need four underscores?  'enter' and 'exit' look better than '__enter__' and 
> '__exit__'.

It's traditional for slots (or pseudo-slots) to have magic method names. It 
implies that the methods are expected to be called implicitly via special syntax 
or builtin functions, rather than explicitly in a normal method call. The only 
exception I can think of is the 'next' method of the iterator protocol. That 
method is often called explicitly, so the exception makes sense.

For resources, there doesn't seem to be any real reason to call the methods 
directly - the calls will generally be hidden behind the 'with' block syntax. 
Hence, magic methods.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net

From arigo at tunes.org  Sun Apr 24 15:10:09 2005
From: arigo at tunes.org (Armin Rigo)
Date: Sun Apr 24 15:11:45 2005
Subject: [Python-Dev] Error checking in init<module> functions
In-Reply-To: <y8bax3jd.fsf@python.net>
References: <y8bax3jd.fsf@python.net>
Message-ID: <20050424131009.GB11964@vicky.ecs.soton.ac.uk>

Hi Thomas,

On Fri, Apr 22, 2005 at 04:57:26PM +0200, Thomas Heller wrote:
> PyMODINIT_FUNC
> PyInit_zlib(void)
> {
>     m = Py_InitModule4("zlib", zlib_methods,
> 		       zlib_module_documentation,
> 		       (PyObject*)NULL,PYTHON_API_VERSION);

I've seen a lot of code like this where laziness is actually buginess.  If the
Py_InitModule4() fails, you get a NULL in m, and that results in a segfault in
most of the cases.


Armin
From jjl at pobox.com  Sun Apr 24 19:12:03 2005
From: jjl at pobox.com (John J Lee)
Date: Sun Apr 24 19:11:25 2005
Subject: [Python-Dev] Re: __except__ use cases
In-Reply-To: <426B0704.9050901@gmail.com>
References: <20050422235112.GK22996@solar.trillke.net>
	<4269C04E.5040108@gmail.com>
	<4269C405.1050008@gmail.com> <20050423160649.GC30548@solar.trillke.net>
	<426B0704.9050901@gmail.com>
Message-ID: <Pine.LNX.4.58.0504241705050.6803@alice>

On Sun, 24 Apr 2005, Nick Coghlan wrote:
[...]
> Seeing this example has convinced me of something. PEP 310 should use the 'with' 
> keyword, and 'expression block' syntax should be used to denote the 'default 
> object' semantics proposed for Python 3K. For example:
> 
> class Key2AttributeError(object):
>      def __init__(self, obj, attr):
>          self:
>              .obj_type = type(obj)
>              .attr = attr
>      def __except__(self, ex_type, ex_val, ex_tb):
>          if isinstance(ex_type, KeyError):
>               self:
>                   raise AttributeError("%s instance has no attribute %s"
>                                         % (.obj_type, .attr))
> 
> > # Somewhere else. . .
>      def __getattr__(self, name):
>          with Key2AttributeError(self, key):
>              self:
>                  return ._cache[key]
[...]

+1

Purely based on my aesthetic reaction, that is.  Never having used other
languages with this 'attribute lookup shorthand' feature, that seems to
align *much* more with what I expect than the other way around.  If 'with'
is used in other languages as the keyword for attribute lookup shorthand,
though, perhaps it will confuse other people, or at least make them frown
:-(


John
From tdickenson at devmail.geminidataloggers.co.uk  Sun Apr 24 19:31:56 2005
From: tdickenson at devmail.geminidataloggers.co.uk (Toby Dickenson)
Date: Sun Apr 24 19:31:58 2005
Subject: [Python-Dev] PEP 310 and exceptions
In-Reply-To: <426B3FCA.2040605@gmail.com>
References: <20050422235112.GK22996@solar.trillke.net>
	<426B1B89.6040700@hathawaymix.org> <426B3FCA.2040605@gmail.com>
Message-ID: <200504241831.56361.tdickenson@devmail.geminidataloggers.co.uk>

On Sunday 24 April 2005 07:42, Nick Coghlan wrote:
> Shane Hathaway wrote:

> > While we're on the subject of block handler method names, do the method
> > names need four underscores?  'enter' and 'exit' look better than
> > '__enter__' and  '__exit__'.

I quite like .acquire() and .release(). 

There are plenty of classes (and not just in the threading module) which 
already have methods with those names that could controlled by a 'with'. 

Those names also make the most sense in the C++ 'resource acquisition' model.


-- 
Toby Dickenson
From jcarlson at uci.edu  Sun Apr 24 20:05:22 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Sun Apr 24 20:06:37 2005
Subject: [Python-Dev] PEP 310 and exceptions
In-Reply-To: <200504241831.56361.tdickenson@devmail.geminidataloggers.co.uk>
References: <426B3FCA.2040605@gmail.com>
	<200504241831.56361.tdickenson@devmail.geminidataloggers.co.uk>
Message-ID: <20050424110041.63E4.JCARLSON@uci.edu>


Toby Dickenson <tdickenson@devmail.geminidataloggers.co.uk> wrote:
> 
> On Sunday 24 April 2005 07:42, Nick Coghlan wrote:
> > Shane Hathaway wrote:
> 
> > > While we're on the subject of block handler method names, do the method
> > > names need four underscores?  'enter' and 'exit' look better than
> > > '__enter__' and  '__exit__'.
> 
> I quite like .acquire() and .release(). 
> 
> There are plenty of classes (and not just in the threading module) which 
> already have methods with those names that could controlled by a 'with'. 
> 
> Those names also make the most sense in the C++ 'resource acquisition' model.

Perhaps, but names for the equivalent of "acquire resource" and "release
resource" are not consistant accross modules.

Also, re-read Nick Coghlan's email with message id
<426B3FCA.2040605@gmail.com>.

 - Josiah

From bac at OCF.Berkeley.EDU  Mon Apr 25 00:31:42 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Mon Apr 25 00:31:46 2005
Subject: [Python-Dev] Proper place to put extra args for building
In-Reply-To: <4269F919.6070901@v.loewis.de>
References: <4265918D.7040700@ocf.berkeley.edu>	<4265F656.4020305@v.loewis.de>	<4266C07A.9090503@ocf.berkeley.edu>	<4266C4E9.5060709@v.loewis.de>
	<4266C7D1.700@ocf.berkeley.edu>	<42672803.3080208@v.loewis.de>
	<42674338.80009@ocf.berkeley.edu>	<426744DF.2030309@v.loewis.de>
	<426883FF.5060009@ocf.berkeley.edu> <42697754.1000707@v.loewis.de>
	<4269CB3A.40306@ocf.berkeley.edu> <4269F919.6070901@v.loewis.de>
Message-ID: <426C1E4E.4060809@ocf.berkeley.edu>

OK, EXTRA_CFLAGS support has been checked into Makefile.pre.in and
distutils.sysconfig .  Martin, please double-check I tweaked sysconfig the way
you wanted.  I also wasn't sure of compatibility for Distutils (first time
touching it); checked PEP 291 but Distutils wasn't listed.  I went ahead and
used a genexp; hope that is okay.  I also did it through Lib/distutils instead
of the separate distutils top directory in CVS.

I didn't bother with touching setup.py because I realized that sysconfig should
take care of that.  If that is wrong let me know and I can check in a change
(and if I am right that line dealing with OPT in setup.py could probably go).

Here are the revisions.

Checking in Makefile.pre.in;
/cvsroot/python/python/dist/src/Makefile.pre.in,v  <--  Makefile.pre.in
new revision: 1.152; previous revision: 1.151
done
Checking in README;
/cvsroot/python/python/dist/src/README,v  <--  README
new revision: 1.188; previous revision: 1.187
done
Checking in Lib/distutils/sysconfig.py;
/cvsroot/python/python/dist/src/Lib/distutils/sysconfig.py,v  <--  sysconfig.py
new revision: 1.64; previous revision: 1.63
done
Checking in Misc/SpecialBuilds.txt;
/cvsroot/python/python/dist/src/Misc/SpecialBuilds.txt,v  <--  SpecialBuilds.txt
new revision: 1.20; previous revision: 1.19
done
Checking in Misc/NEWS;
/cvsroot/python/python/dist/src/Misc/NEWS,v  <--  NEWS
new revision: 1.1288; previous revision: 1.1287
done

-Brett
From hpk at trillke.net  Mon Apr 25 00:34:20 2005
From: hpk at trillke.net (holger krekel)
Date: Mon Apr 25 00:34:23 2005
Subject: [Python-Dev] Re: __except__ use cases
In-Reply-To: <426B0704.9050901@gmail.com>
References: <20050422235112.GK22996@solar.trillke.net>
	<4269C04E.5040108@gmail.com> <4269C405.1050008@gmail.com>
	<20050423160649.GC30548@solar.trillke.net>
	<426B0704.9050901@gmail.com>
Message-ID: <20050424223420.GD30548@solar.trillke.net>

Hi Nick, 

On Sun, Apr 24, 2005 at 12:40 +1000, Nick Coghlan wrote:
> Seeing this example has convinced me of something. PEP 310 should use the 
> 'with' keyword, and 'expression block' syntax should be used to denote the 
> 'default object' semantics proposed for Python 3K. For example:

While that may be true, i don't care too much about the syntax
yet but more about the idea and semantics of an __except__
hook.   I simply followed the syntax that Guido currently
seems to prefer. 

    holger
From ncoghlan at gmail.com  Mon Apr 25 01:37:59 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon Apr 25 01:38:05 2005
Subject: [Python-Dev] PEP 310 and exceptions
In-Reply-To: <200504241831.56361.tdickenson@devmail.geminidataloggers.co.uk>
References: <20050422235112.GK22996@solar.trillke.net>	<426B1B89.6040700@hathawaymix.org>
	<426B3FCA.2040605@gmail.com>
	<200504241831.56361.tdickenson@devmail.geminidataloggers.co.uk>
Message-ID: <426C2DD7.7070507@gmail.com>

Toby Dickenson wrote:
> On Sunday 24 April 2005 07:42, Nick Coghlan wrote:
> 
>>Shane Hathaway wrote:
> 
> 
>>>While we're on the subject of block handler method names, do the method
>>>names need four underscores?  'enter' and 'exit' look better than
>>>'__enter__' and  '__exit__'.
> 
> 
> I quite like .acquire() and .release(). 
> 
> There are plenty of classes (and not just in the threading module) which 
> already have methods with those names that could controlled by a 'with'. 
> 
> Those names also make the most sense in the C++ 'resource acquisition' model.

Such existing pairings can be easily handled with a utility class like the one 
below. Besides, this part of the naming was considered for the original 
development of PEP 310 - entering and exiting the block is common to _all_ uses 
of the syntax, whereas other names are more specific to particular use cases.

class resource(object):
   def __init__(self, enter, exit):
     self.enter = enter
     self.exit = exit
   def __enter__(self):
     self.enter()
   def __exit__(self):
     self.exit()


with resource(my_resource.acquire, my_resource.release):
   # Do stuff!

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From gvanrossum at gmail.com  Mon Apr 25 01:57:14 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Apr 25 01:57:18 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc205042116402d7d38da@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
Message-ID: <ca471dc205042416572da9db71@mail.gmail.com>

After reading a lot of contributions (though perhaps not all -- this
thread seems to bifurcate every time someone has a new idea :-) I'm
back to liking yield for the PEP 310 use case. I think maybe it was
Doug Landauer's post mentioning Beta, plus scanning some more examples
of using yield in Ruby. Jim Jewett's post on defmacro also helped, as
did Nick Coghlan's post explaining why he prefers 'with' for PEP 310
and a bare expression for the 'with' feature from Pascal (and other
languages :-).

It seems that the same argument that explains why generators are so
good for defining iterators, also applies to the PEP 310 use case:
it's just much more natural to write

    def with_file(filename):
        f = open(filename)
        try:
            yield f
        finally:
            f.close()

than having to write a class with __entry__ and __exit__ and
__except__ methods (I've lost track of the exact proposal at this
point).

At the same time, having to use it as follows:

    for f in with_file(filename):
        for line in f:
            print process(line)

is really ugly, so we need new syntax, which also helps with keeping
'for' semantically backwards compatible. So let's use 'with', and then
the using code becomes again this:

    with f = with_file(filename):
        for line in f:
            print process(line)

Now let me propose a strawman for the translation of the latter into
existing semantics. Let's take the generic case:

    with VAR = EXPR:
        BODY

This would translate to the following code:

    it = EXPR
    err = None
    while True:
        try:
            if err is None:
                VAR = it.next()
            else:
                VAR = it.next_ex(err)
        except StopIteration:
            break
        try:
            err = None
            BODY
        except Exception, err: # Pretend "except Exception:" == "except:"
            if not hasattr(it, "next_ex"):
                raise

(The variables 'it' and 'err' are not user-visible variables, they are
internal to the translation.)

This looks slightly awkward because of backward compatibility; what I
really want is just this:

    it = EXPR
    err = None
    while True:
        try:
            VAR = it.next(err)
        except StopIteration:
            break
        try:
            err = None
            BODY
        except Exception, err: # Pretend "except Exception:" == "except:"
            pass

but for backwards compatibility with the existing argument-less next()
API I'm introducing a new iterator API next_ex() which takes an
exception argument.  If that argument is None, it should behave just
like next().  Otherwise, if the iterator is a generator, this will
raised that exception in the generator's frame (at the point of the
suspended yield).  If the iterator is something else, the something
else is free to do whatever it likes; if it doesn't want to do
anything, it can just re-raise the exception.

Also note that, unlike the for-loop translation, this does *not*
invoke iter() on the result of EXPR; that's debatable but given that
the most common use case should not be an alternate looping syntax
(even though it *is* technically a loop) but a more general "macro
statement expansion", I think we can expect EXPR to produce a value
that is already an iterator (rather than merely an interable).

Finally, I think it would be cool if the generator could trap
occurrences of break, continue and return occurring in BODY.  We could
introduce a new class of exceptions for these, named ControlFlow, and
(only in the body of a with statement), break would raise BreakFlow,
continue would raise ContinueFlow, and return EXPR would raise
ReturnFlow(EXPR) (EXPR defaulting to None of course).

So a block could return a value to the generator using a return
statement; the generator can catch this by catching ReturnFlow.
(Syntactic sugar could be "VAR = yield ..." like in Ruby.)

With a little extra magic we could also get the behavior that if the
generator doesn't handle ControlFlow exceptions but re-raises them,
they would affect the code containing the with statement; this means
that the generator can decide whether return, break and continue are
handled locally or passed through to the containing block.

Note that EXPR doesn't have to return a generator; it could be any
object that implements next() and next_ex().  (We could also require
next_ex() or even next() with an argument; perhaps this is better.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From tcdelaney at optusnet.com.au  Mon Apr 25 02:37:44 2005
From: tcdelaney at optusnet.com.au (Tim Delaney)
Date: Mon Apr 25 02:37:46 2005
Subject: [Python-Dev] Re: anonymous blocks
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
Message-ID: <004201c5492e$ff05ca60$f100a8c0@ryoko>

Guido van Rossum wrote:

> but for backwards compatibility with the existing argument-less next()
> API I'm introducing a new iterator API next_ex() which takes an
> exception argument.  If that argument is None, it should behave just
> like next().  Otherwise, if the iterator is a generator, this will

Might this be a good time to introduce __next__ (having the same signature 
and semantics as your proposed next_ex) and builtin next(obj, 
exception=None)?

def next(obj, exception=None):

    if hasattr(obj, '__next__'):
        return obj.__next__(exception)

    if exception is not None:
        return obj.next(exception) # Will raise an appropriate exception

    return obj.next()

Tim Delaney 

From bob at redivi.com  Mon Apr 25 04:16:28 2005
From: bob at redivi.com (Bob Ippolito)
Date: Mon Apr 25 04:16:33 2005
Subject: [Python-Dev] site enhancements (request for review)
Message-ID: <53f1dec01a0d78057c40abb1942cf0f1@redivi.com>

A few weeks ago I put together a patch to site.py for Python 2.5 
<http://python.org/sf/1174614> that solves three major deficiencies:

(1) All site dirs must exist on the filesystem: Since PEP 302 (New
Import Hooks) was adopted, this is not necessarily true.
sys.meta_path and sys.path_hooks can have valid uses for non-
existent paths. Even the standard zipimport hook supports in-zip-
file paths (i.e. foo.zip/bar).

(2) The directories added to sys.path by .pth files are not scanned
for further .pth files. If they were, you could make life much easier
on developers and users of multi-user systems. For example, it
would be possible for an administrator to drop in a .pth file into the
system-wide site-packages to allow users to have their own local
site-packages folder. Currently, you could try this, but it wouldn't
work because many packages such as PIL, Numeric, and PyObjC
take advantage of .pth files during their installation.

(3) To support the above use case, .pth files should be allowed to
use os.path.expanduser(), so you can toss a tilde in front and do the
right thing. Currently, the only way to support (2) is to use an ugly
"import" pth hook.

So far, it seem that only JvR has reviewed the patch, and recommends 
apply.  I'd like to apply it, but it should probably have a bit more 
review first.  If no negative comments show up for a week or two, I'll 
assume that people like it or don't care, and apply.

-bob

From bac at OCF.Berkeley.EDU  Mon Apr 25 04:23:59 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Mon Apr 25 04:24:04 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc205042416572da9db71@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
Message-ID: <426C54BF.2010906@ocf.berkeley.edu>

Guido van Rossum wrote:
[SNIP]
> Now let me propose a strawman for the translation of the latter into
> existing semantics. Let's take the generic case:
> 
>     with VAR = EXPR:
>         BODY
> 
> This would translate to the following code:
[SNIP]
> 
>     it = EXPR
>     err = None
>     while True:
>         try:
>             VAR = it.next(err)
>         except StopIteration:
>             break
>         try:
>             err = None
>             BODY
>         except Exception, err: # Pretend "except Exception:" == "except:"
>             pass
> 
> but for backwards compatibility with the existing argument-less next()
> API I'm introducing a new iterator API next_ex() which takes an
> exception argument.

Can I suggest the name next_exc() instead?  Everything in the sys module uses
"exc" as the abbreviation for "exception".  I realize you might be suggesting
using the "ex" as the suffix because of the use of that as the suffix in the C
API for an extended API, but that usage is not prominent in the stdlib.

Also, would this change in Python 3000 so that both next_ex() and next() are
merged into a single method?

As for an opinion of the need of 'with', I am on the fence, leaning towards
liking it.  To make sure I am understanding the use case, it is to help
encapsulate typical resource management with proper cleanup in another function
instead of having to constantly pasting in boilerplate into your code, right?
So the hope is to be able to create factory functions, typically implemented as
a generator, that encapsulate the obtaining, temporary lending out, and cleanup
of a resource?

Is there some other use that I am totally missing that is obvious?

>  If that argument is None, it should behave just
> like next().  Otherwise, if the iterator is a generator, this will
> raised that exception in the generator's frame (at the point of the
> suspended yield).  If the iterator is something else, the something
> else is free to do whatever it likes; if it doesn't want to do
> anything, it can just re-raise the exception.
> 
> Also note that, unlike the for-loop translation, this does *not*
> invoke iter() on the result of EXPR; that's debatable but given that
> the most common use case should not be an alternate looping syntax
> (even though it *is* technically a loop) but a more general "macro
> statement expansion", I think we can expect EXPR to produce a value
> that is already an iterator (rather than merely an interable).
> 
> Finally, I think it would be cool if the generator could trap
> occurrences of break, continue and return occurring in BODY.  We could
> introduce a new class of exceptions for these, named ControlFlow, and
> (only in the body of a with statement), break would raise BreakFlow,
> continue would raise ContinueFlow, and return EXPR would raise
> ReturnFlow(EXPR) (EXPR defaulting to None of course).
> 
> So a block could return a value to the generator using a return
> statement; the generator can catch this by catching ReturnFlow.
> (Syntactic sugar could be "VAR = yield ..." like in Ruby.)
> 
> With a little extra magic we could also get the behavior that if the
> generator doesn't handle ControlFlow exceptions but re-raises them,
> they would affect the code containing the with statement; this means
> that the generator can decide whether return, break and continue are
> handled locally or passed through to the containing block.
> 

Honestly, I am not very comfortable with this magical meaning of 'break',
'continue', and 'return' in a 'with' block.  I realize 'return' already has
special meaning in an generator, but I don't think that is really needed
either.  It leads to this odd dichotomy where a non-exception-related statement
directly triggers an exception in other code.  It seems like code doing
something behind my back; "remember, it looks like a 'continue', but it really
is a method call with a specific exception instance.  Surprise!"

Personally, what I would rather see, is to have next_ex(), for a generator,
check if the argument is a subclass of Exception.  If it is, raise it as such.
 If not, have the 'yield' statement return the passed-in argument.  This use of
it would make sense for using the next_ex() name.

Then again I guess having exceptions triggering a method call instead of
hitting an 'except' statement is already kind of "surprise" semantics anyway.
=)  Still, I would like to minimize the surprises that we could spring.

And before anyone decries the fact that this might confuse a newbie (which
seems to happen with every advanced feature ever dreamed up), remember this
will not be meant for a newbie but for someone who has experience in Python and
iterators at the minimum, and hopefully with generators.  Not exactly meant for
someone for which raw_input() still holds a "wow" factor for.  =)

> Note that EXPR doesn't have to return a generator; it could be any
> object that implements next() and next_ex().  (We could also require
> next_ex() or even next() with an argument; perhaps this is better.)
> 

Yes, that requirement would be good.  Will make sure people don't try to use an
iterator with the 'with' statement that has not been designed properly for use
within the 'with'.  And the precedence of requiring an API is set by 'for'
since it needs to be an iterable or define __getitem__() as it is.

-Brett
From bob at redivi.com  Mon Apr 25 04:32:35 2005
From: bob at redivi.com (Bob Ippolito)
Date: Mon Apr 25 04:32:41 2005
Subject: [Python-Dev] zipfile still has 2GB boundary bug
Message-ID: <f8130a8bcf8ee1165b8c6c6c5da86f80@redivi.com>

The "2GB bug" that was supposed to be fixed in 
<http://python.org/sf/679953> was not actually fixed.  The zipinfo 
offsets in the structures are still signed longs, so the fix allows you 
to write one file that extends past the 2G boundary, but if any extend 
past that point you are screwed.

I have opened a new bug and patch that should fix this issue 
<http://python.org/sf/1189216>.  This is a backport candidate to 2.4.2 
and 2.3.6 (if that ever happens).

On a related note, if anyone else has a bunch of really big and 
ostensibly broken zip archives created by dumb versions of the zipfile 
module, I have written a script that can rebuild the central directory 
in-place.  Ping me off-list if you're interested and I'll clean it up.

Someone should think about rewriting the zipfile module to be less 
hideous, include a repair feature, and be up to date with the latest 
specifications <http://www.pkware.com/company/standards/appnote/>.

Additionally, it'd also be useful if someone were to include support 
for Apple's "extensions" to the zip format (the __MACOSX folder and its 
contents) that show up when BOM (private framework) is used to create 
archives (i.e. Finder in Mac OS X 10.3+).  I'm not sure if these are 
documented anywhere, but I can help with reverse engineering if someone 
is interested in writing the code.

On that note, Mac OS X 10.4 (Tiger) is supposed to have new APIs (or 
changes to existing APIs?) to facilitate resource fork preservation, 
ACLs, and Spotlight hooks in tar, cp, mv, etc.  Someone should spend 
some time looking at the Darwin 8 sources for these tools (when they're 
publicly available in the next few weeks) to see what would need to be 
done in Python to support them in the standard library (the os, 
tarfile, etc. modules).

-bob

From steven.bethard at gmail.com  Mon Apr 25 05:12:47 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Mon Apr 25 05:12:49 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc205042416572da9db71@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
Message-ID: <d11dcfba05042420126047b506@mail.gmail.com>

Guido van Rossum wrote:
[snip illustration of how generators (and other iterators) can be
modified to be used in with-blocks]
> the most common use case should not be an alternate looping syntax
> (even though it *is* technically a loop) but a more general "macro
> statement expansion"

I'm sure I could get used to it, but my intuition for

   with f = with_file(filename):
       for line in f:
           print process(line)

is that the f = with_file(filename) executes only once.  That is, as
you said, I don't expect this to be a looping syntax.  Of course, as
long as the generators (or other objects) here yield only one value
(like with_file does), then the with-block will execute only once. 
But because the implementation lets you make the with-block loop if
you want, it makes be nervous...

I guess it would be helpful to see example where the looping
with-block is useful.  So far, I think all the examples I've seen have
been like with_file, which only executes the block once.  Of course,
the loop allows you to do anything that you would normally do in a
for-loop, but my feeling is that this is probably better done by
composing a with-block that executes the block only once with a normal
Python for-loop.

I'd almost like to see the with-block translated into something like

   it = EXPR
   try:
       VAR = it.next()
   except StopIteration:
       raise WithNotStartedException
   err = None
   try:
       BODY
   except Exception, err: # Pretend "except Exception:" == "except:"
       pass
   try:
       it.next_ex(err)
   except StopIteration:
       pass
   else:
       raise WithNotEndedException

where there is no looping at all, and the iterator is expected to
yield exactly one item and then terminate.  Of course this looks a lot
like:

   it = EXPR
   VAR = it.__enter__()
   err = None
   try:
       BODY
   except Exception, err: # Pretend "except Exception:" == "except:"
       pass
   it.__exit__(err)

So maybe I'm just still stuck on the enter/exit semantics. ;-)

STeVe
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From pje at telecommunity.com  Mon Apr 25 05:32:30 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Apr 25 05:28:26 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc205042416572da9db71@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042116402d7d38da@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050424232631.02ffbcb0@mail.telecommunity.com>

At 04:57 PM 4/24/05 -0700, Guido van Rossum wrote:
>So a block could return a value to the generator using a return
>statement; the generator can catch this by catching ReturnFlow.
>(Syntactic sugar could be "VAR = yield ..." like in Ruby.)

[uncontrolled drooling, followed by much rejoicing]

If this were available to generators in general, you could untwist 
Twisted.  I'm basically simulating this sort of exception/value passing in 
peak.events to do exactly that, except I have to do:

     yield somethingBlocking(); result=events.resume()

where events.resume() magically receives a value or exception from outside 
the generator and either returns or raises it.  If next()-with-argument and 
next_ex() are available normally on generators, this would allow you to 
simulate co-routines without the events.resume() magic; the above would 
simply read:

     result = yield somethingBlocking()

The rest of the peak.events coroutine simulation would remain around to 
manage the generator stack and scheduling, but the syntax would be cleaner 
and the operation of it entirely unmagical.

From bob at redivi.com  Mon Apr 25 05:39:52 2005
From: bob at redivi.com (Bob Ippolito)
Date: Mon Apr 25 05:39:57 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <5.1.1.6.0.20050424232631.02ffbcb0@mail.telecommunity.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042116402d7d38da@mail.gmail.com>
	<5.1.1.6.0.20050424232631.02ffbcb0@mail.telecommunity.com>
Message-ID: <10a241223d5ad978b6a582adc1cc4954@redivi.com>


On Apr 24, 2005, at 11:32 PM, Phillip J. Eby wrote:

> At 04:57 PM 4/24/05 -0700, Guido van Rossum wrote:
>> So a block could return a value to the generator using a return
>> statement; the generator can catch this by catching ReturnFlow.
>> (Syntactic sugar could be "VAR = yield ..." like in Ruby.)
>
> [uncontrolled drooling, followed by much rejoicing]
>
> If this were available to generators in general, you could untwist 
> Twisted.  I'm basically simulating this sort of exception/value 
> passing in peak.events to do exactly that, except I have to do:
>
>     yield somethingBlocking(); result=events.resume()
>
> where events.resume() magically receives a value or exception from 
> outside the generator and either returns or raises it.  If 
> next()-with-argument and next_ex() are available normally on 
> generators, this would allow you to simulate co-routines without the 
> events.resume() magic; the above would simply read:
>
>     result = yield somethingBlocking()
>
> The rest of the peak.events coroutine simulation would remain around 
> to manage the generator stack and scheduling, but the syntax would be 
> cleaner and the operation of it entirely unmagical.

Only if "result = yield somethingBlocking()" could also raise an 
exception.

-bob

From pje at telecommunity.com  Mon Apr 25 05:57:37 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Apr 25 05:53:34 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <10a241223d5ad978b6a582adc1cc4954@redivi.com>
References: <5.1.1.6.0.20050424232631.02ffbcb0@mail.telecommunity.com>
	<ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042116402d7d38da@mail.gmail.com>
	<5.1.1.6.0.20050424232631.02ffbcb0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050424235458.03099ac0@mail.telecommunity.com>

At 11:39 PM 4/24/05 -0400, Bob Ippolito wrote:

>On Apr 24, 2005, at 11:32 PM, Phillip J. Eby wrote:
>
>>At 04:57 PM 4/24/05 -0700, Guido van Rossum wrote:
>>>So a block could return a value to the generator using a return
>>>statement; the generator can catch this by catching ReturnFlow.
>>>(Syntactic sugar could be "VAR = yield ..." like in Ruby.)
>>
>>[uncontrolled drooling, followed by much rejoicing]
>>
>>If this were available to generators in general, you could untwist 
>>Twisted.  I'm basically simulating this sort of exception/value passing 
>>in peak.events to do exactly that, except I have to do:
>>
>>     yield somethingBlocking(); result=events.resume()
>>
>>where events.resume() magically receives a value or exception from 
>>outside the generator and either returns or raises it.  If 
>>next()-with-argument and next_ex() are available normally on generators, 
>>this would allow you to simulate co-routines without the events.resume() 
>>magic; the above would simply read:
>>
>>     result = yield somethingBlocking()
>>
>>The rest of the peak.events coroutine simulation would remain around to 
>>manage the generator stack and scheduling, but the syntax would be 
>>cleaner and the operation of it entirely unmagical.
>
>Only if "result = yield somethingBlocking()" could also raise an exception.

Read Guido's post again; he proposed that passing a result would occur by 
raising a ReturnFlow exception!  In other words, it's the result passing 
that's the exceptional exception, while returning an exception is 
unexceptional.  :)

From bob at redivi.com  Mon Apr 25 06:08:02 2005
From: bob at redivi.com (Bob Ippolito)
Date: Mon Apr 25 06:08:07 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <5.1.1.6.0.20050424235458.03099ac0@mail.telecommunity.com>
References: <5.1.1.6.0.20050424232631.02ffbcb0@mail.telecommunity.com>
	<ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042116402d7d38da@mail.gmail.com>
	<5.1.1.6.0.20050424232631.02ffbcb0@mail.telecommunity.com>
	<5.1.1.6.0.20050424235458.03099ac0@mail.telecommunity.com>
Message-ID: <ebc6046324c73d94b466d3e9202e6bda@redivi.com>


On Apr 24, 2005, at 11:57 PM, Phillip J. Eby wrote:

> At 11:39 PM 4/24/05 -0400, Bob Ippolito wrote:
>
>> On Apr 24, 2005, at 11:32 PM, Phillip J. Eby wrote:
>>
>>> At 04:57 PM 4/24/05 -0700, Guido van Rossum wrote:
>>>> So a block could return a value to the generator using a return
>>>> statement; the generator can catch this by catching ReturnFlow.
>>>> (Syntactic sugar could be "VAR = yield ..." like in Ruby.)
>>>
>>> [uncontrolled drooling, followed by much rejoicing]
>>>
>>> If this were available to generators in general, you could untwist 
>>> Twisted.  I'm basically simulating this sort of exception/value 
>>> passing in peak.events to do exactly that, except I have to do:
>>>
>>>     yield somethingBlocking(); result=events.resume()
>>>
>>> where events.resume() magically receives a value or exception from 
>>> outside the generator and either returns or raises it.  If 
>>> next()-with-argument and next_ex() are available normally on 
>>> generators, this would allow you to simulate co-routines without the 
>>> events.resume() magic; the above would simply read:
>>>
>>>     result = yield somethingBlocking()
>>>
>>> The rest of the peak.events coroutine simulation would remain around 
>>> to manage the generator stack and scheduling, but the syntax would 
>>> be cleaner and the operation of it entirely unmagical.
>>
>> Only if "result = yield somethingBlocking()" could also raise an 
>> exception.
>
> Read Guido's post again; he proposed that passing a result would occur 
> by raising a ReturnFlow exception!  In other words, it's the result 
> passing that's the exceptional exception, while returning an exception 
> is unexceptional.  :)

Oh, right.  Too much cold medicine tonight I guess :)

You're right, of course.  This facility would be VERY nice to ab^Wuse 
when writing any event driven software.. not just Twisted.

-bob

From pje at telecommunity.com  Mon Apr 25 06:20:00 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Mon Apr 25 06:15:58 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <d11dcfba05042420126047b506@mail.gmail.com>
References: <ca471dc205042416572da9db71@mail.gmail.com>
	<ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050424235840.0309ca90@mail.telecommunity.com>

At 09:12 PM 4/24/05 -0600, Steven Bethard wrote:

>I guess it would be helpful to see example where the looping
>with-block is useful.

Automatically retry an operation a set number of times before hard failure:

     with auto_retry(times=3):
         do_something_that_might_fail()

Process each row of a database query, skipping and logging those that cause 
a processing error:

     with x,y,z = log_errors(db_query()):
         do_something(x,y,z)

You'll notice, by the way, that some of these "runtime macros" may be 
stackable in the expression.

I'm somewhat curious what happens to yields in the body of the macro block, 
but I assume they'll just do what would normally occur.  Somehow it seems 
strange, though, to be yielding to something other than the enclosing 
'with' object.

In any case, I'm personally more excited about the part where this means we 
get to build co-routines with less magic.  The 'with' statement itself is 
of interest mainly for acquisition/release and atomic/rollback scenarios, 
but being able to do retries or skip items that cause errors is often 
handy.  Sometimes you have a list of things (such as event callbacks) where 
you need to call all of them, even if one handler fails, but you can't 
afford to silence the errors either.  Code that deals with that scenario 
well is a bitch to write, and a looping 'with' would make it a bit easier 
to write once and reuse many.

From steven.bethard at gmail.com  Mon Apr 25 07:51:46 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Mon Apr 25 07:51:48 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <5.1.1.6.0.20050424235840.0309ca90@mail.telecommunity.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<d11dcfba05042420126047b506@mail.gmail.com>
	<5.1.1.6.0.20050424235840.0309ca90@mail.telecommunity.com>
Message-ID: <d11dcfba05042422515f84a626@mail.gmail.com>

On 4/24/05, Phillip J. Eby <pje@telecommunity.com> wrote:
> At 09:12 PM 4/24/05 -0600, Steven Bethard wrote:
> >I guess it would be helpful to see example where the looping
> >with-block is useful.
> 
> Automatically retry an operation a set number of times before hard failure:
> 
>      with auto_retry(times=3):
>          do_something_that_might_fail()
> 
> Process each row of a database query, skipping and logging those that cause
> a processing error:
> 
>      with x,y,z = log_errors(db_query()):
>          do_something(x,y,z)

Thanks for the examples!  If I understand your point here right, the
examples that can't be easily rewritten by composing a
single-execution with-block with a for-loop are examples where the
number of iterations of the for-loop depends on the error handling of
the with-block.  Could you rewrite these with PEP 288 as something
like:

    gen = auto_retry(times=3)
    for _ in gen:
        try:
            do_something_that_might_fail()
        except Exception, err: # Pretend "except Exception:" == "except:"
            gen.throw(err)

    gen = log_errors(db_query())
    for x,y,z in gen:
        try:
            do_something(x,y,z)
        except Exception, err: # Pretend "except Exception:" == "except:"
            gen.throw(err)

Obviously, the code is cleaner using the looping with-block.  I'm just
trying to make sure I understand your examples right.

So assuming we had looping with-blocks, what would be the benefit of
using a for-loop instead?  Just efficiency?  Or is there something
that a for-loop could do that a with-block couldn't?

STeVe
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From martin at v.loewis.de  Mon Apr 25 09:17:34 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon Apr 25 09:17:39 2005
Subject: [Python-Dev] Proper place to put extra args for building
In-Reply-To: <426C1E4E.4060809@ocf.berkeley.edu>
References: <4265918D.7040700@ocf.berkeley.edu>	<4265F656.4020305@v.loewis.de>	<4266C07A.9090503@ocf.berkeley.edu>	<4266C4E9.5060709@v.loewis.de>	<4266C7D1.700@ocf.berkeley.edu>	<42672803.3080208@v.loewis.de>	<42674338.80009@ocf.berkeley.edu>	<426744DF.2030309@v.loewis.de>	<426883FF.5060009@ocf.berkeley.edu>
	<42697754.1000707@v.loewis.de>	<4269CB3A.40306@ocf.berkeley.edu>
	<4269F919.6070901@v.loewis.de> <426C1E4E.4060809@ocf.berkeley.edu>
Message-ID: <426C998E.7070402@v.loewis.de>

Brett C. wrote:
> OK, EXTRA_CFLAGS support has been checked into Makefile.pre.in and
> distutils.sysconfig .  Martin, please double-check I tweaked sysconfig the way
> you wanted. 

It is the way I wanted it, but it doesn't work. Just try and use it for
some extension modules to see for yourself, I tried with a harmless GCC
option (-fgcse).

The problem is that distutils only looks at the Makefile, not at the
environment variables. So I changed parse_makefile to do what make does:
fall back to the environment when no makefile variable is set. This was
still not sufficient, since distutils never looks at CFLAGS. So I
changed setup.py and sysconfig.py to fetch CFLAGS, and not bother with
BASECFLAGS and EXTRA_CFLAGS.

setup.py 1.218
NEWS 1.1289
sysconfig.py 1.65

Regards,
Martin
From fredrik at pythonware.com  Mon Apr 25 09:39:12 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon Apr 25 09:40:18 2005
Subject: [Python-Dev] Re: anonymous blocks
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
Message-ID: <d4i6hg$q88$1@sea.gmane.org>

Guido van Rossum wrote:

> At the same time, having to use it as follows:
>
>     for f in with_file(filename):
<         for line in f:
>             print process(line)
>
> is really ugly, so we need new syntax, which also helps with keeping
> 'for' semantically backwards compatible. So let's use 'with', and then
> the using code becomes again this:
>
>     with f = with_file(filename):
>         for line in f:
>            print process(line)

or

    with with_file(filename) as f:
        ...

?

(assignment inside block-opening constructs aren't used in Python today,
as far as I can tell...)

> Finally, I think it would be cool if the generator could trap
> occurrences of break, continue and return occurring in BODY.  We could
> introduce a new class of exceptions for these, named ControlFlow, and
> (only in the body of a with statement), break would raise BreakFlow,
> continue would raise ContinueFlow, and return EXPR would raise
> ReturnFlow(EXPR) (EXPR defaulting to None of course).
>
> So a block could return a value to the generator using a return
> statement; the generator can catch this by catching ReturnFlow.
> (Syntactic sugar could be "VAR = yield ..." like in Ruby.)

slightly weird, but useful enough to be cool.  (maybe "return value" is enough,
though. the others may be slightly too weird...  or should that return perhaps
be a "continue value"?  you're going back to the top of loop, after all).

</F> 


From mal at egenix.com  Mon Apr 25 10:46:28 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon Apr 25 10:46:30 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <c41f67b905042017596328164a@mail.gmail.com>
References: <740c3aec0504191557505d6e9f@mail.gmail.com>	<877e9a170504191855445e0f4d@mail.gmail.com>	<20050419212423.63AD.JCARLSON@uci.edu>
	<d44tmo$ih0$1@sea.gmane.org>	<4266CC49.9080901@egenix.com>
	<c41f67b905042017596328164a@mail.gmail.com>
Message-ID: <426CAE64.8080404@egenix.com>

Shannon -jj Behrens wrote:
> On 4/20/05, M.-A. Lemburg <mal@egenix.com> wrote:
> 
>>Fredrik Lundh wrote:
>>
>>>PS. a side effect of the for-in pattern is that I'm beginning to feel
>>>that Python
>>>might need a nice "switch" statement based on dictionary lookups, so I can
>>>replace multiple callbacks with a single loop body, without writing too
>>>many
>>>if/elif clauses.
>>
>>PEP 275 anyone ? (http://www.python.org/peps/pep-0275.html)
>>
>>My use case for switch is that of a parser switching on tokens.
>>
>>mxTextTools applications would greatly benefit from being able
>>to branch on tokens quickly. Currently, there's only callbacks,
>>dict-to-method branching or long if-elif-elif-...-elif-else.
> 
> I think "match" from Ocaml would be a much nicer addition to Python
> than "switch" from C.

PEP 275 is about branching based on dictionary lookups which
is somewhat different than pattern matching - for which we
already have lots and lots of different tools.

The motivation behind the switch statement idea is that of
interpreting the multi-state outcome of some analysis that
you perform on data. The main benefit is avoiding Python
function calls which are very slow compared to branching to
inlined Python code.

Having a simple switch statement
would enable writing very fast parsers in Python -
you'd let one of the existing tokenizers such as mxTextTools,
re or one of the xml libs create the token input data
and then work on the result using a switch statement.

Instead of having one function call per token, you'd
only have a single dict lookup.

BTW, has anyone in this thread actually read the PEP 275 ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 25 2005)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From ncoghlan at gmail.com  Mon Apr 25 11:26:26 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon Apr 25 11:27:35 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc205042416572da9db71@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
Message-ID: <426CB7C2.8030508@gmail.com>

Guido van Rossum wrote:
> It seems that the same argument that explains why generators are so
> good for defining iterators, also applies to the PEP 310 use case:
> it's just much more natural to write
> 
>     def with_file(filename):
>         f = open(filename)
>         try:
>             yield f
>         finally:
>             f.close()
> 
> than having to write a class with __entry__ and __exit__ and
> __except__ methods (I've lost track of the exact proposal at this
> point).

Indeed - the transaction example is very easy to write this way:

     def transaction():
         begin_transaction()
         try:
             yield None
         except:
             abort_transaction()
             raise
         else:
             commit_transaction()

 > Also note that, unlike the for-loop translation, this does *not*
 > invoke iter() on the result of EXPR; that's debatable but given that
 > the most common use case should not be an alternate looping syntax
 > (even though it *is* technically a loop) but a more general "macro
 > statement expansion", I think we can expect EXPR to produce a value
 > that is already an iterator (rather than merely an interable).

Not supporting iterables makes it harder to write a class which is inherently 
usable in a with block, though. The natural way to make iterable classes is to 
use 'yield' in the definition of __iter__ - if iter() is not called, then that 
trick can't be used.

> Finally, I think it would be cool if the generator could trap
> occurrences of break, continue and return occurring in BODY.  We could
> introduce a new class of exceptions for these, named ControlFlow, and
> (only in the body of a with statement), break would raise BreakFlow,
> continue would raise ContinueFlow, and return EXPR would raise
> ReturnFlow(EXPR) (EXPR defaulting to None of course).

Perhaps 'continue' could be used to pass a value into the iterator, rather than 
'return'? (I believe this has been suggested previously in the context of for loops)

This would permit 'return' to continue to mean breaking out of the containing 
function (as for other loops).

> So a block could return a value to the generator using a return
> statement; the generator can catch this by catching ReturnFlow.
> (Syntactic sugar could be "VAR = yield ..." like in Ruby.)

So, "VAR = yield x" would expand to something like:

     try:
         yield x
     except ReturnFlow, ex:
         VAR = ReturnFlow.value

?

> With a little extra magic we could also get the behavior that if the
> generator doesn't handle ControlFlow exceptions but re-raises them,
> they would affect the code containing the with statement; this means
> that the generator can decide whether return, break and continue are
> handled locally or passed through to the containing block.

That seems a little bit _too_ magical - it would be nice if break and continue 
were defined to be local, and return to be non-local, as for the existing loop 
constructs. For other non-local control flow, application specific exceptions 
will still be available.

Regardless, the ControlFlow exceptions do seem like a very practical way of 
handling the underlying implementation.

> Note that EXPR doesn't have to return a generator; it could be any
> object that implements next() and next_ex().  (We could also require
> next_ex() or even next() with an argument; perhaps this is better.)

With this restriction (i.e. requiring next_ex, next_exc, or Terry's suggested 
__next__), then the backward's compatible version would be simply your desired 
semantics, plus an attribute check to exclude old-style iterators:

      it = EXPR
      if not hasattr(it, "__next__"):
          raise TypeError("'with' block requires 2nd gen iterator API support")
      err = None
      while True:
          try:
              VAR = it.next(err)
          except StopIteration:
              break
          try:
              err = None
              BODY
          except Exception, err: # Pretend "except Exception:" == "except:"
              pass

The generator objects created by using yield would supply the new API, so would 
be usable immediately inside such 'with' blocks.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From skip at pobox.com  Mon Apr 25 15:11:16 2005
From: skip at pobox.com (Skip Montanaro)
Date: Mon Apr 25 15:11:21 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc205042416572da9db71@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
Message-ID: <17004.60532.407476.331271@montanaro.dyndns.org>


    Guido> At the same time, having to use it as follows:

    Guido>     for f in with_file(filename):
    Guido>         for line in f:
    Guido>             print process(line)

    Guido> is really ugly, so we need new syntax, which also helps with
    Guido> keeping 'for' semantically backwards compatible. So let's use
    Guido> 'with', and then the using code becomes again this:

    Guido>     with f = with_file(filename):
    Guido>         for line in f:
    Guido>             print process(line)

How about deferring major new syntax changes until Py3K when the grammar and
semantic options might be more numerous?  Given the constraints of backwards
compatibility, adding more syntax or shoehorning new semantics into what's
an increasingly crowded space seems to always result in an unsatisfying
compromise.

    Guido> Now let me propose a strawman for the translation of the latter
    Guido> into existing semantics. Let's take the generic case:

    Guido>     with VAR = EXPR:
    Guido>         BODY

What about a multi-variable case?  Will you have to introduce a new level of
indentation for each 'with' var?

Skip
From bob at redivi.com  Mon Apr 25 15:53:13 2005
From: bob at redivi.com (Bob Ippolito)
Date: Mon Apr 25 15:53:23 2005
Subject: [Python-Dev] Re: [Pythonmac-SIG] zipfile still has 2GB boundary bug
In-Reply-To: <0a276f3a753a57d89e65013ca77a3714@conncoll.edu>
References: <f8130a8bcf8ee1165b8c6c6c5da86f80@redivi.com>
	<0a276f3a753a57d89e65013ca77a3714@conncoll.edu>
Message-ID: <6a8c0d96b4709c84b223395306646ef0@redivi.com>


On Apr 25, 2005, at 7:53 AM, Charles Hartman wrote:

>>
>> Someone should think about rewriting the zipfile module to be less 
>> hideous, include a repair feature, and be up to date with the latest 
>> specifications <http://www.pkware.com/company/standards/appnote/>.
>
> -- and allow *deleting* a file from a zipfile. As far as I can tell, 
> you now can't (except by rewriting everything but that to a new 
> zipfile and renaming). Somewhere I saw a patch request for this, but 
> it was languishing, a year or more old. Or am I just totally missing 
> something?

No, you're not missing anything.  Deleting is hard, I guess.  Either 
you'd have to shuffle the zip file around to reclaim the space, or just 
leave that spot alone and just remove its entry in the central 
directory.  You'd probably want to look at what other software does to 
decide which approach to use (by default?).  I don't see any markers in 
the format that would otherwise let you say "this file was deleted".

-bob

From tjreedy at udel.edu  Mon Apr 25 16:14:07 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon Apr 25 16:15:31 2005
Subject: [Python-Dev] Re: anonymous blocks
References: <ca471dc205042116402d7d38da@mail.gmail.com><ca471dc205042416572da9db71@mail.gmail.com>
	<d4i6hg$q88$1@sea.gmane.org>
Message-ID: <d4itlr$40e$1@sea.gmane.org>


"Fredrik Lundh" <fredrik@pythonware.com> wrote in message 
news:d4i6hg$q88$1@sea.gmane.org...
> Guido van Rossum wrote:
>
>> At the same time, having to use it as follows:
>>
>>     for f in with_file(filename):
> <         for line in f:
>>             print process(line)
>>
>> is really ugly, so we need new syntax, which also helps with keeping
>> 'for' semantically backwards compatible. So let's use 'with', and then
>> the using code becomes again this:
>>
>>     with f = with_file(filename):
>>         for line in f:
>>            print process(line)
>
> or
>
>    with with_file(filename) as f:

with <target> as <value>:

would parallel the for-statement header and read smoother to me.

for <target> as <value>:

would not need new keyword, but would require close reading to distinguish 
'as' from 'in'.

Terry J. Reedy


From s.percivall at chello.se  Mon Apr 25 16:26:06 2005
From: s.percivall at chello.se (Simon Percivall)
Date: Mon Apr 25 16:26:12 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <d4itlr$40e$1@sea.gmane.org>
References: <ca471dc205042116402d7d38da@mail.gmail.com><ca471dc205042416572da9db71@mail.gmail.com>
	<d4i6hg$q88$1@sea.gmane.org> <d4itlr$40e$1@sea.gmane.org>
Message-ID: <640BE6F5-E393-430A-A9B6-793DB471F28D@chello.se>

On 25 apr 2005, at 16.14, Terry Reedy wrote:
> with <target> as <value>:
>
> would parallel the for-statement header and read smoother to me.
>
> for <target> as <value>:
>
> would not need new keyword, but would require close reading to  
> distinguish
> 'as' from 'in'.

But it also moves the value to the right, removing focus. Wouldn't  
"from"
be a good keyword to overload here?

"in"/"with"/"for"/"" <value> from <target>:
     <BODY>

//Simon

From rodsenra at gpr.com.br  Mon Apr 25 16:38:51 2005
From: rodsenra at gpr.com.br (Rodrigo Dias Arruda Senra)
Date: Mon Apr 25 16:38:28 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <640BE6F5-E393-430A-A9B6-793DB471F28D@chello.se>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<d4i6hg$q88$1@sea.gmane.org> <d4itlr$40e$1@sea.gmane.org>
	<640BE6F5-E393-430A-A9B6-793DB471F28D@chello.se>
Message-ID: <20050425113851.10b00c37@localhost.localdomain>

[ Simon Percivall ]:
> [ Terry Reedy ]:
> > with <target> as <value>:
> >
> > would parallel the for-statement header and read smoother to me.
> >
> > for <target> as <value>:
> >
> > would not need new keyword, but would require close reading to  
> > distinguish
> > 'as' from 'in'.
> 
> But it also moves the value to the right, removing focus. Wouldn't  
> "from"
> be a good keyword to overload here?
> 
> "in"/"with"/"for"/"" <value> from <target>:
>      <BODY>

 I do not have strong feelings about this issue, but for
 completeness sake...

 Mixing both suggestions:

 from <target> as <value>:
     <BODY>

 That resembles an import statement which some
 may consider good (syntax/keyword reuse) or
 very bad (confusion?, value focus).

 cheers,
 Senra

-- 
Rodrigo Senra                 
--
MSc Computer Engineer    rodsenra(at)gpr.com.br  
GPr Sistemas Ltda        http://www.gpr.com.br/ 
Personal Blog     http://rodsenra.blogspot.com/

From tjreedy at udel.edu  Mon Apr 25 16:38:49 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon Apr 25 16:41:01 2005
Subject: [Python-Dev] Re: Re: anonymous blocks
References: <ca471dc205042116402d7d38da@mail.gmail.com><ca471dc205042416572da9db71@mail.gmail.com>
	<426CB7C2.8030508@gmail.com>
Message-ID: <d4iv44$9gn$1@sea.gmane.org>


"Nick Coghlan" <ncoghlan@gmail.com> wrote in message 
news:426CB7C2.8030508@gmail.com...
> Guido van Rossum wrote:
> > statement expansion", I think we can expect EXPR to produce a value
> > that is already an iterator (rather than merely an interable).
>
> Not supporting iterables makes it harder to write a class which is 
> inherently usable in a with block, though. The natural way to make 
> iterable classes is to use 'yield' in the definition of __iter__ - if 
> iter() is not called, then that trick can't be used.

Would not calling iter() (or .__iter__) explicitly, instead of depending on 
the implicit call of for loops, suffice to produce the needed iterator?

tjr


From tjreedy at udel.edu  Mon Apr 25 16:41:37 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon Apr 25 16:47:33 2005
Subject: [Python-Dev] Re: Re: anonymous blocks
References: <ca471dc205042116402d7d38da@mail.gmail.com><ca471dc205042416572da9db71@mail.gmail.com>
	<17004.60532.407476.331271@montanaro.dyndns.org>
Message-ID: <d4iv9c$a5p$1@sea.gmane.org>


"Skip Montanaro" <skip@pobox.com> wrote in message 
news:17004.60532.407476.331271@montanaro.dyndns.org...
>    Guido>     with VAR = EXPR:
>    Guido>         BODY
>
> What about a multi-variable case?  Will you have to introduce a new level 
> of
> indentation for each 'with' var?

I would expect to see the same structure unpacking as with assignment, for 
loops, and function calls: with a,b,c = x,y,z  and so on.

Terry J. Reedy


From ark-mlist at att.net  Mon Apr 25 17:00:16 2005
From: ark-mlist at att.net (Andrew Koenig)
Date: Mon Apr 25 17:00:09 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <20050425113851.10b00c37@localhost.localdomain>
Message-ID: <00b301c549a7$7e8022e0$6402a8c0@arkdesktop>

>  Mixing both suggestions:
> 
>  from <target> as <value>:
>      <BODY>
> 
>  That resembles an import statement which some
>  may consider good (syntax/keyword reuse) or
>  very bad (confusion?, value focus).

I have just noticed that this whole notion is fairly similar to the "local"
statement in ML, the syntax for which looks like this:

	local <declarations> in <declarations> end

The idea is that the first declarations, whatever they are, are processed
without putting their names into the surrounding scope, then the second
declarations are processed *with* putting their names into the surrounding
scope.

For example:

	local
		fun add(x:int, y:int) = x+y
	in
		fun succ(x) = add(x, 1)
		
	end

This defines succ in the surrounding scope, but not add.

So in Python terms, I think this would be

	local: <suite> in: <suite>

or, for example:

	local: <target> = value
	in:
		blah
		blah
		blah


From tjreedy at udel.edu  Mon Apr 25 17:11:26 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon Apr 25 17:13:21 2005
Subject: [Python-Dev] Re: Re: anonymous blocks
References: <ca471dc205042116402d7d38da@mail.gmail.com><ca471dc205042416572da9db71@mail.gmail.com>
	<426C54BF.2010906@ocf.berkeley.edu>
Message-ID: <d4j11a$gp3$1@sea.gmane.org>


"Brett C." <bac@OCF.Berkeley.EDU> wrote in message 
news:426C54BF.2010906@ocf.berkeley.edu...
> And before anyone decries the fact that this might confuse a newbie 
> (which
> seems to happen with every advanced feature ever dreamed up), remember 
> this
> will not be meant for a newbie but for someone who has experience in 
> Python and
> iterators at the minimum, and hopefully with generators.  Not exactly 
> meant for
> someone for which raw_input() still holds a "wow" factor for.  =)

I have accepted the fact that Python has become a two-level language: basic 
Python for expressing algorithms + advanced features (metaclasses, 
decorators, CPython-specific introspection and hacks, and now possibly 
'with' or whatever) for solving software engineering issues.  Perhaps there 
should correspondingly be two tutorials.

Terry J. Reedy


From mcherm at mcherm.com  Mon Apr 25 18:42:54 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Mon Apr 25 18:42:57 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
Message-ID: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>

Jim Jewett writes:
> As best I can tell, the anonymous blocks are used to take
> care of boilerplate code without changing the scope -- exactly
> what macros are used for.

Folks, I think that Jim is onto something here.

I've been following this conversation, and it sounds to me as if we
are stumbling about in the dark, trying to feel our way toward something
very useful and powerful. I think Jim is right, what we're feeling our
way toward is macros.

The problem, of course, is that Guido (and others!) are on record as
being opposed to adding macros to Python. (Even "good" macros... think
lisp, not cpp.) I am not quite sure that I am convinced by the argument,
but let me see if I can present it:

  Allowing macros in Python would enable individual programmers or
  groups to easily invent their own "syntax". Eventually, there would
  develop a large number of different Python "dialects" (as some
  claim has happened in the Lisp community) each dependent on macros
  the others lack. The most important casualty would be Python's
  great *readability*.

(If this is a strawman argument, i.e. if you know of a better reason
for keeping macros OUT of Python please speak up. Like I said, I've
never been completely convinced of it myself.)

I think it would be useful if we approached it like this: either what
we want is the full power of macros (in which case the syntax we choose
should be guided by that choice), or we want LESS than the full power
of macros. If we want less, then HOW less?

In other words, rather than hearing what we'd like to be able to DO
with blocks, I'd like to hear what we want to PROHIBIT DOING with
blocks. I think this might be a fruitful way of thinking about the
problem which might make it easier to evaluate syntax suggestions. And
if the answer is that we want to prohibit nothing, then the right
solution is macros.

-- Michael Chermside

From facundobatista at gmail.com  Mon Apr 25 18:46:15 2005
From: facundobatista at gmail.com (Facundo Batista)
Date: Mon Apr 25 18:46:17 2005
Subject: [Python-Dev] Re: Caching objects in memory
In-Reply-To: <d4av64$ogd$1@sea.gmane.org>
References: <e04bdf31050422063019fda86b@mail.gmail.com>
	<d4av64$ogd$1@sea.gmane.org>
Message-ID: <e04bdf310504250946371f59c@mail.gmail.com>

On 4/22/05, Fredrik Lundh <fredrik@pythonware.com> wrote:

> > Is there a document that details which objects are cached in memory
> > (to not create the same object multiple times, for performance)?
> 
> why do you think you need to know?

I was in my second class of the Python workshop I'm giving here in one
Argentine University, and I was explaining how to think using
name/object and not variable/value.

Using id() for being pedagogic about the objects, the kids saw that
id(3) was always the same, but id([]) not. I explained to them that
Python, in some circumstances, caches the object, and I kept them
happy enough.

But I really don't know what objects and in which circumstances.

.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/
From gvanrossum at gmail.com  Mon Apr 25 18:52:03 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Apr 25 18:52:05 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
Message-ID: <ca471dc20504250952691dd89e@mail.gmail.com>

> I've been following this conversation, and it sounds to me as if we
> are stumbling about in the dark, trying to feel our way toward something
> very useful and powerful. I think Jim is right, what we're feeling our
> way toward is macros.
> 
> The problem, of course, is that Guido (and others!) are on record as
> being opposed to adding macros to Python. (Even "good" macros... think
> lisp, not cpp.) I am not quite sure that I am convinced by the argument,
> but let me see if I can present it:
> 
>   Allowing macros in Python would enable individual programmers or
>   groups to easily invent their own "syntax". Eventually, there would
>   develop a large number of different Python "dialects" (as some
>   claim has happened in the Lisp community) each dependent on macros
>   the others lack. The most important casualty would be Python's
>   great *readability*.
> 
> (If this is a strawman argument, i.e. if you know of a better reason
> for keeping macros OUT of Python please speak up. Like I said, I've
> never been completely convinced of it myself.)

Nor am I; though I am also not completely unconvinced! The argument as
presented here is probably to generic; taken literally, it would argue
against having functions and classes as well...

My problem with macros is actually more practical: Python's compiler
is too dumb. I am assuming that we want to be able to import macros
from other modules, and I am assuming that macros are expanded by the
compiler, not at run time; but the compiler doesn't follow imports
(that happens at run time) so there's no mechanism to tell the
compiler about the new syntax. And macros that don't introduce new
syntax don't seem very interesting (compared to what we can do
already).

> I think it would be useful if we approached it like this: either what
> we want is the full power of macros (in which case the syntax we choose
> should be guided by that choice), or we want LESS than the full power
> of macros. If we want less, then HOW less?
> 
> In other words, rather than hearing what we'd like to be able to DO
> with blocks, I'd like to hear what we want to PROHIBIT DOING with
> blocks. I think this might be a fruitful way of thinking about the
> problem which might make it easier to evaluate syntax suggestions. And
> if the answer is that we want to prohibit nothing, then the right
> solution is macros.

I'm personally at a loss understanding your question here. Perhaps you
could try answering it for yourself?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Mon Apr 25 18:57:08 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Apr 25 18:57:11 2005
Subject: [Python-Dev] Re: Caching objects in memory
In-Reply-To: <e04bdf310504250946371f59c@mail.gmail.com>
References: <e04bdf31050422063019fda86b@mail.gmail.com>
	<d4av64$ogd$1@sea.gmane.org>
	<e04bdf310504250946371f59c@mail.gmail.com>
Message-ID: <ca471dc20504250957753a7445@mail.gmail.com>

> I was in my second class of the Python workshop I'm giving here in one
> Argentine University, and I was explaining how to think using
> name/object and not variable/value.
> 
> Using id() for being pedagogic about the objects, the kids saw that
> id(3) was always the same, but id([]) not. I explained to them that
> Python, in some circumstances, caches the object, and I kept them
> happy enough.
> 
> But I really don't know what objects and in which circumstances.

Aargh! Bad explanation. Or at least you're missing something:
*mutable* objects (like lists) can *never* be cached, because they
have explicit object semantics. For example each time the expression
[] is evaluated it *must* produce a fresh list object (though it may
be recycled from a GC'ed list object -- or any other GC'ed object, for
that matter).

But for *immutable* objects (like numbers, strings and tuples) the
implementation is free to use caching. In practice, I believe ints
between -5 and 100 are cached, and 1-character strings are often
cached (but not always).

Hope this helps! I would think this is in the docs somewhere but
probably not in a place where one would ever think to look...

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From pedronis at strakt.com  Mon Apr 25 19:08:35 2005
From: pedronis at strakt.com (Samuele Pedroni)
Date: Mon Apr 25 19:08:49 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
Message-ID: <426D2413.60301@strakt.com>

Michael Chermside wrote:

>Jim Jewett writes:
>  
>
>>As best I can tell, the anonymous blocks are used to take
>>care of boilerplate code without changing the scope -- exactly
>>what macros are used for.
>>    
>>
>
>Folks, I think that Jim is onto something here.
>
>I've been following this conversation, and it sounds to me as if we
>are stumbling about in the dark, trying to feel our way toward something
>very useful and powerful. I think Jim is right, what we're feeling our
>way toward is macros.
>
>The problem, of course, is that Guido (and others!) are on record as
>being opposed to adding macros to Python. (Even "good" macros... think
>lisp, not cpp.) I am not quite sure that I am convinced by the argument,
>but let me see if I can present it:
>
>  Allowing macros in Python would enable individual programmers or
>  groups to easily invent their own "syntax". Eventually, there would
>  develop a large number of different Python "dialects" (as some
>  claim has happened in the Lisp community) each dependent on macros
>  the others lack. The most important casualty would be Python's
>  great *readability*.
>
>(If this is a strawman argument, i.e. if you know of a better reason
>for keeping macros OUT of Python please speak up. Like I said, I've
>never been completely convinced of it myself.)
>  
>
The typical argument in defense of macros is that macros are just like 
functions, you go to the definition
and see what they does.

But depending on how much variation they offer over the normal grammar 
even eye parsing them may be difficult.

They make it easy to mix to code that is evaluated immediately and code 
that will be evalutated, maybe even repeatedely, later, each
macro having its own rules about this. In most cases the only way  to 
discern this and know what is what is indeed looking at the macro 
definition.

You can get flame wars about whether introducing slightly different 
variations of if is warranted. <.5 wink>

My personal impression is that average macro definitions (I'm thinking 
about Common Lisp or Dylan and similar) are much less
readable that average function definitions. Reading On Lisp may give an 
idea about this. That means that introducing macros in Python, because 
of the importance that readability has in Python, would need a serious 
design effort to make the macro definitions themself readable. I think 
that's a challenging design problem.

Also agree about the technical issues that Guido cited about referencing 
and when macros definition enter in effect etc.


From p.f.moore at gmail.com  Mon Apr 25 20:06:24 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon Apr 25 20:06:27 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
Message-ID: <79990c6b050425110660cc2f3@mail.gmail.com>

On 4/25/05, Michael Chermside <mcherm@mcherm.com> wrote:
> I've been following this conversation, and it sounds to me as if we
> are stumbling about in the dark, trying to feel our way toward something
> very useful and powerful. I think Jim is right, what we're feeling our
> way toward is macros.

I think the key difference with macros is that they act at compile
time, not at run time. There is no intention here to provide any form
of compile-time processing, and that makes all the difference.

What I feel is the key concept here is that of "injecting" code into a
template form (try...finally, or try..except..else, or whatever) [1].
This is "traditionally" handled by macros, and I see it as a *good*
sign, that the discussion has centred around runtime mechanisms rather
than compile-time ones.

[1] Specifically, cases where functions aren't enough. If I try to
characterise precisely what those cases are, all I can come up with is
"when the code being injected needs to run in the current scope, not
in the scope of a template function". Is that right?

Paul.
From shane.holloway at ieee.org  Mon Apr 25 20:23:08 2005
From: shane.holloway at ieee.org (Shane Holloway (IEEE))
Date: Mon Apr 25 20:23:49 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
Message-ID: <426D358C.70509@ieee.org>

Michael Chermside wrote:
> Jim Jewett writes:
> 
>>As best I can tell, the anonymous blocks are used to take
>>care of boilerplate code without changing the scope -- exactly
>>what macros are used for.
> 
> 
> Folks, I think that Jim is onto something here.
> 
> I've been following this conversation, and it sounds to me as if we
> are stumbling about in the dark, trying to feel our way toward something
> very useful and powerful. I think Jim is right, what we're feeling our
> way toward is macros.

I am very excited about the discussion of blocks.  I think they can 
potentially address two things that are sticky to express in python 
right now.  The first is to compress the common try/finally use cases 
around resource usage as with files and database commits.  The second is 
language extensibility, which makes us think of what macros did for Lisp.

Language extensibility has two motivations.  First and foremost is to 
allow the programmer to express his or her *intent*.  The second 
motivation is to reuse code and thereby increase productivity.  Since 
methods already allow us to reuse code, our motivation is to increase 
expressivity.  What blocks offer is to make Python's suites something a 
programmer can work with.  Much like using a metaclass putting control 
of class details into the programmer's hands.  Or decorators allowing us 
to modify method semantics.  If the uses of decorators tells us 
anything, I'm pretty sure there are more potential uses of blocks than 
we could shake many sticks at.  ;)

So, the question comes back to what are blocks in the language 
extensibility case?  To me, they would be something very like a code 
object returned from the compile method.  To this we would need to 
attach the globals and locals where the block was from.  Then we could 
use the normal exec statement to invoke the block whenever needed. 
Perhaps we could add a new mode 'block' to allow the ControlFlow 
exceptions mentioned elsewhere in the thread.  We still need to find a 
way to pass arguments to the block so we are not tempted to insert them 
in locals and have them magically appear in the namespace.  ;) 
Personally, I'm rather attached to "as (x, y):" introducing the block.

To conclude, I mocked up some potential examples for your entertainment.  ;)

Thanks for your time and consideration!
-Shane Holloway


Interfaces::

     def interface(interfaceName, *bases, ***aBlockSuite):
         blockGlobals = aBlockSuite.globals().copy()
         blockGlobals.update(aBlockSuite.locals())
         blockLocals = {}

         exec aBlock in blockGlobals, blockLocals

         return iterfaceType(interfaceName, bases, blockLocals)

     IFoo = interface('IFoo'):
         def isFoo(self): pass

     IBar = interface('IBar'):
         def isBar(self): pass

     IBaz = interface('IBaz', IFoo, IBar):
         def isBaz(self): pass


Event Suites::

     def eventSinksFor(events, ***aBlockSuite):
         blockGlobals = aBlockSuite.globals().copy()
         blockGlobals.update(aBlockSuite.locals())
         blockLocals = {}

         exec aBlock in blockGlobals, blockLocals

         for name, value in blockLocals.iteritems():
             if aBlockSuite.locals().get(name) is value:
                 continue
             if callable(value):
                 events.addEventFor(name, value)

     def debugScene(scene):
         eventSinksFor(scene.events):
             def onMove(pos):
                 print "pos:", pos
             def onButton(which, state):
                 print "button:", which, state
             def onKey(which, state):
                 print "key:", which, state

From p.f.moore at gmail.com  Mon Apr 25 20:28:23 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon Apr 25 20:28:25 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <004201c5492e$ff05ca60$f100a8c0@ryoko>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<004201c5492e$ff05ca60$f100a8c0@ryoko>
Message-ID: <79990c6b05042511285237126c@mail.gmail.com>

On 4/25/05, Tim Delaney <tcdelaney@optusnet.com.au> wrote:
> Guido van Rossum wrote:
> 
> > but for backwards compatibility with the existing argument-less next()
> > API I'm introducing a new iterator API next_ex() which takes an
> > exception argument.  If that argument is None, it should behave just
> > like next().  Otherwise, if the iterator is a generator, this will
> 
> Might this be a good time to introduce __next__ (having the same signature
> and semantics as your proposed next_ex) and builtin next(obj,
> exception=None)?
> 
> def next(obj, exception=None):
> 
>     if hasattr(obj, '__next__'):
>         return obj.__next__(exception)
> 
>     if exception is not None:
>         return obj.next(exception) # Will raise an appropriate exception
> 
>     return obj.next()

Hmm, it took me a while to get this, but what you're ssaying is that
if you modify Guido's "what I really want" solution to use

    VAR = next(it, exc)

then this builtin next makes "API v2" stuff using __next__ work while
remaining backward compatible with old-style "API v1" stuff using
0-arg next() (as long as old-style stuff isn't used in a context where
an exception gets passed back in).

I'd suggest that the new builtin have a "magic" name (__next__ being
the obvious one :-)) to make it clear that it's an internal
implementation detail.

Paul.

PS The first person to replace builtin __next__ in order to implement
a "next hook" of some sort, gets shot :-)
From aahz at pythoncraft.com  Mon Apr 25 20:49:34 2005
From: aahz at pythoncraft.com (Aahz)
Date: Mon Apr 25 20:49:38 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <426D358C.70509@ieee.org>
References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
	<426D358C.70509@ieee.org>
Message-ID: <20050425184934.GA15135@panix.com>

On Mon, Apr 25, 2005, Shane Holloway (IEEE) wrote:
>
> Interfaces::
> 
>     def interface(interfaceName, *bases, ***aBlockSuite):
>         blockGlobals = aBlockSuite.globals().copy()
>         blockGlobals.update(aBlockSuite.locals())
>         blockLocals = {}
> 
>         exec aBlock in blockGlobals, blockLocals
> 
>         return iterfaceType(interfaceName, bases, blockLocals)
> 
>     IFoo = interface('IFoo'):
>         def isFoo(self): pass

Where does ``aBlock`` come from?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It's 106 miles to Chicago.  We have a full tank of gas, a half-pack of
cigarettes, it's dark, and we're wearing sunglasses."  "Hit it."
From shane at hathawaymix.org  Mon Apr 25 21:13:06 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Mon Apr 25 21:13:12 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <79990c6b050425110660cc2f3@mail.gmail.com>
References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
	<79990c6b050425110660cc2f3@mail.gmail.com>
Message-ID: <426D4142.8020703@hathawaymix.org>

Paul Moore wrote:
> I think the key difference with macros is that they act at compile
> time, not at run time. There is no intention here to provide any form
> of compile-time processing, and that makes all the difference.
> 
> What I feel is the key concept here is that of "injecting" code into a
> template form (try...finally, or try..except..else, or whatever) [1].
> This is "traditionally" handled by macros, and I see it as a *good*
> sign, that the discussion has centred around runtime mechanisms rather
> than compile-time ones.
> 
> [1] Specifically, cases where functions aren't enough. If I try to
> characterise precisely what those cases are, all I can come up with is
> "when the code being injected needs to run in the current scope, not
> in the scope of a template function". Is that right?

That doesn't hold if the code being injected is a single Python
expression, since you can put an expression in a lambda and code the
template as a function.  I would say you need a block template when the
code being injected consists of one or more statements that need to run
in the current scope.

Shane
From shane at hathawaymix.org  Mon Apr 25 21:14:47 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Mon Apr 25 21:14:53 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
Message-ID: <426D41A7.1060605@hathawaymix.org>

Michael Chermside wrote:
> In other words, rather than hearing what we'd like to be able to DO
> with blocks, I'd like to hear what we want to PROHIBIT DOING with
> blocks. I think this might be a fruitful way of thinking about the
> problem which might make it easier to evaluate syntax suggestions. And
> if the answer is that we want to prohibit nothing, then the right
> solution is macros.

One thing we don't need, I believe, is arbitrary transformation of code
objects.  That's actually already possible, thanks to Python's compiler
module, although the method isn't clean yet.  Zope uses the compiler
module to sandbox partially-trusted Python code.  For example, it
redirects all print statements and replaces operations that change an
attribute with a call to a function that checks access before setting
the attribute.

Also, we don't need any of these macros, AFAICT:

  http://gauss.gwydiondylan.org/books/drm/drm_86.html

Shane
From shane.holloway at ieee.org  Mon Apr 25 21:20:51 2005
From: shane.holloway at ieee.org (Shane Holloway (IEEE))
Date: Mon Apr 25 21:21:37 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <20050425184934.GA15135@panix.com>
References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
	<426D358C.70509@ieee.org> <20050425184934.GA15135@panix.com>
Message-ID: <426D4313.8030308@ieee.org>


Aahz wrote:
> On Mon, Apr 25, 2005, Shane Holloway (IEEE) wrote:
> 
>>Interfaces::
>>
>>    def interface(interfaceName, *bases, ***aBlockSuite):
>>        blockGlobals = aBlockSuite.globals().copy()
>>        blockGlobals.update(aBlockSuite.locals())
>>        blockLocals = {}
>>
>>        exec aBlock in blockGlobals, blockLocals
>>
>>        return iterfaceType(interfaceName, bases, blockLocals)
>>
>>    IFoo = interface('IFoo'):
>>        def isFoo(self): pass
> 
> 
> Where does ``aBlock`` come from?

Sorry! I renamed ``aBlock`` to ``aBlockSuite``, but missed a few.  ;)
From jimjjewett at gmail.com  Mon Apr 25 21:34:55 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon Apr 25 21:35:01 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
Message-ID: <fb6fbf560504251234553e881f@mail.gmail.com>

Guido:

> My problem with macros is actually more practical: Python's compiler
> is too dumb. I am assuming that we want to be able to import macros
> from other modules, and I am assuming that macros are expanded by the
> compiler, not at run time; but the compiler doesn't follow imports ...

Expanding at run-time is less efficient, but it works at least as well
semantically.  If today's alternative is manual cut-n-paste, I would 
still rather have the computer do it for me, to avoid accidental forks.

It could also be done (though not as cleanly) by making macros act as
import hooks.  

import defmacro                # Stop processing until defmacro is loaded.
                                         # All future lines will be
preprocessed by the
                                         # hook collection
...
from defmacro import foo   # installs a foo hook, good for the rest of the file

Michael Chermside:
>> I think it would be useful if we approached it like this: either what
>> we want is the full power of macros (in which case the syntax we choose
>> should be guided by that choice), or we want LESS than the full power
>> of macros. If we want less, then HOW less?

>> In other words, rather than hearing what we'd like to be able to DO
>> with blocks, I'd like to hear what we want to PROHIBIT DOING with
>> blocks. I think this might be a fruitful way of thinking about the
>> problem which might make it easier to evaluate syntax suggestions. And
>> if the answer is that we want to prohibit nothing, then the right
>> solution is macros.

> I'm personally at a loss understanding your question here. Perhaps you
> could try answering it for yourself?

Why not just introduce macros?  If the answer is "We should, it is just 
hard to code", then use a good syntax for macros.  If the answer is
"We don't want 

    xx sss (S\<!   2k3 ]

to ever be meaningful", then we need to figure out exactly what to 
prohibit.  Lisp macros are (generally, excluding read macros) limited 
to taking and generating complete S-expressions.  If that isn't enough
to enforce readability, then limiting blocks to expressions (or even
statements) probably isn't enough in python.

Do we want to limit the changing part (the "anonymous block") to
only a single suite?  That does work well with the "yield" syntax, but it
seems like an arbitrary restriction unless *all* we want are resource 
wrappers.

Or do we really just want a way to say that a function should share its
local namespace with it's caller or callee?  In that case, maybe the answer
is a "lexical" or "same_namespace" keyword.  Or maybe just a recipe to make
exec or eval do the right thing.

def myresource(rcname, callback, *args):
    rc=open(rcname)
    same_namespace callback(*args)
    close(rc)

def process(*args):
    ...

...
if __name__ == '__main__':
     myresource("file1", process, arg1, arg2)
From mcherm at mcherm.com  Mon Apr 25 21:53:10 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Mon Apr 25 21:53:14 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
Message-ID: <20050425125310.sdzvw4wl4mtc8kww@mcherm.com>

Guido writes:
> My problem with macros is actually more practical: Python's compiler
> is too dumb. I am assuming that we want to be able to import macros
> from other modules, and I am assuming that macros are expanded by the
> compiler, not at run time; but the compiler doesn't follow imports
> (that happens at run time) so there's no mechanism to tell the
> compiler about the new syntax. And macros that don't introduce new
> syntax don't seem very interesting (compared to what we can do
> already).

That's good to hear. It expresses fairly clearly what the challenges
are in implementing macros for Python, and expressing the challenges
makes it easier to attack the problem. My interest comes because some
recent syntax changes (generators, generator expressions) have seemed
to me like true language changes, but others (decorators, anonymous-blocks)
to me just cry out "this would be easy as a macro!".


I wrote:
> I think it would be useful if we approached it like this: either what
> we want is the full power of macros (in which case the syntax we choose
> should be guided by that choice), or we want LESS than the full power
> of macros. If we want less, then HOW less?
>
> In other words, rather than hearing what we'd like to be able to DO
> with blocks, I'd like to hear what we want to PROHIBIT DOING with
> blocks. I think this might be a fruitful way of thinking about the
> problem which might make it easier to evaluate syntax suggestions. And
> if the answer is that we want to prohibit nothing, then the right
> solution is macros.

Guido replied:
> I'm personally at a loss understanding your question here. Perhaps you
> could try answering it for yourself?

You guys just think too fast for me. When I started this email, I replied
"Fair enough. One possibility is...". But while I was trying to condense
my thoughts down from 1.5 pages to something short and coherent (it takes
time to write it short) everything I was thinking became obscelete as
both Paul Moore and Jim Jewett did exactly the kind of thinking I was
hoping to inspire:

Paul:
> What I feel is the key concept here is that of "injecting" code into a
> template form (try...finally, or try..except..else, or whatever)
      [...]
> Specifically, cases where functions aren't enough. If I try to
> characterise precisely what those cases are, all I can come up with is
> "when the code being injected needs to run in the current scope, not
> in the scope of a template function". Is that right?

Jim:
> Why not just introduce macros?  If the answer is "We should, it is just
> hard to code", then use a good syntax for macros.  If the answer is
> "We don't want
>       xx sss (S\<!   2k3 ]
> to ever be meaningful", then we need to figure out exactly what to
> prohibit.
     [...]
> Do we want to limit the changing part (the "anonymous block") to
> only a single suite?  That does work well with the "yield" syntax, but it
> seems like an arbitrary restriction unless *all* we want are resource
> wrappers.
>
> Or do we really just want a way to say that a function should share its
> local namespace with it's caller or callee?  In that case, maybe the answer
> is a "lexical" or "same_namespace" keyword.

My own opinion is that we DO want macros. I prefer a language have a few,
powerful constructs rather than lots of specialized ones. (Yet I still
believe that "doing different things should look different"... which is
why I prefer Python to Lisp.) I think that macros could solve a LOT of
problems.

There are lots of things one might want to replace within macros, from
identifiers to punctuation, but I'd be willing to live with just two of
them: expressions, and "series-of-statements" (that's almost the same as
a block). There are only two places I'd want to be able to USE a macro:
where an expression is called for, and where a series-of-statements is
called for. In both cases, I'd be happy with a function-call like syntax
for including the macro.

Well, that's a lot of "wanting"... now I all I need to do is invent a
clever syntax that allows these in an elegant fashion while also solving
Guido's point about imports (hint: the answer is that it ALL happens at
runtime). I'll go think some while you guys zoom past me again. <wink>

-- Michael Chermside

From gvanrossum at gmail.com  Mon Apr 25 22:16:16 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Mon Apr 25 22:16:18 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <fb6fbf560504251234553e881f@mail.gmail.com>
References: <fb6fbf560504251234553e881f@mail.gmail.com>
Message-ID: <ca471dc205042513162c5cff33@mail.gmail.com>

> It could also be done (though not as cleanly) by making macros act as
> import hooks.
> 
> import defmacro                # Stop processing until defmacro is loaded.
>                                          # All future lines will be preprocessed by the
>                                          # hook collection
> ...
> from defmacro import foo   # installs a foo hook, good for the rest of the file

Brrr. What about imports that aren't at the top level (e.g. inside a function)?

> Why not just introduce macros?

Because I've been using Python for 15 years without needing them?
Sorry, but "why not add feature X" is exactly what we're trying to
AVOID here. You've got to come up with some really good use cases
before we add new features. "I want macros" just doesn't cut it.

> If the answer is "We should, it is just
> hard to code", then use a good syntax for macros.  If the answer is
> "We don't want
> 
>     xx sss (S\<!   2k3 ]
> 
> to ever be meaningful", then we need to figure out exactly what to
> prohibit.  Lisp macros are (generally, excluding read macros) limited
> to taking and generating complete S-expressions.  If that isn't enough
> to enforce readability, then limiting blocks to expressions (or even
> statements) probably isn't enough in python.

I suspect you've derailed here. Or perhaps you should use a better
example; I don't understand what the point is of using an example like
"xx sss (S\<! 2k3 ]".

> Do we want to limit the changing part (the "anonymous block") to
> only a single suite?  That does work well with the "yield" syntax, but it
> seems like an arbitrary restriction unless *all* we want are resource
> wrappers.

Or loops, of course.

Pehaps you've missed some context here? Nobody seems to be able to
come up with other use cases, that's why "yield" is so attractive.

> Or do we really just want a way to say that a function should share its
> local namespace with it's caller or callee?  In that case, maybe the answer
> is a "lexical" or "same_namespace" keyword.  Or maybe just a recipe to make
> exec or eval do the right thing.
> 
> def myresource(rcname, callback, *args):
>     rc=open(rcname)
>     same_namespace callback(*args)
>     close(rc)
> 
> def process(*args):
>     ...

But should the same_namespace modifier be part of the call site or
part of the callee? You seem to be tossing examples around a little
easily here.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From fumanchu at amor.org  Mon Apr 25 22:30:31 2005
From: fumanchu at amor.org (Robert Brewer)
Date: Mon Apr 25 22:28:57 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3771EF6@exchange.hqamor.amorhq.net>

Michael Chermside wrote:
> Jim:
> > Why not just introduce macros?  If the answer is "We 
> > should, it is just hard to code", then use a good
> > syntax for macros.  If the answer is "We don't want
> >       xx sss (S\<!   2k3 ]
> > to ever be meaningful", then we need to figure out exactly what to
> > prohibit.
>      [...]
> > Do we want to limit the changing part (the "anonymous block") to
> > only a single suite?  That does work well with the "yield" 
> syntax, but it
> > seems like an arbitrary restriction unless *all* we want 
> are resource
> > wrappers.
> >
> > Or do we really just want a way to say that a function 
> should share its
> > local namespace with it's caller or callee?  In that case, 
> maybe the answer
> > is a "lexical" or "same_namespace" keyword.
> 
> My own opinion is that we DO want macros. I prefer a language 
> have a few,
> powerful constructs rather than lots of specialized ones. (Yet I still
> believe that "doing different things should look 
> different"... which is
> why I prefer Python to Lisp.) I think that macros could solve a LOT of
> problems.
> 
> There are lots of things one might want to replace within macros, from
> identifiers to punctuation, but I'd be willing to live with 
> just two of them: expressions, and "series-of-statements"
> (that's almost the same as a block). There are only two places
> I'd want to be able to USE a macro: where an expression is
> called for, and where a series-of-statements is called for.
> In both cases, I'd be happy with a function-call 
> like syntax for including the macro.

By "function-call like syntax" you mean something like this?

    def safe_file(filename, body, cleanup):
        f = open(filename)
        try:
            body()
        finally:
            f.close()
            cleanup()

    ...

    defmacro body:
        for line in f:
            print line[:line.find(":")]
    
    defmacro cleanup:
        print "file closed successfully"
    
    safe_file(filename, body, cleanup)

If macros were to be evaluated at runtime, I'd certainly want to see
them be first-class (meaning able to be referenced and passed around); I
don't have much of a need for anonymous macros.


Robert Brewer
MIS
Amor Ministries
fumanchu@amor.org
From fumanchu at amor.org  Mon Apr 25 23:02:55 2005
From: fumanchu at amor.org (Robert Brewer)
Date: Mon Apr 25 23:01:18 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3771EF7@exchange.hqamor.amorhq.net>

Guido van Rossum wrote:
> > Why not just introduce macros?
> 
> Because I've been using Python for 15 years without needing them?
> Sorry, but "why not add feature X" is exactly what we're trying to
> AVOID here. You've got to come up with some really good use cases
> before we add new features. "I want macros" just doesn't cut it.

I had a use-case recently which could be done using macros. I'll let you
all decide whether it would be "better" with macros or not. ;)

My poor-man's ORM uses descriptors to handle the properties of domain
objects, a lot of which need custom triggers, constraint-checking,
notifications, etc. The base class has:

    def __set__(self, unit, value):
        if self.coerce:
            value = self.coerce(unit, value)
        oldvalue = unit._properties[self.key]
        if oldvalue != value:
            unit._properties[self.key] = value

At one time, I had something like:

    def __set__(self, unit, value):
        if self.coerce:
            value = self.coerce(unit, value)
        oldvalue = unit._properties[self.key]
        if oldvalue != value:
            if self.pre:
                self.pre(unit, value)
            unit._properties[self.key] = value
            if self.post:
                self.post(unit, value)

...to run pre- and post-triggers. But that became unwieldy recently when
one of my post functions depended upon calculations inside the
corresponding pre function.

So currently, all subclasses just override __set__, which leads to a
*lot* of duplication of code. If I could write the base class' __set__
to call "macros" like this:

    def __set__(self, unit, value):
        self.begin()
        if self.coerce:
            value = self.coerce(unit, value)
        oldvalue = unit._properties[self.key]
        if oldvalue != value:
            self.pre()
            unit._properties[self.key] = value
            self.post()
        self.end()

    defmacro begin:
        pass
    
    defmacro pre:
        pass
    
    defmacro post:
        pass
    
    defmacro end:
        pass


...(which would require macro-blocks which were decidedly *not*
anonymous) then I could more cleanly write a subclass with additional
"macro" methods:

    defmacro pre:
        old_children = self.children()
    
    defmacro post:
        for child in self.children:
            if child not in old_children:
                notify_somebody("New child %s" % child)


Notice that the "old_children" local gets injected into the namespace of
__set__ (the caller) when "pre" is executed, and is available inside of
"post". The "self" name doesn't need to be rebound, either, since it is
also available in __set__'s local scope. We also avoid all of the
overhead of separate frames.

The above is quite ugly written with callbacks (due to excessive
argument passing), and is currently fragile when overriding __set__ (due
to duplicated code).

I'm sure there are other cases with both 1) a relatively invariant
series of statements and 2) complicated extensions of that series. Of
course, you can do the above with compile() and exec. Maybe I'm just
averse to code within strings.

Some ideas. Now tear 'em apart. :)


Robert Brewer
MIS
Amor Ministries
fumanchu@amor.org
From jimjjewett at gmail.com  Mon Apr 25 23:04:34 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Mon Apr 25 23:04:37 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
Message-ID: <fb6fbf560504251404a95c914@mail.gmail.com>

Michael Chermside:

> There are lots of things one might want to replace within macros, from
> identifiers to punctuation, but I'd be willing to live with just two of
> them: expressions, and "series-of-statements" (that's almost the same as
> a block). There are only two places I'd want to be able to USE a macro:
> where an expression is called for, and where a series-of-statements is
> called for. In both cases, I'd be happy with a function-call like syntax
> for including the macro.

I have often wanted to replace (parts of) strings, either because I'm 
writing a wrapper or because I want a non-English version to be loadable
without having to wrap strings in my own source code.  This is best done
as an import hook, but if I had read-write access to (a copy of) the source
code, I would use it.  I'm not sure I want that door opened, because if I start 
needing to parse regex substitions just to get a source code listing
... I won't
be happy.

I do think macros should be prevented from "changing the level" of the
code it replaces.  Any suites/statements/expressions (including parentheses 
and strings) that are open before the macro must still be open afterwards,
and any opened inside the macro must be closed inside the macro.

For example

def foo(x):
    print x
    macro1(x)
    print x

might print different values for x on the two lines, but I would be less 
comfortable if it could result in any of the following:

def foo(x):
    print x
    while True:        # An invisible loop, because of 
        print x           # Changing the indent level

def foo(x):
    print x
    return               # and you thought it would print twice! 
(This one is iffy)
    print x               

def foo(x)
    print x
    [("""        (unclosed string or paren eats up the rest of the file...)
    print x

def foo(x)
    "Hah!  my backspaces and rubouts eliminated the print statements!"

def foo(x)
    print x
def anotherfunc(x, y, z):
    print x   # Hey, I didn't even mess with the indent!            

And to be honest, even

def foo(x):
    macro1(x)
        stmt1()        # syntax error, except for the macro, so not ambiguous

expanding to

def foo(x)
    while x:
        print x         # macro does not end on same indent level
        stmt1()

is ... not something I want to worry about when I'm reading.

Michael Chermside:
> (hint: the answer is that it ALL happens at runtime).

I have mixed feelings on this.  It is more powerful that way,
but it also limits future implementations -- and I'm not sure
the extra power is entirely a good thing.

defmacro():
    print x # Hey, x was in global scope at runtime when *I* tested

On The Other Hand, this certainly isn't the only piece of python
that could *usually* be moved to compile-time, and I suppose it
could piggyback on whatever extension is used for speeding up
attribute lookup.

-jJ
From tjreedy at udel.edu  Mon Apr 25 23:13:05 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon Apr 25 23:15:45 2005
Subject: [Python-Dev] Re: Re: Caching objects in memory
References: <e04bdf31050422063019fda86b@mail.gmail.com><d4av64$ogd$1@sea.gmane.org><e04bdf310504250946371f59c@mail.gmail.com>
	<ca471dc20504250957753a7445@mail.gmail.com>
Message-ID: <d4jm79$uji$1@sea.gmane.org>

Guido:

But for *immutable* objects (like numbers, strings and tuples) the
implementation is free to use caching. In practice, I believe ints
between -5 and 100 are cached, and 1-character strings are often
cached (but not always).

Hope this helps! I would think this is in the docs somewhere but
probably not in a place where one would ever think to look...

-----------
I am sure that the fact that immutables *may* be cached is in the ref 
manual, but I have been under the impression that the private, *mutable* 
specifics for CPython are intentionally omitted so that people will not 
think of them as either fixed or as part of the language/library.

I have previously suggested that there be a separate doc for CPython 
implementation details like this that some people want but which are not 
part of the language or library definition.

Terry J. Reedy


From shane at hathawaymix.org  Mon Apr 25 23:29:01 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Mon Apr 25 23:29:10 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E3771EF7@exchange.hqamor.amorhq.net>
References: <3A81C87DC164034AA4E2DDFE11D258E3771EF7@exchange.hqamor.amorhq.net>
Message-ID: <426D611D.9060100@hathawaymix.org>

Robert Brewer wrote:
> So currently, all subclasses just override __set__, which leads to a
> *lot* of duplication of code. If I could write the base class' __set__
> to call "macros" like this:
> 
>     def __set__(self, unit, value):
>         self.begin()
>         if self.coerce:
>             value = self.coerce(unit, value)
>         oldvalue = unit._properties[self.key]
>         if oldvalue != value:
>             self.pre()
>             unit._properties[self.key] = value
>             self.post()
>         self.end()
> 
>     defmacro begin:
>         pass
>     
>     defmacro pre:
>         pass
>     
>     defmacro post:
>         pass
>     
>     defmacro end:
>         pass

Here is a way to write that using anonymous blocks:

    def __set__(self, unit, value):
        with self.setting(unit, value):
            if self.coerce:
                value = self.coerce(unit, value)
            oldvalue = unit._properties[self.key]
            if oldvalue != value:
                with self.changing(oldvalue, value):
                    unit._properties[self.key] = value

    def setting(self, unit, value):
	# begin code goes here
        yield None
        # end code goes here

    def changing(self, oldvalue, newvalue):
        # pre code goes here
        yield None
        # post code goes here


> ...(which would require macro-blocks which were decidedly *not*
> anonymous) then I could more cleanly write a subclass with additional
> "macro" methods:
> 
>     defmacro pre:
>         old_children = self.children()
>     
>     defmacro post:
>         for child in self.children:
>             if child not in old_children:
>                 notify_somebody("New child %s" % child)

    def changing(self, oldvalue, newvalue):
        old_children = self.children()
        yield None
        for child in self.children:
            if child not in old_children:
                notify_somebody("New child %s" % child)

Which do you prefer?  I like fewer methods. ;-)

Shane
From fumanchu at amor.org  Mon Apr 25 23:40:12 2005
From: fumanchu at amor.org (Robert Brewer)
Date: Mon Apr 25 23:38:34 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3771EF9@exchange.hqamor.amorhq.net>

Shane Hathaway wrote:
> Robert Brewer wrote:
> > So currently, all subclasses just override __set__, which leads to a
> > *lot* of duplication of code. If I could write the base 
> class' __set__
> > to call "macros" like this:
> > 
> >     def __set__(self, unit, value):
> >         self.begin()
> >         if self.coerce:
> >             value = self.coerce(unit, value)
> >         oldvalue = unit._properties[self.key]
> >         if oldvalue != value:
> >             self.pre()
> >             unit._properties[self.key] = value
> >             self.post()
> >         self.end()
> > 
> >     defmacro begin:
> >         pass
> >     
> >     defmacro pre:
> >         pass
> >     
> >     defmacro post:
> >         pass
> >     
> >     defmacro end:
> >         pass
> 
> Here is a way to write that using anonymous blocks:
> 
>     def __set__(self, unit, value):
>         with self.setting(unit, value):
>             if self.coerce:
>                 value = self.coerce(unit, value)
>             oldvalue = unit._properties[self.key]
>             if oldvalue != value:
>                 with self.changing(oldvalue, value):
>                     unit._properties[self.key] = value
> 
>     def setting(self, unit, value):
> 	# begin code goes here
>         yield None
>         # end code goes here
> 
>     def changing(self, oldvalue, newvalue):
>         # pre code goes here
>         yield None
>         # post code goes here
> 
...
> Which do you prefer?  I like fewer methods. ;-)

I still prefer more methods, because my actual use-cases are more
complicated. Your solution would work for the specific case I gave, but
try factoring in:

* A subclass which needs to share locals between begin and post, instead
of pre and post.

or

* A set of 10 subclasses which need the same begin() but different end()
code.

Yielding seems both too restrictive and too inside-out to be readable,
IMO.


Robert Brewer
MIS
Amor Ministries
fumanchu@amor.org
From jjinux at gmail.com  Mon Apr 25 23:52:01 2005
From: jjinux at gmail.com (Shannon -jj Behrens)
Date: Mon Apr 25 23:52:04 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <426CAE64.8080404@egenix.com>
References: <740c3aec0504191557505d6e9f@mail.gmail.com>
	<877e9a170504191855445e0f4d@mail.gmail.com>
	<20050419212423.63AD.JCARLSON@uci.edu> <d44tmo$ih0$1@sea.gmane.org>
	<4266CC49.9080901@egenix.com>
	<c41f67b905042017596328164a@mail.gmail.com>
	<426CAE64.8080404@egenix.com>
Message-ID: <c41f67b905042514521c99244e@mail.gmail.com>

On 4/25/05, M.-A. Lemburg <mal@egenix.com> wrote:
> Shannon -jj Behrens wrote:
> > On 4/20/05, M.-A. Lemburg <mal@egenix.com> wrote:
> >
> >>Fredrik Lundh wrote:
> >>
> >>>PS. a side effect of the for-in pattern is that I'm beginning to feel
> >>>that Python
> >>>might need a nice "switch" statement based on dictionary lookups, so I can
> >>>replace multiple callbacks with a single loop body, without writing too
> >>>many
> >>>if/elif clauses.
> >>
> >>PEP 275 anyone ? (http://www.python.org/peps/pep-0275.html)
> >>
> >>My use case for switch is that of a parser switching on tokens.
> >>
> >>mxTextTools applications would greatly benefit from being able
> >>to branch on tokens quickly. Currently, there's only callbacks,
> >>dict-to-method branching or long if-elif-elif-...-elif-else.
> >
> > I think "match" from Ocaml would be a much nicer addition to Python
> > than "switch" from C.
> 
> PEP 275 is about branching based on dictionary lookups which
> is somewhat different than pattern matching - for which we
> already have lots and lots of different tools.
> 
> The motivation behind the switch statement idea is that of
> interpreting the multi-state outcome of some analysis that
> you perform on data. The main benefit is avoiding Python
> function calls which are very slow compared to branching to
> inlined Python code.
> 
> Having a simple switch statement
> would enable writing very fast parsers in Python -
> you'd let one of the existing tokenizers such as mxTextTools,
> re or one of the xml libs create the token input data
> and then work on the result using a switch statement.
> 
> Instead of having one function call per token, you'd
> only have a single dict lookup.
> 
> BTW, has anyone in this thread actually read the PEP 275 ?

I'll admit that I haven't because dict-based lookups aren't as
interesting to me as an Ocaml-style match statement.  Furthermore, the
argument "Instead of having one function call per token, you'd only
have a single dict lookup" isn't very compelling to me personally,
because I don't have such a performance problem in my applications,
which isn't to say that it isn't important or that you don't have a
valid point.

Best Regards,
-jj

-- 
I have decided to switch to Gmail, but messages to my Yahoo account will
still get through.
From jimjjewett at gmail.com  Tue Apr 26 00:01:06 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue Apr 26 00:01:09 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <ca471dc205042513162c5cff33@mail.gmail.com>
References: <fb6fbf560504251234553e881f@mail.gmail.com>
	<ca471dc205042513162c5cff33@mail.gmail.com>
Message-ID: <fb6fbf560504251501111fa7a5@mail.gmail.com>

On 4/25/05, Guido van Rossum <gvanrossum@gmail.com> wrote:
> > It could also be done (though not as cleanly) by making macros act as
> > import hooks.

> Brrr. What about imports that aren't at the top level (e.g. inside a function)?

Bad style already.  :D
If you want to use the macro, you have to ensure it was already imported.

That said, I did say it wasn't as clean; think of it like pre-caching which 
dictionary that resolved an attribute lookup.  Don't start with the complexity, 
but consider not making the optimization impossible.
 
> > Why not just introduce macros?

> Because I've been using Python for 15 years without needing them?

And also without anonymous blocks or generator finalizers or resource
managers.

> Sorry, but "why not add feature X" is exactly what we're trying to
> AVOID here.

If anything is added, it might be better to add a single generalized tool 
instead of several special cases -- unless the tool is so general as to be 
hazardous.  Unlimited macros are that hazardous.

> >  If the answer is "We don't want
> >
> >     xx sss (S\<!   2k3 ]
> >
> > to ever be meaningful", then we need to figure out exactly what to
> > prohibit.  

> I don't understand what the point is of using an example like
> "xx sss (S\<! 2k3 ]".

The simplest way to implement macros is to add an import hook 
that can modify (replace) the string containing the source code.  
Unfortunately, that would allow rules like 

    "replace any line starting with 'xx' with the number 7"  

Outside of obfuscators, almost no one would do something quite so 
painful as that ... but some people would start using regex 
substitutions or monkey-patching.  I would hate to debug code that
fails because a standard library module is secretly changed (on load, 
not on disk where I can grep for it) by another module, which doesn't 
even mention that library by name...

As Michael said, we have to think about what transformations we 
do not want happening out of sight.  I would have said "Just use
it responsibly" if I hadn't considered pathological cases like that one.

>> [yield works great for a single "anonymous block", but not so 
>>  great for several blocks per macro/template.]

> Pehaps you've missed some context here? Nobody seems to be able to
> come up with other [than resource wrappers] use cases, that's why 
> "yield" is so attractive.

Sorry; to me it seemed obvious that you would occasionally want to 
interleave the macro/template and the variable portion.  Robert Brewer 
has since provided good examples at

http://mail.python.org/pipermail/python-dev/2005-April/052923.html
http://mail.python.org/pipermail/python-dev/2005-April/052924.html

> > Or do we really just want a way to say that a function should share its
> > local namespace with it's caller or callee?  In that case, maybe the answer
> > is a "lexical" or "same_namespace" keyword.  Or maybe just a recipe to make
> > exec or eval do the right thing.

> But should the same_namespace modifier be part of the call site or
> part of the callee? 

IMHO, it should be part of the calling site, because it is the calling site
that could be surprised to find its own locals modified.  The callee 
presumably runs through a complete call before it has a chance to
be surprised.  

I did leave the decision open because I'm not certain that mention-in-caller
wouldn't end up contorting a common code style.  (It effectively forces
the macro to be in control, and the "meaningful" code to be callbacks.)

-jJ
From tcdelaney at optusnet.com.au  Tue Apr 26 00:10:44 2005
From: tcdelaney at optusnet.com.au (Tim Delaney)
Date: Tue Apr 26 00:10:46 2005
Subject: [Python-Dev] Re: anonymous blocks
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<004201c5492e$ff05ca60$f100a8c0@ryoko>
	<79990c6b05042511285237126c@mail.gmail.com>
Message-ID: <001201c549e3$a07d57a0$0700a8c0@ryoko>

Paul Moore wrote:

> Hmm, it took me a while to get this, but what you're ssaying is that
> if you modify Guido's "what I really want" solution to use
>
>    VAR = next(it, exc)
>
> then this builtin next makes "API v2" stuff using __next__ work while
> remaining backward compatible with old-style "API v1" stuff using
> 0-arg next() (as long as old-style stuff isn't used in a context where
> an exception gets passed back in).

Yes, but it could also be used (almost) anywhere an explicit obj.next() is 
used.

it = iter(seq)

while True:
    print next(it)

for loops would also change to use builtin next() rather than calling 
it.next() directly.

> I'd suggest that the new builtin have a "magic" name (__next__ being
> the obvious one :-)) to make it clear that it's an internal
> implementation detail.

There aren't many builtins that have magic names, and I don't think this 
should be one of them - it has obvious uses other than as an implementation 
detail.

> PS The first person to replace builtin __next__ in order to implement
> a "next hook" of some sort, gets shot :-)

Damn! There goes the use case ;)

Tim Delaney 

From gvanrossum at gmail.com  Tue Apr 26 00:11:07 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Apr 26 00:11:09 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <fb6fbf560504251501111fa7a5@mail.gmail.com>
References: <fb6fbf560504251234553e881f@mail.gmail.com>
	<ca471dc205042513162c5cff33@mail.gmail.com>
	<fb6fbf560504251501111fa7a5@mail.gmail.com>
Message-ID: <ca471dc20504251511689cffd3@mail.gmail.com>

It seems that what you call macros is really an unlimited
preprocessor. I'm even less interested in that topic than in macros,
and I haven't seen anything here to change my mind.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From jimjjewett at gmail.com  Tue Apr 26 00:20:04 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue Apr 26 00:20:07 2005
Subject: [Python-Dev] Re: switch statement
Message-ID: <fb6fbf560504251520797338b2@mail.gmail.com>

M.-A. Lemburg wrote:

> Having a simple switch statement
> would enable writing very fast parsers in Python -
...
> Instead of having one function call per token, you'd
> only have a single dict lookup.

> BTW, has anyone in this thread actually read the PEP 275 ?

I haven't actually seen any use cases outside of parsers 
branching on a constant token.  When I see stacked
elif clauses, the condition almost always includes some 
computation (perhaps only ".startswith" or "in" or a regex 
match), and there are often cases which look at a second 
variable.

If speed for a limited number of cases is the only advantage, 
then I would say it belongs in (at most) the implementation, 
rather than the language spec.  

-jJ
From shane at hathawaymix.org  Tue Apr 26 00:30:09 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Tue Apr 26 00:30:17 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E3771EF9@exchange.hqamor.amorhq.net>
References: <3A81C87DC164034AA4E2DDFE11D258E3771EF9@exchange.hqamor.amorhq.net>
Message-ID: <426D6F71.6020103@hathawaymix.org>

Robert Brewer wrote:
> I still prefer more methods, because my actual use-cases are more
> complicated. Your solution would work for the specific case I gave, but
> try factoring in:
> 
> * A subclass which needs to share locals between begin and post, instead
> of pre and post.
> 
> or
> 
> * A set of 10 subclasses which need the same begin() but different end()
> code.
> 
> Yielding seems both too restrictive and too inside-out to be readable,
> IMO.

Ok, that makes sense.  However, one of your examples seemingly pulls a
name, 'old_children', out of nowhere.  That's hard to fix.  One of the
greatest features of Python is the simple name scoping; we can't lose that.

Shane
From pedronis at strakt.com  Tue Apr 26 00:45:43 2005
From: pedronis at strakt.com (Samuele Pedroni)
Date: Tue Apr 26 00:44:02 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E3771EF9@exchange.hqamor.amorhq.net>
References: <3A81C87DC164034AA4E2DDFE11D258E3771EF9@exchange.hqamor.amorhq.net>
Message-ID: <426D7317.10104@strakt.com>

Robert Brewer wrote:
> Shane Hathaway wrote:
> 
>>Robert Brewer wrote:
>>
>>>So currently, all subclasses just override __set__, which leads to a
>>>*lot* of duplication of code. If I could write the base 
>>
>>class' __set__
>>
>>>to call "macros" like this:
>>>
>>>    def __set__(self, unit, value):
>>>        self.begin()
>>>        if self.coerce:
>>>            value = self.coerce(unit, value)
>>>        oldvalue = unit._properties[self.key]
>>>        if oldvalue != value:
>>>            self.pre()
>>>            unit._properties[self.key] = value
>>>            self.post()
>>>        self.end()
>>>
>>>    defmacro begin:
>>>        pass
>>>    
>>>    defmacro pre:
>>>        pass
>>>    
>>>    defmacro post:
>>>        pass
>>>    
>>>    defmacro end:
>>>        pass
>>
>>Here is a way to write that using anonymous blocks:
>>
>>    def __set__(self, unit, value):
>>        with self.setting(unit, value):
>>            if self.coerce:
>>                value = self.coerce(unit, value)
>>            oldvalue = unit._properties[self.key]
>>            if oldvalue != value:
>>                with self.changing(oldvalue, value):
>>                    unit._properties[self.key] = value
>>
>>    def setting(self, unit, value):
>>	# begin code goes here
>>        yield None
>>        # end code goes here
>>
>>    def changing(self, oldvalue, newvalue):
>>        # pre code goes here
>>        yield None
>>        # post code goes here
>>
> 
> ...
> 
>>Which do you prefer?  I like fewer methods. ;-)
> 
> 
> I still prefer more methods, because my actual use-cases are more
> complicated. Your solution would work for the specific case I gave, but
> try factoring in:
> 
> * A subclass which needs to share locals between begin and post, instead
> of pre and post.
> 
> or
> 
> * A set of 10 subclasses which need the same begin() but different end()
> code.
> 
> Yielding seems both too restrictive and too inside-out to be readable,
> IMO.
> 
> 

it seems what you are asking for are functions that are evaluated in 
namespace of the caller:

- this seems fragile, the only safe wat to implement 'begin' etc is to 
exactly know what goes on in __set__ and what names are used there

- if you throw in deferred evaluation for exprs or suites passed in
as arguments and even without considering that, it seems pretty horrid 
implementation-wise

Notice that even in Common Lisp you cannot really do this, you could 
define a macro that produce a definition for __set__ and takes fragments 
corresponding to begin ... etc


From abo at minkirri.apana.org.au  Tue Apr 26 02:01:05 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Tue Apr 26 02:01:21 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <fb6fbf560504251520797338b2@mail.gmail.com>
References: <fb6fbf560504251520797338b2@mail.gmail.com>
Message-ID: <1114473665.3698.2.camel@schizo>

On Mon, 2005-04-25 at 18:20 -0400, Jim Jewett wrote:
[...]
> If speed for a limited number of cases is the only advantage, 
> then I would say it belongs in (at most) the implementation, 
> rather than the language spec.  

Agreed. I don't find any switch syntaxes better than if/elif/else. Speed
benefits belong in implementation optimisations, not new bad syntax.

-- 
Donovan Baarda <abo@minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/

From exogen at gmail.com  Tue Apr 26 03:21:37 2005
From: exogen at gmail.com (Brian Beck)
Date: Tue Apr 26 03:25:48 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <1114473665.3698.2.camel@schizo>
References: <fb6fbf560504251520797338b2@mail.gmail.com>
	<1114473665.3698.2.camel@schizo>
Message-ID: <d4k4p8$a8q$1@sea.gmane.org>

Donovan Baarda wrote:
> Agreed. I don't find any switch syntaxes better than if/elif/else. Speed
> benefits belong in implementation optimisations, not new bad syntax.

I posted this 'switch' recipe to the Cookbook this morning, it saves
some typing over the if/elif/else construction, and people seemed to
like it. Take a look:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/410692

--
Brian Beck
Adventurer of the First Order

From abo at minkirri.apana.org.au  Tue Apr 26 04:20:07 2005
From: abo at minkirri.apana.org.au (Donovan Baarda)
Date: Tue Apr 26 04:21:36 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <d4k4p8$a8q$1@sea.gmane.org>
References: <fb6fbf560504251520797338b2@mail.gmail.com>
	<1114473665.3698.2.camel@schizo>  <d4k4p8$a8q$1@sea.gmane.org>
Message-ID: <1114482007.3698.15.camel@schizo>

On Mon, 2005-04-25 at 21:21 -0400, Brian Beck wrote:
> Donovan Baarda wrote:
> > Agreed. I don't find any switch syntaxes better than if/elif/else. Speed
> > benefits belong in implementation optimisations, not new bad syntax.
> 
> I posted this 'switch' recipe to the Cookbook this morning, it saves
> some typing over the if/elif/else construction, and people seemed to
> like it. Take a look:
> http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/410692

Very clever... you have shown that current python syntax is capable of
almost exactly replicating a C case statement.

My only problem is C case statements are ugly. A simple if/elif/else is
much more understandable to me. 

The main benefit in C of case statements is the compiler can optimise
them. This copy of a C case statement will be slower than an
if/elif/else, and just as ugly :-)

-- 
Donovan Baarda <abo@minkirri.apana.org.au>
http://minkirri.apana.org.au/~abo/

From tjreedy at udel.edu  Tue Apr 26 04:33:23 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue Apr 26 04:33:44 2005
Subject: [Python-Dev] Re: Re: Caching objects in memory
References: <e04bdf31050422063019fda86b@mail.gmail.com><d4av64$ogd$1@sea.gmane.org><e04bdf310504250946371f59c@mail.gmail.com><ca471dc20504250957753a7445@mail.gmail.com>
	<d4jm79$uji$1@sea.gmane.org>
Message-ID: <d4k8vs$pnf$1@sea.gmane.org>


"Terry Reedy" <tjreedy@udel.edu> wrote in message 
news:d4jm79$uji$1@sea.gmane.org...
> Guido:
>
> But for *immutable* objects (like numbers, strings and tuples) the
> implementation is free to use caching. In practice, I believe ints
> between -5 and 100 are cached, and 1-character strings are often
> cached (but not always).
>
> Hope this helps! I would think this is in the docs somewhere but
> probably not in a place where one would ever think to look...
>
> -----------

To be clearer, the above quotes what Guido wrote in the post of his that I 
am responding to.  Only the below is my response.

> I am sure that the fact that immutables *may* be cached is in the ref 
> manual, but I have been under the impression that the private, *mutable* 
> specifics for CPython are intentionally omitted so that people will not 
> think of them as either fixed or as part of the language/library.
>
> I have previously suggested that there be a separate doc for CPython 
> implementation details like this that some people want but which are not 
> part of the language or library definition.
>
Terry J. Reedy


From greg.ewing at canterbury.ac.nz  Tue Apr 26 04:47:37 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue Apr 26 04:47:54 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <001201c549e3$a07d57a0$0700a8c0@ryoko>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<004201c5492e$ff05ca60$f100a8c0@ryoko>
	<79990c6b05042511285237126c@mail.gmail.com>
	<001201c549e3$a07d57a0$0700a8c0@ryoko>
Message-ID: <426DABC9.4020801@canterbury.ac.nz>

Tim Delaney wrote:
> There aren't many builtins that have magic names, and I don't think this 
> should be one of them - it has obvious uses other than as an 
> implementation detail.

I think there's some confusion here. As I understood the
suggestion, __next__ would be the Python name of the method
corresponding to the tp_next typeslot, analogously with
__len__, __iter__, etc.

There would be a builtin function next(obj) which would
invoke obj.__next__(), for use by Python code. For loops
wouldn't use it, though; they would continue to call the
tp_next typeslot directly.

> Paul Moore wrote: 
>> PS The first person to replace builtin __next__ in order to implement
>> a "next hook" of some sort, gets shot :-)

I think he meant next(), not __next__. And it wouldn't
work anyway, since as I mentioned above, C code would
bypass next() and call the typeslot directly.

I'm +1 on moving towards __next__, BTW. IMO, that's the
WISHBDITFP. :-)

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From michael.walter at gmail.com  Tue Apr 26 05:15:52 2005
From: michael.walter at gmail.com (Michael Walter)
Date: Tue Apr 26 05:15:55 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <ca471dc20504251511689cffd3@mail.gmail.com>
References: <fb6fbf560504251234553e881f@mail.gmail.com>
	<ca471dc205042513162c5cff33@mail.gmail.com>
	<fb6fbf560504251501111fa7a5@mail.gmail.com>
	<ca471dc20504251511689cffd3@mail.gmail.com>
Message-ID: <877e9a1705042520151e493f3f@mail.gmail.com>

A couple of examples out of my tired head (solely from a user perspective) :-)

Embedding domain specific language (ex.: state machine):

stateful Person:
  state Calm(initial=True):
    def react(event):
      self.chill_pill.take()
      ignore(event)

  state Furious:
    def react(event):
      self.say("Macros are the evil :)")
      react(event) # xD

p = Person()
p.become(Furious)
p.react(42)

---

Embedding domain specific language (ex.: markup language):

# no, i haven't thought about whether the presented syntax as such is
unambiguous
# enough to make sense
def hello_world():
  <html>:
    <head>:
      <title>: "Tralalalala"
    <body>:
      for g in uiods:
        <h1>: uido2str(g)

---

Embedding domain-specific language (ex.: badly-designed database table):

deftable Player:
  id: primary_key(integer) # does this feel backward?
  handle: string
  fans: m2n_assoc(Fan)

---

Control constructs:

forever:
  print "tralalala"

unless you.are(LUCKY):
  print "awwww"


I'm not sure whether this the Python you want it to become, so in a
certain sense I feel kind of counterproductive now (sublanguage design
is hard at 11 PM, which might actually prove someone's point that the
language designer shouldn't allow people to do such things. I'm sure
other people are more mature or at least less tired than me, though,
so I beg to differ :-),
Michael


On 4/25/05, Guido van Rossum <gvanrossum@gmail.com> wrote:
> It seems that what you call macros is really an unlimited
> preprocessor. I'm even less interested in that topic than in macros,
> and I haven't seen anything here to change my mind.
> 
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/michael.walter%40gmail.com
>
From greg.ewing at canterbury.ac.nz  Tue Apr 26 05:38:48 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue Apr 26 05:39:09 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc205042416572da9db71@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
Message-ID: <426DB7C8.5020708@canterbury.ac.nz>

Guido van Rossum wrote:

>     with VAR = EXPR:
>         BODY
> 
> This would translate to the following code:
> 
>     it = EXPR
>     err = None
>     while True:
>         try:
>             if err is None:
>                 VAR = it.next()
>             else:
>                 VAR = it.next_ex(err)
>         except StopIteration:
>             break
>         try:
>             err = None
>             BODY
>         except Exception, err: # Pretend "except Exception:" == "except:"
>             if not hasattr(it, "next_ex"):
>                 raise

I like the general shape of this, but I have one or two
reservations about the details.

1) We're going to have to think carefully about the naming of
functions designed for use with this statement. If 'with'
is going to be in there as a keyword, then it really shouldn't
be part of the function name as well. Instead of

   with f = with_file(pathname):
     ...

I would rather see something like

   with f = opened(pathname):
     ...

This sort of convention (using a past participle as a function
name) would work for some other cases as well:

   with some_data.locked():
     ...

   with some_resource.allocated():
     ...

On the negative side, not having anything like 'with' in the
function name means that the fact the function is designed for
use in a with-statement could be somewhat non-obvious. Since
there's not going to be much other use for such a function,
this is a bad thing.

It could also lead people into subtle usage traps such as

   with f = open(pathname):
     ...

which would fail in a somewhat obscure way.

So maybe the 'with' keyword should be dropped (again!) in
favour of

   with_opened(pathname) as f:
     ...

2) I'm not sure about the '='. It makes it look rather deceptively
like an ordinary assignment, and I'm sure many people are going
to wonder what the difference is between

   with f = opened(pathname):
     do_stuff_to(f)

and simply

   f = opened(pathname)
   do_stuff_to(f)

or even just unconsciously read the first as the second without
noticing that anything special is going on. Especially if they're
coming from a language like Pascal which has a much less magical
form of with-statement.

So maybe it would be better to make it look more different:

   with opened(pathname) as f:
     ...

* It seems to me that this same exception-handling mechanism
would be just as useful in a regular for-loop, and that, once
it becomes possible to put 'yield' in a try-statement, people
are going to *expect* it to work in for-loops as well.

Guido has expressed concern about imposing extra overhead on
all for-loops. But would the extra overhead really be all that
noticeable? For-loops already put a block on the block stack,
so the necessary processing could be incorporated into the
code for unwinding a for-block during an exception, and little
if anything would need to change in the absence of an exception.

However, if for-loops also gain this functionality, we end up
with the rather embarrassing situation that there is *no difference*
in semantics between a for-loop and a with-statement!

This could be "fixed" by making the with-statement not loop,
as has been suggested. That was my initial thought as well,
but having thought more deeply, I'm starting to think that
Guido was right in the first place, and that a with-statement
should be capable of looping. I'll elaborate in another post.

> So a block could return a value to the generator using a return
> statement; the generator can catch this by catching ReturnFlow.
> (Syntactic sugar could be "VAR = yield ..." like in Ruby.)

This is a very elegant idea, but I'm seriously worried by the
possibility that a return statement could do something other
than return from the function it's written in, especially if
for-loops also gain this functionality. Intercepting break
and continue isn't so bad, since they're already associated
with the loop they're in, but return has always been an
unconditional get-me-out-of-this-function. I'd feel uncomfortable
if this were no longer true.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From kbk at shore.net  Tue Apr 26 05:56:11 2005
From: kbk at shore.net (Kurt B. Kaiser)
Date: Tue Apr 26 05:56:44 2005
Subject: [Python-Dev] Weekly Python Patch/Bug Summary
Message-ID: <200504260356.j3Q3uBNw020893@bayview.thirdcreek.com>

Patch / Bug Summary
___________________

Patches :  316 open ( +2) /  2831 closed ( +7) /  3147 total ( +9)
Bugs    :  908 open (+10) /  4941 closed (+20) /  5849 total (+30)
RFE     :  178 open ( +1) /   153 closed ( +2) /   331 total ( +3)

New / Reopened Patches
______________________

package_data chops off first char of default package  (2005-04-15)
       http://python.org/sf/1183712  opened by  Wummel

[ast] fix for 1183468: return/yield in class  (2005-04-16)
       http://python.org/sf/1184418  opened by  logistix

urllib2 dloads failing through HTTP proxy w/ auth  (2005-04-18)
       http://python.org/sf/1185444  opened by  Mike Fleetwood

binascii.b2a_qp does not handle binary data correctly  (2005-04-18)
       http://python.org/sf/1185447  opened by  Eric Huss

Automatically build fpectl module from setup.py  (2005-04-18)
       http://python.org/sf/1185529  opened by  Jeff Epler

Typo in Curses-Function doc  (2005-04-20)
       http://python.org/sf/1186781  opened by  grzankam

subprocess: optional auto-reaping fixing os.wait() lossage  (2005-04-21)
       http://python.org/sf/1187312  opened by  Mattias Engdeg?rd

Add const specifier to PySpam_System prototype  (2005-04-21)
       http://python.org/sf/1187396  opened by  Luis Bruno

Don't assume all exceptions are SyntaxError's  (2005-04-25)
       http://python.org/sf/1189210  opened by  John Ehresman

Patches Closed
______________

fix typos in Library Reference  (2005-04-10)
       http://python.org/sf/1180062  closed by  doerwalter

[AST] Fix for core in test_grammar.py  (2005-04-08)
       http://python.org/sf/1179513  closed by  nascheme

Implemented new 'class foo():pass' syntax  (2005-04-03)
       http://python.org/sf/1176019  closed by  nascheme

range() in for loops, again  (2005-04-12)
       http://python.org/sf/1181334  closed by  arigo

Info Associated with Merge to AST  (2005-01-07)
       http://python.org/sf/1097671  closed by  kbk

New / Reopened Bugs
___________________

Minor error in tutorial  (2005-04-14)
CLOSED http://python.org/sf/1183274  opened by  Konrads Smelkovs

check for return/yield outside function is wrong  (2005-04-15)
       http://python.org/sf/1183468  opened by  Neil Schemenauer

try to open /dev/null as directory  (2005-04-15)
       http://python.org/sf/1183585  opened by  Roberto A. Foglietta

PyDict_Copy() can return non-NULL value on error  (2005-04-15)
CLOSED http://python.org/sf/1183742  opened by  Phil Thompson

Popen4 wait() fails sporadically with threads  (2005-04-15)
       http://python.org/sf/1183780  opened by  Taale Skogan

return val in __init__ doesn't raise TypeError in new-style  (2005-04-15)
CLOSED http://python.org/sf/1183959  opened by  Adal Chiriliuc

dest parameter in optparse  (2005-04-15)
       http://python.org/sf/1183972  opened by  ahmado

Missing trailing newline with comment raises SyntaxError  (2005-04-15)
       http://python.org/sf/1184112  opened by  Eric Huss

example broken in section 1.12 of Extending & Embedding  (2005-04-16)
       http://python.org/sf/1184380  opened by  bamoore

Read-only property attributes raise wrong exception  (2005-04-16)
CLOSED http://python.org/sf/1184449  opened by  Barry A. Warsaw

itertools.imerge: merge sequences  (2005-04-18)
CLOSED http://python.org/sf/1185121  opened by  Jurjen N.E. Bos

pydoc doesn't find all module doc strings  (2005-04-18)
       http://python.org/sf/1185124  opened by  Kent Johnson

PyObject_Realloc bug in obmalloc.c  (2005-04-19)
       http://python.org/sf/1185883  opened by  Kristj?n Valur

python socketmodule dies on ^c  (2005-04-19)
CLOSED http://python.org/sf/1185931  opened by  nodata

tempnam doc doesn't include link to tmpfile  (2005-04-19)
       http://python.org/sf/1186072  opened by  Ian Bicking

[AST] genexps get scoping wrong  (2005-04-19)
       http://python.org/sf/1186195  opened by  Brett Cannon

[AST] assert failure on ``eval("u'\Ufffffffe'")``  (2005-04-19)
       http://python.org/sf/1186345  opened by  Brett Cannon

[AST] automatic unpacking of arguments broken  (2005-04-19)
       http://python.org/sf/1186353  opened by  Brett Cannon

Python Programming FAQ should be updated for Python 2.4  (2005-02-09)
CLOSED http://python.org/sf/1119439  reopened by  montanaro

nntplib shouldn't raise generic EOFError  (2005-04-20)
       http://python.org/sf/1186900  opened by  Matt Roper

TypeError message on bad iteration is misleading  (2005-04-21)
       http://python.org/sf/1187437  opened by  Roy Smith

Pickle with HIGHEST_PROTOCOL "ord() expected..."  (2005-04-22)
CLOSED http://python.org/sf/1188175  opened by  Heiko Selber

Rebuilding from source on RH9 fails (_tkinter.so missing)  (2005-04-22)
       http://python.org/sf/1188231  opened by  Marty Heyman

Python 2.4 Not Recognized by Any Programs  (2005-04-23)
       http://python.org/sf/1188637  opened by  Yoshi Nagasaki

zipfile module and 2G boundary  (2005-04-24)
       http://python.org/sf/1189216  opened by  Bob Ippolito

Seg Fault when compiling small program  (2005-04-24)
       http://python.org/sf/1189248  opened by  Reginald B. Charney

LINKCC incorrect  (2005-04-25)
       http://python.org/sf/1189330  opened by  Christoph Ludwig

LINKCC incorrect  (2005-04-25)
CLOSED http://python.org/sf/1189337  opened by  Christoph Ludwig

file.write(x) where len(x) > 64*1024**2 is unreliable  (2005-04-25)
CLOSED http://python.org/sf/1189525  opened by  Martin Gfeller

pydoc may hide non-private doc strings.  (2005-04-25)
       http://python.org/sf/1189811  opened by  J Livingston

"Atuple containing default argument values ..."  (2005-04-25)
       http://python.org/sf/1189819  opened by  Chad Whitacre

Bugs Closed
___________

Minor error in tutorial  (2005-04-14)
       http://python.org/sf/1183274  closed by  doerwalter

copy.py bug  (2005-02-03)
       http://python.org/sf/1114776  closed by  anthonybaxter

re.escape(s) prints wrong for chr(0)  (2005-04-13)
       http://python.org/sf/1182603  closed by  nascheme

PyDict_Copy() can return non-NULL value on error  (2005-04-15)
       http://python.org/sf/1183742  closed by  rhettinger

return val in __init__ doesn't raise TypeError in new-style  (2005-04-15)
       http://python.org/sf/1183959  closed by  rhettinger

dir() does not include _  (2005-04-13)
       http://python.org/sf/1182614  closed by  nickjacobson

Read-only property attributes raise wrong exception  (2005-04-16)
       http://python.org/sf/1184449  closed by  bwarsaw

Readline segfault  (2005-04-05)
       http://python.org/sf/1176893  closed by  mwh

python socketmodule dies on ^c  (2005-04-19)
       http://python.org/sf/1185931  closed by  nodata101

Bad sys.executable value for bdist_wininst install script  (2005-04-12)
       http://python.org/sf/1181619  closed by  theller

StringIO and cStringIO don't provide 'name' attribute  (2005-04-03)
       http://python.org/sf/1175967  closed by  mwh

Python Interpreter shell is crashed   (2005-01-12)
       http://python.org/sf/1100673  closed by  mwh

Python Programming FAQ should be updated for Python 2.4  (2005-02-09)
       http://python.org/sf/1119439  closed by  jafo

Dictionary Parsing Problem  (2005-02-05)
       http://python.org/sf/1117048  closed by  tjreedy

2.4.1 breaks pyTTS  (2005-04-07)
       http://python.org/sf/1178624  closed by  doerwalter

Pickle with HIGHEST_PROTOCOL "ord() expected..."  (2005-04-22)
       http://python.org/sf/1188175  closed by  drhok

multiple broken links in profiler docs  (2005-03-30)
       http://python.org/sf/1173773  closed by  isandler

LINKCC incorrect  (2005-04-25)
       http://python.org/sf/1189337  closed by  cludwig

file.write(x) where len(x) > 64*1024**2 is unreliable  (2005-04-25)
       http://python.org/sf/1189525  closed by  tim_one

New / Reopened RFE
__________________

"replace" function should accept lists.  (2005-04-17)
CLOSED http://python.org/sf/1184678  opened by  Poromenos

Make bisect.* functions accept an optional compare function  (2005-04-18)
       http://python.org/sf/1185383  opened by  Marcin Ciura

RFE Closed
__________

"replace" function should accept lists.  (2005-04-17)
       http://python.org/sf/1184678  closed by  rhettinger

itertools.imerge: merge sequences  (2005-04-18)
       http://python.org/sf/1185121  closed by  jneb

From greg.ewing at canterbury.ac.nz  Tue Apr 26 06:00:14 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue Apr 26 06:00:34 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <426C54BF.2010906@ocf.berkeley.edu>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426C54BF.2010906@ocf.berkeley.edu>
Message-ID: <426DBCCE.40903@canterbury.ac.nz>

Brett C. wrote:

> And before anyone decries the fact that this might confuse a newbie (which
> seems to happen with every advanced feature ever dreamed up), remember this
> will not be meant for a newbie but for someone who has experience in Python and
> iterators at the minimum, and hopefully with generators.

This is dangerously close to the "you don't need to know about
it if you're not going to use it" argument, which is widely
recognised as false. Newbies might not need to know all the
details of the implementation, but they will need to know
enough about the semantics of with-statements to understand
what they're doing when they come across them in other people's
code.

Which leads me to another concern. How are we going to explain
the externally visible semantics of a with-statement in a way
that's easy to grok, without mentioning any details of the
implementation?

You can explain a for-loop pretty well by saying something like
"It executes the body once for each item from the sequence",
without having to mention anything about iterators, generators,
next() methods, etc. etc. How the items are produced is completely
irrelevant to the concept of the for-loop.

But what is the equivalent level of description of the
with-statement going to say?

"It executes the body with... ???"

And a related question: What are we going to call the functions
designed for with-statements, and the objects they return?
Calling them generators and iterators (even though they are)
doesn't seem right, because they're being used for a purpose
very different from generating and iterating.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From greg.ewing at canterbury.ac.nz  Tue Apr 26 06:37:33 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue Apr 26 06:37:51 2005
Subject: [Python-Dev] Re: Caching objects in memory
In-Reply-To: <ca471dc20504250957753a7445@mail.gmail.com>
References: <e04bdf31050422063019fda86b@mail.gmail.com>
	<d4av64$ogd$1@sea.gmane.org> <e04bdf310504250946371f59c@mail.gmail.com>
	<ca471dc20504250957753a7445@mail.gmail.com>
Message-ID: <426DC58D.2010102@canterbury.ac.nz>

Guido van Rossum wrote:

> But for *immutable* objects (like numbers, strings and tuples) the
> implementation is free to use caching. In practice, I believe ints
> between -5 and 100 are cached, and 1-character strings are often
> cached (but not always).

Also, string literals that resemble Python identifiers
are often interned, although this is not guaranteed.
And this only applies to literals, not strings constructed
dynamically by the program (unless you explicitly apply
intern() to them).

Python 2.3.4 (#1, Jun 30 2004, 16:47:37)
[GCC 3.2 20020903 (Red Hat Linux 8.0 3.2-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> "foo" is "foo"
True
 >>> "foo" is "f" + "oo"
False
 >>> "foo" is intern("f" + "oo")
True

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From greg.ewing at canterbury.ac.nz  Tue Apr 26 06:45:14 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue Apr 26 06:45:30 2005
Subject: [Python-Dev] Re: Re: anonymous blocks
In-Reply-To: <d4iv44$9gn$1@sea.gmane.org>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426CB7C2.8030508@gmail.com> <d4iv44$9gn$1@sea.gmane.org>
Message-ID: <426DC75A.1010005@canterbury.ac.nz>

Terry Reedy wrote:

>>Not supporting iterables makes it harder to write a class which is 
>>inherently usable in a with block, though. The natural way to make 
>>iterable classes is to use 'yield' in the definition of __iter__ - if 
>>iter() is not called, then that trick can't be used.

If you're defining it by means of a generator, you don't
need a class at all -- just make the whole thing a generator
function.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From greg.ewing at canterbury.ac.nz  Tue Apr 26 07:00:21 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue Apr 26 07:00:39 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <1114473665.3698.2.camel@schizo>
References: <fb6fbf560504251520797338b2@mail.gmail.com>
	<1114473665.3698.2.camel@schizo>
Message-ID: <426DCAE5.2070501@canterbury.ac.nz>

Donovan Baarda wrote:

> Agreed. I don't find any switch syntaxes better than if/elif/else. Speed
> benefits belong in implementation optimisations, not new bad syntax.
> 

Two things are mildly annoying about if-elif chains as a
substitute for a switch statement:

1) Repeating the name of the thing being switched on all the time,
    and the operator being used for comparison.

2) The first case is syntactically different from subsequent ones,
    even though semantically all the cases are equivalent.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From greg.ewing at canterbury.ac.nz  Tue Apr 26 07:12:14 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue Apr 26 07:12:34 2005
Subject: [Python-Dev] site enhancements (request for review)
In-Reply-To: <53f1dec01a0d78057c40abb1942cf0f1@redivi.com>
References: <53f1dec01a0d78057c40abb1942cf0f1@redivi.com>
Message-ID: <426DCDAE.8060907@canterbury.ac.nz>

Bob Ippolito wrote:
> A few weeks ago I put together a patch to site.py for Python 2.5 
> <http://python.org/sf/1174614> that solves three major deficiencies:
 >
 > [concerning .pth files]

While we're on the subject of .pth files, what about
the idea of scanning the directory containing the main
.py file for .pth files? This would make it easier to
have collections of Python programs sharing a common
set of modules, without having to either install them
system-wide or write hairy sys.path-manipulating code
or use platform-dependent symlink or PATH hacks.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From greg.ewing at canterbury.ac.nz  Tue Apr 26 07:48:11 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue Apr 26 07:48:27 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <fb6fbf56050422143614d8431c@mail.gmail.com>
References: <fb6fbf56050422143614d8431c@mail.gmail.com>
Message-ID: <426DD61B.3030708@canterbury.ac.nz>

Jim Jewett wrote:

> defmacro myresource(filename):
>     <make explicit calls to named callback "functions", but 
>       within the same locals() scope.>
> 
> with myresource("thefile"):
>     def reader(): 
>         ...
>     def writer():
>         ...
>     def fn():
>         ....

-1. This is ugly.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From greg.ewing at canterbury.ac.nz  Tue Apr 26 07:59:52 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue Apr 26 08:00:10 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
Message-ID: <426DD8D8.5040908@canterbury.ac.nz>

Michael Chermside wrote:

> I've been following this conversation, and it sounds to me as if we
> are stumbling about in the dark, trying to feel our way toward something
> very useful and powerful. I think Jim is right, what we're feeling our
> way toward is macros.

I considered saying something like that about 3 posts ago,
but I was afraid of getting stoned for heresy...

>   ... Eventually, there would
>   develop a large number of different Python "dialects" (as some
>   claim has happened in the Lisp community) each dependent on macros
>   the others lack. The most important casualty would be Python's
>   great *readability*.

> In other words, rather than hearing what we'd like to be able to DO
> with blocks, I'd like to hear what we want to PROHIBIT DOING with
> blocks.

 From that quote, it would seem what we want to do is prohibit
anything that would make code less readable. Or prohibit
anything that would permit creating a new dialect. Or something.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From greg.ewing at canterbury.ac.nz  Tue Apr 26 08:19:51 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue Apr 26 08:20:07 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
Message-ID: <426DDD87.60908@canterbury.ac.nz>

Michael Chermside wrote:
> if the answer is that we want to prohibit nothing, then the right
> solution is macros.

I'm not sure about that. Smalltalk manages to provide very
reasonable-looking user-defined control structures without
using compile-time macros, just normal runtime evaluation
together with block arguments. It does this by starting
out with a fairly minimal and very flexible syntax.

This raises the question of why people feel the need for
macros in Lisp or Scheme, which have an even more minimal
and flexible syntax. I think part of the reason is that
the syntax for passing an unevaluated block is too obtrusive.
In Scheme you can define a function (not macro) that is
used like this:

   (with-file "foo/blarg"
      (lambda (f)
         (do-something-with f)))

But there is a natural tendency to want to be able to
cut out the lambda cruft and just write something like:

   (with-file "foo/blarg" (f)
      (do-something-with f))

and for that you need a macro.

The equivalent in Smalltalk would be something like

   File open: "foo/blarg" do: [:f f something]

which doesn't look too bad (compared to the rest of
the language!) because the block-passing syntax is
fairly unobtrusive.

So in summary, I don't think you necessarily *need*
macros to get nice-looking user-defined control structures.
It depends on other features of the language.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From bac at OCF.Berkeley.EDU  Tue Apr 26 08:30:14 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Tue Apr 26 08:30:25 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <426DBCCE.40903@canterbury.ac.nz>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>	<426C54BF.2010906@ocf.berkeley.edu>
	<426DBCCE.40903@canterbury.ac.nz>
Message-ID: <426DDFF6.3060808@ocf.berkeley.edu>

Greg Ewing wrote:
> Brett C. wrote:
> 
>> And before anyone decries the fact that this might confuse a newbie
>> (which
>> seems to happen with every advanced feature ever dreamed up), remember
>> this
>> will not be meant for a newbie but for someone who has experience in
>> Python and
>> iterators at the minimum, and hopefully with generators.
> 
> 
> This is dangerously close to the "you don't need to know about
> it if you're not going to use it" argument, which is widely
> recognised as false. Newbies might not need to know all the
> details of the implementation, but they will need to know
> enough about the semantics of with-statements to understand
> what they're doing when they come across them in other people's
> code.
> 

I am not saying it is totally to be ignored by people staring at Python code,
but we don't need to necessarily spell out the intricacies.

> Which leads me to another concern. How are we going to explain
> the externally visible semantics of a with-statement in a way
> that's easy to grok, without mentioning any details of the
> implementation?
> 
> You can explain a for-loop pretty well by saying something like
> "It executes the body once for each item from the sequence",
> without having to mention anything about iterators, generators,
> next() methods, etc. etc. How the items are produced is completely
> irrelevant to the concept of the for-loop.
> 
> But what is the equivalent level of description of the
> with-statement going to say?
> 
> "It executes the body with... ???"
> 

It executes the body, calling next() on the argument name on each time through
until the iteration stops.

> And a related question: What are we going to call the functions
> designed for with-statements, and the objects they return?
> Calling them generators and iterators (even though they are)
> doesn't seem right, because they're being used for a purpose
> very different from generating and iterating.
> 

I like "managers" since they are basically managing resources most of the time
for the user.

-Brett
From python-dev at zesty.ca  Tue Apr 26 08:47:10 2005
From: python-dev at zesty.ca (Ka-Ping Yee)
Date: Tue Apr 26 08:47:21 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <426DDFF6.3060808@ocf.berkeley.edu>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426C54BF.2010906@ocf.berkeley.edu>
	<426DBCCE.40903@canterbury.ac.nz> <426DDFF6.3060808@ocf.berkeley.edu>
Message-ID: <Pine.LNX.4.58.0504260142080.4786@server1.LFW.org>

On Mon, 25 Apr 2005, Brett C. wrote:
> It executes the body, calling next() on the argument name on each
> time through until the iteration stops.

There's a little more to it than that.  But on the whole I do support
the goal of finding a simple, short description of what this construct
is intended to do.  If it can be described accurately in a sentence
or two, that's a good sign that the semantics are sufficiently clear
and simple.

> I like "managers" since they are basically managing resources
> most of the time for the user.

No, please let's not call them that.  "Manager" is a very common word
to describe all kinds of classes in object-oriented designs, and it is
so generic as to hardly mean anything.  (Sorry, i don't have a better
alternative at the moment.)


-- ?!ng
From stephen at xemacs.org  Tue Apr 26 10:36:16 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue Apr 26 10:36:22 2005
Subject: [Python-Dev] defmacro
In-Reply-To: <426DDD87.60908@canterbury.ac.nz> (Greg Ewing's message of
	"Tue, 26 Apr 2005 18:19:51 +1200")
References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
	<426DDD87.60908@canterbury.ac.nz>
Message-ID: <87k6mqnddr.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Greg" == Greg Ewing <greg.ewing@canterbury.ac.nz> writes:

    Greg> This raises the question of why people feel the need for
    Greg> macros in Lisp or Scheme, which have an even more minimal
    Greg> and flexible syntax. I think part of the reason is that the
    Greg> syntax for passing an unevaluated block is too obtrusive.
    Greg> [... T]here is a natural tendency to want to be able to cut
    Greg> out the lambda cruft....

This doesn't feel right to me.  By that argument, people would want
to "improve"

  (mapcar (lambda (x) (car x)) list-of-lists)

to

  (mapcar list-of-lists (x) (car x))

Have you ever heard someone complain about that lambda, though?

My feeling is that the reason for macros in Lisps is that people want
control structures to look like control structures, not like function
calls whose actual arguments "just happen" to be anonymous function
objects.  In this context, the lambda does not merely bind f, it also
excludes a lot of other possibilities.  I mean when I see

   (with-locked-file "foo/blarg"
      (lambda (f)
         (do-something-with f)))

I go "What's this?  Oh, here the file is obviously important, and
there we have a function of one formal argument with no actual
arguments, so it must be that we're processing the file with the
function."  This emphasizes the application of this function to that
file too much for my taste, and I will assume that the behavior of the
block is self-contained---it had better not depend on free variables.

But with

   (with-locked-file (f "foo/blarg")
      (do-something-with-as-modified-by f x))

there's no particular need for the block to exclusively concentrate on
handling f, and there's nothing disconcerting about the presence of x.
N.B. for non-Lispers: in Common Lisp idiom the list (f "foo/blarg")
may be treated as two arguments, but associating f with "foo/blarg" in
some way.  I think in this context it is much more readable.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
From bob at redivi.com  Tue Apr 26 10:43:34 2005
From: bob at redivi.com (Bob Ippolito)
Date: Tue Apr 26 10:44:30 2005
Subject: [Python-Dev] site enhancements (request for review)
In-Reply-To: <426DCDAE.8060907@canterbury.ac.nz>
References: <53f1dec01a0d78057c40abb1942cf0f1@redivi.com>
	<426DCDAE.8060907@canterbury.ac.nz>
Message-ID: <f0d8fb4c2fd52d7b16349831db70179e@redivi.com>


On Apr 26, 2005, at 1:12 AM, Greg Ewing wrote:

> Bob Ippolito wrote:
>> A few weeks ago I put together a patch to site.py for Python 2.5 
>> <http://python.org/sf/1174614> that solves three major deficiencies:
> >
> > [concerning .pth files]
>
> While we're on the subject of .pth files, what about
> the idea of scanning the directory containing the main
> .py file for .pth files? This would make it easier to
> have collections of Python programs sharing a common
> set of modules, without having to either install them
> system-wide or write hairy sys.path-manipulating code
> or use platform-dependent symlink or PATH hacks.

I don't think I'd ever use that, but it doesn't sound like a terrible 
idea.

-bob

From stephen at xemacs.org  Tue Apr 26 10:55:10 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue Apr 26 10:55:15 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <426DCAE5.2070501@canterbury.ac.nz> (Greg Ewing's message of
	"Tue, 26 Apr 2005 17:00:21 +1200")
References: <fb6fbf560504251520797338b2@mail.gmail.com>
	<1114473665.3698.2.camel@schizo> <426DCAE5.2070501@canterbury.ac.nz>
Message-ID: <87fyxdor2p.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Greg" == Greg Ewing <greg.ewing@canterbury.ac.nz> writes:

    Greg> Two things are mildly annoying about if-elif chains as a
    Greg> substitute for a switch statement:

    Greg> 1) Repeating the name of the thing being switched on all the
    Greg> time, and the operator being used for comparison.

What's worse, to my mind, is the not infrequent case where the thing
being switched on or the operator changes.  Sure, that's bad style,
but sometimes you have to read other people's code like that.


-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
From gvanrossum at gmail.com  Tue Apr 26 11:24:53 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Apr 26 11:24:56 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <87fyxdor2p.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <fb6fbf560504251520797338b2@mail.gmail.com>
	<1114473665.3698.2.camel@schizo> <426DCAE5.2070501@canterbury.ac.nz>
	<87fyxdor2p.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <ca471dc2050426022458a4ad@mail.gmail.com>

>     Greg> 1) Repeating the name of the thing being switched on all the
>     Greg> time, and the operator being used for comparison.
> 
> What's worse, to my mind, is the not infrequent case where the thing
> being switched on or the operator changes.  Sure, that's bad style,
> but sometimes you have to read other people's code like that.

You mean like this?

    if x > 0:
       ...normal case...
    elif y > 0:
        ....abnormal case...
    else:
        ...edge case...

You have guts to call that bad style! :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Tue Apr 26 11:36:05 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Apr 26 11:36:10 2005
Subject: [Python-Dev] site enhancements (request for review)
In-Reply-To: <426DCDAE.8060907@canterbury.ac.nz>
References: <53f1dec01a0d78057c40abb1942cf0f1@redivi.com>
	<426DCDAE.8060907@canterbury.ac.nz>
Message-ID: <ca471dc2050426023629559cab@mail.gmail.com>

> While we're on the subject of .pth files, what about
> the idea of scanning the directory containing the main
> .py file for .pth files? This would make it easier to
> have collections of Python programs sharing a common
> set of modules, without having to either install them
> system-wide or write hairy sys.path-manipulating code
> or use platform-dependent symlink or PATH hacks.

I do that all the time without .pth files -- I just put all the common
modules in a package and place the package in the directory containing
the "main" .py files.

I do have use cases where for reasons of separate development cycles
(etc.) I have some code (usually experimental or "unofficial" in some
way)  in a different place that also needs access to the same set of
common modules, and there I use explicit sys.path manipulations. I
think that even if the proposed feature was available I wouldn't
switch to it -- it's too easy to forget about the .pth file and be
confused when it points to the wrong place. That's also the reason why
I don't use symlinks or $PYTHONPATH for this purpose. EIBTI. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From flaig at sanctacaris.net  Tue Apr 26 12:39:26 2005
From: flaig at sanctacaris.net (flaig@sanctacaris.net)
Date: Tue Apr 26 12:39:33 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
Message-ID: <200504261039.j3QAdQU2013249@ger5.wwwserver.net>

Actually I was thinking of something related the other day: Wouldn't it be nice to be able to define/overload not only operators but also control structures? That way Python's core language could be kept simple and free of "featuritis" while at the same time everyone who desires a match/case or repeat/until statement, or anything more sophisticated, could implement it for himself. 

If you suspect that good old Lisp is on my mind, you are probably right :) . Actually, the idea of a programming language whose structures can be adapted to everyone's personal style is still very appealing to me.

I have no very distinct ideas about how such a thing might be designed (and still less whether it could be made to work efficiently), but perhaps somewhat like this (just to clarify along which paths my thoughts are currently moving):

structure whaddayacallit: # name only as a comment
   def opening_clause:
      statements
   def alternative_clause_1:
      statements
   def *alternative_clause_2: # the asterisk to indicate that this may occur several times
      statements
   def closing-clause:
      statements

e.g.:

structure multiple_switch:
   condition = None
   def switch(self, c): # condition must be passed as a lambdoid function
      self.condition, self.finished = c, False
   def *case(self, x, statements): # so must the statements subordinate to the new "case" expression
      if self.condition( x ):
          statements()
          self.finished = True
          break structure
   def otherwise(self, statements);
      if not self.finished:
          statements()

and the application:

switch my_favourite_language:
    case Python:
        print "Hi Guido"
    case Perl:
        print "Hi Larry"
    otherwise:
        print "Hi pleb"

At least to me, this has a definitively macroish flavour... and not in the #dumbdown style of C. Or rather say, macros might be a generalized way to achieve this, if they are intelligently designed. (<= That at least shouldn't be the problem, since this is the Python community and not M$'s development department :-) .)

Do you think any of this might make sense?

-- Rüdiger Marcus


PS. Aahz: When describing Ruby as the "antithesis" of Python recently I was thinking in Laskerian rather than Hegelian terms... the differences are not really big, but Ruby has always been positioned as a deliberate challenge to Python.


Date: Mon, 25 Apr 2005 09:42:54 -0700
> From: Michael Chermside <mcherm@mcherm.com>
> Subject: RE: [Python-Dev] defmacro (was: Anonymous blocks)
> To: python-dev@python.org
> Cc: jimjjewett@gmail.com
> Message-ID: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
> Content-Type: text/plain;       charset=ISO-8859-1
> 
> Jim Jewett writes:
> > As best I can tell, the anonymous blocks are used to take
> > care of boilerplate code without changing the scope -- exactly
> > what macros are used for.
> 
> Folks, I think that Jim is onto something here.
> 
> I've been following this conversation, and it sounds to me as if we
> are stumbling about in the dark, trying to feel our way toward something
> very useful and powerful. I think Jim is right, what we're feeling our
> way toward is macros.
> 
> The problem, of course, is that Guido (and others!) are on record as
> being opposed to adding macros to Python. (Even "good" macros... think
> lisp, not cpp.) I am not quite sure that I am convinced by the argument,
> but let me see if I can present it:
> 
>   Allowing macros in Python would enable individual programmers or
>   groups to easily invent their own "syntax". Eventually, there would
>   develop a large number of different Python "dialects" (as some
>   claim has happened in the Lisp community) each dependent on macros
>   the others lack. The most important casualty would be Python's
>   great *readability*.
> 
> (If this is a strawman argument, i.e. if you know of a better reason
> for keeping macros OUT of Python please speak up. Like I said, I've
> never been completely convinced of it myself.)
> 
> I think it would be useful if we approached it like this: either what
> we want is the full power of macros (in which case the syntax we choose
> should be guided by that choice), or we want LESS than the full power
> of macros. If we want less, then HOW less?
> 
> In other words, rather than hearing what we'd like to be able to DO
> with blocks, I'd like to hear what we want to PROHIBIT DOING with
> blocks. I think this might be a fruitful way of thinking about the
> problem which might make it easier to evaluate syntax suggestions. And
> if the answer is that we want to prohibit nothing, then the right
> solution is macros.
> 
> -- Michael Chermside
> ===
Chevalier Dr. Dr. Ruediger Marcus Flaig
   Institute for Immunology
   University of Heidelberg
   Im Neuenheimer Feld 305, D-69120 Heidelberg, FRG
   <flaig@sanctacaris.net>


--
Diese E-Mail wurde mit http://www.mail-inspector.de verschickt
Mail Inspector ist ein kostenloser Service von http://www.is-fun.net
Der Absender dieser E-Mail hatte die IP: 129.206.124.135

From gvanrossum at gmail.com  Tue Apr 26 13:37:47 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Apr 26 13:37:53 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <426DB7C8.5020708@canterbury.ac.nz>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
Message-ID: <ca471dc2050426043713116248@mail.gmail.com>

[Greg Ewing]
> I like the general shape of this, but I have one or two
> reservations about the details.

That summarizes the feedback so far pretty well. I think we're on to
something. And I'm not too proud to say that Ruby has led the way here
to some extent (even if Python's implementation would be fundamentally
different, since it's based on generators, which has some different
possibilities and precludes some Ruby patterns).

> 1) We're going to have to think carefully about the naming of
> functions designed for use with this statement. If 'with'
> is going to be in there as a keyword, then it really shouldn't
> be part of the function name as well.

Of course. I only used 'with_opened' because it's been the running
example in this thread.

> I would rather see something like
> 
>    with f = opened(pathname):
>      ...
> 
> This sort of convention (using a past participle as a function
> name) would work for some other cases as well:
> 
>    with some_data.locked():
>      ...
> 
>    with some_resource.allocated():
>      ...


Or how about

    with synchronized(some_resource):
        ...

> On the negative side, not having anything like 'with' in the
> function name means that the fact the function is designed for
> use in a with-statement could be somewhat non-obvious. Since
> there's not going to be much other use for such a function,
> this is a bad thing.

This seems a pretty mild problem; one could argue that every function
is only useful in a context where its return type makes sense, and we
seem to be getting along just fine with naming conventions (or just
plain clear naming).

> It could also lead people into subtle usage traps such as
> 
>    with f = open(pathname):
>      ...
> 
> which would fail in a somewhat obscure way.

Ouch. That one hurts. (I was going to say "but f doesn't have a next()
method" when I realized it *does*. :-) It is *almost* equivalent to

    for f in open(pathname):
        ...

except if the "..." block raises an exception.  Fortunately your
proposal to use 'as' makes this mistake less likely.

> So maybe the 'with' keyword should be dropped (again!) in
> favour of
> 
>    with_opened(pathname) as f:
>      ...

But that doesn't look so great for the case where there's no variable
to be assigned to -- I wasn't totally clear about it, but I meant the
syntax to be

    with [VAR =] EXPR: BLOCK

where VAR would have the same syntax as the left hand side of an
assignment (or the variable in a for-statement).

> 2) I'm not sure about the '='. It makes it look rather deceptively
> like an ordinary assignment, and I'm sure many people are going
> to wonder what the difference is between
> 
>    with f = opened(pathname):
>      do_stuff_to(f)
> 
> and simply
> 
>    f = opened(pathname)
>    do_stuff_to(f)
> 
> or even just unconsciously read the first as the second without
> noticing that anything special is going on. Especially if they're
> coming from a language like Pascal which has a much less magical
> form of with-statement.

Right.

> So maybe it would be better to make it look more different:
> 
>    with opened(pathname) as f:
>      ...

Fredrik said this too, and as long as we're going to add 'with' as a
new keyword, we might as well promote 'as' to become a real
keyword. So then the syntax would become

    with EXPR [as VAR]: BLOCK

I don't see a particular need for assignment to multiple VARs (but VAR
can of course be a tuple of identifiers).

> * It seems to me that this same exception-handling mechanism
> would be just as useful in a regular for-loop, and that, once
> it becomes possible to put 'yield' in a try-statement, people
> are going to *expect* it to work in for-loops as well.

(You can already put a yield inside a try-except, just not inside a
try-finally.)

> Guido has expressed concern about imposing extra overhead on
> all for-loops. But would the extra overhead really be all that
> noticeable? For-loops already put a block on the block stack,
> so the necessary processing could be incorporated into the
> code for unwinding a for-block during an exception, and little
> if anything would need to change in the absence of an exception.

Probably.

> However, if for-loops also gain this functionality, we end up
> with the rather embarrassing situation that there is *no difference*
> in semantics between a for-loop and a with-statement!

There would still be the difference that a for-loop invokes iter() and
a with-block doesn't.

Also, for-loops that don't exhaust the iterator leave it available for
later use. I believe there are even examples of this pattern, where
one for-loop searches the iterable for some kind of marker value and
the next for-loop iterates over the remaining items. For example:

    f = open(messagefile)
    # Process message headers
    for line in f:
        if not line.strip():
            break
        if line[0].isspace():
            addcontinuation(line)
        else:
            addheader(line)
    # Process message body
    for line in f:
        addbody(line)

> This could be "fixed" by making the with-statement not loop,
> as has been suggested. That was my initial thought as well,
> but having thought more deeply, I'm starting to think that
> Guido was right in the first place, and that a with-statement
> should be capable of looping. I'll elaborate in another post.

So perhaps the short description of a with-statement that we give to
newbies could be the following:

    """
    The statement:

        for VAR in EXPR:
            BLOCK

    does the same thing as:

        with iter(EXPR) as VAR:        # Note the iter() call
            BLOCK

    except that:

    - you can leave out the "as VAR" part from the with-statement;
    - they work differently when an exception happens inside BLOCK;
    - break and continue don't always work the same way.

    The only time you should write a with-statement is when the
    documentation for the function you are calling says you should.
    """

> > So a block could return a value to the generator using a return
> > statement; the generator can catch this by catching ReturnFlow.
> > (Syntactic sugar could be "VAR = yield ..." like in Ruby.)
> 
> This is a very elegant idea, but I'm seriously worried by the
> possibility that a return statement could do something other
> than return from the function it's written in, especially if
> for-loops also gain this functionality.

But they wouldn't!

> Intercepting break
> and continue isn't so bad, since they're already associated
> with the loop they're in, but return has always been an
> unconditional get-me-out-of-this-function. I'd feel uncomfortable
> if this were no longer true.

Me too.

Let me explain the use cases that led me to throwing that in (I ws
running out of time and didn't properly explain it) and then let me
propose an alternative.  This is a bit long, but important!

*First*, in the non-looping use cases (like acquiring and releasing a
lock), a return-statement should definitely be allowed when the
with-statement is contained in a function.  There's lots of code like
this out there:

    def search(self, eligible, default=None):
        self.lock.acquire()
        try:
            for item in self.elements:
                if eligible(item):
                    return item
            # no eligible iems
            return default
        finally:
            self.lock.release()

and this translates quite nicely to a with-statement:

    def search(self, eligible, default=None):
        with synchronized(self.lock):
            for item in self.elements:
                if eligible(item):
                    return item
            # no eligible iems
            return default

*Second*, it might make sense if break and continue would be handled
the same way; here's an example:

    def alt_search(self):
        for item in self.elements:
            with synchronized(item):
               if item.abandoned():
                   continue
               if item.eligible():
                   break
        else:
            item = self.default_item
        return item.post_process()

(I realize the case for continue isn't as strong as that for break,
but I think we have to support both if we support one.)

*Third*, if there is a try-finally block around a yield in the
generator, the finally clause absolutely must be executed when control
leaves the body of the with-statement, whether it is through return,
break, or continue.  This pretty much means these have to be turned
into some kind of exception.  So the first example would first be
transformed into this:

    def search(self, eligible, default=None):
        try:
            with synchronized(self.lock):
                for item in self.elements:
                    if eligible(item):
                        raise ReturnFlow(item)  # was "return item"
                # no eligible iems
                raise ReturnFlow(default)    # was "return default"
        except ReturnFlow, exc:
            return exc.value

before applying the transformation of the with-statement, which I
won't repeat here (look it up in my previous long post in this thread).
(BTW I do agree that it should use __next__(), not next_ex().)

I'm assuming the following definition of the ReturnFlow exception:

    class ReturnFlow(Exception):
        def __init__(self, value=None):
             self.value = value

The translation of break into raise BreakFlow() and continue into rase
ContinueFlow() is now obvious.  (BTW ReturnFlow etc. aren't great
names.  Suggestions?)

*Fourth*, and this is what makes Greg and me uncomfortable at the same
time as making Phillip and other event-handling folks drool: from the
previous three points it follows that an iterator may *intercept* any
or all of ReturnFlow, BreakFlow and ContinueFlow, and use them to
implement whatever cool or confusing magic they want.  For example, a
generator can decide that for the purposes of break and continue, the
with-statement that calls it is a loop, and give them the usual
semantics (or the opposite, if you're into that sort of thing :-).  Or
a generator can receive a value from the block via a return statement.

Notes:

- I think there's a better word than Flow, but I'll keep using it
  until we find something better.

- This is not limited to generators -- the with-statement uses an
  arbitrary "new-style" iterator (something with a __next__() method
  taking an optional exception argument).

- The new __next__() API can also (nay, *must*, to make all this work
  reliably) be used to define exception and cleanup semantics for
  generators, thereby rendering obsolete PEP 325 and the second half
  of PEP 288.  When a generator is GC'ed (whether by reference
  counting or by the cyclical garbage collector), its __next__()
  method is called with a BreakFlow exception instance as argument (or
  perhaps some other special exception created for the purpose).  If
  the generator catches the exception and yields another value, too
  bad -- I consider that broken behavior.  (The alternative would be
  to keep calling __next__(BreakFlow()) until it doesn't return a
  value, but that feels uncomfortable in a finalization context.)

- Inside a with-statement, user code raising a Flow exception acts the
  same as the corresponding statement.  This is slightly unfortunate,
  because it might lead one to assume that the same is true for
  example in a for-loop or while-loop, but I don't want to make that
  change.  I don't think it's a big problem.

Given that 1, 2 and 3 combined make 4 inevitable, I think we might as
well give in, and *always* syntactically accept return, break and
continue in a with-statement, whether or not it is contained in a loop
or function.  When the iterator does not handle the Flow exceptions,
and there is no outer context in which the statement is valid, the
Flow exception is turned into an IllegalFlow exception, which is the
run-time equivalent of SyntaxError: 'return' outside function (or
'break' outside loop, etc.).

Now there's one more twist, which you may or may not like.  Presumably
(barring obfuscations or bugs) the handling of BreakFlow and
ContinueFlow by an iterator (or generator) is consistent for all uses
of that particular iterator.  For example synchronized(lock) and
transactional(db) do not behave as loops, and forever() does.  Ditto
for handling ReturnFlow.  This is why I've been thinking of leaving
out the 'with' keyword: in your mind, these calls would become new
statement types, even though the compiler sees them all the same:

    synchronized(lock):
        BLOCK

    transactional(db):
        BLOCK

    forever():
        BLOCK

    opening(filename) as f:
        BLOCK

It does require the authors of such iterators to pick good names, and
it doesn't look as good when the iterator is a method of some object:

    self.elements[0].locker.synchronized():
        BLOCK

You proposed this too (and I even commented on it, ages ago in this
same endless message :-) and while I'm still on the fence, at least I
now have a better motivational argument (i.e., that each iterator
becomes a new statement type in your mind).

One last thing: if we need a special name for iterators and generators
designed for use in a with-statement, how about calling them
with-iterators and with-generators.  The non-looping kind can be
called resource management iterators / generators.  I think whatever
term we come up with should not be a totally new term but a
combination of iterator or generator with some prefix, and it should
work both for iterators and for generators.

That's all I can muster right now (I should've been in bed hours ago)
but I'm feeling pretty good about this.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From rodsenra at gpr.com.br  Tue Apr 26 14:11:59 2005
From: rodsenra at gpr.com.br (Rodrigo Dias Arruda Senra)
Date: Tue Apr 26 14:11:31 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <877e9a1705042520151e493f3f@mail.gmail.com>
References: <fb6fbf560504251234553e881f@mail.gmail.com>
	<ca471dc205042513162c5cff33@mail.gmail.com>
	<fb6fbf560504251501111fa7a5@mail.gmail.com>
	<ca471dc20504251511689cffd3@mail.gmail.com>
	<877e9a1705042520151e493f3f@mail.gmail.com>
Message-ID: <20050426091159.74a83735@localhost.localdomain>

[ Michael Walter ]:
> A couple of examples out of my tired head (solely from a user perspective) :-)
> 
> Embedding domain specific language (ex.: state machine):
> ...
> 
> Embedding domain specific language (ex.: markup language):
> ...
> 
> Embedding domain-specific language (ex.: badly-designed database table):
> ...
>
> ..., which might actually prove someone's point that the
> language designer shouldn't allow people to do such things.

 The whole macros issue comes to a tradeoff between
 power+expressiviness X readability. 

 IMVHO, macros are readability assassins. The power (for any developer)
 to introduce new syntax is *not* a desirable feature, but something
 to be avoided. And that alone should be a stronger argument than
 a hundred use cases. 

 cheers,
 Senra

-- 
Rodrigo Senra                 
--
MSc Computer Engineer    rodsenra(at)gpr.com.br  
GPr Sistemas Ltda        http://www.gpr.com.br/ 
Personal Blog     http://rodsenra.blogspot.com/

From greg.ewing at canterbury.ac.nz  Tue Apr 26 14:36:26 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue Apr 26 14:45:31 2005
Subject: [Python-Dev] Re: anonymous blocks
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426C54BF.2010906@ocf.berkeley.edu> <426DBCCE.40903@canterbury.ac.nz>
	<426DDFF6.3060808@ocf.berkeley.edu>
Message-ID: <426E35CA.60004@canterbury.ac.nz>

Brett C. wrote:

> It executes the body, calling next() on the argument
 > name on each time through until the iteration stops.

But that's no good, because (1) it mentions next(),
which should be an implementation detail, and (2)
it talks about iteration, when most of the time
the high-level intent has nothing to do with iteration.

In other words, this is too low a level of explanation.

Greg


From michael.walter at gmail.com  Tue Apr 26 14:51:22 2005
From: michael.walter at gmail.com (Michael Walter)
Date: Tue Apr 26 14:51:26 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <20050426091159.74a83735@localhost.localdomain>
References: <fb6fbf560504251234553e881f@mail.gmail.com>
	<ca471dc205042513162c5cff33@mail.gmail.com>
	<fb6fbf560504251501111fa7a5@mail.gmail.com>
	<ca471dc20504251511689cffd3@mail.gmail.com>
	<877e9a1705042520151e493f3f@mail.gmail.com>
	<20050426091159.74a83735@localhost.localdomain>
Message-ID: <877e9a17050426055138d243af@mail.gmail.com>

On 4/26/05, Rodrigo Dias Arruda Senra <rodsenra@gpr.com.br> wrote:
>  IMVHO, macros are readability assassins. The power (for any developer)
>  to introduce new syntax is *not* a desirable feature, but something
>  to be avoided. And that alone should be a stronger argument than
>  a hundred use cases.

Personally, I believe that EDSLs can improve usability of a library.
I've been following this list for quite a while, and trying to see
what lengths (hacks) people go (use) to implement "sexy" syntax can
give you quite an idea that custom syntax matters. And surely all of
these tricks (hacks) are way harder to use than a EDSL would be.

Regards,
Michael
From reinhold-birkenfeld-nospam at wolke7.net  Tue Apr 26 14:49:41 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Tue Apr 26 14:53:47 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc2050426043713116248@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
Message-ID: <d4ld6r$ms5$1@sea.gmane.org>

Guido van Rossum wrote:
> [Greg Ewing]
>> I like the general shape of this, but I have one or two
>> reservations about the details.
> 
> That summarizes the feedback so far pretty well. I think we're on to
> something. And I'm not too proud to say that Ruby has led the way here
> to some extent (even if Python's implementation would be fundamentally
> different, since it's based on generators, which has some different
> possibilities and precludes some Ruby patterns).

Five random thoughts:

1. So if break and continue are allowed in with statements only when there
   is an enclosing loop, it would be a inconsistency; consider

     for item in seq:
        with gen():
            continue

   when the generator gen catches the ContinueFlow and does with it what it wants.
   It is then slightly unfair not to allow

     with x:
         continue

   Anyway, I would consider both counterintuitive. So what about making ReturnFlow,
   BreakFlow and ContinueFlow "private" exceptions that cannot be caught in user code
   and instead introducing a new statement that allows passing data to the generator?

2. In process of handling this, would it be reasonable to (re)introduce a combined
   try-except-finally statement with defined syntax (all except before finally) and
   behavior (finally is always executed)?

5. What about the intended usage of 'with' as in Visual B.. NO, NO, NOT THE WHIP!

   (not that you couldn't emulate this with a clever "generator":
      def short(x):
          yield x

      with short(my.long["object"]reference()) as _:
          _.spam = _.ham = _.eggs()

yours,
Reinhold

-- 
Mail address is perfectly valid!

From ncoghlan at gmail.com  Tue Apr 26 15:03:33 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue Apr 26 15:03:42 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc2050426043713116248@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
Message-ID: <426E3C25.4090204@gmail.com>

Guido van Rossum wrote:
[snip]
> - I think there's a better word than Flow, but I'll keep using it
>   until we find something better.

How about simply reusing Iteration (ala StopIteration)?

   Pass in 'ContinueIteration' for 'continue'
   Pass in 'BreakIteration' for 'break'
   Pass in 'AbortIteration' for 'return' and finalisation.

And advise strongly *against* intercepting AbortIteration with anything other 
than a finally block.

> - The new __next__() API can also (nay, *must*, to make all this work
>   reliably) be used to define exception and cleanup semantics for
>   generators, thereby rendering obsolete PEP 325 and the second half
>   of PEP 288.  When a generator is GC'ed (whether by reference
>   counting or by the cyclical garbage collector), its __next__()
>   method is called with a BreakFlow exception instance as argument (or
>   perhaps some other special exception created for the purpose).  If
>   the generator catches the exception and yields another value, too
>   bad -- I consider that broken behavior.  (The alternative would be
>   to keep calling __next__(BreakFlow()) until it doesn't return a
>   value, but that feels uncomfortable in a finalization context.)

As suggested above, perhaps the exception used here should be the exception that 
is raised when a 'return' statement is encountered inside the block, rather than 
the more-likely-to-be-messed-with 'break' statement.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From greg.ewing at canterbury.ac.nz  Tue Apr 26 14:58:41 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue Apr 26 15:07:45 2005
Subject: [Python-Dev] Re: anonymous blocks
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
Message-ID: <426E3B01.1010007@canterbury.ac.nz>

Guido van Rossum wrote:
> [Greg Ewing]
>>* It seems to me that this same exception-handling mechanism
>>would be just as useful in a regular for-loop, and that, once
>>it becomes possible to put 'yield' in a try-statement, people
>>are going to *expect* it to work in for-loops as well.
> 
> (You can already put a yield inside a try-except, just not inside a
> try-finally.)

Well, my point still stands. People are going to write
try-finally around their yields and expect the natural
thing to happen when their generator is used in a
for-loop.

> There would still be the difference that a for-loop invokes iter() and
> a with-block doesn't.
 >
 > Also, for-loops that don't exhaust the iterator leave it available for
 > later use.

Hmmm. But are these big enough differences to justify
having a whole new control structure? Whither TOOWTDI?

>     """
>     The statement:
> 
>         for VAR in EXPR:
>             BLOCK
> 
>     does the same thing as:
> 
>         with iter(EXPR) as VAR:        # Note the iter() call
>             BLOCK
> 
>     except that:
> 
>     - you can leave out the "as VAR" part from the with-statement;
>     - they work differently when an exception happens inside BLOCK;
>     - break and continue don't always work the same way.
> 
>     The only time you should write a with-statement is when the
>     documentation for the function you are calling says you should.
>     """

Surely you jest. Any newbie reading this is going to think
he hasn't a hope in hell of ever understanding what is going
on here, and give up on Python in disgust.


>>I'm seriously worried by the
>>possibility that a return statement could do something other
>>than return from the function it's written in.

> Let me explain the use cases that led me to throwing that in

Yes, I can see that it's going to be necessary to treat
return as an exception, and accept the possibility that
it will be abused. I'd still much prefer people refrain
from abusing it that way, though. Using "return" to spell
"send value back to yield statement" would be extremely
obfuscatory.

> (BTW ReturnFlow etc. aren't great
> names.  Suggestions?)

I'd suggest just calling them Break, Continue and Return.

>     synchronized(lock):
>         BLOCK
> 
>     transactional(db):
>         BLOCK
> 
>     forever():
>         BLOCK
> 
>     opening(filename) as f:
>         BLOCK

Hey, I like that last one! Well done!

> One last thing: if we need a special name for iterators and generators
> designed for use in a with-statement, how about calling them
> with-iterators and with-generators.

Except that if it's no longer a "with" statement, this
doesn't make so much sense...

Greg


From reinhold-birkenfeld-nospam at wolke7.net  Tue Apr 26 15:13:58 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Tue Apr 26 15:18:41 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <426E3C25.4090204@gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>	<426DB7C8.5020708@canterbury.ac.nz>	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3C25.4090204@gmail.com>
Message-ID: <d4lekc$tnq$1@sea.gmane.org>

Nick Coghlan wrote:
> Guido van Rossum wrote:
> [snip]
>> - I think there's a better word than Flow, but I'll keep using it
>>   until we find something better.
> 
> How about simply reusing Iteration (ala StopIteration)?
> 
>    Pass in 'ContinueIteration' for 'continue'
>    Pass in 'BreakIteration' for 'break'
>    Pass in 'AbortIteration' for 'return' and finalisation.
> 
> And advise strongly *against* intercepting AbortIteration with anything other 
> than a finally block.

Hmmm... another idea: If break and continue return keep exactly the current
semantics (break or continue the innermost for/while-loop), do we need
different exceptions at all? AFAICS AbortIteration (+1 on the name) would be
sufficient for all three interrupting statements, and this would prevent
misuse too, I think.

yours,
Reinhold

-- 
Mail address is perfectly valid!

From caglar at uludag.org.tr  Tue Apr 26 15:29:29 2005
From: caglar at uludag.org.tr (=?iso-8859-9?Q?S=2E=C7a=F0lar?= Onur)
Date: Tue Apr 26 15:29:36 2005
Subject: [Python-Dev] Removing --with-wctype-functions support
Message-ID: <1114522169.5352.9.camel@poseidon.cekirdek.int>

Hi;

I just subscribed this list, so i don't know whether this discussed
before. If so, sorry.

I want to know status of
http://mail.python.org/pipermail/python-dev/2004-December/050193.html
this thread. Will python remove wctype functions support from its core?
If it will, what about locale-dependent case conversation functions? 

Without this support python behaves wrong in tr_TR.UTF-8 locale. As a
side effect, this problem reported to Gentoo Linux to add a wctype
support to current python ebuild
( http://bugs.gentoo.org/show_bug.cgi?id=69322 ), but they don't want to
add any removed feature to their ebuilds, and also not break anything
because portage is built on python.

So :), what will the feature of wctype?

Yours
-- 
S.?a?lar Onur <caglar@uludag.org.tr>
http://cekirdek.uludag.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Bu dijital olarak =?iso-8859-9?Q?imzalanm=FD=FE?= ileti
	=?iso-8859-9?Q?par=E7as=FDd=FDr?=
Url : http://mail.python.org/pipermail/python-dev/attachments/20050426/4f66994d/attachment.pgp
From ncoghlan at gmail.com  Tue Apr 26 15:44:30 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue Apr 26 15:44:39 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <d4lekc$tnq$1@sea.gmane.org>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>	<426DB7C8.5020708@canterbury.ac.nz>	<ca471dc2050426043713116248@mail.gmail.com>	<426E3C25.4090204@gmail.com>
	<d4lekc$tnq$1@sea.gmane.org>
Message-ID: <426E45BE.2020009@gmail.com>

Reinhold Birkenfeld wrote:
> Nick Coghlan wrote:
> 
>>Guido van Rossum wrote:
>>[snip]
>>
>>>- I think there's a better word than Flow, but I'll keep using it
>>>  until we find something better.
>>
>>How about simply reusing Iteration (ala StopIteration)?
>>
>>   Pass in 'ContinueIteration' for 'continue'
>>   Pass in 'BreakIteration' for 'break'
>>   Pass in 'AbortIteration' for 'return' and finalisation.
>>
>>And advise strongly *against* intercepting AbortIteration with anything other 
>>than a finally block.
> 
> 
> Hmmm... another idea: If break and continue return keep exactly the current
> semantics (break or continue the innermost for/while-loop), do we need
> different exceptions at all? AFAICS AbortIteration (+1 on the name) would be
> sufficient for all three interrupting statements, and this would prevent
> misuse too, I think.

No, the iterator should be able to keep state around in the case of 
BreakIteration and ContinueIteration, whereas AbortIteration should shut the 
whole thing down.

In particular "VAR = yield None" is likely to become syntactic sugar for:
   try:
     yield None
   except ContinueIteration, exc:
     VAR = ContinueIteration.value

We definitely don't want that construct swallowing AbortIteration.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From mwh at python.net  Tue Apr 26 16:13:07 2005
From: mwh at python.net (Michael Hudson)
Date: Tue Apr 26 16:13:01 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc205042416572da9db71@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
Message-ID: <3cf156b4a85d5b9907c6c9333d8c6af8@python.net>

Whew!  This is a bit long...

On 25 Apr 2005, at 00:57, Guido van Rossum wrote:

> After reading a lot of contributions (though perhaps not all -- this
> thread seems to bifurcate every time someone has a new idea :-)

I haven't read all the posts around the subject, I'll have to admit.  
I've read the one I'm replying and its followups to pretty carefully, 
though.

> I'm back to liking yield for the PEP 310 use case. I think maybe it was
> Doug Landauer's post mentioning Beta, plus scanning some more examples
> of using yield in Ruby. Jim Jewett's post on defmacro also helped, as
> did Nick Coghlan's post explaining why he prefers 'with' for PEP 310
> and a bare expression for the 'with' feature from Pascal (and other
> languages :-).

The history of iterators and generators could be summarized by saying 
that an API was invented, then it turned out that in practice one way 
of implementing them -- generators -- was almost universally useful.

This proposal seems a bit like an effort to make generators good at 
doing something that they aren't really intended -- or dare I say 
suited? -- for.  The tail wagging the dog so to speak.

> It seems that the same argument that explains why generators are so
> good for defining iterators, also applies to the PEP 310 use case:
> it's just much more natural to write
>
>     def with_file(filename):
>         f = open(filename)
>         try:
>             yield f
>         finally:
>             f.close()

This is a syntax error today, of course.  When does the finally: clause 
execute  with your proposal? [I work this one out below :)]

> than having to write a class with __entry__ and __exit__ and
> __except__ methods (I've lost track of the exact proposal at this
> point).

> At the same time, having to use it as follows:
>
>     for f in with_file(filename):
>         for line in f:
>             print process(line)
>
> is really ugly,

This is a non-starter, I hope.  I really meant what I said in PEP 310 
about loops being loops.

> so we need new syntax, which also helps with keeping
> 'for' semantically backwards compatible. So let's use 'with', and then
> the using code becomes again this:
>
>     with f = with_file(filename):
>         for line in f:
>             print process(line)
>
> Now let me propose a strawman for the translation of the latter into
> existing semantics. Let's take the generic case:
>
>     with VAR = EXPR:
>         BODY
>
> This would translate to the following code:
>
>     it = EXPR
>     err = None
>     while True:
>         try:
>             if err is None:
>                 VAR = it.next()
>             else:
>                 VAR = it.next_ex(err)
>         except StopIteration:
>             break
>         try:
>             err = None
>             BODY
>         except Exception, err: # Pretend "except Exception:" == 
> "except:"
>             if not hasattr(it, "next_ex"):
>                 raise
>
> (The variables 'it' and 'err' are not user-visible variables, they are
> internal to the translation.)
>
> This looks slightly awkward because of backward compatibility; what I
> really want is just this:
>
>     it = EXPR
>     err = None
>     while True:
>         try:
>             VAR = it.next(err)
>         except StopIteration:
>             break
>         try:
>             err = None
>             BODY
>         except Exception, err: # Pretend "except Exception:" == 
> "except:"
>             pass
>
> but for backwards compatibility with the existing argument-less next()
> API

More than that: if I'm implementing an iterator for, uh, iterating, why 
would one dream of needing to handle an 'err' argument in the next() 
method?

> I'm introducing a new iterator API next_ex() which takes an
> exception argument.  If that argument is None, it should behave just
> like next().  Otherwise, if the iterator is a generator, this will
> raised that exception in the generator's frame (at the point of the
> suspended yield).  If the iterator is something else, the something
> else is free to do whatever it likes; if it doesn't want to do
> anything, it can just re-raise the exception.

Ah, this answers my 'when does finally' execute question above.

> Finally, I think it would be cool if the generator could trap
> occurrences of break, continue and return occurring in BODY.  We could
> introduce a new class of exceptions for these, named ControlFlow, and
> (only in the body of a with statement), break would raise BreakFlow,
> continue would raise ContinueFlow, and return EXPR would raise
> ReturnFlow(EXPR) (EXPR defaulting to None of course).

Well, this is quite a big thing.

> So a block could return a value to the generator using a return
> statement; the generator can catch this by catching ReturnFlow.
> (Syntactic sugar could be "VAR = yield ..." like in Ruby.)
>
> With a little extra magic we could also get the behavior that if the
> generator doesn't handle ControlFlow exceptions but re-raises them,
> they would affect the code containing the with statement; this means
> that the generator can decide whether return, break and continue are
> handled locally or passed through to the containing block.
>
> Note that EXPR doesn't have to return a generator; it could be any
> object that implements next() and next_ex().  (We could also require
> next_ex() or even next() with an argument; perhaps this is better.)

My main objection to all this is that it conflates iteration and a more 
general kind of execution control (I guess iteration is a kind of 
execution control, but I contend that it's a sufficiently common case 
to get special treatment and also that names like 'for' and 'next' are 
only applicable to iteration).

So, here's a counterproposal!

with expr as var:
    ... code ...

is roughly:

def _(var):
     ... code ...
__private = expr
__private(_)

(var optional as in other proposals).

so one might write:

def open_file(f):
     def inner(block):
         try:
             block(f)
         finally:
             f.close()
     return inner

and have

with auto_closing(open("/tmp/foo")) as f:
     f.write('bob')

The need for approximation in the above translation is necessary 
because you'd want to make assignments in '...code...' affect the scope 
their written in, and also one might want to allow breaks and continues 
to be handled as in the end of your proposal.  And grudgingly, I guess 
you'd need to make returns behave like that anyway.

Has something like this been argued out somewhere in this thread?

As another example, here's how you'd implement something very like a 
for loop:

def as_for_loop(thing):
     it = iter(thing)
     def inner(thunk):
         while 1:
             try:
                 v = it.next()
             except StopIteration:
                 break
             try:
                 thunk(v)
             except Continue:
                 continue
             except Break:
                 break

so

for x in s:

and

with as_for_loop(s) as x:

are now equivalent (I hope :).

Cheers,
mwh

From mwh at python.net  Tue Apr 26 16:26:27 2005
From: mwh at python.net (Michael Hudson)
Date: Tue Apr 26 16:26:21 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <3cf156b4a85d5b9907c6c9333d8c6af8@python.net>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<3cf156b4a85d5b9907c6c9333d8c6af8@python.net>
Message-ID: <67909725b5458513bfe4eb9da573af93@python.net>


On 26 Apr 2005, at 15:13, Michael Hudson wrote:

> So, here's a counterproposal!

And a correction!

> with expr as var:
>    ... code ...
>
> is roughly:

def _(var):
     ... code ...
try:
     expr(_)
except Return, e:
     return e.value

Cheers,
mwh

From ncoghlan at gmail.com  Tue Apr 26 17:14:51 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue Apr 26 17:14:59 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <3cf156b4a85d5b9907c6c9333d8c6af8@python.net>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>
	<3cf156b4a85d5b9907c6c9333d8c6af8@python.net>
Message-ID: <426E5AEB.3030707@gmail.com>

Michael Hudson wrote:
> This is a non-starter, I hope.  I really meant what I said in PEP 310 
> about loops being loops.

The more I play with this, the more I want the 'with' construct to NOT be a loop 
construct.

The main reason is that it would be really nice to be able to write and use a 
multipart code template as:

def template():
   # pre_part_1
   yield None
   # post_part_1
   yield None
   # pre_part_2
   yield None
   # post_part_2
   yield None
   # pre_part_3
   yield None
   # post_part_3

def user():
   block = template()
   with block:
     # do_part_1
   with block:
     # do_part_2
   with block:
     # do_part_3

If 'with' is a looping construct, the above won't work, since the first usage 
will drain the template.

Accordingly, I would like to suggest that 'with' revert to something resembling 
the PEP 310 definition:

     resource = EXPR
     if hasattr(resource, "__enter__"):
         VAR = resource.__enter__()
     else:
         VAR = None
     try:
         try:
             BODY
         except:
             raise # Force realisation of sys.exc_info() for use in __exit__()
     finally:
         if hasattr(resource, "__exit__"):
             VAR = resource.__exit__()
         else:
             VAR = None

Generator objects could implement this protocol, with the following behaviour:

     def __enter__():
         try:
             return self.next()
         except StopIteration:
             raise RuntimeError("Generator exhausted, unable to enter with block")

     def __exit__():
         try:
             return self.next()
         except StopIteration:
             return None

     def __except__(*exc_info):
         pass

     def __no_except__():
         pass

Note that the code template can deal with exceptions quite happily by utilising 
sys.exc_info(), and that the result of the call to __enter__ is available 
*inside* the with block, while the result of the call to __exit__ is available 
*after* the block (useful for multi-part blocks).

If I want to drain the template, then I can use a 'for' loop (albeit without the 
cleanup guarantees).

Taking this route would mean that:
   * PEP 310 and the question of passing values or exceptions into iterators 
would again become orthogonal
   * Resources written using generator syntax aren't cluttered with the 
repetitive try/finally code PEP 310 is trying to eliminate
   * 'for' remains TOOW to write an iterative loop
   * it is possible to execute _different_ suites between each yield in the 
template block, rather than being constrained to a single suite as in the 
looping case.
   * no implications for the semantics of 'return', 'break', 'continue'
   * 'yield' would not be usable inside a with block, unless the AbortIteration 
concept was adopting for forcible generator termination.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From jimjjewett at gmail.com  Tue Apr 26 17:26:30 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue Apr 26 17:26:32 2005
Subject: [Python-Dev] Re: anonymous blocks vs scope-collapse
Message-ID: <fb6fbf5605042608267ed17786@mail.gmail.com>

Michael Hudson:

> This proposal seems a bit like an effort to make generators good at 
> doing something that they aren't really intended -- or dare I say 
> suited? -- for. 

I think it is more an effort to use the right keyword, which has 
unfortunately already been claimed by generators (and linked
to iterators).

    yield

is a sensible way for code to say "your turn, but come back later".

But at the moment, it means "I am producing an intermediate value",
and the way to call that function is to treat it as an iterator (which
seems to imply looping over a closed set, so don't send in more
information after the initial setup).

Should we accept that "yield" is already used up, or should we
shoehorn the concepts until they're "close enough"?

> So, here's a counterproposal!

> with expr as var:
>     ... code ...

> is roughly:

> def _(var):
>      ... code ...
> __private = expr
> __private(_)

...

> The need for approximation in the above translation is necessary 
> because you'd want to make assignments in '...code...' affect the scope 
> their written in, 

To me, this seems like the core requirement.  I see three sensible paths:

(1)  Do nothing.

(2)  Add a way to say "Make this function I'm calling use *my* locals 
and globals."  This seems to meet all the agreed-upon-as-good use 
cases, but there is disagreement over how to sensibly write it.  The 
calling function is the place that could get surprised, but people
who want thunks seem to want the specialness in the called function.

(3)  Add macros.  We still have to figure out how to limit their obfuscation.
Attempts to detail that goal seem to get sidetracked.

-jJ
From tjreedy at udel.edu  Tue Apr 26 18:00:30 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue Apr 26 18:05:05 2005
Subject: [Python-Dev] Re: Re: Re: anonymous blocks
References: <ca471dc205042116402d7d38da@mail.gmail.com><ca471dc205042416572da9db71@mail.gmail.com><426CB7C2.8030508@gmail.com>
	<d4iv44$9gn$1@sea.gmane.org> <426DC75A.1010005@canterbury.ac.nz>
Message-ID: <d4lo91$9g6$1@sea.gmane.org>


"Greg Ewing" <greg.ewing@canterbury.ac.nz> wrote in message 
news:426DC75A.1010005@canterbury.ac.nz...

> Terry Reedy wrote:

The part you quoted was by Nick Coghlan, not me, as indicated by the >> 
(now >>>) instead of > (which would now be >>) in front of the lines.

>>>Not supporting iterables makes it harder to write a class which is
...


From ark-mlist at att.net  Tue Apr 26 18:09:58 2005
From: ark-mlist at att.net (Andrew Koenig)
Date: Tue Apr 26 18:09:49 2005
Subject: [Python-Dev] defmacro
In-Reply-To: <87k6mqnddr.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <006201c54a7a$656bdb40$6402a8c0@arkdesktop>

> This doesn't feel right to me.  By that argument, people would want
> to "improve"
> 
>   (mapcar (lambda (x) (car x)) list-of-lists)
> 
> to
> 
>   (mapcar list-of-lists (x) (car x))
> 
> Have you ever heard someone complain about that lambda, though?

Welllll....  Shouldn't you have written

    (mapcar car list-of-lists)

or am I missing something painfully obvious?

From facundobatista at gmail.com  Tue Apr 26 18:22:15 2005
From: facundobatista at gmail.com (Facundo Batista)
Date: Tue Apr 26 18:22:18 2005
Subject: [Python-Dev] Re: Caching objects in memory
In-Reply-To: <ca471dc20504250957753a7445@mail.gmail.com>
References: <e04bdf31050422063019fda86b@mail.gmail.com>
	<d4av64$ogd$1@sea.gmane.org>
	<e04bdf310504250946371f59c@mail.gmail.com>
	<ca471dc20504250957753a7445@mail.gmail.com>
Message-ID: <e04bdf31050426092230502ab1@mail.gmail.com>

On 4/25/05, Guido van Rossum <gvanrossum@gmail.com> wrote:
> > I was in my second class of the Python workshop I'm giving here in one
> > Argentine University, and I was explaining how to think using
> > name/object and not variable/value.
> >
> > Using id() for being pedagogic about the objects, the kids saw that
> > id(3) was always the same, but id([]) not. I explained to them that
> > Python, in some circumstances, caches the object, and I kept them
> > happy enough.
> >
> > But I really don't know what objects and in which circumstances.
> 
> Aargh! Bad explanation. Or at least you're missing something:

Not really. It's easier for me to show that id(3) is always the same
and id([]) not, and let the kids see that's not so easy and you'll
have to look deeper if you want to know better.

If I did id(3) and id(500), then the difference would look more
subtle, and I would had to explain it longer. Remember, it was the
second day (2 hours per day).


> implementation is free to use caching. In practice, I believe ints
> between -5 and 100 are cached, and 1-character strings are often
> cached (but not always).

These are exactly my doubts, ;).

.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/
From facundobatista at gmail.com  Tue Apr 26 18:24:11 2005
From: facundobatista at gmail.com (Facundo Batista)
Date: Tue Apr 26 18:24:13 2005
Subject: [Python-Dev] Re: Caching objects in memory
In-Reply-To: <426DC58D.2010102@canterbury.ac.nz>
References: <e04bdf31050422063019fda86b@mail.gmail.com>
	<d4av64$ogd$1@sea.gmane.org>
	<e04bdf310504250946371f59c@mail.gmail.com>
	<ca471dc20504250957753a7445@mail.gmail.com>
	<426DC58D.2010102@canterbury.ac.nz>
Message-ID: <e04bdf3105042609241b161d84@mail.gmail.com>

On 4/26/05, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:

> Also, string literals that resemble Python identifiers
> are often interned, although this is not guaranteed.
> And this only applies to literals, not strings constructed
> dynamically by the program (unless you explicitly apply
> intern() to them).

This simplifies the whole thing.

If the issue arises again, my speech will be: "Don't worry about that,
Python worries for you". :D

And I *someone* in particular keeps interested in it (I'm pretty sure
the whole class won't), I'll explain it to him better, and with more
time.

Thank you!

.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/
From gvanrossum at gmail.com  Tue Apr 26 18:57:01 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Apr 26 18:57:04 2005
Subject: [Python-Dev] Re: anonymous blocks vs scope-collapse
In-Reply-To: <fb6fbf5605042608267ed17786@mail.gmail.com>
References: <fb6fbf5605042608267ed17786@mail.gmail.com>
Message-ID: <ca471dc20504260957523dce36@mail.gmail.com>

> (2)  Add a way to say "Make this function I'm calling use *my* locals
> and globals."  This seems to meet all the agreed-upon-as-good use
> cases, but there is disagreement over how to sensibly write it.  The
> calling function is the place that could get surprised, but people
> who want thunks seem to want the specialness in the called function.

I think there are several problems with this. First, it looks
difficult to provide semantics that cover all the corners for the
blending of two namespaces. What happens to names that have a
different meaning in each scope? (E.g. 'self' when calling a method of
another object; or any other name clash.) Are the globals also
blended? How? Second, this construct only makes sense for all
callables; you seem to want to apply it for function (and I suppose
methods, whether bound or not), but it makes no sense when the
callable is implemented as a C function, or is a class, or an object
with a __call__ method. Third, I expect that if we solve the first two
problems, we'll still find that for an efficient implementation we
need to modify the bytecode of the called function.

If you really want to pursue this idea beyond complaining "nobody
listens to me" (which isn't true BTW), I suggest that you try to
define *exactly* how you think it should work. Try to make sure that
it can be used in a "statement context" as well as in an "expression
context". You don't need to come up with a working implementation, but
you should be able to convince me (or Raymond H :-) that it *can* be
implemented, and that the performance will be reasonable, and that it
won't affect performance when not used, etc.

If you think that's beyond you, then perhaps you should accept "no" as
the only answer you're gonna get. Because I personally strongly
suspect that it won't work, so the burden of "proof", so to speak, is
on you.

> (3)  Add macros.  We still have to figure out how to limit their obfuscation.
> Attempts to detail that goal seem to get sidetracked.

No, the problem is not how to limit the obfuscation. The problem is
the same as for (2), only more so: nobody has given even a *remotely*
plausible mechanism for how exactly you would get code executed at
compile time. You might want to look at Boo, a Python-inspired
language that translates to C#. They have something they call
syntactic macros: http://boo.codehaus.org/Syntactic+Macros .

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From pedronis at strakt.com  Tue Apr 26 19:12:22 2005
From: pedronis at strakt.com (Samuele Pedroni)
Date: Tue Apr 26 19:12:27 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <3cf156b4a85d5b9907c6c9333d8c6af8@python.net>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>
	<3cf156b4a85d5b9907c6c9333d8c6af8@python.net>
Message-ID: <426E7676.2040501@strakt.com>

Michael Hudson wrote:

> The history of iterators and generators could be summarized by saying 
> that an API was invented, then it turned out that in practice one way 
> of implementing them -- generators -- was almost universally useful.
>
> This proposal seems a bit like an effort to make generators good at 
> doing something that they aren't really intended -- or dare I say 
> suited? -- for.  The tail wagging the dog so to speak.
>
it is fun because the two of us sort of already had this discussion in 
compressed form a lot of time ago:

http://groups-beta.google.com/groups?q=with+generators+pedronis&hl=en

not that I was really conviced about my idea at the time which was very 
embrional,  and in fact I'm bit skeptical right now about how much 
bending or not of generators makes sense, especially for a learnability 
point of view.
From mwh at python.net  Tue Apr 26 19:48:26 2005
From: mwh at python.net (Michael Hudson)
Date: Tue Apr 26 19:48:28 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <426E7676.2040501@strakt.com> (Samuele Pedroni's message of
	"Tue, 26 Apr 2005 19:12:22 +0200")
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<3cf156b4a85d5b9907c6c9333d8c6af8@python.net>
	<426E7676.2040501@strakt.com>
Message-ID: <2mk6mp2zv9.fsf@starship.python.net>

Samuele Pedroni <pedronis@strakt.com> writes:

> Michael Hudson wrote:
>
>> The history of iterators and generators could be summarized by
>> saying that an API was invented, then it turned out that in practice
>> one way of implementing them -- generators -- was almost universally
>> useful.
>>
>> This proposal seems a bit like an effort to make generators good at
>> doing something that they aren't really intended -- or dare I say
>> suited? -- for.  The tail wagging the dog so to speak.
>>
> it is fun because the two of us sort of already had this discussion in
> compressed form a lot of time ago:

Oh yes.  That was the discussion that led to PEP 310 being written.

> http://groups-beta.google.com/groups?q=with+generators+pedronis&hl=en

At least I'm consistent :)

> not that I was really conviced about my idea at the time which was
> very embrional,  and in fact I'm bit skeptical right now about how
> much bending or not of generators makes sense, especially for a
> learnability point of view.

As am I, obviously.

Cheers,
mwh

-- 
  Arrrrgh, the braindamage!  It's not unlike the massively
  non-brilliant decision to use the period in abbreviations
  as well as a sentence terminator.  Had these people no
  imagination at _all_?                 -- Erik Naggum, comp.lang.lisp
From rrr at ronadam.com  Tue Apr 26 20:18:17 2005
From: rrr at ronadam.com (ron adam)
Date: Tue Apr 26 20:18:40 2005
Subject: [Python-Dev] Re: anonymous blocks
Message-ID: <426E85E9.4060606@ronadam.com>


Hi, this is my first post here and I've been following this very 
interesting discussion as is has developed. 

A really short intro about me,  I was trained as a computer tech in the 
early 80's... ie. learned transistors, gates, logic etc...  And so my 
focus tends to be from that of a troubleshooter.  I'm medically retired 
now (not a subject for here) and am looking for something meaningful and 
rewarding that I can contribute to with my free time.

I will not post often at first as I am still getting up to speed with 
CVS and how Pythons core works.  Hopefully I'm not lagging this 
discussion too far or adding unneeded noise to it.  :-)

>> So maybe the 'with' keyword should be dropped (again!) in
>> favour of
>>
>>   with_opened(pathname) as f:
>>     ...
>>
>
> But that doesn't look so great for the case where there's no variable
> to be assigned to -- I wasn't totally clear about it, but I meant the
> syntax to be
>
>    with [VAR =] EXPR: BLOCK
>
> where VAR would have the same syntax as the left hand side of an
> assignment (or the variable in a for-statement).
>

I keep wanting to read it as:

   with OBJECT [from EXPR]: BLOCK

>> 2) I'm not sure about the '='. It makes it look rather deceptively
>> like an ordinary assignment, and I'm sure many people are going
>> to wonder what the difference is between
>>
>>   with f = opened(pathname):
>>     do_stuff_to(f)
>>
>> and simply
>>
>>   f = opened(pathname)
>>   do_stuff_to(f)
>>
>> or even just unconsciously read the first as the second without
>> noticing that anything special is going on. Especially if they're
>> coming from a language like Pascal which has a much less magical
>> form of with-statement.
>>

Below is what gives me the clearest picture so far.  To me there is 
nothing 'anonymous' going on here.  Which is good I think. :-)

After playing around with Guido's example a bit, it looks to me the role 
of a 'with' block is to define the life of a resource object.  so "with 
OBJECT: BLOCK" seems to me to be the simplest and most natural way to 
express this.

def with_file(filename, mode):
   """ Create a file resource """
   f = open(filename, mode)
   try:
       yield f        # use yield here
   finally:
       # Do at exit of 'with <resource>: <block>'
       f.close

# Get a resource/generator object and use it.
f_resource = with_file('resource.py', 'r')
with f_resource:
   f = f_resource.next()   # get values from yields
   for line in f:
       print line,


# Generator resource with yield loop.

def with_file(filename):
   """ Create a file line resource """
   f = open(filename, 'r')      try:
       for line in f:
           yield line
   finally:
       f.close()
          # print lines in this file.
f_resource = with_file('resource.py')
with f_resource:
   while 1:
       line = f_resource.next()
       if line == "":
           break
       print line,


The life of an object used with a 'with' block is shorter than that of 
the function it is called from, but if the function is short, the life 
could be the same as the function. Then the 'with' block could be 
optional if the resource objects __exit__ method is called when the 
function exits, but that may require some way to tag a resource as being 
different from other class's and generators to keep from evaluating 
__exit__ methods of other objects.
As far as looping behaviors go, I prefer the loop to be explicitly 
defined in the resource  or the body of the 'with', because it looks to 
be more flexible.

Ron_Adam
# "The right question is a good start to finding the correct answer."


From aahz at pythoncraft.com  Tue Apr 26 21:02:56 2005
From: aahz at pythoncraft.com (Aahz)
Date: Tue Apr 26 21:03:00 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc2050426043713116248@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
Message-ID: <20050426190256.GA14052@panix.com>

On Tue, Apr 26, 2005, Guido van Rossum wrote:
>
> Now there's one more twist, which you may or may not like.  Presumably
> (barring obfuscations or bugs) the handling of BreakFlow and
> ContinueFlow by an iterator (or generator) is consistent for all uses
> of that particular iterator.  For example synchronized(lock) and
> transactional(db) do not behave as loops, and forever() does.  Ditto
> for handling ReturnFlow.  This is why I've been thinking of leaving
> out the 'with' keyword: in your mind, these calls would become new
> statement types, even though the compiler sees them all the same:
> 
>     synchronized(lock):
>         BLOCK
> 
>     transactional(db):
>         BLOCK
> 
>     forever():
>         BLOCK
> 
>     opening(filename) as f:
>         BLOCK

That's precisely why I think we should keep the ``with``: the point of
Python is to have a restricted syntax and requiring a prefix for these
constructs makes it easier to read the code.  You'll soon start to gloss
over the ``with`` but it will be there as a marker for your subconscious.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It's 106 miles to Chicago.  We have a full tank of gas, a half-pack of
cigarettes, it's dark, and we're wearing sunglasses."  "Hit it."
From nicksjacobson at hotmail.com  Tue Apr 26 21:57:17 2005
From: nicksjacobson at hotmail.com (Nick Jacobson)
Date: Tue Apr 26 21:57:20 2005
Subject: [Python-Dev] atexit missing an unregister method
Message-ID: <BAY17-F7DF544743281830AFF246A4210@phx.gbl>

I was looking at the atexit module the other day; it seems like an elegant 
way to ensure that resources are cleaned up (that the garbage collector 
doesn't take care of).

But while you can mark functions to be called with the 'register' method, 
there's no 'unregister' method to remove them from the stack of functions to 
be called.  Nor is there any way to view this stack and e.g. call 'del' on a 
registered function.

This would be useful in the following scenario, in which x and y are 
resources that need to be cleaned up, even in the event of a program exit:

import atexit

def free_resource(resource):
    ...

atexit.register(free_resource, x)
atexit.register(free_resource, y)
# do operations with x and y, potentially causing the program to exit
...
# if nothing caused the program to unexpectedly quit, close the resources
free_resource(x)
free_resource(y)
#unregister the functions, so that you don't try to free the resources 
twice!
atexit.unregisterall()


Alternatively, it would be great if there were a way to view the stack of 
registered functions, and delete them from there.

--Nick Jacobson

_________________________________________________________________
Don�t just search. Find. Check out the new MSN Search! 
http://search.msn.click-url.com/go/onm00200636ave/direct/01/

From gvanrossum at gmail.com  Tue Apr 26 21:59:55 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Tue Apr 26 22:00:00 2005
Subject: [Python-Dev] atexit missing an unregister method
In-Reply-To: <BAY17-F7DF544743281830AFF246A4210@phx.gbl>
References: <BAY17-F7DF544743281830AFF246A4210@phx.gbl>
Message-ID: <ca471dc20504261259440098e0@mail.gmail.com>

On 4/26/05, Nick Jacobson <nicksjacobson@hotmail.com> wrote:
> I was looking at the atexit module the other day; it seems like an elegant
> way to ensure that resources are cleaned up (that the garbage collector
> doesn't take care of).
> 
> But while you can mark functions to be called with the 'register' method,
> there's no 'unregister' method to remove them from the stack of functions to
> be called.  Nor is there any way to view this stack and e.g. call 'del' on a
> registered function.
> 
> This would be useful in the following scenario, in which x and y are
> resources that need to be cleaned up, even in the event of a program exit:
> 
> import atexit
> 
> def free_resource(resource):
>     ...
> 
> atexit.register(free_resource, x)
> atexit.register(free_resource, y)
> # do operations with x and y, potentially causing the program to exit
> ...
> # if nothing caused the program to unexpectedly quit, close the resources
> free_resource(x)
> free_resource(y)
> #unregister the functions, so that you don't try to free the resources
> twice!
> atexit.unregisterall()
> 
> Alternatively, it would be great if there were a way to view the stack of
> registered functions, and delete them from there.

Methinks that the resource cleanup routines ought to be written so as
to be reentrant. That shouldn't be too hard (you can always maintain a
global flag that means "already called").

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From skip at pobox.com  Tue Apr 26 22:06:50 2005
From: skip at pobox.com (Skip Montanaro)
Date: Tue Apr 26 22:06:54 2005
Subject: [Python-Dev] atexit missing an unregister method
In-Reply-To: <BAY17-F7DF544743281830AFF246A4210@phx.gbl>
References: <BAY17-F7DF544743281830AFF246A4210@phx.gbl>
Message-ID: <17006.40794.279653.631294@montanaro.dyndns.org>


    Nick> But while you can mark functions to be called with the 'register'
    Nick> method, there's no 'unregister' method to remove them from the
    Nick> stack of functions to be called.  Nor is there any way to view
    Nick> this stack and e.g. call 'del' on a registered function.

    Nick> This would be useful in the following scenario, in which x and y
    Nick> are resources that need to be cleaned up, even in the event of a
    Nick> program exit:

    Nick> import atexit

    Nick> def free_resource(resource):
    Nick>     ...

    Nick> atexit.register(free_resource, x)
    Nick> atexit.register(free_resource, y)
    Nick> # do operations with x and y, potentially causing the program to exit
    Nick> ...
    Nick> # if nothing caused the program to unexpectedly quit, close the resources
    Nick> free_resource(x)
    Nick> free_resource(y)
    Nick> #unregister the functions, so that you don't try to free the resources 
    Nick> twice!
    Nick> atexit.unregisterall()

This seems like a poor argument for unregistering exit handlers.  If you've
registered an exit handler, why then explicitly do what you've already asked
the system to do?  Also, your proposed unregisterall() function would be
dangerous.  As an application writer you don't know what other parts of the
system (libraries you use, for example) might have registered exit
functions.

Skip
From aahz at pythoncraft.com  Tue Apr 26 22:18:30 2005
From: aahz at pythoncraft.com (Aahz)
Date: Tue Apr 26 22:18:33 2005
Subject: [Python-Dev] atexit missing an unregister method
In-Reply-To: <BAY17-F7DF544743281830AFF246A4210@phx.gbl>
References: <BAY17-F7DF544743281830AFF246A4210@phx.gbl>
Message-ID: <20050426201830.GA5253@panix.com>

On Tue, Apr 26, 2005, Nick Jacobson wrote:
>
> I was looking at the atexit module the other day; it seems like an elegant 
> way to ensure that resources are cleaned up (that the garbage collector 
> doesn't take care of).
> 
> But while you can mark functions to be called with the 'register' method, 
> there's no 'unregister' method to remove them from the stack of functions 
> to be called.  Nor is there any way to view this stack and e.g. call 'del' 
> on a registered function.
> 
> This would be useful in the following scenario, in which x and y are 
> resources that need to be cleaned up, even in the event of a program exit:
> 
> import atexit
> 
> def free_resource(resource):
>    ...
> 
> atexit.register(free_resource, x)
> atexit.register(free_resource, y)

This seems like the wrong way.  Why not do this:

    class ResourceCleanup:
        def register(self, resource, func): ...
        def unregister(self, resource): ...
        def __call__(self): ...

    handler = ResourceCleanup)
    atexit.register(handler)
    handler.register(x, free_resource)
    do(x)
    handler.unregister(x)

Probably further discussion should go to comp.lang.python
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It's 106 miles to Chicago.  We have a full tank of gas, a half-pack of
cigarettes, it's dark, and we're wearing sunglasses."  "Hit it."
From martin at v.loewis.de  Tue Apr 26 22:24:37 2005
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Tue Apr 26 22:24:40 2005
Subject: [Python-Dev] Removing --with-wctype-functions support
In-Reply-To: <1114522169.5352.9.camel@poseidon.cekirdek.int>
References: <1114522169.5352.9.camel@poseidon.cekirdek.int>
Message-ID: <426EA385.90609@v.loewis.de>

S.?a?lar Onur wrote:
> I want to know status of
> http://mail.python.org/pipermail/python-dev/2004-December/050193.html
> this thread. 

The status is that they are still there.

> Will python remove wctype functions support from its core?

I don't know what MAL's plans are these days, but it is likely that
he will remove the functions from the places where they are used
at the moment.

> If it will, what about locale-dependent case conversation functions? 
> 
> Without this support python behaves wrong in tr_TR.UTF-8 locale.

I can sympathise with the problem. IMO, the right solution is to provide
them throught the locale module. That would have the advantage that
the choice of locale-awareness of Unicode case conversions (etc.) is
a per-script decision, rather than an interpreter built-time decision.

Patches in this direction (adding the functions to _localemodule.c)
are welcome, independent of whether they are removed from the methods
on Unicode objects. Such functions should probably polymorphically
operate both on byte strings and Unicode strings, allowing to deprecate
the locale-specific methods on strings as well.

Regards,
Martin
From python at rcn.com  Mon Apr 25 22:25:50 2005
From: python at rcn.com (Raymond Hettinger)
Date: Tue Apr 26 22:26:26 2005
Subject: [Python-Dev] atexit missing an unregister method
In-Reply-To: <BAY17-F7DF544743281830AFF246A4210@phx.gbl>
Message-ID: <001401c549d5$06b34200$7c29a044@oemcomputer>

[Nick Jacobson]
> I was looking at the atexit module the other day; it seems like an
elegant
> way to ensure that resources are cleaned up (that the garbage
collector
> doesn't take care of).
> 
> But while you can mark functions to be called with the 'register'
method,
> there's no 'unregister' method to remove them from the stack of
functions
> to
> be called.  
 . . .
> Alternatively, it would be great if there were a way to view the stack
of
> registered functions, and delete them from there.


Please file a feature request on SourceForge.

Will mull it over for a while.  My first impression is that try/finally
is a better tool for the scenario you outlined.  

The issue with unregister() is that the order of clean-up calls is
potentially significant.  If the same function is listed more than once,
there would be no clear-cut way to know which should be removed when
unregister() is called.

Likewise, I suspect that exposing the stack will create more pitfalls
and risks than it could provide in benefits.  Dealing with a stack of
functions is likely to be clumsy at best.


Raymond Hettinger
From mal at egenix.com  Tue Apr 26 22:32:13 2005
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue Apr 26 22:32:17 2005
Subject: [Python-Dev] Removing --with-wctype-functions support
In-Reply-To: <426EA385.90609@v.loewis.de>
References: <1114522169.5352.9.camel@poseidon.cekirdek.int>
	<426EA385.90609@v.loewis.de>
Message-ID: <426EA54D.2060604@egenix.com>

Martin v. L?wis wrote:
> S.?a?lar Onur wrote:
> 
>>I want to know status of
>>http://mail.python.org/pipermail/python-dev/2004-December/050193.html
>>this thread. 
> 
> 
> The status is that they are still there.

Due to lack of time on my part.

>>Will python remove wctype functions support from its core?
> 
> 
> I don't know what MAL's plans are these days, but it is likely that
> he will remove the functions from the places where they are used
> at the moment.

Right. I haven't heard any complaints, so that's the plan.

>>If it will, what about locale-dependent case conversation functions? 
>>
>>Without this support python behaves wrong in tr_TR.UTF-8 locale.

Could you be more specific about the problem ? It's probably
best to open a bug report in SourceForge.

> I can sympathise with the problem. IMO, the right solution is to provide
> them throught the locale module. That would have the advantage that
> the choice of locale-awareness of Unicode case conversions (etc.) is
> a per-script decision, rather than an interpreter built-time decision.
> 
> Patches in this direction (adding the functions to _localemodule.c)
> are welcome, independent of whether they are removed from the methods
> on Unicode objects. Such functions should probably polymorphically
> operate both on byte strings and Unicode strings, allowing to deprecate
> the locale-specific methods on strings as well.

+1, though we will only want to deprecate the "locale dependency",
not the methods themselves ;-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 26 2005)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
From nicksjacobson at hotmail.com  Tue Apr 26 22:40:12 2005
From: nicksjacobson at hotmail.com (Nick Jacobson)
Date: Tue Apr 26 22:40:33 2005
Subject: [Python-Dev] Re: atexit missing an unregister method
Message-ID: <BAY17-F150A28EE34D5041AD5AB3BA4210@phx.gbl>

<< This seems like a poor argument for unregistering exit handlers.  If 
you've
registered an exit handler, why then explicitly do what you've already asked
the system to do? >>

1. To free up memory for the rest of the program.
2. If the following block is in a loop, and you need to allocate & then 
deallocate resources multiple times.:

<< atexit.register(free_resource, x)
atexit.register(free_resource, y)
# do operations with x and y, potentially causing the program to exit
...
# if nothing caused the program to unexpectedly quit, close the resources
free_resource(x)
free_resource(y) >>


<<  Also, your proposed unregisterall() function would be
dangerous.  As an application writer you don't know what other parts of the
system (libraries you use, for example) might have registered exit
functions.

Skip >>

That's true...it would probably be better to expose the stack of registered 
functions.  That way you can manually unregister functions you've 
registered.

--Nick Jacobson

_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today - it's FREE! 
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/

From nicksjacobson at hotmail.com  Tue Apr 26 22:50:28 2005
From: nicksjacobson at hotmail.com (Nick Jacobson)
Date: Tue Apr 26 22:50:31 2005
Subject: [Python-Dev] Re: atexit missing an unregister method
Message-ID: <BAY17-F40F3C1F4E1BEB611E15248A4210@phx.gbl>

Raymond Hettinger wrote:

<< Will mull it over for a while.  My first impression is that try/finally
is a better tool for the scenario you outlined.  >>


You're right.  try/finally takes care of my sample scenario.  There may 
still be a case to be made for atexit.unregister(), though.

--Nick Jacobson

_________________________________________________________________
On the road to retirement? Check out MSN Life Events for advice on how to 
get there! http://lifeevents.msn.com/category.aspx?cid=Retirement

From paul at pfdubois.com  Tue Apr 26 22:53:05 2005
From: paul at pfdubois.com (Paul Dubois)
Date: Tue Apr 26 22:53:09 2005
Subject: [Python-Dev] python.org crashing Mozilla?
Message-ID: <426EAA31.1050304@pfdubois.com>

Three different computers running Linux / Mozilla are crashing Mozilla 
when directed to python.org. A Netscape works ok. Are we hacked or are 
we showing off?


From modelnine at ceosg.de  Wed Apr 27 00:58:21 2005
From: modelnine at ceosg.de (Heiko Wundram)
Date: Tue Apr 26 22:58:48 2005
Subject: [Python-Dev] python.org crashing Mozilla?
In-Reply-To: <426EAA31.1050304@pfdubois.com>
References: <426EAA31.1050304@pfdubois.com>
Message-ID: <200504270058.24606.modelnine@ceosg.de>

Am Dienstag, 26. April 2005 22:53 schrieb Paul Dubois:
> Three different computers running Linux / Mozilla are crashing Mozilla
> when directed to python.org. A Netscape works ok. Are we hacked or are
> we showing off?

Firefox on Gentoo works okay...?

-- 
--- Heiko.
listening to: Incubus - Megalomaniac
  see you at: http://www.stud.mh-hannover.de/~hwundram/wordpress/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.python.org/pipermail/python-dev/attachments/20050427/7b5d2572/attachment.pgp
From fdrake at acm.org  Tue Apr 26 23:00:49 2005
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue Apr 26 23:01:06 2005
Subject: [Python-Dev] python.org crashing Mozilla?
In-Reply-To: <426EAA31.1050304@pfdubois.com>
References: <426EAA31.1050304@pfdubois.com>
Message-ID: <200504261700.49587.fdrake@acm.org>

On Tuesday 26 April 2005 16:53, Paul Dubois wrote:
 > Three different computers running Linux / Mozilla are crashing Mozilla
 > when directed to python.org. A Netscape works ok. Are we hacked or are
 > we showing off?

Paul,

My Firefox 1.0.2 is fine.  What version(s) of Mozilla, and what host 
platforms, would be helpful.


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at acm.org>
From Ugo_DiGirolamo at invision.iip.com  Tue Apr 26 23:15:38 2005
From: Ugo_DiGirolamo at invision.iip.com (Ugo Di Girolamo)
Date: Tue Apr 26 23:14:54 2005
Subject: [Python-Dev] Problem with embedded python
Message-ID: <3D4A0A4A0225484B965A23CFD127B82F4A6800@invnmail.invision.iip.com>

I have the following code, that seems to make sense to me. 


However, it crashes about 1/3 of the times. 


My platform is Python 2.4.1 on WXP (I tried the release version from 
the msi and the debug version built by me, both downloaded today to 
have the latest version). 


The crash happens while the main thread is in Py_Finalize. 
I traced the crash to _Py_ForgetReference(op) in object.c at line 1847, 
where I have op->_ob_prev == NULL.


What am I doing wrong? I'm definitely not too sure about the way I'm 
handling the GIL. 


Thanks in adv for any suggestion/ comment


Cheers and ciao 


Ugo 


////////////////////////// TestPyThreads.py ////////////////////////// 
#include <windows.h> 
#include "Python.h" 


int main() 
{ 
        PyEval_InitThreads(); 
        Py_Initialize(); 
        PyGILState_STATE main_restore_state = PyGILState_UNLOCKED; 
        PyGILState_Release(main_restore_state); 


        // start the thread 
        { 
                PyGILState_STATE state = PyGILState_Ensure(); 
                int trash = PyRun_SimpleString( 
                                "import thread\n" 
                                "import time\n" 
                                "def foo():\n" 
                                "  f = open('pippo.out', 'w', 0)\n" 
                                "  i = 0;\n" 
                                "  while 1:\n" 
                                "    f.write('%d\\n'%i)\n" 
                                "    time.sleep(0.01)\n" 
                                "    i += 1\n" 
                                "t = thread.start_new_thread(foo, ())\n" 
                                ); 
                PyGILState_Release(state); 
        } 


        // wait 300 ms 
        Sleep(300); 


        PyGILState_Ensure(); 
        Py_Finalize(); 
        return 0; 

} 
From python at rcn.com  Mon Apr 25 23:20:35 2005
From: python at rcn.com (Raymond Hettinger)
Date: Tue Apr 26 23:20:49 2005
Subject: [Python-Dev] Re: atexit missing an unregister method
In-Reply-To: <BAY17-F40F3C1F4E1BEB611E15248A4210@phx.gbl>
Message-ID: <003d01c549dc$9f2ebbc0$7c29a044@oemcomputer>

[Raymond Hettinger]
> << Will mull it over for a while.  My first impression is that
try/finally
> is a better tool for the scenario you outlined.  >>


[Nick Jacobson]
> You're right.  try/finally takes care of my sample scenario.  There
may
> still be a case to be made for atexit.unregister(), though.

Now is the time to move the discussion to SF feature requests or to
comp.lang.python.  If you devote time to "making a case", then also
devote equal effort to researching the hazards and API issues.  

"Potentially useful" is usually trumped by "potentially harmful".  Also,
if the API is awkward or error-prone, that is a bad sign.

Specifically, consider whether exposing the data structure opens the
possibility of accidentally violating invariants assumed by other calls
of atexit().

With respect to the API, consider whether you could explain to a newbie
(who has just finished the tutorial) how to access the structure, lookup
a target function, and make appropriate modifications without breaking
anything else.


Raymond
From ncoghlan at gmail.com  Tue Apr 26 23:21:06 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue Apr 26 23:21:14 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <5.1.1.6.0.20050424235840.0309ca90@mail.telecommunity.com>
References: <ca471dc205042416572da9db71@mail.gmail.com>	<ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>
	<5.1.1.6.0.20050424235840.0309ca90@mail.telecommunity.com>
Message-ID: <426EB0C2.8020006@gmail.com>

Phillip J. Eby wrote:
> At 09:12 PM 4/24/05 -0600, Steven Bethard wrote:
> 
>> I guess it would be helpful to see example where the looping
>> with-block is useful.
> 
> 
> Automatically retry an operation a set number of times before hard failure:
> 
>     with auto_retry(times=3):
>         do_something_that_might_fail()
> 
> Process each row of a database query, skipping and logging those that 
> cause a processing error:
> 
>     with x,y,z = log_errors(db_query()):
>         do_something(x,y,z)
> 
> You'll notice, by the way, that some of these "runtime macros" may be 
> stackable in the expression.

These are also possible by combining a normal for loop with a non-looping with 
(but otherwise using Guido's exception injection semantics):

def auto_retry(attempts):
     success = [False]
     failures = [0]
     except = [None]

     def block():
         try:
             yield None
         except:
             failures[0] += 1
         else:
             success[0] = True

     while not success[0] and failures[0] < attempts:
         yield block()
     if not success[0]:
         raise Exception # You'd actually propagate the last inner failure

for attempt in auto_retry(3):
     with attempt:
         do_something_that_might_fail()


The non-looping version of with seems to give the best of both worlds - 
multipart operation can be handled by multiple with statements, and repeated use 
of the same suite can be handled by nesting the with block inside iteration over 
an appropriate generator.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From martin at v.loewis.de  Tue Apr 26 23:29:01 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Apr 26 23:29:04 2005
Subject: [Python-Dev] Re: atexit missing an unregister method
In-Reply-To: <BAY17-F40F3C1F4E1BEB611E15248A4210@phx.gbl>
References: <BAY17-F40F3C1F4E1BEB611E15248A4210@phx.gbl>
Message-ID: <426EB29D.30403@v.loewis.de>

Nick Jacobson wrote:
> You're right.  try/finally takes care of my sample scenario.  There may
> still be a case to be made for atexit.unregister(), though.

No. Anybody in need of such a feature can easily unregister it.

allregistrations=[]

def _run():
  for fn in allregistrations:
    fn()

atexit.register(_run)

def register(fn):
  allregistrations.append(fn)

def unregister(fn):
  allregistrations.remove(fn)

Regards,
Martin
From jimjjewett at gmail.com  Tue Apr 26 23:30:30 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Tue Apr 26 23:30:34 2005
Subject: [Python-Dev] Re: anonymous blocks vs scope-collapse
Message-ID: <fb6fbf5605042614308198cb6@mail.gmail.com>

>> (2)  Add a way to say "Make this function I'm calling use *my* locals
>> and globals."  This seems to meet all the agreed-upon-as-good use
>> cases, but there is disagreement over how to sensibly write it.  The
>> calling function is the place that could get surprised, but people
>> who want thunks seem to want the specialness in the called function.

> I think there are several problems with this. First, it looks
> difficult to provide semantics that cover all the corners for the
> blending of two namespaces. What happens to names that have a
> different meaning in each scope? 

Programming error.  Same name ==> same object.  

If a function is using one of _your_ names for something incompatible,
then don't call that function with collapsed scope.  The same "problem"
happens with globals today.  Code in module X can break if module Y
replaces (not shadows, replaces) a builtin with an incompatible object.

Except ...
> (E.g. 'self' when calling a method of
> another object; or any other name clash.) 

The first argument of a method *might* be a special case.  It seems
wrong to unbind a bound method.  On the other hand, resource
managers may well want to use unbound methods for the called
code.

> Are the globals also blended?  How?

Yes.  The callee does not even get to see its normal namespace.
Therefore, the callee does not get to use its normal name resolution.

If the name normally resolves in locals (often inlined to a tuple, today), 
it looks in the shared scope, which is "owned" by the caller.  This is 
different from a free variable only because the callee can write to this 
dictionary.

If the name is free in that shared scope, (which implies that the 
callee does not bind it, else it would be added to the shared scope) 
then the callee looks up the caller's nested stack and then to the 
caller's globals, and then the caller's builtins.

> Second, this construct only makes sense for all callables; 

Agreed.  

But using it on a non-function may cause surprising results
especially if bound methods are not special-cased.

The same is true of decorators, which is why we have (at least 
initially) "function decorators" instead of "callable decorators".

> it makes no sense when the callable is implemented as
> a C function, 

Or rather, it can't be implemented, as the compiler may well
have optimized the variables names right out.  Stack frame
transitions between C and python are already special.

> or is a class, or an object with a __call__ method. 

These are just calls to __init__ (or __new__) or __call__.
These may be foolish things to call (particularly if the first
argument to a method isn't special-cased), but ... it isn't
a problem if the class is written appropriately.  If the class
is not written appropriately, then don't call it with collapsed 
scope.

> Third, I expect that if we solve the first two
> problems, we'll still find that for an efficient implementation we
> need to modify the bytecode of the called function.

Absolutely.  Even giving up the XXX_FAST optimizations would 
still require new bytecode to not assume them.  (Deoptimizing 
*all* functions, in *all* contexts, is not a sensible tradeoff.)

Eventually, an optimizing compiler could do the right thing, but ... 
that isn't the point.  

For a given simple algorithm, interpeted python is generally slower 
than compiled C, but we write in python anyhow -- it is fast enough, 
and has other advantages.  The same is true of anything that lets 
me not cut-and-paste.  

> Try to make sure that it can be used in a "statement context" 
> as well as in an "expression context". 

I'm not sure I understand this.  The preferred way would be
to just stick the keyword before the call.  Using 'collapse', it
would look like:

    def foo(b):
        c=a        
    def bar():
        a="a1"
        collapse foo("b1")
        print b, c        # prints "b1", "a1"
        a="a2"
        foo("b2")        # Not collapsed this time
        print b, c        # still prints "b1", "a1"

but I suppose you could treat it like the 'global' keyword

    def bar():
        a="a1"
        collapse foo   # forces foo to always collapse when called within bar
        foo("b1")
        print b, c        # prints "b1", "a1"
        a="a2"
        foo("b2")        # still collapsed
        print b, c        # now prints "b2", "a2"

>> [Alternative 3 ... bigger that merely collapsing scope]
>> (3)  Add macros.  We still have to figure out how to limit their obfuscation.
>> Attempts to detail that goal seem to get sidetracked.

> No, the problem is not how to limit the obfuscation. The problem is
> the same as for (2), only more so: nobody has given even a *remotely*
> plausible mechanism for how exactly you would get code executed at
> compile time.

macros can (and *possibly* should) be evaluated at run-time.  

Compile time should be possible (there is an interpreter running) and 
faster, but ... is certainly not required.

Even if the macros just rerun the same boilerplate code less efficiently,
it is still good to have that boilerplate defined once, instead of cutting 
and pasting.  Or, at least, it is better *if* that once doesn't become 
unreadable in the process.

-jJ
From martin at v.loewis.de  Tue Apr 26 23:31:01 2005
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue Apr 26 23:31:05 2005
Subject: [Python-Dev] Problem with embedded python
In-Reply-To: <3D4A0A4A0225484B965A23CFD127B82F4A6800@invnmail.invision.iip.com>
References: <3D4A0A4A0225484B965A23CFD127B82F4A6800@invnmail.invision.iip.com>
Message-ID: <426EB315.4040907@v.loewis.de>

Ugo Di Girolamo wrote:
> What am I doing wrong?

This is not the forum to ask this question, please use
python-list@python.org instead.

Regards,
Martin
From Ugo_DiGirolamo at invision.iip.com  Tue Apr 26 23:34:49 2005
From: Ugo_DiGirolamo at invision.iip.com (Ugo Di Girolamo)
Date: Tue Apr 26 23:33:54 2005
Subject: [Python-Dev] Problem with embedded python
Message-ID: <3D4A0A4A0225484B965A23CFD127B82F4A6801@invnmail.invision.iip.com>

Sorry.

will do.

Ugo


-----Original Message-----
From: "Martin v. L?wis" [mailto:martin@v.loewis.de]
Sent: Tuesday, April 26, 2005 2:31 PM
To: Ugo Di Girolamo
Cc: python-dev@python.org
Subject: Re: [Python-Dev] Problem with embedded python


Ugo Di Girolamo wrote:
> What am I doing wrong?

This is not the forum to ask this question, please use
python-list@python.org instead.

Regards,
Martin
From p.f.moore at gmail.com  Tue Apr 26 23:40:27 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Tue Apr 26 23:40:29 2005
Subject: [Python-Dev] Re: anonymous blocks vs scope-collapse
In-Reply-To: <fb6fbf5605042614308198cb6@mail.gmail.com>
References: <fb6fbf5605042614308198cb6@mail.gmail.com>
Message-ID: <79990c6b05042614406bd8f95@mail.gmail.com>

On 4/26/05, Jim Jewett <jimjjewett@gmail.com> wrote:
> I'm not sure I understand this.  The preferred way would be
> to just stick the keyword before the call.  Using 'collapse', it
> would look like:
> 
>     def foo(b):
>         c=a
>     def bar():
>         a="a1"
>         collapse foo("b1")
>         print b, c        # prints "b1", "a1"
>         a="a2"
>         foo("b2")        # Not collapsed this time
>         print b, c        # still prints "b1", "a1"

*YUK* I spent a long time staring at this and wondering "where did b come from?"

You'd have to come up with a very compelling use case to get me to like this.

Paul.
From gvanrossum at gmail.com  Wed Apr 27 00:02:03 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Apr 27 00:02:06 2005
Subject: [Python-Dev] Re: anonymous blocks vs scope-collapse
In-Reply-To: <fb6fbf5605042614308198cb6@mail.gmail.com>
References: <fb6fbf5605042614308198cb6@mail.gmail.com>
Message-ID: <ca471dc20504261502560c128f@mail.gmail.com>

[Jim Jewett]
> >> (2)  Add a way to say "Make this function I'm calling use *my* locals
> >> and globals."  This seems to meet all the agreed-upon-as-good use
> >> cases, but there is disagreement over how to sensibly write it.  The
> >> calling function is the place that could get surprised, but people
> >> who want thunks seem to want the specialness in the called function.

[Guido]
> > I think there are several problems with this. First, it looks
> > difficult to provide semantics that cover all the corners for the
> > blending of two namespaces. What happens to names that have a
> > different meaning in each scope?

[Jim]
> Programming error.  Same name ==> same object.

Sounds like a recipe for bugs to me. At the very least it is a total
breach of abstraction, which is the fundamental basis of the
relationship between caller and callee in normal circumstances. The
more I understand your proposal the less I like it.

> If a function is using one of _your_ names for something incompatible,
> then don't call that function with collapsed scope.  The same "problem"
> happens with globals today.  Code in module X can break if module Y
> replaces (not shadows, replaces) a builtin with an incompatible object.
> 
> Except ...
> > (E.g. 'self' when calling a method of
> > another object; or any other name clash.)
> 
> The first argument of a method *might* be a special case.  It seems
> wrong to unbind a bound method.  On the other hand, resource
> managers may well want to use unbound methods for the called
> code.

Well, what would you pass in as the first argument then?

> > Are the globals also blended?  How?
> 
> Yes.  The callee does not even get to see its normal namespace.
> Therefore, the callee does not get to use its normal name resolution.

Another breach of abstraction: if a callee wants to use an imported
module, the import should be present in the caller, not in the callee.

This seems to me to repeat all the mistakes of the dynamic scoping of
early Lisps (including GNU Emacs Lisp I believe).

It really strikes me as an endless source of errors that these
blended-scope callees (in your proposal) are ordinary
functions/methods, which means that they can *also* be called without
blending scopes. Having special syntax to define a callee intended for
scope-blending seems much more appropriate (even if there's also
special syntax at the call site).

> If the name normally resolves in locals (often inlined to a tuple, today),
> it looks in the shared scope, which is "owned" by the caller.  This is
> different from a free variable only because the callee can write to this
> dictionary.

Aha! This suggests that a blend-callee needs to use different bytecode
to avoid doing lookups in the tuple of optimized locals, since the
indices assigned to locals in the callee and the caller won't match up
except by miracle.

> If the name is free in that shared scope, (which implies that the
> callee does not bind it, else it would be added to the shared scope)
> then the callee looks up the caller's nested stack and then to the
> caller's globals, and then the caller's builtins.
> 
> > Second, this construct only makes sense for all callables;

(I meant this to read "does not make sense for all callables".)

> Agreed.

(And I presume you read it that way. :-)

> But using it on a non-function may cause surprising results
> especially if bound methods are not special-cased.
> 
> The same is true of decorators, which is why we have (at least
> initially) "function decorators" instead of "callable decorators".

Not true. It is possible today to write decorators that accept things
other than functions -- in fact, this is often necessary if you want
to write decorators that combine properly with other decorators that
don't return function objects (such as staticmethod and classmethod).

> > it makes no sense when the callable is implemented as
> > a C function,
> 
> Or rather, it can't be implemented, as the compiler may well
> have optimized the variables names right out.  Stack frame
> transitions between C and python are already special.

Understatement of the year. There just is no similarity between C and
Python stack frames. How much do you really know about Python's
internals???

> > or is a class, or an object with a __call__ method.
> 
> These are just calls to __init__ (or __new__) or __call__.

No they're not. Calling a class *first* creates an instance (calling
__new__ if it exists) and *then* calls __init__ (if it exists).

> These may be foolish things to call (particularly if the first
> argument to a method isn't special-cased), but ... it isn't
> a problem if the class is written appropriately.  If the class
> is not written appropriately, then don't call it with collapsed
> scope.

That's easy for you to say. Since the failure behavior is so messy I'd
rather not get started.

> > Third, I expect that if we solve the first two
> > problems, we'll still find that for an efficient implementation we
> > need to modify the bytecode of the called function.
> 
> Absolutely.  Even giving up the XXX_FAST optimizations would
> still require new bytecode to not assume them.  (Deoptimizing
> *all* functions, in *all* contexts, is not a sensible tradeoff.)

So you actually *agree* that blended-scope functions should be marked
as such at the callee definition, not just at the call site. Or how
else would you do this?

> Eventually, an optimizing compiler could do the right thing, but ...
> that isn't the point.
> 
> For a given simple algorithm, interpeted python is generally slower
> than compiled C, but we write in python anyhow -- it is fast enough,
> and has other advantages.  The same is true of anything that lets
> me not cut-and-paste.

Whatever. Any new feature that causes a measurable slowdown for code
that does *not* need the feature has a REALLY hard time getting
accepted, by me as well as by the Python community. Slow Python down
enough, and your target audience reduces to a small bunch of folks who
are programming for their own education.

> > Try to make sure that it can be used in a "statement context"
> > as well as in an "expression context".
> 
> I'm not sure I understand this.  The preferred way would be
> to just stick the keyword before the call.  Using 'collapse', it
> would look like:
> 
>     def foo(b):
>         c=a
>     def bar():
>         a="a1"
>         collapse foo("b1")
>         print b, c        # prints "b1", "a1"
>         a="a2"
>         foo("b2")        # Not collapsed this time
>         print b, c        # still prints "b1", "a1"

I'm trying to sensitize you to potential uses like this:

def bar():
    a = "a1"
    print collapse foo("b1")

> but I suppose you could treat it like the 'global' keyword
> 
>     def bar():
>         a="a1"
>         collapse foo   # forces foo to always collapse when called within bar
>         foo("b1")
>         print b, c        # prints "b1", "a1"
>         a="a2"
>         foo("b2")        # still collapsed
>         print b, c        # now prints "b2", "a2"

Would make more sense if the collapse keyword was at the module level.

> >> [Alternative 3 ... bigger that merely collapsing scope]
> >> (3)  Add macros.  We still have to figure out how to limit their obfuscation.
> >> Attempts to detail that goal seem to get sidetracked.
> 
> > No, the problem is not how to limit the obfuscation. The problem is
> > the same as for (2), only more so: nobody has given even a *remotely*
> > plausible mechanism for how exactly you would get code executed at
> > compile time.
> 
> macros can (and *possibly* should) be evaluated at run-time.

We must still have very different views on what a macro is. After a
macros is run, there is new syntax that needs to be parsed and
compiled to bytecode. While Python frequently switches between compile
time and run time, anything that requires invoking the compiler each
time a macro is used will be so slow that nobody will want to use it.
(Python's compiler is very slow, and it's even slower in alternate
implementations like Jython and IronPython.)

> Compile time should be possible (there is an interpreter running) and
> faster, but ... is certainly not required.

OK, now you *must* look at the Boo solution.
http://boo.codehaus.org/Syntactic+Macros

> Even if the macros just rerun the same boilerplate code less efficiently,
> it is still good to have that boilerplate defined once, instead of cutting
> and pasting.  Or, at least, it is better *if* that once doesn't become
> unreadable in the process.

I am unable to assess the value of this mechanism unless you make a
concrete proposal. You seem to have something in mind but you're not
doing a good job getting it into mine...

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Wed Apr 27 00:03:58 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Apr 27 00:04:03 2005
Subject: [Python-Dev] Re: anonymous blocks vs scope-collapse
In-Reply-To: <79990c6b05042614406bd8f95@mail.gmail.com>
References: <fb6fbf5605042614308198cb6@mail.gmail.com>
	<79990c6b05042614406bd8f95@mail.gmail.com>
Message-ID: <ca471dc20504261503767cd117@mail.gmail.com>

[Paul Moore]
> *YUK* I spent a long time staring at this and wondering "where did b come from?"
> 
> You'd have to come up with a very compelling use case to get me to like this.

I couldn't have said it better.

I said it longer though. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From tjreedy at udel.edu  Wed Apr 27 00:18:45 2005
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed Apr 27 00:20:27 2005
Subject: [Python-Dev] Re: a few SF bugs which can (probably) be closed
References: <Pine.LNX.4.58.0504222034070.772@bagira>
Message-ID: <d4mee6$brf$1@sea.gmane.org>


"Ilya Sandler" <ilya@bluefir.net> wrote in message 
news:Pine.LNX.4.58.0504222034070.772@bagira...
> Here a few sourceforge bugs which can probably be closed:
>
> [ 1168983 ] : ftplib.py string index out of range
> Original poster reports that the problem disappeared after a patch
> committed by Raymond

Not clear to me if this is really finished or not.  Leaving for Raymond or 
... .  Closed 3 below.

> [ 1178863 ] Variable.__init__ uses self.set(), blocking specialization
> seems like a dup of 1178872

Closed latter.

> [ 415492 ] Compiler generates relative filenames
> seems to have been fixed at some point. I could not reproduce it with
> python2.4
>
> [ 751612 ] smtplib crashes Windows Kernal.
> Seems like an obvious Windows bug (not python's bug) and seems to be
> unreproducible

Terry J. Reedy


From jimjjewett at gmail.com  Wed Apr 27 01:42:46 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed Apr 27 01:42:49 2005
Subject: [Python-Dev] scope-collapse
Message-ID: <fb6fbf56050426164269d95a90@mail.gmail.com>

[Jim Jewett]
>> >> (2)  Add a way to say "Make this function I'm calling use *my* locals
>> >> and globals."  This seems to meet all the agreed-upon-as-good use
>> >> cases, but there is disagreement over how to sensibly write it.

[Guido]
>> > What happens to names that have a
>> > different meaning in each scope?

[Jim]
>> Programming error.  Same name ==> same object.

> Sounds like a recipe for bugs to me. At the very least it is a total
> breach of abstraction, which is the fundamental basis of the
> relationship between caller and callee in normal circumstances.

Yes.  Collapsing scope is not a good idea in general.  But there
is no way to avoid (some version of) it under the thunk or even
the resource-manager suggestions.  I interpret that to mean that
either 
    The code *can* get ugly and you rely on conventions.
or 
    These constructs should not be added to the language.

The pretend-it-is-a-generator proposals try to specify that only 
certain names will be shared, in only certain ways.  That might
work (practicality beats purity) but I suspect it will evolve into a
wart.  It won't be quite strong enough to solve the problem 
completely (particularly with multiple blocks), but it will be 
strong enough to obfuscate when mishandled.

>> Yes.  The callee does not even get to see its normal namespace.
>> Therefore, the callee does not get to use its normal name resolution.

> This seems to me to repeat all the mistakes of the dynamic scoping 
> of early Lisps (including GNU Emacs Lisp I believe).

With one exception -- the caller must state explicitly that the collapse
it happening, and even then, it only goes down one level at a time.
Still an ugly tool, but at least not an ugly surprise.

> It really strikes me as an endless source of errors that these
> blended-scope callees (in your proposal) are ordinary
> functions/methods, which means that they can *also* be called without
> blending scopes. Having special syntax to define a callee intended for
> scope-blending seems much more appropriate (even if there's also
> special syntax at the call site).

This might well be a good restriction.  The number of times it causes
annoyance (why do I have to code this twice?) should be outweighed
by the number of times it saves a surprise (oops -- those functions 
both defined the same keyword argument).

>> If the name normally resolves in locals (often inlined to a tuple, today),
>> it looks in the shared scope, which is "owned" by the caller.  This is
>> different from a free variable only because the callee can write to this
>> dictionary.

> Aha! This suggests that a blend-callee needs to use different bytecode
> to avoid doing lookups in the tuple of optimized locals

Yes.  I believe the translation is mechanical, so that the compiler could
choose (or create) the right version based on the caller, but ... I agree
that making them a separate kind of callable would simplify things.

> (I meant this to read "does not make sense for all callables".)
> (And I presume you read it that way. :-)

nah... I think it takes special justification to do use anything but
duck typing in python.  If functions can intersperse with boilerplate,
than other callables (and even other suites, such as class definitions) 
should be able to do the same.  But I also agree that it makes sense 
to wait until it can be done sensibly.

Just as @decorator only applies to functions (even if the specific
decorator could accept something else), this interspersing should
probably not apply to non-functions until the "but I can't use a 
function" use cases are clear.

>> For a given simple algorithm, interpeted python is generally slower
>> than compiled C, but we write in python anyhow -- it is fast enough,
>> and has other advantages.  The same is true of anything that lets
>> me not cut-and-paste.

> Whatever. Any new feature that causes a measurable slowdown for code
> that does *not* need the feature has a REALLY hard time getting
> accepted,

Agreed.  But the scope-collapse penalty is restricted to the caller
(which ordered the collapse) and the immediate callees during
the collapsed call.  

>> I'm not sure I understand this.  The preferred way would be
>> to just stick the keyword before the call.  Using 'collapse', it
>> would look like:
                                  # (Added comment to make the ugliess
potential more explicit)
>>     def foo(b):         # Yes, parameters are in the namespace.          
>>         c=a
>>     def bar():
>>         a="a1"
>>         collapse foo("b1")
>>         print b, c        # prints "b1", "a1"
>>         a="a2"
>>         foo("b2")        # Not collapsed this time
>>         print b, c        # still prints "b1", "a1"

> I'm trying to sensitize you to potential uses like this:

> def bar():
>    a = "a1"
>    print collapse foo("b1")

In this case, it would print None, as foo didn't bother to return anything.

> but I suppose you could treat it like the 'global' keyword

>>     def bar():
>>         a="a1"
>>         collapse foo   # forces foo to always collapse when called within bar
>>         foo("b1")
>>         print b, c        # prints "b1", "a1"
>>         a="a2"
>>         foo("b2")        # still collapsed
>>         print b, c        # now prints "b2", "a2"

> Would make more sense if the collapse keyword was at the module level.

???  Are you suggesting that everything defined in the module must live
in a single namespace, just because the collapse was wanted in one place?

-jJ
From charles.hartman at conncoll.edu  Mon Apr 25 13:53:58 2005
From: charles.hartman at conncoll.edu (Charles Hartman)
Date: Wed Apr 27 01:49:08 2005
Subject: [Python-Dev] Re: [Pythonmac-SIG] zipfile still has 2GB boundary bug
In-Reply-To: <f8130a8bcf8ee1165b8c6c6c5da86f80@redivi.com>
References: <f8130a8bcf8ee1165b8c6c6c5da86f80@redivi.com>
Message-ID: <0a276f3a753a57d89e65013ca77a3714@conncoll.edu>

>
> Someone should think about rewriting the zipfile module to be less 
> hideous, include a repair feature, and be up to date with the latest 
> specifications <http://www.pkware.com/company/standards/appnote/>.

-- and allow *deleting* a file from a zipfile. As far as I can tell, 
you now can't (except by rewriting everything but that to a new zipfile 
and renaming). Somewhere I saw a patch request for this, but it was 
languishing, a year or more old. Or am I just totally missing 
something?

Charles Hartman

From greg.ewing at canterbury.ac.nz  Wed Apr 27 01:53:07 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed Apr 27 01:53:27 2005
Subject: [Python-Dev] atexit missing an unregister method
In-Reply-To: <BAY17-F7DF544743281830AFF246A4210@phx.gbl>
References: <BAY17-F7DF544743281830AFF246A4210@phx.gbl>
Message-ID: <426ED463.6040800@canterbury.ac.nz>

Nick Jacobson wrote:

> But while you can mark functions to be called with the 'register' 
> method, there's no 'unregister' method to remove them from the stack of 
> functions to be called.

You can always build your own mechanism for managing
cleanup functions however you want, and register a
single atexit() hander to invoke it. I don't think
there's any need to mess with the way atexit()
currently works.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From fumanchu at amor.org  Wed Apr 27 02:04:54 2005
From: fumanchu at amor.org (Robert Brewer)
Date: Wed Apr 27 02:03:17 2005
Subject: [Python-Dev] Re: scope-collapse (was: anonymous blocks)
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3771F0C@exchange.hqamor.amorhq.net>

[Jim Jewett]
> (2)  Add a way to say "Make this function I'm calling 
> use *my* locals and globals."  This seems to meet all
> the agreed-upon-as-good use cases, but there is disagreement
> over how to sensibly write it.  The calling function is
> the place that could get surprised, but people who want
> thunks seem to want the specialness in the 
> called function.

[Guido]
> I think there are several problems with this. First, it looks
> difficult to provide semantics that cover all the corners for the
> blending of two namespaces. What happens to names that have a
> different meaning in each scope?

[Jim]
> Programming error.  Same name ==> same object.

[Guido]
> Sounds like a recipe for bugs to me. At the very least it is a total
> breach of abstraction, which is the fundamental basis of the
> relationship between caller and callee in normal circumstances. The
> more I understand your proposal the less I like it.

[Jim]
> If a function is using one of _your_ names for something 
> incompatible, then don't call that function with collapsed
> scope.  The same "problem" happens with globals today.
> Code in module X can break if module Y replaces (not shadows,
> replaces) a builtin with an incompatible object.
> 
> Except ...
> (E.g. 'self' when calling a method of
> another object; or any other name clash.)
> 
> The first argument of a method *might* be a special case.  It seems
> wrong to unbind a bound method.  On the other hand, resource
> managers may well want to use unbound methods for the called
> code.

Urg. Please, no. If you're going to blend scopes, the callee should have
nothing passed to it. Why would you possibly want it when you already
have access to both scopes which are to be blended?


[Guido]
> Are the globals also blended?  How?

[Jim]
> Yes.  The callee does not even get to see its normal namespace.
> Therefore, the callee does not get to use its normal name 
> resolution.

[Guido]
> Another breach of abstraction: if a callee wants to use an imported
> module, the import should be present in the caller, not in the callee.

Yes, although if a callee wants to use a module that has not been
imported by the caller, it should be able to do so with a new import
statement (which then affects the namespace of the caller).

[Guido again]
> It really strikes me as an endless source of errors that these
> blended-scope callees (in your proposal) are ordinary
> functions/methods, which means that they can *also* be called without
> blending scopes. Having special syntax to define a callee intended for
> scope-blending seems much more appropriate (even if there's also
> special syntax at the call site).

Agreed. They shouldn't be ordinary functions at all, in my mind. That
means one can also mark the actual call on the callee side, instead of
the caller side; in other words, you wouldn't need a "collapse" keyword
at all if you formed the callee with a "defmacro" or other (better ;)
keyword. I guess if y'all find it surprising, you could keep "collapse".


[Jim]
> If the name normally resolves in locals (often inlined to a 
> tuple, today),
> it looks in the shared scope, which is "owned" by the 
> caller.  This is
> different from a free variable only because the callee can 
> write to this
> dictionary.

[Guido]
> Aha! This suggests that a blend-callee needs to use different bytecode
> to avoid doing lookups in the tuple of optimized locals, since the
> indices assigned to locals in the callee and the caller won't match up
> except by miracle.

[Guido]
> Third, I expect that if we solve the first two
> problems, we'll still find that for an efficient implementation we
> need to modify the bytecode of the called function.

[Jim]
> Absolutely.  Even giving up the XXX_FAST optimizations would
> still require new bytecode to not assume them.  (Deoptimizing
> *all* functions, in *all* contexts, is not a sensible tradeoff.)

I'm afraid I'm only familiar with CPython, but wouldn't callee locals
just map to XXX_FAST indices via the caller's co_names tuple?

Remapping jump targets, on the other hand, would be something to quickly
ban. You shouldn't be able to write trash like:

defmacro keepgoing:
    else:
        continue

[Guido]
> Try to make sure that it can be used in a "statement context"
> as well as in an "expression context".
...
> I'm trying to sensitize you to potential uses like this:
> 
> def foo(b):
>     c=a
> def bar():
>     a = "a1"
>     print collapse foo("b1")

If the callees aren't real functions and don't get passed anything, the
"sensible" approach would be to disallow expression-context use of them.
Rewrite the above to:

defcallee foo:
    c = a

def bar():
    a = "a1"
    collapse foo
    print c


Robert Brewer
MIS
Amor Ministries
fumanchu@amor.org
From greg.ewing at canterbury.ac.nz  Wed Apr 27 02:08:15 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed Apr 27 02:08:30 2005
Subject: [Python-Dev] defmacro
In-Reply-To: <87k6mqnddr.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
	<426DDD87.60908@canterbury.ac.nz>
	<87k6mqnddr.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <426ED7EF.1090508@canterbury.ac.nz>

Stephen J. Turnbull wrote:

> This doesn't feel right to me.  By that argument, people would want
> to "improve"
> 
>   (mapcar (lambda (x) (car x)) list-of-lists)
> 
> to
> 
>   (mapcar list-of-lists (x) (car x))

I didn't claim that people would feel compelled to eliminate
all uses of lambda; only that, in those cases where they
*do* feel so compelled, they might not if lambda weren't
such a long word.

I was just trying to understand why Smalltalkers seem to
get on fine without macros, whereas Lispers feel they are
needed. I think Smalltalk's lightweight block-passing
syntax has a lot to do with it.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From jimjjewett at gmail.com  Wed Apr 27 02:12:19 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed Apr 27 02:12:21 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
Message-ID: <fb6fbf5605042617126c217664@mail.gmail.com>

>> >> (3)  Add macros.  We still have to figure out how to limit their
obfuscation.

>> > nobody has given even a *remotely*
>> > plausible mechanism for how exactly you would get code executed at
>> > compile time.
 
>> macros can (and *possibly* should) be evaluated at run-time.

> We must still have very different views on what a macro is.

In a compiled language, it is (necessarily) compiled.  In an interpreted
language, it doesn't have to be.

> After a macros is run, there is new syntax that needs to be parsed 
> and compiled to bytecode.   ... anything that requires invoking the
> compiler each time a macro is used will be so slow that nobody will
> want to use it.

I had been thinking that the typical use would be during function (or
class) definition.  The overhead would be similar to that of decorators,
and confined mostly to module loading.

I do see your point that putting a macro call inside a function could be
slow -- but I'm not sure that is a reason to forbid it.

>> Even if the macros just rerun the same boilerplate code less efficiently,
>> it is still good to have that boilerplate defined once, instead of cutting
>> and pasting.  Or, at least, it is better *if* that once doesn't become
>> unreadable in the process.

> I am unable to assess the value of this mechanism unless you make a
> concrete proposal. You seem to have something in mind but you're not
> doing a good job getting it into mine...

I'm not confident that macros are even a good idea; I just don't want
a series of half-macros.  That said, here is a strawman.

defmacro boiler1(name, rejects):
    def %(name) (*args):
        for a in args:
            if a in %(rejects):
                print "Don't send me %s" % a
...
boiler1(novowels, "aeiouy")
boiler2(nokey5, "jkl")

I'm pretty sure that a real version should accept suites instead of just
arguments, and the variable portion might even be limited to suites 
(as in the thunk discussion).  It might even be reasonable to mark
macro calls as different from function calls.

template novowels from boiler1("aeiou"):
    <suite>
    
but I can't help thinking that multiple suites should be possible, and
then they should be named, and ... that spurred at least one 
objection.  http://mail.python.org/pipermail/python-dev/2005-April/052949.html

-jJ
From greg.ewing at canterbury.ac.nz  Wed Apr 27 02:13:06 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed Apr 27 02:13:24 2005
Subject: [Python-Dev] defmacro
In-Reply-To: <426D358C.70509@ieee.org>
References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
	<426D358C.70509@ieee.org>
Message-ID: <426ED912.40603@canterbury.ac.nz>

Shane Holloway (IEEE) wrote:

> So, the question comes back to what are blocks in the language 
> extensibility case?  To me, they would be something very like a code 
> object returned from the compile method.  To this we would need to 
> attach the globals and locals where the block was from.  Then we could 
> use the normal exec statement to invoke the block whenever needed. 

There's no need for all that. They're just callable objects.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From greg.ewing at canterbury.ac.nz  Wed Apr 27 02:18:06 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed Apr 27 02:18:21 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <fb6fbf5605042617126c217664@mail.gmail.com>
References: <fb6fbf5605042617126c217664@mail.gmail.com>
Message-ID: <426EDA3E.7090208@canterbury.ac.nz>

Jim Jewett wrote:

> I had been thinking that the typical use would be during function (or
> class) definition.  The overhead would be similar to that of decorators,
> and confined mostly to module loading.

But that's too late, unless you want to resort to bytecode
hacking. By the time the module is loaded, its source code
has long since been compiled into code objects.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From gvanrossum at gmail.com  Wed Apr 27 02:18:48 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Apr 27 02:18:50 2005
Subject: [Python-Dev] scope-collapse
In-Reply-To: <fb6fbf56050426164269d95a90@mail.gmail.com>
References: <fb6fbf56050426164269d95a90@mail.gmail.com>
Message-ID: <ca471dc2050426171860ff388f@mail.gmail.com>

[Jim jewett]
> The pretend-it-is-a-generator proposals try to specify that only
> certain names will be shared, in only certain ways.

Huh? I don't see it this way. There is *no* sharing between the frame
of the generator and the frame of the block. The block is a permanent
part of the frame surrounding the with-statement, so all names are
shared there.

> > Would make more sense if the collapse keyword was at the module level.
> 
> ???  Are you suggesting that everything defined in the module must live
> in a single namespace, just because the collapse was wanted in one place?

No, I was just proposing putting 'collapse foo' in the module, which
would mean that (a) the definition of foo is intended to be a macro,
and (b) all uses of foo are intended to call that macro.

But I still think this whole proposal is built on quicksand, so don't
take that suggestion too seriously.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Wed Apr 27 02:24:44 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Apr 27 02:24:47 2005
Subject: [Python-Dev] Re: [Pythonmac-SIG] zipfile still has 2GB boundary
	bug
In-Reply-To: <0a276f3a753a57d89e65013ca77a3714@conncoll.edu>
References: <f8130a8bcf8ee1165b8c6c6c5da86f80@redivi.com>
	<0a276f3a753a57d89e65013ca77a3714@conncoll.edu>
Message-ID: <ca471dc2050426172429fb52b@mail.gmail.com>

> > Someone should think about rewriting the zipfile module to be less
> > hideous, include a repair feature, and be up to date with the latest
> > specifications <http://www.pkware.com/company/standards/appnote/>.
> 
> -- and allow *deleting* a file from a zipfile. As far as I can tell,
> you now can't (except by rewriting everything but that to a new zipfile
> and renaming). Somewhere I saw a patch request for this, but it was
> languishing, a year or more old. Or am I just totally missing
> something?

Please don't propose a grand rewrite (even it's only a single module).
Given that the API is mostly sensible, please propose gradual
refactoring of the implementation, perhaps some new API methods, and
so on. Don't throw away the work that went into making it work in the
first place!

http://www.joelonsoftware.com/articles/fog0000000069.html

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From greg.ewing at canterbury.ac.nz  Wed Apr 27 02:27:17 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed Apr 27 02:27:32 2005
Subject: [Python-Dev] defmacro (was: Anonymous blocks)
In-Reply-To: <200504261039.j3QAdQU2013249@ger5.wwwserver.net>
References: <200504261039.j3QAdQU2013249@ger5.wwwserver.net>
Message-ID: <426EDC65.3070603@canterbury.ac.nz>

flaig@sanctacaris.net wrote:
> Actually I was thinking of something related the other day:
 > Wouldn't it be nice to be able to define/overload not only
 > operators but also control structures?

That triggered off something in my mind that's somewhat
different from what you went on to talk about.

So far we've been talking about ways of defining new
syntax. But operator overloading isn't creating new
syntax, it's giving a new meaning to existing syntax.
So the statement equivalent of that would be defining
new meanings for *existing* control structures!

For example, when you write

   while expr:
     ...

it gets turned into

   expr.__while__(thunk)

etc.

No, I'm not really serious about this -- it was just
a wild thought!

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From jimjjewett at gmail.com  Wed Apr 27 02:30:48 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Wed Apr 27 02:30:51 2005
Subject: [Python-Dev] Re: scope-collapse (was: anonymous blocks)
In-Reply-To: <3A81C87DC164034AA4E2DDFE11D258E3771F0C@exchange.hqamor.amorhq.net>
References: <3A81C87DC164034AA4E2DDFE11D258E3771F0C@exchange.hqamor.amorhq.net>
Message-ID: <fb6fbf5605042617301cf399e2@mail.gmail.com>

On 4/26/05, Robert Brewer <fumanchu@amor.org> wrote:
> [Jim]
> > Absolutely.  Even giving up the XXX_FAST optimizations would
> > still require new bytecode to not assume them.

> I'm afraid I'm only familiar with CPython, but wouldn't callee locals
> just map to XXX_FAST indices via the caller's co_names tuple?

Only if all names are in the caller's tuple.

In your example at
http://mail.python.org/pipermail/python-dev/2005-April/052924.html

two of the callees wanted a shared old_children, but that name didn't 
appear in the caller, so I wouldn't expect the compiler to make room
for it in the tuple.

-jJ
From jcarlson at uci.edu  Wed Apr 27 02:42:48 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed Apr 27 02:43:25 2005
Subject: [Python-Dev] Re: anonymous blocks vs scope-collapse
In-Reply-To: <ca471dc20504261502560c128f@mail.gmail.com>
References: <fb6fbf5605042614308198cb6@mail.gmail.com>
	<ca471dc20504261502560c128f@mail.gmail.com>
Message-ID: <20050426165507.6401.JCARLSON@uci.edu>


[Guido]
> OK, now you *must* look at the Boo solution.
> http://boo.codehaus.org/Syntactic+Macros

That is an interesting solution, requiring macro writers to actually
write an AST modifier seems pretty reasonable to me.  Whether we want
macros or not... <shrug>

 - Josiah

From sabbey at u.washington.edu  Wed Apr 27 02:45:01 2005
From: sabbey at u.washington.edu (Brian Sabbey)
Date: Wed Apr 27 02:45:05 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <426E5AEB.3030707@gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<3cf156b4a85d5b9907c6c9333d8c6af8@python.net>
	<426E5AEB.3030707@gmail.com>
Message-ID: <Pine.A41.4.61b.0504261605490.142804@dante68.u.washington.edu>

Nick Coghlan wrote:
> Accordingly, I would like to suggest that 'with' revert to something 
> resembling the PEP 310 definition:
>
>    resource = EXPR
>    if hasattr(resource, "__enter__"):
>        VAR = resource.__enter__()
>    else:
>        VAR = None
>    try:
>        try:
>            BODY
>        except:
>            raise # Force realisation of sys.exc_info() for use in __exit__()
>    finally:
>        if hasattr(resource, "__exit__"):
>            VAR = resource.__exit__()
>        else:
>            VAR = None
>
> Generator objects could implement this protocol, with the following 
> behaviour:
>
>    def __enter__():
>        try:
>            return self.next()
>        except StopIteration:
>            raise RuntimeError("Generator exhausted, unable to enter with 
> block")
>
>    def __exit__():
>        try:
>            return self.next()
>        except StopIteration:
>            return None
>
>    def __except__(*exc_info):
>        pass
>
>    def __no_except__():
>        pass

One peculiarity of this is that every other 'yield' would not be allowed 
in the 'try' block of a try/finally statement (TBOATFS).  Specifically, a 
'yield' reached through the call to __exit__ would not be allowed in the 
TBOATFS.

It gets even more complicated when one considers that 'next' may be called 
inside BODY.  In such a case, it would not be sufficient to just disallow 
every other 'yield' in the TBOATFS.  It seems like 'next' would need some 
hidden parameter that indicates whether 'yield' should be allowed in the 
TBOATFS.

(I assume that if a TBOATFS contains an invalid 'yield', then an exception 
will be raised immediately before its 'try' block is executed.  Or would 
the exception be raised upon reaching the 'yield'?)


> These are also possible by combining a normal for loop with a non-looping
> with (but otherwise using Guido's exception injection semantics):
>
> def auto_retry(attempts):
>    success = [False]
>    failures = [0]
>    except = [None]
>
>    def block():
>        try:
>            yield None
>        except:
>            failures[0] += 1
>        else:
>            success[0] = True
>
>    while not success[0] and failures[0] < attempts:
>        yield block()
>    if not success[0]:
>        raise Exception # You'd actually propagate the last inner failure
>
> for attempt in auto_retry(3):
>    with attempt:
>        do_something_that_might_fail()

I think your example above is a good reason to *allow* 'with' to loop. 
Writing 'auto_retry' with a looping 'with' would be pretty straightforward 
and intuitive.  But the above, non-looping 'with' example requires two 
fairly advanced techniques (inner functions, variables-as-arrays trick) 
that would probably be lost on some python users (and make life more 
difficult for the rest).

But I do see the appeal to having a non-looping 'with'.  In many (most?) 
uses of generators, 'for' and looping 'with' could be used 
interchangeably.  This seems ugly-- more than one way to do it and all 
that.

-Brian
From greg.ewing at canterbury.ac.nz  Wed Apr 27 02:53:04 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed Apr 27 02:53:21 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <426E5AEB.3030707@gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<3cf156b4a85d5b9907c6c9333d8c6af8@python.net>
	<426E5AEB.3030707@gmail.com>
Message-ID: <426EE270.1080303@canterbury.ac.nz>

Nick Coghlan wrote:

> def template():
>   # pre_part_1
>   yield None
>   # post_part_1
>   yield None
>   # pre_part_2
>   yield None
>   # post_part_2
>   yield None
>   # pre_part_3
>   yield None
>   # post_part_3
> 
> def user():
>   block = template()
>   with block:
>     # do_part_1
>   with block:
>     # do_part_2
>   with block:
>     # do_part_3

That's an interesting idea, but do you have any use cases
in mind?

I worry that it will be too restrictive to be really useful.
Without the ability for the iterator to control which blocks
get executed and when, you wouldn't be able to implement
something like a case statement, for example.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From bob at redivi.com  Wed Apr 27 03:00:43 2005
From: bob at redivi.com (Bob Ippolito)
Date: Wed Apr 27 03:00:46 2005
Subject: [Python-Dev] Re: [Pythonmac-SIG] zipfile still has 2GB boundary
	bug
In-Reply-To: <ca471dc2050426172429fb52b@mail.gmail.com>
References: <f8130a8bcf8ee1165b8c6c6c5da86f80@redivi.com>
	<0a276f3a753a57d89e65013ca77a3714@conncoll.edu>
	<ca471dc2050426172429fb52b@mail.gmail.com>
Message-ID: <6cc9cf942f6b6fa876741dff724e87cd@redivi.com>


On Apr 26, 2005, at 8:24 PM, Guido van Rossum wrote:

>>> Someone should think about rewriting the zipfile module to be less
>>> hideous, include a repair feature, and be up to date with the latest
>>> specifications <http://www.pkware.com/company/standards/appnote/>.
>>
>> -- and allow *deleting* a file from a zipfile. As far as I can tell,
>> you now can't (except by rewriting everything but that to a new 
>> zipfile
>> and renaming). Somewhere I saw a patch request for this, but it was
>> languishing, a year or more old. Or am I just totally missing
>> something?
>
> Please don't propose a grand rewrite (even it's only a single module).
> Given that the API is mostly sensible, please propose gradual
> refactoring of the implementation, perhaps some new API methods, and
> so on. Don't throw away the work that went into making it work in the
> first place!

Well, I didn't necessarily mean it should be thrown away and started 
from scratch -- however, once you get all the ugly out of it, there's 
not much left!  Obviously there's something wrong with the way it's 
written if it took years and *several passes* to correctly identify and 
fix a simple format character case bug.  Most of this can be blamed on 
the struct module, which is more obscure and error-prone than writing 
the same code in C.

One of the most useful things that could happen to the zipfile module 
would be a stream interface for both reading and writing.  Right now 
it's slow and memory hungry when dealing with large chunks.  The use 
case that lead me to fix this bug is a tool that archives video to zip 
files of targa sequences with a reference QuickTime movie.. so I end up 
with thousands of bite sized chunks.

This >2GB bug really caused me some grief in that I didn't test with 
such large sequences because I didn't have any.  I didn't end up 
finding out about it until months later because client *ignored* the 
exceptions raised by the GUI and came back to me with broken zip files. 
  Fortunately the TOC in a zip file can be reconstructed from an 
otherwise pristine stream.  Of course, I had to rewrite half of the 
zipfile module to come up with such a recovery program, because it's 
not designed well enough to let me build such a tool on top of it.

Another "bug" I ran into was that it has some crazy default for the 
ZipInfo record: it assumes the platform ("create_system") is Windows 
regardless of where you are!  This caused some really subtle and 
annoying issues with some unzip tools (of course, on everyone's 
machines except mine).  Fortunately someone was able to figure out why 
and send me a patch, but it was completely unexpected and I didn't see 
such craziness documented anywhere.  If it weren't for this patch, it'd 
either still be broken, or I'd have switched to some other way of 
creating archives!

The zipfile module is good enough to create input files for zipimport.. 
which is well tested and generally works -- barring the fact that 
zipimport has quite a few rough edges of its own.  I certainly wouldn't 
recommend it for any heavy duty tasks in its current state.

-bob

From alan.mcintyre at esrgtech.com  Wed Apr 27 03:48:06 2005
From: alan.mcintyre at esrgtech.com (Alan McIntyre)
Date: Wed Apr 27 03:48:10 2005
Subject: [Python-Dev] Re: [Pythonmac-SIG] zipfile still has 2GB boundary
	bug
In-Reply-To: <6cc9cf942f6b6fa876741dff724e87cd@redivi.com>
References: <f8130a8bcf8ee1165b8c6c6c5da86f80@redivi.com>	<0a276f3a753a57d89e65013ca77a3714@conncoll.edu>	<ca471dc2050426172429fb52b@mail.gmail.com>
	<6cc9cf942f6b6fa876741dff724e87cd@redivi.com>
Message-ID: <426EEF56.8090905@esrgtech.com>

Bob Ippolito wrote:

> One of the most useful things that could happen to the zipfile module
> would be a stream interface for both reading and writing.  Right now
> it's slow and memory hungry when dealing with large chunks.  The use
> case that lead me to fix this bug is a tool that archives video to zip
> files of targa sequences with a reference QuickTime movie.. so I end
> up with thousands of bite sized chunks.

While it's probably not an improvement on the order of magnitude you're
looking for, there's a patch (1121142) that lets you read large items
out of a zip archive via a file-like object.  I'm occasionally running
into the 2GB problem myself, so if any changes are made to get around
that I can at least help out by testing it against some "real-life" data
sets.

Alan
From greg.ewing at canterbury.ac.nz  Wed Apr 27 05:30:17 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed Apr 27 05:30:34 2005
Subject: [Python-Dev] Re: anonymous blocks vs scope-collapse
In-Reply-To: <ca471dc20504261503767cd117@mail.gmail.com>
References: <fb6fbf5605042614308198cb6@mail.gmail.com>
	<79990c6b05042614406bd8f95@mail.gmail.com>
	<ca471dc20504261503767cd117@mail.gmail.com>
Message-ID: <426F0749.60402@canterbury.ac.nz>

I don't think this proposal has any chance as long as
it's dynamically scoped.

It mightn't be so bad if it were lexically scoped,
i.e. a special way of defining a function so that
it shares the lexically enclosing scope. This
would be implementable, since the compiler has
all the necessary information about both scopes
available.

Although it might be better to have some sort of
"outer" declaration for rebinding in the enclosing
scope, instead of doing it on a whole-function basis.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From greg.ewing at canterbury.ac.nz  Wed Apr 27 05:34:37 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed Apr 27 05:34:54 2005
Subject: [Python-Dev] Re: Caching objects in memory
In-Reply-To: <e04bdf31050426092230502ab1@mail.gmail.com>
References: <e04bdf31050422063019fda86b@mail.gmail.com>
	<d4av64$ogd$1@sea.gmane.org> <e04bdf310504250946371f59c@mail.gmail.com>
	<ca471dc20504250957753a7445@mail.gmail.com>
	<e04bdf31050426092230502ab1@mail.gmail.com>
Message-ID: <426F084D.4000503@canterbury.ac.nz>

Facundo Batista wrote:

>>Aargh! Bad explanation. Or at least you're missing something:
> 
> Not really. It's easier for me to show that id(3) is always the same
> and id([]) not, and let the kids see that's not so easy and you'll
> have to look deeper if you want to know better.

I think Guido was saying that it's important for them to
know that mutable objects are never in danger of being
shared, so you should at least tell them that much.
Otherwise they may end up worrying unnecessarily that
two of their lists might get shared somehow behind
their back.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From stephen at xemacs.org  Wed Apr 27 05:58:56 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed Apr 27 05:59:01 2005
Subject: [Python-Dev] defmacro
In-Reply-To: <006201c54a7a$656bdb40$6402a8c0@arkdesktop> (Andrew Koenig's
	message of "Tue, 26 Apr 2005 12:09:58 -0400")
References: <006201c54a7a$656bdb40$6402a8c0@arkdesktop>
Message-ID: <87ekcwna4f.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Andrew" == Andrew Koenig <ark-mlist@att.net> writes:

    Andrew> Welllll....  Shouldn't you have written

    Andrew>     (mapcar car list-of-lists)

    Andrew> or am I missing something painfully obvious?

Greg should have written

    (with-file "foo/blarg" 'do-something-with)

too.  I guess I should have used do-something-with, too.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
From shane at hathawaymix.org  Wed Apr 27 06:15:16 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Wed Apr 27 06:15:19 2005
Subject: [Python-Dev] Re: [Pythonmac-SIG] zipfile still has 2GB boundary
	bug
In-Reply-To: <6cc9cf942f6b6fa876741dff724e87cd@redivi.com>
References: <f8130a8bcf8ee1165b8c6c6c5da86f80@redivi.com>	<0a276f3a753a57d89e65013ca77a3714@conncoll.edu>	<ca471dc2050426172429fb52b@mail.gmail.com>
	<6cc9cf942f6b6fa876741dff724e87cd@redivi.com>
Message-ID: <426F11D4.1080707@hathawaymix.org>

Bob Ippolito wrote:
> The zipfile module is good enough to create input files for zipimport..
> which is well tested and generally works -- barring the fact that
> zipimport has quite a few rough edges of its own.  I certainly wouldn't
> recommend it for any heavy duty tasks in its current state.

That's interesting because Java seems to suffer from similar problems.
In the early days of Java, although a jar file was a zip file, Java
wouldn't read jar files created by the standard zip utilities I used.  I
think the distinction was that the jar utility stored the files
uncompressed.  Java is fixed now, but I think it illustrates that zip
files are non-trivial.

BTW, I don't think the jar utility can delete files from a zip file
either. ;-)

Shane
From gvanrossum at gmail.com  Wed Apr 27 06:19:47 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Apr 27 06:19:55 2005
Subject: [Python-Dev] Re: [Pythonmac-SIG] zipfile still has 2GB boundary
	bug
In-Reply-To: <6cc9cf942f6b6fa876741dff724e87cd@redivi.com>
References: <f8130a8bcf8ee1165b8c6c6c5da86f80@redivi.com>
	<0a276f3a753a57d89e65013ca77a3714@conncoll.edu>
	<ca471dc2050426172429fb52b@mail.gmail.com>
	<6cc9cf942f6b6fa876741dff724e87cd@redivi.com>
Message-ID: <ca471dc205042621193ddcdaad@mail.gmail.com>

> > Please don't propose a grand rewrite (even it's only a single module).
> > Given that the API is mostly sensible, please propose gradual
> > refactoring of the implementation, perhaps some new API methods, and
> > so on. Don't throw away the work that went into making it work in the
> > first place!
> 
> Well, I didn't necessarily mean it should be thrown away and started
> from scratch

Well, you *did* say "rewrite". :-)

>  -- however, once you get all the ugly out of it, there's
> not much left!  Obviously there's something wrong with the way it's
> written if it took years and *several passes* to correctly identify and
> fix a simple format character case bug.  Most of this can be blamed on
> the struct module, which is more obscure and error-prone than writing
> the same code in C.

I think the reason is different -- it just hasn't had all that much
use beyond the one use case for which it was written (zipping up the
Python library). Also, don't underestimate the baroqueness of the zip
spec.

> One of the most useful things that could happen to the zipfile module
> would be a stream interface for both reading and writing.  Right now
> it's slow and memory hungry when dealing with large chunks.  The use
> case that lead me to fix this bug is a tool that archives video to zip
> files of targa sequences with a reference QuickTime movie.. so I end up
> with thousands of bite sized chunks.

Sounds like a use case nobody else has tried yet.

> This >2GB bug really caused me some grief in that I didn't test with
> such large sequences because I didn't have any.  I didn't end up
> finding out about it until months later because client *ignored* the
> exceptions raised by the GUI and came back to me with broken zip files.
>   Fortunately the TOC in a zip file can be reconstructed from an
> otherwise pristine stream.  Of course, I had to rewrite half of the
> zipfile module to come up with such a recovery program, because it's
> not designed well enough to let me build such a tool on top of it.

Given more typical use cases for zip files (sending around collections
of source files) I'm not surprised that a bug that only occurs for
files >2GB remained hidden for so long.

I don't remember if you have Python CVS permissions, but you sound
like you really know the module as well as the zip file spec, so I'm
hoping that you'll find the time to do some reconstructive surgery on
the zip module for Python 2.5, without breaking the existing APIs. I
like the idea you have for a stream API; I recall that the one time I
had to use it I was surprised that the API dealt with files as string
buffers exclusively.

> Another "bug" I ran into was that it has some crazy default for the
> ZipInfo record: it assumes the platform ("create_system") is Windows
> regardless of where you are!

I vaguely recall that the initial author was a Windows-head; perhaps
he didn't realize how useful the module would be on other platforms,
or that it would make any difference at all.

> This caused some really subtle and
> annoying issues with some unzip tools (of course, on everyone's
> machines except mine).  Fortunately someone was able to figure out why
> and send me a patch, but it was completely unexpected and I didn't see
> such craziness documented anywhere.  If it weren't for this patch, it'd
> either still be broken, or I'd have switched to some other way of
> creating archives!
> 
> The zipfile module is good enough to create input files for zipimport..
> which is well tested and generally works -- barring the fact that
> zipimport has quite a few rough edges of its own.  I certainly wouldn't
> recommend it for any heavy duty tasks in its current state.

So, please fix it!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From greg.ewing at canterbury.ac.nz  Wed Apr 27 06:31:45 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed Apr 27 06:32:10 2005
Subject: [Python-Dev] site enhancements (request for review)
In-Reply-To: <ca471dc2050426023629559cab@mail.gmail.com>
References: <53f1dec01a0d78057c40abb1942cf0f1@redivi.com>
	<426DCDAE.8060907@canterbury.ac.nz>
	<ca471dc2050426023629559cab@mail.gmail.com>
Message-ID: <426F15B1.3020403@canterbury.ac.nz>

Guido van Rossum wrote:
> I do that all the time without .pth files -- I just put all the common
> modules in a package and place the package in the directory containing
> the "main" .py files.

That's fine as long as you're willing to put all the
main .py files together in one directory, with everything
else below it, but sometimes it's not convenient to do that.

I had a use for this the other night, involving two
applications which each consisted of multiple .py
files (belonging only to that application) plus some
shared ones. I wanted to have a directory for each
application containing all the files private to that
application.

 > it's too easy to forget about the .pth file and be
> confused when it points to the wrong place.

I don't think I would be confused by that. I would
consider the .pth file to be a part of the source
code of the application, to be maintained along with
it. If I got an ImportError for one of the shared
modules, checking the .pth file would be a natural
thing to do -- just as would checking the sys.path
munging code if it were being done that way. And
a .pth file would be much easier to maintain than
the hairy-looking code you need to write to munge
sys.path in an equivalent way.

> That's also the reason why
> I don't use symlinks or $PYTHONPATH for this purpose.

Another reason for avoiding that is portability. My first
attempt at solving the aforementioned problem used
symlinks. Trouble is, it also had to work on Windows
running under Virtual PC mounting the source directory
from the host system as a file share, and it turns out
that reading a unix symlink from the Windows end just
returns the contents of the link. Aaarrghh! Braindamage!

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From gvanrossum at gmail.com  Wed Apr 27 06:47:14 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Apr 27 06:47:17 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <426E3B01.1010007@canterbury.ac.nz>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
Message-ID: <ca471dc205042621472b1f6edf@mail.gmail.com>

> > [Greg Ewing]
> >>* It seems to me that this same exception-handling mechanism
> >>would be just as useful in a regular for-loop, and that, once
> >>it becomes possible to put 'yield' in a try-statement, people
> >>are going to *expect* it to work in for-loops as well.

[Guido]
> > (You can already put a yield inside a try-except, just not inside a
> > try-finally.)

[Greg]
> Well, my point still stands. People are going to write
> try-finally around their yields and expect the natural
> thing to happen when their generator is used in a
> for-loop.

Well, the new finalization semantics should take care of that when
their generator is finalized -- its __next__() will be called with
some exception.  But as long you hang on to the generator, it will not
be finalized, which is distinctly different from the desired
with-statement semantics.

> > There would still be the difference that a for-loop invokes iter()
> > and a with-block doesn't.
>  >
>  > Also, for-loops that don't exhaust the iterator leave it
>  > available for later use.
> 
> Hmmm. But are these big enough differences to justify
> having a whole new control structure? Whither TOOWTDI?

Indeed, but apart from declaring that henceforth the with-statement
(by whatever name) is the recommended looping construct and a
for-statement is just a backwards compatibility macro, I just don't
see how we can implement the necessary immediate cleanup semantics of
a with-statement.  In order to serve as a resource cleanup statement
it *must* have stronger cleanup guarantees than the for-statement can
give (if only for backwards compatibility reasons).

> >     """
> >     The statement:
> >
> >         for VAR in EXPR:
> >             BLOCK
> >
> >     does the same thing as:
> >
> >         with iter(EXPR) as VAR:        # Note the iter() call
> >             BLOCK
> >
> >     except that:
> >
> >     - you can leave out the "as VAR" part from the with-statement;
> >     - they work differently when an exception happens inside BLOCK;
> >     - break and continue don't always work the same way.
> >
> >     The only time you should write a with-statement is when the
> >     documentation for the function you are calling says you should.
> >     """
> 
> Surely you jest. Any newbie reading this is going to think
> he hasn't a hope in hell of ever understanding what is going
> on here, and give up on Python in disgust.

And surely you exaggerate.  How about this then:

    The with-statement is similar to the for-loop.  Until you've
    learned about the differences in detail, the only time you should
    write a with-statement is when the documentation for the function
    you are calling says you should.

> >>I'm seriously worried by the
> >>possibility that a return statement could do something other
> >>than return from the function it's written in.
> 
> > Let me explain the use cases that led me to throwing that in
> 
> Yes, I can see that it's going to be necessary to treat
> return as an exception, and accept the possibility that
> it will be abused. I'd still much prefer people refrain
> from abusing it that way, though. Using "return" to spell
> "send value back to yield statement" would be extremely
> obfuscatory.

That depends on where you're coming from.  To Ruby users it will look
completely natural because that's what Ruby uses.  (In fact it'll be a
while before they appreciate the deep differences between yield in
Python and in Ruby.)

But I accept that in Python we might want to use a different keyword
to pass a value to the generator.  I think using 'continue' should
work; continue with a value has no precedent in Python, and continue
without a value happens to have exactly the right semantics anyway.

> > (BTW ReturnFlow etc. aren't great
> > names.  Suggestions?)
> 
> I'd suggest just calling them Break, Continue and Return.

Too close to break, continue and return IMO.

> > One last thing: if we need a special name for iterators and
> > generators designed for use in a with-statement, how about calling
> > them with-iterators and with-generators.
> 
> Except that if it's no longer a "with" statement, this
> doesn't make so much sense...

Then of course we'll call it after whatever the new statement is going
to be called.  If we end up calling it the foible-statement, they will
be foible-iterators and foible-generators.

Anyway, I think I'll need to start writing a PEP.  I'll ask the PEP
editor for a number.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Wed Apr 27 09:30:22 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Apr 27 09:30:33 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc205042621472b1f6edf@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
Message-ID: <ca471dc20504270030405f922f@mail.gmail.com>

I've written a PEP about this topic. It's PEP 340: Anonymous Block
Statements (http://python.org/peps/pep-0340.html).

Some highlights:

- temporarily sidestepping the syntax by proposing 'block' instead of 'with'
- __next__() argument simplified to StopIteration or ContinueIteration instance
- use "continue EXPR" to pass a value to the generator
- generator exception handling explained

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From jason at diamond.name  Wed Apr 27 11:44:22 2005
From: jason at diamond.name (Jason Diamond)
Date: Wed Apr 27 11:42:41 2005
Subject: [Python-Dev] Another Anonymous Block Proposal
Message-ID: <426F5EF6.9050400@diamond.name>

Hi.

I hope you don't mind another proposal. Please feel free to tear it apart.

A limitation of both Ruby's block syntax and the new PEP 340 syntax is 
the fact that they don't allow you to pass in more than a single 
anonymous block parameter. If Python's going to add anonymous blocks, 
shouldn't it do it better than Ruby?

What follows is a proposal for a syntax that allows passing multiple, 
anonymous callable objects into another callable. No new protocols are 
introduced and none of it is tied to iterators/generators which makes it 
much simpler to understand (and hopefully simpler to implement).

This is long and the initial syntax isn't ideal so please bear with me 
as I move towards what I'd like to see.

The Python grammar would get one new production:

    do_statement ::=
        "do" call ":" NEWLINE
        ( "with" funcname "(" [parameter_list] ")" ":" suite )*

Here's an example using this new "do" statement:

    do process_file(path):
    with process(file):
        for line in file:
            print line

That would translate into:

    def __process(file):
        for line in file:
            print line
    process_file(path, process=__process)

Notice that the name after each "with" keyword is the name of a 
parameter to the function being called. This will be what allows 
multiple block parameters.

The implementation of `process_file` could look something like:

    def process_file(path, process):
        try:
            f = file(path)
            process(f)
        finally:
            if f:
                f.close()

There's no magic in `process_file`. It's just a function that receives a 
callable named `process` as a parameter and it calls that callable with 
one parameter.

There's no magic in the post-translated code, either, except for the 
temporary `__process` definition which shouldn't be user-visible.

The magic comes when the pre-translated code gets each "with" block 
turned into a hidden, local def and passed in as a parameter to 
`process_file`.

This syntax allows for multiple blocks:

    do process_file(path):
    with process(file):
        for line in file:
            print line
    with success():
        print 'file processed successfully!'
    with error(exc):
        print 'an exception was raised during processing:', exc

That's three separate anonymous block parameters with varying number of 
parameters in each one.

This is what `process_file` might look like now:

    def process_file(path, process, success=None, error=None):
        try:
            try:
                f = file(path)
                process(f)
                if success:
                    success(()
            except:
                if error:
                    error(sys.exc_info())
                raise
        finally:
            if f:
                f.close()

I'm sure that being able to pass in multiple, anonymous blocks will be a 
huge advantage.

Here's an example of how Twisted might be able to use multiple block 
parameters:

    d = do Deferred():
    with callback(data): ...
    with errback(failure): ...

(After typing that in, I realized the do_statement production needs an 
optional assignment part.)

There's nothing requiring that anonymous blocks be used for looping. 
They're strictly parameters which need to be callable. They can, of 
course, be called from within a loop:

    def process_lines(path, process):
        try:
            f = file(path)
            for line in f:
                process(line)
        finally:
            if f:
                f.close()

    do process_lines(path):
    with process(line):
        print line

Admittedly, this syntax is pretty bulky. The "do" keyword is necessary 
to indicate to the parser that this isn't a normal call--this call has 
anonymous block parameters. Having to prefix each one of these 
parameters with "with" is just following the example of "if/elif/else" 
blocks. An alternative might be to use indentation the way that class 
statements "contain" def statements:

    do_statement ::=
        "do" call ":" NEWLINE
        INDENT
            ( funcname "(" [parameter_list] ")" ":" suite )*
        DEDENT

That would turn our last example into this:

    do process_lines(path):
        process(line):
            print line

The example with the `success` and `error` parameters would look like this:

    do process_file(path):
        process(file):
            for line in file:
                print line
        success():
            print 'file processed successfully!'
        error(exc):
            print 'an exception was raised during processing:', exc

To me, that's much easier to see that the three anonymous block 
statements are part of the "do" statement.

It would be ideal if we could even lose the "do" keyword. I think that 
might make the grammar ambiguous, though. If it was possible, we could 
do this:

    process_file(path):
        process(file):
            for line in file:
                print line
        success():
            print 'file processed successfully!'
        error(exc):
            print 'an exception was raised during processing:', exc

Now the only difference between a normal call and a call with anonymous 
block parameters would be the presence of the trailing colon. I could 
live with the "do" keyword if this can't be done, however.

The only disadvantage to this syntax that I can see is that the simple 
case of opening a file and processing it is slightly more verbose than 
it is in Ruby. This is Ruby:

    File.open_and_process("testfile", "r") do |file|
        while line = file.gets
            puts line
        end
    end

This would be the Python equivalent:

    do open_and_process("testfile", "r"):
        process(file):
            for line in file:
                print line

It's one extra line in Python (I'm not counting lines that contain 
nothing but "end" in Ruby) because we have to specify the name of the 
block parameter. The extra flexibility that the proposed syntax has 
(being able to pass in multiple blocks) is worth this extra line, in my 
opinion.

If we wanted to optimize even further for this case, however, we could 
allow for an alternate form of the "do" statement that lets you only 
specify one anonymous block parameter. Maybe it would look like this:

    do open_and_process("testfile", "r") process(file):
        for line in file:
            print line

I don't really think this is necessary. I don't mind being verbose if it 
makes things clearer and simpler.

Here's some other ideas: use "def" instead of "with". They'd have to be 
indented to avoid ambiguity, though:

    do process_file(path):
        def process(file):
            for line in file:
                print line
        def success():
            print 'file processed successfully!'
        def error(exc):
            print 'an exception was raised during processing:', exc

The presence of the familiar def keyword should help people understand 
what's happening here.

Note that I didn't include an example but there's no reason why an 
anonymous block parameter couldn't return a value which could be used in 
the function calling the block.

Please, be gentle.

-- 
Jason

From jason at diamond.name  Wed Apr 27 12:06:21 2005
From: jason at diamond.name (Jason Diamond)
Date: Wed Apr 27 12:06:28 2005
Subject: [Python-Dev] Another Anonymous Block Proposal
In-Reply-To: <20050427055231.R7719@familjen.svensson.org>
References: <426F5EF6.9050400@diamond.name>
	<20050427055231.R7719@familjen.svensson.org>
Message-ID: <426F641D.1010802@diamond.name>

Paul Svensson wrote:

>  You're not mentioning scopes of local variables, which seems to be
>  the issue where most of the previous proposals lose their balance
>  between hairy and pointless...

My syntax is just sugar for nested defs. I assumed the scopes of local 
variables would be identical when using either syntax.

Do you have any pointers to that go into the issues I'm probably missing?

Thanks.

-- 
Jason

From fredrik at pythonware.com  Wed Apr 27 12:36:47 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed Apr 27 12:38:10 2005
Subject: [Python-Dev] Re: Re: anonymous blocks
References: <ca471dc205042116402d7d38da@mail.gmail.com><ca471dc205042416572da9db71@mail.gmail.com><426DB7C8.5020708@canterbury.ac.nz><ca471dc2050426043713116248@mail.gmail.com><426E3B01.1010007@canterbury.ac.nz><ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
Message-ID: <d4nplu$sgh$1@sea.gmane.org>

Guido van Rossum wrote:

> I've written a PEP about this topic. It's PEP 340: Anonymous Block
> Statements (http://python.org/peps/pep-0340.html).
>
> Some highlights:
>
> - temporarily sidestepping the syntax by proposing 'block' instead of 'with'
> - __next__() argument simplified to StopIteration or ContinueIteration instance
> - use "continue EXPR" to pass a value to the generator
> - generator exception handling explained

+1 (most excellent)

</F> 


From ncoghlan at gmail.com  Wed Apr 27 13:22:58 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed Apr 27 13:23:04 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc20504270030405f922f@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>	<426DB7C8.5020708@canterbury.ac.nz>	<ca471dc2050426043713116248@mail.gmail.com>	<426E3B01.1010007@canterbury.ac.nz>	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
Message-ID: <426F7612.6090707@gmail.com>

Guido van Rossum wrote:
> I've written a PEP about this topic. It's PEP 340: Anonymous Block
> Statements (http://python.org/peps/pep-0340.html).
> 
> Some highlights:
> 
> - temporarily sidestepping the syntax by proposing 'block' instead of 'with'
> - __next__() argument simplified to StopIteration or ContinueIteration instance
> - use "continue EXPR" to pass a value to the generator
> - generator exception handling explained
> 

I'm still trying to build a case for a non-looping block statement, but the 
proposed enhancements to generators look great. Any further suggestions I make 
regarding a PEP 310 style block statement will account for those generator changes.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From stephen at xemacs.org  Wed Apr 27 13:28:55 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed Apr 27 13:29:00 2005
Subject: [Python-Dev] defmacro
In-Reply-To: <426ED7EF.1090508@canterbury.ac.nz> (Greg Ewing's message of
	"Wed, 27 Apr 2005 12:08:15 +1200")
References: <20050425094254.wf6wbmkg0pwkocwo@mcherm.com>
	<426DDD87.60908@canterbury.ac.nz>
	<87k6mqnddr.fsf@tleepslib.sk.tsukuba.ac.jp>
	<426ED7EF.1090508@canterbury.ac.nz>
Message-ID: <877jiolaq0.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Greg" == Greg Ewing <greg.ewing@canterbury.ac.nz> writes:

    Greg> I didn't claim that people would feel compelled to eliminate
    Greg> all uses of lambda; only that, in those cases where they
    Greg> *do* feel so compelled, they might not if lambda weren't
    Greg> such a long word.

Sure, I understood that.  It's just that my feeling is that lambda
can't "just quote a suite", it brings lots of other semantic baggage
with it.

Anyway, with dynamic scope, we can eliminate lambda, can't we?  Just
pass the suites as quoted lists of forms, compute the macro expansion,
and eval it.  So it seems to me that the central issue us scoping, not
preventing evaluation of the suites.  In Lisp, macros are a way of
temporarily enabling certain amounts of dynamic scoping for all
variables, without declaring them "special".  It is very convenient
that they don't evaluate their arguments, but that is syntactic sugar,
AFAICT.

In other words, it's the same idea as the "collapse" keyword that was
proposed, but with different rules about what gets collapsed, when.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
From jim at zope.com  Wed Apr 27 13:42:07 2005
From: jim at zope.com (Jim Fulton)
Date: Wed Apr 27 13:42:12 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc20504270030405f922f@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>	<426DB7C8.5020708@canterbury.ac.nz>	<ca471dc2050426043713116248@mail.gmail.com>	<426E3B01.1010007@canterbury.ac.nz>	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
Message-ID: <426F7A8F.8090109@zope.com>

Guido van Rossum wrote:
> I've written a PEP about this topic. It's PEP 340: Anonymous Block
> Statements (http://python.org/peps/pep-0340.html).
> 
> Some highlights:
> 
> - temporarily sidestepping the syntax by proposing 'block' instead of 'with'
> - __next__() argument simplified to StopIteration or ContinueIteration instance
> - use "continue EXPR" to pass a value to the generator
> - generator exception handling explained

This looks pretty cool.

Some observations:

1. It looks to me like a bare return or a return with an EXPR3 that happens
    to evaluate to None inside a block simply exits the block, rather
    than exiting a surrounding function. Did I miss something, or is this
    a bug?

2. I assume it would be a hack to try to use block statements to implement
    something like interfaces or classes, because doing so would require
    significant local-variable manipulation.  I'm guessing that
    either implementing interfaces (or implementing a class statement
    in which the class was created before execution of a suite)
    is not a use case for this PEP.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org
From ncoghlan at gmail.com  Wed Apr 27 13:44:49 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed Apr 27 13:44:55 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <426EE270.1080303@canterbury.ac.nz>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>	<3cf156b4a85d5b9907c6c9333d8c6af8@python.net>	<426E5AEB.3030707@gmail.com>
	<426EE270.1080303@canterbury.ac.nz>
Message-ID: <426F7B31.2040109@gmail.com>

Greg Ewing wrote:
> Nick Coghlan wrote:
> 
>> def template():
>>   # pre_part_1
>>   yield None
>>   # post_part_1
>>   yield None
>>   # pre_part_2
>>   yield None
>>   # post_part_2
>>   yield None
>>   # pre_part_3
>>   yield None
>>   # post_part_3
>>
>> def user():
>>   block = template()
>>   with block:
>>     # do_part_1
>>   with block:
>>     # do_part_2
>>   with block:
>>     # do_part_3
> 
> 
> That's an interesting idea, but do you have any use cases
> in mind?

I was trying to address a use case which looked something like:

    do_begin()
    # code
    if some_condition:
       do_pre()
       # more code
       do_post()
    do_end()

It's actually doable with a non-looping block statement, but I have yet to come 
up with a version which isn't as ugly as hell.

> I worry that it will be too restrictive to be really useful.
> Without the ability for the iterator to control which blocks
> get executed and when, you wouldn't be able to implement
> something like a case statement, for example.

We can't write a case statement with a looping block statement either, since 
we're restricted to executing the same suite whenever we encounter a yield 
expression. At least the non-looping version offers some hope, since each yield 
can result in the execution of different code.

For me, the main sticking point is that we *already* have a looping construct to 
drain an iterator - a 'for' loop. The more different the block statement's 
semantics are from a regular loop, the more powerful I think the combination 
will be. Whereas if the block statement is just a for loop with slightly tweaked 
exception handling semantics, then the potential combinations will be far less 
interesting.

My current thinking is that we would be better served by a block construct that 
guaranteed it would call __next__() on entry and on exit, but did not drain the 
generator (e.g. by supplying appropriate __enter__() and __exit__() methods on 
generators for a PEP 310 style block statement, or __enter__(), __except__() and 
__no_except__() for the enhanced version posted elsewhere in this rambling 
discussion).

However, I'm currently scattering my thoughts across half-a-dozen different 
conversation threads. So I'm going to stop doing that, and try to put it all 
into one coherent post :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From duncan.booth at suttoncourtenay.org.uk  Wed Apr 27 14:22:20 2005
From: duncan.booth at suttoncourtenay.org.uk (Duncan Booth)
Date: Wed Apr 27 14:22:27 2005
Subject: [Python-Dev] Re: anonymous blocks
References: <ca471dc20504270030405f922f@mail.gmail.com>
	<426F7A8F.8090109@zope.com>
Message-ID: <n2m-g.Xns9645880BB6DDCduncanrcpcouk@127.0.0.1>

Jim Fulton <jim@zope.com> wrote in news:426F7A8F.8090109@zope.com:

> Guido van Rossum wrote:
>> I've written a PEP about this topic. It's PEP 340: Anonymous Block
>> Statements (http://python.org/peps/pep-0340.html).
>> 
> Some observations:
> 
> 1. It looks to me like a bare return or a return with an EXPR3 that
> happens 
>     to evaluate to None inside a block simply exits the block, rather
>     than exiting a surrounding function. Did I miss something, or is
>     this a bug?
> 

No, the return sets a flag and raises StopIteration which should make the 
iterator also raise StopIteration at which point the real return happens.

If the iterator fails to re-raise the StopIteration exception (the spec 
only says it should, not that it must) I think the return would be ignored 
but a subsquent exception would then get converted into a return value. I 
think the flag needs reset to avoid this case.

Also, I wonder whether other exceptions from next() shouldn't be handled a 
bit differently. If BLOCK1 throws an exception, and this causes the 
iterator to also throw an exception then one exception will be lost. I 
think it would be better to propogate the original exception rather than 
the second exception.

So something like (added lines to handle both of the above):

        itr = EXPR1
        exc = arg = None
        ret = False
        while True:
            try:
                VAR1 = next(itr, arg)
            except StopIteration:
                if exc is not None:
                    if ret:
                        return exc
                    else:
                        raise exc   # XXX See below
                break
+           except:
+               if ret or exc is None:
+                   raise
+               raise exc # XXX See below
+           ret = False
            try:
                exc = arg = None
                BLOCK1
            except Exception, exc:
                arg = StopIteration()
From ncoghlan at iinet.net.au  Wed Apr 27 15:27:35 2005
From: ncoghlan at iinet.net.au (Nick Coghlan)
Date: Wed Apr 27 15:33:11 2005
Subject: [Python-Dev] Integrating PEP 310 with PEP 340
Message-ID: <426F9347.6000505@iinet.net.au>

This is my attempt at a coherent combination of what I like about both proposals 
(as opposed to my assortment of half-baked attempts scattered through the 
existing discussion).

PEP 340 has many ideas I like:
   - enhanced yield statements and yield expressions
   - enhanced continue and break
   - generator finalisation
   - 'next' builtin and associated __next__() slot
   - changes to 'for' loop

One restriction I don't like is the limitation to ContinueIteration and 
StopIteration as arguments to next(). The proposed semantics and conventions for 
ContinueIteration and StopIteration are fine, but I would like to be able to 
pass _any_ exception in to the generator, allowing the generator to decide if a 
given exception justifies halting the iteration.

The _major_ part I don't like is that the block statement's semantics are too 
similar to those of a 'for' loop. I would like to see a new construct that can 
do things a for loop can't do, and which can be used in _conjunction_ with a for 
loop, to provide greater power than either construct on their own.

PEP 310 forms the basis for a block construct that I _do_ like. The question 
then becomes whether or not generators can be used to write useful PEP 310 style 
block managers (I think they can, in a style very similar to that of the looping 
block construct from PEP 340).

Block statement syntax from PEP 340:

     block EXPR1 [as VAR1]:
             BLOCK1

Proposed semantics (based on PEP 310, with some ideas stolen from PEP 340):

     blk_mgr = EXPR1
     VAR1 = blk_mgr.__enter__()
     try:
         try:
             BLOCK1
         except Exception, exc:
             blk_mgr.__except__(exc)
         else:
             blk_mgr.__else__()
     finally:
         blk_mgr.__exit__()

'blk_mgr' is a hidden variable (as per PEP 340).

Note that nothing special happens to 'break', 'return' or 'continue' statements 
with this proposal.

Generator methods to support the block manager protocol used by the block statement:

     def __enter__(self):
         try:
             return next(self)
         except StopIteration:
             raise RuntimeError("Generator exhausted before block statement")


     def __except__(self, exc):
         try:
             next(self, exc)
         except StopIteration:
             pass

     def __no_except__(self):
         try:
             next(self)
         except StopIteration:
             pass

     def __exit__(self):
         pass

Writing simple block managers with this proposal (these should be identical to 
the equivalent PEP 340 block managers):

   def opening(name):
       opened = open(name)
       try:
           yield opened
       finally:
           opened.close()

   def logging(logger, name):
       logger.enter_scope(name)
       try:
           try:
               yield
           except Exception, exc:
               logger.log_exception(exc)
       finally:
           logger.exit_scope()

   def transacting(ts):
       ts.begin()
       try:
           yield
       except:
           ts.abort()
       else:
           ts.commit()

Using simple block managers with this proposal (again, identical to PEP 340):

   block opening(name) as f:
     pass

   block logging(logger, name):
     pass

   block transacting(ts):
     pass

Obviously, the more interesting block managers are those like auto_retry (which 
is a loop, and hence an excellent match for PEP 340), and using a single 
generator in multiple block statements (which PEP 340 doesn't allow at all). 
I'll try to get to those tomorrow (and if I can't find any good use cases for 
the latter trick, then this idea can be summarily discarded in favour of PEP 340).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From jim at zope.com  Wed Apr 27 15:44:03 2005
From: jim at zope.com (Jim Fulton)
Date: Wed Apr 27 15:44:08 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <n2m-g.Xns9645880BB6DDCduncanrcpcouk@127.0.0.1>
References: <ca471dc20504270030405f922f@mail.gmail.com>	<426F7A8F.8090109@zope.com>
	<n2m-g.Xns9645880BB6DDCduncanrcpcouk@127.0.0.1>
Message-ID: <426F9723.4080604@zope.com>

Duncan Booth wrote:
> Jim Fulton <jim@zope.com> wrote in news:426F7A8F.8090109@zope.com:
> 
> 
>>Guido van Rossum wrote:
>>
>>>I've written a PEP about this topic. It's PEP 340: Anonymous Block
>>>Statements (http://python.org/peps/pep-0340.html).
>>>
>>
>>Some observations:
>>
>>1. It looks to me like a bare return or a return with an EXPR3 that
>>happens 
>>    to evaluate to None inside a block simply exits the block, rather
>>    than exiting a surrounding function. Did I miss something, or is
>>    this a bug?
>>
> 
> 
> No, the return sets a flag and raises StopIteration which should make the 
> iterator also raise StopIteration at which point the real return happens.

Only if exc is not None

The only return in the pseudocode is inside "if exc is not None".
Is there another return that's not shown? ;)

I agree that we leave the block, but it doesn't look like we
leave the surrounding scope.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org
From pedronis at strakt.com  Wed Apr 27 15:48:24 2005
From: pedronis at strakt.com (Samuele Pedroni)
Date: Wed Apr 27 15:48:36 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <426F9723.4080604@zope.com>
References: <ca471dc20504270030405f922f@mail.gmail.com>	<426F7A8F.8090109@zope.com>	<n2m-g.Xns9645880BB6DDCduncanrcpcouk@127.0.0.1>
	<426F9723.4080604@zope.com>
Message-ID: <426F9828.6060102@strakt.com>

Jim Fulton wrote:

> Duncan Booth wrote:
>
>> Jim Fulton <jim@zope.com> wrote in news:426F7A8F.8090109@zope.com:
>>
>>
>>> Guido van Rossum wrote:
>>>
>>>> I've written a PEP about this topic. It's PEP 340: Anonymous Block
>>>> Statements (http://python.org/peps/pep-0340.html).
>>>>
>>>
>>> Some observations:
>>>
>>> 1. It looks to me like a bare return or a return with an EXPR3 that
>>> happens    to evaluate to None inside a block simply exits the 
>>> block, rather
>>>    than exiting a surrounding function. Did I miss something, or is
>>>    this a bug?
>>>
>>
>>
>> No, the return sets a flag and raises StopIteration which should make 
>> the iterator also raise StopIteration at which point the real return 
>> happens.
>
>
> Only if exc is not None
>
> The only return in the pseudocode is inside "if exc is not None".
> Is there another return that's not shown? ;)
>
> I agree that we leave the block, but it doesn't look like we
> leave the surrounding scope.

that we are having this discussion at all seems a signal that the 
semantics are likely too subtle.


From duncan.booth at suttoncourtenay.org.uk  Wed Apr 27 16:19:35 2005
From: duncan.booth at suttoncourtenay.org.uk (Duncan Booth)
Date: Wed Apr 27 16:19:40 2005
Subject: [Python-Dev] Re: anonymous blocks
References: <n2m-g.Xns9645880BB6DDCduncanrcpcouk@127.0.0.1>
	<426F9723.4080604@zope.com>
Message-ID: <n2m-g.Xns96459BEDFA1B5duncanrcpcouk@127.0.0.1>

Jim Fulton <jim@zope.com> wrote in news:426F9723.4080604@zope.com:

>> No, the return sets a flag and raises StopIteration which should make
>> the iterator also raise StopIteration at which point the real return
>> happens. 
> 
> Only if exc is not None
> 
> The only return in the pseudocode is inside "if exc is not None".
> Is there another return that's not shown? ;)
> 

Ah yes, I see now what you mean. 

I would think that the relevant psuedo-code should look more like:

            except StopIteration:
                if ret:
                    return exc
                if exc is not None:
                    raise exc   # XXX See below
                break
From pje at telecommunity.com  Wed Apr 27 17:01:27 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Apr 27 16:57:40 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc20504270030405f922f@mail.gmail.com>
References: <ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>

At 12:30 AM 4/27/05 -0700, Guido van Rossum wrote:
>I've written a PEP about this topic. It's PEP 340: Anonymous Block
>Statements (http://python.org/peps/pep-0340.html).
>
>Some highlights:
>
>- temporarily sidestepping the syntax by proposing 'block' instead of 'with'
>- __next__() argument simplified to StopIteration or ContinueIteration 
>instance
>- use "continue EXPR" to pass a value to the generator
>- generator exception handling explained

Very nice.  It's not clear from the text, btw, if normal exceptions can be 
passed into __next__, and if so, whether they can include a traceback.  If 
they *can*, then generators can also be considered co-routines now, in 
which case it might make sense to call blocks "coroutine blocks", because 
they're basically a way to interleave a block of code with the execution of 
a specified coroutine.

From pje at telecommunity.com  Wed Apr 27 17:10:41 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Apr 27 17:06:53 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc2050426043713116248@mail.gmail.com>
References: <426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
Message-ID: <5.1.1.6.0.20050427110325.02471bf0@mail.telecommunity.com>

At 04:37 AM 4/26/05 -0700, Guido van Rossum wrote:
>*Fourth*, and this is what makes Greg and me uncomfortable at the same
>time as making Phillip and other event-handling folks drool: from the
>previous three points it follows that an iterator may *intercept* any
>or all of ReturnFlow, BreakFlow and ContinueFlow, and use them to
>implement whatever cool or confusing magic they want.

Actually, this isn't my interest at all.  It's the part where you can pass 
values or exceptions *in* to a generator with *less* magic than is 
currently required.

This interest is unrelated to anonymous blocks in any case; it's about 
being able to simulate lightweight pseudo-threads ala Stackless, for use 
with Twisted.  I can do this now of course, but "yield expressions" as 
described in PEP 340 would eliminate the need for the awkward syntax and 
frame hackery I currently use.

From fredrik at pythonware.com  Wed Apr 27 17:32:16 2005
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed Apr 27 17:36:38 2005
Subject: [Python-Dev] Re: Re: anonymous blocks
References: <426DB7C8.5020708@canterbury.ac.nz><ca471dc205042116402d7d38da@mail.gmail.com><ca471dc205042416572da9db71@mail.gmail.com><426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<5.1.1.6.0.20050427110325.02471bf0@mail.telecommunity.com>
Message-ID: <d4oavq$qef$1@sea.gmane.org>

Phillip J. Eby wrote:

> This interest is unrelated to anonymous blocks in any case; it's about 
> being able to simulate lightweight pseudo-threads ala Stackless, for use 
> with Twisted.  I can do this now of course, but "yield expressions" as 
> described in PEP 340 would eliminate the need for the awkward syntax and 
> frame hackery I currently use.

since when does

    def mythread(self):
        ...
        yield request
        print self.response
        ...

qualify as frame hackery?

</F>

From jcarlson at uci.edu  Wed Apr 27 18:25:13 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed Apr 27 18:27:20 2005
Subject: [Python-Dev] Another Anonymous Block Proposal
In-Reply-To: <426F641D.1010802@diamond.name>
References: <20050427055231.R7719@familjen.svensson.org>
	<426F641D.1010802@diamond.name>
Message-ID: <20050427091934.640A.JCARLSON@uci.edu>


Jason Diamond <jason@diamond.name> wrote:
> 
> Paul Svensson wrote:
> 
> >  You're not mentioning scopes of local variables, which seems to be
> >  the issue where most of the previous proposals lose their balance
> >  between hairy and pointless...
> 
> My syntax is just sugar for nested defs. I assumed the scopes of local 
> variables would be identical when using either syntax.
> 
> Do you have any pointers to that go into the issues I'm probably missing?

We already have nested defs in Python, no need for a new syntax there.

The trick is that people would like to be able to execute the body of a
def (or at least portions) in the namespace of where it is lexically
defined (seemingly making block syntaxes less appealing), and even some
who want to execute the body of the def in the namespace where the
function is evaluated (which has been discussed as being almost not
possible, if not entirely impossible).

 - Josiah

From jcarlson at uci.edu  Wed Apr 27 18:44:12 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed Apr 27 18:45:03 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc20504270030405f922f@mail.gmail.com>
References: <ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
Message-ID: <20050427093635.640D.JCARLSON@uci.edu>


Guido van Rossum <gvanrossum@gmail.com> wrote:
> 
> I've written a PEP about this topic. It's PEP 340: Anonymous Block
> Statements (http://python.org/peps/pep-0340.html).
> 
> Some highlights:
> 
> - temporarily sidestepping the syntax by proposing 'block' instead of 'with'
> - __next__() argument simplified to StopIteration or ContinueIteration instance
> - use "continue EXPR" to pass a value to the generator
> - generator exception handling explained

Your code for the translation of a standard for loop is flawed.  From
the PEP:

        for VAR1 in EXPR1:
            BLOCK1
        else:
            BLOCK2

    will be translated as follows:

        itr = iter(EXPR1)
        arg = None
        while True:
            try:
                VAR1 = next(itr, arg)
            finally:
                break
            arg = None
            BLOCK1
        else:
            BLOCK2


Note that in the translated version, BLOCK2 can only ever execute if
next raises a StopIteration in the call, and BLOCK1 will never be
executed because of the 'break' in the finally clause.

Unless it is too early for me, I believe what you wanted is...

        itr = iter(EXPR1)
        arg = None
        while True:
            VAR1 = next(itr, arg)
            arg = None
            BLOCK1
        else:
            BLOCK2

 - Josiah

From gvanrossum at gmail.com  Wed Apr 27 18:55:14 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Apr 27 18:55:19 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <n2m-g.Xns96459BEDFA1B5duncanrcpcouk@127.0.0.1>
References: <n2m-g.Xns9645880BB6DDCduncanrcpcouk@127.0.0.1>
	<426F9723.4080604@zope.com>
	<n2m-g.Xns96459BEDFA1B5duncanrcpcouk@127.0.0.1>
Message-ID: <ca471dc205042709555b24f522@mail.gmail.com>

> I would think that the relevant psuedo-code should look more like:
> 
>             except StopIteration:
>                 if ret:
>                     return exc
>                 if exc is not None:
>                     raise exc   # XXX See below
>                 break

Thanks! This was a bug in the PEP due to a last-minute change in how I
wanted to handle return; I've fixed it as you show (also renaming
'exc' to 'var' since it doesn't always hold an exception).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From steven.bethard at gmail.com  Wed Apr 27 19:35:20 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Wed Apr 27 19:35:24 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc20504270030405f922f@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
Message-ID: <d11dcfba05042710355eba8d39@mail.gmail.com>

On 4/27/05, Guido van Rossum <gvanrossum@gmail.com> wrote:
> I've written a PEP about this topic. It's PEP 340: Anonymous Block
> Statements (http://python.org/peps/pep-0340.html).

So block-statements would be very much like for-loops, except:

(1) iter() is not called on the expression
(2) the fact that break, continue, return or a raised Exception
occurred can all be intercepted by the block-iterator/generator,
though break, return and a raised Exception all look the same to the
block-iterator/generator (they are signaled with a StopIteration)
(3) the while loop can only be broken out of by next() raising a
StopIteration, so all well-behaved iterators will be exhausted when
the block-statement is exited

Hope I got that mostly right.

I know this is looking a little far ahead, but is the intention that
even in Python 3.0 for-loops and block-statements will still be
separate statements?  It seems like there's a pretty large section of
overlap.  Playing with for-loop semantics right now isn't possible due
to backwards compatibility, but when that limitation is removed in
Python 3.0, are we hoping that these two similar structures will be
expressed in a single statement?

STeVe
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From gvanrossum at gmail.com  Wed Apr 27 19:42:04 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Apr 27 19:42:11 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <20050427093635.640D.JCARLSON@uci.edu>
References: <ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<20050427093635.640D.JCARLSON@uci.edu>
Message-ID: <ca471dc205042710424c7c5006@mail.gmail.com>

> Your code for the translation of a standard for loop is flawed.  From
> the PEP:
> 
>         for VAR1 in EXPR1:
>             BLOCK1
>         else:
>             BLOCK2
> 
>     will be translated as follows:
> 
>         itr = iter(EXPR1)
>         arg = None
>         while True:
>             try:
>                 VAR1 = next(itr, arg)
>             finally:
>                 break
>             arg = None
>             BLOCK1
>         else:
>             BLOCK2
> 
> Note that in the translated version, BLOCK2 can only ever execute if
> next raises a StopIteration in the call, and BLOCK1 will never be
> executed because of the 'break' in the finally clause.

Ouch. Another bug in the PEP. It was late. ;-)

The "finally:" should have been "except StopIteration:" I've updated
the PEP online.

> Unless it is too early for me, I believe what you wanted is...
> 
>         itr = iter(EXPR1)
>         arg = None
>         while True:
>             VAR1 = next(itr, arg)
>             arg = None
>             BLOCK1
>         else:
>             BLOCK2

No, this would just propagate the StopIteration when next() raises it.
StopIteration is not caught implicitly except around the next() call
made by the for-loop control code.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From Benjamin.Schollnick at xerox.com  Wed Apr 27 16:58:26 2005
From: Benjamin.Schollnick at xerox.com (Schollnick, Benjamin)
Date: Wed Apr 27 20:22:38 2005
Subject: [Python-Dev] ZipFile revision....
Message-ID: <266589E1B9392B4C9195CC25A07C73B9AA5033@usa0300ms04.na.xerox.net>

Folks,

	There's been a lot of talk lately about changes to the ZipFile 
module...  Along with people stating that there are few "real life"
applications for it....

	Here's a small "gift"...

	A "Quick" Backup utility for your files....

	Example:

c:\develope\backup\backup.py --source c:\install_software 	--target
c:\backups\ 	 --label installers
c:\develope\backup\backup.py --source c:\develope 		--target
c:\backups\ 	 --label development -z .pyc
c:\develope\backup\backup.py --source "C:\Program Files\Microsoft SQL
Server\MSSQL\Data"  --target c:\backups\	--label sql

	It's evolved a bit, but still could use some work....  It's
currently only tested in a windows
environment...  So don't expect Mac OS X resource forks to be
preserved.....  But it creates and verifies 1Gb+ zip files....

	If you wish to use this to help benchmark, test, etc, any
changes to the ZipFile module
please feel free to...

		- Benjamin


"""Backup Creator Utility

This utility will backup the tree of files that you indicate, into a
archive
of your choice.

"""
#

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
__version__ = '0.95'    # Human Readable Version number
version_info = (0,9,5)  # Easier format version data for comparisons
                        # i.e. if version_info > (1,2,5)
                        #
                        #   if __version__ > '1.00' is a little more
contrived.
                        
__author__  = 'Benjamin A. Schollnick'
__date__    = '2004-12-28'  # yyyy-mm-dd
__email__   = 'Benjamin.Schollnick@xerox.com'
__module_name__ = "Archive Backup Tool"
__short_cright__= ""

import  bas_init
import os
import os.path
import sys
import time
import zipfile


#######################################################################
class   zip_file_engine:
    """The archive backup tool uses pregenerated classes to allow
    multiple styles of archives to be created.

    This is the wrapper around the Python ZIPFILE module.
    """
    def     __init__       ( self ):
        """
        Inputs  --
                    None
                                    
        Outputs --
                    None
        """
        self.Backup_File        = None
        self.Backup_Open        = False
        self.Backup_ReadOnly    = None
        self.Backup_FileName    = None

    def     close_Backup (self ):
        """This will close the current Archive file, and reset the
        internal structures to a clean state.

        Inputs  --
                    None
                                    
        Outputs --
                    None
        """
        if self.Backup_Open <> False:
            self.Backup_File.close ()

        self.Backup_File        = None            
        self.Backup_Open        = False
        self.Backup_ReadOnly    = None
        self.Backup_FileName    = None
        
    def     open_Backup (   self,
                            readonly = False,
                            filename = r"./temp.zip"):
        """This will open a archive file.  Currently appending is not
        formally supported...  The Read Only / Read/Write status is set
        via the readonly flag.

        Inputs  --

            Readonly:
                    True  = Read/Write
                    False = Read Only

            Filename contains the full file/pathname of the zip file.
                                    
        Outputs --
                    None
        """
        if self.Backup_Open == True:
            self.close_Backup ()

        self.Backup_Filename = filename
        if readonly == False:
            self.Backup_File        = zipfile.ZipFile ( filename, "r",
zipfile.ZIP_DEFLATED )
            self.Backup_Open        = True
            self.Backup_ReadOnly    = True
            self.Backup_FileName    = filename
        else:
            self.Backup_File        = zipfile.ZipFile ( filename, "w",
zipfile.ZIP_DEFLATED )
            self.Backup_Open        = True
            self.Backup_ReadOnly    = False
            self.Backup_FileName    = filename

    def     Verify_ZipFile ( self, FileName ):
            """Will create a temporary Zip File object, and verify the
Zip file
            at <filename> location.

            Inputs  -
                        FileName - The filename of the ZIP file to
verify.

            Outputs -
                        True  - File Intact CRCs match

                        Anything else, File Corrupted.  String Contains
the 1st corrupted file.
            """
            temporary_Backup_File = zip_file_engine ( )
            temporary_Backup_File.open_Backup ( False, FileName)
            test_results = temporary_Backup_File.Backup_File.testzip ()
            temporary_Backup_File.close_Backup()
            return test_results        

    def     Verify_Backup (self, FileName ):
            """ Generic Wrapper around the Verify_ZipFile object.
            """
            return self.Verify_ZipFile ( FileName )
            
    def     add_file_to_Backup ( self, filename, archived_filename):
            """Add a file to the writable Zip file.

                inputs -
                
                    filename = Filename of the file to be added
    
                    archived_filename = the Filename stored in the
archive.


                Outputs -

                    None    - Zip file is in Read Only Mode

                    True    - File has been added to the Zip File.
                    
                      -1    - Or the Zipfile engine is not initialized.
            """
            if self.Backup_ReadOnly:
                #   Archive is read only!
                return None
            elif self.Backup_ReadOnly == False:
                #   Archive is Read Write Mode
                if self.Backup_File <> None:
                    #   Zip File Engine is initialized
                    self.Backup_File.write (filename, archived_filename)
                    #   Return Success
                    return True
                else:
                    #   Zip File Engine is *NOT* initialized.
                    return -1


########################################################################
#
No_Archive = 1
ZIP  = 2
        
class   backup_system:
    """Main Class for the Backup Engine.

    Inputs  -
                default_source  = The pathname for the source files.
                default_target  = The pathname for the archive to be
written to.
                default_tag     = the prepended text tag for the archive
file.

    Outputs -

            None
            
    """
    def     __init__ (  self,
                        default_source  = None,
                        default_target  = None,
                        default_exclude = "",
                        default_extensions = "",
                        default_tag     = None,
                        prepend         = False,
                        quiet           = False):
        """The initialization routines for the Backup Engine.

        Inputs  -
                    default_source  = The pathname for the source files.
                    default_target  = The pathname for the archive to be
written to.
                    default_exclude = 
                    default_tag     = the text tag for the archive file.
                    prepend         = Deterimines the placement of the
default tag.

                                        True - The tag is prepended to
the filename
                                        False - The Tag is appended to
the filename.

                                        The default is to append the
default_tag.  (False)

                    quiet           = Forcibly prevent any output from
the directory walk.

        Outputs -

                None;  Internal Values are initialized for the core
engine.
                
        """
        
        self.directory_to_backup        = default_source
        self.backup_storage_location    = default_target
        self.base_filename              = default_tag
        self.archive_filename           = None
        self.force_quiet                = quiet
#        self.exlude_files_dir           = default_exlude

        self.exclude_files_dir           =
default_exclude.upper().strip().split(",")
        self.exclude_exts                =
default_extensions.upper().strip().split(",")
        
        if default_tag == None:
            self.archive_filename_template  = "%m_%d_%Y__%H_%M_%S"
        else:
            if (prepend==True):
                self.archive_filename_template  = self.base_filename +
"_%m_%d_%Y__%H_%M_%S"
            elif (prepend==None) or (prepend==False):
                self.archive_filename_template  = "%m_%d_%Y__%H_%M_%S_"
+ self.base_filename

                
        self.archive_filename_extension = ".zip"
        self.archive_engine_to_use      = None
        

    def     create_archive_filename  ( self ):
        """This sets the archive filename in the object.

        This is set, to prevent timing issues internally.

        Inputs -

                None

        Outputs -

                None;  Internally creates the archives filename from the
backup_storage_location,
                and the archive_filename_template.
        """
        self.archive_filename = self.backup_storage_location + os.sep +
time.strftime (self.archive_filename_template, time.localtime() ) +
self.archive_filename_extension


    def     Verify_Backup ( self ):
        """ Wrapper around the archive_engines verify routines.

        This will automatically start the verification process, and
        return the results.

        Inputs -

                    None

        Outputs -

                    True  - File Intact CRCs match

                    Anything else, File Corrupted.  String Contains
details from the
                    archiver engine.
                    
        """
        return self.archive_engine_to_use.Verify_Backup (
self.archive_filename )
    

    def     start_archive_engine ( self, Backup_Type):
        """ Initializae the derived archive_engine, depending on the
Backup_Type.

        Inputs -
                Backup_Type
                    1  - None
                    2  - Zip

        Outputs -

                None
                
        """
        if Backup_Type == 2:
            self.archive_engine_to_use = zip_file_engine ()

        self.create_archive_filename ()
        self.archive_engine_to_use.open_Backup (   readonly = True,
filename = self.archive_filename )

    def     close_archive_file ( self ):
        """Stop and Close the Archive File.

        This does terminate the Archive Engine.
        But does not terminate the Backup_Engine.

        Inputs -
                    None

        Outputs -
                    None; Internally resets the archive engine to a
closed state.
                    
        """
        self.archive_engine_to_use.close_Backup ()

    def     walk_directory_tree ( self, notify_directory = None,
notify_file = None ):
        """Walk the source directory tree, and add each file/directory
into the archive file.

        Inputs -

            notify_directory    (Pointer)   - see below
            notify_file         (Pointer)   - see below

        Outputs -

            None


        If you wish to have a console output for the walk function, you
can
        have that via the notify_directory and notify_file functions....

        Create two stubs and pass them to the walk routine.  The
routines only have
        a single input, either a directory name, or a filename,
depending on the
        function.

            def     notify_dir ( directory_name ):
                print
                print "Processing Directory - %s " % directory_name


            def     notify_file ( file_name ):
                print "\t\tFile - %s " % file_name
        
        Backup_Engine.walk_directory_tree ( notify_directory =
notify_dir, notify_file = notify_file )
        """
        selfexclude = os.path.normpath(self.archive_filename)

        if self.force_quiet:
            notify_directory    = None
            notify_file         = None
            
        for root, dirs, files in os.walk( self.directory_to_backup ):
            if notify_directory <> None:
                notify_directory ( root )
            for file in files:
                if notify_file <> None:
                    notify_file ( file )
                #
                #   Add the file to the backup zip file.

#                if os.path.normpath(file) <> selfexclude:
#                    self.archive_engine_to_use.add_file_to_Backup (
root + os.sep + file, root + os.sep + file)
#                else:
#                     print "Skipping, it is backup file - %s" % file
                exclude_file = False

                if os.path.normpath(file) == selfexclude:
                    exclude_file = True

                if file.strip().upper() in self.exclude_files_dir:
                    exclude_file = True

                if root <> '.':
                    root_segment = root.strip().upper().split(os.sep)
                    for x in root_segment:
                        if x in self.exclude_files_dir:
                            exclude_file = True

#self.exclude_exts
                for x in self.exclude_exts:
                    #print "X: ", x.strip(), " - ", os.path.splitext(
file )[1].upper().strip()
                    #print  x.strip() == os.path.splitext( file
)[1].upper().strip()
                    if x.strip() == os.path.splitext( file
)[1].upper().strip():
                        exclude_file = True

                if not exclude_file:
                    self.archive_engine_to_use.add_file_to_Backup ( root
+ os.sep + file, root + os.sep + file)
#                else:
#                     print "Skipping - %s" % file

                exclude_file = False
                
def     notify_dir ( directory_name ):
    print
    print "Processing Directory - %s " % directory_name

def     notify_file ( file_name ):
    print "\t\tFile - %s " % file_name
    
            
def     Backup_Directories_App ():
    """Example Application that will backup the Directories as stated in
the command line.

    Inputs -
                None

    Outputs -

                None;  FileSystem, Archive file.


    """    
    initialization_data  = bas_init.initialization_wrapper ()
    
    initialization_data.cmd_line_interface.add_option ("-s", "--source",
action="store", type="string", dest="source", help="Directory Tree to
Read From", default=".")
    initialization_data.cmd_line_interface.add_option ("-t", "--target",
action="store", type="string", dest="target", help="Directory to write
the backup to", default=".")
    initialization_data.cmd_line_interface.add_option ("-l", "--label",
action="store", type="string", dest="label", help="What to Label the
Backup File as", default="backup")
    initialization_data.cmd_line_interface.add_option ("-p", "--pre",
action="store_true", dest="prelabel", help="If used, the label is
prepended to the filename.  Otherwise it is appended.")
    initialization_data.cmd_line_interface.add_option ("-q", "--quiet",
action="store_true", dest="quiet", help="Force File & Directory printing
to be turned off.")
    initialization_data.cmd_line_interface.add_option ("-x",
"--exclude", action="store", type="string", dest="exclude", help="List
of files/directories to exclude", default="")
    initialization_data.cmd_line_interface.add_option ("-z",
"--extensions", action="store", type="string", dest="exclude_exts",
help="List of file extensions to exclude", default="")
    initialization_data.run_cmd_line_parse ( )

    print "Initialization Successful..."
    print

    Backup_Engine = backup_system (
initialization_data.cmd_line_options.source,
 
initialization_data.cmd_line_options.target,
 
initialization_data.cmd_line_options.exclude,
 
initialization_data.cmd_line_options.exclude_exts,
 
initialization_data.cmd_line_options.label,
 
initialization_data.cmd_line_options.prelabel,
 
initialization_data.cmd_line_options.quiet)
        
    Backup_Engine.start_archive_engine ( ZIP )


    print "Backup Archive - %s " % Backup_Engine.archive_filename
    
    Backup_Engine.walk_directory_tree ( notify_directory = notify_dir,
notify_file = notify_file )
    Backup_Engine.close_archive_file ()

    print "Verifying the Archive File...."
    if Backup_Engine.Verify_Backup ( ):
        print
        print "The Backup has failed!"
        print
        print "This file, %s, is bad." % test
    else:
        print
        print "The Backup has been verified!"
        print
        print "Backup is successful."
    print
    print "Backup Application has completed."

if __name__ == "__main__":      #   If run from the Command line
    Backup_Directories_App ()   #   run the unit test.
    
    
From ark-mlist at att.net  Wed Apr 27 20:44:04 2005
From: ark-mlist at att.net (Andrew Koenig)
Date: Wed Apr 27 20:43:55 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <426F9828.6060102@strakt.com>
Message-ID: <005301c54b59$170763e0$6402a8c0@arkdesktop>


> that we are having this discussion at all seems a signal that the
> semantics are likely too subtle.

I feel like we're quietly, delicately tiptoeing toward continuations...


From jcarlson at uci.edu  Wed Apr 27 21:21:25 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Wed Apr 27 21:24:13 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc205042710424c7c5006@mail.gmail.com>
References: <20050427093635.640D.JCARLSON@uci.edu>
	<ca471dc205042710424c7c5006@mail.gmail.com>
Message-ID: <20050427121952.6410.JCARLSON@uci.edu>


Guido van Rossum <gvanrossum@gmail.com> wrote:
> Ouch. Another bug in the PEP. It was late. ;-)
> 
> The "finally:" should have been "except StopIteration:" I've updated
> the PEP online.
> 
> > Unless it is too early for me, I believe what you wanted is...
> > 
> >         itr = iter(EXPR1)
> >         arg = None
> >         while True:
> >             VAR1 = next(itr, arg)
> >             arg = None
> >             BLOCK1
> >         else:
> >             BLOCK2
> 
> No, this would just propagate the StopIteration when next() raises it.
> StopIteration is not caught implicitly except around the next() call
> made by the for-loop control code.

Still no good.  On break, the else isn't executed.

How about...

        itr = iter(EXPR1)
        arg = None
        while True:
            try:
                VAR1 = next(itr, arg)
            except StopIteration:
                BLOCK2
                break
            arg = None
            BLOCK1

 - Josiah

From bac at OCF.Berkeley.EDU  Wed Apr 27 22:20:34 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Wed Apr 27 22:20:44 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc20504270030405f922f@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>	<426DB7C8.5020708@canterbury.ac.nz>	<ca471dc2050426043713116248@mail.gmail.com>	<426E3B01.1010007@canterbury.ac.nz>	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
Message-ID: <426FF412.7010709@ocf.berkeley.edu>

Guido van Rossum wrote:
> I've written a PEP about this topic. It's PEP 340: Anonymous Block
> Statements (http://python.org/peps/pep-0340.html).
> 
> Some highlights:
> 
> - temporarily sidestepping the syntax by proposing 'block' instead of 'with'
> - __next__() argument simplified to StopIteration or ContinueIteration instance
> - use "continue EXPR" to pass a value to the generator
> - generator exception handling explained
> 

I am at least +0 on all of this now, with a slow warming up to +1 (but then it
might just be the cold talking  =).

I still prefer the idea of arguments to __next__() be raised if they are
exceptions and otherwise just be returned through the yield expression.  But I
do realize this is easily solved with a helper function now::

 def raise_or_yield(val):
     """Return the argument if not an exception, otherwise raise it.

     Meant to have a yield expression as an argument.  Worries about
     Iteration subclasses are invalid since they will have been handled by the
     __next__() method on the generator already.


     """
     if isinstance(val, Exception):
        raise val
     else:
        return val

My objections that I had earlier to 'continue' and 'break' being somewhat
magical in block statements has subsided.  It all seems reasonable now within
the context of a block statement.

And while the thought is in my head, I think block statements should be viewed
less as a tweaked version of a 'for' loop and more as an extension to
generators that happens to be very handy for resource management (while
allowing iterators to come over and play on the new swing set as well).  I
think if you take that view then the argument that they are too similar to
'for' loops loses some luster (although I doubt Nick is going to be buy this  =) .

Basically block statements are providing a simplified, syntactically supported
way to control a generator externally from itself (or at least this is the
impression I am getting).  I just had a flash of worry about how this would
work in terms of abstractions of things to functions with block statements in
them, but then I realized you just push more code into the generator and handle
it there with the block statement just driving the generator.  Seems like this
might provide that last key piece for generators to finally provide cool flow
control that we all know they are capable of but just required extra work
beforehand.

-Brett
From gvanrossum at gmail.com  Wed Apr 27 22:27:18 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Apr 27 22:27:36 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>
Message-ID: <ca471dc205042713277846852d@mail.gmail.com>

[Phillip Eby]
> Very nice.  It's not clear from the text, btw, if normal exceptions can be
> passed into __next__, and if so, whether they can include a traceback.  If
> they *can*, then generators can also be considered co-routines now, in
> which case it might make sense to call blocks "coroutine blocks", because
> they're basically a way to interleave a block of code with the execution of
> a specified coroutine.

The PEP is clear on this: __next__() only takes Iteration instances,
i.e., StopIteration and ContinueIteration. (But see below.)

I'm not sure what the relevance of including a stack trace would be,
and why that feature would be necessary to call them coroutines.

But... Maybe it would be nice if generators could also be used to
implement exception handling patterns, rather than just resource
release patterns. IOW, maybe this should work:

    def safeLoop(seq):
        for var in seq:
            try:
                yield var
            except Exception, err:
                print "ignored", var, ":", err.__class__.__name__

    block safeLoop([10, 5, 0, 20]) as x:
        print 1.0/x

This should print

    0.1
    0.2
    ignored 0 : ZeroDivisionError
    0.02

I've been thinking of alternative signatures for the __next__() method
to handle this. We have the following use cases:

1. plain old next()
2. passing a value from continue EXPR
3. forcing a break due to a break statement
4. forcing a break due to a return statement
5. passing an exception EXC

Cases 3 and 4 are really the same; I don't think the generator needs
to know the difference between a break and a return statement. And
these can be mapped to case 5 with EXC being StopIteration().

Now the simplest API would be this: if the argument to __next__() is
an exception instance (let's say we're talking Python 3000, where all
exceptions are subclasses of Exception), it is raised when yield
resumes; otherwise it is the return value from yield (may be None).

This is somewhat unsatisfactory because it means that you can't pass
an exception instance as a value. I don't know how much of a problem
this will be in practice; I could see it causing unpleasant surprises
when someone designs an API around this that takes an arbitrary
object, when someone tries to pass an exception instance. Fixing such
a thing could be expensive (you'd have to change the API to pass the
object wrapped in a list or something).

An alternative that solves this would be to give __next__() a second
argument, which is a bool that should be true when the first argument
is an exception that should be raised. What do people think?

I'll add this to the PEP as an alternative for now.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Wed Apr 27 22:47:32 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Apr 27 22:47:40 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <005301c54b59$170763e0$6402a8c0@arkdesktop>
References: <426F9828.6060102@strakt.com>
	<005301c54b59$170763e0$6402a8c0@arkdesktop>
Message-ID: <ca471dc205042713475c552de6@mail.gmail.com>

> I feel like we're quietly, delicately tiptoeing toward continuations...

No way we aren't. We're not really adding anything to the existing
generator machinery (the exception/value passing is a trivial
modification) and that is only capable of 80% of coroutines (but it's
the 80% you need most :-).

As long as I am BDFL Python is unlikely to get continuations -- my
head explodes each time someone tries to explain them to me.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From david.ascher at gmail.com  Wed Apr 27 22:53:59 2005
From: david.ascher at gmail.com (David Ascher)
Date: Wed Apr 27 22:54:03 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc205042713475c552de6@mail.gmail.com>
References: <426F9828.6060102@strakt.com>
	<005301c54b59$170763e0$6402a8c0@arkdesktop>
	<ca471dc205042713475c552de6@mail.gmail.com>
Message-ID: <dd28fc2f05042713536f191eda@mail.gmail.com>

On 4/27/05, Guido van Rossum <gvanrossum@gmail.com> wrote:

> As long as I am BDFL Python is unlikely to get continuations -- my
> head explodes each time someone tries to explain them to me.

You just need a safety valve installed. It's outpatient surgery, don't worry.

--david
From pje at telecommunity.com  Wed Apr 27 22:59:46 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Wed Apr 27 22:56:04 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc205042713277846852d@mail.gmail.com>
References: <5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>
	<ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com>

At 01:27 PM 4/27/05 -0700, Guido van Rossum wrote:
>[Phillip Eby]
> > Very nice.  It's not clear from the text, btw, if normal exceptions can be
> > passed into __next__, and if so, whether they can include a traceback.  If
> > they *can*, then generators can also be considered co-routines now, in
> > which case it might make sense to call blocks "coroutine blocks", because
> > they're basically a way to interleave a block of code with the execution of
> > a specified coroutine.
>
>The PEP is clear on this: __next__() only takes Iteration instances,
>i.e., StopIteration and ContinueIteration. (But see below.)
>
>I'm not sure what the relevance of including a stack trace would be,
>and why that feature would be necessary to call them coroutines.

Well, you need that feature in order to retain traceback information when 
you're simulating threads with a stack of generators.  Although you can't 
return from a generator inside a nested generator, you can simulate this by 
keeping a stack of generators and having a wrapper that passes control 
between generators, such that:

     def somegen():
         result = yield othergen()

causes the wrapper to push othergen() on the generator stack and execute 
it.  If othergen() raises an error, the wrapper resumes somegen() and 
passes in the error.  If you can only specify the value but not the 
traceback, you lose the information about where the error occurred in 
othergen().

So, the feature is necessary for anything other than "simple" (i.e. 
single-frame) coroutines, at least if you want to retain any possibility of 
debugging.  :)


>But... Maybe it would be nice if generators could also be used to
>implement exception handling patterns, rather than just resource
>release patterns. IOW, maybe this should work:
>
>     def safeLoop(seq):
>         for var in seq:
>             try:
>                 yield var
>             except Exception, err:
>                 print "ignored", var, ":", err.__class__.__name__
>
>     block safeLoop([10, 5, 0, 20]) as x:
>         print 1.0/x

Yes, it would be nice.  Also, you may have just come up with an even better 
word for what these things should be called... patterns.  Perhaps they 
could be called "pattern blocks" or "patterned blocks".  Pattern sounds so 
much more hip and politically correct than "macro" or even "code block".  :)


>An alternative that solves this would be to give __next__() a second
>argument, which is a bool that should be true when the first argument
>is an exception that should be raised. What do people think?

I think it'd be simpler just to have two methods, conceptually 
"resume(value=None)" and "error(value,tb=None)", whatever the actual method 
names are.

From mcherm at mcherm.com  Wed Apr 27 23:09:59 2005
From: mcherm at mcherm.com (Michael Chermside)
Date: Wed Apr 27 23:10:01 2005
Subject: [Python-Dev] Re: switch statement
Message-ID: <20050427140959.qhpyf65lqkls8kkg@mcherm.com>

Guido writes:
> You mean like this?
>
>     if x > 0:
>        ...normal case...
>     elif y > 0:
>         ....abnormal case...
>     else:
>         ...edge case...
>
> You have guts to call that bad style! :-)

Well, maybe, but this:

    if x == 1:
       do_number_1()
    elif x == 2:
       do_number_2()
    elif x == 3:
       do_number_3()
    elif y == 4:
       do_number_4()
    elif x == 5:
       do_number_5()
    else:
       raise ValueError

is clearly bad style. (Even knowing what I did here, how long does it
take you to find the problem? Hint: line 7.)

I've seen Jim's recipe in the cookbook, and as I said there, I'm impressed
by the clever implementation, but I think it's unwise. PEP 275 proposes
an O(1) solution... either by compiler optimization of certain
if-elif-else structures, or via a new syntax with 'switch' and 'case'
keywords. (I prefer the keywords version myself... that optimization
seems awefully messy, and wouldn't help with the problem above.) Jim's
recipe fixes the problem given above, but it's a O(n) solution, and to
me the words 'switch' and 'case' just *scream* "O(1)". But perhaps
it's worthwhile, just because it avoids repeating "x ==".

Really, this seems like a direct analog of another frequently-heard
Python gripe: the lack of a conditional expression. After all, the
problems with these two code snippets:

     if x == 1:        |    if condition_1:
        do_1()         |        y = 1
     elif x == 2:      |    elif condition_2:
        do_2()         |        y = 2
     elif x == 3:      |    elif condition_3:
        do_3()         |        y = 3
     else:             |    else:
        default()      |        y = 4

is the repetition of "x ==" and of "y =". As my earlier example
demonstrates, a structure like this in which the "x ==" or the
"y =" VARIES has a totally different *meaning* to the programmer
than one in which the "x ==" or "y =" is the same for every
single branch.

But let's not start discussing conditional expressions now,
because there's already more traffic on the list than I can read.

-- Michael Chermside

From gvanrossum at gmail.com  Wed Apr 27 23:50:10 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Apr 27 23:50:19 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>
	<ca471dc205042713277846852d@mail.gmail.com>
	<5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com>
Message-ID: <ca471dc2050427145022e8985f@mail.gmail.com>

[Guido]
> >I'm not sure what the relevance of including a stack trace would be,
> >and why that feature would be necessary to call them coroutines.

[Phillip]
> Well, you need that feature in order to retain traceback information when
> you're simulating threads with a stack of generators.  Although you can't
> return from a generator inside a nested generator, you can simulate this by
> keeping a stack of generators and having a wrapper that passes control
> between generators, such that:
> 
>      def somegen():
>          result = yield othergen()
> 
> causes the wrapper to push othergen() on the generator stack and execute
> it.  If othergen() raises an error, the wrapper resumes somegen() and
> passes in the error.  If you can only specify the value but not the
> traceback, you lose the information about where the error occurred in
> othergen().
> 
> So, the feature is necessary for anything other than "simple" (i.e.
> single-frame) coroutines, at least if you want to retain any possibility of
> debugging.  :)

OK. I think you must be describing continuations there, because my
brain just exploded. :-)

In Python 3000 I want to make the traceback a standard attribute of
Exception instances; would that suffice? I really don't want to pass
the whole (type, value, traceback) triple that currently represents an
exception through __next__().

> Yes, it would be nice.  Also, you may have just come up with an even better
> word for what these things should be called... patterns.  Perhaps they
> could be called "pattern blocks" or "patterned blocks".  Pattern sounds so
> much more hip and politically correct than "macro" or even "code block".  :)

Yes, but the word has a much loftier meaning. I could get used to
template blocks though (template being a specific pattern, and this
whole thing being a non-OO version of the Template Method Pattern from
the GoF book).

> >An alternative that solves this would be to give __next__() a second
> >argument, which is a bool that should be true when the first argument
> >is an exception that should be raised. What do people think?
> 
> I think it'd be simpler just to have two methods, conceptually
> "resume(value=None)" and "error(value,tb=None)", whatever the actual method
> names are.

Part of me likes this suggestion, but part of me worries that it
complicates the iterator API too much. Your resume() would be
__next__(), but that means your error() would become __error__(). This
is more along the lines of PEP 288 and PEP 325 (and even PEP 310), but
we have a twist here in that it is totally acceptable (see my example)
for __error__() to return the next value or raise StopIteration. IOW
the return behavior of __error__() is the same as that of __next__().

Fredrik, what does your intuition tell you?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From shane at hathawaymix.org  Wed Apr 27 23:54:31 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Wed Apr 27 23:52:39 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <20050427140959.qhpyf65lqkls8kkg@mcherm.com>
References: <20050427140959.qhpyf65lqkls8kkg@mcherm.com>
Message-ID: <42700A17.9020905@hathawaymix.org>

Michael Chermside wrote:
>      if x == 1:        |    if condition_1:
>         do_1()         |        y = 1
>      elif x == 2:      |    elif condition_2:
>         do_2()         |        y = 2
>      elif x == 3:      |    elif condition_3:
>         do_3()         |        y = 3
>      else:             |    else:
>         default()      |        y = 4

This inspired a twisted thought: if you just redefine truth, you don't
have to repeat the variable. <0.9 wink>

    True = x
    if 1:
        do_1()
    elif 2:
        do_2()
    elif 3:
        do_3()
    else:
        default()

Shane
From gvanrossum at gmail.com  Wed Apr 27 23:57:01 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Apr 27 23:57:12 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <n2m-g.Xns9645880BB6DDCduncanrcpcouk@127.0.0.1>
References: <ca471dc20504270030405f922f@mail.gmail.com>
	<426F7A8F.8090109@zope.com>
	<n2m-g.Xns9645880BB6DDCduncanrcpcouk@127.0.0.1>
Message-ID: <ca471dc205042714575e98d89c@mail.gmail.com>

> If the iterator fails to re-raise the StopIteration exception (the spec
> only says it should, not that it must) I think the return would be ignored
> but a subsquent exception would then get converted into a return value. I
> think the flag needs reset to avoid this case.

Good catch. I've fixed this in the PEP.

> Also, I wonder whether other exceptions from next() shouldn't be handled a
> bit differently. If BLOCK1 throws an exception, and this causes the
> iterator to also throw an exception then one exception will be lost. I
> think it would be better to propogate the original exception rather than
> the second exception.

I don't think so. It's similar to this case:

    try:
        raise Foo
    except:
        raise Bar

Here, Foo is also lost.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Wed Apr 27 23:59:42 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Wed Apr 27 23:59:50 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <426F7A8F.8090109@zope.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<426F7A8F.8090109@zope.com>
Message-ID: <ca471dc205042714596c053236@mail.gmail.com>

[Jim Fulton]

> 2. I assume it would be a hack to try to use block statements to implement
>     something like interfaces or classes, because doing so would require
>     significant local-variable manipulation.  I'm guessing that
>     either implementing interfaces (or implementing a class statement
>     in which the class was created before execution of a suite)
>     is not a use case for this PEP.

I would like to get back to the discussion about interfaces and
signature type declarations at some point, and a syntax dedicated to
declaring interfaces is high on my wish list.

In the mean time, if you need interfaces today, I think using
metaclasses would be easier than using a block-statement (if it were
even possible using the latter without passing locals() to the
generator).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From ncoghlan at gmail.com  Thu Apr 28 00:00:40 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu Apr 28 00:00:48 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <426FF412.7010709@ocf.berkeley.edu>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>	<426DB7C8.5020708@canterbury.ac.nz>	<ca471dc2050426043713116248@mail.gmail.com>	<426E3B01.1010007@canterbury.ac.nz>	<ca471dc205042621472b1f6edf@mail.gmail.com>	<ca471dc20504270030405f922f@mail.gmail.com>
	<426FF412.7010709@ocf.berkeley.edu>
Message-ID: <42700B88.40703@gmail.com>

Brett C. wrote:
> And while the thought is in my head, I think block statements should be viewed
> less as a tweaked version of a 'for' loop and more as an extension to
> generators that happens to be very handy for resource management (while
> allowing iterators to come over and play on the new swing set as well).  I
> think if you take that view then the argument that they are too similar to
> 'for' loops loses some luster (although I doubt Nick is going to be buy this  =) .

I'm surprisingly close to agreeing with you, actually. I've worked out that it 
isn't the looping that I object to, it's the inability to get out of the loop 
without exhausting the entire iterator.

I need to think about some ideas involving iterator factories, then my 
objections may disappear.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From ncoghlan at gmail.com  Thu Apr 28 00:07:54 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu Apr 28 00:08:01 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc205042713277846852d@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>	<426DB7C8.5020708@canterbury.ac.nz>	<ca471dc2050426043713116248@mail.gmail.com>	<426E3B01.1010007@canterbury.ac.nz>	<ca471dc205042621472b1f6edf@mail.gmail.com>	<ca471dc20504270030405f922f@mail.gmail.com>	<5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>
	<ca471dc205042713277846852d@mail.gmail.com>
Message-ID: <42700D3A.5020208@gmail.com>

Guido van Rossum wrote:
> An alternative that solves this would be to give __next__() a second
> argument, which is a bool that should be true when the first argument
> is an exception that should be raised. What do people think?
> 
> I'll add this to the PEP as an alternative for now.

An optional third argument (raise=False) seems a lot friendlier (and more 
flexible) than a typecheck.

Yet another alternative would be for the default behaviour to be to raise 
Exceptions, and continue with anything else, and have the third argument be 
"raise_exc=True" and set it to False to pass an exception in without raising it.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From gvanrossum at gmail.com  Thu Apr 28 00:16:06 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Apr 28 00:16:22 2005
Subject: [Python-Dev] Integrating PEP 310 with PEP 340
In-Reply-To: <426F9347.6000505@iinet.net.au>
References: <426F9347.6000505@iinet.net.au>
Message-ID: <ca471dc205042715165dede48d@mail.gmail.com>

[Nick Coghlan]
> This is my attempt at a coherent combination of what I like about both proposals
> (as opposed to my assortment of half-baked attempts scattered through the
> existing discussion).
> 
> PEP 340 has many ideas I like:
>    - enhanced yield statements and yield expressions
>    - enhanced continue and break
>    - generator finalisation
>    - 'next' builtin and associated __next__() slot
>    - changes to 'for' loop
> 
> One restriction I don't like is the limitation to ContinueIteration and
> StopIteration as arguments to next(). The proposed semantics and conventions for
> ContinueIteration and StopIteration are fine, but I would like to be able to
> pass _any_ exception in to the generator, allowing the generator to decide if a
> given exception justifies halting the iteration.

I'm close to dropping this if we can agree on the API for passing
exceptions into __next__(); see the section "Alternative __next__()
and Generator Exception Handling" that I just added to the PEP.

> The _major_ part I don't like is that the block statement's semantics are too
> similar to those of a 'for' loop. I would like to see a new construct that can
> do things a for loop can't do, and which can be used in _conjunction_ with a for
> loop, to provide greater power than either construct on their own.

While both 'block' and 'for' are looping constructs, their handling of
the iterator upon premature exit is entirely different, and it's hard
to reconcile these two before Python 3000.

> PEP 310 forms the basis for a block construct that I _do_ like. The question
> then becomes whether or not generators can be used to write useful PEP 310 style
> block managers (I think they can, in a style very similar to that of the looping
> block construct from PEP 340).

I've read through your example, and I'm not clear why you think this
is better. It's a much more complex API with less power. What's your
use case? Why should 'block' be disallowed from looping? TOOWTDI or do
you have something better?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Thu Apr 28 00:22:00 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Apr 28 00:29:37 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <42700D3A.5020208@gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>
	<ca471dc205042713277846852d@mail.gmail.com>
	<42700D3A.5020208@gmail.com>
Message-ID: <ca471dc20504271522e79ce4a@mail.gmail.com>

[Guido]
> > An alternative that solves this would be to give __next__() a second
> > argument, which is a bool that should be true when the first argument
> > is an exception that should be raised. What do people think?
> >
> > I'll add this to the PEP as an alternative for now.

[Nick]
> An optional third argument (raise=False) seems a lot friendlier (and more
> flexible) than a typecheck.

I think I agree, especially since Phillip's alternative (a different
method) is even worse IMO.

> Yet another alternative would be for the default behaviour to be to raise
> Exceptions, and continue with anything else, and have the third argument be
> "raise_exc=True" and set it to False to pass an exception in without raising it.

You've lost me there. If you care about this, can you write it up in
more detail (with code samples or whatever)? Or we can agree on a 2nd
arg to __next__() (and a 3rd one to next()).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From pje at telecommunity.com  Thu Apr 28 00:38:53 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Apr 28 00:35:14 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc2050427145022e8985f@mail.gmail.com>
References: <5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com>
	<ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>
	<ca471dc205042713277846852d@mail.gmail.com>
	<5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com>

At 02:50 PM 4/27/05 -0700, Guido van Rossum wrote:
>[Guido]
> > >I'm not sure what the relevance of including a stack trace would be,
> > >and why that feature would be necessary to call them coroutines.
>
>[Phillip]
> > Well, you need that feature in order to retain traceback information when
> > you're simulating threads with a stack of generators.  Although you can't
> > return from a generator inside a nested generator, you can simulate this by
> > keeping a stack of generators and having a wrapper that passes control
> > between generators, such that:
> >
> >      def somegen():
> >          result = yield othergen()
> >
> > causes the wrapper to push othergen() on the generator stack and execute
> > it.  If othergen() raises an error, the wrapper resumes somegen() and
> > passes in the error.  If you can only specify the value but not the
> > traceback, you lose the information about where the error occurred in
> > othergen().
> >
> > So, the feature is necessary for anything other than "simple" (i.e.
> > single-frame) coroutines, at least if you want to retain any possibility of
> > debugging.  :)
>
>OK. I think you must be describing continuations there, because my
>brain just exploded. :-)

Probably my attempt at a *brief* explanation backfired.  No, they're not 
continuations or anything nearly that complicated.  I'm "just" simulating 
threads using generators that yield a nested generator when they need to do 
something that might block waiting for I/O.  The pseudothread object pushes 
the yielded generator-iterator and resumes it.  If that generator-iterator 
raises an error, the pseudothread catches it, pops the previous 
generator-iterator, and passes the error into it, traceback and all.

The net result is that as long as you use a "yield expression" for any 
function/method call that might do blocking I/O, and those functions or 
methods are written as generators, you get the benefits of Twisted (async 
I/O without threading headaches) without having to "twist" your code into 
the callback-registration patterns of Twisted.  And, by passing in errors 
with tracebacks, the normal process of exception call-stack unwinding 
combined with pseudothread stack popping results in a traceback that looks 
just as if you had called the functions or methods normally, rather than 
via the pseudothreading mechanism.  Without that, you would only get the 
error context of 'async_readline()', because the traceback wouldn't be able 
to show who *called* async_readline.


>In Python 3000 I want to make the traceback a standard attribute of
>Exception instances; would that suffice?

If you're planning to make 'raise' reraise it, such that 'raise exc' is 
equivalent to 'raise type(exc), exc, exc.traceback'.  Is that what you 
mean?  (i.e., just making it easier to pass the darn things around)

If so, then I could probably do what I need as long as there exist no error 
types whose instances disallow setting a 'traceback' attribute on them 
after the fact.  Of course, if Exception provides a slot (or dictionary) 
for this, then it shouldn't be a problem.

Of course, it seems to me that you also have the problem of adding to the 
traceback when the same error is reraised...

All in all it seems more complex than just allowing an exception and a 
traceback to be passed.


>I really don't want to pass
>the whole (type, value, traceback) triple that currently represents an
>exception through __next__().

The point of passing it in is so that the traceback can be preserved 
without special action in the body of generators the exception is passing 
through.

I could be wrong, but it seems to me you need this even for PEP 340, if 
you're going to support error management templates, and want tracebacks to 
include the line in the block where the error originated.  Just reraising 
the error inside the generator doesn't seem like it would be enough.


> > >An alternative that solves this would be to give __next__() a second
> > >argument, which is a bool that should be true when the first argument
> > >is an exception that should be raised. What do people think?
> >
> > I think it'd be simpler just to have two methods, conceptually
> > "resume(value=None)" and "error(value,tb=None)", whatever the actual method
> > names are.
>
>Part of me likes this suggestion, but part of me worries that it
>complicates the iterator API too much.

I was thinking that maybe these would be a "coroutine API" or "generator 
API" instead.  That is, something not usable except with 
generator-iterators and with *new* objects written to conform to it.  I 
don't really see a lot of value in making template blocks work with 
existing iterators.  For that matter, I don't see a lot of value in 
hand-writing new objects with resume/error, instead of just using a generator.

So, I guess I'm thinking you'd have something like tp_block_resume and 
tp_block_error type slots, and generators' tp_iter_next would just be the 
same as tp_block_resume(None).

But maybe this is the part you're thinking is complicated.  :)

From tcdelaney at optusnet.com.au  Thu Apr 28 00:34:59 2005
From: tcdelaney at optusnet.com.au (Tim Delaney)
Date: Thu Apr 28 00:39:19 2005
Subject: [Python-Dev] Re: anonymous blocks
References: <ca471dc205042116402d7d38da@mail.gmail.com><ca471dc205042416572da9db71@mail.gmail.com><426DB7C8.5020708@canterbury.ac.nz><ca471dc2050426043713116248@mail.gmail.com><426E3B01.1010007@canterbury.ac.nz><ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
Message-ID: <00c501c54b79$58655890$0201a8c0@ryoko>

Guido van Rossum wrote:

> - temporarily sidestepping the syntax by proposing 'block' instead of
> 'with'
> - __next__() argument simplified to StopIteration or
> ContinueIteration instance
> - use "continue EXPR" to pass a value to the generator
> - generator exception handling explained

+1

A minor sticking point - I don't like that the generator has to re-raise any 
``StopIteration`` passed in. Would it be possible to have the semantics be:

    If a generator is resumed with ``StopIteration``, the exception is 
raised
    at the resumption point (and stored for later use). When the generator
    exits normally (i.e. ``return`` or falls off the end) it re-raises the
    stored exception (if any) or raises a new ``StopIteration`` exception.


So a generator would become effectively::

    try:
        stopexc = None
        exc = None
        BLOCK1
    finally:
        if exc is not None:
            raise exc

        if stopexc is not None:
            raise stopexc

        raise StopIteration


where within BLOCK1:


    ``raise <exception>`` is equivalent to::

        exc = <exception>
        return


    The start of an ``except`` clause sets ``exc`` to None (if the clause is
    executed of course).


    Calling ``__next__(exception)`` with ``StopIteration`` is equivalent 
to::

        stopexc = exception
        (raise exception at resumption point)


    Calling ``__next__(exception)`` with ``ContinueIteration`` is equivalent 
to::

        (resume exception with exception.value)


    Calling ``__next__(exception)__`` with any other value just raises that 
value
    at the resumption point - this allows for calling with arbitrary 
exceptions.


Also, within a for-loop or block-statement, we could have ``raise 
<exception>`` be equivalent to::

    arg = <exception>
    continue


This also takes care of Brett's concern about distinguishing between 
exceptions and values passed to the generator. Anything except StopIteration 
or ContinueIteration will be presumed to be an exception and will be raised. 
Anything passed via ContinueIteration is a value.

Tim Delaney 

From gvanrossum at gmail.com  Thu Apr 28 00:58:14 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Apr 28 00:58:17 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>
	<ca471dc205042713277846852d@mail.gmail.com>
	<5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com>
	<ca471dc2050427145022e8985f@mail.gmail.com>
	<5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com>
Message-ID: <ca471dc205042715585917829f@mail.gmail.com>

[Phillip]
> Probably my attempt at a *brief* explanation backfired.  No, they're not
> continuations or anything nearly that complicated.  I'm "just" simulating
> threads using generators that yield a nested generator when they need to do
> something that might block waiting for I/O.  The pseudothread object pushes
> the yielded generator-iterator and resumes it.  If that generator-iterator
> raises an error, the pseudothread catches it, pops the previous
> generator-iterator, and passes the error into it, traceback and all.
> 
> The net result is that as long as you use a "yield expression" for any
> function/method call that might do blocking I/O, and those functions or
> methods are written as generators, you get the benefits of Twisted (async
> I/O without threading headaches) without having to "twist" your code into
> the callback-registration patterns of Twisted.  And, by passing in errors
> with tracebacks, the normal process of exception call-stack unwinding
> combined with pseudothread stack popping results in a traceback that looks
> just as if you had called the functions or methods normally, rather than
> via the pseudothreading mechanism.  Without that, you would only get the
> error context of 'async_readline()', because the traceback wouldn't be able
> to show who *called* async_readline.

OK, I sort of get it, at a very high-level, although I still feel this
is wildly out of my league.

I guess I should try it first. ;-)

> >In Python 3000 I want to make the traceback a standard attribute of
> >Exception instances; would that suffice?
> 
> If you're planning to make 'raise' reraise it, such that 'raise exc' is
> equivalent to 'raise type(exc), exc, exc.traceback'.  Is that what you
> mean?  (i.e., just making it easier to pass the darn things around)
> 
> If so, then I could probably do what I need as long as there exist no error
> types whose instances disallow setting a 'traceback' attribute on them
> after the fact.  Of course, if Exception provides a slot (or dictionary)
> for this, then it shouldn't be a problem.

Right, this would be a standard part of the Exception base class, just
like in Java.

> Of course, it seems to me that you also have the problem of adding to the
> traceback when the same error is reraised...

I think when it is re-raised, no traceback entry should be added; the
place that re-raises it should not show up in the traceback, only the
place that raised it in the first place. To me that's the essence of
re-raising (and I think that's how it works when you use raise without
arguments).

> All in all it seems more complex than just allowing an exception and a
> traceback to be passed.

Making the traceback a standard attribute of the exception sounds
simpler; having to keep track of two separate arguments that are as
closely related as an exception and the corresponding traceback is
more complex IMO.

The only reason why it isn't done that way in current Python is that
it couldn't be done that way back when exceptions were strings.

> >I really don't want to pass
> >the whole (type, value, traceback) triple that currently represents an
> >exception through __next__().
> 
> The point of passing it in is so that the traceback can be preserved
> without special action in the body of generators the exception is passing
> through.
> 
> I could be wrong, but it seems to me you need this even for PEP 340, if
> you're going to support error management templates, and want tracebacks to
> include the line in the block where the error originated.  Just reraising
> the error inside the generator doesn't seem like it would be enough.

*** I have to think about this more... ***

> > > I think it'd be simpler just to have two methods, conceptually
> > > "resume(value=None)" and "error(value,tb=None)", whatever the actual method
> > > names are.
> >
> >Part of me likes this suggestion, but part of me worries that it
> >complicates the iterator API too much.
> 
> I was thinking that maybe these would be a "coroutine API" or "generator
> API" instead.  That is, something not usable except with
> generator-iterators and with *new* objects written to conform to it.  I
> don't really see a lot of value in making template blocks work with
> existing iterators.

(You mean existing non-generator iterators, right? existing
*generators* will work just fine -- the exception will pass right
through them and that's exactly the right default semantics.

Existing non-generator iterators are indeed a different case, and this
is actually an argument for having a separate API: if the __error__()
method doesn't exist, the exception is just re-raised rather than
bothering the iterator.

OK, I think I'm sold.

> For that matter, I don't see a lot of value in
> hand-writing new objects with resume/error, instead of just using a generator.

Not a lot, but I expect that there may be a few, like an optimized
version of lock synchronization.

> So, I guess I'm thinking you'd have something like tp_block_resume and
> tp_block_error type slots, and generators' tp_iter_next would just be the
> same as tp_block_resume(None).
> 
> But maybe this is the part you're thinking is complicated.  :)

No, this is where I feel right at home. ;-)

I hadn't thought much about the C-level slots yet, but this is a
reasonable proposal.

Time to update the PEP; I'm pretty much settled on these semantics now...

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Thu Apr 28 01:01:58 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Apr 28 01:02:02 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <00c501c54b79$58655890$0201a8c0@ryoko>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<00c501c54b79$58655890$0201a8c0@ryoko>
Message-ID: <ca471dc205042716017a85d241@mail.gmail.com>

> A minor sticking point - I don't like that the generator has to re-raise any
> ``StopIteration`` passed in. Would it be possible to have the semantics be:
> 
>     If a generator is resumed with ``StopIteration``, the exception is raised
>     at the resumption point (and stored for later use). When the generator
>     exits normally (i.e. ``return`` or falls off the end) it re-raises the
>     stored exception (if any) or raises a new ``StopIteration`` exception.

I don't like the idea of storing exceptions. Let's just say that we
don't care whether it re-raises the very same StopIteration exception
that was passed in or a different one -- it's all moot anyway because
the StopIteration instance is thrown away by the caller of next().

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From tcdelaney at optusnet.com.au  Thu Apr 28 01:03:37 2005
From: tcdelaney at optusnet.com.au (Tim Delaney)
Date: Thu Apr 28 01:03:47 2005
Subject: [Python-Dev] Re: anonymous blocks
References: <ca471dc205042116402d7d38da@mail.gmail.com><ca471dc205042416572da9db71@mail.gmail.com><426DB7C8.5020708@canterbury.ac.nz><ca471dc2050426043713116248@mail.gmail.com><426E3B01.1010007@canterbury.ac.nz><ca471dc205042621472b1f6edf@mail.gmail.com><ca471dc20504270030405f922f@mail.gmail.com>
	<00c501c54b79$58655890$0201a8c0@ryoko>
Message-ID: <00e001c54b7d$5888c240$0201a8c0@ryoko>

Tim Delaney wrote:

> Also, within a for-loop or block-statement, we could have ``raise
> <exception>`` be equivalent to::
>
>    arg = <exception>
>    continue

For this to work, builtin next() would need to be a bit smarter ... 
specifically, for an old-style iterator, any non-Iteration exception would 
need to be re-raised there.

Tim Delaney 

From tcdelaney at optusnet.com.au  Thu Apr 28 01:07:00 2005
From: tcdelaney at optusnet.com.au (Tim Delaney)
Date: Thu Apr 28 01:07:10 2005
Subject: [Python-Dev] Re: anonymous blocks
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<00c501c54b79$58655890$0201a8c0@ryoko>
	<ca471dc205042716017a85d241@mail.gmail.com>
Message-ID: <00e401c54b7d$d17e7cd0$0201a8c0@ryoko>

Guido van Rossum wrote:

>> A minor sticking point - I don't like that the generator has to
>> re-raise any ``StopIteration`` passed in. Would it be possible to
>> have the semantics be: 
>> 
>>     If a generator is resumed with ``StopIteration``, the exception
>>     is raised at the resumption point (and stored for later use).
>>     When the generator exits normally (i.e. ``return`` or falls off
>>     the end) it re-raises the stored exception (if any) or raises a
>> new ``StopIteration`` exception. 
> 
> I don't like the idea of storing exceptions. Let's just say that we
> don't care whether it re-raises the very same StopIteration exception
> that was passed in or a different one -- it's all moot anyway because
> the StopIteration instance is thrown away by the caller of next().

OK - so what is the point of the sentence::

    The generator should re-raise this exception; it should not yield
    another value.  

when discussing StopIteration?

Tim Delaney
From gvanrossum at gmail.com  Thu Apr 28 01:17:32 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Apr 28 01:17:35 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <00e401c54b7d$d17e7cd0$0201a8c0@ryoko>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<00c501c54b79$58655890$0201a8c0@ryoko>
	<ca471dc205042716017a85d241@mail.gmail.com>
	<00e401c54b7d$d17e7cd0$0201a8c0@ryoko>
Message-ID: <ca471dc205042716173c992c2c@mail.gmail.com>

> OK - so what is the point of the sentence::
> 
>     The generator should re-raise this exception; it should not yield
>     another value.
> 
> when discussing StopIteration?

It forbids returning a value, since that would mean the generator
could "refuse" a break or return statement, which is a little bit too
weird (returning a value instead would turn these into continue
statements).

I'll change this to clarify that I don't care about the identity of
the StopException instance.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From bac at OCF.Berkeley.EDU  Thu Apr 28 01:56:33 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Apr 28 01:56:43 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <42700B88.40703@gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>	<426DB7C8.5020708@canterbury.ac.nz>	<ca471dc2050426043713116248@mail.gmail.com>	<426E3B01.1010007@canterbury.ac.nz>	<ca471dc205042621472b1f6edf@mail.gmail.com>	<ca471dc20504270030405f922f@mail.gmail.com>	<426FF412.7010709@ocf.berkeley.edu>
	<42700B88.40703@gmail.com>
Message-ID: <427026B1.4020002@ocf.berkeley.edu>

Nick Coghlan wrote:
> Brett C. wrote:
> 
>> And while the thought is in my head, I think block statements should
>> be viewed
>> less as a tweaked version of a 'for' loop and more as an extension to
>> generators that happens to be very handy for resource management (while
>> allowing iterators to come over and play on the new swing set as
>> well).  I
>> think if you take that view then the argument that they are too
>> similar to
>> 'for' loops loses some luster (although I doubt Nick is going to be
>> buy this  =) .
> 
> 
> I'm surprisingly close to agreeing with you, actually. I've worked out
> that it isn't the looping that I object to, it's the inability to get
> out of the loop without exhausting the entire iterator.
> 

'break' isn't' enough for you as laid out by the proposal?  The raising of
StopIteration, which is what 'break' does according to the standard, should be
enough to stop the loop without exhausting things.  Same way you stop a 'for'
loop from executing entirely.

-Brett
From pje at telecommunity.com  Thu Apr 28 02:01:38 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Apr 28 01:58:00 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc205042715585917829f@mail.gmail.com>
References: <5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com>
	<ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>
	<ca471dc205042713277846852d@mail.gmail.com>
	<5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com>
	<ca471dc2050427145022e8985f@mail.gmail.com>
	<5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050427193759.032367b0@mail.telecommunity.com>

At 03:58 PM 4/27/05 -0700, Guido van Rossum wrote:
>OK, I sort of get it, at a very high-level, although I still feel this
>is wildly out of my league.
>
>I guess I should try it first. ;-)

It's not unlike David Mertz' articles on implementing coroutines and 
multitasking using generators, except that I'm adding more "debugging 
sugar", if you will, by making the tracebacks look normal.  It's just that 
the *how* requires me to pass the traceback into the generator.  At the 
moment, I accomplish that by doing a 3-argument raise inside of 
'events.resume()', but it would be really nice to be able to get rid of 
'events.resume()' in a future version of Python.


> > Of course, it seems to me that you also have the problem of adding to the
> > traceback when the same error is reraised...
>
>I think when it is re-raised, no traceback entry should be added; the
>place that re-raises it should not show up in the traceback, only the
>place that raised it in the first place. To me that's the essence of
>re-raising (and I think that's how it works when you use raise without
>arguments).

I think maybe I misspoke.  I mean adding to the traceback *so* that when 
the same error is reraised, the intervening frames are included, rather 
than lost.

In other words, IIRC, the traceback chain is normally increased by one 
entry for each frame the exception escapes.  However, if you start hiding 
that inside of the exception instance, you'll have to modify it instead of 
just modifying the threadstate.  Does that make sense, or am I missing 
something?


> > For that matter, I don't see a lot of value in
> > hand-writing new objects with resume/error, instead of just using a 
> generator.
>
>Not a lot, but I expect that there may be a few, like an optimized
>version of lock synchronization.

My point was mainly that we can err on the side of caller convenience 
rather than callee convenience, if there are fewer implementations.  So, 
e.g. multiple methods aren't a big deal if it makes the 'block' 
implementation simpler, if only generators and a handful of special 
template objects are going need to implement the block API.


> > So, I guess I'm thinking you'd have something like tp_block_resume and
> > tp_block_error type slots, and generators' tp_iter_next would just be the
> > same as tp_block_resume(None).
>
>I hadn't thought much about the C-level slots yet, but this is a
>reasonable proposal.

Note that it also doesn't require a 'next()' builtin, or a next vs. 
__next__ distinction, if you don't try to overload iteration and 
templating.  The fact that a generator can be used for templating, doesn't 
have to imply that any iterator should be usable as a template, or that the 
iteration protocol is involved in any way.  You could just have 
__resume__/__error__ matching the tp_block_* slots.

This also has the benefit of making the delineation between template blocks 
and for loops more concrete.  For example, this:

     block open("filename") as f:
         ...

could be an immediate TypeError (due to the lack of a __resume__) instead 
of biting you later on in the block when you try to do something with f, or 
because the block is repeating for each line of the file, etc.

From nas at arctrix.com  Thu Apr 28 02:02:23 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu Apr 28 02:02:27 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc20504270030405f922f@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
Message-ID: <20050428000223.GA8869@mems-exchange.org>

On Wed, Apr 27, 2005 at 12:30:22AM -0700, Guido van Rossum wrote:
> I've written a PEP about this topic. It's PEP 340: Anonymous Block
> Statements (http://python.org/peps/pep-0340.html).

[Note: most of these comments are based on version 1.2 of the PEP]

It seems like what you are proposing is a limited form of
coroutines.  Just as Python's generators are limited (yield can only
jump up one stack frame), these coroutines have a similar
limitation.  Someone mentioned that we are edging closer to
continuations.  I think that may be a good thing.  One big
difference between what you propose and general continuations is in
finalization semantics.  I don't think anyone has figured out a way
for try/finally to work with continuations.  The fact that
try/finally can be used inside generators is a significant feature
of this PEP, IMO.

Regarding the syntax, I actually quite like the 'block' keyword.  It
doesn't seem so surprising that the block may be a loop.

Allowing 'continue' to have an optional value is elegant syntax.
I'm a little bit concerned about what happens if the iterator does
not expect a value.  If I understand the PEP, it is silently
ignored.  That seems like it could hide bugs.  OTOH, it doesn't seem
any worse then a caller not expecting a return value.

It's interesting that there is such similarity between 'for' and
'block'.  Why is it that block does not call iter() on EXPR1?  I
guess that fact that 'break' and 'return' work differently is a more
significant difference.

After thinking about this more, I wonder if iterators meant for
'for' loops and iterators meant for 'block' statements are really
very different things.  It seems like a block-iterator really needs
to handle yield-expressions.

I wonder if generators that contain a yield-expression should
properly be called coroutines.  Practically, I suspect it would just
cause confusion.

Perhaps passing an Iteration instance to next() should not be
treated the same as passing None.  It seems like that would
implementing the iterator easier.  Why not treat Iterator like any
normal value?  Then only None, StopIteration, and ContinueIteration
would be special.

Argh, it took me so long to write this that you are already up to
version 1.6 of the PEP.  Time to start a new message. :-)

  Neil
From bac at OCF.Berkeley.EDU  Thu Apr 28 02:18:19 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Apr 28 02:18:47 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc20504271522e79ce4a@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>	<426DB7C8.5020708@canterbury.ac.nz>	<ca471dc2050426043713116248@mail.gmail.com>	<426E3B01.1010007@canterbury.ac.nz>	<ca471dc205042621472b1f6edf@mail.gmail.com>	<ca471dc20504270030405f922f@mail.gmail.com>	<5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>	<ca471dc205042713277846852d@mail.gmail.com>	<42700D3A.5020208@gmail.com>
	<ca471dc20504271522e79ce4a@mail.gmail.com>
Message-ID: <42702BCB.9000500@ocf.berkeley.edu>

Guido van Rossum wrote:
> [Guido]
> 
>>>An alternative that solves this would be to give __next__() a second
>>>argument, which is a bool that should be true when the first argument
>>>is an exception that should be raised. What do people think?
>>>
>>>I'll add this to the PEP as an alternative for now.
> 
> 
> [Nick]
> 
>>An optional third argument (raise=False) seems a lot friendlier (and more
>>flexible) than a typecheck.
> 
> 
> I think I agree, especially since Phillip's alternative (a different
> method) is even worse IMO.
> 

The extra argument works for me as well.

> 
>>Yet another alternative would be for the default behaviour to be to raise
>>Exceptions, and continue with anything else, and have the third argument be
>>"raise_exc=True" and set it to False to pass an exception in without raising it.
> 
> 
> You've lost me there. If you care about this, can you write it up in
> more detail (with code samples or whatever)? Or we can agree on a 2nd
> arg to __next__() (and a 3rd one to next()).
> 

Channeling Nick, I think he is saying that the raising argument should be made
True by default and be named 'raise_exc'.

-Brett
From gvanrossum at gmail.com  Thu Apr 28 02:19:08 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Apr 28 02:19:12 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <5.1.1.6.0.20050427193759.032367b0@mail.telecommunity.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>
	<ca471dc205042713277846852d@mail.gmail.com>
	<5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com>
	<ca471dc2050427145022e8985f@mail.gmail.com>
	<5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com>
	<ca471dc205042715585917829f@mail.gmail.com>
	<5.1.1.6.0.20050427193759.032367b0@mail.telecommunity.com>
Message-ID: <ca471dc2050427171967cec0ac@mail.gmail.com>

[Phillip]
> It's not unlike David Mertz' articles on implementing coroutines and
> multitasking using generators, except that I'm adding more "debugging
> sugar", if you will, by making the tracebacks look normal.  It's just that
> the *how* requires me to pass the traceback into the generator.  At the
> moment, I accomplish that by doing a 3-argument raise inside of
> 'events.resume()', but it would be really nice to be able to get rid of
> 'events.resume()' in a future version of Python.

I'm not familiar with Mertz' articles and frankly I still fear it's
head-explosive material. ;-)

> I think maybe I misspoke.  I mean adding to the traceback *so* that when
> the same error is reraised, the intervening frames are included, rather
> than lost.
> 
> In other words, IIRC, the traceback chain is normally increased by one
> entry for each frame the exception escapes.  However, if you start hiding
> that inside of the exception instance, you'll have to modify it instead of
> just modifying the threadstate.  Does that make sense, or am I missing
> something?

Adding to the traceback chain already in the exception object is
totally kosher, if that's where the traceback is kept.

> My point was mainly that we can err on the side of caller convenience
> rather than callee convenience, if there are fewer implementations.  So,
> e.g. multiple methods aren't a big deal if it makes the 'block'
> implementation simpler, if only generators and a handful of special
> template objects are going need to implement the block API.

Well, the way my translation is currently written, writing next(itr,
arg, exc) is a lot more convenient for the caller than having to write

    # if exc is True, arg is an exception; otherwise arg is a value
    if exc:
        err = getattr(itr, "__error__", None)
        if err is not None:
            VAR1 = err(arg)
        else:
            raise arg
    else:
        VAR1 = next(itr, arg)

but since this will actually be code generated by the bytecode
compiler, I think callee convenience is more important. And the
ability to default __error__ to raise the exception makes a lot of
sense. And we could wrap all this inside the next() built-in -- even
if the actual object should have separate __next__() and __error__()
methods, the user-facing built-in next() function might take an extra
flag to indicate that the argument is an exception, and to handle it
appropriate (like shown above).

> > > So, I guess I'm thinking you'd have something like tp_block_resume and
> > > tp_block_error type slots, and generators' tp_iter_next would just be the
> > > same as tp_block_resume(None).
> >
> >I hadn't thought much about the C-level slots yet, but this is a
> >reasonable proposal.
> 
> Note that it also doesn't require a 'next()' builtin, or a next vs.
> __next__ distinction, if you don't try to overload iteration and
> templating.  The fact that a generator can be used for templating, doesn't
> have to imply that any iterator should be usable as a template, or that the
> iteration protocol is involved in any way.  You could just have
> __resume__/__error__ matching the tp_block_* slots.
> 
> This also has the benefit of making the delineation between template blocks
> and for loops more concrete.  For example, this:
> 
>      block open("filename") as f:
>          ...
> 
> could be an immediate TypeError (due to the lack of a __resume__) instead
> of biting you later on in the block when you try to do something with f, or
> because the block is repeating for each line of the file, etc.

I'm not convinced of that, especially since all *generators* will
automatically be usable as templates, whether or not they were
intended as such. And why *shouldn't* you be allowed to use a block
for looping, if you like the exit behavior (guaranteeing that the
iterator is exhausted when you leave the block in any way)?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Thu Apr 28 02:43:19 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Thu Apr 28 02:43:21 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <20050428000223.GA8869@mems-exchange.org>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<20050428000223.GA8869@mems-exchange.org>
Message-ID: <ca471dc2050427174368f4ee3d@mail.gmail.com>

> It seems like what you are proposing is a limited form of
> coroutines.

Well, I though that's already what generators were -- IMO there isn't
much news there. We're providing a more convenient way to pass a value
back, but that's always been possible (see Fredrik's examples).

> Allowing 'continue' to have an optional value is elegant syntax.
> I'm a little bit concerned about what happens if the iterator does
> not expect a value.  If I understand the PEP, it is silently
> ignored.  That seems like it could hide bugs.  OTOH, it doesn't seem
> any worse then a caller not expecting a return value.

Exactly.

> It's interesting that there is such similarity between 'for' and
> 'block'.  Why is it that block does not call iter() on EXPR1?  I
> guess that fact that 'break' and 'return' work differently is a more
> significant difference.

Well, perhaps block *should* call iter()? I'd like to hear votes about
this. In most cases that would make a block-statement entirely
equivalent to a for-loop, the exception being only when there's an
exception or when breaking out of an iterator with resource
management.

I initially decided it should not call iter() so as to emphasize that
this isn't supposed to be used for looping over sequences -- EXPR1 is
really expected to be a resource management generator (or iterator).

> After thinking about this more, I wonder if iterators meant for
> 'for' loops and iterators meant for 'block' statements are really
> very different things.  It seems like a block-iterator really needs
> to handle yield-expressions.

But who knows, they might be useful for for-loops as well. After all,
passing values back to the generator has been on some people's wish
list for a long time.

> I wonder if generators that contain a yield-expression should
> properly be called coroutines.  Practically, I suspect it would just
> cause confusion.

I have to admit that I haven't looked carefully for use cases for
this! I just looked at a few Ruby examples and realized that it would
be a fairly simple extension of generators.

You can call such generators coroutines, but they are still generators.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From nas at arctrix.com  Thu Apr 28 02:48:52 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu Apr 28 02:48:55 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc205042715585917829f@mail.gmail.com>
References: <ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>
	<ca471dc205042713277846852d@mail.gmail.com>
	<5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com>
	<ca471dc2050427145022e8985f@mail.gmail.com>
	<5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com>
	<ca471dc205042715585917829f@mail.gmail.com>
Message-ID: <20050428004851.GB8869@mems-exchange.org>

On Wed, Apr 27, 2005 at 03:58:14PM -0700, Guido van Rossum wrote:
> Time to update the PEP; I'm pretty much settled on these semantics
> now...

[I'm trying to do a bit of Guido channeling here.  I fear I may not
be entirely successful.]

The the __error__ method seems to simplify things a lot.  The
purpose of the __error__ method is to notify the iterator that the
loop has been exited in some unusual way (i.e. not via a
StopIteration raised by the iterator itself).

The translation of a block-statement could become:

        itr = EXPR1
        arg = None
        while True:
            try:
                VAR1 = next(itr, arg)
            except StopIteration:
                break
            try:
                arg = None
                BLOCK1
            except Exception, exc:
                err = getattr(itr, '__error__', None)
                if err is None:
                    raise exc
                err(exc)
        

The translation of "continue EXPR2" would become:

        arg = EXPR2
        continue

The translation of "break" inside a block-statement would
become:

        err = getattr(itr, '__error__', None)
        if err is not None:
            err(StopIteration())
        break

The translation of "return EXPR3" inside a block-statement would
become:

        err = getattr(itr, '__error__', None)
        if err is not None:
            err(StopIteration())
        return EXPR3

For generators, calling __error__ with a StopIteration instance
would execute any 'finally' block.  Any other argument to __error__
would get re-raised by the generator instance.

You could then write:

    def opened(filename):
        fp = open(filename)
        try:
            yield fp
        finally:
            fp.close()

and use it like this:

    block opened(filename) as fp:
        ....

The main difference between 'for' and 'block' is that more iteration
may happen after breaking or returning out of a 'for' loop.  An
iterator used in a block statement is always used up before the
block is exited.

Maybe __error__ should be called __break__ instead.  StopIteration
is not really an error.  If it is called something like __break__,
does it really need to accept an argument?  Of hand I can't think of
what an iterator might do with an exception.

  Neil
From bac at OCF.Berkeley.EDU  Thu Apr 28 02:52:13 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Apr 28 02:52:20 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc2050427174368f4ee3d@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>	<426DB7C8.5020708@canterbury.ac.nz>	<ca471dc2050426043713116248@mail.gmail.com>	<426E3B01.1010007@canterbury.ac.nz>	<ca471dc205042621472b1f6edf@mail.gmail.com>	<ca471dc20504270030405f922f@mail.gmail.com>	<20050428000223.GA8869@mems-exchange.org>
	<ca471dc2050427174368f4ee3d@mail.gmail.com>
Message-ID: <427033BD.4060600@ocf.berkeley.edu>

Guido van Rossum wrote:
[SNIP]
>>It's interesting that there is such similarity between 'for' and
>>'block'.  Why is it that block does not call iter() on EXPR1?  I
>>guess that fact that 'break' and 'return' work differently is a more
>>significant difference.
> 
> 
> Well, perhaps block *should* call iter()? I'd like to hear votes about
> this. In most cases that would make a block-statement entirely
> equivalent to a for-loop, the exception being only when there's an
> exception or when breaking out of an iterator with resource
> management.
> 

I am -0 on changing it to call iter().  I do like the distinction from a 'for'
loop and leaving an emphasis for template blocks (or blocks, or whatever hip
term you crazy kids are using for these things at the moment) to use
generators.  As I said before, I am viewing these blocks as a construct for
external control of generators, not as a snazzy 'for' loop.

-Brett
From pje at telecommunity.com  Thu Apr 28 03:00:01 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Apr 28 02:56:26 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc2050427171967cec0ac@mail.gmail.com>
References: <5.1.1.6.0.20050427193759.032367b0@mail.telecommunity.com>
	<ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>
	<ca471dc205042713277846852d@mail.gmail.com>
	<5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com>
	<ca471dc2050427145022e8985f@mail.gmail.com>
	<5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com>
	<ca471dc205042715585917829f@mail.gmail.com>
	<5.1.1.6.0.20050427193759.032367b0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050427205110.03b54ec0@mail.telecommunity.com>

At 05:19 PM 4/27/05 -0700, Guido van Rossum wrote:
>[Phillip]
> > This also has the benefit of making the delineation between template blocks
> > and for loops more concrete.  For example, this:
> >
> >      block open("filename") as f:
> >          ...
> >
> > could be an immediate TypeError (due to the lack of a __resume__) instead
> > of biting you later on in the block when you try to do something with f, or
> > because the block is repeating for each line of the file, etc.
>
>I'm not convinced of that, especially since all *generators* will
>automatically be usable as templates, whether or not they were
>intended as such. And why *shouldn't* you be allowed to use a block
>for looping, if you like the exit behavior (guaranteeing that the
>iterator is exhausted when you leave the block in any way)?

It doesn't guarantee that, does it?  (Re-reads PEP.)  Aha, for *generators* 
it does, because it says passing StopIteration in, stops execution of the 
generator.  But it doesn't say anything about whether iterators in general 
are allowed to be resumed afterward, just that they should not yield a 
value in response to the __next__, IIUC.  As currently written, it sounds 
like existing non-generator iterators would not be forced to an exhausted 
state.

As for the generator-vs-template distinction, I'd almost say that argues in 
favor of requiring some small extra distinction to make a generator 
template-safe, rather than in favor of making all iterators 
template-promiscuous, as it were.  Perhaps a '@block_template' decorator on 
the generator?  This would have the advantage of documenting the fact that 
the generator was written with that purpose in mind.

It seems to me that using a template block to loop over a normal iterator 
is a TOOWTDI violation, but perhaps you're seeing something deeper here...?

From bac at OCF.Berkeley.EDU  Thu Apr 28 03:03:55 2005
From: bac at OCF.Berkeley.EDU (Brett C.)
Date: Thu Apr 28 03:04:01 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <20050428004851.GB8869@mems-exchange.org>
References: <ca471dc2050426043713116248@mail.gmail.com>	<426E3B01.1010007@canterbury.ac.nz>	<ca471dc205042621472b1f6edf@mail.gmail.com>	<ca471dc20504270030405f922f@mail.gmail.com>	<5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>	<ca471dc205042713277846852d@mail.gmail.com>	<5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com>	<ca471dc2050427145022e8985f@mail.gmail.com>	<5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com>	<ca471dc205042715585917829f@mail.gmail.com>
	<20050428004851.GB8869@mems-exchange.org>
Message-ID: <4270367B.7000606@ocf.berkeley.edu>

Neil Schemenauer wrote:
> On Wed, Apr 27, 2005 at 03:58:14PM -0700, Guido van Rossum wrote:
> 
>>Time to update the PEP; I'm pretty much settled on these semantics
>>now...
> 
> 
> [I'm trying to do a bit of Guido channeling here.  I fear I may not
> be entirely successful.]
> 
> The the __error__ method seems to simplify things a lot.  The
> purpose of the __error__ method is to notify the iterator that the
> loop has been exited in some unusual way (i.e. not via a
> StopIteration raised by the iterator itself).
> 
> The translation of a block-statement could become:
> 
>         itr = EXPR1
>         arg = None
>         while True:
>             try:
>                 VAR1 = next(itr, arg)
>             except StopIteration:
>                 break
>             try:
>                 arg = None
>                 BLOCK1
>             except Exception, exc:
>                 err = getattr(itr, '__error__', None)
>                 if err is None:
>                     raise exc
>                 err(exc)
>         
> 
> The translation of "continue EXPR2" would become:
> 
>         arg = EXPR2
>         continue
> 
> The translation of "break" inside a block-statement would
> become:
> 
>         err = getattr(itr, '__error__', None)
>         if err is not None:
>             err(StopIteration())
>         break
> 
> The translation of "return EXPR3" inside a block-statement would
> become:
> 
>         err = getattr(itr, '__error__', None)
>         if err is not None:
>             err(StopIteration())
>         return EXPR3
> 
> For generators, calling __error__ with a StopIteration instance
> would execute any 'finally' block.  Any other argument to __error__
> would get re-raised by the generator instance.
> 
> You could then write:
> 
>     def opened(filename):
>         fp = open(filename)
>         try:
>             yield fp
>         finally:
>             fp.close()
> 
> and use it like this:
> 
>     block opened(filename) as fp:
>         ....
> 

Seems great to me.  Clean separation of when the block wants things to keep
going if it can and when it wants to let the generator it's all done.

> The main difference between 'for' and 'block' is that more iteration
> may happen after breaking or returning out of a 'for' loop.  An
> iterator used in a block statement is always used up before the
> block is exited.
> 

This constant use of the phrase "used up" for these blocks is bugging me
slightly.  It isn't like the passed-in generator is having next() called on it
until it stops, it is just finishing up (or cleaning up, choose your favorite
term).  It may have had more iterations to go, but the block signaled it was
done and thus the generator got its chance to finish up and wipe pick up after
itself.

> Maybe __error__ should be called __break__ instead.

I like that.

> StopIteration
> is not really an error.  If it is called something like __break__,
> does it really need to accept an argument?  Of hand I can't think of
> what an iterator might do with an exception.
> 

Could just make the default value be StopIteration.  Is there really a perk to
__break__ only raising StopIteration and not accepting an argument?

The real question of whether people would use the ability of raising other
exceptions passed in from the block.  If you view yield expressions as method
calls, then being able to call __break__ with other exceptions makes sense
since you might code up try/except statements within the generator and that
will care about what kind of exception gets raised.

-Brett
From pje at telecommunity.com  Thu Apr 28 03:12:19 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Thu Apr 28 03:08:43 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc2050427174368f4ee3d@mail.gmail.com>
References: <20050428000223.GA8869@mems-exchange.org>
	<ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<20050428000223.GA8869@mems-exchange.org>
Message-ID: <5.1.1.6.0.20050427210148.03bf0ae0@mail.telecommunity.com>

At 05:43 PM 4/27/05 -0700, Guido van Rossum wrote:
>Well, perhaps block *should* call iter()? I'd like to hear votes about
>this. In most cases that would make a block-statement entirely
>equivalent to a for-loop, the exception being only when there's an
>exception or when breaking out of an iterator with resource
>management.
>
>I initially decided it should not call iter() so as to emphasize that
>this isn't supposed to be used for looping over sequences -- EXPR1 is
>really expected to be a resource management generator (or iterator).

Which is why I vote for not calling iter(), and further, that blocks not 
use the iteration protocol, but rather use a new "block template" 
protocol.  And finally, that a decorator be used to convert a generator 
function to a "template function" (i.e., a function that returns a block 
template).

I think it's less confusing to have two completely distinct concepts, than 
to have two things that are very similar, yet different in a blurry kind of 
way.  If you want to use a block on an iterator, you can always explicitly 
do something like this:

     @blocktemplate
     def iterate(iterable):
         for value in iterable:
             yield value

     block iterate([1,2,3]) as x:
         print x


> > I wonder if generators that contain a yield-expression should
> > properly be called coroutines.  Practically, I suspect it would just
> > cause confusion.
>
>I have to admit that I haven't looked carefully for use cases for
>this!

Anything that wants to do co-operative multitasking, basically.

From steven.bethard at gmail.com  Thu Apr 28 06:37:36 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Thu Apr 28 06:37:39 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <20050428004851.GB8869@mems-exchange.org>
References: <ca471dc2050426043713116248@mail.gmail.com>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>
	<ca471dc205042713277846852d@mail.gmail.com>
	<5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com>
	<ca471dc2050427145022e8985f@mail.gmail.com>
	<5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com>
	<ca471dc205042715585917829f@mail.gmail.com>
	<20050428004851.GB8869@mems-exchange.org>
Message-ID: <d11dcfba050427213772840fcb@mail.gmail.com>

Neil Schemenauer wrote:
> For generators, calling __error__ with a StopIteration instance
> would execute any 'finally' block.  Any other argument to __error__
> would get re-raised by the generator instance.

This is only one case right?  Any exception (including StopIteration)
passed to a generator's __error__ method will just be re-raised at the
point of the last yield, right?  Or is there a need to special-case
StopIteration?

STeVe
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From steven.bethard at gmail.com  Thu Apr 28 06:59:27 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Thu Apr 28 06:59:30 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <5.1.1.6.0.20050427205110.03b54ec0@mail.telecommunity.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>
	<ca471dc205042713277846852d@mail.gmail.com>
	<5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com>
	<ca471dc2050427145022e8985f@mail.gmail.com>
	<5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com>
	<ca471dc205042715585917829f@mail.gmail.com>
	<5.1.1.6.0.20050427193759.032367b0@mail.telecommunity.com>
	<ca471dc2050427171967cec0ac@mail.gmail.com>
	<5.1.1.6.0.20050427205110.03b54ec0@mail.telecommunity.com>
Message-ID: <d11dcfba05042721597ca99716@mail.gmail.com>

Phillip J. Eby wrote:
> At 05:19 PM 4/27/05 -0700, Guido van Rossum wrote:
> >I'm not convinced of that, especially since all *generators* will
> >automatically be usable as templates, whether or not they were
> >intended as such. And why *shouldn't* you be allowed to use a block
> >for looping, if you like the exit behavior (guaranteeing that the
> >iterator is exhausted when you leave the block in any way)?
> 
> It doesn't guarantee that, does it?  (Re-reads PEP.)  Aha, for *generators*
> it does, because it says passing StopIteration in, stops execution of the
> generator.  But it doesn't say anything about whether iterators in general
> are allowed to be resumed afterward, just that they should not yield a
> value in response to the __next__, IIUC.  As currently written, it sounds
> like existing non-generator iterators would not be forced to an exhausted
> state.

I wonder if something can be done like what was done for (dare I say
it?) "old-style" iterators:

"The intention of the protocol is that once an iterator's next()
method raises StopIteration, it will continue to do so on subsequent
calls. Implementations that do not obey this property are deemed
broken. (This constraint was added in Python 2.3; in Python 2.2,
various iterators are broken according to this rule.)"[1]

This would mean that if next(itr, ...) raised StopIteration, then
next(itr, ...) should continue to raise StopIteration on subsequent
calls.  I don't know how this is done in the current implementation. 
Would it be hard to do so for the proposed block-statements?

If nothing else, we might at least clearly document what well-behaved
iterators should do...

STeVe

[1] http://docs.python.org/lib/typeiter.html
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From greg.ewing at canterbury.ac.nz  Thu Apr 28 08:26:39 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu Apr 28 08:35:45 2005
Subject: [Python-Dev] Re: anonymous blocks
References: <ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>
	<ca471dc205042713277846852d@mail.gmail.com>
	<5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com>
	<ca471dc2050427145022e8985f@mail.gmail.com>
	<5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com>
	<ca471dc205042715585917829f@mail.gmail.com>
	<20050428004851.GB8869@mems-exchange.org>
Message-ID: <4270821F.5000404@canterbury.ac.nz>

Neil Schemenauer wrote:

> The translation of a block-statement could become:
> 
>         itr = EXPR1
>         arg = None
>         while True:
>             try:
>                 VAR1 = next(itr, arg)
>             except StopIteration:
>                 break
>             try:
>                 arg = None
>                 BLOCK1
>             except Exception, exc:
>                 err = getattr(itr, '__error__', None)
>                 if err is None:
>                     raise exc
>                 err(exc)

That can't be right. When __error__ is called, if the iterator
catches the exception and goes on to do another yield, the
yielded value needs to be assigned to VAR1 and the block
executed again. It looks like your version will ignore the
value from the second yield and only execute the block again
on the third yield.

So something like Guido's safe_loop() would miss every other
yield.

I think Guido was right in the first place, and __error__
really is just a minor variation on __next__ that shouldn't
have a separate entry point.

Greg


From greg.ewing at canterbury.ac.nz  Thu Apr 28 08:33:20 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu Apr 28 08:42:26 2005
Subject: [Python-Dev] Re: anonymous blocks
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
Message-ID: <427083B0.6040204@canterbury.ac.nz>

Guido van Rossum wrote:
> And surely you exaggerate.  How about this then:
> 
>     The with-statement is similar to the for-loop.  Until you've
>     learned about the differences in detail, the only time you should
>     write a with-statement is when the documentation for the function
>     you are calling says you should.

I think perhaps I'm not expressing myself very well.
What I'm after is a high-level explanation that actually
tells people something useful, and *doesn't* cop out by
just saying "you're not experienced enough to understand
this yet".

If such an explanation can't be found, I strongly suspect
that this doesn't correspond to a cohesive enough concept
to be made into a built-in language feature. If you can't
give a short, understandable explanation of it, then it's
probably a bad idea.

Greg

From python-dev at zesty.ca  Thu Apr 28 09:01:35 2005
From: python-dev at zesty.ca (Ka-Ping Yee)
Date: Thu Apr 28 09:01:45 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <427083B0.6040204@canterbury.ac.nz>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<427083B0.6040204@canterbury.ac.nz>
Message-ID: <Pine.LNX.4.58.0504280158590.4786@server1.LFW.org>

On Thu, 28 Apr 2005, Greg Ewing wrote:
> If such an explanation can't be found, I strongly suspect
> that this doesn't correspond to a cohesive enough concept
> to be made into a built-in language feature. If you can't
> give a short, understandable explanation of it, then it's
> probably a bad idea.

In general, i agree with the sentiment of this -- though it's
also okay if there is a way to break the concept down into
concepts that *are* simple enough to have short, understandable
explanations.


-- ?!ng
From stephen at xemacs.org  Thu Apr 28 10:42:53 2005
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu Apr 28 10:42:58 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <ca471dc2050426022458a4ad@mail.gmail.com> (Guido van Rossum's
	message of "Tue, 26 Apr 2005 02:24:53 -0700")
References: <fb6fbf560504251520797338b2@mail.gmail.com>
	<1114473665.3698.2.camel@schizo> <426DCAE5.2070501@canterbury.ac.nz>
	<87fyxdor2p.fsf@tleepslib.sk.tsukuba.ac.jp>
	<ca471dc2050426022458a4ad@mail.gmail.com>
Message-ID: <874qdrjnqq.fsf@tleepslib.sk.tsukuba.ac.jp>

>>>>> "Guido" == Guido van Rossum <gvanrossum@gmail.com> writes:

    Guido> You mean like this?

    if x > 0:
        ...normal case...
    elif y > 0:
        ....abnormal case...
    else:
        ...edge case...

The salient example!  If it's no accident that those conditions are
mutually exclusive and exhaustive, doesn't that code require at least
a comment saying so, and maybe even an assertion to that effect?
Where you can use a switch, it gives both, and throws in economy in
both source and object code as a bonus.

Not a compelling argument---your example shows switches are not
universally applicable---but it's a pretty good deal.

    Guido> You have guts to call that bad style! :-)

Exaggeration in defense of elegance is no vice.<wink>

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
From pedronis at strakt.com  Thu Apr 28 10:51:32 2005
From: pedronis at strakt.com (Samuele Pedroni)
Date: Thu Apr 28 10:49:45 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <427083B0.6040204@canterbury.ac.nz>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>	<426DB7C8.5020708@canterbury.ac.nz>	<ca471dc2050426043713116248@mail.gmail.com>	<426E3B01.1010007@canterbury.ac.nz>	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<427083B0.6040204@canterbury.ac.nz>
Message-ID: <4270A414.2000104@strakt.com>

Greg Ewing wrote:
> Guido van Rossum wrote:
> 
>> And surely you exaggerate.  How about this then:
>>
>>     The with-statement is similar to the for-loop.  Until you've
>>     learned about the differences in detail, the only time you should
>>     write a with-statement is when the documentation for the function
>>     you are calling says you should.
> 
> 
> I think perhaps I'm not expressing myself very well.
> What I'm after is a high-level explanation that actually
> tells people something useful, and *doesn't* cop out by
> just saying "you're not experienced enough to understand
> this yet".
> 

this makes sense to me, also because a new control statement
will not be usually as hidden as metaclasses and some other possibly
obscure corners can be. OTOH I have the impression that the new toy is 
too shiny to have a lucid discussion whether it could have sharp edges 
or produce dizziness for the unexperienced.
From greg.ewing at canterbury.ac.nz  Thu Apr 28 10:42:26 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu Apr 28 10:51:32 2005
Subject: [Python-Dev] Anonymous blocks: Thunks or iterators?
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
Message-ID: <4270A1F2.1030401@canterbury.ac.nz>

Elegant as the idea behind PEP 340 is, I can't shake
the feeling that it's an abuse of generators. It seems
to go to a lot of trouble and complication so you
can write a generator and pretend it's a function
taking a block argument.

I'd like to reconsider a thunk implementation. It
would be a lot simpler, doing just what is required
without any jiggery pokery with exceptions and
break/continue/return statements. It would be easy
to explain what it does and why it's useful.

Are there any objective reasons to prefer a generator
implementation over a thunk implementation? If
for-loops had been implemented with thunks, we might
never have created generators. But generators have
turned out to be more powerful, because you can
have more than one of them on the go at once. Is
there a use for that capability here?

I can think of one possible use. Suppose you want
to acquire multiple resources; one way would be to
nest block-statements, like

    block opening(file1) as f:
       block opening(file2) as g:
          ...

If you have a lot of resources to acquire, the nesting
could get very deep. But with the generator implementation,
you could do something like

    block iterzip(opening(file1), opening(file2)) as f, g:
       ...

provided iterzip were modified to broadcast __next__
arguments to its elements appropriately. You couldn't
do this sort of thing with a thunk implementation.

On the other hand, a thunk implementation has the
potential to easily handle multiple block arguments, if
a suitable syntax could ever be devised. It's hard
to see how that could be done in a general way with
the generator implementation.

[BTW, I've just discovered we're not the only people
with numbered things called PEPs. I typed "PEP 340"
into Google and got "PEP 340: Prevention and Care of
Athletic Injuries"!]

Greg

From ncoghlan at gmail.com  Thu Apr 28 14:08:10 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri Apr 29 02:10:18 2005
Subject: [Python-Dev] Integrating PEP 310 with PEP 340
In-Reply-To: <ca471dc205042715165dede48d@mail.gmail.com>
References: <426F9347.6000505@iinet.net.au>
	<ca471dc205042715165dede48d@mail.gmail.com>
Message-ID: <4270D22A.9020907@gmail.com>

Guido van Rossum wrote:
>>PEP 310 forms the basis for a block construct that I _do_ like. The question
>>then becomes whether or not generators can be used to write useful PEP 310 style
>>block managers (I think they can, in a style very similar to that of the looping
>>block construct from PEP 340).
> 
> 
> I've read through your example, and I'm not clear why you think this
> is better. It's a much more complex API with less power. What's your
> use case? Why should 'block' be disallowed from looping? TOOWTDI or do
> you have something better?

I'm no longer clear on why I thought what I suggested would be better either. 
Can I use the 'it was late' excuse? :)

Actually, the real reason is that I hadn't figured out what was really possible 
with PEP 340. The cases that I thought PEP 310 would handle better, I've since 
worked out how to do using the PEP 340 mechanism, and PEP 340 handles them _far_ 
more elegantly. With PEP 340, multi-stage constructs can be handled by using one 
generator as an argument to the block, and something else (such as a class or 
another generator) to maintain state between the blocks. The looping nature is a 
big win, because it lets execution of a contained block be prevented entirely.

My favourite discovery is that PEP 340 can be used to write a switch statement 
like this:

     block switch(value) as sw:
         block sw.case(1):
             # Handle case 1
         block sw.case(2):
             # Handle case 2
         block sw.default():
             # Handle default case

Given the following definitions:

     class _switch(object):
         def __init__(self, switch_var):
             self.switch_var = switch_var
             self.run_default = True

         def case(self, case_value):
             self.run_default = False
             if self.switch_var == case_value:
                 yield

         def default(self):
             if self.run_default:
                 yield

     def switch(switch_var):
         yield _switch(switch_var)


With the keyword-less syntax previously mentioned, such a 'custom structure' 
could look like:

     switch(value) as sw:
         sw.case(1):
             # Handle case 1
         sw.case(2):
             # Handle case 2
         sw.default():
             # Handle default case

(Actually doing a switch using blocks like this would be *insane* for 
performance reasons, but it is still rather cool that it is possible)

With an appropriate utility block manager PEP 340 can also be used to abstract 
multi-stage operations. I haven't got a real use case for this as yet, but the 
potential is definitely there:

def next_stage(itr):
     """Execute a single stage of a multi-stage block manager"""
     arg = None
     next_item = next(itr)
     while True:
         if next_item is StopIteration:
             raise StopIteration
         try:
             arg = yield next_item
         except:
             if not hasattr(itr, "__error__"):
                 raise
             next_item = itr.__error__(sys.exc_info()[1])
         else:
             next_item = next(itr, arg)


def multi_stage():
     """Code template accepting multiple suites"""
     # Pre stage 1
     result_1 = yield
     # Post stage 1
     yield StopIteration
     result_2 = 0
     if result_1:
         # Pre stage 2
         result_2 = yield
         # Post stage 2
     yield StopIteration
     for i in range(result_2):
         # Pre stage 3
         result_3 = yield
         # Post stage 3
     yield StopIteration
     # Pre stage 4
     result_4 = yield
     # Post stage 4

def use_multistage():
     blk = multi_stage()
     block next_stage(blk):
         # Stage 1
         continue val_1
     block next_stage(blk):
         # Stage 2
         continue val_2
     block next_stage(blk):
         # Stage 3
         continue val_3
     block next_stage(blk):
         # Stage 4
         continue val_4

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From steven.bethard at gmail.com  Thu Apr 28 19:23:22 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri Apr 29 02:32:09 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <d11dcfba05042810164214e9d0@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<427083B0.6040204@canterbury.ac.nz>
	<d11dcfba05042810164214e9d0@mail.gmail.com>
Message-ID: <d11dcfba05042810232a6f87a7@mail.gmail.com>

On 4/28/05, Steven Bethard <steven.bethard@gmail.com> wrote:
> however, the iterable object is notified whenever a 'continue',
> 'break', or 'return' statement is executed inside the block-statement.

This should read:

however, the iterable object is notified whenever a 'continue',
'break' or 'return' statement is executed *or an exception is raised*
inside the block-statement.

Sorry!

STeVe
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From gvanrossum at gmail.com  Thu Apr 28 16:44:05 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Apr 29 02:32:20 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <874qdrjnqq.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <fb6fbf560504251520797338b2@mail.gmail.com>
	<1114473665.3698.2.camel@schizo> <426DCAE5.2070501@canterbury.ac.nz>
	<87fyxdor2p.fsf@tleepslib.sk.tsukuba.ac.jp>
	<ca471dc2050426022458a4ad@mail.gmail.com>
	<874qdrjnqq.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <ca471dc2050428074475d7d6b0@mail.gmail.com>

> Exaggeration in defense of elegance is no vice.<wink>

Maybe not, but it still sounds like BS to me.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From jim at zope.com  Thu Apr 28 13:50:58 2005
From: jim at zope.com (Jim Fulton)
Date: Fri Apr 29 02:32:24 2005
Subject: [Python-Dev] Anonymous blocks: Thunks or iterators?
In-Reply-To: <4270A1F2.1030401@canterbury.ac.nz>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>	<426DB7C8.5020708@canterbury.ac.nz>	<ca471dc2050426043713116248@mail.gmail.com>	<426E3B01.1010007@canterbury.ac.nz>	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<4270A1F2.1030401@canterbury.ac.nz>
Message-ID: <4270CE22.7030406@zope.com>

Greg Ewing wrote:
> Elegant as the idea behind PEP 340 is, I can't shake
> the feeling that it's an abuse of generators. It seems
> to go to a lot of trouble and complication so you
> can write a generator and pretend it's a function
> taking a block argument.
> 
> I'd like to reconsider a thunk implementation. It
> would be a lot simpler, doing just what is required
> without any jiggery pokery with exceptions and
> break/continue/return statements. It would be easy
> to explain what it does and why it's useful.

"Simple is better than Complex."

Is there a thunk PEP?

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org
From steven.bethard at gmail.com  Thu Apr 28 19:16:15 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri Apr 29 02:32:41 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <427083B0.6040204@canterbury.ac.nz>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<427083B0.6040204@canterbury.ac.nz>
Message-ID: <d11dcfba05042810164214e9d0@mail.gmail.com>

On 4/28/05, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
> > And surely you exaggerate.  How about this then:
> >
> >     The with-statement is similar to the for-loop.  Until you've
> >     learned about the differences in detail, the only time you should
> >     write a with-statement is when the documentation for the function
> >     you are calling says you should.
> 
> I think perhaps I'm not expressing myself very well.
> What I'm after is a high-level explanation that actually
> tells people something useful, and *doesn't* cop out by
> just saying "you're not experienced enough to understand
> this yet".

How about:

"""
A block-statement is much like a for-loop, and is also used to iterate
over the elements of an iterable object.  In a block-statement
however, the iterable object is notified whenever a 'continue',
'break', or 'return' statement is executed inside the block-statement.
 Most iterable objects do not need to be notified of such statement
executions, so for most iteration over iterable objects, you should
use a for-loop.  Functions that return iterable objects that should be
used in a block-statement will be documented as such.
"""

If you need more information, you could also include something like:

"""
When generator objects are used in a block-statement, they are
guaranteed to be "exhausted" at the end of the block-statement.  That
is, any additional call to next() with the generator object will
produce a StopIteration.
"""

STeVe
-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From Ugo_DiGirolamo at invision.iip.com  Thu Apr 28 21:29:10 2005
From: Ugo_DiGirolamo at invision.iip.com (Ugo Di Girolamo)
Date: Fri Apr 29 02:32:45 2005
Subject: [Python-Dev] Problem with embedded python - bug?
Message-ID: <3D4A0A4A0225484B965A23CFD127B82F4A680F@invnmail.invision.iip.com>

I have been having a few more discussions around about this, and I'm starting to think that this is a bug.

My take is that, when I call Py_Finalize, the python thread should be shut down  gracefully, closing the file and everything. 
Maybe I'm missing a call to something (?PyEval_FinalizeThreads?) but the docs seem to say that just PyFinalize should be called.

The open file seems to be the issue, since if I remove all the references to the file I cannot get the program to crash.

I can reproduce the same behavior on two different wxp systems, under python 2.4 and 2.4.1.

Ugo


-----Original Message-----
From: Ugo Di Girolamo 
Sent: Tuesday, April 26, 2005 2:16 PM
To: 'python-dev@python.org'
Subject: Problem with embedded python

I have the following code, that seems to make sense to me. 


However, it crashes about 1/3 of the times. 


My platform is Python 2.4.1 on WXP (I tried the release version from 
the msi and the debug version built by me, both downloaded today to 
have the latest version). 


The crash happens while the main thread is in Py_Finalize. 
I traced the crash to _Py_ForgetReference(op) in object.c at line 1847, 
where I have op->_ob_prev == NULL.


What am I doing wrong? I'm definitely not too sure about the way I'm 
handling the GIL. 


Thanks in adv for any suggestion/ comment


Cheers and ciao 


Ugo 

////////////////////////// TestPyThreads.py ////////////////////////// 
#include <windows.h> 
#include "Python.h" 


int main() 
{ 
        PyEval_InitThreads(); 
        Py_Initialize(); 
        PyGILState_STATE main_restore_state = PyGILState_UNLOCKED; 
        PyGILState_Release(main_restore_state); 


        // start the thread 
        { 
                PyGILState_STATE state = PyGILState_Ensure(); 
                int trash = PyRun_SimpleString( 
                                "import thread\n" 
                                "import time\n" 
                                "def foo():\n" 
                                "  f = open('pippo.out', 'w', 0)\n" 
                                "  i = 0;\n" 
                                "  while 1:\n" 
                                "    f.write('%d\\n'%i)\n" 
                                "    time.sleep(0.01)\n" 
                                "    i += 1\n" 
                                "t = thread.start_new_thread(foo, ())\n" 
                                ); 
                PyGILState_Release(state); 
        } 


        // wait 300 ms 
        Sleep(300); 


        PyGILState_Ensure(); 
        Py_Finalize(); 
        return 0; 

} 
From ncoghlan at gmail.com  Thu Apr 28 14:07:55 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri Apr 29 02:32:53 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <42702BCB.9000500@ocf.berkeley.edu>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>	<426DB7C8.5020708@canterbury.ac.nz>	<ca471dc2050426043713116248@mail.gmail.com>	<426E3B01.1010007@canterbury.ac.nz>	<ca471dc205042621472b1f6edf@mail.gmail.com>	<ca471dc20504270030405f922f@mail.gmail.com>	<5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>	<ca471dc205042713277846852d@mail.gmail.com>	<42700D3A.5020208@gmail.com>
	<ca471dc20504271522e79ce4a@mail.gmail.com>
	<42702BCB.9000500@ocf.berkeley.edu>
Message-ID: <4270D21B.9040401@gmail.com>

Brett C. wrote:
> Guido van Rossum wrote:
>>>Yet another alternative would be for the default behaviour to be to raise
>>>Exceptions, and continue with anything else, and have the third argument be
>>>"raise_exc=True" and set it to False to pass an exception in without raising it.
>>
>>
>>You've lost me there. If you care about this, can you write it up in
>>more detail (with code samples or whatever)? Or we can agree on a 2nd
>>arg to __next__() (and a 3rd one to next()).
> 
> Channeling Nick, I think he is saying that the raising argument should be made
> True by default and be named 'raise_exc'.

Pretty close, although I'd say 'could' rather than 'should', as it was an idle 
thought, rather than something I actually consider a good idea.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From steven.bethard at gmail.com  Thu Apr 28 18:21:59 2005
From: steven.bethard at gmail.com (Steven Bethard)
Date: Fri Apr 29 02:32:56 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <4270821F.5000404@canterbury.ac.nz>
References: <ca471dc2050426043713116248@mail.gmail.com>
	<ca471dc20504270030405f922f@mail.gmail.com>
	<5.1.1.6.0.20050427105524.02479e70@mail.telecommunity.com>
	<ca471dc205042713277846852d@mail.gmail.com>
	<5.1.1.6.0.20050427164323.0332c2b0@mail.telecommunity.com>
	<ca471dc2050427145022e8985f@mail.gmail.com>
	<5.1.1.6.0.20050427180054.0315ec30@mail.telecommunity.com>
	<ca471dc205042715585917829f@mail.gmail.com>
	<20050428004851.GB8869@mems-exchange.org>
	<4270821F.5000404@canterbury.ac.nz>
Message-ID: <d11dcfba050428092150ddabac@mail.gmail.com>

On 4/28/05, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
> Neil Schemenauer wrote:
> 
> > The translation of a block-statement could become:
> >
> >         itr = EXPR1
> >         arg = None
> >         while True:
> >             try:
> >                 VAR1 = next(itr, arg)
> >             except StopIteration:
> >                 break
> >             try:
> >                 arg = None
> >                 BLOCK1
> >             except Exception, exc:
> >                 err = getattr(itr, '__error__', None)
> >                 if err is None:
> >                     raise exc
> >                 err(exc)
> 
> That can't be right. When __error__ is called, if the iterator
> catches the exception and goes on to do another yield, the
> yielded value needs to be assigned to VAR1 and the block
> executed again. It looks like your version will ignore the
> value from the second yield and only execute the block again
> on the third yield.

Could you do something like:
    itr = EXPR1
    arg = None
    next_func = next
    while True:
        try:
            VAR1 = next_func(itr, arg)
        except StopIteration:
            break
        try:
            arg = None
            next_func = next
            BLOCK1
        except Exception, arg:
            try:
                next_func = type(itr).__error__
            except AttributeError:
                raise arg


?

STeVe

-- 
You can wordify anything if you just verb it.
        --- Bucky Katt, Get Fuzzy
From michael.walter at gmail.com  Thu Apr 28 14:46:22 2005
From: michael.walter at gmail.com (Michael Walter)
Date: Fri Apr 29 02:47:23 2005
Subject: [Python-Dev] Re: switch statement
In-Reply-To: <874qdrjnqq.fsf@tleepslib.sk.tsukuba.ac.jp>
References: <fb6fbf560504251520797338b2@mail.gmail.com>
	<1114473665.3698.2.camel@schizo> <426DCAE5.2070501@canterbury.ac.nz>
	<87fyxdor2p.fsf@tleepslib.sk.tsukuba.ac.jp>
	<ca471dc2050426022458a4ad@mail.gmail.com>
	<874qdrjnqq.fsf@tleepslib.sk.tsukuba.ac.jp>
Message-ID: <877e9a1705042805464161c3a4@mail.gmail.com>

On 4/28/05, Stephen J. Turnbull <stephen@xemacs.org> wrote:
> >>>>> "Guido" == Guido van Rossum <gvanrossum@gmail.com> writes:
> 
>     Guido> You mean like this?
> 
>     if x > 0:
>         ...normal case...
>     elif y > 0:
>         ....abnormal case...
>     else:
>         ...edge case...
> 
> The salient example!  If it's no accident that those conditions are
> mutually exclusive and exhaustive, doesn't that code require at least
> a comment saying so, and maybe even an assertion to that effect?

I usually do:

if ...:
  return ...
if ...:
  return ...
assert ...
return ...

Michael
From gargamel.su at gmail.com  Thu Apr 28 23:47:18 2005
From: gargamel.su at gmail.com (Jing Su)
Date: Fri Apr 29 02:47:26 2005
Subject: [Python-Dev] noob question regarding the interpreter
Message-ID: <f11de95f050428144735fe364e@mail.gmail.com>

Hello,

I know this is a n00b question, so I apologize ahead of time.

I've been taking a look at they python interpreter, trying to understand how 
it works on the compiled byte-codes. Looking through the sources of the 
2.4.1 stable version, it looks like Python/ceval.c is the module that does 
the main dispatch. However, it looks like a switched interpreter. I just 
find this surprising because python seems to run pretty fast, and a switched 
interpreter is usually painfully slow.

Is there work to change python into a direct-threaded or even JIT'ed 
interpreter? Has there been previous discussion on this topic? I'd greatly 
appreciate any pointers to discussions on this topic. Thus far my google-fu 
has not turned up fruitful hits. 

Thanks in advance for any help!
-Jing
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20050428/bed2d653/attachment-0001.html
From jimjjewett at gmail.com  Thu Apr 28 23:53:36 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri Apr 29 02:47:46 2005
Subject: [Python-Dev] anonymous blocks as scope-collapse: detailed proposal
Message-ID: <fb6fbf5605042814537f6d2e37@mail.gmail.com>

Based on Guido's opinion that caller and callee should both be
marked, I have used keywords 'include' and 'chunk'.  I therefore
call them "Chunks" and "Includers".

Examples are based on

(1)  The common case of a simple resource manager.  e.g.
http://mail.python.org/pipermail/python-dev/2005-April/052751.html

(2)  Robert Brewer's Object Relational Mapper
http://mail.python.org/pipermail/python-dev/2005-April/052924.html
which uses several communicating Chunks in the same Includer, and
benefits from Includer inheritance.

Note that several cooperating Chunks may use the same name
(e.g. old_children) to refer to the same object, even though
that object is never mentioned by the Includer.

It is possible for the same code object to be both a Chunk and
an Includer.  Its own included sub-Chunks also share the top
Includer's namespace.

Chunks and Includers must both be written in pure python,
because C frames cannot be easily manipulated.  They can
of course call or be called (as a unit) by extension modules.

I have assumed that Chunks should not take arguments.  While
arguments are useful ("Which pattern should I match against
on this inclusion?"), the same functionality *can* be had by
binding a known name in the Includer.  When that starts to get
awkward, it is a sign that you should be using separate
namespaces (and callbacks, or value objects).

"self" and "cls" are just random names to a Chunk, though
using them for any but the conventional meaning will be as
foolhardy as it is in a method.

Chunks are limited to statement context, as they do not return
a value.  

Includers must provide a namespace.  Therefore a single inclusion
will turn the entire nearest enclosing namespace into an Includer.
    ?  Should this be limited to nearest enclosing function or
       method?  I can't think of a good use case for including
       directly from class definition or module toplevel, except
       registration.  And even then, a metaclass might be better.

Includers may only be used in a statement context, as the Chunks
must be specified in a following suite.  (It would be possible to
skip the suite if all Chunk names are already bound, but I'm not
sure that is a good habit to encourage -- so initially forbid it.)

Chunks are defined without a (), in analogy to parentless classes.
They are included (called) with a (), so that they can remain first
class objects.

Example Usage
=============

def withfile(filename, mode='r'):
    """Close the file as soon we're done.

    This frees up file handles sooner.  This is particularly important
    under Jython, or if you are using files in cyclic structures."""
    openfile = open(filename, mode)
    try:
        include fileproc()  # keyword 'include' prevents XXX_FAST optimization
    finally:
        openfile.close()

chunk nullreader:           # callee Chunk defined for reuse
    for line in openfile:
        pass

withfile("testr.txt"):      # Is this creation of a new block-starter a problem?
    fileproc=nullreader     # Using an external Chunk object

withfile("testw.txt", "w"):
    chunk fileproc:         # Providing an "inline" Chunk
        openfile.write("Line 1")

#   If callers must be supported in expression context
#fileproc=nullreader
#withfile("tests.txt")      # Resolve Chunk name from caller's default
                            # binding, which in this case defaults back
                            # to the current globals.
                            # Is this just asking for trouble?
                        

class ORM(object):

    chunk nullchunk:                # The extra processing is not always needed.
        pass
    begin=pre=post=end=nullchunk    # Default to no extra processing

    def __set__(self, unit, value):
        include self.begin()
        if self.coerce:
            value = self.coerce(unit, value)
        oldvalue = unit._properties[self.key]
        if oldvalue != value:
            include self.pre()
            unit._properties[self.key] = value
            include self.post()
        include self.end()

class TriggerORM(ORM):
    chunk pre:
        include super(self,TriggerORM).pre()    # self was bound by __set__
        old_children = self.children()          # inject new variable
    
    chunk post:
        include super(self,TriggerORM).post()
        for child in self.children():
            if child not in old_children:       # will see pre's binding
                notify_somebody("New child %s" % child)


As Robert Brewer said, 

> The above is quite ugly written with callbacks (due to
> excessive argument passing), and is currently fragile
> when overriding __set__ (due to duplicated code).

How to Implement
----------------

The Includer cannot know which variables a Chunk will use (or
inject), so the namespace must remain a dictionary.  This precludes
use of the XXX_FAST bytecodes.  But as Robert pointed out, avoiding
another frame creation/destruction will compensate somewhat.

Two new bytecodes will be needed to handle the jump and return to
a different bytecode string without setting up or tearing down a
new frame.  Position in the Includer bytecode will need to be kept
in a stack, though it might make sense to use a frame variable
instead of the execution stack.

With those two exceptions, the Includer and Chunk are both
composed entirely of valid statements that can already be
compiled to ordinary bytecode.

-jJ
From ncoghlan at gmail.com  Thu Apr 28 14:08:08 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri Apr 29 02:49:16 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <427026B1.4020002@ocf.berkeley.edu>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>	<426DB7C8.5020708@canterbury.ac.nz>	<ca471dc2050426043713116248@mail.gmail.com>	<426E3B01.1010007@canterbury.ac.nz>	<ca471dc205042621472b1f6edf@mail.gmail.com>	<ca471dc20504270030405f922f@mail.gmail.com>	<426FF412.7010709@ocf.berkeley.edu>
	<42700B88.40703@gmail.com> <427026B1.4020002@ocf.berkeley.edu>
Message-ID: <4270D228.8020607@gmail.com>

Brett C. wrote:
>>I'm surprisingly close to agreeing with you, actually. I've worked out
>>that it isn't the looping that I object to, it's the inability to get
>>out of the loop without exhausting the entire iterator.

> 'break' isn't' enough for you as laid out by the proposal?  The raising of
> StopIteration, which is what 'break' does according to the standard, should be
> enough to stop the loop without exhausting things.  Same way you stop a 'for'
> loop from executing entirely.

The StopIteration exception effectively exhausted the generator, though. 
However, I've figured out how to deal with that, and my reservations about PEP 
340 are basically gone.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From mwh at python.net  Thu Apr 28 13:26:00 2005
From: mwh at python.net (Michael Hudson)
Date: Fri Apr 29 02:49:22 2005
Subject: [Python-Dev] Anonymous blocks: Thunks or iterators?
In-Reply-To: <4270A1F2.1030401@canterbury.ac.nz> (Greg Ewing's message of
	"Thu, 28 Apr 2005 20:42:26 +1200")
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<4270A1F2.1030401@canterbury.ac.nz>
Message-ID: <2mfyxb16t3.fsf@starship.python.net>

Greg Ewing <greg.ewing@canterbury.ac.nz> writes:

> Are there any objective reasons to prefer a generator
> implementation over a thunk implementation?

I, too, would like to see an answer to this question.

I'd like to see an answer in the PEP, too.

Cheers,
mwh

-- 
  All obscurity will buy you is time enough to contract venereal
  diseases.                                  -- Tim Peters, python-dev
From gvanrossum at gmail.com  Fri Apr 29 00:15:13 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Apr 29 02:50:00 2005
Subject: [Python-Dev] Anonymous blocks: Thunks or iterators?
In-Reply-To: <4270A1F2.1030401@canterbury.ac.nz>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<4270A1F2.1030401@canterbury.ac.nz>
Message-ID: <ca471dc205042815157cf20297@mail.gmail.com>

[Greg Ewing]
> Elegant as the idea behind PEP 340 is, I can't shake
> the feeling that it's an abuse of generators. It seems
> to go to a lot of trouble and complication so you
> can write a generator and pretend it's a function
> taking a block argument.

Maybe. You're not the first one saying this and I'm not saying "no"
outright, but I'd like to defend the PEP.

There are a number of separate ideas that all contribute to PEP 340.
One is turning generators into more general coroutines: continue EXPR
passes the expression to the iterator's next() method (renamed to
__next__() to work around a compatibility issue and because it should
have been called that in the first place), and in a generator this
value can be received as the return value of yield. Incidentally this
makes the generator *syntax* more similar to Ruby (even though Ruby
uses thunks, and consequently uses return instead of continue to pass
a value back). I'd like to have this even if I don't get the block
statement.

The second is a solution for generator resource cleanup. There are
already two PEPs proposing a solution (288 and 325) so I have to
assume this addresses real pain! The only new twist offered by PEP 340
is a unification of the next() API and the resource cleanup API:
neither PEP 288 nor PEP 325 seems to specify rigorously what should
happen if the generator executes another yield in response to a
throw() or close() call (or whether that should even be allowed); PEP
340 takes the stance that it *is* allowed and should return a value
from whatever call sent the exception. This feels "right", especially
together with the previous feature: if yield can return a value as if
it were a function call, it should also be allowed to raise an
exception, and catch or propagate it with impunity.

Even without a block-statement, these two changes make yield look a
lot like invoking a thunk -- but it's more efficient, since calling
yield doesn't create a frame.

The main advantage of thunks that I can see is that you can save the
thunk for later, like a callback for a button widget (the thunk then
becomes a closure). You can't use a yield-based block for that (except
in Ruby, which uses yield syntax with a thunk-based implementation).
But I have to say that I almost see this as an advantage: I think I'd
be slightly uncomfortable seeing a block and not knowing whether it
will be executed in the normal control flow or later. Defining an
explicit nested function for that purpose doesn't have this problem
for me, because I already know that the 'def' keyword means its body
is executed later.

The other problem with thunks is that once we think of them as the
anonymous functions they are, we're pretty much forced to say that a
return statement in a thunk returns from the thunk rather than from
the containing function. Doing it any other way would cause major
weirdness when the thunk were to survive its containing function as a
closure (perhaps continuations would help, but I'm not about to go
there :-).

But then an IMO important use case for the resource cleanup template
pattern is lost. I routinely write code like this:

    def findSomething(self, key, default=None):
        self.lock.acquire()
        try:
             for item in self.elements:
                 if item.matches(key):
                     return item
             return default
        finally:
           self.lock.release()

and I'd be bummed if I couldn't write this as

    def findSomething(self, key, default=None):
        block synchronized(self.lock):
             for item in self.elements:
                 if item.matches(key):
                     return item
             return default

This particular example can be rewritten using a break:

    def findSomething(self, key, default=None):
        block synchronized(self.lock):
             for item in self.elements:
                 if item.matches(key):
                     break
             else:
                 item = default
         return item

but it looks forced and the transformation isn't always that easy;
you'd be forced to rewrite your code in a single-return style which
feels too restrictive.

> I'd like to reconsider a thunk implementation. It
> would be a lot simpler, doing just what is required
> without any jiggery pokery with exceptions and
> break/continue/return statements. It would be easy
> to explain what it does and why it's useful.

I don't know. In order to obtain the required local variable sharing
between the thunk and the  containing function I believe that every
local variable used or set in the thunk would have to become a 'cell'
(our mechanism for sharing variables between nested scopes). Cells
slow down access somewhat compared to regular local variables.

Perhaps not entirely coincidentally, the last example above
(findSomething() rewritten to avoid a return inside the block) shows
that, unlike for regular nested functions, we'll want variables
*assigned to* by the thunk also to be shared with the containing
function, even if they are not assigned to outside the thunk. I swear
I didn't create the example for this purpose -- it just happened.

> Are there any objective reasons to prefer a generator
> implementation over a thunk implementation? If
> for-loops had been implemented with thunks, we might
> never have created generators. But generators have
> turned out to be more powerful, because you can
> have more than one of them on the go at once. Is
> there a use for that capability here?

I think the async event folks like to use this (see the Mertz
references in PEP 288).

> I can think of one possible use. Suppose you want
> to acquire multiple resources; one way would be to
> nest block-statements, like
> 
>     block opening(file1) as f:
>        block opening(file2) as g:
>           ...
> 
> If you have a lot of resources to acquire, the nesting
> could get very deep. But with the generator implementation,
> you could do something like
> 
>     block iterzip(opening(file1), opening(file2)) as f, g:
>        ...
> 
> provided iterzip were modified to broadcast __next__
> arguments to its elements appropriately. You couldn't
> do this sort of thing with a thunk implementation.
> 
> On the other hand, a thunk implementation has the
> potential to easily handle multiple block arguments, if
> a suitable syntax could ever be devised. It's hard
> to see how that could be done in a general way with
> the generator implementation.

Right, but the use cases for multiple blocks seem elusive. If you
really want to have multiple blocks with yield, I suppose we could use
"yield/n" to yield to the n'th block argument, or perhaps yield>>n.
:-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Fri Apr 29 00:51:13 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Apr 29 02:50:23 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <Pine.LNX.4.58.0504280158590.4786@server1.LFW.org>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<427083B0.6040204@canterbury.ac.nz>
	<Pine.LNX.4.58.0504280158590.4786@server1.LFW.org>
Message-ID: <ca471dc2050428155127cec9a5@mail.gmail.com>

[Greg Ewing]
> I think perhaps I'm not expressing myself very well.
> What I'm after is a high-level explanation that actually
> tells people something useful, and *doesn't* cop out by
> just saying "you're not experienced enough to understand
> this yet".
> 
> If such an explanation can't be found, I strongly suspect
> that this doesn't correspond to a cohesive enough concept
> to be made into a built-in language feature. If you can't
> give a short, understandable explanation of it, then it's
> probably a bad idea.

[Ping]
> In general, i agree with the sentiment of this -- though it's
> also okay if there is a way to break the concept down into
> concepts that *are* simple enough to have short, understandable
> explanations.

I don't know. What exactly is the audience supposed to be of this
high-level statement? It would be pretty darn impossible to explain
even the for-statement to people who are new to programming, let alone
generators. And yet explaining the block-statement *must* involve a
reference to generators. I'm guessing most introductions to Python,
even for experienced programmers, put generators off until the
"advanced" section, because this is pretty wild if you're not used to
a language that has something similar. (I wonder how you'd explain
Python generators to an experienced Ruby programmer -- their mind has
been manipulated to the point where they'd be unable to understand
Python's yield no matter how hard they tried. :-)

If I weren't limited to newbies (either to Python or to programming in
general) but simply had to explain it to Python programmers
pre-Python-2.5, I would probably start with a typical example of the
try/finally idiom for acquiring and releasing a lock, then explain how
for software engineering reasons you'd want to templatize that, and
show the solution with a generator and block-statement.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Fri Apr 29 00:55:03 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Apr 29 02:50:28 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
Message-ID: <ca471dc205042815557616722b@mail.gmail.com>

How about, instead of trying to emphasize how different a
block-statement is from a for-loop, we emphasize their similarity?

A regular old loop over a sequence or iterable is written as:

    for VAR in EXPR:
        BLOCK

A variation on this with somewhat different semantics swaps the keywords:

    in EXPR for VAR:
        BLOCK

If you don't need the variable, you can leave the "for VAR" part out:

    in EXPR:
        BLOCK

Too cute? :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From nas at arctrix.com  Fri Apr 29 04:24:02 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri Apr 29 04:24:05 2005
Subject: [Python-Dev] noob question regarding the interpreter
In-Reply-To: <f11de95f050428144735fe364e@mail.gmail.com>
References: <f11de95f050428144735fe364e@mail.gmail.com>
Message-ID: <20050429022401.GA13119@mems-exchange.org>

On Thu, Apr 28, 2005 at 05:47:18PM -0400, Jing Su wrote:
> Is there work to change python into a direct-threaded or even JIT'ed 
> interpreter?

People have experimented with making the ceval loop use direct
threading.  If I recall correctly, the resulting speedup was not
significant.  I suspect the reason is that most of Python's opcodes
do a significant amount of work.  There's probably more to be gained
by moving to a register based VM.  Also, I think direct threading is
hard to do portably.

If you are interested in JIT, take a look at Psyco.

  Neil
From nas at arctrix.com  Fri Apr 29 04:35:39 2005
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri Apr 29 04:35:42 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <ca471dc205042815557616722b@mail.gmail.com>
References: <ca471dc205042815557616722b@mail.gmail.com>
Message-ID: <20050429023539.GB13119@mems-exchange.org>

On Thu, Apr 28, 2005 at 03:55:03PM -0700, Guido van Rossum wrote:
> A variation on this with somewhat different semantics swaps the keywords:
> 
>     in EXPR for VAR:
>         BLOCK

Looks weird to my eyes.

On a related note, I was thinking about the extra cleanup 'block'
provides.  If the 'file' object would provide a suitable iterator,
you could write:

    block open(filename) as line:
        ...

and have the file closed at the end of the block.  It does not read
so well though.  In a way, it seems to make more sense if 'block'
called iter() on the expression and 'for' did not.  block would
guarantee to cleanup iterators that it created.  'for' does not but
implictly creates them.

  Neil
From nidoizo at yahoo.com  Fri Apr 29 05:21:12 2005
From: nidoizo at yahoo.com (Nicolas Fleury)
Date: Fri Apr 29 05:18:54 2005
Subject: [Python-Dev] Re: PEP 340 - possible new name for block-statement
In-Reply-To: <ca471dc205042815557616722b@mail.gmail.com>
References: <ca471dc205042815557616722b@mail.gmail.com>
Message-ID: <d4s8nh$fk8$1@sea.gmane.org>

Guido van Rossum wrote:
> A variation on this with somewhat different semantics swaps the keywords:
> 
>     in EXPR for VAR:
>         BLOCK
> 
> If you don't need the variable, you can leave the "for VAR" part out:
> 
>     in EXPR:
>         BLOCK
> 
> Too cute? :-)
> 

I don't think it reads well.  I would prefer something that would be 
understandable for a newbie's eyes, even if it fits more with common 
usage than with the real semantics behind it.  For example a Boost-like 
keyword like:

scoped EXPR as VAR:
     BLOCK

scoped EXPR:
     BLOCK

We may argue that it doesn't mean a lot, but at least if a newbie sees 
the following code, he would easily guess what it does:

scoped synchronized(mutex):
     scoped opening(filename) as file:
         ...

When compared with:

in synchronized(mutex):
     in opening(filename) for file:
         ...

As a C++ programmer, I still dream I could also do:

scoped synchronized(mutex)
scoped opening(filename) as file
...

which would define a block until the end of the current block...

Regards,
Nicolas

From gvanrossum at gmail.com  Fri Apr 29 05:39:23 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Apr 29 05:39:25 2005
Subject: [Python-Dev] noob question regarding the interpreter
In-Reply-To: <f11de95f050428144735fe364e@mail.gmail.com>
References: <f11de95f050428144735fe364e@mail.gmail.com>
Message-ID: <ca471dc20504282039aeee008@mail.gmail.com>

> However, it looks like a switched interpreter.  I just
> find this surprising because python seems to run pretty fast, and a switched
> interpreter is usually painfully slow.

This just proves how worthless a generalization that is.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From shane at hathawaymix.org  Fri Apr 29 05:56:42 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Fri Apr 29 05:56:44 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc2050428155127cec9a5@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>	<ca471dc205042416572da9db71@mail.gmail.com>	<426DB7C8.5020708@canterbury.ac.nz>	<ca471dc2050426043713116248@mail.gmail.com>	<426E3B01.1010007@canterbury.ac.nz>	<ca471dc205042621472b1f6edf@mail.gmail.com>	<427083B0.6040204@canterbury.ac.nz>	<Pine.LNX.4.58.0504280158590.4786@server1.LFW.org>
	<ca471dc2050428155127cec9a5@mail.gmail.com>
Message-ID: <4271B07A.4010501@hathawaymix.org>

Guido van Rossum wrote:
> I don't know. What exactly is the audience supposed to be of this
> high-level statement? It would be pretty darn impossible to explain
> even the for-statement to people who are new to programming, let alone
> generators. And yet explaining the block-statement *must* involve a
> reference to generators. I'm guessing most introductions to Python,
> even for experienced programmers, put generators off until the
> "advanced" section, because this is pretty wild if you're not used to
> a language that has something similar. (I wonder how you'd explain
> Python generators to an experienced Ruby programmer -- their mind has
> been manipulated to the point where they'd be unable to understand
> Python's yield no matter how hard they tried. :-)

I think this concept can be explained clearly.  I'd like to try
explaining PEP 340 to someone new to Python but not new to programming.
 I'll use the term "block iterator" to refer to the new type of
iterator.  This is according to my limited understanding.

"Good programmers move commonly used code into reusable functions.
Sometimes, however, patterns arise in the structure of the functions
rather than the actual sequence of statements.  For example, many
functions acquire a lock, execute some code specific to that function,
and unconditionally release the lock.  Repeating the locking code in
every function that uses it is error prone and makes refactoring difficult.

"Block statements provide a mechanism for encapsulating patterns of
structure.  Code inside the block statement runs under the control of an
object called a block iterator.  Simple block iterators execute code
before and after the code inside the block statement.  Block iterators
also have the opportunity to execute the controlled code more than once
(or not at all), catch exceptions, or receive data from the body of the
block statement.

"A convenient way to write block iterators is to write a generator.  A
generator looks a lot like a Python function, but instead of returning a
value immediately, generators pause their execution at "yield"
statements.  When a generator is used as a block iterator, the yield
statement tells the Python interpreter to suspend the block iterator,
execute the block statement body, and resume the block iterator when the
body has executed.

"The Python interpreter behaves as follows when it encounters a block
statement based on a generator.  First, the interpreter instantiates the
generator and begins executing it.  The generator does setup work
appropriate to the pattern it encapsulates, such as acquiring a lock,
opening a file, starting a database transaction, or starting a loop.
Then the generator yields execution to the body of the block statement
using a yield statement.  When the block statement body completes,
raises an uncaught exception, or sends data back to the generator using
a continue statement, the generator resumes.  At this point, the
generator can either clean up and stop or yield again, causing the block
statement body to execute again.  When the generator finishes, the
interpreter leaves the block statement."

Is it understandable so far?

Shane
From greg.ewing at canterbury.ac.nz  Fri Apr 29 06:05:32 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri Apr 29 06:05:50 2005
Subject: [Python-Dev] Anonymous blocks: Thunks or iterators?
In-Reply-To: <ca471dc205042815157cf20297@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<4270A1F2.1030401@canterbury.ac.nz>
	<ca471dc205042815157cf20297@mail.gmail.com>
Message-ID: <4271B28C.9030504@canterbury.ac.nz>

Guido van Rossum wrote:
> The main advantage of thunks that I can see is that you can save the
> thunk for later, like a callback for a button widget (the thunk then
> becomes a closure).

Or pass it on to another function. This is something we
haven't considered -- what if one resource-acquision-
generator (RAG?) wants to delegate to another RAG?

With normal generators, one can always use the pattern

   for x in sub_generator(some_args):
     yield x

But that clearly isn't going to work if the generators
involved are RAGs, because the exceptions passed in
are going to be raised at the point of the yield in
the outer RAG, and the inner RAG isn't going to get
finalized (assuming the for-loop doesn't participate
in the finalization protocol).

To get the finalization right, the inner generator
needs to be invoked as a RAG, too:

   block sub_generator(some_args):
     yield

But PEP 340 doesn't say what happens when the block
contains a yield.

A thunk implementation wouldn't have any problem with
this, since the thunk can be passed down any number of
levels before being called, and any exceptions raised
in it will be propagated back up through all of them.

> The other problem with thunks is that once we think of them as the
> anonymous functions they are, we're pretty much forced to say that a
> return statement in a thunk returns from the thunk rather than from
> the containing function.

Urg, you're right. Unless return is turned into an
exception in that case. And then I suppose break and
return (and yield?) will have to follow suit.

I'm just trying to think how Smalltalk handles this,
since it must have a similar problem, but I can't
remember the details.

> every
> local variable used or set in the thunk would have to become a 'cell'
> . Cells
> slow down access somewhat compared to regular local variables.

True, but is the difference all that great? It's
just one more C-level indirection, isn't it?

> we'll want variables
> *assigned to* by the thunk also to be shared with the containing
> function,

Agreed. We'd need to add a STORE_CELL bytecode or
something for this.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From greg.ewing at canterbury.ac.nz  Fri Apr 29 06:13:13 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri Apr 29 06:13:42 2005
Subject: [Python-Dev] Integrating PEP 310 with PEP 340
In-Reply-To: <4270D22A.9020907@gmail.com>
References: <426F9347.6000505@iinet.net.au>
	<ca471dc205042715165dede48d@mail.gmail.com>
	<4270D22A.9020907@gmail.com>
Message-ID: <4271B459.2010208@canterbury.ac.nz>

Nick Coghlan wrote:
> With an appropriate utility block manager

I've just thought of another potential name for them:
Block Utilization and Management Function (BUMF) :-)

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From sabbey at u.washington.edu  Fri Apr 29 06:15:14 2005
From: sabbey at u.washington.edu (Brian Sabbey)
Date: Fri Apr 29 06:15:17 2005
Subject: [Python-Dev] Anonymous blocks: Thunks or iterators?
In-Reply-To: <ca471dc205042815157cf20297@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<4270A1F2.1030401@canterbury.ac.nz>
	<ca471dc205042815157cf20297@mail.gmail.com>
Message-ID: <Pine.A41.4.61b.0504281913370.127138@dante65.u.washington.edu>

Guido van Rossum wrote:
> Even without a block-statement, these two changes make yield look a
> lot like invoking a thunk -- but it's more efficient, since calling
> yield doesn't create a frame.

I like PEP 340 a lot, probably as much or more than any thunk ideas I've 
seen.  But I want to defend thunks here a little.

It is possible to implement thunks without them creating their own frame. 
They can reuse the frame of the surrounding function.  So a new frame does 
not need to be created when the thunk is called, and, much like with a 
yield statement, the frame is not taken down when the thunk completes 
running.  The implementation just needs to take care to save and restore 
members of the frame that get clobbered when the thunk is running.

Cells would of course not be required if the thunk does not create its own 
frame.

> The main advantage of thunks that I can see is that you can save the
> thunk for later, like a callback for a button widget (the thunk then
> becomes a closure). You can't use a yield-based block for that (except
> in Ruby, which uses yield syntax with a thunk-based implementation).
> But I have to say that I almost see this as an advantage: I think I'd
> be slightly uncomfortable seeing a block and not knowing whether it
> will be executed in the normal control flow or later. Defining an
> explicit nested function for that purpose doesn't have this problem
> for me, because I already know that the 'def' keyword means its body
> is executed later.

I would also be uncomfortable if the thunk could be called at a later 
time.  This can be disallowed by raising an exception if such an attempt 
is made.

Such a restriction would not be completely arbitrary.  One consequence of 
having the thunk borrow its surrounding function's frame is that it does 
not make much sense, implementationally speaking, to allow the thunk to be 
called at a later time (although I do realize that "it's difficult to 
implement" is not a good argument for anything).

> The other problem with thunks is that once we think of them as the
> anonymous functions they are, we're pretty much forced to say that a
> return statement in a thunk returns from the thunk rather than from
> the containing function. Doing it any other way would cause major
> weirdness when the thunk were to survive its containing function as a
> closure (perhaps continuations would help, but I'm not about to go
> there :-).

If it is accepted that the thunk won't be callable at a later time, then I 
think it would seem normal that a return statement would return from the 
surrounding function.


-Brian
From greg.ewing at canterbury.ac.nz  Fri Apr 29 06:17:54 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri Apr 29 06:18:21 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <20050429023539.GB13119@mems-exchange.org>
References: <ca471dc205042815557616722b@mail.gmail.com>
	<20050429023539.GB13119@mems-exchange.org>
Message-ID: <4271B572.2030906@canterbury.ac.nz>

Neil Schemenauer wrote:

>>A variation on this with somewhat different semantics swaps the keywords:
>>
>>    in EXPR for VAR:
>>        BLOCK
> 
> Looks weird to my eyes.

Probably makes perfect sense if you're Dutch, though. :-)

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From rmunn at pobox.com  Fri Apr 29 06:28:44 2005
From: rmunn at pobox.com (Robin Munn)
Date: Fri Apr 29 06:28:50 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <ca471dc205042815557616722b@mail.gmail.com>
References: <ca471dc205042815557616722b@mail.gmail.com>
Message-ID: <4271B7FC.1070801@pobox.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Guido van Rossum wrote:
| How about, instead of trying to emphasize how different a
| block-statement is from a for-loop, we emphasize their similarity?
|
| A regular old loop over a sequence or iterable is written as:
|
|     for VAR in EXPR:
|         BLOCK
|
| A variation on this with somewhat different semantics swaps the keywords:
|
|     in EXPR for VAR:
|         BLOCK
|
| If you don't need the variable, you can leave the "for VAR" part out:
|
|     in EXPR:
|         BLOCK
|
| Too cute? :-)

Far too close to the "for" loop, IMHO. I read that, I'd have to remind
myself every time, "now, which one is it that can receive values passed
back in: for ... in, or in ... for?"

I'm definitely -1 on that one: too confusing.

Another possibility just occurred to me. How about "using"?

~    using EXPR as VAR:
~        BLOCK

Reads similarly to "with", but leaves the "with" keyword open for
possible use later.

Since it seems traditional for one to introduce oneself upon first
posting to python-dev, my name is Robin Munn. Yes, my name is just one
letter different from Robin Dunn's. It's not like I *intended* to cause
confusion... :-) Anyway, I was introduced to Python a few years ago,
around version 2.1 or so, and fell in love with the fact that I could
read my own code six months later and understand it. I try to help out
where I can, but I don't know the guts of the interpreter, so on
python-dev I mostly lurk.


Robin Munn
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (Darwin)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCcbf16OLMk9ZJcBQRAuYpAJ4n24AgsX3SrW0g7jlWJM+HfzHXMwCfTbTq
eJ2mLzg1uLZv09KDUemM+WU=
=SXux
-----END PGP SIGNATURE-----
From greg.ewing at canterbury.ac.nz  Fri Apr 29 06:45:16 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri Apr 29 06:45:35 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <ca471dc2050428155127cec9a5@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<427083B0.6040204@canterbury.ac.nz>
	<Pine.LNX.4.58.0504280158590.4786@server1.LFW.org>
	<ca471dc2050428155127cec9a5@mail.gmail.com>
Message-ID: <4271BBDC.9060506@canterbury.ac.nz>

Guido van Rossum wrote:

> I don't know. What exactly is the audience supposed to be of this
> high-level statement? It would be pretty darn impossible to explain
> even the for-statement to people who are new to programming, let alone
> generators.

If the use of block-statements becomes common for certain
tasks such as opening files, it seems to me that people are
going to encounter their use around about the same time
they encounter for-statements. We need *something* to
tell these people to enable them to understand the code
they're reading.

Maybe it would be sufficient just to explain the meanings
of those particular uses, and leave the full general
explanation as an advanced topic.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From greg.ewing at canterbury.ac.nz  Fri Apr 29 06:46:56 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri Apr 29 06:47:14 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <d11dcfba05042810164214e9d0@mail.gmail.com>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<427083B0.6040204@canterbury.ac.nz>
	<d11dcfba05042810164214e9d0@mail.gmail.com>
Message-ID: <4271BC40.4050802@canterbury.ac.nz>

Steven Bethard wrote:
> """
> A block-statement is much like a for-loop, and is also used to iterate
> over the elements of an iterable object.

No, no, no. Similarity to a for-loop is the *last* thing
we want to emphasise, because the intended use is very
different from the intended use of a for-loop. This is
going to give people the wrong idea altogether.

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From gvanrossum at gmail.com  Fri Apr 29 06:50:36 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Apr 29 06:50:40 2005
Subject: [Python-Dev] Anonymous blocks: Thunks or iterators?
In-Reply-To: <4271B28C.9030504@canterbury.ac.nz>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<4270A1F2.1030401@canterbury.ac.nz>
	<ca471dc205042815157cf20297@mail.gmail.com>
	<4271B28C.9030504@canterbury.ac.nz>
Message-ID: <ca471dc20504282150c40d3b2@mail.gmail.com>

(BTW, I'm trying to update the PEP with a discussion of thunks.)

[Guido]
> > The main advantage of thunks that I can see is that you can save the
> > thunk for later, like a callback for a button widget (the thunk then
> > becomes a closure).

[Greg]
> Or pass it on to another function. This is something we
> haven't considered -- what if one resource-acquision-
> generator (RAG?) wants to delegate to another RAG?
> 
> With normal generators, one can always use the pattern
> 
>    for x in sub_generator(some_args):
>      yield x
> 
> But that clearly isn't going to work if the generators
> involved are RAGs, because the exceptions passed in
> are going to be raised at the point of the yield in
> the outer RAG, and the inner RAG isn't going to get
> finalized (assuming the for-loop doesn't participate
> in the finalization protocol).
> 
> To get the finalization right, the inner generator
> needs to be invoked as a RAG, too:
> 
>    block sub_generator(some_args):
>      yield
> 
> But PEP 340 doesn't say what happens when the block
> contains a yield.

The same as when a for-loop contains a yield. The sub_generator is
entirely unaware of this yield, since the local control flow doesn't
actually leave the block (i.e., it's not like a break, continue or
return statement). When the loop that was resumed by the yield calls
next(), the block is resumed back after the yield. The generator
finalization semantics guarantee (within the limitations of all
finalization semantics) that the block will be resumed eventually.

I'll add this to the PEP, too.

I'd say that a yield in a thunk would be more troublesome: does it
turn the thunk into a generator or the containing function? It would
have to be the thunk, but then things get weird quickly (the caller of
the thunk has to treat the result of the call as an iterator).

> A thunk implementation wouldn't have any problem with
> this, since the thunk can be passed down any number of
> levels before being called, and any exceptions raised
> in it will be propagated back up through all of them.
> 
> > The other problem with thunks is that once we think of them as the
> > anonymous functions they are, we're pretty much forced to say that a
> > return statement in a thunk returns from the thunk rather than from
> > the containing function.
> 
> Urg, you're right. Unless return is turned into an
> exception in that case. And then I suppose break and
> return (and yield?) will have to follow suit.

But wasn't that exactly what you were trying to avoid? :-)

> I'm just trying to think how Smalltalk handles this,
> since it must have a similar problem, but I can't
> remember the details.
> 
> > every
> > local variable used or set in the thunk would have to become a 'cell'
> > . Cells
> > slow down access somewhat compared to regular local variables.
> 
> True, but is the difference all that great? It's
> just one more C-level indirection, isn't it?

Alas not. It becomes a call to PyCell_Set()  or PyCell_Get().

> > we'll want variables
> > *assigned to* by the thunk also to be shared with the containing
> > function,
> 
> Agreed. We'd need to add a STORE_CELL bytecode or
> something for this.

This actually exists -- it is used for when an outer function stores
into a local that it shares with an inner function.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From greg.ewing at canterbury.ac.nz  Fri Apr 29 06:51:56 2005
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri Apr 29 06:52:11 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <4271B07A.4010501@hathawaymix.org>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<427083B0.6040204@canterbury.ac.nz>
	<Pine.LNX.4.58.0504280158590.4786@server1.LFW.org>
	<ca471dc2050428155127cec9a5@mail.gmail.com>
	<4271B07A.4010501@hathawaymix.org>
Message-ID: <4271BD6C.1010001@canterbury.ac.nz>

Shane Hathaway wrote:

> "Block statements provide a mechanism for encapsulating patterns of
> structure.  Code inside the block statement runs under the control of an
> object called a block iterator.  Simple block iterators execute code
> before and after the code inside the block statement.  Block iterators
> also have the opportunity to execute the controlled code more than once
> (or not at all), catch exceptions, or receive data from the body of the
> block statement.

That actually looks pretty reasonable.

Hmmm. "Patterns of structure." Maybe we could call it a
"struct" statement.

    struct opening(foo) as f:
       ...

Then we could confuse both C *and* Ruby programmers at
the same time! :-)

[No, I don't really mean this. I actually prefer "block"
to this.]

-- 
Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg.ewing@canterbury.ac.nz	   +--------------------------------------+
From gvanrossum at gmail.com  Fri Apr 29 07:18:58 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Apr 29 07:19:01 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <4271BBDC.9060506@canterbury.ac.nz>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<427083B0.6040204@canterbury.ac.nz>
	<Pine.LNX.4.58.0504280158590.4786@server1.LFW.org>
	<ca471dc2050428155127cec9a5@mail.gmail.com>
	<4271BBDC.9060506@canterbury.ac.nz>
Message-ID: <ca471dc2050428221873fbf94@mail.gmail.com>

> If the use of block-statements becomes common for certain
> tasks such as opening files, it seems to me that people are
> going to encounter their use around about the same time
> they encounter for-statements. We need *something* to
> tell these people to enable them to understand the code
> they're reading.
> 
> Maybe it would be sufficient just to explain the meanings
> of those particular uses, and leave the full general
> explanation as an advanced topic.

Right. The block statement is a bit like a chameleon: it adapts its
meaning to the generator you supply. (Or maybe it's like a sewer: what
you get out of it depends on what you put into it. :-)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Fri Apr 29 07:27:26 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Apr 29 07:27:28 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <4271B7FC.1070801@pobox.com>
References: <ca471dc205042815557616722b@mail.gmail.com>
	<4271B7FC.1070801@pobox.com>
Message-ID: <ca471dc205042822271a43bc83@mail.gmail.com>

> Far too close to the "for" loop, IMHO. I read that, I'd have to remind
> myself every time, "now, which one is it that can receive values passed
> back in: for ... in, or in ... for?"

Whoa! Read the PEP closely. Passing a value back to the iterator
(using "continue EXPR") is supported both in the for-loop and in the
block-statement; it's new syntax so there's no backwards compatibility
issue. The real difference is that when a for-loop is exited through a
break, return or exception, the iterator is left untouched; but when
the same happens in a block-statement, the iterator's __exit__ or
__error__ method is called (I haven't decided what to call it).

> Another possibility just occurred to me. How about "using"?

Blah. I'm beginning to like block just fine. With using, the choice of
word for the generator name becomes iffy IMO; and it almost sounds
like it's a simple renaming: "using X as Y" could mean "Y = X".

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Fri Apr 29 07:30:20 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Apr 29 07:30:23 2005
Subject: [Python-Dev] Re: PEP 340 - possible new name for block-statement
In-Reply-To: <d4s8nh$fk8$1@sea.gmane.org>
References: <ca471dc205042815557616722b@mail.gmail.com>
	<d4s8nh$fk8$1@sea.gmane.org>
Message-ID: <ca471dc2050428223023aa80fc@mail.gmail.com>

[Nicolas Fleury]
> I would prefer something that would be
> understandable for a newbie's eyes, even if it fits more with common
> usage than with the real semantics behind it.  For example a Boost-like
> keyword like:
> 
> scoped EXPR as VAR:
>      BLOCK

Definitely not. In too many languages, a "scope" is a new namespace,
and that's exactly what a block (by whichever name) is *not*.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From jcarlson at uci.edu  Fri Apr 29 09:38:38 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri Apr 29 09:40:31 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <4271BD6C.1010001@canterbury.ac.nz>
References: <4271B07A.4010501@hathawaymix.org>
	<4271BD6C.1010001@canterbury.ac.nz>
Message-ID: <20050429003757.644F.JCARLSON@uci.edu>


Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
> That actually looks pretty reasonable.
> 
> Hmmm. "Patterns of structure." Maybe we could call it a
> "struct" statement.
> 
>     struct opening(foo) as f:
>        ...
> 
> Then we could confuse both C *and* Ruby programmers at
> the same time! :-)

And Python programmers who already use the struct module!

 - Josiah

From fumanchu at amor.org  Fri Apr 29 09:48:01 2005
From: fumanchu at amor.org (Robert Brewer)
Date: Fri Apr 29 09:46:29 2005
Subject: [Python-Dev] Anonymous blocks: Thunks or iterators?
Message-ID: <3A81C87DC164034AA4E2DDFE11D258E3771F1E@exchange.hqamor.amorhq.net>

> [Greg Ewing]
> > Elegant as the idea behind PEP 340 is, I can't shake
> > the feeling that it's an abuse of generators. It seems
> > to go to a lot of trouble and complication so you
> > can write a generator and pretend it's a function
> > taking a block argument.

[Guido]
> Maybe. You're not the first one saying this and I'm not saying "no"
> outright, but I'd like to defend the PEP.
> 
> There are a number of separate ideas that all contribute to PEP 340.
> One is turning generators into more general coroutines: continue EXPR
> passes the expression to the iterator's next() method (renamed to
> __next__() to work around a compatibility issue and because it should
> have been called that in the first place), and in a generator this
> value can be received as the return value of yield. Incidentally this
> makes the generator *syntax* more similar to Ruby (even though Ruby
> uses thunks, and consequently uses return instead of continue to pass
> a value back). I'd like to have this even if I don't get the block
> statement.

Completely agree. Maybe we should have PEP 340 push just that, and make
a PEP 341 independently for resource-cleanup (which assumes 340)?

> [snip]
> 
> The other problem with thunks is that once we think of them as the
> anonymous functions they are, we're pretty much forced to say that a
> return statement in a thunk returns from the thunk rather than from
> the containing function. Doing it any other way would cause major
> weirdness when the thunk were to survive its containing function as a
> closure (perhaps continuations would help, but I'm not about to go
> there :-).
> 
> But then an IMO important use case for the resource cleanup template
> pattern is lost. I routinely write code like this:
> 
>     def findSomething(self, key, default=None):
>         self.lock.acquire()
>         try:
>              for item in self.elements:
>                  if item.matches(key):
>                      return item
>              return default
>         finally:
>            self.lock.release()
> 
> and I'd be bummed if I couldn't write this as
> 
>     def findSomething(self, key, default=None):
>         block synchronized(self.lock):
>              for item in self.elements:
>                  if item.matches(key):
>                      return item
>              return default

Okay, you've convinced me. The only way I can think of to get the effect
I've been wanting would be to recompile the template function every time
that it's executed with a different block. Call it a "Python
_re_processor" ;). Although you could memoize the the resultant
bytecode, etc., it would still be pretty slow, and you wouldn't be able
to alter (rebind) the thunk once you'd entered the caller. Even then,
you'd have the cell issues you mentioned, trying to push values from the
thunk's original scope. Bah. It's so tempting on the semantic level, but
the implementation's a bear.


Robert Brewer
MIS
Amor Ministries
fumanchu@amor.org
From jcarlson at uci.edu  Fri Apr 29 09:47:49 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri Apr 29 09:49:32 2005
Subject: [Python-Dev] Re: PEP 340 - possible new name for block-statement
In-Reply-To: <ca471dc2050428223023aa80fc@mail.gmail.com>
References: <d4s8nh$fk8$1@sea.gmane.org>
	<ca471dc2050428223023aa80fc@mail.gmail.com>
Message-ID: <20050429003912.6452.JCARLSON@uci.edu>


Guido van Rossum <gvanrossum@gmail.com> wrote:
> 
> [Nicolas Fleury]
> > I would prefer something that would be
> > understandable for a newbie's eyes, even if it fits more with common
> > usage than with the real semantics behind it.  For example a Boost-like
> > keyword like:
> > 
> > scoped EXPR as VAR:
> >      BLOCK
> 
> Definitely not. In too many languages, a "scope" is a new namespace,
> and that's exactly what a block (by whichever name) is *not*.

scopeless, unscoped, Scope(tm) (we would be required to use the unicode
trademark symbol, of course)...

It's way too long, and is too close to a pre-existing keyword, but I
think 'finalized' is descriptive.  But...

finalize EXPR as VAR:
    BLOCK

That reads nice...  Maybe even 'cleanup', or
'finalize_after_iteration_without_iter_call' (abbreviated to 'faiwic',
of course). <1.0 wink>

All right, it's late enough.  Enough 'ideas' from me tonight.

 - Josiah

From ncoghlan at gmail.com  Fri Apr 29 10:58:03 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri Apr 29 10:58:09 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <ca471dc205042815557616722b@mail.gmail.com>
References: <ca471dc205042815557616722b@mail.gmail.com>
Message-ID: <4271F71B.8010000@gmail.com>

Guido van Rossum wrote:
> How about, instead of trying to emphasize how different a
> block-statement is from a for-loop, we emphasize their similarity?

If you want to emphasise the similarity, the following syntax and explanation is 
something that occurred to me during lunch today:

Python offers two variants on the basic iterative loop.

   "for NAME from EXPR:" enforces finalisation of the iterator. At loop 
completion, a well-behaved iterator is always completely exhausted. This form 
supports block management operations, that ensure timely release of resources 
such as locks or file handles.
   If the values being iterated over are not required, then the statement may be 
simplified to "for EXPR:".

   "for NAME in EXPR:" skips the finalisation step. At loop completion, a 
well-behaved iterator may still contain additional values. This form allows an 
iterator to be consumed in stages.


Regardless of whether you like the above or not, I think the PEP's proposed use 
of 'as' is incorrect - it looks like the variable should be referring to the 
expression being iterated over, rather than the values returned from the iterator.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From mwh at python.net  Fri Apr 29 11:35:27 2005
From: mwh at python.net (Michael Hudson)
Date: Fri Apr 29 11:35:29 2005
Subject: [Python-Dev] Anonymous blocks: Thunks or iterators?
In-Reply-To: <ca471dc205042815157cf20297@mail.gmail.com> (Guido van Rossum's
	message of "Thu, 28 Apr 2005 15:15:13 -0700")
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<4270A1F2.1030401@canterbury.ac.nz>
	<ca471dc205042815157cf20297@mail.gmail.com>
Message-ID: <2mll720vts.fsf@starship.python.net>

Guido van Rossum <gvanrossum@gmail.com> writes:

> [Greg Ewing]
>> Elegant as the idea behind PEP 340 is, I can't shake
>> the feeling that it's an abuse of generators. It seems
>> to go to a lot of trouble and complication so you
>> can write a generator and pretend it's a function
>> taking a block argument.
>
> Maybe. You're not the first one saying this and I'm not saying "no"
> outright, but I'd like to defend the PEP.

This is kind of my point too; I'm not saying that I really prefer the
thunk solution, just that I want to see it mentioned.

I think the making-generators-more-sexy thing is nice, but I'm think
that's almost orthogonal.

[...]
> Even without a block-statement, these two changes make yield look a
> lot like invoking a thunk -- but it's more efficient, since calling
> yield doesn't create a frame.
>
> The main advantage of thunks that I can see is that you can save the
> thunk for later,

I also find them somewhat easier to understand.

> like a callback for a button widget (the thunk then becomes a
> closure). You can't use a yield-based block for that (except in
> Ruby, which uses yield syntax with a thunk-based implementation).
> But I have to say that I almost see this as an advantage: I think
> I'd be slightly uncomfortable seeing a block and not knowing whether
> it will be executed in the normal control flow or later. Defining an
> explicit nested function for that purpose doesn't have this problem
> for me, because I already know that the 'def' keyword means its body
> is executed later.
>
> The other problem with thunks is that once we think of them as the
> anonymous functions they are, we're pretty much forced to say that a
> return statement in a thunk returns from the thunk rather than from
> the containing function. Doing it any other way would cause major
> weirdness when the thunk were to survive its containing function as a
> closure (perhaps continuations would help, but I'm not about to go
> there :-).

I'm not so sure about this.  Did you read this mail:

http://mail.python.org/pipermail/python-dev/2005-April/052970.html

? In this proposal, you have to go to some effort to make the thunk
survive the block, and I think if weirdness results, that's the
programmer's problem.

> But then an IMO important use case for the resource cleanup template
> pattern is lost. I routinely write code like this:
>
>     def findSomething(self, key, default=None):
>         self.lock.acquire()
>         try:
>              for item in self.elements:
>                  if item.matches(key):
>                      return item
>              return default
>         finally:
>            self.lock.release()
>
> and I'd be bummed if I couldn't write this as
>
>     def findSomething(self, key, default=None):
>         block synchronized(self.lock):
>              for item in self.elements:
>                  if item.matches(key):
>                      return item
>              return default

If you can't write it this way, the thunk proposal is dead.

>> I'd like to reconsider a thunk implementation. It
>> would be a lot simpler, doing just what is required
>> without any jiggery pokery with exceptions and
>> break/continue/return statements. It would be easy
>> to explain what it does and why it's useful.
>
> I don't know. In order to obtain the required local variable sharing
> between the thunk and the  containing function I believe that every
> local variable used or set in the thunk would have to become a 'cell'
> (our mechanism for sharing variables between nested scopes). 

Yes.

> Cells slow down access somewhat compared to regular local variables.

So make them faster.  I'm not sure I think this is a good argument.
You could also do some analysis and treat variables that are only
accessed or written in the block as normal locals.

This all makes a block-created thunk somewhat different from an
anonymous function, to be sure.  But the creating syntax is different,
so I don't know if I care (hell, the invoking syntax could be made
different too, but I really don't think that's a good idea).

> Perhaps not entirely coincidentally, the last example above
> (findSomething() rewritten to avoid a return inside the block) shows
> that, unlike for regular nested functions, we'll want variables
> *assigned to* by the thunk also to be shared with the containing
> function, even if they are not assigned to outside the thunk. I swear
> I didn't create the example for this purpose -- it just happened.

Oh, absolutely.

>> On the other hand, a thunk implementation has the
>> potential to easily handle multiple block arguments, if
>> a suitable syntax could ever be devised. It's hard
>> to see how that could be done in a general way with
>> the generator implementation.
>
> Right, but the use cases for multiple blocks seem elusive. If you
> really want to have multiple blocks with yield, I suppose we could use
> "yield/n" to yield to the n'th block argument, or perhaps yield>>n.
> :-)

Hmm, it's nearly *May* 1... :)

Cheers,
mwh

-- 
  I'm a keen cyclist and I stop at red lights.  Those who don't need
  hitting with a great big slapping machine.
                                           -- Colin Davidson, cam.misc
From mwh at python.net  Fri Apr 29 11:37:30 2005
From: mwh at python.net (Michael Hudson)
Date: Fri Apr 29 11:37:32 2005
Subject: [Python-Dev] Anonymous blocks: Thunks or iterators?
In-Reply-To: <Pine.A41.4.61b.0504281913370.127138@dante65.u.washington.edu>
	(Brian
	Sabbey's message of "Thu, 28 Apr 2005 21:15:14 -0700 (PDT)")
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<4270A1F2.1030401@canterbury.ac.nz>
	<ca471dc205042815157cf20297@mail.gmail.com>
	<Pine.A41.4.61b.0504281913370.127138@dante65.u.washington.edu>
Message-ID: <2mhdhp2aat.fsf@starship.python.net>

Brian Sabbey <sabbey@u.washington.edu> writes:

> It is possible to implement thunks without them creating their own
> frame. They can reuse the frame of the surrounding function.  So a new
> frame does not need to be created when the thunk is called, and, much
> like with a yield statement, the frame is not taken down when the
> thunk completes running.  The implementation just needs to take care
> to save and restore members of the frame that get clobbered when the
> thunk is running.

Woo.  That's cute.

Cheers,
mwh

-- 
  SCSI is not magic. There are fundamental technical reasons why it
  is necessary to sacrifice a young goat to your SCSI chain now and
  then.                                                  -- John Woods
From p.f.moore at gmail.com  Fri Apr 29 12:41:19 2005
From: p.f.moore at gmail.com (Paul Moore)
Date: Fri Apr 29 12:41:22 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <4271B07A.4010501@hathawaymix.org>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<427083B0.6040204@canterbury.ac.nz>
	<Pine.LNX.4.58.0504280158590.4786@server1.LFW.org>
	<ca471dc2050428155127cec9a5@mail.gmail.com>
	<4271B07A.4010501@hathawaymix.org>
Message-ID: <79990c6b05042903417313df72@mail.gmail.com>

On 4/29/05, Shane Hathaway <shane@hathawaymix.org> wrote:
> I think this concept can be explained clearly.  I'd like to try
> explaining PEP 340 to someone new to Python but not new to programming.
> I'll use the term "block iterator" to refer to the new type of
> iterator.  This is according to my limited understanding.
[...]
> Is it understandable so far?

I like it.
Paul.
From pierre.barbier at cirad.fr  Fri Apr 29 13:44:46 2005
From: pierre.barbier at cirad.fr (Pierre Barbier de Reuille)
Date: Fri Apr 29 13:44:20 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <4271F71B.8010000@gmail.com>
References: <ca471dc205042815557616722b@mail.gmail.com>
	<4271F71B.8010000@gmail.com>
Message-ID: <42721E2E.8020108@cirad.fr>


Nick Coghlan a ?crit :
> Python offers two variants on the basic iterative loop.
> 
>   "for NAME from EXPR:" enforces finalisation of the iterator. At loop 
> completion, a well-behaved iterator is always completely exhausted. This 
> form supports block management operations, that ensure timely release of 
> resources such as locks or file handles.
>   If the values being iterated over are not required, then the statement 
> may be simplified to "for EXPR:".
> 
>   "for NAME in EXPR:" skips the finalisation step. At loop completion, a 
> well-behaved iterator may still contain additional values. This form 
> allows an iterator to be consumed in stages.
> 
> 
> Regardless of whether you like the above or not, I think the PEP's 
> proposed use of 'as' is incorrect - it looks like the variable should be 
> referring to the expression being iterated over, rather than the values 
> returned from the iterator.
> 
> Cheers,
> Nick.
> 

Well, I would go a step further and keep only the for-loop syntax, 
mainly because I don't understand why there is two syntax for things 
that's so close we can merge them !

You can simply states that the for-loop call the "__error__" method of 
the object if available without invalidating any other property of the 
new for-loop (ie. as defined in the PEP 340).

One main reason is a common error could be (using the synchronised 
iterator introduced in the PEP):

for l in synchronised(mylock):
   do_something()

It will compile, run, never raise any error but the lock will be 
acquired and never released !

Then, I think there is no use case of a generator with __error__ in the 
for-loop as it is now. So, IMO, it is error-prone and useless to have 
two different syntaxes for such things.

Pierre

-- 
Pierre Barbier de Reuille

INRA - UMR Cirad/Inra/Cnrs/Univ.MontpellierII AMAP
Botanique et Bio-informatique de l'Architecture des Plantes
TA40/PSII, Boulevard de la Lironde
34398 MONTPELLIER CEDEX 5, France

tel   : (33) 4 67 61 65 77    fax   : (33) 4 67 61 56 68
From lcaamano at gmail.com  Fri Apr 29 14:45:56 2005
From: lcaamano at gmail.com (Luis P Caamano)
Date: Fri Apr 29 14:46:00 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <20050429044559.526E31E4006@bag.python.org>
References: <20050429044559.526E31E4006@bag.python.org>
Message-ID: <c56e219d050429054572444ab6@mail.gmail.com>

On 4/29/05, python-dev-request@python.org <python-dev-request@python.org> wrote:
> 
> Message: 2
> Date: Thu, 28 Apr 2005 21:56:42 -0600
> From: Shane Hathaway <shane@hathawaymix.org>
> Subject: Re: [Python-Dev] Re: anonymous blocks
> To: guido@python.org
> Cc: Ka-Ping Yee <python-dev@zesty.ca>,  Python Developers List
>        <python-dev@python.org>
> Message-ID: <4271B07A.4010501@hathawaymix.org>
> Content-Type: text/plain; charset=ISO-8859-1
> 
> 
> I think this concept can be explained clearly.  I'd like to try
> explaining PEP 340 to someone new to Python but not new to programming.
> I'll use the term "block iterator" to refer to the new type of
> iterator.  This is according to my limited understanding.
> 
> "Good programmers move commonly used code into reusable functions.
> Sometimes, however, patterns arise in the structure of the functions
> rather than the actual sequence of statements.  For example, many
> functions acquire a lock, execute some code specific to that function,
> and unconditionally release the lock.  Repeating the locking code in
> every function that uses it is error prone and makes refactoring difficult.
> 
> "Block statements provide a mechanism for encapsulating patterns of
> structure.  Code inside the block statement runs under the control of an
> object called a block iterator.  Simple block iterators execute code
> before and after the code inside the block statement.  Block iterators
> also have the opportunity to execute the controlled code more than once
> (or not at all), catch exceptions, or receive data from the body of the
> block statement.
> 
> "A convenient way to write block iterators is to write a generator.  A
> generator looks a lot like a Python function, but instead of returning a
> value immediately, generators pause their execution at "yield"
> statements.  When a generator is used as a block iterator, the yield
> statement tells the Python interpreter to suspend the block iterator,
> execute the block statement body, and resume the block iterator when the
> body has executed.
> 
> "The Python interpreter behaves as follows when it encounters a block
> statement based on a generator.  First, the interpreter instantiates the
> generator and begins executing it.  The generator does setup work
> appropriate to the pattern it encapsulates, such as acquiring a lock,
> opening a file, starting a database transaction, or starting a loop.
> Then the generator yields execution to the body of the block statement
> using a yield statement.  When the block statement body completes,
> raises an uncaught exception, or sends data back to the generator using
> a continue statement, the generator resumes.  At this point, the
> generator can either clean up and stop or yield again, causing the block
> statement body to execute again.  When the generator finishes, the
> interpreter leaves the block statement."
> 
> Is it understandable so far?
> 

I've been skipping most of the anonymous block discussion and thus,
I only had a very vague idea of what it was about until I read this
explanation.

Yes, it is understandable -- assuming it's correct :-)

Mind you though, I'm not new to python and I've been writing system
software for 20+ years.

-- 
Luis P Caamano
Atlanta, GA USA
From lbruno at republico.estv.ipv.pt  Fri Apr 29 15:50:16 2005
From: lbruno at republico.estv.ipv.pt (Luis Bruno)
Date: Fri Apr 29 15:48:19 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <4271B07A.4010501@hathawaymix.org>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<427083B0.6040204@canterbury.ac.nz>
	<Pine.LNX.4.58.0504280158590.4786@server1.LFW.org>
	<ca471dc2050428155127cec9a5@mail.gmail.com>
	<4271B07A.4010501@hathawaymix.org>
Message-ID: <20050429145016.00005d4e@LAB2-14.esi>

Hello,

Shane Hathaway wrote:
> Is it understandable so far?

Definitely yes! I had the structure upside-down; your explanation is
right on target.

Thanks!
-- 
Luis Bruno
From shane at hathawaymix.org  Fri Apr 29 15:48:39 2005
From: shane at hathawaymix.org (Shane Hathaway)
Date: Fri Apr 29 15:48:43 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <c56e219d050429054572444ab6@mail.gmail.com>
References: <20050429044559.526E31E4006@bag.python.org>
	<c56e219d050429054572444ab6@mail.gmail.com>
Message-ID: <42723B37.3050004@hathawaymix.org>

Luis P Caamano wrote:
> I've been skipping most of the anonymous block discussion and thus,
> I only had a very vague idea of what it was about until I read this
> explanation.
> 
> Yes, it is understandable -- assuming it's correct :-)

To my surprise, the explanation is now in the PEP.  (Thanks, Guido!)

Shane
From jimjjewett at gmail.com  Fri Apr 29 16:43:01 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri Apr 29 16:43:05 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
Message-ID: <fb6fbf5605042907431311af71@mail.gmail.com>

Nick Coghlan:

> Python offers two variants on the basic iterative loop.

>    "for NAME in EXPR:" skips the finalisation step. At loop 
> completion, a well-behaved iterator may still contain additional values.

>   "for NAME from EXPR:" enforces finalisation of the iterator.
> ... At loop completion, a well-behaved [finalizing] iterator is 
> always completely exhausted.

(nitpick):
   "from" isn't really different from "in".  Perhaps

    for NAME inall EXPR:
    for NAME draining EXPR:
    for NAME finalizing EXPR:    # too hard to spell, because of s/z?

(substance):  

"finalized or not" is a very useful distinction, but I'm not sure it 
is something the user should have to worry about.  Realistically, 
most of my loops intend to drain the iterator (which the compiler 
knows because I have no "break:".  Regardless of whether I 
use a break, I still want the iterator cleaned up if it is drained.

The only thing this second loop form does is set a flag saying 

   "No, I won't continue -- and I happen to know that no one else 
    ever will either, even if they do have a reference that prevents
    garbage collection.  I'm *sure* they won't use it."  

That strikes me as a dangerous thing to get in the habit of saying.

Why not just agressively run the finalization on both forms when the
reference count permits?

> This form supports block management operations, 

And this seems unrelated to finalization.  I understand that as an
implementation detail, you need to define the finalizers somehow.
But the decision to aggressively finalize (in some manner) and 
desire to pass a block (that could be for finalization) seem like 
orthogonal issues.

-jJ
From lcaamano at gmail.com  Fri Apr 29 16:43:34 2005
From: lcaamano at gmail.com (Luis P Caamano)
Date: Fri Apr 29 16:43:36 2005
Subject: [Python-Dev] About block statement name alternative
Message-ID: <c56e219d05042907433a52ec34@mail.gmail.com>

How about "bracket" or "bracket_with"?  As in:

bracket_with synchronized(lock):
  BLOCK

bracket_with opening("/etc/passwd") as f:
    for line in f:
                print line.rstrip()

bracket_with transactional(db):
   db.store()

bracket_with auto_retry(3, IOError):
    f = urllib.urlopen("http://python.org/peps/pep-0340.html")
    print f.read()

block_with synchronized_opening("/etc/passwd", myLock) as f:
     for line in f:
          print line.rstrip()

def synchronized_opening(lock, filename, mode="r"):
            bracket_with synchronized(lock):
                bracket_with opening(filename) as f:
                    yield f

bracket_with synchronized_opening("/etc/passwd", myLock) as f:
        for line in f:
            print line.rstrip()


Or for that matter, "block_with", as in:

block_with transactional(db):
   db.store()


-- 
Luis P Caamano
Atlanta, GA USA
From skip at pobox.com  Fri Apr 29 16:48:24 2005
From: skip at pobox.com (Skip Montanaro)
Date: Fri Apr 29 16:48:28 2005
Subject: [Python-Dev] PEP 340: What is "ret" in block statement semantics?
Message-ID: <17010.18744.51208.918622@montanaro.dyndns.org>


PEP 340 describes the block statement translation as:

        itr = EXPR1
        val = arg = None
        ret = False
        while True:
            try:
                VAR1 = next(itr, arg)
            except StopIteration:
                if ret:
                    return val
                if val is not None:
                    raise val
                break
            try:
                val = arg = None
                ret = False
                BLOCK1
            except Exception, val:
                arg = StopIteration()

It uses a variable "ret" that is always False.  If it does manage to take on
a True value, a return statement is executed.  How does ret become True?
What's meaning of return in this context?  Something seems amiss.

Skip
From skip at pobox.com  Fri Apr 29 16:49:09 2005
From: skip at pobox.com (Skip Montanaro)
Date: Fri Apr 29 16:49:12 2005
Subject: [Python-Dev] PEP 340: What is "ret" in block statement semantics?
Message-ID: <17010.18789.916072.361333@montanaro.dyndns.org>


    me> It uses a variable "ret" that is always False.

Gaack.  Please ignore.

Skip
From ncoghlan at gmail.com  Fri Apr 29 17:00:44 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri Apr 29 17:00:50 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <42721E2E.8020108@cirad.fr>
References: <ca471dc205042815557616722b@mail.gmail.com>	<4271F71B.8010000@gmail.com>
	<42721E2E.8020108@cirad.fr>
Message-ID: <42724C1C.4040200@gmail.com>

Pierre Barbier de Reuille wrote:
> One main reason is a common error could be (using the synchronised 
> iterator introduced in the PEP):
> 
> for l in synchronised(mylock):
>   do_something()
> 
> It will compile, run, never raise any error but the lock will be 
> acquired and never released !

It's better than that. With the code above, CPython is actually likely to 
release the lock when the loop exits. Change the code to the below to ensure the 
lock doesn't get released:

   sync = synchronised(mylock):
   for l in sync:
       do_something()

> Then, I think there is no use case of a generator with __error__ in the 
> for-loop as it is now. So, IMO, it is error-prone and useless to have 
> two different syntaxes for such things.

Hmm. This does make PJE's suggestion of requiring a decorator in order to flag 
generators for finalisation a little more appealing. Existing generators 
(without the flag) would not be cleaned up, preserving backwards compatibility. 
Generators with the flag would allow resource clean up.

In this case of no new statement syntax, it would probably make more sense to 
refer to iterators that get cleaned up as finalised iterators, and a builtin 
with the obvious name would be:

     def finalised(obj):
         obj.__finalise__ = True  # The all important flag!
         return obj

The syntax below would still be horrible:

     for f in opening(filename):
         for line in f:
            # process line

But such ugliness could be fixed by pushing the inner loop inside the block 
iterator:

    for line in opened(filename):
       # process line

    @finalised
    def opened(filename):
        f = open(filename)
        try:
            for line in f:
                yield line
        finally:
            f.close()

Then, in Py3K, finalisation could simply become the default for loop behaviour. 
However, the '__finalise__' flag would result in some impressive code bloat, as 
any for loop would need to expand to:

     itr = iter(EXPR1)
     if getattr(itr, "__finalise__", False):
         # Finalised semantics
         #    I'm trying to channel Guido here.
         #    This would really look like whatever the PEP 340 block statement
         #    semantics end up being
         val = arg = None
         ret = broke = False
         while True:
             try:
                 VAR1 = next(itr, arg)
             except StopIteration:
                 BLOCK2
                 break
             try:
                 val = arg = None
                 ret = False
                 BLOCK1
             except Exception, val:
                 itr.__error__(val)
             if ret:
                 try:
                     itr.__error__(StopIteration())
                 except StopIteration:
                     pass
                 return val
     else:
         # Non-finalised semantics
         arg = None
         while True:
             try:
                 VAR1 = next(itr, arg)
             except StopIteration:
                 BLOCK2
                 break
             arg = None
             BLOCK1

The major danger I see is that you could then write a generator containing a 
yield inside a try/finally, _without_ applying the finalisation decorator. 
Leading to exactly the problem described above - the lock (or whatever) is never 
cleaned up, because the generator is not flagged for finalisation. In this 
scenario, even destruction of the generator object won't help.

Cheers,
Nick.

P.S. I think PEP 340's proposed for loop semantics are currently incorrect, as 
BLOCK2 is unreachable. It should look more like the non-finalised semantics 
above (with BLOCK2 before the break in the except clause)

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From ncoghlan at gmail.com  Fri Apr 29 17:26:13 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri Apr 29 17:26:18 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <fb6fbf5605042907431311af71@mail.gmail.com>
References: <fb6fbf5605042907431311af71@mail.gmail.com>
Message-ID: <42725215.7030206@gmail.com>

Jim Jewett wrote:
> Why not just agressively run the finalization on both forms when the
> reference count permits?

So the iterator is always finalised if the for loop has the only reference?

Two problems I can see there is that naming the target of the for loop would 
prevent it being finalised, and that this would make life interesting when the 
Jython or IronPython folks catch up to Python 2.5. . .

The finalised/not finalised aspect definitely seems to be the key behavioural 
distinction between the two forms, though. And I think there are legitimate use 
cases for a non-finalised form. Things like:

    for line in f:
        if end_of_header(line):
            break
        # process header line

    for line in f:
        # process body line

With only a finalised form of iteration available, this would need to be 
rewritten as something like:

     def header(f):
         line = next(f)
         while not end_of_header(line):
             line = next(f, yield line)

    for line in header(f):
        # process header line
    for line in f:
        # process body line


Considering the above, I actually have grave reservations about *ever* making 
finalisation the default behaviour of for loops - if I break out of a standard 
for loop before exhausting the iterator, I would expect to be able to resume the 
iterator afterwards, rather than having it flushed behind my back.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From pierre.barbier at cirad.fr  Fri Apr 29 17:45:21 2005
From: pierre.barbier at cirad.fr (Pierre Barbier de Reuille)
Date: Fri Apr 29 17:44:53 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <42724C1C.4040200@gmail.com>
References: <ca471dc205042815557616722b@mail.gmail.com>	<4271F71B.8010000@gmail.com>	<42721E2E.8020108@cirad.fr>
	<42724C1C.4040200@gmail.com>
Message-ID: <42725691.4030308@cirad.fr>


Nick Coghlan a ?crit :
> Pierre Barbier de Reuille wrote:
> 
>> One main reason is a common error could be (using the synchronised 
>> iterator introduced in the PEP):
>>
>> for l in synchronised(mylock):
>>   do_something()
>>
>> It will compile, run, never raise any error but the lock will be 
>> acquired and never released !
> 
> 
> It's better than that. With the code above, CPython is actually likely 
> to release the lock when the loop exits. Change the code to the below to 
> ensure the lock doesn't get released:
> 
>   sync = synchronised(mylock):
>   for l in sync:
>       do_something()
> 

Well indeed, but this will be an implementation-dependant behaviour ...

>> Then, I think there is no use case of a generator with __error__ in 
>> the for-loop as it is now. So, IMO, it is error-prone and useless to 
>> have two different syntaxes for such things.
> 
> 
[...]
> 
> The major danger I see is that you could then write a generator 
> containing a yield inside a try/finally, _without_ applying the 
> finalisation decorator. Leading to exactly the problem described above - 
> the lock (or whatever) is never cleaned up, because the generator is not 
> flagged for finalisation. In this scenario, even destruction of the 
> generator object won't help.


Mmmmh ... why introduce a new flag ? Can't you just test the presence of 
the "__error__" method ? This would lift your problem wouldn't it ?

> 
> Cheers,
> Nick.
> 
> P.S. I think PEP 340's proposed for loop semantics are currently 
> incorrect, as BLOCK2 is unreachable. It should look more like the 
> non-finalised semantics above (with BLOCK2 before the break in the 
> except clause)
> 

-- 
Pierre Barbier de Reuille

INRA - UMR Cirad/Inra/Cnrs/Univ.MontpellierII AMAP
Botanique et Bio-informatique de l'Architecture des Plantes
TA40/PSII, Boulevard de la Lironde
34398 MONTPELLIER CEDEX 5, France

tel   : (33) 4 67 61 65 77    fax   : (33) 4 67 61 56 68
From aahz at pythoncraft.com  Fri Apr 29 18:34:08 2005
From: aahz at pythoncraft.com (Aahz)
Date: Fri Apr 29 18:34:10 2005
Subject: [Python-Dev] Anonymous blocks: Thunks or iterators?
In-Reply-To: <Pine.A41.4.61b.0504281913370.127138@dante65.u.washington.edu>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<4270A1F2.1030401@canterbury.ac.nz>
	<ca471dc205042815157cf20297@mail.gmail.com>
	<Pine.A41.4.61b.0504281913370.127138@dante65.u.washington.edu>
Message-ID: <20050429163408.GA14920@panix.com>

On Thu, Apr 28, 2005, Brian Sabbey wrote:
> 
> It is possible to implement thunks without them creating their own frame. 
> They can reuse the frame of the surrounding function.  So a new frame does 
> not need to be created when the thunk is called, and, much like with a 
> yield statement, the frame is not taken down when the thunk completes 
> running.  The implementation just needs to take care to save and restore 
> members of the frame that get clobbered when the thunk is running.
> 
> Cells would of course not be required if the thunk does not create its own 
> frame.

Maybe.  It's not clear whether your thunks are lexical (I haven't been
following the discussion closely).  If it's not lexical, how do locals
get handled without cells?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It's 106 miles to Chicago.  We have a full tank of gas, a half-pack of
cigarettes, it's dark, and we're wearing sunglasses."  "Hit it."
From aahz at pythoncraft.com  Fri Apr 29 18:38:54 2005
From: aahz at pythoncraft.com (Aahz)
Date: Fri Apr 29 18:38:58 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <4271F71B.8010000@gmail.com>
References: <ca471dc205042815557616722b@mail.gmail.com>
	<4271F71B.8010000@gmail.com>
Message-ID: <20050429163854.GB14920@panix.com>

On Fri, Apr 29, 2005, Nick Coghlan wrote:
> 
> If you want to emphasise the similarity, the following syntax and 
> explanation is something that occurred to me during lunch today:

We don't want to emphasize the similarity.

> Python offers two variants on the basic iterative loop.
> 
>   "for NAME from EXPR:" enforces finalisation of the iterator. At loop 
> completion, a well-behaved iterator is always completely exhausted. This 
> form supports block management operations, that ensure timely release of 
> resources such as locks or file handles.
>   If the values being iterated over are not required, then the statement 
>   may be simplified to "for EXPR:".
> 
>   "for NAME in EXPR:" skips the finalisation step. At loop completion, a 
> well-behaved iterator may still contain additional values. This form allows 
> an iterator to be consumed in stages.

-1 -- the Zen of Python implies that we should be able to tell which
construct we're using at the beginning of the line.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It's 106 miles to Chicago.  We have a full tank of gas, a half-pack of
cigarettes, it's dark, and we're wearing sunglasses."  "Hit it."
From david.ascher at gmail.com  Fri Apr 29 18:42:33 2005
From: david.ascher at gmail.com (David Ascher)
Date: Fri Apr 29 18:42:35 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <ca471dc205042815557616722b@mail.gmail.com>
References: <ca471dc205042815557616722b@mail.gmail.com>
Message-ID: <dd28fc2f05042909422742b720@mail.gmail.com>

On 4/28/05, Guido van Rossum <gvanrossum@gmail.com> wrote:
> How about, instead of trying to emphasize how different a
> block-statement is from a for-loop, we emphasize their similarity?
> 
> A regular old loop over a sequence or iterable is written as:
> 
>     for VAR in EXPR:
>         BLOCK
> 
> A variation on this with somewhat different semantics swaps the keywords:
> 
>     in EXPR for VAR:
>         BLOCK
> 
> If you don't need the variable, you can leave the "for VAR" part out:
> 
>     in EXPR:
>         BLOCK
> 
> Too cute? :-)

If you want to truly confuse the Ruby folks, you could go for something like:

{ EXPR } VAR:
    BLOCK

<wink/>
From jcarlson at uci.edu  Fri Apr 29 19:02:38 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri Apr 29 19:03:49 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <42724C1C.4040200@gmail.com>
References: <42721E2E.8020108@cirad.fr> <42724C1C.4040200@gmail.com>
Message-ID: <20050429095557.6455.JCARLSON@uci.edu>


Nick Coghlan <ncoghlan@gmail.com> wrote:
> Then, in Py3K, finalisation could simply become the default for loop behaviour. 
> However, the '__finalise__' flag would result in some impressive code bloat, as 
> any for loop would need to expand to:
> 
>      itr = iter(EXPR1)
>      if getattr(itr, "__finalise__", False):
>          # Finalised semantics
>          #    I'm trying to channel Guido here.
>          #    This would really look like whatever the PEP 340 block statement
>          #    semantics end up being
>          val = arg = None
>          ret = broke = False
>          while True:
>              try:
>                  VAR1 = next(itr, arg)
>              except StopIteration:
>                  BLOCK2
>                  break
>              try:
>                  val = arg = None
>                  ret = False
>                  BLOCK1
>              except Exception, val:
>                  itr.__error__(val)
>              if ret:
>                  try:
>                      itr.__error__(StopIteration())
>                  except StopIteration:
>                      pass
>                  return val


The problem is that BLOCK2 is executed within the while loop (the same
problem I had with a fix I offered), which may contain a break for
breaking out of a higher-level loop construct.  Here's one that works as
you intended (though perhaps I'm being a bit to paranoid about the
__error__ attribute)...

            val = arg = None
            ret = ex_block_2 = False
            while True:
                try:
                    VAR1 = next(itr, arg)
                except StopIteration:
                    ex_block_2 = True
                    break
                try:
                    val = arg = None
                    ret = False
                    BLOCK1
                except Exception, val:
                    if hasattr(itr, '__error__):
                        itr.__error__(val)
                if ret:
                    try:
                        if hasattr(itr, '__error__'):
                            itr.__error__(StopIteration())
                    except StopIteration:
                        pass
                    return val
            if ex_block_2:
                BLOCK2


> P.S. I think PEP 340's proposed for loop semantics are currently incorrect, as 
> BLOCK2 is unreachable. It should look more like the non-finalised semantics 
> above (with BLOCK2 before the break in the except clause)

Indeed, I also mentioned this on Wednesday.

 - Josiah
From pje at telecommunity.com  Fri Apr 29 19:08:04 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Apr 29 19:04:56 2005
Subject: [Python-Dev] PEP 340 - possible new name for
  block-statement
In-Reply-To: <20050429163854.GB14920@panix.com>
References: <4271F71B.8010000@gmail.com>
	<ca471dc205042815557616722b@mail.gmail.com>
	<4271F71B.8010000@gmail.com>
Message-ID: <5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com>

At 09:38 AM 4/29/05 -0700, Aahz wrote:
>-1 -- the Zen of Python implies that we should be able to tell which
>construct we're using at the beginning of the line.

Hm, maybe we should just use "@", then.  :)

e.g.

     @synchronized(self):
         @with_file("foo") as f:
             # etc.

Although I'd personally prefer a no-keyword approach:

     synchronized(self):
         with_file("foo") as f:
             # etc.


From jcarlson at uci.edu  Fri Apr 29 19:08:22 2005
From: jcarlson at uci.edu (Josiah Carlson)
Date: Fri Apr 29 19:09:50 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <42724C1C.4040200@gmail.com>
References: <42721E2E.8020108@cirad.fr> <42724C1C.4040200@gmail.com>
Message-ID: <20050429100605.6458.JCARLSON@uci.edu>


Nick Coghlan <ncoghlan@gmail.com> wrote:
>          # Non-finalised semantics
>          arg = None
>          while True:
>              try:
>                  VAR1 = next(itr, arg)
>              except StopIteration:
>                  BLOCK2
>                  break
>              arg = None
>              BLOCK1

And that bad boy should be...

         # Non-finalised semantics
         ex_block_2 = False
         arg = None
         while True:
             try:
                 VAR1 = next(itr, arg)
             except StopIteration:
                 ex_block_2 = True
                 break
             arg = None
             BLOCK1
         if ex_block_2:
             BLOCK2

Josiah Carlson wrote:
> Indeed, I also mentioned this on Wednesday.

Though I was somewhat incorrect as code examples I offered express the
actual intent.

 - Josiah

From gvanrossum at gmail.com  Fri Apr 29 19:16:12 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Apr 29 19:16:15 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com>
References: <ca471dc205042815557616722b@mail.gmail.com>
	<4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com>
	<5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com>
Message-ID: <ca471dc205042910162befaaee@mail.gmail.com>

[Phillip J. Eby]
> Although I'd personally prefer a no-keyword approach:
> 
>      synchronized(self):
>          with_file("foo") as f:
>              # etc.

I'd like that too, but it was shot down at least once. Maybe we can
resurrect it?

    opening("foo") as f:
        # etc.

is just a beauty!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From jimjjewett at gmail.com  Fri Apr 29 19:17:13 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri Apr 29 19:17:16 2005
Subject: [Python-Dev] Anonymous blocks: Thunks or iterators?
Message-ID: <fb6fbf560504291017362204a7@mail.gmail.com>

Brian Sabbey:

> It is possible to implement thunks without them creating their own
> frame. They can reuse the frame of the surrounding function ...

> The implementation just needs to take care
> to save and restore members of the frame that get clobbered when the
> thunk is running.

Michael Hudson:

> Woo.  That's cute.

It *sounds* horrendous, but is actually pretty reasonable.

Conceptually, a thunk replaces a suite in the caller.  

Most frame members are intended to be shared, and changes 
should be visible -- so they don't have to (and shouldn't) be restored.

The only members that need special attention are (f_code, f_lasti)
and possibly (f_blockstack, f_iblock).  

(f_code, f_lasti) would need to be replaced with a stack of pairs.
Finishing a code string would mean popping this stack, rather 
than popping the whole frame. 

Since a completed suite leaves the blockstack where it started, 
(f_blockstack, f_iblock) *can* be ignored, though debugging and
CO_MAXBLOCKS both *suggest* replacing the pair with a stack of
pairs.

-jJ
From david.ascher at gmail.com  Fri Apr 29 19:23:21 2005
From: david.ascher at gmail.com (David Ascher)
Date: Fri Apr 29 19:23:23 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <ca471dc205042910162befaaee@mail.gmail.com>
References: <ca471dc205042815557616722b@mail.gmail.com>
	<4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com>
	<5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com>
	<ca471dc205042910162befaaee@mail.gmail.com>
Message-ID: <dd28fc2f05042910232e69c41f@mail.gmail.com>

On 4/29/05, Guido van Rossum <gvanrossum@gmail.com> wrote:
> [Phillip J. Eby]
> > Although I'd personally prefer a no-keyword approach:
> >
> >      synchronized(self):
> >          with_file("foo") as f:
> >              # etc.
> 
> I'd like that too, but it was shot down at least once. Maybe we can
> resurrect it?
> 
>     opening("foo") as f:
>         # etc.
> 
> is just a beauty!

I agree, but does this then work:

x = opening("foo")
...stuff...
x as f:
   # etc

?  And if not, why not?  And if yes, what happens if "stuff" raises an
exception?
From david.ascher at gmail.com  Fri Apr 29 19:24:58 2005
From: david.ascher at gmail.com (David Ascher)
Date: Fri Apr 29 19:25:00 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <dd28fc2f05042910232e69c41f@mail.gmail.com>
References: <ca471dc205042815557616722b@mail.gmail.com>
	<4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com>
	<5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com>
	<ca471dc205042910162befaaee@mail.gmail.com>
	<dd28fc2f05042910232e69c41f@mail.gmail.com>
Message-ID: <dd28fc2f050429102445d8e435@mail.gmail.com>

> I agree, but does this then work:
> 
> x = opening("foo")
> ...stuff...
> x as f:
>    # etc
> 
> ?  And if not, why not?  And if yes, what happens if "stuff" raises an
> exception?

Forget it -- the above is probably addressed by the PEP and doesn't
really depend on whether there's a kw or not.
From aahz at pythoncraft.com  Fri Apr 29 19:42:32 2005
From: aahz at pythoncraft.com (Aahz)
Date: Fri Apr 29 19:42:34 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <ca471dc205042910162befaaee@mail.gmail.com>
References: <ca471dc205042815557616722b@mail.gmail.com>
	<4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com>
	<5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com>
	<ca471dc205042910162befaaee@mail.gmail.com>
Message-ID: <20050429174232.GB18361@panix.com>

On Fri, Apr 29, 2005, Guido van Rossum wrote:
> [Phillip J. Eby]
>>
>> Although I'd personally prefer a no-keyword approach:
>> 
>>      synchronized(self):
>>          with_file("foo") as f:
>>              # etc.
> 
> I'd like that too, but it was shot down at least once. Maybe we can
> resurrect it?
> 
>     opening("foo") as f:
>         # etc.

I'm still -1 for the same reason I mentioned earlier: function calls
spanning multiple lines are moderately common in Python code, and it's
hard to distinguish these cases because multi-line calls usually get
indented like blocks.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It's 106 miles to Chicago.  We have a full tank of gas, a half-pack of
cigarettes, it's dark, and we're wearing sunglasses."  "Hit it."
From jimjjewett at gmail.com  Fri Apr 29 19:48:57 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri Apr 29 19:48:59 2005
Subject: [Python-Dev] next(arg) was: Anonymous blocks: Thunks or iterators?
Message-ID: <fb6fbf56050429104851d1992@mail.gmail.com>

Guido van Rossum:
> One [of many separate ideas in PEP 340] is turning generators
> into more general coroutines: continue EXPR passes the expression
> to the iterator's next() method ...

I would have been very happy with that a week ago.  Seeing the
specific implementation changed my mind.  

The caller shouldn't know what state the generator is in, so the
passed-in-message will be the same regardless of which yield 
accepts it.  Unless I have a single-yield generator, this means
I end up writing boilerplate code to accept and process the arg
at each yield.  I don't want more boilerplate.

> Even without a block-statement, these two changes make yield look a
> lot like invoking a thunk 

Though it feels backwards to me; yield is returning control to something
that already had to coordinate the thunks itself.

-jJ
From pje at telecommunity.com  Fri Apr 29 19:54:43 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Apr 29 19:51:38 2005
Subject: [Python-Dev] PEP 340 - possible new name for
  block-statement
In-Reply-To: <20050429174232.GB18361@panix.com>
References: <ca471dc205042910162befaaee@mail.gmail.com>
	<ca471dc205042815557616722b@mail.gmail.com>
	<4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com>
	<5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com>
	<ca471dc205042910162befaaee@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050429134751.03099cb0@mail.telecommunity.com>

At 10:42 AM 4/29/05 -0700, Aahz wrote:
>On Fri, Apr 29, 2005, Guido van Rossum wrote:
> > [Phillip J. Eby]
> >>
> >> Although I'd personally prefer a no-keyword approach:
> >>
> >>      synchronized(self):
> >>          with_file("foo") as f:
> >>              # etc.
> >
> > I'd like that too, but it was shot down at least once. Maybe we can
> > resurrect it?
> >
> >     opening("foo") as f:
> >         # etc.
>
>I'm still -1 for the same reason I mentioned earlier: function calls
>spanning multiple lines are moderately common in Python code, and it's
>hard to distinguish these cases because multi-line calls usually get
>indented like blocks.

But the indentation of a multi-line call doesn't start with a colon.  Or 
are you saying you're concerned about things like:

     opening(
        blah, blah,
        foo, wah=flah
     ) as fidgety, widgety, foo:
        sping()

Which is quite ugly, to be sure, but then I don't see where adding an extra 
keyword helps.

From gvanrossum at gmail.com  Fri Apr 29 19:55:28 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Apr 29 19:55:31 2005
Subject: [Python-Dev] Anonymous blocks: Thunks or iterators?
In-Reply-To: <2mll720vts.fsf@starship.python.net>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<4270A1F2.1030401@canterbury.ac.nz>
	<ca471dc205042815157cf20297@mail.gmail.com>
	<2mll720vts.fsf@starship.python.net>
Message-ID: <ca471dc20504291055f5bb88e@mail.gmail.com>

[Michael Hudson]
> I think the making-generators-more-sexy thing is nice, but I'm think
> that's almost orthogonal.

Not entirely. I agree that "continue EXPR" calling next(EXPR) which
enables yield-expressions is entirely orthogonal.

But there are already two PEPs asking for passing exceptions and/or
cleanup into generators and from there it's only a small step to using
them as resource allocation/release templates. The "small step" part
is important -- given that we're going to do that work on generators
anyway, I expect the changes to the compiler and VM to support the
block statement are actually *less* than the changes needed to support
thunks.

No language feature is designed in isolation.

> Did you read this mail:
> 
> http://mail.python.org/pipermail/python-dev/2005-April/052970.html
> 
> ? In this proposal, you have to go to some effort to make the thunk
> survive the block, and I think if weirdness results, that's the
> programmer's problem.

It's not a complete proposal though. You say "And grudgingly, I guess 
you'd need to make returns behave like that anyway" (meaning they
should return from the containing function). But you don't give a hint
on how that could be made to happen, and I expect that by the time
you've figured out a mechanism, thunks aren't all that simple any
more.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From rrr at ronadam.com  Fri Apr 29 19:57:25 2005
From: rrr at ronadam.com (Ron Adam)
Date: Fri Apr 29 19:57:59 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <fb6fbf5605042907431311af71@mail.gmail.com>
References: <fb6fbf5605042907431311af71@mail.gmail.com>
Message-ID: <42727585.3090801@ronadam.com>

Jim Jewett wrote:

>Nick Coghlan:
>
>
>>Python offers two variants on the basic iterative loop.
>>
>
>>   "for NAME in EXPR:" skips the finalisation step. At loop 
>>completion, a well-behaved iterator may still contain additional values.
>>
>
>>  "for NAME from EXPR:" enforces finalisation of the iterator.
>>... At loop completion, a well-behaved [finalizing] iterator is 
>>always completely exhausted.
>>
>
>(nitpick):
>   "from" isn't really different from "in".  Perhaps
>
>    for NAME inall EXPR:
>    for NAME draining EXPR:
>    for NAME finalizing EXPR:    # too hard to spell, because of s/z?
>
>(substance):  
>
>"finalized or not" is a very useful distinction, but I'm not sure it 
>is something the user should have to worry about.  Realistically, 
>most of my loops intend to drain the iterator (which the compiler 
>knows because I have no "break:".  Regardless of whether I 
>use a break, I still want the iterator cleaned up if it is drained.
>
>The only thing this second loop form does is set a flag saying 
>
>   "No, I won't continue -- and I happen to know that no one else 
>    ever will either, even if they do have a reference that prevents
>    garbage collection.  I'm *sure* they won't use it."  
>
>That strikes me as a dangerous thing to get in the habit of saying.
>
>Why not just agressively run the finalization on both forms when the
>reference count permits?
>
>
>>This form supports block management operations, 
>>
>
>And this seems unrelated to finalization.  I understand that as an
>implementation detail, you need to define the finalizers somehow.
>But the decision to aggressively finalize (in some manner) and 
>desire to pass a block (that could be for finalization) seem like 
>orthogonal issues.
>
>-jJ
>_______________________________________________
>Python-Dev mailing list
>Python-Dev@python.org
>http://mail.python.org/mailman/listinfo/python-dev
>Unsubscribe: http://mail.python.org/mailman/options/python-dev/rrr%40ronadam.com
>
>
How about 'serve' as in a server of items from a service?

    serve NAME from EXPR:
        <block>

I think this is more descriptive of what it does and will make it easier 
to explain.  It also implies the correct relationship between the block, 
the name, and the expression.

I  think 'block' and 'with' are both *way* too general.  The problem I 
see with 'block' is that the term is often used as a general term to 
describe the body of other statements....  while, for, if, ... etc.

The generator in this case could be called a 'server' which would 
distinguish it from a normal genrator. 

By using 'serve' as a keyword, you can then refer to the expression as a 
whole as a 'service' or a 'resouce manager'.  And a simple description 
of it would be....

     A SERVE statement serves NAME(s) from a SERVER to the following 
statement block.
         (Details of how to use SERVE blocks and SERVERS.)


Ron Adam


From gvanrossum at gmail.com  Fri Apr 29 20:00:06 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Apr 29 20:00:08 2005
Subject: [Python-Dev] next(arg) was: Anonymous blocks: Thunks or iterators?
In-Reply-To: <fb6fbf56050429104851d1992@mail.gmail.com>
References: <fb6fbf56050429104851d1992@mail.gmail.com>
Message-ID: <ca471dc2050429110062695af8@mail.gmail.com>

[Guido van Rossum]
> > One [of many separate ideas in PEP 340] is turning generators
> > into more general coroutines: continue EXPR passes the expression
> > to the iterator's next() method ...

[Jim Jewett]
> I would have been very happy with that a week ago.  Seeing the
> specific implementation changed my mind.
> 
> The caller shouldn't know what state the generator is in, so the
> passed-in-message will be the same regardless of which yield
> accepts it.  Unless I have a single-yield generator, this means
> I end up writing boilerplate code to accept and process the arg
> at each yield.  I don't want more boilerplate.

I think your premise is wrong. When necessary (which it usually won't
be) the caller can tell the generator's state from the last thing it
yielded. Coroutines can easily define a protocol based on this if
needed. Anyway, single-yield generators are by far the majority.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From gvanrossum at gmail.com  Fri Apr 29 20:01:01 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Fri Apr 29 20:01:04 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <42727585.3090801@ronadam.com>
References: <fb6fbf5605042907431311af71@mail.gmail.com>
	<42727585.3090801@ronadam.com>
Message-ID: <ca471dc2050429110119120892@mail.gmail.com>

[Ron Adam]
> How about 'serve' as in a server of items from a service?

No, please. This has way too strong connotations with network protocols.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From sabbey at u.washington.edu  Fri Apr 29 20:10:42 2005
From: sabbey at u.washington.edu (Brian Sabbey)
Date: Fri Apr 29 20:10:48 2005
Subject: [Python-Dev] Anonymous blocks: Thunks or iterators?
In-Reply-To: <fb6fbf560504291017362204a7@mail.gmail.com>
References: <fb6fbf560504291017362204a7@mail.gmail.com>
Message-ID: <Pine.A41.4.61b.0504291101290.83522@dante74.u.washington.edu>

Jim Jewett wrote:

> The only members that need special attention are (f_code, f_lasti)
> and possibly (f_blockstack, f_iblock).

You don't even need to take care of f_code.  The thunk and its surrounding 
function can share the same code.  The thunk gets compiled into the 
function the same way the body of a for loop would.

> (f_code, f_lasti) would need to be replaced with a stack of pairs.
> Finishing a code string would mean popping this stack, rather
> than popping the whole frame.

There doesn't need to be a stack; each thunk can store its own f_lasti.

One also needs to store f_back, and, to avoid exception weirdness, 
f_exc_XXX.

In this way, calling the thunk is much like resuming a generator.

-Brian
From mahs at telcopartners.com  Fri Apr 29 20:03:33 2005
From: mahs at telcopartners.com (Michael Spencer)
Date: Fri Apr 29 20:17:05 2005
Subject: [Python-Dev] PEP 340: syntax suggestion - try opening(filename) as
	f:
Message-ID: <d4tsik$4be$1@sea.gmane.org>

I don't know whether it's true for all the PEP 340 use cases, but the all the 
current examples would read very naturally if the block-template could be 
specified in an extended try statement:

>     1. A template for ensuring that a lock, acquired at the start of a
>        block, is released when the block is left:

try with_lock(myLock):
     # Code here executes with myLock held.  The lock is
     # guaranteed to be released when the block is left (even
     # if by an uncaught exception).

>     2. A template for opening a file that ensures the file is closed
>        when the block is left:

try opening("/etc/passwd") as f:
     for line in f:
         print line.rstrip()

> 
>     3. A template for committing or rolling back a database
>        transaction:
> 

try transaction(mydb):

>     4. A template that tries something up to n times:
> 
try auto_retry(3):
     f = urllib.urlopen("http://python.org/peps/pep-0340.html")
     print f.read()

>     5. It is possible to nest blocks and combine templates:

try with_lock(myLock):
     try opening("/etc/passwd") as f:
         for line in f:
             print line.rstrip()


Michael

From jimjjewett at gmail.com  Fri Apr 29 20:23:05 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri Apr 29 20:23:08 2005
Subject: [Python-Dev] Anonymous blocks: Thunks or iterators?
Message-ID: <fb6fbf5605042911233ce849db@mail.gmail.com>

Guido van Rossum:

> -- but it's more efficient, since calling yield doesn't create a frame. 

Neither should a thunk.

> The other problem with thunks is that once we think of them as the
> anonymous functions they are, we're pretty much forced to say that a
> return statement in a thunk returns from the thunk rather than from
> the containing function. 

Why should a thunk be a function?  We already have first class
functions.  What we're missing is a way to pass around a suite.

def foo(a):
    if a > 4:
        b = a
        c = process(a)  # thunk line 1
        print a         # thunk line 2
        return          # thunk line 3
    else:
        a.something()

We don't have a good way to package up "c = process(a); print a; return"

The return should exit the whole function, not just (part of) the if clause.

Greg:
>> I'd like to reconsider a thunk implementation. It
>> would be a lot simpler, doing just what is required
>> without any jiggery pokery with exceptions and
>> break/continue/return statements. It would be easy
>> to explain what it does and why it's useful.

> I don't know. In order to obtain the required local variable sharing
> between the thunk and the  containing function I believe that every
> local variable used or set in the thunk would have to become a 'cell'
> (our mechanism for sharing variables between nested scopes). 

Cells only work if you have a complete set of names at compile-time.
Your own resource-example added "item" to the namespace inside
a block.  If you don't know which blocks could be used with a pattern,
cells are out.

That said, the compiler code is already two-pass.  Once to find names,
and another time to resolve them.  This just means that for thunks
(and functions that call them) the adjustment will be to LOAD_NAME
instead of getting a LOAD_FAST index.

-jJ
From jjl at pobox.com  Fri Apr 29 20:31:17 2005
From: jjl at pobox.com (John J Lee)
Date: Fri Apr 29 20:30:02 2005
Subject: [Python-Dev] Re: anonymous blocks
In-Reply-To: <4271B07A.4010501@hathawaymix.org>
References: <ca471dc205042116402d7d38da@mail.gmail.com>
	<ca471dc205042416572da9db71@mail.gmail.com>
	<426DB7C8.5020708@canterbury.ac.nz>
	<ca471dc2050426043713116248@mail.gmail.com>
	<426E3B01.1010007@canterbury.ac.nz>
	<ca471dc205042621472b1f6edf@mail.gmail.com>
	<427083B0.6040204@canterbury.ac.nz>
	<Pine.LNX.4.58.0504280158590.4786@server1.LFW.org>
	<ca471dc2050428155127cec9a5@mail.gmail.com>
	<4271B07A.4010501@hathawaymix.org>
Message-ID: <Pine.WNT.4.58.0504291927120.2748@vernon>

On Thu, 28 Apr 2005, Shane Hathaway wrote:
[...]
> I think this concept can be explained clearly.  I'd like to try
> explaining PEP 340 to someone new to Python but not new to programming.
[...snip explanation...]
> Is it understandable so far?

Yes, excellent.  Speaking as somebody who scanned the PEP and this thread
and only half-understood either, that was quite painless to read.

Still not sure whether thunks or PEP 340 are better, but I'm at least
confused on a higher level now.


John
From rrr at ronadam.com  Fri Apr 29 20:35:35 2005
From: rrr at ronadam.com (Ron Adam)
Date: Fri Apr 29 20:33:38 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <ca471dc2050429110119120892@mail.gmail.com>
References: <fb6fbf5605042907431311af71@mail.gmail.com>	
	<42727585.3090801@ronadam.com>
	<ca471dc2050429110119120892@mail.gmail.com>
Message-ID: <42727E77.80504@ronadam.com>

Guido van Rossum wrote:

>[Ron Adam]
>
>>How about 'serve' as in a server of items from a service?
>>
>
>No, please. This has way too strong connotations with network protocols.
>
>
Errr... you're right of course...  :-/   (I was thinking *way* to narrow.)

I think the context is correct, just need a synonym that isn't already used.

   provide, provider 
   supply, supplier
   dispense, dispenser
   deal, dealer
   deliver, deliveror
  
   or

   parcel, meter, dish, give, dole, offer, cede...

   Maybe barrow from a different language?


Ron Adam


From jimjjewett at gmail.com  Fri Apr 29 20:33:54 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri Apr 29 20:33:56 2005
Subject: [Python-Dev] Anonymous blocks: Thunks or iterators?
In-Reply-To: <Pine.A41.4.61b.0504291101290.83522@dante74.u.washington.edu>
References: <fb6fbf560504291017362204a7@mail.gmail.com>
	<Pine.A41.4.61b.0504291101290.83522@dante74.u.washington.edu>
Message-ID: <fb6fbf5605042911337960ee4d@mail.gmail.com>

On 4/29/05, Brian Sabbey <sabbey@u.washington.edu> wrote:
> Jim Jewett wrote:
 
> > The only members that need special attention are (f_code, f_lasti)
> > and possibly (f_blockstack, f_iblock).
 
> You don't even need to take care of f_code.  The thunk and its surrounding
> function can share the same code.  The thunk gets compiled into the
> function the same way the body of a for loop would.

This only works if you already know what the thunk's code will be 
when you compile the function.  (Just splicing it in messes up jump
targets.)

> One also needs to store f_back, and, to avoid exception weirdness,
> f_exc_XXX.

f_back lists the previous stack frame (which shouldn't change during
a thunk[1]), and f_exc_XXX is for the most recent exception -- I don't see
any reason to treat thunks differently from loop bodies in that regard.

[1]  If the thunk calls another function (that needs its own frame), then
that is handled the same as any regular function call.

-jJ
From jimjjewett at gmail.com  Fri Apr 29 21:01:20 2005
From: jimjjewett at gmail.com (Jim Jewett)
Date: Fri Apr 29 21:01:25 2005
Subject: [Python-Dev] PEP 340: syntax suggestion - try opening(filename) as
	f:
Message-ID: <fb6fbf5605042912013a7f896@mail.gmail.com>

Michael Spencer:

> I don't know whether it's true for all the PEP 340 use cases, but the all the 
> current examples would read very naturally if the block-template could be 
> specified in an extended try statement:

>>     1. A template for ensuring that a lock, acquired at the start of a
>>        block, is released when the block is left:

> try with_lock(myLock):
>      # Code here executes with myLock held.  The lock is
>      # guaranteed to be released when the block is left (even
>      # if by an uncaught exception).

So we would have 

try ... finally, try ... except, and try (no close).

It works for me, and should be backwards-compatible.

The cases where it doesn't work as well are

(1)  You want to insert several different suites.
    
But the anonymous yield syntax doesn't work well for that either.
(That is one of the arguments for thunks instead of generator abuse.)

(2)  You really do want to loop over the suite.  Try doesn't imply a loop.

But this is a *good* thing.  Resources are not loops, and you can always
make the loop explicit as iteration over the resource

    def opener(file):
        f=open(file)
        try:
            yield f
        finally:
            f.close()
        
    try opener(file) as f:
        for line in f:
            process(line)
From aahz at pythoncraft.com  Fri Apr 29 21:05:25 2005
From: aahz at pythoncraft.com (Aahz)
Date: Fri Apr 29 21:05:27 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <5.1.1.6.0.20050429134751.03099cb0@mail.telecommunity.com>
References: <ca471dc205042910162befaaee@mail.gmail.com>
	<ca471dc205042815557616722b@mail.gmail.com>
	<4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com>
	<5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com>
	<ca471dc205042910162befaaee@mail.gmail.com>
	<5.1.1.6.0.20050429134751.03099cb0@mail.telecommunity.com>
Message-ID: <20050429190525.GA2708@panix.com>

On Fri, Apr 29, 2005, Phillip J. Eby wrote:
> At 10:42 AM 4/29/05 -0700, Aahz wrote:
>>On Fri, Apr 29, 2005, Guido van Rossum wrote:
>>> [Phillip J. Eby]
>>>>
>>>> Although I'd personally prefer a no-keyword approach:
>>>>
>>>>      synchronized(self):
>>>>          with_file("foo") as f:
>>>>              # etc.
>>>
>>> I'd like that too, but it was shot down at least once. Maybe we can
>>> resurrect it?
>>>
>>>     opening("foo") as f:
>>>         # etc.
>>
>>I'm still -1 for the same reason I mentioned earlier: function calls
>>spanning multiple lines are moderately common in Python code, and it's
>>hard to distinguish these cases because multi-line calls usually get
>>indented like blocks.
> 
> But the indentation of a multi-line call doesn't start with a colon.  

Neither does the un-keyworded block.  It starts with a colon on the end
of the previous line.  I thought part of the point of Python was to
minimize reliance on punctuation, especially where it's not clearly
visible?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It's 106 miles to Chicago.  We have a full tank of gas, a half-pack of
cigarettes, it's dark, and we're wearing sunglasses."  "Hit it."
From andre.roberge at gmail.com  Fri Apr 29 22:10:57 2005
From: andre.roberge at gmail.com (=?ISO-8859-1?Q?Andr=E9_Roberge?=)
Date: Fri Apr 29 22:16:04 2005
Subject: [Python-Dev] Re: PEP 340 - possible new name for block-statement
In-Reply-To: <4271B7FC.1070801@pobox.com>
References: <ca471dc205042815557616722b@mail.gmail.com>
	<4271B7FC.1070801@pobox.com>
Message-ID: <d4u41i$bpc$1@sea.gmane.org>

Robin Munn wrote:

[snip]
> 
> Another possibility just occurred to me. How about "using"?
> 
> ~    using EXPR as VAR:
> ~        BLOCK
> 

Examples from PEP 340:
==========

def synchronized(lock):
...
using synchronized(myLock):
...
===== (+0)

def opening(filename, mode="r"):
...
using opening("/etc/passwd") as f:
...
===== (+1)

def auto_retry(n=3, exc=Exception):
...
using auto_retry(3, IOError):
...
===== (+1)
def synchronized_opening(lock, filename, mode="r"):
...
using synchronized_opening("/etc/passwd", myLock) as f:
...
===== (+1)

A.R.

From nidoizo at yahoo.com  Fri Apr 29 22:26:27 2005
From: nidoizo at yahoo.com (Nicolas Fleury)
Date: Fri Apr 29 22:27:34 2005
Subject: [Python-Dev] Re: PEP 340 - possible new name for block-statement
In-Reply-To: <ca471dc2050428223023aa80fc@mail.gmail.com>
References: <ca471dc205042815557616722b@mail.gmail.com>	<d4s8nh$fk8$1@sea.gmane.org>
	<ca471dc2050428223023aa80fc@mail.gmail.com>
Message-ID: <d4u4uq$fd5$1@sea.gmane.org>

Guido van Rossum wrote:
> [Nicolas Fleury]
>>scoped EXPR as VAR:
>>     BLOCK
> 
> Definitely not. In too many languages, a "scope" is a new namespace,
> and that's exactly what a block (by whichever name) is *not*.

Humm... what about "context"?

context EXPR as VAR:
     BLOCK

I may answer the question myself, but is an alternative syntax without 
an indentation conceivable? (yes, even since the implicit block could be 
run multiple times).  Because in that case, a keyword like "block" would 
not look right.

It seems to me that in most RAII cases, the block could end at the end 
of the current block and that's fine, and over-indentation can be 
avoided.  However, I realize that the indentation makes more sense in 
the context of Python and removes some magic that would be natural for a 
C++ programmer used to presence of stack...  Ok, I answer my question, 
but "context" still sounds nicer to me than "block";)

Regards,
Nicolas

From pje at telecommunity.com  Fri Apr 29 23:18:52 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Fri Apr 29 23:15:52 2005
Subject: [Python-Dev] PEP 340 - possible new name for
  block-statement
In-Reply-To: <20050429190525.GA2708@panix.com>
References: <5.1.1.6.0.20050429134751.03099cb0@mail.telecommunity.com>
	<ca471dc205042910162befaaee@mail.gmail.com>
	<ca471dc205042815557616722b@mail.gmail.com>
	<4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com>
	<5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com>
	<ca471dc205042910162befaaee@mail.gmail.com>
	<5.1.1.6.0.20050429134751.03099cb0@mail.telecommunity.com>
Message-ID: <5.1.1.6.0.20050429170733.031a74e0@mail.telecommunity.com>

At 12:05 PM 4/29/05 -0700, Aahz wrote:
>On Fri, Apr 29, 2005, Phillip J. Eby wrote:
> > At 10:42 AM 4/29/05 -0700, Aahz wrote:
> >>On Fri, Apr 29, 2005, Guido van Rossum wrote:
> >>> [Phillip J. Eby]
> >>>>
> >>>> Although I'd personally prefer a no-keyword approach:
> >>>>
> >>>>      synchronized(self):
> >>>>          with_file("foo") as f:
> >>>>              # etc.
> >>>
> >>> I'd like that too, but it was shot down at least once. Maybe we can
> >>> resurrect it?
> >>>
> >>>     opening("foo") as f:
> >>>         # etc.
> >>
> >>I'm still -1 for the same reason I mentioned earlier: function calls
> >>spanning multiple lines are moderately common in Python code, and it's
> >>hard to distinguish these cases because multi-line calls usually get
> >>indented like blocks.
> >
> > But the indentation of a multi-line call doesn't start with a colon.
>
>Neither does the un-keyworded block.  It starts with a colon on the end
>of the previous line.  I thought part of the point of Python was to
>minimize reliance on punctuation, especially where it's not clearly
>visible?

Actually, I've just realized that I was misled by your argument into 
thinking that the possibility of confusing a multi-line call and a block of 
this sort is a problem.  It's not, because template blocks can be viewed as 
multi-line calls that just happen to include a block of code as one of the 
arguments.   So, mistaking one for the other when you're just skimming the 
code and not looking at things like "as" or the ":", is really not important.

In the second place, the most important cue to understanding the behavior 
of a template block is the template function itself; the bare syntax gives 
it the most prominence.  Blocks like 'synchronized(self):' should be 
instantly comprehensible to Java programmers, for example, and 'retry(3):' 
is also pretty self-explanatory.  And so far, template function names and 
signatures have been quite brief as well.

From aahz at pythoncraft.com  Sat Apr 30 00:43:00 2005
From: aahz at pythoncraft.com (Aahz)
Date: Sat Apr 30 00:43:02 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <5.1.1.6.0.20050429170733.031a74e0@mail.telecommunity.com>
References: <5.1.1.6.0.20050429134751.03099cb0@mail.telecommunity.com>
	<ca471dc205042910162befaaee@mail.gmail.com>
	<ca471dc205042815557616722b@mail.gmail.com>
	<4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com>
	<5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com>
	<ca471dc205042910162befaaee@mail.gmail.com>
	<5.1.1.6.0.20050429134751.03099cb0@mail.telecommunity.com>
	<5.1.1.6.0.20050429170733.031a74e0@mail.telecommunity.com>
Message-ID: <20050429224300.GA9425@panix.com>

On Fri, Apr 29, 2005, Phillip J. Eby wrote:
>
> Actually, I've just realized that I was misled by your argument into 
> thinking that the possibility of confusing a multi-line call and a block of 
> this sort is a problem.  It's not, because template blocks can be viewed as 
> multi-line calls that just happen to include a block of code as one of the 
> arguments.   So, mistaking one for the other when you're just skimming the 
> code and not looking at things like "as" or the ":", is really not 
> important.

Maybe.  I'm not persuaded, but this inclines me toward agreeing with
your position.

> In the second place, the most important cue to understanding the behavior 
> of a template block is the template function itself; the bare syntax gives 
> it the most prominence.  Blocks like 'synchronized(self):' should be 
> instantly comprehensible to Java programmers, for example, and 'retry(3):' 
> is also pretty self-explanatory.  And so far, template function names and 
> signatures have been quite brief as well.

This works IMO IFF Python is regarded as a language with user-defined
syntactical structures.  Guido has historically disagreed strongly with
that philosophy; until and unless he reverses his opinion, this is
precisely why the non-keyword version will continue to receive -1 from
me.  (As it happens, I agree with Guido, so if Guido wants to change,
I'll probably argue until I see good reason. ;-)
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"It's 106 miles to Chicago.  We have a full tank of gas, a half-pack of
cigarettes, it's dark, and we're wearing sunglasses."  "Hit it."
From reinhold-birkenfeld-nospam at wolke7.net  Sat Apr 30 00:53:12 2005
From: reinhold-birkenfeld-nospam at wolke7.net (Reinhold Birkenfeld)
Date: Sat Apr 30 00:55:45 2005
Subject: [Python-Dev] Re: PEP 340 - possible new name for block-statement
In-Reply-To: <ca471dc205042822271a43bc83@mail.gmail.com>
References: <ca471dc205042815557616722b@mail.gmail.com>	<4271B7FC.1070801@pobox.com>
	<ca471dc205042822271a43bc83@mail.gmail.com>
Message-ID: <d4udl0$7j3$1@sea.gmane.org>

Guido van Rossum wrote:

>> Another possibility just occurred to me. How about "using"?
> 
> Blah. I'm beginning to like block just fine. With using, the choice of
> word for the generator name becomes iffy IMO; and it almost sounds
> like it's a simple renaming: "using X as Y" could mean "Y = X".

FWIW, the first association when seeing

block something:

is with the verb "to block", and not with the noun, which is most displeasing.

Reinhold

-- 
Mail address is perfectly valid!

From gvanrossum at gmail.com  Sat Apr 30 01:02:16 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat Apr 30 01:02:18 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <20050429224300.GA9425@panix.com>
References: <ca471dc205042815557616722b@mail.gmail.com>
	<4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com>
	<5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com>
	<ca471dc205042910162befaaee@mail.gmail.com>
	<5.1.1.6.0.20050429134751.03099cb0@mail.telecommunity.com>
	<5.1.1.6.0.20050429170733.031a74e0@mail.telecommunity.com>
	<20050429224300.GA9425@panix.com>
Message-ID: <ca471dc205042916024c03501a@mail.gmail.com>

[Phillip]
> > In the second place, the most important cue to understanding the behavior
> > of a template block is the template function itself; the bare syntax gives
> > it the most prominence.  Blocks like 'synchronized(self):' should be
> > instantly comprehensible to Java programmers, for example, and 'retry(3):'
> > is also pretty self-explanatory.  And so far, template function names and
> > signatures have been quite brief as well.

[Aahz]
> This works IMO IFF Python is regarded as a language with user-defined
> syntactical structures.  Guido has historically disagreed strongly with
> that philosophy; until and unless he reverses his opinion, this is
> precisely why the non-keyword version will continue to receive -1 from
> me.  (As it happens, I agree with Guido, so if Guido wants to change,
> I'll probably argue until I see good reason. ;-)

Actually, I think this is a nice way to have my cake and eat it too:
on the one hand, there still isn't any user-defined syntax, because
the keyword-less block syntax is still fixed by the compiler. On the
other hand, people are free to *think* of it as introducing syntax if
it helps them understand the code better. Just as you can think of
each distinct @decorator as a separate piece of syntax that modifies a
function/method definition. And just as you can think of a function
call as a user-defined language extension.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From rodrigobamboo at gmail.com  Sat Apr 30 01:15:12 2005
From: rodrigobamboo at gmail.com (Rodrigo B. de Oliveira)
Date: Sat Apr 30 01:15:28 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <ca471dc205042910162befaaee@mail.gmail.com>
References: <ca471dc205042815557616722b@mail.gmail.com>
	<4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com>
	<5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com>
	<ca471dc205042910162befaaee@mail.gmail.com>
Message-ID: <5917478b050429161524eba5a3@mail.gmail.com>

On 4/29/05, Guido van Rossum <gvanrossum@gmail.com> wrote:
> [Phillip J. Eby]
> > Although I'd personally prefer a no-keyword approach:
> >
> >      synchronized(self):
> >          with_file("foo") as f:
> >              # etc.
> 
> I'd like that too, but it was shot down at least once. Maybe we can
> resurrect it?
> 
>     opening("foo") as f:
>         # etc.
> 
> is just a beauty!
> 

Yes. I like it.

EXPRESSION [as VAR]:
    BLOCK

lock(self._monitor): # typing synchronized freaks me out
    spam()

using(DB.open()) as conn:
    eggs(conn)
From gvanrossum at gmail.com  Sat Apr 30 01:19:59 2005
From: gvanrossum at gmail.com (Guido van Rossum)
Date: Sat Apr 30 01:20:02 2005
Subject: [Python-Dev] PEP 340: syntax suggestion - try opening(filename)
	as f:
In-Reply-To: <d4tsik$4be$1@sea.gmane.org>
References: <d4tsik$4be$1@sea.gmane.org>
Message-ID: <ca471dc205042916194f20c27@mail.gmail.com>

[Michael Spencer]
> I don't know whether it's true for all the PEP 340 use cases, but the all the
> current examples would read very naturally if the block-template could be
> specified in an extended try statement:

Sorry, this emphasizes the wrong thing. A try-statement emphasizes
that the body may fail (and then provides some cleanup semantics). IMO
a block-statement, while it has cleanup semantics, should emphasize
that the block executes under some kind of supervision.

The more I think about it the more I like having no keyword at all
(see other messages).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
From s.percivall at chello.se  Sat Apr 30 01:44:38 2005
From: s.percivall at chello.se (Simon Percivall)
Date: Sat Apr 30 01:44:41 2005
Subject: [Python-Dev] Anonymous blocks: Thunks or iterators?
In-Reply-To: <Pine.A41.4.61b.0504291101290.83522@dante74.u.washington.edu>
References: <fb6fbf560504291017362204a7@mail.gmail.com>
	<Pine.A41.4.61b.0504291101290.83522@dante74.u.washington.edu>
Message-ID: <0EBAEB8A-29A5-4B51-9894-F808993EC0A3@chello.se>

On 29 apr 2005, at 20.10, Brian Sabbey wrote:
> [...] The thunk and its surrounding function can share the same  
> code.  The thunk gets compiled into the function the same way the  
> body of a for loop would.

This seems really, truly, nasty! Wouldn't this require you to check  
the source code of the function
you want to integrate your thunk into to avoid namespace collisions?  
Well, no, not to avoid
collisions I guess, if it's truly regarded as part of the function.  
But this means it would use the
function's global namespace, etc. You'd be unable to use anything  
from the scopes in which the
thunk is defined, which makes it really, really ... wierd. Or have I  
not gotten it?

//Simon

From ncoghlan at gmail.com  Sat Apr 30 01:55:22 2005
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat Apr 30 01:55:27 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <42725691.4030308@cirad.fr>
References: <ca471dc205042815557616722b@mail.gmail.com>	<4271F71B.8010000@gmail.com>	<42721E2E.8020108@cirad.fr>	<42724C1C.4040200@gmail.com>
	<42725691.4030308@cirad.fr>
Message-ID: <4272C96A.5080709@gmail.com>

Pierre Barbier de Reuille wrote:
> Mmmmh ... why introduce a new flag ? Can't you just test the presence of 
> the "__error__" method ? This would lift your problem wouldn't it ?

Perhaps - it would require doing something a little tricky with generators to 
allow the programmer to specify whether the generator should be finalised or not.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net
From shane.holloway at ieee.org  Sat Apr 30 02:52:27 2005
From: shane.holloway at ieee.org (Shane Holloway (IEEE))
Date: Sat Apr 30 02:52:54 2005
Subject: [Python-Dev] PEP 340 - possible new name for block-statement
In-Reply-To: <ca471dc205042910162befaaee@mail.gmail.com>
References: <ca471dc205042815557616722b@mail.gmail.com>	<4271F71B.8010000@gmail.com>
	<20050429163854.GB14920@panix.com>	<5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com>
	<ca471dc205042910162befaaee@mail.gmail.com>
Message-ID: <4272D6CB.5070806@ieee.org>

Guido van Rossum wrote:
> [Phillip J. Eby]
> 
>>Although I'd personally prefer a no-keyword approach:
>>
>>     synchronized(self):
>>         with_file("foo") as f:
>>             # etc.
> 
> 
> I'd like that too, but it was shot down at least once. Maybe we can
> resurrect it?
> 
>     opening("foo") as f:
>         # etc.
> 
> is just a beauty!

+1

Certainly my favorite because it's direct and easy on the eyes.  Second 
would be::

     in opening("foo") as f:
         # etc.

because I can see Aahz's point about introducing the block with a 
keyword instead of relying on the ":" punctuation and subsequent 
indentation of the block for skimming code.

-Shane Holloway
From python-dev at zesty.ca  Sat Apr 30 03:21:26 2005
From: python-dev at zesty.ca (Ka-Ping Yee)
Date: Sat Apr 30 03:21:29 2005
Subject: [Python-Dev] PEP 340: syntax suggestion - try opening(filename)
	as f:
In-Reply-To: <ca471dc205042916194f20c27@mail.gmail.com>
References: <d4tsik$4be$1@sea.gmane.org>
	<ca471dc205042916194f20c27@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0504292007510.4786@server1.LFW.org>

On Fri, 29 Apr 2005, Guido van Rossum wrote:
> The more I think about it the more I like having no keyword at all
> (see other messages).

I hope you'll reconsider this.  I really think introducing a new
statement requires a keyword, for pedagogical reasons as well as
readability and consistency.  Here's my pitch:

All the statements in Python are associated with keywords, except
for assignment, which is simple and extremely common.  I don't
think the block statement is simple enough or common enough for
that; its semantics are much too significant to be flagged only
by a little punctuation mark like a colon.

I can empathize with wanting to avoid a keyword in order to
avoid an endless debate about what the keyword will be.  But
that debate can't be avoided anyway -- we still have to agree
on what to call this thing when talking about it and teaching it.

The keyword gives us a name, a conceptual tag from which to hang
our knowledge and discussions.  Once we have a keyword, there
can be no confusion about what to call the construct.  And if
there is a distinctive keyword, a Python programmer who comes
across this unfamiliar construct will be able to ask someone
"What does this 'spam' keyword mean?" or can search on Google for
"Python spam" to find out what it means.  Without a keyword,
they're out of luck.  Names are power.


-- ?!ng
From pje at telecommunity.com  Sat Apr 30 03:52:07 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat Apr 30 03:49:22 2005
Subject: [Python-Dev] PEP 340 - possible new name for
  block-statement
In-Reply-To: <ca471dc205042916024c03501a@mail.gmail.com>
References: <20050429224300.GA9425@panix.com>
	<ca471dc205042815557616722b@mail.gmail.com>
	<4271F71B.8010000@gmail.com> <20050429163854.GB14920@panix.com>
	<5.1.1.6.0.20050429130113.033208b0@mail.telecommunity.com>
	<ca471dc205042910162befaaee@mail.gmail.com>
	<5.1.1.6.0.20050429134751.03099cb0@mail.telecommunity.com>
	<5.1.1.6.0.20050429170733.031a74e0@mail.telecommunity.com>
	<20050429224300.GA9425@panix.com>
Message-ID: <5.1.1.6.0.20050429212046.032abd70@mail.telecommunity.com>

At 04:02 PM 4/29/05 -0700, Guido van Rossum wrote:
>Actually, I think this is a nice way to have my cake and eat it too:
>on the one hand, there still isn't any user-defined syntax, because
>the keyword-less block syntax is still fixed by the compiler. On the
>other hand, people are free to *think* of it as introducing syntax if
>it helps them understand the code better. Just as you can think of
>each distinct @decorator as a separate piece of syntax that modifies a
>function/method definition. And just as you can think of a function
>call as a user-defined language extension.

And, amusingly enough, those folks who wanted a decorator suite can now 
have their wish, e.g.:

     decorate(classmethod):
         def something(cls, blah):
             ...

Given a suitable frame-sniffing implementation of 'decorate'.  :)

By the way, I notice PEP 340 has two outstanding items with my name on 
them; let me see if I can help eliminate one real quick.

Tracebacks: it occurs to me that I may have unintentionally given the 
impression that I need to pass in an arbitrary traceback, when in fact I 
only need to pass in the current sys.exc_info().  So, if the error call-in 
doesn't pass in anything but an error flag, and the template iterator is 
supposed to just read sys.exc_info(), maybe that would be less of an 
issue?  For one thing, it would make handling arbitrary errors in the 
template block cleaner, because the traceback for unhandled errors in 
something like this:

     synchronized(foo):
         raise Bar

would look something like this:

     File .... line ... of __main__:
         synchronized(foo):
     File .... line ... of synchronized:
         yield
     File .... line ... of __main__:
         raise Bar

Which, IMO, is the "correct" traceback for this circumstance, although 
since the first and last frame would actually be the same, you'd probably 
only get the lower two entries (the yield and the raise), which is OK too I 
think.

Anyway, I mainly just wanted to note that I'd be fine with having a way to 
say, "Hey, there's an error, handle it" that doesn't allow passing in the 
exception or traceback, but is just a flag that means "look at Python's 
error state" instead of passing a value back in.

I can do this because when I need to pass in a traceback, it's because I'm 
trying to pass a terminated coroutine's error into another coroutine.  So, 
the traceback I want to pass in is Python's existing "last error" state anyway.

From pje at telecommunity.com  Sat Apr 30 03:54:47 2005
From: pje at telecommunity.com (Phillip J. Eby)
Date: Sat Apr 30 03:52:01 2005
Subject: [Python-Dev] PEP 340: syntax suggestion - try
	opening(filename) as f:
In-Reply-To: <Pine.LNX.4.58.0504292007510.4786@server1.LFW.org>
References: <ca471dc205042916194f20c27@mail.gmail.com>
	<d4tsik$4be$1@sea.gmane.org>
	<ca471dc205042916194f20c27@mail.gmail.com>
Message-ID: <5.1.1.6.0.20050429213620.0322cec0@mail.telecommunity.com>

At 08:21 PM 4/29/05 -0500, Ka-Ping Yee wrote:
>All the statements in Python are associated with keywords, except
>for assignment, which is simple and extremely common.  I don't
>think the block statement is simple enough or common enough for
>that; its semantics are much too significant to be flagged only
>by a little punctuation mark like a colon.

Don't forget the 'as' clause.


>I can empathize with wanting to avoid a keyword in order to
>avoid an endless debate about what the keyword will be.  But
>that debate can't be avoided anyway -- we still have to agree
>on what to call this thing when talking about it and teaching it.

A "template invocation", perhaps, for the statement, and a "templated 
block" for the actual block.  The expression part of the statement would be 
the "template expression" which must result in a "template iterator".


>The keyword gives us a name, a conceptual tag from which to hang
>our knowledge and discussions.  Once we have a keyword, there
>can be no confusion about what to call the construct.  And if
>there is a distinctive keyword, a Python programmer who comes
>across this unfamiliar construct will be able to ask someone
>"What does this 'spam' keyword mean?" or can search on Google for
>"Python spam" to find out what it means.  Without a keyword,
>they're out of luck.  Names are power.

help(synchronized) or help(retry) would doubtless display useful 
information.  Conversely, try Googling for Python's "for" or "if" keywords, 
and see if you get anything useful -- I didn't.

From python-dev at zesty.ca  Sat Apr 30 09:44:20 2005
From: python-dev at zesty.ca (Ka-Ping Yee)
Date: Sat Apr 30 09:44:23 2005
Subject: [Python-Dev] Keyword for block statements
In-Reply-To: <5.1.1.6.0.20050429213620.0322cec0@mail.telecommunity.com>
References: <ca471dc205042916194f20c27@mail.gmail.com>
	<d4tsik$4be$1@sea.gmane.org>
	<ca471dc205042916194f20c27@mail.gmail.com>
	<5.1.1.6.0.20050429213620.0322cec0@mail.telecommunity.com>
Message-ID: <Pine.LNX.4.58.0504300227440.4786@server1.LFW.org>

On Fri, 29 Apr 2005, Phillip J. Eby wrote:
> At 08:21 PM 4/29/05 -0500, Ka-Ping Yee wrote:
> >All the statements in Python are associated with keywords, except
> >for assignment, which is simple and extremely common.  I don't
> >think the block statement is simple enough or common enough for
> >that; its semantics are much too significant to be flagged only
> >by a little punctuation mark like a colon.
>
> Don't forget the 'as' clause.

It's optional, and you have to skip an arbitrarily long expression
to get to it.

> >if there is a distinctive keyword, a Python programmer who comes
> >across this unfamiliar construct will be able to ask someone
> >"What does this 'spam' keyword mean?" or can search on Google for
> >"Python spam" to find out what it means.  Without a keyword,
> >they're out of luck.  Names are power.
>
> help(synchronized) or help(retry) would doubtless display useful
> information.

The programmer who writes the function used to introduce a block
can hardly be relied upon to explain the language semantics.  We
don't expect the docstring of every class to repeat an explanation
of Python classes, for example.  The language reference manual is
for that; it's a different level of documentation.

> Conversely, try Googling for Python's "for" or "if" keywords,
> and see if you get anything useful -- I didn't.

I tried some of my favourite Python keywords :) and found that the
following searches all successfully turn up information on the
associated kinds of Python statements in the first couple of hits:

    python if
    python else
    python del
    python while
    python assert
    python yield
    python break
    python continue
    python pass
    python raise
    python try
    python finally
    python class
    python for statement
    python return statement
    python print statement


-- ?!ng
From python at rcn.com  Sat Apr 30 17:34:22 2005
From: python at rcn.com (Raymond Hettinger)
Date: Sat Apr 30 17:35:36 2005
Subject: [Python-Dev] Developer list update
In-Reply-To: <1112972430.19904.9.camel@geddy.wooz.org>
Message-ID: <001101c54d9a$22c70da0$8d22c797@oemcomputer>

I haven't heard back from Greg Stein, Jim Fulton, or Paul Prescod.

If anyone can get in touch with them, that would be great.
I suspect that Jim may want to keep the commit privileges active
and that Paul and Greg are done with commits for the time being.


Raymond Hettinger
From aleaxit at yahoo.com  Sat Apr 30 20:59:53 2005
From: aleaxit at yahoo.com (Alex Martelli)
Date: Sat Apr 30 20:59:55 2005
Subject: [Python-Dev] Developer list update
In-Reply-To: <001101c54d9a$22c70da0$8d22c797@oemcomputer>
References: <001101c54d9a$22c70da0$8d22c797@oemcomputer>
Message-ID: <e807dd3a37c599a5629f73c8edae0f8f@yahoo.com>


On Apr 30, 2005, at 08:34, Raymond Hettinger wrote:

> I haven't heard back from Greg Stein, Jim Fulton, or Paul Prescod.
>
> If anyone can get in touch with them, that would be great.
> I suspect that Jim may want to keep the commit privileges active
> and that Paul and Greg are done with commits for the time being.

Greg (gstein at lyra dot org, also gstein at google dot com), I assume, 
might also want to keep the commit privileges -- he's now working on 
the opensource projects at Google, and actively speaking about "Python 
at Google" (he did so both at Pycon and ACCU/PythonUK), so it seems far 
from unlikely to me that he might be back to active contributions soon. 
  Anyway, you can ask him directly.


Alex

From prescod at gmail.com  Sat Apr 30 22:38:58 2005
From: prescod at gmail.com (Paul Prescod)
Date: Sat Apr 30 22:39:00 2005
Subject: [Python-Dev] Developer list update
In-Reply-To: <001101c54d9a$22c70da0$8d22c797@oemcomputer>
References: <1112972430.19904.9.camel@geddy.wooz.org>
	<001101c54d9a$22c70da0$8d22c797@oemcomputer>
Message-ID: <1cb725390504301338dccb8c9@mail.gmail.com>

I haven't been using Python recently and don't have plans to
contribute to its development. Go ahead and drop me from the list.
From python at rcn.com  Sat Apr 30 23:21:42 2005
From: python at rcn.com (Raymond Hettinger)
Date: Sat Apr 30 23:22:04 2005
Subject: [Python-Dev] Developer list update
In-Reply-To: <1cb725390504301338dccb8c9@mail.gmail.com>
Message-ID: <000901c54dca$9acbb0a0$8d22c797@oemcomputer>

Thanks for the note.
Let me know if you need to be switched on again at some point.


Raymond Hettinger

> -----Original Message-----
> From: Paul Prescod [mailto:prescod@gmail.com]
> Sent: Saturday, April 30, 2005 4:39 PM
> To: Raymond Hettinger
> Cc: python-dev@python.org
> Subject: Re: [Python-Dev] Developer list update
> 
> I haven't been using Python recently and don't have plans to
> contribute to its development. Go ahead and drop me from the list.